CN115526249A

CN115526249A - Time series early classification method, terminal device and storage medium

Info

Publication number: CN115526249A
Application number: CN202211163048.5A
Authority: CN
Inventors: 侯毅; 安玮; 陈慧玲; 盛卫东; 马超; 林再平; 曾瑶源; 李振; 李骏; 周石琳; 黄源; 乔木
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2022-09-23
Filing date: 2022-09-23
Publication date: 2022-12-27

Abstract

The invention discloses a time sequence early classification method, terminal equipment and a storage medium, wherein a training set is constructed by utilizing time sequence data of human body actions; training the neural network by using a training set; inputting the training set into a trained neural network to obtain the classification probability of all data of the training set at all times, and calculating a probability exit threshold value by using the classification probability; inputting the observable data at the moment t into a trained neural network to obtain the classification probability Pt at the moment t, taking the maximum value of Pt, stopping inputting the observable data if the maximum value is greater than the probability exit threshold, and taking the classification result at the moment t as the final classification result of the observable data at the moment t. The invention can be self-adaptive to new data which is continuously added, and more distinctive features are extracted, so that the accuracy of early classification of time data is improved; the method can be self-adaptive to sample content and difficulty, and extracts more specific class characteristics so as to improve the accuracy of early classification.

Description

Time series early classification method, terminal equipment and storage medium

技术领域technical field

本发明涉及时间序列数据分类技术领域，特别是一种时间序列早期分类方法、终端设备及存储介质。The invention relates to the technical field of time series data classification, in particular to a time series early classification method, a terminal device and a storage medium.

背景技术Background technique

近年来，随着智能可穿戴设备的发展，可以随处获取人体动作的时间序列数据，用于个人健康监测、智能家居控制等，时间序列分类任务引起了广泛的关注。然而，对于一些时间敏感的具体应用中如老人摔倒希望尽快对时间序列进行分类，此外，人类活动的早期分类有助于最大限度地减少系统的响应时间，从而改善用户体验。因此，尽快且尽可能准确地对时间序列数据分类是具有重要的研究意义。In recent years, with the development of smart wearable devices, time-series data of human actions can be obtained everywhere for personal health monitoring, smart home control, etc., time-series classification tasks have attracted extensive attention. However, for some specific time-sensitive applications such as falling of the elderly, it is desirable to classify the time series as soon as possible. In addition, early classification of human activities helps to minimize the response time of the system, thereby improving user experience. Therefore, it is of great research significance to classify time series data as quickly and accurately as possible.

早期分类随着时间推移不断输入数据，数据长度在不断变化，其不同时刻的特征差异较大，因此分类器难以对任意长度的时间序列进行分类。In the early classification, data was input continuously over time, and the length of the data was constantly changing, and its characteristics at different moments were quite different, so it was difficult for the classifier to classify time series of arbitrary length.

传统的早期时间序列分类方法可以分为基于前缀的方法，基于shaplets的方法，和基于后验概率的方法。但这些方法通常需要耗费大量的时间为不同长度的时间序列数据训练多个分类器，需要大量的专家经验设计手工特征或者设置退出阈值。与传统方法相比，在该大数据时代，基于深度学习的方法能够自动提取更加有效的特征。Traditional early time series classification methods can be divided into prefix-based methods, shapets-based methods, and posterior probability-based methods. However, these methods usually take a lot of time to train multiple classifiers for time series data of different lengths, and require a lot of expert experience to design manual features or set exit thresholds. Compared with traditional methods, in this era of big data, methods based on deep learning can automatically extract more effective features.

目前基于深度学习的早期时间序列分类方法可以分为一阶段的方法和二阶段的方法。二阶段的方法通常在第一阶段使用训练集训练出分类模型，然后在第二阶段制定一定的退出规则或者设置固定的退出阈值，当分类概率满足退出条件时提前得到分类结果。一阶段的方法通常同时建立分类子网和退出子网，对其进行训练，分类子网得到分类结果，退出子网用于指示该时刻是否退出。The current deep learning-based early time series classification methods can be divided into one-stage methods and two-stage methods. The two-stage method usually uses the training set to train the classification model in the first stage, and then formulates certain exit rules or sets a fixed exit threshold in the second stage, and obtains the classification results in advance when the classification probability meets the exit conditions. The one-stage method usually establishes a classification subnetwork and an exit subnetwork at the same time, and trains them. The classification subnetwork obtains classification results, and the exit subnetwork is used to indicate whether to exit at this moment.

由于早期分类的输入数据是不断变化的，因此基于深度学习的方法通常利用递归神经网络还适应长度不断变长的数据。递归神经网络由于其递归结构具有遗忘缺陷，不能很好地对较长的时间序列进行分类，并且其局部特征提取能力较差。部分方法结合卷积神经网络来提取局部特征，不幸地是，常规的卷积核的参数固定，对于任意时刻以及任意样本具有固定且相同的特征匹配模板，并没有充分考虑类内差异以及类间差异。Since the input data for early classification is constantly changing, deep learning-based methods usually use recurrent neural networks to adapt to data of ever-increasing length. Due to the forgetting defect of its recursive structure, the recurrent neural network cannot classify long time series well, and its local feature extraction ability is poor. Some methods combine convolutional neural networks to extract local features. Unfortunately, the parameters of conventional convolution kernels are fixed, and they have fixed and the same feature matching templates for any time and any sample, and do not fully consider intra-class differences and inter-class differences. difference.

发明内容Contents of the invention

本发明所要解决的技术问题是，针对现有技术不足，提供一种时间序列早期分类方法、终端设备及存储介质，充分考虑类内差异以及类间差异，提高人体动作时间数据分类的准确性。The technical problem to be solved by the present invention is to provide a time series early classification method, terminal equipment and storage medium in view of the deficiencies in the prior art, fully consider intra-class differences and inter-class differences, and improve the accuracy of human action time data classification.

为解决上述技术问题，本发明所采用的技术方案是：一种时间序列早期分类方法，包括以下步骤：In order to solve the above technical problems, the technical solution adopted in the present invention is: a time series early classification method, comprising the following steps:

S1、利用人体动作的时间序列数据构建训练集；S1. Construct a training set using time series data of human actions;

S2、利用所述训练集训练神经网络；S2. Using the training set to train the neural network;

S3、将训练集输入训练好的神经网络中，得到训练集所有数据的所有时刻的分类概率，利用所述分类概率计算概率退出阈值；S3. Input the training set into the trained neural network, obtain the classification probabilities of all data in the training set at all moments, and use the classification probabilities to calculate the probability exit threshold;

S4、将时刻t的可观测数据输入训练好的神经网络中，得到时刻t的分类概率Pt，取Pt的最大值，若该最大值大于所述概率退出阈值，则停止继续输入可观测数据，将时刻t的分类结果作为最终分类结果；否则，t的值加1，重复步骤S4，直至满足退出条件。S4. Input the observable data at time t into the trained neural network, obtain the classification probability Pt at time t, and take the maximum value of Pt. If the maximum value is greater than the probability exit threshold, stop inputting observable data. The classification result at time t is taken as the final classification result; otherwise, the value of t is increased by 1, and step S4 is repeated until the exit condition is satisfied.

本发明在获得分类概率后，利用分类概率计算概率退出阈值，进一步根据概率退出阈值与分类概率之间的大小关系，确定是否退出训练。本发明的阈值并非固定阈值，而是根据分类概率计算得到的，因此可以适应早期分类的输入数据不断变化的特征。本发明中，若未满足退出条件，则随着时间的推移，数据继续输入分类器(训练好的神经网络)进行分类，直到满足退出条件，充分考虑了类内差异以及类间差异，极大地提高了早期时间序列分类的准确性。After obtaining the classification probability, the present invention uses the classification probability to calculate the probability exit threshold, and further determines whether to exit the training according to the magnitude relationship between the probability exit threshold and the classification probability. The threshold of the present invention is not a fixed threshold, but calculated according to the classification probability, so it can adapt to the changing characteristics of the input data of the early classification. In the present invention, if the exit condition is not met, then as time goes on, the data will continue to be input into the classifier (trained neural network) for classification until the exit condition is met, fully considering the intra-class difference and the inter-class difference, greatly Improved accuracy for early time series classification.

本发明中，所述神经网络包括：In the present invention, the neural network includes:

多个级联的第一卷积模块，用于对输入数据进行特征提取，得到第一特征；A plurality of cascaded first convolution modules are used to perform feature extraction on input data to obtain first features;

多个级联的第二卷积模块，输入所述第一特征，用于提取所述输入数据的高层特征；A plurality of cascaded second convolution modules input the first features for extracting high-level features of the input data;

平均池化层，输入为所述高层特征，输出为不同时间长度序列对应的融合特征；The average pooling layer, the input is the high-level feature, and the output is the fusion feature corresponding to the sequence of different time lengths;

线性层，输入为所述融合特征，输出为不同类别的预测得分；A linear layer, the input is the fusion feature, and the output is the prediction score of different categories;

指数归一化层，用于对所述预测得分进行归一化，输出分类概率。The exponential normalization layer is used to normalize the prediction score and output classification probability.

本发明中，输入数据输入到卷积块(第一卷积模块)对底层特征进行提取，其次将底层特征输入到动态卷积块(第二卷积模块)，提取时间自适应的高层特征，然后将高层特征输入平均池化层得到不同时间长度序列对应的融合特征，这些融合特征经过线性层得到不同类别的预测得分，最后将该得分通过指数归一化层得到输出概率。In the present invention, the input data is input to the convolution block (the first convolution module) to extract the underlying features, and then the underlying features are input to the dynamic convolution block (the second convolution module) to extract time-adaptive high-level features, Then the high-level features are input into the average pooling layer to obtain the fusion features corresponding to sequences of different time lengths. These fusion features are passed through the linear layer to obtain the prediction scores of different categories, and finally the scores are passed through the exponential normalization layer to obtain the output probability.

本发明中，第一卷积模块为底层卷积模块，用于提取底层特征(第一特征或初级特征)，第二卷积模块为动态卷积模块，用于提取高层特征。In the present invention, the first convolution module is a bottom-level convolution module for extracting bottom-level features (first features or primary features), and the second convolution module is a dynamic convolution module for extracting high-level features.

本发明中，级联是指依次连接，比如第一个第一卷积模块的输出与第二个第一卷积模块的输入连接，第二个第一卷积模块的输出与第三个第一卷积模块的输入连接，依次类推。In the present invention, cascading refers to sequential connection, for example, the output of the first first convolution module is connected to the input of the second first convolution module, and the output of the second first convolution module is connected to the third first convolution module. The input connection of a convolutional module, and so on.

本发明中，所述第一卷积模块包括依次连接的第一空洞因果卷积层、第一归一化层、第二空洞因果卷积层、第二归一化层和第一线性激活单元；所述第一卷积模块的输入特征与输出特征残差连接。In the present invention, the first convolution module includes a sequentially connected first dilated causal convolution layer, a first normalization layer, a second dilated causal convolution layer, a second normalization layer and a first linear activation unit ; The input feature of the first convolution module is connected with the output feature residual.

本发明中，所述第二卷积模块包括动态卷积层、第三归一化层、第二线性激活单元；其中，所述动态卷积层包括卷积核生成模块，所述卷积核生成模块用于利用输入数据生成每个时间对应的卷积核；利用所述卷积核对所述输入数据进行特征提取。In the present invention, the second convolution module includes a dynamic convolution layer, a third normalization layer, and a second linear activation unit; wherein, the dynamic convolution layer includes a convolution kernel generation module, and the convolution kernel The generating module is used to generate a convolution kernel corresponding to each time by using the input data; and perform feature extraction on the input data by using the convolution kernel.

常规卷积块的卷积参数是固定的，且与样本的内容、长度无关，这不利于早期分类流数据的特征的提取，因此本发明在卷积块提取出初期的低层特征后，设计了动态卷积模块，使得动态卷积模块提取出的特征能够适应于数据内容以及长度变化，进一步提高了分类准确性。The convolution parameters of the conventional convolution block are fixed and have nothing to do with the content and length of the sample, which is not conducive to the extraction of the features of the early classification flow data. Therefore, after the convolution block extracts the initial low-level features, the present invention designs The dynamic convolution module enables the features extracted by the dynamic convolution module to adapt to the data content and length changes, further improving the classification accuracy.

本发明中，所述卷积核生成模块包括依次连接的第一卷积层、修正线性单元、批归一化层和第二卷积层。In the present invention, the convolution kernel generation module includes a first convolution layer, a modified linear unit, a batch normalization layer and a second convolution layer connected in sequence.

对于同一个样本，常规卷积核采用全时间段相同的卷积核，特征不具有区分性，随着数据量的增加信息增益有限，相比较而言，本发明设计的时间自适应卷积模块随着时间推移生成特定于增加数据的卷积核，能够提取更具有区分的特征。For the same sample, the conventional convolution kernel uses the same convolution kernel for the whole time period, and the features are not distinguishable. As the amount of data increases, the information gain is limited. In comparison, the time-adaptive convolution module designed by the present invention Generating convolution kernels specific to increasing data over time enables extraction of more discriminative features.

利用所述分类概率计算概率退出阈值的实现过程包括：The implementation process of using the classification probability to calculate the probability exit threshold includes:

对所述分类概率进行排序，去除分类概率中的重复项；Sorting the classification probabilities and removing duplicates in the classification probabilities;

取相邻的分类概率的中值，得到一系列的阈值候选值；Take the median of the adjacent classification probabilities to obtain a series of threshold candidate values;

选取成本最小的阈值候选值作为概率退出阈值。Select the threshold candidate value with the smallest cost as the probability exit threshold.

直接指定数据集的阈值需要大量的专家经验，并且该阈值不适用于所有数据集、泛化性能差。而本发明采用成本公式法，能自动得到适用于不同数据集的阈值，并能根据需求调整公式得到与实际需求相符合的阈值(调整α的取值，例如：当对分类准确率有更高的需求时，则需要将α的值增加，当对退出的早期性有更高的需求时，则需将α的值减小)。利用成本公式计算阈值的方法还有具有可解释性的优点。Directly specifying the threshold of the data set requires a lot of expert experience, and the threshold is not suitable for all data sets, and the generalization performance is poor. And the present invention adopts the cost formula method, can automatically obtain the threshold value that is applicable to different data sets, and can obtain the threshold value (adjusting the value of α according to actual demand according to the demand adjustment formula, for example: when the classification accuracy rate has higher When there is a higher demand, the value of α needs to be increased, and when there is a higher demand for early exit, the value of α needs to be decreased). The method of calculating the threshold using the cost formula also has the advantage of being interpretable.

利用下式计算阈值候选值的成本：Cost_β＝α*(1-Acc_β)+(1-α)·Earliness_β；其中，Acc_β为分类概率的最大值在大于阈值候选值β时退出计算出的准确率；Earliness_β为分类概率的最大值在大于阈值候选值β时退出计算出的早期性，α为权重系数。Use the following formula to calculate the cost of the threshold candidate value: Cost _β = α*(1-Acc _β )+(1-α) Earliness _β ; where Acc _β is the maximum value of the classification probability, and exit the calculation when it is greater than the threshold candidate value β Earliness _β is the earlyness calculated when the maximum value of the classification probability is greater than the threshold candidate value β, and α is the weight coefficient.

本发明中，通过大量实验研究分析，α取值为0.8。In the present invention, the value of α is 0.8 through a large number of experimental studies and analysis.

作为一个发明构思，本发明还提供了一种终端设备，包括存储器、处理器及存储在存储器上的计算机程序；所述处理器执行所述计算机程序，以实现本发明上述方法的步骤。As an inventive concept, the present invention also provides a terminal device, including a memory, a processor, and a computer program stored on the memory; the processor executes the computer program to implement the steps of the above method of the present invention.

一种计算机可读存储介质，其上存储有计算机程序/指令；所述计算机程序/指令被处理器执行时实现本发明上述方法的步骤。A computer-readable storage medium, on which computer programs/instructions are stored; when the computer programs/instructions are executed by a processor, the steps of the above method of the present invention are implemented.

与现有技术相比，本发明所具有的有益效果为：Compared with prior art, the beneficial effect that the present invention has is:

1、对于长度不断变长的数据，本发明能够自适应于不断增加的新数据，提取出更具有区分性的特征，以提高时间数据早期分类的准确性；1. For the data whose length is constantly increasing, the present invention can adapt to the continuously increasing new data and extract more distinguishing features to improve the accuracy of early classification of time data;

2、对于不同类别的数据，本发明能够自适应于样本内容以及难度，提取出更加特定于类的特征，以提高早期分类的准确性。2. For different types of data, the present invention can adapt to the sample content and difficulty, and extract more class-specific features to improve the accuracy of early classification.

附图说明Description of drawings

图1为本发明实施例1方法流程图；Fig. 1 is the method flow chart of embodiment 1 of the present invention;

图2为本发明实施例1神经网络结构图；Fig. 2 is a neural network structural diagram of Embodiment 1 of the present invention;

图3为本发明实施例1动态卷积模块结构图。FIG. 3 is a structural diagram of a dynamic convolution module in Embodiment 1 of the present invention.

具体实施方式detailed description

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地说明，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

在本文中，术语“第一”、“第二”和其它类似词语并不意在暗示任何顺序、数量和重要性，而是仅仅用于对不同的元件进行区分。在本文中，术语“一”、“一个”和其它类似词语并不意在表示只存在一个所述事物，而是表示有关描述仅仅针对所述事物中2的一个，所述事物可能具有一个或多个。在本文中，术语“包含”、“包括”和其它类似词语意在表示逻辑上的相互关系，而不能视作表示空间结构上的关系。例如，“A包括B”意在表示在逻辑上B属于A，而不表示在空间上B位于A的内部。另外，术语“包含”、“包括”和其它类似词语的含义应视为开放性的，而非封闭性的。例如，“A包括B”意在表示B属于A，但是B不一定构成A的全部，A还可能包括C、D、E等其它元素。In this document, the terms "first", "second" and other similar words are not intended to imply any order, quantity and importance, but are only used to distinguish different elements. In this document, the terms "a", "an" and other similar words are not intended to mean that there is only one of said things, but that the description is for only one of said things, which may have one or more indivual. In this document, the terms "comprising", "comprising" and other similar words are intended to indicate logical interrelationships, and cannot be regarded as denoting spatial structural relationships. For example, "A includes B" is intended to mean that B logically belongs to A, but not that B is spatially inside A. Additionally, the meanings of the terms "comprising", "comprising" and other similar words are to be regarded as open rather than closed. For example, "A includes B" means that B belongs to A, but B does not necessarily constitute the whole of A, and A may also include C, D, E and other elements.

实施例1Example 1

如图1所示，本实施例中，在训练阶段，首先对本发明设计的网络(DTCN，该网络的具体结构在下一小节)进行训练，具体地，将训练集的所有时间序列数据输入到本发明设计的网络中，利用损失函数

训练训练模型的所有参数θ，其中N为训练集中训练集的样本个数，T为完整的时间序列数据的长度。然后，将训练集输入训练好的DTCN，得到训练集所有数据的所有时刻的分类概率，根据本发明制定的退出规则，计算出该数据集下的概率退出阈值。具体的通过退出规则计算阈值的过程如下：首先将这些分类概率进行排序，并去掉其中的重复项得到{P₁,P₂,…,P_T}，然后将相邻的分类概率取中值，如β₁＝(P₁+P₂)/2，，得到一系列的阈值候选值{β₁，β₂，…，β_T}。对每个候选阈值β，本发明定义了成本公式：As shown in Figure 1, in this embodiment, in the training phase, firstly, the network (DTCN, the specific structure of the network is in the next subsection) designed by the present invention is trained, specifically, all time series data of the training set are input into this In the network designed by the invention, the loss function is used

Train all parameters θ of the training model, where N is the number of samples in the training set in the training set, and T is the length of the complete time series data. Then, the training set is input into the trained DTCN to obtain the classification probabilities of all data in the training set at all moments, and the probability exit threshold under the data set is calculated according to the exit rule formulated by the present invention. The specific process of calculating the threshold value through the exit rule is as follows: first sort these classification probabilities, and remove the duplicate items among them to obtain {P ₁ ,P ₂ ,…,P _T }, and then take the median of the adjacent classification probabilities, For example, β ₁ =(P ₁ +P ₂ )/2, to obtain a series of threshold candidate values {β ₁ , β ₂ , . . . , β _T }. For each candidate threshold β, the present invention defines a cost formula:

Cost_β＝α*(1-Acc_β)+(1-α)*Earliness_β。Cost _β =α*(1-Acc _β )+(1-α)*Earliness _β .

其中，Acc指该训练集预测概率P在大于该阈值时退出计算出的准确率，Earliness指的是训练集的预测概率P在大于该阈值时退出计算出的早期性。计算出所有候选阈值的成本后，选取其中成本最小的阈值作为该数据的最终退出阈值。在本实施例中，选取α值为0.8。Among them, Acc refers to the accuracy rate that the prediction probability P of the training set exits the calculation when it is greater than the threshold, and Earlyness refers to the earlyness of the prediction probability P of the training set that exits the calculation when it is greater than the threshold. After calculating the cost of all candidate thresholds, select the threshold with the smallest cost as the final exit threshold of the data. In this embodiment, the value of α is selected to be 0.8.

在测试阶段，以样本A为例，将时刻t的可观测数据(观测数据的是指在t时刻能够通过设备等获得长度为t的数据，在t+1时刻就能获得长度为t+1的数据。不同时刻的可观测的数据是在随时间推移慢慢增加的)直接输入训练好的网络模型中，得到此时的分类概率Pt。取Pt的最大值，如果其大于在训练阶段计算好的退出阈值，则认为此时的分类结果是可靠的，进行退出，该时刻的分类结果作为该样本A的最终分类结果。否则，随着时间的推移，数据继续输入分类器进行分类，直到满足退出条件。In the test phase, taking sample A as an example, the observable data at time t (observable data refers to data of length t that can be obtained through equipment at time t, and data of length t+1 can be obtained at time t+1. The data. The observable data at different times is slowly increasing over time) and directly input into the trained network model to obtain the classification probability Pt at this time. Take the maximum value of Pt, if it is greater than the exit threshold calculated in the training phase, it is considered that the classification result at this time is reliable and exit, and the classification result at this moment is taken as the final classification result of the sample A. Otherwise, the data continues to be fed into the classifier for classification over time until the exit criteria are met.

本实施例提出的DTCN的架构如图2所示，主要由卷积块、动态卷积块组成。具体地，输入数据输入到卷积块对底层特征进行提取，其次将底层特征输入到动态卷积块，提取时间自适应的高层特征，然后将高层特征输入平均池化层得到不同时间长度序列对应的融合特征，这些融合特征经过线性层得到不同类别的预测得分，最后将该得分通过指数归一化层得到输出概率。The architecture of the DTCN proposed in this embodiment is shown in FIG. 2 , which is mainly composed of a convolution block and a dynamic convolution block. Specifically, the input data is input to the convolution block to extract the underlying features, and then the underlying features are input to the dynamic convolution block to extract time-adaptive high-level features, and then the high-level features are input to the average pooling layer to obtain sequences corresponding to different time lengths. These fusion features are passed through the linear layer to obtain the prediction scores of different categories, and finally the scores are passed through the exponential normalization layer to obtain the output probability.

其中底层卷积块(第一卷积模块)是用于提取底层初级特征(第一特征)的卷积模块，该卷积模块包括扩张因果卷积、归一化层、修正线性单元(ReLU)。引入因果卷积的原因在于避免信息泄漏，使得该时刻的特征与该时刻之后的特征无关；扩张卷积有效地扩大了感受野；归一化层避免了训练过程中的过拟合现象；ReLU增强了提取特征的非线性。常规卷积块的卷积参数是固定的，且与样本的内容、长度无关，这不利于早期分类流数据的特征的提取，因此本发明在卷积块提取出初期的低层特征后，设计了动态卷积模块，使得动态卷积模块提取出的特征能够适应于数据内容以及长度变化。The underlying convolutional block (the first convolutional module) is a convolutional module for extracting the underlying primary features (the first feature), which includes dilated causal convolution, normalization layer, and rectified linear unit (ReLU) . The reason for introducing causal convolution is to avoid information leakage, so that the characteristics at this moment have nothing to do with the characteristics after this moment; the expansion convolution effectively expands the receptive field; the normalization layer avoids overfitting during the training process; ReLU Enhanced nonlinearity of extracted features. The convolution parameters of the conventional convolution block are fixed and have nothing to do with the content and length of the sample, which is not conducive to the extraction of the features of the early classification flow data. Therefore, after the convolution block extracts the initial low-level features, the present invention designs The dynamic convolution module enables the features extracted by the dynamic convolution module to adapt to data content and length changes.

本实施例中动态卷积模块的设计具体如图3所示。将输入特征通过卷积核生成模块生成每个时间特定的卷积核。在训练过程中，卷积核生成模块学习如何产生自适应于数据内容的卷积核。卷积核生成模块由尺寸为1的卷积、批归一化层和ReLU组成。该卷积核生成模块产生了尺寸为K、通道共享的时间自适应卷积核。利用新生成的时间自适应卷积核对输入特征进行卷积操作，得到输出特征。对于不同类别的样本，通过该动态卷积生成模块的设计，生成了自适应内容的卷积核，能够提取特定于类的特征；对于同一个样本，常规卷积核采用全时间段相同的卷积核，特征不具有区分性，随着数据量的增加信息增益有限，相比较而言，设计的时间自适应卷积模块随着时间推移生成特定于增加数据的卷积核，能够提取更具有区分性的特征。The design of the dynamic convolution module in this embodiment is specifically shown in FIG. 3 . Pass the input features through the convolution kernel generation module to generate each time-specific convolution kernel. During training, the convolution kernel generation module learns how to generate convolution kernels that are adaptive to the content of the data. The kernel generation module consists of a convolution of size 1, a batch normalization layer, and a ReLU. The convolution kernel generation module produces a time-adaptive convolution kernel of size K and channel sharing. The input features are convoluted using the newly generated time-adaptive convolution kernel to obtain the output features. For samples of different categories, through the design of the dynamic convolution generation module, a convolution kernel with adaptive content is generated, which can extract class-specific features; for the same sample, the conventional convolution kernel uses the same convolution kernel for the whole time Kernel, the features are not discriminative, and the information gain is limited as the amount of data increases. In comparison, the designed time-adaptive convolution module generates a convolution kernel specific to the increased data over time, which can extract more distinguishing features.

通过动态卷积模块后输出的时间自适应特征经过平均池化层对每个时间段的进行融合，得到不同时间段的融合特征，最后不同时刻的融合特征经过线性层以及归一化指数层得到每个时刻的分类概率。The time-adaptive features output by the dynamic convolution module are fused by the average pooling layer for each time period to obtain the fusion features of different time periods. Finally, the fusion features at different times are obtained through the linear layer and the normalized index layer. Classification probability at each moment.

在常用的两个人体动作识别数据集上进行实验，取得了理想的效果(下表使用的指标HM，是通过早期性和准确性计算得来，是一种综合性指标，本发明实施例的方法是DETSCN)。Experiments were carried out on two commonly used human action recognition data sets, and ideal results were obtained (the index HM used in the following table is calculated by early and accurate, and is a comprehensive index. The embodiment of the present invention The method is DETSCN).

表1分类结果对比Table 1 Comparison of classification results

上表1中，ECLN、ETMD、EARLIEST方法分别参见：In Table 1 above, ECLN, ETMD, and EARLIEST methods refer to:

ECLN：Ruβwurm M,Tavenard R,Lefèvre S,et al.Early classification foragricultural monitoring from satellite time series[J].arXiv preprint arXiv:1908.10283,2019.ECLN: Ruβwurm M, Tavenard R, Lefèvre S, et al. Early classification for agricultural monitoring from satellite time series[J].arXiv preprint arXiv:1908.10283,2019.

ETMD：Sharma A,Singh S K,Udmale S S,et al.Early Transportation ModeDetection Using Smartphone Sensing Data[J].IEEE Sensors Journal,2021,21(14):15651-15659.ETMD: Sharma A, Singh S K, Udmale S S, et al. Early Transportation Mode Detection Using Smartphone Sensing Data[J].IEEE Sensors Journal,2021,21(14):15651-15659.

EARLIEST：Hartvigsen T,Sen C,Kong X,et al.Adaptive-halting policynetwork for early classification[C]//Proceedings of the 25th ACM SIGKDDInternational Conference on Knowledge Discovery&Data Mining.2019:101-1.EARLIEST: Hartvigsen T, Sen C, Kong X, et al.Adaptive-halting policynetwork for early classification[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery&Data Mining.2019:101-1.

实施例2Example 2

本发明实施例2提供一种对应上述实施例1的终端设备，终端设备可以是用于客户端的处理设备，例如手机、笔记本电脑、平板电脑、台式机电脑等，以执行上述实施例的方法。Embodiment 2 of the present invention provides a terminal device corresponding to Embodiment 1 above. The terminal device may be a processing device for a client, such as a mobile phone, a notebook computer, a tablet computer, a desktop computer, etc., to execute the method of the above embodiment.

本实施例的终端设备包括存储器、处理器及存储在存储器上的计算机程序；处理器执行存储器上的计算机程序，以实现上述实施例1方法的步骤。The terminal device in this embodiment includes a memory, a processor, and a computer program stored in the memory; the processor executes the computer program in the memory to implement the steps of the method in Embodiment 1 above.

在一些实现中，存储器可以是高速随机存取存储器(RAM：Random AccessMemory)，也可能还包括非不稳定的存储器(non-volatile memory)，例如至少一个磁盘存储器。In some implementations, the memory may be a high-speed random access memory (RAM: Random Access Memory), and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory.

在另一些实现中，处理器可以为中央处理器(CPU)、数字信号处理器(DSP)等各种类型通用处理器，在此不做限定。In other implementations, the processor may be various types of general-purpose processors such as a central processing unit (CPU) and a digital signal processor (DSP), which are not limited herein.

实施例3Example 3

本发明实施例3提供了一种对应上述实施例1的计算机可读存储介质，其上存储有计算机程序/指令。计算机程序/指令被处理器执行时，实现上述实施例1方法的步骤。Embodiment 3 of the present invention provides a computer-readable storage medium corresponding to Embodiment 1 above, on which computer programs/instructions are stored. When the computer program/instruction is executed by the processor, the steps of the method in Embodiment 1 above are realized.

计算机可读存储介质可以是保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是但不限于电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意组合。A computer readable storage medium may be a tangible device that holds and stores instructions for use by an instruction execution device. A computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any combination of the above.

本领域内的技术人员应明白，本申请的实施例可提供为方法、系统、或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。本申请实施例中的方案可以采用各种计算机语言实现，例如，面向对象的程序设计语言Java和直译式脚本语言JavaScript等。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein. The solutions in the embodiments of the present application can be realized by using various computer languages, for example, the object-oriented programming language Java and the literal translation scripting language JavaScript.

本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowcharts and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.

尽管已描述了本申请的优选实施例，但本领域内的技术人员一旦得知了基本创造性概念，则可对这些实施例作出另外的变更和修改。所以，所附权利要求意欲解释为包括优选实施例以及落入本申请范围的所有变更和修改。While preferred embodiments of the present application have been described, additional changes and modifications to these embodiments can be made by those skilled in the art once the basic inventive concept is appreciated. Therefore, the appended claims are intended to be construed to cover the preferred embodiment and all changes and modifications which fall within the scope of the application.

显然，本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样，倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内，则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the application without departing from the spirit and scope of the application. In this way, if these modifications and variations of the present application fall within the scope of the claims of the present application and their equivalent technologies, the present application is also intended to include these modifications and variations.

Claims

1. A time series early classification method, is characterized in that, comprises the following steps:

S1. Construct a training set using time series data of human actions;

S2. Using the training set to train the neural network;

S3. Input the training set into the trained neural network, obtain the classification probabilities of all data in the training set at all moments, and use the classification probabilities to calculate the probability exit threshold;

S4. Input the observable data at time t into the trained neural network, obtain the classification probability Pt at time t, and take the maximum value of Pt. If the maximum value is greater than the probability exit threshold, stop inputting observable data. The classification result at time t is taken as the final classification result of the observable data at time t; otherwise, the value of t is increased by 1, and step S4 is repeated.

2. the time series early classification method according to claim 1, is characterized in that, described neural network comprises:

A plurality of cascaded first convolution modules are used to perform feature extraction on input data to obtain first features;

A plurality of cascaded second convolution modules, the input of which is the first feature, is used to extract high-level features of the input data;

The average pooling layer, the input is the high-level feature, and the output is the fusion feature corresponding to the sequence of different time lengths;

A linear layer, the input is the fusion feature, and the output is the prediction score of different categories;

The exponential normalization layer is used to normalize the prediction score and output classification probability.

3. The time series early classification method according to claim 2, wherein the first convolution module comprises a sequentially connected first hole causal convolution layer, a first normalization layer, and a second hole causal convolution A product layer, a second normalization layer, and a first linear activation unit; the input features of the first convolution module are connected to the output feature residuals.

4. The time series early classification method according to claim 2, wherein the second convolution module includes a dynamic convolution layer, a third normalization layer, and a second linear activation unit; wherein the dynamic The convolution layer includes a convolution kernel generating module, which is used to generate a convolution kernel corresponding to each time by using input data; and perform feature extraction on the input data by using the convolution kernel.

5. The time series early classification method according to claim 4, wherein the convolution kernel generation module comprises a sequentially connected first convolutional layer, a modified linear unit, a batch normalization layer, and a second convolutional layer layer.

6. The time series early classification method according to any one of claims 1 to 4, wherein the implementation process of calculating the probability exit threshold using the classification probability includes:

Sorting the classification probabilities and removing duplicates in the classification probabilities;

Take the median of the adjacent classification probabilities to obtain a series of threshold candidate values;

Select the threshold candidate value with the smallest cost as the probability exit threshold.

7. time series early classification method according to claim 5, is characterized in that, utilizes the cost of following formula to calculate threshold value candidate value: Cost _β =α*(1-Acc _β )+(1-α)*Earliness _β ; Among them, Acc _β is the accuracy rate that the maximum value of the classification probability exits the calculation when it is greater than the threshold candidate value β; Earliness _β is the earlyness that the maximum value of the classification probability exits the calculation when it is greater than the threshold candidate value β, and α is the weight coefficient .

8. The time series early classification method according to claim 6, wherein the value of α is 0.8.

9. A terminal device, comprising a memory, a processor, and a computer program stored on the memory; characterized in that, the processor executes the computer program to implement the steps of the method according to any one of claims 1-8.

10. A computer-readable storage medium, on which computer programs/instructions are stored; characterized in that, when the computer program/instructions are executed by a processor, the steps of the method according to any one of claims 1-8 are implemented.