CN110071913A

CN110071913A - A kind of time series method for detecting abnormality based on unsupervised learning

Info

Publication number: CN110071913A
Application number: CN201910234623.8A
Authority: CN
Inventors: 杨恺; 刘音希; 窦绍瑜
Original assignee: Tongji University
Current assignee: Tongji University
Priority date: 2019-03-26
Filing date: 2019-03-26
Publication date: 2019-07-30
Anticipated expiration: 2039-03-26
Also published as: CN110071913B

Abstract

The present invention relates to a kind of time series method for detecting abnormality based on unsupervised learning, comprising: time series data is subjected to cutting in the position of its significant changes, and setting length is padded to the data segment after each cutting；Multiple data segments training one using the time series cutting under normal condition and after filling up is used for the neural network of abnormality detection；Multiple data segments by time series cutting to be detected and after filling up are detected as the input of abnormality detection model, and output abnormality score；Judge whether abnormal score is more than threshold value, if it has, then judgement is abnormal, conversely, then judging no exceptions.Compared with prior art, the present invention has the advantages that not depend on that markd abnormal data, not lose data information, performance excellent etc..

Description

A Time Series Anomaly Detection Method Based on Unsupervised Learning

技术领域technical field

本发明涉及一种异常检测方法，尤其是涉及一种基于无监督学习的时间序列异常检测方法。The invention relates to an anomaly detection method, in particular to a time series anomaly detection method based on unsupervised learning.

背景技术Background technique

异常检测(Anomaly Detection)是一种检测数据中的异常的手段，其中“异常”是指不符合正常行为的模式，例如在网络流量分析领域，正常模式是指正常的网络访问行为，异常模式是指网络入侵者的行为。异常检测被应用于很多领域，如医疗健康领域、网络安全领域、金融安全领域、系统维护领域等等。Anomaly Detection is a means of detecting anomalies in data, where "abnormal" refers to patterns that do not conform to normal behaviors. For example, in the field of network traffic analysis, normal patterns refer to normal network access behaviors, and abnormal patterns are Refers to the behavior of network intruders. Anomaly detection is used in many fields, such as medical and health field, network security field, financial security field, system maintenance field and so on.

时间序列(Time Series)是指一系列形如<时间戳，数据>形式的数据，时间序列常常用于实时记录系统运行状态、人体健康数据等数据，通过分析时间序列数据，可以判断系统所处的状态，并分析系统行为，辅助人类进行决策。在现实生活中，很多系统都使用时间序列数据记录系统运行状态，如网站系统访问量、服务器CPU运行状态。此外，在医疗健康领域，心电图数据、疾病发展变化数据等也都适用时间序列来表示。Time series refers to a series of data in the form of <timestamp, data>. Time series is often used to record data such as system operating status and human health data in real time. By analyzing time series data, it is possible to determine where the system is located. state, and analyze system behavior to assist humans in decision-making. In real life, many systems use time series data to record system operating status, such as website system traffic and server CPU operating status. In addition, in the medical and health field, electrocardiogram data, disease development and change data, etc. are also represented by time series.

时间序列中的异常往往可以反映出系统的异常，例如在网站系统中，数据库阻塞或死锁均会反映在数据库的监测数据上，在心电图数据中，心脏疾病所导致的异常也会反映在心电图数据中。因此，针对时间序列数据的异常检测有助于人们尽早发现异常，并采取适当措施避免异常。Abnormalities in the time series can often reflect the abnormality of the system. For example, in the website system, database blockage or deadlock will be reflected in the monitoring data of the database. In the ECG data, the abnormality caused by heart disease will also be reflected in the ECG. in the data. Therefore, anomaly detection for time series data helps people to detect anomalies as early as possible and take appropriate measures to avoid them.

目前，异常检测主要分为有监督方法和无监督方法两种，其中有监督的方法需要大量的带有异常标记的数据进行模型训练，然而异常往往是偶发的，所以在现实生活中很难得到大量的异常数据。因此，我们考虑使用无监督的方法实现异常检测。At present, anomaly detection is mainly divided into two types: supervised methods and unsupervised methods. The supervised method requires a large amount of data marked with anomalies for model training. However, anomalies are often accidental, so it is difficult to obtain in real life. A lot of abnormal data. Therefore, we consider using unsupervised methods to achieve anomaly detection.

发明内容SUMMARY OF THE INVENTION

本发明的目的就是为了克服上述现有技术存在的缺陷而提供一种基于无监督学习的时间序列异常检测方法。The purpose of the present invention is to provide a time series anomaly detection method based on unsupervised learning in order to overcome the above-mentioned defects of the prior art.

本发明的目的可以通过以下技术方案来实现：The object of the present invention can be realized through the following technical solutions:

一种基于无监督学习的时间序列异常检测方法，包括：An unsupervised learning-based time series anomaly detection method, including:

将时间序列数据在其显著变化的位置进行切分，并对每一个切分后的数据段填补至设定长度；Divide the time series data at its significantly changed position, and fill each segmented data segment to a set length;

使用正常状态下的时间序列切分并填补后的多个数据段作为输入训练异常检测模型；The anomaly detection model is trained using multiple data segments that are segmented and filled in the normal state of the time series as input;

将由待检测时间序列切分并填补后的多个数据段作为异常检测模型的输入进行检测，并输出异常得分；The multiple data segments that are divided and filled by the time series to be detected are used as the input of the anomaly detection model to detect, and the anomaly score is output;

判断异常得分是否超过阈值，若为是，则判断发生异常，反之，则判断未发生异常。It is judged whether the abnormality score exceeds the threshold value, if yes, it is judged that an abnormality has occurred, otherwise, it is judged that no abnormality has occurred.

所述将时间序列数据在其显著变化的位置进行切分，具体包括：The process of segmenting the time series data at its significantly changed positions specifically includes:

求时间序列数据的所有极值点；Find all extreme points of time series data;

然后将绝对值较大的极值点位置作为切分点，切分时间序列为多个数据段，其中，切分点由人工设定的数据极值点绝对值阈值决定。Then, the position of the extreme value point with the larger absolute value is used as the split point, and the time series is split into multiple data segments, wherein the split point is determined by the manually set absolute value threshold of the extreme value point of the data.

所述异常检测模型包括数据压缩器和高斯混合模型估计器，所述数据压缩器采用多对多的LSTM网络结构，所述高斯混合模型估计器采用多层感知器结构。The anomaly detection model includes a data compressor and a Gaussian mixture model estimator. The data compressor adopts a many-to-many LSTM network structure, and the Gaussian mixture model estimator adopts a multi-layer perceptron structure.

所述数据压缩器的压缩过程包括：The compression process of the data compressor includes:

将数据段进行压缩重建；Compress and reconstruct the data segment;

计算压缩前后的相对距离和余弦距离；Calculate the relative distance and cosine distance before and after compression;

将相对距离、余弦距离，以及LSTM网络隐藏层单元的输出合成为高斯混合模型估计器的输入量。The relative distance, cosine distance, and the output of the hidden layer unit of the LSTM network are synthesized as the input of the Gaussian mixture model estimator.

所述相对距离的数学表达式为：The mathematical expression of the relative distance is:

其中：r为相对距离，L为对数据段中包含的时间序列的长度，x_i为数据段中包含的时间序列中的元素，x′为重组后得到的时间序列中的元素。Among them: r is the relative distance, L is the length of the time series contained in the data segment, x _i is the element in the time series contained in the data segment, and x' is the element in the time series obtained after recombination.

所述余弦距离的数学表达式为：The mathematical expression of the cosine distance is:

其中：c为余弦距离，||·||为范数，x_i为数据段中包含的时间序列中的元素，x′为重组后得到的时间序列中的元素。Where: c is the cosine distance, ||·|| is the norm, x _i is the element in the time series contained in the data segment, and x′ is the element in the time series obtained after recombination.

所述高斯混合模型估计器的训练过程包括：The training process of the Gaussian mixture model estimator includes:

接收数据压缩器的输出并映射为K维向量，其中，K为模型中高斯分布的数目，The output of the data compressor is received and mapped to a K-dimensional vector, where K is the number of Gaussian distributions in the model,

基于K维向量的各元素得到各高斯分布的混合概率、均值和协方差；Based on each element of the K-dimensional vector, the mixture probability, mean and covariance of each Gaussian distribution are obtained;

所述高斯混合模型的检测过程包括：The detection process of the Gaussian mixture model includes:

接收数据压缩器的输出并计算得到异常得分。The output of the data compressor is received and an anomaly score is calculated.

所述异常得分的数学表达式为：The mathematical expression of the abnormal score is:

其中：Score(z)为异常得分，为第k个高斯分布的混合概率，为第k个高斯分布的协方差，z为数据压缩器的输出，为第k个高斯分布的均值，为的逆矩阵。Among them: Score(z) is the abnormal score, is the mixture probability of the kth Gaussian distribution, is the covariance of the kth Gaussian distribution, z is the output of the data compressor, is the mean of the kth Gaussian distribution, for The inverse matrix of .

所述第k个高斯分布的混合概率为：The mixture probability of the k-th Gaussian distribution is:

所述第k个高斯分布的均值为：The mean of the k-th Gaussian distribution is:

所述第k个高斯分布的协方差为：The covariance of the kth Gaussian distribution is:

其中：N为训练样本的总数，为第i个训练样本的第k维数据，z_i为第i个训练样本。Where: N is the total number of training samples, is the k-th dimension data of the ith training sample, and _zi is the ith training sample.

所述数据压缩器与高斯混合模型估计器使用端到端的方式进行训练，训练的目标函数如下：The data compressor and the Gaussian mixture model estimator are trained in an end-to-end manner, and the training objective function is as follows:

其中：J为目标函数，λ₁、λ₂为人工设定的参数，x_i为第i个数据段包含的时间序列，x′为由第i个数据段包含的时间序列充足后的时间序列，为惩罚项。Among them: J is the objective function, λ ₁ and λ ₂ are manually set parameters, x _i is the time series included in the i-th data segment, and x′ is the time series after the time series included in the i-th data segment is sufficient , for punishment.

与现有技术相比，本发明具有以下有益效果：Compared with the prior art, the present invention has the following beneficial effects:

1)在模型训练与异常检测之前，将时间序列数据在其显著变化的位置进行切分，切分后的序列数据用于进行模型训练。常规的异常检测方法使用固定长度的时间窗口滑动选取时间片，导致分割后的序列数据产生大量的冗余信息，不利于神经网络的特征学习，另一方面，使用固定长度的时间序列无法不利于表征时间窗口内的数据有固定含义，无法实现对于具有相似物理含义的时间序列的比较。1) Before model training and anomaly detection, the time series data is segmented at its significantly changed position, and the segmented sequence data is used for model training. Conventional anomaly detection methods use a fixed-length time window to slide and select time slices, resulting in a large amount of redundant information generated in the segmented sequence data, which is not conducive to the feature learning of neural networks. On the other hand, the use of fixed-length time series cannot be detrimental to The data in the characterization time window has a fixed meaning, and it is impossible to compare time series with similar physical meanings.

2)采用基于密度估计的方法，将分割后的训练样本视为采样自未知高斯混合分布的样本，并利用神经网络估计未知分布的高斯混合模型，常规的方法中仅考虑了整条数据的概率分布，而未考虑数据每段不同的特征分布。2) Using the method based on density estimation, the divided training samples are regarded as samples sampled from an unknown Gaussian mixture distribution, and a neural network is used to estimate the Gaussian mixture model of the unknown distribution. In the conventional method, only the probability of the entire data is considered. distribution without considering the different feature distributions of each segment of the data.

3)在训练阶段，切分后的数据被送入一个多对多的循环神经网络中，用于重建训练样本，神经网络隐含层最后一个步长的输出、重建序列与原始序列之间的相对距离与余弦距离被同时送入一个用于估计高斯混合模型参数的神经网络中，常规方法仅使用重建误差作为高斯混合模型的估计依据。3) In the training phase, the segmented data is sent to a many-to-many recurrent neural network for reconstructing the training samples, the output of the last step of the hidden layer of the neural network, and the relationship between the reconstructed sequence and the original sequence. The relative distance and the cosine distance are simultaneously fed into a neural network for estimating the parameters of the Gaussian mixture model, and conventional methods only use the reconstruction error as the basis for estimating the Gaussian mixture model.

附图说明Description of drawings

图1为本发明主要步骤流程示意图；1 is a schematic flow chart of the main steps of the present invention;

图2为模型训练流程图；Fig. 2 is the model training flow chart;

图3为本发明所使用的神经网络模型结构示意图；Fig. 3 is the neural network model structure schematic diagram used in the present invention;

图4为异常预测流程图；Fig. 4 is the abnormal prediction flow chart;

图5为本发明方法的性能与现有方法的性能比较示意图。FIG. 5 is a schematic diagram showing the performance comparison between the method of the present invention and the performance of the existing method.

具体实施方式Detailed ways

下面结合附图和具体实施例对本发明进行详细说明。本实施例以本发明技术方案为前提进行实施，给出了详细的实施方式和具体的操作过程，但本发明的保护范围不限于下述的实施例。The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments. This embodiment is implemented on the premise of the technical solution of the present invention, and provides a detailed implementation manner and a specific operation process, but the protection scope of the present invention is not limited to the following embodiments.

一种基于无监督学习的时间序列异常检测方法，主要包括两个步骤：模型训练与异常检测，如图1所示，包括：A time series anomaly detection method based on unsupervised learning mainly includes two steps: model training and anomaly detection, as shown in Figure 1, including:

将时间序列数据在其显著变化的位置进行切分，并对每一个切分后的数据段填补至设定长度；为实现以上要求，本发明模型训练步骤的流程图如图2所示。其中数据预处理包括两个步骤：The time series data is segmented at its significantly changed position, and each segmented data segment is filled to a set length; in order to achieve the above requirements, the flow chart of the model training steps of the present invention is shown in FIG. 2 . The data preprocessing includes two steps:

数据切分：首先求序列的所有极值点，然后将绝对值较大的极值点位置作为切分点，切分时间序列为多个数据段，其中，切分点由人工设定的数据极值点绝对值阈值决定。Data segmentation: First find all the extreme points of the sequence, and then use the extreme point position with a larger absolute value as the segmentation point, and segment the time series into multiple data segments, where the segmentation points are manually set data The absolute value threshold of the extreme point is determined.

数据填补：将切分好的多个序列使用0填补至异常检测模型的输入长度。Data padding: Use 0 to pad multiple split sequences to the input length of the anomaly detection model.

切分并填补后的多个数据段分别作为独立的样本用于训练后续的模型。The divided and filled data segments are used as independent samples for training subsequent models.

使用正常状态下的时间序列切分并填补后的多个数据段训练一个用于异常检测的神经网络；Train a neural network for anomaly detection using multiple segments of time series sliced and filled in the normal state;

如图3所示，异常检测模型包括数据压缩器和高斯混合模型估计器，数据压缩器采用多对多的LSTM网络结构。As shown in Figure 3, the anomaly detection model includes a data compressor and a Gaussian mixture model estimator, and the data compressor adopts a many-to-many LSTM network structure.

数据压缩器中模型结构中所使用的时间步长大于所有的可能的送入样本长度。输入LSTM模型中的时间序列样本记为x＝[x₁,x₂,…,x_L]，LSTM网络重建后的时间序列记为x′＝[x′₁,x′₂,…,x′_L]，其中L为时间序列的长度，则LSTM网络的训练的损失函数如下：The time step size used in the model structure in the data compressor is larger than all possible input sample lengths. The time series samples in the input LSTM model are denoted as x=[x ₁ ,x ₂ ,…,x _L ], and the time series reconstructed by the LSTM network are denoted as x′=[x′ ₁ ,x′ ₂ ,…,x′ _L ], where L is the length of the time series, then the loss function for the training of the LSTM network is as follows:

其中x_i为一个时间序列样本中的第i个元素，x′_i为重建时间序列样本中的第i个元素，L为时间序列的长度。where x _i is the ith element in a time series sample, x′ _i is the ith element in the reconstructed time series sample, and L is the length of the time series.

其压缩过程包括：Its compression process includes:

将数据段进行压缩重建；Compress and reconstruct the data segment;

相对距离的数学表达式为：The mathematical expression for relative distance is:

余弦距离的数学表达式为：The mathematical expression for cosine distance is:

高斯混合模型估计器采用多层感知器结构(Multilayer perceptions,MLP)。给定高斯混合模型所使用的高斯分布数目K，高斯混合模型估计器用于估计这K个高斯分布的三个参数，分别为混合概率Φ、均值μ、协方差Σ。The Gaussian mixture model estimator adopts a multi-layer perceptron structure (Multilayer perceptions, MLP). Given the number K of Gaussian distributions used by the Gaussian mixture model, the Gaussian mixture model estimator is used to estimate the three parameters of the K Gaussian distributions, which are the mixture probability Φ, the mean μ, and the covariance Σ.

参数估计过程如下：The parameter estimation process is as follows:

(1)首先使用多层神经网络将输入样本映射为K维向量，以确定用于估计每个高斯分布的所使用的数据。映射过程为：(1) First use a multilayer neural network to map input samples into K-dimensional vectors to determine the data used for estimating each Gaussian distribution. The mapping process is:

p＝MLN(z；θ) p=MLN(z; θ)

其中z为输入到高斯混合模型估计器中的数据，MLN(·)为多层神经网络，其参数为θ，softmax(·)为softmax函数，为用于估计高斯混合模型参数的样本。where z is the data input to the Gaussian mixture model estimator, MLN( ) is a multi-layer neural network whose parameters are θ, and softmax( ) is the softmax function, is the sample used to estimate the parameters of the Gaussian mixture model.

(2)高斯混合模型的参数：混合概率Φ、均值μ、协方差Σ的估计公式如下：(2) The parameters of the Gaussian mixture model: the estimation formulas of the mixture probability Φ, the mean μ, and the covariance Σ are as follows:

其中和分别为第k个高斯分布的混合概率、均值、协方差，为第i个训练样本的第k维数据，z_i为第i个训练样本，N为训练样本的总数。in and are the mixture probability, mean, and covariance of the kth Gaussian distribution, respectively, is the k-th dimension data of the ith training sample, _zi is the ith training sample, and N is the total number of training samples.

高斯混合模型估计器输出的异常得分的公式如下：The formula for the anomaly score output by the Gaussian mixture model estimator is as follows:

其中z为输入到估高斯混合模型估计器中的数据，K为给定的高斯分布数量，和分别为第k个高斯分布的混合概率、均值、协方差。where z is the data input to the estimated Gaussian mixture model estimator, K is the given number of Gaussian distributions, and are the mixture probability, mean, and covariance of the kth Gaussian distribution, respectively.

数据压缩器与高斯混合模型估计器使用端到端的方式进行训练，训练的目标函数如下：The data compressor and the Gaussian mixture model estimator are trained in an end-to-end manner, and the training objective function is as follows:

其中：J为目标函数，λ₁、λ₂为人工设定的参数，x_i为第i个数据段包含的时间序列，x′为由第i个数据段包含的时间序列充足后的时间序列，为惩罚项，其公式如下：Among them: J is the objective function, λ ₁ and λ ₂ are manually set parameters, x _i is the time series included in the i-th data segment, and x′ is the time series after the time series included in the i-th data segment is sufficient , is the penalty term, and its formula is as follows:

其中：d为输入到高斯混合模型估计器中的样本z的维度，K为给定的高斯分布数目。where: d is the dimension of the sample z input to the Gaussian mixture model estimator, and K is the given number of Gaussian distributions.

确定用于异常检测的数据段的方法如下，计算切分后的每个数据段训练所生成的高斯分布的方差，选用可产生最小方差的数据段作为异常检测阶段送入异常检测模型的数据。The method of determining the data segment for anomaly detection is as follows: Calculate the variance of the Gaussian distribution generated by training for each segment of the segmented data, and select the data segment that can generate the smallest variance as the data sent to the anomaly detection model in the anomaly detection stage.

异常检测步骤的流程图如图4所示，其中数据预处理包括两个步骤：The flowchart of anomaly detection steps is shown in Figure 4, where data preprocessing includes two steps:

(1)数据切分：首先求序列的所有极值点，然后绝对值最大的极值点的位置作为切分点。(1) Data segmentation: First, find all the extreme points of the sequence, and then use the position of the extreme point with the largest absolute value as the segmentation point.

(2)数据填补：将切分好的多个序列使用0填补至异常检测模型的输入长度。(2) Data padding: padding multiple sequences that have been segmented to the input length of the anomaly detection model.

(3)挑选出模型训练阶段确定的用于异常检测的数据段。(3) Pick out the data segment determined in the model training stage for anomaly detection.

神经网络模型即为模型训练步骤中所训练的异常检测模型，τ为人为给定的异常分数分类阈值。The neural network model is the anomaly detection model trained in the model training step, and τ is the artificially given anomaly score classification threshold.

上述方法在Two-lead ECG数据集上进行了性能评估，并采用AUC、ROC作为衡量性能的指标，本发明所提出的方法AUC为0.8396573，图5列出了本发明所提出方法性能与其他方法性能在同一数据集上的对比数据，其中Seq2Cluster为所提出的方法。由此可见，本发明所提出的方法优于所有现有的同类无监督异常检测方法，可以说明本专利所述的异常检测方法具有先进性。The performance of the above method is evaluated on the Two-lead ECG data set, and AUC and ROC are used as indicators to measure performance. The AUC of the method proposed by the present invention is 0.8396573. Figure 5 lists the performance of the method proposed by the present invention and other methods Performance comparison data on the same dataset, where Seq2Cluster is the proposed method. It can be seen that the method proposed in the present invention is superior to all existing unsupervised anomaly detection methods of the same kind, which shows that the anomaly detection method described in this patent is advanced.

Claims

1. a time series anomaly detection method based on unsupervised learning, is characterized in that, comprises:

Divide the time series data at its significantly changed position, and fill each segmented data segment to a set length;

The anomaly detection model is trained using multiple data segments that are segmented and filled in the normal state of the time series as input;

The multiple data segments that are divided and filled by the time series to be detected are used as the input of the anomaly detection model to detect, and the anomaly score is output;

It is judged whether the abnormality score exceeds the threshold value, if yes, it is judged that an abnormality has occurred, otherwise, it is judged that no abnormality has occurred.

2. The method for detecting anomalies in time series based on unsupervised learning according to claim 1, wherein the method for segmenting the time series data at its significantly changed position specifically includes:

Find all extreme points of time series data;

Then, the position of the extreme value point where the absolute value is too expensive to set the threshold value is used as the segmentation point to be divided into multiple data segments.

3. The method for detecting anomalies in time series based on unsupervised learning according to claim 1, wherein the anomaly detection model comprises a data compressor and a Gaussian mixture model estimator, and the data compressor adopts multiple pairs of Multiple LSTM network structures, the Gaussian mixture model estimator adopts a multi-layer perceptron structure.

4. The method for detecting anomalies in time series based on unsupervised learning according to claim 3, wherein the compression process of the data compressor comprises:

Compress and reconstruct the data segment;

Calculate the relative distance and cosine distance before and after compression;

The relative distance, cosine distance, and the output of the hidden layer unit of the LSTM network are synthesized as the input of the Gaussian mixture model estimator.

5. a kind of time series anomaly detection method based on unsupervised learning according to claim 4, is characterized in that, the mathematical expression of described relative distance is:

Among them: r is the relative distance, L is the length of the time series contained in the data segment, x _i is the data segment

The elements in the included time series, x' is the element in the time series obtained after recombination.

6. a kind of time series anomaly detection method based on unsupervised learning according to claim 4, is characterized in that, the mathematical expression of described cosine distance is:

Where: c is the cosine distance, ||·|| is the norm, x _i is the element in the time series contained in the data segment, and x′ is the element in the time series obtained after recombination.

7. The method for detecting anomalies in time series based on unsupervised learning according to claim 4, wherein the training process of the Gaussian mixture model estimator comprises:

The output of the data compressor is received and mapped into a K-dimensional vector using a multilayer neural network, where K is the number of Gaussian distributions in the model,

Based on each element of the K-dimensional vector and using the multilayer perceptron model, the mixture probability, mean and covariance of each Gaussian distribution are obtained;

The detection process of the Gaussian mixture model includes:

The output of the data compressor is received and an anomaly score is calculated.

8. a kind of time series abnormal detection method based on unsupervised learning according to claim 7, is characterized in that, the mathematical expression of described abnormal score is:

Among them: Score(z) is the abnormal score, is the mixture probability of the kth Gaussian distribution, is the covariance of the kth Gaussian distribution, z is the output of the data compressor, is the mean of the kth Gaussian distribution, for The inverse matrix of .

9. A kind of time series anomaly detection method based on unsupervised learning according to claim 8, is characterized in that,

The mixture probability of the k-th Gaussian distribution is:

The mean of the k-th Gaussian distribution is:

The covariance of the kth Gaussian distribution is:

Where: N is the total number of training samples, is the k-th dimension data of the ith training sample, and _zi is the ith training sample.

10. The method for detecting anomalies in time series based on unsupervised learning according to claim 3, wherein the data compressor and the Gaussian mixture model estimator are trained in an end-to-end manner, and the training objective function is as follows :

Among them: J is the objective function, λ ₁ and λ ₂ are manually set parameters, x _i is the time series included in the i-th data segment, and x′ is the time series after the time series included in the i-th data segment is sufficient , for punishment.