CN114970750A - Method, system, device and medium for detecting time sequence abnormity - Google Patents

Method, system, device and medium for detecting time sequence abnormity Download PDF

Info

Publication number
CN114970750A
CN114970750A CN202210688277.2A CN202210688277A CN114970750A CN 114970750 A CN114970750 A CN 114970750A CN 202210688277 A CN202210688277 A CN 202210688277A CN 114970750 A CN114970750 A CN 114970750A
Authority
CN
China
Prior art keywords
data
encoder
model
mixture model
gaussian mixture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210688277.2A
Other languages
Chinese (zh)
Inventor
陈静静
吴睿振
王凛
孙华锦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Original Assignee
Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd filed Critical Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority to CN202210688277.2A priority Critical patent/CN114970750A/en
Publication of CN114970750A publication Critical patent/CN114970750A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

The invention discloses a method, a system, equipment and a medium for carrying out abnormity detection on a time sequence, wherein the method comprises the steps of inputting the time sequence into a self-encoder for training, and acquiring abstract characteristics of the time sequence from the trained self-encoder; inputting the abstract characteristics of the time sequence into a Gaussian mixture model for training to obtain a corresponding Gaussian mixture model; inputting the data to be detected into a trained self-encoder to obtain abstract characteristics of the data to be detected; inputting the abstract characteristics of the data to be detected into the corresponding Gaussian mixture model to determine a sub-Gaussian model of the corresponding Gaussian mixture model to which the abstract characteristics of the data to be detected belong; and acquiring parameters of the sub-Gaussian model, and judging whether the data to be detected is abnormal data or not based on the parameters and the three-sigma rule. By the scheme of the invention, the abnormal detection of the time sequence is realized under the condition that normal data and abnormal data in historical data are not required to be calibrated in advance.

Description

Method, system, device and medium for detecting time sequence abnormity
Technical Field
The present invention relates to the field of data detection, and in particular, to a method, system, device, and medium for detecting time series anomalies.
Background
Time series is the most common data, and most of the current time series analysis focuses on the prediction of the time series. However, for some problems, the morphological comparison of time series is also an important problem. For example: the average price per day (or the closing price per day of stocks) of various commodities forms a time series, and how to evaluate the consistency of the price trends of the commodities can be classified as a time series morphological clustering problem.
Anomaly detection (Anomaly detection) is one of the most commonly studied directions in the analysis of time series data at present, and is defined as a process of identifying abnormal events or behaviors from a normal time series. Anomaly detection is widely used in many areas of industry, such as quantitative transactions, network security detection, autonomous cars, and routine maintenance of large industrial equipment. Taking an in-orbit spacecraft as an example, failure to detect a hazard may result in serious or even irreparable damage due to the expense and complexity of the system. The anomaly may develop into a serious fault at any time, so accurate and timely anomaly detection can remind the astronaut to take measures as early as possible.
The problem of time series anomaly detection is usually expressed as finding anomalous data points relative to some standard or regular signal, where there are usually multiple anomaly types, and we focus on only the most important from a business point of view, such as: unexpected peaks, unexpected valleys and abrupt trends. In general, many anomalies can be determined manually. However, when the service combination is complex and the time sequence scale is large, the judgment is carried out by means of the traditional manual and simple absolute value algorithms such as the same-proportion ring ratio and the like, and the judgment is very important. Therefore, it is important to systematically understand the time series anomaly detection method when faced with a wide variety of industrial-level scenarios.
Basically, anomaly detection algorithms fall into two categories. The first type uses a classification algorithm to mark each time point as abnormal/non-abnormal, and then classifies each time point through the classification algorithm, and the defects are that the abnormal/non-abnormal of historical data needs to be manually marked, and the artificial judgment is obviously depended on; the second category uses a prediction algorithm to predict the signal at a point, then tests the difference between the actual and predicted values at that point, and then observes whether the difference is sufficient to treat it as an anomaly, with the disadvantage of depending on the accuracy of the budget algorithm.
However, both classification and prediction algorithms require historical anomaly data as samples for model training. However, most data in life are normal data, and a few data belong to abnormal data, so that the problem that how to detect the training model for detecting the abnormality without the abnormal data is difficult is solved.
Disclosure of Invention
In view of this, the present invention provides a method, a system, a device, and a medium for performing anomaly detection on a time sequence, which reconstruct the time sequence by using an auto encoder (Autoencoder), ensure that dimensions of feature spaces of all time sequences are consistent in the reconstruction process, then use values of the feature spaces of all time sequences as input of a GMM algorithm, perform anomaly detection on the time sequence by using the GMM algorithm, do not need to calibrate normal data and abnormal data in advance, and reduce calculation amount and noise under the condition of keeping original information of the time sequence.
Based on the above object, an aspect of the embodiments of the present invention provides a method for detecting an abnormality of a time series, which specifically includes the following steps:
inputting the time sequence into a self-encoder for training, and acquiring abstract features of the time sequence from the trained self-encoder;
inputting the abstract characteristics of the time sequence into a Gaussian mixture model for training to obtain a corresponding Gaussian mixture model;
inputting the data to be detected into a trained self-encoder to obtain abstract characteristics of the data to be detected;
inputting the abstract characteristics of the data to be detected into the corresponding Gaussian mixture model to determine a sub-Gaussian model of the corresponding Gaussian mixture model to which the abstract characteristics of the data to be detected belong;
and acquiring parameters of the sub-Gaussian model, and judging whether the data to be detected is abnormal data or not based on the parameters and a three-sigma rule.
In some embodiments, inputting the abstract features of the time series into a gaussian mixture model for training, and obtaining a corresponding gaussian mixture model includes:
determining a range of k in the Gaussian mixture model, wherein k represents the number of sub-Gaussian models in the Gaussian mixture model;
inputting the abstract features of the time series into each corresponding k Gaussian mixture model to train each corresponding Gaussian mixture model;
and calculating the distortion degree corresponding to each corresponding Gaussian mixture model based on the elbow rule so as to determine the optimal Gaussian mixture model.
In some embodiments, inputting the time series of abstract features into each k corresponding gaussian mixture model to train each of the corresponding gaussian mixture models comprises:
initializing all parameters of the corresponding Gaussian mixture model, wherein the parameters comprise the expectation of each sub-Gaussian model, the variance or covariance of each sub-Gaussian model, and the occurrence probability of each sub-Gaussian model in the mixture model;
calculating the probability of each abstract feature of the time series from each sub-Gaussian model according to the parameters;
updating each of the initialized parameters based on the probability of each of the abstract features from each of the sub-Gaussian models;
and returning to the step of calculating the probability of each abstract feature of the time sequence from each sub-Gaussian model according to the parameters until the updated parameters are converged.
In some embodiments, obtaining parameters of the sub-gaussian model, and determining whether the data to be detected is abnormal data based on the parameters and three sigma rule includes:
obtaining the mean value of the sub-Gaussian model;
determining a threshold based on the mean and the three sigma law;
and judging whether the data to be detected is abnormal data or not based on the threshold value.
In some embodiments, inputting the time series to the self-encoder for training comprises:
determining the dimension of a feature space of an auto encoder, the number of hidden layers and input/output dimension in the encoding and decoding processes and a loss function, and inputting the time sequence into the auto encoder;
training an auto-encoder to which the time series is input by minimizing the loss function.
In some embodiments, training the self-encoder by minimizing the loss function comprises:
and adjusting the number of hidden layers, the input/output dimension and the dimension of the feature space of the self-encoder in the encoding and decoding processes according to the size of the loss function, and training the self-encoder after parameters are adjusted until the loss function meeting the condition is obtained.
In some embodiments, inputting the time series to the self-encoder for training comprises: respectively and sequentially inputting a plurality of time sequences into a self-encoder according to a preset time unit for training;
the method for acquiring the abstract characteristics of the time series from the trained self-encoder comprises the following steps: sequentially acquiring abstract features of all the time sequences from a trained self-encoder;
inputting the abstract characteristics of the time sequence into a Gaussian mixture model for training, and obtaining a corresponding Gaussian mixture model comprises the following steps: and inputting all the abstract characteristics of the time series into a Gaussian mixture model to obtain corresponding Gaussian mixture models of the abstract characteristics corresponding to the time series.
In another aspect of the embodiments of the present invention, a system for detecting an abnormality of a time series is further provided, including:
the first training module is configured to input a time sequence into a self-encoder for training, and obtain abstract features of the time sequence from the trained self-encoder;
the second training module is configured to input the abstract features of the time sequence into a Gaussian mixture model for training to obtain a corresponding Gaussian mixture model;
the input module is configured to input the data to be detected into the trained self-encoder to obtain the abstract characteristics of the data to be detected;
the determining module is configured to input the abstract features of the data to be detected into the corresponding Gaussian mixture model so as to determine a sub-Gaussian model of the corresponding Gaussian mixture model to which the abstract features of the data to be detected belong;
and the judging module is configured to acquire the parameters of the sub-Gaussian model and judge whether the data to be detected is abnormal data or not based on the parameters and the three-sigma rule.
In another aspect of the embodiments of the present invention, there is also provided a computer device, including: at least one processor; and a memory storing a computer program executable on the processor, the computer program when executed by the processor implementing the steps of the method as above.
In a further aspect of the embodiments of the present invention, a computer-readable storage medium is also provided, in which a computer program for implementing the above method steps is stored when the computer program is executed by a processor.
The invention has at least the following beneficial technical effects: inputting the time sequence into a self-encoder for training, and acquiring abstract characteristics of the time sequence from the trained self-encoder; inputting the abstract characteristics of the time sequence into a Gaussian mixture model for training to obtain a corresponding Gaussian mixture model; inputting the data to be detected into a trained self-encoder to obtain abstract characteristics of the data to be detected; inputting the abstract characteristics of the data to be detected into the corresponding Gaussian mixture model to determine a sub-Gaussian model of the corresponding Gaussian mixture model to which the abstract characteristics of the data to be detected belong; the parameters of the sub-Gaussian model are obtained, whether the data to be detected are abnormal data or not is judged based on the parameters and the three-sigma rule, the time series abnormity detection is realized under the condition that normal data and abnormal data in historical data do not need to be calibrated in advance, and the calculated amount and noise are reduced under the condition that time series original information is reserved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained according to the drawings without creative efforts.
FIG. 1 is a block diagram of an embodiment of a method for detecting anomalies in a time series according to the present invention;
FIG. 2 is a schematic structural diagram of a self-encoder;
FIG. 3 is a diagram illustrating an embodiment of training a self-encoder according to the present invention;
FIG. 4 is a schematic illustration of K value determination based on elbow rule;
FIG. 5 is a flowchart of an embodiment of detecting anomalies in a time series according to the present invention;
FIG. 6 is a diagram illustrating an embodiment of a system for anomaly detection of a time series according to the present invention;
FIG. 7 is a schematic structural diagram of an embodiment of a computer device provided in the present invention;
fig. 8 is a schematic structural diagram of an embodiment of a computer-readable storage medium provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.
It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.
For better understanding of the embodiments of the present invention, related technical terms in the embodiments of the present invention are first explained.
Time series: the time series (or called dynamic number series) refers to a number series formed by arranging the numerical values of the same statistical index according to the time sequence of occurrence of the numerical values. The main purpose of time series analysis is to predict the future based on existing historical data. Most of the economic data is given in time series. The time in the time series may be year, quarter, month or any other form of time, depending on the observed time;
euclidean distance: euclidean metric (also known as euclidean distance) is a commonly used definition of distance, referring to the true distance between two points in an m-dimensional space, or the natural length of a vector (i.e., the distance of the point from the origin). The euclidean distance in two-dimensional and three-dimensional space is the actual distance between two points;
loss function (loss function): or cost function, is a function that maps the value of a random event or its associated random variable to a non-negative real number to represent the "risk" or "loss" of the random event. In application, the loss function is usually associated with the optimization problem as a learning criterion, i.e. the model is solved and evaluated by minimizing the loss function.
Auto encoder (AutoEncoder, A E for short): the Neural Networks (ANNs) are Artificial Neural Networks (ANNs) used in semi-supervised learning and unsupervised learning, and have the function of performing representation learning (representation learning) on input information by taking the input information as a learning target. Autoencode is an artificial neural network used to learn efficient data encoding in an unsupervised manner. The Autoencorder comprises an encorder process and a decoderprocess, wherein a hidden layer (or a plurality of hidden layers) is/are established in the encorder process and contains low-dimensional vectors of input data. The decoder reconstructs the data through the low-dimensional vectors of the hidden layer. AE can be understood as lossy compression and decompression.
The AutoEncoder has a function of characterizing a learning algorithm in a general sense, and is applied to dimension reduction (dimension reduction) and outlier detection (anomaly detection). The self-encoder including the convolutional layer structure can be applied to computer vision problems including image denoising (image denoising), neural style transfer (neural style transfer), and the like.
Gaussian Mixed Model (GMM for short): the gaussian mixture model accurately quantizes an object using a gaussian probability density function (normal distribution curve), and is a model formed based on the gaussian probability density function (normal distribution curve) and capable of decomposing the object into a plurality of objects. The gaussian mixture model can be regarded as a model formed by combining K single gaussian models, and the K sub-gaussian models are Hidden variables (Hidden variables) of the mixture model.
Three sigma law: also known as empirical rules, or 68-95-99.7, is used to quickly calculate normal distribution data with known mean and standard deviation. Statistically, the rule of thumb is that in a normal distribution, the average is less than one standard deviation, two standard deviations, and within three standard deviations, with more precise numbers being 68.27%, 95.45%, and 99.73%.
In a first aspect of the embodiments of the present invention, an embodiment of a method for detecting an abnormality of a time series is provided. As shown in fig. 1, it includes the following steps:
s10, inputting the time sequence into a self-encoder for training, and acquiring abstract characteristics of the time sequence from the trained self-encoder;
s20, inputting the abstract features of the time sequence into a Gaussian mixture model for training to obtain a corresponding Gaussian mixture model;
s30, inputting the data to be detected into the trained self-encoder to obtain the abstract characteristics of the data to be detected;
s40, inputting the abstract characteristics of the data to be detected into the corresponding Gaussian mixture model to determine a sub-Gaussian model of the corresponding Gaussian mixture model to which the abstract characteristics of the data to be detected belong;
and S50, acquiring parameters of the sub-Gaussian model, and judging whether the data to be detected is abnormal data or not based on the parameters and the three-sigma rule.
In step S10, the dimension reduction process of the auto encoder (Autoencoder) will be described first with reference to fig. 2 and 3.
The self-encoder comprises an encoder process and a decoder process, wherein in the encoder process, a hidden layer (or a plurality of hidden layers) is/are established, dimension reduction is carried out on input data x through an encoder function, and the input data is mapped to a feature space z, namely abstract features. Then in the decoder process, the abstract feature z is mapped back to the original space through the decoder function to obtain a reconstructed sample x'. The parameters of the encoder function and the decoder function are trained by minimizing a defined loss function.
The specific steps for constructing the self-encoder are as follows:
1) determining the dimension of an abstract feature space;
2) determining the number of layers of a hidden layer and input and output dimensions in an encoder process, and building the encoder, wherein an activation function of the encoder process is generally a sigmoid function;
3) determining the number of layers of a hidden layer and input-output dimensionality in a decoder process, and building the decoder, wherein an activation function of the decoder process is generally a sigmoid function;
4) setting a loss function, generally a mean square error;
5) training parameters of an encoder function and a decoder function through a minimum loss function;
6) in the training process, the dimensionality, the number of hidden layers and the input and output dimensionality of the abstract feature space can be adjusted, the training is stopped when a satisfactory loss function is obtained, and the abstract feature z is obtained from the training result and is used as the input of a subsequent KDE.
From the above process, the encoder part of the self-encoder compresses the data by reducing the number of the neurons layer by layer; the decoder part increases the number of neurons layer by layer based on the abstract representation of the data, and finally realizes the reconstruction of the input sample. In the optimization process of the self-encoder, a label (a label) of a sample (a time sequence) is not needed, namely, the sample data is not needed to be calibrated by normal data and abnormal data, the input of the sample is simultaneously used as the input and the output of a neural network, and the abstract characteristic z of the sample is learned by minimizing a loss function. The generality of the model is greatly improved through the optimization of the unsupervised learning, and the obtained abstract features z contain all information of the original input sample.
In step S20, a process of training the gaussian mixture model will be described.
Parameter information for defining Gaussian mixture model
x j Represents the jth observed data, j 1, 2.., N;
k is the number of sub-gaussian models in the mixture model, K is 1,2, …, K;
α k is the probability that the observed data belongs to the kth sub-Gaussian model, α k ≥0,
Figure BDA0003700574690000091
φ(x|θ k ) Is a density function of the gaussian distribution of the kth sub-gaussian model,
Figure BDA0003700574690000092
the expansion form is as follows:
Figure BDA0003700574690000093
γ jk representing the probability that the jth observed data belongs to the kth sub-Gaussian model;
definition of gaussian mixture model: the gaussian mixture model is a probability distribution model having the form:
Figure BDA0003700574690000094
for this model
Figure BDA0003700574690000095
I.e., the expectation of each sub-gaussian model, each sub-gaussian model variance (or covariance), the probability that each sub-gaussian model occurs in the mixture model.
Two, training Gaussian mixture model
For a single gaussian model, the value of the parameter τ can be estimated using a log-likelihood of log logarithm (Maximum):
τ=arg max τ logL(τ)
assuming that each data point is independent, the likelihood function is given by a probability density function:
Figure BDA0003700574690000101
for the Gaussian mixture model, the log-likelihood function (log-likelihood) is:
Figure BDA0003700574690000102
since the Gaussian mixture model cannot use maximum likelihood estimation to solve the parameters like the single Gaussian model, because for each observed datum it is not actually known to which subdivision it belongs, and therefore the summation is required inside the log, there is an unknown μ for each sub-Gaussian model kkk
The unknown parameters are solved by the EM algorithm.
The EM algorithm is an iterative algorithm, and a method for updating Gaussian mixture model parameters (with sample data x) through EM iteration 1 ,x 2 ,…,x N And a gaussian mixture model with K sub-gaussian models):
step 1: first all parameters are initialized
Figure BDA0003700574690000103
Step 2: calculating the probability that each observation j comes from the sub-Gaussian model k according to the current parameters:
Figure BDA0003700574690000104
and step 3: calculating model parameters of a new iteration:
Figure BDA0003700574690000111
Figure BDA0003700574690000112
Figure BDA0003700574690000113
and 4, step 4: repeating the calculation steps 2 and 3 until convergence, namely | | tau i+1i ||<ε, where ε is a small positive number, indicates that the parameter change is very small after one iteration.
After K Gaussian mixture models are trained based on the process, the distortion degree corresponding to each Gaussian mixture model is calculated through an elbow rule, and therefore the optimal Gaussian mixture model is determined.
The computing principle of the elbow rule is a cost function, which is the sum of class distortion degrees, the distortion degree of each class is equal to the sum of squares of the distances of the positions of each variable point from the center of the class, the distortion degree of the class is smaller if the members inside the class are more compact, and conversely, the distortion degree of the class is larger if the members inside the class are more dispersed. In selecting the number of categories, the elbow rule would draw cost function values for different values. As the value increases, the average distortion level decreases, the number of samples contained in each class decreases, and the samples are closer to their center of gravity. However, as the value continues to increase, the effect of improving the average distortion level decreases. In the increasing process, the elbow is the value corresponding to the position where the effect of improving the distortion degree is most reduced.
Fig. 4 is a diagram illustrating the determination of the K value based on the elbow rule. The method is suitable for the condition that the K value is relatively small, and when the selected K value is smaller than the real K, the cost value is greatly reduced every time K is increased by 1; when the selected value of K is greater than the true K, the change in cost value will not be as significant for every 1 increase in K. Thus, the correct k value will be at this turning point, like elbow. By drawing a graph of the relationship between K and the cost function, the value of K can be obtained.
In steps S30 and S40, the data to be detected are sequentially processed based on the trained self-encoder and the optimal gaussian mixture model, and the specific process is as follows:
inputting the data to be detected into a trained self-encoder to obtain abstract characteristics of the data to be detected;
inputting the abstract characteristics of the data to be detected into a best trained Gaussian mixture model GMM (K), judging which sub-Gaussian model of the GMM (K) the abstract characteristics of the data to be detected belong to, and acquiring parameters of the sub-Gaussian model, wherein the parameters comprise the mean value and the variance of the sub-Gaussian model.
In step S50, a mean value of the sub-gaussian model is obtained, and whether the data to be detected is abnormal data is determined based on a range of ± 3 σ of the mean value, where 3 σ represents a parameter corresponding to the three-sigma rule.
The preferred embodiment of the invention is suitable for scenes with no or too little historical data, the time sequence is input into a self-encoder for training, and abstract characteristics of the time sequence are obtained from the trained self-encoder; inputting the abstract characteristics of the time sequence into a Gaussian mixture model for training to obtain a corresponding Gaussian mixture model; inputting the data to be detected into a trained self-encoder to obtain abstract characteristics of the data to be detected; inputting the abstract characteristics of the data to be detected into the corresponding Gaussian mixture model to determine a sub-Gaussian model of the corresponding Gaussian mixture model to which the abstract characteristics of the data to be detected belong; the parameters of the sub-Gaussian model are obtained, whether the data to be detected are abnormal data or not is judged based on the parameters and the three-sigma rule, the time series abnormity detection is realized under the condition that normal data and abnormal data in historical data do not need to be calibrated in advance, and the calculated amount and noise are reduced under the condition that time series original information is reserved.
In some embodiments, inputting the abstract features of the time series into a gaussian mixture model for training, and obtaining a corresponding gaussian mixture model includes:
determining a range of k in the Gaussian mixture model, wherein k represents the number of sub-Gaussian models in the Gaussian mixture model;
inputting the abstract features of the time series into each k corresponding Gaussian mixture model to train each corresponding Gaussian mixture model;
and calculating the distortion degree corresponding to each corresponding Gaussian mixture model based on the elbow rule so as to determine the optimal Gaussian mixture model.
In some embodiments, inputting the time series of abstract features into each k corresponding gaussian mixture model to train each of the corresponding gaussian mixture models comprises:
initializing all parameters of the corresponding Gaussian mixture model, wherein the parameters comprise the expectation of each sub-Gaussian model, the variance or covariance of each sub-Gaussian model, and the occurrence probability of each sub-Gaussian model in the mixture model;
calculating the probability of each abstract feature of the time series from each sub-Gaussian model according to the parameters;
updating each of the initialized parameters based on the probability of each of the abstract features from each of the sub-Gaussian models;
and returning to the step of calculating the probability of each abstract feature of the time sequence from each sub-Gaussian model according to the parameters until the updated parameters are converged.
In some embodiments, obtaining parameters of the sub-gaussian model, and determining whether the data to be detected is abnormal data based on the parameters and three sigma rule includes:
obtaining the mean value of the sub-Gaussian model;
determining a threshold based on the mean and the three sigma law;
and judging whether the data to be detected is abnormal data or not based on the threshold value.
In some embodiments, inputting the time series to the self-encoder for training comprises:
determining the dimension of a feature space of an auto encoder, the number of hidden layers and input/output dimension in the encoding and decoding processes and a loss function, and inputting the time sequence into the auto encoder;
training an auto-encoder to which the time series is input by minimizing the loss function.
Specifically, relevant parameters required by the self-encoder for training are determined, the parameters include the dimension of a feature space of the self-encoder, the number of hidden layers in the encoding and decoding processes, the input/output dimension, a loss function and the like, after the parameters are determined, the time sequence is input into the self-encoder for training, and the final trained self-encoder is obtained in a mode of minimizing the loss function.
In some embodiments, training the self-encoder to which the time series is input by minimizing the loss function comprises:
and adjusting the number of hidden layers, the input/output dimension and the dimension of the feature space of the self-encoder in the encoding and decoding processes according to the size of the loss function, and training the self-encoder after parameters are adjusted until the loss function meeting the condition is obtained.
In some embodiments, inputting the time series to the self-encoder for training comprises: respectively and sequentially inputting a plurality of time sequences into a self-encoder according to a preset time unit for training;
the method for acquiring the abstract characteristics of the time series from the trained self-encoder comprises the following steps: sequentially acquiring abstract features of all the time sequences from a trained self-encoder;
inputting the abstract characteristics of the time sequence into a Gaussian mixture model for training, and obtaining a corresponding Gaussian mixture model comprises the following steps: and inputting all the abstract characteristics of the time series into a Gaussian mixture model to obtain corresponding Gaussian mixture models of the abstract characteristics corresponding to the time series.
The following describes a specific embodiment of the present invention with reference to yet another specific example.
Fig. 5 is a flow chart illustrating the time series abnormality detection according to the present invention.
For time series x 1 ,x 2 ,…x m The detection is divided into two stages, namely a training stage and a detection stage, and the specific process is described below with reference to fig. 5, specifically as follows:
first, training phase
1. Training of the self-encoder (Autoencoder):
1) determining the dimension of the Autoencoder abstract feature space, which can be set to any value, and in this embodiment, it is assumed to be 3;
2) determining the number of layers and the input and output dimensions of a hidden layer in the encoder process, wherein the number of layers and the output dimensions can be set to be any values, an initial value is given at the initial training, and dynamic adjustment can be performed according to the size of a loss function in the training process. In this embodiment, it is assumed that the number of hidden layers is 2, the input dimension and the output dimension of the first hidden layer are respectively the dimension and 24 of the time sequence, and the input dimension and the output dimension of the second hidden layer are respectively 24 and 3;
3) determining the number of layers and the dimension of a hidden layer in the decoder process, wherein the number of layers and the dimension can be set to be any values, an initial value is given at the initial training process, and the number of layers and the dimension can be dynamically adjusted according to the size of a loss function in the training process. However, the number of hidden layers is generally equal to that of the hidden layers in the encoder process, and the input and output dimensions are opposite to those of the encoder process. In this embodiment, it is assumed that the number of hidden layers is 2, the input dimension and the output dimension of the first hidden layer are respectively 3 and 24, and the input dimension and the output dimension of the second hidden layer are respectively 24 and the dimension of the time sequence;
4) setting the loss function to Mean Squared Error (MSE)
Figure BDA0003700574690000151
5) For time series x ═ x 1 ,x 2 ,…x m ):
5.1) constructing an encoder function:
Figure BDA0003700574690000152
Figure BDA0003700574690000153
5.2) constructing a decoder function:
Figure BDA0003700574690000154
Figure BDA0003700574690000155
5.3) training the parameters of the encoder function and the decoder function according to the loss function:
arg min MSE(z 1*m -x 1*m )
5.4) continuously adjusting the dimension of the abstract feature space, the number of layers of the hidden layers and the input and output dimensions according to the size of the loss function, and then repeating the process of 5.1-5.3 until a satisfactory MSE is obtained, wherein the output dimension of the last hidden layer in the encoder process is the input dimension of the first hidden layer in the decoder process;
5.5) obtaining the sequence x ═ (x) 1 ,x 2 ,…x m ) Abstract features of
Figure BDA0003700574690000156
The method can obtain abstract characteristics of all time sequences
Figure BDA0003700574690000157
All abstract features to be obtained
Figure BDA0003700574690000158
The training of the AutoEncoder is completed by this time as the input of the KDE.
2. Training of Gaussian mixture models
1) Firstly, determining a range of K values, training a GMM (K) model corresponding to each K, and calculating the distortion degree corresponding to each GMM (K) model according to the training result;
examples are as follows: and if the k value ranges from 3 to 9, 7 Gaussian mixture models are available, namely GMM (3) to GMM (9).
2) And calculating the distortion degree corresponding to each corresponding Gaussian mixture model based on an elbow rule, and drawing a line graph of the distortion degree of each Gaussian mixture model and the corresponding K so as to determine the optimal Gaussian mixture model and the corresponding K value thereof.
3) And obtaining parameters of each submodel of the training GMM (K) model according to the determined K value.
In this process, the training of the gmm (k) model is completed.
II, a detection stage:
1) the data x to be detected is equal to (x) 1 ,x 2 ,…x m ) Inputting the data to a trained self-encoder to obtain abstract features of the data to be detectedSign for
Figure BDA0003700574690000161
2) Abstract characteristics of data to be detected
Figure BDA0003700574690000162
Inputting the model into a trained GMM (K);
3) determining abstract characteristics of data to be detected
Figure BDA0003700574690000163
Which sub-model of GMM (K) belongs to, the parameters of the sub-model are obtained.
4) And judging whether the data to be detected is abnormal data or not according to the 3-sigma rule.
At this point, the abnormality detection of the time series data is completed.
Based on the same inventive concept, according to another aspect of the present invention, as shown in fig. 6, an embodiment of the present invention further provides a system for detecting an abnormality of a time series, including:
a first training module 110, where the first training module 110 is configured to input a time sequence into an auto-encoder for training, and obtain abstract features of the time sequence from the trained auto-encoder;
a second training module 120, where the second training module 120 is configured to input the abstract features of the time series into a gaussian mixture model for training, so as to obtain a corresponding gaussian mixture model;
the input module 130 is configured to input the data to be detected to the trained self-encoder to obtain an abstract feature of the data to be detected;
a determining module 140, where the determining module 140 is configured to input the abstract features of the data to be detected into the corresponding gaussian mixture model, so as to determine a sub-gaussian model of the corresponding gaussian mixture model to which the abstract features of the data to be detected belong;
a determining module 150, where the determining module 150 is configured to obtain parameters of the sub-gaussian model, and determine whether the data to be detected is abnormal data based on the parameters and the three sigma rule.
Based on the same inventive concept, according to another aspect of the present invention, as shown in fig. 7, the embodiment of the present invention further provides a computer device 30, in which the computer device 30 comprises a processor 310 and a memory 320, the memory 320 stores a computer program 321 that can run on the processor, and the processor 310 executes the program to perform the steps of the above method.
The memory, as a non-volatile computer-readable storage medium, may be used to store a non-volatile software program, a non-volatile computer-executable program, and modules, such as program instructions/modules corresponding to the method for detecting an abnormality in a time series in the embodiments of the present application. The processor executes various functional applications and data processing of the device by running the nonvolatile software program, instructions and modules stored in the memory, that is, the method for detecting the time sequence abnormality of the above method embodiment is realized.
The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the device, and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and such remote memory may be coupled to the local module via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Based on the same inventive concept, according to another aspect of the present invention, as shown in fig. 8, an embodiment of the present invention further provides a computer-readable storage medium 40, where the computer-readable storage medium 40 stores a computer program 410, which when executed by a processor, performs the above method.
Finally, it should be noted that, as will be understood by those skilled in the art, all or part of the processes of the methods of the above embodiments may be implemented by a computer program, which may be stored in a computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. The storage medium of the program may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like. The embodiments of the computer program may achieve the same or similar effects as any of the above-described method embodiments.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments of the present invention.
The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments. Furthermore, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.
It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of the embodiments of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.

Claims (10)

1. A method for anomaly detection of a time series, comprising:
inputting the time sequence into a self-encoder for training, and acquiring abstract characteristics of the time sequence from the trained self-encoder;
inputting the abstract characteristics of the time sequence into a Gaussian mixture model for training to obtain a corresponding Gaussian mixture model;
inputting the data to be detected into a trained self-encoder to obtain abstract characteristics of the data to be detected;
inputting the abstract characteristics of the data to be detected into the corresponding Gaussian mixture model to determine a sub-Gaussian model of the corresponding Gaussian mixture model to which the abstract characteristics of the data to be detected belong;
and acquiring parameters of the sub-Gaussian model, and judging whether the data to be detected is abnormal data or not based on the parameters and the three-sigma rule.
2. The method of claim 1, wherein inputting the abstract features of the time series into a Gaussian mixture model for training, and obtaining a corresponding Gaussian mixture model comprises:
determining a range of k in the Gaussian mixture model, wherein k represents the number of sub-Gaussian models in the Gaussian mixture model;
inputting the abstract features of the time series into each k corresponding Gaussian mixture model to train each corresponding Gaussian mixture model;
and calculating the distortion degree corresponding to each corresponding Gaussian mixture model based on the elbow rule so as to determine the optimal Gaussian mixture model.
3. The method of claim 2, wherein inputting the time series of abstract features into each k corresponding gaussian mixture model to train each corresponding gaussian mixture model comprises:
initializing all parameters of the corresponding Gaussian mixture model, wherein the parameters comprise the expectation of each sub-Gaussian model, the variance or covariance of each sub-Gaussian model, and the occurrence probability of each sub-Gaussian model in the mixture model;
calculating the probability of each abstract feature of the time series from each sub-Gaussian model according to the parameters;
updating each of the initialized parameters based on the probability of each of the abstract features from each of the sub-Gaussian models;
and returning to the step of calculating the probability of each abstract feature of the time sequence from each sub-Gaussian model according to the parameters until the updated parameters are converged.
4. The method of claim 1, wherein obtaining parameters of the sub-gaussian model and determining whether the data to be detected is abnormal data based on the parameters and three sigma law comprises:
obtaining the mean value of the sub-Gaussian model;
determining a threshold based on the mean and the three sigma law;
and judging whether the data to be detected is abnormal data or not based on the threshold value.
5. The method of claim 1, wherein inputting the time series into an auto-encoder for training comprises:
determining the dimension of a feature space of an auto encoder, the number of hidden layers and input/output dimension in the encoding and decoding processes and a loss function, and inputting the time sequence into the auto encoder;
training an auto-encoder to which the time series is input by minimizing the loss function.
6. The method of claim 5, wherein training the self-encoder by minimizing the loss function comprises:
and adjusting the number of hidden layers, the input/output dimension and the dimension of the feature space of the self-encoder in the encoding and decoding processes according to the size of the loss function, and training the self-encoder after parameters are adjusted until the loss function meeting the condition is obtained.
7. The method of claim 1, wherein inputting the time series into an auto-encoder for training comprises: respectively and sequentially inputting a plurality of time sequences into a self-encoder according to a preset time unit for training;
the method for acquiring the abstract characteristics of the time series from the trained self-encoder comprises the following steps: sequentially acquiring abstract features of all the time sequences from a trained self-encoder;
inputting the abstract characteristics of the time sequence into a Gaussian mixture model for training, and obtaining a corresponding Gaussian mixture model comprises the following steps: and inputting all the abstract characteristics of the time series into a Gaussian mixture model to obtain corresponding Gaussian mixture models of the abstract characteristics corresponding to the time series.
8. A system for anomaly detection of a time series, comprising:
the first training module is configured to input a time sequence into a self-encoder for training, and obtain abstract features of the time sequence from the trained self-encoder;
the second training module is configured to input the abstract features of the time sequence into a Gaussian mixture model for training to obtain a corresponding Gaussian mixture model;
the input module is configured to input the data to be detected into a trained self-encoder to obtain abstract characteristics of the data to be detected;
the determining module is configured to input the abstract features of the data to be detected into the corresponding Gaussian mixture model so as to determine a sub-Gaussian model of the corresponding Gaussian mixture model to which the abstract features of the data to be detected belong;
and the judging module is configured to acquire the parameters of the sub-Gaussian model and judge whether the data to be detected is abnormal data or not based on the parameters and the three-sigma rule.
9. A computer device, comprising:
at least one processor; and
memory storing a computer program operable on the processor, wherein the processor executes the program to perform the steps of the method according to any of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202210688277.2A 2022-06-17 2022-06-17 Method, system, device and medium for detecting time sequence abnormity Pending CN114970750A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210688277.2A CN114970750A (en) 2022-06-17 2022-06-17 Method, system, device and medium for detecting time sequence abnormity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210688277.2A CN114970750A (en) 2022-06-17 2022-06-17 Method, system, device and medium for detecting time sequence abnormity

Publications (1)

Publication Number Publication Date
CN114970750A true CN114970750A (en) 2022-08-30

Family

ID=82964060

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210688277.2A Pending CN114970750A (en) 2022-06-17 2022-06-17 Method, system, device and medium for detecting time sequence abnormity

Country Status (1)

Country Link
CN (1) CN114970750A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115409131A (en) * 2022-10-28 2022-11-29 武汉惠强新能源材料科技有限公司 Production line abnormity detection method based on SPC process control system
CN116087759A (en) * 2023-04-12 2023-05-09 广东翰唐智控有限公司 Method for inspecting conductive path of circuit board and circuit system
CN116165353A (en) * 2023-04-26 2023-05-26 江西拓荒者科技有限公司 Industrial pollutant monitoring data processing method and system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115409131A (en) * 2022-10-28 2022-11-29 武汉惠强新能源材料科技有限公司 Production line abnormity detection method based on SPC process control system
CN115409131B (en) * 2022-10-28 2023-02-17 武汉惠强新能源材料科技有限公司 Production line abnormity detection method based on SPC process control system
CN116087759A (en) * 2023-04-12 2023-05-09 广东翰唐智控有限公司 Method for inspecting conductive path of circuit board and circuit system
CN116087759B (en) * 2023-04-12 2023-05-30 广东翰唐智控有限公司 Method for inspecting conductive path of circuit board and circuit system
CN116165353A (en) * 2023-04-26 2023-05-26 江西拓荒者科技有限公司 Industrial pollutant monitoring data processing method and system
CN116165353B (en) * 2023-04-26 2023-07-25 江西拓荒者科技有限公司 Industrial pollutant monitoring data processing method and system

Similar Documents

Publication Publication Date Title
CN114970750A (en) Method, system, device and medium for detecting time sequence abnormity
US11537898B2 (en) Generative structure-property inverse computational co-design of materials
WO2021007812A1 (en) Deep neural network hyperparameter optimization method, electronic device and storage medium
Bounliphone et al. A test of relative similarity for model selection in generative models
WO2022095645A1 (en) Image anomaly detection method for latent space auto-regression based on memory enhancement
CN114120041B (en) Small sample classification method based on double-countermeasure variable self-encoder
CN113076215A (en) Unsupervised anomaly detection method independent of data types
US20220245422A1 (en) System and method for machine learning architecture for out-of-distribution data detection
CN114048468A (en) Intrusion detection method, intrusion detection model training method, device and medium
Wu et al. Conditional mutual information-based contrastive loss for financial time series forecasting
Ibragimovich et al. Effective recognition of pollen grains based on parametric adaptation of the image identification model
Kong et al. Research on real time feature extraction method for complex manufacturing big data
Jang et al. Decision fusion approach for detecting unknown wafer bin map patterns based on a deep multitask learning model
CN114500335A (en) SDN network flow control method based on fuzzy C-means and mixed kernel least square support vector machine
Sahay et al. Hyperspectral image target detection using deep ensembles for robust uncertainty quantification
CN112597997A (en) Region-of-interest determining method, image content identifying method and device
CN114880659A (en) Method, device, equipment and medium for time series data abnormity detection
Fradi et al. Manifold-based inference for a supervised Gaussian process classifier
CN114970752A (en) Method, system, device and medium for detecting time sequence abnormity
Chen et al. Generalized correntropy induced loss function for deep learning
Magdalena et al. Identification of beef and pork using gray level co-occurrence matrix and probabilistic neural network
CN113341890A (en) Intelligent diagnosis method and system oriented to cooperation of adaptive scheduling and unmanned production line
Vorotnev et al. Training Bayesian classifier with scaling unique colors among image samples
Donets et al. Methodology of the countries’ economic development data analysis
Bu et al. Measuring robustness of deep neural networks from the lens of statistical model checking

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination