CN115169430A - Cloud network end resource multidimensional time sequence anomaly detection method based on multi-scale decoding - Google Patents

Cloud network end resource multidimensional time sequence anomaly detection method based on multi-scale decoding Download PDF

Info

Publication number
CN115169430A
CN115169430A CN202210456392.7A CN202210456392A CN115169430A CN 115169430 A CN115169430 A CN 115169430A CN 202210456392 A CN202210456392 A CN 202210456392A CN 115169430 A CN115169430 A CN 115169430A
Authority
CN
China
Prior art keywords
time
sequence
data
correlation
time sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210456392.7A
Other languages
Chinese (zh)
Inventor
王树良
徐卓辉
袁汉宁
耿晶
滕腾
党迎旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Surui Data Intelligent Technology Research Institute
Beijing Institute of Technology BIT
Original Assignee
Shenzhen Surui Data Intelligent Technology Research Institute
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Surui Data Intelligent Technology Research Institute, Beijing Institute of Technology BIT filed Critical Shenzhen Surui Data Intelligent Technology Research Institute
Priority to CN202210456392.7A priority Critical patent/CN115169430A/en
Publication of CN115169430A publication Critical patent/CN115169430A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a multi-dimensional time sequence anomaly detection method for cloud network side resources based on multi-scale coding, relates to the technical field of computer science, provides a multi-dimensional time sequence anomaly detection scheme based on multi-scale integrated decoding, and improves accuracy of multi-dimensional time sequence anomaly detection. The technical scheme of the invention comprises the following steps: and (4) calculating the correlation among the time sequences, and constructing a correlation characteristic matrix of the sequences. Based on the correlation signature matrix, the time series correlations are encoded using an encoder and the temporal patterns are captured using an attention-based convolutional long-short term memory network. And decoding by using decoders with different scales, constraining the output of different decoders by using the similarity of tensor kernels, reconstructing a characteristic matrix by fusing all decoding results, and calculating a reconstruction error to detect the abnormality.

Description

Cloud network end resource multidimensional time sequence anomaly detection method based on multi-scale decoding
Technical Field
The invention relates to the technical field of computer science, in particular to a cloud network side resource multi-dimensional time sequence abnormity detection method based on multi-scale decoding.
Background
The time series abnormity detection is an important task for finding problems and avoiding risks in time. In the field of internet application, technicians analyze time sequences of dimensions by monitoring data information of cloud network end resources to find abnormal conditions, and can find potential risks and give an alarm in time, so that the purposes of reducing project economic loss, guaranteeing information safety and the like are achieved.
Cloud refers to cloud computing and infrastructure and resources for supporting cloud computing, a network generally refers to the internet, and a terminal is a terminal device. From the development processes of the three, the end is connected to form a net, and the net is brewed into clouds. In the actual development process, the maturity and popularization of network technology in turn promote the development of computers and various terminal devices, and the cloud computing promotes the intellectualization of the network. The three supplement each other and are a mutually fused whole.
The development of terminal equipment enables people to acquire a large amount of multi-dimensional time sequence data, compared with a single-dimensional time sequence, the multi-dimensional time sequence has diversity and higher data magnitude, invalid and interference information in the data is more serious, meanwhile, more complex correlation conditions exist among dimensions, and the difficulty of the multi-dimensional time sequence abnormity detection task is increased. How to effectively mine deep features in a multidimensional time sequence enables abnormal data to be better distinguished, and the method has gained wide attention of domestic and foreign scholars.
In the whole, at present, two types of methods are available for multi-dimensional time series anomaly detection in domestic and foreign research. One is the traditional time series anomaly detection algorithm, and most of the time series anomaly detection algorithms are the improvement of the time series anomaly detection algorithm based on clustering and classification. The method transversely divides the multi-dimensional time sequence data under the condition of small data scale, converts the multi-dimensional time sequence into a plurality of single-dimensional time sequences, and finds abnormal modes by utilizing an algorithm in the field of the single-dimensional time sequences, thereby having better effect. However, the method has a poor performance under the condition of large data set size, and the system-level abnormality cannot be effectively identified because the correlation among sequences is not considered. The other type is an anomaly detection algorithm based on deep learning, and there are two common methods, one is a method based on a Recurrent Neural Network (RNN), and the other is a method based on an Auto Encoder (AE).
The algorithm based on the recurrent neural network is mainly used for learning the time sequence in data, reserving valuable historical information, predicting data at a future moment, and identifying abnormality according to errors of a predicted value and a true value. The method based on the self-encoder is more inclined to learn hidden features of data in a normal mode, and then abnormity is identified through the reconstructed error after decoding.
The RNN-based method can capture the time sequence of data, but does not consider the correlation among sequences and time modes under different scales, and can not effectively detect system-level abnormality; the method based on the self-encoder cannot avoid the problem of error accumulation caused by sequential decoding under the condition of a long-time sequence, and cannot accurately utilize multi-scale information to perform anomaly detection in a decoding stage.
Therefore, at present, for the cloud network end resource multi-dimensional time sequence, a scheme which can consider the correlation between the sequences and time modes under different scales and accurately utilize multi-scale information to perform anomaly detection in a decoding stage is lacked.
Disclosure of Invention
In view of this, the invention provides a cloud network end resource multi-dimensional time sequence anomaly detection method based on multi-scale decoding, provides a multi-dimensional time sequence anomaly detection scheme based on multi-scale integration decoding, and improves the accuracy of multi-dimensional time sequence anomaly detection.
In order to achieve the purpose, the technical scheme of the invention comprises the following steps:
step 1, calculating the correlation among all time sequences and constructing a correlation characteristic matrix of the sequences.
And 2, based on the correlation characteristic matrix, encoding the time series correlation by using an encoder, and capturing a time mode by using an attention-based convolution long-short term memory network.
And 3, decoding by using decoders with different scales, constraining the output of different decoders by using the similarity of tensor kernels, reconstructing a characteristic matrix by fusing all decoding results, and calculating a reconstruction error to detect the abnormality.
In the embodiment of the invention, the step 1 further comprises the following steps:
after the multi-dimensional time sequence data are standardized, the multi-dimensional time sequence data are segmented by adopting a sliding window algorithm, and a plurality of different time sequence segments are obtained.
Assuming that the dimension of the time sequence is n, the length of the time sequence is T, the size of the time window is w, the time sequence segment obtained by the sliding window segmentation is represented as the time sequence segment obtained by the sliding window segmentation of the ith dimension data
Figure BDA0003618897280000031
Figure BDA0003618897280000032
Wherein
Figure BDA0003618897280000033
The data of the ith dimension corresponding to the t-w-t time in the time sequence segment are respectively.
In the embodiment of the invention, the step 1 specifically comprises the following steps:
calculating the correlation of each dimension data in the current time sequence segment and different dimension data in the historical time sequence segment through inner products to construct an N multiplied by N correlation characteristic matrix M t The ith row and the jth column in the correlation matrix represent the current timeCorrelation between the jth dimension data in sequence segments and the ith dimension data in historical time series segments, i.e.
Figure BDA0003618897280000034
Figure BDA0003618897280000035
The value of (c) is calculated by the following formula:
Figure BDA0003618897280000036
wherein the content of the first and second substances,
Figure BDA0003618897280000037
representing the value in the i-th dimension data corresponding to the time series segment,
Figure BDA0003618897280000038
representing the value of the ith dimension data in the corresponding time sequence segment, wherein t represents the current time; k is a scaling factor, k = w.
In the embodiment of the invention, the step 2 is specifically divided into the following steps:
step 2.1: after multi-dimensional time data feature extraction is completed, obtaining T-w +1 correlation feature matrixes;
step 2.2: modeling the time information in the correlation characteristic matrix by using an LSTM (least squares metric) and outputting potential characteristics of data, namely time series correlation codes;
the encoding process in which LSTM is used can be expressed as:
h t =LSTM([x t ;h t-1 ])
h=F MLP (concat[h 1 ;h 2 ;…;h T ])
where LSTM (. Circle.) represents an LSTM unit, h t Indicating the hidden state at time t, from time t-1 to the hidden state h t-1 And input x at time t t Jointly determine when x is input t When it is a correlation feature matrix, h t The potential characteristics of the data are obtained; h is thenRepresenting the hidden state of the entire input data, from each instant of time the hidden state h 1 ;h 2 ;…;h T Spliced along a time dimension; concat represents the splicing function, F MLP Is a full connection layer.
5. The method for detecting the abnormality of the multidimensional time sequence of the cloud network side resource according to any one of claims 1 to 4, wherein the step 3 specifically comprises:
step 3.1: decoding the latent features of the data by adopting decoders with different output lengths, wherein the decoders output a plurality of reconstructed feature matrixes; in order to ensure the similarity of a plurality of decoder output sequences, tensor similarity is adopted to restrict the time mode of output;
step 3.2: and effectively fusing the decoder output in a coarse-to-fine mode by using a multi-scale fusion strategy, wherein the fused decoder output with the highest scale and the error value of the original input are used for measuring the abnormality, and obtaining an abnormality detection result.
6. The method for detecting the anomaly of the multi-dimensional time sequence of the cloud network side resource according to claim 5, wherein the step 3 comprises the following steps:
step 3.1: assume a decoder set of D, where the k-th decoder D (k) Is defined as T (k) The original sequence length is T, T (k) Is defined as follows:
T (k) =α k T
α k =1/τ k-1 ∈(0,1]
wherein alpha is k For the coefficient of the kth decoder, α 1 =1,τ>1, ensuring the output length of a decoder with the highest scale to be T;
step 3.2, the similarity of tensors is adopted to constrain the time modes output by different decoders;
the dimension of the input matrix sequence M is n multiplied by T, the k decoder outputs a characteristic matrix sequence Y (k) Has dimension of n × n × T (k) Input matrix sequence M and output feature matrix sequence Y (k) All regarded as third-order tensors, and each tensor is subjected to tensor decomposition to obtain the tenses with the same sizeA magnitude kernel that approximates the similarity of the surrogate tensor with the similarity of the tensor kernel;
recording a tensor core of the input matrix sequence M as C input The kth decoder outputs a sequence of feature matrices Y (k) Has a tensor kernel of C (k) Constrained Cos (M, Y) of two tensor similarities (k) ) Given by:
Cos(M,Y (k) )=Cos(C (M) ,C (k) )
Figure BDA0003618897280000051
wherein
Figure BDA0003618897280000052
Representing similarity constraints of input and output temporal patterns, cos representing cosine similarity between two tensor kernels, L (D) Which represents the number of decoders,
Figure BDA0003618897280000053
given by the average of the tensor similarities of the outputs of the multiple decoders to the original input sequence;
step 3.3, the multi-scale fusion strategy is set as follows:
Figure BDA0003618897280000054
wherein
Figure BDA0003618897280000055
By
Figure BDA0003618897280000056
And
Figure BDA0003618897280000057
is formed by fusing, the parameter tau is a preset value in the definition of the length of a decoder, F' MLP Is a two-layer fully-connected layer network, beta represents weight,
Figure BDA0003618897280000058
representing the final t-time hidden layer variable, wherein the potential characteristics of the t-time
Figure BDA0003618897280000059
By
Figure BDA00036188972800000510
And
Figure BDA00036188972800000511
jointly determining:
Figure BDA00036188972800000512
the decoding process of the kth decoder is:
Figure BDA00036188972800000513
wherein
Figure BDA00036188972800000514
Representing the underlying characteristics of the data at time t,
Figure BDA00036188972800000515
the initialization is zero and the number of the initial,
Figure BDA00036188972800000516
representing the output of the kth decoder at time t, W (k) And b (k) Are all learnable parameters;
Figure BDA00036188972800000517
by
Figure BDA00036188972800000518
And
Figure BDA00036188972800000519
together, δ is an artificially introduced normal-distribution-fitting noise.
Highest scale decoder D (1) Is the reconstruction of the input sequence, the reconstructed feature matrix sequence Y (1) Is shown as
Figure BDA0003618897280000061
The sequence of raw feature matrices M is represented as (M) 1 ,M 2 ,…,M T ) The reconstruction error is then represented by:
Figure BDA0003618897280000062
wherein | · | purple 2 Representing the two norms of the matrix.
Total Loss function
Figure BDA0003618897280000063
The two aspects of the error value and the time pattern similarity constraint of the reconstructed sequence and the original sequence are given as follows:
Figure BDA0003618897280000064
where λ is a hyperparameter representing the weighted value of the temporal pattern similarity error.
Step 3.4, detecting the abnormality by using the reconstruction error, which specifically comprises the following steps:
loss function after training rounds
Figure BDA0003618897280000065
And when a convergence value is reached, obtaining an offline anomaly detection model, and taking the anomaly score value recorded in the training process as an anomaly score threshold value.
And detecting the detection sample set by using the trained offline anomaly detection model, and judging that the score value exceeds a threshold value in the detection process as abnormal.
Has the advantages that:
the invention provides a multi-dimensional time sequence anomaly detection method based on multi-scale integrated decoding, and the algorithm of the method uses an inter-sequence correlation matrix to replace an original time sequence as the input of a model, thereby effectively retaining the inter-sequence correlation information; the method constructs a characteristic matrix and introduces the correlation among multi-dimensional time sequences, so that the system-level abnormality detection becomes possible; according to the invention, multi-scale decoding information is introduced in the multi-dimensional time sequence anomaly detection, the combination of coarse granularity and fine granularity fully utilizes information in different time modes for anomaly detection, and meanwhile, a multi-scale information integration decoding scheme is used for fully mining information in different scales, so that the problem of error accumulation is alleviated. The method is beneficial to further improving the accuracy of the multi-dimensional time series abnormality detection and promoting the practical application of the algorithm on large-scale data sets.
2. According to the multi-dimensional time sequence anomaly detection method based on multi-scale integrated decoding, in the multi-scale decoding process, the similarity of tensor cores is introduced to restrict the similarity of time patterns, and the similarity of the learned time patterns and the time patterns of input sequences is ensured
3. Compared with the prior art, the multi-dimensional time sequence anomaly detection method based on multi-scale integrated decoding needs to take values of a plurality of time windows when calculating the characteristic matrix, and adds multi-layer convolution operation, so that the calculation mode is relatively time-consuming; the invention introduces multi-scale information in the decoding stage, makes full use of the information in different time modes, and reduces the length of the decoder in an exponential decrement way and the calculation complexity.
4. According to the multi-dimensional time sequence anomaly detection method based on multi-scale integrated decoding, provided by the invention, the similarity constraint is added when an automatic encoder learns a time mode, and the similarity of the learned time mode and an input sequence is ensured by utilizing the similarity constraint among tensors, so that the accuracy of a multi-scale information fusion result in a decoding stage is ensured.
Drawings
Fig. 1 is a technical route diagram of a cloud network end resource multidimensional time sequence anomaly detection method based on multi-scale decoding according to the present invention.
Fig. 2 is a schematic diagram of a process of performing the Trucker decomposition on each segment of data in the embodiment of the present invention.
Detailed Description
The invention is described in detail below by way of example with reference to the accompanying drawings.
The invention provides a cloud network end resource multi-dimensional time sequence anomaly detection method based on multi-scale decoding. The overall technical roadmap is shown in fig. 1.
Firstly, calculating the correlation among sequences, and constructing a correlation characteristic matrix of the sequences; secondly, coding the time series correlation using an encoder and capturing the time pattern using an attention-based convolutional long-short term memory network given a feature matrix; and finally, decoding by using decoders of different scales, constraining the output of different decoders by using the similarity of tensor kernels, reconstructing a characteristic matrix by fusing all decoding results, and calculating a reconstruction error to detect the abnormality.
The method comprises the following steps:
the data state of the time sequence at the current moment is affected by historical data, the single moment data cannot completely reflect the characteristics of time sequence data, and in order to improve the stability of the data to be detected and effectively detect abnormal data in the time sequence data, the method firstly carries out standardization processing on the multi-dimensional time sequence data, then cuts the multi-dimensional time sequence data by adopting a sliding window algorithm and obtains a plurality of different time sequence segments.
Suppose multidimensional time-series data X = [ X ] 1 ,x 2 ,…,x n ]The dimension of the time sequence is n, the length of the time sequence is T, the size of the time window is w, the time sequence segment obtained by the sliding window segmentation is represented as the time sequence segment obtained by the sliding window segmentation of the ith dimension data
Figure BDA0003618897280000081
Figure BDA0003618897280000082
Wherein
Figure BDA0003618897280000083
The data of the ith dimension corresponding to the t-w-t time in the time sequence segment are respectively.
Step 1, calculating the correlation among time sequences and constructing a correlation characteristic matrix of the sequences; the step 1 specifically comprises the following steps:
calculating the correlation of each dimension data in the current time sequence segment and different dimension data in the historical time sequence segment through inner products to construct an N multiplied by N correlation characteristic matrix M t The ith row and the jth column in the correlation matrix represent the correlation between the jth dimension data in the current time sequence segment and the ith dimension data in the historical time sequence segment, i.e.
Figure BDA0003618897280000084
Figure BDA0003618897280000085
The value of (d) is calculated by the following formula:
Figure BDA0003618897280000086
wherein the content of the first and second substances,
Figure BDA0003618897280000087
representing the value in the corresponding time series segment of the ith dimension data,
Figure BDA0003618897280000088
representing the value of the ith dimension data in the corresponding time sequence segment, wherein t represents the current time; k is a scaling factor, k = w.
And 2, effective potential representation of the characteristic matrix is a key aspect of anomaly detection of the multi-dimensional time sequence, time sequence correlation among different dimensions is better reserved, and T-w +1 correlation matrixes can be obtained after multi-dimensional time data characteristic extraction is completed. In the related research of time series, RNN is generally adopted to encode time series data, which can fully consider the influence of the recent state on the current state. However, RNN suffers from the drawback that it cannot handle the problem of gradient disappearance resulting from recursion, and it cannot exploit long-time-series information. LSTM is a variant of RNN which has been proposed to solve the above problems. LSTM adds filtering of past states on the basis of RNNs so that it can be chosen which states are more influential at the present time, rather than choosing the most recent state as simply as a normal RNN. Therefore, LSTM is used here to model the time information in the feature matrix and output the potential features of the data.
Based on the correlation characteristic matrix, encoding the time series correlation by using an encoder, and capturing a time mode by using an attention-based convolution long-short term memory network; the step 2 is specifically divided into the following steps:
step 2.1: after multi-dimensional time data feature extraction is completed, T-w +1 correlation feature matrixes are obtained;
step 2.2: modeling the time information in the correlation characteristic matrix by adopting LSTM, and outputting potential characteristics of data, namely time sequence correlation codes;
where LSTM is represented as:
h t =LSTM([x t ;h t-1 ])
h=F MLP (concat[h 1 ;h 2 ;…;h T ])
where LSTM () represents an LSTM unit, h t Indicating the hidden state at time t, from time t-1 to the hidden state h t-1 And input x at time t t Jointly determine when x is input t When it is a correlation feature matrix, h t The potential characteristics of the data are obtained; h represents the hidden state of the whole input data, and the hidden state h is represented by each moment 1 ;h 2 ;…;h T Spliced along a time dimension; concat represents the splicing function, F MLP Is a fully connected layer.
And 3, decoding by using decoders of different scales, constraining the output of different decoders by using the similarity of tensor kernels, reconstructing a characteristic matrix by fusing all decoding results, and calculating a reconstruction error to detect the abnormity.
In order to capture the time behaviors of the time sequence under different scales, decoders with different output lengths are adopted to decode the potential features of the data, and a plurality of reconstructed feature matrixes are obtained. Decoders with short output length are interested in macroscopic temporal characteristics, while decoders with longer output length can capture more detailed local temporal patterns. Meanwhile, the output of a plurality of decoders needs to ensure the similarity of time sequences, and the similarity of tensors is adopted to constrain the time pattern of the output. Finally, the decoder outputs are effectively fused in a coarse-to-fine manner using an appropriate multi-scale fusion strategy, and the fused highest-scale decoder output and the error value of the original input are used to measure the anomaly.
The step 3 can be carried out by adopting the following specific steps:
step 3.1: decoding the latent features of the data by adopting decoders with different output lengths, wherein the decoders output a plurality of reconstructed feature matrixes; the output of a plurality of decoders guarantees the similarity of time sequences, and tensor similarity is adopted to restrict the time mode of output;
step 3.2: and effectively fusing the decoder output in a coarse-to-fine mode by using a multi-scale fusion strategy, wherein the fused decoder output with the highest scale and the error value of the original input are used for measuring the abnormality to obtain an abnormality detection result.
The invention also provides the following embodiment, wherein the step 3 is specifically divided into the following steps:
step 3.1: assume a decoder set of D, where the k-th decoder D (k) Is defined as T (k) The original sequence length is T, T (k) Is defined as follows:
T (k) =α k T
α k =1/τ k-1 ∈(0,1]
wherein alpha is k For coefficients of the kth decoder, α 1 =1,τ>1, ensuring the output length of a decoder with the highest scale to be T;
step 3.2, constraining the time modes output by different decoders by adopting the similarity of tensors;
the output of the decoders is a sequence of feature matrices, the outputs of different decoders differing by the length of the sequence. Here the similarity of tensors is used to constrain the temporal patterns of the different decoder outputs. The dimension of the input matrix sequence M is n multiplied by T, the k decoder outputs a characteristic matrix sequence Y (k) Has a dimension of n × n × T (k) Input matrix sequence M and output feature matrix sequence Y (k) Taking the three-order tensors as the whole, carrying out tensor decomposition on each tensor to obtain tensor kernels with the same size, and approximating the similarity of the alternative tensors by the similarity of the tensor kernels.
Because the length T of the time sequence is generally very large, the whole tensor cannot be directly decomposed, the data are divided into a plurality of sections by using the thought of a time window, and the Trucker decomposition is respectively carried out on each section, as shown in figure 2.
The tensor kernel of the input matrix sequence M is recorded as C input The kth decoder outputs a sequence of feature matrices Y (k) Has a tensor kernel of C (k) Constrained Cos (M, Y) of two tensor similarities (k) ) Given by:
Cos(M,Y (k) )=Cos(C (M) ,C (k) )
Figure BDA0003618897280000111
wherein
Figure BDA0003618897280000112
Representing similarity constraints of input and output temporal patterns, cos representing cosine similarity between two tensor kernels, L (D) Which represents the number of decoders,
Figure BDA0003618897280000113
given by the average of the similarity of the outputs of the multiple decoders to the original input sequence tensor;
step 3.3, the multi-scale fusion strategy is set as follows:
Figure BDA0003618897280000114
wherein
Figure BDA0003618897280000115
By
Figure BDA0003618897280000116
And
Figure BDA0003618897280000117
is formed by fusing, the parameter tau is a preset value in the definition of the length of a decoder, F' MLP Is a two-layer fully-connected layer network, beta represents weight,
Figure BDA0003618897280000118
representing the final t-time hidden layer variable, wherein the potential characteristics of the t-time
Figure BDA0003618897280000119
By
Figure BDA00036188972800001110
And
Figure BDA00036188972800001111
jointly determining:
Figure BDA00036188972800001112
the decoding process of the kth decoder is:
Figure BDA00036188972800001113
wherein
Figure BDA00036188972800001114
Representing the underlying characteristics of the data at time t,
Figure BDA00036188972800001115
the initialization is zero and the number of the initial,
Figure BDA00036188972800001116
representing the output of the kth decoder at time t, W (k) And b (k) Are all learnable parameters;
Figure BDA00036188972800001117
by
Figure BDA00036188972800001118
And
Figure BDA00036188972800001119
jointly deciding that delta is artificially introduced noise conforming to a normal distribution;
highest scale decoder D (1) Is the reconstruction of the input sequence, the reconstructed feature matrix sequence Y (1) Is shown as
Figure BDA0003618897280000121
The sequence of original feature matrices M is represented as (M) 1 ,M 2 ,…,M T ) The reconstruction error is then represented by:
Figure BDA0003618897280000122
wherein | · | purple 2 A two-norm representing a matrix;
total Loss function
Figure BDA0003618897280000123
The two aspects of the error value and the time pattern similarity constraint of the reconstructed sequence and the original sequence are given as follows:
Figure BDA0003618897280000124
where λ is a hyperparameter representing the weighted value of the temporal pattern similarity error.
Step 3.4, detecting the abnormality by using the reconstruction error, specifically comprising the following steps:
loss function after training rounds
Figure BDA0003618897280000125
When a convergence value is reached, an offline anomaly detection model is obtained, and an anomaly score value recorded in the training process is used as an anomaly score threshold value;
and detecting the detection sample set by using the trained offline anomaly detection model, and judging that the score value exceeds a threshold value in the detection process as abnormal.
Generally, the definition of the abnormal score is highly related to the Loss function of model training, and the reason why the model can detect the abnormality is to learn the characteristics and distribution of normal data through training, reduce the Loss of the normal data and distinguish the normal data from abnormal data. In this respect, the anomaly score is consistent with the starting point of the Loss function, except that the definition of the anomaly score exists in the detection phase and the Loss function exists in the training phase. After training for multiple rounds, the Loss function reaches a convergence value, and the obtained offline abnormality detection model cannot be directly used for judging abnormality. The error values of the reconstructed feature matrix and the original feature matrix can be obtained in the training or predicting process, and the error values are used as the abnormal score and used for subsequent abnormal judgment.
The definition of the threshold is closely related to the detection effect of the model, the maximum value of the abnormal score in the verification set is adopted to define the threshold in the method, and the score value exceeds the threshold in the detection process, so that the abnormal condition is judged. The threshold is defined as follows.
th=γ·max{score(t) valid }
Where th denotes a threshold, max { score (t) valid Denotes the maximum value of the anomaly scores in the validation set, γ ∈ [1,2 ]]A threshold scaling parameter.
In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (6)

1. The cloud network end resource multi-dimensional time sequence anomaly detection method based on multi-scale decoding is characterized by comprising the following steps:
step 1, calculating the correlation among time sequences and constructing a correlation characteristic matrix of the sequences;
step 2, based on the correlation characteristic matrix, using an encoder to encode the time sequence correlation, and using a convolution long-short term memory network based on attention to capture a time mode;
and 3, decoding by using decoders of different scales, constraining the output of different decoders by using the similarity of tensor kernels, reconstructing a characteristic matrix by fusing all decoding results, and calculating a reconstruction error to detect the abnormity.
2. The cloud network end resource multidimensional time series anomaly detection method based on multi-scale decoding as claimed in claim 1, wherein the step 1 is preceded by the following steps:
after the multi-dimensional time sequence data are standardized, the multi-dimensional time sequence data are segmented by adopting a sliding window algorithm, and a plurality of different time sequence segments are obtained;
assuming that the dimension of the time sequence is n, the length of the time sequence is T, the size of the time window is w, the time sequence segment obtained by the sliding window segmentation is represented as the time sequence segment obtained by the sliding window segmentation of the ith dimension data
Figure FDA0003618897270000011
Figure FDA0003618897270000012
Wherein
Figure FDA0003618897270000013
The data of the ith dimension corresponding to the t-w-t time in the time sequence segment are respectively.
3. The method for detecting the anomaly of the multi-dimensional time series of the cloud network side resources based on the multi-scale decoding as claimed in claim 2, wherein the step 1 specifically comprises:
calculating the correlation of each dimension data in the current time sequence segment and different dimension data in the historical time sequence segment through inner products, and constructing an N multiplied by N correlation characteristic matrix M t The ith row and the jth column in the correlation matrix represent the correlation between the jth dimension data in the current time sequence segment and the ith dimension data in the historical time sequence segment, i.e.
Figure FDA0003618897270000014
Figure FDA0003618897270000021
The value of (d) is calculated by the following formula:
Figure FDA0003618897270000022
wherein the content of the first and second substances,
Figure FDA0003618897270000023
representing the value in the i-th dimension data corresponding to the time series segment,
Figure FDA0003618897270000024
representing the numerical value of the ith dimension data in the corresponding time sequence segment, wherein t represents the current moment; k is a scaling factor, k = w.
4. The method for detecting the anomaly of the multi-dimensional time series of the cloud network resources based on the multi-scale decoding as claimed in claim 3, wherein the step 2 is specifically divided into the following steps:
step 2.1: after multi-dimensional time data feature extraction is completed, T-w +1 correlation feature matrixes are obtained;
step 2.2: modeling the time information in the correlation characteristic matrix by using an LSTM (least squares metric) and outputting potential characteristics of data, namely time series correlation codes;
the encoding process in which LSTM is used can be expressed as:
h t =LSTM([x t ;h t-1 ])
h=F MLP (concat[h 1 ;h 2 ;…;h T ])
where LSTM (. Circle.) represents an LSTM unit, h t Indicating the hidden state at time t, from time t-1 to the hidden state h t-1 And input x at time t t Jointly determine when x is input t When it is a correlation feature matrix, h t The potential characteristics of the data are obtained; h represents the hidden state of the whole input data, and the hidden state h is represented by each moment 1 ;h 2 ;…;h T Spliced along a time dimension; concat represents the splicing function, F MLP Is a fully connected layer.
5. The method for detecting the anomaly of the multi-dimensional time series of the cloud network resources based on the multi-scale decoding as claimed in any one of claims 1 to 4, wherein the step 3 specifically comprises:
step 3.1: decoding the latent features of the data by adopting decoders with different output lengths, wherein the decoders output a plurality of reconstructed feature matrixes; in order to ensure the similarity of a plurality of decoder output sequences, tensor similarity is adopted to restrict the time mode of output;
step 3.2: and effectively fusing the decoder output in a coarse-to-fine mode by using a multi-scale fusion strategy, wherein the fused decoder output with the highest scale and the error value of the original input are used for measuring the abnormality to obtain an abnormality detection result.
6. The method for detecting the anomaly of the multi-dimensional time series of the cloud network resources based on the multi-scale decoding as claimed in claim 5, wherein the step 3 comprises the following steps:
step 3.1: assume a decoder set of D, where the k-th decoder D (k) Is defined as T (k) The original sequence length is T, T (k) Is defined as follows:
T (k) =α k T
α k =1/τ k-1 ∈(0,1]
wherein alpha is k For coefficients of the kth decoder, α 1 =1,τ>1, ensuring the output length of a decoder with the highest scale to be T;
step 3.2, constraining the time modes output by different decoders by adopting the similarity of tensors;
the dimension of the input matrix sequence M is n x T, the k-th decoder outputs a characteristic matrix sequence Y (k) Has dimension of n × n × T (k) Input matrix sequence M and output feature matrix sequence Y (k) Taking the three tensors as the third-order tensors, carrying out tensor decomposition on each tensor to obtain tensor kernels with the same size, and approximating the similarity of the alternative tensors by the similarity of the tensor kernels;
recording a tensor core of the input matrix sequence M as C input The kth decoder outputs a sequence of feature matrices Y (k) Has a tensor kernel of C (k) Constrained Cos (M, Y) of two tensor similarities (k) ) Given by:
Cos(M,Y (k) )=Cos(C (M) ,C (k) )
Figure FDA0003618897270000031
wherein
Figure FDA0003618897270000032
Representing similarity constraints of input and output temporal patterns, cos representing cosine similarity between two tensor kernels, L (D) Which represents the number of decoders,
Figure FDA0003618897270000033
by the output of a plurality of decodersGiving out an average value of the similarity of the original input sequence tensor;
step 3.3, the multi-scale fusion strategy is set as follows:
Figure FDA0003618897270000034
wherein
Figure FDA0003618897270000035
By
Figure FDA0003618897270000036
And
Figure FDA0003618897270000037
is formed by fusing, the parameter tau is a preset value in the definition of the length of a decoder, F' MLP Is a two-layer fully-connected layer network, beta represents weight,
Figure FDA0003618897270000038
representing the final t-time hidden layer variable, wherein the potential characteristics of the t-time
Figure FDA0003618897270000041
By
Figure FDA0003618897270000042
And
Figure FDA0003618897270000043
jointly determining:
Figure FDA0003618897270000044
the decoding process of the kth decoder is:
Figure FDA0003618897270000045
wherein
Figure FDA0003618897270000046
Representing potential characteristics of the data at time t,
Figure FDA0003618897270000047
Figure FDA0003618897270000048
the initialization is zero and the number of the initial,
Figure FDA0003618897270000049
representing the output of the kth decoder at time t, W (k) And b (k) Are all learnable parameters;
Figure FDA00036188972700000410
by
Figure FDA00036188972700000411
And
Figure FDA00036188972700000412
jointly deciding that delta is artificially introduced noise conforming to a normal distribution;
highest scale decoder D (1) Is the reconstruction of the input sequence, the reconstructed feature matrix sequence Y (1) Is shown as
Figure FDA00036188972700000413
The sequence of raw feature matrices M is represented as (M) 1 ,M 2 ,…,M T ) The reconstruction error is then represented by:
Figure FDA00036188972700000414
wherein | · | purple 2 A two-norm representing a matrix;
general description of the inventionLoss function
Figure FDA00036188972700000415
The method is given by two aspects of the error value and the time mode similarity constraint of the reconstructed sequence and the original sequence:
Figure FDA00036188972700000416
where λ is a hyperparameter representing the weighted value of the temporal pattern similarity error.
Step 3.4, detecting the abnormality by using the reconstruction error, specifically comprising the following steps:
loss function after training rounds
Figure FDA00036188972700000417
When a convergence value is reached, an offline anomaly detection model is obtained, and an anomaly score value recorded in the training process is used as an anomaly score threshold value;
and detecting the detection sample set by using the trained offline anomaly detection model, and judging that the score value exceeds a threshold value in the detection process to be abnormal.
CN202210456392.7A 2022-04-27 2022-04-27 Cloud network end resource multidimensional time sequence anomaly detection method based on multi-scale decoding Pending CN115169430A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210456392.7A CN115169430A (en) 2022-04-27 2022-04-27 Cloud network end resource multidimensional time sequence anomaly detection method based on multi-scale decoding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210456392.7A CN115169430A (en) 2022-04-27 2022-04-27 Cloud network end resource multidimensional time sequence anomaly detection method based on multi-scale decoding

Publications (1)

Publication Number Publication Date
CN115169430A true CN115169430A (en) 2022-10-11

Family

ID=83482505

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210456392.7A Pending CN115169430A (en) 2022-04-27 2022-04-27 Cloud network end resource multidimensional time sequence anomaly detection method based on multi-scale decoding

Country Status (1)

Country Link
CN (1) CN115169430A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116415197A (en) * 2023-03-13 2023-07-11 海南大学 Underground pipe gallery abnormality detection network and method based on attention mechanism
CN116702083A (en) * 2023-08-10 2023-09-05 武汉能钠智能装备技术股份有限公司四川省成都市分公司 Satellite telemetry data anomaly detection method and system
CN116743646A (en) * 2023-08-15 2023-09-12 云南省交通规划设计研究院有限公司 Tunnel network anomaly detection method based on domain self-adaptive depth self-encoder
CN117310546A (en) * 2023-11-03 2023-12-29 北京迪赛奇正科技有限公司 UPS power health management monitoring system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116415197A (en) * 2023-03-13 2023-07-11 海南大学 Underground pipe gallery abnormality detection network and method based on attention mechanism
CN116702083A (en) * 2023-08-10 2023-09-05 武汉能钠智能装备技术股份有限公司四川省成都市分公司 Satellite telemetry data anomaly detection method and system
CN116702083B (en) * 2023-08-10 2023-12-26 武汉能钠智能装备技术股份有限公司四川省成都市分公司 Satellite telemetry data anomaly detection method and system
CN116743646A (en) * 2023-08-15 2023-09-12 云南省交通规划设计研究院有限公司 Tunnel network anomaly detection method based on domain self-adaptive depth self-encoder
CN116743646B (en) * 2023-08-15 2023-12-19 云南省交通规划设计研究院股份有限公司 Tunnel network anomaly detection method based on domain self-adaptive depth self-encoder
CN117310546A (en) * 2023-11-03 2023-12-29 北京迪赛奇正科技有限公司 UPS power health management monitoring system

Similar Documents

Publication Publication Date Title
CN115169430A (en) Cloud network end resource multidimensional time sequence anomaly detection method based on multi-scale decoding
CN111914873A (en) Two-stage cloud server unsupervised anomaly prediction method
CN110751108B (en) Subway distributed vibration signal similarity determination method
CN115688035A (en) Time sequence power data anomaly detection method based on self-supervision learning
CN113673346A (en) Motor vibration data processing and state recognition method based on multi-scale SE-Resnet
CN110895705A (en) Abnormal sample detection device, training device and training method thereof
CN115903741B (en) Industrial control system data anomaly detection method
CN112381790A (en) Abnormal image detection method based on depth self-coding
CN110991471B (en) Fault diagnosis method for high-speed train traction system
CN114048468A (en) Intrusion detection method, intrusion detection model training method, device and medium
CN114760098A (en) CNN-GRU-based power grid false data injection detection method and device
CN115587335A (en) Training method of abnormal value detection model, abnormal value detection method and system
CN116522265A (en) Industrial Internet time sequence data anomaly detection method and device
CN116415200A (en) Abnormal vehicle track abnormality detection method and system based on deep learning
CN117092581A (en) Segment consistency-based method and device for detecting abnormity of electric energy meter of self-encoder
CN117041972A (en) Channel-space-time attention self-coding based anomaly detection method for vehicle networking sensor
CN116628612A (en) Unsupervised anomaly detection method, device, medium and equipment
CN117113139A (en) Training method and device for fault detection model, computer equipment and storage medium
CN115184054B (en) Mechanical equipment semi-supervised fault detection and analysis method, device, terminal and medium
CN116739168A (en) Runoff prediction method based on gray theory and codec
CN113095386B (en) Gesture recognition method and system based on triaxial acceleration space-time feature fusion
CN114841196A (en) Mechanical equipment intelligent fault detection method and system based on supervised learning
CN113435321A (en) Method, system and equipment for evaluating state of main shaft bearing and readable storage medium
CN113821401A (en) WT-GA-GRU model-based cloud server fault diagnosis method
CN111178630A (en) Load prediction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination