CN116089012A

CN116089012A - Self-adaptive container anomaly detection method based on container resource index

Info

Publication number: CN116089012A
Application number: CN202310087709.9A
Authority: CN
Inventors: 刘发贵; 石上松
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2023-01-18
Filing date: 2023-01-18
Publication date: 2023-05-09

Abstract

The invention discloses a container anomaly detection method based on container resource indexes. The method comprises the following steps: collecting resource index data of a container cluster and preprocessing the resource index data to obtain a normal sample data set; generating a sample micro-cluster according to the normal sample data set, and storing a given number of normal samples in the micro-cluster; training an automatic encoder model by taking the stored normal samples as a data set; an automatic encoder automatic updating mechanism based on a data stream micro cluster is constructed, a trained automatic encoder is used for calculating a loss function of a container sample in real time, whether the sample is abnormal or not is judged according to the loss function, an abnormal container is positioned when the abnormality is detected, and meanwhile, the real-time detection of the container abnormality is realized according to the constructed automatic updating mechanism. The method can be used for detecting the abnormality of the container resources of multiple platforms, and has the advantages of no dependence on labels, short training time, good data stream adaptability, less resource occupation and strong expansibility.

Description

Self-adaptive container anomaly detection method based on container resource index

Technical Field

The invention belongs to the technical field of computer application, and particularly relates to a self-adaptive container anomaly detection method based on container resource indexes.

Background

Container technology has been widely used in the field of cloud computing in recent years. Compared with the traditional virtualization schemes such as a virtual machine, the container technology has the characteristics of light weight and easy expansion, and the abnormality detection of the container platform also faces the new requirements of high real-time performance and high adaptability. In the Kubernetes container platform commonly used at present, the existing fault detection means are used for monitoring the survival state of the container, and the abnormality of the survival container cannot be detected. For this reason, researchers have proposed using container anomaly detection methods to monitor the anomaly status of containers from a vast array of container indicators for operation and maintenance personnel to respond early.

At present, researchers have achieved a certain result in the field of container abnormality detection. These efforts have met the needs of existing container anomaly detection to some extent, but have limitations in that new solutions have to be proposed to achieve container anomaly detection in more situations.

Part of research results adopt a mode of monitoring indexes outside the container to detect abnormality of the container. For example, the document "a method and system for detecting abnormal behavior of processes in a container (CN 109858244 a)" proposes an abnormality detection method and system for monitoring processes in a container in a host user layer to obtain container behavior data, and detecting abnormalities using an LSTM neural network. The document "an abnormality detection method, an apparatus, a readable storage medium, and an electronic device (CN 115185777 a)" proposes an abnormality detection method based on a security facet, which obtains a resource index and a service index of a container by deploying the security facet in a service application program, and determines whether the container is abnormal according to an abnormality matching policy. The above method has feasibility in a specific container cluster, however, the external indexes of the host layer, the service layer and the like are not always available to the operation and maintenance personnel. The container anomaly detection method based on the container resource indexes has the characteristics of low acquisition difficulty and popularization of acquisition tools in consideration of the resource indexes such as the CPU utilization rate, the memory utilization amount, the disk I/O data amount and the network I/O data amount of the container, and has good adaptability.

In the field of anomaly detection based on container resource indexes, the ideas of researchers are also different. The document "an abnormality detection method and apparatus (CN 114327963 a)" proposes an abnormality detection method based on container instance history data. The method proposes to abstract the containers of the same type as container instances, collect historical data in the unit of container instances, and construct an online anomaly detector based on a threshold detection method. The document "container cloud cluster node anomaly detection method and system (CN 114942875A)" proposes an anomaly detection component deployed on a cloud cluster, and various indexes and events in a cloud platform are collected by using the component and reported to a cloud service API. However, the existing research work indicates that in complex and changeable container environments, the threshold detection method is difficult to adjust the threshold in time according to the change of the container working state, and meanwhile, more complex anomaly detection methods based on statistics, distance and density also need to pay attention to how to avoid excessive overhead caused by massive container data (Z.Zou, Y.Xie, K.Huang, G.Xu, D.Feng and D.Long, "A Docker Container Anomaly Monitoring System Based on Optimized Isolation Forest," in IEEE Transactions on Cloud Computing, vol.10, no.1, pp.134-145,1Jan. -March 2022, doi: 10.1109/TCC.2019.2935724). In addition, the container data has the characteristics of quick data generation and high labeling difficulty, so that the anomaly detection method based on supervised learning can also face the problem of high acquisition difficulty of the training set. Currently, a cloud platform still needs an unsupervised and self-adaptive anomaly detection method based on container resource indexes.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, and provides an unsupervised and self-adaptive container abnormality detection method based on container resource indexes, so as to solve the defects of insufficient flexibility and instantaneity of the existing container abnormality detection method. The method uses container resource index data acquired by a container platform interface, processes the data through a data flow algorithm to generate micro clusters, uses samples in the micro clusters to train an automatic encoder, then detects abnormal samples in real time according to an abnormal detection result of the automatic encoder, positions the abnormal containers, and realizes updating of an abnormal detection model in the running process of the method.

The object of the invention is achieved by at least one of the following technical solutions.

A container anomaly detection method based on container resource indexes comprises the following steps:

s1, collecting resource index data of a container cluster and preprocessing the resource index data to obtain a normal sample data set;

s2, generating a sample micro-cluster according to the normal sample data set, and storing a given number of normal samples in the micro-cluster;

s3, training an automatic encoder model by taking the stored normal sample as a data set;

s4, constructing an automatic updating mechanism of the automatic encoder based on the data stream micro-cluster, calculating a loss function of a container sample in real time by using the trained automatic encoder, judging whether the sample is abnormal or not according to the loss function, positioning an abnormal container when the abnormality is detected, and simultaneously realizing real-time detection of the container abnormality according to the constructed automatic updating mechanism.

Further, in step S1, the resource index data of the container cluster includes a CPU usage rate, a memory usage amount, a disk I/O data amount, and a network I/O data amount of the container statistics container through an interface provided by the container platform;

the interface is an interface provided by the Docker Stats, and the index type can be expanded in a self-data collection interface mode;

the preprocessing process of the resource index data comprises difference processing, standardization processing and default value processing;

the objects of the difference processing, the standardization processing and the default value processing are samples, and specifically refer to all data acquired under a certain timestamp, including container names, index types, acquisition values and acquisition time of container indexes;

the difference processing means that if the resource index is an accumulated value, the difference is calculated according to the historical record in the step of calculation, and the actual resource use condition of the current time interval is obtained; if the difference value is smaller than or equal to zero, setting the real-time value of the index as a non-zero difference value in the shortest time; the memory usage, disk I/O data volume, network I/O data volume and other indexes all need to be subjected to difference processing;

the normalization processing means that the data after the difference processing is subjected to normalization operation according to the mean value and the variance of the existing resource index, and then the mean value and the variance of the resource index are updated;

The default value processing means that collected samples are integrated into a data block, indexes with non-zero difference values are marked as effective indexes and the existing difference values are reserved for all samples in the stored data block, and the indexes with zero difference values are assigned by a linear filling method; the default value process is performed once every time a non-zero difference is generated, and samples at the end of the data block are still filled with valid differences of the most recent timestamp; the three indexes of the memory usage amount, the disk I/O data amount and the network I/O data amount are processed by default values; the default value processing can be used for relieving data loss caused by overlong sampling time of partial indexes, but the real-time performance of abnormality detection can be reduced in the step; default value processing is typically turned on when the index collection intervals are not synchronized and is applied to resource indexes with longer sampling intervals;

the container clusters need to keep a normal state in the resource index collection process;

all containers in the cluster are required to be in an operating state, and services carried by the containers are required to be in a normally accessible state; short fluctuations in the container resource metrics or business metrics do not affect the metrics collection process.

Further, in step S2, a data stream micro-cluster storing a plurality of data samples is constructed, including constructing core parameters of the micro-cluster, calculating modes of important indexes in the micro-cluster and type division of the micro-cluster, which specifically includes the following steps:

The construction of the micro-cluster needs three parameters of epsilon, lambda and beta, wherein epsilon is a micro-cluster radius threshold parameter, lambda is a decay parameter, and beta is an outlier micro-cluster threshold parameter;

the micro-cluster comprises three indexes of w, C and r, wherein w, C and r respectively represent the weight, the center point and the radius of the micro-cluster, and the calculation can be carried out through the history information and the sample point of the micro-cluster; the weight w of the micro clusters determines the number of sample points that the micro clusters can store;

according to different weights w of the micro clusters, the micro clusters are divided into core micro clusters and outlier micro clusters, and the two micro clusters can be mutually converted; removing the outlier micro clusters with insufficient weight after each micro cluster generation period;

in a micro-cluster with n sample points, P is used _i Representing the ith sample point, T, in the micro cluster _i The timestamp representing the ith sample point, T represents the current time, and after epsilon, lambda, beta are defined, the attributes of the micro-cluster can be calculated using the following formula:

the decay parameter lambda represents the importance degree of the method on the historical data, and the larger the value of the decay parameter lambda is, the larger the weight decay degree of the historical data is; the lambda value range is 0< lambda <1, the time difference between the historical time and the current time is t, and the calculation formula of the decay factor f (t) is shown in formula (1):

f(t)＝2 ^-λt ； (1)

the weight w of the micro cluster determines the type and life cycle of the micro cluster and limits the total number of sample points stored in the micro cluster; w is calculated by the time stamp of the sample, and the calculation formula is shown as formula (2):

The center point C and the radius r of the micro cluster are calculated through the weight w and the coordinates of all sample points in the micro cluster; under the action of the decay factor f (t), the longer the sample enters the micro cluster, the smaller the influence degree of the sample on the characteristics; the two parameter calculation formulas are shown in formulas (3) (4):

wherein d (A, B) represents Euclidean distance between the two points A and B, specifically represents absolute distance between the two points A and B in a multidimensional space, and a calculation formula is shown in formula (5):

where n represents the spatial dimension, A and B are points in n-dimensional space, x _i And y is _i Representing the coordinates of a and B in the ith dimension.

Further, the outlier micro-clusters have lower weights and sample reception priorities than the core micro-clusters, and there is a possibility of being removed; the details of the partitioning of core and outlier micro clusters are as follows:

mu is the maximum value of the weight of the micro cluster and is used for calculating the dividing threshold value of the core micro cluster and the outlier micro cluster; specifically, when the current time t→infinity is equal to the current time t→infinity under the condition that the cluster flows into 1 sample at a time, the sum μ of the weights of all the micro clusters in the data stream is a constant value according to the formula (1) and the formula (2), and the calculation formula is shown in the formula (6):

to distinguish the differences in the weights of the micro clusters, an outlier micro cluster threshold parameter β is defined and 0 is specified <β<1, taking beta mu as a distinguishing threshold value of a core micro cluster and an outlier micro cluster; definition T _p For the shortest time required for the conversion of the core micro-cluster into an outlier micro-cluster, T _p The calculation formula of (2) is shown as formula (7):

for outlier micro-clusters, T _p Then it is the shortest lifecycle of the outlier micro-cluster.

Further, in step S2, initialization of the micro cluster is performed;

the initialization process uses a normal sample data set to acquire an initial micro-cluster, wherein the initial micro-cluster is generated by multiple iterations of the normal sample data set, and specifically comprises the following steps:

taking a first sample point in a sample as an initial micro cluster with a weight w of 1, a center point C of the sample and a radius r of 0, trying to sequentially put the samples in the data set into the micro cluster, and generating a first batch of micro clusters through the receiving process of the micro clusters on the samples; repeatedly receiving samples in the data set by using the first micro-cluster for multiple times, and taking the micro-clusters generated after iteration as initial micro-clusters for subsequent model training.

Further, in step S3, an Automatic Encoder (AE) model is divided into two parts, namely an encoder and a decoder, specifically as follows:

the definition of an automatic encoder is as follows:

h＝σ _e (W ₁ x+b ₁ ) (8)

y＝σ _d (W ₂ x+b ₂ ) (9)

where x represents the input of the encoder, h represents the output of the hidden layer, y represents the output of the decoder, σ _e And sigma (sigma) _d Representing the excitation functions of the encoder and decoder, respectively, W ₁ And W is equal to ₂ Weights of encoder and decoder respectively, b ₁ And b ₂ Representing the bias of the encoder and decoder, respectively, J (W, b) is the loss function of the auto encoder;

the encoder and decoder are structured as follows:

the input samples of the encoder are the sample points preprocessed in the step S1; in a cluster formed by N containers, four resource indexes of CPU utilization rate, memory utilization amount, disk I/O data amount and network I/O data amount exist in each container, so that the dimension of a sample point is 4N; the encoder comprises 4 hidden layers, wherein each hidden layer is a fully-connected layer with an input dimension of m and an output dimension of n, and m is greater than n; each hidden layer is followed by a ReLU function as an activation function; the input dimension of the encoder is related to the container cluster size, so the structure of the encoder should be adjusted according to the number N of containers;

the structure of the decoder is symmetrical to the encoder; in the encoder, any full-connection layer with m input dimension and n output dimension is selected, and the decoder is provided with and only one full-connection layer with n input dimension and m output dimension corresponds to the full-connection layer, and vice versa; likewise, each hidden layer in the decoder is followed by a ReLU function as an activation function.

Further, in step S4, an automatic encoder automatic updating mechanism based on the data stream micro-cluster is constructed to realize real-time detection of container anomalies, which specifically comprises the following steps:

s4.1, inputting real-time samples of the container into a trained automatic encoder, calculating a sample loss function through the automatic encoder, detecting abnormal samples according to the loss function, and detecting abnormal samples of the container in real time;

s4.2, after the trained automatic encoder executes sample abnormality detection, a new normal sample is received by the micro-clusters, then the attributes of all the micro-clusters and the sample queue are updated, and outdated samples in the sample queue are removed;

s4.3, after a micro cluster generation period, checking the weights of all micro clusters, and removing the outlier micro clusters with the weights lower than a set threshold value;

s4.4, after a model training period, training a new automatic encoder model by using normal samples stored in the micro-clusters to solve the problem of conceptual drift of the container resource index data stream, wherein training of the model can be completed in one to a plurality of time stamps, and returning to the step S4.1 after training is completed.

Further, in step S4.1, the loss function J (W, b) is a mean square loss function (Mean Square Error, MSE) that calculates the degree of difference between the input x and the output y;

The automatic encoder model expects outputs and inputs to be as identical as possible;

in the problem of anomaly detection, after training is performed by using data of normal operation of a container cluster, the MSE of a normal sample is relatively low, and an input sample is difficult to reconstruct by an abnormal sample through a decoder, so that the MSE is obviously higher than the normal sample;

after obtaining the MSE, selecting one of the following three methods as an anomaly detection method according to cluster characteristics:

3 sigma criterion method, namely setting the abnormal threshold value as the sum of the average value and three times of standard deviation of the model history training loss, judging that the sample is abnormal if the abnormal threshold value is larger than the threshold value, otherwise, judging that the sample is normal; the method has good adaptability and convenience, and is a default abnormality detection method of the scheme;

manual assignment, i.e. the operation and maintenance personnel manually assign abnormal threshold values. The method can meet the further requirements on the sensitivity and accuracy of anomaly detection;

the LOF method, i.e. further detecting abnormal points of the MSE by a local outlier (Local Outlier Factor, LOF) algorithm; the method is suitable for occasions with poor effects in the two modes;

if the sample point is abnormal, using MSE to locate a plurality of containers which contribute most to the sample abnormality, and identifying the containers as abnormality root factors;

Specifically, selecting a micro cluster center closest to a sample as a normal reference sample, continuously replacing resource indexes of corresponding containers in the reference sample by using partial container indexes of the sample to be tested, and determining the position of an abnormal container by comparing MSE of different alternative schemes;

the process uses a dichotomy to search for a container most likely to cause abnormality, and replaces an index of a sample to be tested with a normal reference sample after recording the abnormal container;

then, the abnormal container detection process is repeated, thereby locating other abnormal containers.

Further, in step S4.2, after the new sample point passes through the anomaly detection, sequentially attempting to receive the core micro cluster with the nearest euclidean distance and the outlier micro cluster with the nearest euclidean distance in the sample space according to the order of priority, and if both micro clusters cannot be received, creating a new outlier micro cluster with the sample as the center;

sample points enter the micro-clusters through the following processes: attempting to receive a sample p by a target micro cluster mc, and generating a new micro cluster mc' by using the coordinates of p and the original index of mc; if the radius r of mc _mc′ Satisfy r _mc′ <Epsilon, then it is stated that sample p can be received by mc; after confirming that sample p can be received by mc, delete mc from the list of micro-clusters and add mc' to the target micro-cluster; after being received by the micro clusters, the samples enter a normal sample queue so as to participate in the initialization and updating of the automatic encoder model in the step S4.4;

After the sample points of the latest timestamp are processed, each micro cluster needs to update the respective attribute no matter whether new sample points are merged or not; a normal sample queue containing the latest samples is maintained inside each container, the length of the queue is w, w is rounded down, and the earliest sample exceeding the length of the queue is discarded after the micro cluster features are updated.

Further, in step S4.3, each interval T _p Time, performing a weight check of the primary micro cluster;

for any micro-cluster, if the weight w is more than or equal to beta mu, the micro-cluster is a core micro-cluster; if the weight w < beta mu, the micro cluster is an outlier micro cluster;

after the weight is calculated, the outlier micro-cluster is reserved until the next weight detection, if the weight satisfies w is more than or equal to beta mu, the outlier micro-cluster is converted into a core micro-cluster, otherwise, the micro-cluster is removed;

after the weights of all the micro clusters are checked, the core micro cluster and the outlier micro cluster list are updated.

Compared with the prior art, the invention has the following technical achievements and advantages:

1. the method uses the data flow micro cluster to extract the container resource index information of CPU, memory, disk, network and the like, and uses the automatic encoder to perform the abnormality detection and positioning, thereby having good abnormality detection accuracy, running speed and generalization capability.

2. The container anomaly detection method has the advantages of no supervision and self adaption, the model can be trained by only a small amount of normal samples, a large amount of container resource indexes can be processed, the updating of the anomaly detection model can be rapidly completed, and the method is well suitable for the characteristic of light weight of the container environment. In contrast, most conventional anomaly detection methods are in a supervised and semi-supervised mode, training data is required to be collected and labeled in a new container cluster, and real-time update of the model cannot be performed.

3. According to the container anomaly detection model, the adopted iterative micro clusters can store normal samples of automatically updated stream data, so that the purity and instantaneity of the positive samples of the anomaly detection training set are improved; the AE adopted by the method can better utilize the anomaly detection training set to complete model training in a short time and perform real-time anomaly detection of samples. The combination of the two methods can efficiently and accurately execute the detection work of the abnormality of the container resource index.

Drawings

FIG. 1 is a schematic diagram of an adaptive container anomaly detection method based on container resource indicators in an embodiment of the present invention.

Fig. 2 is a schematic diagram of a flow of container resource index collection and anomaly detection in an embodiment of the present invention.

FIG. 3 is a schematic diagram of an automatic encoder according to an embodiment of the present invention.

Fig. 4 is a schematic flow chart of a micro cluster receiving sample according to an embodiment of the present invention.

Fig. 5 is a graph of a sample distribution of micro clusters on the traintucket in an embodiment of the present invention.

Fig. 6 is a schematic diagram of a result of a container anomaly detection experiment on the traintick in an embodiment of the present invention;

fig. 7 is a schematic diagram of experimental results of the method on the traintick according to the embodiment of the invention.

FIG. 8 is a schematic diagram of experimental results of the method on SockShop in the embodiment of the invention.

FIG. 9 is a schematic diagram of experimental results of the method on an Online Boutique in the embodiment of the present invention.

Detailed Description

In order to make the technical solution and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings, but the practice and protection of the present invention are not limited thereto.

Example 1:

a container anomaly detection method based on container resource indexes, as shown in figure 1, comprises the following steps:

the resource index data sample size is determined by an operation and maintenance personnel according to the index sampling interval of the container platform.

In example 1, the sampling interval was 5s, and 300 time-stamped samples were used as the normal sample training set.

As shown in fig. 2, in the process of collecting the resource index data, the container resource index is obtained through a data Docker stats interface every 5s, and the collected data is uploaded to a time sequence database promethaus in real time. The resource index data of the container cluster comprises CPU utilization rate, memory utilization amount, disk I/O data volume and network I/O data volume of the interface statistics container provided by the container platform; and then, the sample preprocessing module acquires the container resource index in real time through the Prometaus database and executes preprocessing of the container resource index.

the difference processing means that if the resource index is an accumulated value, the difference is calculated according to the historical record in the step of calculation, and the actual resource use condition of the current time interval is obtained; if the difference value is smaller than or equal to zero, setting the real-time value of the index as a non-zero difference value in the shortest time; particularly, notifying an operation and maintenance person to check the container cluster state when the difference value is negative; the memory usage, disk I/O data volume, network I/O data volume and other indexes all need to be subjected to difference processing;

all containers in the cluster are required to be in an operating state, the operations of creating, restarting and the like of the containers are avoided, and the service carried by the containers is required to be in a state of being normally accessed; short fluctuations in the container resource metrics or business metrics do not affect the metrics collection process.

constructing a data flow micro-cluster for storing a plurality of data samples, wherein the method comprises the steps of constructing core parameters of the micro-cluster, calculating an important index in the micro-cluster and dividing the types of the micro-cluster, and specifically comprises the following steps:

f(t)＝2 ^-λt ； (1)

The outlier micro-clusters have lower weights and sample reception priorities than the core micro-clusters, and there is a possibility of being removed; the details of the partitioning of core and outlier micro clusters are as follows:

To distinguish the differences in the weights of the micro clusters, an outlier micro cluster threshold parameter β is defined and 0 is specified<β<1, taking beta mu as a distinguishing threshold value of a core micro cluster and an outlier micro cluster; definition T _p For the shortest time required for the conversion of the core micro-cluster into an outlier micro-cluster, T _p The calculation formula of (2) is shown as formula (7):

Further, initialization of the micro-clusters is performed;

taking a first sample point in a sample as an initial micro cluster with a weight w of 1, a center point C of the sample and a radius r of 0, trying to sequentially put the samples in the data set into the micro cluster, and generating a first batch of micro clusters through the receiving process of the micro clusters on the samples; repeatedly receiving samples in the data set by using the first micro-cluster for multiple times, and taking the micro-clusters generated after iteration as initial micro-clusters for subsequent model training. In example 1, 300 samples of the initial dataset were iterated 30 times to form the initial micro-clusters.

An Automatic Encoder (AE) model is divided into two parts, namely an encoder and a decoder, and is specifically as follows:

the definition of an automatic encoder is as follows:

h＝σ _e (W ₁ x+b ₁ ) (8)

y＝σ _d (W ₂ x+b ₂ ) (9)

the encoder and decoder are structured as follows:

the input samples of the encoder are the sample points preprocessed in the step S1; in a cluster formed by N containers, four resource indexes of CPU utilization rate, memory utilization amount, disk I/O data amount and network I/O data amount exist in each container, so that the dimension of a sample point is 4N; the encoder comprises 4 hidden layers, wherein each hidden layer is a fully-connected layer with an input dimension of m and an output dimension of n, and m is greater than n; each hidden layer is followed by a ReLU function as an activation function; the input dimension of the encoder is related to the container cluster size, so the structure of the encoder should be adjusted according to the number N of containers; the automatic encoder structure of embodiment 1 is shown in fig. 3.

S4, constructing an automatic encoder automatic updating mechanism based on the data stream micro-cluster, which comprises the following specific steps:

as shown in equation (10), the loss function J (W, b) is a mean square loss function (Mean Square Error, MSE) that calculates the degree of difference between the input x and the output y.

after passing through the anomaly detection, the new sample point is sequentially tried to be received by a core micro cluster with the nearest Euclidean distance and an outlier micro cluster with the nearest Euclidean distance in a sample space according to the sequence of the priority, and if the two micro clusters cannot be received, the new outlier micro cluster is created by taking the sample as the center;

as shown in fig. 4, the sample point enters the micro-cluster through the following process: attempting to receive a sample p by a target micro cluster mc, and generating a new micro cluster mc' by using the coordinates of p and the original index of mc; if the radius r of mc _mc′ Satisfy r _mc′ <Epsilon, then it is stated that sample p can be received by mc; after confirming that sample p can be received by mc, delete mc from the list of micro-clusters and add mc' to the target micro-cluster; after being received by the micro clusters, the samples enter a normal sample queue so as to participate in the initialization and updating of the automatic encoder model in the step S4.4;

every interval T _p Time, performing a weight check of the primary micro cluster;

Fig. 5 is a sample distribution diagram of all samples, recent samples, and stored samples within the micro-clusters in example 1. The original sample is subjected to dimension reduction by using a t-sne method, and the dimension-reduced sample is drawn in a two-dimensional plane so as to compare the number of the samples with the shape. According to fig. 5, the micro-cluster samples better extract the sample information in the data stream, and the number of micro-cluster samples does not increase with the increase of the total number of samples.

S4.4, after a model training period, training a new automatic encoder model by using normal samples stored in the micro-clusters to solve the problem of conceptual drift of the container resource index data stream, wherein training of the model can be completed in one or a plurality of time stamps, and returning to the step S4.1 after training is completed;

And calculating a loss function of the container sample in real time by using a trained automatic encoder, judging whether the sample is abnormal or not according to the loss function, positioning an abnormal container when the abnormality is detected, and simultaneously realizing real-time detection of the container abnormality according to a constructed automatic updating mechanism.

Fig. 6 is a graph showing the results of detecting container anomalies in a traintick microservice cluster according to the present invention. Experimental results show that the method has good abnormality detection accuracy. Fig. 7 is a comparison of the time spent in the traintucket microservice cluster of the present invention.

Example 2:

experiments were performed on a socshop microservice demo. Compared with example 1, the initial data set size and epsilon parameters were adjusted, and the experimental results are shown in figure 8.

Example 3:

experiments were performed on an Online Boutique microservice demo. Compared with example 1, the initial data set size and epsilon parameters were adjusted, and the experimental results are shown in figure 9.

Experimental results show that the method has the advantages of small time cost, stable average time consumption of samples and good timeliness. By combining the experimental results, the invention has the advantages of unsupervised and self-adaptive characteristics, high accuracy and good timeliness, and can meet the container abnormality detection requirement based on the container resource index in the actual container environment.

Claims

1. The container anomaly detection method based on the container resource index is characterized by comprising the following steps of:

2. The container anomaly detection method based on container resource indexes according to claim 1, wherein in step S1, the resource index data of the container cluster includes CPU utilization, memory usage, disk I/O data volume and network I/O data volume of the container counted by the interface provided by the container platform;

the difference processing means that if the resource index is an accumulated value, the difference is calculated according to the historical record in the step of calculation, and the actual resource use condition of the current time interval is obtained; if the difference value is smaller than or equal to zero, setting the real-time value of the index as a non-zero difference value in the shortest time; the memory usage amount, the disk I/O data amount and the network I/O data amount are subjected to difference processing;

the default value processing means that collected samples are integrated into a data block, indexes with non-zero difference values are marked as effective indexes and the existing difference values are reserved for all samples in the stored data block, and the indexes with zero difference values are assigned by a linear filling method; the default value process is performed once every time a non-zero difference is generated, and samples at the end of the data block are still filled with valid differences of the most recent timestamp; the memory usage, disk I/O data volume and network I/O data volume are processed by default values;

3. The method for detecting container anomaly based on container resource indexes according to claim 1, wherein in step S2, a data stream micro cluster storing a plurality of data samples is constructed, and the method comprises constructing core parameters of the micro cluster, calculating modes of important indexes in the micro cluster and type division of the micro cluster, specifically comprises the following steps:

f(t)＝2 ^-λt ； (1)

4. A container anomaly detection method based on container resource metrics as in claim 3 wherein outlier micro clusters have lower weight and sample reception priority than core micro clusters and there is a possibility of being removed; the details of the partitioning of core and outlier micro clusters are as follows:

5. The method for detecting container abnormality based on container resource index according to claim 4, wherein in step S2, initialization of micro clusters is performed;

6. The method for detecting container anomaly based on container resource index according to claim 1, wherein in step S3, the automatic encoder model is divided into two parts, namely an encoder and a decoder, specifically as follows:

the definition of an automatic encoder is as follows:

h＝σ _e (W ₁ x+b ₁ ) (8)

y＝σ _d (W ₂ x+b ₂ ) (9)

The encoder and decoder are structured as follows:

7. The method for detecting abnormal containers based on the container resource index according to claim 1, wherein in step S4, an automatic encoder update mechanism based on the data stream micro-cluster is constructed to realize real-time detection of abnormal containers, specifically comprising the following steps:

8. The method for detecting container anomalies based on the container resource index according to claim 7, characterized in that, in step S4.1, the loss function J (W, b) is a mean square loss function (Mean Square Error, MSE) that calculates the degree of difference of the input x and the output y;

3 sigma criterion method, namely setting the abnormal threshold value as the sum of the average value and three times of standard deviation of the model history training loss, judging that the sample is abnormal if the abnormal threshold value is larger than the threshold value, otherwise, judging that the sample is normal;

a manual specification method, namely, manually specifying an abnormal threshold by operation staff;

the LOF method is used for further detecting abnormal points of MSE through a local outlier factor algorithm;

9. The method for detecting container anomaly based on container resource index according to claim 7, wherein in step S4.2, after the new sample point passes anomaly detection, the new sample point is sequentially tried to be received by a core micro cluster with the nearest euclidean distance and an outlier micro cluster with the nearest euclidean distance in a sample space according to the order of priority, and if both micro clusters cannot be received, a new outlier micro cluster is created by taking the sample as a center;

sample points enter the micro-clusters through the following processes: attempting to receive a sample p by a target micro cluster mc, and generating a new micro cluster mc by using the coordinates of p and the original index of mc ^′ The method comprises the steps of carrying out a first treatment on the surface of the If the radius r of mc _mc′ Satisfy r _mc′ <Epsilon, then it is stated that sample p can be received by mc; after confirming that sample p can be received by mc, delete mc from the list of micro-clusters and add mc' to the target micro-cluster; after being received by the micro clusters, the samples enter a normal sample queue so as to participate in the initialization and updating of the automatic encoder model in the step S4.4;

10. The method for detecting container anomalies based on container resource indicators as recited in claim 7, wherein in step S4.3, each interval T _p Time, performing a weight check of the primary micro cluster;