CN113190406B

CN113190406B - IT entity group anomaly detection method under cloud native observability

Info

Publication number: CN113190406B
Application number: CN202110478056.8A
Authority: CN
Inventors: 宋祥雨
Original assignee: Shanghai Eisoo Information Technology Co Ltd
Current assignee: Shanghai Eisoo Information Technology Co Ltd
Priority date: 2021-04-30
Filing date: 2021-04-30
Publication date: 2023-02-03
Anticipated expiration: 2041-04-30
Also published as: CN113190406A

Abstract

The invention relates to an IT entity group anomaly detection method under cloud native observability, which comprises the following steps: 1) Acquiring historical time sequence data of the IT entity group in the same index and time period; 2) Judging whether the IT entity group is suitable for group abnormity detection according to the historical time sequence data, if so, executing the step 3), and if not, ending; 3) Performing data compression on the historical time series data, and performing backward difference calculation to obtain a backward difference matrix; 4) Calculating the distance between each IT entity in the IT entity group and other IT entities according to the backward difference matrix; 5) Identifying abnormal IT entities through the LOF step according to the distance obtained by calculation in the step 4); 6) And calculating abnormal points and the severity thereof generated by each IT entity by taking the normal IT entity as a reference. Compared with the prior art, the method and the device can simultaneously detect the abnormal indexes of the IT entities, and have high calculation efficiency.

Description

IT entity group anomaly detection method under cloud native observability

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to an IT entity group abnormity detection method under cloud native observability.

Background

A cloud-native-based micro service architecture is a current technical trend, and under the cloud-native micro service architecture, a large number of applications are deployed in a distributed cluster manner, in the distributed cluster, nodes, applications, services and other IT entities in the cluster are generally configured in the same aspects and have homogeneity. In a cluster, IT entities such as nodes, applications or services with the same configuration or attributes form a group. The traditional anomaly detection method generally adopts a similarity measurement model, a probability statistics model, a regression model and other methods to detect anomalies aiming at historical time sequence data of a single IT entity under a certain index, wherein in the index data of the IT entities, some IT entities have homogeneity under certain indexes, namely have similar behaviors or modes and tend to be consistent in variation trend. If, for some indicators, the variation trend of the indicator data of a certain IT entity is greatly different from the variation trend of the indicator data of other IT entities within a certain time period on the premise that a plurality of IT entities have homogeneity, the IT entity may have an abnormality. If the traditional anomaly detection method is adopted to detect a plurality of IT entities one by one, the calculation efficiency is low.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide the method for detecting the abnormality of the IT entity group under the cloud native observability, which can simultaneously detect the abnormality of the index data of a plurality of IT entities with homogeneity and has high calculation efficiency;

the purpose of the invention can be realized by the following technical scheme:

a method for detecting an IT entity group abnormity under cloud native observability comprises the following steps:

1) Acquiring historical time sequence data of the IT entity group in the same index and time period;

2) Judging whether the IT entity group is suitable for group abnormality detection or not according to the historical time sequence data, if so, executing the step 3), and if not, ending;

3) Performing data compression on the historical time sequence data, and performing backward difference calculation to obtain a backward difference matrix;

4) Calculating the distance between each IT entity in the IT entity group and other IT entities according to the backward difference matrix;

5) Identifying abnormal IT entities through the LOF step according to the distance obtained by calculation in the step 4);

6) And calculating abnormal points and the severity thereof generated by each IT entity by taking the normal IT entity as a reference.

Further, step 2) comprises:

judging whether the following conditions are met simultaneously:

the sample size of the index data of each IT entity is not less than the set sample size;

the number of IT entities in the IT entity group is not less than a preset value, and the preset value is not less than 3;

if yes, the IT entity group is judged to be applicable to group abnormity detection, and if not, the IT entity group is not applicable.

Further, the data compression is performed on the historical time-series data through a PAA step, wherein the PAA step comprises:

averagely dividing index values of all index data in historical time sequence data, dividing the index data into n sections, taking the average value of non-null values of each section as new data, and taking the initial value of each section as the index value of the new data, so that the length of the index data is compressed to n;

the following two problems can be solved by data compression:

when the index sample number of each IT entity is excessive, the sample size can be reduced on the basis of furthest retaining the characteristics of data through data compression so as to improve the efficiency of the algorithm;

when the time corresponding to the index data of each IT entity slightly deviates due to machine calculation and the like, the compressed sample size can be controlled through data compression, so that the compressed sample size is kept unchanged, but the time corresponding to the index data of each IT entity is kept consistent.

Further, the distance of each IT entity from other IT entities is calculated by the FastDTW step.

Further, step 5) comprises:

calculating a local outlier LOF of the IT entity, judging whether the LOF is larger than a set threshold of the local outlier, if so, judging the IT entity to be an abnormal IT entity, and otherwise, judging the IT entity to be a normal IT entity.

Further, regarding the IT entity as a sample point, the calculation formula of the local outlier LOF is:

wherein ρ _k (O) is the local achievable density, ρ, of the point O _k (P) is N _k Local achievable density of other points in (O), N _k (O) is the kth distance domain of point O.

Further, N _k (O) satisfies:

N _k (O)＝{P′∈D\{O}|d(O,P′)≤d _k (O)}

wherein d is _k (O) is the kth distance of point O, d _k (O) = d (O, P), P being k points closest to the point O, satisfying the following condition:

there are at least k points P 'e D \ O } in the set such that D (O, P') ≦ D (O, P);

at most k-1 points P 'e D \ O } exist in the set, such that D (O, P') < D (O, P);

ρ _k the formula for calculation of (O) is:

wherein, d _k (O, P) is the k-th reachable distance from the point P to the point O, and the calculation formula is as follows:

d _k (O,P)＝max{d _k (O),d(O,P)}。

further, step 6) comprises:

61 According to the backward difference matrix, a normal IT entity is taken as a reference entity, and the standard difference number bias of the difference between the index data sample point of the IT entity and the reference entity at each time point is identified;

62 According to bias, the severity of the index data sample point is determined by a preset severity rule.

Further, the calculation formula of the standard deviation quantity bias is specifically as follows:

when the standard deviation is not null and not 0,

the standard deviation is not null and is 0, and when the mean value is not 0,

the standard deviation is not null and is 0, and when the mean value is 0,

wherein mean is a non-empty mean value of the index data of the reference entity, σ is a non-empty standard deviation of the index data of the reference entity, and default _ mean is a default mean value of the set reference entity index data.

Further, the severity is divided into unknown, normal and abnormal, and the severity rule includes:

621 Judging whether the original value of the index data sample point of the IT entity is-1 or null, if so, marking the index data sample point as an unknown sample point, otherwise, executing step 622);

622 Whether the bias is not larger than the set quantity is judged, if so, the index data sample point is marked as a normal sample point, otherwise, the index data sample point is marked as an abnormal sample point.

Compared with the prior art, the invention has the following beneficial effects:

(1) The invention obtains historical time sequence data of a group of IT entities in the same time period under a certain index, judges the applicability of the group of IT entities according to a preset condition, then performs data compression on the index data of the group of IT entities, respectively performs backward difference calculation according to the compressed index data of each IT entity, merges the results to form a backward difference matrix, calculates the distance between each IT entity and other IT entities under a certain index according to the difference matrix, identifies abnormal IT entities according to the distance of each IT entity, and detects the time when the IT entities generate abnormity and the severity of the abnormity, the invention can simultaneously perform abnormity detection on the index data of a plurality of IT entities, can identify whether the IT entities have homogeneity, detects whether the IT entities belong to IT abnormal entities, and greatly improves the calculation efficiency;

(2) According to the invention, data compression is carried out on historical time sequence data through the PAA step, through the data compression, the sample size is reduced on the basis of maximally retaining the characteristics of the data, the algorithm efficiency is improved, and meanwhile, the compressed sample size can be controlled through the data compression, so that the compressed sample size is kept unchanged, but the time corresponding to the index data of each IT entity is kept consistent, and thus the group abnormity detection is carried out;

(3) According to the invention, the distance between each IT entity and other IT entities is calculated through the step of FastDTW according to the backward difference matrix, so that the efficiency of the algorithm is improved;

(4) The method calculates the local outlier factor, judges whether the IT entity is an abnormal IT entity according to the size of the local outlier factor, and has high algorithm efficiency;

(5) According to the method, the severity result of the index sample point of each IT entity is obtained according to the standard deviation number of the index sample point of each IT entity from the index data of the benchmark entity and the severity rule, the abnormal condition of the IT entity at which time and the severity degree of the abnormal condition can be detected, and the accuracy is high.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

fig. 2 is a schematic diagram of a two-phase time sequence.

Detailed Description

The invention is described in detail below with reference to the figures and specific embodiments. The present embodiment is implemented on the premise of the technical solution of the present invention, and a detailed implementation manner and a specific operation process are given, but the scope of the present invention is not limited to the following embodiments.

A method for detecting an IT entity group anomaly under cloud native observability, as shown in FIG. 1, includes:

2) Judging whether the IT entity group is suitable for group abnormity detection according to the historical time sequence data, if so, executing the step 3), and if not, ending;

4) Calculating the distance between each IT entity and other IT entities through a FastDTW step according to the backward difference matrix;

In step 1), a group of IT entities needs to be determined, in this embodiment, multiple es servers are selected as IT entities, the CPU utilization of the IT entities in the same time period is obtained as the index data of the IT entities, in order to ensure accuracy, the time intervals corresponding to the index data of the IT entities are also consistent, and the obtained historical time series data are shown in table 1:

TABLE 1 historical time series data Table

	0	1	2	3	4	5	6
								1604419200000	0.134	0.134	0.136	0.132	0.198	0.068	0.066
1604419500000	0.134	0.134	0.068	0.068	0.134	0.134	0.134
								1604419800000	0.066	0.066	0.134	0.134	0.132	0.136	0.136
1604420100000	0.132	0.132	0.134	0.134	0.132	0.134	0.066
								1604420400000	0.134	0.068	0.132	0.132	0.134	0.198	0.134
1604420700000	0.066	0.132	0.066	0.066	0.134	0.134	0.132
								1604421000000	0.136	0.134	0.134	0.134	0.134	0.134	0.132
1604421300000	0.134	0.066	0.068	0.132	0.068	0.068	0.134
								1604421600000	0.134	0.134	0.132	0.134	0.134	0.134	0.068
1604421900000	0.066	0.136	0.066	0.132	0.134	0.134	0.198

The first column represents a timestamp, the first row represents the number of the es server, and the data in the rest cells represent the CPU utilization rate of the corresponding es server at the corresponding time.

The step 2) comprises the following steps:

judging whether the following conditions are met simultaneously:

the number of the IT entities in the IT entity group is not less than a preset value, and the preset value is not less than 3;

if yes, the IT entity group is judged to be applicable to group abnormity detection, otherwise, the IT entity group is not applicable.

According to the 80% rule, IT is considered that when the non-missing part of a certain substance is lower than 80% of the total sample size, the substance is recommended to be deleted, so that the sample size is set to be 80% of the due sample size of a single IT entity, namely when the sample size of the index data of the IT entity is lower than 80% of the due sample size, the group of IT entities is not suitable for group anomaly detection, for example, when the acquired due sample size of a single es server is 288, and the sample size of a certain es server is 200, 200 < 288 × 80%, the group of es servers is not suitable for group anomaly detection;

since the group anomaly detection requires three or more IT entities for comparison, a preset value may be set for the number of IT entities, the number of the set of IT entities should be greater than or equal to the preset value, and the required number of IT entities may be modified according to specific requirements, for example, the number of IT entities is 7 in the example shown in table 1, if the preset value is 3, the example in table 1 is suitable for group anomaly detection, and if the preset value is 9, and the number of IT entities in the example in table 1 is less than 9, the example in table 1 is not suitable for group anomaly detection.

Data compression is carried out on the historical time series data through a PAA step, wherein the PAA step comprises the following steps:

averagely dividing index values of index data in historical time sequence data, dividing the index data with the length of m into n sections, taking the average value of non-null values of each section as new data, taking the initial value of the index value of each section as the index value of the new data, and compressing the length of the index data to n;

the following two problems can be solved by data compression:

For example, according to the sample data in table 1, the sample size of each es server index data is 10, the index of the summarized index data is divided into 5 segments according to the average value of the numerical values, that is, the first column data in the table is divided into 5 segments according to the equal intervals of the numerical values, and then the average value of the non-null values of each corresponding index data in each segment is calculated, that is, the sample size of each es server index data can be compressed to 5, and the corresponding obtained results are as shown in table 2:

TABLE 2 History time series data table after data compression

	0	1	2	3	4	5	6
								1604419200000	0.134	0.134	0.102	0.1	0.166	0.101	0.1
1604419800000	0.099	0.099	0.134	0.134	0.132	0.135	0.101
								1604420400000	0.1	0.1	0.099	0.099	0.134	0.166	0.133
1604421000000	0.135	0.1	0.101	0.133	0.101	0.101	0.133
								1604421600000	0.1	0.135	0.099	0.133	0.134	0.134	0.133

The group anomaly detection mainly detects whether the index data change trends of the IT entities are consistent, so that the backward difference is selected as the measure of the variable quantity of each index data at each moment, the index data of each IT entity after data compression are respectively subjected to backward difference calculation, and the backward difference matrix is formed by combining the backward difference calculation and the backward difference calculation.

The backward difference is defined as: the backward difference of the current time is the difference between the current time position and the previous time position, and the next backward difference is used for filling because the backward difference of the first time cannot be obtained, and the backward difference

The calculation formula of (c) is as follows:

the backward difference is calculated according to the sample data in table 1, and a backward difference matrix is generated, and the result is shown in table 3:

TABLE 3 backward difference matrix calculation results table

	0	1	2	3	4	5	6
								1604419200000	0	0	-0.068	-0.064	-0.064	0.066	0.068
1604419500000	0	0	-0.068	-0.064	-0.064	0.066	0.068
								1604419800000	-0.068	-0.068	0.066	0.066	-0.002	0.002	0.002
1604420100000	0.066	0.066	0	0	0	-0.002	-0.07
								1604420400000	0.002	-0.064	-0.002	-0.002	0.002	0.064	0.068
1604420700000	-0.068	0.064	-0.066	-0.066	0	-0.064	-0.002
								1604421000000	0.07	0.002	0.068	0.068	0	0	0
1604421300000	-0.002	-0.068	-0.066	-0.002	-0.066	-0.066	0.002
								1604421600000	0	0.068	0.064	0.002	0.066	0.066	-0.066
1604421900000	-0.068	0.002	-0.066	-0.002	0	0	0.13

And calculating the distance between each IT entity and other IT entities under a certain index through a FastDTW step according to the backward difference matrix. FastDTW is an acceleration algorithm of a dynamic time adjustment algorithm DTW, which is a method for measuring similarity between two time sequences by using a dynamic programming concept, and is mostly used for detecting the similarity of two voices, because the length of each letter pronunciation is different during each utterance, the two voices do not completely coincide, the dynamic time adjustment algorithm stretches or compresses the voices so that they are aligned as much as possible, as shown in fig. 2, the DTW calculates the similarity between the two time sequences by extending and shortening the time sequences, and FastDTW accelerates the calculation of the DTW by comprehensively using two methods, namely a restriction method and a data abstraction method.

The distance between each IT entity is calculated according to the backward difference matrix, the distance matrix between the IT entities is obtained by summarizing, and the result is shown in table 4:

table 4 table of calculation results of distances between IT entities

The first row and the first column in table 4 each represent the number of each es server, and as can be seen from table 4, the distance matrix between the IT entities is a symmetric matrix.

Regarding the IT entity as a sample point, step 5) includes:

51 Calculate the kth distance;

d _k (O) is the kth distance of point O, d _k (O) = d (O, P), P is k points closest to the point O, and the following condition is satisfied:

at most k-1 points P 'epsilon D \ O } exist in the set, so that D (O, P') < D (O, P);

52 Calculate the kth distance domain;

let N _k (O) is the kth distance domain of point O, satisfying:

N _k (O)＝{P′∈D\{O}|d(O,P′)≤d _k (O)}

N _k (O) includes all points whose distance to point O is less than the kth distance of point O;

53 Calculating a kth reachable distance;

the k-th reachable distance from point P to point O is defined as:

d _k (O,P)＝max{d _k (O),d(O,P)}

that is, the k-th reachable distance from the point P to the point O is at least the k-th distance from the point O, and the k points nearest to the point O are equivalent to the reachable distances from the point O and are all equal to d _k (O)；

54 Calculating local reachable density;

the local achievable density is defined as:

ρ _k (O) average reachable distance to the point O of all points in the k-th distance domain of the point O, the number of points on the k-th neighborhood boundary will be counted as k even if the number of points is greater than 1, and if the point O and surrounding neighborhood points are in the same cluster, the more likely the reachable distance is d, which is smaller _k (O) resulting in a smaller sum of the reachable distances and a larger local reachable density, if point O is further away from surrounding neighborhood points, the reachable distance may take a larger value d (O, P), resulting in a larger sum of the reachable distances and a smaller local reachable density;

55 Calculating a local outlier LOF for the T entity;

the local outlier factor LOF is calculated as:

wherein, LOF _k (O) the kth distance domain N of the point O _k (O) mean of the ratio of the local achievable density of the other points to the local achievable density of the point O, LOF _k The closer to 1 (O), the larger the difference in the point density in the neighborhood of the point O, and the point O may belong to the same cluster as the neighborhood, if LOF is larger _k (O) is less than 1, indicating that the density of the point O is higher than that of the neighboring points, the point O is a dense point, if LOF _k (O) is greater than 1, indicating that the density of points O is less than its neighborhood point density, O may be outliers.

Judging whether LOF is larger than a set threshold value of a local outlier factor, if so, judging that the IT entity is an abnormal IT entity and marking as-1, otherwise, judging that the IT entity is a normal IT entity and marking as 1, in the embodiment, according to a distance matrix among the IT entities in the table 4, setting a threshold value of the local outlier factor as 1.2, and judging as [1, -1], wherein the variation trend of the CPU utilization rate of the es server with the number of 6 is different from that of the rest es servers, and the IT entity is judged as an abnormal IT entity.

Step 6) comprises the following steps:

62 According to bias, the severity of the index data sample point is determined through a preset severity rule.

The calculation formula of the standard deviation quantity bias is specifically as follows:

when the standard deviation is not null and is not 0,

the standard deviation is not null and is 0, and when the mean value is not 0,

the standard deviation is 0 instead of null, and when the mean value is 0,

the mean is a non-empty mean value of the index data of the reference entity, the sigma is a non-empty standard deviation of the index data of the reference entity, and the default _ mean is a default mean value of the set index data of the reference entity.

In this embodiment, taking the backward difference matrix in table 3 as an example, default _ mean is set to 0.5, when the timestamp is 1604421900000, the non-null mean value of the index data of each reference entity is-0.02233, the non-null standard deviation is 0.03161, the backward difference of the CPU utilization of the es server with number 1 is-0.068, which has | (-0.068- (-0.02233))/0.03161 | -1.444 from the reference, the backward difference of the CPU utilization of the es server with number 6 is 0.13, which has | (0.13- (-0.02233))/0.03161 | -4.819 standard deviations from the reference, and the number of standard deviations of the index sample points of each IT entity from the reference entity index data can be obtained according to the non-null mean value and the non-null standard deviation of the index data of the backward difference matrix and the reference entity in table 3, and the number of standard deviations of the index sample points of each IT entity from the reference entity are as shown in table 5:

TABLE 5 Table of results of calculation of standard deviation

	0	1	2	3	4	5	6
								1604419200000	0.443	0.443	0.947	0.865	0.865	1.792	1.833
1604419500000	0.443	0.443	0.947	0.865	0.865	1.792	1.833
								1604419800000	1.231	1.231	1.218	1.218	0.024	0.049	0.049
1604420100000	1.414	1.414	0.691	0.691	0.691	0.755	2.923
								1604420400000	0.054	1.730	0.054	0.054	0.054	1.730	1.839
1604420700000	0.697	1.956	0.656	0.656	0.670	0.616	0.630
								1604421000000	1.039	0.960	0.980	0.980	1.019	1.019	1.019
1604421300000	1.414	0.756	0.690	1.414	0.690	0.690	1.545
								1604421600000	1.446	0.772	0.641	1.380	0.706	0.706	3.598
1604421900000	1.445	0.770	1.381	0.643	0.707	0.707	4.819

The severity is divided into unknown, normal and abnormal, denoted 0, 2 and 3, respectively, and the severity rules include:

621 ) determining whether the original value of the index data sample point of the IT entity is-1 or null, if so, marking the index data sample point as an unknown sample point, otherwise, executing step 622)

In this embodiment, the number is set to 3, and according to the number of standard deviations between the index sample point of each IT entity and the reference entity index data in table 5 and the severity rule, the severity result of the index sample point of each IT entity can be obtained, as shown in table 6:

TABLE 6 severity result table of IT entity index sample points

	0	1	2	3	4	5	6
								1604419200000	2	2	2	2	2	2	2
1604419500000	2	2	2	2	2	2	2
								1604419800000	2	2	2	2	2	2	2
1604420100000	2	2	2	2	2	2	2
								1604420400000	2	2	2	2	2	2	2
1604420700000	2	2	2	2	2	2	2
								1604421000000	2	2	2	2	2	2	2
1604421300000	2	2	2	2	2	2	2
								1604421600000	2	2	2	2	2	2	3
1604421900000	2	2	2	2	2	2	3

In order to further evaluate the group anomaly detection effect, the embodiment collects the CPU utilization rate data of 12 servers of a company in the same day, the time interval is 5 minutes, and the effect of the group anomaly detection method is measured by utilizing a two-classification confusion matrix through the accuracy, the precision, the recall rate and the F1 value;

the two-class confusion matrix is shown in table 7:

TABLE 7 two-class confusion matrix

	Predicted as Positive	Predicted as Negative
			Labeled as Positive	True Positive(TP)	False Negative(FN)
Labeled as Negative	False Positive(FP)	True Negative(TN)

As in table 7, the results can be divided into:

true example True Positive, i.e., TP: the real category is a positive example, and the prediction category is a positive example;

false Negative, i.e. FN: the true category is a positive example, and the predicted category is a negative example;

false Positive, FP: the true category is a negative example, and the predicted category is a positive example;

true Negative example True Negative, namely TN: the true category is a negative example and the predicted category is a negative example.

The accuracy represents the ratio of the number of correctly predicted samples to the total number of predicted samples, and the calculation formula of the accuracy is as follows:

the precision ratio precision represents how many of all samples judged to be positive are true positive samples, and is also called precision ratio, and the calculation formula is as follows:

the recall rate recall represents how many positive samples are judged as positive samples by the model, and is also called recall rate, and the calculation formula is as follows:

F-Measure is a weighted harmonic mean of recall ratio and precision ratio, also called F-Score, and in this embodiment, F1 is used, and the calculation formula is shown below.

The evaluation results of the method for detecting an abnormality of an IT entity group provided in this embodiment are shown in table 8:

TABLE 8 evaluation result table of IT entity group anomaly detection method

In order to verify the efficiency of the group anomaly detection method in the large data volume scenario, this embodiment selects 10 similar IT entities in the same time period, tests the efficiency of group anomaly detection as the data volume increases, and the result is shown in table 9:

TABLE 9 Algorithm efficiency test result table

Data volume	Time(s)
		60*10	1.132
120*10	2.283
		180*10	3.672
240*10	4.593
		300*10	5.887
360*10	7.138
		420*10	8.166
480*10	9.441
		540*10	10.635
600*10	11.739

As shown in table 9, as the data amount increases, the operation time of the group anomaly detection method gradually increases, and the group anomaly detection method includes a step of data compression, which can reduce the data amount on the basis of maximally preserving the data characteristics and improve the efficiency of the algorithm, so that when group anomaly detection with a large data amount is performed, the original data of the IT entity can be appropriately compressed according to the operation efficiency in table 9, so as to improve the efficiency of the algorithm, for example, when the compressed data amount is maintained at 240 × 10, group anomaly detection can be guaranteed to be completed within 5 s.

The embodiment provides an IT entity group anomaly detection method under cloud native observability, which can perform anomaly detection on index data of a plurality of IT entities simultaneously, can identify whether the IT entities have homogeneity, which IT entities belong to anomalous entities, detect the times when the IT entities generate anomalies, perform data compression through a PAA (packet access) step, calculate the distance between each IT entity and other IT entities through a FastDTW step, improve the efficiency of group anomaly detection, identify anomalous IT entities through an LOF (low-order-of-compliance) step, obtain the severity result of index sample points of each IT entity according to the standard deviation number of the index sample points of each IT entity from the index data of a reference entity and the severity rule, and improve the accuracy of group anomaly detection.

The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions that can be obtained by a person skilled in the art through logical analysis, reasoning or limited experiments based on the prior art according to the concepts of the present invention should be within the scope of protection determined by the claims.

Claims

1. A method for detecting an IT entity group anomaly under cloud native observability, wherein the IT entity group comprises a plurality of IT entities with same configuration or attribute, the IT entities comprise nodes, applications or services in a cluster, and the method comprises the following steps:

6) Calculating abnormal points and severity thereof generated by each IT entity by taking normal IT entities as a reference;

the step 2) comprises the following steps:

judging whether the following conditions are met simultaneously:

the number of the IT entities in the IT entity group is not less than a preset value;

if yes, the IT entity group is judged to be applicable to group abnormity detection, otherwise, the IT entity group is not applicable;

step 5) comprises the following steps:

calculating a local outlier LOF of the IT entity, judging whether the LOF is larger than a set threshold of the local outlier, if so, judging the IT entity to be an abnormal IT entity, and otherwise, judging the IT entity to be a normal IT entity;

step 6) comprises the following steps:

61 According to the backward difference matrix, a normal IT entity is used as a reference entity, and the standard difference quantity bias of the difference between the index data sample point of the IT entity and the reference entity at each time point is identified;

62 According to bias, judging the severity of the index data sample point through a preset severity rule;

the severity is divided into unknown, normal and abnormal, and the severity rule comprises:

622 Judging whether the bias is not more than the set quantity, if so, marking the index data sample point as a normal sample point, otherwise, marking the index data sample point as an abnormal sample point.

2. The method of claim 1, wherein the historical time series data is data compressed by a PAA step, the PAA step comprises:

and averagely dividing the index data according to the index value of each index data in the historical time sequence data, dividing the index data into n sections, taking the average value of non-null values of each section as new data, and taking the initial value of each section as the index value of the new data, so that the length of the index data is compressed to n.

3. The method of claim 1, wherein the distance between each IT entity and other IT entities is calculated through a FastDTW step.

4. The method of claim 1, wherein the IT entities are regarded as sample points, and the local outlier LOF is calculated by the following formula:

where ρ is _k (O) is the local achievable density, ρ, of the point O _k (P') is N _k Local achievable density of other points within (O), N _k (O) is the kth distance neighborhood of point O.

5. The method of claim 4, wherein N is N _k (O) satisfies:

N _k (O)＝{P′∈D\{O}|d(O,P′)≤d _k (O)}

wherein d is _k (O) is the kth distance of point O, d _k (O) = d (O, P), P is the k-th point closest to the point O, and the following condition is satisfied:

ρ _k the formula for calculation of (O) is:

wherein d is _k (O, P ') is the k-th reachable distance from the point P' to the point O, and the calculation formula is:

d _k (O,P′)＝max{d _k (O),d(O,P′)}。

6. the method according to claim 1, wherein the calculation formula of the standard deviation number bias is specifically:

when the standard deviation is not null and not 0,

the standard deviation is not null and is 0, and when the mean value is not 0,

the standard deviation is 0 instead of null, and when the mean value is 0,