CN110995508B - KPI mutation-based adaptive unsupervised online network anomaly detection method - Google Patents

KPI mutation-based adaptive unsupervised online network anomaly detection method Download PDF

Info

Publication number
CN110995508B
CN110995508B CN201911334135.0A CN201911334135A CN110995508B CN 110995508 B CN110995508 B CN 110995508B CN 201911334135 A CN201911334135 A CN 201911334135A CN 110995508 B CN110995508 B CN 110995508B
Authority
CN
China
Prior art keywords
data
kpi
cluster
real
dist
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911334135.0A
Other languages
Chinese (zh)
Other versions
CN110995508A (en
Inventor
蔡志平
余广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201911334135.0A priority Critical patent/CN110995508B/en
Publication of CN110995508A publication Critical patent/CN110995508A/en
Application granted granted Critical
Publication of CN110995508B publication Critical patent/CN110995508B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0823Errors, e.g. transmission errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction

Abstract

The invention discloses a KPI mutation-based self-adaptive unsupervised online network anomaly detection method, which comprises the following steps of 1, collecting KPI historical data in a network and preprocessing the KPI historical data; 2. extracting high-order differences of the preprocessed historical data to form two-dimensional difference combination features to form a two-dimensional difference combination feature set; 3. performing clustering analysis on the feature data in the two-dimensional differential combination feature set by using a self-adaptive density clustering method to obtain a cluster set of normal data; 4. acquiring KPI real-time data on line; 5. and (3) extracting two-dimensional differential combination characteristics of the real-time data, substituting the characteristics into the normal data cluster in the step (3) to judge whether the combination characteristics belong to a cluster of certain normal data or not, and if not, judging that the real-time data is abnormal, and further judging that the network is in an abnormal state. According to the method, after the correlation between the KPI mutation and the network abnormality is analyzed, the network abnormality is indirectly discovered by searching the KPI mutation, and experiments show that good performance can be realized.

Description

KPI mutation-based self-adaptive unsupervised online network anomaly detection method
Technical Field
The invention belongs to the field of network anomaly detection, and particularly relates to a KPI mutation-based self-adaptive unsupervised online network anomaly detection method.
Background
In order to ensure reliable and stable internet service, internet companies need to monitor various KPIs and detect abnormalities in real time. A KPI is a time series data that typically contains a series of values generated by some kind of continuous monitoring over a period of time, such as service delay, amount of page browsing, number of online users, etc. An anomaly in a KPI generally refers to a temporal anomaly segment in a temporal sequence that exhibits a distinct characteristic or unpredictable pattern, such as a sudden increase or decrease, jitter, or the like. When a KPI shows an anomaly, it often means that a related service has failed, such as a network outage, a configuration error, a server overload, an external attack, and the like. KPI anomalies are often difficult to describe with predefined knowledge or rules, because anomalies tend to be context dependent. In the labeling of actual anomalies, the domain expert obtains a normal pattern through visual scanning of the KPI, and considers those data that do not fit the pattern as anomalies. This is consistent with the general definition of an anomaly: anomalies are those that deviate from the majority of the observed data. Therefore, we consider that the key to detecting KPI abnormalities without labels is to learn the distribution of normal data. In practice, detecting anomalies in various KPIs is very difficult due to the ambiguity and scarcity of anomalies, the lack of data labels, and the diversity of KPIs.
The existing KPI anomaly detection method can be divided into three categories: traditional statistical methods, supervised integration methods and unsupervised learning methods. The traditional method uses statistical techniques and builds equations for KPIs to achieve prediction or bias-based detection, which has the major drawback that it is necessary to select a suitable detection model and fine-tune parameters for each KPI. Supervised integration methods use multiple traditional detection models to extract features and train a supervised machine learning based classifier, which has the disadvantage of being heavily dependent on labels, requiring manual labeling of data in each KPI. Recently, however, the rise of unsupervised learning methods has provided better solutions for KPI anomaly detection. The core idea of the unsupervised learning method is to model the feature distribution of the normal data and to judge the abnormality according to whether the online data conforms to the distribution. However, these methods often perform poorly in practice. Meanwhile, there are some parameters that have important influence on performance in the unsupervised learning method, and it is difficult to estimate these parameters in an unsupervised environment. These methods therefore often require tag-assisted fine-tuning of complex parameters.
Disclosure of Invention
The invention aims to solve the technical problem of how to quickly and accurately find KPI abnormal data so as to analyze network faults, and provides a KPI mutation-based self-adaptive unsupervised online network abnormality detection method.
In order to solve the problem, the technical scheme is as follows:
a KPI mutation-based self-adaptive unsupervised online network anomaly detection method comprises the following steps:
step 1: collecting KPI historical data in a network, and preprocessing the historical data;
step 2: extracting high-order differences of the preprocessed historical data to form two-dimensional difference combination features to form a two-dimensional difference combination feature set;
and 3, step 3: performing clustering analysis on the feature data in the two-dimensional differential combination feature set by using a self-adaptive density clustering method to obtain a cluster set of normal data;
and 4, step 4: KPI real-time data are acquired on line, and the real-time data are preprocessed;
and 5: and (3) extracting two-dimensional differential combination characteristics of the real-time data, substituting the real-time two-dimensional differential combination characteristics into the normal data cluster set in the step (3) to judge whether the combination characteristics belong to a cluster of certain normal data, and if not, judging that the real-time data is abnormal, and further judging that the network is in an abnormal state.
Further, the preprocessing the data includes normalizing the data after filling missing values with mean values of neighboring values.
Further, the two-dimensional difference combination feature extraction method in step 2 is as follows: discretizing the preprocessed historical data, and extracting a first-order difference x of the discretized historical data t -x t-1 And the second order difference value x t -x t-2 Forming two-dimensional combined features: (x) t -x t-1 ,x t -x t-2 )。
Further, the adaptive density clustering method in step 3 is a method for setting a parameter neighborhood radius parameter Eps and a data point number MinPts within the neighborhood radius Eps for the density-based clustering method DBSCAN, and specifically includes:
step 3.1: determining a neighborhood radius parameter Eps by adopting a self-adaptive method:
step 3.1.1: setting k = MinPts to obtain a specific k-dist graph, wherein k-dist refers to the distance between a data point and a point which is k-th near the data point, all points are arranged in an ascending order according to k-dist values of all the data points to obtain an ascending order k-dist graph, the data points refer to points represented by two-dimensional differential combination characteristics of each KPI historical data, and MinPts represents the number of the data points in a neighborhood radius Eps;
step 3.1.2: let f and l be the first and last points in the k-dist graph, respectively, p is any point between f and l, and the coordinates of p are expressed as (p.x, p.y), where p.x represents the abscissa, i.e. ascending sequence number, and p.y represents the ordinate, i.e. k-dist value, then the normalized vector composed of points f and p, p and l, respectively, is expressed as:
Figure BDA0002330498290000021
Figure BDA0002330498290000022
then vector
Figure BDA0002330498290000023
And vector
Figure BDA0002330498290000024
Cosine between is expressed as:
Figure BDA0002330498290000031
step 3.1.3: traversing all points p in the k-dist graph to obtain a minimum cos theta, wherein the point p corresponding to the minimum cos theta is an inflection point t in the k-dist graph, and the k-dist value corresponding to the inflection point t is k-dist (t);
step 3.1.4: value of neighborhood radius Eps:
Eps=k-dist(t)×ρ,ρ≥1
rho is an abnormal precision adjustment coefficient;
step 3.2: and taking MinPts to be more than or equal to 4 for at least the number MinPts of data points in the domain radius Eps.
Further, the method for substituting the real-time two-dimensional differential combination characteristic into the normal data cluster set in step 3 to judge whether the combination characteristic belongs to a cluster of normal data in step 5 is as follows:
step 5.1: calculating the minimum distance between the real-time two-dimensional differential combination characteristics and all the points in a certain cluster,
and step 5.2: judging whether the distance is smaller than a threshold value, if so, indicating that the data belongs to the cluster, judging that the network is normal, and if so, turning to the step 5.3;
step 5.3: and traversing other clusters, repeating the steps 5.1 and 5.2, and if the minimum distances between the real-time two-dimensional difference feature and all the points in any cluster are greater than a threshold value, judging that the network is abnormal.
Further, when the minimum distances between the real-time two-dimensional differential combination feature and all the points in a certain cluster are calculated in step 5.1, when the cluster is the maximum cluster, the maximum cluster is sub-sampled, the sub-sampled sample replaces the maximum cluster, and only the minimum distance between each data point in the sub-sampled sample and the real-time two-dimensional differential combination feature point is obtained.
A computer arrangement comprising a memory storing a computer program and a processor implementing the steps of the KPI mutation based adaptive unsupervised online network anomaly detection method described above when executing the computer program.
A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the above-mentioned method of adaptive unsupervised online network anomaly detection of KPI mutations.
Compared with the prior art, the invention has the following beneficial effects:
a KPI mutation-based self-adaptive unsupervised online network anomaly detection method comprises the steps of obtaining two-dimensional differential combination characteristics (which are beneficial to identifying mutation) through high-order differential extraction and combination in step 2, obtaining a characteristic cluster set of normal data through a self-adaptive density clustering method in step 3, filtering abnormal data, sub-sampling a maximum cluster, comparing and judging KPI data characteristics acquired in real time with each cluster set, and judging whether KPI data are abnormal or not by judging whether the KPI data belong to a certain cluster set or not according to the fact that whether the KPI data belong to a certain cluster set or not, so that whether the network is abnormal or not is judged. According to the method, after the correlation between the KPI mutation and the network abnormality is analyzed, the network abnormality is indirectly discovered by searching the KPI mutation, and the excellent performance can be realized through experimental data discovery.
Drawings
FIG. 1 is a flow chart of the system of the present invention;
FIG. 2 is a schematic representation of three representative types of KPIs studied in accordance with the present invention;
FIG. 3 is a schematic diagram of the type of common and important KPI abnormalities, i.e., mutations, identified and defined by the present invention;
FIG. 4 is a schematic diagram of an inflection point based on a k-dist graph in the adaptive density clustering method of the present invention;
FIG. 5 is an overall framework diagram of the algorithm of the present invention;
FIG. 6 is a graph of an analysis of the effect of subsampling size on F-score and detection time in the algorithm of the present invention;
FIG. 7 is a graph illustrating the effect of the MinPts parameter on F-score in the algorithm of the present invention.
FIG. 8 is a diagram of the detection results and actual label analysis of two-dimensional differential combination features in the algorithm feature space of the present invention.
Fig. 9 and 10 are graphs of the recognition analysis of the algorithm to extract the difference features for sudden changes and expected conceptual drift.
Detailed Description
The method provided by the invention is used for exploring the properties and mechanisms of the abnormalities in various KPIs in detail from the aspect of the abnormalities, and identifying a common and important abnormal form in KPI data, namely mutation.
KPI mutations are defined as:
let f (t) = x t -x t-1 Where t represents a point on the KPI timestamp, x t Is the monitored value at time t. For an abnormal segment of length m in a KPI: { x i ,x i+1 ,…,x i+m-1 If it satisfies the following condition:
Figure BDA0002330498290000041
|f(k+i)|>threshold,k<delay
it is a mutation.
Where the threshold value threshold depends on the KPI, since f (t) is different in different KPIs. For example, assume that f (t) to N (0, σ) 2 ) Then threshold may be 3 σ since this region contains approximately 99.7% of the data. In practice f (t) is often unknown and learning its distribution is critical for detecting mutations. And the delay refers to the distance between the first point and the first mutation point in the mutation (abnormal segment), and the mutation point refers to the condition that f (t) is satisfied>Point of threshold. The amount of delay depends on the actual performance requirements, e.g. delay =7 when DDCOL is required to detect an anomaly before the seventh monitoring point after the anomaly has occurred.
Abrupt changes generally mean that an abnormal segment in a KPI suddenly increases or decreases within a small delay, such as peaks, jitter, etc. in the KPI. These phenomena have important practical significance: since the values in the KPI are generated by continuous monitoring, such as monitoring page views to generate an observation value every minute, the values at the continuous time stamps usually do not change drastically. When values or trends in KPIs mutate, there is a reason to immediately suspect that underlying mechanisms for generating data have changed substantially. These changes are typically caused by a failure of a network-related service or system. Thus, the detection of mutations in KPIs is closely related to the detection of anomalous events in the network, which often appear as contextual anomalies on the relevant timestamps. For example, an external attack on a website may result in a sudden increase in website traffic. Therefore, the sudden change often means that related services in the network are abnormal, such as network failure, server failure, configuration error, server overload, external attack, and the like. In addition to this, most abnormalities in KPIs are found to be mutations in experiments, and therefore it is necessary and important to detect KPI mutations to discover network abnormalities. However, the existing methods cannot accurately detect mutations in various KPIs. Therefore, in order to overcome the defects of the existing method and detect mutations in various KPIs, an adaptive unsupervised online anomaly detection method named DDCOL is provided.
Fig. 1 to fig. 10 show a specific embodiment of an adaptive unsupervised online network anomaly detection method based on KPI mutation, as shown in fig. 1, including the following steps:
step 1: collecting KPI historical data in a network, and preprocessing the historical data;
preprocessing the historical data includes normalizing the data after filling missing values with the mean of neighboring values.
Fig. 2 shows three representative KPIs for which the present invention is directed, periodic, stable and fluctuating from top to bottom. The periodic KPIs generally have a periodic pattern that occurs periodically, such as the amount of page browsing. Normal data in a stable KPI is distributed over a small area relative to abnormal data, such as the number of slow responses searching a data center. Fluctuating KPIs are unstable, have no apparent periodic characteristics and are subject to large fluctuations, such as search response times. The fluctuations (i.e., spikes in the figure) in the fluctuating KPIs are actually caused by the expected conceptual drift. The expected conceptual drift refers to a significant change in the distribution of KPIs, but such a change is considered normal by domain experts. The expected concept drift is typically caused by software updates, service updates, etc., often appearing as level changes, such as spikes or dips, on the KPI. Note that although the concept drift and mutational abnormalities are expected to have similar characteristics, they are different in nature. The expected concept drift may cause the anomaly detection algorithm to fail because the model that the algorithm builds based on the old concept is not applicable to the new concept. Therefore, learning the feature distribution of normal data including the expected concept drift is required to achieve accurate anomaly detection. Circles in the figure indicate abnormalities that are macroscopically observed to be similar, i.e., abrupt, despite differences in the type or shape of KPIs. Fig. 3 shows in more detail microscopically an abrupt change, i.e. the difference of adjacent values in an anomalous segment becomes much larger than the normal difference within a small delay. In the present embodiment, the example is given by the network key performance index of the page browsing amount.
And 2, step: extracting high-order differences of the preprocessed historical data to form two-dimensional difference combination features to form a two-dimensional difference combination feature set; for the calendar after pretreatmentDiscretizing the history data, and extracting a first-order difference value x of the discretized history data t -x t-1 And the second order difference value x t -x t-2 Forming two-dimensional combined features: (x) t -x t-1 ,x t -x t-2 ) And forming a two-dimensional combined feature set by the two-dimensional combined features consisting of the first-order difference values and the second-order difference values of all the discretized historical data.
And step 3: performing clustering analysis on the feature data in the two-dimensional differential combination feature set by using a self-adaptive density clustering method to obtain a cluster set of normal data;
the invention adopts a Density-Based Clustering method, namely DBSCAN (sensitivity-Based Spatial Clustering of applications with Noise) to cluster two-dimensional differential combination characteristics, the core idea of the DBSCAN is to find core points in data, wherein at least MinPts data points are in the neighborhood radius Eps field, and then the core points are expanded into a cluster set through similarity transmission. Points not belonging to any cluster are filtered out. The cluster formed by DBSCAN is considered as a feature distribution of normal data in KPI, and abnormal data has been filtered out. The reason for adopting DBSCAN in the invention is as follows: 1. the principles of DBSCAN are consistent with the definition and discovery of anomalies, i.e., most data with similar characteristics form clusters and are considered to be generated by normal mechanisms, and a small portion of data with different characteristics are filtered out and considered to be anomalous. 2. It is very difficult to determine the number of clusters in advance, which makes many algorithms that require a predefined number of clusters, such as K-means, unsuitable for this scenario. While DBSCAN can automatically form clusters without the need for a predefined cluster number. 3. The cluster formation by DBSCAN can be any shape, which satisfies the complexity and variability of KPIs. 4. The temporal complexity of DBSCAN is relatively low.
In a clustering method DBSCAN based on density, parameter neighborhood radius Eps and the number MinPts of data points in the neighborhood have important influence on performance and detection accuracy. Therefore, with the DBSCAN method, the two parameters are mainly determined, and then each data point in the two-dimensional combined feature set is input into the DBSCAN, and finally a plurality of clusters are output.
Step 3.1: the neighborhood radius parameter Eps is determined by adopting a self-adaptive method, which specifically comprises the following steps:
step 3.1.1: setting k = MinPts to obtain a specific k-dist graph, wherein k-dist refers to the distance between a data point and a point which is k-th near the data point, and arranging all the data points in an ascending order according to k-dist values of all the data points to obtain an ascending order k-dist graph, wherein the data points refer to points represented by two-dimensional differential combination characteristics of each KPI historical data; the inflection points in the k-dist plot, i.e., the representative points of the curve region, provide a good estimate of Eps because the inflection points divide the k-dist plot into two parts, a flat region and a steep region. The flat area of dots is distributed over a dense area, which should form clusters. While points of the steep region are distributed in the sparse region, which is highly likely to be abnormal data. The method for determining the inflection point in the k-dist graph is as follows:
step 3.1.2: let f and l be the first and last points in the k-dist diagram, respectively, p be any point between f and l, and the coordinates of p be (p.x, p.y), where p.x represents the abscissa, i.e., ascending sequence number, and p.y represents the ordinate, i.e., k-dist value, then the normalized vector composed of points f and p, and p and l, respectively, is represented as:
Figure BDA0002330498290000061
Figure BDA0002330498290000062
then vector
Figure BDA0002330498290000063
And vector
Figure BDA0002330498290000064
Cosine in between is expressed as:
Figure BDA0002330498290000065
step 3.1.3: traversing all points p in the k-dist graph, and finding out the minimum value of cos theta, wherein the point p corresponding to the minimum cos theta is an inflection point t in the k-dist graph, and the k-dist value corresponding to the inflection point t is k-dist (t);
step 3.1.4: value of neighborhood radius Eps:
Eps=k-dist(t)×ρ,ρ≥1;
rho is an abnormal data precision adjustment coefficient;
note that setting k = MinPts in step 3.1.1 has one important property or reason: when Eps is equal to k-dist (t) (k-dist (t) represents the k-dist value of inflection point t), all points whose k-dist value is equal to or less than k-dist (t) form clusters because they are all core points. This ensures that all points in the ascending k-dist plot that lie in a flat area can form clusters. In order to adapt Eps to different KPIs, based on the obtained inflection point, the value of Eps determined in this embodiment is:
Eps=k-dist(t)×ρ,ρ≥1
wherein ρ is a multiplier, which represents an abnormal data precision adjustment coefficient and helps Eps adapt to the distribution of different KPIs. The embodiment provides a lower bound of Eps by determining the most compact feature distribution of normal data, i.e., eps ≧ k-dist (t). The invention converts the determination of the Eps value into the determination of rho, and the experimental finding shows that the determination is carried out at the rho =2 0 ,2 1 ,...,2 6 Good performance can be achieved on different KPIs. It is clear that the determination of ρ is much easier than that of Eps because of the smaller search range. In practice, domain experts may adjust the ρ values to capture anomalous data of interest to them.
FIG. 4 is a schematic diagram of searching inflection points in the adaptive density clustering method of the present invention. This figure shows that the k-dist values are small and close for most data, while k-dist is large for a small portion of anomalous data. The inflection point divides the ascending k-dist plot into two parts, flat and steep, providing a good estimate of one parameter, eps. The self-adaptive method can accurately determine the inflection point.
Step 3.2: and taking MinPts to be more than or equal to 4 for at least the number MinPts of data points in the domain radius Eps.
In this embodiment, minPts =5 is set because the clustering result does not change significantly for two-dimensional data when MinPts ≧ 4. Meanwhile, experiments prove that the algorithm is insensitive to the parameter MinPts.
And 4, step 4: acquiring KPI real-time data on line, and preprocessing the real-time data;
and carrying out standardization processing on the real-time data.
And 5: and (3) extracting two-dimensional differential combination characteristics of the preprocessed real-time data, inputting the real-time two-dimensional differential combination characteristics into the normal data cluster set trained by the adaptive density clustering method in the step (3), judging whether the combination characteristics belong to a cluster of certain normal data, and if not, judging that the real-time data is abnormal, and further judging that the network is in an abnormal state.
Step 5.1: calculating the minimum distance between the real-time two-dimensional differential combination characteristics and all the points in a certain cluster;
step 5.2: judging whether the minimum distance is smaller than a threshold value, if so, indicating that the two-dimensional differential combination characteristic belongs to the cluster set, judging that the network is normal, and if so, turning to the step 5.3; the threshold value in this embodiment takes the neighborhood radius Eps.
Step 5.3: and traversing other clusters, repeating the steps 5.1 and 5.2, and if the minimum distances between the real-time two-dimensional differential feature point and all the points in any cluster are greater than a threshold value, judging that the network is abnormal.
For real-time online data, firstly extracting two-dimensional differential combination features of the real-time online data, then calculating the distance between the features and each cluster, namely the minimum distance between the features and all points in a certain cluster, and finally judging whether the distance is smaller than a threshold value, wherein if the distance is smaller than the threshold value, the data belongs to the cluster, and the data is normal. If the feature does not belong to any cluster it is anomalous. In this embodiment, the threshold is set equal to Eps for the following two reasons: 1. in the training phase, the core points are clustered by Eps dilation, and naturally, eps is used as a distance threshold to determine whether online data belongs to a normal cluster in the detection phase. 2. The invention reduces parameters as much as possible to make the algorithm more practical. Due to the design of the whole algorithm, only one parameter, namely the Eps, needs to be adjusted, and the self-adaptive method can help a domain expert to determine the value of the Eps, so that accurate detection can be realized by simply adjusting the parameter.
When a large number of samples are collected in a normal cluster, a large amount of time is consumed for calculating the minimum distance in online detection, and the performance of real-time detection is affected. The present embodiment proposes a method of using sub-sampling to reduce the number of samples. Specifically, when the cluster is the maximum cluster, the maximum cluster is sub-sampled, the maximum cluster is replaced by sub-sampled samples, and only the minimum distance between each data point in the sub-sampled samples and the real-time two-dimensional differential combination feature point is obtained. While sub-sampling is used instead of maximal clustering, other clustering is preserved because most normal data features tend to cluster into one large cluster and a small portion of other normal data features tend to cluster into other small clusters based on the adaptive approach derived Eps, and a small portion of samples in the maximal cluster can represent the density and features of the entire cluster because of the large number of repeated or similar samples. Therefore, the calculation cost can be obviously reduced by sub-sampling the maximum cluster, high-efficiency real-time detection is realized, and higher detection accuracy can be ensured by reserving other small clusters. It was found through experiments that when the sub-sample size was increased to some suitable value (2) 12 ) The algorithm enables accurate and efficient detection and there is no need to increase the sample size any more, as this does not increase the accuracy and increases the computational overhead.
Fig. 5 is an overall framework diagram of the present invention. The invention is divided into two stages: an off-line training phase and an on-line detection phase. In the off-line training phase, step 1-step 3, the historical data is first preprocessed, which includes normalizing the data by filling missing values with the mean of neighboring values. Then extracting and combining the high-order difference to obtain two-dimensional difference combination characteristics, and obtaining the characteristic distribution of normal data by using self-adaptive density clustering, namely clustering clusters. Finally, the largest cluster is sub-sampled and the other small clusters are retained. In the online detection stage, i.e., step 4 and step 5, as in the offline training stage, the real-time data is first processed, then the two-dimensional differential combination features of the real-time data are extracted, and finally whether the two-dimensional differential combination features of the online data belong to a certain cluster trained before is judged in real time to determine whether the two-dimensional differential combination features of the online data are abnormal.
The following is a verification of the invention through some experimental data and graphs that the network anomaly is discovered by finding mutation data in KPIs.
Table 1 overall performance comparison of the inventive method DDCOL and the other four algorithms
Figure BDA0002330498290000091
Table 1 general performance plots for the inventive process DDCOL and the four processes compared. Wherein, the accuracy rate refers to the proportion of the detected real abnormal points to all the abnormal points marked as abnormal points by the algorithm, and the recall rate refers to the proportion of the detected real abnormal points to all the real abnormal points. F-score refers to the harmonic mean of accuracy and recall. The optimal values of the three indexes are all 1, and the worst values are all 0. In general, accuracy and recall have opposite relationships, and it is possible to increase one term and decrease the other by adjusting the threshold. Whereas F-score provides a more objective evaluation. The F-score is higher only if both accuracy and recall are higher. Note that in calculating the above-mentioned index, for an abnormal segment, if an abnormal point is detected within a small delay, all points in the segment are considered to be detected, otherwise, all points of the segment are considered to be not detected. This is because in practice, domain experts are concerned about whether the algorithm can detect anomalies and alert within a small delay, rather than how many anomaly points are detected. For a given data set, different F-scores can be obtained by adjusting the parameters in the algorithm, with the experimental comparison being the optimal F-score since it represents the best performance of an algorithm. In addition, the delay refers to the distance between the first point and the first detected point in an anomalous segment. Data sets 1-8 are 8 large-scale KPIs from actual monitoring (each containing 20000-140000 data points from 1-3 months monitoring), and data sets A1-A4 are yahoo open anomaly detection data sets, containing a total of 356 small KPIs (each containing roughly 1400 data points). Experimental results show that compared with two traditional statistical methods (simple Moving Average, abbreviated simple MA and Autoregressive Integrated Moving Average, differential integration Moving Average Autoregressive model, abbreviated ARIMA) and two unsupervised learning methods (adaptive kernel density estimation, AKDE and Donut (a self-named algorithm name in a paper, without full name)), the method disclosed by the invention realizes the best results in 7 of 12 data sets, and realizes stable and good performance on different data sets.
Table 2 shows the mutation ratios found in the present invention, which were calculated by dividing the number of abnormal fragments detected by DDCOL under optimal F-score by the total number of abnormal fragments. The results showed that when the delay was 5 or more, the proportion of 11 mutations in 12 data sets reached 74% or more, and even 100% in 4 data sets. This result intuitively suggests that most abnormalities in KPIs are mutations. This is probably because in a network environment, anomalies tend to be produced in bursts and bursts, whereas the conditions for producing other types of anomalies (such as slow rises) are so harsh that they occur only in rare cases. This is also why the process DDCOL of the invention designed for mutation is able to achieve good performance.
TABLE 2 mutation ratios found in the present invention
Figure BDA0002330498290000101
Table 3F-score for different characteristics. The invention explores the influence of DDCOL extraction on performance by different characteristics, and respectively extracts the mean, the variance var, the first-order difference diff1, the second-order difference diff2 and the combination of the two. Among them, two-dimensional features (diff 1, diff 2) are the features that the present invention considers to be the simplest and most effective for mutation, and are also the only features extracted in previous experiments. The results show that (diff 1, diff 2) achieved the highest average F-score, while other features performed poorly on some datasets. This is because other features may be limited in some cases, for example, for the mean feature, significant changes in the data distribution in the KPI may cause the mean feature to fail because the mean distribution at the time of detection has deviated from the mean distribution learned in training. For variance characteristics, constant fluctuations in normal data may result in large variances that are misjudged as abnormal. While the difference features are straightforward and effective for abrupt changes, the combination of higher order differences is robust for identifying abrupt changes and distinguishing expected conceptual drift.
TABLE 3 comparison of F-score for different characteristics
Figure BDA0002330498290000111
FIG. 6 is a graph showing the effect of sub-sampling size on F-score and detection time according to the present invention. The results show that the F-score converges at a smaller sub-sample size. A small sub-sample size provides both a high F-score and a low detection time. Therefore, in the present embodiment, when the sub-sample size is increased to a suitable value (2) 12 ) The invention can realize accurate and efficient detection, and does not need to increase the sampling size any more, because the accuracy rate cannot be improved and the calculation cost can be increased.
Table 4 is a statistical table of clusters formed by the density-based clustering method of the present invention, and the result shows that most of the feature samples form a large cluster due to similarity, and the remaining small samples form a small cluster. For example, in the feature space of dataset 3, the largest cluster has 12791 samples, while the other 10 clusters have 101 samples. Therefore, reserving a small cluster does not result in a sharp increase in the total number of samples after sub-sampling, thereby ensuring efficient real-time detection.
TABLE 4 comparison table of sample sizes in clusters formed by density-based clustering method
Figure BDA0002330498290000112
FIG. 7 is the effect of the parameter MinPts on F-score. The results show that the F-score can remain stable when MinPts changes, i.e., the algorithm is insensitive to the parameter MinPts. In this embodiment, minPts =5 is preferable.
FIG. 7 is a graph of the detection (right) and actual label (left) analysis of two-dimensional differentially combined features of the invention. Where 'x' represents an anomaly and the dots represent normal features. The graph shows that the DDCOL algorithm can identify sudden change (abnormity) through high-order difference and combination, and obtains good characteristic distribution of normal data in KPI through density-based clustering.
FIGS. 9 and 10 are the recognition and analysis graphs of the algorithm extracted difference features for sudden change and expected conceptual drift, the circle in FIG. 9 represents the abnormal data, and 'x' in FIG. 10 represents the two-dimensional difference combination abnormal feature. Fig. 9 shows that the original KPI is not stable to the expected concept drift, and that the first order difference signature, although stable except at the moment when the expected concept drift occurs, still causes a large number of false positives because the first order difference signature of the sudden change and the expected concept drift is similar. FIG. 10 illustrates the two-dimensional differentially combined feature space extracted by the algorithm of the present invention, which shows that the two-dimensional differentially combined features for abrupt changes and expected conceptual drift are different. The DDCOL algorithm can model normal data including expected concept drift in KPIs through density clustering, so as to obtain good feature distribution of the normal data.
According to the invention, a cluster set of normal data is obtained through the self-adaptive density clustering method in the step 3, abnormal data is filtered, then KPI data acquired in real time is compared with each cluster set for judgment, whether the KPI data is abnormal or not is judged by judging whether the KPI data belongs to a certain cluster set or not, and thus whether the network is abnormal or not is judged. According to the method, after the correlation between the KPI mutation and the network abnormality is analyzed, the KPI mutation is searched to indirectly discover the network abnormality, and experimental data discovers that the method can achieve good performance. The invention discovers network abnormality from the aspect of detecting KPI abnormality, and provides a method for identifying a common and important abnormality form mutation, and most of the abnormalities in KPI are found to be mutation. And experiments prove that the network abnormity judging method provided by the invention is effective and has good performance.
The present invention also provides a computer device, comprising a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the KPI mutation-based adaptive unsupervised online network anomaly detection method described above when executing the computer program.
The present invention also provides a computer readable storage medium having stored thereon a computer program, characterized in that the computer program, when being executed by a processor, realizes the steps of the above-mentioned method for adaptive unsupervised online network anomaly detection of KPI mutations.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.

Claims (7)

1. A KPI mutation-based self-adaptive unsupervised online network anomaly detection method is characterized in that: the method comprises the following steps:
step 1: collecting KPI historical data in a network, and preprocessing the historical data;
and 2, step: extracting high-order differences of the preprocessed historical data to form two-dimensional difference combination features to form a two-dimensional difference combination feature set;
and 3, step 3: performing clustering analysis on the feature data in the two-dimensional differential combination feature set by using a self-adaptive density clustering method to obtain a cluster set of normal data;
and 4, step 4: acquiring KPI real-time data on line, and preprocessing the real-time data;
and 5: extracting two-dimensional differential combination characteristics of the preprocessed real-time data, inputting the real-time two-dimensional differential combination characteristics into a normal data cluster set trained by the adaptive density clustering method in the step 3, judging whether the combination characteristics belong to a cluster of certain normal data, if not, judging that the real-time data is abnormal, and further judging that the network is in an abnormal state;
the two-dimensional difference combination feature extraction method in the step 2 comprises the following steps: discretizing the preprocessed historical data, and extracting a first-order difference x of the discretized historical data t -x t-1 And the second order difference value x t -x t-2 Forming two-dimensional combined features: (x) t -x t-1 ,x t -x t-2 );
The adaptive density clustering method in the step 3 is a method for setting a parameter neighborhood radius parameter Eps and a data point number MinPts in the neighborhood radius Eps of the density-based clustering method DBSCAN, and specifically comprises the following steps:
step 3.1: the neighborhood radius parameter Eps is determined by adopting a self-adaptive method, which specifically comprises the following steps:
step 3.1.1: setting k = MinPts to obtain a specific k-dist graph, wherein k-dist refers to the distance between a data point and a point which is k-th near the data point, and arranging all the data points in an ascending order according to k-dist values of all the data points to obtain an ascending order k-dist graph, wherein the data points refer to points represented by two-dimensional differential combination characteristics of each KPI historical data;
step 3.1.2: let f and l be the first and last points in the k-dist diagram, respectively, p be any point between f and l, and the coordinates of p be (p.x, p.y), where p.x represents the abscissa, i.e., ascending sequence number, and p.y represents the ordinate, i.e., k-dist value, then the normalized vector composed of points f and p, and p and l, respectively, is represented as:
Figure FDA0003841392330000011
Figure FDA0003841392330000012
then vector
Figure FDA0003841392330000013
And vector
Figure FDA0003841392330000014
Cosine in between is expressed as:
Figure FDA0003841392330000015
step 3.1.3: traversing all points p in the k-dist graph, and finding out the minimum value of cos theta, wherein the point p corresponding to the minimum cos theta is an inflection point t in the k-dist graph, and the k-dist value corresponding to the inflection point t is k-dist (t);
step 3.1.4: value of neighborhood radius Eps:
Eps=k-dist(t)×ρ,ρ≥1;
rho is an abnormal precision adjustment coefficient;
step 3.2: and taking MinPts to be more than or equal to 4 for at least the number MinPts of data points in the domain radius Eps.
2. A KPI mutation-based adaptive unsupervised online network anomaly detection method according to claim 1, characterized in that: the preprocessing of the historical data comprises the step of normalizing the data after filling missing values by using the mean value of the adjacent values.
3. The KPI mutation-based adaptive unsupervised online network anomaly detection method of claim 1, wherein: inputting the real-time two-dimensional differential combination characteristic into a normal data cluster set trained by the adaptive density clustering method in the step 3, and judging whether the combination characteristic belongs to a cluster of certain normal data or not, wherein the method comprises the following steps:
step 5.1: calculating the minimum distance between the real-time two-dimensional differential combination characteristics and all the points in a certain cluster;
step 5.2: judging whether the minimum distance is smaller than a threshold value, if so, indicating that the two-dimensional differential combination characteristic belongs to the cluster set, judging that the network is normal, and if so, turning to the step 5.3;
step 5.3: and traversing other clusters, repeating the steps 5.1 and 5.2, and if the minimum distances between the real-time two-dimensional differential feature point and all the points in any cluster are greater than a threshold value, judging that the network is abnormal.
4. A KPI mutation based adaptive unsupervised online network anomaly detection method according to claim 3, characterized in that: the threshold is the neighborhood radius Eps.
5. A KPI mutation-based adaptive unsupervised online network anomaly detection method according to claim 3, wherein: in step 5.1, when the minimum distance between the real-time two-dimensional differential combination feature and all the points in a certain cluster is calculated, when the cluster is the maximum cluster, the maximum cluster is sub-sampled, the sub-sampled sample replaces the maximum cluster, and only the minimum distance between each data point in the sub-sampled sample and the real-time two-dimensional differential combination feature point is obtained.
6. A computer arrangement comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program implements the steps of the KPI mutation based adaptive unsupervised online network anomaly detection method according to any of claims 1 to 5.
7. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for adaptive unsupervised online network anomaly detection of KPI mutations according to any of the claims 1 to 5.
CN201911334135.0A 2019-12-23 2019-12-23 KPI mutation-based adaptive unsupervised online network anomaly detection method Active CN110995508B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911334135.0A CN110995508B (en) 2019-12-23 2019-12-23 KPI mutation-based adaptive unsupervised online network anomaly detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911334135.0A CN110995508B (en) 2019-12-23 2019-12-23 KPI mutation-based adaptive unsupervised online network anomaly detection method

Publications (2)

Publication Number Publication Date
CN110995508A CN110995508A (en) 2020-04-10
CN110995508B true CN110995508B (en) 2022-11-11

Family

ID=70074150

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911334135.0A Active CN110995508B (en) 2019-12-23 2019-12-23 KPI mutation-based adaptive unsupervised online network anomaly detection method

Country Status (1)

Country Link
CN (1) CN110995508B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113556258B (en) * 2020-04-24 2022-12-27 西安华为技术有限公司 Anomaly detection method and device
CN111651735B (en) * 2020-05-07 2022-03-01 清华四川能源互联网研究院 Time sequence data watermarking method with timestamp alignment function
CN111858231A (en) * 2020-05-11 2020-10-30 北京必示科技有限公司 Single index abnormality detection method based on operation and maintenance monitoring
CN111723940B (en) * 2020-05-22 2023-08-22 第四范式(北京)技术有限公司 Method, device and equipment for providing estimated service based on machine learning service system
CN111738308A (en) * 2020-06-03 2020-10-02 浙江中烟工业有限责任公司 Dynamic threshold detection method for monitoring index based on clustering and semi-supervised learning
CN111865407B (en) * 2020-06-11 2021-11-30 烽火通信科技股份有限公司 Intelligent early warning method, device, equipment and storage medium for optical channel performance degradation
CN112187555B (en) * 2020-12-01 2021-03-19 北京蒙帕信创科技有限公司 Real-time KPI data anomaly detection method and device based on machine learning
CN113033643B (en) * 2021-03-17 2022-11-22 上海交通大学 Concept drift detection method and system based on weighted sampling and electronic equipment
CN113537321B (en) * 2021-07-01 2023-06-30 汕头大学 Network flow anomaly detection method based on isolated forest and X mean value
CN113723452A (en) * 2021-07-19 2021-11-30 山西三友和智慧信息技术股份有限公司 Large-scale anomaly detection system based on KPI clustering
CN114925116A (en) * 2022-06-01 2022-08-19 中国西安卫星测控中心 Spacecraft telemetry data prediction method
CN115310516A (en) * 2022-07-06 2022-11-08 山东科技大学 Method and system for judging state stability of automation equipment and readable storage medium
GB2621851A (en) * 2022-08-24 2024-02-28 Vodafone Group Services Ltd Computer implemented methods, systems and program instructions for detecting anomalies in a core network of a telecommunications network
CN116796214B (en) * 2023-06-07 2024-01-30 南京北极光生物科技有限公司 Data clustering method based on differential features

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014183784A1 (en) * 2013-05-14 2014-11-20 Telefonaktiebolaget L M Ericsson (Publ) Resource budget determination for communications network
WO2017016472A1 (en) * 2015-07-28 2017-02-02 Huawei Technologies Co., Ltd. Predicting network performance
CN110032490A (en) * 2018-12-28 2019-07-19 中国银联股份有限公司 Method and device thereof for detection system exception
CN110083507A (en) * 2019-04-19 2019-08-02 中国科学院信息工程研究所 Key Performance Indicator classification method and device
CN110213227A (en) * 2019-04-24 2019-09-06 华为技术有限公司 A kind of network data flow detection method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10909140B2 (en) * 2016-09-26 2021-02-02 Splunk Inc. Clustering events based on extraction rules

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014183784A1 (en) * 2013-05-14 2014-11-20 Telefonaktiebolaget L M Ericsson (Publ) Resource budget determination for communications network
WO2017016472A1 (en) * 2015-07-28 2017-02-02 Huawei Technologies Co., Ltd. Predicting network performance
CN110032490A (en) * 2018-12-28 2019-07-19 中国银联股份有限公司 Method and device thereof for detection system exception
CN110083507A (en) * 2019-04-19 2019-08-02 中国科学院信息工程研究所 Key Performance Indicator classification method and device
CN110213227A (en) * 2019-04-24 2019-09-06 华为技术有限公司 A kind of network data flow detection method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Towards a Personalized Item Recommendation Approach in Social Tagging Systems Using Intuitionistic Fuzzy DBSCAN;Yong Yue等;《IEEE》;20101230;全文 *

Also Published As

Publication number Publication date
CN110995508A (en) 2020-04-10

Similar Documents

Publication Publication Date Title
CN110995508B (en) KPI mutation-based adaptive unsupervised online network anomaly detection method
JP6725700B2 (en) Method, apparatus, and computer readable medium for detecting abnormal user behavior related application data
CN112527788B (en) Method and device for detecting and cleaning abnormal value of transformer monitoring data
US11042798B2 (en) Regularized iterative collaborative feature learning from web and user behavior data
Toledano et al. Real-time anomaly detection system for time series at scale
Cord et al. Texture classification by statistical learning from morphological image processing: application to metallic surfaces
US8805836B2 (en) Fuzzy tagging method and apparatus
CN112329847A (en) Abnormity detection method and device, electronic equipment and storage medium
Hollmén User profiling and classification for fraud detection in mobile communications networks
CN103617233A (en) Method and device for detecting repeated video based on semantic content multilayer expression
EP2659437A1 (en) Automatic variable creation for adaptive analytical models
CN111143838B (en) Database user abnormal behavior detection method
US7716152B2 (en) Use of sequential nearest neighbor clustering for instance selection in machine condition monitoring
Zhang et al. Energy theft detection in an edge data center using threshold-based abnormality detector
Kalinichenko et al. Methods for anomaly detection: A survey
WO2019200739A1 (en) Data fraud identification method, apparatus, computer device, and storage medium
Mustafa et al. Unsupervised deep embedding for novel class detection over data stream
CN116823496A (en) Intelligent insurance risk assessment and pricing system based on artificial intelligence
CN116955936A (en) Enterprise big data algorithm attribute data prediction method
CN114090393A (en) Method, device and equipment for determining alarm level
CN111428772B (en) Photovoltaic system depth anomaly detection method based on k-nearest neighbor adaptive voting
CN115330362A (en) Engineering progress data processing method and system
CN111611483B (en) Object portrait construction method, device and equipment and storage medium
CN111967911A (en) Derivative monitoring and analyzing method and system
Imani et al. Phishing Website Detection Using Weighted Feature Line Embedding.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant