CN113419934B - KPI index multivariate anomaly monitoring method based on regression prediction - Google Patents

KPI index multivariate anomaly monitoring method based on regression prediction Download PDF

Info

Publication number
CN113419934B
CN113419934B CN202110676313.9A CN202110676313A CN113419934B CN 113419934 B CN113419934 B CN 113419934B CN 202110676313 A CN202110676313 A CN 202110676313A CN 113419934 B CN113419934 B CN 113419934B
Authority
CN
China
Prior art keywords
kpi
data
abnormality
anomaly
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110676313.9A
Other languages
Chinese (zh)
Other versions
CN113419934A (en
Inventor
徐丽燕
徐康
翟明玉
秦银川
林志诚
王纪立
黄鑫健
陈子韵
彭程
王宇冬
季惠英
沙一川
季学纯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NARI Group Corp
Nari Technology Co Ltd
Original Assignee
NARI Group Corp
Nari Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NARI Group Corp, Nari Technology Co Ltd filed Critical NARI Group Corp
Priority to CN202110676313.9A priority Critical patent/CN113419934B/en
Publication of CN113419934A publication Critical patent/CN113419934A/en
Application granted granted Critical
Publication of CN113419934B publication Critical patent/CN113419934B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention discloses a method for monitoring multi-data abnormity such as single-point data abnormity, continuous data abnormity and correlation abnormity in KPI (Key performance indicator) indexes based on a regression prediction method. The method mainly comprises the following steps: preprocessing the data set, and analyzing the periodicity, trend, correlation and other statistical attributes of the data set; calculating a predicted value of the data set through statistical modeling or a machine learning algorithm; calculating the abnormal scores of the actual value and the predicted value, and monitoring the abnormal values by using a 3-sigma criterion; aiming at single point abnormity, directly monitoring through a criterion; and aiming at continuous abnormity, in order to reduce the influence of the continuous abnormity on prediction, the prediction average value in one period is adopted for carrying out abnormity score calculation. The algorithm accuracy was evaluated using the standard metric methods AUC and PRAUC. According to the method, multiple machine learning regression algorithms can be adopted for calculating the predicted value, a targeted solution is provided for the multiple anomalies, the algorithms are more flexible, and the anomaly type monitoring of the data is more comprehensive.

Description

KPI (Key performance indicator) multivariate anomaly monitoring method based on regression prediction
Technical Field
The invention belongs to the field of intelligent operation and maintenance, and particularly relates to a KPI (Key performance indicator) multivariate anomaly monitoring method based on regression prediction.
Background
Kpi (key Performance indicator) is time-series data obtained by timing sampling, and has practical application significance. KPI data in Internet enterprises mainly include service KPIs (call success rate, call time, etc.) reflecting the monitoring status and service quality of a service system and machine KPIs (such as disk IO, memory, CPU, etc.) reflecting the running status of actual physical machines. The KPI of the financial industry mainly comprises transaction amount, transaction success rate, web page access amount and the like. KPI data is monitored in real time, KPI abnormity is timely and accurately found, and stable operation of the system is ensured, which has very important significance for enterprises.
The anomaly monitoring method commonly used in the industry comprises: homocyclic ratio models, threshold models, trend models, unsupervised models, supervised models, polynomial fitting XGboost, unsupervised Isolation Forest, supervised Random Forest, and the like. However, when different service scenes and different anomaly types are targeted, actual anomaly monitoring effects are different.
Because KPI time sequence data visualization is low in cost and obvious in rule, the KPI time sequence data visualization method is often applied to the operation and maintenance field and monitors the running state of a system. As the system is gradually complicated and the operation and maintenance data are gradually huge, the single manpower monitoring can not meet the increasing intelligent operation and maintenance requirements.
Disclosure of Invention
The purpose of the invention is as follows: in order to overcome the defects in the prior art, the invention provides a KPI multivariate abnormality monitoring method based on regression prediction. The monitoring method carries out automatic KPI abnormity monitoring through an algorithm; different monitoring models can be constructed for different service scenes, pertinence and intellectualization are realized for various abnormal monitoring, limitation to different abnormal types is overcome, accuracy and robustness of abnormal detection are improved, and the method is low in labor cost, low in maintenance cost and stable in effect.
The technical scheme is as follows: a KPI index multivariate anomaly monitoring method based on regression prediction comprises the following steps:
carrying out data preprocessing on the collected KPI time sequence data set;
performing feature engineering on the KPI time series number set subjected to data preprocessing, and extracting relevant features from the KPI time series number set; the relevant features include: time characteristics of the time series, statistical characteristics of the time series, fitting characteristics of the time series and time domain characteristics of the time series; the relevant features are extracted comprehensively for use in subsequent different prediction algorithms, and whether to delete trend components by multiplication or addition can be selected.
Inputting the extracted feature data set into all prediction algorithm classes for operation, wherein all prediction algorithm classes can adopt a simple statistical modeling class: WMA, EWMA, ARIMA, Holt-Winters; linear regression: simple linear regression, Huber regression, Ridge regression; tree-based regression class: predicting by using a regression tree, a Random Forest and an XGboost to obtain a plurality of abnormal predicted values corresponding to each KPI time sequence;
calculating an abnormal degree score for a plurality of abnormal predicted values output by all prediction algorithms and actual KPI time sequence data, performing abnormal monitoring and abnormal type judgment on the abnormal degree score by using a 3-sigma criterion, and adopting different combinations of prediction algorithm types for different abnormal types;
respectively carrying out Precision evaluation on the plurality of abnormal degree scores, screening out the highest Score, and evaluating the Precision of the algorithm by calculating Precision, Recall, F1-Score, AUC and PRAUC so as to select the optimal abnormal monitoring method combination; and monitoring the multivariate abnormality of the KPI according to the selected optimal abnormality monitoring method combination.
In a further embodiment, in data preprocessing of the collected time series data set of KPI indicators, the periodicity, trend components, and correlations between KPI indicators in the data set are analyzed, and all anomaly types are simulated for manual labeling and injection of unlabeled multivariate anomaly data in the data set.
The time characteristics of the KPI time series comprise: year, quarter, month, day, week, hour, minute, morning, afternoon, evening, weekend, month end, etc.; statistical characteristics of the time series: actual value, maximum value, minimum value, value range, mean, median, variance, kurtosis, same ratio and ring ratio; fitting characteristics of time series: MA, WMA, EWMA, Holt-Winter, AR, ARIMA; time domain characteristics of the time series: autocorrelation coefficients, partial autocorrelation coefficients, differences, trends, periods, noise.
In a further embodiment, the method of analyzing a data set is as follows:
analyzing periodicity and trend components in a KPI time sequence by adopting an STL decomposition method to obtain periodicity characteristics and trend components contained in a data set; the method has important significance for the parameters of the feature extraction and prediction algorithm in the feature engineering.
Judging the correlation existing between different KPI indexes by calculating the Pearson coefficient among different KPI indexes;
and if the Pearson coefficient between the calculated KPI indexes is in the range of 0.8-1.0, judging that the two KPI indexes calculated by the current input are highly correlated.
In a further embodiment, in the process of performing manual exception injection and labeling on the data set, the data set is injected manually and then labeled; the method for manually injecting and marking the unmarked abnormal data in the data set comprises the following steps: marking random points, random continuous segments and rules, and selecting a manual injection and marking mode according to the abnormal type of the time sequence data set of the original KPI; the data set is subjected to manual exception injection and labeling, so that the data set comprises single-point exceptions, transient exceptions and continuous exceptions, and exception data is limited not to exceed 2% of total data.
In a further embodiment, the random point is marked as randomly selecting KPI index values in the time series data set for abnormal injection;
marking the random section as randomly selecting at least 3 continuous KPI index values in the time series data set to perform abnormal injection; if the KPI index value is abnormal in descending category, randomly reducing the KPI index value according to percentage by using uniform distribution, wherein the reduction proportion accounts for 30-100% and/or 50-100% of the actual value of the KPI index; if the KPI index value is abnormal in the ascending class, the KPI index value is randomly increased according to uniform distribution, and the injection percentage of the abnormality is consistent with that of the abnormity in the descending class; for the relevant change abnormality, one of the two KPI indexes is randomly increased or decreased according to uniform distribution, and the injection percentage of the abnormality is consistent with that of the descending type abnormality;
and the rule marking is that if the original KPI index value is lower than 20% of any KPI index value at the same time in the past 2 periods, the original KPI index value is lower than 180% of any KPI index value at the same time in the past 2 periods, the original KPI index value is higher than the original KPI index value, the original KPI index value is lower than the original KPI index value, the original KPI index value, the original KPI index value, the original KPI, wherein the original KPI index value is lower, wherein the original KPI, 20% of the original KPI, 20% of the original KPI index value, the original KPI, and the original KPI, the original KPI.
In a further embodiment, the extraction of relevant features from KPI indicator data is performed as follows:
different characteristics are extracted to aim at different prediction algorithms and different abnormal types;
and selectively deleting KPI index trend components in the data set after extracting the characteristic data set.
In a further embodiment, when predicting the time series, firstly, judging whether the data has trend or seasonality, removing the characteristics of the trend and the seasonality to enable the data to reach a steady state, wherein random variables in the steady series are greatly reduced due to the fact that trend components influence the accuracy of anomaly monitoring, and the prediction is facilitated; the smoothing sequence is an important step in the time series; but the effect of deleting the trend is good or bad in the experimental process, so that whether the trend is deleted or not can be selected before prediction is carried out; the method for deleting the KPI trend component in the data set comprises two modes: addition deletion and multiplication deletion; the addition deletion is the actual data minus the trend, and the multiplication deletion is the actual data divided by the trend.
In a further embodiment, the prediction algorithm comprises: simple statistical modeling prediction, linear regression prediction, and regression tree based prediction.
The simple statistical modeling prediction uses WMA, EWMA, ARIMA and Holt-Winters, 4 statistical modeling algorithms for prediction, and the characteristics of the simple statistical modeling input mainly comprise time characteristics, statistical characteristics and time domain characteristics.
Predicting by linear regression, using simple linear regression, Ridge regression, Huber regression and 3 linear regression algorithms, and outputting time series data of the predicted KPI; the time signature and the statistical signature are input for linear regression.
And (3) carrying out prediction based on tree regression, wherein the prediction is carried out by using a regression tree, Random Forest, XGboost and 3 tree regression-based algorithms, and the predicted KPI index time sequence data is output. For the tree-based regression input temporal features, statistical features, fitting features, and temporal features.
In a further embodiment, the exception types include: a sudden drop anomaly, a sudden increase anomaly, a short drop anomaly, a short rise anomaly, a sustained drop anomaly, a sustained rise anomaly, and a related change anomaly.
In a further embodiment, the degree of abnormality score is (actual value-predicted value)/predicted value; the more the abnormality degree score is close to 0, the lower the abnormality degree of the data is, and otherwise, the higher the abnormality degree is.
In a further embodiment, outliers are flagged as abnormal when the outlier score < u-3 σ (u is the mean and σ is the variance) for dip anomalies and short-time dip anomalies, and flagged as abnormal when the outlier score > u +3 σ for bump anomalies and short-time dip anomalies; for the continuous descending abnormality, because the abnormality is continuous for a long time, the abnormality degree score formula is only used for the first abnormality monitored, when 10 abnormalities are continuously monitored, the abnormality degree score of the following time series data is (actual value-last period average predicted value)/last period average predicted value, when the abnormality degree score is < u-3 σ, the abnormality is marked, and the same is true for the continuous ascending abnormality; and (3) respectively calculating the abnormality degree score of each KPI index value in the KPI index pair for the correlation abnormality, and if C1 and C2 are used for representing the abnormality degree score, if any one of C1 or C2 does not belong to [ u-3 sigma, u +3 sigma ], the KPI index pair is abnormal and marked as True, otherwise, the KPI index pair is marked as False.
In a further embodiment, Precision and Recall, F1-Score are calculated from the anomaly monitoring results, wherein:
Figure BDA0003120695040000061
the calculation formula is as follows:
Figure BDA0003120695040000062
wherein TP represents a true example; FP represents a false positive case; FN represents false counterexamples; TN represents the true counterexample; the metrics AUC and PRAUC are calculated from the values of Precision and Recall, the metric being between 0 and 1, the higher the metric the higher the anomaly monitoring accuracy.
Has the advantages that: compared with the prior art, the invention has the following advantages:
(1) the method has good flexibility and expansibility after the KPI time sequence data set, the prediction algorithm and the abnormality judgment step are collected, different characteristic information can be extracted on the basis of the method, various prediction algorithms are replaced, different abnormalities are monitored, different training sets are used for training models, and the accuracy of abnormality monitoring can be further improved.
(2) When data preprocessing is carried out, the data is comprehensively considered, wherein periodicity, trend components, correlation and the like are included, the method adopted in the process of carrying out manual injection on the abnormal data is flexible and novel, and the labeled data has randomness.
(3) During feature engineering, various features are extracted from the data set, different features have different prediction effects, and feature information with better anomaly monitoring effect can be selected through experiments, so that the anomaly monitoring method is more robust.
(4) When algorithm prediction is carried out, the prediction algorithm is various, and the prediction result is more accurate by matching different abnormal injection methods and different characteristics.
(5) When the abnormity is monitored, 7 types of abnormity can be monitored, including sudden drop abnormity, sudden increase abnormity, short-time drop abnormity, short-time rise abnormity, continuous drop abnormity, continuous rise abnormity and related change abnormity, and various abnormity monitoring conditions can be met, so that the invention is robust and practical.
(6) When the Precision is evaluated, Precision, Recall, F1-Score, AUC and PRAUC are adopted, 5 measures are adopted to carry out Precision evaluation, and the optimal collocation with the best Precision is selected.
Drawings
FIG. 1 is a basic framework diagram of anomaly monitoring provided by the present invention.
Fig. 2 is a schematic diagram of the algorithm prediction process provided by the present invention.
FIG. 3 is a schematic diagram of the exception type provided by the present invention.
Fig. 4 is a schematic view of an anomaly monitoring process provided by the present invention.
Detailed Description
In order to more fully understand the technical contents of the present invention, the technical solutions of the present invention will be further described and illustrated with reference to specific embodiments, but not limited thereto.
Example one
The anomaly monitoring method based on regression prediction provided by the invention is described by combining the following steps with the figure 1:
s10: performing data preprocessing according to the collected KPI time sequence data set; when a KPI time sequence data set is collected, the periodicity, trend components and the correlation among KPI indexes in the data set need to be analyzed, abnormal data in the data set are manually marked and injected, and the manually marked abnormal data account for 2% of the total data set; in the process of carrying out manual abnormal injection and marking on the data set, the data set needs to be injected manually and then marked;
analyzing periodicity and trend components in a KPI time sequence by adopting an STL decomposition method to obtain periodicity characteristics and trend components contained in a data set; the method has important significance for the parameters of the feature extraction and prediction algorithm in the feature engineering.
Judging the correlation existing between different KPI indexes by calculating the Pearson coefficient among different KPI indexes;
and if the Pearson coefficient between the calculated KPI indexes is in the range of 0.8-1.0, judging that the two KPI indexes calculated by the current input are highly correlated.
S20: performing feature engineering according to a result obtained by data preprocessing, and extracting relevant features from the time series number set of the KPI; wherein the relevant features include: time features, statistical features, fitting features, and time domain features; the time characteristics of the KPI index time series comprise: year, quarter, month, day, week, hour, minute, morning, afternoon, evening, weekend, month end, etc.; statistical characteristics of the time series: actual value, maximum value, minimum value, value range, mean, median, variance, kurtosis, same ratio and ring ratio; fitting characteristics of time series: MA, WMA, EWMA, Holt-Winter, AR, ARIMA; time domain characteristics of the time series: autocorrelation coefficients, partial autocorrelation coefficients, differences, trends, periods, noise.
The relevant features are extracted comprehensively for use in subsequent different prediction algorithms, and whether to delete trend components by multiplication or addition can be selected.
S30: the extracted feature data set is input into a prediction algorithm, and simple statistical modeling can be adopted: WMA, EWMA, ARIMA, Holt-Winters; linear regression: simple linear regression, Huber regression, Ridge regression; tree-based regression: predicting the regression tree, Random Forest and XGboost to obtain a predicted value corresponding to each KPI index time sequence;
s40: calculating an abnormal degree score for an abnormal predicted value output by a prediction algorithm and actual KPI time sequence data, performing abnormal judgment according to a 3-sigma criterion, and calculating abnormal degree scores in different modes for different abnormalities; wherein the exception types include: a sudden drop anomaly, a sudden increase anomaly, a short drop anomaly, a short rise anomaly, a sustained drop anomaly, a sustained rise anomaly, and a related change anomaly.
S50: and (3) evaluating the Precision of the result of the abnormity judgment, and evaluating the Precision of the algorithm by calculating Precision, Recall, F1-Score, AUC and PRAUC, so as to select the optimal abnormity monitoring method combination and monitor the KPI index multivariate abnormity according to the selected optimal abnormity monitoring method combination.
How to select the feature engineering and prediction algorithm according to the result obtained by the data preprocessing is further described with reference to fig. 2:
embodiment of the feature engineering:
s201: different time characteristics are extracted to aim at different prediction algorithms and different abnormal types; wherein by extracting temporal features: year, quarter, month, day, week, hour, minute, morning, afternoon, evening, weekend, month end, etc.; statistical characteristics: actual value, maximum value, minimum value, value range, mean, median, variance, kurtosis, same ratio and ring ratio; fitting characteristics: MA, WMA, EWMA, Holt-Winter, AR, ARIMA; time domain characteristics: autocorrelation coefficients, partial autocorrelation coefficients, differences, trends, periods, noise; for example, in an experiment, for a Random Forest algorithm, the day and the hour are selected as characteristic input; for the ARIMA model, autocorrelation coefficients, partial autocorrelation coefficients, MA, AR, first order difference and the like are selected as characteristic inputs.
S202: selectively deleting KPI index trend components in the data set; the KPI trend component can affect the abnormal monitoring results of different abnormal types to different degrees.
The method for deleting the KPI trend component in the data set comprises two modes: addition deletion and multiplication deletion; the addition deletion is the actual data minus the trend, and the multiplication deletion is the actual data divided by the trend. In the process of time series prediction, whether data has tendency or seasonality is judged, the characteristics of the tendency and the seasonality are removed to enable the data to reach a steady state, and random variables in a steady sequence are greatly reduced due to the fact that a trend component influences the accuracy of anomaly monitoring, so that prediction is facilitated. The smoothing sequence is an important step in the time series. However, the effect of deleting the trend is good or bad in the experimental process, so that whether trend deletion is carried out or not can be selected before prediction is carried out.
Selected examples of predictive algorithms:
the prediction algorithm mentioned in S30 includes: simple statistical modeling prediction, linear regression prediction, and regression tree based prediction.
The simple statistical modeling prediction uses WMA, EWMA, ARIMA and Holt-Winters, 4 statistical modeling algorithms for prediction, and the characteristics of the simple statistical modeling input mainly comprise time characteristics, statistical characteristics and time domain characteristics.
Predicting by linear regression, using simple linear regression, Ridge regression, Huber regression and 3 linear regression algorithms, and outputting time series data of the predicted KPI; the time signature and the statistical signature are input for linear regression.
The prediction is carried out based on regression of trees, the prediction is carried out based on 3 algorithms of regression of trees, regression trees, Random Forest, XGboost and the prediction KPI index time sequence data are output. For tree-based regression input temporal features, statistical features, fitting features, and temporal features.
The method comprises the following steps of simple statistical modeling prediction, linear regression prediction and prediction based on a regression tree, wherein the simple statistical modeling prediction, the linear regression prediction and the prediction based on a regression tree are selected and combined according to different abnormal types of service scenes and monitoring models:
in the embodiment of the invention, the prediction effect of the regression based on the tree after the trend is deleted is the best, and the average of AUC and PRAUC in multiple experiments is more than 0.9.
The abnormality monitoring and abnormality determination in S40 will be further described with reference to fig. 3 and 4:
s401: calculating an abnormality degree score for the actual data and the predicted data; calculating an abnormality degree score, which is (actual value-predicted value)/predicted value, for the actual data and the predicted data; the more the abnormality degree score is close to 0, the lower the abnormality degree of the data is, and otherwise, the higher the abnormality degree is.
S402: when the 3-sigma criterion is used for carrying out anomaly monitoring on the anomaly degree score, different monitoring methods are adopted for different types of anomalies; the graphs in FIG. 3 visually compare the differences between a sudden drop anomaly, a sudden increase anomaly, a short drop anomaly, a short rise anomaly, a sustained drop anomaly, a sustained rise anomaly, and a related change anomaly;
therefore, for abrupt and short-time descent anomalies, the anomaly score < u-3 σ (u is the mean and σ is the variance) is marked as anomalous, and for abrupt and short-time ascent anomalies, the anomaly score > u +3 σ is marked as anomalous. For the continuous descending abnormality, the abnormality degree score formula is only used for the first abnormality monitored, when 10 abnormalities are continuously monitored, the abnormality degree score of the following time series data is (actual value-last period average predicted value)/last period average predicted value, and when the abnormality degree score is < u-3 σ, the abnormality is marked as abnormal, and the same is true for the continuous ascending abnormality. And (3) respectively calculating the abnormality degree score of each KPI index value in the KPI index pair for the correlation abnormality, and if C1 and C2 are used for representing the abnormality degree score, if any one of C1 or C2 does not belong to [ u-3 sigma, u +3 sigma ], the KPI index pair is abnormal and marked as True, otherwise, the KPI index pair is marked as False.
In addition, the abnormal data mode of manually injecting and marking out the data set comprises the following steps: random point labeling, random continuous segment labeling and rule labeling; selecting a multivariate abnormal data mode of manually injecting and marking out a data set according to the abnormal type of KPI index values in the original time series data set in the upper period;
marking the random point as randomly selecting a KPI index value in the time sequence data set to perform abnormal injection;
marking the random section as randomly selecting at least 3 continuous KPI index values in the time sequence data set to perform abnormal injection; if the KPI index value is abnormal in descending category, randomly reducing the KPI index value according to percentage by using uniform distribution, wherein the reduction proportion accounts for 30-100% and/or 50-100% of the actual value of the KPI index; if the KPI index value is abnormal in the ascending class, the KPI index value is randomly increased according to uniform distribution, and the injection percentage of the abnormality is consistent with that of the abnormity in the descending class; for the relevant change abnormality, one of the two KPI indexes is randomly increased or decreased according to uniform distribution, and the injection percentage of the abnormality is consistent with that of the descending type abnormality;
and the rule marking is that if the original KPI index value is lower than 20% of any KPI index value at the same time in the past 2 periods, the original KPI index value is lower than 180% of any KPI index value at the same time in the past 2 periods, the original KPI index value is higher than the original KPI index value, the original KPI index value is lower than the original KPI index value, the original KPI, wherein the original KPI.
Calculating Precision and Recall, F1-Score according to the abnormality monitoring result, wherein:
Figure BDA0003120695040000131
the calculation formula is as follows:
Figure BDA0003120695040000132
wherein TP represents a true example; FP represents a false positive case; FN represents false counterexample; TN represents the true counterexample; the metrics AUC and PRAUC are calculated from the values of Precision and Recall, the metric being between 0 and 1, the higher the metric the higher the anomaly monitoring accuracy.
The different functions discussed herein may be performed in a different order and/or concurrently with each other. Further, if desired, one or more of the functions described above may be optional or may be combined.
The second embodiment:
assume again that the given training set is a time series of KPI indicators, including time stamps and KPI indicator values.
S1, preprocessing the data, analyzing the data set by STL decomposition method to obtain the periodicity of the data set in one week, the change of day and night, and the rising trend. And then, manually injecting descending anomalies into the data, namely injecting single-point anomalies, continuity anomalies and rule anomalies, wherein the reduction ratio of the injected anomalies is 30% -100%.
S2: and performing feature engineering according to the preprocessed data obtained in the step S1. Because the period is one week and the day and night change is obvious, the time characteristic is extracted as week, day and hour, the statistical characteristic, the fitting characteristic and the time domain characteristic are extracted, and the trend item is deleted by adopting a multiplication mode.
S3: according to the characteristics obtained in S2, algorithm prediction is carried out. And selecting Random Forest as a prediction algorithm, performing two experiments according to the deletion trend respectively, and outputting the predicted value.
And S4, carrying out abnormity monitoring according to the predicted value obtained in the S3. And calculating an abnormal score according to S401, and then respectively performing sudden drop abnormal monitoring, short-time drop abnormal monitoring and continuous drop abnormal monitoring through a 3-sigma criterion, wherein the abnormal value is marked as True.
S5: by calculating Precision, Recall and F1-Score, values for PRAUC and AUC can be derived. The results obtained by the above example procedure were:
Figure BDA0003120695040000141
Figure BDA0003120695040000151
it can be seen that in the experiment, the trend is deleted to make the time sequence data stable, so that the accuracy of the abnormity monitoring can be improved slightly.
The steps discussed in the present invention are not limited to the order of execution in the embodiments, and different steps may be executed in different orders and/or concurrently with each other. Further, in other embodiments, one or more of the steps described above may be optional or may be combined.
When the data is preprocessed, the data is comprehensively considered, wherein the data comprises periodicity, trend components, correlation and the like, the method adopted in the process of manually injecting the abnormal data is flexible and novel, and the labeled data has randomness.
When the method is used for characteristic engineering, various characteristics are extracted from the data set, different characteristics have different prediction effects, and characteristic information with better abnormal monitoring effect can be selected through experiments, so that the abnormal monitoring method has higher robustness.
When the method is used for algorithm prediction, the prediction algorithm is diversified, and the prediction result is more accurate by matching different abnormal injection methods and different characteristics.
When the method is used for monitoring the abnormity, seven types of abnormity can be monitored, including sudden drop abnormity, sudden increase abnormity, short-time drop abnormity, short-time rise abnormity, continuous drop abnormity, continuous rise abnormity and related change abnormity, various abnormity monitoring conditions can be met, and the method is high in flexibility and wide in application range.
When the Precision is evaluated, Precision, Recall, F1-Score, AUC, PRAUC and 5 measures are adopted to evaluate the Precision so as to select the optimal collocation with the best Precision.
Therefore, the method has good flexibility and expansibility after the KPI time sequence data set, the prediction algorithm and the abnormality judgment step are collected, different characteristic information can be extracted on the basis of the method, various prediction algorithms are replaced, different abnormalities are monitored, different training sets are used for training models, and the accuracy of abnormality monitoring can be further improved.
The present invention has been described above by way of illustration in the drawings, and it will be understood by those skilled in the art that the present disclosure is not limited to the embodiments described above, and various changes, modifications and substitutions may be made without departing from the scope of the present invention.

Claims (8)

1. A KPI index multivariate abnormity monitoring method is characterized by comprising the following steps:
carrying out data preprocessing on the collected KPI time sequence data set;
performing feature engineering on the KPI time series number set subjected to data preprocessing, and extracting relevant features from the KPI time series number set; the relevant features include: time characteristics of the time series, statistical characteristics of the time series, fitting characteristics of the time series, and time domain characteristics of the time series;
inputting the extracted feature data set into all prediction algorithm classes for operation to obtain a plurality of abnormal prediction values corresponding to each KPI time sequence;
calculating a plurality of abnormality degree scores according to the actual data and the plurality of abnormality prediction values;
carrying out anomaly monitoring on the anomaly degree score by using a 3-sigma criterion and carrying out anomaly type judgment;
respectively carrying out precision evaluation on the multiple abnormality degree scores, and screening out the highest score, thereby selecting the optimal abnormality monitoring method combination;
monitoring KPI index multivariate abnormality according to the selected optimal abnormality monitoring method combination;
and respectively carrying out Precision evaluation on the plurality of the abnormality degree scores, and calculating Precision and Recall according to the plurality of the abnormality degree scores, wherein the calculation formula of F1-Score is as follows:
Figure FDA0003633562260000011
wherein TP represents a true example; FP represents a false positive case; FN represents false counterexample; calculating a metric AUC and a metric prpauc from the Precision and Recall values, the metric being in the range of 0-1 and the anomaly monitoring accuracy varying in direct proportion to the metric;
wherein the degree of abnormality score is (actual value-predicted value)/predicted value.
2. A KPI multivariate abnormality monitoring method according to claim 1, wherein in the data preprocessing of the collected KPI time series data set, the periodicity, trend components and the correlation between KPI indicators in the data set are analyzed, and all abnormality types are simulated to be manually labeled and the unlabeled multivariate abnormality data in the data set is injected.
3. A KPI indicator multivariate abnormality monitoring method according to claim 2, characterized in that the method of analyzing the data set is as follows:
analyzing periodicity and trend components in a KPI time sequence by adopting an STL decomposition method to obtain periodicity characteristics and trend components contained in a data set;
judging the correlation existing between different KPI indexes by calculating the Pearson coefficient between different KPI indexes;
and if the Pearson coefficient between the calculated KPI indexes is in the range of 0.8-1.0, judging that the two KPI indexes calculated by the current input are highly correlated.
4. A KPI indicator multivariate anomaly monitoring method according to claim 2, characterized in that the way of simulating all anomaly types for manual annotation and injecting unlabeled multivariate anomaly data in a dataset comprises: the method comprises the following steps of random point labeling, random continuous segment labeling and rule labeling, wherein manual exception injection and labeling are carried out on a data set, so that the data set comprises single-point exceptions, transient exceptions and continuous exceptions, and the exception data is limited to be not more than 2% of the total data.
5. A KPI multivariate abnormality monitoring method according to claim 1, characterized in that,
the method for extracting relevant features from KPI index data comprises the following steps:
different characteristics are extracted to aim at different prediction algorithms and different abnormal types;
and selectively deleting KPI index trend components in the data set after extracting the characteristic data set.
6. A KPI multivariate abnormality monitoring method according to claim 5, wherein said removing KPI trend components in the data set comprises two ways: addition deletion and multiplication deletion; the addition deletion is actual data minus trend, and the multiplication deletion is actual data divided by trend.
7. A KPI multivariate abnormality monitoring method according to claim 1, wherein the type of prediction algorithm comprises: statistical modeling prediction class, linear regression prediction class, and regression tree based prediction class.
8. A KPI indicator multivariate abnormality monitoring method according to claim 1, characterized in that said abnormality types comprise: a sudden drop anomaly, a sudden increase anomaly, a short drop anomaly, a short rise anomaly, a sustained drop anomaly, a sustained rise anomaly, and a related change anomaly.
CN202110676313.9A 2021-06-18 2021-06-18 KPI index multivariate anomaly monitoring method based on regression prediction Active CN113419934B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110676313.9A CN113419934B (en) 2021-06-18 2021-06-18 KPI index multivariate anomaly monitoring method based on regression prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110676313.9A CN113419934B (en) 2021-06-18 2021-06-18 KPI index multivariate anomaly monitoring method based on regression prediction

Publications (2)

Publication Number Publication Date
CN113419934A CN113419934A (en) 2021-09-21
CN113419934B true CN113419934B (en) 2022-07-08

Family

ID=77789040

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110676313.9A Active CN113419934B (en) 2021-06-18 2021-06-18 KPI index multivariate anomaly monitoring method based on regression prediction

Country Status (1)

Country Link
CN (1) CN113419934B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108197845A (en) * 2018-02-28 2018-06-22 四川新网银行股份有限公司 A kind of monitoring method of the transaction Indexes Abnormality based on deep learning model LSTM
CN111858231A (en) * 2020-05-11 2020-10-30 北京必示科技有限公司 Single index abnormality detection method based on operation and maintenance monitoring

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108197845A (en) * 2018-02-28 2018-06-22 四川新网银行股份有限公司 A kind of monitoring method of the transaction Indexes Abnormality based on deep learning model LSTM
CN111858231A (en) * 2020-05-11 2020-10-30 北京必示科技有限公司 Single index abnormality detection method based on operation and maintenance monitoring

Also Published As

Publication number Publication date
CN113419934A (en) 2021-09-21

Similar Documents

Publication Publication Date Title
CN106951984B (en) Dynamic analysis and prediction method and device for system health degree
US20180348747A1 (en) System and method for unsupervised root cause analysis of machine failures
CN110490720A (en) Financial data analysis and early warning method, apparatus, computer equipment and storage medium
CN106886481B (en) Static analysis and prediction method and device for system health degree
CN104285212A (en) Automated analysis system for modeling online business behavior and detecting outliers
CN103488135A (en) Statistical process control method used for semiconductor manufacturing process monitoring
Koppel et al. MDAIC–a Six Sigma implementation strategy in big data environments
CN114757468B (en) Root cause analysis method for process execution abnormality in process mining
CN111539493A (en) Alarm prediction method and device, electronic equipment and storage medium
CN115755614A (en) Energy consumption optimization regulation and control method and device based on carbon emission monitoring
Filios et al. An agnostic data-driven approach to predict stoppages of industrial packing machine in near future
DE112019003588T5 (en) Optimizing the accuracy of machine learning algorithms for monitoring the operation of industrial machines
Choueiri et al. Discovery of path-attribute dependency in manufacturing environments: A process mining approach
CN113807751A (en) Safety risk grade assessment method and system based on knowledge graph
CN113419934B (en) KPI index multivariate anomaly monitoring method based on regression prediction
CN111325280A (en) Label generation method and system
CN111199419B (en) Stock abnormal transaction identification method and system
CN114048592A (en) Finish rolling whole-flow distributed operation performance evaluation and non-optimal reason tracing method
CN109754186B (en) Probability calculation method and device based on energy consumption analysis
CN111612302A (en) Group-level data management method and equipment
Soderborg Better Before Bigger Data
Telukdarie et al. A review on effective maintenance strategies and management for optimizing equipment systems
CN117609740A (en) Intelligent prediction maintenance system based on industrial large model
CN115391150A (en) Method for predicting failure of server component, related device and computer storage medium
CN117591679A (en) Intelligent analysis system and method for carbon footprint of building block type product based on knowledge graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant