CN112380044A - Data anomaly detection method and device, computer equipment and storage medium - Google Patents

Data anomaly detection method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN112380044A
CN112380044A CN202011407049.0A CN202011407049A CN112380044A CN 112380044 A CN112380044 A CN 112380044A CN 202011407049 A CN202011407049 A CN 202011407049A CN 112380044 A CN112380044 A CN 112380044A
Authority
CN
China
Prior art keywords
data
residual error
time
slope
curve
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011407049.0A
Other languages
Chinese (zh)
Inventor
杨浩
黄宇
周利
吕越
冯理
张粤峰
张云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202011407049.0A priority Critical patent/CN112380044A/en
Publication of CN112380044A publication Critical patent/CN112380044A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy

Abstract

The application relates to a data anomaly detection method, a data anomaly detection device, computer equipment and a storage medium, which can be applied to a cloud server; the method comprises the following steps: acquiring time sequence data to be detected; performing time sequence decomposition on the time sequence data to be detected to obtain corresponding seasonal data and residual error items; calculating a corresponding curve slope according to the seasonal data; regularizing the residual error item based on the slope of the curve to obtain a regularized residual error of the time series data to be detected; and determining a data abnormity detection result according to a preset fluctuation interval by taking the regularized residual error as a statistic. According to the method, the curve slope is calculated for the seasonal data obtained through time sequence decomposition, the curve slope of the seasonal data is introduced to adjust the residual error, then statistical analysis is carried out, the influence of normal mutation brought by system tasks on data anomaly detection can be reduced, and misjudgment is reduced.

Description

Data anomaly detection method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data anomaly detection method and apparatus, a computer device, and a storage medium.
Background
Due to its wide application, anomaly detection has been one of the hot spots of research in the industry. The conventional anomaly detection is generally based on a model method, and is created by adopting a probability distribution function or a fitting function matched with the data distribution characteristics of a research object through the data distribution characteristics of the research object. An observed value of an object is considered an outlier if it compares well with the model to which it does not fit. The common methods are as follows: 3-sigma, linear regression, multiple regression, autoregressive, time sequence decomposition, and the like.
However, in practical applications, the time series data may also generate mutations due to specific tasks executed by the system, and these mutations belong to results of normal operation of the system.
Disclosure of Invention
In view of the above, it is desirable to provide a data anomaly detection method, apparatus, computer device, and storage medium capable of reducing erroneous determination.
A method of data anomaly detection, the method comprising:
acquiring time sequence data to be detected;
performing time sequence decomposition on the time sequence data to be detected to obtain corresponding seasonal data and residual error items;
calculating a corresponding curve slope according to the seasonal data;
regularizing the residual error item based on the slope of the curve to obtain a regularized residual error of the time series data to be detected;
and determining a data abnormity detection result according to a preset fluctuation interval by taking the regularized residual error as a statistic.
An apparatus for data anomaly detection, the apparatus comprising:
the data acquisition module is used for acquiring the time sequence data to be detected;
the time sequence decomposition module is used for carrying out time sequence decomposition on the time sequence data to be detected to obtain corresponding seasonal data and residual error items;
the slope calculation module is used for calculating the corresponding curve slope according to the seasonal data;
the residual error regularization module is used for regularizing the residual error items based on the curve slope to obtain a regularized residual error of the time series data to be detected;
and the abnormal detection module is used for determining a data abnormal detection result according to a preset fluctuation interval by taking the normalized residual error as a statistic.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring time sequence data to be detected;
performing time sequence decomposition on the time sequence data to be detected to obtain corresponding seasonal data and residual error items;
calculating a corresponding curve slope according to the seasonal data;
regularizing the residual error item based on the slope of the curve to obtain a regularized residual error of the time series data to be detected;
and determining a data abnormity detection result according to a preset fluctuation interval by taking the regularized residual error as a statistic.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring time sequence data to be detected;
performing time sequence decomposition on the time sequence data to be detected to obtain corresponding seasonal data and residual error items;
calculating a corresponding curve slope according to the seasonal data;
regularizing the residual error item based on the slope of the curve to obtain a regularized residual error of the time series data to be detected;
and determining a data abnormity detection result according to a preset fluctuation interval by taking the regularized residual error as a statistic.
According to the data anomaly detection method, the data anomaly detection device, the computer equipment and the storage medium, the time sequence decomposition is firstly carried out on the acquired time sequence data to be detected to obtain corresponding seasonal data and residual error items, the curve slope is calculated according to the seasonal data, the residual error items are adjusted based on the curve slope to obtain normalized residual errors, the normalized residual errors are used as statistics, and the data anomaly detection result of the time sequence data to be detected is determined according to the preset fluctuation interval. According to the method, the curve slope is calculated for the seasonal data obtained through time sequence decomposition, the curve slope of the seasonal data is introduced to normalize the residual error and then perform statistical analysis, so that the influence of normal mutation brought by system tasks on data anomaly detection can be reduced, and misjudgment is reduced.
Drawings
FIG. 1 is a flow diagram illustrating a method for data anomaly detection in one embodiment;
FIG. 2(a) is a diagram illustrating timing data in one embodiment;
FIG. 2(b) is a diagram illustrating trend data obtained by performing a time-series decomposition on time-series data according to an embodiment;
FIG. 2(c) is a diagram illustrating seasonal data resulting from time-series decomposition of time-series data in an exemplary embodiment;
FIG. 2(d) is a diagram illustrating calculation of corresponding curve slopes based on seasonal data in one embodiment;
FIG. 2(e) is a diagram illustrating a residual term obtained by performing a time-series decomposition on time-series data according to an embodiment;
FIG. 2(f) is a diagram of regularized residual terms in one embodiment;
fig. 3 is a schematic flow chart illustrating the process of regularizing the residual error term based on the slope of the curve to obtain a regularized residual error of the time series data to be detected in one embodiment;
FIG. 4 is a flow diagram illustrating a process for scaling factor determination in one embodiment;
FIG. 5 is a flow diagram illustrating a method for data anomaly detection in an exemplary embodiment;
FIG. 6(a) is a diagram illustrating an exemplary embodiment of anomaly detection using residual terms as statistics;
FIG. 6(b) is a diagram illustrating anomaly detection using regularized residual terms as statistics in an exemplary embodiment;
FIG. 7 is a block diagram showing the structure of a data abnormality detecting apparatus according to an embodiment;
FIG. 8 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Large internet companies provide a large number of applications and web services to the outside through servers. However, service failures due to many reasons are always difficult to avoid, such as network delays, server failures, malicious network attacks, and so on. In order for a company to remain competitive, the maintenance personnel expend a great deal of effort to ensure that these services are functioning properly. In general, these service failures occur, and some indexes related to company business, typically time series data, always have abnormal fluctuation changes, such as sudden increase of business failure amount, sudden decrease of success rate, sudden increase of response delay, and the like.
Meanwhile, in some cases, the business related indexes of the company are also increased or decreased normally, such as timing tasks, stress tests and the like. Therefore, for one index time series data curve, not only abnormal fluctuation but also normal sudden increase or sudden decrease can occur. How to correctly distinguish between normal and abnormal data is of great significance in actual production.
The present application provides a data anomaly detection method, as shown in fig. 1, this embodiment is illustrated by applying the method to a terminal, and it can be understood that the method can also be applied to a server, and can also be applied to a system including a terminal and a server, and is implemented by interaction between the terminal and the server. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, big data and artificial intelligence platform. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.
Among them, cloud computing (cloud computing) is a computing mode that distributes computing tasks over a resource pool formed by a large number of computers, so that various application systems can acquire computing power, storage space, and information services as needed. The network that provides the resources is referred to as the "cloud". Resources in the "cloud" appear to the user as being infinitely expandable and available at any time, available on demand, expandable at any time, and paid for on-demand.
As a basic capability provider of cloud computing, a cloud computing resource pool (called as an ifas (Infrastructure as a Service) platform for short is established, and multiple types of virtual resources are deployed in the resource pool and are selectively used by external clients.
According to the logic function division, a PaaS (Platform as a Service) layer can be deployed on an IaaS (Infrastructure as a Service) layer, a SaaS (Software as a Service) layer is deployed on the PaaS layer, and the SaaS can be directly deployed on the IaaS. PaaS is a platform on which software runs, such as a database, a web container, etc. SaaS is a variety of business software, such as web portal, sms, and mass texting. Generally speaking, SaaS and PaaS are upper layers relative to IaaS.
The data anomaly detection method provided by the application can be a cloud computing service provided by a cloud server.
In the present embodiment, the method includes steps S110 to S150.
Step S110, acquiring time sequence data to be detected.
The time-series data is time-series data, and the time-series data is data recorded in time series by the same index. In daily life, many data belong to time series data, and numerical values at different times of a day, such as stock share price, CPU usage rate of a notebook computer, indoor temperature and the like, form time series data. In one embodiment, the time series data to be detected includes data such as traffic failure amount, success rate, response delay and the like when the internet provides applications and network services to the outside.
In one embodiment, the time series data to be detected is seasonal. Three parts, a trend part, a seasonal part and an irregular part (residual terms) are included in a seasonal time series.
Further, in an embodiment, the acquiring of the time series data to be detected includes acquiring the time series data to be detected within a preset historical time period. The preset historical time period may be set according to a seasonal period, for example, the time sequence data is seasonally changed time sequence data, the seasonal period is 1 day, the preset historical time period may be set to 8 days of history or 15 days of history, and so on.
And step S120, performing time sequence decomposition on the time sequence data to be detected to obtain corresponding seasonal data and residual error items.
The time sequence decomposition of the seasonal time sequence data is to decompose the time sequence data into a trend part, a seasonal part and an irregular part, and then estimate; in this embodiment, three results of time-series decomposition of time-series data are respectively recorded as trend data, seasonal data, and residual terms; fig. 2(a) shows time series data in an embodiment, fig. 2(b) shows trend data obtained by performing time series decomposition on the time series data in an embodiment, fig. 2(c) shows seasonal data obtained by performing time series decomposition on the time series data in an embodiment, and fig. 2(e) shows residual terms obtained by performing time series decomposition on the time series data in an embodiment. Among them, the data shown in fig. 2(a) to 2(f) each correspond to data in one cycle. In this embodiment, the time series data is subjected to anomaly detection, and seasonal data and residual items in the time series decomposition result are required to be used.
It should be noted that the time series data to be detected includes a plurality of cycles, and the seasonal data after time series decomposition corresponds to one cycle; in this embodiment, after performing the time sequence decomposition on the time sequence data, the method further includes: and periodically expanding the obtained seasonal data to obtain the seasonal data in a time interval corresponding to the time sequence data to be detected. The time intervals corresponding to the residual error items and the trend data are the same, and in one embodiment, the time intervals corresponding to the residual error items and the trend data are the same and are reduced by one cycle; for example, the time interval corresponding to the time series data to be detected includes time series data within 8 days, the period of the time series data is 1 day, and the time interval corresponding to the residual error item and the trend data is 7 days.
In one embodiment, the method for time-sequence decomposition of time-sequence data comprises: classical decomposition (classic decomposition). The classical decomposition method is that the periodic components are assumed to be the same in each period, and according to the decomposition method, the algorithm can have two modes of addition and multiplication, namely an addition model: y ist=St+Tt+RtAnd a multiplication model: y ist=St*Tt*RtIn fact, the two models can be mutually converted, and after logarithms are taken on two sides of the multiplication model, an addition model is corresponding to the multiplication model. S in the modelt,Tt,RtSeasonal, trending, and residual correspond, respectively. Wherein StPeriodic with PPeriodic function of satisfying St=St+P。TtThe reaction is the long-term change trend of index data. RtThen the remaining part, R when there are more data pointstA standard normal distribution can be approximately considered satisfied.
In another embodiment, the time sequence decomposition of the time sequence data can also adopt the following method: x11 decomposition (X11 decomposition), SEATS decomposition (SEATS decomposition), STL decomposition (STL decomposition), and the like.
Step S130, calculating a corresponding slope of the curve according to the seasonal data.
After time sequence decomposition is performed on the time sequence data, the obtained seasonal data is usually a curve, and the corresponding time interval is a period. After the seasonal data are obtained, if the time sequence data to be detected are the seasonal time sequence data, the seasonal data with the same time interval as the time interval corresponding to the residual error item can be obtained in an expanding mode.
In one embodiment, calculating a corresponding curve slope from seasonal data includes: carrying out periodic expansion on the seasonal data to obtain expanded seasonal data; and calculating the corresponding curve slope of the expanded seasonal data by adopting a local weighted regression algorithm.
In one embodiment, periodically expanding the seasonal data includes: and carrying out periodic translation on a time coordinate axis on the basis of seasonal data in a period to obtain the expanded seasonal data.
The basic idea of the local weighted regression algorithm (Lowess) is to perform point-by-point local linear fitting on time series data, and meanwhile, in order to further reflect the local characteristics of the data, different points give different weights according to the distance between the points and a target point when fitting is performed, and usually, the weight of a point closer to the target point is higher, and the weight of a point farther from the target point is lower. For each target point, on one hand, the corresponding smoothed value may be calculated using the fitted straight line, and on the other hand, the slope of the fitted straight line reflects the rate of change of the index data at that time. In addition, the local weighted regression algorithm relies on a training data set to obtain a new parameter value every time a new sample is predicted, namely, the local weighted regression algorithm is a non-parametric learning method.
In one embodiment, calculating a slope of a curve for a curve in post-expansion seasonal data includes the steps of: and (3) taking a numerical value in a preset time window for weighted regression at any time t of the expanded seasonal data curve to obtain a curve slope corresponding to the time t, and respectively calculating corresponding curve slopes at any time in a time interval of the seasonal data to obtain the curve slope of the curve in the seasonal data. The preset time window can be set according to actual conditions. Fig. 2(d) illustrates the calculation of the corresponding slope of the curve based on seasonal data in one embodiment.
And step S140, regularizing the residual error item based on the slope of the curve to obtain a regularized residual error of the time series data to be detected.
The specific implementation process of obtaining the regularized residual by regularizing the residual term based on the slope of the curve will be described in detail in the following embodiments, and will not be described herein again. Fig. 2(f) shows a normalized residual error diagram in an embodiment.
And S150, determining a data abnormity detection result according to a preset fluctuation interval by using the regularized residual error as a statistic.
Statistics are variables used in statistical theory to analyze and test data. In the present embodiment, the regularized residuals are analyzed as statistics of abnormality detection. In one embodiment, the data points corresponding to the normalized residual errors exceeding the preset fluctuation interval in the time series data to be detected are determined as abnormal points, and a data abnormality detection result is obtained.
In the process of abnormal data decision, the selection of the normal fluctuation interval directly determines the effect of abnormal detection. In this embodiment, the preset fluctuation interval may be selected in combination with actual conditions. In one embodiment, the normalized residual historical feature data may be used to calculate the mean μ and standard deviation σ, and then a reasonable fluctuation interval may be calculated by combining the statistical 3-sigma method: [ mu-3. sigma.,. mu. + 3. sigma. ]. Wherein μ represents a normal mean of normalized residuals; σ represents a normal variance.
In another embodiment, the normalized residual historical feature data may also be used to calculate a median (mean) and a Median Absolute Deviation (MAD), and a fluctuation interval may be calculated based on the MAD and the mean. MAD and mean are more robust and more resilient to processing of outliers in the data set than mean and standard deviation. In other embodiments, the preset fluctuation interval may be determined in other manners.
According to the data anomaly detection method, time sequence decomposition is firstly carried out on the acquired time sequence data to be detected to obtain corresponding seasonal data and residual error items, curve slopes are calculated according to the seasonal data, the residual error items are adjusted based on the curve slopes to obtain normalized residual errors, and the normalized residual errors are used as statistics to determine data anomaly detection results of the time sequence data to be detected according to a preset fluctuation interval. According to the method, the curve slope is calculated for the seasonal data obtained through time sequence decomposition, the curve slope of the seasonal data is introduced to adjust the residual error, then statistical analysis is carried out, the influence of normal mutation brought by system tasks on data anomaly detection can be reduced, and misjudgment is reduced.
In one embodiment, as shown in fig. 3, the residual term is normalized based on the slope of the curve to obtain a normalized residual of the time series data to be detected, which includes step S141 and step S142.
Step S141, reading the scaling factor, the slope of the curve at time t, and the residual error term at time t.
Wherein, the scaling factor can be set according to the actual situation; the scaling factor may be set to a constant, for example, based on empirical values, or based on the slope of the curve of the seasonal data; in one embodiment, the scaling factor is a positive integer.
In one embodiment, as shown in fig. 4, the determination of the scaling factor includes steps S410 to S430.
Step S410, reading the slope of the curve at each time in the corresponding time interval.
And the time interval corresponding to the slope of the curve is the same as the time interval corresponding to the seasonal data.
In step S420, a median slope of the curve in the time interval in which the slope of the curve is located is calculated based on the slope of the curve at each time.
The median curve slope represents the median of the slope of the curve over the corresponding time interval. In one embodiment, calculating a median curve slope in a time interval in which the curve slope is based on the curve slope at each time comprises: and respectively taking the absolute value of the curve slope at each moment, and taking the median of the absolute value of the curve slope at each moment to determine the median curve slope in the time interval in which the curve slope is positioned.
In step S430, the inverse of the slope of the median curve is determined as the scaling factor.
The reciprocal is defined as: the product of two reciprocal values is 1.
In this embodiment, the scaling coefficient is determined by using the slope of the curve, and the scaling coefficient changes with different sample data, so that the adaptability is stronger.
And step S142, determining a regularized residual error at the t moment according to the scaling coefficient, the curve slope at the t moment and the residual error item.
In one embodiment, determining a regularized residual at time t based on the scaling factor, the slope of the curve at time t, and a residual term comprises: calculating the product of the scaling coefficient and the slope of the curve at the time t; calculating a sum of 1 and the product; and determining the ratio of the residual error item at the time t to the sum value as the regularized residual error at the time t. In other embodiments, the regularized residual at time t is determined according to the scaling factor, the slope of the curve at time t, and the residual term, which may also be implemented in other ways, such as replacing 1 with another constant, and so on.
In one embodiment, the process of calculating the regularized residual may be represented by the following equation:
Figure BDA0002818922990000091
wherein R istIs the residual error at time t, KtThe slope of the seasonal part data at time t, and alpha is the scalingAnd (4) the coefficient.
In the embodiment, when the regularization calculation is performed on the residual error term, a scaling coefficient is introduced, so that a larger fluctuation influence on the process of calculating the regularization residual error can be avoided under the condition that the slope is too large or too small.
Further, in an embodiment, the data anomaly detection is performed according to a preset fluctuation interval by using the normalized residual error as a statistic, and includes: and sliding the time interval of the normalized residual error by using a preset time window, and judging that the data is abnormal when the number of abnormal data points (data points exceeding a preset fluctuation interval) in the preset time window exceeds a preset threshold value, thereby obtaining an abnormal detection result of the data abnormality. For example, in a specific embodiment, a preset time window is set to T, a preset threshold is set to 80%, and when a data anomaly point exceeds 80% in any T time period in the normalized residual, it is determined that there is an anomaly in the time series data to be detected corresponding to the normalized residual. It will be appreciated that in other embodiments, the preset time window and the preset threshold may be set to other values depending on the actual situation.
In the embodiment, when data anomaly detection is actually performed, a preset time window is used for counting normalized residual errors, and if data points of the normalized residual errors exceeding a preset fluctuation interval in the preset time window exceed a set quantity threshold, data anomaly is determined to be detected; this can reduce the influence of data noise on the abnormality detection result.
Further, in an embodiment, before determining the regularized residual at the time t according to the scaling coefficient, the slope of the curve at the time t, and the residual term, the following is also included in the process of regularizing the residual: denoising the residual error item to obtain a denoised residual error item; in the embodiment, the regularized residual at the time t is determined according to the scaling coefficient, the curve slope at the time t and the residual term after denoising.
Denoising the residual data, namely removing noise in the residual data; denoising can be performed in any one of the realizable ways. In one embodiment, the residual term may be denoised using an exponential moving average algorithm. The exponential moving average method is simply referred to as an exponential smoothing method. The method is a prediction method which uses the actual value and the predicted value (estimated value) of the previous period to carry out different weighted distribution to the actual value and the predicted value to obtain an exponential smoothing value as the predicted value of the next period.
In this embodiment, before the residual error is normalized, denoising operation is performed on residual error data to reduce the influence of data noise on anomaly detection, so that anomaly detection is more accurate.
Further, in one embodiment, the preset fluctuation interval determination includes the steps of: determining the average value and the standard deviation of the normalized residual errors in the corresponding time interval; and determining a preset fluctuation interval by adopting an n-sigma algorithm based on the average value and the standard deviation.
The time interval corresponding to the regularized residual is the same as the time interval corresponding to the seasonal data. If the normalized residual error is approximately in normal distribution, the fluctuation range can be determined by adopting an n-sigma algorithm, and the n-sigma algorithm has the advantages of simple calculation and high efficiency. In one embodiment, the predetermined fluctuation interval is determined using a 3-sigma algorithm based on the average value and the mean value, i.e., [ mu-3 sigma, [ mu +3 sigma ] is taken as the predetermined fluctuation interval. Wherein, 3-sigma law: statistically, the observed values of random variables conforming to the standard normal distribution fall into different intervals according to a certain probability. The probability of falling within the plus and minus three standard deviations reaches 99.73%, and the cases falling within other intervals can be regarded as small probability events. Therefore, in the abnormality detection, data falling outside of 3 standard deviations is generally regarded as abnormal data. Even if the random variable does not conform to the standard normal distribution, the probability of falling between 3 standard deviations reaches more than 88.89% according to the chebyshev inequality.
In another embodiment, the preset fluctuation interval determination includes the steps of: determining a middle normalized residual error of the normalized residual errors in the corresponding time interval and a middle normalized residual error deviation value of each normalized residual error and the middle normalized residual error; and determining a preset fluctuation interval based on the middle normalized residual error and the middle normalized residual error offset value.
The medium-level regularized residual error represents a medium level of the regularized residual error in a corresponding time interval; the mid-normalized residual offset value represents the difference of the normalized residual and the mid-normalized residual. In a specific embodiment, the determining the preset fluctuation interval based on the median regularization residual and the median regularization residual offset value includes:
[median-6*MAD,median+6*MAD];
where mean represents the median regularized residual, and MAD represents the median regularized residual offset value.
In the above embodiment, the preset fluctuation interval is determined in two different ways, and both the effect of performing anomaly detection on data can be achieved. It will be appreciated that in other embodiments, the preset fluctuation interval may be determined in any other way that can be implemented.
The application also provides an application scenario, as shown in fig. 5, where the data anomaly detection method is applied in the application scenario. In this embodiment, the time sequence is taken as seasonal time sequence data, and one period is 1 day as an example. Specifically, the application of the data anomaly detection method in the application scenario is as follows:
1) and acquiring offline time sequence data to be detected, including historical data within 8 days.
2) Performing time sequence decomposition on time sequence data to be detected to obtain seasonal data, residual error items and trend data, wherein the time interval corresponding to the seasonal data is 1 day; the residual term and the corresponding time interval of the trend data are taken to be 7 days.
3) And calculating the slope of the curve by adopting a lowess algorithm for the seasonal data, and periodically expanding to obtain the seasonal data corresponding to the time interval of 7 days.
4) Carrying out exponential moving average denoising on the residual data, and removing noise in the residual data; and regularizing the residual error data after the noise is removed to obtain a regularized residual error. Further, the normalized calculation process:
Figure BDA0002818922990000111
wherein the content of the first and second substances,
Figure BDA0002818922990000112
denotes normalized residual, RtDenotes the residual error, KtThe slope of the seasonal data at the time t is shown, alpha is a scaling coefficient and reflects the influence degree of the slope on the statistic, and alpha is a positive integer in one embodiment; in another embodiment the scaling factor is determined as:
Figure BDA0002818922990000121
wherein, K(t)Represents the slope at the time t, and in the embodiment, t is more than 0 and less than 1440 min; mean (| K)(t)|) represents the median of the slope within one cycle.
5) And judging the abnormal value by using the normalized residual as a statistic and using the selected preset fluctuation interval. The preset fluctuation interval can utilize historical characteristic data of indexes to calculate the average value mu and the standard deviation sigma of the preset fluctuation interval, and then a reasonable preset fluctuation interval is calculated by combining a statistical 3-sigma method: [ mu-3. sigma.,. mu. + 3. sigma. ]. On the other hand, the median (mean) and the Median Absolute Deviation (MAD) may also be used to calculate the preset fluctuation interval, for example, the preset fluctuation interval is: [ mean-6-MAD, mean + 6-MAD ].
In some service scenes, such as timing tasks, pressure measurement and the like, the traditional statistical method can generate more misjudgments when the index data are subjected to normal mutation, and the normal data are judged to be abnormal. The data anomaly detection method in the embodiment can reduce the misjudgment under the condition of not increasing the missing judgment, and has great advantages. Meanwhile, in other scenarios, compared with a statistical algorithm, theoretically, the data anomaly detection method in the embodiment can also show certain superiority. On one hand, the algorithm can be used for abnormal value marking of off-line data and provides a training set for supervised taxonomy, and on the other hand, the algorithm can also be applied to real-time monitoring of abnormality through proper adjustment, such as cache slope and seasonal data re-timing updating, and can be rapidly put into production and use.
In FIG. 6(a) and FIG. 6(b), the horizontal lines represent the residual RtAnd regularized residual
Figure BDA0002818922990000122
The adopted ranges of the calculated preset fluctuation interval of the mean and the MAD are as follows: [ mean-6-MAD, mean + 6-MAD]. In actual calculation, occasionally one or two abnormal values may correspond to accidental jitter of data, rather than a true system failure, because of the volatility of the data. Therefore, the number of abnormal points in a certain time window (preset time window) is usually determined, and only when the number of the abnormal points exceeds a preset number threshold, the system corresponding to the index is considered to be abnormal. The part circled by the solid line ellipse in fig. 6(a) and 6(b) is a real anomaly, and the part circled by the dotted line ellipse is a false anomaly caused by data mutation at the beginning and the end of the timing task. Comparison can find that R is adoptedtStatistical amount, the more false positive 6/(6+3) — 66.7%. While
Figure BDA0002818922990000123
And counting misjudgments caused by normal mutation of the eliminating index data. In addition, the data anomaly detection method has universality and can be used for anomaly judgment of other seasonal time series data.
It should be understood that, although the steps in the flowcharts involved in the above embodiments are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in each flowchart involved in the above embodiments may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.
In one embodiment, as shown in fig. 7, a data anomaly detection apparatus is provided, which may be a part of a computer device using a software module or a hardware module, or a combination of the two modules, and specifically includes: a data acquisition module 710, a timing decomposition module 720, a slope calculation module 730, a residual regularization module 740, and an anomaly detection module 750, wherein:
a data obtaining module 710, configured to obtain time series data to be detected;
the time sequence decomposition module 720 is used for performing time sequence decomposition on the time sequence data to be detected to obtain corresponding seasonal data and residual error items;
a slope calculation module 730 for calculating a corresponding slope of the curve according to the seasonal data;
the residual error regularization module 740 is configured to regularize a residual error term based on a slope of a curve to obtain a regularized residual error of the time series data to be detected;
and an anomaly detection module 750, configured to determine a data anomaly detection result according to the preset fluctuation interval by using the normalized residual error as a statistic.
According to the data anomaly detection device, time sequence decomposition is firstly carried out on the acquired time sequence data to be detected to obtain corresponding seasonal data and residual error items, the curve slope is calculated according to the seasonal data, the residual error items are adjusted based on the curve slope to obtain normalized residual errors, and the normalized residual errors are used as statistics to determine the data anomaly detection result of the time sequence data to be detected according to the preset fluctuation interval. The device calculates the slope of the curve through the seasonal data obtained by time sequence decomposition, introduces the slope of the curve of the seasonal data to adjust the residual error and then carries out statistical analysis, thereby reducing the influence of normal mutation brought by system tasks on data anomaly detection and reducing misjudgment.
In one embodiment, the slope calculation module 730 of the above apparatus comprises: the periodic expansion unit is used for periodically expanding the seasonal data to obtain expanded seasonal data; and the calculating unit is used for calculating the corresponding curve slope of the expanded seasonal data by adopting a local weighted regression algorithm.
In one embodiment, the residual regularization module includes: the information reading unit is used for reading the scaling coefficient, the curve slope at the t moment in the curve slope and a residual error item at the t moment; and the regularization calculation unit is used for determining regularization residual errors at the t moment according to the scaling coefficients, the curve slope at the t moment and the residual error terms.
In an embodiment, the regularization calculation unit of the apparatus is specifically configured to: calculating the product of the scaling coefficient and the slope of the curve at the time t; calculating a sum of 1 and the product; and determining the ratio of the residual error item at the time t to the sum value as the regularized residual error at the time t.
In one embodiment, the above apparatus further comprises: the reading module is used for reading the curve slope of each moment in the corresponding time interval; the calculation module is used for calculating the median curve slope in the time interval of the curve slope based on the curve slope at each moment; the calculation module is further configured to determine an inverse of the slope of the median curve as the scaling factor.
In one embodiment, the above apparatus further comprises: the fluctuation interval determining module is used for determining the average value and the standard deviation of the normalized residual errors in the corresponding time interval; and the fluctuation interval determining module is also used for determining a preset fluctuation interval by adopting an n-sigma algorithm based on the average value and the standard deviation.
In another embodiment, the above apparatus further comprises: the fluctuation interval determining module is used for determining a middle normalized residual error of the normalized residual errors in the corresponding time interval and a middle normalized residual error deviation value of each normalized residual error and the middle normalized residual error; and the fluctuation interval determining module is further used for determining a preset fluctuation interval based on the middle normalized residual error and the middle normalized residual error offset value.
For specific limitations of the data anomaly detection device, reference may be made to the above limitations of the data anomaly detection method, which are not described herein again. The modules in the data anomaly detection device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 8. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a data anomaly detection method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 8 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the following steps when executing the computer program:
acquiring time sequence data to be detected; performing time sequence decomposition on time sequence data to be detected to obtain corresponding seasonal data and residual error items; calculating a corresponding curve slope according to the seasonal data; regularizing the residual error term based on the slope of the curve to obtain a regularized residual error of the time sequence data to be detected; and determining a data abnormity detection result according to a preset fluctuation interval by using the regularized residual error as a statistic.
In one embodiment, the processor, when executing the computer program, further performs the steps of: carrying out periodic expansion on the seasonal data to obtain expanded seasonal data; and calculating the corresponding curve slope of the expanded seasonal data by adopting a local weighted regression algorithm.
In one embodiment, the processor, when executing the computer program, further performs the steps of: reading a scaling coefficient, a curve slope at the t moment in the curve slope and a residual error item at the t moment; and determining the regularized residual error at the t moment according to the scaling coefficient, the curve slope at the t moment and the residual error item.
In one embodiment, the processor, when executing the computer program, further performs the steps of: calculating the product of the scaling coefficient and the slope of the curve at the time t; calculating a sum of 1 and the product; and determining the ratio of the residual error item at the time t to the sum value as the regularized residual error at the time t.
In one embodiment, the processor, when executing the computer program, further performs the steps of: reading the curve slope of each moment in the corresponding time interval; calculating the median curve slope in the time interval of the curve slope based on the curve slope at each moment; the inverse of the slope of the median curve is determined as the scaling factor.
In one embodiment, the processor, when executing the computer program, further performs the steps of: determining the average value and the standard deviation of the normalized residual errors in the corresponding time interval; and determining a preset fluctuation interval by adopting an n-sigma algorithm based on the average value and the standard deviation.
In one embodiment, the processor, when executing the computer program, further performs the steps of: determining a middle normalized residual error of the normalized residual errors in the corresponding time interval and a middle normalized residual error deviation value of each normalized residual error and the middle normalized residual error; and determining a preset fluctuation interval based on the middle normalized residual error and the middle normalized residual error offset value.
In one embodiment, a computer-readable storage medium is provided, storing a computer program that, when executed by a processor, performs the steps of:
acquiring time sequence data to be detected; performing time sequence decomposition on time sequence data to be detected to obtain corresponding seasonal data and residual error items; calculating a corresponding curve slope according to the seasonal data; regularizing the residual error term based on the slope of the curve to obtain a regularized residual error of the time sequence data to be detected; and determining a data abnormity detection result according to a preset fluctuation interval by using the regularized residual error as a statistic.
In one embodiment, the computer program when executed by the processor further performs the steps of: carrying out periodic expansion on the seasonal data to obtain expanded seasonal data; and calculating the corresponding curve slope of the expanded seasonal data by adopting a local weighted regression algorithm.
In one embodiment, the computer program when executed by the processor further performs the steps of: reading a scaling coefficient, a curve slope at the t moment in the curve slope and a residual error item at the t moment; and determining the regularized residual error at the t moment according to the scaling coefficient, the curve slope at the t moment and the residual error item.
In one embodiment, the computer program when executed by the processor further performs the steps of: calculating the product of the scaling coefficient and the slope of the curve at the time t; calculating a sum of 1 and the product; and determining the ratio of the residual error item at the time t to the sum value as the regularized residual error at the time t.
In one embodiment, the computer program when executed by the processor further performs the steps of: reading the curve slope of each moment in the corresponding time interval; calculating the median curve slope in the time interval of the curve slope based on the curve slope at each moment; the inverse of the slope of the median curve is determined as the scaling factor.
In one embodiment, the computer program when executed by the processor further performs the steps of: determining the average value and the standard deviation of the normalized residual errors in the corresponding time interval; and determining a preset fluctuation interval by adopting an n-sigma algorithm based on the average value and the standard deviation.
In one embodiment, the computer program when executed by the processor further performs the steps of: determining a middle normalized residual error of the normalized residual errors in the corresponding time interval and a middle normalized residual error deviation value of each normalized residual error and the middle normalized residual error; and determining a preset fluctuation interval based on the middle normalized residual error and the middle normalized residual error offset value.
In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps in the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, the computer program can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method for detecting data anomalies, the method comprising:
acquiring time sequence data to be detected;
performing time sequence decomposition on the time sequence data to be detected to obtain corresponding seasonal data and residual error items;
calculating a corresponding curve slope according to the seasonal data;
regularizing the residual error item based on the slope of the curve to obtain a regularized residual error of the time series data to be detected;
and determining a data abnormity detection result according to a preset fluctuation interval by taking the regularized residual error as a statistic.
2. The method of claim 1, wherein said calculating a corresponding slope of a curve from said seasonal data comprises:
carrying out periodic expansion on the seasonal data to obtain expanded seasonal data;
and calculating the corresponding curve slope of the expanded seasonal data by adopting a local weighted regression algorithm.
3. The data anomaly detection method according to claim 1, wherein the regularizing the residual term based on the slope of the curve to obtain a regularized residual of the time series data to be detected comprises:
reading a scaling coefficient, a curve slope at the time t in the curve slopes and a residual error item at the time t;
and determining the regularized residual error at the t moment according to the scaling coefficient, the curve slope at the t moment and the residual error item.
4. The method of claim 3, wherein determining the regularized residual at time t based on the scaling factor, the slope of the curve at time t, and a residual term comprises:
calculating the product of the scaling coefficient and the slope of the curve at the time t;
calculating a sum of 1 and the product;
and determining the ratio of the residual error item at the time t to the sum value as the regularized residual error at the time t.
5. The data anomaly detection method according to claim 3 or 4, characterized in that said determination of scaling factors comprises the steps of:
reading the curve slope of each moment of the curve slope in the corresponding time interval;
calculating the median curve slope in the time interval of the curve slope based on the curve slope at each moment;
determining an inverse of the median curve slope as the scaling factor.
6. The data anomaly detection method according to claim 1, characterized in that said preset fluctuation interval determination comprises the steps of:
determining the average value and the standard deviation of the regularized residual errors in the corresponding time interval; determining the preset fluctuation interval by adopting an n-sigma algorithm based on the average value and the standard deviation;
alternatively, the first and second electrodes may be,
determining a middle normalized residual error of the normalized residual errors in the corresponding time interval and a middle normalized residual error offset value of each normalized residual error and the middle normalized residual error; and determining the preset fluctuation interval based on the middle normalized residual error and the middle normalized residual error offset value.
7. The method for detecting data abnormality according to claim 1, wherein determining a data abnormality detection result according to a preset fluctuation interval using the regularized residuals as statistics includes:
sliding a preset time window in a time interval in which the normalized residual error is located, and judging that data abnormality is detected when the number of detected data abnormality points in the preset time window exceeds a preset threshold; the data abnormal points comprise data points corresponding to regularized residuals exceeding a preset fluctuation interval.
8. An apparatus for detecting data abnormality, the apparatus comprising:
the data acquisition module is used for acquiring the time sequence data to be detected;
the time sequence decomposition module is used for carrying out time sequence decomposition on the time sequence data to be detected to obtain corresponding seasonal data and residual error items;
the slope calculation module is used for calculating the corresponding curve slope according to the seasonal data;
the residual error regularization module is used for regularizing the residual error items based on the curve slope to obtain a regularized residual error of the time series data to be detected;
and the abnormal detection module is used for determining a data abnormal detection result according to a preset fluctuation interval by taking the normalized residual error as a statistic.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202011407049.0A 2020-12-04 2020-12-04 Data anomaly detection method and device, computer equipment and storage medium Pending CN112380044A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011407049.0A CN112380044A (en) 2020-12-04 2020-12-04 Data anomaly detection method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011407049.0A CN112380044A (en) 2020-12-04 2020-12-04 Data anomaly detection method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112380044A true CN112380044A (en) 2021-02-19

Family

ID=74589447

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011407049.0A Pending CN112380044A (en) 2020-12-04 2020-12-04 Data anomaly detection method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112380044A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312239A (en) * 2021-06-11 2021-08-27 腾讯云计算(北京)有限责任公司 Data detection method, device, electronic equipment and medium
CN113568950A (en) * 2021-07-29 2021-10-29 北京字节跳动网络技术有限公司 Index detection method, device, equipment and medium
CN115328723A (en) * 2022-04-29 2022-11-11 上海鼎茂信息技术有限公司 Self-adaptive baseband optimization time sequence abnormity detection method and system
CN116582134A (en) * 2023-07-11 2023-08-11 江苏盖亚环境科技股份有限公司 Drilling and testing integrated equipment data processing method
CN117439827A (en) * 2023-12-22 2024-01-23 中国人民解放军陆军步兵学院 Network flow big data analysis method

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102184162A (en) * 2011-02-01 2011-09-14 环境保护部卫星环境应用中心 Method for remotely sensing and quantitatively monitoring steppe vegetation coverage space-time dynamic change
US20130325785A1 (en) * 2010-12-29 2013-12-05 Israel Aerospace Industries Ltd. Computerized system for monitoring and controlling physical data-producing apparatus
US20140310235A1 (en) * 2013-04-11 2014-10-16 Oracle International Corporation Seasonal trending, forecasting, anomaly detection, and endpoint prediction of java heap usage
US20140358833A1 (en) * 2013-05-29 2014-12-04 International Business Machines Corporation Determining an anomalous state of a system at a future point in time
CN104200087A (en) * 2014-06-05 2014-12-10 清华大学 Parameter optimization and feature tuning method and system for machine learning
US20180324199A1 (en) * 2017-05-05 2018-11-08 Servicenow, Inc. Systems and methods for anomaly detection
CN108984870A (en) * 2018-06-29 2018-12-11 中国科学院深圳先进技术研究院 Freezer data of the Temperature and Humidity module prediction technique and Related product based on ARIMA
KR20190070728A (en) * 2017-12-13 2019-06-21 주식회사 케이티 Method and Apparatus for Checking of Error of Time Series Data
CN109934456A (en) * 2019-01-29 2019-06-25 中国电力科学研究院有限公司 A kind of method and system for acquisition operational system progress intelligent trouble detection
CN110008080A (en) * 2018-12-25 2019-07-12 阿里巴巴集团控股有限公司 Operational indicator method for detecting abnormality, device and electronic equipment based on time series
US20190311297A1 (en) * 2018-04-05 2019-10-10 Microsoft Technology Licensing, Llc Anomaly detection and processing for seasonal data
CN111444168A (en) * 2020-03-26 2020-07-24 易电务(北京)科技有限公司 Distribution room transformer daily maximum load abnormal data detection processing method
CN111680397A (en) * 2020-05-06 2020-09-18 北京航空航天大学 Adaptive stability detection method for satellite seasonal fluctuation remote measurement

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130325785A1 (en) * 2010-12-29 2013-12-05 Israel Aerospace Industries Ltd. Computerized system for monitoring and controlling physical data-producing apparatus
CN102184162A (en) * 2011-02-01 2011-09-14 环境保护部卫星环境应用中心 Method for remotely sensing and quantitatively monitoring steppe vegetation coverage space-time dynamic change
US20140310235A1 (en) * 2013-04-11 2014-10-16 Oracle International Corporation Seasonal trending, forecasting, anomaly detection, and endpoint prediction of java heap usage
US20140358833A1 (en) * 2013-05-29 2014-12-04 International Business Machines Corporation Determining an anomalous state of a system at a future point in time
CN104200087A (en) * 2014-06-05 2014-12-10 清华大学 Parameter optimization and feature tuning method and system for machine learning
US20180324199A1 (en) * 2017-05-05 2018-11-08 Servicenow, Inc. Systems and methods for anomaly detection
KR20190070728A (en) * 2017-12-13 2019-06-21 주식회사 케이티 Method and Apparatus for Checking of Error of Time Series Data
US20190311297A1 (en) * 2018-04-05 2019-10-10 Microsoft Technology Licensing, Llc Anomaly detection and processing for seasonal data
CN108984870A (en) * 2018-06-29 2018-12-11 中国科学院深圳先进技术研究院 Freezer data of the Temperature and Humidity module prediction technique and Related product based on ARIMA
CN110008080A (en) * 2018-12-25 2019-07-12 阿里巴巴集团控股有限公司 Operational indicator method for detecting abnormality, device and electronic equipment based on time series
CN109934456A (en) * 2019-01-29 2019-06-25 中国电力科学研究院有限公司 A kind of method and system for acquisition operational system progress intelligent trouble detection
CN111444168A (en) * 2020-03-26 2020-07-24 易电务(北京)科技有限公司 Distribution room transformer daily maximum load abnormal data detection processing method
CN111680397A (en) * 2020-05-06 2020-09-18 北京航空航天大学 Adaptive stability detection method for satellite seasonal fluctuation remote measurement

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
明锋: "GPS坐标时间序列分析研究", 《中国优秀博士论文(电子期刊)》, 31 December 2018 (2018-12-31), pages 1 - 18 *
陆佳丽: "基于改进时间序列模型的日志异常检测方法", 《优秀论文》, 30 September 2020 (2020-09-30), pages 1 - 5 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312239A (en) * 2021-06-11 2021-08-27 腾讯云计算(北京)有限责任公司 Data detection method, device, electronic equipment and medium
CN113312239B (en) * 2021-06-11 2024-03-15 腾讯云计算(北京)有限责任公司 Data detection method, device, electronic equipment and medium
CN113568950A (en) * 2021-07-29 2021-10-29 北京字节跳动网络技术有限公司 Index detection method, device, equipment and medium
CN115328723A (en) * 2022-04-29 2022-11-11 上海鼎茂信息技术有限公司 Self-adaptive baseband optimization time sequence abnormity detection method and system
CN116582134A (en) * 2023-07-11 2023-08-11 江苏盖亚环境科技股份有限公司 Drilling and testing integrated equipment data processing method
CN116582134B (en) * 2023-07-11 2023-10-13 江苏盖亚环境科技股份有限公司 Drilling and testing integrated equipment data processing method
CN117439827A (en) * 2023-12-22 2024-01-23 中国人民解放军陆军步兵学院 Network flow big data analysis method
CN117439827B (en) * 2023-12-22 2024-03-08 中国人民解放军陆军步兵学院 Network flow big data analysis method

Similar Documents

Publication Publication Date Title
CN112380044A (en) Data anomaly detection method and device, computer equipment and storage medium
Shapi et al. Energy consumption prediction by using machine learning for smart building: Case study in Malaysia
JP6313730B2 (en) Anomaly detection system and method
US9323599B1 (en) Time series metric data modeling and prediction
CN111045894B (en) Database abnormality detection method, database abnormality detection device, computer device and storage medium
Ibidunmoye et al. Adaptive anomaly detection in performance metric streams
US20190129821A1 (en) Systems and Techniques for Adaptive Identification and Prediction of Data Anomalies, and Forecasting Data Trends Across High-Scale Network Infrastructures
EP3847586A1 (en) Computer-implemented method, computer program product and system for anomaly detection and/or predictive maintenance
CN114285728B (en) Predictive model training method, traffic prediction device and storage medium
Zhang et al. Resource requests prediction in the cloud computing environment with a deep belief network
JP7007243B2 (en) Anomaly detection system
US20210064432A1 (en) Resource needs prediction in virtualized systems: generic proactive and self-adaptive solution
AU2019371339B2 (en) Finite rank deep kernel learning for robust time series forecasting and regression
US20180121275A1 (en) Method and apparatus for detecting and managing faults
CN112257755A (en) Method and device for analyzing operating state of spacecraft
US20130226501A1 (en) Systems and methods for predicting abnormal temperature of a server room using hidden markov model
US20190164067A1 (en) Method and device for monitoring a process of generating metric data for predicting anomalies
Lewis et al. Chaotic attractor prediction for server run-time energy consumption
Li et al. An adaptive prognostics method based on a new health index via data fusion and diffusion process
Wang et al. Concept drift-based runtime reliability anomaly detection for edge services adaptation
Xue et al. Fill-in the gaps: Spatial-temporal models for missing data
Fu et al. SPC methods for nonstationary correlated count data with application to network surveillance
US20220382857A1 (en) Machine Learning Time Series Anomaly Detection
Zhang et al. A novel hybrid model for docker container workload prediction
Méndez et al. Using deep learning to detect anomalies in traffic flow

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination