WO2015142627A1 - Unsupervised anomaly detection for arbitrary time series - Google Patents

Unsupervised anomaly detection for arbitrary time series Download PDF

Info

Publication number
WO2015142627A1
WO2015142627A1 PCT/US2015/020314 US2015020314W WO2015142627A1 WO 2015142627 A1 WO2015142627 A1 WO 2015142627A1 US 2015020314 W US2015020314 W US 2015020314W WO 2015142627 A1 WO2015142627 A1 WO 2015142627A1
Authority
WO
WIPO (PCT)
Prior art keywords
anomaly
time series
data
anomalies
processor
Prior art date
Application number
PCT/US2015/020314
Other languages
English (en)
French (fr)
Inventor
Vitaly FILIMONOV
Panagiotis PERIORELLIS
Dmitry Starostin
Alexandre De Baynast
Eldar Akchurin
Aleksandr KLIMOV
Thomas Minka
Alexander SPENGLER
Original Assignee
Microsoft Technology Licensing, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Technology Licensing, Llc filed Critical Microsoft Technology Licensing, Llc
Priority to EP20150593.0A priority Critical patent/EP3671466B1/en
Priority to CN201580014770.4A priority patent/CN106104496B/zh
Priority to EP15714070.8A priority patent/EP3120248B1/en
Publication of WO2015142627A1 publication Critical patent/WO2015142627A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Definitions

  • An anomaly can be defined as anything that differs from expectations.
  • anomaly detection refers to identifying data, events or conditions which do not conform to an expected pattern or to other items in a group. Encountering an anomaly may in some cases indicate a processing abnormality and thus may present a starting point for investigation.
  • anomalies are detected by a human being studying a trace.
  • a trace is a log of information that can come from an application, process, operating system, hardware component, and/or a network.
  • Anomaly detection is classified as supervised, semi-supervised or unsupervised, based on the availability of reference data that acts as a baseline to define what is normal and what is an anomaly.
  • Supervised anomaly detection typically involves training a classifier, based on a first type of data that is labeled "normal” and a second type of data that is labeled "abnormal”.
  • Semi-supervised anomaly detection typically involves construction of a model representing normal behavior from one type of labeled data: either from data that is labeled normal or from data that is labeled abnormal but both types of labeled data are not provided.
  • Unsupervised anomaly detection detects anomalies in data where data is not manually labeled by a human.
  • a system and method for unsupervised anomaly detection can enable automatic detection of values that are abnormal to a high degree of probability in any time series sequence.
  • a sequence refers to a progression of values in a set.
  • a time series sequence or time series refers to any sequence of data in which each item in the sequence is associated with a point in time.
  • Anomalies can be detected in real-time by monitoring and processing the corresponding time series, even when the time series has an evolving distribution, i.e., the time series is not stationary but instead, changes or evolves over time. Data in the time series is not labeled before being processed. Data in the time series is not labeled after being processed. The data is scored without using labeled data.
  • a system for incorporating feedback or prior knowledge is not used.
  • Statistical, signal procession and machine learning techniques can be applied to identify anomalies in time series. Anomaly detection can be based on using a combination of the Z-test and a technique that processes time series that follow the Gaussian distribution pattern.
  • a Z-test is a statistical test based on calculating the distance between the actual value of the current point and the average value of the corresponding sequence in units of its standard deviation (known as z-score).
  • the outcome of the Z-test is a Boolean value indicating that the current point is an outlier or is not an outlier.
  • Data that follows a Gaussian type of distribution refers to data that falls in a symmetrical bell curve shape.
  • a formal mathematical definition of one or more anomalies can be captured without supervision.
  • Statistical methods taken from statistics, signal processing and machine learning can be used to model a time series and analyze its distribution to detect anomalies in the data.
  • Projection of the input time series can be based on various algorithms included but not limited to linear prediction coding, first order derivative, second order derivative, etc. Projection refers to transformation of the incoming data as it goes through various stages in the pipeline, depending on the processing applied.
  • Control of the frequency of scoring (that is, anomaly detection results) can be based on buffering using time windows of a variable range.
  • the calibration and/or training of algorithms can be based on a specifiable number of data points from ten or more data points.
  • a combination of Boolean and probabilistic results can be produced when an anomaly is identified, potentially increasing reliability. Results can be categorized based on the distribution and types of anomalies detected. The changes of a distribution of a time series can be dynamically monitored so that dynamic and automatic adjustment of the processing of the performance counters data points can occur.
  • FIG. la illustrates an example of a system 100 that monitors a component and detects anomalies in time series produced by the component in accordance with aspects of the subject matter described herein;
  • FIG. lb illustrates a more detailed example of a portion of system 101 in accordance with aspects of the subject matter described herein;
  • FIG. 2a illustrates an example of a method 200 that detects anomalies in accordance with aspects of the subject matter disclosed herein;
  • FIG. 2b illustrates an example of an anomaly display 250 in accordance with aspects of the subject matter disclosed herein and
  • FIG. 3 is a block diagram of an example of a computing environment in accordance with aspects of the subject matter disclosed herein.
  • Anomalies can be identified in a times series without knowing beforehand what anomalies of a particular sequence are. Statistical methods taken from statistics, signal processing and machine learning can be used to model a time series, to analyze its distribution and to predict what the anomalies are.
  • the techniques described herein are widely applicable: the unsupervised anomaly detector described herein can detect anomalies in any executing application or service.
  • one application is a cloud computing platform and infrastructure, such as but not limited to AZURE created by Microsoft Corporation. AZURE can be used to build, deploy and manage applications and services.
  • Performance counters can reflect the health of the corresponding instance of the application running on particular hardware. Performance counters can include counters associated with CPU (central processing unit) load, memory, network I/O (input/output), exceptions rate, bytes in a heap, objects in memory stores and many others known to those of skill in the art. Typically, for a single application, more than one hundred counters are used for each hardware unit. Automatically detecting anomalies in the time series values can enable problems to be found more quickly and can enable the problems to be fixed as soon as they occur. Typically today counters are logged and examined later in an offline fashion. This approach can be insufficient today.
  • Historical data that describes how the application has behaved in the past may be unavailable.
  • the unsupervised anomaly detector in accordance with aspects of the subject matter described herein can determine how the application should behave as the application or other type of component runs. Deducing the normal and abnormal behavior of a component quickly means there is typically not enough time to wait for a very large statistical sample to make predictions regarding the characteristics (normal versus anomalous) of a piece of data within a time series.
  • the anomaly detector described herein can calibrate and/or train classifiers based on a specifiable number of data points from as few as ten data points.
  • the anomaly detector described herein can adapt to accommodate a time series that changes dynamically. False positives (normal values mislabeled as anomalous) are minimized and almost no false negatives (anomalous values mislabeled as normal) are produced.
  • Anomalies can be detected in any performance counter for any executing component as the component executes or operates by continually monitoring and processing the performance counter data points.
  • the anomaly can be stored.
  • the anomaly can be displayed along with any related information to an observer (e.g., a customer, user interface, application, etc.).
  • Types of anomalies detected can include at least: out of range value anomaly, spike anomaly, pattern error anomaly, and sudden variation or edge anomaly.
  • the anomaly detector as described herein can be implemented as a rule within a monitoring infrastructure. Such a rule may encapsulate a piece of logic that is applied to a time series. A rule can process the data points of a particular performance counter of a particular component. In accordance with some aspects of the subject matter described herein, within a rule, in one phase the distribution of the data is detected, in another preprocessing is performed, in another anomalies are detected and in another post-processing is performed. The results or another type of indication of the anomalies can be provided.
  • the distribution of the time series of a performance counter can be determined.
  • the distribution of the time series can be a description of the relative number of times each possible value occurs. For example, in the case of a CPU performance counter, values of 98% usage are rare, while values in the range of 20%>-70%> can be considered normal and usual.
  • features can be extracted from the data points, such as but not limited to the distance of the current data point value to the average value of the time series.
  • An anomaly detection algorithm can be applied to the data in the anomaly detection phase. The result, if it is a positive (i.e., a positive indicates identification of an anomaly) the result can be sent to the post- processing phase.
  • a data point is characterized as more than one anomaly, (e.g., both a spike and an out-of- range value) post-processing can ensure that only one anomaly event is raised, thus reducing noise.
  • the different analytics processing paths may execute in parallel.
  • the processing paths may receive all incoming performance counter time series as input.
  • the normal behavior of the time series can be observed in an initial "warm up phase.”
  • the incoming data points can be used to train classifiers using the Z-test and Gaussian distribution algorithms. After the warm up phase, detection of anomalies can begin.
  • the edge detection, spike detection and pattern error processing paths may process data whose time series follows a Gaussian distribution pattern. Because not all performance counters follow a Gaussian distribution pattern, an additional anomaly detection processing path can be provided. This path can receive any time series distribution. To determine the behavior the data stream exhibits and consequently classify the time series into either one that follows a Gaussian distribution pattern or one that does not follow Gaussian distribution, the standard Z-test algorithm can be applied on the raw data. The standard deviation of the inliers can be estimated and compared to a threshold that is dynamically determined during the warm up phase.
  • the standard deviation of the inliers will remain zero, i.e., below the threshold and no anomaly would be raised. Failing to apply a technique such as the Z-test on the raw data can widen the spikes and can generate more noise when filtering isolated outliers.
  • Incoming data points can be sent in parallel to the Z-test algorithm (out of range detection) and to the Gaussian based processing paths.
  • the Z-test inliers can be communicated to the distribution analysis component that determines the statistical distribution of the sequence. This can be done continuously. It is possible for a non- Gaussian sequence to switch to a Gaussian patterned sequence. At the same time the incoming data point can be sent to a component that smoothes out the data signal, thereby removing noise.
  • each of the Gaussian processing paths features including but not limited to the mean, the distance from the mean, the standard deviation etc. can be received.
  • the preprocessing results can be sent to a pair of algorithms including the Z-test algorithm and an algorithm dedicated to processing data that follows a Gaussian distribution pattern.
  • each of the pairs of algorithms are trained (e.g., during the warm up phase) according to the data received from each processing path independently of the other processing paths.
  • the processing pathway that detected the anomalies can be determined. If the distribution of a stream is Gaussian and the anomalies were detected by one or more of the Gaussian processing paths, the anomaly event is raised. If the detected anomaly matches other distributions (discreet values, zeroes, binary) and the anomaly is detected by the out of range processing pathway, the anomaly event can also be allowed to propagate to the same or another performance store. All other detected anomalies can be ignored.
  • the Z-test, linear prediction coding (for identifying errors in patterns), low-pass filter and the Kolmogorov-Smirnov test (algorithm for identifying the distance to the normalized Gaussian distribution) are standard algorithms well-described in the machine learning and signal processing literature.
  • FIG. la illustrates an example of a system 100 that monitors a component and/or detects anomalies in time series produced by the component in accordance with aspects of the subject matter described herein. All or portions of system 100 may reside on one or more computers or computing devices such as the computers described below with respect to FIG. 3. System 100 or portions thereof may be provided as a stand-alone system or as a plug-in or add-in.
  • System 100 or portions thereof may include information obtained from a service (e.g., in the cloud) or may operate in a cloud computing environment.
  • a cloud computing environment can be an environment in which computing services are not owned but are provided on demand.
  • information may reside on multiple devices in a networked cloud and/or data can be stored on multiple devices within the cloud.
  • System 100 can include one or more computing devices such as, for example, computing device 102.
  • Contemplated computing devices include but are not limited to desktop computers, tablet computers, laptop computers, notebook computers, personal digital assistants, smart phones, cellular telephones, mobile telephones, and so on.
  • a computing device such as computing device 102 can include one or more processors such as processor 142, etc., and a memory such as memory 144 that communicates with the one or more processors.
  • System 100 can include one or more of the following items: a component such as component 150, a monitor such as monitor 152, data such as data 154, an anomaly detector such as anomaly detector 156, an anomaly store such as anomaly store 158 and/or a display 160.
  • System 100 can include one or more program modules comprising a monitor 152.
  • Monitor 152 can monitor execution or operation of a component such as component 150.
  • Monitor 152 can monitor component 150 continuously or intermittently or on demand.
  • Monitor 152 can collect data 154 produced by or associated with component 150.
  • Monitor 152 can be incorporated within anomaly detector 156.
  • Component 150 can be an application, process, operating system, piece of hardware, network, network switch or any component that produces or is accompanied by the production of data 154 including but not limited to performance counters.
  • Performance counters can include performance counters associated with CPU load, memory, network I/O and many other types of performance counters known to those of skill in the art.
  • the data 154 can include trace data.
  • the data 154 can include trace temporal sequences that can be viewed as time series of numbers or strings.
  • the data 154 can be used by the anomaly detector 156 to provide information about a component 150 such as whether component 150 is operating properly or is malfunctioning.
  • Data 154 can be collected continuously or intermittently or on demand or a combination thereof. Some or all of the data 154 collected by monitor 152 can be provided to an anomaly detector such as anomaly detector 156.
  • Anomaly detector 156 can process some or all of the data 154 received from monitor 152. Anomaly detector 156 can automatically detect values in time series data that are out of the normal range of expected values. Anomaly detector 156 can automatically detect values in time series data that are out of the normal range of expected values even in the absence of historical data that describes how the component 150 has behaved in the past. Anomaly detector 156 can automatically detect values in time series data that are out of the normal range of expected values dynamically, as the component is executing.
  • Anomaly detector 156 can automatically adapt to changes in the data produced by or associated with component 150.
  • Anomaly detector 156 can automatically detect anomalies in any performance counter in any application in real-time by continuously monitoring and evaluating performance counter data points.
  • Anomaly detector 156 can use the time series data to analyze a cause of a problem, predict a problem before it happens or react to a problem as quickly as the problem is detected.
  • Anomaly detector 156 can store information associated with anomalies in an anomaly store such as anomaly store 158.
  • Information associated with anomalies can be displayed on display 160. All or some of: component 150, monitor 152, data 154, anomaly detector 156, anomaly store 158 and display 160 can be on the same computing device or on different computing devices.
  • FIG. lb illustrates a block diagram of an example of a system 101 that detects anomalies in arbitrary time series in accordance with aspects of the subject matter disclosed herein. All or portions of system 101 may reside on one or more computers or computing devices such as the computers described below with respect to FIG. 3. System 101 or portions thereof may be provided as a stand-alone system or as a plug-in or add-in.
  • System 101 or portions thereof may include information obtained from a service (e.g., in the cloud) or may operate in a cloud computing environment.
  • a cloud computing environment can be an environment in which computing services are not owned but are provided on demand.
  • information may reside on multiple devices in a networked cloud and/or data can be stored on multiple devices within the cloud.
  • System 101 can include one or more computing devices such as, for example, computing device 102.
  • Contemplated computing devices include but are not limited to desktop computers, tablet computers, laptop computers, notebook computers, personal digital assistants, smart phones, cellular telephones, mobile telephones, and so on.
  • a computing device such as computing device 102 can include one or more processors such as processor 142, etc., and a memory such as memory 144 that communicates with the one or more processors.
  • System 101 can include one or more program modules that comprise an anomaly detector such as anomaly detector 103.
  • Anomaly detector 103 can include one or more program modules (not shown) that extract performance counter data points from data. Alternatively, data extraction program modules may be external to the anomaly detector 103.
  • the data provided can come from a monitor such as but not limited to monitor 152 of FIG. la.
  • Anomaly detector 103 can include one or more program modules (not shown) that detect distribution of data. Alternatively, data distribution program modules may be external to the anomaly detector 103.
  • the distribution of a time series of a performance counter can be a description of the relative number of times each value occurs.
  • Anomaly detector 103 can include one or more program modules that perform a pre-processing phase such as pre-processing modules 104, one or more program modules that perform an anomaly detection phase such as anomaly detection phase modules 106 and one or more modules that perform a post-processing phase such as anomaly detection output analysis module 130.
  • Anomaly detector 103 can also include one or more anomaly stores such as performance counter store 132.
  • Pre-processing modules 104 can include one or more of the following: a data smoothing module 112, a Z-test module 114, a pattern error detection module 116, a spike detection module 118, and/or an edge detection module 120.
  • Anomaly detection phase modules 106 can include one or more of: a combined, Z-test and/or Gaussian distribution processing module such as Z-test, Gaussian module 122, and/or a distribution detector module 128. Results can be displayed on a display device such as display 126.
  • Preprocessing modules 104 can extract features from the data points.
  • Features can include features such as the distance of the current data point value to the average value of the time series, etc.
  • Anomaly detector 103 can receive data such as data 110.
  • Data 110 can be any time series data.
  • Data 110 can be data sequences including but not limited to performance counters produced by an executing or operating component such as component 150 of FIG. la.
  • Performance counters can include counters associated with CPU load, memory, network I/O (input/output) and so on. Performance counters can be received from applications, processes, operating systems, hardware components, network switches, and so on.
  • Data 110 can be received from a performance monitoring system that monitors one or more components such as executing applications (not shown), one or more processors such as processor 142, etc., memory such as memory 144, etc.
  • Data 110 can be received from a cloud computing platform and infrastructure, such as but not limited to AZURE created by Microsoft Corporation.
  • Data 110 can be received from one or more computers or computing devices.
  • Data 110 may be provided to a data smoothing module such as data smoothing module 112 that can reduce noise.
  • Data smoothing module 112 can be a low pass filter. Undesirable frequencies within the data 110 can be removed using a low pass filter based on a design such as but not limited to the design of the Chebychev filter.
  • Chebyshev filters are analog or digital filters that minimize the error between the idealized and the actual filter characteristic over the range of the filter, but have ripples in the pass band.
  • the low-pass filter in accordance with some aspects of the subject matter described herein is part of the pre-processing phase.
  • the low-pass filter can filter out small variations in the input time series. (These small variations may otherwise deteriorate the performance results of the outlier processing pathway). Filtering using the low-pass filter can reduce the numbers of false positives.
  • a transverse filter of order 20 can be used.
  • a transverse filter instead of an auto-regressive filter can be chosen for its stability. Because no a priori statistical knowledge of the input time series may be available, an auto-regressive filter could become unstable for some input sequences.
  • Data 110 may simultaneously be provided to a Z-test module such as z-test module 114.
  • multiple processing paths can run in parallel. Each of the processing paths detects a single type of anomaly.
  • a first type of processing path can process data that follows a Gaussian distribution pattern while a second type of processing path can process data that does not follow a Gaussian distribution.
  • Processing path 1 (data 110 to Z-test module 114 to distribution detection module 128) can detect an out-of-range anomaly based on individual data points.
  • an out of range anomaly can be detected (e.g., a value of a single data point is significantly out of range).
  • the out of range anomaly can be calculated as per data point. If the absolute value of the difference between the value of the current data point and the average of the time series of the corresponding performance counter is larger than the threshold, an anomaly of type "Out of Range" can be raised.
  • processing path 2 data 110 to data smoothing module 112 to edge detection module 120
  • An edge anomaly refers to the occurrence of a sharp drop or sharp rise in the performance counter time series. If the absolute value of the drop or the rise is larger than the standard deviation of the performance counter time series multiplied by the threshold calculated during the warm-up phase, an anomaly of type "Edge Detection" can be raised.
  • Processing path 3 can detect a spike anomaly.
  • a spike anomaly corresponds to the presence of a positive or a negative spike in the performance counter sequence. If a sequence of the average values is identified as a spike, an anomaly of type "Spike Detection" can be raised. Technically, this is equivalent to comparing the second order derivative of the average values to a threshold calculated during the warm up phase. Exceeding the threshold indicates an anomaly.
  • Processing path 4 can detect a linear prediction error or pattern error anomaly.
  • a linear prediction error anomaly can be a result of the occurrence of an unexpected pattern in the performance counter sequence. This can be particularly significant when the signal is a repeated pattern.
  • the time series can be modeled as an auto-regressive process with Gaussian noise during the warm up phase. Assuming the sequence remains stationary during the whole execution phase, the auto-regressive process can predict an expected value for each upcoming data point. If the absolute value of the difference between the expected value and the actual value is larger than a threshold defined during the warm up phase, an anomaly of type "Prediction Error" can be raised.
  • Algorithms used in combination for detecting anomalies in the anomaly detection phase can include: an algorithm dedicated to processing data that follows a Gaussian distribution and a Z-test algorithm.
  • the Gaussian algorithm can output a variable that indicates the probability of a data point being an inlier or an outlier. A decision can then made based on the probability value to determine whether or not an event is raised.
  • a Z-test is a statistical test, based on calculating the distance between the actual value of the current point and the average value of the corresponding sequence in units of its standard deviation (known as z-score). The outcome is a Boolean value indicating whether the actual point is an outlier or not.
  • the algorithms involved in the pathways can calibrate their internal parameters based on the actual values of the corresponding performance counter sequence.
  • the warm up phase can enable the algorithms that analyze performance counters to determine what the typical or normal behavior of the counter is and compare against the normal distribution of the performance counter values.
  • the warm up phase enables accurate results to be obtained, reducing false positives. A false positive is when an anomaly is erroneously raised.
  • the warm up phase can also be called the training phase.
  • the warm up phase in accordance with aspects of the subject matter described herein is entirely automated. No human intervention or additional information is needed.
  • the pathways described herein can start detecting anomalies after the warm up phase ends.
  • the default duration of the warm up phase can be set to 30 minutes, i.e., approximately 30 data points (given a rate of 1 data point per minute).
  • the default warm up period can be changed by changing a value in a configuration file. Algorithms can be further trained and can continue to learn beyond the initial warm up phase.
  • the input for the warm up phase can be raw performance counter values.
  • the first configurable number of data points can be collected in a buffer and used to calibrate the respective anomaly detection algorithms.
  • the state of the anomaly detection algorithms can be represented by few values of data type double. For some algorithms such as the Gaussian distribution algorithm the state can be a set of 6 values of data type double and 3 of data type integer, while for the Z-test the state can be a set of 4 values of data type double. Persistence of this state can guarantee that the algorithms maintain knowledge of the entire history of the performance counter even beyond re-cycling and re-deployment.
  • Each set of pathways can detect anomalies in a single performance counter sequence of a single component.
  • one pathway set per performance counter can be deployed.
  • each data point can be processed sequentially. In accordance with some aspects of the subject matter described herein, no buffering is needed.
  • a sequence of data from an object is retrieved from a rule corresponding to a single performance counter and a single component.
  • Each data point of the sequence can be provided to the Z-test, which can determine whether the data point is an outlier.
  • Outliers can be sent to post-processing for anomaly detection output analysis.
  • the Z-test can indicate the result using a Boolean value (true or false). Inliers (normal values) can be sent to the distribution analysis component.
  • the edge detection pathway For the edge detection pathway, assume that a sequence of data from an object from a rule corresponding to a single performance counter and a single component is received.
  • the sequence can be buffered using a tumbling (re -use) window of 30 minutes. Assuming a sampling rate of a data point per minute, the buffer size will comprise 30 data points.
  • the arithmetic average over the buffer can be calculated.
  • the sampling rate of the resulting sequence is one value per window duration, i.e., one value every 30 minutes.
  • the first derivative can be calculated, corresponding to the difference between the average of the values of the current buffer and the average of the values of the preceding buffer.
  • the result can be provided to the Z-test and Gaussian outlier detector that can determine whether the first derivative value is an outlier.
  • the Z- test can indicate the result using a Boolean value (true or false). Meanwhile the Gaussian outlier detector can infer the probability that the first derivative value corresponds to an outlier.
  • the second order derivative is calculated instead of the first order derivative. All the other components of the pathway are the same as the edge detection pathway.
  • the data can processed in the same way as in the other pathways.
  • An additional algorithm can be used to determine prediction error over the sequence of the average values.
  • the average value determined per time window can be provided to the linear prediction coder that can generate the expected value.
  • the absolute value of the difference between this value and the actual value will be provided to the Z-test and the Gaussian outlier detector that can determine whether it is an outlier. Detection of an outlier can correspond to an erroneous pattern in the original sequence.
  • the state of the linear prediction coder can be stored in a one dimensional array of type double.
  • any policy can be applied to the results of the pathways to determine whether a data point indicates an anomaly.
  • One option is to utilize an anonymous policy in which the pathway characterizes a data point as abnormal before an anomaly event is raised.
  • Other policies can apply.
  • the Z-test algorithm can output a Boolean value (true or false).
  • the Gaussian algorithm can output a probability whose value ranges from 0 to 1. To make the anomaly detection more reliable the following reasoning can be applied: If the probability is over a pre-defined threshold (e.g., above 65%) and the Z-Test confirms that the data point is an anomaly an anomaly event can be raised and sent for post processing.
  • the raised anomaly event can include at least the following fields
  • a time stamp In the event that a window operation has been applied, the time stamp will correspond to the end of the window,
  • Each pipeline can raise its own anomaly events.
  • Post processing can be a final step in which the anomaly events raised from each pathway are aggregated.
  • the anomaly events can be combined into a single event which carries all the identifiers of the pathways in which it was detected.
  • FIG. 2a illustrates an example of a method 200 for unsupervised anomaly detection in accordance with aspects of the subject matter described herein.
  • the method described in FIG. 2a can be practiced by a system such as but not limited to the ones described with respect to FIGs. la and lb. While method 200 describes a series of operations that are performed in a sequence, it is to be understood that method 200 is not limited by the order of the sequence depicted. For instance, some operations may occur in a different order than that described. In addition, one operation may occur concurrently with another operation. In some instances, not all operations described are performed.
  • a component can be monitored, as described more fully above.
  • the component can be associated with or can produce data including performance counters.
  • Performance counter data can be extracted from the data produced by the component.
  • the data can be viewed as time series of numbers or strings.
  • the data can be received by an anomaly detector as described more fully above.
  • Performance counter data points can be extracted from the data received.
  • the data distribution that the data follows can be determined.
  • a Z-test algorithm can be applied to the raw data.
  • the standard deviation of the inliers can be compared to a threshold that can be dynamically determined during a warm up phase.
  • the stand deviation of the inliers will remain zero, i.e., below the threshold and no outlier is raised.
  • the incoming data points can be sent in parallel to both the Z-test algorithm (out of range detection) and to a data smoothing module as described more fully above.
  • pre-processing can be performed.
  • features can be extracted from the data points.
  • Features can include the distance of the current data point value to the average value of the time series.
  • anomalies can be detected.
  • four different types of anomalies can be detected using four analytics pipelines during anomaly detection.
  • the four pipelines can execute in parallel.
  • Each pipeline can take incoming performance counter time series as input.
  • each pipeline can attempt to learn the normal behavior of the time series in an initial warm up phase.
  • anomaly alarms are not raised.
  • incoming data points can be used to train the combination Z-test/Gaussian distribution algorithms.
  • the warm up phase can typically last from 30 to 60 minutes.
  • the analytics pipelines can start detecting anomalies.
  • Edge detection, spike detection and pattern error pipelines can work with data whose time series follows a Gaussian distribution pattern. Data whose time series does not follow Gaussian distribution can be processed by the out of range pipeline. If the result of the anomaly detection operation is a positive (an anomaly has been identified), the anomaly can be sent to post processing at operation 210. In the event that a data point is characterized by more than one anomaly (e.g., a data point value corresponds to a spike and to an out of range value) post-processing operations can ensure than only one anomaly event is raised, which can reduce noise.
  • results can be provided. Results can be provided in visual or electronic format.
  • FIG. 2b illustrates an example of anomaly results 250 that can be displayed for a selected time interval 02/5 7:30am 252 to 02/5 7:45am 254.
  • Anomaly results can be provided in any tangible form.
  • a table 256 can include metrics that can be tracked for a component 258 ("Performance Metric Store").
  • traceable metrics can include for example: %Total Processor Time 260 , % Agent/Host Processor Time 261, Available Bytes 262 , Request Execution Time (Total) 263, % Processor Time 264, %Processor Time (Total) 265, Connections Established (Average) 266, Connections Established (Total) 267, Thread Count 268, Requests Executing 269 and so on).
  • the detected anomalies comprise abnormal behavior in the executing application: before the first anomaly at area 273 the total number of network connections established by the computers running the application is hovering under 1000. At point 274 indicating the first displayed anomaly, the number of connections increased sharply to > 5000 and then sharply dropped back after just two minutes at point 275 indicating the second anomaly. Something happened that may require the attention of an owner or operator of the application. It will be appreciated that the display illustrated in FIG. 2b is a non- limiting example of a display that can be provided to a user to provide information concerning detected anomalies.
  • FIG. 3 and the following discussion are intended to provide a brief general description of a suitable computing environment 510 in which various embodiments of the subject matter disclosed herein may be implemented. While the subject matter disclosed herein is described in the general context of computer-executable instructions, such as program modules, executed by one or more computers or other computing devices, those skilled in the art will recognize that portions of the subject matter disclosed herein can also be implemented in combination with other program modules and/or a combination of hardware and software. Generally, program modules include routines, programs, objects, physical artifacts, data structures, etc. that perform particular tasks or implement particular data types. Typically, the functionality of the program modules may be combined or distributed as desired in various embodiments.
  • the computing environment 510 is only one example of a suitable operating environment and is not intended to limit the scope of use or functionality of the subject matter disclosed herein.
  • Computer 512 may include at least one processing unit 514, a system memory 516, and a system bus 518.
  • the at least one processing unit 514 can execute instructions that are stored in a memory such as but not limited to system memory 516.
  • the processing unit 514 can be any of various available processors.
  • the processing unit 514 can be a graphics processing unit (GPU).
  • the instructions can be instructions for implementing functionality carried out by one or more components or modules discussed above or instructions for implementing one or more of the methods described above. Dual microprocessors and other multiprocessor architectures also can be employed as the processing unit 514.
  • the computer 512 may be used in a system that supports rendering graphics on a display screen. In another example, at least a portion of the computing device can be used in a system that comprises a graphical processing unit.
  • the system memory 516 may include volatile memory 520 and nonvolatile memory 522.
  • Nonvolatile memory 522 can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM) or flash memory.
  • Volatile memory 520 may include random access memory (RAM) which may act as external cache memory.
  • the system bus 518 couples system physical artifacts including the system memory 516 to the processing unit 514.
  • the system bus 518 can be any of several types including a memory bus, memory controller, peripheral bus, external bus, or local bus and may use any variety of available bus architectures.
  • Computer 512 may include a data store accessible by the processing unit 514 by way of the system bus 518.
  • the data store may include executable instructions, 3D models, materials, textures and so on for graphics rendering.
  • Computer 512 typically includes a variety of computer readable media such as volatile and nonvolatile media, removable and non-removable media.
  • Computer readable media may be implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
  • Computer readable media include computer-readable storage media (also referred to as computer storage media) and communications media.
  • Computer storage media includes physical (tangible) media, such as but not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CDROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices that can store the desired data and which can be accessed by computer 512.
  • Communications media include media such as, but not limited to, communications signals, modulated carrier waves or any other intangible media which can be used to communicate the desired information and which can be accessed by computer 512.
  • FIG. 3 describes software that can act as an intermediary between users and computer resources.
  • This software may include an operating system 528 which can be stored on disk storage 524, and which can allocate resources of the computer 512.
  • Disk storage 524 may be a hard disk drive connected to the system bus 518 through a non-removable memory interface such as interface 526.
  • System applications 530 take advantage of the management of resources by operating system 528 through program modules 532 and program data 534 stored either in system memory 516 or on disk storage 524. It will be appreciated that computers can be implemented with various operating systems or combinations of operating systems.
  • a user can enter commands or information into the computer 512 through an input device(s) 536.
  • Input devices 536 include but are not limited to a pointing device such as a mouse, trackball, stylus, touch pad, keyboard, microphone, voice recognition and gesture recognition systems and the like. These and other input devices connect to the processing unit 514 through the system bus 518 via interface port(s) 538.
  • An interface port(s) 538 may represent a serial port, parallel port, universal serial bus (USB) and the like.
  • Output devices(s) 540 may use the same type of ports as do the input devices.
  • Output adapter 542 is provided to illustrate that there are some output devices 540 like monitors, speakers and printers that require particular adapters.
  • Output adapters 542 include but are not limited to video and sound cards that provide a connection between the output device 540 and the system bus 518.
  • Other devices and/or systems or devices such as remote computer(s) 544 may provide both input and output capabilities.
  • Computer 512 can operate in a networked environment using logical connections to one or more remote computers, such as a remote computer(s) 544.
  • the remote computer 544 can be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 512, although only a memory storage device 546 has been illustrated in FIG. 3.
  • Remote computer(s) 544 can be logically connected via
  • Network interface 548 encompasses communication networks such as local area networks (LANs) and wide area networks (WANs) but may also include other networks.
  • Communication connection(s) 550 refers to the
  • Communication connection(s) 550 may be internal to or external to computer 512 and include internal and external technologies such as modems (telephone, cable, DSL and wireless) and ISDN adapters, Ethernet cards and so on.
  • a computer 512 or other client device can be deployed as part of a computer network.
  • the subject matter disclosed herein may pertain to any computer system having any number of memory or storage units, and any number of applications and processes occurring across any number of storage units or volumes.
  • aspects of the subject matter disclosed herein may apply to an environment with server computers and client computers deployed in a network environment, having remote or local storage.
  • aspects of the subject matter disclosed herein may also apply to a standalone computing device, having programming language functionality, interpretation and execution capabilities.
  • the various techniques described herein may be implemented in connection with hardware or software or, where appropriate, with a combination of both.
  • the methods and apparatus described herein, or certain aspects or portions thereof may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing aspects of the subject matter disclosed herein.
  • the term "machine-readable storage medium” shall be taken to exclude any mechanism that provides (i.e., stores and/or transmits) any form of propagated signals.
  • the computing device will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device.
  • One or more programs that may utilize the creation and/or implementation of domain-specific programming models aspects, e.g., through the use of a data processing API or the like, may be implemented in a high level procedural or object oriented programming language to communicate with a computer system.
  • the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language, and combined with hardware implementations.
PCT/US2015/020314 2014-03-18 2015-03-13 Unsupervised anomaly detection for arbitrary time series WO2015142627A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP20150593.0A EP3671466B1 (en) 2014-03-18 2015-03-13 Unsupervised anomaly detection for arbitrary time series
CN201580014770.4A CN106104496B (zh) 2014-03-18 2015-03-13 用于任意时序的不受监督的异常检测
EP15714070.8A EP3120248B1 (en) 2014-03-18 2015-03-13 Unsupervised anomaly detection for arbitrary time series

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US14/218,119 2014-03-18
US14/218,119 US9652354B2 (en) 2014-03-18 2014-03-18 Unsupervised anomaly detection for arbitrary time series

Publications (1)

Publication Number Publication Date
WO2015142627A1 true WO2015142627A1 (en) 2015-09-24

Family

ID=52808130

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2015/020314 WO2015142627A1 (en) 2014-03-18 2015-03-13 Unsupervised anomaly detection for arbitrary time series

Country Status (4)

Country Link
US (1) US9652354B2 (zh)
EP (2) EP3120248B1 (zh)
CN (1) CN106104496B (zh)
WO (1) WO2015142627A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776480A (zh) * 2015-11-25 2017-05-31 中国电力科学研究院 一种无线电干扰现场测量异常值的剔除方法
US20190228353A1 (en) * 2018-01-19 2019-07-25 EMC IP Holding Company LLC Competition-based tool for anomaly detection of business process time series in it environments

Families Citing this family (130)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10270748B2 (en) 2013-03-22 2019-04-23 Nok Nok Labs, Inc. Advanced authentication techniques and applications
US9887983B2 (en) 2013-10-29 2018-02-06 Nok Nok Labs, Inc. Apparatus and method for implementing composite authenticators
US9396320B2 (en) 2013-03-22 2016-07-19 Nok Nok Labs, Inc. System and method for non-intrusive, privacy-preserving authentication
US9961077B2 (en) 2013-05-30 2018-05-01 Nok Nok Labs, Inc. System and method for biometric authentication with device attestation
DE102014208034A1 (de) * 2014-04-29 2015-10-29 Siemens Aktiengesellschaft Verfahren zum Bereitstellen von zuverlässigen Sensordaten
US9654469B1 (en) 2014-05-02 2017-05-16 Nok Nok Labs, Inc. Web-based user authentication techniques and applications
US9577999B1 (en) 2014-05-02 2017-02-21 Nok Nok Labs, Inc. Enhanced security for registration of authentication devices
US9413533B1 (en) 2014-05-02 2016-08-09 Nok Nok Labs, Inc. System and method for authorizing a new authenticator
US9779361B2 (en) * 2014-06-05 2017-10-03 Mitsubishi Electric Research Laboratories, Inc. Method for learning exemplars for anomaly detection
US10148630B2 (en) 2014-07-31 2018-12-04 Nok Nok Labs, Inc. System and method for implementing a hosted authentication service
US9875347B2 (en) * 2014-07-31 2018-01-23 Nok Nok Labs, Inc. System and method for performing authentication using data analytics
US9749131B2 (en) 2014-07-31 2017-08-29 Nok Nok Labs, Inc. System and method for implementing a one-time-password using asymmetric cryptography
US9455979B2 (en) 2014-07-31 2016-09-27 Nok Nok Labs, Inc. System and method for establishing trust using secure transmission protocols
US9736154B2 (en) 2014-09-16 2017-08-15 Nok Nok Labs, Inc. System and method for integrating an authentication service within a network architecture
US10284584B2 (en) * 2014-11-06 2019-05-07 International Business Machines Corporation Methods and systems for improving beaconing detection algorithms
US10439898B2 (en) * 2014-12-19 2019-10-08 Infosys Limited Measuring affinity bands for pro-active performance management
US9811562B2 (en) * 2015-02-25 2017-11-07 FactorChain Inc. Event context management system
US10270668B1 (en) * 2015-03-23 2019-04-23 Amazon Technologies, Inc. Identifying correlated events in a distributed system according to operational metrics
US10193912B2 (en) * 2015-05-28 2019-01-29 Cisco Technology, Inc. Warm-start with knowledge and data based grace period for live anomaly detection systems
JP2018525728A (ja) * 2015-07-14 2018-09-06 サイオス テクノロジー コーポレーションSios Technology Corporation コンピュータ環境からのストリーミングデータセットを分析するための分散型機械学習分析フレームワーク
US10057142B2 (en) * 2015-08-19 2018-08-21 Microsoft Technology Licensing, Llc Diagnostic framework in computing systems
US9838409B2 (en) * 2015-10-08 2017-12-05 Cisco Technology, Inc. Cold start mechanism to prevent compromise of automatic anomaly detection systems
FR3044437B1 (fr) * 2015-11-27 2018-09-21 Bull Sas Procede et systeme d'aide a la maintenance et a l'optimisation d'un supercalculateur
US10866939B2 (en) * 2015-11-30 2020-12-15 Micro Focus Llc Alignment and deduplication of time-series datasets
US10146826B1 (en) * 2015-12-28 2018-12-04 EMC IP Holding Company LLC Storage array testing
US9960952B2 (en) 2016-03-17 2018-05-01 Ca, Inc. Proactive fault detection and diagnostics for networked node transaction events
US10783532B2 (en) * 2016-04-06 2020-09-22 Chicago Mercantile Exchange Inc. Detection and mitigation of effects of high velocity value changes based upon match event outcomes
US10372524B2 (en) * 2016-07-28 2019-08-06 Western Digital Technologies, Inc. Storage anomaly detection
US10769635B2 (en) 2016-08-05 2020-09-08 Nok Nok Labs, Inc. Authentication techniques including speech and/or lip movement analysis
US10637853B2 (en) 2016-08-05 2020-04-28 Nok Nok Labs, Inc. Authentication techniques including speech and/or lip movement analysis
US10565046B2 (en) * 2016-09-01 2020-02-18 Intel Corporation Fault detection using data distribution characteristics
CN106371939B (zh) * 2016-09-12 2019-03-22 山东大学 一种时序数据异常检测方法及其系统
US10565513B2 (en) * 2016-09-19 2020-02-18 Applied Materials, Inc. Time-series fault detection, fault classification, and transition analysis using a K-nearest-neighbor and logistic regression approach
WO2018124672A1 (en) 2016-12-28 2018-07-05 Samsung Electronics Co., Ltd. Apparatus for detecting anomaly and operating method for the same
US10237070B2 (en) 2016-12-31 2019-03-19 Nok Nok Labs, Inc. System and method for sharing keys across authenticators
US10091195B2 (en) 2016-12-31 2018-10-02 Nok Nok Labs, Inc. System and method for bootstrapping a user binding
US10992693B2 (en) * 2017-02-09 2021-04-27 Microsoft Technology Licensing, Llc Near real-time detection of suspicious outbound traffic
US10326787B2 (en) 2017-02-15 2019-06-18 Microsoft Technology Licensing, Llc System and method for detecting anomalies including detection and removal of outliers associated with network traffic to cloud applications
TW201904265A (zh) * 2017-03-31 2019-01-16 加拿大商艾維吉隆股份有限公司 異常運動偵測方法及系統
EP3407273A1 (de) * 2017-05-22 2018-11-28 Siemens Aktiengesellschaft Verfahren und anordnung zur ermittlung eines anomalen zustands eines systems
US11062792B2 (en) 2017-07-18 2021-07-13 Analytics For Life Inc. Discovering genomes to use in machine learning techniques
US11139048B2 (en) 2017-07-18 2021-10-05 Analytics For Life Inc. Discovering novel features to use in machine learning techniques, such as machine learning techniques for diagnosing medical conditions
CN107526667B (zh) 2017-07-28 2020-04-28 阿里巴巴集团控股有限公司 一种指标异常检测方法、装置以及电子设备
US10699040B2 (en) * 2017-08-07 2020-06-30 The Boeing Company System and method for remaining useful life determination
WO2019040534A1 (en) * 2017-08-22 2019-02-28 Curemetrix, Inc. DEVICES, SYSTEMS AND METHODS FOR GENERATING SYNTHETIC 2D IMAGES
US10587484B2 (en) * 2017-09-12 2020-03-10 Cisco Technology, Inc. Anomaly detection and reporting in a network assurance appliance
CN109542740B (zh) * 2017-09-22 2022-05-27 阿里巴巴集团控股有限公司 异常检测方法及装置
US11275975B2 (en) * 2017-10-05 2022-03-15 Applied Materials, Inc. Fault detection classification
KR102339239B1 (ko) * 2017-10-13 2021-12-14 후아웨이 테크놀러지 컴퍼니 리미티드 클라우드-디바이스 협업적 실시간 사용자 사용 및 성능 비정상 검출을 위한 시스템 및 방법
FR3074316B1 (fr) * 2017-11-27 2021-04-09 Bull Sas Procede et dispositif de surveillance d'un processus generateur de donnees d'une metrique pour la prediction d'anomalies
US11868995B2 (en) 2017-11-27 2024-01-09 Nok Nok Labs, Inc. Extending a secure key storage for transaction confirmation and cryptocurrency
JP6878260B2 (ja) * 2017-11-30 2021-05-26 パラマウントベッド株式会社 異常判定装置、プログラム
CN109976986B (zh) * 2017-12-28 2023-12-19 阿里巴巴集团控股有限公司 异常设备的检测方法及装置
CN108387342A (zh) * 2018-01-08 2018-08-10 联创汽车电子有限公司 Eps非接触式扭矩传感器故障识别系统及其识别方法
US11831409B2 (en) 2018-01-12 2023-11-28 Nok Nok Labs, Inc. System and method for binding verifiable claims
US11481571B2 (en) 2018-01-12 2022-10-25 Microsoft Technology Licensing, Llc Automated localized machine learning training
EP3514555B1 (en) * 2018-01-22 2020-07-22 Siemens Aktiengesellschaft Apparatus for monitoring an actuator system, method for providing an apparatus for monitoring an actuator system and method for monitoring an actuator system
US11368474B2 (en) * 2018-01-23 2022-06-21 Rapid7, Inc. Detecting anomalous internet behavior
US11209807B2 (en) * 2018-01-26 2021-12-28 Ge Inspection Technologies, Lp Anomaly detection
AT520746B1 (de) * 2018-02-20 2019-07-15 Ait Austrian Inst Tech Gmbh Verfahren zur Erkennung von anormalen Betriebszuständen
US10628289B2 (en) * 2018-03-26 2020-04-21 Ca, Inc. Multivariate path-based anomaly prediction
US20190334759A1 (en) * 2018-04-26 2019-10-31 Microsoft Technology Licensing, Llc Unsupervised anomaly detection for identifying anomalies in data
US11860971B2 (en) * 2018-05-24 2024-01-02 International Business Machines Corporation Anomaly detection
CN108875367B (zh) * 2018-06-13 2020-06-16 曙光星云信息技术(北京)有限公司 一种基于时序的云计算智能安全系统
US10904113B2 (en) 2018-06-26 2021-01-26 Microsoft Technology Licensing, Llc Insight ranking based on detected time-series changes
US11362910B2 (en) 2018-07-17 2022-06-14 International Business Machines Corporation Distributed machine learning for anomaly detection
US11768936B2 (en) * 2018-07-31 2023-09-26 EMC IP Holding Company LLC Anomaly-based ransomware detection for encrypted files
CN110874674B (zh) * 2018-08-29 2023-06-27 阿里巴巴集团控股有限公司 一种异常检测方法、装置及设备
US10776196B2 (en) * 2018-08-29 2020-09-15 International Business Machines Corporation Systems and methods for anomaly detection in a distributed computing system
US11061915B2 (en) * 2018-10-25 2021-07-13 Palo Alto Research Center Incorporated System and method for anomaly characterization based on joint historical and time-series analysis
EP3663951B1 (en) * 2018-12-03 2021-09-15 British Telecommunications public limited company Multi factor network anomaly detection
WO2020114921A1 (en) 2018-12-03 2020-06-11 British Telecommunications Public Limited Company Detecting vulnerability change in software systems
WO2020122287A1 (ko) * 2018-12-13 2020-06-18 주식회사 알고리고 미세 분포 변화를 이용한 비정상 데이터 구분 장치 및 방법
EP3871120A1 (en) * 2018-12-17 2021-09-01 Huawei Technologies Co., Ltd. Apparatus and method for detecting an anomaly among successive events and computer program product therefor
US10901746B2 (en) * 2018-12-20 2021-01-26 Microsoft Technology Licensing, Llc Automatic anomaly detection in computer processing pipelines
EP3681124B8 (en) 2019-01-09 2022-02-16 British Telecommunications public limited company Anomalous network node behaviour identification using deterministic path walking
US20200234321A1 (en) * 2019-01-23 2020-07-23 General Electric Company Cost analysis system and method for detecting anomalous cost signals
JP2022523563A (ja) * 2019-03-04 2022-04-25 アイオーカレンツ, インコーポレイテッド 機械学習および人工知能を使用する、機械異常の近リアルタイム検出ならびに分類
US11277424B2 (en) * 2019-03-08 2022-03-15 Cisco Technology, Inc. Anomaly detection for a networking device based on monitoring related sets of counters
US11720461B2 (en) 2019-03-12 2023-08-08 Microsoft Technology Licensing, Llc Automated detection of code regressions from time-series data
DE102019107363B4 (de) * 2019-03-22 2023-02-09 Schaeffler Technologies AG & Co. KG Verfahren und System zum Bestimmen einer Eigenschaft einer Maschine, insbesondere einer Werkzeugmaschine, ohne messtechnisches Erfassen der Eigenschaft sowie Verfahren zum Bestimmen eines voraussichtlichen Qualitätszustands eines mit einer Maschine gefertigten Bauteils
US11792024B2 (en) 2019-03-29 2023-10-17 Nok Nok Labs, Inc. System and method for efficient challenge-response authentication
CN110245844B (zh) * 2019-05-27 2023-03-28 创新先进技术有限公司 异常指标检测方法及装置
US11448570B2 (en) * 2019-06-04 2022-09-20 Palo Alto Research Center Incorporated Method and system for unsupervised anomaly detection and accountability with majority voting for high-dimensional sensor data
US11237897B2 (en) 2019-07-25 2022-02-01 International Business Machines Corporation Detecting and responding to an anomaly in an event log
US11521254B2 (en) * 2019-08-08 2022-12-06 Ebay Inc. Automatic tuning of machine learning parameters for non-stationary e-commerce data
US11086948B2 (en) 2019-08-22 2021-08-10 Yandex Europe Ag Method and system for determining abnormal crowd-sourced label
US11710137B2 (en) 2019-08-23 2023-07-25 Yandex Europe Ag Method and system for identifying electronic devices of genuine customers of organizations
US11347718B2 (en) 2019-09-04 2022-05-31 Optum Services (Ireland) Limited Manifold-anomaly detection with axis parallel explanations
US11941502B2 (en) 2019-09-04 2024-03-26 Optum Services (Ireland) Limited Manifold-anomaly detection with axis parallel
US11108802B2 (en) 2019-09-05 2021-08-31 Yandex Europe Ag Method of and system for identifying abnormal site visits
RU2757007C2 (ru) 2019-09-05 2021-10-08 Общество С Ограниченной Ответственностью «Яндекс» Способ и система для определения вредоносных действий определенного вида
US11334559B2 (en) 2019-09-09 2022-05-17 Yandex Europe Ag Method of and system for identifying abnormal rating activity
US11128645B2 (en) 2019-09-09 2021-09-21 Yandex Europe Ag Method and system for detecting fraudulent access to web resource
CN112819491B (zh) * 2019-11-15 2024-02-09 百度在线网络技术(北京)有限公司 一种转化数据处理的方法、装置、电子设备及存储介质
US11363059B2 (en) * 2019-12-13 2022-06-14 Microsoft Technology Licensing, Llc Detection of brute force attacks
RU2752241C2 (ru) 2019-12-25 2021-07-23 Общество С Ограниченной Ответственностью «Яндекс» Способ и система для выявления вредоносной активности предопределенного типа в локальной сети
US20210209486A1 (en) * 2020-01-08 2021-07-08 Intuit Inc. System and method for anomaly detection for time series data
WO2021155576A1 (en) * 2020-02-07 2021-08-12 Alibaba Group Holding Limited Automatic parameter tuning for anomaly detection system
US11716337B2 (en) * 2020-02-10 2023-08-01 IronNet Cybersecurity, Inc. Systems and methods of malware detection
US11620581B2 (en) 2020-03-06 2023-04-04 International Business Machines Corporation Modification of machine learning model ensembles based on user feedback
US11374953B2 (en) * 2020-03-06 2022-06-28 International Business Machines Corporation Hybrid machine learning to detect anomalies
US11422992B2 (en) 2020-03-16 2022-08-23 Accenture Global Solutions Limited Auto reinforced anomaly detection
US11851096B2 (en) * 2020-04-01 2023-12-26 Siemens Mobility, Inc. Anomaly detection using machine learning
US11575697B2 (en) 2020-04-30 2023-02-07 Kyndryl, Inc. Anomaly detection using an ensemble of models
US11645558B2 (en) * 2020-05-08 2023-05-09 International Business Machines Corporation Automatic mapping of records without configuration information
US11095544B1 (en) * 2020-06-17 2021-08-17 Adobe Inc. Robust anomaly and change detection utilizing sparse decomposition
US11243986B1 (en) 2020-07-21 2022-02-08 International Business Machines Corporation Method for proactive trouble-shooting of provisioning workflows for efficient cloud operations
US11231985B1 (en) 2020-07-21 2022-01-25 International Business Machines Corporation Approach to automated detection of dominant errors in cloud provisions
US11768915B1 (en) 2020-08-03 2023-09-26 Amdocs Development Limited System, method, and computer program for anomaly detection in time-series data with mixed seasonality
US11748568B1 (en) 2020-08-07 2023-09-05 Amazon Technologies, Inc. Machine learning-based selection of metrics for anomaly detection
KR20220019560A (ko) 2020-08-10 2022-02-17 삼성전자주식회사 네트워크 모니터링 장치 및 방법
EP3958100A1 (en) * 2020-08-21 2022-02-23 Microsoft Technology Licensing, LLC Anomaly detection for sensor systems
US11374919B2 (en) * 2020-11-18 2022-06-28 Okta, Inc. Memory-free anomaly detection for risk management systems
CN112416643A (zh) * 2020-11-26 2021-02-26 清华大学 无监督异常检测方法与装置
CN112464248A (zh) * 2020-12-04 2021-03-09 中国科学院信息工程研究所 一种处理器漏洞利用威胁检测方法及装置
WO2022215063A1 (en) * 2021-04-04 2022-10-13 Stardat Data Science Ltd. A machine learning model blind-spot detection system and method
CN113139158B (zh) * 2021-04-21 2023-05-05 国网安徽省电力有限公司 基于高斯过程回归的comtrade异常录波数据监测和修正方法及系统
US11640387B2 (en) * 2021-04-23 2023-05-02 Capital One Services, Llc Anomaly detection data workflow for time series data
KR102525187B1 (ko) * 2021-05-12 2023-04-24 네이버클라우드 주식회사 시계열 기반 이상 탐지 방법 및 시스템
CN113342610B (zh) * 2021-06-11 2023-10-13 北京奇艺世纪科技有限公司 一种时序数据异常检测方法、装置、电子设备及存储介质
US11595283B2 (en) * 2021-07-26 2023-02-28 Cisco Technology, Inc. Message bus subscription management with telemetry inform message
US11727011B2 (en) 2021-08-24 2023-08-15 Target Brands, Inc. Data analysis tool with precalculated metrics
CN116125499B (zh) * 2021-11-12 2024-04-09 北京六分科技有限公司 检测中频数据的方法、装置及系统
US11902127B2 (en) 2021-11-23 2024-02-13 Cisco Technology, Inc. Automatic detection and tracking of anomalous rectifiable paths using time-series dynamics
WO2023107937A1 (en) * 2021-12-06 2023-06-15 Jpmorgan Chase Bank, N.A. Systems and methods for collecting and processing application telemetry
CN114689968A (zh) * 2022-03-16 2022-07-01 河南翔宇医疗设备股份有限公司 一种电磁兼容试验中的滤波方法及相关装置
US20230349608A1 (en) * 2022-04-29 2023-11-02 Fortive Corporation Anomaly detection for refrigeration systems
WO2024030588A1 (en) * 2022-08-03 2024-02-08 Aviatrix Systems, Inc. Systems and methods for improved monitoring features for of a network topology and corresponding user interfaces
CN116975545A (zh) * 2023-07-31 2023-10-31 长沙穗城轨道交通有限公司 一种站台门异常检测方法、装置、电子设备及存储介质

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012020329A1 (en) * 2010-04-15 2012-02-16 Caplan Software Development S.R.L. Automated upgrading method for capacity of it system resources

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110178967A1 (en) 2001-05-24 2011-07-21 Test Advantage, Inc. Methods and apparatus for data analysis
JP3821225B2 (ja) 2002-07-17 2006-09-13 日本電気株式会社 時系列データに対する自己回帰モデル学習装置並びにそれを用いた外れ値および変化点の検出装置
US8185348B2 (en) 2003-10-31 2012-05-22 Hewlett-Packard Development Company, L.P. Techniques for monitoring a data stream
WO2007002838A2 (en) 2005-06-29 2007-01-04 Trustees Of Boston University Whole-network anomaly diagnosis
US20070289013A1 (en) 2006-06-08 2007-12-13 Keng Leng Albert Lim Method and system for anomaly detection using a collective set of unsupervised machine-learning algorithms
US20080103729A1 (en) * 2006-10-31 2008-05-01 Microsoft Corporation Distributed detection with diagnosis
US7788198B2 (en) * 2006-12-14 2010-08-31 Microsoft Corporation Method for detecting anomalies in server behavior using operational performance and failure mode monitoring counters
US7917338B2 (en) 2007-01-08 2011-03-29 International Business Machines Corporation Determining a window size for outlier detection
US7716011B2 (en) * 2007-02-28 2010-05-11 Microsoft Corporation Strategies for identifying anomalies in time-series data
US7620523B2 (en) 2007-04-30 2009-11-17 Integrien Corporation Nonparametric method for determination of anomalous event states in complex systems exhibiting non-stationarity
GB201012519D0 (en) 2010-07-26 2010-09-08 Ucl Business Plc Method and system for anomaly detection in data sets
US8949677B1 (en) * 2012-05-23 2015-02-03 Amazon Technologies, Inc. Detecting anomalies in time series data
US8914317B2 (en) * 2012-06-28 2014-12-16 International Business Machines Corporation Detecting anomalies in real-time in multiple time series data with automated thresholding
CN102945320A (zh) * 2012-10-29 2013-02-27 河海大学 一种时间序列数据异常检测方法与装置

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012020329A1 (en) * 2010-04-15 2012-02-16 Caplan Software Development S.R.L. Automated upgrading method for capacity of it system resources

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776480A (zh) * 2015-11-25 2017-05-31 中国电力科学研究院 一种无线电干扰现场测量异常值的剔除方法
CN106776480B (zh) * 2015-11-25 2019-07-19 中国电力科学研究院 一种无线电干扰现场测量异常值的剔除方法
US20190228353A1 (en) * 2018-01-19 2019-07-25 EMC IP Holding Company LLC Competition-based tool for anomaly detection of business process time series in it environments

Also Published As

Publication number Publication date
US20150269050A1 (en) 2015-09-24
EP3671466A1 (en) 2020-06-24
CN106104496A (zh) 2016-11-09
EP3120248B1 (en) 2020-01-08
CN106104496B (zh) 2019-07-30
EP3671466B1 (en) 2022-01-05
EP3120248A1 (en) 2017-01-25
US9652354B2 (en) 2017-05-16

Similar Documents

Publication Publication Date Title
EP3120248B1 (en) Unsupervised anomaly detection for arbitrary time series
CN110708204B (zh) 一种基于运维知识库的异常处理方法、系统、终端及介质
US10069900B2 (en) Systems and methods for adaptive thresholding using maximum concentration intervals
CN107666410B (zh) 网络安全分析系统及方法
US10585774B2 (en) Detection of misbehaving components for large scale distributed systems
US11294754B2 (en) System and method for contextual event sequence analysis
US10373065B2 (en) Generating database cluster health alerts using machine learning
US20170364818A1 (en) Automatic condition monitoring and anomaly detection for predictive maintenance
EP3079337A1 (en) Event correlation across heterogeneous operations
WO2020134032A1 (zh) 用于检测业务系统异常的方法及其装置
US9369364B2 (en) System for analysing network traffic and a method thereof
US20130086431A1 (en) Multiple modeling paradigm for predictive analytics
CA2931624A1 (en) Systems and methods for event detection and diagnosis
CN110750429A (zh) 运维管理系统的异常检测方法、装置、设备及存储介质
US10838791B1 (en) Robust event prediction
CN111666187A (zh) 用于检测异常响应时间的方法和装置
WO2017214613A1 (en) Streaming data decision-making using distributions with noise reduction
JP7086230B2 (ja) プロトコルに依存しない異常検出
US20220413481A1 (en) Geometric aging data reduction for machine learning applications
US8806313B1 (en) Amplitude-based anomaly detection
WO2022115419A1 (en) Method of detecting an anomaly in a system
US10740458B2 (en) System and method for high frequency heuristic data acquisition and analytics of information security events
US11269706B2 (en) System and method for alarm correlation and aggregation in IT monitoring
CN111258845A (zh) 事件风暴的检测
CN113656207B (zh) 故障处理方法、装置、电子设备和介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15714070

Country of ref document: EP

Kind code of ref document: A1

DPE2 Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101)
REEP Request for entry into the european phase

Ref document number: 2015714070

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2015714070

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE