WO2021155576A1 - Automatic parameter tuning for anomaly detection system - Google Patents

Automatic parameter tuning for anomaly detection system Download PDF

Info

Publication number
WO2021155576A1
WO2021155576A1 PCT/CN2020/074509 CN2020074509W WO2021155576A1 WO 2021155576 A1 WO2021155576 A1 WO 2021155576A1 CN 2020074509 W CN2020074509 W CN 2020074509W WO 2021155576 A1 WO2021155576 A1 WO 2021155576A1
Authority
WO
WIPO (PCT)
Prior art keywords
detection system
anomaly detection
anomaly
new
data points
Prior art date
Application number
PCT/CN2020/074509
Other languages
French (fr)
Inventor
Jingkun Gao
Xiaomin Song
Yan Li
Liang Sun
Shan REN
Xingming XU
Original Assignee
Alibaba Group Holding Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Limited filed Critical Alibaba Group Holding Limited
Priority to PCT/CN2020/074509 priority Critical patent/WO2021155576A1/en
Priority to CN202080094902.XA priority patent/CN115315689A/en
Publication of WO2021155576A1 publication Critical patent/WO2021155576A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0823Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/50Testing arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • G06F11/3072Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting

Definitions

  • an anomaly detection system may be deployed and configured to monitor performance metrics (such as a percentage of CPU usage, a percentage of memory usage, etc. ) of a plurality of servers in a cloud computing architecture, and detect any occurrence of various types of abnormality or anomaly (e.g., a sudden jump or spike in an amount of traffic, a failure of a certain server, etc. ) in the cloud computing architecture.
  • performance metrics such as a percentage of CPU usage, a percentage of memory usage, etc.
  • FIG. 1 illustrates an example environment in which a parameter tuning system may be used.
  • FIG. 2 illustrates an example anomaly detection system.
  • FIG. 3 illustrates an example parameter tuning system.
  • FIG. 4 illustrates an example parameter tuning method.
  • existing anomaly detection systems require human experts to manually or semi-manually tune parameters of the anomaly detection systems to set optimal configurations of parameters necessary for detecting anomalies in time series data associated with daily-life operations such as performance of servers provided in a cloud. This would be impractical and inefficient to involve human experts to set the optimal parameter manually for each metric, hence limit the possibility of the anomaly detection system to scale up to monitor a large number of machines provided in the computer system.
  • the parameter tuning system may automatically tune and configure parameters associated with an anomaly detection system to obtain an optimal configuration of parameters usable for anomaly detection in the anomaly detection system, and further adaptively adjust the parameters when new time series data is collected by the anomaly detection system.
  • the parameter tuning system may obtain a set of different value combinations for one or more parameters associated with an anomaly detection system, and obtain one or more time series monitored by the anomaly detection system.
  • each data point of the one or more time series may be marked with a label indicating whether an anomaly is present, absent, or not yet determined.
  • the parameter tuning system may assign a respective value combination of the set of different value combinations to the one or more parameters associated with the anomaly detection system, and apply the anomaly detection system assigned with the respective value combination on a subset of data points of the one or more time series to obtain predicted labels of the subset of data points for each value combination.
  • the parameter tuning system may then calculate a performance score of the anomaly detection system assigned with the respective value combination based at least in part on a predetermined evaluation metric, and select a parameter combination corresponding to a highest performance score of the anomaly detection system from among the set of different parameter combinations as a recommended parameter combination for the one or more parameters associated with the anomaly detection system.
  • the parameter tuning system may calculate the performance score based on predicted labels and labels provided by users.
  • functions described herein to be performed by the parameter tuning system may be performed by multiple separate units or services.
  • an acquisition service may obtain a set of different value combinations for one or more parameters associated with an anomaly detection system, and obtain one or more time series monitored by the anomaly detection system, while a detection service may assign a respective value combination of the set of different value combinations to the one or more parameters associated with the anomaly detection system, and apply the anomaly detection system assigned with the respective value combination on a subset of data points of the one or more time series to obtain predicted labels of the subset of data points for each value combination.
  • An evaluation service may calculate a performance score of the anomaly detection system assigned with the respective value combination based at least in part on a predetermined evaluation metric, and select a parameter combination corresponding to a highest performance score of the anomaly detection system from among the set of different parameter combinations as a recommended parameter combination for the one or more parameters associated with the anomaly detection system.
  • the parameter tuning system may be implemented as a combination of software and hardware installed in a single device, in other examples, the parameter tuning system may be implemented and distributed in multiple devices or as services provided in one or more computing devices over a network and/or in a cloud computing architecture.
  • the application describes multiple and varied embodiments and implementations.
  • the following section describes an example framework that is suitable for practicing various implementations.
  • the application describes example systems, devices, and processes for implementing a parameter tuning system.
  • FIG. 1 illustrates an example environment 100 usable to implement a parameter tuning system.
  • the environment 100 may include a parameter tuning system 102.
  • the parameter tuning system 102 is described to exist as an individual entity.
  • the parameter tuning system 102 may include one or more servers 104.
  • the parameter tuning system 102 may be included as a part of the one or more servers 104, or distributed among the one or more servers 104, which communicate data with one another via a network 106.
  • a first server of the one or more servers 104 may include part of the functions of the parameter tuning system 102, while other functions of the parameter tuning system 102 may be included in a second server of the one or more servers 104.
  • some or all the functions of the parameter tuning system 102 may be included in a cloud computing system or architecture, and may be provided as services for determining or recommending suitable or optimal parameter configuration for anomaly detection.
  • the parameter tuning system 102 may be a part of a client device 108, e.g., software and/or hardware components of the client device 108. In some instances, the parameter tuning system 102 may include a client device 108.
  • the environment 100 may further include an anomaly detection system 110.
  • the parameter tuning system 102 may be included in the anomaly detection system 110, and provide services to the anomaly detection system 110.
  • some or all the functions of the parameter tuning system 102 may be included and provided in the one or more servers 104, the client device 108, and/or the anomaly detection system 110, which communicate with each other via the network 106.
  • the client device 108 may be implemented as any of a variety of computing devices including, but not limited to, a desktop computer, a notebook or portable computer, a handheld device, a netbook, an Internet appliance, a tablet or slate computer, a mobile device (e.g., a mobile phone, a personal digital assistant, a smart phone, etc. ) , etc., or a combination thereof.
  • a desktop computer e.g., a notebook or portable computer, a handheld device, a netbook, an Internet appliance, a tablet or slate computer, a mobile device (e.g., a mobile phone, a personal digital assistant, a smart phone, etc. ) , etc., or a combination thereof.
  • the network 106 may be a wireless or a wired network, or a combination thereof.
  • the network 106 may be a collection of individual networks interconnected with each other and functioning as a single large network (e.g., the Internet or an intranet) . Examples of such individual networks include, but are not limited to, telephone networks, cable networks, Local Area Networks (LANs) , Wide Area Networks (WANs) , and Metropolitan Area Networks (MANs) . Further, the individual networks may be wireless or wired networks, or a combination thereof.
  • Wired networks may include an electrical carrier connection (such a communication cable, etc. ) and/or an optical carrier or connection (such as an optical fiber connection, etc. ) .
  • Wireless networks may include, for example, a WiFi network, other radio frequency networks (e.g., Zigbee, etc. ) , etc.
  • the parameter tuning system 102 may receive one or more parameters to be tuned and time series data with labeled data points from the anomaly detection system 110. The parameter tuning system 102 may then generate a tuning space based on different value combinations of the one or more parameters to be tuned and the labeled data points of the time series data, and perform anomaly detection on the labeled data points using the different value combinations of the one or more parameters. The parameter tuning system 102 may evaluate respective performances of the different value combinations of the one or more parameters based on a predetermined evaluation metric, and determine a particular value combination to be recommended to the anomaly detection system as a recommended value combination for the one or more parameters.
  • FIG. 2 illustrates the anomaly detection system 110 in more detail.
  • the anomaly detection system 110 may include, but is not limited to, one or more processors 202, an input/output (I/O) interface 204, and/or a network interface 206, and memory 208.
  • processors 202 may include, but is not limited to, one or more processors 202, an input/output (I/O) interface 204, and/or a network interface 206, and memory 208.
  • I/O input/output
  • the processors 202 may be configured to execute instructions that are stored in the memory 208, and/or received from the input/output interface 204, and/or the network interface 206.
  • the processors 202 may be implemented as one or more hardware processors including, for example, a microprocessor, an application-specific instruction-set processor, a physics processing unit (PPU) , a central processing unit (CPU) , a graphics processing unit, a digital signal processor, a tensor processing unit, etc. Additionally or alternatively, the functionality described herein can be performed, at least in part, by one or more hardware logic components.
  • FPGAs field-programmable gate arrays
  • ASICs application-specific integrated circuits
  • ASSPs application-specific standard products
  • SOCs system-on-a-chip systems
  • CPLDs complex programmable logic devices
  • the memory 208 may include computer readable media in a form of volatile memory, such as Random Access Memory (RAM) and/or non-volatile memory, such as read only memory (ROM) or flash RAM.
  • RAM Random Access Memory
  • ROM read only memory
  • flash RAM flash random access memory
  • the computer readable media may include a volatile or non-volatile type, a removable or non-removable media, which may achieve storage of information using any method or technology.
  • the information may include a computer readable instruction, a data structure, a program module or other data.
  • Examples of computer readable media include, but not limited to, phase-change memory (PRAM) , static random access memory (SRAM) , dynamic random access memory (DRAM) , other types of random-access memory (RAM) , read-only memory (ROM) , electronically erasable programmable read-only memory (EEPROM) , quick flash memory or other internal storage technology, compact disk read-only memory (CD-ROM) , digital versatile disc (DVD) or other optical storage, magnetic cassette tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission media, which may be used to store information that may be accessed by a computing device.
  • the computer readable media does not include any transitory media, such as modulated data signals and carrier waves.
  • the anomaly detection system 110 may further include other hardware components and/or other software components such as program modules 210 to execute instructions stored in the memory 208 for performing various operations, and program data 212 for storing data associated with anomaly detection, such as data points of one or more time series, values of parameters associated with anomaly detection, etc.
  • program modules 210 to execute instructions stored in the memory 208 for performing various operations
  • program data 212 for storing data associated with anomaly detection, such as data points of one or more time series, values of parameters associated with anomaly detection, etc.
  • the program modules 210 may include a data preprocessing module 214 configured to check and clean an input time series, such as checking timestamps of data points in the input time series, checking any missing data points, performing an interpolation to fill in a missing data point, etc.
  • the program module 210 may further include a classification module 216 configured to separate data of at least some portions of the input time series into different types based on respective data patterns included in the at least some portions of the input time series.
  • the program module 210 may further include a transformation module 218 configured to process the input time series by, for example, denoising and smoothing the input time series, and decomposing the input time series into different components including, but not limited to, a trend component, a seasonal component, and a residual component, etc.
  • the program module 210 may further include a detection module 220 configured to detect and determine whether an anomaly occurs in the input time series by applying one or more statistical hypothesis tests (such as a T-test, a F-test, and/or a MK test) on different components obtained by the transformation module 218.
  • FIG. 3 illustrates the example parameter tuning system 102 in more detail.
  • the parameter tuning system 102 may include, but is not limited to, one or more processors 302, an input/output (I/O) interface 304, a network interface 306, and memory 308.
  • the memory 308 may include computer readable media as described in the foregoing description.
  • the parameter tuning system 102 may further include an evaluator 310 and a tuner 312. Additionally, in some implementations, the parameter tuning system 102 may further include.
  • the parameter tuning system 102 may be implemented using hardware, for example, an ASIC (i.e., Application-Specific Integrated Circuit) , a FPGA (i.e., Field-Programmable Gate Array) , and/or other hardware.
  • ASIC i.e., Application-Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • the parameter tuning system 102 is described to exist as a separate entity.
  • some or all of the functions of the parameter tuning system 102 may be included in the anomaly detection system 110, the one or more servers 104, and/or the client device 108.
  • the parameter tuning system 102 may further include other hardware components and/or other software components such as program modules 314 to execute instructions stored in the memory 308 for performing various operations, and program data 316 for storing data associated with parameter tuning, such as labeled data points of one or more time series, value combinations of parameters associated with the anomaly detection system 110, etc.
  • program modules 314 to execute instructions stored in the memory 308 for performing various operations
  • program data 316 for storing data associated with parameter tuning, such as labeled data points of one or more time series, value combinations of parameters associated with the anomaly detection system 110, etc.
  • FIG. 4 shows a schematic diagram depicting an example method of parameter tuning.
  • the method of FIG. 4 may, but need not, be implemented in the environment of FIG. 1 and using the systems of FIGS. 2 and 3.
  • a method 400 is described with reference to FIGS. 1-3.
  • the method 400 may alternatively be implemented in other environments and/or using other systems.
  • the method 400 is described in the general context of computer-executable instructions.
  • computer-executable instructions can include routines, programs, objects, components, data structures, procedures, modules, functions, and the like that perform particular functions or implement particular abstract data types.
  • each of the example methods are illustrated as a collection of blocks in a logical flow graph representing a sequence of operations that can be implemented in hardware, software, firmware, or a combination thereof.
  • the order in which the method is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method, or alternate methods. Additionally, individual blocks may be omitted from the method without departing from the spirit and scope of the subject matter described herein.
  • the blocks represent computer instructions that, when executed by one or more processors, perform the recited operations.
  • some or all of the blocks may represent application specific integrated circuits (ASICs) or other physical components that perform the recited operations.
  • ASICs application specific integrated circuits
  • the parameter tuning system 102 may receive information of one or more parameters to be tuned for an anomaly detection system and one or more labeled time series.
  • the parameter tuning system 102 may be connected to or associated with one or more anomaly detection systems (e.g., the anomaly detection system 110) , and may communicate data with the one or more anomaly detection systems through the network 106.
  • the parameter tuning system 102 may receive information of the one or more parameters to be tuned from the anomaly detection system 110 for automatic parameter tuning.
  • the parameter tuning system 102 may receive one or more labeled time series including labeled data points from the anomaly detection system 102.
  • the one or more labeled time series may correspond to performance data (such as a percentage of CPU usage, a percentage of memory usage, etc. ) of a computer system that is collected and monitored by the anomaly detection system 110 on a regular basis.
  • the one or more parameters to be tuned may include, but are not limited to, parameters associated with one or more statistical hypothesis tests that are used for anomaly detection.
  • parameters associated with a statistical test may include a length of a time series (a window size including a data point to be tested) , a direction of the statistical test (e.g., a one-sided test or a two-sided test) , a significance level of the statistical test, etc.
  • the one or more statistical hypothesis tests may include, but are not limited to, a T-test, a F-test, and a MK-test, etc.
  • significance levels of one or more statistical hypothesis tests involved in anomaly detection are used as an example of the one or more parameters to be tuned hereinafter.
  • other types of parameters associated with the anomaly detection system such as a length of a time series (a window size including a data point to be tested) , a direction of the statistical test (e.g., a one-sided test or a two-sided test) , etc., may also be tuned and configured in a similar manner.
  • the information of the one or more parameters to be tuned may include initial or default values of the one or more parameters used in the anomaly detection system 110, and these initial or default values of the one or more parameters may not be optimal or may be outdated, and thus need to be tuned or refined.
  • the anomaly detection system 110 may first automatically separate data in at least some portions of the one or more time series into different types (e.g., anomaly types such as a spike, a level shift, a mean change, etc. ) through the classification module 216 based on respective data patterns in the at least some portions of the one or more time series using active learning-based approaches and/or through user selection. The anomaly detection system 110 may then select representative samples from respective data of each type. In implementations, each representative sample may include a predetermined number of data points in a time series.
  • different types e.g., anomaly types such as a spike, a level shift, a mean change, etc.
  • the anomaly detection system 110 may then transform the representative samples of each type through the transformation module 218, and perform anomaly detection (e.g., applying corresponding one or more statistical hypothesis tests) on these representative samples of each type through the detection module 220 to generate anomaly scores for these representative samples.
  • anomaly detection e.g., applying corresponding one or more statistical hypothesis tests
  • the anomaly scores for these representative samples may be related to probability values (i.e., p-values) of these representative samples generated for the one or more statistical hypothesis tests.
  • an anomaly score for a representative sample under a statistical test may be a negative logarithm of a p-value generated for that representative sample under the statistical test with a base of ten.
  • a p-value or probability value is defined as a probability of obtaining test results at least as extreme as results that are actually observed during the statistical hypothesis testing.
  • the anomaly detection system 110 may compare the p-values of the representative samples generated for the one or more statistical hypothesis tests with respective significance levels of the one or more statistical hypothesis tests to determine whether an anomaly occurs in each of the representative samples.
  • the anomaly detection system 110 may provide the representative samples to one or more users, and request feedback from the one or more users for labeling at least some data points in the representative samples as anomaly points and/or non-anomaly points. In some implementations, the anomaly detection system 110 may further provide the anomaly scores of the representative samples to one or more users as a further reference. Based on the feedback from the one or more users, the representative samples may be labeled accordingly. For example, the one or more users may label data points in the representative samples whether an anomaly is present or absent. In implementations, each data point of the representative samples may be marked with a label indicating whether an anomaly is present, absent, or not determined.
  • a data point that is determined by a user as an anomaly point is marked with a label (e.g., anomaly) indicating that an anomaly is present.
  • a data point that is determined by a user as a non-anomaly point is marked with a label (e.g., non-anomaly) indicating that an anomaly is absent.
  • a data point that is not determined by any user as either an anomaly point or a non-anomaly point is marked with a label (e.g., undecided) indicating that an anomaly is undecided or not determined yet.
  • the anomaly detection system 110 may select a portion of the representative samples and respective anomaly scores (e.g., the p-values) , and send this selected portion of the representative samples and the respective anomaly scores to the parameter tuning system 102 as at least a portion of the data associated with the one or more parameters to be tuned.
  • respective anomaly scores e.g., the p-values
  • the parameter tuning system 102 may generate a set of different value combinations for the one or more parameters associated with the anomaly detection system based on the received information of the one or more parameters to be tuned.
  • the parameter tuning system 102 may obtain respective candidate values of the one or more parameters from the received information of the one or more parameters to be tuned.
  • the one or more parameters are described to be one or more significance levels of one or more statistical hypothesis tests, and therefore the respective candidate values of the one or more parameters may include respective p-values of data points marked with a label indicating that an anomaly is present are obtained under the one or more statistical hypothesis tests.
  • the parameter tuning system 102 may combine the respective candidate values of the one or more parameters to form a set of different value combinations of the one or more parameters to be tuned. For example, if N number of data points are detected and labeled to be an anomaly point or a normal point under a first statistical test T1 with corresponding p-values generated for the first statistical test T1 being M number of data points are detected and labeled to have an anomaly under a second statistical test T2 with corresponding p- values generated for the first statistical test T2 being and K number of data points are detected and labeled to have an anomaly under a third statistical test T3 with corresponding p-values generated for the third statistical test T3 being the set of different value combinations of the one or more parameters (in this example, there are three parameters, namely, significance levels of the first, second and third statistical hypothesis tests) to be tuned that is generated by the parameter tuning system 102 may include N ⁇ M ⁇ K number of different value combinations, i.e., wherein 1 ⁇ n ⁇ N, 1
  • the parameter tuning system 102 may further generate a tuning space from the set of different value combinations for the one or more parameters to be tuned and the one or more labeled time series. For example, after generating the set of different value combinations for the one or more parameters to be tuned, the parameter tuning system 102 may further generate a tuning space using the set of different value combinations for the one or more parameters to be tuned and the one or more labeled time series.
  • the tuning space may include a parameter space consisting of the set of different value combinations and a data space consisting of the one or more labeled time series.
  • the parameter tuning system 102 may send the set of different value combinations for the one or more parameters to be tuned to the anomaly detection system for performing anomaly detection on the one or more labeled time series based on each value combination of the one or more parameters to be tuned.
  • the parameter tuning system 102 may send the set of different value combinations to the anomaly detection system 110 to cause the anomaly detection system 110 to perform anomaly detection on data points of the one or more labeled time series based on each value combination (in this example, a combination of candidate values for significance levels of the statistical hypothesis tests) , and obtain predicted results or labels for the one or more labeled time series.
  • each value combination in this example, a combination of candidate values for significance levels of the statistical hypothesis tests
  • the parameter tuning system 102 may receive predicted results or labels of data points of the one or more time series from the anomaly detection system.
  • the anomaly detection system 110 may return predicted results or labels of the data points of the one or more labeled time series to the parameter tuning system 102.
  • each predicted result or label may include a label indicating whether an anomaly is predicted to be present or absent.
  • the parameter tuning system 102 may calculate a respective performance score of each value combination of the set of different value combinations for the one or more parameters to be tuned.
  • the parameter tuning system 102 or the evaluator 310 of the parameter tuning system 102 may evaluate a respective performance score of each value combination of the set of different value combinations for the one or more parameters to be tuned based on the predicted results or labels of the data points of the one or more labeled time series and user labels of the data points of the one or more labeled time series made by the one or more users.
  • the parameter tuning system 102 may calculate an enhanced confusion matrix as shown in the following Table 1.
  • the parameter tuning system 102 may further employ a relaxed mode of anomaly matching between the predicted labels and the user labels of the data points of the one or more labeled time series.
  • a data window of a predetermined size i.e., a predetermined integer such as 2, 3, 4, ..., etc.
  • a data window of a predetermined size may be used to accommodate a pattern anomaly in which a number of consecutive data points are labeled as anomalies.
  • a data point labeled with predicted anomaly is off (i.e., earlier or later) by a few data points as compared to a data point that is labeled as anomaly by a user.
  • the parameter tuning system 102 may still consider the data point labeled with predicted anomaly to be matched with the data point labeled with user anomaly, and therefore counts this data point towards TP.
  • the parameter tuning system 102 may calculate a predetermined evaluation metric for each value combination.
  • the predetermined evaluation metric may include, but is not limited to, a recall score and a precision score.
  • a recall score and a precision score for each value combination may be calculated using the following equations respectively:
  • the parameter tuning system 102 may obtain a performance score for each value combination based on the predetermined evaluation metric.
  • the predetermined evaluation metric may include, for example, a F ⁇ score, which may be defined as follows:
  • is a real number, and is chosen such that the recall score is considered to be ⁇ times as important as the precision score.
  • the parameter tuning system 102 may choose ⁇ as one such that F ⁇ score becomes F 1 score with the recall score as important as the precision score.
  • the parameter tuning system 102 may select or determine a particular value combination from among the set of different value combinations as a recommended value combination for the one or more parameters to be tuned to the anomaly detection system based on the calculated performance score of each value combination of the set of different value combinations for the one or more parameters to be tuned.
  • the parameter tuning system 102 or the tuner 312 of the parameter tuning system 102 may select a value combination that leads to the highest performance score as a recommended value combination for the one or more parameters to be tuned to the anomaly detection system 110.
  • the parameter tuning system 102 may request the one or more users to provide further feedback on the one or more time series, for example, labeling additional data points of the one or more time series about whether these additional data points are anomaly points or non-anomaly points, etc.
  • the parameter tuning system 102 may then iteratively perform the operations as described above (e.g., blocks 402 –412) to obtain a more optimal or better value combination for the one or more parameters to be tuned.
  • the parameter tuning system 102 may wait for one or more newly labeled time series or new data points of the one or more labeled time series from the anomaly detection system, and automatically perform a new iteration for tuning the one or more parameters for the anomaly detection system.
  • the parameter tuning system 102 may automatically perform a new iteration for tuning the one or more parameters, i.e., repeating the parameter tuning operations as described above in blocks 404 –412.
  • the parameter tuning system 102 may receive one or more labeled time series from another anomaly detection system and information of one or more parameters associated with this anomaly detection system. The tuning parameter system may then perform parameter tuning operations as described above in blocks 404 –412 for this anomaly detection system.
  • a method implemented by one or more computing devices comprising: generating a set of different value combinations for one or more parameters associated with an anomaly detection system; obtaining one or more time series, each data point of the one or more time series being marked with a label indicating whether an anomaly is present, absent, or not yet determined; assigning a respective value combination of the set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a subset of data points of the one or more time series to obtain predicted labels of the subset of data points for each value combination; calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on a predetermined evaluation metric; and selecting a parameter combination corresponding to a highest performance score of the anomaly detection system from among the set of different parameter combinations as a recommended parameter combination for the one or more parameters associated with the anomaly detection system.
  • Clause 2 The method of Clause 1, wherein the one or more parameters associated with the anomaly detection system comprise one or more parameters used in one or more statistical hypothesis tests for detecting one or more anomaly types in the anomaly detection system.
  • Clause 3 The method of Clause 2, wherein the one or more parameters used in the one or more statistical hypothesis tests comprise one or more significance levels used in the one or more statistical hypothesis tests.
  • Clause 4 The method of Clause 3, wherein the set of different value combinations for the one or more parameters associated with the anomaly detection system comprise respective probability values generated for a plurality of data points marked with a label indicating that an anomaly is present.
  • Clause 5 The method of Clause 1, further comprising: obtaining new data points of the one or more time series, the new data points comprising at least one data point marked with a label indicating that an anomaly is present; calculating a probability value for the at least one data point; using the calculated probability value for the at least one data point as a value of a parameter of the one or more parameters associated with the anomaly detection system; and incorporating the calculated probability value into the set of different value combinations to form a new set of different value combinations for the one or more parameters associated with the anomaly detection system.
  • Clause 6 The method of Clause 5, further comprising: assigning a respective value combination of the new set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a new subset of data points of the one or more time series to obtain predicted labels of the new subset of data points for each value combination of the new set of different value combinations; calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on the predetermined evaluation metric; and selecting a new parameter combination corresponding to a new highest performance score of the anomaly detection system from among the new set of different parameter combinations as a new recommended parameter combination for the one or more parameters associated with the anomaly detection system.
  • Clause 7 The method of Clause 1, wherein the performance score of the anomaly detection system comprises a F-1 score that is calculated based on a recall score and a precision score included in the evaluation metric.
  • Clause 8 The method of Clause 7, wherein the precision score depends on a number of correct anomaly predications, a number of incorrect anomaly predictions, and a number of testing data points having a label indicating that an anomaly is not yet determined.
  • Clause 9 The method of Clause 1, further comprising receiving feedback from one or more users to update labels associated with a plurality of data points of the one or more time series.
  • One or more computer readable media storing executable instructions that, when executed by one or more processors, cause the one or more processors to perform acts comprising: generating a set of different value combinations for one or more parameters associated with an anomaly detection system; obtaining one or more time series, each data point of the one or more time series being marked with a label indicating whether an anomaly is present, absent, or not yet determined; assigning a respective value combination of the set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a subset of data points of the one or more time series to obtain predicted labels of the subset of data points for each value combination; calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on a predetermined evaluation metric; and selecting a parameter combination corresponding to a highest performance score of the anomaly detection system from among the set of different parameter combinations as a recommended parameter combination for the one or more parameters associated with the anomaly detection system.
  • Clause 11 The one or more computer readable media of Clause 10, wherein the one or more parameters associated with the anomaly detection system comprise one or more parameters used in one or more statistical hypothesis tests for detecting one or more anomaly types in the anomaly detection system.
  • Clause 12 The one or more computer readable media of Clause 11, wherein the one or more parameters used in the one or more statistical hypothesis tests comprise one or more significance levels used in the one or more statistical hypothesis tests.
  • Clause 13 The one or more computer readable media of Clause 12, wherein the set of different value combinations for the one or more parameters associated with the anomaly detection system comprise respective probability values generated for a plurality of data points marked with a label indicating that an anomaly is present.
  • Clause 14 The one or more computer readable media of Clause 10, the acts further comprising: obtaining new data points of the one or more time series, the new data points comprising at least one data point marked with a label indicating that an anomaly is present; calculating a probability value for the at least one data point; using the calculated probability value for the at least one data point as a value of a parameter of the one or more parameters associated with the anomaly detection system; and incorporating the calculated probability value into the set of different value combinations to form a new set of different value combinations for the one or more parameters associated with the anomaly detection system.
  • Clause 15 The one or more computer readable media of Clause 14, the acts further comprising: assigning a respective value combination of the new set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a new subset of data points of the one or more time series to obtain predicted labels of the new subset of data points for each value combination of the new set of different value combinations; calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on the predetermined evaluation metric; and selecting a new parameter combination corresponding to a new highest performance score of the anomaly detection system from among the new set of different parameter combinations as a new recommended parameter combination for the one or more parameters associated with the anomaly detection system.
  • Clause 16 The one or more computer readable media of Clause 10, wherein the performance score of the anomaly detection system comprises a F-1 score that is calculated based on a recall score and a precision score included in the evaluation metric.
  • Clause 17 The one or more computer readable media of Clause 16, wherein the precision score depends on a number of correct anomaly predications, a number of incorrect anomaly predictions, and a number of testing data points having a label indicating that an anomaly is not yet determined.
  • Clause 18 The one or more computer readable media of Clause 10, the acts further comprising receiving feedback from one or more users to update labels associated with a plurality of data points of the one or more time series.
  • a system comprising: one or more processors; and memory storing executable instructions that, when executed by one or more processors, cause the one or more processors to perform acts comprising: generating a set of different value combinations for one or more parameters associated with an anomaly detection system; obtaining one or more time series, each data point of the one or more time series being marked with a label indicating whether an anomaly is present, absent, or not yet determined; assigning a respective value combination of the set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a subset of data points of the one or more time series to obtain predicted labels of the subset of data points for each value combination; calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on a predetermined evaluation metric; and selecting a parameter combination corresponding to a highest performance score of the anomaly detection system from among the set of different parameter combinations as a recommended parameter combination for the one or more parameters associated with
  • Clause 20 The system of Clause 19, the acts further comprising: obtaining new data points of the one or more time series, the new data points comprising at least one data point marked with a label indicating that an anomaly is present; calculating a probability value for the at least one data point; using the calculated probability value for the at least one data point as a value of a parameter of the one or more parameters associated with the anomaly detection system; incorporating the calculated probability value into the set of different value combinations to form a new set of different value combinations for the one or more parameters associated with the anomaly detection system; assigning a respective value combination of the new set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a new subset of data points of the one or more time series to obtain predicted labels of the new subset of data points for each value combination of the new set of different value combinations; calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on the predetermined evaluation metric; and

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Quality & Reliability (AREA)
  • Algebra (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Hardware Design (AREA)
  • Environmental & Geological Engineering (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Debugging And Monitoring (AREA)

Abstract

A parameter tuning system may obtain a set of different value combinations for one or more parameters associated with an anomaly detection system, and obtain one or more time series monitored by the anomaly detection system. In implementations, each data point of the one or more time series may be marked with a label indicating whether an anomaly is present, absent, or not yet determined. The parameter tuning system may cause the anomaly detection system assigned with each value combination to perform on a subset of data points of the one or more time series to obtain predicted labels of the subset of data points, calculate a performance score of the anomaly detection system assigned with the respective value combination based on a predetermined evaluation metric, and select a parameter combination corresponding to a highest performance score as a recommended parameter combination for the anomaly detection system.

Description

AUTOMATIC PARAMETER TUNING FOR ANOMALY DETECTION SYSTEM BACKGROUND
With the explosive development of computer technologies, a number of computer systems have been developed and used in various application fields for monitoring and controlling purposes. For example, an anomaly detection system may be deployed and configured to monitor performance metrics (such as a percentage of CPU usage, a percentage of memory usage, etc. ) of a plurality of servers in a cloud computing architecture, and detect any occurrence of various types of abnormality or anomaly (e.g., a sudden jump or spike in an amount of traffic, a failure of a certain server, etc. ) in the cloud computing architecture. In order to accurately detecting an occurrence of an anomaly, human experts familiar with anomaly detection algorithms are usually recruited to manually or semi-manually configure or tune parameters of the anomaly detection system to set an optimal configuration of parameters for an anomaly detection algorithm used in the anomaly detection system. However, due to a tremendous number of time series and data points collected for the performance metrics, it is impractical, if not impossible, to involve the experts to configure and tune optimal parameters for each performance metric. Without the involvement of the human experts, the performance (e.g., the accuracy) of the anomaly detection system may be affected and cannot be guaranteed.
BRIEF DESCRIPTION OF THE DRAWINGS
The detailed description is set forth with reference to the accompanying figures. In the figures, the left-most digit (s) of a reference number identifies the figure in which the reference number first appears. The use of the same reference numbers in different figures indicates similar or identical items.
FIG. 1 illustrates an example environment in which a parameter tuning system may be used.
FIG. 2 illustrates an example anomaly detection system.
FIG. 3 illustrates an example parameter tuning system.
FIG. 4 illustrates an example parameter tuning method.
DETAILED DESCRIPTION
Overview
As noted above, existing anomaly detection systems require human experts to manually or semi-manually tune parameters of the anomaly detection systems to set optimal configurations of parameters necessary for detecting anomalies in time series data associated with daily-life operations such as performance of servers provided in a cloud. This would be impractical and inefficient to involve human experts to set the optimal parameter manually for each metric, hence limit the possibility of the anomaly detection system to scale up to monitor a large number of machines provided in the computer system.
This disclosure describes an example parameter tuning system. The parameter tuning system may automatically tune and configure parameters  associated with an anomaly detection system to obtain an optimal configuration of parameters usable for anomaly detection in the anomaly detection system, and further adaptively adjust the parameters when new time series data is collected by the anomaly detection system.
In implementations, the parameter tuning system may obtain a set of different value combinations for one or more parameters associated with an anomaly detection system, and obtain one or more time series monitored by the anomaly detection system. In implementations, each data point of the one or more time series may be marked with a label indicating whether an anomaly is present, absent, or not yet determined. In implementations, the parameter tuning system may assign a respective value combination of the set of different value combinations to the one or more parameters associated with the anomaly detection system, and apply the anomaly detection system assigned with the respective value combination on a subset of data points of the one or more time series to obtain predicted labels of the subset of data points for each value combination. In implementations, the parameter tuning system may then calculate a performance score of the anomaly detection system assigned with the respective value combination based at least in part on a predetermined evaluation metric, and select a parameter combination corresponding to a highest performance score of the anomaly detection system from among the set of different parameter combinations as a recommended parameter combination for the one or more parameters associated with the anomaly detection system. In implementations, the parameter tuning system may calculate the performance score based on predicted labels and labels provided by users.
In implementations, functions described herein to be performed by the parameter tuning system may be performed by multiple separate units or services. For example, an acquisition service may obtain a set of different value combinations for one or more parameters associated with an anomaly detection system, and obtain one or more time series monitored by the anomaly detection system, while a detection service may assign a respective value combination of the set of different value combinations to the one or more parameters associated with the anomaly detection system, and apply the anomaly detection system assigned with the respective value combination on a subset of data points of the one or more time series to obtain predicted labels of the subset of data points for each value combination. An evaluation service may calculate a performance score of the anomaly detection system assigned with the respective value combination based at least in part on a predetermined evaluation metric, and select a parameter combination corresponding to a highest performance score of the anomaly detection system from among the set of different parameter combinations as a recommended parameter combination for the one or more parameters associated with the anomaly detection system.
Moreover, although in the examples described herein, the parameter tuning system may be implemented as a combination of software and hardware installed in a single device, in other examples, the parameter tuning system may be implemented and distributed in multiple devices or as services provided in one or more computing devices over a network and/or in a cloud computing architecture.
The application describes multiple and varied embodiments and  implementations. The following section describes an example framework that is suitable for practicing various implementations. Next, the application describes example systems, devices, and processes for implementing a parameter tuning system.
Example Environment
FIG. 1 illustrates an example environment 100 usable to implement a parameter tuning system. The environment 100 may include a parameter tuning system 102. In this example, the parameter tuning system 102 is described to exist as an individual entity. In some instances, the parameter tuning system 102 may include one or more servers 104. In other instances, the parameter tuning system 102 may be included as a part of the one or more servers 104, or distributed among the one or more servers 104, which communicate data with one another via a network 106. In implementations, a first server of the one or more servers 104 may include part of the functions of the parameter tuning system 102, while other functions of the parameter tuning system 102 may be included in a second server of the one or more servers 104. Furthermore, in some implementations, some or all the functions of the parameter tuning system 102 may be included in a cloud computing system or architecture, and may be provided as services for determining or recommending suitable or optimal parameter configuration for anomaly detection.
In implementations, the parameter tuning system 102 may be a part of a client device 108, e.g., software and/or hardware components of the client  device 108. In some instances, the parameter tuning system 102 may include a client device 108.
In implementations, the environment 100 may further include an anomaly detection system 110. In implementations, the parameter tuning system 102 may be included in the anomaly detection system 110, and provide services to the anomaly detection system 110. In some implementations, some or all the functions of the parameter tuning system 102 may be included and provided in the one or more servers 104, the client device 108, and/or the anomaly detection system 110, which communicate with each other via the network 106.
The client device 108 may be implemented as any of a variety of computing devices including, but not limited to, a desktop computer, a notebook or portable computer, a handheld device, a netbook, an Internet appliance, a tablet or slate computer, a mobile device (e.g., a mobile phone, a personal digital assistant, a smart phone, etc. ) , etc., or a combination thereof.
The network 106 may be a wireless or a wired network, or a combination thereof. The network 106 may be a collection of individual networks interconnected with each other and functioning as a single large network (e.g., the Internet or an intranet) . Examples of such individual networks include, but are not limited to, telephone networks, cable networks, Local Area Networks (LANs) , Wide Area Networks (WANs) , and Metropolitan Area Networks (MANs) . Further, the individual networks may be wireless or wired networks, or a combination thereof. Wired networks may include an electrical carrier connection (such a communication cable, etc. ) and/or an optical carrier or connection (such as an optical fiber  connection, etc. ) . Wireless networks may include, for example, a WiFi network, other radio frequency networks (e.g., 
Figure PCTCN2020074509-appb-000001
Zigbee, etc. ) , etc.
In implementations, the parameter tuning system 102 may receive one or more parameters to be tuned and time series data with labeled data points from the anomaly detection system 110. The parameter tuning system 102 may then generate a tuning space based on different value combinations of the one or more parameters to be tuned and the labeled data points of the time series data, and perform anomaly detection on the labeled data points using the different value combinations of the one or more parameters. The parameter tuning system 102 may evaluate respective performances of the different value combinations of the one or more parameters based on a predetermined evaluation metric, and determine a particular value combination to be recommended to the anomaly detection system as a recommended value combination for the one or more parameters.
Example Anomaly Detection System
FIG. 2 illustrates the anomaly detection system 110 in more detail. In implementations, the anomaly detection system 110 may include, but is not limited to, one or more processors 202, an input/output (I/O) interface 204, and/or a network interface 206, and memory 208.
In implementations, the processors 202 may be configured to execute instructions that are stored in the memory 208, and/or received from the input/output interface 204, and/or the network interface 206. In implementations, the processors 202 may be implemented as one or more hardware processors  including, for example, a microprocessor, an application-specific instruction-set processor, a physics processing unit (PPU) , a central processing unit (CPU) , a graphics processing unit, a digital signal processor, a tensor processing unit, etc. Additionally or alternatively, the functionality described herein can be performed, at least in part, by one or more hardware logic components. For example, and without limitation, illustrative types of hardware logic components that can be used include field-programmable gate arrays (FPGAs) , application-specific integrated circuits (ASICs) , application-specific standard products (ASSPs) , system-on-a-chip systems (SOCs) , complex programmable logic devices (CPLDs) , etc.
The memory 208 may include computer readable media in a form of volatile memory, such as Random Access Memory (RAM) and/or non-volatile memory, such as read only memory (ROM) or flash RAM. The memory 208 is an example of computer readable media.
The computer readable media may include a volatile or non-volatile type, a removable or non-removable media, which may achieve storage of information using any method or technology. The information may include a computer readable instruction, a data structure, a program module or other data. Examples of computer readable media include, but not limited to, phase-change memory (PRAM) , static random access memory (SRAM) , dynamic random access memory (DRAM) , other types of random-access memory (RAM) , read-only memory (ROM) , electronically erasable programmable read-only memory (EEPROM) , quick flash memory or other internal storage technology, compact disk read-only memory (CD-ROM) , digital versatile disc (DVD) or other optical storage, magnetic cassette  tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission media, which may be used to store information that may be accessed by a computing device. As defined herein, the computer readable media does not include any transitory media, such as modulated data signals and carrier waves.
Although in this example, only hardware components are described in the anomaly detection system 110, in other instances, the anomaly detection system 110 may further include other hardware components and/or other software components such as program modules 210 to execute instructions stored in the memory 208 for performing various operations, and program data 212 for storing data associated with anomaly detection, such as data points of one or more time series, values of parameters associated with anomaly detection, etc.
By way of example and not limitation, the program modules 210 may include a data preprocessing module 214 configured to check and clean an input time series, such as checking timestamps of data points in the input time series, checking any missing data points, performing an interpolation to fill in a missing data point, etc. In implementations, the program module 210 may further include a classification module 216 configured to separate data of at least some portions of the input time series into different types based on respective data patterns included in the at least some portions of the input time series. In implementations, the program module 210 may further include a transformation module 218 configured to process the input time series by, for example, denoising and smoothing the input time series, and decomposing the input time series into different components including, but not limited to, a trend component, a seasonal component, and a  residual component, etc. In implementations, the program module 210 may further include a detection module 220 configured to detect and determine whether an anomaly occurs in the input time series by applying one or more statistical hypothesis tests (such as a T-test, a F-test, and/or a MK test) on different components obtained by the transformation module 218.
Example Parameter Tuning System
FIG. 3 illustrates the example parameter tuning system 102 in more detail. In implementations, the parameter tuning system 102 may include, but is not limited to, one or more processors 302, an input/output (I/O) interface 304, a network interface 306, and memory 308. In implementations, the memory 308 may include computer readable media as described in the foregoing description. In implementations, the parameter tuning system 102 may further include an evaluator 310 and a tuner 312. Additionally, in some implementations, the parameter tuning system 102 may further include. In implementations, some of the functions of the parameter tuning system 102 may be implemented using hardware, for example, an ASIC (i.e., Application-Specific Integrated Circuit) , a FPGA (i.e., Field-Programmable Gate Array) , and/or other hardware. In this example, the parameter tuning system 102 is described to exist as a separate entity. In some instances, some or all of the functions of the parameter tuning system 102 may be included in the anomaly detection system 110, the one or more servers 104, and/or the client device 108.
Although in this example, only hardware components are described in  the parameter tuning system 102, in other instances, the parameter tuning system 102 may further include other hardware components and/or other software components such as program modules 314 to execute instructions stored in the memory 308 for performing various operations, and program data 316 for storing data associated with parameter tuning, such as labeled data points of one or more time series, value combinations of parameters associated with the anomaly detection system 110, etc.
Example Methods
FIG. 4 shows a schematic diagram depicting an example method of parameter tuning. The method of FIG. 4 may, but need not, be implemented in the environment of FIG. 1 and using the systems of FIGS. 2 and 3. For ease of explanation, a method 400 is described with reference to FIGS. 1-3. However, the method 400 may alternatively be implemented in other environments and/or using other systems.
The method 400 is described in the general context of computer-executable instructions. Generally, computer-executable instructions can include routines, programs, objects, components, data structures, procedures, modules, functions, and the like that perform particular functions or implement particular abstract data types. Furthermore, each of the example methods are illustrated as a collection of blocks in a logical flow graph representing a sequence of operations that can be implemented in hardware, software, firmware, or a combination thereof. The order in which the method is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any  order to implement the method, or alternate methods. Additionally, individual blocks may be omitted from the method without departing from the spirit and scope of the subject matter described herein. In the context of software, the blocks represent computer instructions that, when executed by one or more processors, perform the recited operations. In the context of hardware, some or all of the blocks may represent application specific integrated circuits (ASICs) or other physical components that perform the recited operations.
Referring back to FIG. 4, at block 402, the parameter tuning system 102 may receive information of one or more parameters to be tuned for an anomaly detection system and one or more labeled time series.
In implementations, the parameter tuning system 102 may be connected to or associated with one or more anomaly detection systems (e.g., the anomaly detection system 110) , and may communicate data with the one or more anomaly detection systems through the network 106. In implementations, when one or more parameters associated with the anomaly detection system 110 need to be configured or tuned, the parameter tuning system 102 may receive information of the one or more parameters to be tuned from the anomaly detection system 110 for automatic parameter tuning. Additionally, the parameter tuning system 102 may receive one or more labeled time series including labeled data points from the anomaly detection system 102. The one or more labeled time series may correspond to performance data (such as a percentage of CPU usage, a percentage of memory usage, etc. ) of a computer system that is collected and monitored by the anomaly detection system 110 on a regular basis.
In implementations, the one or more parameters to be tuned may include, but are not limited to, parameters associated with one or more statistical hypothesis tests that are used for anomaly detection. Examples of parameters associated with a statistical test may include a length of a time series (a window size including a data point to be tested) , a direction of the statistical test (e.g., a one-sided test or a two-sided test) , a significance level of the statistical test, etc. In implementations, the one or more statistical hypothesis tests may include, but are not limited to, a T-test, a F-test, and a MK-test, etc. For the sake of description and without loss of generality, significance levels of one or more statistical hypothesis tests involved in anomaly detection are used as an example of the one or more parameters to be tuned hereinafter. Nevertheless, other types of parameters associated with the anomaly detection system, such as a length of a time series (a window size including a data point to be tested) , a direction of the statistical test (e.g., a one-sided test or a two-sided test) , etc., may also be tuned and configured in a similar manner.
In implementations, the information of the one or more parameters to be tuned may include initial or default values of the one or more parameters used in the anomaly detection system 110, and these initial or default values of the one or more parameters may not be optimal or may be outdated, and thus need to be tuned or refined.
In implementations, before sending the information of the one or more parameters to be tuned to the parameter tuning system 102, the anomaly detection system 110 may first automatically separate data in at least some portions  of the one or more time series into different types (e.g., anomaly types such as a spike, a level shift, a mean change, etc. ) through the classification module 216 based on respective data patterns in the at least some portions of the one or more time series using active learning-based approaches and/or through user selection. The anomaly detection system 110 may then select representative samples from respective data of each type. In implementations, each representative sample may include a predetermined number of data points in a time series. The anomaly detection system 110 may then transform the representative samples of each type through the transformation module 218, and perform anomaly detection (e.g., applying corresponding one or more statistical hypothesis tests) on these representative samples of each type through the detection module 220 to generate anomaly scores for these representative samples.
By way of example and not limitation, the anomaly scores for these representative samples may be related to probability values (i.e., p-values) of these representative samples generated for the one or more statistical hypothesis tests. By way of example and not limitation, an anomaly score for a representative sample under a statistical test may be a negative logarithm of a p-value generated for that representative sample under the statistical test with a base of ten. In statistical hypothesis testing, a p-value or probability value is defined as a probability of obtaining test results at least as extreme as results that are actually observed during the statistical hypothesis testing. In implementations, the anomaly detection system 110 may compare the p-values of the representative samples generated for the one or more statistical hypothesis tests with respective significance levels of the  one or more statistical hypothesis tests to determine whether an anomaly occurs in each of the representative samples.
In implementations, the anomaly detection system 110 may provide the representative samples to one or more users, and request feedback from the one or more users for labeling at least some data points in the representative samples as anomaly points and/or non-anomaly points. In some implementations, the anomaly detection system 110 may further provide the anomaly scores of the representative samples to one or more users as a further reference. Based on the feedback from the one or more users, the representative samples may be labeled accordingly. For example, the one or more users may label data points in the representative samples whether an anomaly is present or absent. In implementations, each data point of the representative samples may be marked with a label indicating whether an anomaly is present, absent, or not determined. Specifically, a data point that is determined by a user as an anomaly point (or a data point having an anomaly) is marked with a label (e.g., anomaly) indicating that an anomaly is present. A data point that is determined by a user as a non-anomaly point (or a data point having no anomaly) is marked with a label (e.g., non-anomaly) indicating that an anomaly is absent. A data point that is not determined by any user as either an anomaly point or a non-anomaly point is marked with a label (e.g., undecided) indicating that an anomaly is undecided or not determined yet.
In implementations, the anomaly detection system 110 may select a portion of the representative samples and respective anomaly scores (e.g., the p-values) , and send this selected portion of the representative samples and the  respective anomaly scores to the parameter tuning system 102 as at least a portion of the data associated with the one or more parameters to be tuned.
At block 404, the parameter tuning system 102 may generate a set of different value combinations for the one or more parameters associated with the anomaly detection system based on the received information of the one or more parameters to be tuned.
In implementations, in response to receiving the information of the one or more parameters to be tuned from the anomaly detection system 110, the parameter tuning system 102 may obtain respective candidate values of the one or more parameters from the received information of the one or more parameters to be tuned. In this example, the one or more parameters are described to be one or more significance levels of one or more statistical hypothesis tests, and therefore the respective candidate values of the one or more parameters may include respective p-values of data points marked with a label indicating that an anomaly is present are obtained under the one or more statistical hypothesis tests.
In implementations, the parameter tuning system 102 may combine the respective candidate values of the one or more parameters to form a set of different value combinations of the one or more parameters to be tuned. For example, if N number of data points
Figure PCTCN2020074509-appb-000002
are detected and labeled to be an anomaly point or a normal point under a first statistical test T1 with corresponding p-values generated for the first statistical test T1 being 
Figure PCTCN2020074509-appb-000003
M number of data points
Figure PCTCN2020074509-appb-000004
are detected and labeled to have an anomaly under a second statistical test T2 with corresponding p- values generated for the first statistical test T2 being
Figure PCTCN2020074509-appb-000005
and K number of data points
Figure PCTCN2020074509-appb-000006
are detected and labeled to have an anomaly under a third statistical test T3 with corresponding p-values generated for the third statistical test T3 being
Figure PCTCN2020074509-appb-000007
the set of different value combinations of the one or more parameters (in this example, there are three parameters, namely, significance levels of the first, second and third statistical hypothesis tests) to be tuned that is generated by the parameter tuning system 102 may include N×M×K number of different value combinations, i.e., 
Figure PCTCN2020074509-appb-000008
Figure PCTCN2020074509-appb-000009
wherein 1≤ n≤N, 1≤m≤M, and 1≤k≤K. N, M and K are positive integers greater than or equal to one.
In implementations, the parameter tuning system 102 may further generate a tuning space from the set of different value combinations for the one or more parameters to be tuned and the one or more labeled time series. For example, after generating the set of different value combinations for the one or more parameters to be tuned, the parameter tuning system 102 may further generate a tuning space using the set of different value combinations for the one or more parameters to be tuned and the one or more labeled time series. In implementations, the tuning space may include a parameter space consisting of the set of different value combinations and a data space consisting of the one or more labeled time series.
At block 406, the parameter tuning system 102 may send the set of different value combinations for the one or more parameters to be tuned to the  anomaly detection system for performing anomaly detection on the one or more labeled time series based on each value combination of the one or more parameters to be tuned.
In implementations, the parameter tuning system 102 may send the set of different value combinations to the anomaly detection system 110 to cause the anomaly detection system 110 to perform anomaly detection on data points of the one or more labeled time series based on each value combination (in this example, a combination of candidate values for significance levels of the statistical hypothesis tests) , and obtain predicted results or labels for the one or more labeled time series.
At block 408, the parameter tuning system 102 may receive predicted results or labels of data points of the one or more time series from the anomaly detection system.
In implementations, after the anomaly detection system 110 performs the anomaly detection on the data points of the one or more labeled time series based on each value combination of the set of different value combinations for the one or more parameters to be tuned, the anomaly detection system 110 may return predicted results or labels of the data points of the one or more labeled time series to the parameter tuning system 102. In implementations, each predicted result or label may include a label indicating whether an anomaly is predicted to be present or absent.
At block 410, the parameter tuning system 102 may calculate a respective performance score of each value combination of the set of different value combinations for the one or more parameters to be tuned.
In implementations, upon receiving the predicted results or labels of the data points of the one or more time series from the anomaly detection system, the parameter tuning system 102 or the evaluator 310 of the parameter tuning system 102 may evaluate a respective performance score of each value combination of the set of different value combinations for the one or more parameters to be tuned based on the predicted results or labels of the data points of the one or more labeled time series and user labels of the data points of the one or more labeled time series made by the one or more users. In implementations, the parameter tuning system 102 may calculate an enhanced confusion matrix as shown in the following Table 1.
Table 1: Enhanced confusion matrix
  Predicted Anomaly Predicted Not Anomaly
Marked Anomaly TP FN
Marked Not Anomaly FN N TP N
Not Marked FP TN
In implementations, the parameter tuning system 102 may further employ a relaxed mode of anomaly matching between the predicted labels and the user labels of the data points of the one or more labeled time series. For example, a data window of a predetermined size (i.e., a predetermined integer such as 2, 3, 4, …, etc. ) may be used to accommodate a pattern anomaly in which a number of consecutive data points are labeled as anomalies. This corresponds to a scenario in which a data point labeled with predicted anomaly is off (i.e., earlier or later) by a few data points as compared to a data point that is labeled as anomaly by a user. In  this case, the parameter tuning system 102 may still consider the data point labeled with predicted anomaly to be matched with the data point labeled with user anomaly, and therefore counts this data point towards TP.
In implementations, based on the enhanced confusion matrix and the relaxed mode, the parameter tuning system 102 may calculate a predetermined evaluation metric for each value combination. In implementations, the predetermined evaluation metric may include, but is not limited to, a recall score and a precision score. Specifically, a recall score and a precision score for each value combination may be calculated using the following equations respectively:
Figure PCTCN2020074509-appb-000010
Figure PCTCN2020074509-appb-000011
Upon obtaining the recall score and the precision score for each value combination, the parameter tuning system 102 may obtain a performance score for each value combination based on the predetermined evaluation metric. In implementations, the predetermined evaluation metric may include, for example, a F β score, which may be defined as follows:
Figure PCTCN2020074509-appb-000012
where β is a real number, and is chosen such that the recall score is considered to be β times as important as the precision score.
In implementations, the parameter tuning system 102 may choose β as one such that F β score becomes F 1 score with the recall score as important as the precision score.
At block 412, the parameter tuning system 102 may select or determine a particular value combination from among the set of different value combinations as a recommended value combination for the one or more parameters to be tuned to the anomaly detection system based on the calculated performance score of each value combination of the set of different value combinations for the one or more parameters to be tuned.
In implementations, after calculating the performance score of each value combination of the set of different value combinations for the one or more parameters to be tuned, the parameter tuning system 102 or the tuner 312 of the parameter tuning system 102 may select a value combination that leads to the highest performance score as a recommended value combination for the one or more parameters to be tuned to the anomaly detection system 110.
In implementations, if the recommended value combination is not desirable, for example, the performance score is low, or one or more users are not satisfied with the performance of the anomaly detection system 102 assigned with the recommended value combination, the parameter tuning system 102 may request the one or more users to provide further feedback on the one or more time series, for example, labeling additional data points of the one or more time series about whether these additional data points are anomaly points or non-anomaly points, etc. The parameter tuning system 102 may then iteratively perform the operations as described above (e.g., blocks 402 –412) to obtain a more optimal or better value combination for the one or more parameters to be tuned.
At block 414 the parameter tuning system 102 may wait for one or  more newly labeled time series or new data points of the one or more labeled time series from the anomaly detection system, and automatically perform a new iteration for tuning the one or more parameters for the anomaly detection system.
In implementations, if one or more new time series or new data points of the one or more labeled time series are received from the anomaly detection system, the parameter tuning system 102 may automatically perform a new iteration for tuning the one or more parameters, i.e., repeating the parameter tuning operations as described above in blocks 404 –412.
In implementations, additionally or alternatively, the parameter tuning system 102 may receive one or more labeled time series from another anomaly detection system and information of one or more parameters associated with this anomaly detection system. The tuning parameter system may then perform parameter tuning operations as described above in blocks 404 –412 for this anomaly detection system.
Although the above method blocks are described to be executed in a particular order, in some implementations, some or all of the method blocks can be executed in other orders, or in parallel.
Conclusion
Although implementations have been described in language specific to structural features and/or methodological acts, it is to be understood that the claims are not necessarily limited to the specific features or acts described. Rather, the specific features and acts are disclosed as exemplary forms of implementing the  claimed subject matter. Additionally or alternatively, some or all of the operations may be implemented by one or more ASICS, FPGAs, or other hardware.
The present disclosure can be further understood using the following clauses.
Clause 1: A method implemented by one or more computing devices, the method comprising: generating a set of different value combinations for one or more parameters associated with an anomaly detection system; obtaining one or more time series, each data point of the one or more time series being marked with a label indicating whether an anomaly is present, absent, or not yet determined; assigning a respective value combination of the set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a subset of data points of the one or more time series to obtain predicted labels of the subset of data points for each value combination; calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on a predetermined evaluation metric; and selecting a parameter combination corresponding to a highest performance score of the anomaly detection system from among the set of different parameter combinations as a recommended parameter combination for the one or more parameters associated with the anomaly detection system.
Clause 2: The method of Clause 1, wherein the one or more parameters associated with the anomaly detection system comprise one or more  parameters used in one or more statistical hypothesis tests for detecting one or more anomaly types in the anomaly detection system.
Clause 3: The method of Clause 2, wherein the one or more parameters used in the one or more statistical hypothesis tests comprise one or more significance levels used in the one or more statistical hypothesis tests.
Clause 4: The method of Clause 3, wherein the set of different value combinations for the one or more parameters associated with the anomaly detection system comprise respective probability values generated for a plurality of data points marked with a label indicating that an anomaly is present.
Clause 5: The method of Clause 1, further comprising: obtaining new data points of the one or more time series, the new data points comprising at least one data point marked with a label indicating that an anomaly is present; calculating a probability value for the at least one data point; using the calculated probability value for the at least one data point as a value of a parameter of the one or more parameters associated with the anomaly detection system; and incorporating the calculated probability value into the set of different value combinations to form a new set of different value combinations for the one or more parameters associated with the anomaly detection system.
Clause 6: The method of Clause 5, further comprising: assigning a respective value combination of the new set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a new subset of data points of the one or more time series to obtain  predicted labels of the new subset of data points for each value combination of the new set of different value combinations; calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on the predetermined evaluation metric; and selecting a new parameter combination corresponding to a new highest performance score of the anomaly detection system from among the new set of different parameter combinations as a new recommended parameter combination for the one or more parameters associated with the anomaly detection system.
Clause 7: The method of Clause 1, wherein the performance score of the anomaly detection system comprises a F-1 score that is calculated based on a recall score and a precision score included in the evaluation metric.
Clause 8: The method of Clause 7, wherein the precision score depends on a number of correct anomaly predications, a number of incorrect anomaly predictions, and a number of testing data points having a label indicating that an anomaly is not yet determined.
Clause 9: The method of Clause 1, further comprising receiving feedback from one or more users to update labels associated with a plurality of data points of the one or more time series.
Clause 10: One or more computer readable media storing executable instructions that, when executed by one or more processors, cause the one or more processors to perform acts comprising: generating a set of different value combinations for one or more parameters associated with an anomaly detection system; obtaining one or more time series, each data point of the one or more time  series being marked with a label indicating whether an anomaly is present, absent, or not yet determined; assigning a respective value combination of the set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a subset of data points of the one or more time series to obtain predicted labels of the subset of data points for each value combination; calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on a predetermined evaluation metric; and selecting a parameter combination corresponding to a highest performance score of the anomaly detection system from among the set of different parameter combinations as a recommended parameter combination for the one or more parameters associated with the anomaly detection system.
Clause 11: The one or more computer readable media of Clause 10, wherein the one or more parameters associated with the anomaly detection system comprise one or more parameters used in one or more statistical hypothesis tests for detecting one or more anomaly types in the anomaly detection system.
Clause 12: The one or more computer readable media of Clause 11, wherein the one or more parameters used in the one or more statistical hypothesis tests comprise one or more significance levels used in the one or more statistical hypothesis tests.
Clause 13: The one or more computer readable media of Clause 12, wherein the set of different value combinations for the one or more parameters  associated with the anomaly detection system comprise respective probability values generated for a plurality of data points marked with a label indicating that an anomaly is present.
Clause 14: The one or more computer readable media of Clause 10, the acts further comprising: obtaining new data points of the one or more time series, the new data points comprising at least one data point marked with a label indicating that an anomaly is present; calculating a probability value for the at least one data point; using the calculated probability value for the at least one data point as a value of a parameter of the one or more parameters associated with the anomaly detection system; and incorporating the calculated probability value into the set of different value combinations to form a new set of different value combinations for the one or more parameters associated with the anomaly detection system.
Clause 15: The one or more computer readable media of Clause 14, the acts further comprising: assigning a respective value combination of the new set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a new subset of data points of the one or more time series to obtain predicted labels of the new subset of data points for each value combination of the new set of different value combinations; calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on the predetermined evaluation metric; and selecting a new parameter combination corresponding to a new highest  performance score of the anomaly detection system from among the new set of different parameter combinations as a new recommended parameter combination for the one or more parameters associated with the anomaly detection system.
Clause 16: The one or more computer readable media of Clause 10, wherein the performance score of the anomaly detection system comprises a F-1 score that is calculated based on a recall score and a precision score included in the evaluation metric.
Clause 17: The one or more computer readable media of Clause 16, wherein the precision score depends on a number of correct anomaly predications, a number of incorrect anomaly predictions, and a number of testing data points having a label indicating that an anomaly is not yet determined.
Clause 18: The one or more computer readable media of Clause 10, the acts further comprising receiving feedback from one or more users to update labels associated with a plurality of data points of the one or more time series.
Clause 19: A system comprising: one or more processors; and memory storing executable instructions that, when executed by one or more processors, cause the one or more processors to perform acts comprising: generating a set of different value combinations for one or more parameters associated with an anomaly detection system; obtaining one or more time series, each data point of the one or more time series being marked with a label indicating whether an anomaly is present, absent, or not yet determined; assigning a respective value combination of the set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system  assigned with the respective value combination on a subset of data points of the one or more time series to obtain predicted labels of the subset of data points for each value combination; calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on a predetermined evaluation metric; and selecting a parameter combination corresponding to a highest performance score of the anomaly detection system from among the set of different parameter combinations as a recommended parameter combination for the one or more parameters associated with the anomaly detection system.
Clause 20: The system of Clause 19, the acts further comprising: obtaining new data points of the one or more time series, the new data points comprising at least one data point marked with a label indicating that an anomaly is present; calculating a probability value for the at least one data point; using the calculated probability value for the at least one data point as a value of a parameter of the one or more parameters associated with the anomaly detection system; incorporating the calculated probability value into the set of different value combinations to form a new set of different value combinations for the one or more parameters associated with the anomaly detection system; assigning a respective value combination of the new set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a new subset of data points of the one or more time series to obtain predicted labels of the new subset of data points for each value combination of the new set of different  value combinations; calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on the predetermined evaluation metric; and selecting a new parameter combination corresponding to a new highest performance score of the anomaly detection system from among the new set of different parameter combinations as a new recommended parameter combination for the one or more parameters associated with the anomaly detection system.

Claims (20)

  1. A method implemented by one or more computing devices, the method comprising:
    generating a set of different value combinations for one or more parameters associated with an anomaly detection system;
    obtaining one or more time series, each data point of the one or more time series being marked with a label indicating whether an anomaly is present, absent, or not yet determined;
    assigning a respective value combination of the set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a subset of data points of the one or more time series to obtain predicted labels of the subset of data points for each value combination;
    calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on a predetermined evaluation metric; and
    selecting a parameter combination corresponding to a highest performance score of the anomaly detection system from among the set of different parameter combinations as a recommended parameter combination for the one or more parameters associated with the anomaly detection system.
  2. The method of claim 1, wherein the one or more parameters associated with the anomaly detection system comprise one or more parameters used in one or more statistical hypothesis tests for detecting one or more anomaly types in the anomaly detection system.
  3. The method of claim 2, wherein the one or more parameters used in the one or more statistical hypothesis tests comprise one or more significance levels used in the one or more statistical hypothesis tests.
  4. The method of claim 3, wherein the set of different value combinations for the one or more parameters associated with the anomaly detection system comprise respective probability values generated for a plurality of data points marked with a label indicating that an anomaly is present.
  5. The method of claim 1, further comprising:
    obtaining new data points of the one or more time series, the new data points comprising at least one data point marked with a label indicating that an anomaly is present;
    calculating a probability value for the at least one data point;
    using the calculated probability value for the at least one data point as a value of a parameter of the one or more parameters associated with the anomaly detection system; and
    incorporating the calculated probability value into the set of different value combinations to form a new set of different value combinations for the one or more parameters associated with the anomaly detection system.
  6. The method of claim 5, further comprising:
    assigning a respective value combination of the new set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a new subset of data points of the one or more time series to obtain predicted labels of the new subset of data points for each value combination of the new set of different value combinations;
    calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on the predetermined evaluation metric; and
    selecting a new parameter combination corresponding to a new highest performance score of the anomaly detection system from among the new set of different parameter combinations as a new recommended parameter combination for the one or more parameters associated with the anomaly detection system.
  7. The method of claim 1, wherein the performance score of the anomaly detection system comprises a F-1 score that is calculated based on a recall score and a precision score included in the evaluation metric.
  8. The method of claim 7, wherein the precision score depends on a number of correct anomaly predications, a number of incorrect anomaly predictions, and a number of testing data points having a label indicating that an anomaly is not yet determined.
  9. The method of claim 1, further comprising receiving feedback from one or more users to update labels associated with a plurality of data points of the one or more time series.
  10. One or more computer readable media storing executable instructions that, when executed by one or more processors, cause the one or more processors to perform acts comprising:
    generating a set of different value combinations for one or more parameters associated with an anomaly detection system;
    obtaining one or more time series, each data point of the one or more time series being marked with a label indicating whether an anomaly is present, absent, or not yet determined;
    assigning a respective value combination of the set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a subset of data points of the one or more time series to obtain predicted labels of the subset of data points for each value combination;
    calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on a predetermined evaluation metric; and
    selecting a parameter combination corresponding to a highest performance score of the anomaly detection system from among the set of different parameter combinations as a recommended parameter combination for the one or more parameters associated with the anomaly detection system.
  11. The one or more computer readable media of claim 10, wherein the one or more parameters associated with the anomaly detection system comprise one or more parameters used in one or more statistical hypothesis tests for detecting one or more anomaly types in the anomaly detection system.
  12. The one or more computer readable media of claim 11, wherein the one or more parameters used in the one or more statistical hypothesis tests comprise one or more significance levels used in the one or more statistical hypothesis tests.
  13. The one or more computer readable media of claim 12, wherein the set of different value combinations for the one or more parameters associated with the anomaly detection system comprise respective probability values generated for a plurality of data points marked with a label indicating that an anomaly is present.
  14. The one or more computer readable media of claim 10, the acts further comprising:
    obtaining new data points of the one or more time series, the new data points comprising at least one data point marked with a label indicating that an anomaly is present;
    calculating a probability value for the at least one data point;
    using the calculated probability value for the at least one data point as a value of a parameter of the one or more parameters associated with the anomaly detection system; and
    incorporating the calculated probability value into the set of different value combinations to form a new set of different value combinations for the one or more parameters associated with the anomaly detection system.
  15. The one or more computer readable media of claim 14, the acts further comprising:
    assigning a respective value combination of the new set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a new subset of data points of the one or more time series to obtain predicted labels of the new subset of data points for each value combination of the new set of different value combinations;
    calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on the predetermined evaluation metric; and
    selecting a new parameter combination corresponding to a new highest performance score of the anomaly detection system from among the new set of different parameter combinations as a new recommended parameter combination for the one or more parameters associated with the anomaly detection system.
  16. The one or more computer readable media of claim 10, wherein the performance score of the anomaly detection system comprises a F-1 score that is calculated based on a recall score and a precision score included in the evaluation metric.
  17. The one or more computer readable media of claim 16, wherein the precision score depends on a number of correct anomaly predications, a number of incorrect anomaly predictions, and a number of testing data points having a label indicating that an anomaly is not yet determined.
  18. The one or more computer readable media of claim 10, the acts further comprising receiving feedback from one or more users to update labels associated with a plurality of data points of the one or more time series.
  19. A system comprising:
    one or more processors; and
    memory storing executable instructions that, when executed by one or more processors, cause the one or more processors to perform acts comprising:
    generating a set of different value combinations for one or more parameters associated with an anomaly detection system;
    obtaining one or more time series, each data point of the one or more time series being marked with a label indicating whether an anomaly is present, absent, or not yet determined;
    assigning a respective value combination of the set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a subset of data points of the one or more time series to obtain predicted labels of the subset of data points for each value combination;
    calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on a predetermined evaluation metric; and
    selecting a parameter combination corresponding to a highest performance score of the anomaly detection system from among the set of different parameter combinations as a recommended parameter combination for the one or more parameters associated with the anomaly detection system.
  20. The system of claim 19, the acts further comprising:
    obtaining new data points of the one or more time series, the new data points comprising at least one data point marked with a label indicating that an anomaly is present;
    calculating a probability value for the at least one data point;
    using the calculated probability value for the at least one data point as a value of a parameter of the one or more parameters associated with the anomaly detection system;
    incorporating the calculated probability value into the set of different value combinations to form a new set of different value combinations for the one or more parameters associated with the anomaly detection system;
    assigning a respective value combination of the new set of different value combinations to the one or more parameters associated with the anomaly detection system, and applying the anomaly detection system assigned with the respective value combination on a new subset of data points of the one or more time series to obtain predicted labels of the new subset of data points for each value combination of the new set of different value combinations;
    calculating a performance score of the anomaly detection system assigned with the respective value combination based at least in part on the predetermined evaluation metric; and
    selecting a new parameter combination corresponding to a new highest performance score of the anomaly detection system from among the new set of  different parameter combinations as a new recommended parameter combination for the one or more parameters associated with the anomaly detection system.
PCT/CN2020/074509 2020-02-07 2020-02-07 Automatic parameter tuning for anomaly detection system WO2021155576A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/074509 WO2021155576A1 (en) 2020-02-07 2020-02-07 Automatic parameter tuning for anomaly detection system
CN202080094902.XA CN115315689A (en) 2020-02-07 2020-02-07 Automatic parameter adjustment for anomaly detection system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/074509 WO2021155576A1 (en) 2020-02-07 2020-02-07 Automatic parameter tuning for anomaly detection system

Publications (1)

Publication Number Publication Date
WO2021155576A1 true WO2021155576A1 (en) 2021-08-12

Family

ID=77199119

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/074509 WO2021155576A1 (en) 2020-02-07 2020-02-07 Automatic parameter tuning for anomaly detection system

Country Status (2)

Country Link
CN (1) CN115315689A (en)
WO (1) WO2021155576A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114139613A (en) * 2021-11-18 2022-03-04 支付宝(杭州)信息技术有限公司 Updating method and device of abnormality detection system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150269050A1 (en) * 2014-03-18 2015-09-24 Microsoft Corporation Unsupervised anomaly detection for arbitrary time series
US20180039555A1 (en) * 2016-08-04 2018-02-08 Oracle International Corporation Unsupervised method for baselining and anomaly detection in time-series data for enterprise systems
CN107908521A (en) * 2017-11-10 2018-04-13 南京邮电大学 A kind of monitoring method of container performance on the server performance and node being applied under cloud environment
US10042697B2 (en) * 2015-05-28 2018-08-07 Oracle International Corporation Automatic anomaly detection and resolution system
US10373070B2 (en) * 2015-10-14 2019-08-06 International Business Machines Corporation Anomaly detection model selection and validity for time series data
US10375169B1 (en) * 2017-05-24 2019-08-06 United States Of America As Represented By The Secretary Of The Navy System and method for automatically triggering the live migration of cloud services and automatically performing the triggered migration

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150269050A1 (en) * 2014-03-18 2015-09-24 Microsoft Corporation Unsupervised anomaly detection for arbitrary time series
US10042697B2 (en) * 2015-05-28 2018-08-07 Oracle International Corporation Automatic anomaly detection and resolution system
US10373070B2 (en) * 2015-10-14 2019-08-06 International Business Machines Corporation Anomaly detection model selection and validity for time series data
US20180039555A1 (en) * 2016-08-04 2018-02-08 Oracle International Corporation Unsupervised method for baselining and anomaly detection in time-series data for enterprise systems
US10375169B1 (en) * 2017-05-24 2019-08-06 United States Of America As Represented By The Secretary Of The Navy System and method for automatically triggering the live migration of cloud services and automatically performing the triggered migration
CN107908521A (en) * 2017-11-10 2018-04-13 南京邮电大学 A kind of monitoring method of container performance on the server performance and node being applied under cloud environment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114139613A (en) * 2021-11-18 2022-03-04 支付宝(杭州)信息技术有限公司 Updating method and device of abnormality detection system

Also Published As

Publication number Publication date
CN115315689A (en) 2022-11-08

Similar Documents

Publication Publication Date Title
US10067746B1 (en) Approximate random number generator by empirical cumulative distribution function
US10983856B2 (en) Identifying root causes of performance issues
CN108683530B (en) Data analysis method and device for multi-dimensional data and storage medium
US10223397B1 (en) Social graph based co-location of network users
US10108520B2 (en) Systems and methods for service demand based performance prediction with varying workloads
CN115795928B (en) Gamma process-based accelerated degradation test data processing method and device
WO2021155576A1 (en) Automatic parameter tuning for anomaly detection system
US9397921B2 (en) Method and system for signal categorization for monitoring and detecting health changes in a database system
Chen et al. Weighted graph clustering with non-uniform uncertainties
JP2005326412A (en) Adapted data collection method and system
WO2021000244A1 (en) Hyperparameter recommendation for machine learning method
US20230216728A1 (en) Method and system for evaluating peer groups for comparative anomaly
WO2018177293A1 (en) Sample-based multidimensional data cloning
Li et al. Pilot: A framework that understands how to do performance benchmarks the right way
Ruiz et al. Analysis of correlated multivariate degradation data in accelerated reliability growth
CN115656834A (en) Battery capacity prediction method and device and electronic equipment
CN115525394A (en) Method and device for adjusting number of containers
CN113296043A (en) Online analysis method, device and equipment for voltage transformer errors and storage medium
CN106776598B (en) Information processing method and device
CN112527622A (en) Performance test result analysis method and device
CN107665258A (en) File system availability determination method and device
Kaufman et al. Coverage estimation using statistics of the extremes for when testing reveals no failures
CN117421460B (en) Method, device and equipment for matching space-time data
JP2011107862A (en) System, device, method, and program for evaluating maximum throughput
US9959159B2 (en) Dynamic monitoring and problem resolution

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20917661

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20917661

Country of ref document: EP

Kind code of ref document: A1