WO2020248291A1 - Systems and methods for anomaly detection - Google Patents

Systems and methods for anomaly detection Download PDF

Info

Publication number
WO2020248291A1
WO2020248291A1 PCT/CN2019/091433 CN2019091433W WO2020248291A1 WO 2020248291 A1 WO2020248291 A1 WO 2020248291A1 CN 2019091433 W CN2019091433 W CN 2019091433W WO 2020248291 A1 WO2020248291 A1 WO 2020248291A1
Authority
WO
WIPO (PCT)
Prior art keywords
machine learning
learning model
samples
anomaly detection
determining
Prior art date
Application number
PCT/CN2019/091433
Other languages
French (fr)
Inventor
Bao ZHU
Shujun Chen
Dongdong CUI
Original Assignee
Beijing Didi Infinity Technology And Development Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Didi Infinity Technology And Development Co., Ltd. filed Critical Beijing Didi Infinity Technology And Development Co., Ltd.
Publication of WO2020248291A1 publication Critical patent/WO2020248291A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Definitions

  • the present disclosure generally relates to anomaly detection field, and specifically, to systems and methods for determining thresholds of a machine learning model for anomaly detection.
  • Machine learning has greatly promoted the development of anomaly detection technology, which in turn expands the use of anomaly detection technology.
  • an anomaly detection technology is applied to intrusion detection, fault detection, network abnormal traffic detection, etc.
  • unsupervised machine learning techniques are widely used for anomaly detection.
  • a threshold may be predetermined for determining whether an event is anomalous.
  • a threshold for an unsupervised machine learning model for anomaly detection is set by a user empirically, which may lack sufficient accuracy and or effectiveness of the threshold, which in turn may decrease accuracy of estimated results and/or effectiveness of the unsupervised machine learning model for anomaly detection. Therefore, it is desirable to develop systems and methods for determining a threshold of an unsupervised machine learning model for anomaly detection with improved accuracy and/or effectiveness.
  • a system for anomaly detection may include at least one storage medium storing a set of instructions and at least one processor configured to communicate with the at least one storage medium.
  • the at least one processor may be directed to cause the system to obtain a plurality of samples.
  • Each of the plurality of samples may be associated with an event.
  • the at least one processor may be further directed to cause the system to determine, for each of the plurality of samples, based on a machine learning model for the anomaly detection, an estimated probability that the event corresponding to the each of the plurality of samples is anomalous.
  • the at least one processor may be further directed to cause the system to determine, based on estimated probabilities corresponding to at least a portion of the plurality of samples, a plurality of candidate thresholds associated with the machine learning model.
  • the at least one processor may be further directed to cause the system to determine an evaluation result by evaluating the machine learning model for the anomaly detection with respect to each of the plurality of candidate thresholds.
  • the at least one processor may be further directed to cause the system to determine, based on the evaluation result, a target threshold associated with the machine learning model for the anomaly detection from the plurality of candidate thresholds.
  • the machine learning model may include at least one of a one-class support vector machine (SVM) model or an isolation forest algorithm.
  • SVM support vector machine
  • the at least one processor may be directed to cause the system to designate the estimated probability corresponding to each of the at least a portion of the plurality of samples as one of the plurality of candidate thresholds.
  • the at least one processor may be directed to cause the system to determine, for each of the plurality of samples, a reference probability corresponding to the each of the plurality of samples based on a probability estimation model.
  • the at least one processor may be directed to cause the system to evaluate, based on the estimated probability and the reference probability, the machine learning model with respect to each of the plurality of candidate thresholds.
  • the at least one processor may be directed to cause the system to determining, based on the reference probability and the estimated probability, an evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds.
  • the at least one processor may be further directed to cause the system to evaluate, based on the evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds, the machine learning model.
  • the at least one processor may be directed to cause the system to determine, based on each of the plurality of candidate thresholds and the estimated probability, an estimated label of each of the plurality of samples, the estimated label including a negative sample or a positive sample.
  • the at least one processor may be further directed to cause the system to determine, based on the reference probability and the estimated label, the evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds.
  • the at least one processor may be further directed to cause the system to ranking the reference probability of each of the plurality of samples.
  • the at least one processor may be further directed to cause the system to determine, based on the ranked reference probability, and the estimated label corresponding to each of the plurality of the samples, the evaluation index.
  • the evaluation index of the machine learning model may include at least one of an area under curve (AUC) or a Gini coefficient.
  • AUC area under curve
  • the at least one processor may be directed to cause the system to identify, from the plurality of candidate thresholds, a candidate threshold that corresponds to a maximum of the evaluation index.
  • the at least one processor may be further directed to cause the system to designate the identified candidate threshold as the target threshold associated with the machine learning model.
  • the at least one processor may be further directed to cause the system to obtain data associated with a specific event.
  • the at least one processor may be further directed to cause the system to determine, based on the data associated with the specific event and the machine learning model with respect to the target threshold, whether the specific event is anomalous.
  • a method for anomaly detection may include obtaining a plurality of samples. Each of the plurality of samples may be associated with an event. The method may further include determining, for each of the plurality of samples, an estimated probability that the event corresponding to the each of the plurality of samples is anomalous based on a machine learning model for the anomaly detection. The method may further include determining, based on estimated probabilities corresponding to at least a portion of the plurality of samples, a plurality of candidate thresholds associated with the machine learning model. The method may further include determining an evaluation result by evaluating the machine learning model for the anomaly detection with respect to each of the plurality of candidate thresholds. The method may further include determining, based on the evaluation result, a target threshold associated with the machine learning model for the anomaly detection from the plurality of candidate thresholds.
  • a non-transitory computer readable medium storing instructions, the instructions, when executed by a computer, may cause the computer to implement a method.
  • the method may include one or more of the following operations.
  • the method may include obtaining a plurality of samples. Each of the plurality of samples may be associated with an event.
  • the method may further include determining, for each of the plurality of samples, an estimated probability that the event corresponding to the each of the plurality of samples is anomalous based on a machine learning model for the anomaly detection.
  • the method may further include determining, based on estimated probabilities corresponding to at least a portion of the plurality of samples, a plurality of candidate thresholds associated with the machine learning model.
  • the method may further include determining an evaluation result by evaluating the machine learning model for the anomaly detection with respect to each of the plurality of candidate thresholds.
  • the method may further include determining, based on the evaluation result, a target threshold associated with the machine learning model for the anomaly detection from the plurality of candidate thresholds.
  • a system for anomaly detection may include an acquisition module, a determination module and an evaluation module.
  • the acquisition module may be configured to obtain a plurality of samples. Each of the plurality of samples may be associated with an event.
  • the determination module may be configured to for each of the plurality of samples, determine, based on a machine learning model for the anomaly detection, an estimated probability that the event corresponding to the each of the plurality of samples is anomalous.
  • the determination module may be also configured to determine, based on estimated probabilities corresponding to at least a portion of the plurality of samples, a plurality of candidate thresholds associated with the machine learning model.
  • the evaluation module may be configured to determine an evaluation result by evaluating the machine learning model for the anomaly detection with respect to each of the plurality of candidate thresholds.
  • the determination module may be further configured to determine, based on the evaluation result, a target threshold associated with the machine learning model for the anomaly detection from the plurality of candidate thresholds.
  • FIG. 1 is a schematic diagram illustrating an exemplary anomaly detection system according to some embodiments of the present disclosure
  • FIG. 2 is a schematic diagram illustrating exemplary hardware and software components of a computing device according to some embodiments of the present disclosure
  • FIG. 3 is a schematic diagram illustrating exemplary hardware and/or software components of a mobile device on which a terminal may be implemented according to some embodiments of the present disclosure
  • FIG. 4 is a block diagram illustrating an exemplary processing device according to some embodiments of the present disclosure.
  • FIG. 5 is a flowchart illustrating an exemplary process for determining a threshold of a machine learning model for anomaly detection according to some embodiments of the present disclosure
  • FIG. 6 is a flowchart illustrating an exemplary process for evaluating a machine learning model according to some embodiments of the present disclosure
  • FIG. 7 is a flowchart illustrating an exemplary process for anomaly detection according to some embodiments of the present disclosure.
  • FIGs. 8A-8D are schematic diagrams of exemplary anomaly detection results according to some embodiments of the present disclosure.
  • system, ” “engine, ” “unit, ” “module, ” and/or “block” used herein are one method to distinguish different components, elements, parts, section or assembly of different level in ascending order. However, the terms may be displaced by another expression if they achieve the same purpose.
  • module, ” “unit, ” or “block, ” as used herein refers to logic embodied in hardware or firmware, or to a collection of software instructions.
  • a module, a unit, or a block described herein may be implemented as software and/or hardware and may be stored in any type of non-transitory computer-readable medium or another storage device.
  • a software module/unit/block may be compiled and linked into an executable program. It will be appreciated that software modules can be callable from other modules/units/blocks or from themselves, and/or may be invoked in response to detected events or interrupts.
  • Software modules/units/blocks configured for execution on computing devices may be provided on a computer-readable medium, such as a compact disc, a digital video disc, a flash drive, a magnetic disc, or any other tangible medium, or as a digital download (and can be originally stored in a compressed or installable format that needs installation, decompression, or decryption prior to execution) .
  • a computer-readable medium such as a compact disc, a digital video disc, a flash drive, a magnetic disc, or any other tangible medium, or as a digital download (and can be originally stored in a compressed or installable format that needs installation, decompression, or decryption prior to execution) .
  • Such software code may be stored, partially or fully, on a storage device of the executing computing device, for execution by the computing device.
  • Software instructions may be embedded in firmware, such as an erasable programmable read-only memory (EPROM) .
  • EPROM erasable programmable read-only memory
  • modules/units/blocks may be included in connected logic components, such as gates and flip-flops, and/or can be included of programmable units, such as programmable gate arrays or processors.
  • the modules/units/blocks or computing device functionality described herein may be implemented as software modules/units/blocks, but may be represented in hardware or firmware.
  • the modules/units/blocks described herein refer to logical modules/units/blocks that may be combined with other modules/units/blocks or divided into sub-modules/sub-units/sub-blocks despite their physical organization or storage. The description may be applicable to a system, an engine, or a portion thereof.
  • the flowcharts used in the present disclosure illustrate operations that systems implement according to some embodiments in the present disclosure. It is to be expressly understood the operations of the flowchart may be implemented not in order. Conversely, the operations may be implemented in inverted order, or simultaneously. Moreover, one or more other operations may be added to the flowcharts. One or more operations may be removed from the flowcharts.
  • Embodiments of the present disclosure may be applied to different transportation systems including but not limited to land transportation, sea transportation, air transportation, space transportation, or the like, or any combination thereof.
  • a vehicle of the transportation systems may include a rickshaw, travel tool, taxi, chauffeured car, hitch, bus, rail transportation (e.g., a train, a bullet train, high-speed rail, and subway) , ship, airplane, spaceship, hot-air balloon, driverless vehicle, or the like, or any combination thereof.
  • the transportation system may also include any transportation system that applies management and/or distribution, for example, a system for sending and/or receiving an express.
  • the application scenarios of different embodiments of the present disclosure may include but not limited to one or more webpages, browser plugins and/or extensions, client terminals, custom systems, intracompany analysis systems, artificial intelligence robots, or the like, or any combination thereof. It should be understood that application scenarios of the system and method disclosed herein are only some examples or embodiments. Those having ordinary skills in the art, without further creative efforts, may apply these drawings to other application scenarios, for example, another similar server.
  • passenger, ” “requester, ” “requestor, ” “service requester, ” “service requestor” and “customer” in the present disclosure are used interchangeably to refer to an individual, an entity or a tool that may request or order a service.
  • driver, ” “provider, ” “service provider, ” and “supplier” in the present disclosure are used interchangeably to refer to an individual, an entity or a tool that may provide a service or facilitate the providing of the service.
  • user in the present disclosure may refer to an individual, an entity or a tool that may request a service, order a service, provide a service, or facilitate the providing of the service.
  • the user may be a requester, a passenger, a driver, an operator, or the like, or any combination thereof.
  • requester and “requester terminal” may be used interchangeably, and “provider” and “provider terminal” may be used interchangeably.
  • the term “request, ” “service, ” “service request, ” and “order” in the present disclosure are used interchangeably to refer to a request that may be initiated by a passenger, a requester, a service requester, a customer, a driver, a provider, a service provider, a supplier, or the like, or any combination thereof.
  • the service request may be accepted by any one of a passenger, a requester, a service requester, a customer, a driver, a provider, a service provider, or a supplier.
  • the service request may be chargeable or free.
  • Some embodiments of the present disclosure provide systems and methods for determining or predicting whether an event is anomalous using a model.
  • the model may be a machine learning model.
  • the model may be used with a target threshold that may serve as a classifier.
  • the model may be used to predict or determine whether an event is anomalous. For instance, the model may predict or determine a probability that an event is anomalous, and then by comparing the probability with the target threshold determine/designate the event as anomalous or not.
  • Some embodiments of the present disclosure provide systems and methods for determining a model and a target threshold with respect to the model for anomaly detection that is used to determine or predict whether an event is anomalous.
  • the target threshold may be determined using a plurality of samples associated with different events each of whose anomaly status (whether an event is anomalous or not) is unknown or known. Estimated probabilities corresponding to the plurality of samples may be determined. Further, a plurality of candidate thresholds associated with a machine learning model for anomaly detection may be determined. Then an evaluation result may be determined by evaluating the machine learning model, with respect to each of the plurality of candidate thresholds, for detecting anomaly of the samples. A target threshold associated with the machine learning model from the plurality of candidate thresholds may be determined based on the evaluation result.
  • the plurality of candidate thresholds may be applied to a machine learning model for anomaly detection to assess the accuracy and/or effectiveness of the candidate thresholds for the machine learning model in anomaly detection.
  • the target threshold may then be determined from the plurality of candidate thresholds by evaluating the machine learning model with respect to each of the plurality of candidate thresholds, which may further improve the accuracy and/or effectiveness of the machine learning model.
  • the systems and methods for anomaly detection according to some embodiments of the present disclosure may reduce or avoid the need to rely on the experience of an individual to select a threshold for a machine learning model for anomaly detection.
  • FIG. 1 is a schematic diagram illustrating an exemplary anomaly detection system 100 according to some embodiments of the present disclosure.
  • the anomaly detection system 100 may be a platform for data and/or information processing, for example, training a machine learning model for anomaly detection and/or data classification, such as image classification, text classification, etc.
  • the anomaly detection system 100 may be applied in intrusion detection, fault detection, network abnormal traffic detection, fraud detection, behavior abnormal detection, or the like, or a combination thereof.
  • An anomaly may be also referred to as an outlier, a novelty, a noise, a deviation, an exception, etc.
  • an anomaly refers to an action or an event that is determined to be unusual or abnormal in view of known or inferred conditions.
  • the anomaly may include a network quality anomaly, a user access anomaly, a server anomaly, etc.
  • the anomaly may include an order anomaly, a driver behavior anomaly, a passenger behavior anomaly, a route anomaly, etc.
  • the anomaly detection system 100 may include a data exchange port 101, a data transmitting port 102, a server 110, and storage 120.
  • the anomaly detection system 100 may interact with a data providing system 130 and a service providing system 140 via the data exchange port 101 and the data transmitting port 102, respectively.
  • anomaly detection system 100 may access information and/or data stored in the data providing system 130 via the data exchange port 101.
  • the server 110 may send information and/or data to a service providing system 140 via the data transmitting port 102.
  • the server 110 may process information and/or data relating to anomaly detection.
  • the server 110 may be a single server, or a server group.
  • the server group may be centralized, or distributed (e.g., the server 110 may be a distributed system) .
  • the server 110 may be implemented on a cloud platform.
  • the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof.
  • the server 110 may be implemented on a computing device having one or more components illustrated in FIG. 2 in the present disclosure.
  • the server 110 may include a processing device 112.
  • the processing device 112 may process information and/or data relating to anomaly detection to perform one or more functions described in the present disclosure.
  • the processing device 112 may receive a machine learning model for anomaly detection from the data providing system 130 and a sample set from the service providing system 140.
  • the processing device 112 may determine a target threshold for the machine learning model for anomaly detection based on the sample set.
  • the processing device 112 may estimate whether a specific sample received from the service providing system 140 is anomalous based on the target threshold using the machine learning model for anomaly detection.
  • the target threshold may be updated from time to time, e.g., periodically or not, based on a sample set that is at least partially different from the original sample set from which the original target threshold is determined. For instance, the target threshold may be updated based on a sample set including new samples that are not in the original sample set, samples whose anomaly is assessed using the machine learning model in connection with the original target threshold or a target threshold of a prior version, or the like, or a combination thereof. As still another example, the processing device 112 may transmit a signal including the estimated result to the service providing system 140. In some embodiments, the determination and/or updating of the target threshold may be performed on a processing device, while the application of a machine learning model in connection with the target threshold may be performed on a different processing device.
  • the determination and/or updating of the target threshold and/or the corresponding machine learning model may be performed on a processing device of a system different than the anomaly detection system 100 or a server different than the server 110 on which the application of a machine learning model in connection with the target threshold is performed.
  • the determination and/or updating of the target threshold and/or the machine learning model may be performed on a first system of a vendor who provides and/or maintains such a machine learning model, including the target threshold, and/or has access to training samples used to determine and/or update the target threshold and/or the machine learning model, while anomaly detection of an event based on the provided machine learning model, including the target threshold, may be performed on a second system of a client of the vendor.
  • the determination and/or updating of the target threshold and/or the machine learning model may be performed online in response to a request for anomaly detection of an event. In some embodiments, the determination and/or updating of the target threshold and/or the machine learning model may be performed offline.
  • the processing device 112 may include one or more processors (e.g., single-core processor (s) or multi-core processor (s) ) .
  • the processing device 112 may include a central processing unit (CPU) , an application-specific integrated circuit (ASIC) , an application-specific instruction-set processor (ASIP) , a graphics processing unit (GPU) , a physics processing unit (PPU) , a digital signal processor (DSP) , a field programmable gate array (FPGA) , a programmable logic device (PLD) , a controller, a microcontroller unit, a reduced instruction-set computer (RISC) , a microprocessor, or the like, or any combination thereof.
  • CPU central processing unit
  • ASIC application-specific integrated circuit
  • ASIP application-specific instruction-set processor
  • GPU graphics processing unit
  • PPU physics processing unit
  • DSP digital signal processor
  • FPGA field programmable gate array
  • PLD programmable logic device
  • controller a microcontroller unit, a reduced instruction-set computer (RISC) , a microprocessor, or the like, or any combination thereof.
  • RISC reduced
  • the storage 120 may store data and/or instructions related to content identification and/or data classification. In some embodiments, the storage 120 may store data obtained/acquired from the data providing system 130 and/or the service providing system 140. In some embodiments, the storage 120 may store data and/or instructions that the server 110 may execute or use to perform exemplary methods described in the present disclosure. In some embodiments, the storage 120 may include a mass storage device, a removable storage device, a volatile read-and-write memory, a read-only memory (ROM) , or the like, or any combination thereof. Exemplary mass storage devices may include a magnetic disk, an optical disk, a solid-state drive, etc.
  • Exemplary removable storage devices may include a flash drive, a floppy disk, an optical disk, a memory card, a zip disk, a magnetic tape, etc.
  • Exemplary volatile read-and-write memory may include a random access memory (RAM) .
  • Exemplary RAM may include a dynamic RAM (DRAM) , a double date rate synchronous dynamic RAM (DDR SDRAM) , a static RAM (SRAM) , a thyristor RAM (T-RAM) , and a zero-capacitor RAM (Z-RAM) , etc.
  • DRAM dynamic RAM
  • DDR SDRAM double date rate synchronous dynamic RAM
  • SRAM static RAM
  • T-RAM thyristor RAM
  • Z-RAM zero-capacitor RAM
  • Exemplary ROM may include a mask ROM (MROM) , a programmable ROM (PROM) , an erasable programmable ROM (PEROM) , an electrically erasable programmable ROM (EEPROM) , a compact disk ROM (CD-ROM) , and a digital versatile disk ROM, etc.
  • the storage 120 may be implemented on a cloud platform.
  • the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof.
  • the storage 120 may be connected to or communicate with the server 110.
  • the server 110 may access data or instructions stored in the storage 120 directly or via a network.
  • the storage 120 may be a part of the server 110.
  • the data providing system 130 may provide data and/or information related to anomaly detection and/or data classification.
  • the data and/or information may include images, text files, voice segments, web pages, video recordings, user requests, programs, applications, algorithms, instructions, computer codes, or the like, or a combination thereof.
  • the data providing system 130 may provide the data and/or information to the server 110 and/or the storage 120 of the anomaly detection system 100 for processing (e.g., train a machine learning model for anomaly detection) .
  • the data providing system 130 may provide the data and/or information to the service providing system 140 for generating a service response relating to the anomaly detection and/or data classification.
  • the service providing system 140 may be configured to provide online services, such as an anomaly detection service, an online to offline service (e.g., a taxi service, a carpooling service, a food delivery service, a party organization service, an express service, etc. ) , an unmanned driving service, a medical service, a map-based service (e.g., a route planning service) , a live chatting service, a query service, a Q&A service, etc.
  • the service providing system 140 may generate service responses, for example, by inputting the data and/or information received from a user and/or the data providing system 130 into a machine learning model for anomaly detection.
  • the data providing system 130 and/or the service providing system 140 may be a device, a platform, or other entity interacting with the anomaly detection system.
  • the data providing system 130 may be implemented in a device with data acquisition and/or data storage, such as a mobile device 130-1, a tablet computer 130-2, a laptop computer 130-3, a server 130-4, a storage device (not shown) , or the like, or any combination thereof.
  • the service providing system 140 may also be implemented in a device with data processing, such as a mobile device 140-1, a tablet computer 140-2, a laptop computer 140-3, and a server 140-4, or the like, or any combination thereof.
  • the mobile devices 130-1 and 140-1 may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof.
  • the smart home device may include a smart lighting device, a control device of an intelligent electrical apparatus, a smart monitoring device, a smart television, a smart video camera, an interphone, or the like, or any combination thereof.
  • the wearable device may include a smart bracelet, a smart footgear, a smart glass, a smart helmet, a smart watch, a smart clothing, a smart backpack, a smart accessory, or the like, or any combination thereof.
  • the smart mobile device may include a smartphone, a personal digital assistant (PDA) , a gaming device, a navigation device, a point of sale (POS) device, or the like, or any combination thereof.
  • the virtual reality device and/or the augmented reality device may include a virtual reality helmet, a virtual reality glass, a virtual reality patch, an augmented reality helmet, an augmented reality glass, an augmented reality patch, or the like, or any combination thereof.
  • the virtual reality device and/or the augmented reality device may include a Google Glass, an Oculus Rift, a HoloLens, a Gear VR, etc.
  • the servers 130-4 and 140-4 may include a database server, a file server, a mail server, a web server, an application server, a computing server, a media server, a communication server, etc.
  • the data providing system 130 may be a device with data processing technology for preprocessing acquired or stored information (e.g., identifying images from stored information) .
  • the service providing system 140 may be a device for data processing, for example, train an identification model using a cleaned dataset received from the server 110.
  • the service providing system 140 may directly communicate with the data providing system 130 via a network 150-3.
  • the service providing system 140 may receive a dataset from the data providing system 130, and perform an anomaly detection on the dataset based on a machine learning model for anomaly detection.
  • any two systems of the anomaly detection system 100, the data providing system 130, and the service providing system 140 may be integrated into a device or a platform.
  • both the data providing system 130 and the service providing system 140 may be implemented in a mobile device of a user.
  • the anomaly detection system 100, the data providing system 130, and the service providing system 140 may be integrated into a device or a platform.
  • the anomaly detection system 100, the data providing system 130, and the service providing system 140 may be implemented in a computing device including a server and a user interface.
  • Networks 150-1 through 150-3 may facilitate exchange of information and/or data.
  • one or more components in the anomaly detection system e.g., the server 110 and/or the storage 120
  • the server 110 may obtain/acquire datasets for anomaly detection from the data providing system 130 via the network 150-1.
  • the server 110 may transmit/output estimated result for anomaly detection to the service providing system 140 via the network 150-2.
  • the networks 150-1 through 150-3 may be any type of wired or wireless networks, or combination thereof.
  • the networks 150 may include a cable network, a wireline network, an optical fiber network, a tele communications network, an intranet, an Internet, a local area network (LAN) , a wide area network (WAN) , a wireless local area network (WLAN) , a metropolitan area network (MAN) , a wide area network (WAN) , a public telephone switched network (PSTN) , a Bluetooth TM network, a ZigBee TM network, a near field communication (NFC) network, a global system for mobile communications (GSM) network, a code-division multiple access (CDMA) network, a time-division multiple access (TDMA) network, a general packet radio service (GPRS) network, an enhanced data rate for GSM evolution (EDGE) network, a wideband code division multiple access (WCDMA) network, a high speed downlink packet access (HSDPA) network, a long term evolution (LTE) network, a user datagram protocol (UDP) network
  • LAN local area
  • FIG. 2 illustrates a schematic diagram of an exemplary computing device 200 according to some embodiments of the present disclosure.
  • the computing device 200 may be a computer, such as the server 110 in FIG. 1 and/or a computer with specific functions, configured to implement any particular system according to some embodiments of the present disclosure.
  • the computing device 200 may be configured to implement any component that performs one or more functions disclosed in the present disclosure.
  • the server 110 e.g., the processing device 112
  • FIG. 2 depicts only one computing device.
  • the functions of the computing device may be implemented by a group of similar platforms in a distributed mode to disperse the processing load of the system.
  • the computing device 200 may include a communication terminal 250 that may connect with a network that may implement the data communication.
  • the computing device 200 may also include a processor 220 that is configured to execute instructions and includes one or more processors.
  • the schematic computer platform may include an internal communication bus 210, different types of program storage units and data storage units (e.g., a hard disk 270, a read-only memory (ROM) 230, a random-access memory (RAM) 240) , various data files applicable to computer processing and/or communication, and some program instructions executed possibly by the processor 220.
  • the computing device 200 may also include an I/O device 260 that may support the input and output of data flows between the computing device 200 and other components. Moreover, the computing device 200 may receive programs and data via the communication network.
  • FIG. 3 is a schematic diagram illustrating exemplary hardware and/or software components of an exemplary mobile device on which a service system (e.g., the anomaly detection system 122, the data providing system 130 and/or the service providing system 140) may be implemented according to some embodiments of the present disclosure.
  • the mobile device 300 may include, a communication platform 310, a display 320, a graphics processing unit (GPU) 330, a central processing unit (CPU) 340, an I/O 350, a memory 360, a mobile operating system (OS) 370, application (s) , and a storage 390.
  • any other suitable component including but not limited to a system bus or a controller (not shown) , may also be included in the mobile device 300.
  • the mobile operating system 370 e.g., iOS TM , Android TM , Windows Phone TM , etc.
  • the applications 380 may include a browser or any other suitable mobile apps for receiving and rendering information relating to image processing or other information from the Anomaly detection system 100.
  • User interactions with the information stream may be achieved via the I/O 350 and provided to the storage device 120, the server 110 and/or other components of the Anomaly detection system 100.
  • the mobile device 300 may be an exemplary embodiment corresponding to a terminal associated with, the anomaly detection system 100, the data providing system 130 and/or the service providing system 140.
  • computer hardware platforms may be used as the hardware platform (s) for one or more of the elements described herein.
  • a computer with user interface elements may be used to implement a personal computer (PC) or any other type of work station or terminal device.
  • PC personal computer
  • a computer may also act as a system if appropriately programmed.
  • FIG. 4 is a block diagram illustrating an exemplary processing device 112 according to some embodiments of the present disclosure.
  • the processing device 112 may include an acquisition module 410, a determination module 420, an evaluation module 430, and a storage module 440.
  • the acquisition module 410 may be configured to obtain data for anomaly detection.
  • the acquisition module 410 may obtain a plurality of samples. Each of the plurality of samples may be associated with an event.
  • an event may be defined by information and/or data that indicating something has happened at a specific time or time period.
  • the acquisition module 410 may also obtain data associated with a specific event.
  • the specific event may be an event associated with one of the plurality of samples.
  • the data associated with the specific event may be one of the plurality of samples.
  • the data associated with the specific event may include one or more features that characterizes the specific event as described elsewhere in the present disclosure.
  • the acquisition module 410 may be configured to obtain a model, such as a machine learning model for anomaly detection, a probability estimation model, etc.
  • the determination module 420 may be configured to determine an estimated probability for each of the plurality of samples based on a machine learning model for anomaly detection. Each of the plurality of samples may correspond to an estimated probability. The determination module 420 may further determine a plurality of candidate thresholds associated with the machine learning model based on the estimated probabilities of the plurality of samples. The determination module 420 may also determine a target threshold associated with the machine learning model from the plurality of candidate thresholds based on an evaluation result corresponding to each of the plurality of candidate thresholds. The determination module 420 may determine whether the specific event is anomalous based on the data associated with the specific event and the machine learning model with the target threshold.
  • the evaluation module 430 may be configured to determine the evaluation result by evaluating the machine learning model for anomaly detection with respect to each of the plurality of candidate thresholds.
  • the evaluation module 430 may evaluate the machine learning model with respect to each of the plurality of candidate thresholds according to one or more evaluation indexes.
  • the evaluation result may be denoted by one or more values of the one or more evaluation indexes.
  • the storage module 440 may be configured to store information.
  • the information may include programs, software, algorithms, data, text, number, images and some other information.
  • the information may include data which may define an event indicating something has happened at a specific time or time period, etc.
  • the information may include a machine learning model for anomaly detection.
  • any module mentioned above may be implemented in two or more separate units.
  • the functions of determination module 420 may be implemented in two separate units, one of which is configured to determine an estimated probability corresponding to each of the plurality of samples, and the other is configured to determine candidate thresholds associated with the machine learning model.
  • the processing device 112 may further include one or more additional modules (e.g., a storage module) . Additionally or alternatively, one or more modules mentioned above may be omitted.
  • FIG. 5 is a flowchart illustrating an exemplary process 500 for determining a threshold of a machine learning model for anomaly detection according to some embodiments of the present disclosure. At least a portion of process 500 may be implemented on the computing device 200 as illustrated in FIG. 2 or the mobile device 300 as illustrated in FIG. 3. In some embodiments, one or more operations of process 500 may be implemented in the anomaly detection system 100 as illustrated in FIG. 1.
  • one or more operations in the process 500 may be stored in a storage device (e.g., the storage 120, the ROM 230, the RAM 240, the storage 390) as a form of instructions, and invoked and/or executed by the server 110 (e.g., the processing device 112 in the server 110, or the processor 220 of the computing device 200) or the CPU 340 of the mobile device 300.
  • the instructions may be transmitted in the form of electronic current or electrical signals.
  • the processing device 112 may obtain a plurality of samples. Each of the plurality of samples may be associated with an event.
  • the plurality of samples may be obtained by the acquisition module 410 from a storage device (e.g., the storage device 120, the ROM 230, the RAM 240, the storage 390) as described elsewhere in the present disclosure.
  • an event may be defined by information and/or data that indicating something has happened at a specific time or within a specific time period.
  • the event may include logging in the online taxi-hailing platform, initiating a service request, dispatching a service request, picking up a passenger, transporting a passenger to a destination along a predetermined route, a communication between a driver and a passenger on a route, a communication between a client terminal and a server associated with the onlie taxi-hailing platform, or the like, or a combination thereof.
  • a sample associated with an event may be also referred to as sample data.
  • the sample may be in the form of an image, a video, text, etc.
  • a sample associated with a specific event may include and/or represent one or more features that characterize the specific event.
  • the one or more features associated with the specific event may be presented as a feature vector (e.g., a multi-dimensional vector) .
  • Each dimension of the feature vector may represent a feature of the specific event. For example, in an online taxi-hailing platform, an event may include transporting a passenger to a destination along a predetermined route.
  • the sample data (e.g., one or more features) associated with the event may include a start location, a start time, a destination, an estimated time of arrival, a real-time location, a travel trajectory (e.g., the whole length of the travel trajectory, the whole travel time of the travel trajectory, the length of a road segment in the travel trajectory, the travel time of a road segment in the travel trajectory, etc. ) , or the like, or any combination thereof.
  • the plurality of samples may form a sample set.
  • the sample set may be denoted as An element in the sample set may represent a sample.
  • a sample may correspond to a multi-dimensional feature vector denoted as [f 1 , f 2 , f 3 , ...] that represents one or more features of the event.
  • f 1 may represent the start location
  • f 2 may represent the destination
  • f 3 may represent the travel trajectory.
  • the processing device 112 may determine an estimated probability for each of the plurality of samples based on a machine learning model for anomaly detection.
  • Each of the plurality of samples may correspond to an estimated probability.
  • the estimated probability of a specific sample determined based on the machine learning model for anomaly detection may refer to a possibility that the event corresponding to the specific sample is anomalous. The smaller the value of the estimated probability is, the higher the possibility that the event is anomalous may be.
  • the machine learning model for anomaly detection may be configured to generate and/or output the estimated probability of an event using the sample corresponding to the event.
  • the processing device 112 may input a specific sample into the machine learning model for anomaly detection.
  • the machine learning model for anomaly detection may generate and output the estimated probability of an event associated with the specific sample using the inputted specific sample.
  • the machine learning model for anomaly detection may be obtained by the acquisition module 410 from the data providing system 130, the storage 120, the service providing system 140, or any other storage device as described elsewhere in the present disclosure.
  • the machine learning model for anomaly detection may include an unsupervised machine learning model, a semi-unsupervised machine learning model, etc.
  • Exemplary unsupervised machine learning models may include using a classification-based algorithm, a statistical distribution-based algorithm, a proximity-based algorithm, a density-based algorithm, a cluster-based algorithm, a tree-based algorithm, etc.
  • the classification-based algorithm may include using a neural network model, a Bayesian network model, a one-class support vector machine (SVM) model, a robust SVM, a one-class kernel fisher discriminants model, etc.
  • SVM support vector machine
  • the statistical distribution-based algorithm may include using a Gaussian model, a robust regression model, etc.
  • the proximity-based algorithm may include a K-nearest neighbor (KNN) algorithm, an outlier detection using in-degree number (ODIN) algorithm, etc.
  • the density-based algorithm may include a local outlier factor (LOF) algorithm, a connectivity-based outlier factor (COF) algorithm, etc.
  • the tree-based algorithm may include an isolation forest (iForest) algorithm, an interpretable hierarchical clustering unsupervised decision tree (IHCUDT) algorithm, etc.
  • the cluster-based algorithm may include a share nearest neighbor (SNN) clustering algorithm, a wave cluster algorithm, K-means clustering algorithm, a self-organizing maps algorithm, an expectation maximization (EM) algorithm, etc.
  • Exemplary semi-unsupervised machine learning models may include using a Markovian model, a finite state automata (FSA) model, a hidden Markov model (HMM) , a probabilistic suffix trees (SMT
  • the processing device 112 may determine a plurality of candidate thresholds associated with the machine learning model.
  • a candidate threshold associated with the machine learning model for anomaly detection may be configured to determine whether an event is anomalous. For example, if an estimated probability of an event determined using the machine learning model for anomaly detection is greater than the candidate threshold, the processing device 112 may determine that the event is anomalous when using the candidate threshold.
  • the processing device 112 may determine at least a portion of the plurality of candidate thresholds based on the estimated probability determined in operation 504. For example, the processing device 112 may determine a portion or all of the plurality of candidate thresholds based on estimated probabilities corresponding to at least a portion of the plurality of samples. Further, the processing device 112 may designate each of the estimated probabilities corresponding to at least a portion of the plurality of samples as one of the plurality of candidate thresholds. As a further example, the processing device 112 may rank the estimated probability corresponding to each of the plurality of samples (e.g., in an ascending order or a descending order) .
  • the processing device 112 may determine a portion or all of the plurality of candidate thresholds based on the ranked estimated probabilities of the plurality samples. An estimated probability ranked as, e.g., the top, the bottom, the medium of the ranked estimated probabilities may be designated as a candidate threshold. As still another example, the processing device 112 may designate one or more estimated probabilities of at least a portion of the plurality samples within a certain range as one or more candidate thresholds. In some embodiments, the processing device 112 may designate the estimated probability corresponding to each of the plurality of samples as one of the plurality of candidate thresholds.
  • the processing device 112 may determine at least a portion of the plurality of candidate thresholds using a probability estimation model. Further, the processing device 112 may determine a reference probability corresponding to each of the plurality of samples using the probability estimation model. The reference probability corresponding to a specific sample of the plurality of samples may be used to measure and/or assess similarity between the specific sample and one or more other samples of the plurality of samples. The greater the reference probability corresponding to the specific sample is, the higher the similarity between the specific sample and the one or more other samples of the plurality of samples may be. The processing device 112 may determine a portion or all of the plurality of candidate thresholds based on reference probabilities corresponding to at least a portion of the plurality of samples.
  • the processing device 112 may designate each of reference probabilities corresponding to a portion of the plurality of samples as one of the plurality of candidate thresholds. As another example, the processing device 112 may designate each of reference probabilities corresponding to all the plurality of samples as one of the plurality of candidate thresholds.
  • Exemplary probability estimation models may include using a parametric estimation algorithm, a Bayes algorithm, a non-parametric estimation algorithm, etc.
  • the parametric estimation algorithm may include a maximum likelihood algorithm.
  • the non-parametric estimation algorithm may include a histogram probability estimation algorithm, a kernel density estimation algorithm, etc.
  • At least a portion of the plurality of candidate thresholds may be set by a user or according to a default setting of the anomaly detection system 100.
  • the processing device 112 may determine an evaluation result by evaluating the machine learning model for anomaly detection with respect to each of the plurality of candidate thresholds.
  • the processing device 112 may evaluate the machine learning model with respect to each of the plurality of candidate thresholds according to one or more evaluation indexes.
  • the evaluation result may be denoted by one or more values of the one or more evaluation indexes.
  • Exemplary evaluation indexes of the machine learning model for anomaly detection may include an area under curve (AUC) , a Gini coefficient, or the like, or any combination thereof.
  • AUC area under curve
  • An evaluation index may be used to measure and/or indicate the accuracy of estimation results of the machine learning model for anomaly detection. For example, the greater the value of AUC of the machine learning model for anomaly detection with respect to a candidate threshold, the greater the accuracy of estimation results of the machine learning model for anomaly detection may be.
  • the processing device 112 may determine the value of an evaluation index of the machine learning model for anomaly detection with respect to a specific candidate threshold using the plurality of samples. For example, the processing device 112 may determine a reference probability corresponding to each of the plurality of samples using a probability estimation model as described elsewhere in the present disclosure. The processing device 112 may determine the value of the evaluation index using the estimated probability and the reference probability corresponding to each of the plurality of samples. For the specific candidate threshold, the processing device 112 may designate a label to each of the plurality of samples based on the specific candidate threshold and the estimated probability corresponding to each of the plurality of samples. The label may include a positive sample or a negative sample.
  • the processing device 112 may designate a positive sample to a sample if the estimated probability corresponding to the sample exceeds the specific candidate threshold.
  • the processing device 112 may designate a negative sample to a sample if the estimated probability corresponding to the sample is smaller than the specific candidate threshold.
  • the processing device 112 may determine an evaluation index corresponding to the specific candidate threshold based on the label and the reference probability corresponding to each of the plurality of samples. More descriptions for determining the evaluation result may be found elsewhere in the present disclosure (e.g., FIG. 6 and the descriptions thereof) .
  • the processing device 112 may determine a target threshold associated with the machine learning model from the plurality of candidate thresholds based on the evaluation result corresponding to each of the plurality of candidate thresholds. In some embodiments, the processing device 112 may compare the evaluation result corresponding to each of the plurality of candidate thresholds. Each of the plurality of candidate thresholds may correspond to an evaluation result, i.e., a value of an evaluation index. The processing device 112 may determine the target threshold based on the comparison. The processing device 112 may compare the values of an evaluation index (e.g., AUC) of the machine learning model with respect to the plurality of candidate thresholds.
  • an evaluation index e.g., AUC
  • the processing device 112 may designate a candidate threshold that corresponds the maximum or maximum of the values of the evaluation index (e.g., AUC) of the machine learning model with respect to the plurality of candidate thresholds as the target threshold.
  • the processing device 112 may determine at least two candidate thresholds from the plurality of candidate thresholds.
  • the values of the evaluation index (e.g., AUC) of the machine learning model corresponding to the at least two candidate thresholds may be greater than or smaller than values of the evaluation index corresponding to other candidate thresholds of the plurality of candidate thresholds.
  • the processing device 112 may designate an average of the at least two candidate thresholds as the target threshold.
  • one or more operations may be omitted and/or one or more additional operations may be added.
  • operation 510 may be combined into operation 502.
  • Operation 512 and operation 514 may be omitted.
  • one or more operation in process 600 may be added into the process 500 to determine an evaluation result of the machine learning model.
  • process 500 may also include performing an anomaly detection on the plurality of samples based on the machine learning model for anomaly detection with the target threshold.
  • the processing device 112 may determine one or more anomalies from the plurality of samples using the machine learning model for anomaly detection with the target threshold.
  • a target machine learning model may be determined based on samples each of whose anomaly status is known, and one or more operations illustrated in FIG. 5 may be omitted. For instance, operation 504 may be omitted.
  • FIG. 6 is a flowchart illustrating an exemplary process for evaluating a machine learning model according to some embodiments of the present disclosure. At least a portion of process 600 may be implemented on the computing device 200 as illustrated in FIG. 2 or the mobile device 300 as illustrated in FIG. 3. In some embodiments, one or more operations of process 600 may be implemented in the anomaly detection system 100 as illustrated in FIG. 1.
  • one or more operations in the process 600 may be stored in a storage device (e.g., the storage 120, the ROM 230, the RAM 240, the storage 390) as a form of instructions, and invoked and/or executed by the server 110 (e.g., the processing device 112 in the server 110, or the processor 220 of the computing device 200) or the CPU 340 of the mobile device 300.
  • the instructions may be transmitted in the form of electronic current or electrical signals. Operation 508 may be performed according to process 600 as described in FIG. 6.
  • the processing device 112 may determine a reference probability corresponding to each of a plurality of samples based on a probability estimation model.
  • the plurality of samples may be obtained as described in connection with 502.
  • a sample may be associated with an event.
  • the sample may include one or more features (e.g., a feature vector) that characterize the event.
  • the reference probability corresponding to a specific sample of the plurality of samples may be used to measure and/or indicate a similarity between the specific sample and other samples of the plurality of samples. The greater the reference probability corresponding to a specific sample is, the larger the similarity between the specific sample and other samples of the plurality of samples may be.
  • the processing device 112 may determine a reference probability corresponding to a specific sample based on the specific sample and other samples of the plurality of samples using the probability estimation model.
  • Exemplary probability estimation models may include using a parametric estimation algorithm, a Bayes algorithm, non-parametric estimation algorithm, etc.
  • the parametric estimation algorithm may include a maximum likelihood algorithm.
  • the non-parametric estimation algorithm may include a histogram probability estimation algorithm, a kernel density estimation algorithm, etc.
  • the processing device 112 may determine an estimated label for each of the plurality of samples based on a candidate threshold and an estimated probability corresponding to each of the plurality of samples.
  • the estimated probability may be determined as described in connection with operation 504.
  • the candidate threshold may be determined as described in connection with operation 506.
  • the estimated probability corresponding to a specific sample may be used to measure and/or indicate a possibility that an event associated with the specific sample is anomalous.
  • the processing device 112 may determine an estimated label corresponding to a specific sample by comparing the estimated probability corresponding to the specific sample with the candidate threshold.
  • the estimated label may include a negative sample and a positive sample.
  • the negative sample may indicate that an event associated with the negative sample is anomalous.
  • the positive sample may indicate that an event associated with the positive sample is normal.
  • the processing device 112 may label the specific sample as a positive sample. If the estimated probability corresponding to the specific sample is smaller the candidate threshold, the processing device 112 may label the specific sample as a negative sample.
  • the processing device 112 may determine an evaluation index of the machine learning model with respect to the candidate threshold based on the reference probability and the estimated label corresponding to each of the plurality of samples.
  • Exemplary evaluation indexes of the machine learning model for anomaly detection may include an area under curve (AUC) , a Gini coefficient, etc.
  • AUC of the machine learning model for anomaly detection may be defined by a probability that when the machine learning model is used for anomaly detection, a random positive sample is ranked above a random negative sample. The greater the AUC of a machine learning model for anomaly detection is, the greater the accuracy of the machine learning model for anomaly detection estimation may be.
  • the processing device 112 may rank the samples according to the reference probability from small to large.
  • the processing device 112 may determine, statistically the probability that a positive sample is ranked above a negative sample. For example, the processing device 112 may determine the value of AUC according to the Equation (1) as below:
  • M refers to a count of positive samples of the plurality of samples
  • N refers to a count of negative samples of the plurality of samples
  • P positive refers to a reference probability (i.e., a true score) of a positive sample
  • P negative refers to a reference probability (i.e., a true score) of a negative sample
  • I (P positive , P negative ) may be determined according to equation (2) as below:
  • the plurality of samples may include M positive samples and N negative samples.
  • the processing device 112 may determine M*N sample pairs. Each of the M*N sample pairs may include a negative sample and a positive sample.
  • the processing device 112 may comparing a reference probability corresponding to each of the negative sample and the positive sample in each of the M*N sample pairs to determine the AUC of the machine learning model for anomaly detection with respect to the candidate threshold.
  • the estimated probability of each the plurality of samples is 0.9, 0.8, 0.75, and 0.85, respectively, and the reference probability of each the plurality of samples is 0.7, 0.6, 0.8 and 0.9, respectively.
  • the candidate threshold equals the estimated probability 0.85 of sample D.
  • the processing device 112 may label sample A as a positive sample, sample B as a negative sample, sample C as a negative sample, and sample D as a positive sample by comparing the estimated probabilities of sample A, B, C, and D, respectively, with the candidate threshold 0.85.
  • the processing device 112 may determine 4 sample pairs including (A, B) , (A, C) , (D, B) , and (D, C) . According to equation (2) , the processing device 112 may determine I (A, B) , I (A, C) , I (D, B) , and I (D, C) as 1, 0, 1, and 1, respectively. According to the Equation (1) , the AUC of the machine learning model for anomaly detection may be determined as when designating the estimated probability 0.85 of sample D as the candidate threshold. Similarly, the processing device 112 may determine the AUCs of the machine learning model for anomaly detection when designating estimated probabilities of samples A, B, and C as the candidate threshold, respectively.
  • the Gini coefficient of the machine learning model for anomaly detection with respect to the candidate threshold may be determined based on the AUC according to equation (3) as below:
  • FIG. 7 is a flowchart illustrating an exemplary process 700 for anomaly detection according to some embodiments of the present disclosure. At least a portion of process 700 may be implemented on the computing device 200 as illustrated in FIG. 2 or the mobile device 300 as illustrated in FIG. 3. In some embodiments, one or more operations of process 700 may be implemented in the anomaly detection system 100 as illustrated in FIG. 1.
  • one or more operations in the process 500 may be stored in a storage device (e.g., the storage 120, the ROM 230, the RAM 240, the storage 390) as a form of instructions, and invoked and/or executed by the server 110 (e.g., the processing device 112 in the server 110, or the processor 220 of the computing device 200) or the CPU 340 of the mobile device 300.
  • the instructions may be transmitted in the form of electronic current or electrical signals.
  • the processing device 112 may obtain data associated with a specific event.
  • the specific event may be an event associated with one of the plurality of samples.
  • the data associated with the specific event may be one of the plurality of samples.
  • the data associated with the specific event may be obtained by the acquisition module 410 from the data providing system 130, the service providing system 140, the storage 120, etc.
  • the data associated with the specific event may include one or more features that characterizes the specific event as described elsewhere in the present disclosure.
  • the processing device 112 may obtain a machine learning model for anomaly detection with a target threshold.
  • the acquisition module 410 may obtain the machine learning model with the target threshold from the storage 120, the data providing system 130, the service providing system 140, and any other storage device as described elsewhere in the present disclosure.
  • the machine learning model for anomaly detection may include an unsupervised machine learning model, a semi-unsupervised machine learning model, etc., as described elsewhere in the present disclosure (e.g., FIG. 5, and the descriptions thereof) .
  • the target threshold may be used by the machine learning model for anomaly detection to determine whether an event is anomalous based on.
  • the target threshold may be determined according to process 500 as described in FIG. 5.
  • the target threshold may be determined using a plurality of samples associated with different events each of whose anomaly status (whether an event is anomalous or not) is unknown.
  • the plurality of samples may include the data associated with the event obtained in operation 702. Estimated probabilities corresponding to the plurality of samples may be determined.
  • a plurality of candidate thresholds associated with a machine learning model for anomaly detection may be determined. Then an evaluation result may be determined by evaluating the machine learning model, with respect to each of the plurality of candidate thresholds, for detecting anomaly of the samples. A target threshold associated with the machine learning model from the plurality of candidate thresholds may be determined based on the evaluation result.
  • the processing device 112 may determine whether the specific event is anomalous based on the data associated with the specific event and the machine learning model with the target threshold.
  • the processing device 112 may determine whether the specific event is anomalous by inputting the data associated with the specific event into the machine learning model.
  • the machine learning model for anomaly detection may be configured to determine and output an estimated probability corresponding to the specific event based on the inputted data associated with the specific event. Further, the determination module 420 may compare the estimated probability with the target threshold. If the estimated probability corresponding to the specific event is smaller than the target threshold, the determination module 420 may determine that the specific event is anomalous.
  • the determination module 420 may determine that the event is normal.
  • the machine learning model for anomaly detection may be configured to determine an estimated probability corresponding to the specific event and determine whether the specific event is anomalous based on the inputted data associated with the specific event and the target threshold.
  • the machine learning model for anomaly detection may be configured to output an estimated result of the specific event. For example, if the specific event is anomalous, the machine learning model for anomaly detection may output “0. ” If the specific event is normal, the machine learning model for anomaly detection may output “1. ”
  • FIGs. 8A-8D illustrate schematic diagrams of exemplary anomaly detection results according to some embodiments of the present disclosure.
  • the horizontal axis represents the time axis.
  • the vertical axis represents sample signals obtained over time.
  • Curve “a” and Curve “b” represents service indicators associated with the sample signals.
  • Curve “c” represents alarm signals that indicate an anomaly appears. The greater a peak of curve “a” or curve “b” , the greater a possibility that an anomaly exists may be.
  • the anomaly detection was performed using a machine learning model for anomaly detection with respect to a target threshold of about 0.1387.
  • the AUC of the machine learning model for anomaly detection with respect to a threshold of about 0.1387 was about 0.9376.
  • FIG. 8A shows a plurality of anomalies appeared during anomaly detection.
  • the anomaly detection was performed using a machine learning model for anomaly detection with respect to a target threshold of about 0.1728.
  • the AUC of the machine learning model for anomaly detection with respect to the threshold of about 0.1728 was about 0.9671.
  • FIG. 8B shows that a plurality of anomalies appeared during anomaly detection.
  • the anomaly detection was performed using a machine learning model for anomaly detection with respect to a threshold of about 0.5838.
  • the AUC of the machine learning model for anomaly detection with respect to the threshold about 0.5838 was about 0.9998.
  • FIG. 8C shows that an anomaly appeared at the time corresponding to a maximum peak “P.
  • the anomaly detection was performed using a machine learning model for anomaly detection with respect to a target threshold of about 0.8272.
  • the AUC of the machine learning model for anomaly detection with respect to the threshold about 0.8272 was about 0.9980.
  • FIG. 8D shows that at least two anomalies appeared at the time corresponding to a maximum peak “P. ”
  • one single anomaly may appear at the time corresponding to a maximum peak, e.g., peak “P. ”
  • the greater the AUC of a machine learning model for anomaly detection is, the greater the accuracy of the machine learning model for anomaly detection may be.
  • aspects of the present disclosure may be illustrated and described herein in any of a number of patentable classes or context including any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof. Accordingly, aspects of the present disclosure may be implemented entirely hardware, entirely software (including firmware, resident software, micro-code, etc. ) or combining software and hardware implementation that may all generally be referred to herein as a "block, " “module, ” “engine, ” “unit, ” “component, ” or “system. ” Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer readable media having computer readable program code embodied thereon.
  • a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including electro-magnetic, optical, or the like, or any suitable combination thereof.
  • a computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that may communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
  • Program code embodied on a computer readable signal medium may be transmitted using any appropriate medium, including wireless, wireline, optical fiber cable, RF, or the like, or any suitable combination of the foregoing.
  • Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB. NET, Python or the like, conventional procedural programming languages, such as the “C” programming language, Visual Basic, Fortran 1703, Perl, COBOL 1702, PHP, ABAP, dynamic programming languages such as Python, Ruby and Groovy, or other programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN) , or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider) or in a cloud computing environment or offered as a service such as a software as a service (SaaS) .
  • LAN local area network
  • WAN wide area network
  • an Internet Service Provider for example, AT&T, MCI, Sprint, EarthLink, MSN, etc.
  • SaaS software as a service

Abstract

Systems and methods for anomaly detection. The system may obtain a plurality of samples. Each of the plurality of samples may be associated with an event. The system may also determine an estimated probability based on a machine learning model for the anomaly detection. The system may also determine a plurality of candidate thresholds associated with the machine learning model based on estimated probabilities corresponding to at least a portion of the plurality of samples. The system may also determine an evaluation result by evaluating the machine learning model for the anomaly detection with respect to each of the plurality of candidate thresholds. The system may also determine, based on the evaluation result, a target threshold associated with the machine learning model for the anomaly detection from the plurality of candidate thresholds.

Description

SYSTEMS AND METHODS FOR ANOMALY DETECTION
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims priority to Chinese Patent Application No. 201910501710.5 filed on June 11, 2019, the contents of which are incorporated herein by reference.
TECHNICAL FIELD
The present disclosure generally relates to anomaly detection field, and specifically, to systems and methods for determining thresholds of a machine learning model for anomaly detection.
BACKGROUND
Machine learning has greatly promoted the development of anomaly detection technology, which in turn expands the use of anomaly detection technology. For example, an anomaly detection technology is applied to intrusion detection, fault detection, network abnormal traffic detection, etc. Currently, unsupervised machine learning techniques are widely used for anomaly detection. Using an unsupervised machine learning model for anomaly detection, a threshold may be predetermined for determining whether an event is anomalous. At present, a threshold for an unsupervised machine learning model for anomaly detection is set by a user empirically, which may lack sufficient accuracy and or effectiveness of the threshold, which in turn may decrease accuracy of estimated results and/or effectiveness of the unsupervised machine learning model for anomaly detection. Therefore, it is desirable to develop systems and methods for determining a threshold of an unsupervised machine learning model for anomaly detection with improved accuracy and/or effectiveness.
SUMMARY
According to an aspect of the present disclosure, a system for anomaly detection is provided. The system may include at least one storage medium storing a set of instructions and at least one processor configured to communicate with the at least one  storage medium. When executing the set of instructions, the at least one processor may be directed to cause the system to obtain a plurality of samples. Each of the plurality of samples may be associated with an event. The at least one processor may be further directed to cause the system to determine, for each of the plurality of samples, based on a machine learning model for the anomaly detection, an estimated probability that the event corresponding to the each of the plurality of samples is anomalous. The at least one processor may be further directed to cause the system to determine, based on estimated probabilities corresponding to at least a portion of the plurality of samples, a plurality of candidate thresholds associated with the machine learning model. The at least one processor may be further directed to cause the system to determine an evaluation result by evaluating the machine learning model for the anomaly detection with respect to each of the plurality of candidate thresholds. The at least one processor may be further directed to cause the system to determine, based on the evaluation result, a target threshold associated with the machine learning model for the anomaly detection from the plurality of candidate thresholds.
In some embodiments, the machine learning model may include at least one of a one-class support vector machine (SVM) model or an isolation forest algorithm.
In some embodiments, to determine, based on estimated probabilities corresponding to at least a portion of the plurality of samples, a plurality of candidate thresholds associated with the machine learning model, the at least one processor may be directed to cause the system to designate the estimated probability corresponding to each of the at least a portion of the plurality of samples as one of the plurality of candidate thresholds.
In some embodiments, to evaluate the machine learning model with respect to each of the plurality of candidate thresholds, the at least one processor may be directed to cause the system to determine, for each of the plurality of samples, a reference probability  corresponding to the each of the plurality of samples based on a probability estimation model. The at least one processor may be directed to cause the system to evaluate, based on the estimated probability and the reference probability, the machine learning model with respect to each of the plurality of candidate thresholds.
In some embodiments, to evaluate, based on the estimated probability and the reference probability, the machine learning model, the at least one processor may be directed to cause the system to determining, based on the reference probability and the estimated probability, an evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds. The at least one processor may be further directed to cause the system to evaluate, based on the evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds, the machine learning model.
In some embodiments, to determine, based on the reference probability and the estimated probability, an evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds, the at least one processor may be directed to cause the system to determine, based on each of the plurality of candidate thresholds and the estimated probability, an estimated label of each of the plurality of samples, the estimated label including a negative sample or a positive sample. The at least one processor may be further directed to cause the system to determine, based on the reference probability and the estimated label, the evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds.
In some embodiments, the at least one processor may be further directed to cause the system to ranking the reference probability of each of the plurality of samples. To determine the evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds, the at least one processor may be further directed to  cause the system to determine, based on the ranked reference probability, and the estimated label corresponding to each of the plurality of the samples, the evaluation index.
In some embodiments, the evaluation index of the machine learning model may include at least one of an area under curve (AUC) or a Gini coefficient.
In some embodiments, to determine, based on the evaluation result, a target threshold associated with the machine learning model from the plurality of candidate thresholds, the at least one processor may be directed to cause the system to identify, from the plurality of candidate thresholds, a candidate threshold that corresponds to a maximum of the evaluation index. The at least one processor may be further directed to cause the system to designate the identified candidate threshold as the target threshold associated with the machine learning model.
In some embodiments, the at least one processor may be further directed to cause the system to obtain data associated with a specific event. The at least one processor may be further directed to cause the system to determine, based on the data associated with the specific event and the machine learning model with respect to the target threshold, whether the specific event is anomalous.
According to another aspect of the present disclosure, a method for anomaly detection is provided. The method may include obtaining a plurality of samples. Each of the plurality of samples may be associated with an event. The method may further include determining, for each of the plurality of samples, an estimated probability that the event corresponding to the each of the plurality of samples is anomalous based on a machine learning model for the anomaly detection. The method may further include determining, based on estimated probabilities corresponding to at least a portion of the plurality of samples, a plurality of candidate thresholds associated with the machine learning model. The method may further include determining an evaluation result by evaluating the machine learning model for the anomaly detection with respect to each of  the plurality of candidate thresholds. The method may further include determining, based on the evaluation result, a target threshold associated with the machine learning model for the anomaly detection from the plurality of candidate thresholds.
According to still a further aspect of the present disclosure, a non-transitory computer readable medium is provided. The non-transitory computer readable medium storing instructions, the instructions, when executed by a computer, may cause the computer to implement a method. The method may include one or more of the following operations. The method may include obtaining a plurality of samples. Each of the plurality of samples may be associated with an event. The method may further include determining, for each of the plurality of samples, an estimated probability that the event corresponding to the each of the plurality of samples is anomalous based on a machine learning model for the anomaly detection. The method may further include determining, based on estimated probabilities corresponding to at least a portion of the plurality of samples, a plurality of candidate thresholds associated with the machine learning model. The method may further include determining an evaluation result by evaluating the machine learning model for the anomaly detection with respect to each of the plurality of candidate thresholds. The method may further include determining, based on the evaluation result, a target threshold associated with the machine learning model for the anomaly detection from the plurality of candidate thresholds.
According to another aspect of the present disclosure, a system for anomaly detection is provided. The system may include an acquisition module, a determination module and an evaluation module. The acquisition module may be configured to obtain a plurality of samples. Each of the plurality of samples may be associated with an event. The determination module may be configured to for each of the plurality of samples, determine, based on a machine learning model for the anomaly detection, an estimated probability that the event corresponding to the each of the plurality of samples is  anomalous. The determination module may be also configured to determine, based on estimated probabilities corresponding to at least a portion of the plurality of samples, a plurality of candidate thresholds associated with the machine learning model. The evaluation module may be configured to determine an evaluation result by evaluating the machine learning model for the anomaly detection with respect to each of the plurality of candidate thresholds. The determination module may be further configured to determine, based on the evaluation result, a target threshold associated with the machine learning model for the anomaly detection from the plurality of candidate thresholds.
Additional features will be set forth in part in the description which follows, and in part will become apparent to those skilled in the art upon examination of the following and the accompanying drawings or may be learned by production or operation of the examples. The features of the present disclosure may be realized and attained by practice or use of various aspects of the methodologies, instrumentalities and combinations set forth in the detailed examples discussed below.
BRIEF DESCRIPTION OF THE DRAWINGS
The present disclosure is further described in terms of exemplary embodiments. These exemplary embodiments are described in detail with reference to the drawings. These embodiments are non-limiting exemplary embodiments, in which like reference numerals represent similar structures throughout the several views of the drawings, and wherein:
FIG. 1 is a schematic diagram illustrating an exemplary anomaly detection system according to some embodiments of the present disclosure;
FIG. 2 is a schematic diagram illustrating exemplary hardware and software components of a computing device according to some embodiments of the present disclosure;
FIG. 3 is a schematic diagram illustrating exemplary hardware and/or software components of a mobile device on which a terminal may be implemented according to some embodiments of the present disclosure;
FIG. 4 is a block diagram illustrating an exemplary processing device according to some embodiments of the present disclosure;
FIG. 5 is a flowchart illustrating an exemplary process for determining a threshold of a machine learning model for anomaly detection according to some embodiments of the present disclosure;
FIG. 6 is a flowchart illustrating an exemplary process for evaluating a machine learning model according to some embodiments of the present disclosure;
FIG. 7 is a flowchart illustrating an exemplary process for anomaly detection according to some embodiments of the present disclosure; and
FIGs. 8A-8D are schematic diagrams of exemplary anomaly detection results according to some embodiments of the present disclosure.
DETAILED DESCRIPTION
In the following detailed description, numerous specific details are set forth by way of examples in order to provide a thorough understanding of the relevant disclosure. However, it should be apparent to those skilled in the art that the present disclosure may be practiced without such details. In other instances, well-known methods, procedures, systems, components, and/or circuitry have been described at a relatively high-level, without detail, in order to avoid unnecessarily obscuring aspects of the present disclosure. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present disclosure. Thus, the present disclosure is not limited to the embodiments shown, but to be accorded the widest scope consistent with the claims.
The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting. As used herein, the singular forms “a, ” “an, ” and “the” may be intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprise, ” “comprises, ” and/or “comprising, ” “include, ” “includes, ” and/or “including, ” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It will be understood that the term “system, ” “engine, ” “unit, ” “module, ” and/or “block” used herein are one method to distinguish different components, elements, parts, section or assembly of different level in ascending order. However, the terms may be displaced by another expression if they achieve the same purpose.
Generally, the word “module, ” “unit, ” or “block, ” as used herein, refers to logic embodied in hardware or firmware, or to a collection of software instructions. A module, a unit, or a block described herein may be implemented as software and/or hardware and may be stored in any type of non-transitory computer-readable medium or another storage device. In some embodiments, a software module/unit/block may be compiled and linked into an executable program. It will be appreciated that software modules can be callable from other modules/units/blocks or from themselves, and/or may be invoked in response to detected events or interrupts. Software modules/units/blocks configured for execution on computing devices may be provided on a computer-readable medium, such as a compact disc, a digital video disc, a flash drive, a magnetic disc, or any other tangible medium, or as a digital download (and can be originally stored in a compressed or installable format that needs installation, decompression, or decryption prior to execution) . Such software code may be stored, partially or fully, on a storage device of the executing computing device, for execution by the computing device. Software instructions may be embedded  in firmware, such as an erasable programmable read-only memory (EPROM) . It will be further appreciated that hardware modules/units/blocks may be included in connected logic components, such as gates and flip-flops, and/or can be included of programmable units, such as programmable gate arrays or processors. The modules/units/blocks or computing device functionality described herein may be implemented as software modules/units/blocks, but may be represented in hardware or firmware. In general, the modules/units/blocks described herein refer to logical modules/units/blocks that may be combined with other modules/units/blocks or divided into sub-modules/sub-units/sub-blocks despite their physical organization or storage. The description may be applicable to a system, an engine, or a portion thereof.
It will be understood that when a unit, engine, module or block is referred to as being “on, ” “connected to, ” or “coupled to, ” another unit, engine, module, or block, it may be directly on, connected or coupled to, or communicate with the other unit, engine, module, or block, or an intervening unit, engine, module, or block may be present, unless the context clearly indicates otherwise. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.
These and other features, and characteristics of the present disclosure, as well as the methods of operation and functions of the related elements of structure and the combination of parts and economies of manufacture, may become more apparent upon consideration of the following description with reference to the accompanying drawings, all of which form a part of this disclosure. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended to limit the scope of the present disclosure. It is understood that the drawings are not to scale.
The flowcharts used in the present disclosure illustrate operations that systems implement according to some embodiments in the present disclosure. It is to be  expressly understood the operations of the flowchart may be implemented not in order. Conversely, the operations may be implemented in inverted order, or simultaneously. Moreover, one or more other operations may be added to the flowcharts. One or more operations may be removed from the flowcharts.
Embodiments of the present disclosure may be applied to different transportation systems including but not limited to land transportation, sea transportation, air transportation, space transportation, or the like, or any combination thereof. A vehicle of the transportation systems may include a rickshaw, travel tool, taxi, chauffeured car, hitch, bus, rail transportation (e.g., a train, a bullet train, high-speed rail, and subway) , ship, airplane, spaceship, hot-air balloon, driverless vehicle, or the like, or any combination thereof. The transportation system may also include any transportation system that applies management and/or distribution, for example, a system for sending and/or receiving an express.
The application scenarios of different embodiments of the present disclosure may include but not limited to one or more webpages, browser plugins and/or extensions, client terminals, custom systems, intracompany analysis systems, artificial intelligence robots, or the like, or any combination thereof. It should be understood that application scenarios of the system and method disclosed herein are only some examples or embodiments. Those having ordinary skills in the art, without further creative efforts, may apply these drawings to other application scenarios, for example, another similar server.
The term “passenger, ” “requester, ” “requestor, ” “service requester, ” “service requestor” and “customer” in the present disclosure are used interchangeably to refer to an individual, an entity or a tool that may request or order a service. Also, the term “driver, ” “provider, ” “service provider, ” and “supplier” in the present disclosure are used interchangeably to refer to an individual, an entity or a tool that may provide a service or facilitate the providing of the service. The term “user” in the present disclosure may refer  to an individual, an entity or a tool that may request a service, order a service, provide a service, or facilitate the providing of the service. For example, the user may be a requester, a passenger, a driver, an operator, or the like, or any combination thereof. In the present disclosure, “requester” and “requester terminal” may be used interchangeably, and “provider” and “provider terminal” may be used interchangeably.
The term “request, ” “service, ” “service request, ” and “order” in the present disclosure are used interchangeably to refer to a request that may be initiated by a passenger, a requester, a service requester, a customer, a driver, a provider, a service provider, a supplier, or the like, or any combination thereof. The service request may be accepted by any one of a passenger, a requester, a service requester, a customer, a driver, a provider, a service provider, or a supplier. The service request may be chargeable or free.
Some embodiments of the present disclosure provide systems and methods for determining or predicting whether an event is anomalous using a model. The model may be a machine learning model. The model may be used with a target threshold that may serve as a classifier. The model may be used to predict or determine whether an event is anomalous. For instance, the model may predict or determine a probability that an event is anomalous, and then by comparing the probability with the target threshold determine/designate the event as anomalous or not.
Some embodiments of the present disclosure provide systems and methods for determining a model and a target threshold with respect to the model for anomaly detection that is used to determine or predict whether an event is anomalous. The target threshold may be determined using a plurality of samples associated with different events each of whose anomaly status (whether an event is anomalous or not) is unknown or known. Estimated probabilities corresponding to the plurality of samples may be determined. Further, a plurality of candidate thresholds associated with a machine  learning model for anomaly detection may be determined. Then an evaluation result may be determined by evaluating the machine learning model, with respect to each of the plurality of candidate thresholds, for detecting anomaly of the samples. A target threshold associated with the machine learning model from the plurality of candidate thresholds may be determined based on the evaluation result. Accordingly, the plurality of candidate thresholds may be applied to a machine learning model for anomaly detection to assess the accuracy and/or effectiveness of the candidate thresholds for the machine learning model in anomaly detection. The target threshold may then be determined from the plurality of candidate thresholds by evaluating the machine learning model with respect to each of the plurality of candidate thresholds, which may further improve the accuracy and/or effectiveness of the machine learning model. The systems and methods for anomaly detection according to some embodiments of the present disclosure may reduce or avoid the need to rely on the experience of an individual to select a threshold for a machine learning model for anomaly detection.
FIG. 1 is a schematic diagram illustrating an exemplary anomaly detection system 100 according to some embodiments of the present disclosure. The anomaly detection system 100 may be a platform for data and/or information processing, for example, training a machine learning model for anomaly detection and/or data classification, such as image classification, text classification, etc. The anomaly detection system 100 may be applied in intrusion detection, fault detection, network abnormal traffic detection, fraud detection, behavior abnormal detection, or the like, or a combination thereof. An anomaly may be also referred to as an outlier, a novelty, a noise, a deviation, an exception, etc. As used herein, an anomaly refers to an action or an event that is determined to be unusual or abnormal in view of known or inferred conditions. For example, for a network subscribe platform (e.g., a video broadcast platform, a social network platform, etc. ) , the anomaly may include a network quality anomaly, a user access anomaly, a server anomaly, etc.  As another example, for an online transport service platform (e.g., an online taxi-hailing service platform) , the anomaly may include an order anomaly, a driver behavior anomaly, a passenger behavior anomaly, a route anomaly, etc.
The anomaly detection system 100 may include a data exchange port 101, a data transmitting port 102, a server 110, and storage 120. In some embodiments, the anomaly detection system 100 may interact with a data providing system 130 and a service providing system 140 via the data exchange port 101 and the data transmitting port 102, respectively. For example, anomaly detection system 100 may access information and/or data stored in the data providing system 130 via the data exchange port 101. As another example, the server 110 may send information and/or data to a service providing system 140 via the data transmitting port 102.
The server 110 may process information and/or data relating to anomaly detection. In some embodiments, the server 110 may be a single server, or a server group. The server group may be centralized, or distributed (e.g., the server 110 may be a distributed system) . In some embodiments, the server 110 may be implemented on a cloud platform. Merely by way of example, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof. In some embodiments, the server 110 may be implemented on a computing device having one or more components illustrated in FIG. 2 in the present disclosure.
In some embodiments, the server 110 may include a processing device 112. The processing device 112 may process information and/or data relating to anomaly detection to perform one or more functions described in the present disclosure. For example, the processing device 112 may receive a machine learning model for anomaly detection from the data providing system 130 and a sample set from the service providing system 140. The processing device 112 may determine a target threshold for the machine learning  model for anomaly detection based on the sample set. As another example, the processing device 112 may estimate whether a specific sample received from the service providing system 140 is anomalous based on the target threshold using the machine learning model for anomaly detection. The target threshold may be updated from time to time, e.g., periodically or not, based on a sample set that is at least partially different from the original sample set from which the original target threshold is determined. For instance, the target threshold may be updated based on a sample set including new samples that are not in the original sample set, samples whose anomaly is assessed using the machine learning model in connection with the original target threshold or a target threshold of a prior version, or the like, or a combination thereof. As still another example, the processing device 112 may transmit a signal including the estimated result to the service providing system 140. In some embodiments, the determination and/or updating of the target threshold may be performed on a processing device, while the application of a machine learning model in connection with the target threshold may be performed on a different processing device. In some embodiments, the determination and/or updating of the target threshold and/or the corresponding machine learning model may be performed on a processing device of a system different than the anomaly detection system 100 or a server different than the server 110 on which the application of a machine learning model in connection with the target threshold is performed. For instance, the determination and/or updating of the target threshold and/or the machine learning model may be performed on a first system of a vendor who provides and/or maintains such a machine learning model, including the target threshold, and/or has access to training samples used to determine and/or update the target threshold and/or the machine learning model, while anomaly detection of an event based on the provided machine learning model, including the target threshold, may be performed on a second system of a client of the vendor. In some embodiments, the determination and/or  updating of the target threshold and/or the machine learning model may be performed online in response to a request for anomaly detection of an event. In some embodiments, the determination and/or updating of the target threshold and/or the machine learning model may be performed offline. In some embodiments, the processing device 112 may include one or more processors (e.g., single-core processor (s) or multi-core processor (s) ) . Merely by way of example, the processing device 112 may include a central processing unit (CPU) , an application-specific integrated circuit (ASIC) , an application-specific instruction-set processor (ASIP) , a graphics processing unit (GPU) , a physics processing unit (PPU) , a digital signal processor (DSP) , a field programmable gate array (FPGA) , a programmable logic device (PLD) , a controller, a microcontroller unit, a reduced instruction-set computer (RISC) , a microprocessor, or the like, or any combination thereof.
The storage 120 may store data and/or instructions related to content identification and/or data classification. In some embodiments, the storage 120 may store data obtained/acquired from the data providing system 130 and/or the service providing system 140. In some embodiments, the storage 120 may store data and/or instructions that the server 110 may execute or use to perform exemplary methods described in the present disclosure. In some embodiments, the storage 120 may include a mass storage device, a removable storage device, a volatile read-and-write memory, a read-only memory (ROM) , or the like, or any combination thereof. Exemplary mass storage devices may include a magnetic disk, an optical disk, a solid-state drive, etc. Exemplary removable storage devices may include a flash drive, a floppy disk, an optical disk, a memory card, a zip disk, a magnetic tape, etc. Exemplary volatile read-and-write memory may include a random access memory (RAM) . Exemplary RAM may include a dynamic RAM (DRAM) , a double date rate synchronous dynamic RAM (DDR SDRAM) , a static RAM (SRAM) , a thyristor RAM (T-RAM) , and a zero-capacitor RAM (Z-RAM) , etc. Exemplary ROM may include a mask ROM (MROM) , a programmable ROM (PROM) , an erasable programmable ROM  (PEROM) , an electrically erasable programmable ROM (EEPROM) , a compact disk ROM (CD-ROM) , and a digital versatile disk ROM, etc. In some embodiments, the storage 120 may be implemented on a cloud platform. Merely by way of example, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof.
In some embodiments, the storage 120 may be connected to or communicate with the server 110. The server 110 may access data or instructions stored in the storage 120 directly or via a network. In some embodiments, the storage 120 may be a part of the server 110.
The data providing system 130 may provide data and/or information related to anomaly detection and/or data classification. The data and/or information may include images, text files, voice segments, web pages, video recordings, user requests, programs, applications, algorithms, instructions, computer codes, or the like, or a combination thereof. In some embodiments, the data providing system 130 may provide the data and/or information to the server 110 and/or the storage 120 of the anomaly detection system 100 for processing (e.g., train a machine learning model for anomaly detection) . In some embodiments, the data providing system 130 may provide the data and/or information to the service providing system 140 for generating a service response relating to the anomaly detection and/or data classification.
In some embodiments, the service providing system 140 may be configured to provide online services, such as an anomaly detection service, an online to offline service (e.g., a taxi service, a carpooling service, a food delivery service, a party organization service, an express service, etc. ) , an unmanned driving service, a medical service, a map-based service (e.g., a route planning service) , a live chatting service, a query service, a Q&A service, etc. The service providing system 140 may generate service responses, for  example, by inputting the data and/or information received from a user and/or the data providing system 130 into a machine learning model for anomaly detection.
In some embodiments, the data providing system 130 and/or the service providing system 140 may be a device, a platform, or other entity interacting with the anomaly detection system. In some embodiments, the data providing system 130 may be implemented in a device with data acquisition and/or data storage, such as a mobile device 130-1, a tablet computer 130-2, a laptop computer 130-3, a server 130-4, a storage device (not shown) , or the like, or any combination thereof. In some embodiments, the service providing system 140 may also be implemented in a device with data processing, such as a mobile device 140-1, a tablet computer 140-2, a laptop computer 140-3, and a server 140-4, or the like, or any combination thereof. In some embodiments, the mobile devices 130-1 and 140-1 may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof. In some embodiments, the smart home device may include a smart lighting device, a control device of an intelligent electrical apparatus, a smart monitoring device, a smart television, a smart video camera, an interphone, or the like, or any combination thereof. In some embodiments, the wearable device may include a smart bracelet, a smart footgear, a smart glass, a smart helmet, a smart watch, a smart clothing, a smart backpack, a smart accessory, or the like, or any combination thereof. In some embodiments, the smart mobile device may include a smartphone, a personal digital assistant (PDA) , a gaming device, a navigation device, a point of sale (POS) device, or the like, or any combination thereof. In some embodiments, the virtual reality device and/or the augmented reality device may include a virtual reality helmet, a virtual reality glass, a virtual reality patch, an augmented reality helmet, an augmented reality glass, an augmented reality patch, or the like, or any combination thereof. For example, the virtual reality device and/or the augmented reality device may include a Google Glass, an Oculus  Rift, a HoloLens, a Gear VR, etc. In some embodiments, the servers 130-4 and 140-4 may include a database server, a file server, a mail server, a web server, an application server, a computing server, a media server, a communication server, etc.
In some embodiments, the data providing system 130 may be a device with data processing technology for preprocessing acquired or stored information (e.g., identifying images from stored information) . In some embodiments, the service providing system 140 may be a device for data processing, for example, train an identification model using a cleaned dataset received from the server 110. In some embodiments, the service providing system 140 may directly communicate with the data providing system 130 via a network 150-3. For example, the service providing system 140 may receive a dataset from the data providing system 130, and perform an anomaly detection on the dataset based on a machine learning model for anomaly detection.
In some embodiments, any two systems of the anomaly detection system 100, the data providing system 130, and the service providing system 140 may be integrated into a device or a platform. For example, both the data providing system 130 and the service providing system 140 may be implemented in a mobile device of a user. In some embodiments, the anomaly detection system 100, the data providing system 130, and the service providing system 140 may be integrated into a device or a platform. For example, the anomaly detection system 100, the data providing system 130, and the service providing system 140 may be implemented in a computing device including a server and a user interface.
Networks 150-1 through 150-3 may facilitate exchange of information and/or data. In some embodiments, one or more components in the anomaly detection system (e.g., the server 110 and/or the storage 120) may send and/or receive information and/or data to/from the data providing system 130 and/or the service providing system 140 via the networks 150-1 through 150-3. For example, the server 110 may obtain/acquire datasets  for anomaly detection from the data providing system 130 via the network 150-1. As another example, the server 110 may transmit/output estimated result for anomaly detection to the service providing system 140 via the network 150-2. In some embodiments, the networks 150-1 through 150-3 may be any type of wired or wireless networks, or combination thereof. Merely by way of example, the networks 150 may include a cable network, a wireline network, an optical fiber network, a tele communications network, an intranet, an Internet, a local area network (LAN) , a wide area network (WAN) , a wireless local area network (WLAN) , a metropolitan area network (MAN) , a wide area network (WAN) , a public telephone switched network (PSTN) , a Bluetooth TM network, a ZigBee TM network, a near field communication (NFC) network, a global system for mobile communications (GSM) network, a code-division multiple access (CDMA) network, a time-division multiple access (TDMA) network, a general packet radio service (GPRS) network, an enhanced data rate for GSM evolution (EDGE) network, a wideband code division multiple access (WCDMA) network, a high speed downlink packet access (HSDPA) network, a long term evolution (LTE) network, a user datagram protocol (UDP) network, a transmission control protocol/Internet protocol (TCP/IP) network, a short message service (SMS) network, a wireless application protocol (WAP) network, a ultra wide band (UWB) network, an infrared ray, or the like, or any combination thereof.
FIG. 2 illustrates a schematic diagram of an exemplary computing device 200 according to some embodiments of the present disclosure. The computing device 200 may be a computer, such as the server 110 in FIG. 1 and/or a computer with specific functions, configured to implement any particular system according to some embodiments of the present disclosure. The computing device 200 may be configured to implement any component that performs one or more functions disclosed in the present disclosure. For example, the server 110 (e.g., the processing device 112) may be implemented in hardware devices, software programs, firmware, or any combination thereof of a computer  like the computing device 200. For brevity, FIG. 2 depicts only one computing device. In some embodiments, the functions of the computing device may be implemented by a group of similar platforms in a distributed mode to disperse the processing load of the system.
The computing device 200 may include a communication terminal 250 that may connect with a network that may implement the data communication. The computing device 200 may also include a processor 220 that is configured to execute instructions and includes one or more processors. The schematic computer platform may include an internal communication bus 210, different types of program storage units and data storage units (e.g., a hard disk 270, a read-only memory (ROM) 230, a random-access memory (RAM) 240) , various data files applicable to computer processing and/or communication, and some program instructions executed possibly by the processor 220. The computing device 200 may also include an I/O device 260 that may support the input and output of data flows between the computing device 200 and other components. Moreover, the computing device 200 may receive programs and data via the communication network.
FIG. 3 is a schematic diagram illustrating exemplary hardware and/or software components of an exemplary mobile device on which a service system (e.g., the anomaly detection system 122, the data providing system 130 and/or the service providing system 140) may be implemented according to some embodiments of the present disclosure. As illustrated in FIG. 3, the mobile device 300 may include, a communication platform 310, a display 320, a graphics processing unit (GPU) 330, a central processing unit (CPU) 340, an I/O 350, a memory 360, a mobile operating system (OS) 370, application (s) , and a storage 390. In some embodiments, any other suitable component, including but not limited to a system bus or a controller (not shown) , may also be included in the mobile device 300.
In some embodiments, the mobile operating system 370 (e.g., iOS TM, Android TM, Windows Phone TM, etc. ) and one or more applications 380 may be loaded into the memory 360 from the storage 390 in order to be executed by the CPU 340. The applications 380 may include a browser or any other suitable mobile apps for receiving and rendering information relating to image processing or other information from the Anomaly detection system 100. User interactions with the information stream may be achieved via the I/O 350 and provided to the storage device 120, the server 110 and/or other components of the Anomaly detection system 100. In some embodiments, the mobile device 300 may be an exemplary embodiment corresponding to a terminal associated with, the anomaly detection system 100, the data providing system 130 and/or the service providing system 140.
To implement various modules, units, and their functionalities described in the present disclosure, computer hardware platforms may be used as the hardware platform (s) for one or more of the elements described herein. A computer with user interface elements may be used to implement a personal computer (PC) or any other type of work station or terminal device. A computer may also act as a system if appropriately programmed.
FIG. 4 is a block diagram illustrating an exemplary processing device 112 according to some embodiments of the present disclosure. The processing device 112 may include an acquisition module 410, a determination module 420, an evaluation module 430, and a storage module 440.
The acquisition module 410 may be configured to obtain data for anomaly detection. For example, the acquisition module 410 may obtain a plurality of samples. Each of the plurality of samples may be associated with an event. As used herein, an event may be defined by information and/or data that indicating something has happened at a specific time or time period. As another example, the acquisition module 410 may  also obtain data associated with a specific event. In some embodiments, the specific event may be an event associated with one of the plurality of samples. The data associated with the specific event may be one of the plurality of samples. The data associated with the specific event may include one or more features that characterizes the specific event as described elsewhere in the present disclosure. As still another example, the acquisition module 410 may be configured to obtain a model, such as a machine learning model for anomaly detection, a probability estimation model, etc.
The determination module 420 may be configured to determine an estimated probability for each of the plurality of samples based on a machine learning model for anomaly detection. Each of the plurality of samples may correspond to an estimated probability. The determination module 420 may further determine a plurality of candidate thresholds associated with the machine learning model based on the estimated probabilities of the plurality of samples. The determination module 420 may also determine a target threshold associated with the machine learning model from the plurality of candidate thresholds based on an evaluation result corresponding to each of the plurality of candidate thresholds. The determination module 420 may determine whether the specific event is anomalous based on the data associated with the specific event and the machine learning model with the target threshold.
The evaluation module 430 may be configured to determine the evaluation result by evaluating the machine learning model for anomaly detection with respect to each of the plurality of candidate thresholds. The evaluation module 430 may evaluate the machine learning model with respect to each of the plurality of candidate thresholds according to one or more evaluation indexes. The evaluation result may be denoted by one or more values of the one or more evaluation indexes.
The storage module 440 may be configured to store information. The information may include programs, software, algorithms, data, text, number, images and some other  information. For example, the information may include data which may define an event indicating something has happened at a specific time or time period, etc. As another example, the information may include a machine learning model for anomaly detection.
It should be noted that the above description of the processing device 112 provided for the purposes of illustration, and is not intended to limit the scope of the present disclosure. For persons having ordinary skills in the art, multiple variations and modifications may be made under the teachings of the present disclosure. However, those variations and modifications do not depart from the scope of the present disclosure. In some embodiments, any module mentioned above may be implemented in two or more separate units. For example, the functions of determination module 420 may be implemented in two separate units, one of which is configured to determine an estimated probability corresponding to each of the plurality of samples, and the other is configured to determine candidate thresholds associated with the machine learning model. In some embodiments, the processing device 112 may further include one or more additional modules (e.g., a storage module) . Additionally or alternatively, one or more modules mentioned above may be omitted.
FIG. 5 is a flowchart illustrating an exemplary process 500 for determining a threshold of a machine learning model for anomaly detection according to some embodiments of the present disclosure. At least a portion of process 500 may be implemented on the computing device 200 as illustrated in FIG. 2 or the mobile device 300 as illustrated in FIG. 3. In some embodiments, one or more operations of process 500 may be implemented in the anomaly detection system 100 as illustrated in FIG. 1. In some embodiments, one or more operations in the process 500 may be stored in a storage device (e.g., the storage 120, the ROM 230, the RAM 240, the storage 390) as a form of instructions, and invoked and/or executed by the server 110 (e.g., the processing device 112 in the server 110, or the processor 220 of the computing device 200) or the  CPU 340 of the mobile device 300. In some embodiments, the instructions may be transmitted in the form of electronic current or electrical signals.
In 502, the processing device 112 (e.g., the acquisition module 410) may obtain a plurality of samples. Each of the plurality of samples may be associated with an event. The plurality of samples may be obtained by the acquisition module 410 from a storage device (e.g., the storage device 120, the ROM 230, the RAM 240, the storage 390) as described elsewhere in the present disclosure. As used herein, an event may be defined by information and/or data that indicating something has happened at a specific time or within a specific time period. For example, for an online taxi-hailing platform, the event may include logging in the online taxi-hailing platform, initiating a service request, dispatching a service request, picking up a passenger, transporting a passenger to a destination along a predetermined route, a communication between a driver and a passenger on a route, a communication between a client terminal and a server associated with the onlie taxi-hailing platform, or the like, or a combination thereof.
As used herein, a sample associated with an event may be also referred to as sample data. The sample may be in the form of an image, a video, text, etc. A sample associated with a specific event may include and/or represent one or more features that characterize the specific event. In some embodiments, the one or more features associated with the specific event may be presented as a feature vector (e.g., a multi-dimensional vector) . Each dimension of the feature vector may represent a feature of the specific event. For example, in an online taxi-hailing platform, an event may include transporting a passenger to a destination along a predetermined route. The sample data (e.g., one or more features) associated with the event may include a start location, a start time, a destination, an estimated time of arrival, a real-time location, a travel trajectory (e.g., the whole length of the travel trajectory, the whole travel time of the travel trajectory, the length of a road segment in the travel trajectory, the travel time of a road segment in  the travel trajectory, etc. ) , or the like, or any combination thereof. In some embodiments, the plurality of samples may form a sample set. The sample set may be denoted as
Figure PCTCN2019091433-appb-000001
Figure PCTCN2019091433-appb-000002
An element
Figure PCTCN2019091433-appb-000003
in the sample set may represent a sample. A sample may correspond to a multi-dimensional feature vector denoted as [f 1, f 2, f 3, …] that represents one or more features of the event. For example, in an online taxi-hailing platform, if an event includes transporting a passenger to a destination along a predetermined route, f 1 may represent the start location, f 2 may represent the destination, and f 3 may represent the travel trajectory.
In 504, the processing device 112 (e.g., the determination module 420) may determine an estimated probability for each of the plurality of samples based on a machine learning model for anomaly detection. Each of the plurality of samples may correspond to an estimated probability. As used herein, the estimated probability of a specific sample determined based on the machine learning model for anomaly detection may refer to a possibility that the event corresponding to the specific sample is anomalous. The smaller the value of the estimated probability is, the higher the possibility that the event is anomalous may be. In some embodiments, the machine learning model for anomaly detection may be configured to generate and/or output the estimated probability of an event using the sample corresponding to the event. For example, the processing device 112 may input a specific sample into the machine learning model for anomaly detection. The machine learning model for anomaly detection may generate and output the estimated probability of an event associated with the specific sample using the inputted specific sample.
The machine learning model for anomaly detection may be obtained by the acquisition module 410 from the data providing system 130, the storage 120, the service providing system 140, or any other storage device as described elsewhere in the present disclosure. The machine learning model for anomaly detection may include an  unsupervised machine learning model, a semi-unsupervised machine learning model, etc. Exemplary unsupervised machine learning models may include using a classification-based algorithm, a statistical distribution-based algorithm, a proximity-based algorithm, a density-based algorithm, a cluster-based algorithm, a tree-based algorithm, etc. For example, the classification-based algorithm may include using a neural network model, a Bayesian network model, a one-class support vector machine (SVM) model, a robust SVM, a one-class kernel fisher discriminants model, etc. The statistical distribution-based algorithm may include using a Gaussian model, a robust regression model, etc. The proximity-based algorithm may include a K-nearest neighbor (KNN) algorithm, an outlier detection using in-degree number (ODIN) algorithm, etc. The density-based algorithm may include a local outlier factor (LOF) algorithm, a connectivity-based outlier factor (COF) algorithm, etc. The tree-based algorithm may include an isolation forest (iForest) algorithm, an interpretable hierarchical clustering unsupervised decision tree (IHCUDT) algorithm, etc. The cluster-based algorithm may include a share nearest neighbor (SNN) clustering algorithm, a wave cluster algorithm, K-means clustering algorithm, a self-organizing maps algorithm, an expectation maximization (EM) algorithm, etc. Exemplary semi-unsupervised machine learning models may include using a Markovian model, a finite state automata (FSA) model, a hidden Markov model (HMM) , a probabilistic suffix trees (SMT) model, etc.
In 506, the processing device 112 (e.g., the determination module 420) may determine a plurality of candidate thresholds associated with the machine learning model. A candidate threshold associated with the machine learning model for anomaly detection may be configured to determine whether an event is anomalous. For example, if an estimated probability of an event determined using the machine learning model for anomaly detection is greater than the candidate threshold, the processing device 112 may determine that the event is anomalous when using the candidate threshold.
In some embodiments, the processing device 112 (e.g., the determination module 420) may determine at least a portion of the plurality of candidate thresholds based on the estimated probability determined in operation 504. For example, the processing device 112 may determine a portion or all of the plurality of candidate thresholds based on estimated probabilities corresponding to at least a portion of the plurality of samples. Further, the processing device 112 may designate each of the estimated probabilities corresponding to at least a portion of the plurality of samples as one of the plurality of candidate thresholds. As a further example, the processing device 112 may rank the estimated probability corresponding to each of the plurality of samples (e.g., in an ascending order or a descending order) . The processing device 112 may determine a portion or all of the plurality of candidate thresholds based on the ranked estimated probabilities of the plurality samples. An estimated probability ranked as, e.g., the top, the bottom, the medium of the ranked estimated probabilities may be designated as a candidate threshold. As still another example, the processing device 112 may designate one or more estimated probabilities of at least a portion of the plurality samples within a certain range as one or more candidate thresholds. In some embodiments, the processing device 112 may designate the estimated probability corresponding to each of the plurality of samples as one of the plurality of candidate thresholds.
In some embodiments, the processing device 112 may determine at least a portion of the plurality of candidate thresholds using a probability estimation model. Further, the processing device 112 may determine a reference probability corresponding to each of the plurality of samples using the probability estimation model. The reference probability corresponding to a specific sample of the plurality of samples may be used to measure and/or assess similarity between the specific sample and one or more other samples of the plurality of samples. The greater the reference probability corresponding to the specific sample is, the higher the similarity between the specific sample and the one or  more other samples of the plurality of samples may be. The processing device 112 may determine a portion or all of the plurality of candidate thresholds based on reference probabilities corresponding to at least a portion of the plurality of samples. For example, the processing device 112 may designate each of reference probabilities corresponding to a portion of the plurality of samples as one of the plurality of candidate thresholds. As another example, the processing device 112 may designate each of reference probabilities corresponding to all the plurality of samples as one of the plurality of candidate thresholds. Exemplary probability estimation models may include using a parametric estimation algorithm, a Bayes algorithm, a non-parametric estimation algorithm, etc. For example, the parametric estimation algorithm may include a maximum likelihood algorithm. The non-parametric estimation algorithm may include a histogram probability estimation algorithm, a kernel density estimation algorithm, etc.
In some embodiments, at least a portion of the plurality of candidate thresholds may be set by a user or according to a default setting of the anomaly detection system 100.
In 508, the processing device 112 (e.g., the evaluation module 430) may determine an evaluation result by evaluating the machine learning model for anomaly detection with respect to each of the plurality of candidate thresholds. In some embodiments, the processing device 112 may evaluate the machine learning model with respect to each of the plurality of candidate thresholds according to one or more evaluation indexes. The evaluation result may be denoted by one or more values of the one or more evaluation indexes. Exemplary evaluation indexes of the machine learning model for anomaly detection may include an area under curve (AUC) , a Gini coefficient, or the like, or any combination thereof. An evaluation index may be used to measure and/or indicate the accuracy of estimation results of the machine learning model for anomaly detection. For example, the greater the value of AUC of the machine learning model for  anomaly detection with respect to a candidate threshold, the greater the accuracy of estimation results of the machine learning model for anomaly detection may be.
The processing device 112 may determine the value of an evaluation index of the machine learning model for anomaly detection with respect to a specific candidate threshold using the plurality of samples. For example, the processing device 112 may determine a reference probability corresponding to each of the plurality of samples using a probability estimation model as described elsewhere in the present disclosure. The processing device 112 may determine the value of the evaluation index using the estimated probability and the reference probability corresponding to each of the plurality of samples. For the specific candidate threshold, the processing device 112 may designate a label to each of the plurality of samples based on the specific candidate threshold and the estimated probability corresponding to each of the plurality of samples. The label may include a positive sample or a negative sample. For example, the processing device 112 may designate a positive sample to a sample if the estimated probability corresponding to the sample exceeds the specific candidate threshold. The processing device 112 may designate a negative sample to a sample if the estimated probability corresponding to the sample is smaller than the specific candidate threshold. The processing device 112 may determine an evaluation index corresponding to the specific candidate threshold based on the label and the reference probability corresponding to each of the plurality of samples. More descriptions for determining the evaluation result may be found elsewhere in the present disclosure (e.g., FIG. 6 and the descriptions thereof) .
In 510, the processing device 112 (e.g., the determination module 420) may determine a target threshold associated with the machine learning model from the plurality of candidate thresholds based on the evaluation result corresponding to each of the plurality of candidate thresholds. In some embodiments, the processing device 112 may  compare the evaluation result corresponding to each of the plurality of candidate thresholds. Each of the plurality of candidate thresholds may correspond to an evaluation result, i.e., a value of an evaluation index. The processing device 112 may determine the target threshold based on the comparison. The processing device 112 may compare the values of an evaluation index (e.g., AUC) of the machine learning model with respect to the plurality of candidate thresholds. The processing device 112 may designate a candidate threshold that corresponds the maximum or maximum of the values of the evaluation index (e.g., AUC) of the machine learning model with respect to the plurality of candidate thresholds as the target threshold. As another example, the processing device 112 may determine at least two candidate thresholds from the plurality of candidate thresholds. The values of the evaluation index (e.g., AUC) of the machine learning model corresponding to the at least two candidate thresholds may be greater than or smaller than values of the evaluation index corresponding to other candidate thresholds of the plurality of candidate thresholds. The processing device 112 may designate an average of the at least two candidate thresholds as the target threshold.
It should be noted that the above description regarding the process 500 is merely provided for the purposes of illustration, and not intended to limit the scope of the present disclosure. For persons having ordinary skills in the art, multiple variations and modifications may be made under the teachings of the present disclosure. However, those variations and modifications do not depart from the scope of the present disclosure. In some embodiments, one or more operations may be omitted and/or one or more additional operations may be added. For example, operation 510 may be combined into operation 502. Operation 512 and operation 514 may be omitted. As another example, one or more operation in process 600 may be added into the process 500 to determine an evaluation result of the machine learning model. In some embodiments, in operation 514, if the estimated probability corresponding to the specific event equals the target threshold,  the determination module 420 may determine that the specific event is anomalous or normal. In some embodiments, process 500 may also include performing an anomaly detection on the plurality of samples based on the machine learning model for anomaly detection with the target threshold. For example, the processing device 112 may determine one or more anomalies from the plurality of samples using the machine learning model for anomaly detection with the target threshold. In some embodiments, a target machine learning model may be determined based on samples each of whose anomaly status is known, and one or more operations illustrated in FIG. 5 may be omitted. For instance, operation 504 may be omitted.
FIG. 6 is a flowchart illustrating an exemplary process for evaluating a machine learning model according to some embodiments of the present disclosure. At least a portion of process 600 may be implemented on the computing device 200 as illustrated in FIG. 2 or the mobile device 300 as illustrated in FIG. 3. In some embodiments, one or more operations of process 600 may be implemented in the anomaly detection system 100 as illustrated in FIG. 1. In some embodiments, one or more operations in the process 600 may be stored in a storage device (e.g., the storage 120, the ROM 230, the RAM 240, the storage 390) as a form of instructions, and invoked and/or executed by the server 110 (e.g., the processing device 112 in the server 110, or the processor 220 of the computing device 200) or the CPU 340 of the mobile device 300. In some embodiments, the instructions may be transmitted in the form of electronic current or electrical signals. Operation 508 may be performed according to process 600 as described in FIG. 6.
In 602, the processing device 112 (e.g., the evaluation module 430) may determine a reference probability corresponding to each of a plurality of samples based on a probability estimation model. The plurality of samples may be obtained as described in connection with 502. For example, a sample may be associated with an event. The sample may include one or more features (e.g., a feature vector) that characterize the  event. The reference probability corresponding to a specific sample of the plurality of samples may be used to measure and/or indicate a similarity between the specific sample and other samples of the plurality of samples. The greater the reference probability corresponding to a specific sample is, the larger the similarity between the specific sample and other samples of the plurality of samples may be.
The processing device 112 may determine a reference probability corresponding to a specific sample based on the specific sample and other samples of the plurality of samples using the probability estimation model. Exemplary probability estimation models may include using a parametric estimation algorithm, a Bayes algorithm, non-parametric estimation algorithm, etc. For example, the parametric estimation algorithm may include a maximum likelihood algorithm. The non-parametric estimation algorithm may include a histogram probability estimation algorithm, a kernel density estimation algorithm, etc.
In 604, the processing device 112 (e.g., the evaluation module 430) may determine an estimated label for each of the plurality of samples based on a candidate threshold and an estimated probability corresponding to each of the plurality of samples. The estimated probability may be determined as described in connection with operation 504. The candidate threshold may be determined as described in connection with operation 506.
The estimated probability corresponding to a specific sample may be used to measure and/or indicate a possibility that an event associated with the specific sample is anomalous. The processing device 112 may determine an estimated label corresponding to a specific sample by comparing the estimated probability corresponding to the specific sample with the candidate threshold. The estimated label may include a negative sample and a positive sample. The negative sample may indicate that an event associated with the negative sample is anomalous. The positive sample may indicate that an event associated with the positive sample is normal. In some embodiments, if the estimated  probability corresponding to the specific sample exceeds the candidate threshold, the processing device 112 may label the specific sample as a positive sample. If the estimated probability corresponding to the specific sample is smaller the candidate threshold, the processing device 112 may label the specific sample as a negative sample.
In 606, the processing device 112 (e.g., the evaluation module 430) may determine an evaluation index of the machine learning model with respect to the candidate threshold based on the reference probability and the estimated label corresponding to each of the plurality of samples. Exemplary evaluation indexes of the machine learning model for anomaly detection may include an area under curve (AUC) , a Gini coefficient, etc. The AUC of the machine learning model for anomaly detection may be defined by a probability that when the machine learning model is used for anomaly detection, a random positive sample is ranked above a random negative sample. The greater the AUC of a machine learning model for anomaly detection is, the greater the accuracy of the machine learning model for anomaly detection estimation may be.
In some embodiments, the processing device 112 may rank the samples according to the reference probability from small to large. The processing device 112 may determine, statistically the probability that a positive sample is ranked above a negative sample. For example, the processing device 112 may determine the value of AUC according to the Equation (1) as below:
Figure PCTCN2019091433-appb-000004
where M refers to a count of positive samples of the plurality of samples, N refers to a count of negative samples of the plurality of samples, P positive refers to a reference probability (i.e., a true score) of a positive sample, P negative refers to a reference probability (i.e., a true score) of a negative sample, and I (P positive, P negative) may be determined according to equation (2) as below:
Figure PCTCN2019091433-appb-000005
According to Equation (1) , the plurality of samples may include M positive samples and N negative samples. The processing device 112 may determine M*N sample pairs. Each of the M*N sample pairs may include a negative sample and a positive sample. The processing device 112 may comparing a reference probability corresponding to each of the negative sample and the positive sample in each of the M*N sample pairs to determine the AUC of the machine learning model for anomaly detection with respect to the candidate threshold.
For example, assuming that the plurality of samples include A, B, C, and D, the estimated probability of each the plurality of samples is 0.9, 0.8, 0.75, and 0.85, respectively, and the reference probability of each the plurality of samples is 0.7, 0.6, 0.8 and 0.9, respectively. The candidate threshold equals the estimated probability 0.85 of sample D. The processing device 112 may label sample A as a positive sample, sample B as a negative sample, sample C as a negative sample, and sample D as a positive sample by comparing the estimated probabilities of sample A, B, C, and D, respectively, with the candidate threshold 0.85. The processing device 112 may determine 4 sample pairs including (A, B) , (A, C) , (D, B) , and (D, C) . According to equation (2) , the processing device 112 may determine I (A, B) , I (A, C) , I (D, B) , and I (D, C) as 1, 0, 1, and 1, respectively. According to the Equation (1) , the AUC of the machine learning model for anomaly detection may be determined as
Figure PCTCN2019091433-appb-000006
when designating the estimated probability 0.85 of sample D as the candidate threshold. Similarly, the processing device 112 may determine the AUCs of the machine learning  model for anomaly detection when designating estimated probabilities of samples A, B, and C as the candidate threshold, respectively.
The Gini coefficient of the machine learning model for anomaly detection with respect to the candidate threshold may be determined based on the AUC according to equation (3) as below:
Gini=2*AUC-1.    (3)
It should be noted that the above description regarding the process 600 is merely provided for the purposes of illustration, and not intended to limit the scope of the present disclosure. For persons having ordinary skills in the art, multiple variations and modifications may be made under the teachings of the present disclosure. However, those variations and modifications do not depart from the scope of the present disclosure.
FIG. 7 is a flowchart illustrating an exemplary process 700 for anomaly detection according to some embodiments of the present disclosure. At least a portion of process 700 may be implemented on the computing device 200 as illustrated in FIG. 2 or the mobile device 300 as illustrated in FIG. 3. In some embodiments, one or more operations of process 700 may be implemented in the anomaly detection system 100 as illustrated in FIG. 1. In some embodiments, one or more operations in the process 500 may be stored in a storage device (e.g., the storage 120, the ROM 230, the RAM 240, the storage 390) as a form of instructions, and invoked and/or executed by the server 110 (e.g., the processing device 112 in the server 110, or the processor 220 of the computing device 200) or the CPU 340 of the mobile device 300. In some embodiments, the instructions may be transmitted in the form of electronic current or electrical signals.
In 702, the processing device 112 (e.g., the acquisition module 410) may obtain data associated with a specific event. In some embodiments, the specific event may be an event associated with one of the plurality of samples. The data associated with the specific event may be one of the plurality of samples. In some embodiments, the data  associated with the specific event may be obtained by the acquisition module 410 from the data providing system 130, the service providing system 140, the storage 120, etc. The data associated with the specific event may include one or more features that characterizes the specific event as described elsewhere in the present disclosure.
In 704, the processing device 112 (e.g., the acquisition module 410) may obtain a machine learning model for anomaly detection with a target threshold. The acquisition module 410 may obtain the machine learning model with the target threshold from the storage 120, the data providing system 130, the service providing system 140, and any other storage device as described elsewhere in the present disclosure.
The machine learning model for anomaly detection may include an unsupervised machine learning model, a semi-unsupervised machine learning model, etc., as described elsewhere in the present disclosure (e.g., FIG. 5, and the descriptions thereof) . In some embodiments, the target threshold may be used by the machine learning model for anomaly detection to determine whether an event is anomalous based on. The target threshold may be determined according to process 500 as described in FIG. 5. The target threshold may be determined using a plurality of samples associated with different events each of whose anomaly status (whether an event is anomalous or not) is unknown. In some embodiments, the plurality of samples may include the data associated with the event obtained in operation 702. Estimated probabilities corresponding to the plurality of samples may be determined. Further, a plurality of candidate thresholds associated with a machine learning model for anomaly detection may be determined. Then an evaluation result may be determined by evaluating the machine learning model, with respect to each of the plurality of candidate thresholds, for detecting anomaly of the samples. A target threshold associated with the machine learning model from the plurality of candidate thresholds may be determined based on the evaluation result.
In 706, the processing device 112 (e.g., the determination module 420) may determine whether the specific event is anomalous based on the data associated with the specific event and the machine learning model with the target threshold. The processing device 112 may determine whether the specific event is anomalous by inputting the data associated with the specific event into the machine learning model. In some embodiments, the machine learning model for anomaly detection may be configured to determine and output an estimated probability corresponding to the specific event based on the inputted data associated with the specific event. Further, the determination module 420 may compare the estimated probability with the target threshold. If the estimated probability corresponding to the specific event is smaller than the target threshold, the determination module 420 may determine that the specific event is anomalous. If the estimated probability corresponding to the specific event exceeds the target threshold, the determination module 420 may determine that the event is normal. In some embodiments, the machine learning model for anomaly detection may be configured to determine an estimated probability corresponding to the specific event and determine whether the specific event is anomalous based on the inputted data associated with the specific event and the target threshold. The machine learning model for anomaly detection may be configured to output an estimated result of the specific event. For example, if the specific event is anomalous, the machine learning model for anomaly detection may output “0. ” If the specific event is normal, the machine learning model for anomaly detection may output “1. ”
FIGs. 8A-8D illustrate schematic diagrams of exemplary anomaly detection results according to some embodiments of the present disclosure. As shown in FIG. 8A, the horizontal axis represents the time axis. The vertical axis represents sample signals obtained over time. Curve “a” and Curve “b” represents service indicators associated with the sample signals. Curve “c” represents alarm signals that indicate an anomaly  appears. The greater a peak of curve “a” or curve “b” , the greater a possibility that an anomaly exists may be. As shown in FIG. 8A, the anomaly detection was performed using a machine learning model for anomaly detection with respect to a target threshold of about 0.1387. The AUC of the machine learning model for anomaly detection with respect to a threshold of about 0.1387 was about 0.9376. FIG. 8A shows a plurality of anomalies appeared during anomaly detection. As shown in FIG. 8B, the anomaly detection was performed using a machine learning model for anomaly detection with respect to a target threshold of about 0.1728. The AUC of the machine learning model for anomaly detection with respect to the threshold of about 0.1728 was about 0.9671. FIG. 8B shows that a plurality of anomalies appeared during anomaly detection. As shown in FIG. 8C, the anomaly detection was performed using a machine learning model for anomaly detection with respect to a threshold of about 0.5838. The AUC of the machine learning model for anomaly detection with respect to the threshold about 0.5838 was about 0.9998. FIG. 8C shows that an anomaly appeared at the time corresponding to a maximum peak “P. ” As shown in FIG. 8D, the anomaly detection was performed using a machine learning model for anomaly detection with respect to a target threshold of about 0.8272. The AUC of the machine learning model for anomaly detection with respect to the threshold about 0.8272 was about 0.9980. FIG. 8D shows that at least two anomalies appeared at the time corresponding to a maximum peak “P. ” Generally, one single anomaly may appear at the time corresponding to a maximum peak, e.g., peak “P. ” Accordingly, the greater the AUC of a machine learning model for anomaly detection is, the greater the accuracy of the machine learning model for anomaly detection may be.
Having thus described the basic concepts, it may be rather apparent to those skilled in the art after reading this detailed disclosure that the foregoing detailed disclosure is intended to be presented by way of example only and is not limiting. Various alterations, improvements, and modifications may occur and are intended to those skilled  in the art, though not expressly stated herein. These alterations, improvements, and modifications are intended to be suggested by this disclosure, and are within the spirit and scope of the exemplary embodiments of this disclosure.
Moreover, certain terminology has been used to describe embodiments of the present disclosure. For example, the terms “one embodiment, ” “an embodiment, ” and/or “some embodiments” mean that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Therefore, it is emphasized and should be appreciated that two or more references to “an embodiment, ” “one embodiment, ” or “an alternative embodiment” in various portions of this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined as suitable in one or more embodiments of the present disclosure.
Further, it will be appreciated by one skilled in the art, aspects of the present disclosure may be illustrated and described herein in any of a number of patentable classes or context including any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof. Accordingly, aspects of the present disclosure may be implemented entirely hardware, entirely software (including firmware, resident software, micro-code, etc. ) or combining software and hardware implementation that may all generally be referred to herein as a "block, " “module, ” “engine, ” “unit, ” “component, ” or “system. ” Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer readable media having computer readable program code embodied thereon.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including electro-magnetic, optical, or the like, or any suitable combination thereof. A computer  readable signal medium may be any computer readable medium that is not a computer readable storage medium and that may communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable signal medium may be transmitted using any appropriate medium, including wireless, wireline, optical fiber cable, RF, or the like, or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB. NET, Python or the like, conventional procedural programming languages, such as the “C” programming language, Visual Basic, Fortran 1703, Perl, COBOL 1702, PHP, ABAP, dynamic programming languages such as Python, Ruby and Groovy, or other programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN) , or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider) or in a cloud computing environment or offered as a service such as a software as a service (SaaS) .
Furthermore, the recited order of processing elements or sequences, or the use of numbers, letters, or other designations, therefore, is not intended to limit the claimed processes and methods to any order except as may be specified in the claims. Although the above disclosure discusses through various examples what is currently considered to be a variety of useful embodiments of the disclosure, it is to be understood that such detail  is solely for that purpose, and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover modifications and equivalent arrangements that are within the spirit and scope of the disclosed embodiments. For example, although the implementation of various components described above may be embodied in a hardware device, it may also be implemented as a software-only solution-e.g., an installation on an existing server or mobile device.
Similarly, it should be appreciated that in the foregoing description of embodiments of the present disclosure, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the various embodiments. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed subject matter requires more features than are expressly recited in each claim. Rather, claimed subject matter may lie in less than all features of a single foregoing disclosed embodiment.

Claims (22)

  1. A system for anomaly detection, comprising:
    at least one storage medium including a set of instructions;
    at least one processor in communication with the at least one storage medium, wherein when executing the set of instructions, the at least one processor is directed to cause the system to perform operations including:
    obtaining a plurality of samples, each of the plurality of samples being associated with an event;
    for each of the plurality of samples, determining, based on a machine learning model for the anomaly detection, an estimated probability that the event corresponding to the each of the plurality of samples is anomalous;
    determining, based on estimated probabilities corresponding to at least a portion of the plurality of samples, a plurality of candidate thresholds associated with the machine learning model for the anomaly detection;
    determining an evaluation result by evaluating the machine learning model for the anomaly detection with respect to each of the plurality of candidate thresholds; and
    determining, based on the evaluation result, a target threshold associated with the machine learning model for the anomaly detection from the plurality of candidate thresholds.
  2. The system of claim 1, wherein the machine learning model includes at least one of a one-class support vector machine (SVM) model or an isolation forest algorithm.
  3. The system of claim 1 or claim 2, wherein to determine, based on estimated probabilities corresponding to at least a portion of the plurality of samples, a plurality of candidate thresholds associated with the machine learning model, the at least one  processor is directed to cause the system to perform additional operations including:
    designating the estimated probability corresponding to each of the at least a portion of the plurality of samples as one of the plurality of candidate thresholds.
  4. The system of any one of claims 1 to 3, wherein to evaluate the machine learning model with respect to each of the plurality of candidate thresholds, the at least one processor is directed to cause the system to perform additional operations including:
    for each of the plurality of samples,
    determining, based on a probability estimation model, a reference probability corresponding to the each of the plurality of samples; and
    evaluating, based on the estimated probability and the reference probability, the machine learning model with respect to each of the plurality of candidate thresholds.
  5. The system of claim 4, wherein to evaluate, based on the estimated probability and the reference probability, the machine learning model, the at least one processor is directed to cause the system to perform additional operations including:
    determining, based on the reference probability and the estimated probability, an evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds.
  6. The system of claim 5, wherein to determine, based on the reference probability and the estimated probability, an evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds, the at least one processor is directed to cause the system to perform additional operations including:
    determining, based on each of the plurality of candidate thresholds and the estimated probability, an estimated label of each of the plurality of samples, the estimated label  including a negative sample or a positive sample; and
    determining, based on the reference probability and the estimated label, the evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds.
  7. The system of claim 6, wherein the at least one processor is further directed to cause the system to perform additional operations including:
    ranking the reference probability of each of the plurality of samples, wherein determining the evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds includes:
    determining, based on the ranked reference probability, and the estimated label corresponding to each of the plurality of the samples, the evaluation index.
  8. The system of any one of claims 5 to 7, wherein the evaluation index of the machine learning model includes at least one of an area under curve (AUC) or a Gini coefficient.
  9. The system of any one of claims 5 to 8, wherein to determine, based on the evaluation result, a target threshold associated with the machine learning model from the plurality of candidate thresholds, the at least one processor is directed to cause the system to perform additional operations including:
    identifying, from the plurality of candidate thresholds, a candidate threshold that corresponds to a maximum of the evaluation index; and
    designating the identified candidate threshold as the target threshold associated with the machine learning model.
  10. The system of any one of claims 1 to 9, wherein the at least one processor is  directed to cause the system to perform additional operations including:
    obtaining data associated with a specific event; and
    determining, based on the data associated with the specific event and the machine learning model with respect to the target threshold, whether the specific event is anomalous.
  11. A method for anomaly detection, comprising:
    obtaining a plurality of samples, each of the plurality of samples being associated with an event;
    for each of the plurality of samples, determining, based on a machine learning model for the anomaly detection, an estimated probability that the event corresponding to the each of the plurality of samples is anomalous;
    determining, based on estimated probabilities corresponding to at least a portion of the plurality of samples, a plurality of candidate thresholds associated with the machine learning model;
    determining an evaluation result by evaluating the machine learning model for the anomaly detection with respect to each of the plurality of candidate thresholds; and
    determining, based on the evaluation result, a target threshold associated with the machine learning model for the anomaly detection from the plurality of candidate thresholds.
  12. The method of claim 11, wherein the machine learning model includes at least one of a one-class support vector machine (SVM) model or an isolation forest algorithm.
  13. The method of claim 11 or claim 12, wherein the determining, based on estimated probabilities corresponding to at least a portion of the plurality of samples, a plurality of  candidate thresholds associated with the machine learning model includes:
    designating the estimated probability corresponding to each of the at least a portion of the plurality of samples as one of the plurality of candidate thresholds.
  14. The method of any one of claims 11 to 13, wherein the evaluating the machine learning model with respect to each of the plurality of candidate thresholds includes:
    for each of the plurality of samples,
    determining, based on a probability estimation model, a reference probability corresponding to the each of the plurality of samples; and
    evaluating, based on the estimated probability and the reference probability, the machine learning model with respect to each of the plurality of candidate thresholds.
  15. The method of claim 14, wherein the evaluating, based on the estimated probability and the reference probability, the machine learning model includes:
    determining, based on the reference probability and the estimated probability, an evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds; and
    evaluating, based on the evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds, the machine learning model.
  16. The method of claim 15, wherein the determining, based on the reference probability and the estimated probability, an evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds includes:
    determining, based on each of the plurality of candidate thresholds and the estimated probability, an estimated label of each of the plurality of samples, the estimated label including a negative sample or a positive sample; and
    determining, based on the reference probability and the estimated label, the evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds.
  17. The method of claim 16, further comprising:
    ranking the reference probability of each of the plurality of samples, wherein determining the evaluation index of the machine learning model with respect to each of the plurality of candidate thresholds includes:
    determining, based on the ranked reference probability and the estimated label corresponding to each of the plurality of the samples, the evaluation index.
  18. The method of any one of claims 15 to 17, wherein the evaluation index of the machine learning model includes at least one of an area under curve (AUC) or a Gini coefficient.
  19. The method of any one of claims 15 to 18, wherein the determining, based on the evaluation result, a target threshold associated with the machine learning model from the plurality of candidate thresholds includes:
    identifying, from the plurality of candidate thresholds, a candidate threshold that corresponds to a maximum of the evaluation index; and
    designating the identified candidate threshold as the target threshold associated with the machine learning model.
  20. The method of any one of claims 11 to 19, further comprising:
    obtaining data associated with a specific event; and
    determining, based on the data associated with the specific event and the machine  learning model with respect to the target threshold, whether the specific event is anomalous.
  21. A non-transitory computer readable medium storing instructions, the instructions, when executed by at least one processor, causing the at least one processor to implement a method comprising:
    obtaining a plurality of samples, each of the plurality of samples being associated with an event;
    for each of the plurality of samples, determining, based on a machine learning model for the anomaly detection, an estimated probability that the event corresponding to the each of the plurality of samples is anomalous;
    determining, based on estimated probabilities corresponding to at least a portion of the plurality of samples, a plurality of candidate thresholds associated with the machine learning model;
    determining an evaluation result by evaluating the machine learning model for the anomaly detection with respect to each of the plurality of candidate thresholds; and
    determining, based on the evaluation result, a target threshold associated with the machine learning model for the anomaly detection from the plurality of candidate thresholds.
  22. A system for anomaly detection, comprising:
    an acquisition module configured to obtain a plurality of samples, each of the plurality of samples being associated with an event;
    a determination module configured to:
    for each of the plurality of samples, determine, based on a machine learning model for the anomaly detection, an estimated probability that the event corresponding to the each  of the plurality of samples is anomalous; and
    determining, based on estimated probabilities corresponding to at least a portion of the plurality of samples, a plurality of candidate thresholds associated with the machine learning model; and
    an evaluation module configured to determine an evaluation result by evaluating the machine learning model for the anomaly detection with respect to each of the plurality of candidate thresholds, and wherein the determination module is further configured to:
    determine, based on the evaluation result, a target threshold associated with the machine learning model for the anomaly detection from the plurality of candidate thresholds.
PCT/CN2019/091433 2019-06-11 2019-06-15 Systems and methods for anomaly detection WO2020248291A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910501710.5 2019-06-11
CN201910501710.5A CN111860872B (en) 2019-06-11 2019-06-11 System and method for anomaly detection

Publications (1)

Publication Number Publication Date
WO2020248291A1 true WO2020248291A1 (en) 2020-12-17

Family

ID=72966069

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/091433 WO2020248291A1 (en) 2019-06-11 2019-06-15 Systems and methods for anomaly detection

Country Status (2)

Country Link
CN (1) CN111860872B (en)
WO (1) WO2020248291A1 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112738088A (en) * 2020-12-28 2021-04-30 上海观安信息技术股份有限公司 Behavior sequence anomaly detection method and system based on unsupervised algorithm
CN112733897A (en) * 2020-12-30 2021-04-30 胜斗士(上海)科技技术发展有限公司 Method and equipment for determining abnormal reason of multi-dimensional sample data
CN113536050A (en) * 2021-07-06 2021-10-22 贵州电网有限责任公司 Distribution network monitoring system curve data query processing method
CN114125916A (en) * 2022-01-27 2022-03-01 荣耀终端有限公司 Communication system, method and related equipment
CN114500326A (en) * 2022-02-25 2022-05-13 北京百度网讯科技有限公司 Abnormality detection method, abnormality detection device, electronic apparatus, and storage medium
CN114726749A (en) * 2022-03-02 2022-07-08 阿里巴巴(中国)有限公司 Data anomaly detection model acquisition method, device, equipment, medium and product
CN115001997A (en) * 2022-04-11 2022-09-02 北京邮电大学 Extreme value theory-based smart city network equipment performance abnormity threshold evaluation method
CN115567371A (en) * 2022-11-16 2023-01-03 支付宝(杭州)信息技术有限公司 Abnormity detection method, device, equipment and readable storage medium
CN116127326A (en) * 2023-04-04 2023-05-16 广东电网有限责任公司揭阳供电局 Composite insulator detection method and device, electronic equipment and storage medium
CN116430831A (en) * 2023-04-26 2023-07-14 宁夏五谷丰生物科技发展有限公司 Data abnormity monitoring method and system applied to edible oil production control system
CN117076991A (en) * 2023-10-16 2023-11-17 云境商务智能研究院南京有限公司 Power consumption abnormality monitoring method and device for pollution control equipment and computer equipment

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112541981B (en) * 2020-11-03 2022-07-22 山东中创软件商用中间件股份有限公司 ETC portal system early warning method, device, equipment and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180096261A1 (en) * 2016-10-01 2018-04-05 Intel Corporation Unsupervised machine learning ensemble for anomaly detection
US20180247220A1 (en) * 2017-02-28 2018-08-30 International Business Machines Corporation Detecting data anomalies
CN109522304A (en) * 2018-11-23 2019-03-26 中国联合网络通信集团有限公司 Exception object recognition methods and device, storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107798390B (en) * 2017-11-22 2023-03-21 创新先进技术有限公司 Training method and device of machine learning model and electronic equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180096261A1 (en) * 2016-10-01 2018-04-05 Intel Corporation Unsupervised machine learning ensemble for anomaly detection
US20180247220A1 (en) * 2017-02-28 2018-08-30 International Business Machines Corporation Detecting data anomalies
CN109522304A (en) * 2018-11-23 2019-03-26 中国联合网络通信集团有限公司 Exception object recognition methods and device, storage medium

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112738088A (en) * 2020-12-28 2021-04-30 上海观安信息技术股份有限公司 Behavior sequence anomaly detection method and system based on unsupervised algorithm
CN112733897A (en) * 2020-12-30 2021-04-30 胜斗士(上海)科技技术发展有限公司 Method and equipment for determining abnormal reason of multi-dimensional sample data
CN113536050A (en) * 2021-07-06 2021-10-22 贵州电网有限责任公司 Distribution network monitoring system curve data query processing method
CN113536050B (en) * 2021-07-06 2023-12-01 贵州电网有限责任公司 Distribution network monitoring system curve data query processing method
CN114125916A (en) * 2022-01-27 2022-03-01 荣耀终端有限公司 Communication system, method and related equipment
CN114125916B (en) * 2022-01-27 2022-06-10 荣耀终端有限公司 Communication system, method and related equipment
CN114500326A (en) * 2022-02-25 2022-05-13 北京百度网讯科技有限公司 Abnormality detection method, abnormality detection device, electronic apparatus, and storage medium
CN114500326B (en) * 2022-02-25 2023-08-11 北京百度网讯科技有限公司 Abnormality detection method, abnormality detection device, electronic device, and storage medium
CN114726749A (en) * 2022-03-02 2022-07-08 阿里巴巴(中国)有限公司 Data anomaly detection model acquisition method, device, equipment, medium and product
CN114726749B (en) * 2022-03-02 2023-10-31 阿里巴巴(中国)有限公司 Data anomaly detection model acquisition method, device, equipment and medium
CN115001997B (en) * 2022-04-11 2024-02-09 北京邮电大学 Extreme value theory-based smart city network equipment performance abnormal threshold evaluation method
CN115001997A (en) * 2022-04-11 2022-09-02 北京邮电大学 Extreme value theory-based smart city network equipment performance abnormity threshold evaluation method
CN115567371B (en) * 2022-11-16 2023-03-10 支付宝(杭州)信息技术有限公司 Abnormity detection method, device, equipment and readable storage medium
CN115567371A (en) * 2022-11-16 2023-01-03 支付宝(杭州)信息技术有限公司 Abnormity detection method, device, equipment and readable storage medium
CN116127326A (en) * 2023-04-04 2023-05-16 广东电网有限责任公司揭阳供电局 Composite insulator detection method and device, electronic equipment and storage medium
CN116127326B (en) * 2023-04-04 2023-06-23 广东电网有限责任公司揭阳供电局 Composite insulator detection method and device, electronic equipment and storage medium
CN116430831B (en) * 2023-04-26 2023-10-31 宁夏五谷丰生物科技发展有限公司 Data abnormity monitoring method and system applied to edible oil production control system
CN116430831A (en) * 2023-04-26 2023-07-14 宁夏五谷丰生物科技发展有限公司 Data abnormity monitoring method and system applied to edible oil production control system
CN117076991A (en) * 2023-10-16 2023-11-17 云境商务智能研究院南京有限公司 Power consumption abnormality monitoring method and device for pollution control equipment and computer equipment
CN117076991B (en) * 2023-10-16 2024-01-02 云境商务智能研究院南京有限公司 Power consumption abnormality monitoring method and device for pollution control equipment and computer equipment

Also Published As

Publication number Publication date
CN111860872B (en) 2024-03-26
CN111860872A (en) 2020-10-30

Similar Documents

Publication Publication Date Title
WO2020248291A1 (en) Systems and methods for anomaly detection
US10254119B2 (en) Systems and methods for recommending an estimated time of arrival
US20200011692A1 (en) Systems and methods for recommending an estimated time of arrival
US20200051193A1 (en) Systems and methods for allocating orders
WO2019232772A1 (en) Systems and methods for content identification
US20210118078A1 (en) Systems and methods for determining potential malicious event
CN111316308B (en) System and method for identifying wrong order requests
EP3568850A1 (en) Systems and methods for speech information processing
US11573084B2 (en) Method and system for heading determination
US20190205785A1 (en) Event detection using sensor data
WO2018171531A1 (en) System and method for predicting classification for object
Abdelrahman et al. Robust data-driven framework for driver behavior profiling using supervised machine learning
US11140531B2 (en) Systems and methods for processing data from an online on-demand service platform
WO2021087663A1 (en) Systems and methods for determining name for boarding point
KR102505303B1 (en) Method and apparatus for classifying image
US20210110140A1 (en) Environment specific model delivery
WO2019232773A1 (en) Systems and methods for abnormality detection in data storage
CN115147618A (en) Method for generating saliency map, method and device for detecting abnormal object
CN111274471B (en) Information pushing method, device, server and readable storage medium
US20230035995A1 (en) Method, apparatus and storage medium for object attribute classification model training
CN112243487A (en) System and method for on-demand services
WO2021051230A1 (en) Systems and methods for object detection
Abdelrahman et al. A robust environment-aware driver profiling framework using ensemble supervised learning
CN111797620A (en) System and method for recognizing proper nouns
Truong et al. Rotated Mask Region-Based Convolutional Neural Network Detection for Parking Space Management System

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19933134

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19933134

Country of ref document: EP

Kind code of ref document: A1