WO2019232773A1 - Systems and methods for abnormality detection in data storage - Google Patents

Systems and methods for abnormality detection in data storage Download PDF

Info

Publication number
WO2019232773A1
WO2019232773A1 PCT/CN2018/090357 CN2018090357W WO2019232773A1 WO 2019232773 A1 WO2019232773 A1 WO 2019232773A1 CN 2018090357 W CN2018090357 W CN 2018090357W WO 2019232773 A1 WO2019232773 A1 WO 2019232773A1
Authority
WO
WIPO (PCT)
Prior art keywords
values
relating
comparison result
service
actual values
Prior art date
Application number
PCT/CN2018/090357
Other languages
French (fr)
Inventor
Zuyu GAN
Zhou Ye
Yu Wang
Original Assignee
Beijing Didi Infinity Technology And Development Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Didi Infinity Technology And Development Co., Ltd. filed Critical Beijing Didi Infinity Technology And Development Co., Ltd.
Priority to PCT/CN2018/090357 priority Critical patent/WO2019232773A1/en
Priority to CN201880001318.8A priority patent/CN110945484B/en
Publication of WO2019232773A1 publication Critical patent/WO2019232773A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/02Reservations, e.g. for tickets, services or events
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3447Performance evaluation by modeling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/805Real-time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/81Threshold

Definitions

  • the present disclosure generally relates to systems and methods for data storage management, and in particular, to systems and methods for detecting abnormality in data storage.
  • Data warehouse may be used to store the service data.
  • Anomality detection may aim to, from the service data, find a set of data that is different from the expected data.
  • the service data may reflect the business situation within a certain period of time, the authenticity of the service data in the data warehouse must be ensured and each abnormal fluctuation of the service data may need to be promptly alerted.
  • Current technology often relies on the experience of a database administrative or continuous iterative amendments of a database management system causing delayed response to the abnormal fluctuation. A method and system to improve the abnormality detection may be desired.
  • a system may include a storage device storing a set of instructions and one or more processors in communication with the storage device.
  • one or more processors may be configured to cause the system to obtain, via a network, a plurality of historical data values relating to a service and determine a category relating to the plurality of historical data values.
  • the one or more processors may also cause the system to determine a plurality of predicted values relating to the service based on a prediction model relating to the category and obtain, via a network, the plurality of actual values relating to the service corresponding to the plurality of predicted values.
  • each of the plurality of predicted values may correspond to a time point.
  • the one or more processors may further cause the system to compare the plurality of actual values with the plurality of predicted values using at least one filter to generate a comparison result and determine that at least part of the plurality of actual values are abnormal based on the comparison result.
  • the plurality of historical data values may form a temporal sequence.
  • the one or more processors may further cause the system to determine a plurality of feature values relating to the plurality of historical data values and determine the category relating to the plurality of historical data values based on the plurality of feature values.
  • the category may indicate a characteristic relating to the service.
  • the category may include one of growth period with periodicity, stable period with periodicity, fading period with periodicity, growth period with aperiodicity, stable period with aperiodicity, or fading period with aperiodicity.
  • the one or more processors may also cause the system to determine that the category indicating the characteristic relating to the service is associated with periodicity and determine a residual function, a trend function, and a seasonal function relating to the plurality of historical data values based on the category associated with periodicity.
  • the one or more processors may further cause the system to generate the prediction model based on the residual function, the trend function and the seasonal function and determine the plurality of predicted values based on the prediction model.
  • the one or more processors may further cause the system to obtain time points relating to at least part of the plurality of predicted values and obtain the plurality of actual values based on the time points relating to the at least part of the plurality of predicted values.
  • the at least one filter may include a dispersion filter
  • the one or more processors may also cause the system to determine a statistical value based on the plurality of predicted values and the plurality of actual values using the dispersion filter and compare the statistical value with a first threshold.
  • the statistical value may be associated with dispersion degrees of both the plurality of predicted values and the plurality of actual values.
  • the one or more processors may further cause the system to determine that the at least part of the plurality of actual values are abnormal in response to the comparison result that the statistical value is greater than the first threshold.
  • the at least one filter may include a threshold filter.
  • the one or more processors may also cause the system to determine a plurality of differences between the plurality of predicted values and the plurality of actual values using the threshold filter and determine a plurality of second thresholds based on a time function.
  • the one or more processors may further cause the system to compare each of the plurality of differences with a corresponding second threshold and determine that the at least part of the plurality of actual values are abnormal in response to the comparison result that each of the plurality of differences is greater than a corresponding second threshold.
  • the each of the plurality differences and the corresponding second threshold may be associated with a same time point.
  • the at least one filter may include a false alarm filter.
  • the one or more processors may further cause the system to determine a false alarm model based on a pre-labeled data set relating to service data and determine that the at least part of the plurality of actual values are abnormal based on the false alarm model.
  • the pre-labeled data set may include a plurality of false alarm results generated by the system.
  • the one or more processors may also cause the system to compare the plurality of actual values with the plurality of predicted values using a dispersion filter, a threshold filter, and a false alarm filter to generate a first comparison result, a second comparison result, and a third comparison result, respectively.
  • the one or more processors may further cause the system to determine that at least part of the plurality of actual values are abnormal based on the first comparison result, the second comparison result, and the third comparison result.
  • a computer-implemented method may include one or more of the following operations performed by one or more processors.
  • the method may include obtaining, via a network, a plurality of historical data values relating to a service and determining a category relating to the plurality of historical data values.
  • the method may also include determining a plurality of predicted values relating to the service based on a prediction model relating to the category and obtaining, via a network, the plurality of actual values relating to the service corresponding to the plurality of predicted values.
  • each of the plurality of predicted values may correspond to a time point.
  • the method may further include comparing the plurality of actual values with the plurality of predicted values using at least one filter to generate a comparison result and determining that at least part of the plurality of actual values are abnormal based on the comparison result.
  • the method may further include determining a plurality of feature values relating to the plurality of historical data values and determining the category relating to the plurality of historical data values based on the plurality of feature values.
  • the method may also include determining that the category indicating the characteristic relating to the service is associated with periodicity and determining a residual function, a trend function and a seasonal function relating to the plurality of historical data values based on the category associated with periodicity.
  • the method may further include generating the prediction model based on the residual function, the trend function, and the seasonal function and determining the plurality of predicted values based on the prediction model.
  • the method may further include obtaining time points relating to at least part of the plurality of predicted values and obtaining the plurality of actual values based on the time points relating to the at least part of the plurality of predicted values.
  • the at least one filter may include a dispersion filter.
  • the method may also include determining a statistical value based on the plurality of predicted values and the plurality of actual values using the dispersion filter and comparing the statistical value with a first threshold.
  • the statistical value may be associated with dispersion degrees of both the plurality of predicted values and the plurality of actual values.
  • the method may further include determining that the at least part of the plurality of actual values are abnormal in response to the comparison result that the statistical value is greater than the first threshold.
  • the at least one filter may include a threshold filter.
  • the method may also include determining a plurality of differences between the plurality of predicted values and the plurality of actual values using the threshold filter and determining a plurality of second thresholds based on a time function.
  • the method may further include comparing each of the plurality of differences with a corresponding second threshold and determining that the at least part of the plurality of actual values are abnormal in response to the comparison result that each of the plurality of differences is greater than a corresponding second threshold.
  • the each of the plurality differences and the corresponding second threshold may be associated with a same time point.
  • the at least one filter may include a false alarm filter.
  • the method may also include determining a false alarm model based on a pre-labeled data set relating to service data and determining that the at least part of the plurality of actual values are abnormal based on the false alarm model.
  • the pre-labeled data set may include a plurality of false alarm results generated by the system.
  • the method may also include comparing the plurality of actual values with the plurality of predicted values using a dispersion filter, a threshold filter, and a false alarm filter to generate a first comparison result, a second comparison result, and a third comparison result, respectively.
  • the method may further include determining that at least part of the plurality of actual values are abnormal based on the first comparison result, the second comparison result, and the third comparison result.
  • a non-transitory computer-readable medium may store instructions. When executed by one or more processors of a system, the instructions may cause the system to obtain, via a network, a plurality of historical data values relating to a service and determine a category relating to the plurality of historical data values. The instructions may also cause the system to determine a plurality of predicted values relating to the service based on a prediction model relating to the category and obtain, via a network, the plurality of actual values relating to the service corresponding to the plurality of predicted values. In some embodiments, each of the plurality of predicted values may correspond to a time point. The instructions may further cause the system to compare the plurality of actual values with the plurality of predicted values using at least one filter to generate a comparison result and determine that at least part of the plurality of actual values are abnormal based on the comparison result.
  • FIG. 1 is a block diagram illustrating an exemplary online to offline service system according to some embodiments
  • FIG. 2 is a schematic diagram illustrating exemplary hardware and software components of a computing device according to some embodiments
  • FIG. 3 is a schematic diagram illustrating exemplary hardware and/or software components of a mobile device according to some embodiments of the present disclosure
  • FIG. 4 is a block diagram illustrating an exemplary processing engine according to some embodiments of the present disclosure
  • FIG. 5 is a flowchart illustrating an exemplary process for determining that at least part of a plurality of actual values are abnormal based on a comparison result according to some embodiments of the present disclosure
  • FIG. 6 is a flowchart illustrating an exemplary process for determining a plurality of predicted values according to some embodiments of the present disclosure
  • FIG. 7 is a flowchart illustrating an exemplary process for determining that the at least part of the plurality of actual values are abnormal according to some embodiments of the present disclosure
  • FIG. 8 is a flowchart illustrating an exemplary process for determining that the at least part of the plurality of actual values are abnormal according to some embodiments of the present disclosure.
  • FIG. 9 is a table associated with a plurality of business lines according to some embodiments of the present disclosure.
  • the flowcharts used in the present disclosure illustrate steps that systems implement according to some embodiments described in the present disclosure. It is to be expressly understood, the steps of the flowchart may be implemented not in order. Conversely, the steps may be implemented in inverted order, or simultaneously. Moreover, one or more other steps may be added to the flowcharts. One or more steps may be removed from the flowcharts.
  • the system and method in the present disclosure are described primarily with regard to distributing a request for a transportation service, it should also be understood that the present disclosure is not intended to be limiting.
  • the system or method of the present disclosure may be applied to any other kind of online to offline service.
  • the system or method of the present disclosure may be applied to transportation systems of different environments including land, ocean, aerospace, or the like, or any combination thereof.
  • the vehicle of the transportation systems may include a taxi, a private car, a hitch, a bus, a train, a bullet train, a high-speed rail, a subway, a vessel, an aircraft, a spaceship, a hot-air balloon, a driverless vehicle, or the like, or any combination thereof.
  • the transportation system may also include any transportation system for management and/or distribution, for example, a system for transmitting and/or receiving an express.
  • the application of the system or method of the present disclosure may be implemented on a user device and include a web page, a plug-in of a browser, a client terminal, a custom system, an internal analysis system, an artificial intelligence robot, or the like, or any combination thereof.
  • passenger " “requestor, “ “service requestor, “ and “customer” in the present disclosure are used interchangeably to refer to an individual, an entity, or a tool that may request or order a service.
  • driver “ “provider, “ and “service provider” in the present disclosure are used interchangeably to refer to an individual, an entity, or a tool that may provide a service or facilitate the providing of the service.
  • service request “ “request for a service, “ “requests, “ and “order” in the present disclosure are used interchangeably to refer to a request that may be initiated by a passenger, a service requestor, a customer, a driver, a provider, a service provider, or the like, or any combination thereof.
  • the service request may be accepted by any one of a passenger, a service requestor, a customer, a driver, a provider, or a service provider.
  • the service request may be chargeable or free.
  • service provider terminal and “driver terminal” in the present disclosure are used interchangeably to refer to a mobile terminal that is used by a service provider to provide a service or facilitate the providing of the service.
  • service requestor terminal and “passenger terminal” in the present disclosure are used interchangeably to refer to a mobile terminal that is used by a service requestor to request or order a service.
  • the positioning technology used in the present disclosure may be based on a global positioning system (GPS) , a global navigation satellite system (GLONASS) , a compass navigation system (COMPASS) , a Galileo positioning system, a quasi-zenith satellite system (QZSS) , a wireless fidelity (WiFi) positioning technology, or the like, or any combination thereof.
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • COMPASS compass navigation system
  • Galileo positioning system Galileo positioning system
  • QZSS quasi-zenith satellite system
  • WiFi wireless fidelity positioning technology
  • An aspect of the present disclosure relates to online systems and methods for data storage management.
  • a plurality of historical data values relating to a service may be obtained.
  • a category relating to the plurality of historical data values may be determined.
  • a prediction model associated with the category may be determined.
  • a plurality of predicted values relating to the service may be determined based on the prediction model.
  • a plurality of actual values corresponding to the plurality of predicted values may be obtained.
  • the plurality of actual values and the plurality of predicted values may be compared to generate a comparison result based on at least one filter. At least part of the plurality of actual values is determined as abnormal based on the comparison result.
  • the present disclosure employs the functions of the classifiers, the predictors and the comparators and the machine learning algorithm to generate an abnormality alarm system.
  • the system may obtain one or more parameters based on the offline historical service data values. Further, the one or more parameters may be applied to the online predictors and comparators to detect abnormalities in the real-time service data.
  • the present disclosure improves the capability of the abnormality alarm in the data storage management.
  • FIG. 1 is a block diagram illustrating an exemplary online to offline service system 100 according to some embodiments.
  • the online to offline service system 100 may be an online transportation service platform for transportation services.
  • the online to offline service system 100 may include a server 110, a network 120, a service requestor terminal 130, a service provider terminal 140, a vehicle 150, a storage device 160, and a navigation system 170.
  • the online to offline service system 100 may provide a plurality of services.
  • Exemplary service may include a taxi hailing service, a chauffeur service, an express car service, a carpool service, a bus service, a driver hire service, and a shuttle service.
  • the online to offline service may be any on-line service, such as booking a meal, shopping, or the like, or any combination thereof.
  • the server 110 may be a single server, or a server group.
  • the server group may be centralized, or distributed (e.g., the server 110 may be a distributed system) .
  • the server 110 may be local or remote.
  • the server 110 may access information and/or data stored in the service requestor terminal 130, the service provider terminal 140, and/or the storage device 160 via the network 120.
  • the server 110 may be directly connected to the service requestor terminal 130, the service provider terminal 140, and/or the storage device 160 to access stored information and/or data.
  • the server 110 may be implemented on a cloud platform.
  • the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof.
  • the server 110 may be implemented on a computing device 1000 having one or more components illustrated in FIG. 10 in the present disclosure.
  • the server 110 may include a processing engine 112.
  • the processing engine 112 may process information and/or data related to the service request to perform one or more functions described in the present disclosure. For example, the processing engine 112 may determine that at least part of a plurality of actual values are abnormal.
  • the processing engine 112 may include one or more processing engines (e.g., single-core processing engine (s) or multi-core processor (s) ) .
  • the processing engine 112 may include a central processing unit (CPU) , an application-specific integrated circuit (ASIC) , an application-specific instruction-set processor (ASIP) , a graphics processing unit (GPU) , a physics processing unit (PPU) , a digital signal processor (DSP) , a field programmable gate array (FPGA) , a programmable logic device (PLD) , a controller, a microcontroller unit, a reduced instruction-set computer (RISC) , a microprocessor, or the like, or any combination thereof.
  • CPU central processing unit
  • ASIC application-specific integrated circuit
  • ASIP application-specific instruction-set processor
  • GPU graphics processing unit
  • PPU physics processing unit
  • DSP digital signal processor
  • FPGA field programmable gate array
  • PLD programmable logic device
  • controller a microcontroller unit, a reduced instruction-set computer (RISC) , a microprocessor, or the like, or any combination thereof.
  • RISC reduced
  • the network 120 may facilitate exchange of information and/or data.
  • one or more components in the online to offline service system 100 e.g., the server 110, the service requestor terminal 130, the service provider terminal 140, the vehicle 150, the storage device 160, and the navigation system 170
  • the server 110 may receive a service request from the service requestor terminal 130 via the network 120.
  • the network 120 may be any type of wired or wireless network, or combination thereof.
  • the network 120 may include a cable network, a wireline network, an optical fiber network, a tele communications network, an intranet, an Internet, a local area network (LAN) , a wide area network (WAN) , a wireless local area network (WLAN) , a metropolitan area network (MAN) , a wide area network (WAN) , a public telephone switched network (PSTN) , a Bluetooth network, a ZigBee network, a near field communication (NFC) network, or the like, or any combination thereof.
  • the network 120 may include one or more network access points.
  • the network 120 may include wired or wireless network access points such as base stations and/or internet exchange points 120-1, 120-2, ..., through which one or more components of the online to offline service system 100 may be connected to the network 120 to exchange data and/or information.
  • a passenger may be an owner of the service requestor terminal 130. In some embodiments, the owner of the service requestor terminal 130 may be someone other than the passenger. For example, an owner A of the service requestor terminal 130 may use the service requestor terminal 130 to send a service request for a passenger B, or receive a service confirmation and/or information or instructions from the server 110.
  • a service provider may be a user of the service provider terminal 140. In some embodiments, the user of the service provider terminal 140 may be someone other than the service provider. For example, a user C of the service provider terminal 140 may use the service provider terminal 140 to receive a service request for a service provider D, and/or information or instructions from the server 110.
  • bypassenger and “passenger terminal” may be used interchangeably, and “service provider” and “service provider terminal” may be used interchangeably.
  • the service provider terminal may be associated with one or more service providers (e.g., a night-shift service provider, or a day-shift service provider) .
  • the service requestor terminal 130 may include a mobile device 130-1, a tablet computer 130-2, a laptop computer 130-3, a built-in device in a vehicle 130-4, or the like, or any combination thereof.
  • the mobile device 130-1 may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof.
  • the smart home device may include a smart lighting device, a control device of an intelligent electrical apparatus, a smart monitoring device, a smart television, a smart video camera, an interphone, or the like, or any combination thereof.
  • the wearable device may include a smart bracelet, a smart footgear, a smart glass, a smart helmet, a smart watch, smart clothing, a smart backpack, a smart accessory, or the like, or any combination thereof.
  • the smart mobile device may include a smartphone, a personal digital assistance (PDA) , a gaming device, a navigation device, a point of sale (POS) device, or the like, or any combination thereof.
  • the virtual reality device and/or the augmented reality device may include a virtual reality helmet, a virtual reality glass, a virtual reality patch, an augmented reality helmet, an augmented reality glass, an augmented reality patch, or the like, or any combination thereof.
  • the virtual reality device and/or the augmented reality device may include a Google TM Glass, an Oculus Rift, a HoloLens, a Gear VR, etc.
  • built-in device in the vehicle 130-4 may include an onboard computer, an onboard television, etc.
  • the service requestor terminal 130 may be a device with positioning technology for locating the position of the passenger and/or the service requestor terminal 130.
  • the service provider terminal 140 may include a plurality of service provider terminals 140-1, 140-2, ..., 140-n. In some embodiments, the service provider terminal 140 may be similar to, or the same device as the service requestor terminal 130. In some embodiments, the service provider terminal 140 may be customized to be able to implement the online to offline service. In some embodiments, the service provider terminal 140 may be a device with positioning technology for locating the service provider, the service provider terminal 140, and/or a vehicle 150 associated with the service provider terminal 140. In some embodiments, the service requestor terminal 130 and/or the service provider terminal 140 may communicate with other positioning device to determine the position of the passenger, the service requestor terminal 130, the service provider, and/or the service provider terminal 140.
  • the service requestor terminal 130 and/or the service provider terminal 140 may periodically send the positioning information to the server 110. In some embodiments, the service provider terminal 140 may also periodically send the availability status to the server 110. The availability status may indicate whether a vehicle 150 associated with the service provider terminal 140 is available to carry a passenger. For example, the service requestor terminal 130 and/or the service provider terminal 140 may send the positioning information and the availability status to the server 110 every thirty minutes. As another example, the service requestor terminal 130 and/or the service provider terminal 140 may send the positioning information and the availability status to the server 110 each time the user logs into the mobile application associated with the online to offline service.
  • the service provider terminal 140 may correspond to one or more vehicles 150.
  • the vehicles 150 may carry the passenger and travel to the destination.
  • the vehicles 150 may include a plurality of vehicles 150-1, 150-2, ..., 150-n.
  • One vehicle may correspond to one type of services (e.g., a taxi hailing service, a chauffeur service, an express car service, a carpool service, a bus service, a driver hire service, and a shuttle service) .
  • the storage device 160 may store data and/or instructions. In some embodiments, the storage device 160 may store data obtained from the service requestor terminal 130 and/or the service provider terminal 140. In some embodiments, the storage device 160 may store data and/or instructions that the server 110 may execute or use to perform exemplary methods described in the present disclosure. In some embodiments, the storage device 160 may include a mass storage, a removable storage, a volatile read-and-write memory, a read-only memory (ROM) , or the like, or any combination thereof. Exemplary mass storage may include a magnetic disk, an optical disk, a solid-state drives, etc. Exemplary removable storage may include a flash drive, a floppy disk, an optical disk, a memory card, a zip disk, a magnetic tape, etc.
  • Exemplary volatile read-and-write memory may include a random access memory (RAM) .
  • RAM may include a dynamic RAM (DRAM) , a double date rate synchronous dynamic RAM (DDR SDRAM) , a static RAM (SRAM) , a thyristor RAM (T-RAM) , and a zero-capacitor RAM (Z-RAM) , etc.
  • Exemplary ROM may include a mask ROM (MROM) , a programmable ROM (PROM) , an erasable programmable ROM (PEROM) , an electrically erasable programmable ROM (EEPROM) , a compact disk ROM (CD-ROM) , and a digital versatile disk ROM, etc.
  • MROM mask ROM
  • PROM programmable ROM
  • PROM erasable programmable ROM
  • EEPROM electrically erasable programmable ROM
  • CD-ROM compact disk ROM
  • digital versatile disk ROM etc.
  • the storage device 160 may be implemented on a cloud platform.
  • the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof.
  • the storage device 160 may be connected to the network 120 to communicate with one or more components in the online to offline service system 100 (e.g., the server 110, the service requestor terminal 130, the service provider terminal 140, etc. ) .
  • One or more components in the online to offline service system 100 may access the data or instructions stored in the storage device 160 via the network 120.
  • the storage device 160 may be directly connected to or communicate with one or more components in the online to offline service system 100 (e.g., the server 110, the service requestor terminal 130, the service provider terminal 140, etc. ) .
  • the storage device 160 may be part of the server 110.
  • the navigation system 170 may determine information associated with an object, for example, one or more of the service requestor terminal 130, the service provider terminal 140, the vehicle 150, etc.
  • the navigation system 170 may be a global positioning system (GPS) , a global navigation satellite system (GLONASS) , a compass navigation system (COMPASS) , a BeiDou navigation satellite system, a Galileo positioning system, a quasi-zenith satellite system (QZSS) , etc.
  • the information may include a location, an elevation, a velocity, or an acceleration of the object, or a current time.
  • the navigation system 170 may include one or more satellites, for example, a satellite 170-1, a satellite 170-2, and a satellite 170-3.
  • the satellites 170-1 through 170-3 may determine the information mentioned above independently or jointly.
  • the satellite navigation system 170 may send the information mentioned above to the network 120, the service requestor terminal 130, the service provider terminal 140, or the vehicle 150 via wireless connections.
  • one or more components in the online to offline service system 100 may have permissions to access the storage device 160.
  • one or more components in the online to offline service system 100 may read and/or modify information related to the passenger, service provider, and/or the public when one or more conditions are met.
  • the server 110 may read and/or modify one or more passengers’information after a service is completed.
  • the server 110 may read and/or modify one or more service providers’information after a service is completed.
  • information exchanging of one or more components in the online to offline service system 100 may be initiated by way of requesting a service.
  • the object of the service request may be any product.
  • the product may include food, medicine, commodity, chemical product, electrical appliance, clothing, car, housing, luxury, or the like, or any combination thereof.
  • the product may include a servicing product, a financial product, a knowledge product, an internet product, or the like, or any combination thereof.
  • the internet product may include an individual host product, a web product, a mobile internet product, a commercial host product, an embedded product, or the like, or any combination thereof.
  • the mobile internet product may be used in a software of a mobile terminal, a program, a system, or the like, or any combination thereof.
  • the mobile terminal may include a tablet computer, a laptop computer, a mobile phone, a personal digital assistance (PDA) , a smart watch, a point of sale (POS) device, an onboard computer, an onboard television, a wearable device, or the like, or any combination thereof.
  • PDA personal digital assistance
  • POS point of sale
  • the product may be any software and/or application used in the computer or mobile phone.
  • the software and/or application may relate to socializing, shopping, transporting, entertainment, learning, investment, or the like, or any combination thereof.
  • the software and/or application related to transporting may include a traveling software and/or application, a vehicle scheduling software and/or application, a mapping software and/or application, etc.
  • the vehicle may include a horse, a carriage, a rickshaw (e.g., a wheelbarrow, a bike, a tricycle, etc. ) , a car (e.g., a taxi, a bus, a private car, etc. ) , a train, a subway, a vessel, an aircraft (e.g., an airplane, a helicopter, a space shuttle, a rocket, a hot-air balloon, etc. ) , or the like, or any combination thereof.
  • a traveling software and/or application the vehicle may include a horse, a carriage, a rickshaw (e.g., a wheelbarrow, a bike, a tricycle, etc. ) , a car (e.g., a taxi, a bus, a private car, etc. )
  • FIG. 2 is a schematic diagram illustrating exemplary hardware and software components of a computing device 200 on which the server 110, the requestor terminal 130, and/or the provider terminal 140 may be implemented according to some embodiments of the present disclosure.
  • the processing engine 112 may be implemented on the computing device 200 and configured to perform functions of the processing engine 112 disclosed in this disclosure.
  • the computing device 200 may be a general-purpose computer or a special purpose computer; both may be used to implement an online to offline service system for the present disclosure.
  • the computing device 200 may be used to implement any component of the online to offline service as described herein.
  • the processing engine 112 may be implemented on the computing device 200, via its hardware, software program, firmware, or a combination thereof.
  • only one such computer is shown, for convenience, the computer functions relating to the online to offline service as described herein may be implemented in a distributed fashion on a number of similar platforms, to distribute the processing load.
  • the computing device 200 may include COM ports 250 connected to and from a network connected thereto to facilitate data communications.
  • the computing device 200 may also include a processor (e.g., the processor 220) , in the form of one or more processors, for executing program instructions.
  • the exemplary computing device may include an internal communication bus 210, program storage and data storage of different forms including, for example, a disk 270, and a read only memory (ROM) 230, or a random access memory (RAM) 240, for various data files to be processed and/or transmitted by the computing device.
  • the exemplary computing device may also include program instructions stored in the ROM 230, RAM 240, and/or other type of non-transitory storage medium to be executed by the processor 220.
  • the methods and/or processes of the present disclosure may be implemented as the program instructions.
  • the computing device 200 also includes an I/O component 260, supporting input/output between the computer and other components.
  • the computing device 200 may also receive programming and data via network communications.
  • FIG. 2 Merely for illustration, only one CPU and/or processor is illustrated in FIG. 2. Multiple CPUs and/or processors are also contemplated; thus operations and/or method steps performed by one CPU and/or processor as described in the present disclosure may also be jointly or separately performed by the multiple CPUs and/or processors.
  • the CPU and/or processor of the computing device 200 executes both step A and step B, it should be understood that step A and step B may also be performed by two different CPUs and/or processors jointly or separately in the computing device 200 (e.g., the first processor executes step A and the second processor executes step B, or the first and second processors jointly execute steps A and B) .
  • FIG. 3 is a schematic diagram illustrating exemplary hardware and/or software components of a mobile device 300 according to some embodiments of the present disclosure.
  • the mobile device 300 may include a communication module 310, a display 320, a graphics processing unit (GPU) 330, a processor 340, an I/O 350, a memory 360, and a storage 390.
  • any other suitable component including but not limited to a system bus or a controller (not shown) , may also be included in the mobile device 300.
  • a mobile operating system 370 e.g., iOS TM , Android TM , Windows Phone TM
  • the applications 380 may include a browser or any other suitable apps for transmitting, receiving and presenting information relating to the status of the vehicle 140 (e.g., the location of the vehicle 140) from the server 110.
  • User interactions with the information stream may be achieved via the I/O 350 and provided to the server 110 and/or other components of the online to offline service system100 via the network 120.
  • FIG. 4 is a block diagram illustrating an exemplary processing engine 112 according to some embodiments of the present disclosure.
  • the processing engine 112 may include an obtaining module 402, a classification module 404, a prediction module 406, a comparison module 408, and a determination module 410. At least a portion of the processing engine 112 may be implemented on a computing device as illustrated in FIG. 2 or a mobile device as illustrated in FIG. 3.
  • the obtaining module 402 may be configured to obtain, via the network 120, a plurality of historical data values and a plurality of actual values relating to a service.
  • the service may be associated with a business line of the online to offline service system 100.
  • the business line may be any service provided through the online to offline service system 100 including but not limited to online taxi-hailing, online car rental, advertisement, Internet finance, or the like, or a combination thereof.
  • the obtaining module 402 may be also configured to obtain, via the network 120, a plurality of actual values relating to the service corresponding to a plurality of predicted values.
  • the plurality of predicted values may be determined by the prediction module 406.
  • the classification module 404 may be configured to determine a category relating to the plurality of historical data values.
  • the classification module 404 may extract a plurality of features from the plurality of historical data values.
  • the classification module 404 may classify the plurality of historical data values into the category based on values of the plurality of the features.
  • the prediction module 406 may be configured to determine a plurality of predicted values relating to the service.
  • the prediction module 406 may determine a plurality of predicted values based on a prediction model relating to the category determined by the classification module 404.
  • the plurality of predicted values may be associated with a plurality of time points in real time.
  • the comparison module 408 may be configured to compare the plurality of actual values with the plurality of predicted values using at least one filter to generate a comparison result.
  • the at least one filter may include a dispersion filter, a threshold filter, and a false alarm filter.
  • the comparison module 408 may compare the plurality of actual values with the plurality of predicted values using each of the dispersion filter, the threshold filter, and the false alarm filter to generate the comparison result.
  • the comparison result may include at least one of a first comparison result, a second comparison result, and a third comparison result.
  • the comparison module 408 may determine the first comparison result, the second comparison result, and the third comparison result based on the dispersion filter, the threshold filter, and the false alarm filter, respectively.
  • the determination module 410 may be configured to determine that at least part of the plurality of actual values are abnormal based on the comparison result.
  • Each of the first comparison result, the second comparison result, and the third comparison result may include a determination that the at least part of the plurality of actual values are abnormal.
  • the determination module 410 may determine that the at least part of the plurality of actual values are abnormal based on one of or a combination of the first comparison result, the second comparison result, and the third comparison.
  • processing engine 112 is merely provided for the purpose of illustration, and not intended to limit the scope of the present disclosure.
  • the prediction module 406 and the determination module 410 may be integrated into one single module to perform their functions.
  • FIG. 5 is a flowchart illustrating an exemplary process 500 for determining that at least part of a plurality of actual values are abnormal based on a comparison result according to some embodiments of the present disclosure.
  • the processing engine 112 may perform the process 500 to determine that the at least part of the plurality of actual values are abnormal.
  • one or more operations of the process 500 illustrated in FIG. 5 for determining that the at least part of the plurality of actual values are abnormal may be implemented in the online to offline service system 100 illustrated in FIG. 1.
  • the process 500 illustrated in FIG. 5 may be stored in the storage device 160 in the form of instructions, and invoked and/or executed by the processing engine 112 (e.g., the processor 210 of the computing device 200 as illustrated in FIG. 2, the CPU 340 of the mobile device 300 as illustrated in FIG. 3) .
  • the processing engine 112 may obtain, via the network 120, a plurality of historical data values relating to a service.
  • the service may be associated with a business line of the online to offline service system 100.
  • the business line may be any service provided through the online to offline service system 100 including but not limited to online taxi-hailing, online car rental, advertisement, Internet finance, or the like, or a combination thereof.
  • the service may be described in the form of online taxi-hailing for illustration purposes, but it should not be interpreted to limit the service to the form of online taxi-hailing only.
  • the online taxi-hailing may be associated with a service requestor, a service provider, a service request.
  • the service request may include a real-time request and/or an appointment request.
  • a real-time request may be a request that the requestor wishes to use a transportation service at the present moment or at a defined time reasonably close to the present moment for an ordinary person in the art.
  • a request may be a real-time request if the defined time is shorter than a threshold value, such as 1 minute, 5 minutes, 10 minutes or 20 minutes.
  • the appointment request may refer to that the requestor wishes to use a transportation service at a defined time which is reasonably far from the present moment for the ordinary person in the art.
  • a request may be an appointment request if the defined time is longer than a threshold value, such as 20 minutes, 2 hours, or 1 day.
  • the processing engine 112 may define the real-time request or the appointment request based on a time threshold.
  • the time threshold may be default settings of the system 100, or may be adjustable depending on different situations. For example, in a traffic peak period, the time threshold may be relatively small (e.g., 10 minutes) , otherwise in idle period (e.g., 10: 00-12: 00 am) , the time threshold may be relatively large (e.g., 1 hour) .
  • the service request may include a start location, an end location, a start time, a duration, or the like.
  • the start location may refer to the location where the service provider picks up the passenger.
  • the ending location may refer to the location where the service provider drops off the passenger.
  • the start time may refer to a time that a passenger has been picked up, or a time that a service provider (e.g., a driver) has received or confirmed the service request.
  • the duration may be the time expended in which a service provider drove a passenger from the start location to the end location associated with a service request.
  • the plurality of historical data values may be associated with the service.
  • the plurality of historical data values may include a plurality of numbers of service requests, a plurality of durations of service requests, a plurality of start locations of service requests, or the like, or a combination thereof.
  • the plurality of historical data values may be associated with a plurality of time points (e.g., the start times of service requests) .
  • the historical data values may be described in the form of the plurality of numbers of service requests for illustration purposes, but it should not be interpreted to limit the historical data values to the form of the plurality of numbers of service requests only.
  • the plurality of historical data values may form a temporal sequence (herein after also referred to as “sequence” ) .
  • the sequence may be (p 1 , p 2 , p 3 , ..., p i-1 , p i , ...p n ) .
  • each value may be associated with a time point (e.g., a start time of a service request) .
  • the time points associated with the values p i-1 and p i may be t i-1 and t i , respectively.
  • the time point t i-1 may be earlier than the time point t i .
  • the value p i may be the number of service requests at the time point t i .
  • the processing engine 112 may determine a category relating to the plurality of historical data values.
  • the processing engine 112 may analyze the sequence associated with the plurality of historical data values.
  • the processing engine 112 may then extract a plurality of features from the sequence. More description regarding the plurality of features may be found elsewhere in the present disclosure, for example, in FIG. 9 and the descriptions thereof.
  • the plurality of features may be associated with the service (e.g., the online taxi-hailing) corresponding to the plurality of historical data values.
  • the plurality of features may include years of development, volume of business, business flow, profit, or the like, or a combination thereof.
  • the processing engine 112 may determine values of the plurality of the features based on the plurality of historical data values.
  • the processing engine 112 may classify the plurality of historical data values into a category based on the values of the plurality of the features.
  • the category may indicate a characteristic relating to the service.
  • the category may be constructed based on Cartesian product of two sets.
  • a first set may include elements of periodicity, aperiodicity, or the like.
  • a second set may include elements of growth period, stable period, fading period, or the like.
  • the category may include growth period with periodicity, stable period with periodicity, fading period with periodicity, growth period with aperiodicity, stable period with aperiodicity, and fading period with aperiodicity, or the like.
  • the category may include one element of the first set or the second set. More description regarding the category may be found elsewhere in the present disclosure, for example, in FIG. 9 and the descriptions thereof.
  • the processing engine 112 may determine a classifier for classifying the plurality of historical data values into a category.
  • the processing engine 112 may determine a category associated with a plurality of historical data values via a third party.
  • the processing engine 112 may determine a training set based on values of features relating to the plurality of historical data values and the category.
  • the processing engine 112 may determine the classifier based on the training set using a model (e.g., Gradient Boosting Decision Tree model (GBDT) model) .
  • the classifier may classify other historical data values (e.g., a plurality of sequences) into categories.
  • the classifying result may be added into the training set.
  • the classifier may be updated with time using new training sets.
  • the processing engine 112 may classify the plurality of historical data values into a category based on the newest classifier. In some embodiments, the processing engine 112 may classify the plurality of historical data values into more than one categories based on the newest classifier.
  • the processing engine 112 may determine a plurality of predicted values relating to the service based on a prediction model relating to the category.
  • the plurality of predicted values may be associated with a plurality of first time points in real time.
  • the plurality of first time points may include (t 1 , t 2 , t 3 , ..., t j-1 , t j , ...t m ) .
  • the plurality of predicted values may also form a sequence.
  • the processing engine 112 may use an algorithm (e.g., an exponential smoothing algorithm) to determine the plurality of predicted values.
  • the processing engine 112 may determine a statistical parameter relating to the plurality of historical data values using the algorithm.
  • the statistical parameter may be denoted by a residual function, a trend function and/or a seasonal function.
  • the processing engine 112 may determine a prediction model based on the statistical parameter.
  • the processing engine 112 may determine the plurality of predicted values based on the prediction model. More description regarding the determination of the plurality of predicted values may be found elsewhere in the present disclosure, for example, in FIG. 6 and the descriptions thereof.
  • the processing engine 112 may collect the historical data values at a time point periodically. In some embodiments, if the processing engine 112 determines a predicted value (e.g., the number of service requests) at a time point on the next Monday, the processing engine 112 may collect numbers of service requests at the time point on last Mondays in past several weeks. The processing engine 112 may determine the predicted value at the time point on the next Monday based on the collected numbers of service requests.
  • a predicted value e.g., the number of service requests
  • the processing engine 112 may obtain, via the network 120, a plurality of actual values relating to the service corresponding to the plurality of predicted values.
  • the plurality of actual values may be associated with a plurality of second time points.
  • the plurality of second time points and the plurality of first time points may be one-to-one correspondence.
  • the plurality of second time points and part of the plurality of first time points may be one-to-one correspondence.
  • the plurality of actual values may refer to values (e.g., the number of service requests) relating to the service after a service (e.g., a service request) was completed.
  • the plurality of actual values may be stored in the storage 160. Similar to the plurality of historical data values, the plurality of actual values may also form a sequence. In some embodiments, the plurality of actual values and the plurality of predicted values may be paired. For each of the plurality of predicted values, the processing engine 112 may obtain an actual value corresponding to the each of the plurality of predicted value. The each of the plurality of predicted value and the corresponding actual value may be associated with a same time point.
  • the processing engine 112 may compare the plurality of actual values with the plurality of predicted values using at least one filter to generate a comparison result.
  • the at least one filter may include a dispersion filter, a threshold filter, and a false alarm filter.
  • the processing engine 112 may compare the plurality of actual values with the plurality of predicted values using each of the dispersion filter, the threshold filter, and the false alarm filter to generate the comparison result.
  • the comparison result may include at least one of a first comparison result, a second comparison result, and a third comparison result.
  • the processing engine 112 may compare the plurality of actual values with the plurality of predicted values to generate the first comparison result based on the dispersion filter. More description regarding the first comparison result may be found elsewhere in the present disclosure, for example, in FIG. 7 and the descriptions thereof.
  • the processing engine 112 may compare the plurality of actual values with the plurality of predicted values to generate the second comparison result based on the threshold filter. More description regarding the second comparison result may be found elsewhere in the present disclosure, for example, in FIG. 8 and the descriptions thereof.
  • the processing engine 112 may compare the plurality of actual values with the plurality of predicted values to generate the third comparison result based on the false alarm filter.
  • the processing engine 112 may obtain a pre-labeled data set relating to service data.
  • the pre-labeled data set may include a plurality of false alarm results.
  • the plurality of false alarm results may refer to the actual values determined by the online to offline service system 100 as abnormal, which was later corrected as normal by the third party.
  • the plurality of false alarm results may also refer to actual values determined by the online to offline service system 100 as normal, which was later corrected as abnormal by the third party.
  • the processing engine 112 may determine a model based on the pre-labeled data set.
  • the processing engine 112 may determine the model by training a classifying model based on the pre-labeled data set.
  • the classifying model may include a GBDT model, a random forest model, or the like.
  • the processing engine 112 may use the plurality of actual values and/or the plurality of predicted values as input of the model.
  • the processing engine 112 may then obtain the third comparison result based on the model.
  • the processing engine 112 may determine that at least part of the plurality of actual values are abnormal based on the comparison result.
  • the comparison result may include at least one of the first comparison result, the second comparison result, and the third comparison result.
  • Each of the first comparison result, the second comparison result, and the third comparison result may include a determination that the at least part of the plurality of actual values are abnormal.
  • the processing engine 112 may determine that the at least part of the plurality of actual values are abnormal based on the comparison result. For example, the processing engine 112 may determine that the at least part of the plurality of actual values are abnormal if one of the first comparison result, the second comparison result, and the third comparison result includes a determination that the at least part of the plurality of actual values are abnormal. As another example, the processing engine 112 may determine that the at least part of the plurality of actual values are abnormal if two of the first comparison result, the second comparison result, and the third comparison result both include determinations that the at least part of the plurality of actual values are abnormal.
  • the processing engine 112 may determine that the at least part of the plurality of actual values are abnormal if all of the first comparison result, the second comparison result, and the third comparison result include determinations that the at least part of the plurality of actual values are abnormal.
  • the processing engine 112 may omit the at least part of the plurality of actual values from the plurality of actual values.
  • FIG. 6 is a flowchart illustrating an exemplary process 600 for determining a plurality of predicted values according to some embodiments of the present disclosure.
  • the processing engine 112 may perform the process 600 to determine the plurality of predicted values.
  • one or more operations of the process 600 illustrated in FIG. 6 for determining the plurality of predicted values may be implemented in the online to offline service system 100 illustrated in FIG. 1.
  • the process 600 illustrated in FIG. 6 may be stored in the storage device 160 in the form of instructions, and invoked and/or executed by the processing engine 112 (e.g., the processor 210 of the computing device 200 as illustrated in FIG. 2, the CPU 340 of the mobile device 300 as illustrated in FIG. 3) .
  • the processing engine 112 may determine that the category is associated with periodicity.
  • the category may be one of the growth period with periodicity, the stable period with periodicity, and the fading period with periodicity.
  • the category associated with periodicity may indicate that the plurality of historical data values are periodical.
  • the processing engine 112 may determine a statistical parameter relating to the plurality of historical data values based on the category.
  • the processing engine 112 may analyze the plurality of historical data values using time series method.
  • the time series method may include a moving average model, an autoregressive model, an autoregressive moving average model, an exponential smoothing model, or the like, or a combination thereof.
  • the exponential smoothing model may include a basic exponential smoothing model, a double exponential smoothing model, a triple exponential smoothing model, or the like, or a combination thereof.
  • the processing engine 112 may determine the statistical parameter based on the exponential smoothing model (e.g., the triple exponential smoothing model) .
  • the statistical parameter may include a residual function, a trend function, and a seasonal function.
  • the residual function, the trend function, and the seasonal function may all be time functions.
  • the processing engine 112 may generate a prediction model based on the statistical parameter.
  • the processing engine 112 may determine the prediction model based on the residual function, the trend function, and the seasonal function.
  • the prediction model may be expressed by Equation (1) :
  • a (t) may refer to the residual function
  • b (t) may refer to the trend function
  • s (t) may refer to the seasonal function
  • p t+h may refer to the predicted value
  • t may refer to a current time point
  • h may refer to a time interval from the current time point t to a time point associated with the predicted value p t+h
  • k may refer to the period associated with the plurality of historical data values
  • the “mod” may refer to a modulo operation.
  • the processing engine 112 may determine the plurality of predicted values based on the prediction model.
  • the processing engine 112 may obtain time points associated with the plurality of predicted values.
  • the processing engine 112 may determine the plurality of predicted values based on the time points associated with the plurality of predicted values using the Equation (1) .
  • FIG. 7 is a flowchart illustrating an exemplary process 700 for determining that the at least part of the plurality of actual values are abnormal according to some embodiments of the present disclosure.
  • the processing engine 112 may perform the process 700 to determine that the at least part of the plurality of actual values are abnormal.
  • the processing engine 112 may determine the first comparison result based on the dispersion filter by performing the process 700.
  • one or more operations of the process 700 illustrated in FIG. 7 for determining that the at least part of the plurality of actual values are abnormal may be implemented in the online to offline service system 100 illustrated in FIG. 1. For example, the process 700 illustrated in FIG.
  • the storage device 160 may be stored in the form of instructions, and invoked and/or executed by the processing engine 112 (e.g., the processor 210 of the computing device 200 as illustrated in FIG. 2, the CPU 340 of the mobile device 300 as illustrated in FIG. 3) .
  • the processing engine 112 e.g., the processor 210 of the computing device 200 as illustrated in FIG. 2, the CPU 340 of the mobile device 300 as illustrated in FIG. 3 .
  • the processing engine 112 may determine a statistical value based on both the plurality of predicted values and the plurality of actual values.
  • the processing engine 112 may determine the plurality of predicted values and the plurality of actual values each as a sample sequence.
  • the processing engine 112 may perform paired t-test on the two sample sequences.
  • the processing engine 112 may then determine the statistical value based on the paired t-test.
  • the statistical value may be associated with dispersion degrees of both the plurality of predicted values and the plurality of actual values.
  • the processing engine 112 may compare the statistical value with a first threshold to generate the first comparison result.
  • the first threshold may be a predetermined value set in the system.
  • the first threshold may be also adjusted based on real-time conditions. In some embodiments, the first threshold may be any value including 0.5, 0.7, 1, etc.
  • the processing engine 112 may compare the statistical value with the first threshold. The processing engine 112 may then determine if the statistical value is greater than a first threshold.
  • the processing engine 112 may determine that the at least part of the plurality of actual values are abnormal in response to the first comparison result that the statistical value is greater than the first threshold. If the processing engine 112 determines that the statistical value is less than the first threshold, the processing engine 112 may determine that the plurality of actual values are normal. If the processing engine 112 determines that the statistical value is greater than the first threshold, the processing engine 112 may determine that the at least part of the plurality of actual values are abnormal. The first comparison result may indicate that the at least part of the plurality of actual values are abnormal. The processing engine 112 may omit the at least part of the plurality of actual values from the plurality of actual values.
  • FIG. 8 is a flowchart illustrating an exemplary process 800 for determining that the at least part of the plurality of actual values are abnormal according to some embodiments of the present disclosure.
  • the processing engine 112 may perform the process 800 to determine that the at least part of the plurality of actual values are abnormal.
  • the processing engine 112 may determine the second comparison result by performing the process 800.
  • one or more operations of the process 800 illustrated in FIG. 8 for determining that the at least part of the plurality of actual values are abnormal may be implemented in the online to offline service system 100 illustrated in FIG. 1.
  • the storage device 160 may be stored in the form of instructions, and invoked and/or executed by the processing engine 112 (e.g., the processor 210 of the computing device 200 as illustrated in FIG. 2, the CPU 340 of the mobile device 300 as illustrated in FIG. 3) .
  • the processing engine 112 e.g., the processor 210 of the computing device 200 as illustrated in FIG. 2, the CPU 340 of the mobile device 300 as illustrated in FIG. 3 .
  • the processing engine 112 may determine a plurality of differences between the plurality of predicted values and the plurality of actual values using the threshold filter. For each difference, the predicted value and the corresponding actual value may be associated with a same time point. For each of at least part of the plurality of predicted values, the processing engine 112 may determine a difference based on the predicted value and the corresponding actual value.
  • the processing engine 112 may determine a plurality of second thresholds based on a time function.
  • the processing engine 112 may determine the time function based on the category into which the plurality of historical data values as a sequence is classified.
  • the processing engine 112 may determine the plurality of second thresholds based on the time function. For each of the at least part of the plurality of first time points and/or the plurality of second time points, the processing engine 112 may determine a second threshold based on the time point and the time function.
  • the processing engine 112 may determine the plurality of second thresholds accordingly.
  • the processing engine 112 may compare each of the plurality of differences with a corresponding second threshold. For a time point, the processing engine 112 may determine a difference and a corresponding second threshold. For each of at least part of the first time points and/or the second time points, the processing engine 112 may determine if the difference associated with the time point is greater than a corresponding second threshold.
  • the processing engine 112 may determine that the at least part of the plurality of actual values are abnormal in response to the comparison result that each of the plurality of differences is greater than a corresponding second threshold. If the processing engine 112 determines that part of the plurality of differences are less than corresponding second threshold (s) , respectively, the processing engine 112 may determine that the plurality of actual values are normal. If the processing engine 112 determines that each of the plurality of differences is greater than a corresponding second threshold, the processing engine 112 may determine that the at least part of the plurality of actual values are abnormal. The second comparison result may indicate that the at least part of the plurality of actual values are abnormal. The processing engine 112 may omit the at least part of the plurality of actual values from the plurality of actual values.
  • FIG. 9 is a table 900 associated with a plurality of business lines according to some embodiments of the present disclosure.
  • Four business line IDs 902 may be shown in the table 900.
  • the processing engine 112 may obtain a plurality of historical data values.
  • the processing engine 112 may analyze the plurality of historical data values and extract four features from the plurality of historical data values.
  • the four features may include a first feature 904 (e.g., year of development) , a second feature 906 (e.g., volume of business) , a third feature 908 (e.g., business flow) , a fourth feature 910 (e.g., profit) .
  • the processing engine 112 may also determine values of the first feature 904, the second feature 906, the third feature 908 and the fourth feature 910.
  • the processing engine 112 may use the business line ID 902, values of the four features as input of the classifier.
  • the processing engine 112 may determine a category into which the business line ID is classified based on the classifier. Accordingly, the plurality of historical data values associated with the business ID is classified into the category.
  • Four categories (a first category 912, a second category 914, a third category 916, a fourth category 918) are shown in the table 910.
  • the four categories each may be associated with the Cartesian product of the two sets as described in connection with the step 504.
  • a value of the category into which a business line ID is classified may be set as 1, and values of other categories that a business line ID is not classified may be set as 0.
  • the business line ID 1 is classified into the first category.
  • the business line ID 2 is classified into the second category.
  • the business line ID 3 is classified into the third category.
  • the business line ID 4 is classified into the fourth category.
  • the processing engine 112 may use the values of the four features and the category into which the business line ID is classified to update the training set associated with the classifier.
  • the classifier may be updated based on the updated training set associated with the classifier.
  • each business line ID may be any other value instead of 4.
  • the value of category into which a business line ID is classified may be any other value instead of 0.
  • the number of categories shown in the table 900 may be any other value instead of 4.
  • aspects of the present disclosure may be illustrated and described herein in any of a number of patentable classes or context including any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof. Accordingly, aspects of the present disclosure may be implemented entirely hardware, entirely software (including firmware, resident software, micro-code, etc. ) or combining software and hardware implementation that may all generally be referred to herein as a "block, " “module, ” “engine, ” “unit, ” “component, ” or “system. ” Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer-readable medium having computer readable program code embodied thereon.
  • a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including electromagnetic, optical, or the like, or any suitable combination thereof.
  • a computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that may communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
  • Program code embodied on a computer readable signal medium may be transmitted using any appropriate medium, including wireless, wireline, optical fiber cable, RF, or the like, or any suitable combination of the foregoing.
  • Computer program code for carrying out steps for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object-oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB. NET, Python or the like, conventional procedural programming languages, such as the “C” programming language, Visual Basic, Fortran 1703, Perl, COBOL 1702, PHP, ABAP, dynamic programming languages such as Python, Ruby and Groovy, or other programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN) , or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider) or in a cloud computing environment or offered as a service such as a software as a service (SaaS) .
  • LAN local area network
  • WAN wide area network
  • an Internet Service Provider for example, AT&T, MCI, Sprint, EarthLink, MSN, etc.
  • SaaS software as a service

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Quality & Reliability (AREA)
  • Educational Administration (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Testing And Monitoring For Control Systems (AREA)
  • Traffic Control Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A system includes a storage device storing a set of instructions and at least one processor in communication with the storage device. When executing the instructions, the at least one processor is configured to cause the system to obtain a plurality of historical data values and determine a category relating to the plurality of historical data values. The at least one processor may also cause the system to determine a plurality of predicted values based on the category and obtain a plurality of actual values relating to the service corresponding to the plurality of predicted values. The at least one processor may further cause the system to compare the plurality of actual values with the plurality of predicted values using at least one filter to generate a comparison result and determine that at least part of the plurality of actual values are abnormal based on the comparison result.

Description

SYSTEMS AND METHODS FOR ABNORMALITY DETECTION IN DATA STORAGE TECHNICAL FIELD
The present disclosure generally relates to systems and methods for data storage management, and in particular, to systems and methods for detecting abnormality in data storage.
BACKGROUND
With a vigorous development of various service lines of the online to offline service system, the volume of service data may experience an explosive growth. Data warehouse may be used to store the service data. Anomality detection may aim to, from the service data, find a set of data that is different from the expected data.
As the service data may reflect the business situation within a certain period of time, the authenticity of the service data in the data warehouse must be ensured and each abnormal fluctuation of the service data may need to be promptly alerted. Current technology often relies on the experience of a database administrative or continuous iterative amendments of a database management system causing delayed response to the abnormal fluctuation. A method and system to improve the abnormality detection may be desired.
SUMMARY
According to an aspect of the present disclosure, a system may include a storage device storing a set of instructions and one or more processors in communication with the storage device. When executing the instructions, one or more processors may be configured to cause the system to obtain, via a network, a plurality of historical data values relating to a service and determine a category relating to the plurality of historical data values. The one or more processors may also cause the system to determine a  plurality of predicted values relating to the service based on a prediction model relating to the category and obtain, via a network, the plurality of actual values relating to the service corresponding to the plurality of predicted values. In some embodiments, each of the plurality of predicted values may correspond to a time point. The one or more processors may further cause the system to compare the plurality of actual values with the plurality of predicted values using at least one filter to generate a comparison result and determine that at least part of the plurality of actual values are abnormal based on the comparison result.
In some embodiments, the plurality of historical data values may form a temporal sequence.
In some embodiments, the one or more processors may further cause the system to determine a plurality of feature values relating to the plurality of historical data values and determine the category relating to the plurality of historical data values based on the plurality of feature values.
In some embodiments, the category may indicate a characteristic relating to the service. The category may include one of growth period with periodicity, stable period with periodicity, fading period with periodicity, growth period with aperiodicity, stable period with aperiodicity, or fading period with aperiodicity.
In some embodiments, the one or more processors may also cause the system to determine that the category indicating the characteristic relating to the service is associated with periodicity and determine a residual function, a trend function, and a seasonal function relating to the plurality of historical data values based on the category associated with periodicity. The one or more processors may further cause the system to generate the prediction model based on the residual function, the trend function and the seasonal function and determine the plurality of predicted values based on the prediction model.
In some embodiments, the one or more processors may further cause the system to obtain time points relating to at least part of the plurality of predicted values and obtain the plurality of actual values based on the time points relating to the at least part of the plurality of predicted values.
In some embodiments, the at least one filter may include a dispersion filter The one or more processors may also cause the system to determine a statistical value based on the plurality of predicted values and the plurality of actual values using the dispersion filter and compare the statistical value with a first threshold. In some embodiments, the statistical value may be associated with dispersion degrees of both the plurality of predicted values and the plurality of actual values. The one or more processors may further cause the system to determine that the at least part of the plurality of actual values are abnormal in response to the comparison result that the statistical value is greater than the first threshold.
In some embodiments, the at least one filter may include a threshold filter. The one or more processors may also cause the system to determine a plurality of differences between the plurality of predicted values and the plurality of actual values using the threshold filter and determine a plurality of second thresholds based on a time function. The one or more processors may further cause the system to compare each of the plurality of differences with a corresponding second threshold and determine that the at least part of the plurality of actual values are abnormal in response to the comparison result that each of the plurality of differences is greater than a corresponding second threshold. In some embodiments, the each of the plurality differences and the corresponding second threshold may be associated with a same time point.
In some embodiments, the at least one filter may include a false alarm filter. The one or more processors may further cause the system to determine a false alarm model based on a pre-labeled data set relating to  service data and determine that the at least part of the plurality of actual values are abnormal based on the false alarm model. In some embodiments, the pre-labeled data set may include a plurality of false alarm results generated by the system.
In some embodiments, the one or more processors may also cause the system to compare the plurality of actual values with the plurality of predicted values using a dispersion filter, a threshold filter, and a false alarm filter to generate a first comparison result, a second comparison result, and a third comparison result, respectively. The one or more processors may further cause the system to determine that at least part of the plurality of actual values are abnormal based on the first comparison result, the second comparison result, and the third comparison result.
According to another aspect of the present disclosure, a computer-implemented method may include one or more of the following operations performed by one or more processors. The method may include obtaining, via a network, a plurality of historical data values relating to a service and determining a category relating to the plurality of historical data values. The method may also include determining a plurality of predicted values relating to the service based on a prediction model relating to the category and obtaining, via a network, the plurality of actual values relating to the service corresponding to the plurality of predicted values. In some embodiments, each of the plurality of predicted values may correspond to a time point. The method may further include comparing the plurality of actual values with the plurality of predicted values using at least one filter to generate a comparison result and determining that at least part of the plurality of actual values are abnormal based on the comparison result.
In some embodiments, the method may further include determining a plurality of feature values relating to the plurality of historical data values and  determining the category relating to the plurality of historical data values based on the plurality of feature values.
In some embodiments, the method may also include determining that the category indicating the characteristic relating to the service is associated with periodicity and determining a residual function, a trend function and a seasonal function relating to the plurality of historical data values based on the category associated with periodicity. The method may further include generating the prediction model based on the residual function, the trend function, and the seasonal function and determining the plurality of predicted values based on the prediction model.
In some embodiments, the method may further include obtaining time points relating to at least part of the plurality of predicted values and obtaining the plurality of actual values based on the time points relating to the at least part of the plurality of predicted values.
In some embodiments, the at least one filter may include a dispersion filter. The method may also include determining a statistical value based on the plurality of predicted values and the plurality of actual values using the dispersion filter and comparing the statistical value with a first threshold. In some embodiments, the statistical value may be associated with dispersion degrees of both the plurality of predicted values and the plurality of actual values. The method may further include determining that the at least part of the plurality of actual values are abnormal in response to the comparison result that the statistical value is greater than the first threshold.
In some embodiments, the at least one filter may include a threshold filter. The method may also include determining a plurality of differences between the plurality of predicted values and the plurality of actual values using the threshold filter and determining a plurality of second thresholds based on a time function. The method may further include comparing each of the plurality of differences with a corresponding second  threshold and determining that the at least part of the plurality of actual values are abnormal in response to the comparison result that each of the plurality of differences is greater than a corresponding second threshold. In some embodiments, the each of the plurality differences and the corresponding second threshold may be associated with a same time point.
In some embodiments, the at least one filter may include a false alarm filter. The method may also include determining a false alarm model based on a pre-labeled data set relating to service data and determining that the at least part of the plurality of actual values are abnormal based on the false alarm model. In some embodiments, the pre-labeled data set may include a plurality of false alarm results generated by the system.
In some embodiments, the method may also include comparing the plurality of actual values with the plurality of predicted values using a dispersion filter, a threshold filter, and a false alarm filter to generate a first comparison result, a second comparison result, and a third comparison result, respectively. The method may further include determining that at least part of the plurality of actual values are abnormal based on the first comparison result, the second comparison result, and the third comparison result.
According to yet another aspect of the present disclosure, a non-transitory computer-readable medium may store instructions. When executed by one or more processors of a system, the instructions may cause the system to obtain, via a network, a plurality of historical data values relating to a service and determine a category relating to the plurality of historical data values. The instructions may also cause the system to determine a plurality of predicted values relating to the service based on a prediction model relating to the category and obtain, via a network, the plurality of actual values relating to the service corresponding to the plurality of predicted values. In some embodiments, each of the plurality of predicted values may correspond to a time point. The instructions may further cause the system to compare the  plurality of actual values with the plurality of predicted values using at least one filter to generate a comparison result and determine that at least part of the plurality of actual values are abnormal based on the comparison result.
Additional features will be set forth in part in the description which follows, and in part will become apparent to those skilled in the art upon examination of the following and the accompanying drawings or may be learned by production or operation of the examples. The features of the present disclosure may be realized and attained by practice or use of various aspects of the methodologies, instrumentalities and combinations set forth in the detailed examples discussed below.
BRIEF DESCRIPTION OF THE DRAWINGS
The present disclosure is further described in terms of exemplary embodiments. These exemplary embodiments are described in detail with reference to the drawings. These embodiments are non-limiting exemplary embodiments, in which like reference numerals represent similar structures throughout the several views of the drawings, and wherein:
FIG. 1 is a block diagram illustrating an exemplary online to offline service system according to some embodiments;
FIG. 2 is a schematic diagram illustrating exemplary hardware and software components of a computing device according to some embodiments;
FIG. 3 is a schematic diagram illustrating exemplary hardware and/or software components of a mobile device according to some embodiments of the present disclosure;
FIG. 4 is a block diagram illustrating an exemplary processing engine according to some embodiments of the present disclosure;
FIG. 5 is a flowchart illustrating an exemplary process for determining that at least part of a plurality of actual values are abnormal based on a comparison result according to some embodiments of the present disclosure;
FIG. 6 is a flowchart illustrating an exemplary process for determining a plurality of predicted values according to some embodiments of the present disclosure;
FIG. 7 is a flowchart illustrating an exemplary process for determining that the at least part of the plurality of actual values are abnormal according to some embodiments of the present disclosure;
FIG. 8 is a flowchart illustrating an exemplary process for determining that the at least part of the plurality of actual values are abnormal according to some embodiments of the present disclosure; and
FIG. 9 is a table associated with a plurality of business lines according to some embodiments of the present disclosure.
DETAILED DESCRIPTION
The following description is presented to enable any person skilled in the art to make and use the present disclosure and is provided in the context of a particular application and its requirements. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present disclosure. Thus, the present disclosure is not limited to the embodiments shown but is to be accorded the widest scope consistent with the claims.
The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting. As used herein, the singular forms "a, " "an, " and "the" may be intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprise, " "comprises, " and/or "comprising, " "include, " "includes, " and/or "including, " when used in this specification, specify the presence of stated features, integers, steps, steps, elements, and/or components, but do not preclude the presence or addition of  one or more other features, integers, steps, steps, elements, components, and/or groups thereof.
These and other features, and characteristics of the present disclosure, as well as the methods of step and functions of the related elements of structure and the combination of parts and economies of manufacture, may become more apparent upon consideration of the following description with reference to the accompanying drawings, all of which form a part of this disclosure. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended to limit the scope of the present disclosure. It is understood that the drawings are not to scale.
The flowcharts used in the present disclosure illustrate steps that systems implement according to some embodiments described in the present disclosure. It is to be expressly understood, the steps of the flowchart may be implemented not in order. Conversely, the steps may be implemented in inverted order, or simultaneously. Moreover, one or more other steps may be added to the flowcharts. One or more steps may be removed from the flowcharts.
Moreover, while the system and method in the present disclosure are described primarily with regard to distributing a request for a transportation service, it should also be understood that the present disclosure is not intended to be limiting. The system or method of the present disclosure may be applied to any other kind of online to offline service. For example, the system or method of the present disclosure may be applied to transportation systems of different environments including land, ocean, aerospace, or the like, or any combination thereof. The vehicle of the transportation systems may include a taxi, a private car, a hitch, a bus, a train, a bullet train, a high-speed rail, a subway, a vessel, an aircraft, a spaceship, a hot-air balloon, a driverless vehicle, or the like, or any combination thereof. The transportation system  may also include any transportation system for management and/or distribution, for example, a system for transmitting and/or receiving an express. The application of the system or method of the present disclosure may be implemented on a user device and include a web page, a plug-in of a browser, a client terminal, a custom system, an internal analysis system, an artificial intelligence robot, or the like, or any combination thereof.
The term "passenger, " "requestor, " "service requestor, " and "customer" in the present disclosure are used interchangeably to refer to an individual, an entity, or a tool that may request or order a service. Also, the term "driver, " "provider, " and "service provider" in the present disclosure are used interchangeably to refer to an individual, an entity, or a tool that may provide a service or facilitate the providing of the service.
The term "service request, " "request for a service, " "requests, " and "order" in the present disclosure are used interchangeably to refer to a request that may be initiated by a passenger, a service requestor, a customer, a driver, a provider, a service provider, or the like, or any combination thereof. The service request may be accepted by any one of a passenger, a service requestor, a customer, a driver, a provider, or a service provider. The service request may be chargeable or free.
The term "service provider terminal" and "driver terminal" in the present disclosure are used interchangeably to refer to a mobile terminal that is used by a service provider to provide a service or facilitate the providing of the service. The term "service requestor terminal" and "passenger terminal" in the present disclosure are used interchangeably to refer to a mobile terminal that is used by a service requestor to request or order a service.
The positioning technology used in the present disclosure may be based on a global positioning system (GPS) , a global navigation satellite system (GLONASS) , a compass navigation system (COMPASS) , a Galileo positioning system, a quasi-zenith satellite system (QZSS) , a wireless fidelity  (WiFi) positioning technology, or the like, or any combination thereof. One or more of the above positioning systems may be used interchangeably in the present disclosure.
An aspect of the present disclosure relates to online systems and methods for data storage management. A plurality of historical data values relating to a service may be obtained. A category relating to the plurality of historical data values may be determined. A prediction model associated with the category may be determined. A plurality of predicted values relating to the service may be determined based on the prediction model. A plurality of actual values corresponding to the plurality of predicted values may be obtained. The plurality of actual values and the plurality of predicted values may be compared to generate a comparison result based on at least one filter. At least part of the plurality of actual values is determined as abnormal based on the comparison result. The present disclosure employs the functions of the classifiers, the predictors and the comparators and the machine learning algorithm to generate an abnormality alarm system. Based on the category of the service data, the system may obtain one or more parameters based on the offline historical service data values. Further, the one or more parameters may be applied to the online predictors and comparators to detect abnormalities in the real-time service data. The present disclosure improves the capability of the abnormality alarm in the data storage management.
FIG. 1 is a block diagram illustrating an exemplary online to offline service system 100 according to some embodiments. For example, the online to offline service system 100 may be an online transportation service platform for transportation services. The online to offline service system 100 may include a server 110, a network 120, a service requestor terminal 130, a service provider terminal 140, a vehicle 150, a storage device 160, and a navigation system 170.
The online to offline service system 100 may provide a plurality of services. Exemplary service may include a taxi hailing service, a chauffeur service, an express car service, a carpool service, a bus service, a driver hire service, and a shuttle service. In some embodiments, the online to offline service may be any on-line service, such as booking a meal, shopping, or the like, or any combination thereof.
In some embodiments, the server 110 may be a single server, or a server group. The server group may be centralized, or distributed (e.g., the server 110 may be a distributed system) . In some embodiments, the server 110 may be local or remote. For example, the server 110 may access information and/or data stored in the service requestor terminal 130, the service provider terminal 140, and/or the storage device 160 via the network 120. As another example, the server 110 may be directly connected to the service requestor terminal 130, the service provider terminal 140, and/or the storage device 160 to access stored information and/or data. In some embodiments, the server 110 may be implemented on a cloud platform. Merely by way of example, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof. In some embodiments, the server 110 may be implemented on a computing device 1000 having one or more components illustrated in FIG. 10 in the present disclosure.
In some embodiments, the server 110 may include a processing engine 112. The processing engine 112 may process information and/or data related to the service request to perform one or more functions described in the present disclosure. For example, the processing engine 112 may determine that at least part of a plurality of actual values are abnormal. In some embodiments, the processing engine 112 may include one or more processing engines (e.g., single-core processing engine (s) or multi-core  processor (s) ) . Merely by way of example, the processing engine 112 may include a central processing unit (CPU) , an application-specific integrated circuit (ASIC) , an application-specific instruction-set processor (ASIP) , a graphics processing unit (GPU) , a physics processing unit (PPU) , a digital signal processor (DSP) , a field programmable gate array (FPGA) , a programmable logic device (PLD) , a controller, a microcontroller unit, a reduced instruction-set computer (RISC) , a microprocessor, or the like, or any combination thereof.
The network 120 may facilitate exchange of information and/or data. In some embodiments, one or more components in the online to offline service system 100 (e.g., the server 110, the service requestor terminal 130, the service provider terminal 140, the vehicle 150, the storage device 160, and the navigation system 170) may send information and/or data to other component (s) in the online to offline service system 100 via the network 120. For example, the server 110 may receive a service request from the service requestor terminal 130 via the network 120. In some embodiments, the network 120 may be any type of wired or wireless network, or combination thereof. Merely by way of example, the network 120 may include a cable network, a wireline network, an optical fiber network, a tele communications network, an intranet, an Internet, a local area network (LAN) , a wide area network (WAN) , a wireless local area network (WLAN) , a metropolitan area network (MAN) , a wide area network (WAN) , a public telephone switched network (PSTN) , a Bluetooth network, a ZigBee network, a near field communication (NFC) network, or the like, or any combination thereof. In some embodiments, the network 120 may include one or more network access points. For example, the network 120 may include wired or wireless network access points such as base stations and/or internet exchange points 120-1, 120-2, …, through which one or more components of the online to offline  service system 100 may be connected to the network 120 to exchange data and/or information.
In some embodiments, a passenger may be an owner of the service requestor terminal 130. In some embodiments, the owner of the service requestor terminal 130 may be someone other than the passenger. For example, an owner A of the service requestor terminal 130 may use the service requestor terminal 130 to send a service request for a passenger B, or receive a service confirmation and/or information or instructions from the server 110. In some embodiments, a service provider may be a user of the service provider terminal 140. In some embodiments, the user of the service provider terminal 140 may be someone other than the service provider. For example, a user C of the service provider terminal 140 may use the service provider terminal 140 to receive a service request for a service provider D, and/or information or instructions from the server 110. In some embodiments, "passenger" and "passenger terminal" may be used interchangeably, and "service provider" and "service provider terminal" may be used interchangeably. In some embodiments, the service provider terminal may be associated with one or more service providers (e.g., a night-shift service provider, or a day-shift service provider) .
In some embodiments, the service requestor terminal 130 may include a mobile device 130-1, a tablet computer 130-2, a laptop computer 130-3, a built-in device in a vehicle 130-4, or the like, or any combination thereof. In some embodiments, the mobile device 130-1 may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof. In some embodiments, the smart home device may include a smart lighting device, a control device of an intelligent electrical apparatus, a smart monitoring device, a smart television, a smart video camera, an interphone, or the like, or any combination thereof. In some embodiments, the wearable device may  include a smart bracelet, a smart footgear, a smart glass, a smart helmet, a smart watch, smart clothing, a smart backpack, a smart accessory, or the like, or any combination thereof. In some embodiments, the smart mobile device may include a smartphone, a personal digital assistance (PDA) , a gaming device, a navigation device, a point of sale (POS) device, or the like, or any combination thereof. In some embodiments, the virtual reality device and/or the augmented reality device may include a virtual reality helmet, a virtual reality glass, a virtual reality patch, an augmented reality helmet, an augmented reality glass, an augmented reality patch, or the like, or any combination thereof. For example, the virtual reality device and/or the augmented reality device may include a Google TM Glass, an Oculus Rift, a HoloLens, a Gear VR, etc. In some embodiments, built-in device in the vehicle 130-4 may include an onboard computer, an onboard television, etc. In some embodiments, the service requestor terminal 130 may be a device with positioning technology for locating the position of the passenger and/or the service requestor terminal 130.
The service provider terminal 140 may include a plurality of service provider terminals 140-1, 140-2, …, 140-n. In some embodiments, the service provider terminal 140 may be similar to, or the same device as the service requestor terminal 130. In some embodiments, the service provider terminal 140 may be customized to be able to implement the online to offline service. In some embodiments, the service provider terminal 140 may be a device with positioning technology for locating the service provider, the service provider terminal 140, and/or a vehicle 150 associated with the service provider terminal 140. In some embodiments, the service requestor terminal 130 and/or the service provider terminal 140 may communicate with other positioning device to determine the position of the passenger, the service requestor terminal 130, the service provider, and/or the service provider terminal 140. In some embodiments, the service requestor terminal 130  and/or the service provider terminal 140 may periodically send the positioning information to the server 110. In some embodiments, the service provider terminal 140 may also periodically send the availability status to the server 110. The availability status may indicate whether a vehicle 150 associated with the service provider terminal 140 is available to carry a passenger. For example, the service requestor terminal 130 and/or the service provider terminal 140 may send the positioning information and the availability status to the server 110 every thirty minutes. As another example, the service requestor terminal 130 and/or the service provider terminal 140 may send the positioning information and the availability status to the server 110 each time the user logs into the mobile application associated with the online to offline service.
In some embodiments, the service provider terminal 140 may correspond to one or more vehicles 150. The vehicles 150 may carry the passenger and travel to the destination. The vehicles 150 may include a plurality of vehicles 150-1, 150-2, …, 150-n. One vehicle may correspond to one type of services (e.g., a taxi hailing service, a chauffeur service, an express car service, a carpool service, a bus service, a driver hire service, and a shuttle service) .
The storage device 160 may store data and/or instructions. In some embodiments, the storage device 160 may store data obtained from the service requestor terminal 130 and/or the service provider terminal 140. In some embodiments, the storage device 160 may store data and/or instructions that the server 110 may execute or use to perform exemplary methods described in the present disclosure. In some embodiments, the storage device 160 may include a mass storage, a removable storage, a volatile read-and-write memory, a read-only memory (ROM) , or the like, or any combination thereof. Exemplary mass storage may include a magnetic disk, an optical disk, a solid-state drives, etc. Exemplary removable storage may include a flash drive, a floppy disk, an optical disk, a memory card, a zip disk, a  magnetic tape, etc. Exemplary volatile read-and-write memory may include a random access memory (RAM) . Exemplary RAM may include a dynamic RAM (DRAM) , a double date rate synchronous dynamic RAM (DDR SDRAM) , a static RAM (SRAM) , a thyristor RAM (T-RAM) , and a zero-capacitor RAM (Z-RAM) , etc. Exemplary ROM may include a mask ROM (MROM) , a programmable ROM (PROM) , an erasable programmable ROM (PEROM) , an electrically erasable programmable ROM (EEPROM) , a compact disk ROM (CD-ROM) , and a digital versatile disk ROM, etc. In some embodiments, the storage device 160 may be implemented on a cloud platform. Merely by way of example, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof.
In some embodiments, the storage device 160 may be connected to the network 120 to communicate with one or more components in the online to offline service system 100 (e.g., the server 110, the service requestor terminal 130, the service provider terminal 140, etc. ) . One or more components in the online to offline service system 100 may access the data or instructions stored in the storage device 160 via the network 120. In some embodiments, the storage device 160 may be directly connected to or communicate with one or more components in the online to offline service system 100 (e.g., the server 110, the service requestor terminal 130, the service provider terminal 140, etc. ) . In some embodiments, the storage device 160 may be part of the server 110.
The navigation system 170 may determine information associated with an object, for example, one or more of the service requestor terminal 130, the service provider terminal 140, the vehicle 150, etc. In some embodiments, the navigation system 170 may be a global positioning system (GPS) , a global navigation satellite system (GLONASS) , a compass navigation system (COMPASS) , a BeiDou navigation satellite system, a Galileo positioning  system, a quasi-zenith satellite system (QZSS) , etc. The information may include a location, an elevation, a velocity, or an acceleration of the object, or a current time. The navigation system 170 may include one or more satellites, for example, a satellite 170-1, a satellite 170-2, and a satellite 170-3. The satellites 170-1 through 170-3 may determine the information mentioned above independently or jointly. The satellite navigation system 170 may send the information mentioned above to the network 120, the service requestor terminal 130, the service provider terminal 140, or the vehicle 150 via wireless connections.
In some embodiments, one or more components in the online to offline service system 100 (e.g., the server 110, the service requestor terminal 130, the service provider terminal 140, etc. ) may have permissions to access the storage device 160. In some embodiments, one or more components in the online to offline service system 100 may read and/or modify information related to the passenger, service provider, and/or the public when one or more conditions are met. For example, the server 110 may read and/or modify one or more passengers’information after a service is completed. As another example, the server 110 may read and/or modify one or more service providers’information after a service is completed.
In some embodiments, information exchanging of one or more components in the online to offline service system 100 may be initiated by way of requesting a service. The object of the service request may be any product. In some embodiments, the product may include food, medicine, commodity, chemical product, electrical appliance, clothing, car, housing, luxury, or the like, or any combination thereof. In some other embodiments, the product may include a servicing product, a financial product, a knowledge product, an internet product, or the like, or any combination thereof. The internet product may include an individual host product, a web product, a mobile internet product, a commercial host product, an embedded product, or the like, or any  combination thereof. The mobile internet product may be used in a software of a mobile terminal, a program, a system, or the like, or any combination thereof. The mobile terminal may include a tablet computer, a laptop computer, a mobile phone, a personal digital assistance (PDA) , a smart watch, a point of sale (POS) device, an onboard computer, an onboard television, a wearable device, or the like, or any combination thereof. For example, the product may be any software and/or application used in the computer or mobile phone. The software and/or application may relate to socializing, shopping, transporting, entertainment, learning, investment, or the like, or any combination thereof. In some embodiments, the software and/or application related to transporting may include a traveling software and/or application, a vehicle scheduling software and/or application, a mapping software and/or application, etc. In the vehicle scheduling software and/or application, the vehicle may include a horse, a carriage, a rickshaw (e.g., a wheelbarrow, a bike, a tricycle, etc. ) , a car (e.g., a taxi, a bus, a private car, etc. ) , a train, a subway, a vessel, an aircraft (e.g., an airplane, a helicopter, a space shuttle, a rocket, a hot-air balloon, etc. ) , or the like, or any combination thereof.
FIG. 2 is a schematic diagram illustrating exemplary hardware and software components of a computing device 200 on which the server 110, the requestor terminal 130, and/or the provider terminal 140 may be implemented according to some embodiments of the present disclosure. For example, the processing engine 112 may be implemented on the computing device 200 and configured to perform functions of the processing engine 112 disclosed in this disclosure.
The computing device 200 may be a general-purpose computer or a special purpose computer; both may be used to implement an online to offline service system for the present disclosure. The computing device 200 may be used to implement any component of the online to offline service as described herein. For example, the processing engine 112 may be  implemented on the computing device 200, via its hardware, software program, firmware, or a combination thereof. Although only one such computer is shown, for convenience, the computer functions relating to the online to offline service as described herein may be implemented in a distributed fashion on a number of similar platforms, to distribute the processing load.
The computing device 200, for example, may include COM ports 250 connected to and from a network connected thereto to facilitate data communications. The computing device 200 may also include a processor (e.g., the processor 220) , in the form of one or more processors, for executing program instructions. The exemplary computing device may include an internal communication bus 210, program storage and data storage of different forms including, for example, a disk 270, and a read only memory (ROM) 230, or a random access memory (RAM) 240, for various data files to be processed and/or transmitted by the computing device. The exemplary computing device may also include program instructions stored in the ROM 230, RAM 240, and/or other type of non-transitory storage medium to be executed by the processor 220. The methods and/or processes of the present disclosure may be implemented as the program instructions. The computing device 200 also includes an I/O component 260, supporting input/output between the computer and other components. The computing device 200 may also receive programming and data via network communications.
Merely for illustration, only one CPU and/or processor is illustrated in FIG. 2. Multiple CPUs and/or processors are also contemplated; thus operations and/or method steps performed by one CPU and/or processor as described in the present disclosure may also be jointly or separately performed by the multiple CPUs and/or processors. For example, if in the present disclosure the CPU and/or processor of the computing device 200 executes both step A and step B, it should be understood that step A and step B may also be performed by two different CPUs and/or processors jointly or  separately in the computing device 200 (e.g., the first processor executes step A and the second processor executes step B, or the first and second processors jointly execute steps A and B) .
FIG. 3 is a schematic diagram illustrating exemplary hardware and/or software components of a mobile device 300 according to some embodiments of the present disclosure. As illustrated in FIG. 3, the mobile device 300 may include a communication module 310, a display 320, a graphics processing unit (GPU) 330, a processor 340, an I/O 350, a memory 360, and a storage 390. In some embodiments, any other suitable component, including but not limited to a system bus or a controller (not shown) , may also be included in the mobile device 300. In some embodiments, a mobile operating system 370 (e.g., iOS TM, Android TM, Windows Phone TM) and one or more applications 380 may be loaded into the memory 360 from the storage 390 in order to be executed by the processor 340. The applications 380 may include a browser or any other suitable apps for transmitting, receiving and presenting information relating to the status of the vehicle 140 (e.g., the location of the vehicle 140) from the server 110. User interactions with the information stream may be achieved via the I/O 350 and provided to the server 110 and/or other components of the online to offline service system100 via the network 120.
FIG. 4 is a block diagram illustrating an exemplary processing engine 112 according to some embodiments of the present disclosure. The processing engine 112 may include an obtaining module 402, a classification module 404, a prediction module 406, a comparison module 408, and a determination module 410. At least a portion of the processing engine 112 may be implemented on a computing device as illustrated in FIG. 2 or a mobile device as illustrated in FIG. 3.
The obtaining module 402 may be configured to obtain, via the network 120, a plurality of historical data values and a plurality of actual values  relating to a service. The service may be associated with a business line of the online to offline service system 100. The business line may be any service provided through the online to offline service system 100 including but not limited to online taxi-hailing, online car rental, advertisement, Internet finance, or the like, or a combination thereof. The obtaining module 402 may be also configured to obtain, via the network 120, a plurality of actual values relating to the service corresponding to a plurality of predicted values. The plurality of predicted values may be determined by the prediction module 406.
The classification module 404 may be configured to determine a category relating to the plurality of historical data values. The classification module 404 may extract a plurality of features from the plurality of historical data values. The classification module 404 may classify the plurality of historical data values into the category based on values of the plurality of the features.
The prediction module 406 may be configured to determine a plurality of predicted values relating to the service. The prediction module 406 may determine a plurality of predicted values based on a prediction model relating to the category determined by the classification module 404. The plurality of predicted values may be associated with a plurality of time points in real time.
The comparison module 408 may be configured to compare the plurality of actual values with the plurality of predicted values using at least one filter to generate a comparison result. The at least one filter may include a dispersion filter, a threshold filter, and a false alarm filter. In some embodiments, the comparison module 408 may compare the plurality of actual values with the plurality of predicted values using each of the dispersion filter, the threshold filter, and the false alarm filter to generate the comparison result. The comparison result may include at least one of a first comparison result, a second comparison result, and a third comparison result. The comparison  module 408 may determine the first comparison result, the second comparison result, and the third comparison result based on the dispersion filter, the threshold filter, and the false alarm filter, respectively.
The determination module 410 may be configured to determine that at least part of the plurality of actual values are abnormal based on the comparison result. Each of the first comparison result, the second comparison result, and the third comparison result may include a determination that the at least part of the plurality of actual values are abnormal. The determination module 410 may determine that the at least part of the plurality of actual values are abnormal based on one of or a combination of the first comparison result, the second comparison result, and the third comparison.
It should be noted that the above description of the processing engine 112 is merely provided for the purpose of illustration, and not intended to limit the scope of the present disclosure. For persons having ordinary skills in the art, multiple variations and modifications may be made under the teachings of the present disclosure. However, those variations and modifications do not depart from the scope of the present disclosure. For example, the prediction module 406 and the determination module 410 may be integrated into one single module to perform their functions.
FIG. 5 is a flowchart illustrating an exemplary process 500 for determining that at least part of a plurality of actual values are abnormal based on a comparison result according to some embodiments of the present disclosure. In some embodiments, the processing engine 112 may perform the process 500 to determine that the at least part of the plurality of actual values are abnormal. In some embodiments, one or more operations of the process 500 illustrated in FIG. 5 for determining that the at least part of the plurality of actual values are abnormal may be implemented in the online to offline service system 100 illustrated in FIG. 1. For example, the process 500  illustrated in FIG. 5 may be stored in the storage device 160 in the form of instructions, and invoked and/or executed by the processing engine 112 (e.g., the processor 210 of the computing device 200 as illustrated in FIG. 2, the CPU 340 of the mobile device 300 as illustrated in FIG. 3) .
In 502, the processing engine 112 (e.g., the obtaining module 402) may obtain, via the network 120, a plurality of historical data values relating to a service. The service may be associated with a business line of the online to offline service system 100. The business line may be any service provided through the online to offline service system 100 including but not limited to online taxi-hailing, online car rental, advertisement, Internet finance, or the like, or a combination thereof.
In the present application, the service may be described in the form of online taxi-hailing for illustration purposes, but it should not be interpreted to limit the service to the form of online taxi-hailing only. In some embodiments, the online taxi-hailing may be associated with a service requestor, a service provider, a service request. The service request may include a real-time request and/or an appointment request. As used herein, a real-time request may be a request that the requestor wishes to use a transportation service at the present moment or at a defined time reasonably close to the present moment for an ordinary person in the art. For example, a request may be a real-time request if the defined time is shorter than a threshold value, such as 1 minute, 5 minutes, 10 minutes or 20 minutes. The appointment request may refer to that the requestor wishes to use a transportation service at a defined time which is reasonably far from the present moment for the ordinary person in the art. For example, a request may be an appointment request if the defined time is longer than a threshold value, such as 20 minutes, 2 hours, or 1 day. In some embodiments, the processing engine 112 may define the real-time request or the appointment request based on a time threshold. The time threshold may be default settings of the system 100, or may be adjustable  depending on different situations. For example, in a traffic peak period, the time threshold may be relatively small (e.g., 10 minutes) , otherwise in idle period (e.g., 10: 00-12: 00 am) , the time threshold may be relatively large (e.g., 1 hour) .
The service request may include a start location, an end location, a start time, a duration, or the like. The start location may refer to the location where the service provider picks up the passenger. The ending location may refer to the location where the service provider drops off the passenger. The start time may refer to a time that a passenger has been picked up, or a time that a service provider (e.g., a driver) has received or confirmed the service request. The duration may be the time expended in which a service provider drove a passenger from the start location to the end location associated with a service request.
The plurality of historical data values may be associated with the service. In some embodiments, the plurality of historical data values may include a plurality of numbers of service requests, a plurality of durations of service requests, a plurality of start locations of service requests, or the like, or a combination thereof. The plurality of historical data values may be associated with a plurality of time points (e.g., the start times of service requests) . In the present application, the historical data values may be described in the form of the plurality of numbers of service requests for illustration purposes, but it should not be interpreted to limit the historical data values to the form of the plurality of numbers of service requests only.
The plurality of historical data values may form a temporal sequence (herein after also referred to as “sequence” ) . For example, the sequence may be (p 1, p 2, p 3, …, p i-1, p i, …p n) . In the sequence, each value may be associated with a time point (e.g., a start time of a service request) . The time points associated with the values p i-1 and p i may be t i-1 and t i, respectively. The time point t i-1 may be earlier than the time point t i. In  some embodiments, the value p i may be the number of service requests at the time point t i.
In 504, the processing engine 112 (e.g., the classification module 404) may determine a category relating to the plurality of historical data values. The processing engine 112 may analyze the sequence associated with the plurality of historical data values. The processing engine 112 may then extract a plurality of features from the sequence. More description regarding the plurality of features may be found elsewhere in the present disclosure, for example, in FIG. 9 and the descriptions thereof.
In some embodiments, the plurality of features may be associated with the service (e.g., the online taxi-hailing) corresponding to the plurality of historical data values. The plurality of features may include years of development, volume of business, business flow, profit, or the like, or a combination thereof. The processing engine 112 may determine values of the plurality of the features based on the plurality of historical data values.
The processing engine 112 may classify the plurality of historical data values into a category based on the values of the plurality of the features. The category may indicate a characteristic relating to the service. The category may be constructed based on Cartesian product of two sets. A first set may include elements of periodicity, aperiodicity, or the like. A second set may include elements of growth period, stable period, fading period, or the like. In some embodiments, the category may include growth period with periodicity, stable period with periodicity, fading period with periodicity, growth period with aperiodicity, stable period with aperiodicity, and fading period with aperiodicity, or the like. In another embodiment, the category may include one element of the first set or the second set. More description regarding the category may be found elsewhere in the present disclosure, for example, in FIG. 9 and the descriptions thereof.
The processing engine 112 may determine a classifier for classifying the plurality of historical data values into a category. The processing engine 112 may determine a category associated with a plurality of historical data values via a third party. The processing engine 112 may determine a training set based on values of features relating to the plurality of historical data values and the category. The processing engine 112 may determine the classifier based on the training set using a model (e.g., Gradient Boosting Decision Tree model (GBDT) model) . The classifier may classify other historical data values (e.g., a plurality of sequences) into categories. The classifying result may be added into the training set. The classifier may be updated with time using new training sets. The processing engine 112 may classify the plurality of historical data values into a category based on the newest classifier. In some embodiments, the processing engine 112 may classify the plurality of historical data values into more than one categories based on the newest classifier.
In 506, the processing engine 112 (e.g., the prediction module 406) may determine a plurality of predicted values relating to the service based on a prediction model relating to the category. The plurality of predicted values may be associated with a plurality of first time points in real time. For example, the plurality of first time points may include (t 1, t 2, t 3, …, t j-1, t j, …t m) . Similar to the plurality of historical data values, the plurality of predicted values may also form a sequence.
If the category is associated with periodicity, the processing engine 112 may use an algorithm (e.g., an exponential smoothing algorithm) to determine the plurality of predicted values. The processing engine 112 may determine a statistical parameter relating to the plurality of historical data values using the algorithm. The statistical parameter may be denoted by a residual function, a trend function and/or a seasonal function.
In some embodiments, the processing engine 112 may determine a prediction model based on the statistical parameter. The processing engine 112 may determine the plurality of predicted values based on the prediction model. More description regarding the determination of the plurality of predicted values may be found elsewhere in the present disclosure, for example, in FIG. 6 and the descriptions thereof.
If the category is associated with aperiodicity, the processing engine 112 may collect the historical data values at a time point periodically. In some embodiments, if the processing engine 112 determines a predicted value (e.g., the number of service requests) at a time point on the next Monday, the processing engine 112 may collect numbers of service requests at the time point on last Mondays in past several weeks. The processing engine 112 may determine the predicted value at the time point on the next Monday based on the collected numbers of service requests.
In 508, the processing engine 112 (e.g., the obtaining module 402) may obtain, via the network 120, a plurality of actual values relating to the service corresponding to the plurality of predicted values. The plurality of actual values may be associated with a plurality of second time points. The plurality of second time points and the plurality of first time points may be one-to-one correspondence. Alternatively or additionally, the plurality of second time points and part of the plurality of first time points may be one-to-one correspondence.
For each of the plurality of the second time points, there may be an actual value corresponding to the second time point. The plurality of actual values may refer to values (e.g., the number of service requests) relating to the service after a service (e.g., a service request) was completed. The plurality of actual values may be stored in the storage 160. Similar to the plurality of historical data values, the plurality of actual values may also form a sequence. In some embodiments, the plurality of actual values and the plurality of  predicted values may be paired. For each of the plurality of predicted values, the processing engine 112 may obtain an actual value corresponding to the each of the plurality of predicted value. The each of the plurality of predicted value and the corresponding actual value may be associated with a same time point.
In 510, the processing engine 112 may compare the plurality of actual values with the plurality of predicted values using at least one filter to generate a comparison result. The at least one filter may include a dispersion filter, a threshold filter, and a false alarm filter. In some embodiments, the processing engine 112 may compare the plurality of actual values with the plurality of predicted values using each of the dispersion filter, the threshold filter, and the false alarm filter to generate the comparison result.
The comparison result may include at least one of a first comparison result, a second comparison result, and a third comparison result. The processing engine 112 may compare the plurality of actual values with the plurality of predicted values to generate the first comparison result based on the dispersion filter. More description regarding the first comparison result may be found elsewhere in the present disclosure, for example, in FIG. 7 and the descriptions thereof.
The processing engine 112 may compare the plurality of actual values with the plurality of predicted values to generate the second comparison result based on the threshold filter. More description regarding the second comparison result may be found elsewhere in the present disclosure, for example, in FIG. 8 and the descriptions thereof.
The processing engine 112 may compare the plurality of actual values with the plurality of predicted values to generate the third comparison result based on the false alarm filter. The processing engine 112 may obtain a pre-labeled data set relating to service data. The pre-labeled data set may include a plurality of false alarm results. The plurality of false alarm results  may refer to the actual values determined by the online to offline service system 100 as abnormal, which was later corrected as normal by the third party. The plurality of false alarm results may also refer to actual values determined by the online to offline service system 100 as normal, which was later corrected as abnormal by the third party.
In some embodiments, the processing engine 112 (e.g., the comparison module 408) may determine a model based on the pre-labeled data set. The processing engine 112 may determine the model by training a classifying model based on the pre-labeled data set. The classifying model may include a GBDT model, a random forest model, or the like. The processing engine 112 may use the plurality of actual values and/or the plurality of predicted values as input of the model. The processing engine 112 may then obtain the third comparison result based on the model.
In 512, the processing engine 112 may determine that at least part of the plurality of actual values are abnormal based on the comparison result. The comparison result may include at least one of the first comparison result, the second comparison result, and the third comparison result. Each of the first comparison result, the second comparison result, and the third comparison result may include a determination that the at least part of the plurality of actual values are abnormal.
The processing engine 112 may determine that the at least part of the plurality of actual values are abnormal based on the comparison result. For example, the processing engine 112 may determine that the at least part of the plurality of actual values are abnormal if one of the first comparison result, the second comparison result, and the third comparison result includes a determination that the at least part of the plurality of actual values are abnormal. As another example, the processing engine 112 may determine that the at least part of the plurality of actual values are abnormal if two of the first comparison result, the second comparison result, and the third  comparison result both include determinations that the at least part of the plurality of actual values are abnormal. As still another example, the processing engine 112 may determine that the at least part of the plurality of actual values are abnormal if all of the first comparison result, the second comparison result, and the third comparison result include determinations that the at least part of the plurality of actual values are abnormal. The processing engine 112 may omit the at least part of the plurality of actual values from the plurality of actual values.
It should be noted that the above descriptions about the processing of determining that the at least part of the plurality of actual values are abnormal are provided for illustration purposes, and should not be designated as the only practical embodiment. For persons having ordinary skills in the art, after understanding the general principle of the process for determining that the at least part of the plurality of actual values are abnormal, without departing the principle, may modify or change the forms or details of the particular practical ways and steps, and further make simple deductions or substitutions, or may make modifications or combinations of some steps without further creative efforts. However, those variations and modifications do not depart the scope of the present disclosure. Additionally or alternatively, one or more steps may be omitted. In some embodiments, two or more steps may be integrated into a step, or a step may be separated into two steps. In some embodiments, 506 and 508 may be combined into one operation.
FIG. 6 is a flowchart illustrating an exemplary process 600 for determining a plurality of predicted values according to some embodiments of the present disclosure. In some embodiments, the processing engine 112 may perform the process 600 to determine the plurality of predicted values. In some embodiments, one or more operations of the process 600 illustrated in FIG. 6 for determining the plurality of predicted values may be implemented in the online to offline service system 100 illustrated in FIG. 1. For example, the  process 600 illustrated in FIG. 6 may be stored in the storage device 160 in the form of instructions, and invoked and/or executed by the processing engine 112 (e.g., the processor 210 of the computing device 200 as illustrated in FIG. 2, the CPU 340 of the mobile device 300 as illustrated in FIG. 3) .
In 602, the processing engine 112 (e.g., the prediction module 406) may determine that the category is associated with periodicity. For example, the category may be one of the growth period with periodicity, the stable period with periodicity, and the fading period with periodicity. The category associated with periodicity may indicate that the plurality of historical data values are periodical.
In 604, the processing engine 112 (e.g., the prediction module 406) may determine a statistical parameter relating to the plurality of historical data values based on the category. The processing engine 112 may analyze the plurality of historical data values using time series method. The time series method may include a moving average model, an autoregressive model, an autoregressive moving average model, an exponential smoothing model, or the like, or a combination thereof. The exponential smoothing model may include a basic exponential smoothing model, a double exponential smoothing model, a triple exponential smoothing model, or the like, or a combination thereof.
In some embodiments, the processing engine 112 may determine the statistical parameter based on the exponential smoothing model (e.g., the triple exponential smoothing model) . The statistical parameter may include a residual function, a trend function, and a seasonal function. The residual function, the trend function, and the seasonal function may all be time functions.
In 606, the processing engine 112 (e.g., the prediction module 406) may generate a prediction model based on the statistical parameter. In some embodiments, the processing engine 112 may determine the prediction model  based on the residual function, the trend function, and the seasonal function. For example, the prediction model may be expressed by Equation (1) :
p t+h=a (t) +h·b (t) +s [t-k+1+ (h-1) modk]   (1)
where a (t) may refer to the residual function, b (t) may refer to the trend function, s (t) may refer to the seasonal function, p t+h may refer to the predicted value, t may refer to a current time point, h may refer to a time interval from the current time point t to a time point associated with the predicted value p t+h, k may refer to the period associated with the plurality of historical data values, the “mod” may refer to a modulo operation.
In 608, the processing engine 112 (e.g., the prediction module 406) may determine the plurality of predicted values based on the prediction model. The processing engine 112 may obtain time points associated with the plurality of predicted values. The processing engine 112 may determine the plurality of predicted values based on the time points associated with the plurality of predicted values using the Equation (1) .
It should be noted that the above descriptions about the processing of determining the plurality of predicted values are provided for illustration purposes, and should not be designated as the only practical embodiment. For persons having ordinary skills in the art, after understanding the general principle of the process for determining the plurality fo predicted values, without departing the principle, may modify or change the forms or details of the particular practical ways and steps, and further make simple deductions or substitutions, or may make modifications or combinations of some steps without further creative efforts. However, those variations and modifications do not depart the scope of the present disclosure. Additionally or alternatively, one or more steps may be omitted. In some embodiments, two or more steps may be integrated into a step, or a step may be separated into two steps.
FIG. 7 is a flowchart illustrating an exemplary process 700 for determining that the at least part of the plurality of actual values are abnormal  according to some embodiments of the present disclosure. In some embodiments, the processing engine 112 may perform the process 700 to determine that the at least part of the plurality of actual values are abnormal. The processing engine 112 may determine the first comparison result based on the dispersion filter by performing the process 700. In some embodiments, one or more operations of the process 700 illustrated in FIG. 7 for determining that the at least part of the plurality of actual values are abnormal may be implemented in the online to offline service system 100 illustrated in FIG. 1. For example, the process 700 illustrated in FIG. 7 may be stored in the storage device 160 in the form of instructions, and invoked and/or executed by the processing engine 112 (e.g., the processor 210 of the computing device 200 as illustrated in FIG. 2, the CPU 340 of the mobile device 300 as illustrated in FIG. 3) .
In 702, the processing engine 112 (e.g., the comparison module 408) may determine a statistical value based on both the plurality of predicted values and the plurality of actual values. The processing engine 112 may determine the plurality of predicted values and the plurality of actual values each as a sample sequence. The processing engine 112 may perform paired t-test on the two sample sequences. The processing engine 112 may then determine the statistical value based on the paired t-test. The statistical value may be associated with dispersion degrees of both the plurality of predicted values and the plurality of actual values.
In 704, the processing engine 112 (e.g., the comparison module 408) may compare the statistical value with a first threshold to generate the first comparison result. The first threshold may be a predetermined value set in the system. The first threshold may be also adjusted based on real-time conditions. In some embodiments, the first threshold may be any value including 0.5, 0.7, 1, etc. The processing engine 112 may compare the  statistical value with the first threshold. The processing engine 112 may then determine if the statistical value is greater than a first threshold.
In 706, the processing engine 112 (e.g., the comparison module 408) may determine that the at least part of the plurality of actual values are abnormal in response to the first comparison result that the statistical value is greater than the first threshold. If the processing engine 112 determines that the statistical value is less than the first threshold, the processing engine 112 may determine that the plurality of actual values are normal. If the processing engine 112 determines that the statistical value is greater than the first threshold, the processing engine 112 may determine that the at least part of the plurality of actual values are abnormal. The first comparison result may indicate that the at least part of the plurality of actual values are abnormal. The processing engine 112 may omit the at least part of the plurality of actual values from the plurality of actual values.
FIG. 8 is a flowchart illustrating an exemplary process 800 for determining that the at least part of the plurality of actual values are abnormal according to some embodiments of the present disclosure. In some embodiments, the processing engine 112 may perform the process 800 to determine that the at least part of the plurality of actual values are abnormal. The processing engine 112 may determine the second comparison result by performing the process 800. In some embodiments, one or more operations of the process 800 illustrated in FIG. 8 for determining that the at least part of the plurality of actual values are abnormal may be implemented in the online to offline service system 100 illustrated in FIG. 1. For example, the process 800 illustrated in FIG. 8 may be stored in the storage device 160 in the form of instructions, and invoked and/or executed by the processing engine 112 (e.g., the processor 210 of the computing device 200 as illustrated in FIG. 2, the CPU 340 of the mobile device 300 as illustrated in FIG. 3) .
In 802, the processing engine 112 (e.g., the comparison module 408) may determine a plurality of differences between the plurality of predicted values and the plurality of actual values using the threshold filter. For each difference, the predicted value and the corresponding actual value may be associated with a same time point. For each of at least part of the plurality of predicted values, the processing engine 112 may determine a difference based on the predicted value and the corresponding actual value.
In 804, the processing engine 112 (e.g., the comparison module 408) may determine a plurality of second thresholds based on a time function. The processing engine 112 may determine the time function based on the category into which the plurality of historical data values as a sequence is classified. The processing engine 112 may determine the plurality of second thresholds based on the time function. For each of the at least part of the plurality of first time points and/or the plurality of second time points, the processing engine 112 may determine a second threshold based on the time point and the time function. The processing engine 112 may determine the plurality of second thresholds accordingly.
In 806, the processing engine 112 (e.g., the comparison module 408) may compare each of the plurality of differences with a corresponding second threshold. For a time point, the processing engine 112 may determine a difference and a corresponding second threshold. For each of at least part of the first time points and/or the second time points, the processing engine 112 may determine if the difference associated with the time point is greater than a corresponding second threshold.
In 808, the processing engine 112 (e.g., the comparison module 408) may determine that the at least part of the plurality of actual values are abnormal in response to the comparison result that each of the plurality of differences is greater than a corresponding second threshold. If the processing engine 112 determines that part of the plurality of differences are  less than corresponding second threshold (s) , respectively, the processing engine 112 may determine that the plurality of actual values are normal. If the processing engine 112 determines that each of the plurality of differences is greater than a corresponding second threshold, the processing engine 112 may determine that the at least part of the plurality of actual values are abnormal. The second comparison result may indicate that the at least part of the plurality of actual values are abnormal. The processing engine 112 may omit the at least part of the plurality of actual values from the plurality of actual values.
It should be noted that the above descriptions about the processing of determining that the at least part of the plurality of actual values are abnormal are provided for illustration purposes, and should not be designated as the only practical embodiment. For persons having ordinary skills in the art, after understanding the general principle of the process for determining that the at least part of the plurality of actual values are abnormal, without departing the principle, may modify or change the forms or details of the particular practical ways and steps, and further make simple deductions or substitutions, or may make modifications or combinations of some steps without further creative efforts. However, those variations and modifications do not depart the scope of the present disclosure. Additionally or alternatively, one or more steps may be omitted. In some embodiments, two or more steps may be integrated into a step, or a step may be separated into two steps.
FIG. 9 is a table 900 associated with a plurality of business lines according to some embodiments of the present disclosure. Four business line IDs 902 may be shown in the table 900. For each business line ID 902, the processing engine 112 may obtain a plurality of historical data values. The processing engine 112 may analyze the plurality of historical data values and extract four features from the plurality of historical data values. The four features may include a first feature 904 (e.g., year of development) , a second  feature 906 (e.g., volume of business) , a third feature 908 (e.g., business flow) , a fourth feature 910 (e.g., profit) . The processing engine 112 may also determine values of the first feature 904, the second feature 906, the third feature 908 and the fourth feature 910.
For each business line ID 902, the processing engine 112 may use the business line ID 902, values of the four features as input of the classifier. The processing engine 112 may determine a category into which the business line ID is classified based on the classifier. Accordingly, the plurality of historical data values associated with the business ID is classified into the category. Four categories (a first category 912, a second category 914, a third category 916, a fourth category 918) are shown in the table 910. The four categories each may be associated with the Cartesian product of the two sets as described in connection with the step 504.
In some embodiments, a value of the category into which a business line ID is classified may be set as 1, and values of other categories that a business line ID is not classified may be set as 0. As shown in the table 900, the business line ID 1 is classified into the first category. The business line ID 2 is classified into the second category. The business line ID 3 is classified into the third category. The business line ID 4 is classified into the fourth category.
For each business line ID, the processing engine 112 may use the values of the four features and the category into which the business line ID is classified to update the training set associated with the classifier. The classifier may be updated based on the updated training set associated with the classifier.
It should be noted that the above descriptions about the table are provided for illustration purposes, and should not be designated as the only practical embodiment. For persons having ordinary skills in the art, after understanding the general principle of the process for determining that the at  least part of the plurality of actual values are abnormal, without departing the principle, may modify or change the forms or details of the particular practical ways and steps, and further make simple deductions or substitutions, or may make modifications or combinations of some steps without further creative efforts. However, those variations and modifications do not depart the scope of the present disclosure. For example, the number of features of each business line ID may be any other value instead of 4. The value of category into which a business line ID is classified may be any other value instead of 0. The number of categories shown in the table 900 may be any other value instead of 4.
Having thus described the basic concepts, it may be rather apparent to those skilled in the art after reading this detailed disclosure that the foregoing detailed disclosure is intended to be presented by way of example only and is not limiting. Various alterations, improvements, and modifications may occur and are intended to those skilled in the art, though not expressly stated herein. These alterations, improvements, and modifications are intended to be suggested by this disclosure and are within the spirit and scope of the exemplary embodiments of this disclosure.
Moreover, certain terminology has been used to describe embodiments of the present disclosure. For example, the terms “one embodiment, ” “an embodiment, ” and/or “some embodiments” mean that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Therefore, it is emphasized and should be appreciated that two or more references to “an embodiment, ” “one embodiment, ” or “an alternative embodiment” in various portions of this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined as suitable in one or more embodiments of the present disclosure.
Further, it will be appreciated by one skilled in the art, aspects of the present disclosure may be illustrated and described herein in any of a number of patentable classes or context including any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof. Accordingly, aspects of the present disclosure may be implemented entirely hardware, entirely software (including firmware, resident software, micro-code, etc. ) or combining software and hardware implementation that may all generally be referred to herein as a "block, " “module, ” “engine, ” “unit, ” “component, ” or “system. ” Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer-readable medium having computer readable program code embodied thereon.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including electromagnetic, optical, or the like, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that may communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable signal medium may be transmitted using any appropriate medium, including wireless, wireline, optical fiber cable, RF, or the like, or any suitable combination of the foregoing.
Computer program code for carrying out steps for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object-oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB. NET, Python or the like, conventional procedural programming languages, such as  the “C” programming language, Visual Basic, Fortran 1703, Perl, COBOL 1702, PHP, ABAP, dynamic programming languages such as Python, Ruby and Groovy, or other programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN) , or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider) or in a cloud computing environment or offered as a service such as a software as a service (SaaS) .
Furthermore, the recited order of processing elements or sequences, or the use of numbers, letters, or other designations, therefore, is not intended to limit the claimed processes and methods to any order except as may be specified in the claims. Although the above disclosure discusses through various examples what is currently considered to be a variety of useful embodiments of the disclosure, it is to be understood that such detail is solely for that purpose, and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover modifications and equivalent arrangements that are within the spirit and scope of the disclosed embodiments. For example, although the implementation of various components described above may be embodied in a hardware device, it may also be implemented as a software-only solution-e.g., an installation on an existing server or mobile device.
Similarly, it should be appreciated that in the foregoing description of embodiments of the present disclosure, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or  more of the various embodiments. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed subject matter requires more features than are expressly recited in each claim. Rather, claimed subject matter may lie in less than all features of a single foregoing disclosed embodiment.

Claims (21)

  1. A system for abnormality detection in data storage comprising:
    a storage device storing a set of instructions; and
    one or more processors in communication with the storage device, wherein when executing the set of instructions, the one or more processors are configured to cause the system to:
    obtain, via a network, a plurality of historical data values relating to a service;
    determine a category relating to the plurality of historical data values;
    determine a plurality of predicted values relating to the service based on a prediction model relating to the category, each of the plurality of predicted values corresponding to a time point;
    obtain, via a network, the plurality of actual values relating to the service corresponding to the plurality of predicted values;
    compare the plurality of actual values with the plurality of predicted values using at least one filter to generate a comparison result; and
    determine that at least part of the plurality of actual values are abnormal based on the comparison result.
  2. The system of claim 1, wherein the plurality of historical data values form a temporal sequence.
  3. The system of any one of claims 1 or 2, wherein to determine the category relating to the plurality of historical data values, the one or more processors are further configured to cause the system to:
    determine a plurality of feature values relating to the plurality of historical  data values; and
    determine the category relating to the plurality of historical data values based on the plurality of feature values.
  4. The system of any one of claims 1-3, wherein the category indicates a characteristic relating to the service, the category including one of growth period with periodicity, stable period with periodicity, fading period with periodicity, growth period with aperiodicity, stable period with aperiodicity, or fading period with aperiodicity.
  5. The system of claim 4, wherein to determine the plurality of predicted values relating to the service based on the prediction model relating to the category; the one or more processors are further configured to cause the system to:
    determine that the category indicating the characteristic relating to the service is associated with periodicity;
    determine a residual function, a trend function and a seasonal function relating to the plurality of historical data values based on the category associated with periodicity;
    generate the prediction model based on the residual function, the trend function, and the seasonal function; and
    determine the plurality of predicted values based on the prediction model.
  6. The system of any one of claims 1-5, wherein to obtain the plurality of actual values relating to the service corresponding to the plurality of predicted values, the one or more processors are further configured to cause the system to:
    obtain time points relating to at least part of the plurality of predicted values; and
    obtain the plurality of actual values based on the time points relating to the at least part of the plurality of predicted values.
  7. The system of any one of claims 1-6, wherein the at least one filter includes a dispersion filter, to determine that the at least part of the plurality of actual values are abnormal based on the comparison result, the one or more processors are further configured to cause the system to:
    determine a statistical value based on the plurality of predicted values and the plurality of actual values using the dispersion filter, the statistical value associated with dispersion degrees of both the plurality of predicted values and the plurality of actual values;
    compare the statistical value with a first threshold; and
    determine that the at least part of the plurality of actual values are abnormal in response to the comparison result that the statistical value is greater than the first threshold.
  8. The system of any one of claims 1-7, wherein the at least one filter includes a threshold filter, to determine that the at least part of the plurality of actual values are abnormal based on the comparison result, the one or more processors are further configured to cause the system to:
    determine a plurality of differences between the plurality of predicted values and the plurality of actual values using the threshold filter;
    determine a plurality of second thresholds based on a time function;
    compare each of the plurality of differences with a corresponding second threshold, the each of the plurality differences and the corresponding second threshold associated with a same time point; and
    determine that the at least part of the plurality of actual values are abnormal in response to the comparison result that each of the plurality of differences is greater than a corresponding second threshold.
  9. The system of any one of claims 1-8, wherein the at least one filter includes  a false alarm filter, to determine that the at least part of the plurality of actual values are abnormal based on the comparison result, the one or more processors are further configured to cause the system to:
    determine a false alarm model based on a pre-labeled data set relating to service data, the pre-labeled data set including a plurality of false alarm results generated by the system; and
    determine that the at least part of the plurality of actual values are abnormal based on the false alarm model.
  10. The system of any one of claims 1-9, wherein the one or more processors are further configured to cause the system to:
    compare the plurality of actual values with the plurality of predicted values using a dispersion filter, a threshold filter, and a false alarm filter to generate a first comparison result, a second comparison result, and a third comparison result, respectively; and
    determine that at least part of the plurality of actual values are abnormal based on the first comparison result, the second comparison result, and the third comparison result.
  11. A method implemented on a computing device having at least one processor, storage, and a communication platform connected to a network for abnormality detection in data storage, the method comprising:
    obtaining, via a network, a plurality of historical data values relating to a service;
    determining a category relating to the plurality of historical data values;
    determining a plurality of predicted values relating to the service based on a prediction model relating to the category, each of the plurality of predicted values corresponding to a time point;
    obtaining, via a network, the plurality of actual values relating to the  service corresponding to the plurality of predicted values;
    comparing the plurality of actual values with the plurality of predicted values using at least one filter to generate a comparison result; and
    determining that at least part of the plurality of actual values are abnormal based on the comparison result.
  12. The method of claim 11, wherein the plurality of historical data values form a temporal sequence.
  13. The method of any one of claims 11 or 12, wherein the determining the category relating to the plurality of historical data values comprises:
    determining a plurality of feature values relating to the plurality of historical data values; and
    determining the category relating to the plurality of historical data values based on the plurality of feature values.
  14. The method of any one of claims 11-13, wherein the category indicates a characteristic relating to the service, the category including one of growth period with periodicity, stable period with periodicity, fading period with periodicity, growth period with aperiodicity, stable period with aperiodicity, or fading period with aperiodicity.
  15. The method of claim 14, wherein the determining the plurality of predicted values relating to the service based on the prediction model relating to the category comprises:
    determining that the category indicating the characteristic relating to the service is associated with periodicity;
    determining a residual function, a trend function and a seasonal function relating to the plurality of historical data values based on the category  associated with periodicity;
    generating the prediction model based on the residual function, the trend function, and the seasonal function; and
    determining the plurality of predicted values based on the prediction model.
  16. The method of any one of claims 11-15, wherein the obtaining the plurality of actual values relating to the service corresponding to the plurality of predicted values comprises:
    obtaining time points relating to at least part of the plurality of predicted values; and
    obtaining the plurality of actual values based on the time points relating to the at least part of the plurality of predicted values.
  17. The method of any one of claims 11-16, wherein the at least one filter includes a dispersion filter, the determining that the at least part of the plurality of actual values are abnormal based on the comparison result comprises:
    determining a statistical value based on the plurality of predicted values and the plurality of actual values using the dispersion filter, the statistical value associated with dispersion degrees of both the plurality of predicted values and the plurality of actual values;
    comparing the statistical value with a first threshold; and
    determining that the at least part of the plurality of actual values are abnormal in response to the comparison result that the statistical value is greater than the first threshold.
  18. The method of any one of claims 11-17, wherein the at least one filter includes a threshold filter, the determining that the at least part of the plurality of actual values are abnormal based on the comparison result comprises:
    determining a plurality of differences between the plurality of predicted values and the plurality of actual values using the threshold filter;
    determining a plurality of second thresholds based on a time function;
    comparing each of the plurality of differences with a corresponding second threshold, the each of the plurality differences and the corresponding second threshold associated with a same time point; and
    determining that the at least part of the plurality of actual values are abnormal in response to the comparison result that each of the plurality of differences is greater than a corresponding second threshold.
  19. The method of any one of claims 11-18, wherein the at least one filter includes a false alarm filter, the determining that the at least part of the plurality of actual values are abnormal based on the comparison result comprises:
    determining a false alarm model based on a pre-labeled data set relating to service data, the pre-labeled data set including a plurality of false alarm results generated by the system; and
    determining that the at least part of the plurality of actual values are abnormal based on the false alarm model.
  20. The method of any one of claims 11-19, further comprising:
    comparing the plurality of actual values with the plurality of predicted values using a dispersion filter, a threshold filter, and a false alarm filter to generate a first comparison result, a second comparison result, and a third comparison result, respectively; and
    determining that at least part of the plurality of actual values are abnormal based on the first comparison result, the second comparison result, and the third comparison result.
  21. A non-transitory computer-readable medium, comprising at least one set of instructions for abnormality detection in data storage, wherein when executed  by at least one processor, the at least one set of instructions directs the at least one processor to:
    obtain, via a network, a plurality of historical data values relating to a service;
    determine a category relating to the plurality of historical data values;
    determine a plurality of predicted values relating to the service based on a prediction model relating to the category, each of the plurality of predicted values corresponding to a time point;
    obtain, via a network, the plurality of actual values relating to the service corresponding to the plurality of predicted values;
    compare the plurality of actual values with the plurality of predicted values using at least one filter to generate a comparison result; and
    determine that at least part of the plurality of actual values are abnormal based on the comparison result.
PCT/CN2018/090357 2018-06-08 2018-06-08 Systems and methods for abnormality detection in data storage WO2019232773A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2018/090357 WO2019232773A1 (en) 2018-06-08 2018-06-08 Systems and methods for abnormality detection in data storage
CN201880001318.8A CN110945484B (en) 2018-06-08 2018-06-08 System and method for anomaly detection in data storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/090357 WO2019232773A1 (en) 2018-06-08 2018-06-08 Systems and methods for abnormality detection in data storage

Publications (1)

Publication Number Publication Date
WO2019232773A1 true WO2019232773A1 (en) 2019-12-12

Family

ID=68769638

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/090357 WO2019232773A1 (en) 2018-06-08 2018-06-08 Systems and methods for abnormality detection in data storage

Country Status (2)

Country Link
CN (1) CN110945484B (en)
WO (1) WO2019232773A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113688385A (en) * 2021-07-20 2021-11-23 电子科技大学 Lightweight distributed intrusion detection method
CN113762569A (en) * 2020-10-15 2021-12-07 北京沃东天骏信息技术有限公司 Data processing method, device, equipment and computer readable storage medium
CN114915542A (en) * 2022-04-28 2022-08-16 远景智能国际私人投资有限公司 Data abnormity warning method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102890803A (en) * 2011-07-21 2013-01-23 阿里巴巴集团控股有限公司 Method and device for determining abnormal transaction process of electronic commodity
CN105323111A (en) * 2015-11-17 2016-02-10 南京南瑞集团公司 Operation and maintenance automation system and method
CN105871879A (en) * 2016-05-06 2016-08-17 中国联合网络通信集团有限公司 Automatic network element abnormal behavior detection method and device
CN106126391A (en) * 2016-06-28 2016-11-16 北京百度网讯科技有限公司 System monitoring method and apparatus

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101729301B (en) * 2008-11-03 2012-08-15 中国移动通信集团湖北有限公司 Monitor method and monitor system of network anomaly traffic
CN107153882B (en) * 2016-03-03 2021-10-15 北京嘀嘀无限科技发展有限公司 Method and system for predicting passenger taxi taking time distribution interval
CN107194488B (en) * 2016-03-14 2020-12-22 北京嘀嘀无限科技发展有限公司 Travel information pushing method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102890803A (en) * 2011-07-21 2013-01-23 阿里巴巴集团控股有限公司 Method and device for determining abnormal transaction process of electronic commodity
CN105323111A (en) * 2015-11-17 2016-02-10 南京南瑞集团公司 Operation and maintenance automation system and method
CN105871879A (en) * 2016-05-06 2016-08-17 中国联合网络通信集团有限公司 Automatic network element abnormal behavior detection method and device
CN106126391A (en) * 2016-06-28 2016-11-16 北京百度网讯科技有限公司 System monitoring method and apparatus

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113762569A (en) * 2020-10-15 2021-12-07 北京沃东天骏信息技术有限公司 Data processing method, device, equipment and computer readable storage medium
CN113688385A (en) * 2021-07-20 2021-11-23 电子科技大学 Lightweight distributed intrusion detection method
CN113688385B (en) * 2021-07-20 2023-04-07 电子科技大学 Lightweight distributed intrusion detection method
CN114915542A (en) * 2022-04-28 2022-08-16 远景智能国际私人投资有限公司 Data abnormity warning method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN110945484B (en) 2024-01-19
CN110945484A (en) 2020-03-31

Similar Documents

Publication Publication Date Title
US11631027B2 (en) Systems and methods for allocating service requests
JP6737805B2 (en) System and method for obtaining a forecast distribution of future transportation service points
US11017662B2 (en) Systems and methods for determining a path of a moving device
US11546729B2 (en) System and method for destination predicting
WO2017202112A1 (en) Systems and methods for distributing request for service
WO2018214361A1 (en) Systems and methods for improvement of index prediction and model building
US20200042885A1 (en) Systems and methods for determining an estimated time of arrival
EP3535709A1 (en) Systems and methods for providing information for on-demand services
EP3586285A1 (en) Systems and methods for recommending an estimated time of arrival
AU2017419266B2 (en) Methods and systems for estimating time of arrival
WO2019242259A1 (en) Systems and methods for determining potential malicious event
AU2017411198B2 (en) Systems and methods for route planning
EP3365864B1 (en) Systems and methods for updating sequence of services
WO2017157069A1 (en) Systems and methods for predicting service time point
WO2019109604A1 (en) Systems and methods for determining an estimated time of arrival for online to offline services
WO2019232773A1 (en) Systems and methods for abnormality detection in data storage
WO2021012244A1 (en) Systems and methods for order dispatching
WO2019109756A1 (en) Systems and methods for cheat examination
WO2019100366A1 (en) Systems and methods for distributing on-demand service requests
US20230072625A1 (en) Systems and methods for online to offline services

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18921398

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18921398

Country of ref document: EP

Kind code of ref document: A1