CN111931860B

CN111931860B - Abnormal data detection method, device, equipment and storage medium

Info

Publication number: CN111931860B
Application number: CN202010901199.0A
Authority: CN
Inventors: 董善东; 姚华宁; 黄小龙; 梁晓聪; 张加浪; 黄荣庚; 高传泽; 李雄政
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-09-01
Filing date: 2020-09-01
Publication date: 2021-02-09
Anticipated expiration: 2040-09-01
Also published as: CN111931860A

Abstract

The embodiment of the application provides a method, a device, equipment and a storage medium for detecting abnormal data; the method comprises the following steps: acquiring historical state data of an object to be monitored in a preset historical time; predicting a trend baseline of the current state data based on the historical state data; predicting a first boundary value and a second boundary value of the trend baseline based on the trend baseline and the historical state data; at the same moment, the first boundary value is larger than the data value in the trend baseline, and the second boundary value is smaller than the data value in the trend baseline; detecting the current state data based on the first boundary value and the second boundary value to obtain a detection result; under the condition that the detection result shows that the current state data is abnormal data, outputting alarm information for prompting the occurrence of the abnormal data; thus, the current state data is detected by adopting two boundary values of the trend baseline determined in real time to determine whether the current state data is abnormal data; thus, the accuracy of abnormality detection can be improved.

Description

Abnormal data detection method, device, equipment and storage medium

Technical Field

The present application relates to the field of computers, and in particular, to a method, an apparatus, a device, and a storage medium for detecting abnormal data.

Background

In the related technology, the characteristic extraction is carried out on the time sequence, a trained supervised model is loaded, the supervised algorithm is utilized for carrying out anomaly detection, and finally, the anomaly data are detected; thus, the amount of data is very large, since the time series may be in the order of minutes to collect data. Moreover, different features or rules need to be mined aiming at the time sequences of different scenes; therefore, with the continuous expansion of services and the increase of time series types, the required features are larger and larger, and huge manpower is needed to perform the work of feature mining, system maintenance and the like.

Disclosure of Invention

The embodiment of the application provides an abnormal data detection method, device, equipment and storage medium, wherein current state data is detected by adopting two boundary values of a trend baseline determined in real time to determine whether the current state data is abnormal data; thus, the accuracy of abnormality detection can be improved.

The technical scheme of the embodiment of the application is realized as follows:

in a first aspect, an embodiment of the present application provides an abnormal data detection method, including: acquiring historical state data of the object to be monitored in a preset historical time; predicting a trend baseline of current state data based on the historical state data; predicting a first boundary value and a second boundary value of the trend baseline based on the trend baseline and the historical state data; wherein, at the same time, the first boundary value is greater than the data value in the trend baseline, and the second boundary value is less than the data value in the trend baseline; detecting the current state data based on the first boundary value and the second boundary value to obtain a detection result; and outputting alarm information for prompting the abnormal data when the detection result shows that the current state data is the abnormal data.

In some embodiments, after predicting the first boundary value and the second boundary value of the trend baseline based on the trend baseline and the historical state data, the method further comprises: determining a state characteristic of the historical state data; adjusting the first boundary value and the second boundary value according to the state characteristic to obtain an adjusted first boundary value and an adjusted second boundary value;

the detecting the current state data based on the first boundary value and the second boundary value to obtain a detection result, including: and detecting the current state data based on the adjusted first boundary value and the adjusted second boundary value to obtain the detection result.

In some embodiments, the outputting alarm information for prompting occurrence of abnormal data when the detection result indicates that the current state data is abnormal data includes: determining the duration of the abnormal data under the condition that the detection result indicates that the current state data is abnormal data; and generating and outputting the alarm information under the condition that the duration is greater than or equal to the preset duration.

In a second aspect, an embodiment of the present application provides an abnormal data detection apparatus, where the apparatus includes:

the first acquisition module is used for acquiring historical state data of the object to be monitored within a preset historical time; the first prediction module is used for predicting the trend baseline of the current state data based on the historical state data; a second prediction module for predicting a first boundary value and a second boundary value of the trend baseline based on the trend baseline and the historical state data; wherein, at the same time, the first boundary value is greater than the data value in the trend baseline, and the second boundary value is less than the data value in the trend baseline; the first detection module is used for detecting the current state data based on the first boundary value and the second boundary value to obtain a detection result; and the first alarm module is used for outputting alarm information for prompting the occurrence of abnormal data under the condition that the detection result shows that the current state data is abnormal data.

In some embodiments, the second prediction module is further configured to determine a state characteristic of the historical state data; adjusting the first boundary value and the second boundary value according to the state characteristic to obtain an adjusted first boundary value and an adjusted second boundary value;

the first detection module is further configured to detect the current state data based on the adjusted first boundary value and the adjusted second boundary value, so as to obtain the detection result.

In some embodiments, the first warning module is further configured to determine a duration of the abnormal data when the detection result indicates that the current state data is abnormal data; and generating and outputting the alarm information under the condition that the duration is greater than or equal to the preset duration.

In a third aspect, an embodiment of the present application provides an apparatus for detecting abnormal data, including: a memory for storing executable instructions; and the processor is used for realizing the method provided by the embodiment of the application when executing the executable instructions stored in the memory.

In a fourth aspect, an embodiment of the present application provides a storage medium storing executable instructions for causing a processor to implement a method provided by an embodiment of the present application when the processor executes the executable instructions.

The embodiment of the application has the following beneficial effects: for an object to be monitored related to information service, a monitoring system firstly predicts a trend baseline of current state data based on acquired historical state data; then, predicting a first boundary value and a second boundary value of the trend baseline based on the trend baseline and the historical state data; therefore, the first boundary value and the second boundary value of the trend baseline are generated in real time, the process of manually setting the threshold is omitted, the two boundary values of the trend baseline are predicted through the trend baseline and historical state data, and a user can check the generated boundary values and the detection result, so that the detection result is more visual and the result interpretability is higher. Finally, detecting the current state data by adopting the first boundary value and the second boundary value of the trend baseline which are determined in real time so as to determine whether the current state data are abnormal data; thus, the accuracy of abnormality detection can be improved.

Drawings

FIG. 1 is an alternative architectural diagram of an anomaly data detection system provided by embodiments of the present application;

FIG. 2A is a schematic diagram of an alternative architecture of an abnormal data detecting system according to an embodiment of the present application;

fig. 2B is a schematic structural diagram of an abnormal data detection system according to an embodiment of the present application;

fig. 3 is a schematic flow chart illustrating an implementation of an abnormal data detection method according to an embodiment of the present application;

FIG. 4 is a schematic flow chart of another implementation of the abnormal data detection method according to the embodiment of the present application;

FIG. 5A is a diagram illustrating the fluctuation of a dynamic threshold detection curve according to an embodiment of the present application;

fig. 5B is a schematic structural diagram of another component of the abnormal data detecting system according to the embodiment of the present application;

FIG. 6 is a schematic diagram of an implementation framework of an abnormal data detection method provided in an embodiment of the present application;

fig. 7 is a page diagram of dynamic threshold configuration provided by an embodiment of the present application.

Detailed Description

In order to make the objectives, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the attached drawings, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

In the following description, references to the terms "first \ second \ third" are only to distinguish similar objects and do not denote a particular order, but rather the terms "first \ second \ third" are used to interchange specific orders or sequences, where appropriate, so as to enable the embodiments of the application described herein to be practiced in other than the order shown or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.

Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.

1) Time series: is a group of data point sequences arranged according to the chronological order. The time interval of a time series is a constant value (e.g., 10 seconds, 1 minute, 5 minutes). In the embodiment of the present application, the time series mainly refers to the time series of the monitoring class.

2) Time series anomalies: the time series abnormality is manifested in the sudden rise or fall of the traffic curve and the fluctuation after deviating from the normal value.

3) And (3) alarm strategy: including three essential components of name, type and alarm triggering condition.

4) Moving Average (MA): is a computational method that analyzes data points by creating a series of averages of different subsets across the data set. The moving average method comprises the following steps: simple moving averages, cumulative moving averages, and weighted moving averages.

5) Autoregressive Moving Average (ARMA) model: the method is applied to time series and is formed by mixing based on an Auto-regression (AR) model and a Moving Average (MA) model.

6) Differential Integrated Moving Average Autoregressive (ARIMA) model: also known as an integrated moving average autoregressive model, one of the time series prediction analysis methods. The ARIMA model is an extension of the ARMA model.

7) Blockchain (Blockchain): an encrypted, chained transactional memory structure formed of blocks (blocks).

8) Block chain Network (Blockchain Network): the new block is incorporated into the set of a series of nodes of the block chain in a consensus manner.

9) Cloud Technology (Cloud Technology) is based on a general term of network Technology, information Technology, integration Technology, management platform Technology, application Technology and the like applied in a Cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, picture-like websites and more web portals. With the high development and application of the internet industry, each article may have its own identification mark and needs to be transmitted to a background system for logic processing, data in different levels are processed separately, and various industrial data need strong system background support and can only be realized through cloud computing.

10) Cloud Storage (Cloud Storage) is a new concept extended and developed on the Cloud computing concept, and a distributed Cloud Storage system (hereinafter referred to as a Storage system) refers to a Storage system which integrates a large number of Storage devices (Storage devices are also referred to as Storage nodes) of various types in a network through application software or application interfaces to cooperatively work through functions of cluster application, grid technology, distributed Storage file system and the like, and provides data Storage and service access functions to the outside.

The enterprise information department needs to monitor a large amount of infrastructure status (e.g., servers, databases, networks, etc.) and business status (e.g., user online volume, access success rate, etc.) during the process of providing services to the inside and the outside. When a service problem occurs, the service operation and maintenance personnel can quickly know the problem and process and recover the problem, so that a stable and quick service is ensured. The monitoring method of the related art includes that firstly, data generated in the operation of infrastructure or service are collected and reported to form a time series index, and then the data state of the corresponding infrastructure or service is monitored by monitoring the time series index. In some embodiments, a method of manually setting a static threshold is adopted, and the static threshold is set for each service according to the experience of service operation and maintenance personnel. And triggering an abnormal alarm when the time sequence value exceeds a set threshold range. However, with the development of business, the static threshold value is high in maintenance cost, large in labor consumption and dependent on expert experience in detection effect. In order to solve the problem, in the related technology, a machine learning mode is adopted to label a large amount of labeled data, feature engineering work is carried out by relying on an algorithm or the experience of an operation and maintenance expert to design proper features, and a machine learning model is combined to carry out anomaly detection on state data; therefore, the problems of high maintenance cost and poor detection effect of the original manual setting of the static threshold are solved. But rely on extensive data labeling and extensive feature engineering efforts. With the continuous development and evolution of services, data annotation and feature engineering work needs to follow the evolution all the time. Meanwhile, the detection result of the machine learning classification model is only abnormal or not abnormal, and visual result interpretability is lacked.

Based on this, the embodiment of the application provides an abnormal data detection method, device, equipment and storage medium, and the baseline trend and the dynamic upper and lower thresholds of the time sequence are learned through self-adaptive learning of historical time series data, so that an end-to-end detection model is obtained, the tedious operation of characteristic engineering and manual threshold setting are avoided, the labor cost is reduced, and meanwhile, the abnormal data detection system provided by the embodiment of the application has good accuracy and recall rate, and is easier for the expansion of scene application. Therefore, through learning historical state data of models such as ARIMA and the like in a self-adaptive manner, the detection accuracy and recall rate are high, dynamic upper and lower boundaries obtained through calculation are used as decision bases, the interpretability of the detection result is good, the whole implementation process depends on statistics and a machine learning unsupervised method, a large amount of data labeling work is avoided, and the expansion of scenes is facilitated.

An exemplary application of the device for detecting abnormal data provided in the embodiment of the present application is described below, and the terminal provided in the embodiment of the present application may be implemented as various types of user equipment, and may also be implemented as a server. In the following, an exemplary application will be explained when the terminal is implemented as a device or a server. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, big data and artificial intelligence platform. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.

Referring to fig. 1, fig. 1 is an optional architecture diagram of an abnormal data detection system provided in an embodiment of the present application, and to implement supporting an exemplary application, first, for an infrastructure and a service 11 to be detected, first, historical state data 12 of the infrastructure and the service 11 to be detected in a historical period is obtained, and based on the historical state data 12, a trend baseline 13 of the state data of the infrastructure or the service is predicted; then, comprehensively considering the trend baseline 13 and the historical state data 12 to predict a dynamic boundary value 14 of the trend baseline; in this way, the complicated work of the feature engineering is eliminated, the setting of the detection threshold value is not required to be performed manually, and the dynamic boundary value 14 (including the first boundary value and the second boundary value) of the current state data is generated and detected adaptively and dynamically by learning the historical state data. Finally, detecting the current state data 15 by adopting a boundary value of the trend baseline 13 at the current moment so as to determine whether the current state data is abnormal data or not, and feeding back an obtained detection result to the client; therefore, a user can conveniently know why the current state data of the corresponding service is detected as abnormal data (for example, the current state data exceeds the boundary of the dynamic threshold) according to the generated dynamic threshold, and the product effect is visual.

Referring to fig. 2A, fig. 2A is another alternative architecture schematic diagram of the abnormal data detection system provided in the embodiment of the present application, including a blockchain network 20 (exemplarily showing a server 200 as a native node), a monitoring system 30 (exemplarily showing a device 300 belonging to the monitoring system 30 and a graphical interface 301 thereof), which are described below.

The type of blockchain network 20 is flexible and may be, for example, any of a public chain, a private chain, or a federation chain. Taking a public link as an example, electronic devices such as user equipment and servers of any service entity can access the blockchain network 20 without authorization; taking the alliance chain as an example, after obtaining authorization, the electronic device (e.g., device/server) under the jurisdiction of the service entity may access the blockchain network 20, and at this time, the service entity becomes a special node, i.e., a terminal node, in the blockchain network 20.

It should be noted that the end node may only provide functionality for supporting the business entity to initiate transactions (e.g., for uplink storage of data or for querying of data on the chain), and that the end node may be implemented by default or selectively (e.g., depending on the specific business requirements of the business entity) for functions of native nodes of the blockchain network 20, such as the ranking function, consensus service, and ledger function, etc., described below. Therefore, the data and the service processing logic of the service subject can be migrated to the blockchain network 20 to the maximum extent, and the credibility and traceability of the data and service processing process are realized through the blockchain network 20.

Blockchain network 20 receives a transaction submitted from an end node (e.g., device 300 shown in fig. 2A belonging to monitoring system 30) of a business entity (e.g., monitoring system 30 shown in fig. 2A), executes the transaction to update or query the ledger, and displays various intermediate or final results of executing the transaction on a user interface (e.g., graphical interface 301 of device 300) of the device.

An exemplary application of the blockchain network is described below by taking monitoring the system access to the blockchain network and detecting the uplink of abnormal data as an example.

The device 300 of the monitoring system 30 accesses the blockchain network 20 and becomes an end node of the blockchain network 20. The device 300 acquires historical status data of the status of the infrastructure or the service through a sensor; and, the final detection result is fed back to the server 200 in the blockchain network 20 or stored in the device 300; in the case where the upload logic has been deployed for the device 300 or the user has performed an operation, the device 300 generates a transaction corresponding to the update operation/query operation according to the to-be-processed task/synchronous time query request, specifies an intelligent contract to be called for implementing the update operation/query operation and parameters transferred to the intelligent contract in the transaction, and also carries a digital signature signed by the monitoring system 30 (for example, a digest of the transaction is encrypted by using a private key in a digital certificate of the monitoring system 30), and broadcasts the transaction to the blockchain network 20. The digital certificate can be obtained by registering the monitoring system 30 with the certificate authority 31.

A native node in the blockchain network 20, for example, the server 200 verifies a digital signature carried by the transaction when receiving the transaction, and after the verification of the digital signature is successful, it is determined whether the monitoring system 30 has a transaction right according to the identity of the monitoring system 30 carried in the transaction, and any verification judgment of the digital signature and the right verification will result in a transaction failure. After successful verification, the native node signs its own digital signature (e.g., by encrypting a digest of the transaction using the native node's private key) and continues to broadcast in the blockchain network 20.

After the node with the sorting function in the blockchain network 20 receives the transaction successfully verified, the transaction is filled into a new block and broadcasted to the node providing the consensus service in the blockchain network 20.

The nodes in the blockchain network 20 that provide the consensus service perform a consensus process on the new block to reach agreement, the nodes that provide the ledger function append the new block to the end of the blockchain, and perform the transaction in the new block: for an abnormal data detection request initiated by a terminal, the current state data can be detected by adaptively determining two boundary values of the trend baseline, and the identified abnormal data is displayed in a graphical interface 301 of the device 300.

The native node in the blockchain network 20 may read the state data of the object to be monitored from the blockchain, and present the detection result in the monitoring page of the native node, or the native node may detect the current state data of the state of the infrastructure or the service by using the state of the infrastructure or the service stored in the blockchain.

In practical applications, different functions may be provided for different native nodes of the blockchain network 20, for example, the provisioning server 200 has an abnormal data detection function and an accounting function. For this situation, the server 200 may receive the abnormal data detection sent by the device 300 during the transaction, and in the server 200, the trend baseline of the state data is determined adaptively based on the historical state data, and then the dynamic upper and lower boundary values of the trend baseline are determined, so that the accuracy of the abnormal detection can be improved by performing the abnormal detection on the current state data by using the dynamic upper and lower boundary values.

Referring to fig. 2B, fig. 2B is a schematic structural diagram of an abnormal data detection system according to an embodiment of the present application, and the apparatus 400 shown in fig. 2B includes: at least one processor 410, memory 450, at least one network interface 420, and a user interface 430. The various components in device 400 are coupled together by a bus system 440. It is understood that the bus system 440 is used to enable communications among the components. The bus system 440 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 440 in FIG. 2B.

The processor 410 may be an integrated circuit chip having signal processing capabilities such as a general purpose processor, a digital signal processor, or other programmable logic device, discrete gate or transistor logic, discrete hardware components, etc., wherein the general purpose processor may be a microprocessor or any conventional processor, etc.

The user interface 430 includes one or more output devices 431, including one or more speakers and/or one or more visual displays, that enable the presentation of media content. The user interface 430 also includes one or more input devices 432, including user interface components that facilitate user input, in some examples, a keyboard, a mouse, a microphone, a touch screen display, a camera, other input buttons and controls.

The memory 450 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Memory 450 optionally includes one or more storage devices physically located remote from processor 410.

The memory 450 includes either volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM), and the volatile Memory may be a Random Access Memory (RAM). The memory 450 described in embodiments herein is intended to comprise any suitable type of memory.

In some embodiments, memory 450 is capable of storing data, examples of which include programs, modules, and data structures, or a subset or superset thereof, to support various operations, as exemplified below.

An operating system 451, including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks;

a network communication module 452 for communicating to other computing devices via one or more (wired or wireless) network interfaces 420, exemplary network interfaces 420 including: bluetooth, wireless compatibility authentication, and Universal Serial Bus (USB), etc.;

a presentation module 453 for enabling presentation of information (e.g., user interfaces for operating peripherals and displaying content and information) via one or more output devices 431 (e.g., display screens, speakers, etc.) associated with user interface 430;

an input processing module 454 for detecting one or more user inputs or interactions from one of the one or more input devices 432 and translating the detected inputs or interactions.

In some embodiments, the apparatus provided by the embodiments of the present application may be implemented in software, and fig. 2B illustrates a server 455 stored in the memory 450, which may be software in the form of programs and plug-ins, and the like, and includes the following software modules: a first obtaining module 4551, a first predicting module 4552, a second predicting module 4553, a first detecting module 4554 and a first warning module 4555; these modules are logical and thus may be combined or further split according to the functionality implemented. The functions of the respective modules will be explained below.

In other embodiments, the apparatus provided in the embodiments of the present Application may be implemented in hardware, and for example, the apparatus provided in the embodiments of the present Application may be a processor in the form of a hardware decoding processor, which is programmed to execute the abnormal data detection method provided in the embodiments of the present Application, for example, the processor in the form of the hardware decoding processor may be implemented by one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field-Programmable Gate arrays (FPGAs), or other electronic components.

The abnormal data detection method provided by the embodiment of the present application will be described in conjunction with exemplary applications and implementations of the device provided by the embodiment of the present application.

Referring to fig. 3, the method is applied to a monitoring system to implement anomaly detection on status data of an object to be monitored related to information service, and fig. 3 is a schematic implementation flow diagram of an anomaly data detection method provided in an embodiment of the present application, and is described with reference to the steps shown in fig. 3.

Step S301, obtaining historical state data of the object to be monitored in a preset historical time.

In some embodiments, the information service includes services provided by an enterprise information department inside and outside, and the object to be monitored includes an infrastructure status and a business status in the provided information service, wherein the infrastructure includes: servers, databases, networks, etc.; the service comprises the following steps: user online volume, user access success rate, etc. The historical state data is a Key Performance Indicator (KPI) of the state of an infrastructure or a service, and in terms of data expression, the state data is time series data arranged according to the chronological order of occurrence.

In some possible implementation manners, firstly, taking the current moment as a starting point, determining a historical time period which meets a preset time length; for example, the preset time period may be set to 20 minutes or may be set to 1 hour. Then, historical status data of the status of the infrastructure or service over the historical period is obtained. For example, the historical state data may be continuous data in the historical duration, or may be data collected at certain time intervals in the historical duration. The state of the infrastructure or the traffic is a service for detecting the user presence, and the historical state data may be time-series data of the user presence 20 minutes before the current time.

Step S302, based on the historical state data, the trend baseline of the current state data is predicted.

In some embodiments, a trend baseline of data points of the state data of the infrastructure or the business that are not detected at the current time is predicted by analyzing data statistics of historical state data. For example, the historical state data is time-series data of past 1 hour starting from the current time, and by analyzing the time-series data, since the state data of the current time has not been acquired, the trend baseline of the undetected data point at the current time is predicted.

In some possible implementation manners, the historical time sequence (namely historical state data) is processed in a moving average or weighted moving average manner to obtain a fitted time sequence, namely a predicted trend baseline can be obtained; the ARMA model can also be adopted to process the historical time sequence (namely the historical state data) to obtain a fitted time sequence, thereby obtaining a predicted trend baseline. Therefore, the trend base line of the state data at the current moment is predicted in real time according to the historical time sequence, the processes such as manual setting and the like can be omitted, and meanwhile, the continuous manual modification and maintenance can be reduced along with the development of services; and the trend base line is determined through a statistical machine learning algorithm, so that abnormal points can be distinguished at millisecond level, and the calculation efficiency can be improved.

Step S303, predicting a first boundary value and a second boundary value of the trend baseline based on the trend baseline and the historical state data.

In some embodiments, at the same time, the first boundary value is greater than the data value in the spark baseline, and the second boundary value is less than the data value in the spark baseline, for example, at the same time point, the first boundary value is greater than the data value in the spark baseline, and the second boundary value is less than the data value in the spark baseline. The first and second boundary values may be understood as the upper and lower boundary values of the trend baseline, respectively, i.e. the maximum value of the trend baseline at any point in time and the minimum value of the trend baseline at that point in time. By comprehensively considering the trend base line and the historical state data, the upper and lower boundaries of the trend base line are generated in real time on the basis of the trend base line, so that two boundary values of the trend base line at the current moment can be predicted in real time, and the cost caused by manual maintenance of a static threshold value can be saved.

In some possible implementation manners, firstly, a difference value sequence is obtained by determining a difference value between the trend baseline at any time point and the historical state data; and then, combining the difference in the sequence difference with the trend baseline according to the time point to obtain the boundary value of the time point. For example, for the current time point, the sequence difference of the current time point is combined with the trend baseline of the current time point, so that the first boundary value and the second boundary value of the trend baseline at the current time can be predicted.

And step S304, detecting the current state data based on the first boundary value and the second boundary value to obtain a detection result.

In some embodiments, the current state data is a data point that needs to be detected at the current time, and a boundary value of the trend baseline at the current time is adopted to determine whether the current state data exceeds the boundary value, if the boundary value is exceeded, the current state data is an abnormal data point, and if the boundary value is not exceeded, the current state data is a normal data point.

And step S305, outputting alarm information for prompting the abnormal data when the detection result shows that the current state data is the abnormal data.

In some embodiments, if the current state data exceeds the range between the first boundary value and the second boundary value, that is, the current state data is greater than the first boundary value or less than the second boundary value, indicating that the current state data is abnormal data, generating and outputting alarm information according to an alarm policy set by a user (for example, if the state data in three consecutive periods are all detected as abnormal data, determining the current state data as abnormal data, and outputting an alarm prompt in a voice or text manner), so as to prompt the user that the abnormal data are detected, so that the user can check which state data are abnormal in time, and the interpretability of the detection effect is good.

In the embodiment of the application, the process of manually setting the threshold is omitted by generating the two boundary values of the trend baseline in real time, the two boundary values of the trend baseline are predicted by the trend baseline and historical state data, and a user can check the generated boundary values and the detection result, so that the detection result is more intuitive and the result interpretability is higher. And the boundary value of the trend baseline at the current moment is adopted to detect the current state data so as to determine whether the current state data is abnormal data, so that the accuracy of abnormal data detection can be improved.

In some embodiments, the abnormal data detection method is applied to a monitoring system to implement abnormal detection on state data of an object to be monitored related to an information service, for example, abnormal detection on a server, a database, a network and the like in an infrastructure, abnormal detection on an online amount and an access success rate of a user in a service, and the like. The historical state data is time sequence data acquired within preset historical duration, and the trend baseline of the state data at the current moment is predicted through the historical state data; the trend baseline may be understood as the predicted ideal value of the time series data points at the current time. Then, the error between the trend baseline and the historical state data is determined, the upper and lower boundaries (namely, the first boundary value and the second boundary value) of the ideal value (namely, the value of the trend baseline at the current moment) are predicted, if the actual state data at the current moment is detected to exceed the first boundary value and the second boundary value, the current state data is abnormal, and alarm information is sent out to prompt a user that the abnormal data is detected. In implementation, a person skilled in the art may obtain historical status data of an infrastructure or a service to be detected according to the actual infrastructure or service to be detected, and adaptively implement prediction of a trend baseline of the historical status data at the current time and dynamic boundary values of the trend baseline, including a first boundary value and a second boundary value, based on the historical status data. And when the detection result indicates that the current state data is greater than the first boundary value or less than the second boundary value, indicating that the current state data is abnormal data, generating alarm information based on an alarm strategy set by a user, and outputting the alarm information in an alarm mode set in the alarm strategy, for example, the set alarm mode is to label the abnormal data and then send the labeled abnormal data to a user side so that the user can view the labeled abnormal data. Therefore, the user can conveniently know why the state data of the corresponding service is detected to be abnormal (the current point exceeds the dynamic threshold boundary), and the method has a very intuitive effect.

In some embodiments, the dynamic real-time generation of the trend baseline for the historical status data, i.e. step S302, may be implemented by:

and S321, fitting the historical state data to obtain fitted data.

In some possible implementations, a moving average or a weighted moving average may be used to fit the historical state data to obtain fitted data. The historical state data is time sequence data, and a fitted sequence is obtained by fitting the time sequence data. Step S321 may be implemented by the following procedure:

firstly, verifying historical state data by adopting a preset verification condition to obtain a verification result.

Here, the preset verification condition is a condition for verifying the reasonability of the data, and is related to the data characteristics of the historical state data, and includes: whether the preset data point quantity is met, whether a breakpoint or a missing condition exists, whether the data value meets a preset threshold value, for example, whether the historical state data is the flow data of different time points in the historical duration, whether the number of the flow data meets the preset data point quantity, whether the flow data of a certain time point is missing, whether the flow data of a certain time point is too large or too small, and the like, are verified. The verification result comprises: the verification is passed and the verification is not passed, wherein the verification is passed to indicate that the historical state data meets the preset verification condition, for example, the number of the flow data meets the preset data point number, the flow data at a certain moment is not missed, and the flow data at a moment is considered to be in the threshold range. The verification fails to show that the number of the flow data does not meet the preset number of data points, or the flow data at a certain moment is missing, or the flow data at a certain moment is too large or too small, and the like.

And then, updating the historical state data based on the verification result to obtain verified data meeting preset verification conditions.

In some possible implementation manners, if the verification result indicates that the historical state data is not verified, the historical state data is supplemented or adjusted according to the specific verification condition that is not satisfied in the historical state data, so that the updated state data satisfies the preset verification condition, and the verified data is obtained. In a specific example, if the historical state data is flow data, and the number of points of the flow data does not meet the preset number of data points, some data points are collected again according to a time sequence from the historical period of the historical state data and are supplemented in the historical state data. For example, the original historical state data is acquired every 30 seconds within 20 minutes of the history, and at this time, a data point is acquired every 10 seconds, and the newly acquired data point is supplemented to the original historical state data, so that the data point of the supplemented historical state data satisfies the preset number of data points.

In another example, if the historical state data is flow data, for example, the flow data is too large or too small (for example, the flow value is greater than the set maximum flow value or the flow value is less than the set minimum flow value), if the number of points of the too large or too small flow data is less, the data points can be eliminated, and if the number of points of the too large or too small flow data is more, the historical state data in other historical periods can be reselected.

Then, normalization processing is carried out on the verified data to obtain normalized data.

In some possible implementations, the verified data is subjected to data preprocessing, i.e., maximum and minimum normalization, such that the value ranges of the normalized data are unified between [0, 1 ].

And finally, fitting the normalized data to obtain fitted data.

In some possible implementations, the normalized data may be fitted in a variety of ways, for example, the normalized data is fitted by using a moving average, a weighted moving average, an AR model or an ARIMA model, etc. to obtain the fitted data.

And step S322, determining the trend baseline based on the fitted data.

In some possible implementations, the fitted data may be taken as a trend baseline, that is, the trend baseline may be a fitted time series; the trend baseline can also be obtained by connecting the fitted data.

In the embodiment of the application, the boundary value is generated by dynamically generating the boundary value in real time for the historical state data, so that the boundary value can be more suitable for the dynamic development of the state of the infrastructure or the service in the whole life cycle.

In some embodiments, in order to accurately predict the first boundary value and the second boundary value of the trend baseline in real time, step S303 may be implemented by steps shown in fig. 4, where fig. 4 is a schematic flow chart of another implementation of the abnormal data detection method provided in this embodiment of the application, and the following description is performed with reference to the steps shown in fig. 3 and fig. 4:

step S401, determining a difference value between the trend baseline and the historical state data of the same time point to obtain a difference value sequence.

In some possible implementations, the trend baseline is fitting time series data obtained by fitting the historical state data, where the trend baseline includes a plurality of data points, and for each data point, the data point and the historical state data of the time point to which the data point belongs are subtracted to obtain a difference value at each time point, so as to obtain a difference value sequence.

In step S402, the mean and variance of the difference sequence are determined.

In some possible implementations, the differences in the difference sequence are averaged, and the average value includes a mean absolute error and a mean relative error of the difference sequence, where the mean absolute error represents averaging the absolute value of each difference in the difference sequence, and the mean relative error represents averaging the difference of the subtraction of the trend baseline and the historical state data at the same time point. Then, based on the mean absolute error and the mean relative error of the difference sequence, the variance of the difference sequence is calculated.

In step S403, a first boundary value and a second boundary value are determined according to the mean and the variance.

In some possible implementations, the adjustment amount is determined by the mean and the variance, and then the adjustment amount is considered comprehensively on the basis of the trend baseline, so as to obtain the maximum value and the minimum value of the boundary value, which can be implemented by the following processes:

firstly, adjusting the variance by adopting a preset adjusting parameter to obtain an adjusted variance.

Here, preset adjustment parameters are used to indicate different sensitivities of data fluctuation, and may be set according to the state characteristics. In a specific example, the adjusted variance is obtained by multiplying a preset adjustment parameter by the variance.

And secondly, summing the mean value and the adjusted square difference to obtain the boundary adjustment quantity.

Here, the mean value and the adjusted posterior difference are added to obtain a boundary adjustment amount for adjusting the up-and-down fluctuation of the trend baseline.

And thirdly, determining the sum of the trend baseline and the boundary adjustment amount as a first boundary value of the trend baseline.

Here, the sum of the trend baseline sum and the boundary adjustment amount at the same time point is determined as a first boundary value (e.g., a maximum boundary value) of the trend baseline at the same time point.

And finally, determining the difference between the trend baseline and the boundary adjustment amount as a second boundary value of the trend baseline.

Here, the difference between the trend baseline and the boundary adjustment amount at the same time point is determined as a second boundary value of the trend baseline, that is, a lower boundary value (for example, a minimum boundary value) at the same time point.

In other possible implementations, a mean square error of the difference sequence may also be determined, from which the boundary values are determined.

The above steps S401 to S403 provide a way to realize "predicting a first boundary value and a second boundary value of the trend baseline based on the trend baseline and the historical state data", in which a baseline is fitted to a time series of the historical state data by adaptation, and then an upper and a lower boundary values of the trend baseline are determined in real time according to a mean and a variance of the fitted time series; therefore, the average error of the trend baseline and the historical state data and the standard deviation of the difference value of the trend baseline and the historical state data are considered at the same time, and more accurate upper and lower boundary values can be obtained.

In step S404, if the current state data is smaller than the second boundary value, or the current state data is larger than the first boundary value, it is determined that the current state data is abnormal data.

In some possible implementation manners, the first boundary value and the second boundary value are upper and lower boundary values of the predicted trend baseline at the current time, and the detection of the current state data is realized by judging whether the current state data at the current time is greater than the first boundary value or less than the second boundary value. And if the data value of the current state data exceeds the first boundary value and the second boundary value, the current state data is unreasonable, namely the current state data is abnormal data.

In some embodiments, if the current state data is greater than or equal to the second boundary value and the current state data is less than or equal to the first boundary value, the current state data is determined to be normal data.

In some possible implementations, if the data value of the current state data does not exceed the first boundary value and the second boundary value, it is determined that the current state data is reasonable, i.e., the current state data is normal data.

The step S404 mentioned above provides a manner of implementing "detecting the current state data based on the first boundary value and the second boundary value to obtain the detection result", in which the current state data is detected by the first boundary value and the second boundary value, so that the accuracy of detecting the abnormality of the state data of the infrastructure and the service can be improved.

In some embodiments, the boundary value determined by the trend baseline and the historical state data may be fine-tuned according to the state characteristic by the service characteristic layer in the abnormal data detection system, that is, after step S303, the method further includes the following steps:

step S331, determining the state characteristic of the historical state data when the detection result indicates that the current state data is abnormal data.

In some possible implementations, state characteristics of state data of an infrastructure or service are determined. The status property can be understood as the actual meaning of the status data in the service, for example, the status of the infrastructure or the service is the success rate index, and the status property thereof is the success rate. In the case where the object to be monitored is in a business state, the state characteristics can be understood as business characteristics.

Step S332 of adjusting the first boundary value and the second boundary value according to the state characteristic to obtain an adjusted first boundary value and an adjusted second boundary value.

In some possible implementations, the adjusting of the first boundary value and the second boundary value in the boundary values according to the state characteristic may be increasing or decreasing the first boundary value and the second boundary value. In a specific example, the status of the infrastructure or service is a success rate indicator, the status characteristic is a success rate, and since the range to which the success rate belongs should be equal to or less than 100% and equal to or greater than 0, it can be determined whether the first boundary value of the preliminarily set boundary values is 100% or not, and whether the second boundary value is 0 or not, and if not, adjustment is performed based on this so that the adjusted boundary value matches the status characteristic.

The above steps S331 and S332 realize the adjustment of the boundary value, so that the boundary value is adjusted based on the state characteristic, thereby enabling the adjusted boundary value to more accurately distinguish abnormal data in the state data of the infrastructure and the service.

And step S333, detecting the current state data based on the adjusted first boundary value and the adjusted second boundary value to obtain a detection result.

In some possible implementation manners, the current state data is detected by using the first boundary value and the second boundary value which are adjusted after the state characteristic is updated, so that the accuracy of abnormal data detection can be improved.

In the embodiment of the application, the boundary value obtained preliminarily can be adjusted by combining the state characteristics of the state data, so that the adjusted boundary value can be more matched with the uniqueness of the state characteristics, and the accuracy of abnormality detection is improved.

In some embodiments, in order to reduce the misjudgment on the abnormal data, an adjusted boundary value may be obtained through a decision layer in the abnormal data detection system, and the current state data that is determined as the abnormal data for the first time is judged again by using the adjusted boundary value, that is, after step S304, the method further includes the following steps:

in step S341, if the detection result indicates that the current state data is abnormal data, the state characteristics of the historical state data are acquired.

In some possible implementation manners, the detection result indicates that the current state data is abnormal data, and indicates that the current state data is smaller than the second boundary value or the current state data is larger than the first boundary value, if the current state data is judged to be abnormal data preliminarily, the state characteristic is acquired, and the preliminarily determined boundary value is adjusted through the state characteristic

In step S342, the first boundary value and the second boundary value are adjusted according to the state characteristic, and the adjusted first boundary value and the adjusted second boundary value are obtained.

Here, the implementation procedure of step S342 may be the same as the implementation procedure of step S332, but it is not easy to implement step S332 at the status characteristic layer of the abnormal data detection system, and implement step S342 at the decision layer of the abnormal data detection system.

And S343, detecting the current state data by adopting the adjusted first boundary value and the adjusted second boundary value to obtain an updated detection result.

In some possible implementation manners, for the current state data preliminarily judged as abnormal data, secondary detection is performed on the current state data by adopting an adjusted boundary value updated based on the state characteristic, so as to obtain an updated detection result. If the updated detection result still indicates that the current state data is abnormal data, alarm information can be generated and output to prompt the user that the data point is abnormal data.

In the embodiment of the application, the current state data which is preliminarily judged to be abnormal data is adjusted again by adopting the adjusted boundary value, so that the misjudgment of the state data is reduced.

In step S344, when the updated detection result indicates that the current state data is abnormal data, the alarm information is output.

In some possible implementation manners, if the updated detection result indicates that the current state data is abnormal data, the alarm information is generated and output. That is, for the same state data, if the plurality of detections are all abnormal data, alarm information is generated and output.

In some embodiments, in the case that the detection result indicates that the current state data is abnormal data, an alarm message for prompting the user that the data is abnormal may be generated, that is, after step S304, the method further includes the following steps:

step one, if the detection result shows that the current state data is abnormal data, determining the duration of the abnormal data.

In some possible implementations, if the detection result indicates that the current state data is abnormal data, the duration of the judged state data continuing to be the abnormal data from the current time is determined, for example, the state data in the future 1 minute from the current time is only the abnormal data.

And secondly, if the duration is greater than or equal to the preset duration, generating and outputting alarm information.

For example, the preset time duration may be set based on a data attribute of the status data, for example, the infrastructure is network traffic of the network, and the data attribute is an attribute of the network traffic, and since the network traffic may not be abnormal for a long time, the preset time duration may be set to a smaller value, for example, to 5 seconds. When generating the alarm information for the abnormal data, the alarm information may be generated according to an alarm policy selected by a user, for example, output in a voice manner or a text manner.

In other embodiments, after generating and outputting alarm information for abnormal data, firstly, labeling the current state data to obtain labeled data; for example, a marker is identified for the current state data determined as abnormal data to obtain marked data, and a data point corresponding to the marked data is a point where an alarm occurs.

And then, the marked data is sent to the client so that the client verifies the marked data and adjusts the alarm information based on the verification result.

For example, the labeled data of the data point labeled with the alarm occurrence is sent to the client, so that the user can verify the data point labeled with the alarm occurrence at the client, and can adjust the alarm information and the output strategy of the alarm information according to the verification result.

In the embodiment of the application, when the abnormal data contact occurs for a long time or is still the state data of the abnormal data after multiple detections, the alarm information is generated and sent, so that a user can be timely reminded of the abnormal data, and the false alarm of the abnormal data can be avoided.

Next, an exemplary application of the embodiment of the present application in an actual application scenario will be described, taking the detection of an anomaly based on a dynamic threshold as an example, and performing the description.

To facilitate understanding of the embodiment of the present application, a time series is described, in the embodiment of the present application, the time series refers to a time series of a monitoring class, as shown in fig. 5A, fig. 5A is a schematic fluctuation diagram of a dynamic threshold detection curve according to the embodiment of the present application, where a curve 501 represents a baseline obtained by connecting one monitoring data point reported every minute, and

boundary lines

502 and 503 represent a first boundary value and a second boundary value in a dynamic threshold, respectively; a curve 504 represents an actual data curve of the state data to be detected, and a partial curve of the curve 504 in a region 505 represents the fluctuation of the traffic which is suddenly increased and always deviates from a normal value; based on this, the alarm information "the traffic curve is suddenly increased and fluctuates from the normal value all the time" is output.

In the related art, the detection of abnormal data may be achieved in a variety of ways:

in the first mode, a static threshold is set for each time series index through experience of service operation and maintenance personnel. When the time series value exceeds the set threshold, it is recognized as abnormal and an alarm is sent. In this way, by manually setting the threshold value, the threshold value is manually set according to observation or experience, and a point exceeding the threshold value is considered as an abnormality. Although the method is simple and intuitive, the result has good interpretability, the maintenance cost is high, a large amount of manpower investment is needed for maintaining the threshold, the use scene is narrow, and the adjustment speed of the threshold cannot keep pace with the development speed along with the development of services.

And secondly, detecting abnormal values in the time sequence based on the characteristic engineering and the random forest method.

For example, the abnormal features in the time sequence are extracted by using feature engineering, and the time sequence abnormality is detected by using a random forest model or a supervised model such as a Gradient Boosting Decision Tree (GBDT) and an extensible machine learning system (XGBoost). Fig. 5B shows a process of detecting an abnormal value in a time sequence based on feature engineering and a random forest method, where fig. 5B is another schematic structural diagram of the abnormal data detection system provided in the embodiment of the present application, and the following description is made with reference to fig. 5B:

and the KPI data input module 51 is configured to label positive and negative sample data through a data labeling tool.

The detector 52 is configured to perform a feature extraction operation using a plurality of detectors that are currently used, thereby extracting features of abnormal values in a time series.

And the characteristic module 53 is configured to obtain the characteristic sequence and the labeled tag.

In some possible implementation manners, the obtained feature sequence and the labeled label can be used for training and testing a random forest model to obtain the random forest model meeting the requirements;

the latest anomaly classifier 54 is used for testing the test set and inputting anomaly data into an anomaly module 55.

And an exception module 55 for outputting exception data.

Extracting feature information through feature engineering; the feature information comprises statistical features, fitting features and classification features, and the total number of the three types of features is 186. In terms of model decision, three layers are included: the first layer of statistical layer filters a large number of positive samples by using a statistical scheme similar to a 3sigma criterion for judgment, and realizes the primary screening of the abnormity; the second layer is an unsupervised layer, and multiple unsupervised joint arbitrations such as Isolation Forest, Exponentially Weighted Moving-Average (EWMA), polynomial and the like are used for transmitting the suspected exception to the third layer for detection; and the third layer is provided with a supervision layer, which is used for extracting the characteristics of the time sequence, loading a trained supervision model, carrying out anomaly detection by using supervision algorithms such as GBDT, XGboost and the like, and finally detecting anomalous data.

In the second and third modes, on the one hand, feature engineering is performed to extract features of outliers, and an analyst extracts valid features or rules from a large amount of data by analyzing the data. However, the time series is not only data collected according to days, but more data collected according to hours or even minutes, so the data volume is huge. The data analysis and feature extraction from the massive time series requires huge manpower.

On the other hand, time sequences in different application scenes have different feature expressions, so that for the time sequences in different scenes, an analyst needs to dig out different features or rules, and the features in the current scene are difficult to be applied to other scenes. In addition, with continuous expansion of services and increase of types of time sequences, required features will become larger and larger, and the analysts need to continuously spend huge manpower to perform work such as feature mining and system maintenance, so that feature engineering will become more and more complex and tedious.

On the other hand, the machine learning classification model is used for detecting abnormal data, and the interpretability of a detection result is poor. Many times, the user only gets a detection result (whether abnormal or not), and the interpretability of the detection result is poor.

Based on the above, the embodiment of the application provides an abnormal data detection method, which is based on a dynamic threshold method of statistical machine learning, dynamically generates the upper and lower boundaries of detection in real time to perform abnormal detection of time series indexes; therefore, on one hand, the complicated work of characteristic engineering is avoided, and the setting of the detection threshold value is not required to be carried out manually. The model adaptively and dynamically generates and detects the upper and lower boundaries to perform detection and judgment by learning historical state data. On the other hand, the model calculation can obtain the detection result within the second level, and the timeliness is improved. On the other hand, the user can conveniently know why the time sequence of the corresponding service is detected to be abnormal (the current point exceeds the boundary of the dynamic threshold) according to the generated dynamic threshold, and the method has a very visual product effect, so that the interpretability of the detection result is improved.

Fig. 6 is a schematic diagram of an implementation framework of an abnormal data detection method provided in an embodiment of the present application, and the following description is made in conjunction with fig. 6, where the framework includes a data layer 601, an algorithm layer 602, a service layer 603, and a decision layer 604, where:

the data layer 601 is configured to receive input raw KPI data 611, and perform data verification 612 and data preprocessing 613 on the raw KPI data.

In some embodiments, first, data validation and data pre-processing work is performed by the data layer. The data verification of the original KPI data is to verify the reasonability of the data (for example, verify whether the original KPI data meets the requirement of data points, whether a breakpoint or a missing value condition exists, whether a data value meets a value range constraint, etc.). And if the original KPI data are subjected to data preprocessing, the maximum and minimum normalization is carried out, and the data value ranges are unified to be between 0 and 1.

The algorithm layer 602 is configured to adaptively perform fitting of a trend baseline of a time sequence on the data processed by the data layer by using a moving average method or ARIMA fitting 621, calculate an average value and an evaluation data pair error from a fitting result, and determine an upper boundary 622 and a lower boundary 622 corresponding to the dynamic threshold according to the average value and an average absolute error (MAE).

In some embodiments, the algorithm layer 602 may determine the trend baseline of the time series in various ways, for example, using a moving average, a weighted moving average, AR or ARIMA, and the like, wherein:

the first method is as follows: determining trend base line of time series by adopting moving average mode, firstly, determining moving average value by adopting formula (1)

Then, based on the moving average

And obtaining a trend baseline.

In the formula (1), the window _ size is a value set according to the actual situation, and is generally 10% to 20% of the total number of points in the time series.

Is the value of the original time series at time i.

The second method comprises the following steps: determining trend base line of time series by adopting weighted moving average mode, firstly, determining weighted moving average value by adopting formula (2)

Then, a trend baseline is obtained according to the weighted moving average.

The third method comprises the following steps: determining the trend baseline of the time sequence by adopting an ARMA model, firstly determining the fitted time sequence by adopting a formula (3), and then obtaining the trend baseline according to the fitted time sequence.

In the formula (3), the first and second groups,

is an autoregressive part, the nonnegative integer p is the autoregressive order,

in order to be the coefficient of the auto-regression,

is a moving average part, the non-negative integer q is a moving average order,

is a moving average coefficient;

for the time series after the fitting to be,

is a white noise sequence.

And secondly, determining the upper and lower boundaries of the dynamic threshold value based on the trend baseline.

First, a difference sequence of the trend baseline and the original time series points (i.e., a sequence of difference values corresponding to the respective points, i.e., in the formula) is determined

) Then, the difference sequence is averaged for absolute error (mae) using equation (4) and the difference sequence is averaged using equation (5)

And calculating the standard deviation (std) of the difference sequence by adopting a formula (6):

then, determining the upper and lower boundaries of the dynamic threshold according to the mean absolute error and the standard deviation, wherein:

first boundary value

As shown in equation (7):

lower boundary

As shown in equation (8):

in the above equations (7) and (8), Scale is an adjustment parameter, which indicates different sensitivities to data fluctuation, and can be set according to a status characteristic (for example, when an object to be monitored is in a traffic status, a status characteristic can be understood as a traffic characteristic). Therefore, the upper boundary and the lower boundary are generated according to the dynamic threshold, a user can check the generated upper boundary and the generated lower boundary and the detection result, the detection result is visual, and the interpretability of the result is high; and when the upper and lower boundaries of the dynamic threshold are determined, the average error of the base line fitting and the standard deviation of the difference value of the base line and the original time series data are considered, so that more accurate upper and lower boundaries of the threshold can be obtained, and the accuracy of abnormal detection is improved.

The service layer 603 is used to implement loading the service characteristics into the upper and lower boundaries, resulting in the adjusted boundary value 631.

For example, the success rate index corresponds to a first boundary value that is always 100%, for example, the failure rate index corresponds to a lower boundary value that is always 0. I.e. the traffic characteristics may adjust the threshold of the upper and lower boundaries, e.g. by setting a specific threshold of the upper and lower boundaries depending on the traffic characteristics.

Therefore, the algorithm layer outputs the upper and lower boundaries of the trend base line, the decision layer performs abnormity judgment according to the upper and lower boundaries of the trend base line and by considering the service characteristics of the service layer, the interpretability is better, the algorithm layer only needs to detect the abnormity of the data layer, the service characteristics of different services do not need to be considered, and the algorithm layer has universality.

The decision layer 604 is configured to determine whether the current point to be detected is abnormal, so as to evaluate whether the state data is abnormal 641.

In some embodiments, a point is considered an outlier when it exceeds the upper and lower bounds of the dynamic threshold. And adjusting the upper and lower boundaries of the dynamic threshold output by the algorithm layer by combining the service characteristics of the service layer, and deciding whether the current point to be detected is abnormal or not according to the adjusted upper and lower boundaries of the dynamic threshold.

In some embodiments, the dynamic threshold of the time series indicator is generated based on a machine learning method. When the point to be detected exceeds the upper and lower boundaries of the dynamic threshold, the point is detected as an outlier. Through the abnormal point detection based on the dynamic threshold scheme, a user can visually see the reason why the abnormal point is detected as the abnormality (visually equivalent to manually setting the threshold). Compared with the manual threshold setting, the method avoids the cost of threshold setting and maintenance and provides better detection accuracy. Based on this, the method is applied as follows:

first, the user may select either a static threshold scheme or a dynamic threshold scheme on the alarm configuration page.

In some possible implementations, a user may select a static threshold scheme or a dynamic threshold scheme in a dynamic threshold configuration interface 701 as shown in fig. 7, where fig. 7 is a page diagram of dynamic threshold configuration provided in this embodiment of the present application, and as can be seen from the interface 701, a condition type 71 includes: static threshold 72 and dynamic threshold 73, the content presented by interface 701 is the interface presented by selecting the type of condition as dynamic threshold 73. As can be seen from the interface 701, the status data is the outbound traffic 721, and the triggering condition 702 for the alarm is that the sum of the outbound traffic 721 over a certain period of time is greater than the maximum value of the dynamic threshold 722 or less than the minimum value of the dynamic threshold. The selected time period in the area 703 may be a selected set time period, for example, 1 hour or 3 hours, or may be a freely set time period, for example, from 20 days 12:00:00 in 11 months in 2019 to 26 days 12:00:00 in 11 months in 2019. For example, the selected predetermined time period is 1 hour, the upper boundary value (corresponding to the first boundary value in the above embodiment) of the predicted boundary values is the curve 704, the lower boundary value (corresponding to the second boundary value in the above embodiment) is the curve 705, and the curve 706 is the actual outbound traffic data. Screening dimension 707 represents the screening of the state of the infrastructure or business, including service types 771, such as screening input machines 772. Aggregation dimension 708 represents a dimension for aggregating data, for example, aggregating chat-class data, aggregating status-specific traffic, and the like. In the embodiment of the present application, the statistical period 709 is set to 1 minute, that is, 1 minute is used to perform one-time anomaly detection on the state data, the number 710 of triggering alarm periods is set to 3 periods of continuous triggering, that is, the state data is anomalous within 3 periods of continuous triggering, an alarm is triggered, and the notification frequency 711 is set to 30 minutes, that is, the detection result is reported every 30 minutes.

Second, when the dynamic threshold detection scheme is selected, the alarm page may generate intuitive dynamic upper and lower thresholds based on the historical status data (e.g., past week, or past day) selected by the user.

And thirdly, the user can select an alarm strategy scheme suitable for the state characteristics, and after the alarm strategy is selected, the corresponding dynamic threshold detection is combined with the alarm strategy to mark the alarm point.

Finally, the user can verify the state characteristic alarm according to the marked alarm point, so that a more suitable alarm strategy is selected.

In the embodiment of the application, the upper and lower boundaries of the dynamic threshold are generated in real time in a self-adaptive manner according to the time sequence, so that the process of manually setting the threshold is eliminated. Meanwhile, with the development of state characteristics, the manually set static threshold needs to be continuously modified and maintained manually, and the abnormal data detection method provided by the embodiment of the application sets the dynamic threshold in a self-adaptive mode, so that the manual maintenance cost is saved; and the dynamic threshold is determined by a statistical machine learning algorithm, so that abnormal points can be distinguished in millisecond level.

Continuing with the exemplary structure of the server 455 for detecting abnormal data provided by the embodiments of the present application implemented as software modules, in some embodiments, as shown in fig. 2B, the software modules stored in the server 455 for detecting abnormal data in the memory 450 may include: a first obtaining module 4551, configured to obtain historical state data of the object to be monitored within a preset historical duration; a first prediction module 4552, configured to predict a trend baseline of the current state data based on the historical state data; a second prediction module 4553, configured to predict a first boundary value and a second boundary value of the trend baseline based on the trend baseline and the historical state data; wherein, at the same time, the first boundary value is greater than the data value in the trend baseline, and the second boundary value is less than the data value in the trend baseline; a first detecting module 4554, configured to detect the current state data based on the first boundary value and the second boundary value, and obtain a detection result; a first alarm module 4555, configured to output alarm information for prompting occurrence of abnormal data when the detection result indicates that the current state data is abnormal data.

In some embodiments, the first prediction module 4552 is further configured to: fitting the historical state data to obtain fitted data; determining the trend baseline based on the fitted data.

In some embodiments, the first prediction module 4552 is further configured to: verifying the historical state data by adopting a preset verification condition to obtain a verification result; and updating the historical state data based on the verification result to obtain verified data meeting the preset verification condition. Carrying out normalization processing on the verified data to obtain normalized data; and fitting the normalized data to obtain the fitted data.

In some embodiments, the second prediction module 4553 is further configured to: determining a difference value between the trend baseline and historical state data of the same time point to obtain a difference value sequence; determining a mean and a variance of the sequence of difference values; determining the first boundary value and the second boundary value according to the mean and the variance.

In some embodiments, the second prediction module 4553 is further configured to: adjusting the variance by adopting a preset adjusting parameter to obtain an adjusted variance; summing the mean value and the adjusted square difference to obtain a boundary adjustment amount; determining the sum of the trend baseline and the boundary adjustment amount as a first boundary value of the trend baseline; and determining the difference between the trend baseline and the boundary adjustment amount as a second boundary value of the trend baseline.

In some embodiments, the first detecting module 4554 is further configured to: if the current state data is smaller than the second boundary value or the current state data is larger than the first boundary value, determining that the current state data is abnormal data; and if the current state data is greater than or equal to the second boundary value and the current state data is less than or equal to the first boundary value, determining that the current state data is normal data.

In some embodiments, the second prediction module 4553 is further configured to: determining a state characteristic of the historical state data; adjusting the first boundary value and the second boundary value according to the state characteristic to obtain an adjusted first boundary value and an adjusted second boundary value;

the first detecting module 4554 is further configured to: and detecting the current state data based on the adjusted first boundary value and the adjusted second boundary value to obtain the detection result.

In some embodiments, the second prediction module 4553 is further configured to: determining the state characteristics of the historical state data under the condition that the detection result shows that the current state data is abnormal data; adjusting the first boundary value and the second boundary value according to the state characteristic to obtain an adjusted first boundary value and an adjusted second boundary value; detecting the current state data by adopting the adjusted first boundary value and the adjusted second boundary value to obtain an updated detection result; and outputting the alarm information under the condition that the updated detection result shows that the current state data is abnormal data.

In some embodiments, the first alarming module 4555 is further configured to: if the detection result shows that the current state data is abnormal data, determining the duration of the abnormal data; and if the duration is greater than or equal to the preset duration, generating and outputting the alarm information.

Embodiments of the present application provide a storage medium storing executable instructions, which when executed by a processor, will cause the processor to execute the method provided by the embodiments of the present application. In some embodiments, the storage medium may be a memory such as a flash memory, a magnetic surface memory, an optical disk, or an optical disk memory; or may be various devices including one or any combination of the above memories.

In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. By way of example, executable instructions may, but need not, correspond to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code). By way of example, executable instructions may be deployed to be executed on one in-vehicle computing device or on multiple computing devices located at one site or distributed across multiple sites and interconnected by a communication network. To sum up, for an object to be monitored related to an information service, a monitoring system in the embodiment of the present application first predicts a trend baseline of current state data based on acquired historical state data; then, predicting a first boundary value and a second boundary value of the trend baseline based on the trend baseline and the historical state data; therefore, the first boundary value and the second boundary value of the trend baseline are generated in real time, the process of manually setting the threshold is omitted, the two boundary values of the trend baseline are predicted through the trend baseline and historical state data, and a user can check the generated boundary values and the detection result, so that the detection result is more visual and the result interpretability is higher. Finally, detecting the current state data by adopting the first boundary value and the second boundary value of the trend baseline which are determined in real time so as to determine whether the current state data are abnormal data; thus, the accuracy of abnormality detection can be improved. The above description is only an example of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims

1. An abnormal data detection method is applied to a monitoring system to realize abnormal detection of state data of an object to be monitored related to information service, and is characterized by comprising the following steps:

acquiring historical state data of the object to be monitored in a preset historical time;

predicting a trend baseline of current state data based on the historical state data;

determining a difference value between the trend baseline and historical state data of the same time point to obtain a difference value sequence;

determining a mean and a variance of the sequence of difference values;

determining a first boundary value and a second boundary value according to the mean value and the variance; wherein, at the same time, the first boundary value is greater than the data value in the trend baseline, and the second boundary value is less than the data value in the trend baseline;

fine-tuning the first boundary value and the second boundary value through a service characteristic layer to obtain a fine-tuned first boundary value and a fine-tuned second boundary value;

detecting the current state data based on the fine-tuned first boundary value and the fine-tuned second boundary value to obtain a detection result;

determining the state characteristics of the historical state data through a decision layer under the condition that the detection result indicates that the current state data is abnormal data;

adjusting the fine-tuned first boundary value and the fine-tuned second boundary value according to the state characteristic to obtain an adjusted first boundary value and an adjusted second boundary value;

detecting the current state data by adopting the adjusted first boundary value and the adjusted second boundary value to obtain an updated detection result;

and outputting alarm information under the condition that the updated detection result shows that the current state data is abnormal data.

2. The method of claim 1, wherein predicting a trend baseline for the current state data based on the historical state data comprises:

fitting the historical state data to obtain fitted data;

determining the trend baseline based on the fitted data.

3. The method of claim 2, wherein said fitting said historical state data to obtain fitted data comprises:

verifying the historical state data by adopting a preset verification condition to obtain a verification result;

updating the historical state data based on the verification result to obtain verified data meeting the preset verification condition;

carrying out normalization processing on the verified data to obtain normalized data;

and fitting the normalized data to obtain the fitted data.

4. The method of claim 1, wherein determining the first boundary value and the second boundary value based on the mean and the variance comprises:

adjusting the variance by adopting a preset adjusting parameter to obtain an adjusted variance;

summing the mean value and the adjusted square difference to obtain a boundary adjustment amount;

determining the sum of the trend baseline and the boundary adjustment amount as a first boundary value of the trend baseline;

and determining the difference between the trend baseline and the boundary adjustment amount as a second boundary value of the trend baseline.

5. The method of claim 1, wherein the current state data is detected based on the trimmed first boundary value and the trimmed second boundary value, and a detection result is obtained:

if the current state data is smaller than the fine-tuned second boundary value or the current state data is larger than the fine-tuned first boundary value, determining that the current state data is abnormal data;

and if the current state data is greater than or equal to the second boundary value of the fine adjustment and the current state data is less than or equal to the first boundary value of the fine adjustment, determining that the current state data is normal data.

6. An abnormal data detection device is applied to a monitoring system to realize abnormal detection of state data of an object to be monitored related to information service, and is characterized by comprising the following components:

the first acquisition module is used for acquiring historical state data of the object to be monitored within a preset historical time;

the first prediction module is used for predicting the trend baseline of the current state data based on the historical state data;

the second prediction module is used for determining a difference value between the trend base line and historical state data of the same time point to obtain a difference value sequence; determining a mean and a variance of the sequence of difference values; determining a first boundary value and a second boundary value according to the mean value and the variance; wherein, at the same time, the first boundary value is greater than the data value in the trend baseline, and the second boundary value is less than the data value in the trend baseline; fine-tuning the first boundary value and the second boundary value through a service characteristic layer to obtain a fine-tuned first boundary value and a fine-tuned second boundary value;

the first detection module is used for detecting the current state data based on the fine-tuned first boundary value and the fine-tuned second boundary value to obtain a detection result;

the first alarm module is used for determining the state characteristics of the historical state data through a decision layer under the condition that the detection result shows that the current state data is abnormal data; adjusting the fine-tuned first boundary value and the fine-tuned second boundary value according to the state characteristic to obtain an adjusted first boundary value and an adjusted second boundary value; detecting the current state data by adopting the adjusted first boundary value and the adjusted second boundary value to obtain an updated detection result; and outputting alarm information under the condition that the updated detection result shows that the current state data is abnormal data.

7. An apparatus for anomalous data detection, comprising:

a memory for storing executable instructions;

a processor for implementing the method of any one of claims 1 to 5 when executing executable instructions stored in the memory.

8. A storage medium having stored thereon executable instructions for causing a processor to perform the method of any one of claims 1 to 5 when executed.