CN112084056A - Abnormality detection method, apparatus, device and storage medium - Google Patents

Abnormality detection method, apparatus, device and storage medium Download PDF

Info

Publication number
CN112084056A
CN112084056A CN202010862378.8A CN202010862378A CN112084056A CN 112084056 A CN112084056 A CN 112084056A CN 202010862378 A CN202010862378 A CN 202010862378A CN 112084056 A CN112084056 A CN 112084056A
Authority
CN
China
Prior art keywords
time sequence
sequence data
data
anomaly detection
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010862378.8A
Other languages
Chinese (zh)
Inventor
董善东
张加浪
黄荣庚
李雄政
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010862378.8A priority Critical patent/CN112084056A/en
Publication of CN112084056A publication Critical patent/CN112084056A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Testing And Monitoring For Control Systems (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application relates to an anomaly detection method, an anomaly detection device, anomaly detection equipment and a storage medium. The method comprises the following steps: acquiring at least one to-be-detected time sequence data; extracting basic characteristic information from the at least one to-be-detected time series data; determining an anomaly detection model corresponding to the basic characteristic information; inputting the at least one time sequence data to be detected into a corresponding anomaly detection model, and performing anomaly detection processing to obtain an anomaly detection result corresponding to the at least one time sequence data to be detected; acquiring time sequence data to be detected with abnormal detection results as target time sequence data and acquiring a service type corresponding to the target time sequence data; and verifying the target time sequence data based on the service type corresponding to the target time sequence data, and determining a target abnormity detection result corresponding to the target time sequence data. The anomaly detection processing method and device are low in labor cost, low in time consumption, high in generality of the anomaly detection model and fine in anomaly detection.

Description

Abnormality detection method, apparatus, device and storage medium
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to an anomaly detection method, apparatus, device, and storage medium.
Background
The existing detection of the time series of the index data is generally a manual threshold detection mode or a machine learning mode based on feature engineering. In the manual threshold detection mode, a static threshold is set for the index data of each time sequence through the experience of service operation and maintenance personnel, and when the index data of the time sequence exceeds the static threshold, the index data of the time sequence is determined to be abnormal and an alarm is sent. In addition, in a real monitoring application scene, monitoring indexes are often in the million level, and at this time, a monitoring scheme with a static threshold value needs to be maintained for each index data, so that the maintenance cost is too high. And as the service evolves and evolves, the initially set static threshold may no longer be appropriate, and the adjustment of the static threshold will not keep up with the evolution speed.
Although the machine learning mode based on the feature engineering solves some problems of the traditional scheme, the machine learning mode still has some defects, such as high detection time consumption and weak model generalization capability, that is, the generalization capability of the model in different business scenes is weak, and data annotation and model business characteristics are loaded in the annotation of index data, so that the model trained in the business a is not applicable to the business b any more. And the marking of respective business data and the training of models are carried out in different businesses, so that the workload is large.
Disclosure of Invention
In view of the above technical problems, the present application provides an abnormality detection method, apparatus, device and storage medium.
According to an aspect of the present application, there is provided an abnormality detection method including:
acquiring at least one to-be-detected time sequence data;
extracting basic characteristic information from the at least one to-be-detected time series data;
determining an anomaly detection model corresponding to the basic characteristic information;
inputting the at least one time sequence data to be detected into a corresponding anomaly detection model, and performing anomaly detection processing to obtain an anomaly detection result corresponding to the at least one time sequence data to be detected;
acquiring time sequence data to be detected with abnormal detection results as target time sequence data and acquiring a service type corresponding to the target time sequence data;
and verifying the target time sequence data based on the service type corresponding to the target time sequence data, and determining a target abnormity detection result corresponding to the target time sequence data.
According to another aspect of the present application, there is provided an abnormality detection apparatus including:
the time sequence data acquisition module to be detected is used for acquiring at least one time sequence data to be detected;
the basic characteristic information extraction module is used for extracting basic characteristic information from the at least one to-be-detected time series data;
an anomaly detection model determining module, configured to determine an anomaly detection model corresponding to the basic feature information;
the anomaly detection result acquisition module is used for inputting the at least one time sequence data to be detected into a corresponding anomaly detection model and carrying out anomaly detection processing to obtain an anomaly detection result corresponding to the at least one time sequence data to be detected;
the target time sequence data and service type acquisition module is used for acquiring the time sequence data to be detected with abnormal detection results as target time sequence data and acquiring the service type corresponding to the target time sequence data;
and the target anomaly detection result determining module is used for verifying the target time sequence data based on the service type corresponding to the target time sequence data and determining a target anomaly detection result corresponding to the target time sequence data.
According to another aspect of the present application, there is provided an abnormality detection apparatus including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to perform the above method.
According to another aspect of the application, a non-transitory computer-readable storage medium is provided, having computer program instructions stored thereon, wherein the computer program instructions, when executed by a processor, implement the above-described method.
The method comprises the steps of extracting basic characteristic information from time sequence data to be detected, inputting the time sequence data to be detected into a corresponding abnormity detection model, carrying out abnormity detection processing, verifying the target time sequence data based on the business type corresponding to the target time sequence data, and determining a target abnormity detection result corresponding to the target time sequence data. The abnormity detection processing of the application does not need to manually set and maintain the detection threshold value, and the labor cost is low; a large amount of features are not required to be calculated, only basic feature information is extracted, calculation work of a large amount of feature engineering is avoided, anomaly detection processing is flatter, time consumption is low, and millisecond level can be achieved; moreover, the anomaly detection model is extracted from the service types, so that the generalization capability of the anomaly detection model is strong, the form of the time series data can be detected only, the universality is higher, and the expansion of the anomaly detection model is easier. In addition, the time sequence data to be detected is input into the corresponding abnormality detection model to be subjected to abnormality detection processing, so that parallel processing of abnormality detection can be realized, the efficiency of abnormality detection is further improved, and the abnormality detection is refined.
Other features and aspects of the present application will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features, and aspects of the application and, together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic diagram illustrating an application system according to an embodiment of the present application.
FIG. 2 illustrates a schematic diagram of time series data according to an embodiment of the present application.
Fig. 3 shows a flow chart of a training method of an anomaly detection model according to an embodiment of the present application.
FIG. 4 shows a flow diagram of an anomaly detection method according to an embodiment of the present application.
Fig. 5 is a flowchart illustrating a method for verifying the target time series data and determining a target anomaly detection result corresponding to the target time series data based on a service type corresponding to the target time series data according to an embodiment of the present application.
FIG. 6 shows a flow diagram of an anomaly detection method according to an embodiment of the present application.
Fig. 7a and 7b are schematic diagrams illustrating an alarm of abnormal time series data according to an embodiment of the present application.
FIG. 8 shows a flow diagram of an anomaly detection method according to an embodiment of the present application.
FIG. 9 illustrates a schematic diagram of an anomaly detection technique architecture according to an embodiment of the present application.
FIG. 10 shows a flow diagram of an anomaly detection method according to an embodiment of the present application.
Fig. 11 illustrates a block diagram of an abnormality detection apparatus of an image according to an embodiment of the present application.
Fig. 12 is a block diagram illustrating an apparatus 1200 for anomaly detection according to an exemplary embodiment.
Detailed Description
Various exemplary embodiments, features and aspects of the present application will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present application. It will be understood by those skilled in the art that the present application may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present application.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
In recent years, with research and development of artificial intelligence technology, the artificial intelligence technology is widely applied in a plurality of fields, and the scheme provided by the embodiment of the application relates to computer vision and other technologies, and is specifically described by the following embodiments:
referring to fig. 1, fig. 1 is a schematic diagram illustrating an application system according to an embodiment of the present application. The application system can be used for the anomaly detection method of the application. As shown in fig. 1, the application system may include at least a server 01 and a terminal 02.
In this embodiment of the application, the server 01 may include an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), and a big data and artificial intelligence platform.
In this embodiment, the terminal 02 may include a smart phone, a desktop computer, a tablet computer, a notebook computer, a smart speaker, a digital assistant, an Augmented Reality (AR)/Virtual Reality (VR) device, a smart wearable device, and other types of entity devices. The physical device may also include software running in the physical device, such as an application program. The operating system running on terminal 02 in this embodiment of the present application may include, but is not limited to, an android system, an IOS system, linux, windows, and the like.
In the embodiment of the present disclosure, the terminal 02 and the server 01 may be directly or indirectly connected by a wired or wireless communication method, and the present disclosure is not limited thereto.
The terminal 02 may be used to provide user-oriented anomaly detection processing. The user can upload the time series data to be detected at the terminal 02. The terminal 02 may receive and display the warning information, the time series diagram corresponding to the abnormal time series data, and the basic feature information of the abnormal time series data. The user can also feed back the alarm information on the terminal 02, that is, the abnormal time series data can be fed back. The manner in which the terminal 02 provides the user-oriented abnormality detection processing may include, but is not limited to, an application manner, a web page manner, and the like.
It should be noted that, in the embodiment of the present application, the server 01 may execute the abnormality detection method, and preferably, the abnormality detection method is implemented in the server 01. So as to reduce the data processing pressure of the terminal and improve the equipment performance of the terminal facing the user.
In a specific embodiment, when the server 02 is a distributed system, the distributed system may be a blockchain system, when the distributed system is a blockchain system, the distributed system may be formed by a plurality of nodes (any form of computing device in an access network, such as a server and a user terminal), a Peer-To-Peer (P2P, Peer To Peer) network is formed between the nodes, and the P2P Protocol is an application layer Protocol running on top of a Transmission Control Protocol (TCP). In a distributed system, any machine, such as a server or a terminal, can join to become a node, and the node comprises a hardware layer, a middle layer, an operating system layer and an application layer. Specifically, the functions of each node in the blockchain system may include:
1) routing, a basic function that a node has, is used to support communication between nodes.
Besides the routing function, the node may also have the following functions:
2) the application is used for being deployed in a block chain, realizing specific services according to actual service requirements, recording data related to the realization functions to form recording data, carrying a digital signature in the recording data to represent a source of task data, and sending the recording data to other nodes in the block chain system, so that the other nodes add the recording data to a temporary block when the source and integrity of the recording data are verified successfully.
It should be noted that the following figures show a possible sequence of steps, and in fact do not limit the order that must be followed. Some steps may be performed in parallel without being dependent on each other.
In the embodiment of the present specification, because the corresponding relationship between the basic feature information, the feature type, and the anomaly detection model is used in both the anomaly detection process and the anomaly detection model training process, the corresponding relationship between the basic feature information, the feature type, and the anomaly detection model is selected to be introduced first. The corresponding relation may be set according to actual requirements, and the present application is not limited.
The basic characteristic information can be used for reflecting distribution information of time series data and selecting an abnormality detection model. For example, the basic feature information may include fluctuation information, trend (monotony) information, periodicity information, and the like of the time series data. The feature types may include stationary types and non-stationary types. For stationary types of time series data, the sigma model may perform well for anomaly detection, e.g., N-sigma model, where N may be 3. However, for non-stationary type time series data, the sigma model cannot obtain good anomaly detection effect. In order to better process the abnormal detection of the non-stationary type time series data, the method introduces trend information and periodic information to subdivide the non-stationary type. In one example, the non-stationary types may include a first non-stationary type, a second non-stationary type, and a third non-stationary type. Specifically, the first non-stationary type may include a type in which the time series data is fluctuating largely, trending-free, and non-periodic difference, or a type in which the time series data is fluctuating largely, trending, and non-periodic difference; the second non-stationary type may include a time series data being a fluctuating, trending, periodic difference type; the third non-stationary type may include the time series data being of a type including large fluctuations, no trends, periodic differences. The time series data can be referred to the description of the corresponding parts below.
In the embodiment of the present specification, the feature type may be determined based on the basic feature information and the corresponding fluctuation threshold. For example, if the fluctuation information in the basic feature information is less than or equal to the fluctuation threshold, the feature type may be determined to be a stationary type; if the fluctuation information in the basic feature information is larger than the fluctuation threshold, the feature type can be determined to be a non-stationary type. When the basic feature information includes fluctuation information, trend information, and periodicity information of the time series data, in one example, further dividing the non-stationary type may include: dividing fluctuation information which is larger than a fluctuation threshold value, has no trend, has periodicity and has no difference among periods, and fluctuation information which is larger than the fluctuation threshold value, has rising trend or falling trend, has periodicity and has no difference among periods into corresponding first non-stationary types; dividing fluctuation information which is larger than a fluctuation threshold value, is ascending or descending in trend, is periodic and has difference among cycles into corresponding second non-stable types; and dividing the fluctuation information into a corresponding third non-stationary type, wherein the fluctuation information is larger than a fluctuation threshold, and has no trend, periodicity and difference between periods.
Wherein, the fluctuation information may refer to fluctuation degree information of the time series data; the trend information may refer to information of rising, falling or stability of a detection data point from a preset time before the detection data point in the time series data, wherein the preset time may be half an hour, which is not limited in the present application; the periodicity information may refer to whether the time series data has periodicity and the difference of the data between the periods; the data difference between the periods includes that there is a difference between the periods (there is a period difference), and there is no difference between the periods (there is no period difference), and the period difference may mean that the detected data point is different from the data point corresponding to the detection time point in the historical data points; the no-cycle difference may mean that the detected data point is the same as a data point corresponding to the detection time point in the historical data points. For example, the detected data point is the current 10:00, and the data point corresponding to the detected point may refer to a data point corresponding to the historical detected point of 10:00 in the historical data points. The time series data, the detected data points and the historical data points can be referred to the following description. The historical time points may correspond to historical data points. As an example, the periodicity information of the time series data may be acquired by using a time series decomposition algorithm, such as STL (local-regression-smoothness-based) algorithm (local-regression-smoothing-based time series decomposition algorithm).
Alternatively, the abnormality detection model may be classified into a non-deep learning model and a deep learning model. The abnormality detection model may be a supervised learning model or an unsupervised learning model, which is not limited in the present application.
As an example, the corresponding relationship between the basic feature information, the feature type, and the anomaly detection model may be as shown in table 1 below:
TABLE 1
Figure BDA0002648571290000081
It should be noted that table 1 is merely an example, and does not limit the present application. For the large fluctuation and the non-difference between periods, the decision Tree model is used for anomaly detection, so that a better effect can be obtained, for example, the decision Tree model may include a GBDT (Gradient Boosting decision Tree) model or an xgboost (extreme Gradient Boosting) model. The GBDT is a regression tree, and the GBDT can be used for regression prediction and classification. The XGboost is an efficient gradient boosting algorithm, can automatically utilize multiple threads of the cpu, can improve the efficiency of anomaly detection, and is more suitable for scenes with large data volume. For the types of large fluctuation, tendency and periodic difference, the fitting condition of a plurality of data points near the detected data point can be effectively evaluated by utilizing the moving average algorithm model so as to effectively detect the abnormity. For example, the Moving Average-like algorithm model may include an EWMA (exponential Weighted Moving Average) algorithm model or an ARIMA (differential Integrated Moving Average Autoregressive model) algorithm model. For the types of large fluctuation, no trend and periodic difference, the anomaly cannot be detected by using the EWMA algorithm model because the data points adjacent to the detected data points have no obvious trend, so that the combination of the polynomial fitting algorithm model and the variable point detection algorithm model can be selected for anomaly detection. The polynomial fitting algorithm model is also a linear model in nature, except that the variables may be powers of 2 or higher. The polynomial fitting algorithm model may be developed with a polynomial to fit the time series data, wherein the coefficients of expansion may be determined with least squares fitting. The variable point detection algorithm model mainly detects the amplitude of time sequence change by using a difference method, and if the amplitude is larger than an amplitude threshold value, the existence of a variable point can be considered, namely, an abnormal detection result is abnormal.
In the training stage of the anomaly detection model, a sub-sample time sequence data set corresponding to the feature type can be obtained, and machine learning training is performed on the preset machine learning model by using the sub-sample time sequence data set to obtain a corresponding non-deep learning model in table 1. The preset deep learning model can also be trained by utilizing a sub-sample time sequence data set to obtain a corresponding deep learning model in table 1: a first abnormality detection model, a second abnormality detection model, a third abnormality detection model, and a fourth abnormality detection model.
It should be noted that, the basic feature of the time series data may also be aperiodic, and correspondingly, a corresponding anomaly detection model may also be set, and aperiodic sub-sample time series data may be obtained, and based on the sub-sample time series data, machine learning training is performed on a preset machine learning model to obtain a corresponding anomaly detection model. And subsequently, the non-periodic time series data can be subjected to abnormity detection.
In the application of anomaly detection, the time series data to be detected can be input into a corresponding anomaly detection model, and here, the anomaly detection model corresponding to the time series data to be detected comprises two types: a non-deep learning model and a deep learning model, wherein the type of the abnormity detection model can be selected by a user; or the type of anomaly detection model may be automatically selected; alternatively, the corresponding non-deep learning model and the deep learning model may be input at the same time, and the abnormality detection result may be determined based on the weight corresponding to the output of the two models. This is not a limitation of the present application.
In addition, the time series data in the embodiment of the present specification will be described first. The time series data may be reported by the monitoring device, the time series data may be data arranged according to a reporting time sequence, and may be represented by a one-dimensional array, where the one-dimensional array may include data from left to right, and the data from left to right corresponds to the data arranged according to the reporting time sequence. The time series data can be regarded as a time series and is a group of data point sequences arranged according to the time occurrence sequence. Each element in the one-dimensional array may be a data point (corresponding to data reported once), and the rightmost (last) data point in the one-dimensional array may be a detection data point; the other data points may be historical data points. Wherein the interval of data points may be constant, as the reported interval may be constant, such as 10 seconds, 1 minute, 5 minutes, etc. The time series herein may refer to a time series of monitoring classes, such as a time series of cloud monitoring.
The anomaly detection of the present application is directed to anomaly detection that detects data points, such as sudden rises or falls of data points, fluctuations from normal values, and the like. When the current reported data is taken as a detection data point, the abnormality detection is performed on the current reported data.
In the embodiment of the present specification, the time series data may be composed of a history data point and a detection data point. In one example, as can be seen in FIG. 2, the one-dimensional array of time series data is [ first historical data point; a second historical data point; a third history data point; detecting data points]. Wherein the detection data point is a detection data point corresponding to 10:00 of current 20180810, and the 10:00 is a detection time point xt(ii) a The third historical data point may include historical data points within a preset time k before the detection data point: [ x ] oft-k,xt) The data points within; the second historical data point may include the same time (y) as the detection time point on the previous day (e.g., 20180809)t) The preset time k before and afterHistorical data points of (a): [ y ]t-k,yt+k]The data points within; the first historical data point may include the same time (z) at the detection time point during the day of the weekt) Historical data points within a preset time k before and after: [ z ] ist-k,zt+k]The data points in. The day of the week may be 7 days different from the date corresponding to the detected data points, e.g., 20180803. Where k may be 3 hours, which is not limited in this application.
It should be noted that fig. 2 is only an example, and the detected data point may be current reported data corresponding to a current time point, or may be a data point to be detected. The present application is not limited to this, and as long as the last data point of the time series data is the detection data point. The historical data points may include only the third historical data point, or the historical data points may include the third historical data point and the first historical data point or the third historical data point and the second historical data point. Or the historical data point may also select another historical data point based on the detection time point corresponding to the detection data point, which is not limited in this application as long as the selected historical data point can represent the basic feature information of the time series data in combination with the detection data point.
In this embodiment, the time series data may include data of different index types. The population can be divided into a basic monitoring index type and a service monitoring index type. The basic index type may be a monitoring index type for a lower-level service such as a basic machine and a database, and data of the basic monitoring index type, such as CPU utilization, memory utilization, network bandwidth, and the like; the service monitoring index type may refer to a monitoring index type for a high-level service, and data of the service monitoring index type, such as a success rate of an interface, a success rate of accessing a web page, a pause rate of a live video, a number of online users of an APP (Application), and the like. For each time series data, each time corresponds to a unique value.
The time sequence data are subjected to abnormal detection through the time sequence data provided by the cloud monitoring system, the service states corresponding to the data with different index types can be detected quickly and accurately in time, and an alarm can be triggered to inform machine operation and maintenance personnel to check, process and repair, so that stable service can be guaranteed.
In particular, fig. 3 shows a flow chart of a training method of an anomaly detection model according to an embodiment of the present application. As shown in fig. 3, the method may include:
s301, a sample time sequence data set is obtained, wherein the sample time sequence data set comprises sample time sequence data and corresponding labels.
In this embodiment, a large amount of time series data may be acquired, and the large amount of time series data may be preprocessed, such as cleaning and interpolation, so as to obtain a sample time series data set. One sample time sequence data is a one-dimensional array; the tags may include normal and abnormal.
In one example, preprocessing the large amount of time series data may include:
and carrying out data verification on the large amount of time series data. For example, size check, missing value check, NAN (Not a Number) value check, illegal value check, and the like are performed on data in the time series data.
And performing data cleaning on the time sequence data. For example, if the time series data has missing values, the missing values can be supplemented by interpolation processing; if the time series data has the NAN value, the NAN value may be removed through interpolation processing or averaging processing. If the time series data has illegal values (such as characters and the like), the detection failure code can be directly returned.
The time series data after data cleaning can be standardized. For example, the time series data after data cleaning is normalized, so that the data in the time series data can be unified to [0,1 ].
S303, sample basic feature information is extracted from each sample time series data.
In one possible implementation manner, the S303 may be implemented by the following manner, that is, the S303 may include:
and inputting the sample time sequence data set into a basic feature extraction model for basic feature extraction processing, and acquiring basic feature information of each sample time sequence data in the sample time sequence data set. The base feature extraction model may be a pre-trained machine learning model. For example, the machine learning model may be an N-sigma model, and the value of N may be set according to actual needs.
Alternatively, statistical processing or fitting processing may be performed on each sample time series data, and the basic feature information of each sample time series data is extracted. The application is not limited to the specific methods of statistical processing and fitting processing. For example, as for the statistical processing mode, the time series data of each sample may be counted, and the variance obtained through statistics is used as fluctuation information; trend information can be extracted based on the rising or falling trend of the detected data points within half an hour before the detection time point; and the periodicity information of each sample time sequence data can be obtained according to statistical analysis, and the periodicity information can comprise whether the periodicity information exists and whether the periods are different.
S305, determining the characteristic type corresponding to the sample basic characteristic information.
In the embodiment of the present specification, the corresponding relationship between the basic feature information and the feature type may be obtained, and then the feature type corresponding to the basic feature information of the sample may be determined by finding the basic feature information matched with the basic feature information of the sample. For example, the table 1 may be looked up according to the sample basic feature information, and the feature type corresponding to the sample basic feature information may be determined in a matching manner.
S307, dividing the sample time sequence data set into sub-sample time sequence data sets corresponding to the feature types based on the feature types corresponding to the sample time sequence data. That is, sample time-series data having the same feature type in a sample time-series data set may be divided into the same sub-sample time-series data set, thereby dividing the sample time-series data set into sub-sample time-series data sets corresponding to the feature type.
Optionally, the sub-sample time series data set corresponding to the feature type may also be directly obtained from a large amount of time series data, that is, the sub-sample time series data set corresponding to each feature type may be directly obtained from a large amount of time series data one by one.
S309, performing machine learning training on a preset machine learning model based on the sub-sample time sequence data set corresponding to the characteristic type until a preset condition is met, and obtaining an abnormal detection model corresponding to the characteristic type.
In the embodiment of the present specification, the feature types may include 5 feature types as shown in table 1, and since the anomaly detection model includes two types, i.e., a non-deep learning model and a deep learning model, and the stationary type does not require a deep learning model, the 5 feature types may correspond to 9 anomaly detection models. Machine learning training can be respectively carried out on the preset machine learning model based on the sub-sample time sequence data set corresponding to the feature type until the preset condition is met, and the anomaly detection model corresponding to the feature type is obtained. The preset condition may be a preset number of iterations or a preset error threshold, etc.
Optionally, for the decision tree model, S309 may include:
sample detection feature information is extracted from the sub-sample time series data in the sub-sample time series data set. Here, the sample detection feature information may be extracted from the sub-sample time-series data in the sub-sample time-series data set based on feature engineering. The sample detection feature information may be feature information for abnormality detection classification. As an example, the sample detection feature information may include a minimum value, a maximum value, and the like in the sub-sample timing data, which is not limited in this application.
And performing machine learning training on a preset decision tree model based on the sample detection characteristic information of the sub-sample time sequence data set corresponding to the characteristic type until a preset condition is met, and obtaining an abnormal detection model corresponding to the characteristic type. As one example, the predetermined decision tree model may include a GBDT model or an xgboost model. This is not a limitation of the present application.
FIG. 4 shows a flow diagram of an anomaly detection method according to an embodiment of the present application. As shown in fig. 4, the method may include:
s401, acquiring at least one time sequence data to be detected.
In this embodiment, the time series data to be detected may refer to time series data that needs to be subjected to anomaly detection. For example, a current data point reported by the monitoring device may be obtained, and the current data point is combined with the selected historical data point to be used as the time series data to be detected. Based on the mode, at least one time series data to be detected can be acquired.
S403, extracting basic characteristic information from the at least one to-be-detected time series data;
s405, determining an abnormality detection model corresponding to the basic characteristic information.
The implementation manners of the above steps S403 and S405 may refer to steps S303 and S305, which are not described herein again.
S407, inputting the at least one time series data to be detected into a corresponding anomaly detection model, and performing anomaly detection processing to obtain an anomaly detection result corresponding to the at least one time series data to be detected.
In the embodiment of the present specification, the anomaly detection module corresponding to the basic characteristic information of the data to be detected is used to perform anomaly detection processing on the data to be detected, so that an anomaly detection result, such as normal or abnormal, corresponding to the time series data to be detected can be obtained. The anomaly detection model may include the anomaly detection models in table 1 above.
S409, acquiring the time sequence data to be detected with the abnormal detection result as target time sequence data and acquiring the service type corresponding to the target time sequence data.
In the embodiment of the present specification, through the screening of the anomaly detection model, the time series data to be detected whose anomaly detection result is abnormal can be screened, and the screened time series data to be detected whose anomaly detection result is abnormal can be used as the target time series data.
Optionally, since the anomaly detection of the anomaly detection model is only performed on the to-be-detected time series data itself, that is, the anomaly detection of the to-be-detected time series data by the anomaly detection model is extracted from the service scene, the anomaly detection of the to-be-detected time series data itself is separated from the anomaly detection of the to-be-detected data in the service scene. Therefore, the generalization capability of the anomaly detection model is improved, and whether the time sequence data to be detected is abnormal is also related to the service type. For example: for the data to be detected of the success rate index, the abnormality in the service scene may mean that the detected data point is in a trend decrease. For the data to be detected with the failure rate index, the anomaly in the service scene may mean that the detected data point is in a trend rising. Therefore, the service type corresponding to the target time sequence data is selected and obtained, and the abnormal detection result can be verified (detected) again based on the service type, so that the accuracy of the abnormal detection is ensured.
S411, based on the service type corresponding to the target time sequence data, verifying the target time sequence data, and determining a target abnormity detection result corresponding to the target time sequence data.
In this embodiment, the service type may refer to an index type of the time series data.
In an example, an exception threshold corresponding to the service type may be set, for example, the exception threshold of the interface success rate is 30%, and if the interface success rate is lower than the 30%, the interface success rate is considered to be abnormal. If the service type of the target time sequence data is the interface success rate service type, whether the value of a detection data point in the target time sequence data is lower than an abnormal threshold value of the interface success rate can be obtained, and if so, the target abnormal detection result corresponding to the target time sequence data can be determined to be abnormal; if not, the target abnormality detection result corresponding to the target time series data can be determined to be normal. The abnormal threshold corresponding to each service type is not limited, and can be set according to actual requirements.
The method comprises the steps of extracting basic characteristic information from time sequence data to be detected, inputting the time sequence data to be detected into a corresponding abnormity detection model, carrying out abnormity detection processing, verifying the target time sequence data based on the business type corresponding to the target time sequence data, and determining a target abnormity detection result corresponding to the target time sequence data. The abnormity detection processing of the application does not need to manually set and maintain the detection threshold value, and the labor cost is low; the method has the advantages that a large number of technical features are not needed, only basic feature information needs to be extracted, the work of a large number of feature engineering is avoided, the abnormality detection processing is flatter, the time consumption is low, and the millisecond level can be reached; moreover, the anomaly detection model is extracted from the service types, so that the generalization capability of the anomaly detection model is strong, the form of the time series data can be detected only, the universality is higher, and the expansion of the anomaly detection model is easier. In addition, the time sequence data to be detected is input into the corresponding abnormality detection model to be subjected to abnormality detection processing, so that parallel processing of abnormality detection can be realized, the efficiency of abnormality detection is further improved, and the abnormality detection is refined.
Fig. 5 is a flowchart illustrating a method for verifying the target time series data and determining a target anomaly detection result corresponding to the target time series data based on a service type corresponding to the target time series data according to an embodiment of the present application. As shown in fig. 5, may include:
s501, obtaining basic characteristic information of the target time sequence data and preset abnormal information of a service type corresponding to the target time sequence data.
In this embodiment of the present specification, a specific implementation manner of "acquiring the basic feature information of the target time series data" in S501 may refer to S303, and is not described herein again. Or the basic feature information of at least one to-be-detected time series data is extracted in S403, and the to-be-detected time series data corresponding to the target time series data can be determined, so that the basic feature information of the target time series data can be acquired.
In this embodiment of the present specification, the preset abnormal information may be index information that corresponds to a service type and can represent whether time series data is abnormal. For example, the preset abnormality information may include an abnormality threshold, abnormality trend information, and the like. The method is not limited in this application, and can be set according to actual needs or business experiences. The corresponding preset abnormal information can be acquired based on the service type corresponding to the target time sequence data and the preset abnormal information corresponding to the service type.
S503, determining a target abnormity detection result corresponding to the target time sequence data according to the basic characteristic information of the target time sequence data and the preset abnormity information of the service type corresponding to the target time sequence data.
In an example, for example, the success rate service type, the preset abnormal information corresponding to the success rate service type may be a trend decrease. Therefore, whether the trend information in the basic characteristic information of the target time series data is trend reduction or not can be verified, and if the trend information is trend reduction, the target abnormity detection result corresponding to the target time series data can be determined to be abnormal; if the trend is rising, the target abnormal detection result corresponding to the target time series data can be determined to be normal. So that the anomaly detection result can be further verified based on the traffic type.
In the embodiment of the present specification, when the obtained target abnormality detection result is abnormal, an abnormality alarm may be triggered. In one possible implementation, referring to fig. 6, fig. 6 shows a flowchart of an anomaly detection method according to an embodiment of the present application. The method may further comprise:
s601, obtaining target time series data with abnormal target abnormal detection results from the target time series data as abnormal time series data and obtaining basic characteristic information corresponding to the abnormal time series data. This step can be referred to as S409 and S403, and is not described herein.
S603, generating alarm information corresponding to the abnormal time sequence data and a time sequence chart corresponding to the abnormal time sequence data.
In this embodiment, the alarm information may include a detection time point (i.e., a time point at which an abnormality occurs) corresponding to the detection data point. Based on the abnormal time series data, a corresponding time series diagram can be generated, and as shown in fig. 7a and 7b, the abnormal time series data can be visually presented in the form of the diagram.
S605, sending the alarm information, the time sequence chart corresponding to the abnormal time sequence data and the basic characteristic information of the abnormal time sequence data to a terminal.
In the embodiment of the description, the warning information, the time sequence diagram corresponding to the abnormal time sequence data and the basic characteristic information of the abnormal time sequence data can be sent to the terminal, so that the terminal can display the information, and the information can be displayed to a user, so that the user can not only intuitively know the abnormal time sequence data, but also well assist the user in knowing the reason of the abnormality through the basic characteristic information. The interpretability of the anomaly detection is improved.
In this embodiment of the present specification, when the abnormal time series data is displayed to the user, the user may perform a feedback operation, for example, in the abnormal detection of the time series data of the ping unreachable event shown in fig. 7b, when sending the warning information, the time series diagram corresponding to the abnormal time series data, and the basic feature information corresponding to the abnormal time series data to the terminal, the operation information for feedback may also be sent, for example, the operation information may include operation information labeled as abnormal, labeled as normal, more feedbacks, and the like, so that the user may perform a corresponding feedback operation, and thus, the optimization and the promotion of the abnormal detection may be performed based on the feedback operation. Based on this, a feedback mechanism may be added, and in one possible implementation, as shown in fig. 8, fig. 8 shows a flowchart of an anomaly detection method according to an embodiment of the present application. The abnormality detection method may further include:
and S801, acquiring feedback information of the abnormal time sequence data.
In the embodiment of the present specification, as shown in fig. 7b, if the user performs a feedback operation, feedback information on abnormal time series data may be acquired. The feedback information may include an identifier of the abnormal time series data, abnormal information or normal information, and the like.
And S803, adding the abnormal time series data to a sample time series data set according to the feedback information.
In the embodiment of the present specification, the abnormal time series data corresponding to the abnormal information in the feedback information may be added to the sample time series data set, that is, the abnormal time series data confirmed by the user feedback may be added to the sample time series data set, so as to enrich the sample time series data set for optimization of the subsequent abnormal detection model, and save the labeling process on the sample time series data, so as to obtain more sample time series data. Thus, the anomaly detection technical architecture of the embodiment of the present specification can be formed, and as shown in fig. 9, the anomaly detection technical architecture may include a sample time series data set, acquisition of time series data to be detected, selection of an anomaly detection model, verification based on a service type, alarm triggering, and user feedback.
Optionally, when feedback information of the abnormal data is obtained, a target feature type corresponding to the abnormal time series data may also be obtained; and adding the abnormal time sequence data to a sample time sequence data set corresponding to the target feature type according to the feedback information. That is, the anomalous timing data can be directly added to the corresponding sub-sample timing data set.
In a possible implementation manner, the determining an anomaly detection model corresponding to the basic feature information may include:
determining a characteristic type corresponding to the basic characteristic information;
if the characteristic type is stable, determining that an abnormal detection model corresponding to the basic characteristic information is a sigma model;
if the characteristic type is a first non-stationary type, determining an abnormal detection model corresponding to the basic characteristic information as a decision tree model;
if the characteristic type is a second non-stationary type, determining that an abnormal detection model corresponding to the basic characteristic information is a moving average algorithm model;
and if the characteristic type is a third non-stationary type, determining that the abnormal detection model corresponding to the basic characteristic information is a polynomial fitting algorithm model and a variable point detection algorithm model.
In the embodiment of the present specification, the corresponding abnormality detection model may be determined based on the correspondence relationship between the feature type and the abnormality detection model in table 1.
In one possible implementation, fig. 10 shows a flowchart of an anomaly detection method according to an embodiment of the present application. As shown in fig. 10, when the anomaly detection model is a decision tree model, before inputting the at least one to-be-detected time series data into the corresponding anomaly detection model and performing anomaly detection processing to obtain an anomaly detection result corresponding to the at least one to-be-detected time series data, the method may further include:
s1001, extracting target detection characteristic information from the at least one to-be-detected time series data;
s1003, inputting the at least one to-be-detected time series data into a corresponding anomaly detection model, and performing anomaly detection processing to obtain an anomaly detection result corresponding to the at least one to-be-detected time series data, where the anomaly detection result includes: and inputting the target detection characteristic information into a decision tree model, and performing anomaly detection processing to obtain an anomaly detection result corresponding to the at least one to-be-detected time sequence data.
The implementation manners of steps S1001 and S1003 may refer to the corresponding portions of the decision tree model in S309, and are not described herein again.
In a possible implementation manner, the extracting basic feature information from the at least one to-be-detected time series data may include:
inputting the at least one time sequence data to be detected into a basic feature extraction model for basic feature extraction processing, and acquiring basic feature information of the at least one time sequence data to be detected;
or,
and performing statistical processing or fitting processing on the at least one to-be-detected time series data to extract basic characteristic information of the at least one to-be-detected time series data.
The implementation manner here may specifically refer to step S303, and is not described herein again.
Fig. 11 shows a block diagram of an abnormality detection apparatus according to an embodiment of the present application. As shown in fig. 11, the apparatus may include:
the to-be-detected time series data acquisition module 1101 is configured to acquire at least one to-be-detected time series data;
a basic feature information extraction module 1103, configured to extract basic feature information from the at least one to-be-detected time series data;
an anomaly detection model determining module 1105, configured to determine an anomaly detection model corresponding to the basic feature information;
an anomaly detection result obtaining module 1107, configured to input the at least one to-be-detected time series data into a corresponding anomaly detection model, and perform anomaly detection processing to obtain an anomaly detection result corresponding to the at least one to-be-detected time series data;
a target time series data and service type obtaining module 1109, configured to obtain time series data to be detected with an abnormal detection result as target time series data and obtain a service type corresponding to the target time series data;
and the target anomaly detection result determining module 1111 is configured to verify the target time series data based on the service type corresponding to the target time series data, and determine a target anomaly detection result corresponding to the target time series data.
The method comprises the steps of extracting basic characteristic information from time sequence data to be detected, inputting the time sequence data to be detected into a corresponding abnormity detection model, carrying out abnormity detection processing, verifying the target time sequence data based on the business type corresponding to the target time sequence data, and determining a target abnormity detection result corresponding to the target time sequence data. The abnormity detection processing of the application does not need to manually set and maintain the detection threshold value, and the labor cost is low; the method has the advantages that a large number of technical features are not needed, only basic feature information needs to be extracted, the work of a large number of feature engineering is avoided, the abnormality detection processing is flatter, the time consumption is low, and the millisecond level can be reached; moreover, the anomaly detection model is extracted from the service types, so that the generalization capability of the anomaly detection model is strong, the form of the time series data can be detected only, the universality is higher, and the expansion of the anomaly detection model is easier. In addition, the time sequence data to be detected is input into the corresponding abnormality detection model to be subjected to abnormality detection processing, so that parallel processing of abnormality detection can be realized, the efficiency of abnormality detection is further improved, and the abnormality detection is refined.
In a possible implementation manner, the target abnormality detection result determining module 1111 may include:
a basic feature information and preset abnormal information acquiring unit, configured to acquire basic feature information of the target time series data and preset abnormal information of a service type corresponding to the target time series data;
and the target abnormity detection result determining unit is used for determining a target abnormity detection result corresponding to the target time sequence data according to the basic characteristic information of the target time sequence data and the preset abnormity information corresponding to the target time sequence data.
In one possible implementation, the apparatus may further include:
the basic characteristic information acquisition module of the abnormal time sequence data is used for acquiring target time sequence data with abnormal target abnormal detection results from the target time sequence data as abnormal time sequence data and acquiring basic characteristic information of the abnormal time sequence data;
the time sequence chart generating module is used for generating alarm information corresponding to the abnormal time sequence data and a time sequence chart corresponding to the abnormal time sequence data;
and the alarm sending module is used for sending the alarm information, the time sequence chart corresponding to the abnormal time sequence data and the basic characteristic information of the abnormal time sequence data to a terminal.
In one possible implementation, the anomaly detection model determining module 1105 may include:
the characteristic type determining unit is used for determining a characteristic type corresponding to the basic characteristic information;
an anomaly detection model determining unit, configured to determine that an anomaly detection model corresponding to the basic feature information is a sigma model if the feature type is a stationary type; if the characteristic type is a first non-stationary type, determining an abnormal detection model corresponding to the basic characteristic information as a decision tree model; if the characteristic type is a second non-stationary type, determining that an abnormal detection model corresponding to the basic characteristic information is a moving average algorithm model; and if the characteristic type is a third non-stationary type, determining that the abnormal detection model corresponding to the basic characteristic information is a polynomial fitting algorithm model and a variable point detection algorithm model.
In a possible implementation, when the anomaly detection model is a decision tree model; the apparatus may further include:
the target detection characteristic information extraction module is used for extracting target detection characteristic information from the at least one to-be-detected time sequence data;
the anomaly detection result obtaining module 1107 may be further configured to input the target detection feature information into a decision tree model, and perform anomaly detection processing to obtain an anomaly detection result corresponding to the at least one to-be-detected time series data.
In one possible implementation, the basic feature information extraction module 1103 may include:
the basic feature information extraction unit is used for inputting the at least one time series data to be detected into a basic feature extraction model to perform basic feature extraction processing, and acquiring basic feature information of the at least one time series data to be detected; or,
and performing statistical processing or fitting processing on the at least one to-be-detected time series data to extract basic characteristic information of the at least one to-be-detected time series data.
In one possible implementation, the apparatus may further include:
a sample time series dataset acquisition module for acquiring a sample time series dataset, the sample time series dataset comprising sample time series data and a corresponding tag;
the sample basic characteristic information extraction module is used for extracting sample basic characteristic information from each sample time sequence data;
the characteristic type determining module is used for determining a characteristic type corresponding to the sample basic characteristic information;
the sub-sample time sequence data set dividing module is used for dividing the sample time sequence data set into a sub-sample time sequence data set corresponding to the characteristic type based on the characteristic type corresponding to each sample time sequence data;
and the anomaly detection model acquisition module is used for performing machine learning training on a preset machine learning model based on the sub-sample time sequence data set corresponding to the characteristic type until a preset condition is met, so as to obtain an anomaly detection model corresponding to the characteristic type.
In one possible implementation, the apparatus may further include:
the feedback information acquisition module is used for acquiring feedback information of the abnormal time sequence data;
and the sample time sequence data set updating module is used for adding the abnormal time sequence data to the sample time sequence data set according to the feedback information.
With regard to the apparatus in the above-described embodiment, the specific manner in which the respective modules and units perform operations has been described in detail in the embodiment related to the method, and will not be elaborated upon here.
In another aspect, the present application provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the anomaly detection method provided in the various alternative implementations described above.
Fig. 12 is a block diagram illustrating an apparatus 1200 for anomaly detection according to an exemplary embodiment. For example, the apparatus 1200 may be provided as a server. Referring to fig. 12, the apparatus 1200 includes a processing component 1222 that further includes one or more processors, and memory resources, represented by memory 1232, for storing instructions, such as application programs, that are executable by the processing component 1222. The application programs stored in memory 1232 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1222 is configured to execute instructions to perform the above-described methods.
The apparatus 1200 may also include a power supply component 1226 configured to perform power management of the apparatus 1200, a wired or wireless network interface 1250 configured to connect the apparatus 1200 to a network, and an input output (I/O) interface 1258. The apparatus 1200 may operate based on an operating system stored in the memory 1232, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
In an exemplary embodiment, a non-transitory computer readable storage medium, such as the memory 1232, is also provided that includes computer program instructions executable by the processing component 1222 of the apparatus 1200 to perform the above-described methods.
The present application may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present application.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present application may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry can execute computer-readable program instructions to implement aspects of the present application by utilizing state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).
Various aspects of the present application are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Having described embodiments of the present application, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (10)

1. An anomaly detection method, characterized in that it comprises:
acquiring at least one to-be-detected time sequence data;
extracting basic characteristic information from the at least one to-be-detected time series data;
determining an anomaly detection model corresponding to the basic characteristic information;
inputting the at least one time sequence data to be detected into a corresponding anomaly detection model, and performing anomaly detection processing to obtain an anomaly detection result corresponding to the at least one time sequence data to be detected;
acquiring time sequence data to be detected with abnormal detection results as target time sequence data and acquiring a service type corresponding to the target time sequence data;
and verifying the target time sequence data based on the service type corresponding to the target time sequence data, and determining a target abnormity detection result corresponding to the target time sequence data.
2. The method according to claim 1, wherein the verifying the target time series data based on the service type corresponding to the target time series data and determining the target anomaly detection result corresponding to the target time series data comprises:
acquiring basic characteristic information of the target time sequence data and preset abnormal information of a service type corresponding to the target time sequence data;
and determining a target abnormity detection result corresponding to the target time sequence data according to the basic characteristic information of the target time sequence data and preset abnormity information of the service type corresponding to the target time sequence data.
3. The method of claim 1, further comprising:
acquiring target time sequence data with abnormal target abnormality detection results from the target time sequence data as abnormal time sequence data and acquiring basic characteristic information of the abnormal time sequence data;
generating alarm information corresponding to the abnormal time sequence data and a time sequence chart corresponding to the abnormal time sequence data;
and sending the alarm information, the time sequence chart corresponding to the abnormal time sequence data and the basic characteristic information of the abnormal time sequence data to a terminal.
4. The method of claim 1, wherein determining the anomaly detection model corresponding to the base feature information comprises:
determining a characteristic type corresponding to the basic characteristic information;
if the characteristic type is stable, determining that an abnormal detection model corresponding to the basic characteristic information is a sigma model;
if the characteristic type is a first non-stationary type, determining an abnormal detection model corresponding to the basic characteristic information as a decision tree model;
if the characteristic type is a second non-stationary type, determining that an abnormal detection model corresponding to the basic characteristic information is a moving average algorithm model;
and if the characteristic type is a third non-stationary type, determining that the abnormal detection model corresponding to the basic characteristic information is a polynomial fitting algorithm model and a variable point detection algorithm model.
5. The method of claim 4, wherein the anomaly detection model is a decision tree model; before inputting the at least one to-be-detected time series data into the corresponding anomaly detection model and performing anomaly detection processing to obtain an anomaly detection result corresponding to the at least one to-be-detected time series data, the method further includes:
extracting target detection characteristic information from the at least one to-be-detected time series data;
the inputting the at least one to-be-detected time series data into a corresponding anomaly detection model, and performing anomaly detection processing to obtain an anomaly detection result corresponding to the at least one to-be-detected time series data, includes: and inputting the target detection characteristic information into a decision tree model, and performing anomaly detection processing to obtain an anomaly detection result corresponding to the at least one to-be-detected time sequence data.
6. The method according to claim 1, wherein the extracting of the basic feature information from the at least one to-be-detected time series data includes:
inputting the at least one time sequence data to be detected into a basic feature extraction model for basic feature extraction processing, and acquiring basic feature information of the at least one time sequence data to be detected; or,
and performing statistical processing or fitting processing on the at least one to-be-detected time series data to extract basic characteristic information of the at least one to-be-detected time series data.
7. The method of claim 1, further comprising:
obtaining a sample time series dataset comprising sample time series data and corresponding labels;
extracting sample basic characteristic information from each sample time sequence data;
determining a characteristic type corresponding to the sample basic characteristic information;
dividing the sample time sequence data set into a sub-sample time sequence data set corresponding to the characteristic type based on the characteristic type corresponding to each sample time sequence data;
and performing machine learning training on a preset machine learning model based on the sub-sample time sequence data set corresponding to the feature type until a preset condition is met, and obtaining an abnormality detection model corresponding to the feature type.
8. The method of claim 3, further comprising:
acquiring feedback information of the abnormal time sequence data;
adding the abnormal timing data to a sample timing data set according to the feedback information.
9. An abnormality detection device characterized by comprising:
the time sequence data acquisition module to be detected is used for acquiring at least one time sequence data to be detected;
the basic characteristic information extraction module is used for extracting basic characteristic information from the at least one to-be-detected time series data;
an anomaly detection model determining module, configured to determine an anomaly detection model corresponding to the basic feature information;
the anomaly detection result acquisition module is used for inputting the at least one time sequence data to be detected into a corresponding anomaly detection model and carrying out anomaly detection processing to obtain an anomaly detection result corresponding to the at least one time sequence data to be detected;
the target time sequence data and service type acquisition module is used for acquiring the time sequence data to be detected with abnormal detection results as target time sequence data and acquiring the service type corresponding to the target time sequence data;
and the target anomaly detection result determining module is used for verifying the target time sequence data based on the service type corresponding to the target time sequence data and determining a target anomaly detection result corresponding to the target time sequence data.
10. An abnormality detection apparatus characterized by comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to execute the executable instructions to implement the method of any one of claims 1 to 8.
CN202010862378.8A 2020-08-25 2020-08-25 Abnormality detection method, apparatus, device and storage medium Pending CN112084056A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010862378.8A CN112084056A (en) 2020-08-25 2020-08-25 Abnormality detection method, apparatus, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010862378.8A CN112084056A (en) 2020-08-25 2020-08-25 Abnormality detection method, apparatus, device and storage medium

Publications (1)

Publication Number Publication Date
CN112084056A true CN112084056A (en) 2020-12-15

Family

ID=73728600

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010862378.8A Pending CN112084056A (en) 2020-08-25 2020-08-25 Abnormality detection method, apparatus, device and storage medium

Country Status (1)

Country Link
CN (1) CN112084056A (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712113A (en) * 2020-12-29 2021-04-27 广州品唯软件有限公司 Alarm method and device based on indexes and computer system
CN112783972A (en) * 2020-12-31 2021-05-11 武汉工程大学 Method and system for synchronizing image characteristic value data
CN112817955A (en) * 2021-02-02 2021-05-18 中国人民解放军海军航空大学青岛校区 Regression model-based data cleaning method
CN112860524A (en) * 2021-03-31 2021-05-28 中国工商银行股份有限公司 Abnormal behavior detection method, device and equipment
CN112905671A (en) * 2021-03-24 2021-06-04 北京必示科技有限公司 Time series exception handling method and device, electronic equipment and storage medium
CN112988443A (en) * 2021-03-16 2021-06-18 上海哔哩哔哩科技有限公司 Method and device for processing business exception
CN113064796A (en) * 2021-04-13 2021-07-02 上海浦东发展银行股份有限公司 Unsupervised index abnormality detection method
CN113094284A (en) * 2021-04-30 2021-07-09 中国工商银行股份有限公司 Application fault detection method and device
CN113127305A (en) * 2021-04-22 2021-07-16 北京百度网讯科技有限公司 Abnormality detection method and apparatus
CN113342610A (en) * 2021-06-11 2021-09-03 北京奇艺世纪科技有限公司 Time sequence data anomaly detection method and device, electronic equipment and storage medium
CN113434498A (en) * 2021-05-14 2021-09-24 国网河北省电力有限公司衡水供电分公司 Method and device for monitoring data abnormity of database of power system and electronic equipment
CN113536066A (en) * 2021-07-16 2021-10-22 全球能源互联网研究院有限公司 Data anomaly detection algorithm determination method and device and computer equipment
CN113568950A (en) * 2021-07-29 2021-10-29 北京字节跳动网络技术有限公司 Index detection method, device, equipment and medium
CN113595240A (en) * 2021-06-21 2021-11-02 深圳供电局有限公司 Power data detection method, device, equipment and storage medium
CN114338284A (en) * 2021-12-24 2022-04-12 深圳尊悦智能科技有限公司 5G intelligent gateway of Internet of things
CN115858633A (en) * 2023-02-27 2023-03-28 广州汇通国信科技有限公司 Time sequence data analysis method and device based on data lake
CN116108086A (en) * 2023-02-27 2023-05-12 广州汇通国信科技有限公司 Time sequence data evaluation method and device, electronic equipment and storage medium
CN117520410A (en) * 2023-11-03 2024-02-06 华青融天(北京)软件股份有限公司 Service data processing method, device, electronic equipment and computer readable medium
CN117807545A (en) * 2024-02-28 2024-04-02 广东优信无限网络股份有限公司 Abnormality detection method and system based on data mining
CN118513269A (en) * 2024-07-25 2024-08-20 威海三元塑胶科技有限公司 Injection mold surface flatness measurement method and system

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712113A (en) * 2020-12-29 2021-04-27 广州品唯软件有限公司 Alarm method and device based on indexes and computer system
CN112712113B (en) * 2020-12-29 2024-04-09 广州品唯软件有限公司 Alarm method, device and computer system based on index
CN112783972A (en) * 2020-12-31 2021-05-11 武汉工程大学 Method and system for synchronizing image characteristic value data
CN112817955A (en) * 2021-02-02 2021-05-18 中国人民解放军海军航空大学青岛校区 Regression model-based data cleaning method
CN112817955B (en) * 2021-02-02 2022-07-01 中国人民解放军海军航空大学青岛校区 Regression model-based data cleaning method
CN112988443A (en) * 2021-03-16 2021-06-18 上海哔哩哔哩科技有限公司 Method and device for processing business exception
CN112905671A (en) * 2021-03-24 2021-06-04 北京必示科技有限公司 Time series exception handling method and device, electronic equipment and storage medium
CN112860524A (en) * 2021-03-31 2021-05-28 中国工商银行股份有限公司 Abnormal behavior detection method, device and equipment
CN113064796A (en) * 2021-04-13 2021-07-02 上海浦东发展银行股份有限公司 Unsupervised index abnormality detection method
CN113064796B (en) * 2021-04-13 2022-08-12 上海浦东发展银行股份有限公司 Unsupervised index abnormality detection method
CN113127305A (en) * 2021-04-22 2021-07-16 北京百度网讯科技有限公司 Abnormality detection method and apparatus
CN113127305B (en) * 2021-04-22 2024-02-13 北京百度网讯科技有限公司 Abnormality detection method and device
CN113094284A (en) * 2021-04-30 2021-07-09 中国工商银行股份有限公司 Application fault detection method and device
CN113434498A (en) * 2021-05-14 2021-09-24 国网河北省电力有限公司衡水供电分公司 Method and device for monitoring data abnormity of database of power system and electronic equipment
CN113342610A (en) * 2021-06-11 2021-09-03 北京奇艺世纪科技有限公司 Time sequence data anomaly detection method and device, electronic equipment and storage medium
CN113342610B (en) * 2021-06-11 2023-10-13 北京奇艺世纪科技有限公司 Time sequence data anomaly detection method and device, electronic equipment and storage medium
CN113595240A (en) * 2021-06-21 2021-11-02 深圳供电局有限公司 Power data detection method, device, equipment and storage medium
CN113595240B (en) * 2021-06-21 2024-01-19 深圳供电局有限公司 Method, device, equipment and storage medium for detecting electric power data
CN113536066A (en) * 2021-07-16 2021-10-22 全球能源互联网研究院有限公司 Data anomaly detection algorithm determination method and device and computer equipment
CN113568950A (en) * 2021-07-29 2021-10-29 北京字节跳动网络技术有限公司 Index detection method, device, equipment and medium
CN114338284A (en) * 2021-12-24 2022-04-12 深圳尊悦智能科技有限公司 5G intelligent gateway of Internet of things
CN115858633B (en) * 2023-02-27 2023-10-20 广州汇通国信科技有限公司 Time sequence data analysis method and device based on data lake
CN116108086B (en) * 2023-02-27 2023-09-26 广州汇通国信科技有限公司 Time sequence data evaluation method and device, electronic equipment and storage medium
CN116108086A (en) * 2023-02-27 2023-05-12 广州汇通国信科技有限公司 Time sequence data evaluation method and device, electronic equipment and storage medium
CN115858633A (en) * 2023-02-27 2023-03-28 广州汇通国信科技有限公司 Time sequence data analysis method and device based on data lake
CN117520410A (en) * 2023-11-03 2024-02-06 华青融天(北京)软件股份有限公司 Service data processing method, device, electronic equipment and computer readable medium
CN117807545A (en) * 2024-02-28 2024-04-02 广东优信无限网络股份有限公司 Abnormality detection method and system based on data mining
CN117807545B (en) * 2024-02-28 2024-05-31 广东优信无限网络股份有限公司 Abnormality detection method and system based on data mining
CN118513269A (en) * 2024-07-25 2024-08-20 威海三元塑胶科技有限公司 Injection mold surface flatness measurement method and system

Similar Documents

Publication Publication Date Title
CN112084056A (en) Abnormality detection method, apparatus, device and storage medium
US20210067527A1 (en) Structural graph neural networks for suspicious event detection
CN110865929B (en) Abnormality detection early warning method and system
US11294754B2 (en) System and method for contextual event sequence analysis
US11586972B2 (en) Tool-specific alerting rules based on abnormal and normal patterns obtained from history logs
US11176206B2 (en) Incremental generation of models with dynamic clustering
CN108011782B (en) Method and device for pushing alarm information
US11675641B2 (en) Failure prediction
US10832150B2 (en) Optimized re-training for analytic models
JP6355683B2 (en) Risk early warning method, apparatus, storage medium, and computer program
US11816586B2 (en) Event identification through machine learning
CN109815085B (en) Alarm data classification method and device, electronic equipment and storage medium
US11736363B2 (en) Techniques for analyzing a network and increasing network availability
CN111756706A (en) Abnormal flow detection method and device and storage medium
US10737904B2 (en) Elevator condition monitoring using heterogeneous sources
CN110401567B (en) Alarm data processing method and device, computing equipment and medium
CN113223121B (en) Video generation method, device, electronic equipment and storage medium
CN112306808A (en) Performance monitoring and evaluating method and device, computer equipment and readable storage medium
CN111666187B (en) Method and apparatus for detecting abnormal response time
US20180053117A1 (en) Labelling intervals using system data to identify unusual activity in information technology systems
US12086038B2 (en) Unsupervised log data anomaly detection
US20230274160A1 (en) Automatically training and implementing artificial intelligence-based anomaly detection models
US20210021456A1 (en) Bayesian-based event grouping
US11573888B2 (en) Machine learning test result analyzer for identifying and triggering remedial actions
CN116010187A (en) Log detection method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination