CN113326177A - Index anomaly detection method, device, equipment and storage medium - Google Patents

Index anomaly detection method, device, equipment and storage medium Download PDF

Info

Publication number
CN113326177A
CN113326177A CN202110671950.7A CN202110671950A CN113326177A CN 113326177 A CN113326177 A CN 113326177A CN 202110671950 A CN202110671950 A CN 202110671950A CN 113326177 A CN113326177 A CN 113326177A
Authority
CN
China
Prior art keywords
index
monitoring data
target
target service
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110671950.7A
Other languages
Chinese (zh)
Inventor
陈鉴镔
杨军
卢道和
陈刚
程志峰
朱嘉伟
罗海湾
李勋棋
汪晓雪
周琪
郭英亚
李兴龙
胡仲臣
周佳振
文玉茹
何勇彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202110671950.7A priority Critical patent/CN113326177A/en
Publication of CN113326177A publication Critical patent/CN113326177A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The embodiment of the application provides an index abnormality detection method, an index abnormality detection device, index abnormality detection equipment and a storage medium, and relates to the technical field of financial science and technology, wherein the index abnormality detection method comprises the following steps: the historical monitoring data of the business indexes are subjected to feature extraction to obtain historical index features, and then the first index labels of the business indexes are obtained based on the historical index feature prediction, so that the index labels do not need to be marked on all the business indexes manually, the labeling automation is realized, the labeling efficiency is improved, and the labor cost is reduced. Secondly, based on a first index label of the target business index, a corresponding first detection rule is obtained from an abnormality detection rule base, and then the first detection rule is adopted to carry out abnormality detection on the first monitoring data, so as to determine whether the target business index is an abnormal index. Different anomaly detection rules are adopted according to the characteristics of different indexes, so that the anomaly detection requirements of different indexes are met, and the accuracy of anomaly detection is improved.

Description

Index anomaly detection method, device, equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of financial technology (Fintech), in particular to a method, a device, equipment and a storage medium for index abnormality detection.
Background
With the development of computer technology, more and more technologies are applied in the financial field, and the traditional financial industry is gradually changing to financial technology (Fintech), but due to the requirements of the financial industry on safety and real-time performance, higher requirements are also put forward on the technologies. Specifically, with the increase of financial services, the number of systems, virtual machines and containers is continuously increased, the number of monitoring indexes is also leap forward, the monitoring system is operated and maintained manually, an alarm threshold value is configured, whether the system normally operates or not is observed, a large amount of manpower is consumed, and errors are prone to occurring.
In the related technology, an index anomaly detection algorithm is used for detecting whether indexes under all scenes are abnormal, however, indexes of each scene have corresponding characteristics, each anomaly detection algorithm has own defects, and if the index anomaly detection algorithm is adopted to cover index anomaly detection under all scenes, the accuracy of anomaly detection is low because indexes of all scenes cannot be involved.
Disclosure of Invention
The embodiment of the application provides an index abnormality detection method, device, equipment and storage medium, which are used for improving the accuracy of abnormality detection.
In one aspect, an embodiment of the present application provides an index abnormality detection method, including:
acquiring a first index label and first monitoring data of a target business index, wherein the first index label of the target business index is acquired by performing feature extraction on historical monitoring data of the target business index to acquire a historical index feature and predicting the historical index feature;
acquiring a corresponding first detection rule from an abnormality detection rule base based on a first index label of the target service index;
and carrying out anomaly detection on the first monitoring data by adopting the first detection rule, and determining whether the target service index is an anomaly index.
In the embodiment of the application, the historical index features are obtained by extracting the features of the historical monitoring data of the business indexes, and then the first index labels of the business indexes are obtained based on the historical index feature prediction, so that the index labels do not need to be manually marked on all the business indexes, the labeling automation is realized, the labeling efficiency is improved, and the labor cost is reduced. Secondly, based on a first index label of the target business index, a corresponding first detection rule is obtained from an abnormality detection rule base, and then the first detection rule is adopted to carry out abnormality detection on the first monitoring data, so as to determine whether the target business index is an abnormal index. Different anomaly detection rules are adopted according to the characteristics of different indexes, so that the anomaly detection requirements of different indexes are met, and the accuracy of anomaly detection is improved.
Optionally, the extracting the feature of the historical monitoring data of the target service index to obtain the historical index feature includes:
and sequentially performing feature extraction on the historical monitoring data of the target service index by adopting N feature extraction modules to obtain the historical index features of the target service index, wherein each feature extraction module at least comprises a convolution layer and a pooling layer, and N is a preset positive integer.
Optionally, the N feature extraction modules include a first feature extraction module, a second feature extraction module, and a third feature extraction module, where the first feature extraction module includes a first convolution layer, a second convolution layer, and a first pooling layer, the second feature extraction module includes a third convolution layer and a second pooling layer, and the third feature extraction module includes a fourth convolution layer and a third pooling layer.
Optionally, the performing, by using N feature extraction modules, feature extraction on the historical monitoring data of the target service index in sequence to obtain the historical index feature of the target service index includes:
sequentially extracting features of the historical monitoring data of the target service index by adopting the first convolution layer, the second convolution layer and the first pooling layer to obtain a first feature;
sequentially performing feature extraction on the first features by adopting the third convolution layer and the second pooling layer to obtain second features;
and sequentially extracting the features of the second feature by adopting the fourth convolution layer and the third pooling layer to obtain the historical index features of the target service index.
Optionally, the predicting, based on the historical index features, a first index tag of the target business index, includes:
and expanding the historical index characteristics by adopting a flattening layer and inputting the expanded historical index characteristics into a full-connection network to obtain a first index label of the target service index.
In the embodiment of the application, the index label of the service index is automatically predicted by adopting the convolutional neural network model, so that the labeling efficiency is improved, and the labor cost is reduced. Secondly, monitoring data of the service indexes are more, and the convolutional neural network model can reduce the number of parameters and extract and compress the characteristics through local connection, weight sharing and down sampling. The characteristic extraction is carried out by utilizing the plurality of convolution layers and the plurality of pooling layers, the position relation among the pixel points is obtained, the gradient conduction problem of a full-connection network is solved, the performance of the model is improved, and the accuracy of the index label obtained by prediction is improved.
Optionally, a first index tag of the target service index represents that the target service index is a periodic index;
the performing, by using the first detection rule, an anomaly detection on the first monitoring data to determine whether the target service index is an abnormal index includes:
respectively carrying out anomaly detection on the first monitoring data by adopting a standard deviation model, an isolated random forest model and a moving average model to obtain anomaly detection results output by each model;
if at least two abnormal detection results exist in each obtained abnormal detection result, representing that the target service index is an abnormal index, determining that the target service index is an abnormal index; otherwise, determining the target service index as a normal index.
Optionally, after determining that the target service indicator is an abnormal indicator, the method further includes:
and if the homocyclic ratio of the first monitoring data meets a preset alarm condition, triggering an alarm.
According to the embodiment of the application, the standard deviation model, the isolated random forest model and the moving average model are arranged for the periodic indexes for carrying out anomaly detection according to the periodic variation characteristics of the periodic indexes, so that the anomaly detection capability is improved. And secondly, integrating the abnormal detection results of the three abnormal detection models, namely the standard deviation model, the isolated random forest model and the moving average model, judging whether the target service index is an abnormal index, avoiding one-sidedness when a single model detects the abnormality and improving the accuracy of the abnormal detection.
Optionally, a first index tag of the target service index represents that the target service index is a periodic index, and the grade is greater than a preset threshold;
the performing, by using the first detection rule, an anomaly detection on the first monitoring data to determine whether the target service index is an abnormal index includes:
determining a homocyclic ratio and a target derivative of the first monitored data;
respectively carrying out anomaly detection on the first monitoring data by adopting a standard deviation model, an isolated random forest model, a moving average model and an integrated moving average autoregressive model to obtain anomaly detection results output by each model;
and determining whether the target service index is an abnormal index or not by adopting a trained comprehensive identification model based on each obtained abnormal detection result, the same-loop ratio and the target derivative.
In the embodiment of the application, the homocyclic ratio, the target derivative, the standard deviation model, the isolated random forest model, the moving average model and the integrated moving average autoregressive model are integrated to determine whether the target service index is an abnormal index, so that the one-sidedness of a single model in abnormal detection is avoided, the accuracy of abnormal detection is improved, and the requirement of important indexes on the accuracy of abnormal detection is met.
Optionally, a first index tag of the target service index represents that the target service index is a stationary index or a random fluctuation index;
the performing, by using the first detection rule, an anomaly detection on the first monitoring data to determine whether the target service index is an abnormal index includes:
performing fluctuation detection on the first monitoring data by adopting a fluctuation detection model, and determining a target fluctuation value of the target service index;
adopting a fluctuation trend identification model to identify the fluctuation trend of the first monitoring data and determining a first fluctuation trend of the target service index;
if the target fluctuation value and/or the first fluctuation trend meet corresponding alarm conditions, determining the target service index as an abnormal index and triggering an alarm; otherwise, determining the target service index as a normal index.
In the embodiment of the application, according to the characteristics of the stable indexes or the random fluctuation indexes, the fluctuation detection model and the fluctuation trend recognition model are arranged for the stable indexes or the random fluctuation indexes, so that the indexes to be detected are more matched with the detection model, and the accuracy of abnormal detection is improved.
Optionally, the method further comprises:
acquiring a second index label of a target basic index and second monitoring data, wherein the second index label of the target basic index is acquired from an index label library in a regular matching mode based on the name of the target basic index;
acquiring a corresponding second detection rule from an abnormality detection rule base based on a second index label of the target basic index;
and carrying out anomaly detection on the second monitoring data by adopting the second detection rule, and determining whether the target basic index is an anomaly index.
Optionally, the performing, by using the second detection rule, an abnormal detection on the second monitoring data to determine whether the target base indicator is an abnormal indicator includes:
adopting a fluctuation trend identification model to identify the fluctuation trend of the second monitoring data and determining a second fluctuation trend of the target basic index;
if the second fluctuation trend meets the corresponding alarm condition or the second monitoring data exceeds the range of a preset threshold value, determining the target basic index as an abnormal index and triggering an alarm; otherwise, determining the target basic index as a normal index.
In one aspect, an embodiment of the present application provides an index abnormality detection apparatus, where the apparatus includes:
the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a first index label and first monitoring data of a target business index, and the first index label of the target business index is obtained by performing feature extraction on historical monitoring data of the target business index to obtain historical index features and predicting the historical index features;
the matching module is used for acquiring a corresponding first detection rule from an abnormity detection rule base based on a first index label of the target service index;
and the detection module is used for performing abnormity detection on the first monitoring data by adopting the first detection rule and determining whether the target service index is an abnormal index.
In one aspect, an embodiment of the present application provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of the index abnormality detection method when executing the program.
In one aspect, an embodiment of the present application provides a computer-readable storage medium, which stores a computer program executable by a computer device, and when the program runs on the computer device, the computer device is caused to execute the steps of the index abnormality detection method.
In the implementation of the application, the historical index features are obtained by extracting the features of the historical monitoring data of the business indexes, and then the first index labels of the business indexes are obtained based on the historical index feature prediction, so that the index labels do not need to be manually marked on all the business indexes, the labeling automation is realized, the labeling efficiency is improved, and the labor cost is reduced. Secondly, based on a first index label of the target business index, a corresponding first detection rule is obtained from an abnormality detection rule base, and then the first detection rule is adopted to carry out abnormality detection on the first monitoring data, so as to determine whether the target business index is an abnormal index. Different anomaly detection rules are adopted according to the characteristics of different indexes, so that the anomaly detection requirements of different indexes are met, and the accuracy of anomaly detection is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a schematic diagram of a system architecture according to an embodiment of the present application;
fig. 2 is a schematic flowchart of an index abnormality detection method according to an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of a convolutional neural network according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a convolutional neural network according to an embodiment of the present disclosure;
fig. 5 is a schematic flowchart of a service index classification method according to an embodiment of the present application;
fig. 6 is a schematic flowchart of an index abnormality detection method according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an index abnormality detection apparatus according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more clearly apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
For convenience of understanding, terms referred to in the embodiments of the present invention are explained below.
CNN: convolutional Neural Networks, a class of feed forward Neural Networks (fed Neural Networks) that contain convolution calculations and have a deep structure, are one of the representative algorithms for deep learning (deep learning). Convolutional Neural Networks have a feature learning (representation learning) capability, and can perform Shift-Invariant classification (Shift-Invariant classification) on input information according to a hierarchical structure thereof, and are also called Shift-Invariant Artificial Neural Networks (SIANN).
A Support Vector Machine (SVM), a generalized linear classifier (generalized linear classifier) for binary classification of data in a supervised learning (supervised learning) manner, where a decision boundary is a maximum-margin hyperplane (maximum-margin hyperplane) for solving a learning sample.
Decision Tree (Decision Tree): on the basis of the known occurrence probability of various conditions, the probability that the expected value of the net present value is greater than or equal to zero is obtained by forming a decision tree, the risk of the project is evaluated, and the feasibility of the project is judged. This decision branch is called a decision tree because it is drawn to resemble a branch of a tree. In machine learning, a decision tree is a predictive model that represents a mapping between object attributes and object values.
Indexes are as follows: the data collected by the monitoring system reflects the operation condition of the system, including basic indexes and service indexes, where the basic indexes include host indexes and program operation indexes, and the host indexes may be further divided into container indexes, virtual machine indexes, and the like, such as Central Processing Unit (CPU), memory, IO, network, and the like. The program runtime index includes various indexes generated when a Java Virtual Machine (JVM) runs, such as the number of active threads, the size of Buffer, the size of memory, and the like. The service index is related to a specific service, and specifically includes a success rate, a TPS, a failure rate, a number of failures, an average delay, a maximum delay, and the like.
Referring to fig. 1, a system architecture diagram applicable to the embodiment of the present application includes at least a configuration management system 101, a monitoring system 102, a business system 103, and an anomaly detection system 104.
When the service system 103 is accessed as a new system, the configuration management system 101 configures system information corresponding to the service system 103. After the configuration is finished, the service system 103 may provide corresponding services for each user terminal. In the operation process of the service system 103, the monitoring system 102 scans the log of the service system 103 to obtain the monitoring data of each service index of the service system 103; the agent collects monitoring data of each basic index of the business system 103. And then sends the monitored data of each service index and the monitored data of each basic index to the anomaly detection system 104.
For any one service index, the anomaly detection system 104 performs feature extraction on historical monitoring data of the service index to obtain a historical index feature, predicts and obtains a first index tag of the service index based on the historical index feature, and then stores the first index tag of the service index in a database.
For any basic index, the anomaly detection system 104 obtains the second index label of the basic index from the index label library in a regular matching manner based on the name of the basic index, and then stores the second index label of the basic index in the database.
After determining the index labels of the service indexes and the basic indexes, if the abnormality detection system 104 receives first monitoring data of a target service index sent by the monitoring system 102, the abnormality detection system 104 obtains the first index label of the target service index from the database, then obtains a corresponding first detection rule from the abnormality detection rule base based on the first index label of the target service index, and then performs abnormality detection on the first monitoring data by using the first detection rule to determine whether the target service index is an abnormal index.
If the anomaly detection system 104 receives second monitoring data of the target basic index sent by the monitoring system 102, the anomaly detection system 104 obtains a second index tag of the target basic index from the database, then obtains a corresponding second detection rule from the anomaly detection rule base based on the second index tag of the target basic index, and then performs anomaly detection on the second monitoring data by adopting the second detection rule to determine whether the target basic index is an anomaly index.
It should be noted that, in this embodiment of the application, each system may be an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), and a big data and artificial intelligence platform. The above systems may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.
Based on the system architecture diagram shown in fig. 1, an embodiment of the present application provides a flow of an index abnormality detection method, as shown in fig. 2, where the flow of the method is executed by a computer device, which may be the abnormality detection system 104 shown in fig. 1, and includes the following steps:
step S201, a first index tag of a target service index and first monitoring data are obtained.
Specifically, the service index is related to a specific service, and includes: success rate, TPS, failure rate, number of failures, average delay, maximum delay, etc. The target traffic indicator may be any one of the individual traffic indicators. The first index label is used for representing the service index type, and the service index type comprises a periodic index, a stable index and a random fluctuation index. The first index label of the target business index is obtained by performing feature extraction on historical monitoring data of the target business index to obtain historical index features and predicting the historical index features. Optionally, when the first index label of the target service index is predicted based on the historical index feature, each local feature in the historical index feature may be divided into each service index type based on a similarity between the local feature in the historical index feature and each service index type. Then, based on the number of the divided local features in each service index type, the probability corresponding to each service index type is determined, and the service index type with the highest probability is used as a first index label.
In the same way, the first index labels of other service indexes except the target service index can be obtained. After the first index labels of the business indexes are obtained, the first index labels of the business indexes are stored in a database so as to be called when whether the business indexes are abnormal or not is detected.
Step S202, based on the first index label of the target service index, a corresponding first detection rule is obtained from the abnormality detection rule base.
Specifically, a corresponding abnormality detection rule is arranged in advance for each first index tag, and then each first index tag and the corresponding abnormality detection rule are correspondingly stored in an abnormality detection rule base. After the first index label of the target business index is obtained, the abnormity detection rule base is inquired based on the first index label of the target business index, and a first detection rule is obtained from each abnormity detection rule.
Step S203, performing anomaly detection on the first monitoring data by using a first detection rule, and determining whether the target service index is an anomaly index.
Specifically, one anomaly detection rule may include one or more anomaly detection models. When a plurality of abnormality detection models are included in one abnormality detection rule, the abnormality detection rule further includes an execution order among the plurality of abnormality detection models, and a policy of determining a final abnormality detection result by synthesizing output results of the plurality of abnormality detection models. The first detection rule is an abnormality detection rule matched with the first index tag in each abnormality detection rule.
And when the first detection rule only comprises one abnormality detection model, performing abnormality detection on the first monitoring data by adopting the abnormality detection model, and determining whether the target service index is an abnormal index. When the first detection rule only comprises a plurality of abnormality detection models, the abnormality detection is carried out on the first monitoring data by adopting the plurality of abnormality detection models respectively, and then the output results of the plurality of abnormality detection models are integrated to determine whether the target service index is an abnormal index.
In the implementation of the application, the historical index features are obtained by extracting the features of the historical monitoring data of the business indexes, and then the first index labels of the business indexes are obtained based on the historical index feature prediction, so that the index labels do not need to be manually marked on all the business indexes, the labeling automation is realized, the labeling efficiency is improved, and the labor cost is reduced. Secondly, based on a first index label of the target business index, a corresponding first detection rule is obtained from an abnormality detection rule base, and then the first detection rule is adopted to carry out abnormality detection on the first monitoring data, so as to determine whether the target business index is an abnormal index. Different anomaly detection rules are adopted according to the characteristics of different indexes, so that the anomaly detection requirements of different indexes are met, and the accuracy of anomaly detection is improved.
Optionally, in the step S201, a convolutional neural network may be used to perform feature extraction on historical monitoring data of the target service index to obtain a historical index feature. Specifically, the convolutional neural network comprises N feature extraction modules, after the convolutional neural network is trained by adopting training samples, the N feature extraction modules in the convolutional neural network are adopted to sequentially perform feature extraction on historical monitoring data of a target service index to obtain historical index features of the target service index, wherein each feature extraction module at least comprises a convolutional layer and a pooling layer, and N is a preset positive integer.
In a possible implementation manner, the N feature extraction modules include a first feature extraction module, a second feature extraction module, and a third feature extraction module, where the first feature extraction module includes a first convolution layer and a first pooling layer, the second feature extraction module includes a second convolution layer and a second pooling layer, and the third feature extraction module includes a third convolution layer and a third pooling layer.
Further, in order to aggregate more data to improve the accuracy of index anomaly detection, in the embodiment of the present invention, another possible implementation manner is preferably adopted, as shown in fig. 3, the N feature extraction modules include a first feature extraction module, a second feature extraction module, and a third feature extraction module, where the first feature extraction module includes a first convolution layer, a second convolution layer, and a first pooling layer, the second feature extraction module includes a third convolution layer and a second pooling layer, and the third feature extraction module includes a fourth convolution layer and a third pooling layer.
And sequentially extracting the characteristics of the historical monitoring data of the target service index by adopting the first convolution layer, the second convolution layer and the first pooling layer to obtain a first characteristic. And then, sequentially carrying out feature extraction on the first features by adopting the third convolution layer and the second pooling layer to obtain second features. And sequentially extracting the features of the second feature by adopting a fourth convolution layer and a third pooling layer to obtain the historical index features of the target service index. It should be noted that, in this embodiment, the first convolution layer and the second convolution layer are arranged in the first feature extraction module, so that aggregated data is more, and accuracy of subsequent index anomaly detection is higher.
Specifically, the convolution layers are used for extracting local features by using the position relationship among pixel points, and the pooling layers are used for downsampling, removing unimportant samples and reducing the number of parameters.
Optionally, as shown in fig. 4, the convolutional neural network further includes a flattening layer and a fully connected network, and after extracting the historical index features of the target service index, the flattening layer is used to expand the historical index features and then input the expanded historical index features into the fully connected network, so as to obtain a first index tag of the target service index.
The following illustrates the structure, training process, and classification process of the convolutional neural network in combination with a specific implementation scenario:
firstly, a training sample obtaining process:
the monitoring data is collected once for a first preset time period (in this embodiment, one minute may be selected, and in other embodiments, other values such as two minutes may be selected) of the monitoring system, and 1440 monitoring data points are collected for each service index in a day. The 1440 monitor data points are compressed using a time window of a second predetermined duration (optionally 5 minutes in this embodiment, and optionally other values such as 10 minutes in other embodiments) to obtain 288 monitor data points, wherein the monitor data points within 5 minutes are averaged. The monitoring system identifies each business indicator by a metricID.
Manually classifying each business index based on the monitoring data of each business index, and printing an index label as a training sample set, wherein the index label comprises { 0: random fluctuation index, 1: periodicity index, 2: stationary indicator }. In the training sample set, the sample data size of the three index labels is basically the same.
The embodiment of the application adopts the following modes to import a training sample set:
the training sample set comprises two parts of data, namely monitoring data and index labels. And a data pulling module is adopted to lead monitoring data into a database from an interface of the monitoring system at regular time, and an index tag is led into the database in an excel leading-in or robot leading-in mode.
The excel import means that the index tag is written into the corresponding excel, then a specific script is compiled, and the index tag in the excel is imported into the database.
The robot import refers to compiling a robot script and then importing the index label into a database through the robot script.
Secondly, training the convolutional neural network model:
the convolutional neural network comprises a convolutional layer 1, a convolutional layer 2, a pooling layer 1, a convolutional layer 3, a pooling layer 2, a convolutional layer 4, a pooling layer 3, a flattening layer and a full-connection network.
Specifically, the attribute parameters of the convolutional layer 1 are: 3 × 1 × 4, where 3 denotes data features for extracting 3 adjacent data points, 1 denotes a one-dimensional matrix, and 4 denotes 4 different filters (filters), for example, 4 filters are [1, 0, 1], [ -1, 0, -1], [ -1, 0, 1], respectively, and different features can be extracted by using different filters.
The attribute parameters of convolutional layer 2 are: 3 × 1, the meaning of the attribute parameters of the convolutional layer 2 is the same as that of the attribute parameters of the convolutional layer 1, and will not be described herein again.
The attribute parameters of the pooling layer 1 are: and 3 x 1, selecting the data point with the largest value from the 3 data points, thereby removing unimportant samples and reducing the number of the data points.
The attribute parameters of convolutional layer 3 are: 3*1*4.
The property parameters of the pooling layer 2 are: 3 x 1. the meaning of the property parameters of the pooling layer 2 is the same as that of the property parameters of the pooling layer 1 and will not be described herein.
The attribute parameters of convolutional layer 4 are: 3*1*1.
The property parameters of the pooling layer 3 are: 3*1,
since the ReLu function is fast and can alleviate the problem of gradient disappearance, the ReLu function is selected as an activation function between layers in the convolutional neural network, and is shown in the following formula (1):
F(x)=Max(0,x)……………………(1)
according to research, the activation rate of the neural network is optimally between 15% and 30%, and the neural network is not activated when the input of the ReLu function is less than 0, so that the ReLu function can adopt a smaller activation rate, such as an activation rate of 15%.
In the network structure, the convolutional layers are adopted to extract the relation between data points (time sequences), the pooling layers are adopted to compress parameters, and a plurality of convolutional layers and a plurality of pooling layers are adopted to extract more fine index characteristics, so that the accuracy of the model is improved.
After the network structure is determined, a convolutional neural network is built by adopting a keras framework of python, and a training sample set is divided into a training set and a verification set according to the ratio of 7: 3. And performing iterative training on the convolutional neural network for 40 times by adopting a training set, obtaining a candidate model by each iterative training, verifying each candidate model by adopting a verification set, determining a loss function of each candidate model, and selecting the candidate model with the minimum loss function as the trained convolutional neural network model.
Thirdly, classification process:
historical monitoring data of a target business index for 30 days is obtained, 1440 data points are included each day, and then the data points in one day are compressed by adopting a 5-minute time window to obtain 288 data points.
For 288 data points corresponding to historical monitoring data for one day, as shown in fig. 5, 288 data points were input to the trained convolutional neural network model as one-dimensional images (288 × 1, where 288 × 1 represents the image height 288 and the image width 1, and 1 represents the number of channels 1).
The convolution layer 1 performs feature extraction on the one-dimensional image (288 × 1) based on the attribute parameter (3 × 1 × 4), and obtains a first feature image (286 × 1 × 4, where 286 × 1 denotes an image height 286 and an image width 1, and 4 denotes a channel number 4).
The first feature image is input to the convolution layer 2, and the convolution layer 2 performs feature extraction on the first feature image (286 × 1 × 4) based on the attribute parameter (3 × 1), to obtain a second feature image (284 × 1 × 4, where 284 × 1 represents the image height 284 and the image width 1, and 4 represents the number of channels 4).
The second feature image is input to the pooling layer 1, and the pooling layer 1 down-samples the second feature image (284 x 1 x 4) based on the attribute parameter (3 x 1) to obtain a first pooled image (94 x 1 x 4, wherein 94 x 1 represents the image height 94 and the image width 1 and 4 represents the number of channels 4).
The first pooled image was input to convolutional layer 3, and convolutional layer 3 performs feature extraction on first pooled image (94 x 1 x 4) based on attribute parameters (3 x 1 x 4) to obtain a third feature image (92 x 1 x 16, where 92 x 1 represents image height 92 and image width 1, and 16 represents channel number 16).
The third feature image is input to the pooling layer 2, and the pooling layer 2 down-samples the third feature image (92 x 1 x 16) based on the attribute parameter (3 x 1) to obtain a second pooled image (30 x 1 x 16, where 30 x 1 represents the image height 30 and the image width 1 and 16 represents the number of channels 16).
The second pooled image was input to convolutional layer 4, and convolutional layer 4 performs feature extraction on the second pooled image (30 x 1 x 16) based on the attribute parameter (3 x 1), to obtain a fourth feature image (28 x 1 x 16, where 28 x 1 represents image height 28 and image width 1, and 16 represents channel number 16).
The fourth feature image is input to the pooling layer 3, and the pooling layer 3 down-samples the fourth feature image (28 x 1 x 16) based on the attribute parameter (3 x 1) to obtain a third pooled image (9 x 1 x 16, where 9 x 1 indicates the image height 9 and the image width 1 and 16 indicates the number of channels 16).
The third pooled image (9 x 1 x 16) was input to the flattening stack and expanded to obtain a fifth feature image (144 x 1, where 144 x 1 indicates image height 144 and image width 1 and 1 indicates number of channels 1).
Inputting the fifth feature image (144 x 1) into a fully-connected network, dividing 144 local features in the fifth feature image into three categories of a random fluctuation index, a periodic index and a stationary index based on the similarity between the local features in the fifth feature image and each category, then determining the probability corresponding to the three categories of the random fluctuation index, the periodic index and the stationary index according to the number of the divided local features in the three categories of the random fluctuation index, the periodic index and the stationary index, and taking the category with the maximum probability as a first index label of the target service index.
In addition, in practical application, the above processes can be packaged into different functions in advance, and then the functions can be directly called to realize model building, model training, service index classification and the like. For example, a model is built by a model.component () function, training is performed by a model.fit () function, and finally the model is stored by a model.save _ weights () function. In practical application, when a model needs to be extracted, a model.load _ weights () function is called to load the model, and then a model.predict function is called to predict a first index label of a target service index.
In the embodiment of the application, the index label of the service index is automatically predicted by adopting the convolutional neural network model, so that the labeling efficiency is improved, and the labor cost is reduced. Secondly, monitoring data of the service indexes are more, and the convolutional neural network model can reduce the number of parameters and extract and compress the characteristics through local connection, weight sharing and down sampling. The characteristic extraction is carried out by utilizing the plurality of convolution layers and the plurality of pooling layers, the position relation among the pixel points is obtained, the gradient conduction problem of a full-connection network is solved, the performance of the model is improved, and the accuracy of the index label obtained by prediction is improved.
It should be noted that, in the embodiment of the present application, the structure of the convolutional neural network is not limited to the structure described in the above embodiment, and the number and the position relationship of the convolutional layers and the pooling layers in the convolutional neural network may be adjusted according to actual needs, and in addition, when classifying the service index, the model is not limited to the convolutional neural network model, and may also be a model with a classification function, such as a support vector machine, a decision tree, and a K cluster, and the present application is not limited specifically.
Optionally, in this embodiment of the present application, in addition to the service index, the basic index further includes a basic index, where the basic index includes a host index and a program runtime index, and the host index may be further divided into a container index, a virtual machine index, and the like, for example, a CPU, a memory, an IO, a network, and the like. The program runtime index includes various indexes generated when the Java virtual machine runs, such as the number of active threads, the size of Buffer, the size of memory, and the like.
The method for detecting the abnormality of the basic index in the embodiment of the application is shown in fig. 6, and includes the following steps:
step S601, a second index tag of the target basic index and second monitoring data are obtained.
Specifically, the second index label of the target basic index is obtained from the index label library in a regular matching manner based on the name of the target basic index. The index label library comprises various designed index labels, such as: CPU usage, CPU1 minute load, CPU5 minute load, CPU15 minute load, IO read times, IO write times, memory usage, network packet ingress, packet loss, GC (Garbage Collection, computer terminology) time, active thread count, and the like. And obtaining the name of each basic index through a monitoring system, and then regularly matching each index label in the index label library by adopting the name of the basic index to obtain a second index label of the basic index.
In addition, for important indexes or indexes which are not labeled through the regular matching and the convolutional neural network model, the labels can be labeled manually.
Step S602, based on the second index label of the target basic index, a corresponding second detection rule is obtained from the anomaly detection rule base.
Step S603, performing anomaly detection on the second monitoring data by using a second detection rule, and determining whether the target base index is an anomaly index.
In the embodiment of the application, through the mode of regular matching index label library, index labels are automatically marked on each basic index, so that the manual marking of each basic index is not needed, the labeling efficiency is improved, and the labor cost is reduced. And secondly, carrying out anomaly detection on the second monitoring data from a second detection rule in the anomaly detection rule base to determine whether the basic index is an anomaly index. Different detection rules are adopted according to the characteristics of different indexes, and the abnormal detection requirements of different indexes are met, so that the accuracy of abnormal detection is improved.
Optionally, the anomaly detection rule base is pre-constructed, and the anomaly detection rule base at least includes the following anomaly detection models:
1. the ratio of the homocycles is used for the fluctuation range of the reaction index, and can also reflect the periodicity of the index.
The same-ring ratio includes a same-ring ratio and a ring ratio, wherein the same-ring ratio is obtained by comparing the monitoring data at the current time point with the average value of the monitoring data at the same time point in the past period, and is specifically shown in the following formula (2):
f1(x)=abs(Xnew-mean(x1,x2,…,xn))/mean(x1,x2,…,xn)……(2)
wherein f is1(x) Denotes the same ratio, XnewMonitoring data representing the current time point, mean (x)1,x2,…,xn) The average value of the monitored data at the same time point in the past period is shown, and abs means the absolute value.
For example, set the current time to 4 months, 19 days, 14: 23 points, 14 at 19 days 4 month: 23, acquiring monitoring data of the target service indexes within 14: 21-14: 25 in the last 7 days, namely acquiring monitoring data of the target service indexes within 7 days such as 14: 21-14: 25 in 18 days of 4 months and 14: 21-14: 25 in 17 days of 4 months, calculating the average value of the acquired monitoring data of the target service indexes within 7 days, and then calculating the homonymy of the target service indexes by adopting the formula (2).
The loop ratio is obtained by comparing the monitoring data at the current time point with the average value of the monitoring data at the same time point in each past period, and is specifically shown in the following formula (3):
f2(x)=abs(Knew-mean(k1,k2,…,kn))/mean(k1,k2,…,kn)……(3)
wherein f is2(x) Denotes the ring ratio, KnewMonitoring data representing the current point in time, mean (k)1,k2,…,kn) The average value of the monitoring data at the same time point in each past period is shown, and abs refers to an absolute value.
For example, set the current time to 4 months and 19 days (monday) 14: 23 points, 14 at 19 days 4 month: and (3) dividing the monitoring data of the target service indexes by 23, simultaneously acquiring the monitoring data of the target service indexes of 4 mondays such as 14: 21-14: 25 minutes of mondays in the past 4 weeks, namely the monitoring data of the target service indexes of 4 mondays such as 14: 21-14: 25 minutes of 12 days in 4 months and 14: 21-14: 25 minutes of 5 days in 4 months, calculating the average value of the acquired monitoring data of the target service indexes of 4 mondays, and then calculating the ring ratio of the target service indexes by adopting the formula (3).
And taking the average value of the same-ring ratio and the ring ratio as the same-ring ratio, and presetting an alarm condition of the same-ring ratio, wherein the same-ring ratio is larger than 50 percent. And triggering an alarm when the same-ring ratio meets an alarm condition.
2. First and second derivatives.
The first derivative reflects the variation trend of the index, and the second derivative reflects the change of the slope.
3. Standard deviation model.
The standard deviation represents the degree of dispersion of the monitored data of the index.
4. And (5) isolating the random forest model.
The isolated random forest is used for isolating abnormal data points in the monitoring data of the indexes, and reflects the discrete condition of the data.
5. A moving average model.
The moving average model is used for predicting the monitoring data of the index in the future.
6. An Autoregressive model with Moving Average, called ARIMA model for short, is Integrated.
The ARIMA model is used for predicting the monitoring data of the index in the future.
7. And (5) a fluctuation detection model.
Specifically, the fluctuation value of the index is determined by comparing the monitoring data at the current time point with the average value of the monitoring data over a past period of time, as shown in the following formula (4):
f3(x)=(mean-x)/mean………………(4)
wherein f is3(x) Represents the fluctuation value, mean represents the average value of the monitored data over a past period of time, and x represents the monitored data at the current time point.
Further, an alarm condition corresponding to the fluctuation value is preset, for example, for the basic index, if the fluctuation value is within the fluctuation range of [ -20%, 30% ], no alarm is triggered, and if the fluctuation value is not within the fluctuation range of [ -20%, 30% ], an alarm is triggered.
For another example, for the service index, if the fluctuation value is within the fluctuation range of [ -30%, 20% ], no alarm is triggered, and if the fluctuation value is not within the fluctuation range of [ -30%, 20% ], an alarm is triggered.
8. And identifying a model according to the fluctuation trend.
Specifically, the fluctuation trend identification model is used for identifying the variation trend of the monitoring data of the index, the variation trend comprises increasing trend and decreasing trend, and during specific identification, the fluctuation trend can be identified from multiple dimensions such as a minute dimension, an hour dimension, a day dimension and the like.
For example, the minutes dimension: the first average value of the monitoring data in the current 5 minutes and the average value of the monitoring data in the past 10 5 minutes are obtained. Averaging the monitoring data average values in the past 10 5 minutes to obtain a second monitoring data average value, and calculating the minute dimension fluctuation value between the first monitoring data average value and the second monitoring data average value. And judging the variation trend of the monitoring data of the minute dimension based on the minute dimension fluctuation value. And simultaneously setting the fluctuation range of-10 percent and 10 percent, if the minute dimension fluctuation value is between the fluctuation range of-10 percent and 10 percent, not triggering the alarm, and if the minute dimension fluctuation value is not between the fluctuation range of-10 percent and 10 percent, triggering the alarm.
The hour dimension:
1. the first average value of the monitoring data in the current 1 hour (from 17 to 18 points today) and the average value of the monitoring data in the past 10 hours (from 7 to 17 points today) are obtained. Averaging the average values of the monitoring data within the past 10 hours to obtain a second average value of the monitoring data, and calculating an hour dimension fluctuation value between the first average value of the monitoring data and the second average value of the monitoring data. And judging the variation trend of the monitoring data of the hour dimension based on the hour dimension fluctuation value. And simultaneously setting fluctuation ranges of minus 10 percent and 10 percent, if the hourly dimensional fluctuation value is between the fluctuation ranges of minus 10 percent and 10 percent, not triggering an alarm, and if the hourly dimensional fluctuation value is not between the fluctuation ranges of minus 10 percent and 10 percent, triggering an alarm.
2. The first average value of the monitoring data in the current 1 hour (from 17 o 'clock to 18 o' clock today, and monday today) and the average value of the monitoring data in the last 10 hours (from 7 o 'clock to 17 o' clock last monday) of the same time point in the last monday are obtained. Averaging the average values of the monitoring data within the past 10 hours to obtain a second average value of the monitoring data, and calculating an hour dimension fluctuation value between the first average value of the monitoring data and the second average value of the monitoring data. And judging the variation trend of the monitoring data of the hour dimension based on the hour dimension fluctuation value. And simultaneously setting fluctuation ranges of minus 10 percent and 10 percent, if the hourly dimensional fluctuation value is between the fluctuation ranges of minus 10 percent and 10 percent, not triggering an alarm, and if the hourly dimensional fluctuation value is not between the fluctuation ranges of minus 10 percent and 10 percent, triggering an alarm.
The dimension of the sky:
1. and acquiring a first monitoring data average value of today and a second monitoring data average value of yesterday. And then calculating a day dimension fluctuation value between the first monitoring data average value and the second monitoring data average value. And judging the change trend of the monitoring data of the day dimension based on the day dimension fluctuation value. And simultaneously setting fluctuation ranges of minus 10 percent and 10 percent, if the day dimension fluctuation value is between the fluctuation ranges of minus 10 percent and 10 percent, not triggering an alarm, and if the day dimension fluctuation value is not between the fluctuation ranges of minus 10 percent and 10 percent, triggering an alarm.
2. The first mean value of the monitoring data of today and the second mean value of the monitoring data of the past month are obtained. And then calculating a day dimension fluctuation value between the first monitoring data average value and the second monitoring data average value. And judging the change trend of the monitoring data of the day dimension based on the day dimension fluctuation value. And simultaneously setting fluctuation ranges of minus 10 percent and 10 percent, if the day dimension fluctuation value is between the fluctuation ranges of minus 10 percent and 10 percent, not triggering an alarm, and if the day dimension fluctuation value is not between the fluctuation ranges of minus 10 percent and 10 percent, triggering an alarm.
8. And (5) comprehensively identifying the model.
And (3) training a comprehensive recognition model based on the abnormal detection results output by the models 1-6. After training is finished, aiming at the monitoring data of the indexes, the models 1-6 are respectively adopted to determine a plurality of abnormal detection results of the indexes, and then the plurality of abnormal detection results are input into the comprehensive identification model to obtain a more accurate comprehensive detection result.
Specifically, the model described in 1-6 above is used to process the historical monitoring data of the index, and a training sample for training the comprehensive recognition model is obtained.
And calculating the homocyclic ratio, the first derivative and the second derivative of the index based on the historical monitoring data of the index aiming at the historical monitoring data in any period of time. And inputting the historical monitoring data of the index into a standard deviation model to obtain the standard deviation of the index. And inputting historical monitoring data of the index into the isolated random forest model to obtain an abnormal data point of the index. And inputting the historical monitoring data of the index into the moving average model to obtain first prediction monitoring data of the index. And inputting the historical monitoring data of the index into an ARIMA model to obtain second prediction monitoring data of the index.
And combining each obtained abnormal detection result and the comprehensive detection result of the pre-marked index into a training sample, wherein each abnormal detection result comprises the same-loop ratio, the first derivative, the second derivative, the standard deviation, an abnormal data point, the first prediction monitoring data and the second prediction monitoring data of the index, and the pre-marked comprehensive detection result can be the abnormal detection result with the highest accuracy in each abnormal detection result or the comprehensive detection result obtained by combining each abnormal detection result. Other training samples can be constructed in the same manner, and are not described in detail here.
And training the comprehensive recognition model by using the obtained training samples. After training is finished, calculating a homocyclic ratio, a first derivative and a second derivative of the target business index according to the monitoring data of the target business index. And inputting the monitoring data of the target service index into the standard deviation model to obtain the standard deviation of the target service index. And inputting the monitoring data of the target service index into the isolated random forest model to obtain an abnormal data point of the target service index. And inputting the monitoring data of the target service index into the moving average model to obtain first prediction monitoring data of the target service index. And inputting the monitoring data of the target service index into an ARIMA model to obtain second prediction monitoring data of the target service index.
And then inputting the homocyclic ratio, the first derivative, the second derivative, the standard deviation, the abnormal data point, the first prediction monitoring data and the second prediction monitoring data of the target service index into the trained comprehensive identification model to obtain the comprehensive detection result of the target service index.
It should be noted that, in the embodiment of the present application, the output result of the model described in above 1 to 6 is not limited to be combined to construct the comprehensive identification model, and the output result of the partial model in the model described in above 1 to 6 may also be used to construct the comprehensive identification model, which is not limited in this application.
In the embodiment of the application, the anomaly detection rule base is pre-configured with various anomaly detection algorithms, such as a time series algorithm, a neural network algorithm, a data probability statistical algorithm and the like, so that the anomaly detection requirements of various indexes are met.
Optionally, on the basis of each abnormality detection model in the abnormality detection rule base, the embodiment of the application lays out corresponding abnormality detection rules for each index according to the characteristics of each index.
In a possible implementation manner, the first index tag of the target service index represents that the target service index is a periodic index. And respectively carrying out anomaly detection on the first monitoring data by adopting a standard deviation model, an isolated random forest model and a moving average model to obtain anomaly detection results output by each model. If at least two abnormal detection results exist in each obtained abnormal detection result, representing that the target service index is an abnormal index, determining that the target service index is an abnormal index; otherwise, determining the target service index as a normal index.
Specifically, the first monitoring data is input into a standard deviation model, and the standard deviation of the target service index is determined. And then judging whether the standard deviation of the target service index is within a preset range, if so, determining the target service index as a normal index, and otherwise, determining the target service index as an abnormal index.
And carrying out anomaly detection on the first monitoring data by adopting an isolated random forest model, determining an anomaly point in the first monitoring data, and then determining the discrete degree of the first monitoring data according to the anomaly point in the first monitoring data. And if the discrete degree of the first monitoring data is larger than a preset threshold value, determining that the target service index is an abnormal index, and otherwise, determining that the target service index is a normal index.
And predicting the monitoring data of the target service index in a future period of time based on the first monitoring data by adopting a moving average model, and then judging whether the target service index is an abnormal index or not according to the monitoring data in the future period of time.
And when at least two abnormal detection results exist in the abnormal detection results output by the three models and indicate that the target service index is an abnormal index, determining that the target service index is the abnormal index.
Optionally, after the target service index is determined to be an abnormal index, whether the same-loop ratio of the first monitoring data meets a preset alarm condition is determined, and if yes, an alarm is triggered.
Specifically, the same-ratio and the ring-ratio of the first monitoring data are calculated, then the average value of the same-ratio and the ring-ratio of the first monitoring data is used as the same-ratio of the first monitoring data, and the alarm condition may be set according to actual requirements, for example, the same-ratio of the first monitoring data is greater than 50%. The calculation methods of the same ratio, the ring ratio and the same ring ratio are described above and will not be described herein.
According to the embodiment of the application, the standard deviation model, the isolated random forest model and the moving average model are arranged for the periodic indexes for carrying out anomaly detection according to the periodic variation characteristics of the periodic indexes, so that the anomaly detection capability is improved. And secondly, integrating the abnormal detection results of the three abnormal detection models, namely the standard deviation model, the isolated random forest model and the moving average model, judging whether the target service index is an abnormal index, avoiding one-sidedness when a single model detects the abnormality and improving the accuracy of the abnormal detection.
In a possible implementation manner, the first index tag of the target service index represents that the target service index is a periodic index, and the level is greater than a preset threshold.
Determining the homocyclic ratio and the target derivative of the first monitoring data, and then respectively performing anomaly detection on the first monitoring data by adopting a standard deviation model, an isolated random forest model, a moving average model and an integrated moving average autoregressive model to obtain anomaly detection results output by each model. And determining whether the target service index is an abnormal index or not by adopting the trained comprehensive identification model based on each obtained abnormal detection result, the homocyclic ratio and the target derivative.
Specifically, when the grade of the target service index is greater than the preset threshold, it is indicated that the target service index is an important index, and the important index has a high requirement on the accuracy of the abnormality detection. The target derivative of the first monitored data includes a first derivative and a second derivative of the first monitored data, the comprehensive identification model is a neural network model, and the process of training the comprehensive identification model is described above and is not described herein again.
And calculating the homocyclic ratio, the first derivative and the second derivative of the target service index based on the first monitoring data. And inputting the first monitoring data into a standard deviation model to obtain the standard deviation of the target service index. And inputting the first monitoring data into the isolated random forest model to obtain abnormal data points of the target service index. And inputting the first monitoring data into the moving average model to obtain first prediction monitoring data of the target service index. And inputting the first monitoring data into an ARIMA model to obtain second predicted monitoring data of the target service index.
And then inputting the homocyclic ratio, the first derivative, the second derivative, the standard deviation, the abnormal data point, the first prediction monitoring data and the second prediction monitoring data of the target service index into the trained comprehensive identification model to obtain the comprehensive detection result of the target service index. And then determining whether the target service index is an abnormal index based on the comprehensive detection result of the target service index.
In the embodiment of the application, the homocyclic ratio, the target derivative, the standard deviation model, the isolated random forest model, the moving average model and the integrated moving average autoregressive model are integrated to determine whether the target service index is an abnormal index, so that the one-sidedness of a single model in abnormal detection is avoided, the accuracy of abnormal detection is improved, and the requirement of important indexes on the accuracy of abnormal detection is met.
In one possible implementation, the first index tag of the target service index represents that the target service index is a stable index or a random fluctuation index.
And performing fluctuation detection on the first monitoring data by adopting a fluctuation detection model, and determining a target fluctuation value of the target service index. And then, adopting a fluctuation trend identification model to identify the fluctuation trend of the first monitoring data and determining the first fluctuation trend of the target service index. If the target fluctuation value and/or the first fluctuation trend meet the corresponding alarm conditions, determining the target service index as an abnormal index and triggering an alarm; otherwise, determining the target service index as a normal index.
Specifically, the first fluctuation trend of the target business index may be a fluctuation trend in a minute dimension, an hour dimension, a day dimension, and the like. The alarm condition may be set according to actual requirements, for example, the alarm condition corresponding to the target fluctuation value may be that the fluctuation range is not [ -10%, 10% ], and the alarm condition corresponding to the first fluctuation trend may be: the fluctuation range is larger than 20% when increasing, and larger than 30% when decreasing. And when at least one parameter in the target fluctuation value and the first fluctuation trend meets the corresponding alarm condition, determining the target service index as an abnormal index and triggering an alarm. The way of calculating the target fluctuation value and the first fluctuation tendency is described in the foregoing, and will not be described herein.
In the embodiment of the application, according to the characteristics of the stable indexes or the random fluctuation indexes, the fluctuation detection model and the fluctuation trend recognition model are arranged for the stable indexes or the random fluctuation indexes, so that the indexes to be detected are more matched with the detection model, and the accuracy of abnormal detection is improved.
In one possible implementation manner, the fluctuation trend identification model is adopted to identify the fluctuation trend of the second monitoring data, and the second fluctuation trend of the target basic index is determined. If the second fluctuation trend meets the corresponding alarm condition or the second monitoring data exceeds the preset threshold range, determining the target basic index as an abnormal index and triggering an alarm; otherwise, determining the target basic index as a normal index.
Specifically, a preset threshold range of the second monitoring data is preset, and when the second monitoring data exceeds the preset threshold range, the target basic index is determined to be an abnormal index and an alarm is triggered. In order to obtain the abnormal condition of the second monitoring data in advance, a fluctuation trend identification model is adopted in the embodiment of the application, fluctuation trend identification is carried out on the second monitoring data, and a second fluctuation trend of the target basic index is determined. When the second fluctuation trend meets the corresponding alarm condition, for example, the second fluctuation trend is increasing and the fluctuation value is greater than 10%, the increase is too fast, even if the second monitoring data does not exceed the preset threshold range at the moment, the alarm can be performed, and therefore the safety of the system is ensured.
It should be noted that, on the basis of setting each abnormality detection model in the abnormality detection rule base, the abnormality detection rules arranged for each type of index are not limited to the above-mentioned ones, and may also be other rules, such as configuring an abnormality detection algorithm for each type of index.
Based on the same technical concept, the present application provides an index abnormality detection apparatus, as shown in fig. 7, the apparatus 700 includes:
an obtaining module 701, configured to obtain a first index tag of a target service index and first monitoring data, where the first index tag of the target service index is obtained by performing feature extraction on historical monitoring data of the target service index to obtain a historical index feature, and is obtained based on the historical index feature prediction;
a matching module 702, configured to obtain a corresponding first detection rule from an anomaly detection rule base based on the first index tag of the target service index;
the detecting module 703 is configured to perform anomaly detection on the first monitoring data by using the first detection rule, and determine whether the target service index is an anomaly index.
Optionally, a tag identification module 704 is further included;
the tag identification module 704 is specifically configured to:
and sequentially performing feature extraction on the historical monitoring data of the target service index by adopting N feature extraction modules to obtain the historical index features of the target service index, wherein each feature extraction module at least comprises a convolution layer and a pooling layer, and N is a preset positive integer.
Optionally, the N feature extraction modules include a first feature extraction module, a second feature extraction module, and a third feature extraction module, where the first feature extraction module includes a first convolution layer, a second convolution layer, and a first pooling layer, the second feature extraction module includes a third convolution layer and a second pooling layer, and the third feature extraction module includes a fourth convolution layer and a third pooling layer.
Optionally, the tag identification module 704 is specifically configured to:
sequentially extracting features of the historical monitoring data of the target service index by adopting the first convolution layer, the second convolution layer and the first pooling layer to obtain a first feature;
sequentially performing feature extraction on the first features by adopting the third convolution layer and the second pooling layer to obtain second features;
and sequentially extracting the features of the second feature by adopting the fourth convolution layer and the third pooling layer to obtain the historical index features of the target service index.
Optionally, the tag identification module 704 is specifically configured to:
and expanding the historical index characteristics by adopting a flattening layer and inputting the expanded historical index characteristics into a full-connection network to obtain a first index label of the target service index.
Optionally, a first index tag of the target service index represents that the target service index is a periodic index;
the detection module 703 is specifically configured to:
respectively carrying out anomaly detection on the first monitoring data by adopting a standard deviation model, an isolated random forest model and a moving average model to obtain anomaly detection results output by each model;
if at least two abnormal detection results exist in each obtained abnormal detection result, representing that the target service index is an abnormal index, determining that the target service index is an abnormal index; otherwise, determining the target service index as a normal index.
Optionally, the detecting module 703 is further configured to:
and after the target service index is determined to be an abnormal index, if the same-loop ratio of the first monitoring data is determined to meet a preset alarm condition, triggering an alarm.
Optionally, a first index tag of the target service index represents that the target service index is a periodic index, and the grade is greater than a preset threshold;
the detection module 703 is specifically configured to:
determining a homocyclic ratio and a target derivative of the first monitored data;
respectively carrying out anomaly detection on the first monitoring data by adopting a standard deviation model, an isolated random forest model, a moving average model and an integrated moving average autoregressive model to obtain anomaly detection results output by each model;
and determining whether the target service index is an abnormal index or not by adopting a trained comprehensive identification model based on each obtained abnormal detection result, the same-loop ratio and the target derivative.
Optionally, a first index tag of the target service index represents that the target service index is a stationary index or a random fluctuation index;
the detection module 703 is specifically configured to:
performing fluctuation detection on the first monitoring data by adopting a fluctuation detection model, and determining a target fluctuation value of the target service index;
adopting a fluctuation trend identification model to identify the fluctuation trend of the first monitoring data and determining a first fluctuation trend of the target service index;
if the target fluctuation value and/or the first fluctuation trend meet corresponding alarm conditions, determining the target service index as an abnormal index and triggering an alarm; otherwise, determining the target service index as a normal index.
Optionally, the obtaining module 701 is further configured to:
acquiring a second index label of a target basic index and second monitoring data, wherein the second index label of the target basic index is acquired from an index label library in a regular matching mode based on the name of the target basic index;
the matching module 702 is further configured to:
acquiring a corresponding second detection rule from an abnormality detection rule base based on a second index label of the target basic index;
the detection module 703 is further configured to:
and carrying out anomaly detection on the second monitoring data by adopting the second detection rule, and determining whether the target basic index is an anomaly index.
Optionally, the detection module 703 is specifically configured to:
adopting a fluctuation trend identification model to identify the fluctuation trend of the second monitoring data and determining a second fluctuation trend of the target basic index;
if the second fluctuation trend meets the corresponding alarm condition or the second monitoring data exceeds the range of a preset threshold value, determining the target basic index as an abnormal index and triggering an alarm; otherwise, determining the target basic index as a normal index.
In the implementation of the application, the historical index features are obtained by extracting the features of the historical monitoring data of the business indexes, and then the first index labels of the business indexes are obtained based on the historical index feature prediction, so that the index labels do not need to be manually marked on all the business indexes, the labeling automation is realized, the labeling efficiency is improved, and the labor cost is reduced. Secondly, based on a first index label of the target business index, a corresponding first detection rule is obtained from an abnormality detection rule base, and then the first detection rule is adopted to carry out abnormality detection on the first monitoring data, so as to determine whether the target business index is an abnormal index. Different anomaly detection rules are adopted according to the characteristics of different indexes, so that the anomaly detection requirements of different indexes are met, and the accuracy of anomaly detection is improved.
Based on the same technical concept, the embodiment of the present application provides a computer device, which may be a terminal or a server, as shown in fig. 8, including at least one processor 801 and a memory 802 connected to the at least one processor, where a specific connection medium between the processor 801 and the memory 802 is not limited in the embodiment of the present application, and the processor 801 and the memory 802 are connected through a bus in fig. 8 as an example. The bus may be divided into an address bus, a data bus, a control bus, etc.
In the embodiment of the present application, the memory 802 stores instructions executable by the at least one processor 801, and the at least one processor 801 may execute the steps included in the index abnormality detection method by executing the instructions stored in the memory 802.
The processor 801 is a control center of the computer device, and may connect various parts of the computer device by using various interfaces and lines, and perform index abnormality detection by executing or executing instructions stored in the memory 802 and calling data stored in the memory 802. Optionally, the processor 801 may include one or more processing units, and the processor 801 may integrate an application processor and a modem processor, wherein the application processor mainly handles operating systems, user interfaces, application programs, and the like, and the modem processor mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 801. In some embodiments, the processor 801 and the memory 802 may be implemented on the same chip, or in some embodiments, they may be implemented separately on separate chips.
The processor 801 may be a general-purpose processor, such as a Central Processing Unit (CPU), a digital signal processor, an Application Specific Integrated Circuit (ASIC), a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof, configured to implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present Application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in a processor.
Memory 802, which is a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory 802 may include at least one type of storage medium, and may include, for example, a flash Memory, a hard disk, a multimedia card, a card-type Memory, a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Programmable Read Only Memory (PROM), a Read Only Memory (ROM), a charge Erasable Programmable Read Only Memory (EEPROM), a magnetic Memory, a magnetic disk, an optical disk, and so on. The memory 802 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 802 in the embodiments of the present application may also be circuitry or any other device capable of performing a storage function for storing program instructions and/or data.
Based on the same inventive concept, embodiments of the present application provide a computer-readable storage medium storing a computer program executable by a computer device, which, when the program runs on the computer device, causes the computer device to execute the steps of the index abnormality detection method described above.
It should be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (14)

1. An index abnormality detection method characterized by comprising:
acquiring a first index label and first monitoring data of a target business index, wherein the first index label of the target business index is acquired by performing feature extraction on historical monitoring data of the target business index to acquire a historical index feature and predicting the historical index feature;
acquiring a corresponding first detection rule from an abnormality detection rule base based on a first index label of the target service index;
and carrying out anomaly detection on the first monitoring data by adopting the first detection rule, and determining whether the target service index is an anomaly index.
2. The method of claim 1, wherein the performing feature extraction on the historical monitoring data of the target service index to obtain a historical index feature comprises:
and sequentially performing feature extraction on the historical monitoring data of the target service index by adopting N feature extraction modules to obtain the historical index features of the target service index, wherein each feature extraction module at least comprises a convolution layer and a pooling layer, and N is a preset positive integer.
3. The method of claim 2, wherein the N feature extraction modules comprise a first feature extraction module, a second feature extraction module, and a third feature extraction module, wherein the first feature extraction module comprises a first convolutional layer, a second convolutional layer, and a first pooling layer, the second feature extraction module comprises a third convolutional layer and a second pooling layer, and the third feature extraction module comprises a fourth convolutional layer and a third pooling layer.
4. The method of claim 3, wherein the using N feature extraction modules to sequentially perform feature extraction on the historical monitoring data of the target service index to obtain the historical index feature of the target service index comprises:
sequentially extracting features of the historical monitoring data of the target service index by adopting the first convolution layer, the second convolution layer and the first pooling layer to obtain a first feature;
sequentially performing feature extraction on the first features by adopting the third convolution layer and the second pooling layer to obtain second features;
and sequentially extracting the features of the second feature by adopting the fourth convolution layer and the third pooling layer to obtain the historical index features of the target service index.
5. The method of claim 1, wherein predicting the first metric label for obtaining the target business metric based on the historical metric characteristics comprises:
and expanding the historical index characteristics by adopting a flattening layer and inputting the expanded historical index characteristics into a full-connection network to obtain a first index label of the target service index.
6. The method according to any of claims 1 to 5, wherein the first index tag of the target business index characterizes the target business index as a periodic index;
the performing, by using the first detection rule, an anomaly detection on the first monitoring data to determine whether the target service index is an abnormal index includes:
respectively carrying out anomaly detection on the first monitoring data by adopting a standard deviation model, an isolated random forest model and a moving average model to obtain anomaly detection results output by each model;
if at least two abnormal detection results exist in each obtained abnormal detection result, representing that the target service index is an abnormal index, determining that the target service index is an abnormal index; otherwise, determining the target service index as a normal index.
7. The method of claim 6, wherein after determining that the target traffic indicator is an abnormal indicator, further comprising:
and if the homocyclic ratio of the first monitoring data meets a preset alarm condition, triggering an alarm.
8. The method according to any one of claims 1 to 5, wherein the first index tag of the target service index characterizes that the target service index is a periodic index, and the grade is greater than a preset threshold value;
the performing, by using the first detection rule, an anomaly detection on the first monitoring data to determine whether the target service index is an abnormal index includes:
determining a homocyclic ratio and a target derivative of the first monitored data;
respectively carrying out anomaly detection on the first monitoring data by adopting a standard deviation model, an isolated random forest model, a moving average model and an integrated moving average autoregressive model to obtain anomaly detection results output by each model;
and determining whether the target service index is an abnormal index or not by adopting a trained comprehensive identification model based on each obtained abnormal detection result, the same-loop ratio and the target derivative.
9. The method according to any one of claims 1 to 5, wherein the first index tag of the target business index characterizes the target business index as a stationary index or a random fluctuation index;
the performing, by using the first detection rule, an anomaly detection on the first monitoring data to determine whether the target service index is an abnormal index includes:
performing fluctuation detection on the first monitoring data by adopting a fluctuation detection model, and determining a target fluctuation value of the target service index;
adopting a fluctuation trend identification model to identify the fluctuation trend of the first monitoring data and determining a first fluctuation trend of the target service index;
if the target fluctuation value and/or the first fluctuation trend meet corresponding alarm conditions, determining the target service index as an abnormal index and triggering an alarm; otherwise, determining the target service index as a normal index.
10. The method of claim 1, further comprising:
acquiring a second index label of a target basic index and second monitoring data, wherein the second index label of the target basic index is acquired from an index label library in a regular matching mode based on the name of the target basic index;
acquiring a corresponding second detection rule from an abnormality detection rule base based on a second index label of the target basic index;
and carrying out anomaly detection on the second monitoring data by adopting the second detection rule, and determining whether the target basic index is an anomaly index.
11. The method of claim 10, wherein the performing anomaly detection on the second monitoring data by using the second detection rule to determine whether the target base indicator is an anomaly indicator comprises:
adopting a fluctuation trend identification model to identify the fluctuation trend of the second monitoring data and determining a second fluctuation trend of the target basic index;
if the second fluctuation trend meets the corresponding alarm condition or the second monitoring data exceeds the range of a preset threshold value, determining the target basic index as an abnormal index and triggering an alarm; otherwise, determining the target basic index as a normal index.
12. An index abnormality detection device characterized by comprising:
the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a first index label and first monitoring data of a target business index, and the first index label of the target business index is obtained by performing feature extraction on historical monitoring data of the target business index to obtain historical index features and predicting the historical index features;
the matching module is used for acquiring a corresponding first detection rule from an abnormity detection rule base based on a first index label of the target service index;
and the detection module is used for performing abnormity detection on the first monitoring data by adopting the first detection rule and determining whether the target service index is an abnormal index.
13. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the method of any of claims 1 to 11 are performed when the program is executed by the processor.
14. A computer-readable storage medium, storing a computer program executable by a computer device, the program, when executed on the computer device, causing the computer device to perform the steps of the method of any one of claims 1 to 11.
CN202110671950.7A 2021-06-17 2021-06-17 Index anomaly detection method, device, equipment and storage medium Pending CN113326177A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110671950.7A CN113326177A (en) 2021-06-17 2021-06-17 Index anomaly detection method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110671950.7A CN113326177A (en) 2021-06-17 2021-06-17 Index anomaly detection method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113326177A true CN113326177A (en) 2021-08-31

Family

ID=77423579

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110671950.7A Pending CN113326177A (en) 2021-06-17 2021-06-17 Index anomaly detection method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113326177A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115311829A (en) * 2022-10-12 2022-11-08 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) Accurate alarm method and system based on mass data
CN115496644A (en) * 2022-11-18 2022-12-20 山东超华环保智能装备有限公司 Solid waste treatment equipment monitoring method based on data identification

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115311829A (en) * 2022-10-12 2022-11-08 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) Accurate alarm method and system based on mass data
CN115496644A (en) * 2022-11-18 2022-12-20 山东超华环保智能装备有限公司 Solid waste treatment equipment monitoring method based on data identification
CN115496644B (en) * 2022-11-18 2023-09-26 南通万达能源动力科技有限公司 Solid waste treatment equipment monitoring method based on data identification

Similar Documents

Publication Publication Date Title
CN111178456B (en) Abnormal index detection method and device, computer equipment and storage medium
US10209974B1 (en) Automated model management methods
US10311044B2 (en) Distributed data variable analysis and hierarchical grouping system
US11631029B2 (en) Generating combined feature embedding for minority class upsampling in training machine learning models with imbalanced samples
US10877863B2 (en) Automatic prediction system for server failure and method of automatically predicting server failure
CN108563739B (en) Weather data acquisition method and device, computer device and readable storage medium
CN112800116A (en) Method and device for detecting abnormity of service data
CN112150237B (en) Multi-model fused order overdue early warning method, device, equipment and storage medium
CN113326177A (en) Index anomaly detection method, device, equipment and storage medium
CN110781818B (en) Video classification method, model training method, device and equipment
CN112765385A (en) Information management method and system based on big data and Internet
CN114638633A (en) Abnormal flow detection method and device, electronic equipment and storage medium
CN110471945B (en) Active data processing method, system, computer equipment and storage medium
CN115269981A (en) Abnormal behavior analysis method and system combined with artificial intelligence
CN110874601B (en) Method for identifying running state of equipment, state identification model training method and device
CN115168509A (en) Processing method and device of wind control data, storage medium and computer equipment
CN109978038B (en) Cluster abnormity judgment method and device
CN114580791A (en) Method and device for identifying working state of bulking machine, computer equipment and storage medium
CN113689020A (en) Service information prediction method, device, computer equipment and storage medium
CN113569879B (en) Training method of abnormal recognition model, abnormal account recognition method and related device
CN116842174A (en) Agricultural resource database platform building method based on network data
CN117591860A (en) Data anomaly detection method and device
CN116755836A (en) Container abnormal restarting prediction method, training device and computer equipment
CA3221902A1 (en) Efficient cross-platform serving of deep neural networks for low latency applications
CN115907059A (en) Method for predicting periodicity of time series data, computer device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination