CN112882889A - Abnormality monitoring method, abnormality monitoring system, electronic device, and storage medium - Google Patents

Abnormality monitoring method, abnormality monitoring system, electronic device, and storage medium Download PDF

Info

Publication number
CN112882889A
CN112882889A CN202110088001.6A CN202110088001A CN112882889A CN 112882889 A CN112882889 A CN 112882889A CN 202110088001 A CN202110088001 A CN 202110088001A CN 112882889 A CN112882889 A CN 112882889A
Authority
CN
China
Prior art keywords
monitoring
data
curve
determining
data curve
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110088001.6A
Other languages
Chinese (zh)
Inventor
曹臻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN202110088001.6A priority Critical patent/CN112882889A/en
Publication of CN112882889A publication Critical patent/CN112882889A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

The invention discloses an anomaly monitoring method, an anomaly monitoring system, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a monitoring data curve of a target monitoring item; determining a dynamic threshold value of each data partition in the monitoring data curve according to the distribution characteristics of the monitoring data curve; and carrying out abnormity monitoring on the target monitoring item based on the dynamic threshold value of each data partition. In the embodiment of the invention, the dynamic threshold of each data partition can be determined only by maintaining the historical time sequence data of the target monitoring item, namely the monitoring data curve, without spending huge computing resources and storage resources to train or update the prediction model of each monitoring item or spending a large amount of manpower to maintain the model, and the abnormal monitoring result with higher accuracy can be obtained, so that the monitoring efficiency can be improved to a certain extent, and the scheme can also meet the requirements of the monitoring scene of a large-scale monitoring item on the monitoring accuracy and the monitoring efficiency.

Description

Abnormality monitoring method, abnormality monitoring system, electronic device, and storage medium
Technical Field
The present invention relates to the field of network technologies, and in particular, to an anomaly monitoring method, an anomaly monitoring system, an electronic device, and a storage medium.
Background
Currently, in the field of data anomaly monitoring, a prediction model of each monitoring item, for example, an Autoregressive moving average model (ARMA), is mostly trained based on sample data of the monitoring item to predict an anomaly threshold. Therefore, when the abnormity monitoring is carried out, the trained prediction model is used for processing the historical data so as to predict the abnormity threshold value of the monitoring item, and the abnormity monitoring is carried out on the monitoring item by using the abnormity threshold value.
However, in the practical application process, the accuracy of the anomaly monitoring depends on the complexity of the prediction model and the size of the training sample, and meanwhile, the anomaly monitoring mode needs to establish, train and maintain the prediction models corresponding to different monitoring items. Therefore, under the trend of increasing number of monitoring items, the existing scheme for implementing anomaly monitoring needs a large amount of storage resources and computing resources, and meanwhile, a large amount of human resources are used for model maintenance. In other words, the conventional anomaly monitoring scheme is difficult to consider both the anomaly monitoring accuracy and the data maintenance complexity, and especially in the monitoring scene facing a large-scale monitoring item, the requirements on the monitoring accuracy and the monitoring efficiency are difficult to meet.
Disclosure of Invention
The embodiment of the invention aims to provide an anomaly monitoring method, an anomaly monitoring system, electronic equipment and a storage medium, and solves the technical problems that the monitoring result of the existing anomaly monitoring method is not accurate enough and error alarm is possibly generated. The specific technical scheme is as follows:
in a first aspect of the embodiments of the present invention, there is provided an anomaly monitoring method, including the following steps:
acquiring a monitoring data curve of a target monitoring item, wherein the monitoring data curve is used for reflecting the corresponding relation between historical monitoring data and historical monitoring time of the target monitoring item;
determining a dynamic threshold value of each data partition in the monitoring data curve according to the distribution characteristics of the monitoring data curve, wherein the monitoring data curve comprises N data partitions, and N is a positive integer;
and monitoring the abnormity of the target monitoring item based on the dynamic threshold value of each data partition.
In a second aspect of the embodiments of the present invention, there is also provided an abnormality monitoring system, including:
the acquisition module is used for acquiring a monitoring data curve of the target monitoring item, and the monitoring data curve is used for reflecting the corresponding relation between the historical monitoring data of the target monitoring item and the historical monitoring time;
the determining module is used for determining a dynamic threshold value of each data partition in the monitoring data curve according to the distribution characteristics of the monitoring data curve, wherein the monitoring data curve comprises N data partitions, and N is a positive integer;
and the monitoring module is used for monitoring the abnormity of the target monitoring item based on the dynamic threshold value of each data partition.
In a third aspect of the embodiments of the present invention, there is further provided a computer-readable storage medium, in which instructions are stored, and when the instructions are run on a computer, the instructions cause the computer to execute the abnormality monitoring method according to any one of the above embodiments.
In a fourth aspect of the present invention, there is also provided a computer program product containing instructions, which when run on a computer, causes the computer to perform the anomaly monitoring method according to any one of the above embodiments.
In the embodiment of the invention, a monitoring data curve of a target monitoring item is obtained, wherein the monitoring data curve reflects the corresponding relation between historical monitoring data and historical monitoring time; determining a dynamic threshold value of each data partition in the monitoring data curve according to the distribution characteristics of the monitoring data curve; and carrying out abnormity monitoring on the target monitoring item based on the dynamic threshold value of each data partition. Different from the scheme of respectively establishing, training and maintaining the model for each monitoring item in the prior art, in the embodiment of the invention, the dynamic threshold of each data partition can be flexibly determined based on the distribution characteristics of the monitoring data curve only by maintaining the historical time sequence data of the target monitoring item, namely the monitoring data curve, without spending huge computing resources and storage resources to train or update the prediction model of each monitoring item, the parameter combination data of each monitoring item is stored, and the model maintenance is not required to spend a large amount of manpower, so that the abnormal monitoring result with higher accuracy can be obtained based on the change rule of the monitoring item data based on time, the monitoring efficiency can be improved to a certain extent, and the scheme can also meet the requirements of the monitoring scene of a large-scale monitoring item on the monitoring accuracy and the monitoring efficiency.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
FIG. 1 is a schematic diagram of a monitoring data curve according to an embodiment of the present invention;
FIG. 2 is a flow chart of an anomaly monitoring method in an embodiment of the present invention;
FIG. 3 is a diagram illustrating an application scenario of the anomaly monitoring method according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of an anomaly monitoring system according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.
The anomaly monitoring method provided by the embodiment can be applied to an anomaly monitoring system, and the anomaly monitoring system is used for monitoring the monitoring data of the monitoring items in an anomaly manner. Wherein the monitoring data may include, but is not limited to, at least one of: the operation and maintenance personnel deliver data with a periodic distribution rule and data automatically generated based on user operation or automatic operation in a monitoring scene.
It should be understood that the above-mentioned anomaly monitoring system may monitor a plurality of monitoring items, in this case, the monitoring system may generate a monitoring data curve corresponding to each monitoring item, and determine the monitoring threshold corresponding to each monitoring item in different monitoring time periods according to the monitoring data curve.
For the purpose of clearly explaining the technical scheme, the technical scheme is explained by taking the example that the abnormality monitoring system monitors 1 monitoring item.
In the case of monitoring 1 monitoring item using the anomaly monitoring system, the anomaly monitoring system may be used to monitor load data of the video website server, for example. The load of the video website server can be understood as a monitoring item, and the load value of the server can be understood as monitoring data.
In the case of monitoring 2 monitoring items using the abnormality monitoring system, the abnormality monitoring system may be used to monitor load data of the video web server and the game web server, for example. In this case, the load of the video website server may be understood as a first monitoring item, the load of the game website server may be understood as a second monitoring item, the load value of the video website server may be understood as monitoring data corresponding to the first monitoring item, and the load value of the game website server may be understood as monitoring data corresponding to the second monitoring item.
The monitoring data at least comprises timestamp information and a monitoring value, and optionally, the form of the monitoring data can be { "timestamp":1602432000, "value":1}, wherein "timestamp":1602432000 represents timestamp information, and the timestamp information represents a monitoring time corresponding to the monitoring data. "value" 1 indicates a monitoring value, i.e., a load value of the server. In one embodiment, 1602432000 may be preset to indicate 0 o 0 s at 12 o 0 t 10/12/2020, so that the monitoring data { "timestamp":1602432000, "value":1} may indicate that the server has a load value of 1 at 0 o 0 s at 12 o 0 t 10/2020.
It should be noted that, for a certain monitoring item, the monitoring data may be collected, acquired, or read according to a delivery frequency, for example, the delivery frequency is 1 time per minute, which means that the operation and maintenance personnel delivers one monitoring data per 1 minute interval, and then the abnormality monitoring system receives 1440 monitoring data in a day in total. The monitoring system can fit these 1440 discrete monitoring data to obtain a monitoring data curve. The monitoring data curve may be characterized as a set of 1440 sequences including elements related to monitoring data.
It should be noted that the monitoring data curve with the periodic distribution rule is related to the behavior pattern of the user corresponding to the monitoring item, for example, at 7 pm to 8 pm, the frequency of the user accessing the video website is high, which results in a high load on the video website server, and then the value of the load value corresponding to the delivery time period of the monitoring data curve at 7 pm to 8 pm is high.
For the requirement of intuitively explaining the technical scheme, please refer to fig. 1, as shown in the figure, a two-dimensional coordinate system is included in fig. 1, and the abscissa axis of the coordinate system represents the delivery time of the monitoring data, which can also be understood as the monitoring time; the ordinate axis represents the monitored value of the monitored data, and the graph also includes 1 solid line L1 and 2 broken lines L2 and L3. In the graph, L1 is a monitoring data curve, a broken line L2 located above the monitoring data curve L1 is a broken line formed based on the upper threshold values of all the data partitions, and a broken line L3 located below the monitoring data curve is a broken line formed based on the lower threshold values of all the data partitions.
It can be understood that the anomaly monitoring scheme provided by the embodiment of the invention is not limited to the anomaly monitoring scene of the equipment load. Illustratively, the scheme can also be applied to exception monitoring of server hardware data. At this time, the monitoring item may include, but is not limited to, at least one of: a Central Processing Unit (CPU), a load, a network card access flow, a disk storage amount, an available memory ratio, and the like. Illustratively, the scheme may also be applied to anomaly monitoring of service level data. At this time, the monitoring item may include, but is not limited to, at least one of: query Per Second (QPS) of server requests, server request response time, server request success rate, etc.
Currently, in the prior art, for monitoring an anomaly of a monitoring item, the following technical means are generally adopted:
the first anomaly monitoring method is to set a static threshold value according to experience, and send out an alarm for a monitoring item when a monitoring value of a monitoring data curve is higher than the static threshold value, wherein the static threshold value is a fixed and unchangeable numerical value.
However, as described above, the monitoring data curve having a periodic distribution rule is strongly related to the behavior pattern of the user corresponding to the monitoring item, the monitoring data curve may have a high load value in some specific time period, which is a normal condition, and the monitoring item using the static threshold value for monitoring the abnormality determines the condition as abnormal monitoring, and thus gives a false alarm.
The second anomaly monitoring method is to train a prediction model, such as an ARMA model, using a large amount of historical monitoring data, to set a dynamic threshold of a monitoring data curve, and to issue an alarm if a monitored value is higher than the dynamic threshold.
However, in the above anomaly monitoring method, it is necessary to establish, train and maintain prediction models corresponding to different monitoring items, which requires a large amount of storage resources and computational resources, and also needs a large amount of human resources to perform model maintenance, so that it is difficult to consider both anomaly monitoring accuracy and data maintenance complexity, and especially in a monitoring scenario facing a large-scale monitoring item, it is difficult to meet the requirements for monitoring accuracy and efficiency.
Based on the technical problems, the embodiment of the invention provides the following technical scheme:
the time sequence data of the monitoring items are maintained, the dynamic threshold of each data partition is determined according to the distribution characteristics of the time sequence data, huge computing resources and storage resources are not needed to be spent to train or update the prediction model of each monitoring item, a large amount of manpower is not needed to be spent to maintain the model, and meanwhile, the scheme can meet the requirements of monitoring scenes of large-scale monitoring items on monitoring accuracy and monitoring efficiency.
Specifically, the embodiment of the invention provides an anomaly monitoring method. Referring to fig. 2, fig. 2 is a flowchart illustrating an anomaly monitoring method according to an embodiment of the present invention. The anomaly monitoring method provided by the embodiment comprises the following steps:
and S101, acquiring a monitoring data curve of the target monitoring item.
In this step, the monitoring data curve is used to reflect the corresponding relationship between the historical monitoring data of the target monitoring item and the historical monitoring time.
A two-dimensional coordinate system may be assumed, where the axis of abscissa is monitoring time, which may also be understood as time for delivering the monitoring data, and the axis of ordinate is a monitoring value of the monitoring data, and since the monitoring time corresponding to each monitoring data is different, the monitoring data in the message queue may be understood as a point on the two-dimensional axis. The abscissa of the point represents the monitoring time corresponding to the monitoring data, and the ordinate of the point represents the monitoring value corresponding to the monitoring data, alternatively, a plurality of points may be fitted to a curve, or a plurality of points may be connected to a curve, which may be understood as a monitoring data curve.
Illustratively, the time interval corresponding to the monitoring curve is 24 hours, the delivery frequency of the monitoring data is 1 minute and 1 time, and 1 monitoring data is acquired every 1 minute. The monitoring data can be delivered to the abnormality monitoring system by the operation and maintenance personnel at intervals of 1 minute, or can be automatically acquired by the abnormality monitoring system at intervals of 1 minute. Then a total of 1440 monitored data were obtained over the course of a day. As described above, a monitoring data may be represented as a point on a two-dimensional coordinate system, and the abscissa of the point is associated with the monitoring time and the ordinate of the point is associated with the monitoring value, then 1440 points may be fitted to a curve, which may be referred to as a monitoring data curve.
It should be noted that the anomaly monitoring system may receive a large amount of monitoring data in a short time. One possible case is that the monitoring data is a monitoring item for one monitoring item sent by the operation and maintenance personnel in a short time; another possible case is that the monitoring data is monitoring data for a plurality of monitoring items sent by the operation and maintenance personnel in a short time.
In the above case, the anomaly monitoring system needs to process a large amount of data in a short time, and is likely to cause data disturbance.
Based on the above possible technical problems, an optional scheme is that a message queue is configured to store monitoring data received by the system, and the monitoring data in the message queue is sorted according to delivery time, so that the abnormal monitoring system is ensured to process the monitoring data according to a certain sequence, and a data disorder phenomenon is avoided.
And S102, determining the dynamic threshold of each data partition in the monitoring data curve according to the distribution characteristics of the monitoring data curve.
As described above, the monitoring data curve may be divided into N data partitions, where N is a positive integer. In an actual scene, the monitoring data can be partitioned to obtain a plurality of data partitions; alternatively, the partitions may not be performed, and the entire monitoring data curve may be processed as one data partition. How to perform partitioning is specifically described later, and this is not expanded here.
In addition, it can be understood that, when a plurality of monitoring items are involved, the number of data partitions corresponding to different monitoring items may be the same or different, and the monitoring time periods corresponding to the data partitions may also be the same or different.
In this step, the dynamic threshold of each data partition in the monitoring data curve may be determined according to the distribution characteristics of the monitoring data curve, where the distribution characteristics of the monitoring data curve may reflect the distribution characteristics of the periodic distribution rules existing in the monitoring data curve.
Specifically, please refer to the following contents, how to determine the dynamic threshold of each data partition in the monitoring data curve according to the distribution characteristics of the monitoring data curve.
S103, performing abnormity monitoring on the target monitoring item based on the dynamic threshold value of each data partition.
In this step, each data partition corresponds to a different monitoring time period, and the dynamic threshold includes an upper threshold and/or a lower threshold.
The method comprises the steps of obtaining a monitoring value corresponding to a target monitoring item and monitoring time corresponding to the monitoring value, determining a data partition corresponding to a monitoring time period to which the monitoring time belongs, and monitoring the target monitoring item for abnormity based on a dynamic threshold value of the data partition.
Specifically, please refer to the following contents, for a technical scheme of how to perform anomaly monitoring on a target monitoring item based on a dynamic threshold of each data partition.
In the embodiment of the invention, only the historical time sequence data of the target monitoring item, namely the monitoring data curve, needs to be maintained, the dynamic threshold of each data partition can be flexibly determined based on the distribution characteristics of the monitoring data curve, huge computing resources and storage resources are not needed to be spent, the prediction model of each monitoring item is trained or updated, the parameter combination data of each monitoring item is stored, and a large amount of manpower is not needed to be spent for model maintenance, so that the abnormal monitoring result with higher accuracy can be obtained based on the change rule of the monitoring item data based on time, the monitoring efficiency can be improved to a certain extent, and the scheme can also meet the requirements of the monitoring scene of a large-scale monitoring item on the monitoring accuracy and the monitoring efficiency.
How to determine the target monitoring item is explained in detail below:
optionally, before the obtaining of the monitoring data curve of the target monitoring item, the method further includes:
receiving a monitoring request of a user, and determining index data corresponding to the monitoring request; inquiring the index data in a preset database; and under the condition that the index data is not stored in the database, determining the monitoring item corresponding to the index data as a target monitoring item.
It should be noted that the monitoring request sent by the user includes index data and monitoring data, where the index data is used to represent a monitoring item corresponding to the monitoring request in a database, a preset database reflects a mapping relationship between the index data and the monitoring item, and the index data and the monitoring item correspond to each other one to one. It should be understood that the index data for different monitoring requests may be the same.
In this embodiment, after receiving the monitoring request, the system parses the monitoring request to obtain index data, and inputs the index data into a preset database for query, where the database reflects a mapping relationship between the index data and the monitoring items, and it should be understood that the index data in the database is index data of all monitored monitoring items.
If the index data is not queried in the database, which indicates that the monitoring item corresponding to the index data is not monitored, determining the monitoring item corresponding to the index data as a target monitoring item, and establishing a new mapping relationship (which can be understood as newly establishing a data table in the database) in the database, where the mapping relationship represents mapping between the index data and the target monitoring item.
If the index data exists in the database, the monitoring item corresponding to the index data is monitored, and the monitoring item is configured with a corresponding dynamic threshold value. In an optional implementation manner, based on the dynamic threshold, monitoring is performed on the monitoring item; another optional embodiment is that the monitored monitoring item may be determined as a target monitoring item, and then the target monitoring item is monitored by using the method provided in the present application, it should be understood that in this embodiment, for the monitored monitoring item, a new mapping relation does not need to be established in the database.
It should be noted that, before determining the dynamic threshold of each data partition in the monitoring data curve, it is necessary to determine a plurality of data partitions of the monitoring data curve first, and then determine the dynamic threshold of each data partition.
In the following, how to determine a plurality of data partitions of the monitoring data curve is specifically described.
Optionally, before determining the dynamic threshold of each data partition in the monitoring data curve according to the distribution characteristic of the monitoring data curve, the method further includes:
judging whether the monitoring data curve has a periodic distribution rule or not; and under the condition that the monitoring data area curve has a periodic distribution rule, dividing the monitoring data curve into a plurality of data partitions.
In this embodiment, when the monitoring data curve has a seasonal component, it may be determined that the monitoring data curve has a periodic distribution rule, and it should be understood that the seasonal component is also referred to as a periodic component.
For example, if S (1) is 1, S (2) is 2, S (3) is 3, S (4) is 1, S (5) is 2, and S (6) is 3 in a monitored data curve. Wherein S (1) represents that the delivery time in the monitoring data curve is 10: 00, S (2) represents a delivery time of 11 in the monitoring data curve: 00, S (3) represents a delivery time of 12: 00, S (4) represents a delivery time of 13: 00, S (5) represents a delivery time of 14: 00, S (6) represents a delivery time of 15: 00.
From the above, it can be obtained that if S (1) is S (4), S (2) is S (5), and S (3) is S (6), it can be further obtained that, if the delivery time 10: 00 to 12: 00, distribution rule and delivery time of a partial monitoring data curve corresponding to the data curve 13: 00 to 15: 00, the distribution rules of the partial monitoring data curves corresponding to 00 are the same, and then the delivery time 10: 00 to 15: the monitoring data curve corresponding to 00 has 2 seasonal components, and the period of each seasonal component is 2 hours.
In the case that the monitoring data curve has a periodic distribution rule, the monitoring data curve may be divided into a plurality of data partitions, and please refer to the following embodiments in a specific technical scheme of how to divide the monitoring data curve into the plurality of data partitions.
In this embodiment, the monitoring data curve is divided into different data partitions, and the monitoring time periods corresponding to each data partition are different, so that the monitoring items can be monitored according to the actual conditions of the monitoring data curve in different monitoring time periods, and the monitoring result is ensured to conform to the actual conditions of the monitoring data curve in the corresponding monitoring time period.
It should be noted that the data partitions may be understood as a part of the monitoring data curve, and the monitoring time periods corresponding to any two data partitions are not identical. In an exemplary embodiment, the monitoring periods corresponding to any two data partitions may be completely different.
In order to reflect the periodic distribution rule of the monitoring data curve, the monitoring duration corresponding to any data partition is less than or equal to the minimum period of the periodic distribution rule of the monitoring data curve, namely the monitoring duration corresponding to any data partition is less than or equal to the period of the seasonal component of the monitoring data curve. This means that the data change of the monitoring item in one period can be represented by one or more data partitions, the data change in one data partition is related and closely related, and the periodic components of the data are kept consistent; moreover, the interference of more than one (not containing one) period data in one data partition data to the dynamic threshold determination process can be avoided, the noise of the data partition is favorably reduced, the dynamic threshold which is more accordant with the data distribution characteristics in the partition is obtained, and the more accurate abnormal monitoring result is favorably obtained.
In an exemplary embodiment, the monitoring duration corresponding to the data partition may be equal to the period of the seasonal component, so that the data stored in one data partition may be used to represent the data change condition of the monitoring item in one period, which is beneficial to reduce noise, obtain data with consistent components and close relation, and obtain a more accurate abnormal monitoring result.
It should be noted that, in the case that the monitoring data curve does not have a periodic distribution rule, the monitoring data curve may be determined as a data partition without performing partition processing on the monitoring data curve.
How to determine whether the monitoring data curve has a periodic distribution rule is described in detail as follows:
optionally, the determining whether the monitoring data curve has a periodic distribution rule includes:
calculating K autocorrelation coefficients corresponding to the monitoring data curve in K monitoring time periods; and under the condition that the K autocorrelation coefficients are regularly distributed, determining that the monitoring data curve has a periodic distribution rule.
In this embodiment, the monitoring data curve may be divided into K monitoring data sub-curves, and the monitoring durations corresponding to the monitoring data sub-curves may be the same or different.
For any one of the monitored data sub-curves, an Autocorrelation Function (AF) may be used to obtain an Autocorrelation coefficient of each monitored data sub-curve, that is, K Autocorrelation coefficients, based on the monitoring time duration and the monitoring value of the monitored data sub-curve in the monitoring time duration.
If the values of the K autocorrelation coefficients are regularly distributed, it can be determined that the monitoring data curve has a periodic distribution rule.
For example, if the monitoring duration corresponding to one monitoring data curve is 24 hours, the monitoring data curve may be divided into 8 monitoring data sub-curves, the monitoring duration corresponding to each monitoring data sub-curve is 3 hours, the autocorrelation coefficient corresponding to the first monitoring data sub-curve may be referred to as a first autocorrelation coefficient, the autocorrelation coefficient corresponding to the second monitoring data sub-curve may be referred to as a second autocorrelation coefficient, and so on, the autocorrelation coefficient corresponding to the eighth monitoring data sub-curve may be referred to as an eighth autocorrelation coefficient.
The first autocorrelation coefficient is calculated to be-0.102, the second autocorrelation coefficient is-0.657, the third autocorrelation coefficient is-0.060, the fourth autocorrelation coefficient is 0.869, the fifth autocorrelation coefficient is-0.089, the sixth autocorrelation coefficient is-0.635, the seventh autocorrelation coefficient is-0.054, and the eighth autocorrelation coefficient is 0.832.
Here, the coefficient difference may be preset, and if the difference between the absolute value of the first autocorrelation coefficient and the absolute value of the fifth autocorrelation coefficient is smaller than the preset coefficient difference, the difference between the absolute value of the second autocorrelation coefficient and the absolute value of the sixth autocorrelation coefficient is smaller than the preset coefficient difference, the difference between the absolute value of the third autocorrelation coefficient and the absolute value of the seventh autocorrelation coefficient is smaller than the preset coefficient difference, and the difference between the absolute value of the fourth autocorrelation coefficient and the absolute value of the eighth autocorrelation coefficient is smaller than the preset coefficient difference, the distribution law of the first to fourth autocorrelation coefficients may be determined to be the same as the distribution law of the fifth to eighth autocorrelation coefficients.
In an actual implementation scene, the difference between the autocorrelation coefficients can be calculated in a traversal manner, and whether the respective correlation coefficients have regular distribution or not is determined based on the relationship between the difference and the difference of the preset number. It can be understood that, in this scheme, it is not required that all autocorrelation coefficients have regular distribution, and when there are at least two groups of autocorrelation coefficients having regular distribution, it may be determined that there is regular distribution of autocorrelation coefficients (also referred to as autocorrelation coefficients having regular distribution).
Under the condition, the autocorrelation coefficients are determined to have regular distribution, and then the monitoring data sub-curves corresponding to the autocorrelation coefficients are determined to have regular distribution, so that the monitoring data curves can be determined to have periodic distribution rules.
It should be noted that a data partition can also be understood as a special monitoring data sub-curve. The monitoring time periods corresponding to the data partitions may be the same as or different from the K monitoring time periods. In other words, after it is determined that the monitoring curve has a regular distribution, the K monitoring time periods (if the foregoing requirement is met) may be directly used as the monitoring time periods corresponding to the data partitions, so as to obtain K data partitions (in this case, N is equal to K). In addition, after it is determined that there is a regular distribution of the monitoring curve based on the K monitoring time periods, N data partitions satisfying the aforementioned requirement (i.e., the monitoring time period corresponding to any data partition is less than or equal to the period of the seasonal component of the monitoring curve) may be determined in any other manner.
It should be understood that, in some embodiments, the number of data partitions of the monitoring data curve may be preset, and the monitoring data curve may be divided into a plurality of data partitions based on the monitoring duration and the number of data partitions corresponding to the monitoring data curve.
For example, N data partitions with the same data time period length may be determined in the monitoring data curve according to a preset number (N) of data partitions.
As an example, the present invention further provides another embodiment of dividing the monitoring data curve into a plurality of data partitions:
optionally, the dividing the monitoring data curve into a plurality of data partitions includes:
calculating the arrangement entropy of the monitoring data curve; carrying out normalization processing on the permutation entropy to obtain a normalized permutation entropy; dividing the monitoring data curve into M data partitions (N is M) under the condition that the normalized permutation entropy is larger than a fifth preset value; and under the condition that the normalized permutation entropy is less than or equal to the fifth preset value, dividing the monitoring data curve into L data partitions (N is L). M and L are both positive integers greater than 1, and L is greater than M.
In this embodiment, the permutation entropy of the monitoring data curve may be calculated by using a permutation entropy algorithm, where it is to be noted that the permutation entropy of the monitoring data curve represents the complexity of the monitoring data curve, and a smaller permutation entropy indicates a more regular distribution of the monitoring data curve, and a larger permutation entropy indicates a more complex distribution of the monitoring data curve.
The permutation entropy algorithm is an algorithm for measuring time series complexity, and the calculation principle can be simply summarized as follows: performing phase space reconstruction on the time sequence to obtain a matrix; rearranging each row of the matrix in an ascending order, and recording a subscript sequence before each row is sequenced after sequencing to obtain a group of symbol sequences; and determining the arrangement entropy of the monitoring data curve according to the number of the sequential appearance of the subscripts of each row.
After the permutation entropy of the monitoring data curve is obtained, normalization processing can be performed on the permutation entropy to obtain a normalized permutation entropy with a numerical range of 0 to 1.
It should be noted that, the larger the normalized permutation entropy is, the more the distribution of the monitoring data curve is complex, and the more the noise in the monitoring data curve is, in this case, in order to limit the amount of noise in each data partition, it is necessary to set the monitoring time corresponding to the data partition to be shorter, so that the number of the set data partitions is larger.
The smaller the normalized permutation entropy is, the distribution rule of the monitoring data curve is represented, the less the noise in the monitoring data curve is, and on the premise that the monitoring time length corresponding to the monitoring data curve is the same, the longer the monitoring time length corresponding to the data partition can be set compared with the case that the normalized permutation entropy is larger, so that the number of the data partitions is set to be smaller.
For the above reasons, the fifth preset value is preset in the embodiment, and it should be understood that the fifth preset value is an empirical value. And under the condition that the normalized permutation entropy is larger than a fifth preset value, the fact that the noise in the monitoring data curve is more is shown, the monitoring data curve is divided into M data partitions, and M is a positive integer larger than 1. And under the condition that the normalized permutation entropy is less than or equal to a fifth preset value, the noise in the monitoring data curve is less, the monitoring data curve is divided into L data partitions, L is a positive integer greater than 1, and L is greater than M.
For example, the fifth preset value may be 0.7, and if the monitoring duration corresponding to the monitoring data curve is 24 hours, the value of M may be set to be 48, and the value of L may be set to be 96.
In the above case, if the value of the normalized permutation entropy is greater than 0.7, the monitoring data curve is divided into 48 data partitions, and the time duration corresponding to each data partition is 30 minutes.
If the normalized permutation entropy value is less than or equal to 0.7, the monitoring data curve is divided into 96 data partitions, and the corresponding time length of each data partition is 15 minutes.
In this embodiment, the monitoring data curve is divided according to the arrangement entropy of the monitoring data curve, so that the components of each data partition tend to be consistent, and thus, the dynamic threshold can be flexibly set according to the actual situation of the monitoring data in each monitoring time period.
Optionally, the determining a dynamic threshold of each data partition in the monitoring data curve according to the distribution characteristic of the monitoring data curve includes:
determining distribution characteristics met by the monitoring data curve based on the curve skewness and kurtosis of the monitoring data curve; and determining a dynamic threshold value of each data partition based on the distribution characteristics.
It should be noted that the skewness of the curve is a measure of the skew direction and degree of the statistical data distribution, and can be used to measure the asymmetry of the curve distribution; the kurtosis of the curve is a statistic of the steepness degree of distribution form of all values in the curve, and can be used for measuring the steepness degree of the distribution of the curve. Based on skewness and kurtosis of the monitoring data curve, distribution characteristics of the monitoring data curve can be determined, and further based on distribution characteristics of the whole monitoring data curve, dynamic thresholds of all data partitions are determined, wherein the dynamic thresholds comprise an upper threshold and a lower threshold.
In the following, how to determine the distribution characteristics satisfied by the monitoring data curve based on the curve skewness and kurtosis of the monitoring data curve is specifically described:
optionally, the determining, based on the curve skewness and the kurtosis of the monitoring data curve, a distribution characteristic that is satisfied by the monitoring data curve includes:
judging whether the skewness of the monitoring data curve is equal to a first preset value or not; and if the skewness of the monitoring data curve is equal to a first preset value, determining that the distribution characteristic of the monitoring data curve is a first distribution characteristic.
The first distribution characteristic is a normal distribution characteristic. It should be noted that the periodic distribution rule of the curve in normal distribution is that the two ends of the curve are low, the middle is high, and the curve is symmetrical.
If the skewness of the monitoring data curve is not equal to a first preset value, judging whether the skewness of the monitoring data curve is larger than a second preset value or not, and judging whether the kurtosis of the monitoring data curve is larger than a preset kurtosis or not;
and if the skewness of the monitoring data curve is larger than a second preset value and the kurtosis of the monitoring data curve is larger than a preset kurtosis, determining that the distribution characteristic of the monitoring data curve is a second distribution characteristic.
The second distribution characteristic is the fertilizer tail distribution characteristic. It should be noted that the fertilizer tail distribution is a special probability distribution, and compared with the normal distribution, the two ends of the fertilizer tail distribution curve are decreased more slowly and more than normal. Because the two ends of the curve of the fertilizer tail distribution descend more slowly and more than that, the probability of the extreme value of the curve of the fertilizer tail distribution is higher than that of the curve of the normal distribution.
If the skewness of the monitoring data curve is not greater than a second preset value and/or the kurtosis of the monitoring data curve is not greater than a preset kurtosis, determining that the distribution characteristic of the monitoring data curve is a third type of distribution characteristic, wherein the third type of distribution characteristic is other distribution characteristics except for a normal distribution characteristic or a fat tail distribution characteristic, and the third type of distribution characteristic is also called a third type of distribution characteristic.
The third distribution characteristic is other distribution characteristics besides the normal distribution characteristic or the fat tail distribution characteristic. And if the distribution characteristics of the monitoring data curve do not meet normal distribution, determining that the distribution characteristics of the monitoring data curve are not normal distribution characteristics and are not fat tail distribution characteristics.
In this embodiment, the skewness of the monitoring data curve may be compared with a first preset value, and if the skewness of the monitoring data curve is equal to the first preset value, the distribution characteristic of the monitoring data curve is determined to be a normal distribution characteristic.
And (4) a curve in normal distribution, the data distribution of the curves on both sides of the curve mean value is the same, and the curve skewness is equal to 0. The curve mean is understood to mean the curve data. Accordingly, the first preset value may be set to 0.
In this embodiment, when the skewness of the monitoring data curve is not equal to the first preset value, it is determined whether the skewness of the monitoring data curve is greater than a second preset value, and the kurtosis of the monitoring data curve is greater than a preset kurtosis, where the second preset value and the preset kurtosis are experience values that can be set by a user.
And under the condition that the skewness of the monitoring data curve is greater than a second preset value and the kurtosis of the monitoring data curve is greater than the preset kurtosis, the data distribution in the monitoring data curve is asymmetric and the curve distribution is steeper, so that the distribution characteristic of the monitoring data curve can be determined to be fertilizer tail distribution.
And under the condition that the skewness of the monitoring data curve is not greater than a second preset value and the kurtosis of the monitoring data curve is not greater than a preset kurtosis, indicating that the monitoring data curve is not in fertilizer tail distribution. Because the skewness of the monitoring data curve is not equal to the first preset value, the monitoring data curve is not normally distributed. In this case, it may be determined that the distribution characteristic of the monitoring data curve is the third case, that is, the distribution characteristic of the monitoring data curve is not a normal distribution characteristic nor a fat tail distribution characteristic.
In the following, how to determine the dynamic threshold of each data partition when the monitored data curves have different distribution characteristics is specifically described:
optionally, in a case that the distribution feature of the data partition is a normal distribution feature, the determining the dynamic threshold of each data partition based on the distribution feature includes:
calculating the mean value and the standard deviation of any data partition; calculating a first product of the standard deviation and a third preset value; determining the sum of the first product and the mean as the upper threshold; determining a difference between the mean value and the first product as the lower threshold.
It is understood that statistically, a curve of data in a normal distribution, with the mean of the curve plus the 3 standard deviations of the curve, can cover 99.73% of the data for the curve. Therefore, in this embodiment, if the monitoring data curve satisfies the normal distribution, the third preset value may be set to be 3, and the dynamic threshold of each data partition is determined by using a calculation method of adding or subtracting 3 times of the standard deviation from the mean value. Determining the sum of the mean value and the 3 times of standard deviation as an upper limit threshold; the difference between the mean and the 3 standard deviations is determined as the lower threshold.
For a data partition, the mean and standard deviation of the data partition are calculated, it being understood that the mean and standard deviation are the mean and standard deviation of all the monitored values included in the data partition. The average value is a numerical value obtained by dividing the sum of all monitoring values by the number of the monitoring values; the standard deviation is a numerical value obtained by performing standard deviation operation on all the monitored values.
Optionally, in a case that the distribution characteristic of the data partition is a third type of distribution characteristic, the determining the dynamic threshold of each data partition based on the distribution characteristic includes:
calculating a first quantile and a second quantile of any data partition; calculating a second product of the quantile difference value and a fourth preset value; determining a sum of the second product and the second score as the upper threshold; determining a difference between the first quantile and the second product as the lower threshold.
The criterion for judging the abnormal value in the normal distribution is based on calculating the mean and standard deviation of the data, whereas in the non-normal distribution, since the data does not follow the normal distribution, the present embodiment provides a calculation method that relies on actual data, does not require the assumption in advance that the data follows a specific distribution form, and does not impose any restrictive requirement on the data.
In this embodiment, for any data partition, the upper threshold and the lower threshold of the data partition satisfy the following formula:
UpperLimit=Q3+1.5IQR
LowerLimit=Q1-1.5IQR
wherein UpperLimit is an upper threshold; LowerLimit is a lower threshold; q1 is the first quantile; q3 is the second quantile; the IQR is a quantile difference value, namely the IQR is Q3-Q1; and 1.5 is a fourth preset value.
It should be understood that the first quantile, the second quantile and the fourth preset value are all empirical values that can be set by user. Preferably, the first quantile is set to a 25% quantile, the second quantile to a 75% quantile, and the fourth predetermined value is 1.5.
In the following, the meaning of the first quantile is specified by taking the 25% quantile as an example:
for example, if n monitored values are included in a data partition, the n monitored values are sorted from large to small to obtain a sequence X ═ X (X)(1),x(2)...x(n)) Can define
Figure BDA0002911622150000171
Wherein
Figure BDA0002911622150000172
p is 25.
If L ispIs an integer, then Q1 has a value of
Figure BDA0002911622150000173
And
Figure BDA0002911622150000174
average value of (d); if L ispIf not an integer, then pair LpIs rounded up, e.g. LpWhen the value is 1.2, the determination is made
Figure BDA0002911622150000175
Is x(2)
The second score for the data partition may be determined based on the principles described above.
Optionally, in a case that the distribution feature of the data partition is a fat tail distribution feature, the determining the dynamic threshold of each data partition based on the distribution feature includes:
partitioning any data, and acquiring data quantity, shape parameters, scale parameters, an upper quantile, a lower quantile and probability parameters in the data partitions; determining a first number in the data partition, the first number being a number of data having a value greater than the upper quantile; determining, in the data partition, a second number, the second number being a number of data having a value less than the lower quantile; determining the upper limit threshold value based on a pareto distribution relation satisfied by the data amount, the shape parameter, the scale parameter, the upper quantile, the first number, and the probability parameter; determining the lower threshold based on a pareto distribution relationship that is satisfied by the data amount, the shape parameter, the scale parameter, the lower quantile, the second number, and the probability parameter.
In this embodiment, if the monitored data curve satisfies the fertilizer tail distribution, it indicates that the probability of occurrence of the extreme value of the curve is high, so that the dynamic threshold of each data partition can be determined by using an extreme value theoretical calculation method.
In the embodiment, the data volume, the shape parameter, the scale parameter, the upper quantile, the lower quantile and the probability parameter in the data partition are obtained; determining a first number and a second number in the data partition; the first number is a data number with a value larger than the upper quantile, and the second number is a data number with a value smaller than the lower quantile. The present embodiment is not specifically described herein, particularly, a manner of obtaining the data from the data partition.
The pareto distribution relation satisfied by the data volume, the shape parameter, the scale parameter, the upper quantile, the first number and the probability parameter satisfies the following formula:
Figure BDA0002911622150000181
wherein Z isq1Is an upper threshold; t is t1The numerical value calculated based on the upper quantile; delta is a scale parameter; gamma is a shape parameter; q is a probability parameter, and m is the number of monitoring values; nt1The first number is a number of monitoring values greater than or equal to the upper quantile of the m monitoring values.
The pareto distribution relation satisfied by the data volume, the shape parameter, the scale parameter, the lower quantile, the second number and the probability parameter satisfies the following formula:
Figure BDA0002911622150000182
wherein Z isq2Is a lower threshold; t is t2A numerical value calculated based on the lower quantile; nt2Is a second number, which is the number of the monitored values smaller than the lower quantile among the m monitored values. The meaning of the other parameters in this formula is consistent with the relevant formula for determining the upper threshold.
In the foregoing formula, the upper quantile and the lower quantile may be set in a user-defined manner based on actual conditions, and this is not particularly limited in the embodiment of the present invention. In an exemplary embodiment, the upper quantile may be a 95% quantile and the lower quantile may be a 5% quantile.
It will be appreciated that in some embodiments, the values of the scale parameter and the shape parameter described above are related to a complexity parameter, and that alternatively, the complexity parameter of the monitoring data curve may be calculated using a Lempel-Ziv complexity algorithm. It should be noted that the Lempel-Ziv complexity algorithm can measure the repetition degree of each monitoring data in the monitoring data curve, and the higher the repetition degree of the monitoring data curve is, the larger the complexity parameter is; the lower the repetition of the monitored data curve, the smaller the complexity parameter.
Further, the complexity parameter may be normalized to obtain a complexity parameter with a numerical range of 0 to 1. An optional implementation manner is that, in the case that the complexity parameter is greater than the preset parameter value, the probability parameter value is higher, for example, may be 0.05; under the condition that the complexity parameter is less than or equal to the preset parameter value, the probability parameter value is low, and may be 0.001, for example.
The preset parameter value is an empirical value, and optionally, the preset parameter value is 0.5.
Optionally, the monitoring the abnormality of the target monitoring item based on the dynamic threshold of each data partition includes:
when the monitoring value of the target monitoring item is out of the range indicated by the dynamic threshold value of the corresponding data partition, outputting an alarm aiming at the target monitoring item.
An optional implementation of monitoring the abnormality of the target monitoring item may be at least one of the following:
first, when the monitored value is greater than the upper threshold of the corresponding data partition, it is determined that the target monitored item is abnormal, and an alarm for the target monitored item is issued.
And secondly, when the monitoring value is smaller than the lower limit threshold of the corresponding data partition, determining that the target monitoring item is abnormal, and sending an alarm aiming at the target monitoring item.
Thirdly, when the monitoring value is larger than the upper limit threshold of the corresponding data partition and the number of the monitoring values is larger than the preset number, determining that the target monitoring item is abnormal, and sending an alarm aiming at the target monitoring item.
Fourthly, determining that the target monitoring item is abnormal and sending an alarm aiming at the target monitoring item when the monitoring value is smaller than the upper limit threshold of the corresponding data partition and the number of the monitoring values is larger than the preset number.
It can be understood that the foregoing 4 pre-warning manners may be used alone or in combination; when used in combination, the first and third, second and fourth cannot be combined.
The technical solution is explained below with reference to specific application scenarios of embodiments of the present invention. Referring to fig. 3, as shown in the figure, the anomaly monitoring system may obtain a monitoring request sent by a user from a message queue, analyze the monitoring request, input index data corresponding to the monitoring request into a database for query, and determine whether a monitoring item corresponding to the monitoring request is a target monitoring item, where the above steps may be understood as a consumption index task in fig. 3. Under the condition that the monitoring item corresponding to the monitoring request is a target monitoring item, determining whether a monitoring data curve corresponding to the target monitoring item has a periodic distribution rule through a time sequence analysis module, dividing the monitoring data curve corresponding to the target monitoring item through a data partitioning module to obtain a plurality of data partitions, and calculating a dynamic threshold value of each data partition through a baseline calculation module. It should be understood that the above baseline task may also be performed by 1 module, or multiple module processes. And after the baseline threshold value of each data partition is obtained, writing the baseline threshold value into the database to realize the monitoring of the target monitoring item.
The abnormity monitoring scheme provided by the embodiment of the invention can be combined with other abnormity monitoring schemes for use. For example, the scheme realizes primary screening of suspicious samples in the monitoring item data, and then performs secondary screening processing by using other abnormal monitoring modes, which is beneficial to improving the coverage and resource utilization rate of the abnormal monitoring system.
In summary, different from the conventional dynamic threshold generation manner and the anomaly monitoring manner, the anomaly monitoring method provided in the embodiment of the present invention takes the historical distribution of the monitoring item data as a drive, and ensures screening out of suspicious samples in the time sequence samples of the monitoring item by estimating the normal data fluctuation interval, which is beneficial to improving the accuracy of anomaly monitoring; moreover, models and parameters of all monitoring items do not need to be established, trained, maintained and stored, so that computing resources and storage resources are saved; the scheme can meet the requirements of monitoring scenes of large-scale monitoring items on monitoring accuracy and monitoring efficiency.
As shown in fig. 4, an embodiment of the present invention further provides an anomaly monitoring system 200, including:
an obtaining module 201, configured to obtain a monitoring data curve corresponding to a target monitoring item;
a first determining module 202, configured to determine a dynamic threshold of each data partition in the monitoring data curve according to a distribution characteristic of the monitoring data curve;
and the monitoring module 203 is configured to perform exception monitoring on the target monitoring item based on the dynamic threshold of each data partition.
Optionally, the first determining module 202 includes:
a first determining unit, configured to determine, based on a curve skewness and a kurtosis of the monitoring data curve, a distribution characteristic that the monitoring data curve satisfies;
and the second determining unit is used for determining the dynamic threshold of each data partition based on the distribution characteristics.
Optionally, the first determining unit is further configured to:
judging whether the skewness of the monitoring data curve is equal to a first preset value or not;
if the skewness of the monitoring data curve is equal to a first preset value, determining that the distribution characteristic of the monitoring data curve is a normal distribution characteristic;
if the skewness of the monitoring data curve is not equal to a first preset value, judging whether the skewness of the monitoring data curve is larger than a second preset value or not, and judging whether the kurtosis of the monitoring data curve is larger than a preset kurtosis or not;
if the skewness of the monitoring data curve is larger than a second preset value and the kurtosis of the monitoring data curve is larger than a preset kurtosis, determining that the distribution characteristic of the monitoring data curve is a fertilizer tail distribution characteristic;
and if the skewness of the monitoring data curve is not greater than a second preset value and/or the kurtosis of the monitoring data curve is not greater than a preset kurtosis, determining that the distribution characteristic of the monitoring data curve is a third type of distribution characteristic, wherein the third type of distribution characteristic is other distribution characteristics except normal distribution and fertilizer tail distribution.
Optionally, the second determining unit is further configured to:
calculating the mean value and the standard deviation of any data partition;
calculating a first product of the standard deviation and a third preset value;
determining the sum of the first product and the mean as the upper threshold;
determining a difference between the mean value and the first product as the lower threshold.
Optionally, the second determining unit is further configured to:
calculating a first quantile and a second quantile of any data partition;
calculating a second product of a quantile difference value and a fourth preset value, wherein the quantile difference value is the difference between the second quantile and the first quantile;
determining a sum of the second product and the second score as the upper threshold;
determining a difference between the first quantile and the second product as the lower threshold.
Optionally, the second determining unit is further configured to:
partitioning any data, and acquiring data quantity, shape parameters, scale parameters, an upper quantile, a lower quantile and probability parameters in the data partitions;
determining a first number in the data partition, the first number being a number of data having a value greater than the upper quantile;
determining, in the data partition, a second number, the second number being a number of data having a value less than the lower quantile;
determining the upper limit threshold value based on a pareto distribution relation satisfied by the data amount, the shape parameter, the scale parameter, the upper quantile, the first number, and the probability parameter;
determining the lower threshold based on a pareto distribution relationship that is satisfied by the data amount, the shape parameter, the scale parameter, the lower quantile, the second number, and the probability parameter.
Optionally, the anomaly monitoring system 200 further includes:
the judging module is used for judging whether the monitoring data curve has a periodic distribution rule or not;
and the dividing module is used for dividing the monitoring data curve into a plurality of data partitions under the condition that the monitoring data curve has a periodic distribution rule.
Optionally, the determining module is further configured to:
calculating K autocorrelation coefficients corresponding to the monitoring data curve in K monitoring time periods, wherein K is a positive integer greater than 1;
and under the condition that the K autocorrelation coefficients are regularly distributed, determining that the monitoring data curve has a periodic distribution rule.
Optionally, the dividing module is further configured to:
calculating the arrangement entropy of the monitoring data curve;
carrying out normalization processing on the permutation entropy to obtain a normalized permutation entropy;
under the condition that the normalized permutation entropy is larger than a fifth preset value, dividing the monitoring data curve into M data partitions, wherein M is a positive integer larger than 1;
and under the condition that the normalized permutation entropy is less than or equal to the fifth preset value, dividing the monitoring data curve into L data partitions, wherein L is a positive integer greater than 1 and is greater than M.
Optionally, the anomaly monitoring system 200 further includes:
the second determining module is used for receiving a monitoring request of a user and determining index data corresponding to the monitoring request;
the query module is used for querying the index data in a preset database, and the database stores historical index data of all monitored monitoring items;
and the third determining module is used for determining the monitoring item corresponding to the index data as the target monitoring item under the condition that the index data is not stored in the database.
Optionally, the monitoring module 203 is further configured to:
when the monitoring value of the target monitoring item is out of the range indicated by the dynamic threshold value of the corresponding data partition, outputting an alarm aiming at the target monitoring item.
An embodiment of the present invention further provides an electronic device, as shown in fig. 5, including a processor 301, a communication interface 302, a memory 303, and a communication bus 304, where the processor 301, the communication interface 302, and the memory 303 complete mutual communication through the communication bus 304,
a memory 303 for storing a computer program;
the processor 301 is configured to execute the computer program when executing the program stored in the memory 303, and the processor 301 executes the abnormality monitoring method according to any one of the embodiments.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the terminal and other equipment.
The Memory may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
In another embodiment of the present invention, a computer-readable storage medium is further provided, in which instructions are stored, and when the instructions are executed on a computer, the computer is enabled to execute the abnormality monitoring method according to any one of the above embodiments.
In a further embodiment of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the anomaly monitoring method as described in any one of the above embodiments.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (14)

1. An anomaly monitoring method, comprising the steps of:
acquiring a monitoring data curve of a target monitoring item, wherein the monitoring data curve is used for reflecting the corresponding relation between historical monitoring data and historical monitoring time of the target monitoring item;
determining a dynamic threshold value of each data partition in the monitoring data curve according to the distribution characteristics of the monitoring data curve, wherein the monitoring data curve comprises N data partitions, and N is a positive integer;
and monitoring the abnormity of the target monitoring item based on the dynamic threshold value of each data partition.
2. The anomaly monitoring method according to claim 1, wherein determining the dynamic threshold of each data partition in the monitored data curve according to the distribution characteristics of the monitored data curve comprises:
determining distribution characteristics met by the monitoring data curve based on the curve skewness and kurtosis of the monitoring data curve;
determining a dynamic threshold value of each data partition based on the distribution characteristics; wherein the dynamic threshold comprises an upper threshold and a lower threshold.
3. The anomaly monitoring method according to claim 2, wherein the determining the distribution characteristic satisfied by the monitoring data curve based on the curve skewness and kurtosis of the monitoring data curve comprises:
judging whether the skewness of the monitoring data curve is equal to a first preset value or not;
if the skewness of the monitoring data curve is equal to a first preset value, determining that the distribution characteristic of the monitoring data curve is a normal distribution characteristic;
if the skewness of the monitoring data curve is not equal to a first preset value, judging whether the skewness of the monitoring data curve is larger than a second preset value or not, and judging whether the kurtosis of the monitoring data curve is larger than a preset kurtosis or not;
if the skewness of the monitoring data curve is larger than a second preset value and the kurtosis of the monitoring data curve is larger than a preset kurtosis, determining that the distribution characteristic of the monitoring data curve is a fertilizer tail distribution characteristic;
and if the skewness of the monitoring data curve is not greater than a second preset value and/or the kurtosis of the monitoring data curve is not greater than a preset kurtosis, determining that the distribution characteristic of the monitoring data curve is a third type of distribution characteristic, wherein the third type of distribution characteristic is other distribution characteristics except normal distribution and fertilizer tail distribution.
4. The anomaly monitoring method according to claim 2, wherein in a case that the distribution characteristic of the data partition is a normal distribution characteristic, the determining the dynamic threshold of each data partition based on the distribution characteristic comprises:
calculating the mean value and the standard deviation of any data partition;
calculating a first product of the standard deviation and a third preset value;
determining the sum of the first product and the mean as the upper threshold;
determining a difference between the mean value and the first product as the lower threshold.
5. The anomaly monitoring method according to claim 2, wherein in a case that the distribution characteristic of the data partition is a third type distribution characteristic, the determining the dynamic threshold value of each data partition based on the distribution characteristic comprises:
calculating a first quantile and a second quantile of any data partition;
calculating a second product of a quantile difference value and a fourth preset value, wherein the quantile difference value is the difference between the second quantile and the first quantile;
determining a sum of the second product and the second score as the upper threshold;
determining a difference between the first quantile and the second product as the lower threshold.
6. The anomaly monitoring method according to claim 2, wherein in a case that the distribution characteristic of the data partition is a fat tail distribution characteristic, the determining the dynamic threshold of each data partition based on the distribution characteristic comprises:
partitioning any data, and acquiring data quantity, shape parameters, scale parameters, an upper quantile, a lower quantile and probability parameters in the data partitions;
determining a first number in the data partition, the first number being a number of data having a value greater than the upper quantile;
determining, in the data partition, a second number, the second number being a number of data having a value less than the lower quantile;
determining the upper limit threshold value based on a pareto distribution relation satisfied by the data amount, the shape parameter, the scale parameter, the upper quantile, the first number, and the probability parameter;
determining the lower threshold based on a pareto distribution relationship that is satisfied by the data amount, the shape parameter, the scale parameter, the lower quantile, the second number, and the probability parameter.
7. The anomaly monitoring method according to any one of claims 1-6, wherein before determining the dynamic threshold for each data partition in the monitored data curve based on the distribution characteristics of the monitored data curve, the method further comprises:
judging whether the monitoring data curve has a periodic distribution rule or not;
under the condition that the monitoring data area curve has a periodic distribution rule, dividing the monitoring data curve into a plurality of data partitions;
the monitoring time periods corresponding to any two data partitions are not completely the same, and the monitoring time duration corresponding to any data partition is less than or equal to the minimum period of the monitoring data curve in the periodic distribution rule.
8. The anomaly monitoring method according to claim 7, wherein said determining whether the monitoring data curve has a periodic distribution rule comprises:
calculating K autocorrelation coefficients corresponding to the monitoring data curve in K monitoring time periods, wherein K is a positive integer greater than 1;
and under the condition that the K autocorrelation coefficients are regularly distributed, determining that the monitoring data curve has a periodic distribution rule.
9. The anomaly monitoring method of claim 7, wherein said dividing said monitored data curve into a plurality of data partitions comprises:
calculating the arrangement entropy of the monitoring data curve;
carrying out normalization processing on the permutation entropy to obtain a normalized permutation entropy;
under the condition that the normalized permutation entropy is larger than a fifth preset value, dividing the monitoring data curve into M data partitions, wherein M is a positive integer larger than 1;
and under the condition that the normalized permutation entropy is less than or equal to the fifth preset value, dividing the monitoring data curve into L data partitions, wherein L is a positive integer greater than 1 and is greater than M.
10. The anomaly monitoring method according to any one of claims 1-6, wherein before the obtaining of the monitoring data curve of the target monitoring item, the method further comprises:
receiving a monitoring request of a user, and determining index data corresponding to the monitoring request;
inquiring the index data in a preset database, wherein the database stores historical index data of all monitored monitoring items;
and under the condition that the index data is not stored in the database, determining the monitoring item corresponding to the index data as a target monitoring item.
11. The anomaly monitoring method according to claim 1, wherein the anomaly monitoring of the target monitoring item based on the dynamic threshold of each data partition comprises:
when the monitoring value of the target monitoring item is out of the range indicated by the dynamic threshold value of the corresponding data partition, outputting an alarm aiming at the target monitoring item.
12. An anomaly monitoring system, comprising:
the acquisition module is used for acquiring a monitoring data curve of the target monitoring item, and the monitoring data curve is used for reflecting the corresponding relation between the historical monitoring data of the target monitoring item and the historical monitoring time;
the determining module is used for determining a dynamic threshold value of each data partition in the monitoring data curve according to the distribution characteristics of the monitoring data curve, wherein the monitoring data curve comprises N data partitions, and N is a positive integer;
and the monitoring module is used for monitoring the abnormity of the target monitoring item based on the dynamic threshold value of each data partition.
13. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the anomaly monitoring method of any one of claims 1-11 when executing a program stored on a memory.
14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the anomaly monitoring method according to any one of claims 1-11.
CN202110088001.6A 2021-01-22 2021-01-22 Abnormality monitoring method, abnormality monitoring system, electronic device, and storage medium Pending CN112882889A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110088001.6A CN112882889A (en) 2021-01-22 2021-01-22 Abnormality monitoring method, abnormality monitoring system, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110088001.6A CN112882889A (en) 2021-01-22 2021-01-22 Abnormality monitoring method, abnormality monitoring system, electronic device, and storage medium

Publications (1)

Publication Number Publication Date
CN112882889A true CN112882889A (en) 2021-06-01

Family

ID=76050163

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110088001.6A Pending CN112882889A (en) 2021-01-22 2021-01-22 Abnormality monitoring method, abnormality monitoring system, electronic device, and storage medium

Country Status (1)

Country Link
CN (1) CN112882889A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113627627A (en) * 2021-08-11 2021-11-09 北京互金新融科技有限公司 Abnormity monitoring method, abnormity monitoring device, computer readable medium and processor
CN113884889A (en) * 2021-10-29 2022-01-04 蜂巢能源(上海)有限公司 Battery safety early warning method and device, storage medium and electronic equipment
CN115096344A (en) * 2022-05-13 2022-09-23 珠海格力电器股份有限公司 Data real-time display method and device, electronic equipment and storage medium
CN115811486A (en) * 2022-12-08 2023-03-17 柳州达迪通信技术股份有限公司 Method, system and device for monitoring abnormal value of data flow and storage medium
CN116185783A (en) * 2023-04-24 2023-05-30 山东溯源安全科技有限公司 Monitoring method and device of electronic equipment, electronic equipment and storage medium
CN116306937A (en) * 2023-03-22 2023-06-23 中航信移动科技有限公司 Rule extraction method, medium and device based on time sequence offline data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103679378A (en) * 2013-12-20 2014-03-26 北京航天测控技术有限公司 Method and device for evaluating heath state of spacecraft on basis of telemeasuring data
CN107871190A (en) * 2016-09-23 2018-04-03 阿里巴巴集团控股有限公司 A kind of operational indicator monitoring method and device
CN109656793A (en) * 2018-11-22 2019-04-19 安徽继远软件有限公司 A kind of information system performance stereoscopic monitoring method based on multi-source heterogeneous data fusion
CN110504974A (en) * 2019-08-20 2019-11-26 北京四方继保自动化股份有限公司 D-PMU measurement data segmentation slice mixing compression and storage method and device
CN112188531A (en) * 2019-07-01 2021-01-05 中国移动通信集团浙江有限公司 Abnormality detection method, abnormality detection device, electronic apparatus, and computer storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103679378A (en) * 2013-12-20 2014-03-26 北京航天测控技术有限公司 Method and device for evaluating heath state of spacecraft on basis of telemeasuring data
CN107871190A (en) * 2016-09-23 2018-04-03 阿里巴巴集团控股有限公司 A kind of operational indicator monitoring method and device
CN109656793A (en) * 2018-11-22 2019-04-19 安徽继远软件有限公司 A kind of information system performance stereoscopic monitoring method based on multi-source heterogeneous data fusion
CN112188531A (en) * 2019-07-01 2021-01-05 中国移动通信集团浙江有限公司 Abnormality detection method, abnormality detection device, electronic apparatus, and computer storage medium
CN110504974A (en) * 2019-08-20 2019-11-26 北京四方继保自动化股份有限公司 D-PMU measurement data segmentation slice mixing compression and storage method and device

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113627627A (en) * 2021-08-11 2021-11-09 北京互金新融科技有限公司 Abnormity monitoring method, abnormity monitoring device, computer readable medium and processor
CN113884889A (en) * 2021-10-29 2022-01-04 蜂巢能源(上海)有限公司 Battery safety early warning method and device, storage medium and electronic equipment
CN113884889B (en) * 2021-10-29 2024-04-26 章鱼博士智能技术(上海)有限公司 Method and device for battery safety pre-warning, storage medium and electronic equipment
CN115096344A (en) * 2022-05-13 2022-09-23 珠海格力电器股份有限公司 Data real-time display method and device, electronic equipment and storage medium
CN115811486A (en) * 2022-12-08 2023-03-17 柳州达迪通信技术股份有限公司 Method, system and device for monitoring abnormal value of data flow and storage medium
CN116306937A (en) * 2023-03-22 2023-06-23 中航信移动科技有限公司 Rule extraction method, medium and device based on time sequence offline data
CN116306937B (en) * 2023-03-22 2023-11-10 中航信移动科技有限公司 Rule extraction method, medium and device based on time sequence offline data
CN116185783A (en) * 2023-04-24 2023-05-30 山东溯源安全科技有限公司 Monitoring method and device of electronic equipment, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112882889A (en) Abnormality monitoring method, abnormality monitoring system, electronic device, and storage medium
TWI712900B (en) Distributed cluster training method and device
CN109120463B (en) Flow prediction method and device
CN111045894B (en) Database abnormality detection method, database abnormality detection device, computer device and storage medium
CN109241084B (en) Data query method, terminal equipment and medium
CN114365094A (en) Timing anomaly detection using inverted indices
CN114500339B (en) Node bandwidth monitoring method and device, electronic equipment and storage medium
CN111835536B (en) Flow prediction method and device
CN112416590A (en) Server system resource adjusting method and device, computer equipment and storage medium
CN111740865A (en) Flow fluctuation trend prediction method and device and electronic equipment
US9225608B1 (en) Evaluating configuration changes based on aggregate activity level
JP2015194797A (en) Omitted monitoring identification processing program, omitted monitoring identification processing method and omitted monitoring identification processor
CN111783883A (en) Abnormal data detection method and device
EP4343554A1 (en) System monitoring method and apparatus
US10467119B2 (en) Data-agnostic adjustment of hard thresholds based on user feedback
CN114661562A (en) Data warning method, device, equipment and medium
CN109344049B (en) Method and apparatus for testing a data processing system
US20190138931A1 (en) Apparatus and method of introducing probability and uncertainty via order statistics to unsupervised data classification via clustering
US20220335347A1 (en) Time-series anomaly prediction and alert
CN111104569A (en) Region segmentation method and device for database table and storage medium
US20240004765A1 (en) Data processing method and apparatus for distributed storage system, device, and storage medium
CN111782480B (en) Disk usage monitoring method, device, system and medium
CN116187895B (en) Intelligent warehouse cargo flow planning method, system and electronic equipment
CN111158862B (en) Virtual machine scheduling method and device
Sidhanta et al. Infra: SLO Aware Elastic Auto-scaling in the Cloud for Cost Reduction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination