CN112380100A - Method, apparatus and medium for detecting traffic anomaly based on directional deviation - Google Patents
Method, apparatus and medium for detecting traffic anomaly based on directional deviation Download PDFInfo
- Publication number
- CN112380100A CN112380100A CN202011384760.9A CN202011384760A CN112380100A CN 112380100 A CN112380100 A CN 112380100A CN 202011384760 A CN202011384760 A CN 202011384760A CN 112380100 A CN112380100 A CN 112380100A
- Authority
- CN
- China
- Prior art keywords
- data
- real
- base
- baseline
- service
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 239000013598 vector Substances 0.000 claims abstract description 98
- 238000004364 calculation method Methods 0.000 claims abstract description 34
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 22
- 230000008859 change Effects 0.000 claims abstract description 22
- 238000001514 detection method Methods 0.000 claims description 51
- 230000002159 abnormal effect Effects 0.000 claims description 27
- 230000005856 abnormality Effects 0.000 claims description 17
- 238000012545 processing Methods 0.000 claims description 7
- 238000005070 sampling Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 10
- 238000004590 computer program Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 6
- 238000013459 approach Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 229910052770 Uranium Inorganic materials 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 230000002547 anomalous effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000036962 time dependent Effects 0.000 description 1
- JFALSRSLKYAFGM-UHFFFAOYSA-N uranium(0) Chemical compound [U] JFALSRSLKYAFGM-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3006—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3452—Performance evaluation by statistical analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Quality & Reliability (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Computer Hardware Design (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention provides a method, a device and a medium for detecting service abnormity, wherein the method comprises the following steps: a service baseline generation step, based on historical service data of specific service characteristics, generating a service baseline of which the service data changes along with time; the step of counting the actual service data, namely counting the change curve of the actual service data with the specific service characteristics along with the time; a tolerance judging step, namely judging whether the actual service data conforms to the tolerance range of the service baseline according to a tolerance algorithm; a direction deviation calculation step of calculating a baseline vector of baseline data at adjacent data points and a corresponding actual traffic data vector of the actual traffic data, respectively, and comparing the baseline vector with the actual traffic data vector to obtain the direction deviation; and a traffic anomaly determination step of integrating the directional deviations of all adjacent data points in the plurality of continuous data points to determine whether a traffic anomaly occurs.
Description
Technical Field
The present invention relates to the field of network traffic anomaly detection, and in particular, to a method, apparatus, and medium for detecting traffic anomalies from calculated vector direction deviations using a traffic learning baseline in a network.
Background
There have been many studies to detect the anomaly based on the traffic baseline, which mainly utilize the characteristics of some aspect of the network traffic, such as bandwidth bps, packet rate pps, packet length or the value of some other characteristic field, etc., count the characteristic vector to form a baseline, set a tolerance interval, then compare the current actual traffic characteristics with the baseline value, and calculate the difference between the actual value and the baseline value, if the difference falls within the tolerance interval, it is considered as a normal error, otherwise, it is considered as an anomaly. The baseline threshold based approach typically sets the threshold range in terms of percentage or absolute value.
The method for judging whether the flow is abnormal or not based on the threshold range can meet the requirements in most scenes, but some scenes in which the problem cannot be found in time exist. For example, the threshold value of a certain dimension characteristic baseline is in the range of 10-20, and a normal scene is obtained as long as the detection data falls within the range. In a normal scenario, the detected data should fluctuate around 15. if the detected data is equal to 19 for a long time, then there is a high probability that there is a traffic anomaly, but the traditional threshold-based detection decision scheme cannot identify such anomaly. However, such abnormal judgment is particularly important for an industrial control system, and if data such as temperature or pressure are in a high position for a long time or in a low position for a long time, the industrial system is likely to be damaged, and serious consequences are caused. The 'net shaking' virus discovered in 2010 basically adopts the attack mode, so that the rotating speed of the Iran uranium concentration centrifuge runs under two extreme conditions of high load and low load for a long time, and finally, the accelerated aging is caused, and the damage rate is very high.
By combining the scenes and considering the limitation of the existing flow anomaly detection algorithm, the method adopts the vector direction deviation detection algorithm, can more accurately identify the change characteristics of the service data, and effectively detects the abnormal conditions of the services.
Disclosure of Invention
Service anomaly detection is a technical problem faced by all network service systems. The most common algorithm for detecting abnormal services is to establish various flow baselines, then set tolerance or threshold range, and judge whether the services are abnormal by checking whether the actual service data meets the requirement of the baseline range.
In an industrial control system, some service data curves not only need to meet the requirements of tolerance or threshold, but also need to meet the requirements of change direction. If the monitoring data is in the tolerance range but is unchanged in high order for a long time (as mentioned above, the high order is equal to 19 for a long time), an abnormal condition is likely to occur, and an alarm needs to be timely sent for overhaul.
The present invention has been made in view of the above circumstances, and a traffic anomaly detection method, apparatus and medium based on a direction deviation according to the present invention effectively detect a traffic anomaly mainly by comparing a traffic baseline of the same period with a change direction of an actual curve. Compared with the traditional anomaly detection algorithm, the method, the device and the medium for detecting the service anomaly introduce the concept of vector direction deviation on the basis of the base line tolerance or the threshold value, calculate the direction deviation of real-time data and base line data, focus on the detection of the fluctuation direction of the curve, and can timely detect the correct direction of the service data deviating from the base line, thereby more accurately identifying the change characteristics of the service data and finding some more hidden service anomalies. Moreover, the service abnormity detection method, the device and the medium have simple calculation mode, little consumption on a CPU and easy engineering implementation.
According to a first aspect of the present invention, a method for detecting a service anomaly is provided, the method comprising:
a service baseline generation step, based on historical service data of specific service characteristics, generating service baselines (Base _ t, Base _ v) of service data changing with time, wherein, Base _ t represents time, and Base _ v represents baseline data;
a step of counting actual service data, which is to count a change curve (Real _ t, Real _ v) of the actual service data with the specific service characteristics along with time, wherein the Real _ t represents time, and the Real _ v represents the actual service data;
a tolerance judging step, namely judging whether the actual business data Real _ v at the time Real _ t conforms to the tolerance range of the business baseline (Base _ t, Base _ v) or not according to a tolerance algorithm;
a direction deviation calculation step of calculating a baseline vector Base _ d of baseline data at adjacent data points and a corresponding actual traffic data vector Real _ d of the actual traffic data, respectively, and comparing the baseline vector Base _ d with the actual traffic data vector Real _ d to obtain the direction deviation d; and
and a service abnormity determining step, wherein the direction deviation d of all adjacent data points in the continuous multiple data points is integrated to determine whether service abnormity occurs.
Further, Base _ t is the same as Real _ t in time interval, and the baseline data and the actual traffic data are the same in period.
Further, in the direction deviation calculating step, two of the adjacent baseline data are defined as (Base _ t1, Base _ v1), (Base _ t2, Base _ v2), wherein,
when Base _ v2-Base _ v1>0, Base _ d ═ 1;
when Base _ v2-Base _ v1 is 0, Base _ d is 0;
when Base _ v2-Base _ v1<0, Base _ d ═ 1; and wherein the one or more of the one,
corresponding adjacent two of the actual traffic data are defined as (Real _ t1, Real _ v1), (Real _ t2, Real _ v2), wherein,
when Real _ v2-Real _ v1>0, Real _ d ═ 1;
when Real _ v2-Real _ v1 is 0, Real _ d is 0;
when Real _ v2-Real _ v1<0, Real _ d ═ 1.
Further, the directional deviation d of a baseline vector Base _ d of the baseline data from the actual traffic data vector Real _ d, Real _ d-Base _ d, wherein,
when d is 0, indicating that the direction deviation of the actual traffic data and the baseline data is consistent;
when d-2 or d-2, the direction deviation of the actual traffic data from the baseline data is maximum and the vector direction is opposite.
Further, in the abnormal traffic determination step, a sum directional deviation D is calculated by adding up the directional deviations D of all the adjacent data points of the plurality of data points, and
and when the sum direction deviation D exceeds a preset direction deviation threshold range, judging that the actual service data is abnormal.
Further, the direction deviation threshold range is determined according to the number of the plurality of data points.
Further, in the tolerance determining step, if the actual service data Real _ v exceeds the tolerance range, the service data is directly determined to be abnormal.
Further, in the traffic baseline generation step, a traffic baseline model is established by a traffic cycle learning based on the historical traffic data of the specific traffic feature with the specific traffic feature as a detection object to form a curve of the traffic baseline (Base _ t, Base _ v) related to time.
According to a second aspect of the present invention, there is provided a traffic anomaly detection apparatus, comprising:
a service baseline generation module, which is used for generating service baselines (Base _ t, Base _ v) of the service data changing with time based on the historical service data of the specific service characteristics, wherein, the Base _ t represents time, and the Base _ v represents baseline data;
the actual service data statistics module is used for counting a change curve (Real _ t, Real _ v) of the actual service data with the specific service characteristics along with time, wherein the Real _ t represents time, and the Real _ v represents the actual service data;
the tolerance judging module is used for judging whether the actual business data Real _ v at the time point Real _ t conforms to the tolerance range of the business baseline (Base _ t, Base _ v) or not according to a tolerance algorithm;
the direction deviation calculation module is used for respectively calculating a baseline vector Base _ d of baseline data of adjacent data points and a corresponding actual business data vector Real _ d of the actual business data, and comparing the baseline vector Base _ d with the actual business data vector Real _ d to obtain the direction deviation d; and
and the business abnormity determining module is used for integrating the direction deviation d of all adjacent data points in the continuous multiple data points to determine whether business abnormity occurs.
Further, Base _ t is the same as Real _ t in time interval, and the baseline data and the actual traffic data are the same in period.
Further, the direction deviation calculation module sets two adjacent ones of the baseline data to (Base _ t1, Base _ v1), (Base _ t2, Base _ v2), wherein,
when Base _ v2-Base _ v1>0, Base _ d ═ 1;
when Base _ v2-Base _ v1 is 0, Base _ d is 0;
when Base _ v2-Base _ v1<0, Base _ d ═ 1; and wherein the one or more of the one,
the direction deviation calculation module sets corresponding adjacent two Real service data as (Real _ t1, Real _ v1), (Real _ t2, Real _ v2), wherein,
when Real _ v2-Real _ v1>0, Real _ d ═ 1;
when Real _ v2-Real _ v1 is 0, Real _ d is 0;
when Real _ v2-Real _ v1<0, Real _ d ═ 1.
Further, the directional deviation d of a baseline vector Base _ d of the baseline data from the actual traffic data vector Real _ d, Real _ d-Base _ d, wherein,
when d is 0, indicating that the direction deviation of the actual traffic data and the baseline data is consistent; and is
When d-2 or d-2, the direction deviation representing the actual traffic data from the baseline data is the largest and the opposite is true.
Further, the traffic anomaly determination module calculates a sum directional deviation D by adding up the directional deviations D of all of the plurality of data points, and
and when the sum direction deviation D exceeds a preset direction deviation threshold range, judging that the actual service data is abnormal.
According to a third aspect of the present invention, there is provided a traffic abnormality detection apparatus including a storage unit storing a program and a processing unit, wherein,
the processing unit executes the program to implement the steps of the method of the first aspect.
According to a fourth aspect of the present invention, there is provided a computer-readable medium, wherein,
the medium has stored thereon a program that is executed to implement the steps of the method according to the first aspect.
The technical solutions of the present invention will be described in further detail below with reference to the drawings and preferred embodiments of the present invention, and the advantageous effects of the present invention will be further apparent.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention.
Fig. 1 is a diagram illustrating steps of a traffic anomaly detection method based on directional deviation according to a preferred embodiment of the present invention;
fig. 2 is a schematic diagram showing a configuration of a traffic abnormality detection apparatus based on a directional deviation according to a preferred embodiment of the present invention;
FIG. 3 illustrates a specific example of a plot of actual traffic data and traffic baseline over time over a period of time;
FIG. 4 illustrates another specific example of a plot of actual traffic data and traffic baseline over time over another time period; and
fig. 5 is a diagram showing a schematic configuration of a computer system of the traffic abnormality detection apparatus based on the directional deviation according to a preferred embodiment of the present invention.
Detailed Description
The technical solution of the present invention will be clearly and completely described below with reference to the specific embodiments of the present invention and the accompanying drawings. It is to be understood that the described embodiments are only a few of the presently preferred embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The detailed steps of the traffic anomaly detection method based on directional deviation according to the present invention are described below with reference to fig. 1.
Fig. 1 is a schematic diagram illustrating steps of a traffic anomaly detection method based on directional deviation according to a preferred embodiment of the present invention, and as shown in fig. 1, the traffic anomaly detection method based on directional deviation according to the present invention includes a traffic baseline generation step S1, an actual traffic data statistics step S2, a tolerance determination step S3, a directional deviation calculation step S4, and a traffic anomaly determination step S5.
The steps of the above-mentioned traffic anomaly detection method based on directional deviation will be described in detail with reference to fig. 1.
S1: generating a business baseline
Based on historical traffic data for a particular traffic feature, a traffic baseline (Base _ t, Base _ v) is generated for the traffic data over time.
Specifically, the determined service characteristics are used as detection objects, and a service baseline model is established according to service cycle learning based on historical service data of the service characteristics so as to form a time-related service baseline curve. The traffic baseline typically represents a 2-dimensional vector, each baseline value may be represented by (Base _ t, Base _ v), Base _ t representing time, and Base _ v representing baseline traffic data. An actual traffic baseline may be graphically represented as a curve, where the times Base _ t are typically an arithmetic series, i.e., the time intervals are the same.
The service characteristics can be network flow Kbps, packet rate pps, CPU, memory, temperature, rotating speed, voltage, air pressure and the like.
S2: actual business data statistics step
And counting the change curve (Real _ t, Real _ v) of the actual service data of the specific service characteristic along with the time.
Specifically, the change of the actual service data with time is counted in Real time, and each actual service data can be represented by (Real _ t, Real _ v), where Real _ t represents time and Real _ v represents the actual service data. The change of the actual traffic data with time may be graphically represented as a curve, wherein the time Real _ t is generally an arithmetic progression, i.e., the time interval is the same, and preferably, the time Base _ t is the same as the time interval of the time Real _ t, and the period (length of unit time, i.e., a single period) of the baseline data and the actual traffic data is the same.
S3: tolerance determination step
According to the tolerance algorithm, whether the actual service data Real _ v at the time point Real _ t conforms to the tolerance range of the service baseline (Base _ t, Base _ v) is judged.
Specifically, if the actual service data Real _ v exceeds the tolerance range, it is determined that the actual service data is abnormal, and the abnormal result can be directly reported.
As an example, the tolerance algorithm on which the invention is based may employ, for example, an absolute value deviation or a proportional deviation.
S4: and a direction deviation calculation step, namely performing vector direction deviation detection on the actual service data and the baseline data in the same period, respectively calculating a baseline vector Base _ d of the baseline data at adjacent time points and an actual service data vector Real _ d of the corresponding actual service data, and comparing the baseline vector Base _ d with the actual service data vector Real _ d to obtain a direction deviation d.
The method comprises the following specific steps:
first, 2 corresponding data points in the actual traffic data and the baseline data are taken as an example for explanation. Assume that the 2 data points in the baseline data are (Base _ t1, Base _ v1) and (Base _ t2, Base _ v2), and the corresponding 2 data points in the actual traffic data are (Real _ t1, Real _ v1) and (Real _ t2, Real _ v2) ]. The time difference between the two data is the same, namely, Base _ t2-Base _ t 1-Real _ t2-Real _ t 1.
Assuming that the baseline vector is Base _ d and the actual traffic data vector is Real _ d, the specific calculation method is expressed in the form of pseudo code as follows:
the same algorithm is used to calculate the vector deviation Real _ d of the Real-time data:
that is, it can be interpreted that when Base _ v2-Base _ v1>0, Base _ d is 1; when Base _ v2-Base _ v1 is 0, Base _ d is 0; when Base _ v2-Base _ v1<0, Base _ d ═ 1.
When Real _ v2-Real _ v1>0, Real _ d ═ 1; when Real _ v2-Real _ v1 is 0, Real _ d is 0; when Real _ v2-Real _ v1<0, Real _ d ═ 1.
The values of the baseline vector Base _ d and the actual traffic data vector Real _ d are thus obtained separately by the above-mentioned calculation, on the basis of which the direction deviation d is calculated, wherein,
the direction deviation d is Real _ d-Base _ d.
Based on the values of the baseline vector Base _ d and the actual service data vector Real _ d, it can be seen that the value range of d is-2 ≤ d ≤ 2.
As can be seen from the above description of the calculation process:
when d is 0, the direction deviation representing the actual data and the direction deviation representing the baseline data are completely consistent;
when d is 2, the direction deviation indicating the actual data and the direction deviation indicating the baseline data are positive and reach the maximum value, meaning that the baseline vector is directed downward, but the actual data vector is directed upward.
When d is-2, the deviation indicating the actual data and the directional deviation of the baseline data are negative and reach the maximum value, meaning that the baseline vector is directed upward, but the actual data vector is directed downward.
Therefore, the direction deviation of the vector of 2 data points can be determined from the value of d.
S5: service abnormality determination step
And integrating the direction deviation d of all adjacent data points in the continuous data points to judge whether the abnormal service occurs. Specifically, a sum direction deviation D is calculated by adding up the direction deviations D of all adjacent data points in the plurality of data points, and when the sum direction deviation D exceeds a preset direction deviation threshold range, it is determined that actual traffic data is abnormal.
Since the actual traffic data is always allowed to fluctuate around the baseline data, relying only on the above directional deviation of 2 data points is not enough to judge whether the traffic is abnormal. In view of this, in order to more accurately determine an abnormality, it is necessary to integrate the directional deviations of a plurality of consecutive data points for comprehensive determination.
The direction deviation is determined by taking the data of 10 service points as an example. The 10 service data form 9 segments, and the direction deviation D is calculated for each segment, which is respectively denoted as D1, D2, d3... D9, and the total direction deviation is denoted as D, then D can be expressed as:
as can be seen, the larger the value of D, the larger the directional deviation between the actual traffic data vector representing the actual traffic data and the baseline vector of the baseline data; if the actual data fluctuates up and down around the baseline data, positive and negative numbers will appear in the value of D, and the value of the sum-direction deviation D will be subtracted. In fact, if the baseline value is relatively accurate, D approaches 0 when the sampling space is large enough. However, too much sampling is not suitable for the abnormality judgment because the timeliness of the abnormality judgment is affected by too much historical data. Theoretically, the number of samples cannot be greater than the number of data in 1 service period, otherwise, the detection of the direction deviation may be weakened due to the period offset fluctuation. For example, if the traffic period is 40 points, i.e. a cycle is restarted after every 40 points, then the number of samples should be less than 40, and the waveform of the traffic data is specifically referred to determine the number of samples. Generally, about 10 samples are suitable, and the specific sampling number can be adjusted according to the actual requirements of services.
After the sampling number is determined according to the actual requirement of the service, the direction deviation threshold range is determined according to the sampling number, so that whether the total direction deviation exceeds the direction deviation threshold range or not can be judged, and if the total direction deviation exceeds the direction deviation threshold range, the service is abnormal.
The steps of the traffic anomaly detection method based on directional deviation according to the present invention are described in detail above. To facilitate understanding, how the traffic anomaly detection method based on the direction deviation performs traffic anomaly detection based on the direction deviation will be described below by using a specific example.
It is to be understood that the following text is merely illustrative of one or more specific embodiments of the invention and does not strictly limit the scope of the invention as specifically claimed.
The business data baseline of the present embodiment is directed to memory usage (specific business characteristics) of a certain business process. For simplicity of description, the memory unit is not described, for example, if the size occupied by 20MB is recorded as 20. In addition, this embodiment omits the conventional tolerance determination process, i.e., the previously described step S3. It is assumed that none of the actual traffic data exceeds the tolerance range in step S3.
Assume that the values of the baseline data (obtained through step S1) and the actual traffic data (obtained through step S2) of the traffic baseline occupied by the memory for a certain period of time are as shown in the following table (table 1):
TABLE 1
Referring to fig. 3, a time-varying curve of the actual traffic data 1 and the traffic baseline is shown.
Calculating the Base _ d between each adjacent data point as 1, -1, -1, 0 respectively.
Real _ d1 between each adjacent data point is calculated to be-1, -1, respectively.
The calculated direction deviation d1 is Real _ d1-Base _ d, which is-2, 0, -2, 0, -2, -1, respectively.
The corresponding sum directional deviation D1 is the sum of the above calculated directional deviations D1, i.e., D1 is-3.
Assume that the values of the memory usage baseline data and the actual traffic data for another period of time are shown in the following table (table 2):
TABLE 2
Referring to fig. 4, a time-varying curve of the actual traffic data 2 and the traffic baseline is shown.
Calculating the Base _ d between each adjacent data point as 1, -1, -1, 0 respectively.
Calculate Real _ d2 between each adjacent data point as 1, respectively.
The calculated direction deviation d2 is Real _ d2-Base _ d, and is 0, 2, 0, 2, 0, and 1, respectively.
The corresponding sum directional deviation D2 is the sum of the above calculated directional deviations D2, i.e., D2 equals 7.
If the direction deviation threshold range is set to [ -6,6], then according to the above calculation, D1 is within the direction deviation threshold range, and D2 exceeds the direction deviation threshold range, it is determined that the actual traffic data 1 is normal, and the actual traffic data 2 has traffic anomaly.
It can be seen that, in the actual traffic data 2, although it meets the tolerance range in step S3, that is, does not exceed the threshold range of memory usage, since the direction deviation thereof exceeds the direction deviation threshold, there may be traffic anomaly, which needs to be paid attention.
The above describes in detail the traffic anomaly detection method based on directional deviation and its specific embodiment of the present invention, according to the traffic anomaly detection method based on directional deviation of the present invention, traffic anomalies are effectively detected mainly by comparing the traffic baseline and the change direction of the actual curve in the same period. Compared with the traditional anomaly detection algorithm, the service anomaly detection method introduces the concept of vector direction deviation on the basis of the base line tolerance or threshold value, calculates the direction deviation of real-time data and base line data, focuses on the detection of the fluctuation direction of the curve, can timely detect the correct direction of the service data deviating from the base line, thereby more accurately identifying the change characteristics of the service data and finding some more hidden service anomalies. Moreover, the service anomaly detection method is simple in calculation mode, low in CPU consumption and easy to implement in engineering.
In another aspect of the present invention, a traffic anomaly detection apparatus 100 based on directional deviation is provided as shown in fig. 2, and the traffic anomaly detection apparatus 100 based on directional deviation according to the present invention includes a traffic baseline generation module 110, an actual traffic data statistics module 120, a tolerance determination module 130, a directional deviation calculation module 140, and a traffic anomaly determination module 150.
The respective modules of the above-described traffic abnormality detection apparatus 100 based on the directional deviation will be described in detail below with reference to fig. 2.
Business baseline generation module 110
The service baseline generation module 110 is configured to generate a service baseline (Base _ t, Base _ v) of service data changing with time based on historical service data of a specific service feature.
Specifically, the business baseline generating module 110 takes the determined business features as detection objects, and based on historical business data of the business features, a business baseline model is established according to business cycle learning, so as to form a time-dependent business baseline curve. The traffic baseline typically represents a 2-dimensional vector, each baseline value may be represented by (Base _ t, Base _ v), Base _ t representing time, and Base _ v representing baseline traffic data. An actual traffic baseline may be graphically represented as a curve, where the times Base _ t are typically an arithmetic series, i.e., the time intervals are the same.
The service characteristics can be network flow Kbps, packet rate pps, CPU, memory, temperature, rotating speed, voltage, air pressure and the like.
Actual business data statistics module 120
The actual traffic data statistics module 120 is configured to count a time variation curve (Real _ t, Real _ v) of the actual traffic data of the specific traffic characteristic.
Specifically, the actual service data statistics module 120 counts the change of the actual service data with time in Real time, where each actual service data may be represented by (Real _ t, Real _ v), Real _ t represents time, and Real _ v represents the actual service data. The change of the actual traffic data with time can be graphically represented as a curve, wherein the time Real _ t is generally an arithmetic progression, i.e. the time interval is the same, and preferably, the time Base _ t is the same as the time interval of the time Real _ t, and the period (unit time length) of the baseline data and the actual traffic data is the same.
Tolerance determination module 130
The tolerance determining module 130 is configured to determine whether the actual traffic data Real _ v at the time point Real _ t meets a tolerance range of a traffic baseline (Base _ t, Base _ v) according to a tolerance algorithm.
Specifically, if the actual service data Real _ v exceeds the tolerance range, the tolerance determining module 130 determines that the actual service data is abnormal, and may directly report the abnormal result.
As an example, the tolerance algorithm on which the invention is based may employ, for example, an absolute value deviation or a proportional deviation.
Direction deviation calculation module 140
The direction deviation calculation module 140 is configured to perform vector direction deviation detection on actual service data and baseline data in the same period, calculate a baseline vector Base _ d of the baseline data at adjacent time points and an actual service data vector Real _ d of the actual service data, respectively, and compare the baseline vector Base _ d with the actual service data vector Real _ d to obtain a direction deviation d.
The method comprises the following specific steps:
first, 2 corresponding data points in the actual traffic data and the baseline data are taken as an example for explanation. Assume that the 2 data points in the baseline data are (Base _ t1, Base _ v1) and (Base _ t2, Base _ v2), and the corresponding 2 data points in the actual traffic data are (Real _ t1, Real _ v1) and (Real _ t2, Real _ v2) ]. The time difference between the two data is the same, namely, Base _ t2-Base _ t 1-Real _ t2-Real _ t 1.
Assuming that the baseline vector is Base _ d and the actual traffic data vector is Real _ d, the specific calculation performed by the direction deviation calculation module 140 is expressed in pseudo code as follows:
the direction deviation calculation module 140 calculates the vector deviation Real _ d of the Real-time data by the same algorithm:
that is, it can be interpreted that when Base _ v2-Base _ v1>0, Base _ d is 1; when Base _ v2-Base _ v1 is 0, Base _ d is 0; when Base _ v2-Base _ v1<0, Base _ d ═ 1.
When Real _ v2-Real _ v1>0, Real _ d ═ 1; when Real _ v2-Real _ v1 is 0, Real _ d is 0; when Real _ v2-Real _ v1<0, Real _ d ═ 1.
The direction deviation calculation module 140 thus obtains the values of the baseline vector Base _ d and the actual traffic data vector Real _ d, respectively, by the above calculation, and calculates the direction deviation d based on these two vectors, wherein,
the direction deviation d is Real _ d-Base _ d.
Based on the values of the baseline vector Base _ d and the actual service data vector Real _ d, it can be seen that the value range of d is-2 ≤ d ≤ 2.
As can be seen from the above description of the calculation process:
when d is 0, the direction deviation representing the actual data and the direction deviation representing the baseline data are completely consistent;
when d is 2, the direction deviation indicating the actual data and the direction deviation indicating the baseline data are positive and reach the maximum value, meaning that the baseline vector is directed downward, but the actual data vector is directed upward.
When d is-2, the deviation indicating the actual data and the directional deviation of the baseline data are negative and reach the maximum value, meaning that the baseline vector is directed upward, but the actual data vector is directed downward.
Therefore, the direction deviation of the vector of 2 data points can be determined from the value of d.
Service anomaly determination module 150
The traffic anomaly determination module 150 is configured to integrate the direction deviation d of a plurality of consecutive data points to determine whether a traffic anomaly occurs. Specifically, the traffic anomaly determination module 150 calculates a sum direction deviation D by adding up the direction deviations D of all adjacent data points in the plurality of data points, and determines that the actual traffic data is anomalous when the sum direction deviation D exceeds a preset direction deviation threshold range.
Since the actual traffic data is always allowed to fluctuate around the baseline data, relying only on the above directional deviation of 2 data points is not enough to judge whether the traffic is abnormal. In view of this, in order to determine the anomaly more accurately, the traffic anomaly determination module 150 needs to collect the directional deviations of a plurality of consecutive data points for comprehensive determination.
It is assumed that the traffic anomaly determination module 150 determines the direction deviation by taking the data of 10 traffic points as an example. The 10 service data form 9 segments, and the direction deviation D is calculated for each segment, which is respectively denoted as D1, D2, d3... D9, and the total direction deviation is denoted as D, then D can be expressed as:
as can be seen, the larger the value of D, the larger the directional deviation between the actual traffic data vector representing the actual traffic data and the baseline vector of the baseline data; if the actual data fluctuates up and down around the baseline data, positive and negative numbers will appear in the value of D, and the value of the sum-direction deviation D will be subtracted. In fact, if the baseline value is relatively accurate, D approaches 0 when the sampling space is large enough. However, too much sampling is not suitable for the abnormality judgment because the timeliness of the abnormality judgment is affected by too much historical data. Theoretically, the number of samples cannot be greater than the number of data in 1 service period, otherwise, the detection of the direction deviation may be weakened due to the period offset fluctuation. For example, if the traffic period is 40 points, i.e. a cycle is restarted after every 40 points, then the number of samples should be less than 40, and the waveform of the traffic data is specifically referred to determine the number of samples. Generally, about 10 samples are suitable, and the specific sampling number can be adjusted according to the actual requirements of services.
The traffic anomaly determination module 150 determines the number of samples according to the actual demand of the traffic, and then determines the range of the direction deviation threshold according to the number of samples, so as to determine whether the total direction deviation exceeds the range of the direction deviation threshold, and if the total direction deviation exceeds the range of the direction deviation threshold, the traffic anomaly is represented.
The traffic anomaly detection apparatus based on the directional deviation according to the present invention, which has been described above, effectively detects traffic anomalies mainly by comparing the traffic base line of the same period with the direction of change of the actual curve. Compared with the traditional anomaly detection algorithm and the device thereof, the service anomaly detection device of the invention introduces the concept of vector direction deviation on the basis of the base line tolerance or the threshold value, calculates the direction deviation of real-time data and base line data, so that the correct direction of the service data deviating from the base line can be detected in time by emphasizing the detection of the fluctuation direction of the curve, thereby more accurately identifying the change characteristics of the service data and finding some more hidden service anomalies. Moreover, the service abnormity detection device has simple calculation mode, consumes little CPU and is easy to be implemented in engineering.
In addition, embodiments of the present invention further provide a computer system to which the method and apparatus for detecting a business anomaly based on a directional deviation are applied, and refer to fig. 5, which is a schematic structural diagram of a computer system suitable for implementing the apparatus for detecting a business anomaly based on a directional deviation according to an embodiment of the present invention. The illustration of fig. 5 is merely an example, and should not impose any limitations on the functionality or scope of use of embodiments of the present invention.
As shown in fig. 5, the computer system 300 includes a Central Processing Unit (CPU)301 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)302 or a program loaded from a storage section 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data necessary for the operation of the system 300 are also stored. The CPU 301, ROM302, and RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
The following components are connected to the I/O interface 305: an input portion 306 including a keyboard, a mouse, and the like; an output section 303 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 308 including a hard disk and the like; and a communication section 309 including a network interface card such as a LAN card, a modem, or the like. The communication section 309 performs communication processing via a network such as the internet. A drive 310 is also connected to the I/O interface 305 as needed. A removable medium 311 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 310 as necessary, so that a computer program read out therefrom is mounted into the storage section 308 as necessary.
In particular, the steps described above with reference to the flow diagrams may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in FIG. 1. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 309, and/or installed from the removable medium 311. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 301.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods, apparatus, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules and their units may also be provided in a processor, which may be described as: a processor comprises a service baseline generation module, an actual service data statistics module, a tolerance judgment module, a direction deviation calculation module and a service abnormity judgment module. The names of these modules do not in some cases constitute a limitation to the module and its units themselves, and for example, the directional deviation calculation module may also be described as a "directional deviation acquisition module".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise:
s1: generating a business baseline
Based on historical traffic data for a particular traffic feature, a traffic baseline (Base _ t, Base _ v) is generated for the traffic data over time.
Specifically, the determined service characteristics are used as detection objects, and a service baseline model is established according to service cycle learning based on historical service data of the service characteristics so as to form a time-related service baseline curve. The traffic baseline typically represents a 2-dimensional vector, each baseline value may be represented by (Base _ t, Base _ v), Base _ t representing time, and Base _ v representing baseline traffic data. An actual traffic baseline may be graphically represented as a curve, where the times Base _ t are typically an arithmetic series, i.e., the time intervals are the same.
The service characteristics can be network flow Kbps, packet rate pps, CPU, memory, temperature, rotating speed, voltage, air pressure and the like.
S2: actual business data statistics step
And counting the change curve (Real _ t, Real _ v) of the actual service data of the specific service characteristic along with the time.
Specifically, the change of the actual service data with time is counted in Real time, and each actual service data can be represented by (Real _ t, Real _ v), where Real _ t represents time and Real _ v represents the actual service data. The change of the actual traffic data with time can be graphically represented as a curve, wherein the time Real _ t is generally an arithmetic progression, i.e. the time interval is the same, and preferably, the time Base _ t is the same as the time interval of the time Real _ t, and the period (unit time length) of the baseline data and the actual traffic data is the same.
S3: tolerance determination step
According to the tolerance algorithm, whether the actual service data Real _ v at the time point Real _ t conforms to the tolerance range of the service baseline (Base _ t, Base _ v) is judged.
Specifically, if the actual service data Real _ v exceeds the tolerance range, it is determined that the actual service data is abnormal, and the abnormal result can be directly reported.
As an example, the tolerance algorithm on which the invention is based may employ, for example, an absolute value deviation or a proportional deviation.
S4: and a direction deviation calculation step, namely performing vector direction deviation detection on the actual service data and the baseline data in the same period, respectively calculating a baseline vector Base _ d of the baseline data at adjacent time points and a corresponding actual service data vector Real _ d of the actual service data, and comparing the baseline vector Base _ d with the actual service data vector Real _ d to obtain the direction deviation d.
As an example, the tolerance algorithm on which the invention is based may employ, for example, an absolute value deviation or a proportional deviation.
The method comprises the following specific steps:
first, 2 corresponding data points in the actual traffic data and the baseline data are taken as an example for explanation. Assume that the 2 data points in the baseline data are (Base _ t1, Base _ v1) and (Base _ t2, Base _ v2), and the corresponding 2 data points in the actual traffic data are (Real _ t1, Real _ v1) and (Real _ t2, Real _ v2) ]. The time difference between the two data is the same, namely, Base _ t2-Base _ t 1-Real _ t2-Real _ t 1.
Assuming that the baseline vector is Base _ d and the actual traffic data vector is Real _ d, the specific calculation method is expressed in the form of pseudo code as follows:
If(Base_v2-Base_v1>0)Then
Base_d=1
Else If(Base_v2-Base_v1==0)Then
Base_d=0
Else
Base_d=-1;
the same algorithm is used to calculate the vector deviation Real _ d of the Real-time data:
If(Real_v2-Real_v1>0)Then
Real_d=1
Else If(Real_v2-Real_v1==0)Then
Real_d=0
Else
Real_d=-1;
that is, it can be interpreted that when Base _ v2-Base _ v1>0, Base _ d is 1; when Base _ v2-Base _ v1 is 0, Base _ d is 0; when Base _ v2-Base _ v1<0, Base _ d ═ 1.
When Real _ v2-Real _ v1>0, Real _ d ═ 1; when Real _ v2-Real _ v1 is 0, Real _ d is 0; when Real _ v2-Real _ v1<0, Real _ d ═ 1.
The values of the baseline vector Base _ d and the actual traffic data vector Real _ d are thus obtained separately by the above-mentioned calculation, on the basis of which the direction deviation d is calculated, wherein,
the direction deviation d is Real _ d-Base _ d.
Based on the values of the baseline vector Base _ d and the actual service data vector Real _ d, it can be seen that the value range of d is-2 ≤ d ≤ 2.
As can be seen from the above description of the calculation process:
when d is 0, the direction deviation representing the actual data and the direction deviation representing the baseline data are completely consistent;
when d is 2, the direction deviation indicating the actual data and the direction deviation indicating the baseline data are positive and reach the maximum value, meaning that the baseline vector is directed downward, but the actual data vector is directed upward.
When d is-2, the deviation indicating the actual data and the directional deviation of the baseline data are negative and reach the maximum value, meaning that the baseline vector is directed upward, but the actual data vector is directed downward.
Therefore, the direction deviation of the vector of 2 data points can be determined from the value of d.
S5: service abnormality determination step
And integrating the direction deviation d of a plurality of continuous data points to judge whether the service abnormity occurs. Specifically, a sum direction deviation D is calculated by adding up the direction deviations D of all adjacent data points in the plurality of data points, and when the sum direction deviation D exceeds a preset direction deviation threshold range, it is determined that actual traffic data is abnormal.
Since the actual traffic data is always allowed to fluctuate around the baseline data, relying only on the above directional deviation of 2 data points is not enough to judge whether the traffic is abnormal. In view of this, in order to more accurately determine an abnormality, it is necessary to integrate the directional deviations of a plurality of consecutive data points for comprehensive determination.
The direction deviation is determined by taking the data of 10 service points as an example. The 10 service data form 9 segments, and the direction deviation D is calculated for each segment, which is respectively denoted as D1, D2, d3... D9, and the total direction deviation is denoted as D, then D can be expressed as:
as can be seen, the larger the value of D, the larger the directional deviation between the actual traffic data vector representing the actual traffic data and the baseline vector of the baseline data; if the actual data fluctuates up and down around the baseline data, positive and negative numbers will appear in the value of D, and the value of the sum-direction deviation D will be subtracted. In fact, if the baseline value is relatively accurate, D approaches 0 when the sampling space is large enough. However, too much sampling is not suitable for the abnormality judgment because the timeliness of the abnormality judgment is affected by too much historical data. Theoretically, the number of samples cannot be greater than the number of data in 1 service period, otherwise, the detection of the direction deviation may be weakened due to the period offset fluctuation. For example, if the traffic period is 40 points, i.e. a cycle is restarted after every 40 points, then the number of samples should be less than 40, and the waveform of the traffic data is specifically referred to determine the number of samples. Generally, about 10 samples are suitable, and the specific sampling number can be adjusted according to the actual requirements of services.
After the sampling number is determined according to the actual requirement of the service, the direction deviation threshold range is determined according to the sampling number, so that whether the total direction deviation exceeds the direction deviation threshold range or not can be judged, and if the total direction deviation exceeds the direction deviation threshold range, the service is abnormal.
Having described various embodiments of the present invention in detail above, the method, apparatus and medium for detecting traffic anomaly based on directional deviation according to the present invention effectively detect traffic anomaly mainly by comparing the changing directions of traffic base lines and actual curves of the same period. Compared with the traditional anomaly detection algorithm, the method, the device and the medium for detecting the service anomaly introduce the concept of vector direction deviation on the basis of the base line tolerance or the threshold value, calculate the direction deviation of real-time data and base line data, focus on the detection of the fluctuation direction of the curve, and can timely detect the correct direction of the service data deviating from the base line, thereby more accurately identifying the change characteristics of the service data and finding some more hidden service anomalies. Moreover, the service abnormity detection method, the device and the medium have simple calculation mode, little consumption on a CPU and easy engineering implementation.
The above description is only an example of the present application and is not intended to limit the present invention, and it is obvious to those skilled in the art that various modifications and variations can be made in the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.
Claims (15)
1. A method for detecting service abnormality is characterized in that the method comprises the following steps:
a service baseline generation step, based on historical service data of specific service characteristics, generating service baselines (Base _ t, Base _ v) of service data changing with time, wherein, Base _ t represents time, and Base _ v represents baseline data;
a step of counting actual service data, which is to count a change curve (Real _ t, Real _ v) of the actual service data with the specific service characteristics along with time, wherein the Real _ t represents time, and the Real _ v represents the actual service data;
a tolerance judging step, namely judging whether the actual business data Real _ v at the time Real _ t conforms to the tolerance range of the business baseline (Base _ t, Base _ v) or not according to a tolerance algorithm;
a direction deviation calculation step of calculating a baseline vector Base _ d of baseline data at adjacent data points and a corresponding actual traffic data vector Real _ d of the actual traffic data, respectively, and comparing the baseline vector Base _ d with the actual traffic data vector Real _ d to obtain the direction deviation d; and
and a service abnormity determining step, wherein the direction deviation d of all adjacent data points in the continuous multiple data points is integrated to determine whether service abnormity occurs.
2. The method of claim 1, wherein,
the time interval of Base _ t and Real _ t is the same, and the period of the baseline data and the actual traffic data is the same.
3. The method of claim 1 or 2, wherein,
in the direction deviation calculating step, two of the adjacent baseline data are defined as (Base _ t1, Base _ v1), (Base _ t2, Base _ v2), wherein,
when Base _ v2-Base _ v1>0, Base _ d ═ 1;
when Base _ v2-Base _ v1 is 0, Base _ d is 0;
when Base _ v2-Base _ v1<0, Base _ d ═ 1; and wherein the one or more of the one,
corresponding adjacent two of the actual traffic data are defined as (Real _ t1, Real _ v1), (Real _ t2, Real _ v2), wherein,
when Real _ v2-Real _ v1>0, Real _ d ═ 1;
when Real _ v2-Real _ v1 is 0, Real _ d is 0;
when Real _ v2-Real _ v1<0, Real _ d ═ 1.
4. The method of claim 3, wherein,
the directional deviation d of a baseline vector Base _ d of the baseline data from the actual traffic data vector Real _ d, Real _ d-Base _ d, wherein,
when d is 0, indicating that the direction deviation of the actual traffic data and the baseline data is consistent;
when d-2 or d-2, the direction deviation of the actual traffic data from the baseline data is maximum and the vector direction is opposite.
5. The method of claim 4, wherein,
in the traffic abnormality determination step, a sum directional deviation D is calculated by adding up the directional deviations D of all the adjacent data points of the plurality of data points, and
and when the sum direction deviation D exceeds a preset direction deviation threshold range, judging that the actual service data is abnormal.
6. The method of claim 5, wherein,
and determining the direction deviation threshold range according to the number of the data points.
7. The method according to any one of claims 1 to 6, wherein,
in the tolerance determining step, if the actual service data Real _ v exceeds the tolerance range, the service data is directly determined to be abnormal.
8. The method according to any one of claims 1 to 7, wherein,
in the service baseline generation step, the specific service characteristic is used as a detection object, and a service baseline model is established according to service cycle learning based on the historical service data of the specific service characteristic so as to form a curve of the service baseline (Base _ t, Base _ v) related to time.
9. An apparatus for detecting traffic anomalies, the apparatus comprising:
a service baseline generation module, which is used for generating service baselines (Base _ t, Base _ v) of the service data changing with time based on the historical service data of the specific service characteristics, wherein, the Base _ t represents time, and the Base _ v represents baseline data;
the actual service data statistics module is used for counting a change curve (Real _ t, Real _ v) of the actual service data with the specific service characteristics along with time, wherein the Real _ t represents time, and the Real _ v represents the actual service data;
the tolerance judging module is used for judging whether the actual business data Real _ v at the time point Real _ t conforms to the tolerance range of the business baseline (Base _ t, Base _ v) or not according to a tolerance algorithm;
the direction deviation calculation module is used for respectively calculating a baseline vector Base _ d of baseline data of adjacent data points and a corresponding actual business data vector Real _ d of the actual business data, and comparing the baseline vector Base _ d with the actual business data vector Real _ d to obtain the direction deviation d; and
and the business abnormity determining module is used for integrating the direction deviation d of all adjacent data points in the continuous multiple data points to determine whether business abnormity occurs.
10. The apparatus of claim 9, wherein,
the time interval of Base _ t and Real _ t is the same, and the period of the baseline data and the actual traffic data is the same.
11. The apparatus of claim 9 or 10, wherein,
the direction deviation calculation module sets two of the baseline data that are adjacent to each other to (Base _ t1, Base _ v1), (Base _ t2, Base _ v2), wherein,
when Base _ v2-Base _ v1>0, Base _ d ═ 1;
when Base _ v2-Base _ v1 is 0, Base _ d is 0;
when Base _ v2-Base _ v1<0, Base _ d ═ 1; and wherein the one or more of the one,
the direction deviation calculation module sets corresponding adjacent two Real service data as (Real _ t1, Real _ v1), (Real _ t2, Real _ v2), wherein,
when Real _ v2-Real _ v1>0, Real _ d ═ 1;
when Real _ v2-Real _ v1 is 0, Real _ d is 0;
when Real _ v2-Real _ v1<0, Real _ d ═ 1.
12. The apparatus of claim 11, wherein,
the directional deviation d of a baseline vector Base _ d of the baseline data from the actual traffic data vector Real _ d, Real _ d-Base _ d, wherein,
when d is 0, indicating that the direction deviation of the actual traffic data and the baseline data is consistent; and is
When d-2 or d-2, the direction deviation representing the actual traffic data from the baseline data is the largest and the opposite is true.
13. The apparatus of claim 12, wherein,
the traffic anomaly determination module calculates a sum directional deviation D by adding up the directional deviations D of all of the plurality of data points, and
and when the sum direction deviation D exceeds a preset direction deviation threshold range, judging that the actual service data is abnormal.
14. A traffic abnormality detection apparatus comprising a storage unit storing a program and a processing unit, wherein,
the processing unit executes the program to implement the steps of the method according to any one of claims 1 to 8.
15. A computer-readable medium, wherein,
the medium has stored thereon a program which is executed to implement the steps in the method according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011384760.9A CN112380100A (en) | 2020-12-01 | 2020-12-01 | Method, apparatus and medium for detecting traffic anomaly based on directional deviation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011384760.9A CN112380100A (en) | 2020-12-01 | 2020-12-01 | Method, apparatus and medium for detecting traffic anomaly based on directional deviation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112380100A true CN112380100A (en) | 2021-02-19 |
Family
ID=74589190
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011384760.9A Pending CN112380100A (en) | 2020-12-01 | 2020-12-01 | Method, apparatus and medium for detecting traffic anomaly based on directional deviation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112380100A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1677934A (en) * | 2004-03-31 | 2005-10-05 | 华为技术有限公司 | Method and system for monitoring network service performance |
CN108965347A (en) * | 2018-10-10 | 2018-12-07 | 腾讯科技(深圳)有限公司 | A kind of detecting method of distributed denial of service attacking, device and server |
CN110287078A (en) * | 2019-04-12 | 2019-09-27 | 上海新炬网络技术有限公司 | Abnormality detection and alarm method based on zabbix performance baseline |
CN110674891A (en) * | 2019-10-16 | 2020-01-10 | 北京天泽智云科技有限公司 | Data quality abnormity detection method for monitoring system |
CN111143102A (en) * | 2019-12-13 | 2020-05-12 | 东软集团股份有限公司 | Abnormal data detection method and device, storage medium and electronic equipment |
CN111199018A (en) * | 2019-12-27 | 2020-05-26 | 东软集团股份有限公司 | Abnormal data detection method and device, storage medium and electronic equipment |
-
2020
- 2020-12-01 CN CN202011384760.9A patent/CN112380100A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1677934A (en) * | 2004-03-31 | 2005-10-05 | 华为技术有限公司 | Method and system for monitoring network service performance |
CN108965347A (en) * | 2018-10-10 | 2018-12-07 | 腾讯科技(深圳)有限公司 | A kind of detecting method of distributed denial of service attacking, device and server |
CN110287078A (en) * | 2019-04-12 | 2019-09-27 | 上海新炬网络技术有限公司 | Abnormality detection and alarm method based on zabbix performance baseline |
CN110674891A (en) * | 2019-10-16 | 2020-01-10 | 北京天泽智云科技有限公司 | Data quality abnormity detection method for monitoring system |
CN111143102A (en) * | 2019-12-13 | 2020-05-12 | 东软集团股份有限公司 | Abnormal data detection method and device, storage medium and electronic equipment |
CN111199018A (en) * | 2019-12-27 | 2020-05-26 | 东软集团股份有限公司 | Abnormal data detection method and device, storage medium and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10585774B2 (en) | Detection of misbehaving components for large scale distributed systems | |
KR102260417B1 (en) | Method and apparatus for detecting traffic | |
US9772896B2 (en) | Identifying intervals of unusual activity in information technology systems | |
CN110288624A (en) | Detection method, device and the relevant device of straightway in a kind of image | |
CN110967585B (en) | Malignant load identification method and device | |
CN114363212B (en) | Equipment detection method, device, equipment and storage medium | |
CN115794578A (en) | Data management method, device, equipment and medium for power system | |
CN116245865A (en) | Image quality detection method and device, electronic equipment and storage medium | |
CN113204467B (en) | Method, device, equipment and storage medium for monitoring online service system | |
CN113746862A (en) | Abnormal flow detection method, device and equipment based on machine learning | |
CN112380100A (en) | Method, apparatus and medium for detecting traffic anomaly based on directional deviation | |
US11956256B2 (en) | Priority determination apparatus, priority determination method, and computer readable medium | |
CN110334125A (en) | A kind of power distribution network measurement anomalous data identification method and device | |
CN112507957B (en) | Vehicle association method and device, road side equipment and cloud control platform | |
CN110120893B (en) | Method and device for positioning network system security problem | |
CN114581711A (en) | Target object detection method, apparatus, device, storage medium, and program product | |
CN115174426B (en) | Output message detection method and device, electronic equipment and storage medium | |
CN115290798B (en) | Stability performance monitoring method and terminal of transformer oil chromatographic online monitoring device | |
CN117648202B (en) | Heterogeneous system data synchronization process endless loop detection method, system and medium | |
CN113102385B (en) | Method, device and system for removing metal particles in GIS and storage medium | |
CN114724370B (en) | Traffic data processing method, device, electronic equipment and medium | |
CN115378746A (en) | Network intrusion detection rule generation method, device, equipment and storage medium | |
CN109842586B (en) | Abnormal network flow detection method, device and storage medium | |
CN118075172A (en) | Network quality anomaly detection method, device and readable storage medium | |
CN117743767A (en) | Time sequence segmentation method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |