CN107257351B - OF flow anomaly detection system based on gray L and detection method thereof - Google Patents

OF flow anomaly detection system based on gray L and detection method thereof Download PDF

Info

Publication number
CN107257351B
CN107257351B CN201710631334.2A CN201710631334A CN107257351B CN 107257351 B CN107257351 B CN 107257351B CN 201710631334 A CN201710631334 A CN 201710631334A CN 107257351 B CN107257351 B CN 107257351B
Authority
CN
China
Prior art keywords
data
gray
flow
dimension
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710631334.2A
Other languages
Chinese (zh)
Other versions
CN107257351A (en
Inventor
张众发
陈炽光
王冬生
杨福国
刘东东
赖群
焦力
王广
黄祖迪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunfu Power Supply Bureau of Guangdong Power Grid Co Ltd
Original Assignee
Yunfu Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunfu Power Supply Bureau of Guangdong Power Grid Co Ltd filed Critical Yunfu Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority to CN201710631334.2A priority Critical patent/CN107257351B/en
Publication of CN107257351A publication Critical patent/CN107257351A/en
Application granted granted Critical
Publication of CN107257351B publication Critical patent/CN107257351B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection

Abstract

The invention relates to the technical field OF flow anomaly detection, in particular to a gray L OF-based flow anomaly detection system and a detection method thereof, wherein the gray L OF-based flow anomaly detection system acquires original data flow packets through an information acquisition module, uses a data cleaning technology to preprocess data, extracts and summarizes a high-correlation field OF each flow data packet as a detection data source, analyzes and prejudges the data provided by the information acquisition module by using a gray distinguishing module through a gray theory, reduces the data calculation scale in a large scale, reduces the time complexity OF a L OF algorithm, effectively improves the timeliness, calculates the anomaly degree OF the data flow packets through a L OF analysis module, detects based on density, calculates the separation degree OF each flow packet and the nearby flow packets, does not need to preset the specific abnormal state OF flow, and has high flexibility compared with the traditional method.

Description

OF flow anomaly detection system based on gray L and detection method thereof
Technical Field
The invention relates to the technical field OF flow anomaly detection, in particular to an OF flow anomaly detection system based on gray L and a detection method thereof.
Background
With the construction of the smart grid, the data network and the service system carried by the data network are rapidly developed, and a large amount of network traffic is generated every day. The abnormal traffic mixed in the normal traffic causes great damage to the network, which causes the service quality of the network to be rapidly reduced and even causes network paralysis in case of serious service. Therefore, detecting abnormal traffic is an important aspect of data network operation and maintenance work.
Currently, abnormal flow testsThe method is simple and easy to understand and has high accuracy, but on one hand, the method is difficult to construct a network traffic matrix and solve a high-dimensional covariance matrix, and on the other hand, the time complexity of the algorithm is o (n)3) The time cost is too large; in addition, an ODSP-based network flow time sequence analysis flow model is provided, a multi-view cooperation visual analysis prototype system is designed, network conditions can be comprehensively detected, and detection of abnormality is mostly completed manually; the network traffic analysis method based on the entropy theory is also provided, the entropy theory is improved by utilizing the long correlation characteristics of the information units on the traffic space, but the problem of large traffic distribution difference in different time periods is difficult to solve, the high detection rate and the low misjudgment rate are difficult to simultaneously ensure, and the self-adaptability is lacked; some researchers have proposed methods based on signal analysis, which analyze various characteristics such as frequency spectrum and energy spectral density of signals to detect abnormalities, but the methods have high undetected rate and false undetected rate due to complexity and variability of abnormal flow characteristics.
Disclosure of Invention
The invention aims to overcome the defects OF the prior art and provide a gray L OF flow anomaly detection system and a detection method thereof, which have the advantages OF no need OF labels, strong self-adaptability, good timeliness and capability OF meeting the requirements OF diversification OF data network service flow types and real-time performance OF anomaly detection.
In order to solve the technical problems, the invention adopts the technical scheme that:
the gray L OF flow anomaly detection system comprises an information acquisition module, a gray distinguishing module, a L OF analysis module and an output module, wherein the information acquisition module is used for acquiring and preprocessing original data and transmitting the data to the gray distinguishing module, the gray distinguishing module is used for analyzing and prejudging the data to obtain a gray area needing to be calculated and transmitting the gray area to a L OF analysis module, the L OF analysis module is used for analyzing objects in the gray area and transmitting the analysis result to the output module, and the output module is used for outputting the analysis result to a terminal.
The gray L OF flow anomaly detection system comprises an information acquisition module, a gray distinguishing module, a gray area analysis module, an output module and an output module, wherein the information acquisition module acquires an original data flow packet, the acquired original data flow packet generally comprises 25 fields, data is preprocessed by using a data cleaning technology, high-correlation fields OF each flow data packet are extracted and summarized to serve as a detection data source, the gray distinguishing module analyzes and predicts data provided by the information acquisition module by using a gray theory, a prediction result is compared with actual data, the result deviation within a certain range is classified into normal flow, the result deviation beyond the range is judged as gray flow, all gray flow forms a gray area, the gray area becomes an area OF a L OF analysis module, the time complexity OF a L OF module is reduced, the timeliness is effectively improved, the anomaly degree OF the data flow packet is calculated by the L OF analysis module, a point with an anomaly factor close to 1 is classified into a normal point, a point with an anomaly factor deviating from 1 is classified into an abnormal point, and the detected abnormal point is transmitted to the output module, and the detected abnormal flow is output to a required terminal.
The invention also provides a detection method based on the gray L OF flow anomaly detection system, which comprises the following steps:
s1, acquiring original data traffic packets through traffic acquisition equipment arranged on a data network node through a bypass, preprocessing the data by using a data cleaning technology, extracting and summarizing high-association fields of each traffic data packet, and determining and selecting four fields of PacketsIn, PacketsOut, BytesIn and BytesOut as detection data sources;
s2, after the step S1, judging the prediction result by adopting a dimension standardization mode, and setting an original data column x(0)={x(0)(1),x(0)(2),…,x(0)(n) n is the number of grey predictions, according to x(0)Establishing a GM (1,1) model for realizing a prediction function by the data column; comparing the predicted result with actual data, and judging the flow of which the deviation of the result exceeds a comparison threshold value as grey flow;
s3, after the step S2, abstracting a data traffic packet into an object p, calculating the local reachable density and the local abnormal factor L OF (p) OF the object p by a L OF analysis module according to a KT L AD algorithm, judging the point OF the abnormal factor L OF (p) which is close to 1 as a normal point, and comparing a threshold value to obtain an abnormal point;
s4. after step S3, the abnormal point is output to the terminal.
Preferably, the step S2 is according to x(0)Establishing a GM (1,1) model by the data column to realize prediction according to the following steps:
a. accumulating the original data, weakening the volatility and randomness of the random sequence to obtain a new data sequence x(1)
x(1)={x(1)(1),x(1)(2),...,x(1)(n)} (1)
Wherein x is(1)(k) Each data represents the accumulation of the corresponding first several items of data;
Figure GDA0002516448260000031
b. for x according to formula (2)(1)(k) Establishing a first-order linear differential equation, namely a GM (1,1) model:
Figure GDA0002516448260000032
wherein a and b are undetermined coefficients which are respectively called development coefficient and gray action amount; the effective interval of a is (-2,2), and the matrix formed by a and b is recorded as a gray parameter
Figure GDA0002516448260000033
By determining the parameters a and b, x can be determined(1)(k) And further find x(0)Future predicted value of.
c. Averaging the accumulated data to generate B and constant term vector Y according to formula (3) and formula (4)n:
Figure GDA0002516448260000034
Yn=[x(0)(2),x(0)(3),...x(0)(n)]Τ(4)
d. Solving ash parameters by least squares method according to equation (5)
Figure GDA0002516448260000035
Figure GDA0002516448260000036
e. Ash parameter
Figure GDA0002516448260000037
Substituting into formula (2) and applying formula (6) to x(1)(k) And solving to obtain:
Figure GDA0002516448260000038
f. calculating the data sequence x according to equation (7)(1)Predicted value of (2)
Figure GDA0002516448260000039
Figure GDA00025164482600000310
g. X is obtained by calculation according to the formula (8)(0)Predicted value of (2)
Figure GDA00025164482600000311
Figure GDA0002516448260000041
h. The gray contrast value gc is calculated according to equation (9):
Figure GDA0002516448260000042
wherein XiIs a predicted value of the i-th field, YiIs the actual value of the i-th field, kiThe weight of the ith dimension, in combination with the distance calculation in the next module,Xordering the values at 0.9 from big to little in dimension i, XThe values at 0.1 are ordered from large to small in dimension i.
Because the flow has self-similarity, the trend of the data is predicted according to the known data, the existing flow data is used as an original sequence, the predicted result is compared with the actual data, if the deviation of the result is within a certain range, the flow is considered to be normal, and if the deviation of the result is beyond the range, the flow is judged to be gray; all gray flows constitute the gray area, which is the area analyzed by the next module. The gray flow is not necessarily abnormal flow and is composed of the majority of abnormal flow and part of normal flow; part of abnormal flow may not be judged as grey flow, so that the detection rate of the detection process is reduced; the detection rate of abnormal flow approaches 100% by adjusting the gray contrast value.
Preferably, the KT L AD algorithm in step S3 specifically includes the following steps:
a. calculating the variance of each dimension, and finding out the dimension d with the maximum variance; arranging the points from small to large in d dimension, setting the middle value point as a splitting point, setting the point smaller than the middle value as a left son, and setting the point larger than the middle value as a right son; establishing a k-d tree which comprises a plurality of nodes which take data traffic packets as nodes;
b. after the step a, abstracting a data traffic packet into an object p, adopting normalization processing, adopting weighting processing aiming at different dimension importance, and obtaining a distance d (p, q), wherein the distance d (p, q) is calculated according to a formula (10):
Figure GDA0002516448260000043
wherein k isiIs the weight of the ith dimension, XOrdering the values at α, X, from big to little in dimension iThe values at β are sorted from large to small in dimension i, α, β satisfy 0.6826 ≦ α - β ≦ 0.9544
c. After the step b, inquiring the nearest neighbors according to the established k-d tree, and inquiring to obtain the kth nearest neighbor, namely obtaining the k-distance;
d. after step c, a k-distance neighborhood is computed as in equation (11):
Nk-dis(p)={q|d(p,q)≤k-dis(p)} (11)
e. after step d, given a natural number k, the reachable distance r-dis of object p relative to object o is calculated as in equation (12)k
r-disk(p,o)=max{k-dis(o),d(p,o)} (12)
f. After step e, the local reachable density lrd of object p is calculated according to equation (13)k-dis(p) calculating a local abnormality factor L OF (p) of the object p according to the formula (14):
Figure GDA0002516448260000051
Figure GDA0002516448260000052
according to the established k-d tree, nearest neighbors can be easily inquired, when the kth nearest neighbor is inquired, an array can be used for recording whether a point can be used for updating the nearest distance or not, the k-distance can be obtained after the kth nearest neighbor is inquired, then a k-distance neighborhood is calculated through a formula, the k-distance neighborhood comprises all objects OF which the distance between the k-distance neighborhood and p does not exceed k-dis (p), L OF values OF all points are calculated, abnormal points are obtained through an abnormal factor comparison threshold value, the density OF the abnormal points is consistent with that OF surrounding points and can be judged to be normal, the larger the abnormal factor is, the larger the difference between the abnormal factor and the density OF the surrounding points is, the probability OF being the abnormal points is larger, detection is carried out based on the density, the separation degree OF each flow packet and the nearby flow packets is calculated, the specific abnormal state OF the flow does not need to be preset, and the flexibility is higher compared with the traditional method.
Compared with the prior art, the invention has the beneficial effects that:
the gray L OF flow anomaly detection system comprises an information acquisition module, a gray distinguishing module, a L OF analysis module and an output module, wherein an original data flow packet is acquired through the information acquisition module, data is preprocessed through a data cleaning technology, high-correlation fields OF each flow data packet are extracted and summarized to serve as a detection data source, data provided by the information acquisition module are analyzed and prejudged through a gray distinguishing module by utilizing a gray theory, a range needing to be calculated is found out and called as a gray area, finally, only objects in the gray area need to be calculated, the data calculation scale is reduced on a large scale, the time cost is greatly reduced, the abnormal degree OF the data flow packet is calculated through the L OF analysis module, abnormal points can be found out efficiently in an unsupervised environment, and the system is simple, small in calculation amount and good in timeliness.
The gray L OF-based flow anomaly detection method adopts a data cleaning technology to extract and summarize a high-correlation field OF each flow data packet, adopts a gray theory to predict the trend OF data according to constant data, judges a gray area, reduces the data calculation scale in a large scale, and carries out detection based on density to calculate the separation degree OF each flow packet and nearby flow packets without presetting the specific abnormal state OF flow, thereby having higher flexibility compared with the traditional method, having strong feasibility and effectively reducing the time cost.
Drawings
FIG. 1 is a block diagram OF an algorithm OF a gray L OF flow anomaly detection system according to a first embodiment;
FIG. 2 is a gc threshold correspondence table;
FIG. 3 is a graph corresponding to gray detection rate and gray compression ratio for different L OF values;
FIG. 4 is an L OF threshold adjustment table;
FIG. 5 is a comparison graph of the accuracy and the detection rate of the detection method of the present invention with the classical density algorithms DBSCAn, RIDBSCAn and the hierarchical clustering-based Cure algorithm;
FIG. 6 is a comparison graph of the detection method of the present invention and the time consumption of the classical density algorithms DBSCAn, RIDBSCAn and the hierarchical clustering based Cure algorithm.
Detailed Description
The present invention will be further described with reference to the following embodiments. Wherein the showings are for the purpose of illustration only and are shown by way of illustration only and not in actual form, and are not to be construed as limiting the present patent; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by the terms "upper", "lower", "left", "right", etc. based on the orientation or positional relationship shown in the drawings, it is only for convenience of describing the present invention and simplifying the description, but it is not intended to indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes and are not to be construed as limiting the present patent, and the specific meaning of the terms may be understood by those skilled in the art according to specific circumstances.
Example 1
Fig. 1 shows a first embodiment OF a gray L OF based flow anomaly detection system according to the present invention, which includes an information acquisition module, a gray distinguishing module, a L OF analysis module, and an output module, where the information acquisition module is configured to acquire and preprocess raw data and transmit the data to the gray distinguishing module, the gray distinguishing module is configured to analyze and pre-judge the data to obtain a gray area to be calculated and transmit the gray area to a L OF analysis module, the L OF analysis module is configured to analyze an object in the gray area and transmit an analysis result to the output module, and the output module is configured to output the analysis result to a desired target terminal.
The invention also provides a gray L OF flow anomaly based detection method, which comprises the following steps:
s1, acquiring original data traffic packets through traffic acquisition equipment arranged on a data network node through a bypass, preprocessing data on the basis of a gray scale theory, extracting and summarizing high-association fields of each traffic data packet, and determining and selecting four fields of PacketsIn, PacketsOut, BytesIn and BytesOut as detection data sources;
s2, after the step S1, judging the prediction result by adopting a dimension standardization mode, and setting an original data column x(0)={x(0)(1),x(0)(2),…,x(0)(n) n is the number of grey predictions, according to x(0)Establishing a GM (1,1) model for realizing a prediction function by the data column; comparing the predicted result with actual data, and judging the flow of which the deviation of the result exceeds a comparison threshold value as grey flow;
s3, after the step S2, abstracting a data traffic packet into an object p, calculating the local reachable density and the local abnormal factor L OF (p) OF the object p by a L OF analysis module according to a KT L AD algorithm, judging the point OF the abnormal factor L OF (p) which is close to 1 as a normal point, and comparing a threshold value to obtain an abnormal point;
s4. after step S3, the abnormal point is output to the terminal.
Specifically, the term x in step S2(0)Establishing a GM (1,1) model by the data column to realize prediction according to the following steps:
a. accumulating the original data, weakening the volatility and randomness of the random sequence to obtain a new data sequence x(1)
x(1)={x(1)(1),x(1)(2),...,x(1)(n)} (1)
Wherein x is(1)(k) Each data represents the accumulation of the corresponding first several items of data;
Figure GDA0002516448260000071
b. for x according to formula (2)(1)(k) Establishing a first-order linear differential equation, namely a GM (1,1) model:
Figure GDA0002516448260000072
wherein a and b are undetermined coefficients which are respectively called development coefficient and gray action amount; the effective interval of a is (-2,2), and the matrix formed by a and b is recorded as a gray parameter
Figure GDA0002516448260000081
c. Averaging the accumulated data to generate B and constant term vector Y according to formula (3) and formula (4)n:
Figure GDA0002516448260000082
Yn=[x(0)(2),x(0)(3),...x(0)(n)]Τ(4)
d. Solving ash parameters by least squares method according to equation (5)
Figure GDA0002516448260000083
Figure GDA0002516448260000084
e. Ash parameter
Figure GDA0002516448260000085
Substituting into formula (2) and applying formula (6) to x(1)(k) And solving to obtain:
Figure GDA0002516448260000086
f. calculating the data sequence x according to equation (7)(1)Predicted value of (2)
Figure GDA0002516448260000087
Figure GDA0002516448260000088
g. X is obtained by calculation according to the formula (8)(0)Predicted value of (2)
Figure GDA0002516448260000089
Figure GDA00025164482600000810
h. The gray contrast value gc is calculated according to equation (9):
Figure GDA00025164482600000811
wherein XiIs a predicted value of the i-th field, YiIs the actual value of the i-th field, kiIs the weight of the ith dimension, and is combined with the distance calculation in the next module, XOrdering the values at 0.9 from big to little in dimension i, XThe values at 0.1 are ordered from large to small in dimension i.
In addition, the KT L AD algorithm in step S3 specifically includes the following steps:
a. calculating the variance of each dimension, and finding out the dimension d with the maximum variance; arranging the points from small to large in d dimension, setting the middle value point as a splitting point, setting the point smaller than the middle value as a left son, and setting the point larger than the middle value as a right son; establishing a k-d tree which comprises a plurality of nodes which take data traffic packets as nodes;
b. after the step a, abstracting a data traffic packet into an object p, adopting normalization processing, adopting weighting processing aiming at different dimension importance, and obtaining a distance d (p, q), wherein the distance d (p, q) is calculated according to a formula (10):
Figure GDA0002516448260000091
wherein k isiIs the weight of the ith dimension, XOrdering the values at α, X, from big to little in dimension iThe values at β are sorted from large to small in dimension i, α, β satisfy 0.6826 ≦ α - β ≦ 0.9544, α∈ [0.0228,0.1587 ≦],β∈[0.8413,0.9772]In this embodiment, α -0.1 and β -0.9 are reference values that can be obtained with a comparative value.
c. After the step b, inquiring the nearest neighbors according to the established k-d tree, and inquiring to obtain the kth nearest neighbor, namely obtaining the k-distance;
d. after step c, a k-distance neighborhood is computed as in equation (11):
Nk-dis(p)={q|d(p,q)≤k-dis(p)} (11)
e. after step d, given a natural number k, the reachable distance r-dis of object p relative to object o is calculated as in equation (12)k
r-disk(p,o)=max{k-dis(o),d(p,o)} (12)
f. After step e, the local reachable density lrd of object p is calculated according to equation (13)k-dis(p) calculating a local abnormality factor L OF (p) of the object p according to the formula (14):
Figure GDA0002516448260000092
Figure GDA0002516448260000093
through the steps, the data set is preprocessed on the basis of a gray scale theory, and the data calculation scale is reduced in a large scale; the detection is carried out based on the density, the separation degree of each flow packet and the adjacent flow packets is calculated, the specific abnormal state of the flow is not required to be preset, and the method has high flexibility compared with the traditional method.
Example two
A gray L OF flow anomaly detection system and a detection method are adopted to obtain a continuous flow data packet for experimental simulation in the first embodiment:
firstly, simulating a determined gc threshold corresponding to different grey prediction numbers when the detection rate and the timeliness OF the determined L OF threshold are optimal, wherein a specific gc threshold corresponding table is shown in fig. 2;
secondly, a gray detection rate and gray compression ratio corresponding graph in different L OF values is tested, because the influence OF a gray distinguishing module on the L OF analysis module is mainly measured by two parameters, namely the gray detection rate and the gray compression ratio, wherein the gray detection rate is defined as the ratio OF the number OF abnormal flow in gray flow to the number OF abnormal flow in total flow, the gray compression ratio is defined as the ratio OF the number OF gray flow to the number OF total flow, the gray detection rate and gray compression ratio corresponding graph in different L OF values is shown in FIG. 3, and experimental results show that the optimal effect can be achieved when the grey prediction number is 50 and the gray compression ratio is L OF is 4.15.
Then, threshold OF L OF algorithm is adjusted to obtain detection effect, at this time, because gray flow obtained by processing with gray distinguishing module is far smaller than original flow, on the basis OF ensuring detection rate, proportion OF abnormal flow to gray flow is greatly reduced, L OF threshold adjustment corresponding table is shown in FIG. 4.
EXAMPLE III
The accuracy and the detection rate OF the gray L OF flow anomaly detection system and the detection method in the first embodiment are compared with those OF a classical density algorithm DBSCAn, a RIDBSCAn and a hierarchical clustering-based Cure algorithm, and a comparison graph OF the accuracy and the detection rate is shown in FIG. 5. the time consumption OF the gray L OF flow anomaly detection system and the detection method in the first embodiment is compared with those OF a traditional L OF algorithm and a DBSCAn algorithm, and the comparison graph OF the time consumption is shown in FIG. 6, and the time consumption is sequentially gray L OF, traditional L OF and DBScan when Minpts takes values OF 10, 15 and 20 from left to right.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (2)

1. A detection method based on a gray L OF flow anomaly detection system is characterized by comprising the following steps:
s1, acquiring original data traffic packets through traffic acquisition equipment arranged on a data network node through a bypass, preprocessing data on the basis of a gray scale theory, extracting and summarizing high-association fields of each traffic data packet, and determining and selecting four fields of PacketsIn, PacketsOut, BytesIn and BytesOut as detection data sources;
s2, after the step S1, judging the prediction result by adopting a dimension standardization mode, and setting an original data column x(0)={x(0)(1),x(0)(2),...,x(0)(n) n is the number of grey predictions, according to x(0)Establishing a GM (1,1) model for realizing a prediction function by the data column; comparing the predicted result with actual data, and judging the flow of which the deviation of the result exceeds a comparison threshold value as grey flow;
s3, after the step S2, abstracting a data traffic packet in the gray traffic to be an object p, calculating the local reachable density and the local abnormal factor L OF (p) OF the object p by the L OF analysis module according to the KT L AD algorithm, determining a point OF the abnormal factor L OF (p) close to 1 as a normal point, and comparing a threshold value to obtain an abnormal point;
s4, after the step S3, outputting the abnormal point to a terminal;
in step S2, according to x(0)Establishing a GM (1,1) model by the data column to realize prediction according to the following steps:
a. accumulating the original data, weakening the volatility and randomness of the random sequence to obtain a new data sequence x(1)
x(1)={x(1)(1),x(1)(2),...,x(1)(k),...,x(1)(n)} (1)
Wherein x is(1)(k) Each data represents the accumulation of the corresponding first several items of data:
Figure FDA0002523763940000011
b. for x according to formula (2)(1)(k) Establishing a first-order linear differential equation, namely a GM (1,1) model:
Figure FDA0002523763940000012
wherein a and b are undetermined coefficients respectively calledCoefficient of development and amount of gray effect; the effective interval of a is (-2,2), and the matrix formed by a and b is recorded as a gray parameter
Figure FDA0002523763940000013
c. Averaging the accumulated data to generate B and constant term vector Y according to formula (3) and formula (4)n:
Figure FDA0002523763940000021
Yn=[x(0)(2),x(0)(3),...x(0)(n)]T(4)
d. Solving ash parameters by least squares method according to equation (5)
Figure FDA0002523763940000022
Figure FDA0002523763940000023
e. Ash parameter
Figure FDA0002523763940000024
Substituting into formula (2) and applying formula (6) to x(1)(k) And solving to obtain:
Figure FDA0002523763940000025
f. calculating the data sequence x according to equation (7)(1)Predicted value of (2)
Figure FDA0002523763940000026
Figure FDA0002523763940000027
g. X is obtained by calculation according to the formula (8)(0)Predicted value of (2)
Figure FDA0002523763940000028
Figure FDA0002523763940000029
h. The gray contrast value gc is calculated according to equation (9):
Figure FDA00025237639400000210
wherein XiIs a predicted value of the i-th field, YiIs the actual value of the i-th field, kiIs the weight of the ith dimension, and is combined with the distance calculation in the next module, XOrdering the values at 0.9 from big to little in dimension i, XSorting the values at 0.1 in dimension i from large to small;
the KT L AD algorithm in the step S3 specifically includes the following steps:
a. calculating the variance of each dimension, and finding out the dimension d with the maximum variance; arranging the points from small to large in d dimension, setting the middle value point as a splitting point, setting the point smaller than the middle value as a left son, and setting the point larger than the middle value as a right son; establishing a k-d tree which comprises a plurality of nodes which take data traffic packets as nodes;
b. after the step a, abstracting a data traffic packet into an object p, adopting normalization processing, adopting weighting processing aiming at different dimension importance, and obtaining a distance d (p, q), wherein the distance d (p, q) is calculated according to a formula (10):
Figure FDA0002523763940000031
wherein k isiIs the weight of the ith dimension, XOrdering the values at 0.1 from big to small in dimension i, XSorting the values at 0.9 in dimension i from large to small;
c. after the step b, inquiring the nearest neighbors according to the established k-d tree, and inquiring to obtain the kth nearest neighbor, namely obtaining the k-distance;
d. after step c, a k-distance neighborhood is computed as in equation (11):
Nk-dis(p)={q|d(p,q)≤k-dis(p)} (11)
e. after step d, given a natural number k, the reachable distance r-dis of object p relative to object o is calculated as in equation (12)k
r-disk(p,o)=max{k-dis(o),d(p,o)} (12)
f. After step e, the local reachable density lrd of object p is calculated according to equation (13)k-disA local abnormality factor L OF (p) of an object p is calculated according to the formula (14):
Figure FDA0002523763940000032
Figure FDA0002523763940000033
2. the gray L OF flow anomaly detection system based on the method OF claim 1 is characterized by comprising an information acquisition module, a gray distinguishing module, a L OF analysis module and an output module, wherein the information acquisition module is used for acquiring and preprocessing raw data and transmitting the data to the gray distinguishing module, the gray distinguishing module is used for analyzing and prejudging the data to obtain a gray area needing to be calculated and transmitting the gray area to a L OF analysis module, the L OF analysis module is used for analyzing objects in the gray area and transmitting an analysis result to the output module, and the output module is used for outputting the analysis result to a terminal.
CN201710631334.2A 2017-07-28 2017-07-28 OF flow anomaly detection system based on gray L and detection method thereof Active CN107257351B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710631334.2A CN107257351B (en) 2017-07-28 2017-07-28 OF flow anomaly detection system based on gray L and detection method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710631334.2A CN107257351B (en) 2017-07-28 2017-07-28 OF flow anomaly detection system based on gray L and detection method thereof

Publications (2)

Publication Number Publication Date
CN107257351A CN107257351A (en) 2017-10-17
CN107257351B true CN107257351B (en) 2020-08-04

Family

ID=60025647

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710631334.2A Active CN107257351B (en) 2017-07-28 2017-07-28 OF flow anomaly detection system based on gray L and detection method thereof

Country Status (1)

Country Link
CN (1) CN107257351B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107733921A (en) * 2017-11-14 2018-02-23 深圳中兴网信科技有限公司 Network flow abnormal detecting method, device, computer equipment and storage medium
CN109446189A (en) * 2018-10-31 2019-03-08 成都天衡智造科技有限公司 A kind of technological parameter outlier detection system and method
CN110232082B (en) * 2019-06-13 2022-08-30 中国科学院新疆理化技术研究所 Anomaly detection method for continuous space-time refueling data
CN111144435B (en) * 2019-11-11 2022-11-11 国电南瑞科技股份有限公司 Electric energy abnormal data monitoring method based on LOF and verification filtering framework
CN110986946B (en) * 2019-11-15 2022-07-26 上海宇航系统工程研究所 Dynamic pose estimation method and device
CN112685473B (en) * 2020-12-29 2022-07-05 山东大学 Network abnormal flow detection method and system based on time sequence analysis technology
CN113347181A (en) * 2021-06-01 2021-09-03 上海明略人工智能(集团)有限公司 Abnormal advertisement flow detection method, system, computer equipment and storage medium
CN113626502A (en) * 2021-08-13 2021-11-09 南方电网深圳数字电网研究院有限公司 Power grid data anomaly detection method and device based on ensemble learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104717106A (en) * 2015-03-04 2015-06-17 贵州电网公司信息通信分公司 Distributed network traffic abnormity detection method based on multi-variable sequential analysis
CN105025515A (en) * 2015-06-30 2015-11-04 电子科技大学 Method for detecting flow abnormity of wireless sensor network based on GM model
CN106375156A (en) * 2016-09-30 2017-02-01 国网冀北电力有限公司信息通信分公司 Power network traffic anomaly detection method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104717106A (en) * 2015-03-04 2015-06-17 贵州电网公司信息通信分公司 Distributed network traffic abnormity detection method based on multi-variable sequential analysis
CN105025515A (en) * 2015-06-30 2015-11-04 电子科技大学 Method for detecting flow abnormity of wireless sensor network based on GM model
CN106375156A (en) * 2016-09-30 2017-02-01 国网冀北电力有限公司信息通信分公司 Power network traffic anomaly detection method and device

Also Published As

Publication number Publication date
CN107257351A (en) 2017-10-17

Similar Documents

Publication Publication Date Title
CN107257351B (en) OF flow anomaly detection system based on gray L and detection method thereof
CN107682319B (en) Enhanced angle anomaly factor-based data flow anomaly detection and multi-verification method
CN110505179B (en) Method and system for detecting network abnormal flow
CN109787979B (en) Method for detecting electric power network event and invasion
CN110895526A (en) Method for correcting data abnormity in atmosphere monitoring system
CN111107102A (en) Real-time network flow abnormity detection method based on big data
CN112788066B (en) Abnormal flow detection method and system for Internet of things equipment and storage medium
CN108601026B (en) Perception data error attack detection method based on random sampling consistency
CN103812577A (en) Method for automatically identifying and learning abnormal radio signal type
CN107561997A (en) A kind of power equipment state monitoring method based on big data decision tree
CN110830946B (en) Mixed type online data anomaly detection method
CN104866831B (en) The face recognition algorithms of characteristic weighing
CN110472671B (en) Multi-stage-based fault data preprocessing method for oil immersed transformer
CN108632269A (en) Detecting method of distributed denial of service attacking based on C4.5 decision Tree algorithms
CN112367303B (en) Distributed self-learning abnormal flow collaborative detection method and system
CN109711664B (en) Power transmission and transformation equipment health assessment system based on big data
CN111614576A (en) Network data traffic identification method and system based on wavelet analysis and support vector machine
Zhang et al. Pca-svm-based approach of detecting low-rate dos attack
CN110851422A (en) Data anomaly monitoring model construction method based on machine learning
CN111191720B (en) Service scene identification method and device and electronic equipment
CN116012780A (en) Fire disaster monitoring method and system based on image recognition
CN116684878B (en) 5G information transmission data safety monitoring system
CN111314910B (en) Wireless sensor network abnormal data detection method for mapping isolation forest
CN117201410A (en) Flow management method and system for Internet of things
CN116996665A (en) Intelligent monitoring method, device, equipment and storage medium based on Internet of things

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant