CN109257384B - Application layer DDoS attack identification method based on access rhythm matrix - Google Patents

Application layer DDoS attack identification method based on access rhythm matrix Download PDF

Info

Publication number
CN109257384B
CN109257384B CN201811354199.2A CN201811354199A CN109257384B CN 109257384 B CN109257384 B CN 109257384B CN 201811354199 A CN201811354199 A CN 201811354199A CN 109257384 B CN109257384 B CN 109257384B
Authority
CN
China
Prior art keywords
matrix
rhythm
data
data packet
packet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811354199.2A
Other languages
Chinese (zh)
Other versions
CN109257384A (en
Inventor
王风宇
林欢
孔健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinan Bainarui Information Technology Co ltd
Original Assignee
Jinan Bainarui Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinan Bainarui Information Technology Co ltd filed Critical Jinan Bainarui Information Technology Co ltd
Priority to CN201811354199.2A priority Critical patent/CN109257384B/en
Publication of CN109257384A publication Critical patent/CN109257384A/en
Application granted granted Critical
Publication of CN109257384B publication Critical patent/CN109257384B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1458Denial of Service

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses an application layer DDoS attack identification method based on an access rhythm matrix, which uses the change abnormality degree to carry out attack detection and uses outliers to identify an attack host IP based on a data structure of the access rhythm matrix. The method has low time complexity and space complexity and does not influence the actual network condition. The detection system integrated by the algorithm can be deployed between a route nearest to a target host and the host, an inbound traffic packet is obtained through a port mirroring technology, an access rhythm matrix is constructed, application layer DDoS attack is detected, and an attack host IP is identified.

Description

Application layer DDoS attack identification method based on access rhythm matrix
Technical Field
The disclosure relates to the technical field of computer network security, in particular to an application layer DDoS attack identification method based on an access rhythm matrix.
Background
DDoS attacks pose a great deal of harm to the network, and particularly DDoS flooding attacks on application layer protocols may directly cause users to be unable to access normally. This type of attack is more destructive than other underlying attack patterns and is difficult to detect by conventional DDoS detection systems.
Current systems for detecting application layer DDoS attacks can be roughly divided into two categories: one is misuse detection and the other is anomaly detection. The former finds attacks by matching input data with predefined attack features, while the latter builds a legal behavior model using normal behavior, and determines abnormal behavior if it deviates from the model. The DDoS attack detection method provided by the disclosure belongs to the category of anomaly detection.
In a DDoS attack detection system based on anomaly detection, selecting characteristics and constructing a normal behavior model based on the selected characteristics are main tasks. We select the message length and the message arrival time interval of the network layer as the detection characteristics. In previous studies, DDoS attack detection is performed by using a message length or a message arrival time interval as a feature, such as documents [ l.zhou, m.liao, c.yuan, and h.zhang, "Low-rate DDoS attack detection implementation of packet size," Security and Communication Networks, vol.2017,2017 ], and [ s.n.shieles, v.katos, a.s.karakos, and b.k.papopoulo, "Real DDoS detection functions estimates," Computers & Security, vol.31, vol.6, pp.782-790, sep2012 ], but in these studies, the message length and the message arrival time interval are not linked with an application layer, so that it is difficult to detect application layer attacks. Web content distribution and user access behaviors are implied in the length of the essential report and the arrival time interval of the report, and a user access behavior mode is described through a rhythm matrix constructed by the length of the essential report and the arrival time interval of the report, so that application layer DDoS attacks are identified.
In a few studies related to DDoS attack detection, a matrix mode is adopted to organize data characteristics and construct a normal behavior mode. The literature [ Xie, Y., Yu, S.z. "Monitoring the Application-Layer DDoS anchors for Popular Websites," IEEE/ACM Transactions on Networking 2009; 17(1): 15-25) constructing a user Access Matrix (Access Matrix) by expanding a common concept in website statistics, namely Request Hit Rate (Request Hit Rate), and deducing popularity index change of documents in the website by using a hidden half Markov model so as to discover abnormal behaviors. The rhythm matrix proposed by the user is different from the construction method of the access matrix of the user, the used original characteristics are different, but the space-time characteristics of the access flow can be captured. Document [ s.m.lee, d.s.kim, j.h.lee, and j.s.park, "Detection of DDoS attack using optimized Traffic Matrix," Computers & technologies with Applications, vol.63, No.2, pp.501-510, jan.2012] constructs a Traffic Matrix (Traffic Matrix) using the source IP address of the IP packet header, and detects DDoS attack according to its distribution characteristics. In contrast, the adopted rhythm matrix can depict richer user behavior characteristics, so that the DDoS attack of the application layer can be accurately identified.
Disclosure of Invention
In order to solve the defects of the prior art, the application layer DDoS attack identification method based on the access rhythm matrix is provided by the disclosure and used for DDoS attack detection of the application layer and attack host IP identification.
In order to achieve the purpose, the following technical scheme is adopted in the disclosure:
an application layer DDoS attack identification method based on an access rhythm matrix comprises the following steps:
step (1): data packet IP address capturing step: when the data packet reaches the boundary router, the IP address of the data packet is obtained through port mirroring, and whether the IP address of the current data packet is the IP address in the blacklist or not is judged; if yes, the border router directly blocks the sending of the data packet of the current IP address; if not, merging the data packets according to the IP addresses, and dividing the data packets of the same IP address into the same network data stream;
step (2): setting a data window, wherein the data window divides the network data stream into equal segments, and a corresponding rhythm matrix is generated in each data window; with the continuous arrival of the data packet, the data window is continuously pushed, a plurality of rhythm matrixes are constructed, and a rhythm matrix sequence is formed; if the number of the data packets in the data window reaches a set value, selecting a base matrix of the current rhythm matrix, and calculating a change rate matrix; after the change rate matrix is obtained, calculating the abnormal degree of the rhythm matrix;
and (3): judging whether attack occurs or not, and if the abnormal degree of the rhythm matrix is larger than a set threshold value, determining that DDoS attack occurs; and if the abnormality degree of the rhythm matrix is less than or equal to a set threshold value, determining that the DDoS attack does not occur.
As some possible implementations of the present disclosure, if no attack has occurred, the packet IP address capture step is returned; and if the attack occurs, detecting outliers of the constructed change rate matrix, and marking the detected outliers. And if the number of times that a certain host IP address falls in the outlier exceeds the set number of times, the host IP address is considered as an attack host IP address, and the identified IP address is supplemented into a blacklist.
As some possible implementation manners of the present disclosure, a boundary router is arranged between the internet and a destination server, and a port mirror is arranged on the boundary router, so as to capture a data packet flowing into the destination server; and setting a blacklist for storing the IP address of the data packet which attacks the server once.
As some possible implementation manners of the present disclosure, the specific steps of generating the rhythm matrix are:
step (21): recording the length of each data packet and the arrival interval time of each data packet; eliminating data packets without upper layer protocol load, session start packets and end packets from all data packets in the same network data stream; respectively calculating the capacity of each data packet for the rest data packets, and calculating the arrival time interval between each data packet and the previous data packet; calculating whether the number of the data packets is integral multiple of d, if not, returning to the data packet IP address capturing step; if yes, entering the step (22);
step (22): d data packets are normalized, and the steps are as follows:
let siFor the ith packet, Δ t, sent by the client to the destination serveriThe value range of i is 1 to n for the inter-arrival time of the ith data packet; n is the number of packets in a network data stream, then the network data stream F is abstracted as:
Figure GDA0002683279290000031
extracting the length characteristic of the data packet in the network data flow F to obtain the length characteristic flow of the current network data flow, and enabling the data packet siHas a length of piThen a network data stream is converted into:
Figure GDA0002683279290000032
wherein, F' represents a characteristic flow after the characteristic of the length of the data packet is extracted;
abstracting all data packets flowing to a destination server into a characteristic flow through a formula (2);
then, discretizing the lengths and arrival interval times of all the data packets in the feature stream respectively:
Figure GDA0002683279290000033
where b (p) -inf (p) represents the discrete base of the length of the packet, sup (p) represents the upper bound of the standard packet length, inf (p) represents the lower bound of the standard packet length, and log is used10Function will Δ tiDiscretizing into a number of 0-9, wherein p 'represents the length of a data packet after discretization, delta t' represents the arrival interval time of the data packet after discretization, and p represents the original length of the data packet;
d adjacent data packets s are obtained through normalization processing1、s2......sdData packet s1、s2......sdThe finishing was performed using the following formulas (4-1) and (4-2):
X=p1'*10d-1+p2'*10d-2+p3'*10d-1+...+pd' (4-1)
Y=Δt1'*10d-1+Δt2'*10d-2+Δt3'*10d-3+...+Δtd' (4-2)
wherein, X and Y are the values obtained after the data packet is processed, and the values of X and Y are both (0, 10)d-1) an interval; wherein p isi' denotes the length after discretization of the ith packet, Δ ti' denotes an inter-arrival time after discretization of the ith packet.
Step (23): the obtained X was regarded as 10dThe abscissa of the element in the rhythm matrix S of the dimension is taken as Y10dThe vertical coordinates of elements in the dimensional rhythm matrix S are used for mapping the network data stream to the rhythm matrix; the initial value of each element in the rhythm matrix is 0, and each element enters one rhythm matrixAdding 1 to the element value of the rhythm matrix S at the corresponding coordinate position by the new drop point;
step (24): and (5) repeating the step (23) until the data packets in the current time window are mapped to the rhythm matrix, and ending the mapping, wherein the length of the time window is based on the number of the falling points of the rhythm matrix, namely the mapping of the data packets in the time window is ended when the number of the falling points reaches the set number.
As some possible implementations of the present disclosure, the selection rule of the base matrix is as follows:
(2a) for the constructed rhythm matrix sequence, detecting from a third rhythm matrix, and taking a first rhythm matrix in the rhythm matrix sequence as a base matrix of the third rhythm matrix; when the fourth rhythm matrix is detected, detecting according to the rule in (2b) or (2 c);
(2b) if the current tempo matrix SiOf the previous tempo matrix Si-1If the matrix is judged to be normal, namely no attack occurs, the matrix S is selectedi-2As a base matrix for the current tempo matrix;
(2c) if the current tempo matrix SiOf the previous tempo matrix Si-1If the matrix is judged to be abnormal, S is selectedi-1The basis matrix of (2) is the basis matrix of the current tempo matrix.
As some possible implementations of the present disclosure, the specific steps of calculating the rate of change matrix are as follows:
Figure GDA0002683279290000041
if S isx,yWhen the value is 0, then alphax,y0; if R isx,yWhen the value is 0, then alphax,y=Sx,y/1;
Wherein S isx,yA value at (x, y) coordinates representing a current tempo matrix; rx,yRepresenting the value at the (x, y) coordinate of the basis matrix, alphax,yRepresents the value at the (x, y) coordinate of the rate of change matrix.
As some possible implementation manners of the present disclosure, after the change rate matrix is obtained, a value that is not 0 in the change rate matrix is extracted into one-dimensional vector, and the outlier is detected.
As some possible implementations of the present disclosure, the outlier detection method is a detection method based on a four-quadrant distance:
if the value of a certain point in the one-dimensional vector is not in the interval [ Q ]1-k*(Q3-Q1),Q3+k*(Q3-Q1)]Internal: the point is considered to be an outlier. Wherein Q is1Denotes the first quantile, Q3Representing the third quantile, the value of k is 1.5.
The quartile method is an analytical method of statistics. The difference between the third quartile and the first quartile is defined as a quartile distance representing the degree of dispersion of a set of data, where all data are arranged from small to large, the number arranged at the position just before 1/4 (i.e., the number at the position of 25%) is called a first quartile Q1, the number arranged at the position after 1/4 (i.e., the number at the position of 75%) is called a third quartile Q3, and the number arranged at the middle position (i.e., the number at the position of 50%) is called a second quartile Q2. According to the four-quadrant distance, an outlier detection method based on the distance is provided.
As some possible implementations of the present disclosure, the degree of abnormality γ of the tempo matrix is calculated, i.e. the sum of the degrees of abnormality of all elements in the matrix:
γ=∑ max(0,Sx,y-Rx,y*mean(α)) (6)
wherein S isx,yRepresenting the value at (x, y) coordinates of the current matrix, Rx,yRepresents the value at the (x, y) coordinate of the base matrix and mean (α) represents the average of the rates of change of all the points of the current matrix.
As some possible implementation manners of the present disclosure, if an application layer DDoS is found in a current time window, a packet IP address tracking record is started in a next time window; performing IP inspection, and observing all IP records in the current time window; if the number of the outliers of the falling point of a certain IP on the rhythm matrix S exceeds a set threshold, judging the IP as the IP of the attack host; the IP is added to the prepared blacklist.
Compared with the prior art, the beneficial effect of this disclosure is:
1) low space-time complexity
The algorithm has low time complexity and space complexity, and has low requirements on CPU time and memory. Firstly, calculating the time complexity of an algorithm, wherein the algorithm only processes an incoming data packet once, calculates the length of a message and an arrival interval, and then carries out flow classification according to a source IP address, wherein the complexity is O (1); in constructing the matrix and the rate of change matrix, 10 is performeddThe time complexity of the quotient of the elements of the dimensional matrix is of constant order O (1). And then, the time complexity of the outlier detection and the abnormality degree calculation is constant order. The time complexity of our algorithm remains constant order O (1) at all times.
And secondly, calculating the space complexity, and only recording the rhythm matrix of the current time window and the corresponding base matrix of the rhythm matrix as the time window is continuously advanced. And secondly, temporarily storing the change rate matrix of the current window and the corresponding 0-1 matrix. Since the time window is continuous, the memory is constantly updated, but the spatial complexity of the overall algorithm remains at a constant order O (1).
2) The application layer DDoS has high discrimination accuracy and high attack host IP recognition rate
Experiments are carried out by using different types of DDoS attack flow data, the accuracy rate of the algorithm for detecting the DDoS attack of the application layer can reach more than 98% within the optimized parameter range, the accuracy rate of IP identification of the attack host after the attack is detected can reach more than 97%, and the recall rate can reach 100%.
3) The DDoS attack and burst access of an application layer can be distinguished
Since burst accesses do not result in the occurrence of outliers in the access rhythm matrix, they are not recognized as DDoS attacks in our method. The method is verified by using the burst congestion data set disclosed by the network, and the result shows that the method can effectively avoid identifying burst traffic as attack traffic.
4) The application layer data in the message does not need to be analyzed, and the privacy of the user is protected
According to the method and the device, only the size of the data packet and the arrival interval time of the adjacent packets are selected as main data to be analyzed, and the data of an application layer in the message does not need to be analyzed, so that user data can be prevented from being leaked, and the privacy of a user can be protected.
The detection method provided by the disclosure selects and uses the packet length and the packet arrival time interval of the network layer to construct the access rhythm matrix, and because the access rhythm matrix is the characteristic of the network layer, the detection method can be theoretically used for detecting DDoS attacks based on different upper layer protocols.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application.
FIG. 1 is a network deployment diagram of the present disclosure;
FIG. 2 is a construction diagram of a rhythm matrix;
fig. 3 is a general flow chart of the algorithm.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
In fig. 1, the application layer DDoS attack detection and attack host IP identification system employs the device of the present method, and is deployed between a router closest to a destination host and the destination host. All the traffic entering the destination host is captured by the detection system through the port mirror image on the router and then detected, the identified IP of the attack host is added into the blacklist, and the IP in the blacklist can be blocked through the router.
As shown in fig. 3, the application layer DDoS attack identification method based on the access rhythm matrix includes:
step (1): data packet IP address capturing step: when the data packet reaches the boundary router, the IP address of the data packet is obtained through port mirroring, and whether the IP address of the current data packet is the IP address in the blacklist or not is judged; if yes, the border router directly blocks the sending of the data packet of the current IP address; if not, merging the data packets according to the IP addresses, and dividing the data packets of the same IP address into the same network data stream;
step (2): and setting a data window, wherein the data window is used for dividing the network data stream into a plurality of equal-quantity segments, and a corresponding rhythm matrix is generated in each data window. With the continuous arrival of data packets and the continuous push of data windows, a plurality of rhythm matrixes are constructed, and as shown in fig. 2, a rhythm matrix sequence is formed; if the number of the falling points in the data window reaches a set value, selecting a base matrix of the current rhythm matrix, and calculating a change rate matrix; after the change rate matrix is obtained, calculating the degree of abnormality;
and (3): judging whether an attack occurs, if the abnormal degree of a rhythm matrix is greater than a set threshold value, determining that DDoS attack occurs, and if the abnormal degree of the rhythm matrix is less than or equal to the set threshold value, determining that DDoS attack does not occur;
as some possible implementations of the present disclosure, if no attack has occurred, the packet capture step continues; and if the attack is detected, performing outlier detection on the constructed change rate matrix, marking the detected outlier, and identifying the attack host IP in the subsequent step.
As some possible implementation manners of the present disclosure, a boundary router is arranged between the internet and a destination server, and a port mirror is arranged on the boundary router, so as to capture a data packet flowing into the destination server; and setting a blacklist for storing the IP address of the data packet which attacks the server once.
As some possible implementation manners of the present disclosure, the specific steps of mapping out the corresponding rhythm matrix for the network data stream are as follows:
step (21): recording the length of each data packet and the arrival interval time of each data packet; eliminating data packets without upper layer protocol load, session start packets and end packets from all data packets in the same network data stream; respectively calculating the capacity of each data packet for the rest data packets, and calculating the arrival time interval between each data packet and the previous data packet; calculating whether the number of the data packets is integral multiple of d, if not, returning to the data packet IP address capturing step; if yes, entering the step (22);
step (22): d data packets are normalized, and the steps are as follows:
let siFor the ith packet, Δ t, sent by the client to the destination serveriThe value range of i is 1 to n for the inter-arrival time of the ith data packet; n is the number of packets in a network data stream, then the network data stream F is abstracted as:
Figure GDA0002683279290000071
extracting the length characteristic of the data packet in the network data flow F to obtain the length characteristic flow of the current network data flow, and enabling the data packet siHas a length of piThen a network data stream is converted into:
Figure GDA0002683279290000072
wherein, F' represents a characteristic flow after the characteristic of the length of the data packet is extracted;
abstracting all data packets flowing to a destination server into a characteristic flow through a formula (2);
then, discretizing the lengths and arrival interval times of all the data packets in the feature stream respectively:
Figure GDA0002683279290000081
where b (p) -inf (p) represents the discrete base of the length of the packet, sup (p) represents the upper bound of the standard packet length, inf (p) represents the lower bound of the standard packet length, and log is used10Function will Δ tiDiscretizing into a number of 0-9, wherein p 'represents the length of a data packet after discretization, delta t' represents the arrival interval time of the data packet after discretization, and p represents the original length of the data packet;
d adjacent data packets s are obtained through normalization processing1、s2......sdData packet s1、s2......sdThe finishing was performed using the following formulas (4-1) and (4-2):
X=p1'*10d-1+p2'*10d-2+p3'*10d-1+...+pd' (4-1)
Y=Δt1'*10d-1+Δt2'*10d-2+Δt3'*10d-3+...+Δtd' (4-2)
wherein, X and Y are the values obtained after the data packet is processed, and the values of X and Y are both (0, 10)d-1) an interval; wherein p isi' denotes the length after discretization of the ith packet, Δ ti' denotes an inter-arrival time after discretization of the ith packet.
Step (23): the obtained X was regarded as 10dThe abscissa of the element in the rhythm matrix S of the dimension is taken as Y10dThe vertical coordinates of elements in the dimensional rhythm matrix S are used for mapping the network data stream to the rhythm matrix; the initial value of each element in the rhythm matrix is 0, and the element value of the rhythm matrix S at the corresponding coordinate position is added with 1 every time a new drop point is entered;
step (24): and (5) repeating the step (23) until the data packets in the current time window are mapped to the rhythm matrix, and ending the mapping, wherein the length of the time window is based on the number of the falling points of the rhythm matrix, namely the mapping of the data packets in the time window is ended when the number of the falling points reaches the set number.
As some possible embodiments of the present disclosure, the selection rule of the base matrix is as follows:
(2a) for the constructed rhythm matrix sequence, detecting from a third rhythm matrix, and taking a first rhythm matrix in the rhythm matrix sequence as a base matrix of the third rhythm matrix; when the fourth rhythm matrix is detected, detecting according to the rule in (2b) or (2 c);
(2b) if the current tempo matrix SiOf the previous tempo matrix Si-1If the matrix is judged to be normal, namely no attack occurs, the matrix S is selectedi-2As a base matrix for the current tempo matrix;
(2c) if the current tempo matrix SiOf the previous tempo matrix Si-1If the matrix is judged to be abnormal, S is selectedi-1The basis matrix of (2) is the basis matrix of the current tempo matrix.
As some possible embodiments of the present disclosure, the specific steps of calculating the rate of change matrix are as follows:
Figure GDA0002683279290000091
if S isx,yWhen the value is 0, then alphax,y0; if R isx,yWhen the value is 0, then alphax,y=Sx,y/1;
Wherein S isx,yRepresents the value at the (x, y) coordinate of the current matrix; rx,yRepresenting the value at the (x, y) coordinate of the basis matrix, alphax,yRepresents the value at the (x, y) coordinate of the rate of change matrix.
As some possible embodiments of the present disclosure, after obtaining the change rate matrix, extracting values that are not 0 in the change rate matrix into a one-dimensional vector, and performing outlier detection.
As some possible embodiments of the present disclosure, the outlier detection method is a four-quadrant based detection method:
if the value of a certain point in the one-dimensional vector is not in the interval [ Q ]1-k*(Q3-Q1),Q3+k*(Q3-Q1)]Internal: the point is considered to be an outlier. Wherein Q is1Denotes the first quantile, Q3Representing the third quantile, the value of k is 1.5.
The quartile method is an analytical method of statistics. The difference between the third quartile and the first quartile is defined as a quartile distance representing the degree of dispersion of a set of data, where all data are arranged from small to large, the number arranged at the position just before 1/4 (i.e., the number at the position of 25%) is called a first quartile Q1, the number arranged at the position after 1/4 (i.e., the number at the position of 75%) is called a third quartile Q3, and the number arranged at the middle position (i.e., the number at the position of 50%) is called a second quartile Q2. According to the four-quadrant distance, an outlier detection method based on the distance is provided.
As some possible embodiments of the present disclosure, the degree of abnormality γ of the tempo matrix is calculated, i.e. the sum of the degrees of abnormality of all elements in the matrix:
γ=∑ max(0,Sx,y-Rx,y*mean(α)) (6)
wherein S isx,yRepresenting the value at (x, y) coordinates of the current matrix, Rx,yRepresents the value at the (x, y) coordinate of the base matrix and mean (α) represents the average of the rates of change of all the points of the current matrix.
As some possible embodiments of the present disclosure, if an application layer DDoS is found in a current time window, a packet IP address tracking record is started in a next time window; performing IP inspection, and observing all IP records in the current time window; if the number of the outliers of the falling point of a certain IP on the rhythm matrix S exceeds a set threshold, judging the IP as the IP of the attack host; the IP is added to the prepared blacklist.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (7)

1. An application layer DDoS attack identification method based on an access rhythm matrix is characterized by comprising the following steps:
step (1): data packet IP address capturing step: when the data packet reaches the boundary router, the IP address of the data packet is obtained through port mirroring, and whether the IP address of the current data packet is the IP address in the blacklist or not is judged; if yes, the border router directly blocks the sending of the data packet of the current IP address; if not, merging the data packets according to the IP addresses, and dividing the data packets of the same IP address into the same network data stream;
step (2): setting a data window, wherein the data window divides a network data stream into equal segments, and a corresponding rhythm matrix is generated in each data window; with the continuous arrival of the data packet, the data window is continuously pushed, a plurality of rhythm matrixes are constructed, and a rhythm matrix sequence is formed; if the number of the data packets in the data window reaches a set value, selecting a base matrix of the current rhythm matrix, and calculating a change rate matrix; after the change rate matrix is obtained, calculating the abnormal degree of the rhythm matrix;
and (3): judging whether attack occurs or not, and if the abnormal degree of the rhythm matrix is larger than a set threshold value, determining that DDoS attack occurs; if the abnormal degree of the rhythm matrix is less than or equal to a set threshold value, determining that DDoS attack does not occur;
the specific generation steps of the rhythm matrix are as follows:
step (21): recording the length of each data packet and the arrival interval time of each data packet; eliminating data packets without upper layer protocol load, session start packets and end packets from all data packets in the same network data stream; respectively calculating the capacity of each data packet for the rest data packets, and calculating the arrival time interval between each data packet and the previous data packet; calculating whether the number of the data packets is integral multiple of d, if not, returning to the data packet IP address capturing step; if yes, entering the step (22);
step (22): d data packets are normalized, and the steps are as follows:
let siFor the ith packet, Δ t, sent by the client to the destination serveriThe value range of i is 1 to n for the inter-arrival time of the ith data packet; n is the number of packets in a network data stream, then the network data stream F is abstracted as:
Figure FDA0002683279280000011
extracting the length characteristic of the data packet in the network data flow F to obtain the length characteristic flow of the current network data flow, and enabling the data packet siHas a length of piThen a network data stream is converted into:
Figure FDA0002683279280000012
wherein, F' represents a characteristic flow after the characteristic of the length of the data packet is extracted;
abstracting all data packets flowing to a destination server into a characteristic flow through a formula (2);
then, discretizing the lengths and arrival interval times of all the data packets in the feature stream respectively:
Figure FDA0002683279280000021
where b (p) -inf (p) represents the discrete base of the length of the packet, sup (p) represents the upper bound of the standard packet length, inf (p) represents the lower bound of the standard packet length, and log is used10Function will Δ tiDiscretizing into numbers of 0-9, p 'represents the length of the discretized data packet, Δ t' represents the inter-arrival time of the discretized data packet, and p represents the original data packetA length;
d adjacent data packets s are obtained through normalization processing1、s2......sdData packet s1、s2......sdThe arrangement was performed using the formulas (4-1) and (4-2):
X=p1'*10d-1+p2'*10d-2+p3'*10d-1+...+pd' (4-1)
Y=Δt1'*10d-1+Δt2'*10d-2+Δt3'*10d-3+...+Δtd' (4-2)
wherein, X and Y are the values obtained after the data packet is processed, and the values of X and Y are both (0, 10)d-1) an interval; wherein p isi' denotes the length after discretization of the ith packet, Δ ti' represents an inter-arrival time after discretization of the ith packet;
step (23): the obtained X was regarded as 10dThe abscissa of the element in the rhythm matrix S of the dimension is taken as Y10dThe vertical coordinates of elements in the dimensional rhythm matrix S are used for mapping the network data stream to the rhythm matrix; the initial value of each element in the rhythm matrix is 0, and the element value of the rhythm matrix S at the corresponding coordinate position is added with 1 every time a new drop point is entered;
step (24): repeating the step (23) until the data packets in the current time window are all mapped to the rhythm matrix, and finishing the mapping, wherein the length of the time window takes the number of the falling points of the rhythm matrix as a standard, namely the data packets in the time window are mapped when the number of the falling points reaches a set number;
the specific calculation steps of the change rate matrix are as follows:
Figure FDA0002683279280000022
if S isx,yWhen the value is 0, then alphax,y0; if R isx,yWhen the value is 0, then alphax,y=Sx,y/1;
Wherein S isx,yA value at (x, y) coordinates representing a current tempo matrix; rx,yRepresenting the value at the (x, y) coordinate of the basis matrix, alphax,yA value at the (x, y) coordinate representing the rate of change matrix;
wherein the degree of abnormality γ of the rhythm matrix, i.e. the sum of the degrees of abnormality of all elements in the matrix:
γ=∑max(0,Sx,y-Rx,y*mean(α)) (6)
wherein S isx,yRepresenting the value at (x, y) coordinates of the current matrix, Rx,yRepresents the value at the (x, y) coordinate of the base matrix and mean (α) represents the average of the rates of change of all the points of the current matrix.
2. The DDoS attack recognition method for application layer based on access rhythm matrix as claimed in claim 1, wherein if no attack occurs, returning to the packet IP address capture step; if the attack is detected, detecting outliers of the constructed change rate matrix, and marking the detected outliers; and if the number of times that a certain host IP address falls in the outlier exceeds the set number of times, the host IP address is considered as an attack host IP address, and the identified IP address is supplemented into a blacklist.
3. The DDoS attack recognition method for application layer based on access rhythm matrix as claimed in claim 1, wherein a border router is provided between internet and destination server, and a port mirror image is provided for the border router to capture the data packet flowing into the destination server; and setting a blacklist for storing the IP address of the data packet which attacks the server once.
4. The method for identifying an application layer DDoS attack based on an access rhythm matrix as claimed in claim 1,
the selection rule of the base matrix is as follows:
(2a) for the constructed rhythm matrix sequence, detecting from a third rhythm matrix, and taking a first rhythm matrix in the rhythm matrix sequence as a base matrix of the third rhythm matrix; when the fourth rhythm matrix is detected, detecting according to the rule in (2b) or (2 c);
(2b) if the current tempo matrix SiOf the previous tempo matrix Si-1If the matrix is judged to be normal, namely no attack occurs, the matrix S is selectedi-2As a base matrix for the current tempo matrix;
(2c) if the current tempo matrix SiOf the previous tempo matrix Si-1If the matrix is judged to be abnormal, S is selectedi-1The basis matrix of (2) is the basis matrix of the current tempo matrix.
5. The method according to claim 1, wherein after obtaining the change rate matrix, extracting values in the change rate matrix other than 0 into a one-dimensional vector for outlier detection.
6. The method for identifying an application layer DDoS attack based on an access rhythm matrix as claimed in claim 1,
the outlier detection method is a detection method based on a four-quadrant distance:
if the value of a certain point in the one-dimensional vector is not in the interval [ Q ]1-k*(Q3-Q1),Q3+k*(Q3-Q1)]Internal: then the point is considered to be an outlier; wherein Q is1Denotes the first quantile, Q3Representing the third quantile, the value of k is 1.5.
7. The method according to claim 1, wherein if the DDoS attack is found in the current time window, the IP address trace record of the data packet is started in the next time window; performing IP inspection, and observing all IP records in the current time window; if the number of the outliers of the falling point of a certain IP on the rhythm matrix S exceeds a set threshold, judging the IP as the IP of the attack host; the IP is added to the prepared blacklist.
CN201811354199.2A 2018-11-14 2018-11-14 Application layer DDoS attack identification method based on access rhythm matrix Active CN109257384B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811354199.2A CN109257384B (en) 2018-11-14 2018-11-14 Application layer DDoS attack identification method based on access rhythm matrix

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811354199.2A CN109257384B (en) 2018-11-14 2018-11-14 Application layer DDoS attack identification method based on access rhythm matrix

Publications (2)

Publication Number Publication Date
CN109257384A CN109257384A (en) 2019-01-22
CN109257384B true CN109257384B (en) 2020-12-04

Family

ID=65044396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811354199.2A Active CN109257384B (en) 2018-11-14 2018-11-14 Application layer DDoS attack identification method based on access rhythm matrix

Country Status (1)

Country Link
CN (1) CN109257384B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111600859B (en) * 2020-05-08 2022-08-05 恒安嘉新(北京)科技股份公司 Method, device, equipment and storage medium for detecting distributed denial of service attack
CN111556068B (en) * 2020-05-12 2020-12-22 上海有孚智数云创数字科技有限公司 Flow characteristic identification-based distributed denial service monitoring and prevention and control method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102238047A (en) * 2011-07-15 2011-11-09 山东大学 Distributed denial-of-service attack detection method based on external connection behaviors of Web communication group
CN102638474A (en) * 2012-05-08 2012-08-15 山东大学 Application layer DDOS (distributed denial of service) attack and defense method
CN102882881A (en) * 2012-10-10 2013-01-16 常州大学 Special data filtering method for eliminating denial-of-service attacks to DNS (domain name system) service
WO2015012422A1 (en) * 2013-07-24 2015-01-29 Kim Hangjin Method for dealing with ddos attack and guaranteeing business continuity by using "2d matrix-based distributed access network"
CN104468507A (en) * 2014-10-28 2015-03-25 刘胜利 Torjan detection method based on uncontrolled end flow analysis

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102238047A (en) * 2011-07-15 2011-11-09 山东大学 Distributed denial-of-service attack detection method based on external connection behaviors of Web communication group
CN102638474A (en) * 2012-05-08 2012-08-15 山东大学 Application layer DDOS (distributed denial of service) attack and defense method
CN102882881A (en) * 2012-10-10 2013-01-16 常州大学 Special data filtering method for eliminating denial-of-service attacks to DNS (domain name system) service
WO2015012422A1 (en) * 2013-07-24 2015-01-29 Kim Hangjin Method for dealing with ddos attack and guaranteeing business continuity by using "2d matrix-based distributed access network"
CN104468507A (en) * 2014-10-28 2015-03-25 刘胜利 Torjan detection method based on uncontrolled end flow analysis

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Application layer DDoS attack detection using cluster with label based on sparse vector decomposition and rhythm matching";Liao, Q 等;《SECURITY AND COMMUNICATION NETWORKS》;20151125;第8卷(第17期);第3111-3120页 *
"基于节奏矩阵的AL_DDoS攻击检测技术研究";吴佳燕;《中国优秀硕士学位论文全文数据库 信息科技特辑》;20170115(第01期);第I139-42页 *

Also Published As

Publication number Publication date
CN109257384A (en) 2019-01-22

Similar Documents

Publication Publication Date Title
EP2725512B1 (en) System and method for malware detection using multi-dimensional feature clustering
CN108429651B (en) Flow data detection method and device, electronic equipment and computer readable medium
US10296739B2 (en) Event correlation based on confidence factor
CN112019574B (en) Abnormal network data detection method and device, computer equipment and storage medium
US11057403B2 (en) Suspicious packet detection device and suspicious packet detection method thereof
JP3448254B2 (en) Access chain tracking system, network system, method, and recording medium
TWI674777B (en) Abnormal flow detection device and abnormal flow detection method thereof
CN111935170A (en) Network abnormal flow detection method, device and equipment
US7596810B2 (en) Apparatus and method of detecting network attack situation
US20030097439A1 (en) Systems and methods for identifying anomalies in network data streams
JP2006279930A (en) Method and device for detecting and blocking unauthorized access
US10264004B2 (en) System and method for connection fingerprint generation and stepping-stone traceback based on netflow
CN114143037A (en) Malicious encrypted channel detection method based on process behavior analysis
CN109257384B (en) Application layer DDoS attack identification method based on access rhythm matrix
CN112437062A (en) ICMP tunnel detection method, device, storage medium and electronic equipment
Shahrestani et al. Architecture for applying data mining and visualization on network flow for botnet traffic detection
CN111163114A (en) Method and apparatus for detecting network attacks
CN113839925A (en) IPv6 network intrusion detection method and system based on data mining technology
KR20170054215A (en) Method for connection fingerprint generation and traceback based on netflow
CN112583774A (en) Method and device for detecting attack flow, storage medium and electronic equipment
CN116155519A (en) Threat alert information processing method, threat alert information processing device, computer equipment and storage medium
KR20180101868A (en) Apparatus and method for detecting of suspected malignant information
RU2728506C2 (en) Method of blocking network connections
Lee et al. Monsieur poirot: Detecting botnets using re-identification algorithm and nontrivial feature selection technique
EP3484122A1 (en) Malicious relay and jump-system detection using behavioral indicators of actors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant