CN108632269A

CN108632269A - Detecting method of distributed denial of service attacking based on C4.5 decision Tree algorithms

Info

Publication number: CN108632269A
Application number: CN201810412986.1A
Authority: CN
Inventors: 刘俊杰; 王珺; 王梦林
Original assignee: Nanjing Post and Telecommunication University
Current assignee: Nanjing Post and Telecommunication University
Priority date: 2018-05-02
Filing date: 2018-05-02
Publication date: 2018-10-09
Anticipated expiration: 2038-05-02
Also published as: CN108632269B

Abstract

The invention discloses the detecting method of distributed denial of service attacking based on C4.5 decision Tree algorithms under a kind of software defined network environment, including step：The flow table information returned by OpenFlow interchangers is collected by OpenFlow agreements；The parameter that can analyze network flow changes in distribution in the flow table information with the relevant field information of ddos attack and being converted to is extracted as attribute, forms the training set of a decision tree；Classified to flow using C4.5 decision Tree algorithms, and according to the sub-category calculating classification information entropy of training set data；Successively the gain of conditional entropy, information of computation attribute, the comentropy of attribute and attribute information gain-ratio；It selects the maximum attribute of information gain-ratio to regard the root node of decision tree, the maximum attribute of information gain-ratio is then chosen in remaining attribute as node of divergence, and repeat the above steps to formation decision tree；Sort operation is carried out to new network flow using finally formed decision tree, detects whether that there are ddos attacks；The present invention more acurrate can detect ddos attack.

Description

Distributed Denial of Service Attack Detection Method Based on C4.5 Decision Tree Algorithm

技术领域technical field

本发明涉及计算机通信技术领域，是一种软件定义环境下的拒绝服务攻击检测方法，尤其涉及一种基于C4.5决策树算法的分布式拒绝服务攻击检测方法。The invention relates to the technical field of computer communication, and is a method for detecting a denial of service attack in a software-defined environment, in particular to a method for detecting a distributed denial of service attack based on a C4.5 decision tree algorithm.

背景技术Background technique

目前，连接到互联网的网络设备数量正在加速增长，不仅是移动设备的激增，新兴技术的发展也使得网络设备的数量迅速增长。相应地，网络规模的不断扩大将导致更为复杂的网络，带来更多的挑战。但现有的网络技术与设施并不能实现这样越来越复杂的系统。为了设计能满足这些快速发展需求的未来网络，已经提出了许多方法，软件定义网络就是其中比较重要的一个解决方案。At present, the number of network devices connected to the Internet is increasing rapidly, not only the surge of mobile devices, but also the rapid growth of network devices due to the development of emerging technologies. Correspondingly, the continuous expansion of network scale will lead to more complex networks and bring more challenges. However, the existing network technology and facilities cannot realize such an increasingly complex system. In order to design future networks that can meet these rapidly developing needs, many methods have been proposed, and software-defined networking is one of the more important solutions.

软件定义网络突出的特点是网络设备中数据平面和控制平面的解耦。在传统网络中，路由器通过路由算法决定数据包转发的位置。在软件定义网络中，决策和转发功能是分开的，决策过程由控制器提供，而数据转发交由交换机处理。简化网络设备和集中管理是软件定义网络最实用的特性。尽管软件定义网络在很多方面都有优势，但仍有许多挑战需要我们关注。在软件定义网络的安全方面的研究还很有限，它的漏洞源于它的两个特性：通过软件控制网络以及控制器中网络智能的集中化。这些功能会导致一些信任问题和单点管理的失效。对于信任问题，应用授权及认证机制可以解决，而通过损害控制器的可用性会造成单点管理失效，分布式拒绝服务攻击正是此类问题的最常见方式之一。拒绝服务攻击其实就是拒绝将系统资源用于合法用户并降低系统可用性。基本机制就是向目标发送大量多余的网络流量，使其无法响应真正的服务请求。如果攻击者使用多个源，则称为分布式拒绝服务攻击，这比拒绝服务更麻烦。软件定义网络架构在面对分布式拒绝服务攻击的一个缺点就是交换机过于被动，它们将所有带有未知流量的数据包发往控制器，由于控制器的中央管理特性，如果控制器因攻击流量而饱和，分布式拒绝服务攻击将造成灾难性后果。A prominent feature of software-defined networking is the decoupling of the data plane and control plane in network devices. In traditional networks, routers use routing algorithms to determine where to forward data packets. In software-defined networking, the decision-making and forwarding functions are separated, the decision-making process is provided by the controller, and the data forwarding is handled by the switch. Simplified network equipment and centralized management are the most useful features of software-defined networking. Although software-defined networking has many advantages, there are still many challenges that require our attention. Research on the security of software-defined networking is limited, and its vulnerability stems from two characteristics: control of the network through software and centralization of network intelligence in the controller. These features can lead to some trust issues and failure of a single point of administration. For the trust problem, the application authorization and authentication mechanism can be solved, and the single point management will fail by damaging the availability of the controller. Distributed denial of service attack is one of the most common ways of this kind of problem. A denial of service attack is essentially denying system resources to legitimate users and reducing system availability. The basic mechanism is to send a large amount of redundant network traffic to the target, making it unable to respond to real service requests. If the attacker uses multiple sources, it is called a distributed denial of service attack, which is more troublesome than denial of service. One disadvantage of the software-defined network architecture in the face of distributed denial of service attacks is that the switches are too passive. They send all data packets with unknown traffic to the controller. Due to the central management characteristics of the controller, if the controller Saturation, distributed denial of service attacks will have disastrous consequences.

现在已经存在一些检测软件定义网络环境下的分布式拒绝服务攻击的方法，比如说通过处理数据包的信息，基于熵值的计算判断是否遭受攻击。还有通过对数据包流量的不断监控，找到潜在的受害者和攻击者；这些方法的检测成功率偏低而且误报警的频率比较高。There are already some methods for detecting distributed denial-of-service attacks in a software-defined network environment. For example, by processing the information of data packets, it is judged based on the calculation of entropy whether it is attacked. In addition, potential victims and attackers can be found through continuous monitoring of data packet flow; the detection success rate of these methods is low and the frequency of false alarms is relatively high.

发明内容Contents of the invention

本发明的主要目的在于解决现有技术中存在的缺点和不足，提供一种基于C4.5决策树算法的分布式拒绝服务攻击检测方法，通过此方法可获得更高的检测成功率和更低的误报警率，具体技术方案如下：The main purpose of the present invention is to solve the shortcomings and deficiencies in the prior art, and provide a distributed denial-of-service attack detection method based on the C4.5 decision tree algorithm, through which a higher detection success rate and a lower false alarm rate, the specific technical scheme is as follows:

一种基于C4.5决策树算法的分布式拒绝服务攻击检测方法，是软件定义网络环境下对分布式拒绝服务攻击的一种检测方法，所述方法包括以下步骤：A distributed denial of service attack detection method based on the C4.5 decision tree algorithm is a detection method for distributed denial of service attacks in a software-defined network environment, and the method includes the following steps:

S1：通过OpenFlow协议收集由OpenFlow交换机返回的流表信息；S1: collect the flow table information returned by the OpenFlow switch through the OpenFlow protocol;

S2：提取所述流表信息中与DDoS攻击相关的字段信息并转化成可分析网络流量分布变化的参数作为属性，形成一个决策树的训练集；S2: Extract field information related to DDoS attacks in the flow table information and convert it into parameters that can analyze network traffic distribution changes as attributes to form a decision tree training set;

S3：使用C4.5决策树算法对网络流量进行分类，并根据训练集数据分类别计算类别信息熵；S3: Use the C4.5 decision tree algorithm to classify the network traffic, and calculate the category information entropy according to the training set data;

S4：依次计算属性的条件熵、信息的增益、属性的信息熵以及属性的信息增益率；S4: Calculate the conditional entropy of the attribute, the information gain, the information entropy of the attribute and the information gain rate of the attribute in sequence;

S5：选择信息增益率最大的属性当做决策树的根节点，然后在剩余属性中选取信息增益率最大的属性作为分叉节点，并重复步骤S3和S4形成决策树；S5: Select the attribute with the largest information gain rate as the root node of the decision tree, and then select the attribute with the largest information gain rate among the remaining attributes as the fork node, and repeat steps S3 and S4 to form a decision tree;

S6：使用步骤S5中形成的决策树对新的网络流量进行分类操作，检测是否存在DDoS攻击。S6: Use the decision tree formed in step S5 to classify new network traffic to detect whether there is a DDoS attack.

本发明的进一步改进在于，所述属性包括流包数均值ANPPF、对流比PCF、端口增速PGS和源IP增速SGS，所述属性的条件熵用于表示各种类别在某种属性的条件下出现的不确定之和，通过式A further improvement of the present invention is that the attributes include the average number of flow packets ANPPF, the convection ratio PCF, the port speed increase PGS and the source IP speed increase SGS, and the conditional entropy of the attributes is used to represent the conditions of various categories in a certain attribute The sum of uncertainties arising under

计算，其中Ax代表了每一个属性，按属性将训练集划分为D1,D2,…,Dn的n个子集，n是属性Ax下的不同情况数目，|Di|为样本总数|D|下每一种情况的样本数，Info(Di)是每个子集的信息熵。 Calculate, where Ax represents each attribute, divide the training set into n subsets of D1, D2,..., Dn according to the attribute, n is the number of different situations under the attribute Ax, |Di| is the total number of samples |D| The number of samples in one case, Info(Di) is the information entropy of each subset.

本发明的进一步改进在于，所述流包数均值ANPPF用于判断是否存在不合法IP攻击；所述对流比PCF可以用来表示在攻击期受害者回复的数据包无法达到僵尸网络时的交互状态；所述端口增速PGS和源IP增速SGS在网络收到攻击期间会发生明显变化，可用于判断是否存在不合法攻击。A further improvement of the present invention is that the average value of the number of flow packets ANPPF is used to judge whether there is an illegal IP attack; the flow ratio PCF can be used to represent the interactive state when the data packet returned by the victim in the attack period cannot reach the botnet ; The port speed-up PGS and source IP speed-up SGS will change obviously when the network receives an attack, which can be used to judge whether there is an illegal attack.

本发明的进一步改进在于，所述网络流量包括正常流量和攻击流量，且两者的所述类别信息熵通过式计算得到。A further improvement of the present invention is that the network traffic includes normal traffic and attack traffic, and the category information entropy of the two is passed through the formula calculated.

本发明的进一步改进在于，所述属性的信息熵用于表示所述属性是否存在分裂的情况，通过式计算得到；所述信息的增益通过式Gain(A_x)＝Info(D)-Info(A_x)计算得到；所述信息增益率是对原来单纯使用信息增益的一种补充，通过式IGR(A_x)＝Gain(A_x)/H(A_x)计算得到。A further improvement of the present invention is that the information entropy of the attribute is used to indicate whether the attribute is split, by the formula Calculated; the information gain is calculated through the formula Gain(A _x )=Info(D)-Info(A _x ); the information gain rate is a supplement to the original simple use of information gain, through the formula IGR( A _x )=Gain(A _x )/H(A _x ) is calculated.

本发明的基于C4.5决策树算法的分布式拒绝服务攻击检测方法，首先通过OpenFlow协议得到OpenFlow交换机返回的流表信息，随后提取流表信息中的与DDoS攻击相关的并将其转化成可分析网络流量分布变化的参数作为属性，使得属性形成决策树的训练集；然后基于C4.5决策树算法分类网络流量的类别，并分别计算网络流量的类别信息熵、属性的条件熵、流表信息的增益、属性的信息上和信息增益率从而得到决策树，最后通过决策树来对新的数据集进行分类，达到检测是否存在DDoS攻击；与现有技术相比，本发明可更准确地检测到网络中是否存在DDoS攻击，并且检测的准确率更精准。The distributed denial-of-service attack detection method based on the C4.5 decision tree algorithm of the present invention first obtains the flow table information returned by the OpenFlow switch through the OpenFlow protocol, and then extracts the data related to the DDoS attack in the flow table information and converts it into an available Analyze the parameters of network traffic distribution changes as attributes, so that the attributes form the training set of the decision tree; then classify the categories of network traffic based on the C4.5 decision tree algorithm, and calculate the category information entropy of network traffic, the conditional entropy of attributes, and the flow table The gain of information, the information of attributes and the rate of information gain thus obtain a decision tree, and finally classify the new data set through the decision tree to detect whether there is a DDoS attack; compared with the prior art, the present invention can more accurately Detect whether there is a DDoS attack in the network, and the detection accuracy is more precise.

附图说明Description of drawings

图1为本发明所述攻击检测方法的流程框图；Fig. 1 is a flow chart of the attack detection method of the present invention;

图2为本发明软件定义网络的框架示意图；FIG. 2 is a schematic diagram of the framework of the software-defined network of the present invention;

图3为通过本发明所述攻击检测方法的流程示意图。Fig. 3 is a schematic flowchart of the attack detection method of the present invention.

具体实施方式Detailed ways

为了使本技术领域的人员更好地理解本发明方案，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述。显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例，附图中给出了本发明的较佳实施例。本发明可以以许多不同的形式来实现，并不限于本文所描述的实施例，相反地，提供这些实施例的目的是使对本发明的公开内容的理解更加透彻全面。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to enable those skilled in the art to better understand the solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Apparently, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments, and the preferred embodiments of the present invention are shown in the accompanying drawings. The present invention can be implemented in many different forms and is not limited to the embodiments described herein, on the contrary, these embodiments are provided for the purpose of making the disclosure of the present invention more thorough and comprehensive. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

参阅图1，在本发明实施例中，提供了一种基于C4.5决策树算法的分布式拒绝服务攻击检测方法，是软件定义网络环境下对分布式拒绝服务攻击的一种检测方法；参阅图2，所述软件定义网络环境包括网络应用、软件定义网络控制器和数据平面，网络应用通过软件定义网络控制其与数据平面进行数据交互，软件定义网络控制器与网络应用和数据平面之间分别通过特定的接口连接，此外，在数据平面内包括若干节点设备；参阅图3，所述方法首先对收集的流表进行流量统计，根据流表统计提取相应的特征，根据特征得到检测的依据，然后基于检测的依据来对后续新的流量进行新的分类，并得到最后的分类结果；具体的，方法的具体描述如下：Referring to Fig. 1, in the embodiment of the present invention, a kind of distributed denial of service attack detection method based on C4.5 decision tree algorithm is provided, is a kind of detection method to distributed denial of service attack under the software defined network environment; Figure 2, the software-defined network environment includes network applications, software-defined network controllers, and data planes. Network applications control data interactions with data planes through software-defined networks. Between software-defined network controllers and network applications and data planes They are respectively connected through specific interfaces, and in addition, several node devices are included in the data plane; referring to Figure 3, the method first performs traffic statistics on the collected flow table, extracts corresponding features according to the flow table statistics, and obtains the basis for detection according to the features , and then perform a new classification on the subsequent new traffic based on the detection basis, and obtain the final classification result; specifically, the specific description of the method is as follows:

在本发明中，通过OpenFlow协议，由软件定义网络控制器定期地向所有软件定义网络交换机发送流表获取报文来获得OpenFlow交换机返回的流表信息，具体的，设置时间间隔为5秒，与控制器设置的近期未命中流删除时间保持一致，这样可以更加全面、完整地收集流表信息。In the present invention, through the OpenFlow protocol, the software-defined network controller periodically sends flow table acquisition messages to all software-defined network switches to obtain the flow table information returned by the OpenFlow switch. Specifically, the time interval is set to 5 seconds, and The recent miss flow deletion time set by the controller is consistent, so that the flow table information can be collected more comprehensively and completely.

一般为了形成决策树，首先需要形成一个训练集，在本发明中，通过OpenFlow协议收集得到OpenFlow交换机返回的流表信息后，本发明会提取流表信息中与DDoS攻击相关的字段信息，并且将字段信息转化成可用于分析网络流量分布情况的参数作为属性形成相关训练集；具体的，属性包括流包数均值ANNPF，对流比PCF，端口增速PGS和源IP增速SGS；其中，流包数均值式中PacketsNumi是一定时间间隔内第i条流中数据包的数目，FlowNum是这个时间间隔内流的总数目；对流比PCF＝2×Pair/FlowNum，式中Pair是交互流的对数，端口增速PGS＝PortsNum/interval，式中PortsNum是一定时间间隔内不同端口的数量，interval为时间间隔；源IP增速SGS＝sIPNum/interval，式中sIPNum指源IP地址的数目，经计算得到各属性相对应的值大小，从而形成决策树的训练集；假设训练集为D，则可以根据训练集D来构建相对应的决策树；具体可参阅步骤S3后续操作。Generally, in order to form a decision tree, it is first necessary to form a training set. In the present invention, after the flow table information returned by the OpenFlow switch is collected through the OpenFlow protocol, the present invention will extract the field information related to the DDoS attack in the flow table information, and will Field information is converted into parameters that can be used to analyze network traffic distribution as attributes to form relevant training sets; specifically, the attributes include the average number of flow packets ANNPF, convective flow ratio PCF, port speed increase PGS and source IP speed increase SGS; among them, flow packet number mean In the formula, PacketsNumi is the number of data packets in the i-th flow in a certain time interval, and FlowNum is the total number of flows in this time interval; the convective flow ratio PCF=2×Pair/FlowNum, where Pair is the logarithm of the interactive flow, and the port Growth rate PGS=PortsNum/interval, where PortsNum is the number of different ports in a certain time interval, and interval is the time interval; source IP growth rate SGS=sIPNum/interval, where sIPNum refers to the number of source IP addresses, and each The value corresponding to the attribute forms the training set of the decision tree; assuming that the training set is D, the corresponding decision tree can be constructed according to the training set D; for details, please refer to the subsequent operation of step S3.

在本发明实施例中，使用流包数均值是因为攻击者通常通过连续随机地生成不合法IP来进行攻击，所以流的生成速度会显著提高，并且每条流的数据包数量减少；使用对流比是因为攻击期间受害者回复的数据包无法到达僵尸网络，所以用对流比来表现这种交互状态；使用端口增速与源IP增速是因为攻击期间它们会发生明显的变化；即通过各中属性构成的训练集形成的决策树可通过叠加各种不同的判断依据来判断网络流量中是否存在DDoS攻击的存在。In the embodiment of the present invention, the average value of flow packets is used because attackers usually attack by continuously and randomly generating illegal IPs, so the generation speed of flows will be significantly improved, and the number of data packets of each flow will be reduced; The ratio is because the data packets replied by the victim cannot reach the botnet during the attack, so the convection ratio is used to express this interactive state; the port growth rate and the source IP growth rate are used because they will change significantly during the attack; that is, through each The decision tree formed by the training set composed of medium attributes can judge whether there is a DDoS attack in the network traffic by superimposing various judgment criteria.

S3：使用C4.5决策树算法对网络流量进行分类，并根据训练集数据分类别计算类别信息熵；以及S4：依次计算属性的条件熵、信息的增益、属性的信息熵以及属性的信息增益率；S3: Use the C4.5 decision tree algorithm to classify the network traffic, and calculate the category information entropy according to the training set data; and S4: Calculate the attribute conditional entropy, information gain, attribute information entropy, and attribute information gain in sequence Rate;

在本发明中，为了得到决策树，需要找出决策树的根节点和分叉节点，具体通过如下方式得到：In the present invention, in order to obtain the decision tree, it is necessary to find out the root node and the bifurcation node of the decision tree, which are specifically obtained in the following manner:

首先，通过C4.5决策树算法将网络流量分类，实施例中，可分为正常流量和攻击流量两类，然后根据训练集数据D计算各流量类别的信息熵，具体通过式计算得到；公式中|Ci|是正常或攻击流量的样本数，|C|是总的样本数；然后根据属性取值下的类别分别来计算四个属性的条件熵，具体通过式计算得到；式中Ax代表了每一个属性，按属性将训练集划分为D1，D2，…，Dn的n个子集，n是属性Ax下的不同情况数目，比如说可以根据属性值的大小分为高、中和低三中，|Di|为样本总数|D|下每一种情况的样本数，Info(Di)是每个子集的信息熵；条件熵可用于表示各种类别在某种属性的条件下出现的不确定之和；而信息的增益则通过式Gain(A_x)＝Info(D)-Info(A_x)计算得到；每个属性的信息熵通过式计算得到，每个属性的信息熵上可作为分裂信息度量，可用来考虑某种属性进行分裂式分支的数量信息和尺寸信息，这样更加有利于DDoS攻击判断精准性的提升；而各个属性的信息增益率通过公式IGR(A_x)＝Gain(A_x)/H(A_x)计算得到，是对原来单纯使用信息增益的一种补充；综上可知，通过结合计算网络流量类别的信息熵、各属性的条件熵、流表信息的增益、每个属性的信息熵以及各属性的流表信息增益率，可以很好表述网络流量中的特征，从而实现对网络流量中是否存在DDoS攻击进行判断和预测。First, the network traffic is classified by the C4.5 decision tree algorithm. In the embodiment, it can be divided into two types: normal traffic and attack traffic. Then, the information entropy of each traffic category is calculated according to the training set data D. Specifically, the formula Calculated; in the formula |Ci| is the number of samples of normal or attack traffic, and |C| is the total number of samples; then the conditional entropy of the four attributes is calculated according to the category under the attribute value, specifically through the formula Calculated; in the formula, Ax represents each attribute, and the training set is divided into n subsets of D1, D2, ..., Dn according to the attribute, and n is the number of different situations under the attribute Ax. For example, it can be divided according to the size of the attribute value are high, medium and low, |Di| is the total number of samples |D| the number of samples in each case, Info(Di) is the information entropy of each subset; conditional entropy can be used to represent various categories in a certain The sum of uncertainties that appear under the conditions of attributes; and the gain of information is calculated by the formula Gain(A _x )=Info(D)-Info(A _x ); the information entropy of each attribute is calculated by the formula Calculated, the information entropy of each attribute can be used as a split information measure, which can be used to consider the number and size information of a certain attribute for splitting branches, which is more conducive to the improvement of the accuracy of DDoS attack judgment; and the information of each attribute The gain rate is calculated by the formula IGR(A _x )=Gain(A _x )/H(A _x ), which is a supplement to the original simple use of information gain; in summary, by combining the calculation of the information entropy of the network traffic category, The conditional entropy of each attribute, the gain of flow table information, the information entropy of each attribute, and the gain rate of flow table information of each attribute can well describe the characteristics of network traffic, so as to realize whether there is a DDoS attack in the network traffic. and forecast.

在本发明中，将步骤S3和S4中计算得到的最大信息增益率的属性当作决策树的根节点，将剩余属性中信息增益率最大的属性作为分叉节点，并且多次重复步骤S3和S4，找出信息增益率大小在第一位和第二位的属性分别作为决策树的根节点和分叉节点，最终形成决策树。In the present invention, the attribute with the largest information gain rate calculated in steps S3 and S4 is used as the root node of the decision tree, and the attribute with the largest information gain rate among the remaining attributes is used as a bifurcation node, and steps S3 and S4. Find the first and second attributes of the information gain rate as the root node and bifurcation node of the decision tree respectively, and finally form a decision tree.

实施例中，在决策树形成后，则可以使用决策树对网络流量进行分类操作，从而实现对DDoS攻击的检测，实现对网络流量中DDoS攻击检测的精确检测，并在检测到后及时采取相应的应对措施，保护网络的安全运行。In the embodiment, after the decision tree is formed, the decision tree can be used to classify the network traffic, thereby realizing the detection of the DDoS attack, realizing the accurate detection of the DDoS attack detection in the network traffic, and taking corresponding measures in time after detection. Countermeasures to protect the safe operation of the network.

以上仅为本发明的较佳实施例，但并不限制本发明的专利范围，尽管参照前述实施例对本发明进行了详细的说明，对于本领域的技术人员来而言，其依然可以对前述各具体实施方式所记载的技术方案进行修改，或者对其中部分技术特征进行等效替换。凡是利用本发明说明书及附图内容所做的等效结构，直接或间接运用在其他相关的技术领域，均同理在本发明专利保护范围之内。The above are only preferred embodiments of the present invention, but do not limit the scope of patents of the present invention. Although the present invention has been described in detail with reference to the foregoing embodiments, for those skilled in the art, it can still understand the foregoing aspects The technical solutions described in the specific embodiments are modified, or some of the technical features are equivalently replaced. All equivalent structures made by utilizing the contents of the specification and drawings of the present invention and directly or indirectly used in other related technical fields are also within the protection scope of the patent of the present invention.

Claims

1. The distributed denial of service attack detection method based on C4.5 decision tree algorithm is a kind of detection method to distributed denial of service attack under the software defined network environment, it is characterized in that, described method comprises the following steps:

S1: collect the flow table information returned by the OpenFlow switch through the OpenFlow protocol;

S2: Extract field information related to DDoS attacks in the flow table information and convert it into parameters that can analyze network traffic distribution changes as attributes to form a decision tree training set;

S3: Use the C4.5 decision tree algorithm to classify the network traffic, and calculate the category information entropy according to the training set data;

S4: Calculate the conditional entropy of the attribute, the information gain, the information entropy of the attribute and the information gain rate of the attribute in sequence;

S5: Select the attribute with the largest information gain rate as the root node of the decision tree, and then select the attribute with the largest information gain rate among the remaining attributes as the fork node, and repeat steps S3 and S4 to form a decision tree;

S6: Use the decision tree formed in step S5 to classify new network traffic to detect whether there is a DDoS attack.

2. the distributed denial of service attack detection method based on C4.5 decision tree algorithm according to claim 1, it is characterized in that, described attribute comprises flow packet average value ANPPF, convective flow ratio PCF, port speed-up PGS and source IP The growth rate SGS, the conditional entropy of the attribute is used to represent the sum of uncertainty of various categories under the condition of a certain attribute, through the formula Calculate, where Ax represents each attribute, divide the training set into n subsets of D1, D2,..., Dn according to the attribute, n is the number of different situations under the attribute Ax, |Di| is the total number of samples |D| The number of samples in one case, Info(Di) is the information entropy of each subset.

3. the distributed denial-of-service attack detection method based on C4.5 decision tree algorithm according to claim 2, it is characterized in that, described flow packet average value ANPPF is used for judging whether there is illegal IP attack; PCF can be used to indicate the interactive state when the data packet replied by the victim cannot reach the botnet during the attack period; the port speed-up PGS and source IP speed-up SGS will change significantly during the network receiving the attack, and can be used to judge whether There is an illegal attack.

4. the distributed denial-of-service attack detection method based on C4.5 decision tree algorithm according to claim 1, is characterized in that, described network flow comprises normal flow and attack flow, and the described category information entropy of both passes Mode calculated.

5. the distributed denial-of-service attack detection method based on C4.5 decision tree algorithm according to claim 1, it is characterized in that, the information entropy of described attribute is used to represent whether there is the situation of splitting in described attribute, through formula Calculated; the information gain is calculated through the formula Gain(A _x )=Info(D)-Info(A _x ); the information gain rate is a supplement to the original simple use of information gain, through the formula IGR( A _x )=Gain(A _x )/H(A _x ) is calculated.