Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be understood that the technical solutions of the embodiments of the present invention can be applied to various communication systems, for example: a Global System for Mobile communications (GSM) System, a Code Division Multiple Access (CDMA) System, a Wideband Code Division Multiple Access (WCDMA) System, a General Packet Radio Service (GPRS), a Long Term Evolution (Long Term Evolution, LTE) System, an LTE Frequency Division Duplex (FDD) System, an LTE Time Division Duplex (TDD), a Universal Mobile Telecommunications System (UMTS), or a Worldwide Interoperability for Microwave Access (WiMAX) communication System, etc.
Fig. 1 is a schematic flow diagram of a method of selecting a packet classification algorithm according to one embodiment of the invention. The method of fig. 1 may be performed by an apparatus 300 for selecting a packet classification algorithm.
A first value range of a first field of the data packet and a set of first ranges of rules in a rule set classifying the data packet over the first field are determined 110.
The first field of the data packet may be one of a plurality of fields of the header of the data packet, and the first range of values of the first field may be a maximum range that the first field can obtain. The first value range is an inherent attribute value of the first domain, and the value range of the first domain can be determined according to the type of the first domain. For example, the port field has a value range of [0,65535], and the IP address field has a value range of [0,2^ (32) -1 ].
The number of fields in the packet header is the dimension of the rule that classifies the packet. For example, the fields of a packet header include the source IP address, destination IP address, source port number, destination port number, and protocol type. The packet classification rule for classifying the packet is as follows:
153.0.0.0/8224.0.0.0/80:6553580:80TCP- > DROP (rule 1)
Where 153.0.0.0/8 represents the range of rule 1 on the source IP address domain, 224.0.0.0/8 represents the range of rule 1 on the destination IP address domain, 0:65535 represents the range of rule 1 on the source port domain, and 80:80 represents the range of rule 1 on the destination port domain. Rule 1 is a 5-dimensional rule, which indicates that if the field of the packet header satisfies that the source IP address conforms to 153.0.0.0/8, the destination IP address conforms to 224.0.0.0/8, the source port is 0:65535, the destination port is 80, and the protocol number is the TCP protocol, then a discard operation is performed.
The rule set may include a plurality of rules, each rule in the rule set corresponding to a plurality of scopes across a plurality of domains of the data packet. Each rule may include a first range on a first domain. The set of first scopes may be represented as a set of first scopes of rules of a rule set across the first domain. For example, the number of the first ranges included in the set of first ranges of the N rules in the rule set in the first domain may be N, or may be less than N, and the embodiment of the present invention is not limited thereto.
Optionally, as another embodiment, the manner of obtaining the first range set of the rule in the rule set on the first domain may be obtained by scanning a range of each rule in the rule set. For example, a first range may be queried by querying the code, and the queried first range may be stored by storing the code for use in creating the first split tree.
And 120, generating a first partition tree according to the first value range and the set of the first range, wherein the interval represented by the root node of the first partition tree is the first value range, and the interval represented by the leaf node of the first partition tree is the first range.
Specifically, according to the first value range and the set of the first range, the generated first segmentation tree may be used to measure the uniformity of the distribution of the first range in the rule set. For example, the more evenly the distribution of the first range, the closer the first split tree is to a full binary tree.
According to the first value range, an interval represented by a root node of the first partition tree can be determined, and according to the first value range and a set of the first range, an interval represented by a child node of the first partition tree can be determined.
It should be understood that the first partition tree includes a root node, and the root node of the first partition tree represents an interval, and the interval represented by the root node may be a value range of the first field of the data packet. For example, if the field of the packet has four bits, the value range of the field is [0,15 ]. The interval represented by the root node can be determined to be [0,15] according to the value range of the domain.
It is also understood that the first partition tree includes at least one leaf node, each leaf node of the first partition tree may represent an interval, and at least one leaf node of the first partition tree may represent at least one interval corresponding to the at least one leaf node. The at least one interval of the at least one leaf node of the first partition tree may be a first range of the first domain for a rule of the rule set.
And 130, determining a first maximum balance distance according to the first splitting tree, wherein the first maximum balance distance is the maximum number of subtrees included between a first subtree in which a root node of the first splitting tree is positioned and a second subtree in which a leaf node of the first splitting tree is positioned.
Optionally, as another embodiment, the subtree may be a binary tree that satisfies a certain condition. The balance distance may represent the number of subtrees included between a subtree where a root node of the partition tree is located and a subtree where a leaf node of the partition tree is located. The maximum balancing distance may represent the largest of all balancing distances of the partition tree. The subtree where the root node is located, the subtree where the leaf node is located, and the included subtree may be the same or different, and may also be a binary tree that satisfies a certain condition at the same time.
The first sub-tree where the root node of the first split tree is located, the second sub-tree where the leaf node of the first split tree is located, and the sub-tree included between the first sub-tree and the second sub-tree may be the same or different, and may also be a binary tree that simultaneously satisfies a certain condition.
Each leaf node of the first partition tree corresponds to a balancing distance, and the maximum value of the balancing distances may be the first maximum balancing distance.
A packet classification algorithm is selected 140 for classifying the packet based on the first maximum balancing distance.
Optionally, as another embodiment, the decision tree-based packet classification algorithm is a packet classification algorithm with better performance. The decision tree-based packet classification algorithm may include a HyperSplit algorithm and a HyperCuts algorithm, and the method of selecting the packet classification algorithm may be to select a suitable packet classification algorithm from the HyperSplit algorithm and the HyperCuts algorithm to classify the packets.
Optionally, as another embodiment, a packet classification algorithm is selected according to the first maximum balance distance, the packet classification algorithm may be selected by analyzing the first maximum balance distance, or the first maximum balance distance may be compared with a certain threshold, and the packet classification algorithm is selected according to a result of the comparison, which is not limited in this embodiment of the present invention.
The embodiment of the invention selects between the data packet classification algorithms according to the range distribution condition of the rule set, only needs to create the partition tree, and avoids the operation of respectively creating the decision tree for each data packet classification algorithm. The data packet classification algorithm can be rapidly selected according to the partition tree, and the efficiency of the method for selecting the data packet classification algorithm is improved.
Fig. 2 is a schematic flow chart diagram of a method of selecting a packet classification algorithm according to another embodiment of the invention. As shown in fig. 2, optionally, the method further includes:
150, determining a second value range of a second domain of the data packet and a set of second ranges of rules in a rule set for classifying the data packet in the second domain;
160, generating a second partition tree according to the second value range and the set of the second range, where an interval represented by a root node of the second partition tree is the second value range, and an interval represented by a leaf node of the second partition tree is the second range;
170, determining a second maximum balance distance according to the second split tree, where the second maximum balance distance is the maximum number of subtrees included between a third subtree where a root node of the second split tree is located and a fourth subtree where a leaf node of the second split tree is located;
it is to be understood that the technical solution of steps 150,160,170 is the same as the technical solution of steps 110,120,130, and that the step of determining the second maximum balance is omitted in the claims and the description herein to avoid repetition.
Wherein 140, selecting a packet classification algorithm for classifying the packet according to the first maximum balance distance comprises:
140A, determining the greater of the first maximum balancing distance and the second maximum balancing distance;
140B, based on the larger value, a packet classification algorithm for classifying the packet is selected.
Optionally, as another embodiment. The embodiment of the invention can establish the partition tree aiming at the range distribution condition of a domain and determine the maximum balance distance of the partition tree; two partition trees can also be created for the range distribution of the two domains separately. And respectively determining the maximum balance distance according to the two segmentation trees. The packet classification algorithm is selected according to the greater of the two maximum balancing distances. Alternatively, a plurality of corresponding partition trees may be created for the range distribution of the plurality of domains. And determining a plurality of corresponding maximum balance distances according to the plurality of partition trees. The packet classification algorithm is selected according to the maximum value of the maximum balancing distances, which is not limited in the embodiments of the present invention.
The embodiment of the invention selects between the data packet classification algorithms according to the distribution condition of the rule set in the range of a plurality of domains, and creates a plurality of corresponding segmentation trees. And respectively determining a plurality of corresponding maximum balance distances, and selecting a data packet classification algorithm according to the maximum value of the maximum balance distances. The accuracy of the method for selecting the packet classification algorithm can be improved. The method avoids the operation of respectively creating the decision tree for each data packet classification algorithm, and improves the efficiency and the accuracy of the method for selecting the data packet classification algorithm.
Optionally, as another embodiment. The first domain may be a source IP address domain and the second domain may be a destination IP address domain, or the first domain may be a destination IP address domain and the second domain may be a source IP address domain. And respectively generating two corresponding partition trees according to the first range and the second range corresponding to the source IP address field and the destination IP address field. According to the two partition trees, respectively determining the maximum balance distance D corresponding to the source IP address fieldsrcMaximum balance distance D corresponding to destination IP address fielddst. And selecting a data packet classification algorithm to classify the data packet according to the larger value of the two maximum balance distances.
In the method for selecting a packet classification algorithm provided in the embodiment of the present invention, partition trees are respectively created according to a range corresponding to a source IP address field and a range corresponding to a destination IP address field, and a larger value of two maximum balance distances corresponding to the two partition trees is determined. And comparing the larger value with the number of rules corresponding to the narrow range meeting the conditions selected from the range corresponding to the source IP address field and the range corresponding to the target IP address field, and selecting a data packet classification algorithm from the HyperSplit algorithm and the HyperCuts algorithm. The method avoids the operation of respectively creating the decision tree for each data packet classification algorithm, can quickly select the data packet classification algorithm, and improves the efficiency of the method for selecting the data packet classification algorithm.
Fig. 3 is a schematic flow chart diagram of a method of selecting a packet classification algorithm according to another embodiment of the invention. As shown in fig. 3, generating a first partition tree according to the first value range and the set of the first range includes:
and 121, generating a root node of the first segmentation tree according to the first value range.
Specifically, each node of the first partition tree represents an interval, and the first value range may be used as an interval represented by a root node of the first partition tree. For example, if the field of the packet has four bits, the value range of the field is [0,15 ]. The interval represented by the root node can be determined to be [0,15] according to the value range of the domain.
And 122, determining a first segmentation point of the root node of the first segmentation tree according to the first value range.
Optionally, the first dividing point may select one value from the values included in the first value range as the first dividing point according to experience, may also take a middle value of the first value range as the first dividing point, and may also determine the first dividing point through calculation, which is not limited in the embodiment of the present invention.
Optionally, as another embodiment. The calculation method for selecting the first segmentation point may be:
wherein, ImIs the first division point, ImaxIs the maximum value of the first value range, IminIs the minimum value of the first value range. For example, the first range of values is [0,15]]Then can be represented as [ Imin,Imax]. Wherein, Imin=0,ImaxCalculated according to the above formula, I can be obtained as 15m=7。
And 123, selecting a first narrow range set from the first range set, wherein the ratio of the length of the first narrow range to the length of the first value range is smaller than a first value.
The first partition tree may be generated based on a set of first ranges, which may be first narrow ranges of short length. The first narrow range may satisfy that a ratio of a length of the first narrow range to a length of the first value range is smaller than a first value. The scope of customization may also be defined based on other methods, and the embodiments of the present invention are not limited thereto.
Optionally, as another embodiment. The first field is a source IP address field or a destination IP address field, and the first value may take 0.05. The first field is a port field or a protocol field and the first value may take 0.5. For example, if the first range of rules in a rule set is denoted as (F)L,FH) Then the condition that the first range satisfies the first narrow range may be: 1) if it is the source IP address field or the destination IP address field, (F)H-FL+1)/len(I)<0.05. 2) If it is a port domain or a protocol domain, (F)H-FL+1)/len(I)<0.5. Wherein, I represents the first value range of the first field of the data packet, and len (I) represents the length of the first value range of the first field.
Optionally, as another embodiment. The value of the first numerical value may also take other numerical values, such as 0.045, 0.046, 0.047, 0.048, 0.049, 0.051, 0.052, and the like. A narrow range of shorter length determined based on the other values may be satisfied to generate a partition tree, and a packet classification algorithm may be accurately selected according to the partition tree. The value of the first value is not limited thereto in the embodiments of the present invention.
And 124, generating child nodes of the first partition tree according to the first value range, the first partition point and the first narrow range, wherein the first range represented by the child nodes intersects with the first narrow range.
It should be understood that the child nodes of the first partition tree include other nodes than the root node, including leaf nodes of the first partition tree and intermediate nodes of the first partition tree.
According to the first value range and the first segmentation point, a left node and a right node of a root node of the first segmentation tree can be generated. Specifically, the first division point may divide the interval represented by the root node of the first division tree into two intervals represented by left and right nodes of the root node of the first division tree, respectively. For example, the first value range is [0,15], and the division point can be determined to be 7 according to the above formula. According to the division point 7, the intervals represented by the left and right nodes of the root node are determined to be [0,7] and [8,15], respectively.
Optionally, as another embodiment. The generation process of the first partition tree excluding the root node and the other child nodes of the left and right nodes of the root node may be the same as the generation process of the left and right nodes of the root node, that is, the partition point is determined to divide the section of the parent node into two sections of the left and right nodes of the parent node.
The generated child nodes of the first partition tree can meet the condition that the intervals represented by the child nodes intersect with the first narrow range, and the child nodes which do not meet the condition are deleted. Alternatively, the definition of the intersection may be: if the range 1 is (F)1L,F1H) Range 2 is (F)2L,F2H) If F is not satisfied1H<F2LOr F2H<F1LThen, range 1 and range 2 are said to intersect.
And stopping the iterative process until the interval represented by the child nodes of the first partition tree is the first narrow range. The child node whose interval represented is the first narrow range can be taken as the leaf node of the first partition tree. For example, the first value range is [0,15], and the division point can be determined to be 7. According to the division point 7, the intervals represented by the left and right nodes of the root node are determined to be [0,7] and [8,15], respectively. Determining left and right nodes as the division points 3 and 11 of the father node respectively, then the two intervals represented by the left and right nodes of the left node can be [0,3] and [3,7], the two intervals represented by the left and right nodes of the right node can be [8,11] and [12,15], and so on, the intervals represented by each child node in the division tree can be sequentially determined as [0,7], [8,15], [0,3], [3,7], [8,11], [12,15], [0,1], [2,3], [8,9] and [10,11 ]. If the first narrow range includes [2,3], [8,9] and [12,15], nodes which do not satisfy the intersection condition can be deleted if the interval represented by the leaf node of the first partition tree is ensured to be the first narrow range and the interval represented by the child node intersects with the first narrow range. Therefore, the sections represented by the sequentially determined child nodes are deleted [3,7], [0,1] and [10,11 ]. Therefore, the intervals represented by the leaf nodes of the first partition tree can be determined as [2,3], [8,9] and [12,15 ].
And 125, generating the first partition tree according to the root node of the first partition tree and the child nodes of the first partition tree, wherein the interval represented by the leaf node of the first partition tree is the first narrow range.
The embodiment of the invention selects between data packet classification algorithms according to the range distribution condition of the rule set, creates a partition tree according to the range of the rule set only, and quantizes the uniformity of range distribution through the interval represented by the leaf nodes of the partition tree. The method avoids the operation of respectively creating the decision tree for each data packet classification algorithm, can quickly select the data packet classification algorithm according to the quantization result of the partition tree, and improves the efficiency of the method for selecting the data packet classification algorithm.
Alternatively, as another embodiment, the method of generating the first partition tree may be implemented by code. The specific implementation can be shown by the following codes:
optionally, the child node may include at least one of the following information:
1) the interval represented by the node, Nmin,Imax](ii) a Optionally, for a certain range in the rule, the range represented by the root node of the partition tree is equal to the first value range.
2) All with Nmin,Imax]A narrow set of intersections N.S;
3) the rule number N.L (r) of the first narrow range includes the rule number of the narrow range r e N.S
<math>
<mrow>
<munder>
<mi>Σ</mi>
<mrow>
<mi>r</mi>
<mo>∈</mo>
<mi>N</mi>
<mo>.</mo>
<mi>S</mi>
</mrow>
</munder>
<mi>N</mi>
<mo>.</mo>
<mi>L</mi>
<mrow>
<mo>(</mo>
<mi>r</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mi>N</mi>
<mo>.</mo>
<mi>numRules</mi>
</mrow>
</math>
The method for selecting the data packet classification algorithm provided by the embodiment of the invention comprises the steps of establishing a partition tree according to the range of a rule set, comparing the maximum balance distance of the partition tree with a judgment value determined according to the number of rules corresponding to a narrow range, and selecting the data packet classification algorithm from the HyperSplit algorithm and the HyperCuts algorithm. The method avoids the operation of respectively creating the decision tree for each data packet classification algorithm, can quickly select the data packet classification algorithm, and improves the efficiency of the method for selecting the data packet classification algorithm.
Fig. 4 is a schematic flow chart diagram of a method of selecting a packet classification algorithm according to another embodiment of the invention. As shown in fig. 4, selecting a packet classification algorithm for classifying the packet according to the first maximum balance distance includes:
141, determining a first rule number corresponding to the first narrow range set according to the first narrow range set.
It should be understood that the number of narrow ranges of a rule on a domain is one. The narrow ranges may be the same, so that N rules of a rule set correspond to up to N narrow ranges, and a narrow range may correspond to multiple rules, with N being a positive integer. For example, rule 1: [1,3] [2,4], rule 2: 1,3] [2,2], then there is one of the narrow ranges of the two rules over the first domain, i.e., [1,3 ]. This narrow range [1,3] corresponds to two rules.
Optionally, as an embodiment. The first rule number corresponding to the first narrow range may be obtained by scanning the first narrow range. For example, the first narrow range may be queried by querying the code, and the queried first number of rules may be stored by storing the code.
142, according to the first rule number, determining a first judgment value.
Optionally, the determination method of the first determination value may be obtained by calculation according to the number of rules of the rule set, and the calculation formula may be obtained by multiple experimental summaries; further, a predetermined value may be determined as a determination value based on a predetermined analysis, and the present invention is not limited to this.
143, selecting a packet classification algorithm for classifying the packet according to the first maximum balance distance and the first decision value.
Alternatively, the packet classification algorithm may be selected by comparing the first maximum balance distance with the first judgment value, or may be selected by other calculation methods. For example, the threshold may be obtained by a mathematical operation, and the packet classification algorithm may be selected by determining the size of the threshold, but the present invention is not limited thereto.
The embodiment of the invention selects between the data packet classification algorithms according to the range distribution condition of the rule set. A partition tree is created from only the range of the rule set, from which a maximum balance distance embodying the uniformity of the range distribution is determined. And selecting a data packet classification algorithm according to the comparison result of the maximum balance distance and the judgment value. The operation of respectively creating a decision tree for each data packet classification algorithm is avoided, the data packet classification algorithm is rapidly selected, and the efficiency of the method for selecting the data packet classification algorithm is improved.
Optionally, as another embodiment. The determination method of the first judgment value may be obtained by calculation according to a first rule number, and the calculation formula may be:
wherein X is the judgment value, and numRules is the first rule number.
Optionally, as another embodiment. The decision tree-based packet classification algorithm may include a HyperSplit algorithm and a HyperCuts algorithm, and the selection method selects a suitable packet classification algorithm from the HyperSplit algorithm and the HyperCuts algorithm to classify the packet. When the maximum balance distance is compared with a judgment value, when the first maximum balance distance is larger than the first judgment value, selecting a HyperSplit algorithm; when the first maximum balance distance is smaller than or equal to the first judgment value, a HyperCuts algorithm is selected.
According to the method for selecting the data packet classification algorithm provided by the embodiment of the invention, the partition tree is created according to the range distribution condition of the rule set. And comparing the maximum balance distance of the partition tree with a judgment value determined according to the number of rules of the rule set, and selecting a packet classification algorithm from the HyperSplit algorithm and the HyperCuts algorithm. The operation of respectively creating the decision tree for each packet classification algorithm is avoided, the data packet classification algorithm can be rapidly selected, and the efficiency of the method for selecting the data packet classification algorithm is improved.
Fig. 5 is a schematic flow chart diagram of a method of selecting a packet classification algorithm according to another embodiment of the invention. As shown in fig. 5, the sub-tree is a quasi-balanced sub-tree, and determining a first maximum balanced distance according to the first partition tree includes:
and 131, determining the quasi-balanced subtree contained in the first splitting tree according to the first splitting tree, wherein the ratio of the node number of the k +1 layer to the node number of the k layer of the quasi-balanced subtree is greater than or equal to a second value, and k is a positive integer greater than or equal to 1.
The first partition tree may include a plurality of quasi-balanced sub-trees, wherein each quasi-balanced sub-tree may be the same or different. The quasi-balanced subtree can be defined as follows:
wherein brato is the second value. According to the characteristics of the binary tree, the maximum value of the ratio of the number of nodes of the k +1 layer to the number of nodes of the k layer is 2. When the branch is 2, the ratio of the number of nodes of the k +1 layer to the number of nodes of the k layer is equal to 2, then the quasi-balanced sub-tree is a full binary tree. When the branch is less than 2, the ratio of the number of nodes of the k +1 layer to the number of nodes of the k layer is greater than or equal to the branch, and the binary tree formed by removing some nodes from the full binary tree is represented.
Alternatively, the Bratio may take any value between 1.5 and 1.8. For example, the Bratio of the present embodiment may take the values of 1.52, 1.55, 1.65, 1.75, 1.78, and so on. Values of Bratio between 1.5 and 1.8 are preferred embodiments. The branch may also take other values so that the quasi-balanced subtree satisfies a certain condition, and the embodiment of the present invention is not limited thereto. For example, braio may take values of 1.4, 1.45, 1.83, 1.85, etc. that are close to 1.5 or 1.8.
132, performing a depth-first traversal on the first partition tree, and determining a first quasi-balanced sub-tree in which a root node of the first partition tree is located and a second quasi-balanced sub-tree in which a leaf node of the first partition tree is located.
The first quasi-balanced sub-tree where the root node of the first split tree is located may be the same as or different from the second quasi-balanced sub-tree where the leaf node of the first split tree is located. The first quasi-balanced sub-tree and the second quasi-balanced sub-tree may be binary trees that satisfy the definition of the quasi-balanced sub-trees.
According to the depth-first traversal method, the child nodes of the first partition tree are sequentially traversed in a depth mode, and a first quasi-balanced sub-tree where a root node of the first partition tree is located and a second quasi-balanced sub-tree where a leaf node of the first partition tree is located can be determined.
133, determining the maximum number of quasi-balanced subtrees included between the first quasi-balanced subtree and the second quasi-balanced subtree as the first maximum balanced distance.
The balancing distance may refer to the number of quasi-balanced subtrees between which a leaf node is located and which are located from the quasi-balanced subtrees including the root node of the tree. The maximum balancing distance may refer to the largest one of the balancing distances in the binary tree.
It should be understood that the number of leaf nodes of the first partition tree in the embodiment of the present invention may be multiple. The number of the second quasi-balanced subtrees where the leaf nodes of the first partition tree are located is also multiple. The plurality of second quasi-balanced sub-trees and the first quasi-balanced sub-tree may determine a corresponding plurality of first balanced distances. The largest value among the plurality of first balance distances may be taken as the first maximum balance distance.
Optionally, as another embodiment. The method of determining the maximum balancing distance may be implemented by code. The specific implementation can be shown by the following codes:
the implementation process of getBaccereTreeLeaves (N), ChildrenCount (node) and getChildren (N) functions can be implemented by other codes, and the implemented functions are getBaccereTreeLeaves (N) respectively, which represent to judge and screen whether the subtree meets the quasi-balanced subtree condition and obtain a set of child nodes of the leaf nodes of the quasi-balanced subtree of which N is the root node subtree. Childrencount (Node) is the number of child nodes that obtain the Node. getchildren (N) is the child node that obtains the N node.
According to the method for selecting the packet classification algorithm provided by the embodiment of the invention, the partition tree is created according to the range of the rule set. And comparing the maximum balance distance of the partition tree with a first judgment value determined according to the rule number corresponding to the narrow range, and selecting a data packet classification algorithm from the HyperSplit algorithm and the HyperCuts algorithm. The method avoids the operation of respectively creating the decision tree for each packet classification algorithm, can quickly select the data packet classification algorithm, and improves the efficiency of the method for selecting the data packet classification algorithm.
Fig. 6 is a schematic flow chart diagram of a process of a method of selecting a packet classification algorithm according to another embodiment of the invention. As shown in fig. 6, the process of the method for selecting the packet classification algorithm is as follows:
in S201, the rule of the rule set is determined to include a range corresponding to the source IP address field and a range corresponding to the destination IP address field.
The rules of the rule set include a range corresponding to the source IP address field and a range corresponding to the destination IP address field. And scanning the rule of the rule set, and determining that the rule of the rule set comprises a range corresponding to the source IP address domain and a range corresponding to the destination IP address domain.
The rule set may include a plurality of rules, each rule in the rule set corresponding to a plurality of scopes across a plurality of domains of the data packet. Each rule may include a range over the source IP address realm.
Optionally, as another embodiment, the obtaining manner of the rule set in the range corresponding to the source IP address field and the range corresponding to the destination IP address field may be obtained by scanning the range of each rule in the rule set. For example, the scope may be queried by querying the code, and the queried scope may be stored by storing the code for use in creating the partition tree.
In S202, a partition tree is created for each of the range corresponding to the source IP address field and the range corresponding to the destination IP address field.
And selecting a source IP narrow range with the ratio of the length of the range to the length of the value range of the source IP address field smaller than 0.05 from the range corresponding to the source IP address field. And selecting a target IP narrow range with the ratio of the length of the satisfying range to the length of the value range of the target IP address field smaller than 0.05 from the range corresponding to the target IP address field.
And generating a root node of the partition tree according to the value range of the source IP address domain, wherein the interval represented by the root node is the value range of the source IP address domain. For example, if the field of the packet has four bits, the value range of the field is [0,15 ]. The interval represented by the root node can be determined to be [0,15] according to the value range of the domain.
And taking the middle value of the interval of the root node, taking the middle value as a dividing point, and generating two intervals represented by the left and right nodes of the root node. I.e., the left and right nodes that generate the root node of the split tree. The calculation method for selecting the first segmentation point may be:
wherein, ImIs the first division point, ImaxIs the maximum value of the first value range, IminIs the minimum value of the first value range. For example, the first range of values is [0,15]]Then can be represented as [ Imin,Imax]. Wherein, Imin=0,ImaxCalculated according to the above formula, I can be obtained as 15m7. Therefore, the left node of the root node represents an interval of [0,7]]The right node of the root node represents an interval of [8,15]]。
The generation process of the partition tree excluding the root node and the other child nodes of the left and right nodes of the root node may be the same as the generation process of the left and right nodes of the root node, that is, the partition point is determined to divide the interval of the parent node into two intervals of the left and right nodes of the parent node. For example. The interval represented by the left node of the root node is [0,7], and the division point 3 is obtained by calculation according to a formula. Therefore, it can be obtained that the intervals represented by the left and right child nodes of the left node are [0,3] and [4,7], respectively.
And executing an iterative process until the interval represented by a certain child node of the partition tree is the selected source IP narrow range, and deleting the child node not containing the source IP narrow range. The child nodes with the intervals represented as the narrow range of the source IP can be used as leaf nodes of the partition tree.
Similarly, another partition tree created according to the range corresponding to the destination IP address field may be created according to the above method.
In S203, the maximum balance distance D of the partition tree is determined according to the partition tree created by the range corresponding to the source IP address fieldsrc(ii) a Determining the maximum balance distance D of the partition tree according to the partition tree created by the corresponding range of the destination IP address fielddst。
And determining a quasi-balanced sub-tree contained in the partition tree created according to the range corresponding to the source IP address domain, wherein the quasi-balanced sub-tree satisfies that the ratio of the number of nodes of the k +1 layer to the number of nodes of the k layer is greater than or equal to a second value, and the value range of the second value is between 1.5 and 1.8. The first partition tree may include a plurality of quasi-balanced sub-trees, and the quasi-balanced sub-trees may be defined as follows:
wherein brato is the second value. When the branch is 2, the ratio of the number of nodes of the k +1 layer to the number of nodes of the k layer is equal to 2, then the quasi-balanced sub-tree is a full binary tree. When the branch is less than 2, the ratio of the number of nodes of the k +1 layer to the number of nodes of the k layer is greater than or equal to the branch, and the binary tree formed by removing some nodes from the full binary tree is represented.
Determining a plurality of balancing distances as a plurality of numbers of quasi-balancing subtrees contained between a quasi-balancing subtree where a root node is located and a plurality of quasi-balancing subtrees where a plurality of leaf nodes are located, and selecting the largest balancing distance D from the balancing distances as the largest balancing distance of the partition tree created according to the range corresponding to the source IP address domainsrc。
It should be understood that the balancing distance may refer to the number of quasi-balanced subtrees between which a leaf node is located and the quasi-balanced subtrees that contain the root node of the tree. The maximum balancing distance may refer to the largest one of the balancing distances in the binary tree.
It should also be understood that the number of leaf nodes of the partition tree in the embodiment of the present invention may be multiple. The number of the quasi-balanced subtrees where the leaf nodes of the partition tree are located is also multiple. The quasi-balanced sub-trees in which the leaf nodes are located and the quasi-balanced sub-trees in which the root node is located can determine corresponding balanced distances. The largest value among the plurality of equilibrium distances may be taken as the maximum equilibrium distance.
Similarly, the maximum level of the partition tree created according to the range corresponding to the destination IP address field can be determined according to the above methodDistance D of balancedst。
In S204, take DsrcAnd DdstLarger value of Dm。
In S205, whenWhen, the HyperSplit algorithm is selected. Wherein numRules is the number of rules corresponding to the source IP narrow range and the destination IP narrow range.
In S206, whenWhen, the HyperCuts algorithm is selected.
The decision tree-based packet classification algorithm may include a HyperSplit algorithm and a HyperCuts algorithm, and the selection method selects a suitable packet classification algorithm from the HyperSplit algorithm and the HyperCuts algorithm to classify the packet.
According to the method for selecting the data packet classification algorithm provided by the embodiment of the invention, the partition trees are respectively created according to the range corresponding to the source IP address field and the range corresponding to the target IP address field in the rule set, and the larger value of the two maximum balance distances corresponding to the two partition trees is determined. And comparing the larger value with the number of rules corresponding to the range meeting the narrow range condition in the range corresponding to the source IP address field and the range corresponding to the target IP address field, and selecting a packet classification algorithm from the HyperSplit algorithm and the HyperCuts algorithm. The method avoids the operation of respectively creating the decision tree for each packet classification algorithm, can quickly select the data packet classification algorithm, and improves the efficiency of the method for selecting the data packet classification algorithm.
The method for selecting a packet classification algorithm according to an embodiment of the present invention is described in detail above with reference to fig. 1 to 6, and the apparatus for selecting a packet classification algorithm according to an embodiment of the present invention is described in detail below with reference to fig. 7.
Fig. 7 shows a schematic block diagram of an apparatus for processing data packets according to an embodiment of the invention. The apparatus of fig. 7 may implement aspects of the method of selecting a packet classification algorithm described in fig. 1-6. As shown in fig. 7, the apparatus 300 for selecting a packet classification algorithm includes:
the first determining module 310 is configured to determine a first value range of a first domain of a data packet and a first range set of rules in a rule set classifying the data packet on the first domain.
The first field of the data packet may be one of a plurality of fields of the header of the data packet, and the first range of values of the first field may be a maximum range that the first field can obtain. The first value range is an inherent attribute value of the first domain, and the value range of the first domain can be determined according to the type of the first domain. For example, the port field has a value range of [0,65535], and the IP address field has a value range of [0,2^ (32) -1 ].
The number of fields in the packet header is the dimension of the rule that classifies the packet. For example, the fields of a packet header include the source IP address, destination IP address, source port number, destination port number, and protocol type. The packet classification rule for classifying the packet is as follows:
153.0.0.0/8224.0.0.0/80:6553580:80TCP- > DROP (rule 1)
Where 153.0.0.0/8 represents the range of rule 1 on the source IP address domain, 224.0.0.0/8 represents the range of rule 1 on the destination IP address domain, 0:65535 represents the range of rule 1 on the source port domain, and 80:80 represents the range of rule 1 on the destination port domain. Rule 1 is a 5-dimensional rule, which indicates that if the field of the packet header satisfies that the source IP address conforms to 153.0.0.0/8, the destination IP address conforms to 224.0.0.0/8, the source port is 0:65535, the destination port is 80, and the protocol number is the TCP protocol, then a discard operation is performed.
The rule set may include a plurality of rules, each rule in the rule set corresponding to a plurality of scopes across a plurality of domains of the data packet. Each rule may include a first range on a first domain. The set of first scopes may be represented as a set of first scopes of rules of a rule set across the first domain. For example, the number of the first ranges included in the set of first ranges of the N rules in the rule set in the first domain may be N, or may be less than N, and the embodiment of the present invention is not limited thereto.
Optionally, as another embodiment, the manner of obtaining the first range set of the rule in the rule set on the first domain may be obtained by scanning a range of each rule in the rule set. For example, a first range may be queried by querying the code, and the queried first range may be stored by storing the code for use in creating the first split tree.
The generating module 320 is configured to generate a first partition tree according to the first value range and the set of the first range, where an interval represented by a root node of the first partition tree is the first value range, and an interval represented by a leaf node of the first partition tree is the first range.
Specifically, according to the first value range and the set of the first range, the generated first segmentation tree may be used to measure the uniformity of the distribution of the first range in the rule set. For example, the more evenly the distribution of the first range, the closer the first split tree is to a full binary tree.
According to the first value range, an interval represented by a root node of the first partition tree can be determined, and according to the first value range and a set of the first range, an interval represented by a child node of the first partition tree can be determined.
It should be understood that the first partition tree includes a root node, and the root node of the first partition tree represents an interval, and the interval represented by the root node may be a value range of the first field of the data packet. For example, if the field of the packet has four bits, the value range of the field is [0,15 ]. The interval represented by the root node can be determined to be [0,15] according to the value range of the domain.
It is also understood that the first partition tree includes at least one leaf node, each leaf node of the first partition tree may represent an interval, and at least one leaf node of the first partition tree may represent at least one interval corresponding to the at least one leaf node. The at least one interval of the at least one leaf node of the first partition tree may be a first range of the first domain for a rule of the rule set.
The second determining module 330 is configured to determine a first maximum balance distance according to the first partition tree, where the first maximum balance distance is a maximum number of subtrees included between a first subtree in which a root node of the first partition tree is located and a second subtree in which a leaf node of the first partition tree is located.
Optionally, as another embodiment, the subtree may be a binary tree that satisfies a certain condition. The balance distance may represent the number of subtrees included between a subtree where a root node of the partition tree is located and a subtree where a leaf node of the partition tree is located. The maximum balancing distance may represent the largest of all balancing distances of the partition tree. The subtree where the root node is located, the subtree where the leaf node is located, and the included subtree may be the same or different, and may also be a binary tree that satisfies a certain condition at the same time.
The first sub-tree where the root node of the first split tree is located, the second sub-tree where the leaf node of the first split tree is located, and the sub-tree included between the first sub-tree and the second sub-tree may be the same or different, and may also be a binary tree that simultaneously satisfies a certain condition.
Each leaf node of the first partition tree corresponds to a balancing distance, and the maximum value of the balancing distances may be the first maximum balancing distance.
A selecting module 340, configured to select a packet classification algorithm for classifying the packet according to the first maximum balance distance.
Optionally, as another embodiment, the decision tree-based packet classification algorithm is a packet classification algorithm with better performance. The decision tree-based packet classification algorithm may include a HyperSplit algorithm and a HyperCuts algorithm, and the means for selecting the packet classification algorithm may select a suitable packet classification algorithm from the HyperSplit algorithm and the HyperCuts algorithm to classify the packets.
Optionally, as another embodiment, a packet classification algorithm is selected according to the first maximum balance distance, the packet classification algorithm may be selected by analyzing the first maximum balance distance, or the first maximum balance distance may be compared with a certain threshold, and the packet classification algorithm is selected according to a result of the comparison, which is not limited in this embodiment of the present invention.
The device for selecting the data packet classification algorithm provided by the embodiment of the invention selects between the data packet classification algorithms according to the range distribution condition of the rule set, only needs to create the partition tree, and avoids the operation of respectively creating the decision tree for each data packet classification algorithm. The data packet classification algorithm can be rapidly selected according to the partition tree, and the efficiency of the method for selecting the data packet classification algorithm is improved.
Optionally, as another embodiment. In the means 300 for selecting a packet classification algorithm,
the first determining module 310 is further configured to determine a second value range of a second domain of the data packet and a set of second ranges of rules in the rule set for classifying the data packet on the second domain;
the generating module 320 is further configured to generate a second partition tree according to the second value range and the set of the second range, where an interval represented by a root node of the second partition tree is the second value range, and an interval represented by a leaf node of the second partition tree is the second range;
a second determining module 330, configured to determine a second maximum balance distance according to the second partition tree, where the second maximum balance distance is the maximum number of subtrees between a third subtree where a root node of the second partition tree is located and a fourth subtree where a leaf node of the second partition tree is located;
it should be appreciated that 310,320,330 is the same process for creating the first split tree as the second split tree, as is the process for determining the first maximum balancing distance and the second maximum balancing distance. Here, to avoid repetition, the step of determining the second maximum balance by the determining means 320,320,330 is omitted in the claims and the description.
The selecting module 340 is configured to select, according to the first maximum balance distance, a packet classification algorithm for classifying the packet, where the selecting module includes:
340A, specifically configured to determine a larger value of the first maximum balancing distance and the second maximum balancing distance;
340B, specifically configured to select, according to the larger value, a packet classification algorithm for classifying the packet.
Optionally, as another embodiment. The device 300 of the embodiment of the present invention may create a partition tree for the range distribution of a domain, and determine the maximum balance distance of the partition tree; two partition trees can also be created for the range distribution of the two domains separately. And respectively determining the maximum balance distance according to the two segmentation trees. The packet classification algorithm is selected according to the larger of the two. Alternatively, a plurality of corresponding partition trees may be created for the range distribution of the plurality of domains. And determining a plurality of corresponding maximum balance distances according to the plurality of partition trees. The packet classification algorithm is selected according to the maximum value of the maximum balancing distances, which is not limited in the embodiments of the present invention.
The device for selecting the data packet classification algorithm provided by the embodiment of the invention selects among the data packet classification algorithms according to the distribution condition of the rule set in the range of a plurality of domains, and creates a plurality of corresponding partition trees. And respectively determining a plurality of corresponding maximum balance distances, and selecting a data packet classification algorithm according to the maximum value of the maximum balance distances. The accuracy of the method for selecting the packet classification algorithm can be improved. The device avoids the operation of respectively creating the decision tree for each data packet classification algorithm, and improves the efficiency and the accuracy of the method for selecting the data packet classification algorithm.
Optionally, as another embodiment. The first domain may be a source IP address domain and the second domain may be a destination IP address domain, or the first domain may be a destination IP address domain and the second domain may be a source IP address domain. And respectively generating two corresponding partition trees according to the first range and the second range corresponding to the source IP address field and the destination IP address field. According to the two partition trees, respectively determining the maximum balance distance D corresponding to the source IP address fieldsrcMaximum balance distance D corresponding to destination IP address fielddst. And selecting a data packet classification algorithm to classify the data packet according to the larger value of the two maximum balance distances.
The device for selecting the data packet classification algorithm provided by the embodiment of the invention respectively creates the partition trees according to the range corresponding to the source IP address field and the range corresponding to the target IP address field, and determines the larger value of the two maximum balance distances corresponding to the two partition trees. And comparing the larger value with the number of rules corresponding to the narrow range meeting the conditions selected from the range corresponding to the source IP address field and the range corresponding to the target IP address field, and selecting a data packet classification algorithm from the HyperSplit algorithm and the HyperCuts algorithm. The device avoids the operation of respectively creating the decision tree for each data packet classification algorithm, can quickly select the data packet classification algorithm, and improves the efficiency of the method for selecting the data packet classification algorithm.
Optionally, as another embodiment. The generation module 320 generates the first segmentation tree as follows:
the generating module 320 is specifically configured to generate a root node of the first partition tree according to the first value range.
Specifically, each node of the first partition tree represents an interval, and the first value range may be used as an interval represented by a root node of the first partition tree. For example, if the field of the packet has four bits, the value range of the field is [0,15 ]. The interval represented by the root node can be determined to be [0,15] according to the value range of the domain.
The generating module 320 is specifically configured to determine a first splitting point of a root node of the first splitting tree according to the first value range.
Optionally, the first dividing point may select one value from the values included in the first value range as the first dividing point according to experience, may also take a middle value of the first value range as the first dividing point, and may also determine the first dividing point through calculation, which is not limited in the embodiment of the present invention.
Optionally, as another embodiment. The calculation method for selecting the first segmentation point may be:
wherein, ImIs the first division point, ImaxIs the maximum value of the first value range, IminIs the minimum value of the first value range. For example, the first range of values is [0,15]]Then can be represented as [ Imin,Imax]. Wherein, Imin=0,ImaxCalculated according to the above formula, I can be obtained as 15m=7。
The generating module 320 is specifically configured to select a first narrow range set from the first range set, where a ratio of a length of the first narrow range to a length of the first value range is smaller than a first value.
The first partition tree may be generated based on a set of first ranges, which may be first narrow ranges of short length. The first narrow range may satisfy that a ratio of a length of the first narrow range to a length of the first value range is smaller than a first value. The scope of customization may also be defined based on other methods, and the embodiments of the present invention are not limited thereto.
Optionally, as another embodiment. The first field is a source IP address field or a destination IP address field, and the first value may take 0.05. The first field is a port field or a protocol field and the first value may take 0.5. For example, if the first range of rules in a rule set is denoted as (F)L,FH) Then the condition that the first range satisfies the first narrow range may be: 1) if it is the source IP address field or the destination IP address field, (F)H-FL+1)/len(I)<0.05. 2) If it is a port domain or a protocol domain, (F)H-FL+1)/len(I)<0.5. Wherein, I represents the first value range of the first field of the data packet, and len (I) represents the length of the first value range of the first field.
Optionally, as another embodiment. The value of the first numerical value may also take other numerical values, such as 0.045, 0.046, 0.047, 0.048, 0.049, 0.051, 0.052, and the like. A narrow range of shorter length determined based on the other values may be satisfied to generate a partition tree, and a packet classification algorithm may be accurately selected according to the partition tree. The value of the first value is not limited thereto in the embodiments of the present invention.
The generating module 320 is specifically configured to generate a child node of the first partition tree according to the first value range, the first partition point, and the set of the first narrow range, where the first range represented by the child node intersects with the first narrow range.
It should be understood that the child nodes of the first partition tree include other nodes than the root node, including leaf nodes of the first partition tree and intermediate nodes of the first partition tree.
According to the first value range and the first segmentation point, a left node and a right node of a root node of the first segmentation tree can be generated. Specifically, the first division point may divide the interval represented by the root node of the first division tree into two intervals represented by left and right nodes of the root node of the first division tree, respectively. For example, the first value range is [0,15], and the division point can be determined to be 7 according to the above formula. According to the division point 7, the intervals represented by the left and right nodes of the root node are determined to be [0,7] and [8,15], respectively.
Optionally, as another embodiment. The generation process of the first partition tree excluding the root node and the other child nodes of the left and right nodes of the root node may be the same as the generation process of the left and right nodes of the root node, that is, the partition point is determined to divide the section of the parent node into two sections of the left and right nodes of the parent node.
The generated child nodes of the first partition tree can meet the condition that the intervals represented by the child nodes intersect with the first narrow range, and the child nodes which do not meet the condition are deleted. Alternatively, the definition of the intersection may be: if the range 1 is (F)1L,F1H) Range 2 is (F)2L,F2H) If F is not satisfied1H<F2LOr F2H<F1LThis range 1 and range 2 are said to intersect.
And stopping the iterative process until the interval represented by the child nodes of the first partition tree is the first narrow range. The child node whose interval represented is the first narrow range can be taken as the leaf node of the first partition tree. For example, the first value range is [0,15], and the division point can be determined to be 7. According to the division point 7, the intervals represented by the left and right nodes of the root node are determined to be [0,7] and [8,15], respectively. Determining left and right nodes as the division points 3 and 11 of the father node respectively, then the two intervals represented by the left and right nodes of the left node can be [0,3] and [3,7], the two intervals represented by the left and right nodes of the right node can be [8,11] and [12,15], and so on, the intervals represented by each child node in the division tree can be sequentially determined as [0,7], [8,15], [0,3], [3,7], [8,11], [12,15], [0,1], [2,3], [8,9] and [10,11 ]. If the first narrow range includes [2,3], [8,9] and [12,15], nodes which do not satisfy the intersection condition can be deleted if the interval represented by the leaf node of the first partition tree is ensured to be the first narrow range and the interval represented by the child node intersects with the first narrow range. Therefore, the sections represented by the sequentially determined child nodes are deleted [3,7], [0,1] and [10,11 ]. Therefore, the intervals represented by the leaf nodes of the first partition tree can be determined as [2,3], [8,9] and [12,15 ].
The generating module 320 is specifically configured to generate the first partition tree according to the root node of the first partition tree and the child nodes of the first partition tree, where the interval represented by the leaf node of the first partition tree is the first narrow range.
The device for selecting the data packet classification algorithm provided by the embodiment of the invention selects among the data packet classification algorithms according to the range distribution condition of the rule set, creates the partition tree according to the range of the rule set only, and quantizes the uniformity of the range distribution through the interval represented by the leaf node of the partition tree. The device avoids the operation of respectively creating a decision tree for each data packet classification algorithm, can quickly select the data packet classification algorithm according to the quantization result of the partition tree, and improves the efficiency of the method for selecting the data packet classification algorithm.
Alternatively, as another embodiment, the method of generating the first partition tree of the generating module may be implemented by code. The specific implementation can be shown by the following codes:
optionally, the child node may include at least one of the following information:
1) the area represented by the nodeM, N. [ I ]min,Imax](ii) a Optionally, for a certain range in the rule, the range represented by the root node of the partition tree is equal to the first value range.
2) All with Nmin,Imax]A narrow set of intersections N.S;
3) a first narrow range of numbers of rules N.L (r) that includes the first number of rules for the first narrow range r e N.S and
<math>
<mrow>
<munder>
<mi>Σ</mi>
<mrow>
<mi>r</mi>
<mo>∈</mo>
<mi>N</mi>
<mo>.</mo>
<mi>S</mi>
</mrow>
</munder>
<mi>N</mi>
<mo>.</mo>
<mi>L</mi>
<mrow>
<mo>(</mo>
<mi>r</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mi>N</mi>
<mo>.</mo>
<mi>numRules</mi>
</mrow>
</math>
the device for selecting the data packet classification algorithm provided by the embodiment of the invention creates the partition tree according to the range of the rule set, compares the maximum balance distance of the partition tree with the judgment value determined according to the number of rules corresponding to the narrow range, and selects the packet classification algorithm from the HyperSplit algorithm and the HyperCuts algorithm. The device avoids the operation of respectively creating a decision tree for each packet classification algorithm, can quickly select the packet classification algorithm with better performance, and improves the efficiency of the method for selecting the data packet classification algorithm.
Optionally, as another embodiment. The selection module 340 is configured to select the selected block,
the first narrow range setting module is used for determining a first rule number corresponding to the first narrow range set according to the first narrow range set;
it should be understood that the number of narrow ranges of a rule on a domain is one. The narrow ranges may be the same, so that N rules of a rule set correspond to up to N narrow ranges, and a narrow range may correspond to multiple rules, with N being a positive integer. For example, rule 1: [1,3] [2,4], rule 2: 1,3] [2,2], then there is one of the narrow ranges of the two rules over the first domain, i.e., [1,3 ]. This narrow range [1,3] corresponds to two rules.
Optionally, as an embodiment. The first rule number corresponding to the first narrow range may be obtained by scanning the first narrow range. For example, the first narrow range may be queried by querying the code, and the queried first number of rules may be stored by storing the code.
The selecting module 340 is configured to determine a first judgment value according to the first rule number.
Optionally, the determination method of the first determination value may be obtained by calculation according to the number of rules of the rule set, and the calculation formula may be obtained by multiple experimental summaries; further, a predetermined value may be determined as a determination value based on a predetermined analysis, and the present invention is not limited to this.
The selecting module 340 is configured to select a packet classification algorithm for classifying the packet classification algorithm according to the first maximum balance distance and the first judgment value.
Alternatively, the packet classification algorithm may be selected by comparing the first maximum balance distance with the first judgment value, or may be selected by other calculation methods. For example, the threshold may be obtained by a mathematical operation, and the packet classification algorithm may be selected by determining the size of the threshold, but the present invention is not limited thereto.
The device for selecting the data packet classification algorithm provided by the embodiment of the invention selects the data packet classification algorithm according to the range distribution condition of the rule set. A partition tree is created from only the range of the rule set, from which a maximum balance distance embodying the uniformity of the range distribution is determined. And selecting a data packet classification algorithm according to the comparison result of the maximum balance distance and the judgment value. The operation of respectively creating a decision tree for each data packet classification algorithm is avoided, the data packet classification algorithm is rapidly selected, and the efficiency of the method for selecting the data packet classification algorithm is improved.
Optionally, as another embodiment, the method for determining the first determination value may be obtained by calculating according to a first rule number, and the calculation formula may be:
wherein X is the judgment value, and numRules is the first rule number.
Optionally, as another embodiment. The decision tree-based packet classification algorithm may include a HyperSplit algorithm and a HyperCuts algorithm, and the selection method selects a suitable packet classification algorithm from the HyperSplit algorithm and the HyperCuts algorithm to classify the packet. When the maximum balance distance is compared with a judgment value, when the first maximum balance distance is larger than the first judgment value, selecting a HyperSplit algorithm; when the first maximum balance distance is smaller than or equal to the first judgment value, a HyperCuts algorithm is selected.
According to the device for selecting the data packet classification algorithm, provided by the embodiment of the invention, the partition tree is created according to the range distribution condition of the rule set. And comparing the maximum balance distance of the partition tree with a judgment value determined according to the number of rules of the rule set, and selecting a packet classification algorithm from the HyperSplit algorithm and the HyperCuts algorithm. The operation of respectively creating the decision tree for each packet classification algorithm is avoided, the data packet classification algorithm can be rapidly selected, and the efficiency of the method for selecting the data packet classification algorithm is improved.
Optionally, as another embodiment. The second determination module 330 is configured to determine,
the method is used for determining the quasi-balanced subtree contained in the first splitting tree according to the first splitting tree, the ratio of the number of nodes at the k +1 layer of the quasi-balanced subtree to the number of nodes at the k layer is larger than or equal to a second value, and k is a positive integer larger than or equal to 1.
The first partition tree may include a plurality of quasi-balanced sub-trees, wherein each quasi-balanced sub-tree may be the same or different. The quasi-balanced subtree can be defined as follows:
wherein brato is the second value. According to the characteristics of the binary tree, the maximum value of the ratio of the number of nodes of the k +1 layer to the number of nodes of the k layer is 2. When the branch is 2, the ratio of the number of nodes of the k +1 layer to the number of nodes of the k layer is equal to 2, then the quasi-balanced sub-tree is a full binary tree. When the branch is less than 2, the ratio of the number of nodes of the k +1 layer to the number of nodes of the k layer is greater than or equal to the branch, and the binary tree formed by removing some nodes from the full binary tree is represented.
Alternatively, the Bratio may take any value between 1.5 and 1.8. For example, the Bratio of the present embodiment may take the form of 1.53, 1.55, 1.6, 1.65, 1.75, and so on. Values of Bratio between 1.5 and 1.8 are preferred embodiments. The branch may also take other values so that the quasi-balanced subtree satisfies a certain condition, and the embodiment of the present invention is not limited thereto. For example, braio may take values of 1.45, 1.48, 1.83, 1.85, etc. that are close to 1.5 or 1.8.
The second determining module 330 is configured to perform depth-first traversal on the first partition tree, and determine a first quasi-balanced sub-tree in which a root node of the first partition tree is located and a second quasi-balanced sub-tree in which a leaf node of the first partition tree is located.
The first quasi-balanced sub-tree where the root node of the first split tree is located may be the same as or different from the second quasi-balanced sub-tree where the leaf node of the first split tree is located. The first quasi-balanced sub-tree and the second quasi-balanced sub-tree may be binary trees that satisfy the definition of the quasi-balanced sub-trees.
According to the depth-first traversal method, the child nodes of the first partition tree are sequentially traversed in a depth mode, and a first quasi-balanced sub-tree where a root node of the first partition tree is located and a second quasi-balanced sub-tree where a leaf node of the first partition tree is located can be determined.
The second determining module 330 is configured to determine that the maximum number of quasi-balanced subtrees included between the first quasi-balanced subtree and the second quasi-balanced subtree is the first maximum balanced distance.
The balancing distance may refer to the number of quasi-balanced subtrees between which a leaf node is located and which are located from the quasi-balanced subtrees including the root node of the tree. The maximum balancing distance may refer to the largest one of the balancing distances in the binary tree.
It should be understood that the number of leaf nodes of the first partition tree in the embodiment of the present invention may be multiple. The number of the second quasi-balanced subtrees where the leaf nodes of the first partition tree are located is also multiple. The plurality of second quasi-balanced sub-trees and the first quasi-balanced sub-tree may determine a corresponding plurality of first balanced distances. The largest value among the plurality of first balance distances may be taken as the first maximum balance distance.
Optionally, as another embodiment. The method of determining the maximum balancing distance may be implemented by code. The specific implementation can be shown by the following codes:
the implementation process of getBaccereTreeLeaves (N), ChildrenCount (node) and getChildren (N) functions can be implemented by other codes, and the implemented functions are getBaccereTreeLeaves (N) respectively, which represent to judge and screen whether the subtree meets the quasi-balanced subtree condition and obtain a set of child nodes of the leaf nodes of the quasi-balanced subtree of which N is the root node subtree. Childrencount (Node) is the number of child nodes that obtain the Node. getchildren (N) is the child node that obtains the N node.
According to the device for selecting the data packet classification algorithm, provided by the embodiment of the invention, the partition tree is created according to the range of the rule set. And comparing the maximum balance distance of the partition tree with a first judgment value determined according to the rule number corresponding to the narrow range, and selecting a data packet classification algorithm from the HyperSplit algorithm and the HyperCuts algorithm. The operation of respectively creating the decision tree for each packet classification algorithm is avoided, the data packet classification algorithm can be rapidly selected, and the efficiency of the method for selecting the data packet classification algorithm is improved.
It should be further noted that each component in fig. 7 may be implemented by hardware, or implemented by software on a hardware basis.
Fig. 8 is a block diagram of a selection device 80 for selecting a packet classification algorithm according to another embodiment of the invention. The selection device 80 of fig. 8 comprises a processor 81 and a memory 82. The processor 81 and the memory 82 are connected by a bus system 83.
The processor 81 controls the operation of the selection device 80. The memory 82 may include a read-only memory and a random access memory, and provides instructions and data to the processor 81. A portion of the memory 82 may also include non-volatile random access memory (NVRAM). The various components of the selection device 80 are coupled together by a bus system 83, wherein the bus system 83 may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 83 in the figures.
The processor 81 may be an integrated circuit chip having signal processing capabilities. The processor 81 may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The processor 81 reads the information in the memory 82 and, in conjunction with its hardware, controls the various components of the processing device 80.
The methods of fig. 1-6 may be implemented in the selection device 80 of fig. 8, or the means of selecting a packet classification algorithm of fig. 7 may be implemented by the selection device 80 of fig. 8. And will not be described in detail to avoid repetition.
In particular, the processor 81 is configured to invoke, via the bus 83, code stored in the memory 82 for determining a first value range of a first domain of a packet and a set of first ranges of rules in a rule set for classifying the packet on the first domain; the first segmentation tree generation unit is used for generating a first segmentation tree according to the first value range and the set of the first range, wherein the interval represented by the root node of the first segmentation tree is the first value range, and the interval represented by the leaf node of the first segmentation tree is the first range; the first maximum balance distance is the maximum number of subtrees included between a first subtree where a root node of the first splitting tree is located and a second subtree where a leaf node of the first splitting tree is located; and selecting a packet classification algorithm for classifying the packet according to the first maximum balance distance.
According to the selection device for selecting the data packet classification algorithm, provided by the embodiment of the invention, the partition tree is created according to the range distribution condition of the rule set, the maximum balance distance of the partition tree is compared with the judgment value determined according to the rule number of the rule set, and the packet classification algorithm is selected from the HyperSplit algorithm and the HyperCuts algorithm, so that the operation of respectively creating a decision tree for each packet classification algorithm is avoided, the packet classification algorithm with better performance can be quickly selected, and the efficiency of the method for selecting the data packet classification algorithm is improved.
The method disclosed in the above embodiments of the present invention may be applied to the processor 81, or implemented by the processor 81. The processor 81 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 81. The Processor 81 may be a general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, or discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in a Random Access Memory (RAM), a flash Memory, a Read-Only Memory (ROM), a programmable ROM, an electrically erasable programmable Memory, a register, or other storage media that are well known in the art. The storage medium is located in the memory 920, and the processor 81 reads the information in the memory 920 and completes the steps of the method in combination with the hardware thereof.
Optionally, as another embodiment, the processor 81 is further configured to determine a second value range of a second domain of the data packet and a second range of a rule in the rule set that classifies the data packet on the second domain; the second partition tree is further configured to generate a second partition tree according to the second value range and the set of the second range, where an interval represented by a root node of the second partition tree is the second value range, and an interval represented by a leaf node of the second partition tree is the second range; the second maximum balancing distance is the maximum number of the subtrees between a third subtree where a root node of the second splitting tree is located and a fourth subtree where a leaf node of the second splitting tree is located; wherein the processor 81 is configured to select a packet classification algorithm for classifying the packet according to the first maximum balance distance, and includes: in particular for determining the greater of the first maximum balancing distance and the second maximum balancing distance; and is specifically configured to select, according to the larger value, a packet classification algorithm for classifying the packet.
Optionally, as another embodiment. The first domain is a source IP address domain and the second domain is a destination IP address domain, or the first domain is a destination IP address domain and the second domain is a source IP address domain.
Optionally, as another embodiment, the processor 81 is specifically configured to generate a root node of the first partition tree according to the first value range; the first segmentation point is specifically used for determining a root node of the first segmentation tree according to the first value range; specifically, the method is configured to select a first narrow range set from the first range set, where a ratio of a length of the first narrow range to a length of the first value range is smaller than a first value; specifically, the method is configured to generate a child node of the first partition tree according to the first value range, the first partition point, and the set of the first narrow range, where an interval represented by the child node intersects with the first narrow range; the first partition tree is generated according to the root node of the first partition tree and the child nodes of the first partition tree, and the interval represented by the leaf node of the first partition tree is the first narrow range.
Optionally, as another embodiment. The processor 81 is specifically configured to determine, according to the first value range, a first splitting point of a root node of the first splitting tree, and includes:
according to the formulaDetermining the first segmentation point;
wherein, ImIs the first division point, ImaxIs the maximum value of the first value range, IminIs the most significant of the first value rangeA small value.
Optionally, as another embodiment. A processor 81, configured to determine, according to the set of first narrow ranges, a first rule number corresponding to the set of first narrow ranges; the first judgment value is determined according to the first rule number; and the data packet classification algorithm is used for selecting the data packet for classifying according to the first maximum balance distance and the first judgment value.
Optionally, as another embodiment. The processor 81 is configured to determine a first judgment value according to the first rule number, and includes:
according to the formulaDetermining the first judgment value;
wherein, X is the first judgment value, numRules is the first rule number; among other things, the processor 81 may be,
specifically, when the first maximum balance distance is greater than the first judgment value, the HyperSplit algorithm is selected; the method is specifically used for selecting the HyperCuts algorithm when the first maximum balance distance is less than or equal to the first judgment value.
Optionally, as another embodiment. A processor 81, configured to determine, according to the first partition tree, the quasi-balanced sub-tree included in the first partition tree, where a ratio of the number of nodes in the k +1 layer to the number of nodes in the k layer of the quasi-balanced sub-tree is greater than or equal to a second value, and k is a positive integer greater than or equal to 1; the method comprises the steps that depth-first traversal is conducted on the first segmentation tree, and a first quasi-balanced sub-tree where a root node of the first segmentation tree is located and a second quasi-balanced sub-tree where a leaf node of the first segmentation tree is located are determined; the maximum number of the quasi-balanced subtrees included between the first quasi-balanced subtree and the second quasi-balanced subtree is determined as the first maximum balanced distance.
Optionally, as another embodiment. The first domain is a source IP address domain or a destination IP address domain, and the first value is 0.05; the first field is a port field or a protocol field, and the first value is 0.5.
Optionally, as another embodiment. The second value is any value between 1.5 and 1.8.
According to the selection device for selecting the data packet classification algorithm provided by the embodiment of the invention, the partition trees are respectively created according to the range corresponding to the source IP address domain and the range corresponding to the destination IP address domain in the rule set, and the larger value of the two maximum balance distances corresponding to the partition trees is determined. And comparing the larger value with the number of rules corresponding to the range meeting the narrow range condition in the range corresponding to the source IP address field and the range corresponding to the target IP address field, and selecting a packet classification algorithm from the HyperSplit algorithm and the HyperCuts algorithm. The operation of respectively creating the decision tree for each packet classification algorithm is avoided, the data packet classification algorithm can be rapidly selected, and the efficiency of the method for selecting the data packet classification algorithm is improved.
It should be understood that, in various embodiments of the present invention, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
Additionally, the terms "system" and "network" are often used interchangeably herein. The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
It should be understood that in the present embodiment, "B corresponding to a" means that B is associated with a, from which B can be determined. It should also be understood that determining B from a does not mean determining B from a alone, but may be determined from a and/or other information.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by hardware, firmware, or a combination thereof. When implemented in software, the functions described above may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a computer. Taking this as an example but not limiting: computer-readable media can include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. Furthermore, the method is simple. Any connection is properly termed a computer-readable medium. For example, if software is transmitted from a website, a server, or other remote source using a coaxial cable, a fiber optic cable, a twisted pair, a Digital Subscriber Line (DSL), or a wireless technology such as infrared, radio, and microwave, the coaxial cable, the fiber optic cable, the twisted pair, the DSL, or the wireless technology such as infrared, radio, and microwave are included in the fixation of the medium. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy Disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
In short, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.