WO2011110003A1 - 二叉树建立、压缩和查找的方法和装置 - Google Patents

二叉树建立、压缩和查找的方法和装置 Download PDF

Info

Publication number
WO2011110003A1
WO2011110003A1 PCT/CN2010/076299 CN2010076299W WO2011110003A1 WO 2011110003 A1 WO2011110003 A1 WO 2011110003A1 CN 2010076299 W CN2010076299 W CN 2010076299W WO 2011110003 A1 WO2011110003 A1 WO 2011110003A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
compression
binary tree
bitmap
rule set
Prior art date
Application number
PCT/CN2010/076299
Other languages
English (en)
French (fr)
Inventor
张文勇
王慧
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2010/076299 priority Critical patent/WO2011110003A1/zh
Priority to EP10847261A priority patent/EP2477363A4/en
Priority to CN201080003336.3A priority patent/CN102405622B/zh
Publication of WO2011110003A1 publication Critical patent/WO2011110003A1/zh
Priority to US13/353,884 priority patent/US8711014B2/en
Priority to US14/213,167 priority patent/US9521082B2/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2441Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3066Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction by means of a mask or a bit-map
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/60General implementation details not specific to a particular type of compression
    • H03M7/6058Saving memory space in the encoder or decoder
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/70Type of the data to be coded, other than image and sound
    • H03M7/707Structured documents, e.g. XML
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/48Routing tree calculation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers

Definitions

  • the present invention relates to the field of flow classification, and in particular, to a method and apparatus for establishing, compressing, and searching a binary tree. Background technique
  • the traffic classification checks multiple domains in the packet header according to a predefined rule, and performs corresponding processing according to the matching situation.
  • a collection of rules used by a traffic classification is called a traffic classifier.
  • Each rule in the flow classifier is related to several fields in the message header.
  • the standard IPv4 quintuple rule includes five domains: source IP address, destination IP address, protocol type, source port number, and destination port number.
  • the matching method of different domains may also be different. IP address uses prefix matching, protocol
  • Types use exact match, port numbers use range matching.
  • the decision tree-based flow classification algorithm is a rule set segmentation algorithm, which uses a segmentation strategy to recursively separate the rule sets until the number of rules in each sub-rule set is less than the preset bucket size. until.
  • a binary decision tree can be established by segmentation, which is referred to as a binary tree.
  • the intermediate node of the binary tree saves the method used by the segmentation rule set, and the leaf node saves all possible matching rule sets.
  • searching extract the relevant domain from the message header to form a key, and then use the keyword to traverse the established decision tree until the corresponding leaf node is found. By comparing the keyword with the rules in the leaf node, you can finally get the rule that matches the message and has the highest priority.
  • the Modular algorithm treats the rule as a three-valued bit string consisting of '0', ', and '*', without the concept of a dimension, where '*' represents a wildcard whose binary bit can be 0 or 1.
  • '*' represents a wildcard whose binary bit can be 0 or 1.
  • the rule with the value '0' is placed in a sub-rule set, the rule for 'is placed in another sub-rule set, and the rule with '*' appears simultaneously in both The set of sub-rules.
  • the original rule set is divided into two sub-rule sets. For the scope rule, you can first convert it to a prefix, and then use the above method to perform the segmentation. In this way, the original rule set is recursively split until the number of rules in each sub-rule set is less than the maximum number of rules allowed by the preset leaf node. In this way, a binary decision tree can be established.
  • the modular algorithm is not '*' according to the source IP and the destination IP, only the source IP is '*', only the destination IP is '*', the source IP and the destination IP are both '*', and the rule is The set is divided into four sub-rules. Different binary decision trees are established for each of the four sub-rule sets. When looking up, look up multiple binary decision trees in parallel. In the process of implementing the present invention, the inventors have found that the prior art has at least the following problems:
  • the embodiment of the present invention provides a method for binary tree compression, where the method includes: determining a compression parameter, where the compression parameter is a compression level n or an intermediate node number K;
  • a bitmap of the compressed node is established.
  • a method for establishing a binary tree in order to avoid the rule expansion caused by the translation of the range rule into a prefix in the process of establishing the binary tree, the method includes:
  • the location-based segmentation algorithm is used to segment the non-range rules in the rule set
  • the range rule in the rule set is converted into a prefix, and the identifier corresponding to the prefix corresponding rule and the identifier corresponding to the range rule are kept unchanged;
  • this embodiment provides a method for establishing a binary tree, and the method includes:
  • the rule set segmentation algorithm is used to segment the rule set
  • the rule In the process of extracting the segmentation, the rule needs to be copied and placed in another sub-rule set;
  • a binary tree corresponding to the rule set and the another sub rule set is separately established.
  • the embodiment provides a method for binary tree search, and the method includes:
  • each node of the binary tree is a leaf node or a compression node
  • the embodiment of the present invention provides a device for binary tree compression, and the device includes: a determining module, configured to determine a compression parameter, where the compression parameter is a compression level n or an intermediate node number K;
  • a compression module configured to compress the binary tree according to the compression parameter to form at least one compression node; and a bitmap module, configured to establish a bitmap of the compression node.
  • the embodiment provides a device for establishing a binary tree, and the device includes:
  • a first sub-module configured to perform a segmentation process on the non-range rule in the rule set
  • a conversion module configured to: when the segmentation efficiency is lower than a preset threshold, the range of the rule set Rules are converted to prefixes
  • the second segmentation module is configured to slice the prefix by using the location segmentation algorithm.
  • the embodiment provides a device for establishing a binary tree, and the device includes:
  • a segmentation module configured to segment a rule set by using a segmentation segmentation algorithm
  • An extraction module that is used to extract the rules that need to be copied and placed into another sub-rule set.
  • the embodiment provides a device for searching for a binary tree, and the device includes:
  • An obtaining module configured to obtain a searched keyword
  • a judging module configured to determine whether each node of the binary tree is a leaf node or a compression node
  • the processing module is configured to parse the compression node when it is a compression node; when it is a leaf node, traverse the linear table corresponding to the leaf node to find a rule matching the keyword.
  • the depth of the decision tree is greatly reduced, and the search speed is improved;
  • the segmentation rule is used to first segment the non-range rules in the rule set.
  • the range rule in the rule set is converted into a prefix, which is "if necessary".
  • the method of converting the range into a prefix effectively avoids the rule expansion caused by converting all the ranges into prefixes in the binary tree establishment process; by extracting the required copy rules and placing them into another sub-rule set, that is, by creating multiple trees
  • the decision tree method effectively reduces the rule copying during the establishment of the binary tree;
  • each node of the binary tree is a leaf node or a compression node
  • the compression node is parsed; when it is a leaf node, it traverses the linear table corresponding to the leaf node, and the search matches the keyword.
  • the rules reduce the depth of the binary tree and improve the speed of the search.
  • Embodiment 1 is a flowchart of a method for binary tree compression provided by Embodiment 1 of the present invention
  • FIG. 2 is a flow chart of a shape compression method according to Embodiment 2 of the present invention.
  • Embodiment 3 is a schematic diagram of a binary tree segment provided by Embodiment 2 of the present invention.
  • Embodiment 4 is a schematic diagram of an updated binary tree segment provided by Embodiment 2 of the present invention.
  • FIG. 5 is a flowchart of a binary tree search according to Embodiment 2 of the present invention.
  • FIG. 6 is a flowchart of analyzing a compression node of shape compression according to Embodiment 2 of the present invention.
  • FIG. 7 is a flowchart of an adaptive compression method according to Embodiment 3 of the present invention.
  • Embodiment 8 is a schematic diagram of a binary tree segment provided by Embodiment 3 of the present invention.
  • Embodiment 9 is a schematic diagram of adaptive compression of another binary tree segment provided by Embodiment 3 of the present invention.
  • FIG. 10 is a schematic diagram of width-prioritized clipping of another binary tree segment provided by Embodiment 3 of the present invention
  • FIG. 11 is a flowchart of analyzing an adaptive compression compression node provided by Embodiment 3 of the present invention.
  • FIG. 12 is a flowchart of a method for establishing a binary tree according to Embodiment 4 of the present invention.
  • Embodiment 13 is a schematic diagram of rule coverage provided by Embodiment 4 of the present invention.
  • FIG. 15 is a schematic structural diagram of a device for binary tree compression provided by Embodiment 6 of the present invention.
  • FIG. 16 is a schematic structural diagram of a device for establishing a binary tree according to Embodiment 7 of the present invention.
  • FIG. 17 is a schematic structural diagram of a device for establishing a binary tree according to Embodiment 8 of the present invention.
  • FIG. 18 is a schematic structural diagram of a device for searching for a binary tree according to Embodiment 9 of the present invention. detailed description
  • this embodiment provides a method for binary tree compression, including:
  • the method provided in this embodiment compresses multiple nodes into one node according to the compression level or the number of intermediate nodes, which greatly reduces the depth of the decision tree and improves the search speed.
  • Example 2
  • This embodiment provides a method for binary tree compression, which compresses a binary tree according to a compression level, which is also called a shape compression method.
  • the shape compression method specifically includes:
  • the number of bits Nb of the read data according to a memory access, the number of bits used by the bit index of each intermediate node, the number of bits used by the start address of the child node of the compressed node, and the type of the compressed node are used.
  • the number of bits Nt and the number of bits used by the bitmap determine the compression level n;
  • the child node of the compression node is also called the child node, and refers to the node hanging under the compression node.
  • (2" - 1) ⁇ ⁇ + (2" - 1) + . + ⁇ , determine w ⁇ Log 2 ( Nb ⁇ N ' + l) , where (2" - 1) represents the number of bits used by the bitmap.
  • the node having the number of layers less than or equal to the compression level n is regarded as a compression node; and then, starting from the child node of the compression node, using the same as the compression node
  • the compression method continues to compress the binary tree until the binary tree is traversed.
  • the compression level is 3, starting from the root node of the binary tree, the three-layer binary tree is compressed into one layer to form a large compression node 1, and then, from the child node of the compression node 1. The compression continues, forming large compression nodes 2 and 3.
  • the bitmap refers to the shape bitmap.
  • the compression node is traversed in a width-first order, and the type of each node is identified in turn, for example, the intermediate node is identified as 1, the leaf node or the empty node is identified as 0, and the identifier is identified.
  • the result is a shape bitmap of the compressed node.
  • the "width-first order" refers to the order from top to bottom and left to right.
  • the open circle is the middle node
  • the solid circle is the leaf node
  • the middle node is represented by 1
  • the leaf node and the empty node are represented by 0.
  • the compression is traversed in the order of width first.
  • the shape bitmap of the compressed node 1 is 1100100
  • the last two bits 0 are empty nodes.
  • the shape bitmap is directly updated. For example, referring to FIG. 4, for the updated binary tree segment, the shape bitmap of the updated compressed node 1 is 1110100.
  • 204 Store all child nodes of a compression node continuously, and save the starting address in the compression node; further, when searching, determine the address of the child node according to the starting address and the index of the child node.
  • the node searching process includes: obtaining a searched keyword; determining whether each node of the binary tree is a leaf node or a compression node; when compressing the node, parsing the compressed node; when it is a leaf node , traverse the linear table corresponding to the leaf node to find the rule that matches the keyword.
  • the binary tree search process is as follows: A1: Get the searched keyword;
  • A2 Determine whether the root node of the decision binary tree is a leaf node
  • the root node corresponds to a compression node, and is used as the current compression node, and step A3 is performed. If it is a leaf node, step A5 is performed.
  • the root node of Figure 3 is not a leaf node, which corresponds to a compression node, and step A3 is performed.
  • A4 determining whether the child node of the current compression node is a leaf node
  • the child node corresponds to a compression node, and is used as the current compression node, and step A3 is performed. If it is a leaf node, step A5 is performed.
  • A5 Traverse the linear table corresponding to the leaf node, find the rule matching the keyword, and the process ends.
  • the shape compression can be layer-by-layer analysis, and the root node of the compression node is used as the first layer. According to the shape bitmap of the compression node, it is determined whether each node of the compression node is a leaf node. , see Figure 6, the compression node resolution process is:
  • step B5 if no, go to step B5, and if yes, go to step B7.
  • step B9 if yes, go to step B9, if no, go to step B10.
  • BIO The child node is a compressed node and the process ends.
  • the searching method determines whether the nodes of the binary tree are leaf nodes or compression nodes, and when compressing the nodes, parses the compression nodes; when it is a leaf node, traverses the linear table corresponding to the leaf nodes, and searches Rules that match keywords reduce the depth of the binary tree and improve the speed of the search.
  • the method provided in this embodiment determines the compression level and compresses multiple nodes into one node according to the compression level, which greatly reduces the depth of the decision tree and improves the search speed.
  • Example 3 the number of bits used for reading the bit data according to one memory access, the number of bits used by the bit index of each intermediate node, the number of bits used by the starting address, the number of bits used for the compressed node type, and the bit The number of bits used in the graph determines the compression level and compresses multiple nodes into one node according to the compression level, which greatly reduces the depth of the decision tree and improves the search speed.
  • This embodiment provides a method of binary tree compression, which compresses a binary tree according to the number of intermediate nodes, which is also called an adaptive compression method.
  • the adaptive compression method specifically includes:
  • the number of bits Nt and the number of bits used in the bitmap determine the number of intermediate nodes K;
  • bitmaps refer to shape bitmaps and external bitmaps. Shape bitmaps are used to represent the type of each node in a compressed node. The external bitmap is used to represent the type of each child node of the compression node.
  • a node whose number is less than or equal to the number of intermediate nodes is used as a compression node; starting from the child node of the compression node, using the same compression method as the compression node , continue to compress the binary tree until the binary tree is traversed.
  • each 3 layers is compressed into one large node (i.e., a compressed node), which needs to be compressed into three large nodes.
  • every 8 intermediate nodes are compressed into one large node, and only need to be compressed into one large node, and the compression efficiency is higher.
  • the width-first pruning algorithm is used for optimization
  • the number of all intermediate intermediate nodes including each intermediate node including itself is counted; from the root node of the binary tree, it is determined whether the number of all intermediate intermediate nodes including the self node corresponding to the intermediate node is smaller than Equivalent to the number of intermediate nodes ⁇ ; when each intermediate node in the compressed node corresponds to the number of all intermediate intermediate nodes including itself is greater than the number of intermediate nodes ⁇ , keep the compression node unchanged; When the number of all child intermediate nodes including the self corresponding to the middle node in the node is less than or equal to the number of intermediate nodes ⁇ , the intermediate node and all its child nodes are cut off as a new compression. Node. After the pruning, the other nodes of the compression node other than the intermediate node remain in the compression node, that is, the number of intermediate nodes of the compression node associated with the compression node formed by the pruning is adjusted. .
  • the binary circle of the binary tree segment shown in Figure 9 is the intermediate node, and the solid circle is the leaf node.
  • the adaptive compression method needs to be compressed into 9 compression nodes. Using the width-first pruning algorithm, the number of all intermediate nodes including the first intermediate node including itself is 15, 15 > ⁇ , then it is not pruned, and the second intermediate node is calculated including itself. The number of all intermediate nodes is 7, 7 ⁇ ⁇ , then the second intermediate node and all its child nodes are cut off as a compression node. Similarly, the calculation of the third intermediate node is included.
  • the number of all child intermediate nodes, including itself, is 7, ⁇ ⁇ ⁇ , then the third intermediate node and all its child nodes are clipped off as a compression node. After clipping, the number of intermediate nodes of the compression node associated with the two compression nodes formed by the clipping is adjusted to one. Finally, only three compression nodes are formed, as shown in Figure 10, which greatly improves the compression efficiency.
  • the compression nodes are traversed in the order of width first, and the type of each of the nodes is identified in turn, and the identification result is used as the shape bitmap of the compression node.
  • the compression nodes are traversed in a width-first order, the type of each child node is identified in turn, and the identification result is used as an external bitmap of the compression node.
  • the incremental update can be implemented, that is, when the type of the sub-node of the compression node is changed, the bit corresponding to the sub-node of the type change in the external bitmap is adjusted, and the shape bit is The graph is unchanged to implement incremental updates.
  • the first leaf node in Figure 8 becomes the intermediate node after the leaf node is hung, the corresponding bit in the external bitmap changes from 0 to 1, and the other sub-nodes have the same type, then the external bitmap Updated to 10000000, the shape bitmap is unchanged.
  • C2 extract the bit corresponding to the first bit index from the lookup key
  • C4 Determine whether the position of the current node (ie, the current binary tree node) in the shape bitmap is greater than 2 (K-1);
  • step C5 if no, go to step C5, and if yes, SP> 2 (K-1), go to step C6.
  • C6 The shape bitmap corresponding to the current node is 0;
  • C7 Determine whether the shape bitmap corresponding to the current node is 0;
  • step C3 if no, go to step C3, and if yes, go to step C8.
  • C9 Determine whether the external bitmap corresponding to the current node is 0;
  • step C10 If it is not 0, go to step C10. If it is 0, go to step Cl l.
  • the child node is a compressed node, and the process ends;
  • the child node is a leaf node, and the process ends.
  • the searching method determines whether the nodes of the binary tree are leaf nodes or compression nodes, and when compressing the nodes, parses the compression nodes; when it is a leaf node, traverses the linear table corresponding to the leaf nodes, and searches Rules that match keywords reduce the depth of the binary tree and improve the speed of the search.
  • the method provided in this embodiment the number of bits used for reading the bit data according to one memory access, the number of bits used by the bit index of each intermediate node, the number of bits used by the starting address, the number of bits used for the compressed node type, and the bit The number of bits used in the graph, determined The number of intermediate nodes, and the multiple nodes are compressed into one node according to the number of intermediate nodes, which greatly reduces the depth of the decision tree, improves the search speed, and further improves the compression efficiency and reduces the decision by using the width clipping method. The depth of the tree.
  • Example 4 The number of bits used for reading the bit data according to one memory access, the number of bits used by the bit index of each intermediate node, the number of bits used by the starting address, the number of bits used for the compressed node type, and the bit The number of bits used in the graph, determined The number of intermediate nodes, and the multiple nodes are compressed into one node according to the number of intermediate nodes, which greatly reduces the depth of the decision tree, improves the search
  • this embodiment provides a method for establishing a binary tree, including:
  • each time the bit with the highest scribing efficiency and the least number of times of copying is selected for segmentation does not limit the specific cutting method.
  • the range rule in the rule set is converted into a prefix; when converting, the identifier of the prefix corresponding rule and the identifier of the range corresponding rule are unchanged.
  • the rules may overlap and remove the rules that are completely covered and have lower priority. For example, see Figure 13. If rule R1 has a higher priority than rule R2, and R1 completely covers R2, then R2 will never be hit and can be removed from the rule set. In addition, in the leaf node, if the identifiers of multiple rules are the same, only one rule corresponding to the identifier is reserved.
  • all the segmentation results include the segmentation result of the non-range rule in the rule set and the segmentation result of the converted rule set.
  • the extraction requires a copy rule and is placed in another sub-rule set, that is, another decision tree is created.
  • the extended multiple rules have the same identifier as the original rule. Therefore, when extracting, all the required copying rules with the same identifier are extracted and placed in the other sub-rule set.
  • the non-range rule in the rule set is firstly segmented by using a selective segmentation algorithm, and when the segmentation efficiency is lower than a preset threshold, the range rule in the rule set is converted into a prefix.
  • This method of converting the range to a prefix when necessary effectively avoids the rule expansion caused by converting all ranges into prefixes.
  • rule copying is effectively reduced by extracting the need for copying rules and placing them into another sub-rule set, that is, by creating multiple decision trees.
  • the embodiment provides a method for establishing a binary tree, including: 501: Perform a segmentation rule set by using a location segmentation algorithm;
  • step 401 The details are the same as those of step 401, and are not described here.
  • a copy rule is required during the extraction segmentation process, and is placed in another sub-rule set;
  • the extended multiple rules have the same identifier as the original rule. Therefore, when extracting, all the required copying rules with the same identifier are extracted and placed in the other sub-rule set.
  • the method provided in this embodiment effectively reduces the rule copy by extracting the required copy rule and placing it into another sub-rule set, that is, by creating a plurality of decision trees.
  • this embodiment provides a device for binary tree compression, including:
  • a determining module 601 configured to determine a compression parameter, where the compression parameter is a compression level n or an intermediate node number K;
  • the compression module 602 is configured to compress the binary tree according to the compression parameter to form at least one compression node.
  • the bitmap module 603 is configured to establish a bitmap of the compression node.
  • the determining module 601 is specifically configured to use the number of bits Nb of the read data according to a memory access, the number of bits used by the bit index of each intermediate node, the number of bits used by the starting address of the child node of the compressed node, and
  • the compression parameter is determined by the number of bits Nt used by the compressed node type and the number of bits used by the bitmap.
  • Compression module 602 specifically for
  • the node whose layer number is less than or equal to the compression level n is regarded as a compression node;
  • the binary tree Starting from the child node of the compression node, the binary tree continues to be compressed using the same compression method as the compression node until the binary tree is traversed.
  • Nb represents the number of bits of data read in a memory access
  • Ni represents the number of bits used by the bit index of each intermediate node
  • Na represents the number of bits used by the starting address of the child node of the compressed node
  • Nt Indicates the number of bits used by the compressed node type
  • 2( ⁇ - 1) is the number of bits used to ignore the shape bitmap of the first node and the last two nodes
  • (f + l) is the bit used for the external bitmap number.
  • Compression module 602 specifically for
  • a node whose number is less than or equal to the number of intermediate nodes ⁇ is used as a compression node; starting from the child node of the compression node, the same compression method as the compression node is used to continue the binary tree. Compress until the traversal of the binary tree is completed.
  • the compression module 602 is further used after, after traversing the binary tree,
  • the intermediate node and all its child nodes are regarded as a new compression node.
  • the other nodes in the compression node other than the intermediate node remain in the compression node.
  • the compression nodes are traversed in a width-first order, the type of each node is identified in turn, and the identification result is used as a shape bitmap of the compression node;
  • the compression nodes are traversed in a width-first order, the type of each child node is identified in turn, and the identification result is used as an external bitmap of the compression node.
  • the device further includes:
  • An incremental update module configured to: after establishing a bitmap of the compressed node, when a type of a child node of the compressed node changes, adjusting a child node corresponding to the type change in the external bitmap Bit, the shape bitmap is unchanged to implement incremental updates.
  • the apparatus provided in this embodiment compresses multiple nodes into one node according to the compression level or the number of intermediate nodes, which greatly reduces the depth of the decision tree and improves the search speed.
  • the embodiment provides a device for establishing a binary tree, including:
  • a first sub-module 701 configured to perform a segmentation process on the non-range rule in the rule set by using a location-splitting algorithm
  • a conversion module 702 configured to: when the segmentation efficiency is lower than a preset threshold, The rule is converted into a prefix, and the identifier corresponding to the rule of the prefix is kept unchanged from the identifier corresponding to the range rule;
  • the range of the rule set is converted into a prefix, and the identifier of the prefix corresponding to the rule and the identifier of the range corresponding rule are kept unchanged.
  • the second segmentation module 703 is configured to perform segmentation on the converted rule set by using a location segmentation algorithm
  • the establishing module 704 is configured to establish a binary tree corresponding to the rule set according to all the segmentation results.
  • the device further includes:
  • the extraction module is used to extract the rules that need to be copied during the segmentation process and placed in another sub-rule set.
  • the apparatus provided in this embodiment firstly divides the rule set by using a selective segmentation algorithm.
  • the segmentation efficiency is lower than a preset threshold
  • the range of the rule set is converted into a prefix.
  • the method of converting the range to a prefix effectively avoids the rule expansion caused by converting all ranges into prefixes.
  • the rule copy is effectively reduced.
  • the embodiment provides a device for establishing a binary tree, including:
  • a segmentation module 801 configured to perform segmentation on a rule set by using a location segmentation algorithm
  • the extracting module 802 is configured to extract a copying rule and put it into another sub-rule set.
  • the building module 803 is configured to respectively establish a rule set and a binary tree corresponding to another sub-rule set.
  • the apparatus provided in this embodiment effectively reduces the rule copy by extracting the required copy rule and placing it into another sub-rule set, that is, by creating a plurality of decision trees.
  • this embodiment provides a device for searching for a binary tree, including:
  • the obtaining module 901 is configured to obtain a keyword for searching
  • the determining module 902 is configured to determine whether each node of the binary tree is a leaf node or a compression node;
  • the processing module 903 is configured to parse the compression node when it is a compression node, and traverse the linear table corresponding to the leaf node when the leaf node is a node, and find a rule matching the keyword.
  • the processing module 903 includes a first parsing unit 903a, configured to determine, according to the shape bitmap of the compressed node, whether each node of the compression node is a leaf node. The specific process is shown in Figure 6, and will not be described here.
  • the processing module 903 includes a second parsing unit 903b, configured to determine, according to the shape bitmap of the compressed node, when the node in the compressed node is 0 in the shape bitmap, determine whether the node is 0 in the external bitmap. According to the judgment result, it is determined whether the child node corresponding to the node in the compression node is a leaf node.
  • the specific process is shown in Figure 11, and is not described here.
  • the apparatus provided in this embodiment determines whether the nodes of the binary tree are leaf nodes or compression nodes, and when compressing the nodes, parses the compression nodes; when it is a leaf node, traverses the linearity corresponding to the leaf nodes. Table, find rules that match the keyword, reduce the depth of the binary tree search, and improve the search speed.
  • Embodiments of the invention may be implemented in software, and the corresponding software program may be stored in a readable storage medium, such as a hard disk, a cache, or an optical disk of a computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Security & Cryptography (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Description

二叉树建立、 压缩和查找的方法和装置 技术领域
本发明涉及流分类领域, 特别涉及一种二叉树建立、 压缩和查找的方法和装置。 背景技术
流分类根据预先定义的规则对报文头中的多个域进行检查, 并根据匹配情况进行相应 的处理。 流分类所使用的规则的集合称为流分类器。 流分类器中的每一条规则跟报文头中 的若干个域有关。 比如标准 IPv4五元组规则包括源 IP地址、 目的 IP地址、 协议类型、 源 端口号和目的端口号五个域。 不同域的匹配方式也可能不同。 IP地址使用前缀匹配, 协议 书
类型使用精确匹配, 端口号使用范围匹配。
基于决策树的流分类算法是一种规则集切分算法, 它采用某种切分策略递归地将规则 集分开, 直到每个子规则集中的规则数都小于预先设定的 bucket size (桶深) 为止。 通过 切分可以建立一棵二叉决策树, 简称二叉树, 二叉树的中间结点保存切分规则集所使用的 方法, 叶子结点保存所有可能匹配的子规则集。 查找时, 从报文头中抽出相关域组成关键 字, 然后使用关键字遍历已经建立好的决策树, 直到找到对应的叶子结点为止。 将关键字 跟叶子结点中的规则进行比较, 最终可以得到跟报文匹配且优先级最高的规则。
目前有一种基于决策树的分阶段选位切分流分类算法 Modular算法。 Modular算法将规 则看成由 '0'、 ' 和 '*' 组成的三值位串, 没有维的概念, 其中, '*' 表示通配符, 其 二进制位可以是 0或 1。 切分时, 计算某一位为 '0', ' 或 '*' 所对应的规则数, 并根 据优先度量值公式选择最合适的位进行切分。 当选择某一位进行切分时, 该位的值为 '0' 的规则被放入一个子规则集, 为 ' 的规则放入另一个子规则集, 为 '*' 的规则同时出 现在两个子规则集中。 这样, 原始规则集就被分为两个子规则集。 对于范围规则, 可以先 转化为前缀, 然后再采用上述方法进行切分。 采用这种方法递归地对原始规则集进行切分, 直到每个子规则集中的规则数小于预先设定的叶子结点允许的最大规则数为止。 这样, 可 以建立一棵二叉决策树。 同时, 为了减少规则复制, modular算法根据源 IP和目的 IP都不 是 '*'、 只有源 IP是 '*'、 只有目的 IP是 '*'、 源 IP和目的 IP都是 '*', 将规则集分 成四个子规则集。 针对四个子规则集分别建立不同的二叉决策树。 查找时, 对多个二叉决 策树并行查找。 在实现本发明的过程中, 发明人发现现有技术至少存在以下问题:
当选一位进行切分时, 最终建立的是一棵二叉决策树, 树的深度较大, 影响决策效率; 二叉树建立过程中, 如果将范围扩展为前缀, 一个任意的范围在最坏的情况下可以转化为 30个前缀。 以标准的 IPv4五元组为例, 每条规则包括源端口号和目的端口号两个范围, 最 坏的情况下, 一条规则被扩展成 900 条规则, 大大增加了内存占用量。 另外, 二叉树建立 过程中减少规则复制的方法比较粗略, 当子规则集中存在多个 '*' 时, 规则复制仍然很多。 发明内容
为了降低二叉树的深度, 本发明实施例提供了二叉树压缩的方法, 所述方法包括: 确定压缩参数, 所述压缩参数为压缩层次 n或中间结点数量 K;
根据所述压缩参数对二叉树进行压缩, 形成至少一个压缩结点;
建立所述压缩结点的位图。
为了在二叉树建立过程中避免范围规则转化为前缀造成的规则膨胀, 本实施例提供了 二叉树建立的方法, 所述方法包括:
采用选位切分算法对规则集中的非范围规则进行切分;
当切分效率低于预先设定的阈值时, 将所述规则集中的范围规则转换为前缀, 保持所 述前缀对应规则的标识与所述范围规则对应的标识不变;
采用所述选位切分算法对所述转换后的规则集进行切分;
根据所有切分结果, 建立所述规则集对应的二叉树。
为了在二叉树建立过程中减少规则复制, 本实施例提供了二叉树建立的方法, 所述方 法包括:
采用选位切分算法对规则集进行切分;
提取切分过程中需要复制规则, 放置到另一个子规则集中;
分别建立所述规则集和所述另一子规则集对应的二叉树。
为了降低查找深度, 提高查找速度, 本实施例提供了二叉树查找的方法, 所述方法包 括:
获取查找的关键字;
判断二叉树的各个结点是叶子结点、 还是压缩结点;
当是压缩结点时, 解析所述压缩结点;
当是叶子结点时, 遍历所述叶子结点对应的线性表, 查找与所述关键字匹配的规则。 为了降低二叉树的深度, 本发明实施例提供了二叉树压缩的装置, 所述装置包括: 确定模块, 用于确定压缩参数, 所述压缩参数为压缩层次 n或中间结点数量 K;
压缩模块, 用于根据所述压缩参数对二叉树进行压缩, 形成至少一个压缩结点; 位图模块, 用于建立所述压缩结点的位图。
为了在二叉树建立过程中避免范围规则转化为前缀造成的规则膨胀, 本实施例提供了 二叉树建立的装置, 所述装置包括:
第一切分模块, 用于采用选位切分算法对规则集中的非范围规则进行切分; 转换模块, 用于当切分效率低于预先设定的阈值时, 将所述规则集中的范围规则转换 为前缀;
第二切分模块, 用于采用所述选位切分算法对所述前缀进行切分。
为了在二叉树建立过程中减少规则复制, 本实施例提供了二叉树建立的装置, 所述装 置包括:
切分模块, 用于采用选位切分算法对规则集进行切分;
提取模块, 用于提取需要复制规则, 放置到另一个子规则集中。
为了降低查找深度, 提高查找速度, 本实施例提供了二叉树查找的装置, 所述装置包 括:
获取模块, 用于获取查找的关键字;
判断模块, 用于判断二叉树的各个结点是叶子结点、 还是压缩结点;
处理模块, 用于当是压缩结点时, 解析所述压缩结点; 当是叶子结点时, 遍历所述叶 子结点对应的线性表, 查找与所述关键字匹配的规则。
本发明实施例提供的技术方案的有益效果是:
通过确定并根据压缩层次或中间结点数量, 将多个结点压缩为一个结点, 大大的降低 了决策树的深度, 提高了查找速度;
通过首先采用选位切分算法对规则集中的非范围规则进行切分, 当切分效率低于预先 设定的阈值时, 再将规则集中的范围规则转换为前缀, 这种在 "必要时"将范围转化为前 缀的方法, 有效的避免了二叉树建立过程中将所有范围转化为前缀所产生的规则膨胀; 通过将需要复制规则提取出来, 放置到另一个子规则集, 也即通过创建多棵决策树的 方法, 有效的减少了二叉树建立过程中的规则复制;
通过判断二叉树的各个结点是叶子结点、 还是压缩结点, 当是压缩结点时, 解析压缩 结点; 当是叶子结点时, 遍历叶子结点对应的线性表, 查找与关键字匹配的规则, 降低二 叉树的查找深度, 提高了查找速度。 附图说明
图 1是本发明实施例 1提供的二叉树压缩的方法流程图;
图 2是本发明实施例 2提供的形状压缩方法流程图;
图 3是本发明实施例 2提供的二叉树片段示意图;
图 4是本发明实施例 2提供的更新后的二叉树片段示意图;
图 5是本发明实施例 2提供的二叉树查找流程图;
图 6是本发明实施例 2提供的形状压缩的压缩结点的解析流程图;
图 7是本发明实施例 3提供的自适应压缩方法流程图;
图 8是本发明实施例 3提供的二叉树片段示意图;
图 9是本发明实施例 3提供的另一二叉树片段进行自适应压缩的示意图;
图 10是本发明实施例 3提供的另一二叉树片段进行宽度优先剪除的示意图; 图 11是本发明实施例 3提供的自适应压缩的压缩结点的解析流程图;
图 12是本发明实施例 4提供的二叉树建立的方法流程图;
图 13是本发明实施例 4提供的规则覆盖的示意图;
图 14是本发明实施例 5提供的二叉树建立的方法流程图;
图 15是本发明实施例 6提供的二叉树压缩的装置结构示意图;
图 16是本发明实施例 7提供的二叉树建立的装置结构示意图;
图 17是本发明实施例 8提供的二叉树建立的装置结构示意图;
图 18是本发明实施例 9提供的二叉树查找的装置结构示意图。 具体实施方式
为使本发明的目的、 技术方案和优点更加清楚, 下面将结合附图对本发明实施方式作 进一步地详细描述。
实施例 1
参见图 1, 本实施例提供了一种二叉树压缩的方法, 包括:
101: 确定压缩参数, 该压缩参数为压缩层次 n或中间结点数量 K;
102: 根据该压缩参数对二叉树进行压缩, 形成至少一个压缩结点;
103: 建立该压缩结点的位图;
本实施例提供的方法, 通过确定并根据压缩层次或中间结点数量, 将多个结点压缩为 一个结点, 大大的降低了决策树的深度, 提高了查找速度。 实施例 2
本实施例提供了一种二叉树压缩的方法, 该方法根据压缩层次对二叉树进行压缩, 也 称为形状压缩方法。
参见图 2, 形状压缩方法具体包括:
201: 根据一次内存访问读入数据的位数 Nb、 每个中间结点的位索引使用的位数 Ni、 压缩结点的子结点的起始地址使用的位数 Na、压缩结点类型使用的位数 Nt和位图使用的位 数, 确定压缩层次 n;
其中, 压缩结点的子结点也称孩子结点, 是指该压缩结点下挂的结点。
具体的, 根据(2"— 1)χ^ + (2"— 1) + 。+ ≤^, 确定 w≤ Log2 (Nb ~ N' + l) , 其 中, (2" - 1)表示位图使用的位数。
例如, 设 Na = 32, Nt = 2, Ni = 9则当 Nb=128时, n = 3, 即可以将 3层压缩为 1 层; 当 Nb = 256时, n = 4, 即可以将 4层压缩为 1层。
202: 根据压缩层次 n对二叉树进行压缩, 形成至少一个压缩结点;
具体的, 从二叉树的根结点或叶子结点开始, 将层数小于等于压缩层次 n 的结点作为 一个压缩结点; 然后, 从压缩结点的子结点开始, 采用与压缩结点相同的压缩方法, 对二 叉树继续进行压缩, 直至遍历完二叉树。
例如, 参见图 3, 当压缩层次为 3时, 从二叉树的根结点开始, 将三层二叉树压缩为一 层, 形成大的压缩结点 1, 然后, 再从压缩结点 1的子结点处继续压缩, 形成大的压缩结点 2禾口 3。
另外, 当根结点为叶子结点时, 无须压缩, 此时, 没有压缩结点。
203: 建立每个压缩结点的位图;
其中, 对于形状压缩方法, 位图是指形状位图。 具体的, 按照宽度优先的次序遍历压 缩结点, 依次对其中的每个结点的类型进行标识, 例如, 将中间结点标识为 1, 叶子结点或 空结点标识为 0, 并将标识结果作为压缩结点的形状位图。 其中, "宽度优先的次序"是指 按照从上到下, 从左到右的顺序。
以图 3所示的二叉树片段为例, 空心圆为中间结点, 实心圆为叶子结点, 用 1表示中 间结点, 用 0表示叶子结点和空结点, 按照宽度优先的次序遍历压缩结点 1, 则压缩结点 1 的形状位图为 1100100, 其中最后两位 0为空结点。
进一步的, 当压缩结点增量更新时, 直接更新形状位图。 例如, 参见图 4, 为更新后的 二叉树片段, 则更新后的压缩结点 1的形状位图为 1110100。 204: 将一个压缩结点的所有子结点连续存储, 并将起始地址保存在压缩结点; 进一步的, 在查找时, 根据起始地址和子结点的索引, 确定子结点的地址。
上述形状压缩, 结点查找过程包括: 获取查找的关键字; 判断二叉树的各个结点是叶 子结点、 还是压缩结点; 当是压缩结点时, 解析压缩结点; 当是叶子结点时, 遍历叶子结 点对应的线性表, 查找与关键字匹配的规则。具体的, 参见图 5, 二叉树查找过程具体如下: A1 : 获取查找的关键字;
A2 : 判断决策二叉树的根结点是否为叶子结点;
具体的, 如果不是叶子结点, 则该根结点对应一个压缩结点, 将其作为当前压缩结点, 执行步骤 A3, 如果是叶子结点, 执行步骤 A5。
例如, 图 3的根结点不是叶子结点, 其对应一个压缩结点, 执行步骤 A3。
A3: 解析当前压缩结点;
A4: 判断当前压缩结点的子结点是否为叶子结点;
具体的, 如果不是叶子结点, 则该子结点对应一个压缩结点, 将其作为当前压缩结点, 执行步骤 A3, 如果是叶子结点, 执行步骤 A5。
A5 : 遍历该叶子结点对应的线性表, 查找与关键字匹配的规则, 流程结束。
基于图 5的查找过程, 对于形状压缩可以按层解析, 将压缩结点的根结点作为第一层, 根据压缩结点的形状位图, 判断压缩结点的各个结点是否是叶子结点, 参见图 6, 压缩结点 解析过程为:
B1 : 进入压缩结点;
B2 : 从查找关键字中提取第一个位索引对应的位;
B3 : 进入下一层;
B4: 判断当前层数是否大于压缩层数 n;
具体的, 如果否, 执行步骤 B5, 如果是, 执行步骤 B7。
B5 : 计算当前结点 (即当前二叉树结点) 在形状位图中的位置, 并从形状位图中提取 该位置对应的位;
B6 : 判断当前结点对应的形状位图是否为 0 ;
具体的, 如果为 0, 执行步骤 B9, 如果不为 0, 执行步骤 B3。
B7 : 进入子结点, 并读取子结点的类型;
B8 : 判断子结点是否为叶子结点;
具体的, 如果是, 执行步骤 B9, 如果否, 执行步骤 B10。
B9 : 为叶子结点, 流程结束; BIO: 子结点为压缩结点, 流程结束。
该查找方法, 通过判断二叉树的各个结点是叶子结点、 还是压缩结点, 当是压缩结点 时, 解析压缩结点; 当是叶子结点时, 遍历叶子结点对应的线性表, 查找与关键字匹配的 规则, 降低二叉树的查找深度, 提高了查找速度。
本实施例提供的方法, 通过根据一次内存访问读入数据的位数、 每个中间结点的位索 引使用的位数、 起始地址使用的位数、 压缩结点类型使用的位数和位图使用的位数, 确定 压缩层次, 并根据压缩层次将多个结点压缩为一个结点, 大大的降低了决策树的深度, 提 高了查找速度。 实施例 3
本实施例提供了二叉树压缩的方法, 该方法根据中间结点数量对二叉树进行压缩, 也 称为自适应压缩方法。
首先, 介绍本实施例应用的一个定理: 对于每个中间结点都有两个子结点的二叉树, 以一个中间结点为根且连接在一起的 N个中间结点, 必然有 (N+1) 个子结点。 证明:
1) 如果只考虑一个中间结点, 它必然有两个子结点, 所以满足上面的定理。
2) 考虑以一个中间结点为根且连接在一起的 K个中间结点, 假设这个 K个中间结点有 (K+1) 个子结点。 则当考虑以相同的中间结点为根且连接在一起的中间结点的数目变为 (K+1) 时, 必然是将原来的一个子结点变为被考虑的中间结点, 而且由于该子结点是中间 结点, 所以它肯定有 2个子结点, 所以子结点的数目变为 (K+1) - 1 + 2 = (K+1) + 1个。
综合 1) 和 2), 定理得证。
参见图 7, 自适应压缩方法具体包括:
301: 根据一次内存访问读入数据的位数 Nb、 每个中间结点的位索引使用的位数 Ni、 压缩结点的子结点的起始地址使用的位数 Na、压缩结点类型使用的位数 Nt和位图使用的位 数, 确定中间结点数量 K;
其中, 对于自适应压缩方法, 位图是指形状位图和外部位图。 形状位图用于表示压缩 结点中的每个结点的类型。 外部位图用于表示压缩结点的每个子结点的类型。
具体的, 根据 xN +2( — l) + + l + N。+N,≤Nfc, 聰定 K<N Na_Nt +i
' b Nt +3
其中, 2 (K-l) 为形状位图使用的位数。 因为一个压缩结点共涉及 2K+1个结点, 被压 缩二叉树片段的第一个结点肯定是中间结点, 所以形状位图的第 1位肯定是 1; 而最后两个 结点肯定是未被压缩的结点, 所以形状位图的最后两位肯定是 '00'。 所以形状位图用 2K+l-3=2 (K-l ) 表示。 因为共有 Κ+1个外部结点, 所以外部位图用 K+1位表示。
302: 根据中间结点数量 Κ对二叉树进行压缩, 形成至少一个压缩结点;
具体的, 从二叉树的根结点开始, 将个数小于等于中间结点数量 Κ 的结点作为一个压 缩结点; 从该压缩结点的子结点开始, 采用与压缩结点相同的压缩方法, 对二叉树继续进 行压缩, 直至遍历完二叉树。
例如, 设 Na = 32, Nt = 2, Ni = 9, 则当 Nb=128时, 中间结点数量 K=8, 压缩层次 η=3。 参见图 8所示的二叉树片段, 如果采用实施例 2中的形状压缩方法, 则每 3层压缩为 一个大结点(即压缩结点), 需要压缩成三个大结点。按照本实施例提供的自适应压缩方法, 每 8个中间结点压缩为一个大结点, 则只需要压缩为一个大结点, 压缩效率更高。
另外, 当根结点为叶子结点时, 无须压缩, 此时, 没有压缩结点。
303: 为了进一步提高压缩效率, 采用宽度优先剪除算法进行优化;
具体的, 统计每个中间结点包括自身在内的所有子中间结点的数目; 从二叉树的根结 点开始, 判断中间结点对应的包括自身在内的所有子中间结点的数目是否小于等于中间结 点数量 Κ;当压缩结点中的每一中间结点对应的包括自身在内的所有子中间结点的数目都大 于中间结点数量 Κ 时, 保持压缩结点不变; 当压缩结点中的中间结点对应的包括自身在内 的所有子中间结点的数目小于等于该中间结点数量 κ 时, 将该中间结点及其所有子结点剪 除下来, 作为一个新的压缩结点。 剪除之后, 将压缩结点中除该中间结点之外的其他结点 仍保留在该压缩结点中, 也即调整与该剪除形成的压缩结点关联的压缩结点的中间结点的 数目。
例如, 如果计算的中间结点数量 Κ=7, 图 9所示的二叉树片段空心圆为中间结点, 实心 圆为叶子结点, 采用自适应压缩方法需要压缩为 9 个压缩结点。 采用宽度优先剪除算法, 计算第一个中间结点的包括自身在内的所有子中间结点的数目为 15, 15 > ^ , 则不剪除, 计算第二个中间结点的包括自身在内的所有子中间结点的数目为 7, 7≤Κ , 则将该第二个 中间结点及其所有子结点剪除下来, 作为一个压缩结点, 同理, 计算第三个中间结点的包 括自身在内的所有子中间结点的数目为 7, Ί≤Κ , 则将该第三中间结点及其所有子结点剪 除下来, 作为一个压缩结点。 剪除之后, 将与剪除形成的两个压缩结点关联的压缩结点的 中间结点的数目调整为 1。 最后, 只形成 3个压缩结点, 具体参见图 10所示, 大大提高了 压缩效率。
304: 建立压缩结点的位图, 包括形状位图和外部位图;
对于形状位图, 按照宽度优先的次序遍历压缩结点, 依次对其中的每个结点的类型进 行标识, 并将标识结果作为该压缩结点的形状位图。 对于外部位图, 按照宽度优先的次序遍历压缩结点, 依次对每个子结点的类型进行标 识, 并将标识结果作为该压缩结点的外部位图。
仍以图 8为例, 忽略第一个结点和最后两个结点的形状位图共需 2 (K-1 ) =12位, 为 010101010101; 外部位图共需 K+1位, 为 00000000。
进一步的, 通过使用形状位图和外部位图可以实现增量更新, 也即当压缩结点的子结 点的类型改变时, 调整外部位图中类型改变的子结点对应的位, 形状位图不变, 以实现增 量更新。 例如, 图 8 中的第一个叶子结点下挂叶子结点后变为中间结点, 外部位图中的相 应位由 0变为 1, 其他子结点的类型不变, 则外部位图更新为 10000000, 形状位图不变。
上述自适应压缩后, 基于图 5 的查找过程, 对于自适应压缩可以按层解析, 将压缩结 点的根结点作为第一层, 根据压缩结点的形状位图, 当压缩结点中的结点在形状位图中为 0 时, 判断结点在外部位图是否为 0, 根据判断结果确定压缩结点中结点对应的子结点是否为 叶子结点, 参见图 11, 压缩结点解析过程具体如下:
C1 : 进入压缩结点;
C2: 从查找关键字中提取第一个位索引对应的位;
C3: 进入下一层;
C4: 判断当前结点 (即当前二叉树结点) 在形状位图中的位置是否大于 2 (K-1) ;
具体的, 如果否, 执行步骤 C5, 如果是, SP> 2 (K-1)时, 执行步骤 C6。
C5: 从形状位图中提取当前结点对应的位, 执行步骤 C7;
C6: 当前结点对应的形状位图为 0;
C7: 判断当前结点对应的形状位图是否为 0 ;
具体的, 如果否, 执行步骤 C3, 如果是, 即为 0 时, 执行步骤 C8。
C8: 从外部位图中提取当前结点对应的位;
C9: 判断当前结点对应的外部位图是否为 0 ;
具体的, 如果不为 0, 执行步骤 C10, 如果为 0, 执行步骤 Cl l。
C10: 子结点为压缩结点, 流程结束;
C11 : 子结点为叶子结点, 流程结束。
该查找方法, 通过判断二叉树的各个结点是叶子结点、 还是压缩结点, 当是压缩结点 时, 解析压缩结点; 当是叶子结点时, 遍历叶子结点对应的线性表, 查找与关键字匹配的 规则, 降低二叉树的查找深度, 提高了查找速度。
本实施例提供的方法, 通过根据一次内存访问读入数据的位数、 每个中间结点的位索 引使用的位数、 起始地址使用的位数、 压缩结点类型使用的位数和位图使用的位数, 确定 中间结点数量, 并根据中间结点数量将多个结点压缩为一个结点, 大大的降低了决策树的 深度, 提高了查找速度, 并且采用宽度剪除方法进一步提高了压缩效率、 降低了决策树的 深度。 实施例 4
参见图 12, 本实施例提供了一种二叉树建立的方法, 包括:
401: 采用选位切分算法对规则集中的非范围规则进行切分;
具体的, 每次选取切分效率最高、 且复制次数最少的位进行切分。 另外, 本实施例并 不限定具体的切分方法。
402: 当切分效率低于预先设定的阈值时, 将该规则集中的范围规则转换为前缀; 转换时, 保持前缀对应规则的标识与范围对应规则的标识不变。
转换后, 规则之间可能会重叠, 去掉被完全覆盖且优先级低的规则。 例如, 参见图 13, 如果规则 R1的优先级比规则 R2高, 同时 R1又完全覆盖 R2, 这时, R2永远不会被命中, 可以从规则集中去掉。 另外, 叶子结点中, 如果多个规则的标识相同, 则只保留一个该标 识对应的规则。
403: 采用该选位切分算法对转换后的规则集进行切分;
404: 根据所有切分结果, 建立规则集对应的二叉树。
其中, 所有切分结果包括对规则集中的非范围规则的切分结果和对转换后的规则集的 切分结果。
进一步的, 为了解决规则复制问题, 在上述采用选位切分算法切分的过程中, 提取需 要复制规则, 放置到另一个子规则集中, 也即另外创建一棵决策树。
另外, 对范围进行转换后, 扩展后的多条规则跟原始规则具有相同的标识, 因此, 提 取时, 需要将标识相同的所有需要复制规则都提取出来, 放置到该另一个子规则集中。
本实施例提供的方法, 通过首先采用选位切分算法对规则集中的非范围规则进行切分, 当切分效率低于预先设定的阈值时, 再将规则集中的范围规则转换为前缀, 这种在 "必要 时"将范围转化为前缀的方法, 有效的避免了将所有范围转化为前缀所产生的规则膨胀。 另外, 通过将需要复制规则提取出来, 放置到另一个子规则集, 也即通过创建多棵决策树 的方法, 有效的减少了规则复制。 实施例 5
参见图 14, 本实施例提供了一种二叉树建立的方法, 包括: 501: 采用选位切分算法对规则集进行切分;
具体与步骤 401相同, 这里不再赘述。
502: 提取切分过程中需要复制规则, 放置到另一个子规则集中;
503: 分别建立规则集和另一子规则集对应的二叉树。
进一步的, 对范围进行转换后, 扩展后的多条规则跟原始规则具有相同的标识, 因此, 提取时, 需要将标识相同的所有需要复制规则都提取出来, 放置到该另一个子规则集中。
本实施例提供的方法, 通过将需要复制规则提取出来, 放置到另一个子规则集, 也即 通过创建多棵决策树的方法, 有效的减少了规则复制。 实施例 6
参见图 15, 本实施例提供了一种二叉树压缩的装置, 包括:
确定模块 601, 用于确定压缩参数, 压缩参数为压缩层次 n或中间结点数量 K;
压缩模块 602, 用于根据压缩参数对二叉树进行压缩, 形成至少一个压缩结点; 位图模块 603, 用于建立压缩结点的位图。
确定模块 601, 具体用于根据一次内存访问读入数据的位数 Nb、 每个中间结点的位索 引使用的位数 Ni、 压缩结点的子结点的起始地址使用的位数 Na、 压缩结点类型使用的位数 Nt和位图使用的位数, 确定压缩参数。
当压缩参数为压缩层次 n时,
确定模块 601, 具体用于
根据(2"— l)xN! + (2"— 1) + N。+ N,≤N6, 确定 w≤ Log2(Nb ~Nt + l) , 其中, Nb表示一次内存访问读入数据的位数, Ni表示每个中间结点的位索引使用的位 数, Na表示所述压缩结点的子结点的起始地址使用的位数、 Nt表示压缩结点类型使用的位 数, (2" - 1)表示位图使用的位数。
压缩模块 602, 具体用于
从二叉树的根结点或叶子结点开始, 将层数小于等于压缩层次 n 的结点作为一个压缩 结点;
从压缩结点的子结点开始, 采用与压缩结点相同的压缩方法, 对二叉树继续进行压缩, 直至遍历完二叉树。
位图模块 603, 具体用于
按照宽度优先的次序遍历压缩结点, 依次对每个结点的类型进行标识, 并将标识结果 作为压缩结点的形状位图。
当压缩参数为中间结点数量 κ时,
确定模块 601, 具体用于
根据 xN + 2( — l) + ( + l) + N。 + Nt < Nh , 确定 ≤ ¾Na ~ Nt + l .
1 a 1 b Nr + 3
其中, Nb表示一次内存访问读入数据的位数, Ni表示每个中间结点的位索引使用的位 数, Na表示所述压缩结点的子结点的起始地址使用的位数、 Nt表示压缩结点类型使用的位 数, 2(^ - 1)为忽略第一个结点和最后两个结点的形状位图使用的位数, (f + l) 为外部位 图使用的位数。
压缩模块 602, 具体用于
从二叉树的根结点开始, 将个数小于等于中间结点数量 κ的结点作为一个压缩结点; 从压缩结点的子结点开始, 采用与压缩结点相同的压缩方法, 对二叉树继续进行压缩, 直至遍历完二叉树。
进一步的, 压缩模块 602, 还用于在直至遍历完二叉树之后,
统计每个中间结点包括自身在内的所有子中间结点的数目;
从二叉树的根结点开始, 判断中间结点对应的包括自身在内的所有子中间结点的数目 是否小于等于中间结点数量 K;
当压缩结点中的每一中间结点对应的包括自身在内的所有子中间结点的数目都大于中 间结点数量 κ时, 保持压缩结点不变;
当压缩结点中的中间结点对应的包括自身在内的所有子中间结点的数目小于等于中间 结点数量 K 时, 将中间结点及其所有子结点作为一个新的压缩结点, 将压缩结点中除该中 间结点之外的其他结点仍保留在该压缩结点中。
位图模块 603, 具体用于
按照宽度优先的次序遍历压缩结点, 依次对每个结点的类型进行标识, 并将标识结果 作为压缩结点的形状位图;
按照宽度优先的次序遍历压缩结点, 依次对每个子结点的类型进行标识, 并将标识结 果作为压缩结点的外部位图。
进一步的, 该装置还包括:
增量更新模块, 用于建立所述压缩结点的位图之后, 当所述压缩结点的子结点的类型 改变时, 调整所述外部位图中所述类型改变的子结点对应的位, 所述形状位图不变, 以实 现增量更新。 本实施例提供的装置, 通过确定并根据压缩层次或中间结点数量, 将多个结点压缩为 一个结点, 大大的降低了决策树的深度, 提高了查找速度。 实施例 7
参见图 16, 本实施例提供了一种二叉树建立的装置, 包括:
第一切分模块 701, 用于采用选位切分算法对规则集中的非范围规则进行切分; 转换模块 702, 用于当切分效率低于预先设定的阈值时, 将规则集中的范围规则转换为 前缀, 保持该前缀对应规则的标识与该范围规则对应的标识不变;
具体的, 当切分效率低于预先设定的阈值时, 将规则集中的范围转换为前缀, 保持前 缀对应规则的标识与范围对应规则的标识不变。
进一步的, 将规则集中的范围转换为前缀之后, 去掉被覆盖且优先级低的规则。
第二切分模块 703, 用于采用选位切分算法对转换后的规则集进行切分;
建立模块 704, 用于根据所有切分结果, 建立规则集对应的二叉树。
进一步的, 该装置还包括:
提取模块, 用于提取切分过程中需要复制规则, 放置到另一个子规则集中。
具体的, 提取标识相同的所有需要复制规则, 放置到该另一个子规则集中。
本实施例提供的装置, 通过首先采用选位切分算法对规则集进行切分, 当切分效率低 于预先设定的阈值时, 再将规则集中的范围转换为前缀, 这种在 "必要时"将范围转化为 前缀的方法, 有效的避免了将所有范围转化为前缀所产生的规则膨胀。 另外, 通过将需要 复制规则提取出来, 放置到另一个子规则集, 也即通过创建多棵决策树的方法, 有效的减 少了规则复制。 实施例 8
参见图 17, 本实施例提供了一种二叉树建立的装置, 包括:
切分模块 801, 用于采用选位切分算法对规则集进行切分;
提取模块 802, 用于提取切分过程中需要复制规则, 放置到另一个子规则集中; 建立模块 803, 用于分别建立规则集和另一子规则集对应的二叉树。
具体的, 提取标识相同的所有需要复制规则, 放置到该另一个子规则集中。
本实施例提供的装置, 通过将需要复制规则提取出来, 放置到另一个子规则集, 也即 通过创建多棵决策树的方法, 有效的减少了规则复制。 实施例 9
参见图 18, 本实施例提供了一种二叉树查找的装置, 包括:
获取模块 901, 用于获取查找的关键字;
判断模块 902, 用于判断二叉树的各个结点是叶子结点、 还是压缩结点;
处理模块 903, 用于当是压缩结点时, 解析压缩结点; 当是叶子结点时, 遍历叶子结点 对应的线性表, 查找与关键字匹配的规则。
其中, 处理模块 903包括第一解析单元 903a, 用于根据压缩结点的形状位图, 判断压 缩结点的各个结点是否是叶子结点。 具体流程参见图 6所示, 这里不再赘述。
其中, 处理模块 903包括第二解析单元 903b, 用于根据压缩结点的形状位图, 当压缩 结点中的结点在形状位图中为 0时, 判断结点在外部位图是否为 0, 根据判断结果确定压缩 结点中结点对应的子结点是否为叶子结点。 具体流程参见图 11所示, 这里不再赘述。
本实施例提供的装置, 通过判断二叉树的各个结点是叶子结点、 还是压缩结点, 当是 压缩结点时, 解析压缩结点; 当是叶子结点时, 遍历叶子结点对应的线性表, 查找与关键 字匹配的规则, 降低二叉树的查找深度, 提高了查找速度。 本发明实施例可以利用软件实现, 相应的软件程序可以存储在可读取的存储介质中, 例如, 计算机的硬盘、 缓存或光盘中。 以上所述仅为本发明的较佳实施例, 并不用以限制本发明, 凡在本发明的精神和原则 之内, 所作的任何修改、 等同替换、 改进等, 均应包含在本发明的保护范围之内。

Claims

权 利 要 求 书
1、 一种二叉树压缩的方法, 其特征在于, 所述方法包括:
确定压缩参数, 所述压缩参数为压缩层次 n或中间结点数量 K;
根据所述压缩参数对二叉树进行压缩, 形成至少一个压缩结点;
建立所述压缩结点的位图。
2、 如权利要求 1所述的方法, 其特征在于, 确定压缩参数包括:
根据一次内存访问读入数据的位数 Nb、每个中间结点的位索引使用的位数 Ni、所述压缩 结点的子结点的起始地址使用的位数 Na、 压缩结点类型使用的位数 Nt和所述位图使用的位 数, 确定压缩参数。
3、 如权利要求 1或 2所述的方法, 其特征在于, 所述压缩参数为压缩层次 n时, 确定压缩参数, 具体包括:
根据(2"— l)xN! + (2"— 1) + N。+ N,≤N6, 确定 w≤ Log2(Nb ~Nt + l) , 其中, Nb表示一次内存访问读入数据的位数, Ni表示每个中间结点的位索引使用的位数, Na表示所述压缩结点的子结点的起始地址使用的位数、 Nt 表示压缩结点类型使用的位数, (2" - 1)表示位图使用的位数。
4、 如权利要求 1所述的方法, 其特征在于, 所述压缩参数为压缩层次 n时,
根据所述压缩参数对二叉树进行压缩, 形成至少一个压缩结点, 具体包括:
从所述二叉树的根结点或叶子结点开始, 将层数小于等于所述压缩层次 n的结点作为一 个压缩结点;
从所述压缩结点的子结点开始, 采用与所述压缩结点相同的压缩方法, 对所述二叉树继 续进行压缩, 直至遍历完所述二叉树。
5、 如权利要求 1所述的方法, 其特征在于, 所述压缩参数为压缩层次 n时,
建立所述压缩结点的位图, 具体包括:
按照宽度优先的次序遍历所述压缩结点, 依次对每个结点的类型进行标识, 并将标识结 果作为所述压缩结点的形状位图。
6、 如权利要求 1或 2所述的方法, 其特征在于, 所述压缩参数为中间结点数量 K时, 确定压缩参数, 具体包括:
K N^ liK - j + iK + j + N^ N^ N, , 确定 ≤ ¾Na ~ N' + l; 其中, Nb表示一次内存访问读入数据的位数, Ni表示每个中间结点的位索引使用的位数, Na表示所述压缩结点的子结点的起始地址使用的位数、 Nt 表示压缩结点类型使用的位数, 2(^ -1)是忽略第一个结点和最后两个结点的形状位图使用的位数, (f + l) 是外部位图使用 的位数。
7、 如权利要求 1所述的方法, 其特征在于, 所述压缩参数为中间结点数量 K时, 根据所述压缩参数对二叉树进行压缩, 形成至少一个压缩结点, 具体包括:
从所述二叉树的根结点开始, 将个数小于等于所述中间结点数量 K的结点作为一个压缩 结点;
从所述压缩结点的子结点开始, 采用与所述压缩结点相同的压缩方法, 对所述二叉树继 续进行压缩, 直至遍历完所述二叉树。
8、 如权利要求 7所述的方法, 其特征在于, 所述直至遍历完所述二叉树之后包括: 统计每个中间结点包括自身在内的所有子中间结点的数目;
从所述二叉树的根结点开始, 判断中间结点对应的包括自身在内的所有子中间结点的数 目是否小于等于所述中间结点数量 K;
当压缩结点中的每一中间结点对应的包括自身在内的所有子中间结点的数目都大于所述 中间结点数量 K时, 保持所述压缩结点不变;
当压缩结点中的中间结点对应的包括自身在内的所有子中间结点的数目小于等于所述中 间结点数量 κ时, 将所述中间结点及其所有子结点作为一个新的压缩结点, 将所述压缩结点 中除所述中间结点之外的其他结点仍保留在所述压缩结点中。
9、 如权利要求 1所述的方法, 其特征在于, 所述压缩参数为中间结点数量 K时, 建立所述压缩结点的位图, 具体包括:
按照宽度优先的次序遍历所述压缩结点, 依次对每个结点的类型进行标识, 并将标识结 果作为所述压缩结点的形状位图;
按照宽度优先的次序遍历所述压缩结点, 依次对每个子结点的类型进行标识, 并将标识 结果作为所述压缩结点的外部位图。
10、 如权利要求 9所述的方法, 其特征在于, 所述建立所述压缩结点的位图之后还包括: 当所述压缩结点的子结点的类型改变时, 调整所述外部位图中所述类型改变的子结点对 应的位, 所述形状位图不变, 以实现增量更新。
11、 一种二叉树建立的方法, 其特征在于, 所述方法包括:
采用选位切分算法对规则集中的非范围规则进行切分; 当切分效率低于预先设定的阈值时, 将所述规则集中的范围规则转换为前缀, 保持所述 前缀对应规则的标识与所述范围规则对应的标识不变;
采用所述选位切分算法对所述转换后的规则集进行切分;
根据所有切分结果, 建立所述规则集对应的二叉树。
12、 如权利要求 11所述的方法, 其特征在于, 所述方法还包括:
提取切分过程中标识相同的所有需要复制规则, 放置到另一个子规则集中;
建立所述另一子规则集对应的二叉树。
13、 如权利要求 11所述的方法, 其特征在于, 所述当切分效率低于预先设定的阈值时, 将所述规则集中的范围规则转换为前缀, 之后包括:
去掉被覆盖且优先级低的规则。
14、 一种二叉树建立的方法, 其特征在于, 所述方法包括:
采用选位切分算法对规则集进行切分;
提取切分过程中需要复制规则, 放置到另一个子规则集中;
分别建立所述规则集和所述另一子规则集对应的二叉树。
15、 如权利要求 14所述的方法, 其特征在于, 所述提取需要复制规则, 放置到另一个子 规则集中包括:
提取标识相同的所有需要复制规则, 放置到所述另一个子规则集中。
16、 一种二叉树压缩的装置, 其特征在于, 所述装置包括:
确定模块, 用于确定压缩参数, 所述压缩参数为压缩层次 n或中间结点数量 K;
压缩模块, 用于根据所述压缩参数对二叉树进行压缩, 形成至少一个压缩结点; 位图模块, 用于建立所述压缩结点的位图。
17、 如权利要求 16所述的装置, 其特征在于, 所述压缩参数为压缩层次 n时, 所述压缩模块, 具体用于
从所述二叉树的根结点或叶子结点开始, 将层数小于等于所述压缩层次 n的结点作为一 个压缩结点;
从所述压缩结点的子结点开始, 采用与所述压缩结点相同的压缩方法, 对所述二叉树继 续进行压缩, 直至遍历完所述二叉树。
18、 如权利要求 16所述的装置, 其特征在于, 所述压缩参数为压缩层次 n时, 所述位图模块, 具体用于
按照宽度优先的次序遍历所述压缩结点, 依次对每个结点的类型进行标识, 并将标识结 果作为所述压缩结点的形状位图。
19、 如权利要求 16所述的装置, 其特征在于, 所述压缩参数为中间结点数量 K时, 所述压缩模块, 具体用于
从所述二叉树的根结点开始, 将个数小于等于所述中间结点数量 K的结点作为一个压缩 结点;
从所述压缩结点的子结点开始, 采用与所述压缩结点相同的压缩方法, 对所述二叉树继 续进行压缩, 直至遍历完所述二叉树。
20、 如权利要求 19所述的装置, 其特征在于, 所述压缩模块, 还用于在直至遍历完所述 二叉树之后,
统计每个中间结点包括自身在内的所有子中间结点的数目;
从所述二叉树的根结点开始, 判断中间结点对应的包括自身在内的所有子中间结点的数 目是否小于等于所述中间结点数量 K;
当压缩结点中的每一中间结点对应的包括自身在内的所有子中间结点的数目都大于所述 中间结点数量 κ时, 保持所述压缩结点不变;
当压缩结点中的中间结点对应的包括自身在内的所有子中间结点的数目小于等于所述中 间结点数量 κ时, 将所述中间结点及其所有子结点作为一个新的压缩结点, 将所述压缩结点 中除所述中间结点之外的其他结点仍保留在所述压缩结点中。
21、 如权利要求 16所述的装置, 其特征在于, 所述压缩参数为中间结点数量 K时, 所述位图模块, 具体用于
按照宽度优先的次序遍历所述压缩结点, 依次对每个结点的类型进行标识, 并将标识结 果作为所述压缩结点的形状位图;
按照宽度优先的次序遍历所述压缩结点, 依次对每个子结点的类型进行标识, 并将标识 结果作为所述压缩结点的外部位图。
22、 如权利要求 21所述的装置, 其特征在于, 所述装置还包括:
增量更新模块, 用于建立所述压缩结点的位图之后, 当所述压缩结点的子结点的类型改 变时, 调整所述外部位图中所述类型改变的子结点对应的位, 所述形状位图不变, 以实现增 量更新。
23、 一种二叉树建立的装置, 其特征在于, 所述装置包括:
第一切分模块, 用于采用选位切分算法对规则集中的非范围规则进行切分;
转换模块, 用于当切分效率低于预先设定的阈值时, 将所述规则集中的范围规则转换为 前缀, 保持所述前缀对应规则的标识与所述范围规则对应的标识不变;
第二切分模块, 用于采用所述选位切分算法对所述转换后的规则集进行切分; 建立模块, 用于根据所有切分结果, 建立所述规则集对应的二叉树。
24、 如权利要求 23所述的装置, 其特征在于, 所述装置还包括:
提取模块, 用于提取切分过程中标识相同的所有需要复制规则, 放置到另一个子规则集 中。
25、 一种二叉树建立的装置, 其特征在于, 所述装置包括:
切分模块, 用于采用选位切分算法对规则集进行切分;
提取模块, 用于提取切分过程中需要复制规则, 放置到另一个子规则集中。
建立模块, 用于分别建立所述规则集和所述另一子规则集对应的二叉树。
26、 一种二叉树查找的方法, 其特征在于, 所述方法包括:
获取查找的关键字;
判断二叉树的各个结点是叶子结点、 还是压缩结点;
当是压缩结点时, 解析所述压缩结点;
当是叶子结点时, 遍历所述叶子结点对应的线性表, 查找与所述关键字匹配的规则。
27、 如权利要求 26所述的方法, 其特征在于, 所述解析所述压缩结点, 具体包括: 根据所述压缩结点的形状位图, 判断所述压缩结点的各个结点是否是叶子结点。
28、 如权利要求 26所述的方法, 其特征在于, 所述解析所述压缩结点, 具体包括: 根据所述压缩结点的形状位图, 当所述压缩结点中的结点在所述形状位图中为 0时, 判 断所述结点在外部位图是否为 0, 根据判断结果确定所述压缩结点中所述结点对应的子结点 是否为叶子结点。
29、 一种二叉树查找的装置, 其特征在于, 所述装置包括:
获取模块, 用于获取查找的关键字;
判断模块, 用于判断二叉树的各个结点是叶子结点、 还是压缩结点;
处理模块, 用于当是压缩结点时, 解析所述压缩结点; 当是叶子结点时, 遍历所述叶子 结点对应的线性表, 查找与所述关键字匹配的规则。
30、 如权利要求 29所述的装置, 其特征在于, 所述处理模块包括第一解析单元, 用于 根据所述压缩结点的形状位图, 判断所述压缩结点的各个结点是否是叶子结点。
31、 如权利要求 29所述的装置, 其特征在于, 所述处理模块包括第二解析单元, 用于 根据所述压缩结点的形状位图, 当所述压缩结点中的结点在所述形状位图中为 0时, 判 断所述结点在外部位图是否为 0, 根据判断结果确定所述压缩结点中所述结点对应的子结点 是否为叶子结点。
PCT/CN2010/076299 2010-08-24 2010-08-24 二叉树建立、压缩和查找的方法和装置 WO2011110003A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
PCT/CN2010/076299 WO2011110003A1 (zh) 2010-08-24 2010-08-24 二叉树建立、压缩和查找的方法和装置
EP10847261A EP2477363A4 (en) 2010-08-24 2010-08-24 METHOD AND DEVICES FOR CONSTRUCTION, COMPRESSION AND SEARCH BINARY HIERARCHIES
CN201080003336.3A CN102405622B (zh) 2010-08-24 2010-08-24 二叉树建立、压缩和查找的方法和装置
US13/353,884 US8711014B2 (en) 2010-08-24 2012-01-19 Methods and devices for creating, compressing and searching binary tree
US14/213,167 US9521082B2 (en) 2010-08-24 2014-03-14 Methods and devices for creating, compressing and searching binary tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2010/076299 WO2011110003A1 (zh) 2010-08-24 2010-08-24 二叉树建立、压缩和查找的方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US13/353,884 Continuation US8711014B2 (en) 2010-08-24 2012-01-19 Methods and devices for creating, compressing and searching binary tree

Publications (1)

Publication Number Publication Date
WO2011110003A1 true WO2011110003A1 (zh) 2011-09-15

Family

ID=44562832

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2010/076299 WO2011110003A1 (zh) 2010-08-24 2010-08-24 二叉树建立、压缩和查找的方法和装置

Country Status (4)

Country Link
US (2) US8711014B2 (zh)
EP (1) EP2477363A4 (zh)
CN (1) CN102405622B (zh)
WO (1) WO2011110003A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107577756A (zh) * 2017-08-31 2018-01-12 南通大学 一种基于多层迭代的改进递归数据流匹配方法

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8856138B1 (en) * 2012-08-09 2014-10-07 Google Inc. Faster substring searching using hybrid range query data structures
US20150256450A1 (en) * 2012-09-28 2015-09-10 Siyu Yang Generating a Shape Graph for a Routing Table
US9305041B2 (en) 2014-01-06 2016-04-05 International Business Machines Corporation Compression of serialized B-tree data
US9667528B2 (en) 2014-03-31 2017-05-30 Vmware, Inc. Fast lookup and update of current hop limit
US9578617B2 (en) * 2014-08-19 2017-02-21 Walkbase Oy Anonymous device position measuring system and method
CN106209614B (zh) * 2015-04-30 2019-09-17 新华三技术有限公司 一种网包分类方法和装置
CN106845990B (zh) * 2015-12-03 2020-09-18 阿里巴巴集团控股有限公司 一种规则处理方法和设备
WO2017143988A1 (en) * 2016-02-26 2017-08-31 Versitech Limited Shape-adaptive model-based codec for lossy and lossless compression of images
CN108334888B (zh) * 2017-01-20 2022-03-11 微软技术许可有限责任公司 针对比特序列的压缩编码
CN106971528A (zh) * 2017-03-31 2017-07-21 上海智觅智能科技有限公司 一种压缩红外空调遥控码库的算法
US10402094B2 (en) * 2017-10-17 2019-09-03 Seagate Technology Llc Mapping system for data storage devices
CN108170866B (zh) * 2018-01-30 2022-03-11 深圳市茁壮网络股份有限公司 一种样本查找方法及装置
US11204962B2 (en) 2018-10-01 2021-12-21 Palo Alto Networks, Inc. Explorable visual analytics system having reduced latency
CN109558520A (zh) * 2018-11-28 2019-04-02 平安科技(深圳)有限公司 一种基于用户画像的数据处理方法和装置
US11240355B2 (en) * 2019-05-17 2022-02-01 Arista Networks, Inc. Platform agnostic abstraction for forwarding equivalence classes with hierarchy
CN110263862B (zh) * 2019-06-21 2021-05-07 北京字节跳动网络技术有限公司 信息的推送方法、装置、电子设备及可读存储介质
CN112565072B (zh) * 2020-11-02 2022-08-09 鹏城实验室 一种路由表压缩方法、路由器及存储介质
CN112507665B (zh) * 2021-02-01 2021-06-01 北京江融信科技有限公司 一种基于圆周率pi的中文数据压缩和同步加密方法及系统
CN114745453B (zh) * 2022-03-21 2024-06-11 北京左江科技股份有限公司 一种将熵二叉树合并为多叉树的方法
CN115776468B (zh) * 2022-11-11 2024-06-18 北京大学 一种提高频谱效率的数据包封装方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101241499A (zh) * 2008-02-26 2008-08-13 中兴通讯股份有限公司 Patricia树快速查找方法
CN100536435C (zh) * 2007-03-13 2009-09-02 中兴通讯股份有限公司 一种基于二叉树的流分类查找方法
CN101741708A (zh) * 2008-11-13 2010-06-16 华为技术有限公司 一种存储数据的方法、装置及系统

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69334349D1 (de) * 1992-09-01 2011-04-21 Apple Inc Verbesserte Vektorquatisierung
JP3746098B2 (ja) * 1996-02-28 2006-02-15 株式会社日立製作所 データの暗号化装置
WO2000005599A2 (en) * 1998-07-22 2000-02-03 Geo Energy, Inc. Fast compression and transmission of seismic data
US7249149B1 (en) * 1999-08-10 2007-07-24 Washington University Tree bitmap data structures and their use in performing lookup operations
CA2387653C (en) * 1999-08-13 2006-11-14 Fujitsu Limited File processing method, data processing device and storage medium
US7039641B2 (en) 2000-02-24 2006-05-02 Lucent Technologies Inc. Modular packet classification
US6996071B2 (en) * 2001-04-30 2006-02-07 Adtran Inc. Binary decision tree-based arbitrator for packetized communications
US6983334B2 (en) * 2001-11-07 2006-01-03 International Business Machines Corporation Method and system of tracking missing packets in a multicast TFTP environment
US20030236793A1 (en) * 2002-06-19 2003-12-25 Ericsson Inc. Compressed prefix tree structure and method for traversing a compressed prefix tree
JP4037875B2 (ja) * 2005-02-24 2008-01-23 株式会社東芝 コンピュータグラフィックスデータ符号化装置、復号化装置、符号化方法、および、復号化方法
EP2055050A1 (en) * 2006-07-27 2009-05-06 University Of Florida Research Foundation, Inc. Dynamic tree bitmap for ip lookup and update
CN101577662B (zh) * 2008-05-05 2012-04-04 华为技术有限公司 一种基于树形数据结构的最长前缀匹配方法和装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100536435C (zh) * 2007-03-13 2009-09-02 中兴通讯股份有限公司 一种基于二叉树的流分类查找方法
CN101241499A (zh) * 2008-02-26 2008-08-13 中兴通讯股份有限公司 Patricia树快速查找方法
CN101741708A (zh) * 2008-11-13 2010-06-16 华为技术有限公司 一种存储数据的方法、装置及系统

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
GUAN, AIFANG ET AL.: "Multi-field packet Classification Algorithm Based on Aggregated and Folded Vector", APPLICATION RESEARCH OF COMPUTERS, vol. 24, no. 9, September 2007 (2007-09-01), pages 276 - 281, XP008161165 *
See also references of EP2477363A4 *
SUN, WEIQIANG ET AL.: "Expanded Compressed Trie Algorithm-a Trie-based Fast Routing Lookup Algorithm", COMPUTER ENGINEERING AND APPLICATIONS, vol. 22, 2001, pages 51 - 52, XP009161090 *
SUN, YI ET AL.: "Research on Packet Classification Algorithm", APPLICATION RESEARCH OF COMPUTERS, vol. 24, no. 4, April 2007 (2007-04-01), pages 5 - 11, XP008161293 *
XIAO, JINGE ET AL.: "Research on Rule Conversion Method in Packet Classification Algorithm", COMPUTER ENGINEERING, vol. 35, no. 9, May 2009 (2009-05-01), pages 46 - 48, XP008161264 *
XU, KE ET AL.: "Survey on Routing Lookup Algorithms", JOURNAL OF SOFTWARE, vol. 13, no. 1, 2002, XP009025401 *
YAO, XINGMIAO ET AL.: "A Multi-dimensional Packet Classification Algorithm with Trees Divided by Value", JOURNAL OF ELECTRONICS AND INFORMATION TECHNOLOGY, vol. 26, no. 9, 2004, pages 1414 - 1417, XP008161097 *
ZHANG, FEIFEI ET AL.: "Non-backtracking Longest Prefix Match Search Algorithm", COMPUTER ENGINEERING, vol. 34, no. 10, May 2008 (2008-05-01), pages 52 - 54, XP008161487 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107577756A (zh) * 2017-08-31 2018-01-12 南通大学 一种基于多层迭代的改进递归数据流匹配方法

Also Published As

Publication number Publication date
CN102405622B (zh) 2014-11-05
US9521082B2 (en) 2016-12-13
US8711014B2 (en) 2014-04-29
CN102405622A (zh) 2012-04-04
US20140198807A1 (en) 2014-07-17
EP2477363A4 (en) 2012-08-22
EP2477363A1 (en) 2012-07-18
US20120119927A1 (en) 2012-05-17

Similar Documents

Publication Publication Date Title
WO2011110003A1 (zh) 二叉树建立、压缩和查找的方法和装置
US20210152444A1 (en) Aggregation of select network traffic statistics
US8732110B2 (en) Method and device for classifying a packet
US7089240B2 (en) Longest prefix match lookup using hash function
US7039641B2 (en) Modular packet classification
CN105122745B (zh) 用于网络设备的高效最长前缀匹配技术
US20100076919A1 (en) Method and apparatus for pattern matching
US6594655B2 (en) Wildcards in radix- search tree structures
US20050240604A1 (en) Method and system for compressing a search tree structure used in rule classification
Priya et al. Hierarchical packet classification using a Bloom filter and rule-priority tries
US8139591B1 (en) Methods and apparatus for range matching during packet classification based on a linked-node structure
GB2452760A (en) Storing and searching data in a database tree structure for use in data packet routing applications.
CN104579941A (zh) 一种OpenFlow交换机中的报文分类方法
CN102193948A (zh) 特征匹配方法和装置
JP2017537566A (ja) ルーティングテーブルのメンテナンス方法、装置及び記憶媒体
KR100965552B1 (ko) 영역분할을 이용한 패킷 분류 테이블 생성 방법 및 패킷분류 방법과 장치
WO2010054599A1 (zh) 存储数据的方法、装置及系统
CN106789727B (zh) 报文分类方法和装置
WO2016184069A1 (zh) 一种路由查询方法及装置
CN106789668B (zh) 一种处理报文的方法和装置
Lim et al. High-speed packet classification using binary search on length
US7523218B1 (en) O(log n) dynamic router tables for prefixes and ranges
CN108566335B (zh) 一种基于NetFlow的网络拓扑生成方法
Macián et al. An evaluation of the key design criteria to achieve high update rates in packet classifiers
EP1657859B1 (en) Protocol speed increasing device

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201080003336.3

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10847261

Country of ref document: EP

Kind code of ref document: A1

REEP Request for entry into the european phase

Ref document number: 2010847261

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2010847261

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE