WO2011108168A1

WO2011108168A1 - Packet classifier, packet classification method, and packet classification program

Info

Publication number: WO2011108168A1
Application number: PCT/JP2010/072548
Authority: WO
Inventors: 則夫山垣
Original assignee: 日本電気株式会社
Priority date: 2010-03-05
Filing date: 2010-12-15
Publication date: 2011-09-09
Also published as: JP5673667B2; JPWO2011108168A1

Abstract

Disclosed is a packet classifier wherein, from a rule set consisting of a plurality of rules defined by using a plurality of fields, a rule compatible with a search key which is a search target is searched in the packet classifier. In the packet classifier; Decision Tree is used to refine a large number of rules to a predetermined number of rules which may be compatibile; among search keys, Bit Vectors having lengths equal to the number of rules refined by Decision Tree for each predetermined data are used; and using a rule identifier list provided with a list of rule identifiers indicating bit positions of these Bit Vectors, rules having compatibility are specified from the refined rules, and a compatible rule is determined as a final result corresponding to the priorities of the specified rules.

Description

Packet classifier, packet classification method, packet classification program

The present invention relates to a packet classifier, and more particularly to a packet classifier using a plurality of packet header fields as search keys.

Packet classification (Packet Classification) is an important technology for classifying packets into a series of packet sequences called flows in routers and switches on the network, and for providing QoS (Quality of Service) for individual flows. In addition, for the realization of network applications with additional value such as firewall (Firewall), network intrusion detection system (NIDS: Network Intrusion Detection System), network intrusion prevention system (NIPS: Network Intrusion Prevention System), etc. Is an indispensable technology.

In packet classification, for example, it is defined in TCP (Transmission Control Protocol) / UDP (User Datagram Protocol) header in addition to the source IP address, destination IP address, and protocol number defined in the IP (Internet Protocol) header of the packet. A plurality of packet header fields such as a transmission port number and a destination port number are used as search keys. A series of packet sequences specified by this search key is called a flow. The above five packet header fields are generally called 5-tuple. This search key is defined in advance as a rule (sometimes referred to as a filter), and in particular, such packet classification using a plurality of packet header fields is referred to as Multi-Field Packet Classification. Further, as a matching method in the rule, Exact Match that defines the packet header field as a specific value, a plurality of upper bits in the packet header field are specified, but a lower number of bits is used by using a wild card (*). Prefix Match defined as undefined, Range Match defining a packet header field as a range of two specific values, and Wildcard Match defining a packet header field by specifying a wild card in units of individual bits are used. For example, when an 8-bit packet header field is considered, the packet header field is designated as a specific value such as “00110101”, and the match header is designated as “0011 ***”, and the packet header field is designated as “0011 ***”. "Specified as a value starting from 4 bits" is Prefix Match, and when the packet header field is considered as a decimal number, such as [3-64], it is only required to be in the range of 3 to 64. A wildcard match is a wildcard that can be used in bit units of the packet header field, such as “0 ** 10 * 01”.

In such a Multi-Field Packet Classification technology, one of the technical issues is how to process routers and switches at high speed by increasing the capacity of the rule set and improving the link speed. In order to realize high-speed processing, a ternary content addressable memory (TCAM) is often used.

However, TCAM has problems such as high cost, large power consumption and circuit scale. In addition, when Range Match is used, there is a problem that the number of rules increases because it is necessary to divide the rule into rules using Prefix Match.

On the other hand, various multi-field packets using static random access memory (SRAM) and dynamic random access memory (DRAM) with lower cost and lower power consumption to avoid the problem of high cost and high power consumption of TCAM. A classification method has been proposed.

For example, Non-Patent Document 1 proposes a technique using a Decision Tree (decision tree) called HyperCuts. A method based on such a decision tree will be briefly described with reference to FIGS. 1, 2, and 3.

FIG. 1 is a diagram illustrating an example of a rule set including 12 rules R0 to R11 defined using two fields X and Y each having 4 bits. The fields X and Y are 4 bits each here, but correspond to actual packet header fields such as a source IP address and a source port number. The field X is expressed in binary, and “*” represents a wild card whose value may be 0 or 1. In addition, the field Y is represented by Range Match, where “[a: b]” a is a lower limit value (decimal notation) and b is an upper limit value (decimal notation). In general, each rule is given a priority (Priority) and a method of handling a packet (Action) in the case of corresponding to the rule, but they are omitted here.

FIG. 2 shows the respective rules in a two-dimensional space of fields X and Y for such a rule set. FIG. 2 is a diagram showing the rule set of FIG. 1 on a two-dimensional space (space represented by fields X and Y). Note that the numbers on the X-axis and the Y-axis are expressed in decimal numbers.

In the method based on Decision Tree such as HyperCuts, the space as shown in FIG. 2 is divided by paying attention to a plurality of dimensions, and the number of rules existing in the divided area is below a certain threshold. Until then, the decision tree is constructed by dividing the area. Here, a rule group managed in the divided area is referred to as a rule list. FIG. 3 is a diagram showing an example of a basic Decision Tree constructed for the rule set of FIG. In the decision tree shown in FIG. 3, the threshold that is the number of rules in the divided area is set to 2. In FIG. 3, first, both X and Y are divided into two areas, which are divided into four areas. As a result, the entire space (X, Y) = ([0:15], [0:15]) is the region 0 ([0: 7], [0: 7]), region 1 ([0: 7] , [8:15]), region 2 ([8:15], [0: 7]), and region 3 ([8:15], [8:15]). At this time, the rule list managed in each area is [R5, R6, R7, R9] (area 0), [R0, R3, R5, R6, R11] (area 1), [R1, R2, R4]. , R10] (region 2), [R3, R4, R8] (region 3). Since more rules than the threshold value 2 are still managed in each region, further region division is performed until the number of rules is equal to or less than the threshold value for each region. It is divided into areas. Note that FIG. 3 is merely an example, and the algorithm for constructing the Decision Tree is described in Non-Patent Document 1, and is omitted here.

On the other hand, when packet classification is performed, the Decision Tree is traced, and all the rules equal to or less than the threshold number managed by the reached node are searched. For example, when packet classification is performed on a packet having X = 0111 and Y = 1001, the node is traced from the root node of the Decision Tree. In the decision tree shown in FIG. 3, the root node is divided into the above four areas, and the packet belongs to area 1 ([0: 7], [8:15]) among the above four areas. I understand that. Subsequently, when looking at the node in the region 1, it is further divided into two parts in the X direction and two parts in the Y direction. [12:15]) and area 12 ([4: 7], [8:11]) and area 13 ([4: 7], [12:15]) are divided into four. Since it is understood that the packet belongs to the area 12 among these, the subsequent nodes are traced. In the next node, the area 120 ([4: 7], [8: 9]) and the area 121 ([4: 7], [10:11]) are divided into two in the Y direction. Therefore, it is determined that they belong to the matching rule by performing a search for R3 and R6 managed in the area 120. In this case, since the packet matches both R3 and R6, a matching rule is selected according to the priority assigned to each rule omitted in FIG.

In this way, the method using Decision Tree reduces the number of rules to be searched by dividing an area focusing on a plurality of dimensions and performing a search on a small number of rules managed by the divided area. It is a technique to do.

In the method using the Decision Tree, when a region is divided, it may be managed by a plurality of divided regions depending on the rule. Hereinafter, this is referred to as rule duplication. For example, in FIG. 3, it can be seen that rules such as R3 and R4 are managed in a plurality of areas. The more such rules are replicated, the greater the management of address values to the replicated rules, or the management of the rules themselves, and apparently it will handle more rules than the actual rule set, The amount of data in Decision Tree increases. In order to prevent this, in Non-Patent Document 1 and Non-Patent Document 2, if a node that is not a leaf node also has a rule list and the rule is replicated between its child nodes (areas), There has been proposed a method for managing the rule to be duplicated in the rule list, so that the subsequent child nodes do not manage the rule, and consequently reduce the number of rules to be duplicated.

Also, Non-Patent Document 3 proposes a multi-field packet classification method called Parallel Bit Vector (hereinafter referred to as “Parellel BV”). In Parelel BV, pay attention to each field that constitutes a rule, and prepare a bit array (called Bit Vector (BV)) for each section of the focused field divided by each rule. To perform Packet Classification. Such Parellel BV will be briefly described with reference to FIG.

FIG. 4 is a diagram showing an example of the BV for the rule set shown in FIG. In this BV, the bit position and the rule included in the rule set are associated with each other on a one-to-one basis, and each bit is '1' if the value of the section matches the associated rule. In this case, “0” is assigned. In FIG. 4, R11, R10,..., R0 are assigned from the upper bits of each BV. Such BV is prepared for all the fields constituting the rule.

When packet classification is performed, the above BV is selected for each field according to the value of each field, and AND (logical product) is performed for each bit of the BV. It is determined that the packet matches the rule corresponding to the bit position where “1” is set in the BV obtained as a result. For example, consider a case where packet classification is performed on a packet having X = 0111 and Y = 1001. Since the value of the field X is 0111, that is, 7 in decimal notation, “100011001000” is selected as BV (it can be seen from FIG. 4 that it may match R3, R6, R7, R11). Similarly, since the value of the field Y is 1001, that is, 9 in decimal notation, “000001111000” is selected as the BV (from FIG. 4, there is a possibility that it matches R3, R4, R5, R6) I understand that.) Subsequently, when the bitwise AND of BV obtained from the fields X and Y is taken, “000001001000” is obtained. As a result, it can be seen that the packet matches both R3 and R6. Eventually, a matching rule is selected according to the priority assigned to each rule omitted in FIG.

In this way, Parallel BV selects a rule that may match for each field constituting the rule, and finally determines the result of all fields in a comprehensive manner, thereby limiting the matching rule. It is a technique.

A method for performing high-speed processing of the above HyperCuts and BV using a hardware architecture has also been proposed.

For example, Non-Patent Document 2 proposes a hardware architecture that processes a method using a Decision Tree, such as HyperCuts, using a pipeline. Note that, as described above, Non-Patent Document 2 uses a technique for reducing the number of replicated rules by providing a rule list to a node that is not a leaf node. FIG. 5 is a diagram showing an example of a decision tree constructed using the method of Non-Patent Document 2 for the rule set of FIG. Since the method of building the Decision Tree is described in Non-Patent Document 2, details are omitted, but as can be seen from a comparison of FIG. 3 and FIG. 5, a node that is not a leaf node also has a rule list. As a result, the number of rules to be duplicated can be reduced, and the height of the Decision Tree can be kept low.

In the hardware architecture of Non-Patent Document 2, two pipelines are used in parallel: Tree Pipeline for tracing the Decision Tree and Rule Pipeline for searching all the rules included in the rule list at each node. Packet classification. There is only one Tree Pipeline, and basically there is a pipeline stage equal to the depth (height) of the Decision Tree, and each stage advances one deep node. On the other hand, when each node is reached, an address value to one of the rules included in the rule list is designated, and a process of matching it one by one at each stage of the Rule Pipeline is started. For this reason, the number of stages in Rule Pipeline is equal to the number of rules included in the rule list, that is, the number of thresholds, and the number of Rules Pipeline is one more than the number of stages in Tree Pipeline. Since the detailed architecture is described in Non-Patent Document 2, it is omitted here.

However, according to the technique of Non-Patent Document 2, the larger the sum of the header field lengths constituting one rule, the larger the required capacity of a memory that can be accessed at high speed, for example, SRAM, and the processing of one packet. In addition, since the amount of data read from the SRAM increases, the dynamic power of the memory increases, resulting in an increase in overall power consumption.

On the other hand, Non-Patent Document 4 discloses an algorithm obtained by extending Parallel BV and a hardware architecture that processes the algorithm by a pipeline. In the Parallel BV of Non-Patent Document 3, the memory capacity necessary for managing the BV usually increases by O (N ² ) with respect to the number of rules N. The field is divided into a plurality of subfields, and BVs are prepared for all possible values of the subfield composed of the small number of bits. For example, when a certain field is divided into 1-bit subfields, one BV is prepared with any value (0 or 1) that can be taken by the subfield. For this reason, the number of BVs that take bitwise AND increases, but the required memory capacity can be suppressed to a linear increase. In addition, since it is not necessary to perform matching using the rule itself in the Parellel BV, it is possible to store the rule in, for example, a large capacity and low speed DRAM, and hold only the BV in a high speed SRAM. It is.

However, in Parellel BV, since one BV length is proportional to the number of rules N, the larger the number of rules, the more clock cycles are consumed to read BV from the memory, and N bits BV Specifically, since it is necessary to read out the number of fields constituting the rule, there is a problem that the dynamic power increases, resulting in an increase in power consumption.

Furthermore, in the hardware architecture proposed in Non-Patent Document 2 and Non-Patent Document 4, the above-described 5-tuple is assumed as packet header information used as a rule, and packet header information used as a rule is changed. It is necessary to change the hardware circuit again.

Further, Non-Patent Document 5 discloses a method aiming to compensate for the disadvantages of both by combining HyperCuts and Parellel BV. In this method, only the number of rules for which the BV read from the memory has a realistic bit length for high-speed processing is processed by the BV, and the rest are processed by HyperCuts. In particular, when processing is performed with HyperCuts, it is possible to reduce the memory capacity required for HyperCuts by processing rules that require more copies with Parrel BV.

However, this method is merely a combination of HyperCuts and Parellel BV, and does not fundamentally solve each of the problems described above.

The multi-field packet classification method using hardware using a memory such as SRAM as described above has the following problems.

First, the first problem is that the larger the sum of the header field lengths that make up one rule, and the greater the number of rules, the more memory dynamic power increases, resulting in the overall hardware. This also increases power consumption.

The reason is that in the algorithm using the Decision Tree, when searching for all the rules included in the rule list, the rules themselves need to be read from the memory and compared.

Next, the second problem is that the greater the number of rules, the greater the number of clock cycles required to read data from the memory.

The reason is that Parallel BV is because one rule is associated with 1 bit of BV, and BV needs the number of bits to correspond to all the rules.

Finally, the third problem is that packet header information used as a rule cannot be freely changed.

The reason is that, as the packet header information used as a rule, for example, a hardware circuit is assembled assuming 5-tuple, and thus changing the hardware circuit is necessary. is there. Note that the change of packet header information here does not mean changing the packet header information used for each rule, but the packet header information that can be used for each rule is determined in advance. This means that the packet header information can be changed freely without changing the hardware circuit.

An object of the present invention is to provide a packet classifier, a packet classification method, and a packet classification program that can solve any of the problems described above.

The packet classifier of the present invention
From a rule set composed of a large number of rules defined using a plurality of fields, a rule that matches the search key to be searched is selected using a plurality of types of bit arrays having a predetermined small number of lengths. A packet classifier to search for,
Using a decision tree, we narrow down the rules that can be matched from a large number of rules to a predetermined number,
Using a bit identifier having the same length as the number of rules narrowed down by the decision tree for each predetermined data in the search key, and using a rule identifier list having rule identifiers indicated by the bit positions of these bit arrays as a list Identify the matching rule from the narrowed-down rules,
Determine the final matching rule according to the priority of the identified rule,
It is characterized by that.

The packet classification method of the present invention includes:
A packet classification method by a packet classifier that searches a rule set composed of a large number of rules defined using a plurality of fields and that matches a search key that is a search target,
Using a decision tree, narrow down the number of rules that may match from a large number of rules to a predetermined number,
Using a bit identifier having the same length as the number of rules narrowed down by the decision tree for each predetermined data in the search key, and using a rule identifier list having rule identifiers indicated by the bit positions of these bit arrays as a list Identify the matching rule from the narrowed-down rules,
Determine the final matching rule according to the priority of the identified rule,
It is characterized by that.

The packet classification program of the present invention is
From a rule set consisting of a large number of rules defined using multiple fields to a computer that searches for a rule that matches the search key that is the search target,
Using a decision tree, a process of narrowing down a rule that may be matched from a large number of rules to a predetermined number,
Using a bit identifier having the same length as the number of rules narrowed down by the decision tree for each predetermined data in the search key, and using a rule identifier list having rule identifiers indicated by the bit positions of these bit arrays as a list Process to identify the matching rule from the narrowed down rules,
Determining the final matching rule according to the priority of the identified rule;
Is executed.

According to the present invention, by combining a decision tree and a bit arrangement, it is possible to reduce the amount of data read from the memory in processing per packet, and the sum of the header field lengths constituting one rule. Even when the number of rules increases or the number of rules increases, an increase in the dynamic power of the memory can be suppressed, and as a result, the power consumption of the entire hardware can be prevented from increasing.

Also, by combining the decision tree and the bit array, the number of rules that can be matched by the decision tree can be narrowed down even if the number of rules is large, so the bit length of the bit array can be reduced, An effect is obtained that an increase in the number of clock cycles necessary for reading data from the memory can be suppressed.

It is a figure which shows the example of a rule set. It is the figure which represented the rule set of FIG. 1 on the two-dimensional space. It is a figure which shows an example of Decision Tree with respect to the rule set of FIG. It is a figure which shows an example of Bit Vector with respect to the rule set of FIG. It is a figure which shows an example of Decision Tree constructed | assembled using the method of the nonpatent literature 2 with respect to the rule set of FIG. It is a figure which shows an example of Decision Tree used with the basic packet classification method in this invention. It is a figure which shows the example of the area | region division information in each node of Decision Tree used with the basic packet classification method in this invention. It is a figure which shows the example of Bit Vector information in each node of Decision Tree used with the basic packet classification method in this invention. It is a flowchart which shows the basic operation | movement of the basic packet classification method in this invention. It is a figure which shows the example of the rule set used with the basic packet classification method in this invention. It is the figure which represented the rule set shown in FIG. 10 on the two-dimensional space. It is a figure which shows an example of Decision Tree constructed | assembled with the basic packet classification method in this invention with respect to the rule set shown in FIG. FIG. 13 is a diagram showing a rule list of node 0 in FIG. 12 on a two-dimensional space and showing a Bit Vector for valid bits of fields X and Y. FIG. It is the figure which showed the rule list of the node 4 of FIG. 12 on a two-dimensional space, and showed Bit Vector with respect to the effective bit of the fields X and Y. FIG. FIG. 13 is a diagram showing a rule list of a node 8 in FIG. 12 on a two-dimensional space and showing a Bit Vector for valid bits of fields X and Y. FIG. 13 is a diagram showing a rule list of a node 8 in FIG. 12 on a two-dimensional space and showing a Bit Vector for valid bits of fields X and Y. It is a block diagram which shows the 1st Embodiment of this invention. It is a figure which shows the example of mapping to the Tree Pipeline Stage of each node in the Decision Tree in the 1st Embodiment of this invention. It is a block diagram which shows the structure of Tree Pipeline Stage in the 1st Embodiment of this invention. It is a block diagram which shows the structure of the area division circuit in the 1st Embodiment of this invention. It is a block diagram which shows the structure of the field division circuit in the 1st Embodiment of this invention. It is a figure which shows the example of Virtual Node in the 1st Embodiment of this invention. It is a figure which shows the example of arrangement | positioning of the area | region division information of Virtual Node in the 1st Embodiment of this invention. It is a block diagram which shows the structure of Priority Pipeline Stage in the 1st Embodiment of this invention. It is a block diagram which shows the structure of the Bit Vector selection circuit in the 1st Embodiment of this invention. It is a flowchart which shows the operation | movement (step A2) at the time of Parallel Bit Vector processing of 1st implementation of this invention. It is a flowchart which shows the operation | movement (step A6) at the time of the area | region division | segmentation of 1st Embodiment of this invention. It is a flowchart which shows the operation | movement (step A2) at the time of Parallel Bit Vector processing of 1st implementation of this invention. It is a figure which shows the example of mapping to the Tree Pipeline Stage of the several Decision Tree node in the 1st Embodiment of this invention. It is a block diagram which shows the structure of Tree Pipeline Stage in the 2nd Embodiment of this invention. It is a figure which shows the example of the area | region division information in each node of Decision Tree in the 2nd Embodiment of this invention. It is a block diagram which shows the structure of the area | region division circuit in the 2nd Embodiment of this invention. It is a flowchart which shows the operation | movement (step A6) at the time of the area | region division of the 2nd implementation of this invention. It is a block diagram which shows the 3rd Embodiment of this invention. It is a block diagram which shows the structure of the Decision Tree processing circuit in the 3rd Embodiment of this invention. It is a block diagram which shows the structure of the Bit Vector processing circuit in the 3rd Embodiment of this invention. It is a figure which shows the structural example of the rule ID list | wrist in the 3rd Embodiment of this invention. It is a flowchart which shows the operation | movement of the packet classification | category in the 3rd Embodiment of this invention. It is a flowchart which shows the operation | movement (Step A2) at the time of Parallel Bit Vector processing in the 3rd Embodiment of this invention. It is a block diagram which shows the 4th Embodiment of this invention.

[Summary of Invention]
Before describing the packet classifier and packet classification method of the present invention, first, an outline of a basic packet classification method of the present invention will be described.

FIG. 6 is a diagram showing an example of Decision Tree used in the basic packet classification method according to the present invention. FIG. 6 shows an example in which node 0 is a root node of a decision tree, and each node that is not a leaf node is divided into two or four regions. Each leaf node manages a rule list (rule list indicated by a solid line in FIG. 6) which is a rule group having a threshold value L or less managed in each divided region. Further, even in a node that is not a leaf node, a rule list (rule list indicated by a dotted line in FIG. 6) for reducing duplication of rules as proposed in Non-Patent Document 1 and Non-Patent Document 2 is maintained. ing. In addition, about the construction method of such Decision Tree, the method proposed by the nonpatent literature 1 and the nonpatent literature 2 shall be used, and detailed description is abbreviate | omitted here.

FIG. 7 is a diagram showing area division information in each node of the Decision Tree used in the basic packet classification method according to the present invention. As the area division information of the packet classification method in the present invention, “Leaf Flag” indicating whether or not the node is a leaf node in the Decision Tree, and in order to divide the area in each node of the Decision Tree by the packet classification method. The number of divisions (Num. Of Cutting) for the C fields used for the field, the Base Address (base address) in which the area division information for the child node of the node is stored, and the node is not a real node but a virtual node “Virtual Flag” indicating the above. Here, the number of divisions is designated in the same manner as in Non-Patent Document 2, and when k is designated as the number of divisions, the number of divisions for the field is 2 ^k . Specific usage methods regarding other information will be described later. Note that this area division information is stored in the memory, and when the decision tree arrives at each node, the corresponding area division information is read from the memory.

FIG. 8 is a diagram showing Bit Vector (BV) information in each node of the Decision Tree used in the basic packet classification method according to the present invention. First, in this method, one rule ID list is managed in addition to BV in each node of the Decision Tree. This rule ID list corresponds to the rule list in the node, and indicates the rule IDs of the maximum L rules managed in the divided area of the node and their priorities (in FIG. 8, The priority is not shown, but the priority is assumed to be managed together with the rule ID). On the other hand, the length of BV is L bits, and as shown in FIG. 8, the bit position and the rule ID list have a one-to-one correspondence. That is, if the bit position of BV is expressed as BV [L-1], BV [L-2], BV [L-3],..., BV [0] from the upper order, the bit position BV [i] (i = L-1, L-2,..., 0) corresponds to the rule having the ID described in Rule ID #i of the rule ID list. In addition, since the meaning which each bit of BV represents is the same as the existing Parallel BV, description is abbreviate | omitted. In FIG. 8, only one BV is described. Actually, however, as described in Non-Patent Document 3, it is prepared for each section in each field constituting a rule. As described in FIG. 4, it is conceivable to divide each field into subfields and prepare all possible values of the subfields. Details will be described in an embodiment described later. However, in this method, the effective bit length is managed in each field of the search key, and the BV is managed only for the bit length indicated by the effective bit length. Also, it is assumed that the rule ID list and BV are held in the memory.

FIG. 9 is a flowchart showing the basic operation of the basic packet classification method in the present invention.

In the basic packet classification method of the present invention, when packet header information serving as a search key is input, processing starts from the root node of the Decision Tree (step A1). In the processing node, a predetermined Parallel BV process is executed based on the managed rule ID list and search key (step A2). Note that the Parallel BV processing here is based on the methods described in Non-Patent Document 3, Non-Patent Document 4, and the like. Next, the optimal rule is selected from the rules including the processing node and the optimal rule from each processing node before the processing node (step A3). Subsequently, the area division information in the processing node is read from the memory (step A4). Although details will be described later, the memory address for the area division information of the node is determined at the time of processing in the parent node, notified to the processing node that is a child node, and the processing node is stored in the address value. The area division information is read. From the leaf flag of the read area division information, check whether the node is a leaf node of the Decision Tree (step A5). If the node is not a leaf node (No in step A5), the area division information of the processing node The memory address value in which the area division information of the next child node is stored is determined from the number of divisions for the field specified in (this is called selecting the next child node), and the child node is set as the processing node. (Step A6). The details of the method for selecting the next child node will be described later. For the divided field specified by the area division information, the first k bits of the effective bit length are checked among the values of the field of the search key, After combining them, by adding to the Base Address of the area division information, the memory address value storing the area division information for the next child node can be determined. It is possible to determine which child node. Here, the variable k is a value specified by the number of divisions of the region division information. Next, the effective bit length is updated by subtracting k specified by the area division information from the effective bit length of each field (step A7), and the process returns to step A2 and is repeated. The effective bit length is managed as an internal variable, is initialized with the field length of the field, and is passed from the root node to the node following the Decision Tree. On the other hand, when the processing node is a leaf node of Decision Tree (Yes in step A5), the processing is terminated, and the optimal rule selected so far is set as the final solution.

Subsequently, in the rule set including the fields X and Y shown in FIG. 10, packet classification is performed using the basic packet classification method of the present invention for a packet having X = 1001 and Y = 1111. Will be described. In the following example, the Parallel BV process in step A2 in FIG. 9 is based on the technique described in Non-Patent Document 2. FIG. 10 is a diagram illustrating an example of a rule set including 20 rules R0 to R19 defined using two fields X and Y each including 4 bits, as in the rule set of FIG. . Although the fields X and Y are each 4 bits, they correspond to actual packet header fields such as a source IP address and a source port number, for example, and the notation method, priority, and action of the fields X and Y Since handling is the same as in FIG. 1, the description thereof is omitted here. FIG. 11 is a diagram showing the rule set shown in FIG. 10 on a two-dimensional space of fields X and Y. FIG. 12 shows the rule set shown in FIG. It is an example of Decision Tree constructed by a packet classification method. However, the threshold L (the number of rules that can be included in the rule ID list) in the Decision Tree in FIG.

In the basic packet classification method in the present invention, first, the node 0 which is the root node of the Decision Tree is set as a processing node, and processing is started (step A1). Since the rule list of node 0 includes R7 and R8, the rule ID list is (R7, R8), and the BVs for fields X and Y are “10” and “01”, respectively. Here, FIG. 13a is a diagram showing the rule list of the node 0 in the two-dimensional space and showing the BV for the fields X and Y. From FIG. 13a, it can be seen that the BVs for the fields X and Y at node 0 are "10" and "01", respectively. Subsequently, the bitwise AND of the acquired BV is taken to obtain the final BV “00” in the processing node 0 (step A2). From this result, it can be determined that there is no optimal solution up to this processing node and no matching rule exists (step A3). Next, the area division information at node 0 is read (step A4). In this example, detailed area division information is not shown, but the number of divisions for fields X and Y used for area division is shown in each node of FIG. In addition, it is assumed that the Leaf Flag from the node 5 to the node 7 and the node 9 to the node 14 is 1, that is, a leaf node, and the other Leaf Flags are all 0, that is, not a leaf node. Further, it is assumed that the virtual flags are all 0, that is, all nodes are real nodes. Since it can be determined from the area division information of node 0 that node 0 is not a leaf node of the decision tree (No in step A5), area division is performed, and the next child node is selected to be a processing node (step A6). . In this case, since 1 is designated as the division number k for the fields X and Y, and the effective bit length of each field is 4, the leading bits for the effective bits of the fields X and Y of the search key are concatenated. By adding Base Address to 11 ″, the memory address value in which the area division information of the node 4, which is the next processing node, is stored is determined. Finally, the effective bit length is updated by reducing the effective bit length by 1, which is the number of divisions (step A7).

Subsequently, since R13 is included in the rule list of the node 4, which is the next processing node, the rule ID list is (R13, NULL), and "111" and "001" which are valid bits of the fields X and Y BV for "" is "10" and "00", respectively. Here, FIG. 13B is a diagram showing the rule list of the node 4 in a two-dimensional space and showing BVs for the fields X and Y. Since the effective bit length for the fields X and Y is 3 at the node 4, BVs for the lower 3 bits that are effective bits are prepared, and the area portions that are not effective bits are filled. From FIG. 13b, it can be seen that the BVs for the fields X and Y at the node 4 are "10" and "00", respectively. The obtained BV is bitwise ANDed to obtain the final BV “00” in the processing node 4 (step A2). As a result, it can be seen that there is no matching rule in the processing up to the node 4 (step A3). Next, the area division information of the node 4 is read (step A4). Since the node 4 is not a leaf node of the Decision Tree (No in Step A5), the area is divided and the next processing node is selected (Step A6). In this case, 1 is designated as the number of divisions for the fields X and Y, and the effective bit length of each field is 3, so that the effective bits of the search key fields X and Y are “001” and “111”. By adding Base Address to “01” obtained by concatenating the leading bits for “,” the memory address value in which the area division information of the next processing node, node 8, is stored is determined. Finally, the effective bit length is decreased by 1 (step A7).

Similarly, since the rule list of node 8 includes R17 and R18, the rule ID list is (R17, R18), and the BV for the valid bits “01” and “11” of fields X and Y is , “11” and “11”, respectively. Here, FIG. 13c is a diagram showing the rule list of the node 8 in a two-dimensional space and showing BVs for the fields X and Y. Note that since the effective bit length for the fields X and Y is 2 at the node 8, BVs for the lower 2 bits that are effective bits are prepared, and the area portions that are not effective bits are filled. The obtained BV is bitwise ANDed to obtain the final BV “11” in the processing node 8 (step A2). In this case, the matching rules at node 8 are R17 and R18, and there is no matching rule at the previous processing nodes (node 0 and node 4), so the priority of R17 and R18 is confirmed and high priority is given. The rule of degree is the optimal solution (although it is omitted in the rule set of FIG. 10, it is assumed that individual priority is set for each rule included in the rule set). Next, the area division information of the node 8 is read (step A4), and it can be seen that the node 8 is not a leaf node of the Decision Tree (No in step A5), so the area is divided and the next processing node is selected ( Step A6). In this case, 1 is designated as the number of divisions for the fields X and Y, and the effective bit length of each field is 2, so that the effective bits of the search key fields X and Y are “01” and “11”. By adding Base Address to “01” obtained by concatenating the leading bits for “,” the memory address value for the area division information of the node 12 that is the next processing node is determined. Finally, the effective bit length is decreased by 1 (step A7).

Similarly, since the rule list of the node 12 includes R3, the rule ID list is (R3, NULL), and the BVs for the valid bits “1” and “1” of the fields X and Y are They are “00” and “00”, respectively. Here, FIG. 13d is a diagram showing the rule list of the node 12 in a two-dimensional space and showing BVs for the fields X and Y. Since the effective bit length for each of the fields X and Y is 1 at the node 12, a BV for the lower 1 bit that is an effective bit is prepared, and an area portion that is not an effective bit is filled. The obtained BV bitwise AND is taken to obtain the final BV “00” in the processing node 12 (step A2). As a result, there is no matching rule in the node 12, and it can be determined that R17 or R18, which is the optimal rule in the processing node so far, is the optimal rule (step A3). Next, the region division information of the node 12 is read (step A4), and since the node 12 is found to be a leaf node of the Decision Tree (Yes in step A5), the processing is terminated and the current optimum rule R17, Or let R18 be the final solution.

In the example of the rule set shown in FIG. 10, the rule is defined only using the two fields X and Y. However, in the basic packet classification method according to the present invention, more fields are defined. Can be used to define rules. In this case, as described above, using the data structure as shown in FIG. 7, the field to be divided and the number of divisions are specified in each node of the Decision Tree.
[Embodiment of the Invention]
Based on the basic packet classification method in the present invention described above, an embodiment of the present invention and its operation will be described below.

(1) First Embodiment (1-1) Configuration of First Embodiment First, a first embodiment of the present invention will be described with reference to the drawings.

FIG. 14 is a block diagram illustrating the packet classifier according to the first embodiment of this invention. Referring to FIG. 14, the packet classifier 1 according to the first exemplary embodiment of the present invention is realized as a hardware circuit. A search key is input as an input 2 and a rule ID of an optimal rule is output as an output 3. Is output.

The packet classifier 1 includes a tree pipeline (tree pipeline processing circuit) 10 and a priority pipeline (priority pipeline processing circuit) 20.

The Tree Pipeline 10 has a pipeline structure, and executes the process of tracing the Decision Tree when performing packet classification of the present invention. Tree Pipeline 10 is an H-stage pipeline consisting of Tree Pipeline Stage (Tree Pipeline Processing Unit) 10-1, Tree Pipeline Stage 10-2, Tree Pipeline Stage 10-3, ..., Pipe Pipeline Stage 10-H. Yes. Here, H is equivalent to the height (depth) of the Division Tree, and in this embodiment, the Division Tree having a height of H or less is configured and used.

FIG. 15 is a diagram illustrating a mapping example of each node to the Tree Pipeline Stage in the Decision Tree according to the first embodiment of this invention. For example, a decision tree of height 4 composed of 15 nodes from node 0 to node 14 is basically the same depth (referred to as a level), as in the case of Tree Pipeline in Non-Patent Document 2. Are placed on the same stage in Tree Pipeline. However, a node at a certain level may be arranged in a stage after that level. For example, the node 10 has a depth of 3, but is not a Pipe Pipeline Stage # 3 but a Pipe Pipeline Stage # that is a level after that. 4 is arranged. The node arrangement means the arrangement of the area division information of the node as shown in FIG. 7, and means that the area division information is held on a storage medium such as a memory provided in each Tree Pipeline Stage. means. Such an arrangement method is based on the number of words in the memory storing the area division information in the Tree Pipeline Stage and the number of rule lists that can be realized in the Priority Pipeline Stage (priority pipeline processing unit) described later, that is, in the memory storing the BV. This is to give the flexibility that the number of words does not have to correspond one-to-one, and details are described in Non-Patent Document 2, and thus the description thereof is omitted here.

FIG. 16 is a block diagram showing the configuration of the Tree Pipeline Stage in the first embodiment of the present invention. Referring to FIG. 16, each of the Pipe Pipeline Stages 10-1 to 10-H according to the present invention performs region division processing at a node that performs processing at the stage, and as a result, includes region division information of the next node. The area division circuit 100 for determining the address value of the next stage, the memory controller 101 for reading the area division information of the processing node at the address value designated from the previous stage, and the stage are arranged. A region division information storage block 102 formed of a storage medium such as a memory that stores region division information of a node of the Decision Tree, a valid bit update length circuit 103 that updates a valid bit length of a search key, and Search key entered on stage Provided to have a certain delay, the search key delay circuit 104 to synchronize with other outputs, the.

In each Tree Pipeline Stage, the area division that holds the search key that is the search target, its effective bit length, and the area division information of the node that is the processing target in the Tree Pipeline Stage, from the previous Pipe Pipeline Stage The address value of the information storage block 102 is input. The input search key is input to the area dividing circuit 100 and the search key delay circuit 104. Here, the search key is configured as a bit string including all packet header field information included in the rule targeted by this packet classifier, and information such as the bit length of each field is preliminarily stored on the circuit. It is assumed that this packet classifier can uniquely refer to or cut out the field. The input effective bit length is input to the area dividing circuit 100 and the effective bit length update circuit 103. Here, the effective bit length is set for each packet header field that is set in advance in the packet header field included in the search key and is used for region division by this packet classifier indicated in the region division information. And a bit string representing the effective bit length. Finally, the address value is input to the Memory Controller 101. In Tree Pipeline Stage 10-1, the search key described above is given as an input 2 to the packet classifier 1. The effective bit length is specified as the length of each header field itself, that is, all are valid, and the address value is the address value of the area division information storage block 102 in which the area division information of the root node of the Decision Tree is stored. Are designated as input 2, or may be designated within the packet classifier 1.

The memory controller 101 to which the address value has been input reads the area division information as shown in FIG. 7 stored in the address value designated for the area division information storage block 102, and reads it out with the area division circuit 100 and the effective bit. Output to the long update block 103.

The effective bit length update block 103 updates the effective bit length from the input effective bit length and the region division information. Specifically, in the region division information, the division number k is designated for each header field used for region division by this packet classifier, but the effective bit length update block 103 is obtained from the input effective bit length, It is updated by subtracting the number of divisions k, and the updated effective bit length is output to the tree pipeline stage at the subsequent stage.

The search key delay circuit 104 delays the input search key by a predetermined interval using a register or the like, and an effective bit length output from the effective bit length update circuit 103 or an address output from the area dividing circuit 100 described later. The search key is output at the same timing as the value.

FIG. 17 is a block diagram showing a configuration of the area dividing circuit 100 according to the first embodiment of the present invention. Referring to FIG. 17, the area division circuit 100 includes an area division information separation circuit 1000 that extracts the number of divisions for each header field used for area division from the input area division information, a multiplexer 1001, and a field division circuit 100-. 1, 100-2,..., 100-C, an OR gate 1002, and an adder 1003.

The area division information input to the area division circuit 100 is input to the area division information separation circuit 1000. The area division information is composed of information as shown in FIG. 7, but the area division information separation circuit 1000 cuts out each of these pieces of information and outputs the information to a circuit that uses the information. For example, the Virtual Flag is output to the multiplexer 1001, the number of divisions for the C fields is output from the field division circuits 100-1 to 100-C, and the Base Address is output to the adder 1003. The search key and effective bit length input to the area dividing circuit 100 are input to the field dividing circuit that processes the field according to each field. Here, a search key in which a plurality of fields are bundled or effective bit length data itself may be input to each field division circuit, and the field data in charge within the circuit may be cut out. Data of each field may be cut out and input to each field dividing circuit.

The OR gate 1002 takes a logical sum (OR) of the output results of each field dividing circuit described later and inputs the result to the adder 1003.

The adder 1003 adds the Base Address included in the area division information and the output result of the OR gate 1002, and outputs the result as the address value of the area division information of the child node processed in the subsequent Pipe Pipeline Stage.

FIG. 18 is a block diagram showing the configuration of the field dividing circuit in the first embodiment of the present invention. Referring to FIG. 18, the field division circuit includes a subtracter 1004, a right shifter 1005, an adder 1006, and a left shifter 1007.

The effective bit length input to the field division circuit and the division number of the area division information are input to the subtractor 1004, and the division number is subtracted from the effective bit length. This result is output to the Right Shifter 1005, and the Right Shifter 1005 shifts the field of the input search key to the right by the value of the result of the subtractor 1004. On the other hand, the value input from the lower field division circuit and the division number are added by an adder 1006, and the result is output to the Left Shifter 1007 and the upper field division circuit. In the field division circuit 100-1, the value input to the adder 1006 from the lower stage is zero. Here, the adder 1006 of each field division circuit adds the number of divisions input to the addition result from the lower field division circuit, but does not use the lower result and adds up to that point. All the power division numbers may be added by the adder 1006. The left shifter 1007 shifts the result of the right shifter 1005 to the left by the value of the result of the adder 1006, and outputs the result to the OR gate 1002.

As a result, the OR gate 1002 calculates the logical sum of the results of the field division circuits, and for the field used for the division specified by the region division information, the first k bits of the effective bits of the field are cut out and the region division is performed. It is possible to determine the relative address value of the next node according to the number of divisions of each field used for. Then, by adding the Base Address and this relative address value in the adder 1003, it is possible to specify an address value in which the area division information of the next node in the subsequent Pipe Pipeline Stage is stored.

Here, the use of the virtual flag of the area division information will be described. As described above, in this packet classifier, as in the case of Non-Patent Document 2, the nodes having the same depth are basically arranged at the same stage in the Tree Pipeline. It may be arranged on the stage. In this case, the packet classifier realizes this using a virtual node (Virtual Node) as shown in FIG. FIG. 19 is a diagram illustrating an example of the Virtual Node according to the first embodiment of this invention. In FIG. 19, the node 10, which is a child node of the node 4, is originally arranged in the Tree Pipeline Stage next to the node 4, but here is further arranged in the Tree Pipeline Stage one stage after. In this case, the node 4 calculates the address values of the

subsequent nodes

7, 8, 9, and 10, but the node 4 calculates the address where the area division information of the node 7 is stored as Base Address. To do. On the other hand, in the Pipe Pipeline Stage where the node 10 is actually arranged, since other nodes are arranged, for example, from the node 11 to the node 14, the address value calculated from the Base Address in the node 4 is the node value. It is difficult to determine 10 address values. For this reason, when arrange | positioning in the stage different from the stage which should be arrange | positioned originally in this way, Virtual Node is arrange | positioned. In FIG. 19, the node V <b> 0 is arranged as the virtual node of the node 10. Since the node V0 is not a real node, it does not have a rule list. In other words, there is no need to store BV in Priority Pipeline Stage. As a result, as in Non-Patent Document 2, it is possible to increase the number of memory words in Tree Pipeline Stage and map the Decision Tree more flexibly. . In this case, the node 4 can determine the address value in which the area division information of the

nodes

7, 8, 9, and V0 that are the child nodes is stored by performing the above-described processing in the area division circuit 100. . When the next node of the node 4 is the node 10, the node Pipe 0 in the subsequent stage of the node 4 designates the node V 0, reads the area division information, and performs the same area division processing. At this time, in the area division information of the node V0, since the virtual flag is “1”, the division information input to each field division circuit by the multiplexer 1001 is zero. As a result, the output of the OR gate 1002 becomes 0, and the output from the adder 1003 has an address value equal to Base Address. Since the address value of the node 10 that is the child node is held in the Base Address of the area division information of the Virtual Node, it is possible to specify the address value of the node 10 without any problem by performing the above-described processing. . The same applies to a case where a child node of a certain node is arranged with a plurality of Tree Pipeline Stages open, and a Virtual Node is arranged in the middle Pipe Pipeline Stage.

Furthermore, it supplements about the node arrange | positioned in the stage after Tree Pipeline Stage originally arrange | positioned using Virtual Node. FIG. 20 is a diagram illustrating an arrangement example of the area division information of the Virtual Node according to the first embodiment of this invention. Tree Pipeline Stage Segmentation information the number of words _W T of the storage block 102, and the number of words that the number of words that can be stored Bit Vector of Priority Pipeline Stage and _{W P} (here, means the number of nodes and things. that is, the actual node in the Tree Pipeline Stage, _{W T} node content including virtual nodes, the Priority Pipeline Stage, which shall be able to store the _{W P} nodes worth of BV). In this case, one of the _{W T} node in the Tree Pipeline Stage, _{W P} node is a real node, can have a rule ID list. Therefore, in the present invention, from the address value 0 for each of a plurality of child nodes of a node packed in order to be stored in the area division information storage block 102, the Tree Pipeline Stage later placed originally nodes beyond the _{W P} node Take a policy of placing on stage. In other words, nodes that have a Virtual Node as a child node are all virtual nodes, or among the child nodes, some nodes with a small node ID are real nodes, and some nodes with a large node ID are All virtual nodes are assumed to be used.

The configuration of Tree Pipeline as described above is also disclosed in Non-Patent Document 2. However, in the configuration disclosed in Non-Patent Document 2, the effective bit is not taken into account and the result is divided into Base Address. Since the address values are not added, it is considered difficult to set an appropriate address. In addition, when a child node of a certain node is arranged with a plurality of stages open, in Non-Patent Document 2, it is grasped using a counter called Distance Value, but in this configuration, a 1-bit flag is used. This is different.

Subsequently, the Priority Pipeline 20 included in the packet classifier 1 has a pipeline structure, and executes processing related to selection of a BV and selection of an optimal solution when performing packet classification of the present invention. Priority Pipeline 20 is Priority Pipeline Stage 20-0, Priority Pipeline Stage 20-1, Priority Pipeline Stage 20-2, ... It is composed of

FIG. 21 is a diagram showing a configuration of the Priority Pipeline Stage in the first embodiment of the present invention. Referring to FIG. 21, Priority Pipeline Stage includes a field separation circuit 200, an address conversion circuit 201, and a Bit Vector (BV) selection circuit (bit array selection circuit) 200-1, 200-2,. F, an AND gate 202, a priority check circuit 203, and a rule ID list storage block 204 formed of a storage medium such as a memory.

The field separation circuit 200 separates the search key and the effective bit length input from the Tree Pipeline Stage for each of F fields included in the search key, and inputs each to the F BV selection circuits. The effective bit length is set only for the field used for area division in Tree Pipeline. Therefore, for a field for which the effective bit length is not defined, the field length of the field is set as the effective bit length. It may be input to the BV selection circuit, or may not be input as don't care and may be determined to be all valid by the BV selection circuit.

The address conversion circuit 201 receives an address value in which the area division information of the child node at the next stage, which is input from the Tree Pipeline Stage, is stored. The address conversion circuit 201, the address value entered is, it is determined whether Priority Pipeline number of words that can hold a rule ID list in Stage (number of nodes) _W larger or smaller than _P, the node smaller is the actual node Therefore, in order to perform Parallel BV processing on the rule list, the Base Address in which the BV of the node is stored is output to each BV selection circuit, and is input as the address value of the rule ID list storage block 204 of the node. Output address value. At this time, the rule ID list storing block 204, W _{for P} number of only rule ID list does not hold for a node, the bit width of the address may be the smallest integer value greater than or equal to log ₂ W _P. On the other hand, the input address value is greater than _{W P,} because the node is Virtual Node, and outputs a signal indicating not to perform processing in the Priority Pipeline Stage.

The description of the configuration related to the BV selection circuit will be described later, but the BV selection circuit selects a BV for the field and outputs it to the AND gate 202. The AND gate 202 to which the BV for each field is input takes the logical product (AND) of these BVs and outputs the result to the priority check circuit 203.

The priority check circuit 203 reads the rule ID list at the address value specified by the address conversion circuit 201 and applies the rule ID corresponding to each bit having the value “1” of BV output from the AND gate 202. The priority including the optimal rule input from the preceding Priority Pipeline Stage is compared, and the optimal rule ID at that time is output to the Priority Pipeline Stage.

FIG. 22 is a block diagram showing a configuration of the BV selection circuit according to the first embodiment of the present invention. Referring to FIG. 22, the BV selection circuit according to the present embodiment includes a search circuit 2000 and a Bit Vector (BV) storage block (bit array storage block) 2001 configured from a storage medium such as a memory.

The search circuit 2000 has a Base Address in which the BV of the node input from the address conversion circuit 201 is stored, a header field data processed by the BV selection circuit input from the field separation circuit 200, an effective bit length, , The BV corresponding to the field data is selected, read from the BV storage block 2001, and output to the AND gate 202. Here, in the present embodiment, the Parallel BV method disclosed in Non-Patent Document 3 is used, and the BV storage block 2001 stores the start position or the end position of the section having BV from BaseAddress and the section of the section. The BV is stored, and the search circuit 2000 reads out these data and reads out an appropriate BV by performing a binary search, for example, while referring to the input header field data and effective bit length. Note that a method for realizing the binary search is well known to those skilled in the art, and thus detailed description thereof is omitted here.

In the present embodiment, the processing as described above is performed, and since the output from the Priority Pipeline Stage 20-H is the optimal solution, the output is output by the output 3.

In the present embodiment, when it is determined by Leaf Flag in the region division information that the node is a leaf node, it is omitted in the configuration diagram, but the region division information in the Tree Pipeline Stage is not executed. The information is output to the later Pipe Pipeline Stage, and the subsequent processing such as reading the area division information and not executing the area division is performed in the Tree Pipeline Stage. At this time, the Priority Pipeline Stage immediately after the Tree Pipeline Stage needs to execute the Parallel BV processing for the rule ID list of the leaf node, so for example, a signal indicating that it is a leaf node is received from the previous stage. After Priority Pipeline Stage after various signals are input from Tree Pipeline Stage, processing such as not executing Parallel BV processing is performed.

(1-2) Operation of the First Embodiment Next, a flowchart showing the basic packet classification method in the present invention in FIG. 9, and FIGS. 14, 16, 17, 18, 21, 21 and 22. The operation of the present embodiment will be described with reference to the configuration diagram of the present embodiment. The basic packet classification method in the present invention and the operation outline of each component in the present embodiment are as described above, and here, in this embodiment for the basic packet classification method, The explanation will focus on the characteristic operation.

When the header field data of the packet to be searched is input to the packet classifier 1 in the present embodiment, the root node of the Division Tree is set as a processing node (Step A1), and the Parallel Pipeline Stage 20-0 is used for Parallel BV processing. Perform (Step A2).

FIG. 23 is a flowchart showing the operation (step A2) at the time of area division according to the first embodiment of the present invention. In Priority Pipeline Stage 20-0, the address conversion circuit 201 converts the input address value into an address value in which the rule ID list and BV information of the processing node are stored (step B1). Subsequently, the field separation circuit 200 cuts out valid bits of each field from the input search key and valid bit length (step B2). Next, in each of the BV selection circuits 200-1 to 200-F, based on the address value from the address conversion circuit 201 and the valid bit from the field separation circuit 200, the search circuit 2000 assigns an appropriate BV to the BV storage block 2001. (Step B3). The BV selected from each BV selection circuit takes a logical product for each bit by the AND gate 201 to obtain a final BV in the node (step B4), and the process of step A2 is completed.

Subsequently, the priority check circuit 203 reads the rule ID list at the node from the rule ID list storage block 204 based on the address value specified by the address conversion circuit 201, and the optimum rule up to the priority Pipeline Stage in the previous stage. Select the optimal rule including. Note that the rule ID list also holds the priority of the rule (step A3).

Next, the area division information in the root node is read out in the Tree Pipeline Stage 10-1 (step A4). If it is determined from the Leaf Node of the read area division information that the node is not a leaf node (No in Step A5), the area division processing is executed in the area division circuit 100 using the read area division information (Step A6). .

FIG. 24 is a flowchart showing an operation (step A6) at the time of area division according to the first embodiment of the present invention. The area division processing is executed by the field division circuits 100-1 to 100-C in the area division circuit 100. The input data to each field dividing circuit is as described above.

In the area division processing, first, the number of divisions so far input to the adder 1006 in the field division circuit 100-1 is set to 0 (step C1). Subsequently, the subtracter 1004 subtracts the number of divisions of the field from the input effective bit length (step C2). Further, in the right shifter 1005, the input field data is shifted to the right by the value obtained in step C2 (step C3). On the other hand, the number of divisions so far and the number of divisions of the field are added in the adder 1006 (step C4), and the field data obtained in step C3 in the left shifter 1007 is shifted to the left by the result value (step C5). . The above processing is processing for the field dividing circuit 100-1, and there are fields for performing other region division (Yes in step C6). Therefore, processing similar to the above (step C2 to step C5) is also performed for these fields. Execute. Note that the number of divisions so far input to the adder 1006 in the field division circuit 100-2 is the addition result of the adder 1006 in the field division circuit 100-1, and is added to the adder 1006 in the field division circuit 100-3. The number of divisions input so far is the addition result of the adder 1006 of the field division circuit 100-2. Similarly, the number of divisions so far input to the adder 1006 in the field division circuit 100-n (n = 2, 3,..., C) is the field division circuit 100- (n−1). The result of addition by the adder 1006. Finally, in the field division circuit 100-C, when the processing up to step C5 is completed, all processing for the region division field included in the region division information is completed (No in step C6). The logical sum of the results obtained by the field dividing circuit (the output of Left Shifter 1007) is calculated (step C7). Finally, in the adder 1003, the result of Step C7 and Base Address are added (Step C8), and the process of Step A6 is terminated.

On the other hand, the effective bit length update circuit 103 updates the effective bit length of each input field to be divided based on the number of divisions specified by the region division information (step A7), and returns to step A2.

The above processing is performed in all Priority Pipeline Stages and all Tree Pipeline Stages. Here, the processing of the leaf node of the Decision Tree will be described. If the node processed by Tree Pipeline Stage10-H is not a leaf node, that is, if the Leaf Flag is '0' as a result of reading the region division information, the following processing is performed due to the restriction of Decision Tree processed by this packet classifier Nodes are always leaf nodes. For this reason, since the area division is not performed at the last leaf node, the Parallel BV processing is performed on the rule list of the leaf node in the Priority Pipeline Stage 20-H, and the optimal solution is obtained from the results so far. On the other hand, if the Leaf Flag in the read area division information is '1' before reaching the Pipe Pipeline Stage 10-H, that is, if the processing node is a leaf node, as described above, the leaf in the subsequent Priority Pipeline Stage Only the Parallel BV process for the node rule list is executed, and the subsequent Priority Pipeline Stage and Tree Pipeline Stage processes are not executed, or even if executed, the optimal solution obtained so far is not changed. Processing shall be performed. As a result of processing the leaf node, the optimal rule obtained so far can be made the final solution (step A5).

In the above-described embodiment, in each Pipe Pipeline Stage, it is determined whether the processing node is a real node or a Virtual Node by adding a Virtual Flag to the area division information. Since the virtual flag is not held in the division information and the address conversion circuit 201 in the Priority Pipeline Stage is provided in the subsequent stage of the adder 1003 of the Tree Pipeline Stage, the calculated address value and the number of words of the Priority Pipeline Stage in the subsequent stage be to compare the number) _{W P,} determines the Virtual Node, be to output the information to the subsequent stage of the Tree Pipeline Stage, to realize the same processing Bets are possible.

Further, in the above-described embodiment, each Priority Pipeline Stage is based on the Parallel BV process disclosed in Non-Patent Document 3, but this is applied to the Parallel BV process disclosed in Non-Patent Document 4. It may be used for the base. The detailed Parallel BV process in this case is omitted because it is disclosed in Non-Patent Document 4, but the Parallel BV process in Step A2 is a process as shown in FIG. However, both Non-Patent Document 3 and Non-Patent Document 4 use the rule ID list to associate each bit position of the BV with the rule.

FIG. 25 is a flowchart showing an operation (step A2) at the time of area division according to the first embodiment of the present invention when the Parallel BV processing disclosed in Non-Patent Document 4 is used as a base. In this case, the Priority Pipeline Stage 20-0 performs the processing of Step B1 and Step B2. Steps B1 and B2 are the same as the operations in the flowchart shown in FIG. Next, when Parallel BV processing disclosed in Non-Patent Document 4 is used, the BV selection circuits 200-1 to 200-F use the address value from the address conversion circuit 201 and the effective bit from the field separation circuit 200. Originally, one BV is read from the BV storage block 2001 by the sub-field unit of each field by the search circuit 2000 (step B5). Subsequently, the AND gate 201 performs a logical product for each bit of the plurality of BVs read from each BV selection circuit to obtain a final BV at the node (step B6), and the process of step A2 is completed. To do. When the Parallel BV processing disclosed in Non-Patent Document 4 is used as a base, the BV selection circuit may not be separated for each field, and a plurality of BVs may be read by one BV selection circuit.

Furthermore, in the above description, the example is described in which the root node of the decision tree is arranged in the tree pipeline stage 10-1, but this is also a form in which the root nodes of a plurality of subtrees of the decision tree are arranged from the tree pipeline stage 10-1. It is obvious that the present embodiment can be configured without any changes. FIG. 26 is a diagram illustrating an example of mapping a plurality of Decision Tree nodes to a Tree Pipeline Stage according to the first embodiment of this invention. In FIG. 26, the child nodes of node 0 (in FIG. 26, node 1 and node 14), which are the root nodes of the decision tree, are arranged in the Tree Pipeline Stage 10-1. In this case, N subtrees having N child nodes of node 0 as root nodes are arranged on one Tree Pipeline. However, in this case, a processing block corresponding to the area division processing in the node 0 which is the root node is required, and immediately after the input 2 of the packet classifier 1 shown in FIG. 14 is input, it corresponds to the Index Table. A processing block is arranged, and a search key, an effective bit length, an address value, and the like serving as an input to the Tree Pipeline Stage 10-1 and the Priority Pipeline Stage 20-0 are determined based on the output. For this processing block, the configuration of the packet classifier described above may be used, or may be determined with reference to the first few bits of the search key, and can be easily configured for the parties. Is omitted.

(1-3) Operational Effects of First Embodiment Next, the operational effects of the first embodiment of the present invention will be described.

As described above, in the present embodiment, by combining the Division Tree and the Parallel Bit Vector, the amount of data read from the memory in the processing per packet can be reduced, and the header field length constituting one rule can be reduced. Even if the sum increases or the number of rules increases, it is possible to provide a packet classifier that suppresses an increase in the dynamic power of the memory and consequently does not increase the power consumption of the entire hardware.

Specifically, the comparison of the amount of data read from the memory in the processing per packet of the packet classifier according to the present invention is performed as follows.

First, the rule is composed of F fields, the field length of each field is W _i [bits] (i = 0, 1,..., F−1), and the total bit length constituting the rule is W [ bits]], the following equation holds.

Assuming Exact Match, Prefix Match, and Range Match as matching methods in the rule targeted by the packet classifier in the present invention, a double bit length, that is, 2 W [bits] is required for each field constituting the rule. Become. This is because it is necessary to specify a lower limit value and an upper limit value as in Range Match. When specifying 2 W [bits] per field in this way, a specific value is specified by the first first value W [bits], and a mask is specified by the next second value W [bits]. By doing so, it is also possible to specify the Exact Match and the Prefix Match. More strictly, it may be possible to use a 1-bit flag or the like for determining whether the second value indicates a mask or an upper limit value in Range Match, but here, for simplification, the rule is 2W [bits ].

Next, let R be the total number of rules supported by this packet classifier, and let L be the number of rules that can be included in the rule list of each node. The number of rules L that can be included in the rule list may be changed for each node, but here, the same value is used for all nodes. However, in the method based on Division Tree, the rules are stored in a storage area for each rule list so that it is efficient to read the rule list continuously. That is, since the rule that crosses the region is duplicated there, the sum of R and L for all nodes is not necessarily equal.

Note that the amount of data read in each Tree Pipeline Stage that follows the Division Tree itself is almost the same for both the Classification Tree-based method and the packet classification method of the present invention, so it is omitted and focused on the data read amount in the matching process for the rule list. Estimate.

First, in the case of the Division Tree-based method, it is necessary to read out all the L rules included in one rule list from the memory. Since the rule here is defined by 2W [bits] as described above, the amount of data D _E [bits] read per node is equal to the number of rules L included in the rule list and the bit length of the rule 2W [bits]. ] Product.

On the other hand, when using the packet classification method of the present invention based on the Parallel BV processing disclosed in Non-Patent Document 3, the search circuit 2000 starts the section start value for each field from the BV storage block 2001. Alternatively, it is necessary to read out W _i + L [bits] data composed of the end value and the BV for the L rules included in the rule list. Note that, when a binary search is used for N elements, it is generally possible to perform a search by performing [log ₂ N] +1 comparisons. Here, [x] means the smallest integer greater than or equal to x. In addition, the number of sections in Parallel BV when there are L rules is 2L + 1 at most from Non-Patent Document 2. Therefore, the amount of data read out per field is (W _i + L) × {[log ₂ (2L + 1)] + 1}, and this is performed for F fields. Also, one rule ID list is read. When the number of rules is R, the rule ID can be expressed as [log ₂ R] [bits], and the rule ID list includes L rule IDs. Then, the data amount D _P [bits] to be read out per rule list can be obtained from Equation 1 by the following equation.

For example, when the rule is composed of 5-tuples of a source IP address (32 bits), a destination IP address (32 bits), a protocol number (8 bits), a transmission port number (16 bits), and a destination port number (16 bits), When W = 104 [bits], F = 5, the number of rules included in the rule list is L = 8, and the total number of rules is 10K (= 10 × 2 ¹⁰ ), D _E = 2 × 8 × 104 = 1664 [Bits], D _P = 8 × [log ₂ (10 × 2 ¹⁰ )] + (104 + 5 × 8) × ([log ₂ (2 × 8 + 1)] + 1) = 8 × 14 + 144 × 6 = 112 + 864 = 976 [Bits], and it is possible to reduce the data amount of 688 bits per rule list. Since there are H + 1 rule lists to be processed with respect to the height H of the decision tree, the amount of read data can be reduced by H + 1 times this difference.

For example, when L = 16 in the above, D _E = 2 × 16 × 104 = 3328 [bits], and D _P = 16 × [log ₂ (10 × 2 ¹⁰ )] + (104 + 5 × 16) × ([Log ₂ (2 × 16 + 1)] + 1) = 16 × 14 + 184 × 7 = 224 + 1288 = 1512 [bits] It is possible to reduce the data amount of 1816 bits per rule list, and the rules included in the rule list It can be seen that the difference increases as the number L increases.

Similarly, for example, assuming IPv6 above, if the source IP address and the destination IP address are 128 bits each, W = 296 [bits]. In this case, D _E = 2 × 8 × 296 = 4736 [bits] and D _P = 8 × [log ₂ (10 × 2 ¹⁰ )] + (296 + 5 × 8) × ([log ₂ (2 × 8 + 1)] ] +1) = 8 × 14 + 336 × 6 = 112 + 2012 = 2124 [bits] The read data amount of 2612 bits per rule list can be reduced, and the difference increases as the bit length of the rule increases. You can see it grows.

Further, assuming that the total number of rules is R = 1M (= 2 ²⁰ ), D _E = 2 × 8 × 104 = 1664 [bits] and D _P = 8 × [log ₂ (2 ²⁰ )] + ( 104 + 5 × 8) × ([log ₂ (2 × 8 + 1)] + 1) = 8 × 20 + 144 × 6 = 160 + 864 = 1024 [bits], and even if the total number of rules increases to 1M, 640 bits of read data per rule list The amount can be reduced.

Further, when the packet classification method of the present invention is used based on the Parallel BV process disclosed in Non-Patent Document 4, the search circuit 2000 uses L [bits] BV data in units of a plurality of bits. Is read. Here, if 2 bits subfields are defined for all fields, that is, if BV is read in units of 2 bits, W / 2 BVs are read for the bit length W [bits] of the rule. Furthermore, since one rule ID list is read, the data amount D _P [bits] read per rule list is obtained by the following equation.

For example, if W = 104 [bits], F = 5, L = 8, and R = 10K (= 10 × 2 ¹⁰ ), D _E = 1664 [bits] and D _P = 8 × ( [Log ₂ (10 × 2 ¹⁰ )] + 104/2) = 8 × (14 + 52) = 528 [bits], and it is possible to reduce the data amount of 1136 bits per rule list. Since there are H + 1 rule lists to be processed with respect to the height H of the decision tree, the amount of read data can be reduced by H + 1 times this difference.

Further, for example, if L = 16 in the above, D _E = 3328 [bits], and D _P = 16 × ([log ₂ (10 × 2 ¹⁰ )] + 104/2) = 16 × (14 + 52) = 1056 [Bits], it is possible to reduce the data amount of 2272 bits per rule list, and it can be seen that the difference increases as the number of rules L included in the rule list increases.

Similarly, for example, assuming IPv6 above, if the source IP address and the destination IP address are 128 bits each, W = 296 [bits]. In this case, D _E = 4736 [bits], and D _P = 8 × ([log ₂ (10 × 2 ¹⁰ )] + 296/2) = 8 × (14 + 148) = 1296 [bits]. It can be seen that the amount of read data of 3440 bits can be reduced, and the difference increases as the bit length of the rule increases.

Further, assuming that the total number of rules is R = 1M (= 2 ²⁰ ), D _E = 1664 [bits], and D _P = 8 × ([log ₂ (2 ²⁰ )] + 104/2) = 8 × (20 + 52) = 576 [bits], and even when the total number of rules is increased to 1M, it is possible to reduce the read data amount of 1088 bits per rule list.

From the above, it can be seen that the amount of data read from the memory per packet processing can be reduced by using the packet classifier of the present invention. As a result, the dynamic power of the memory can be reduced, and the overall power consumption can be reduced.

In addition, by combining Division Tree and Parallel BV, even if there are many rules, the number of rules that can be matched by Division Tree can be narrowed down, so the bit length of BV can be reduced and from memory It is possible to provide a packet classifier that can suppress an increase in the number of clock cycles required for reading data.

(2) Second Embodiment (2-1) Configuration of Second Embodiment Next, a second embodiment of the present invention will be described with reference to the drawings.

The packet classifier according to the second exemplary embodiment of the present invention can be used without reconfiguring the packet classifier itself even when a field other than a predetermined field is used as a header field constituting the rule. This is different from the packet classifier in the first embodiment. However, the bit length W [bits] of the entire rule, the number of header fields F constituting the rule, and the value of the number of header fields C used for area division at each node are determined in advance, and within the range permitted by the conditions. It can be set freely.

The overall configuration of the packet classifier in the second embodiment of the present invention is the same as that of FIG. 14 which is the packet classifier in the first embodiment, but the Tree Pipeline in the second embodiment of the present invention. The configuration of the Stage is different from the configuration of the Tree Pipeline Stage in the first embodiment.

FIG. 27 is a block diagram showing the configuration of the Tree Pipeline Stage according to the second embodiment of the present invention. Referring to FIG. 27, the Tree Pipeline Stage in the second embodiment of the present invention is the same as the area dividing circuit 100 and the effective bit length update circuit 103 of the Tree Pipeline Stage in the first embodiment of the present invention shown in FIG. However, the area dividing circuit 105 and the effective bit length updating circuit 107 are replaced with each other, and a field extracting circuit 106 is newly added. The other configurations are the same as the Tree Pipeline Stage in the first embodiment, and thus detailed description thereof is omitted.

FIG. 28 is a diagram showing area division information in each node of the Decision Tree in the second exemplary embodiment of the present invention. In the second embodiment of the present invention, in the area division information, a header field that constitutes a rule defined in a range that meets the above-described conditions is uniquely identified using a field ID. Use region division information. This region division information is the same as the region division information in the basic packet classification method of the present invention shown in FIG. 7 except that C sets of field IDs and the number of divisions are provided. The field ID is defined in advance for the header field that constitutes the rule.

In the field extraction circuit 106 in the present configuration, the region division information read by the memory controller 101 from the region division information storage block 102, the search key, and the effective bit length thereof are input. The field extraction circuit 106 refers to the field ID included in the region division information, and extracts header field data used for region division in the processing node from the search key and the effective bit length. The extracted search key, each field data of the effective bit length, and the area division information are output to the area dividing circuit 105 in a state of being separated for each information, and each field data of the extracted effective bit length and the division of the field are output. The number is output to the effective bit length update circuit 107.

The effective bit length update circuit 107 updates each effective bit length by subtracting the number of divisions from the effective bit length of the header field used for the input region division.

FIG. 29 is a block diagram showing a configuration of the area dividing circuit 105 according to the second embodiment of the present invention. Referring to FIG. 29, the area dividing circuit 105 according to the second embodiment of the present invention excludes the area dividing information separating circuit 1000 from the area dividing circuit 100 according to the first embodiment of the present invention shown in FIG. Since other configurations are the same as those of the region dividing circuit 100 according to the first embodiment of the present invention, detailed description thereof is omitted. In the present embodiment, each information data of the region division information is input to the region division circuit 105 in a separated state in the field extraction circuit 106, so that the region division information separation circuit 1000 that has performed the same function is excluded. Yes.

(2-2) Operation of the Second Embodiment Next, referring to the flowchart showing the operation (step A6) at the time of area division according to the second embodiment of the present invention in FIG. 30, the operation in the present embodiment. Will be described. The operation in the present embodiment is basically the same as the flowchart showing the operation of the basic packet classification method shown in FIG. 9, and only the operation at the time of area division in step A6 is different. Only the operation of step A6 shown in FIG. 9 will be described, and detailed description of the other will be omitted.

In the process at the time of area division in the present embodiment, first, the field extraction circuit 106 receives area division information from the Memory Controller 101, and a search key and effective bit length from the previous Pipe Pipeline Stage. The field extraction circuit 106 refers to the C field IDs included in the region division information, extracts the corresponding field data from the number of divisions of the corresponding field, the search key, and the effective bit length, and sends it to the region division circuit 105. Output (step C9). The area dividing circuit 105 that has received the above data executes the processing from step C1 to step C8 and obtains the result, similar to the operation during area division in the first embodiment of the present invention shown in FIG. The address value is output as the address value of the memory storing the area division information of the next node (step C10). Note that the processing from step C1 to step C8 is the same as the operation in the first embodiment of the present invention, and thus detailed description thereof is omitted.

Note that in this embodiment, as in the first embodiment, in each Pipe Pipeline Stage, whether the processing node is a real node or a Virtual Node by providing the area division information with a Virtual Flag. However, the virtual flag is not held in the region division information, and the post-pipeline stage adder 1003 is provided with the address conversion circuit 201 in the priority pipeline stage in the subsequent stage, so that the calculated address value and the priority of the subsequent stage are provided. Pipeline Stage number of words (number of nodes) _{W P} be to compare determines Virtual node, be to output the information to the subsequent stage of the Tree Pipeline Stage, the It is possible to realize the process.

Further, in the present embodiment, as in the first embodiment, each Priority Pipeline Stage is based on the Parallel BV processing disclosed in Non-Patent Document 3, but this is disclosed in Non-Patent Document 4. It may be used on the basis of the current Parallel BV processing.

Furthermore, in the present embodiment, as in the first embodiment, the description has been given using the example where the root node of the Decision Tree is arranged in the Tree Pipeline Stage 10-1, but this is the root of a plurality of subtrees of the Decision Tree. It is obvious that the node may be arranged from the Tree Pipeline Stage 10-1 and can be configured without any change in the configuration of the present embodiment.

(2-3) Operational Effects of Second Embodiment Next, the operational effects of the second embodiment of the present invention will be described.

In this embodiment, as in the first embodiment, the amount of data read from the memory can be reduced in processing per packet by combining the Discription Tree and the Parallel Bit Vector, and one rule is configured. To provide a packet classifier that suppresses the increase in dynamic power of the memory and consequently does not increase the power consumption of the entire hardware even if the total header field length increases or the number of rules increases. Can do. Note that the comparison between the Division Tree-based method at this time and the amount of data read from the memory in the processing per packet of the packet classifier in the present invention is the same as in the first embodiment, and is therefore omitted.

Also, in this embodiment, as in the first embodiment, by combining the Decision Tree and Parallel BV, the number of rules that may be matched by the Decision Tree can be narrowed down even if the number of rules is large. Therefore, it is possible to provide a packet classifier that can reduce the BV bit length and can suppress an increase in the number of clock cycles required to read data from the memory.

Furthermore, in the present embodiment, unlike the first embodiment, packet header information used as a rule is within the range allowed by the bit length W of the rule, the number of fields F, and the number of fields C that can be used for area division. It is possible to provide a packet classifier that can be freely changed without changing the hardware circuit. In the present embodiment, the number C of header fields used for area division at each node and the number C of field division circuits included in the area division circuit 105 included in the area division information shown in FIG. 28, and each Priority Pipeline. Since the number F of the BV selection circuits included in the stage and the bit width W of the signal line for transmitting the search key without this packet classifier are determined in advance, the packet classification can be freely changed within a range that meets the above-described conditions. It is clear that it is a vessel.

(3) Third Embodiment (3-1) Configuration of Third Embodiment Next, a third embodiment of the present invention will be described with reference to the drawings.

In the packet classifier according to the third embodiment of the present invention, the search key is divided into a plurality of header fields, etc., and a plurality of Decision Trees corresponding to the respective search keys (referred to as sub-search keys) are used. The packet classification is different from the first and second embodiments.

FIG. 31 is a diagram showing a configuration of a packet classifier according to the third embodiment of the present invention. Referring to FIG. 31, the packet classifier 4 in the third exemplary embodiment of the present invention includes P decision tree processing circuits (decision tree processing circuits) 30-1, 30-2,..., 30-P. And an optimal solution selection circuit 40 and a search key separation circuit 50.

In the input 2 in the present embodiment, a search key is input as in the first and second embodiments. In the output 3 according to the present embodiment, the most appropriate rule ID is output as a result among the matched rules, as in the first and second embodiments.

The search key separating circuit 50 separates the input search key into P sub-search keys, and each sub-search key is a Decision Tree processing circuit that constitutes a Decision Tree corresponding to each sub-search key. Output to.

FIG. 32 is a diagram illustrating a configuration of a Decision Tree processing circuit according to the third embodiment of the present invention. The decision tree processing circuit in the present embodiment has a configuration in which the priority pipeline 20 of the packet classifier 1 in the first and second embodiments is replaced with a solution candidate selection circuit 21, and an optimum solution selection from the solution candidate selection circuit 21 is performed. A rule ID list as a solution candidate is output to the circuit 40. Since other configurations are the same as those of the packet classifier 1 in the first and second embodiments, detailed description thereof is omitted.

The solution candidate selection circuit 21 includes H + 1 bit vector (BV) processing circuits 21-0, 21-1, 21-2, ..., 21- (H-1), 21-H. FIG. 33 is a block diagram showing a configuration of the BV processing circuit in the present embodiment. Referring to FIG. 33, the BV processing circuit according to the present embodiment includes a priority check circuit 203 as a solution candidate rule in the configuration of the priority pipeline stage in the first and second embodiments of the present invention shown in FIG. Replaced with the ID list generation circuit 205, there is no input of the rule ID from the preceding Priority Pipeline Stage, and no output of the rule ID for the Priority Pipeline Stage in the subsequent stage, and instead the solution candidate rule ID list generation circuit 205 becomes a solution candidate. This is a configuration for outputting the rule ID list to the optimum solution selection circuit 40, and the other configuration is the same as the configuration of the Priority Pipeline Stage in the first and second embodiments of the present invention shown in FIG. Do description thereof is omitted.

The solution candidate rule ID list generation circuit 205 receives the final BV in the present BV processing circuit input from the OR gate 202, and the rule ID from the rule ID list storage block 204 according to the address value input from the Tree Pipeline Stage. Read the list. Subsequently, if the value of each bit position of the BV received from the OR gate 202 is 1, the rule ID of the corresponding rule ID list is left as it is, and if it is 0, it does not conform to the rule. Therefore, all the bits of the rule ID are set to 1. When all the bits of the rule ID are 1, the rule ID is don't care, which means that there is no matching rule in the rule ID area. The rule ID list generated as described above is output to the optimum solution selection circuit 40.

The optimal solution selection circuit 40 receives a total of P × (H + 1) rule ID lists from the H + 1 BV processing circuits included in the P decision tree circuits. Among these, the rule ID lists input from the H + 1 BV processing circuits included in the same Decision Tree are combined into a solution candidate rule ID list for the sub search key. FIG. 34 is a diagram illustrating a configuration example of a rule ID list according to the third embodiment of the present invention. The optimum solution selection circuit 40 compares the P candidate solution rule ID lists, confirms the rule IDs included in all the P solution candidate rule ID lists, and sets the solution with the highest priority as the optimum solution. The rule ID is output as output 3. Note that the confirmation processing of the rule IDs included in all the P solution candidate rule ID lists is merely a comparison processing and can be easily realized by the parties, and thus detailed description thereof is omitted.

In this embodiment, the decision tree is set for each field or bit length divided into P pieces, for example, by dividing the bit length of the rule into a plurality of fields or a fixed number of fields constituting the rule. Constitute. The P decision trees configured in this way are each constituted by P decision tree processing circuits. More specifically, the Decision Tree is configured on the Tree Pipeline 10 in the Decision Tree processing circuit, and the Parallel BV processing in the first and second embodiments is executed in the BV processing circuit of the solution candidate selection circuit 21. However, in this case, the solution processed by each Decision Tree processing circuit is only a solution that matches the divided field or bit length, and it is unclear whether the rule, that is, the entire search key is applicable. . For this reason, the optimum solution selection circuit 40 performs confirmation again based on the rule ID that is a solution candidate in each Decision Tree processing circuit, and selects the optimum solution.

(3-2) Operation of the Third Embodiment Next, the operation of the present embodiment will be described with reference to the flowchart of FIG. 35 showing the operation of the third embodiment of the present invention.

In the packet classification operation in the present embodiment, first, the search key input as input 2 is separated into sub-search keys corresponding to P Decision Trees by the search key separation circuit 50, and each sub-search is performed. The key is output to each Decision Tree processing circuit (step A8). Here, in the present embodiment, as described above, a rule, that is, a single or a plurality of fields constituting a search key, or a rule bit length is fixedly divided into a plurality of P pieces. A Decision Tree is configured for each field and bit length. The sub search key indicates each field or bit group divided into P pieces.

Subsequently, each Decision Tree processing circuit performs the processing of Steps A1 to A7 shown in FIG. 9 in the same manner as the operations in the first and second embodiments (Step A9). Here, the only difference is that the Decision Tree in the first and second embodiments is the Decision Tree for the entire search key, and the Decision Tree in the present embodiment is the Tree for the sub search key. There is no change. More specifically, FIG. 36 shows a flowchart showing the operation during the Parallel BV processing in the present embodiment in step A2. In the present embodiment, the Parallel BV process shown in FIG. 36 is performed in each BV processing circuit of the solution candidate selection circuit 21. Referring to FIG. 36, this process is basically the same as the Parallel BV process in the first and second embodiments shown in FIG. In each BV processing circuit, an address value, a search key (in this case, a sub search key), and its effective bit length are input from each Tree Pipeline Stage, and the processing from Step B1 to Step B4 is executed (Step B7). ). Since the process of this step B7 is the same as that of 1st, 2nd embodiment, detailed description is abbreviate | omitted. The solution candidate rule ID list generation circuit 205 in each BV processing circuit reads the rule ID list from the rule ID list storage block 204 in accordance with the address value input from the Tree Pipeline Stage. A rule ID list to be output to the optimum solution selection circuit 40 is generated while checking the value of each bit position of the BV input from the OR gate 202 with respect to the read rule ID list. Here, for each bit of BV input from the OR gate 202, if the value is 1, it means that the rule corresponding to the bit position is a solution candidate, so that the rule in the rule ID list The ID is left as it is, and if the value of each bit is 0, it indicates that the rule does not conform, and therefore all the bits of the rule ID are set to 1 (step B8). When all the bits of the rule ID are 1, the rule ID is don't care, meaning that there is no matching rule in the rule ID area.

The rule ID list generated as described above is output to the optimum solution selection circuit 40. In the optimum solution selection circuit 40, a rule ID list of matching rules for each sub search key is input from H + 1 BV processing circuits included in the P decision tree processing circuits. Of these rule ID lists, rule ID lists from BV processing circuits included in the same Decision Tree processing circuit are combined to generate a solution candidate rule ID list (step A10).

Finally, the optimal solution selection circuit 40 confirms the rule IDs included in all P solution candidate rule ID lists, and sets the rule with the highest priority among these rules as the optimal solution. (Step A11).

In the above-described embodiment, the description has been made based on the first embodiment. However, the present embodiment can be implemented using the second embodiment as a base.

Further, as in the first and second embodiments, in each Pipe Pipeline Stage, it is determined whether the processing node is a real node or a Virtual Node by giving a virtual flag to the area division information. However, the virtual flag is not held in the area division information, and the address conversion circuit 201 in the Priority Pipeline Stage is provided in the subsequent stage of the adder 1003 of the Tree Pipeline Stage, so that the calculated address value and the Priority Pipeline Stage in the subsequent stage are included. be to compare the number of words (number of nodes) _{W P,} it determines the Virtual node, be to output the information to the subsequent stage of the Tree Pipeline Stage, real similar processing It is possible to.

Further, in this embodiment, as in the first and second embodiments, each Priority Pipeline Stage is based on the Parallel BV processing disclosed in Non-Patent Document 3, but this is not the case. The Parallel BV process disclosed in 4 may be used as a base.

In addition to the above, in the present embodiment as well as in the first and second embodiments, the description has been given using the example in which the root node of the Decision Tree is arranged in the Tree Pipeline Stage 10-1, but this is also described in the Decision Tree. It is obvious that the root nodes of the plurality of subtrees may be arranged from the Tree Pipeline Stage 10-1, and the configuration of this embodiment can be changed without any change.

(3-3) Effects of Third Embodiment Next, functions and effects of the third embodiment of the present invention will be described.

In the present embodiment, as in the first and second embodiments, the amount of data read from the memory in the processing per packet can be reduced by combining the Dition Tree and the Parallel BV. Providing a packet classifier that suppresses the increase in dynamic power of the memory and consequently does not increase the power consumption of the entire hardware even if the total header field length is large or the number of rules is large can do. Note that the comparison of the amount of data read from the memory in the processing based on the Division Tree and the packet classifier in the present invention in this case is the same as in the first and second embodiments and the sub search key. The difference is only whether or not the Parallel BV processing is performed, and the description is omitted because it is essentially the same.

Also, in this embodiment, as in the first and second embodiments, by combining the Combination Tree and Parallel BV, the number of rules that can be matched by the Distribution Tree is narrowed down even if the number of rules is large. Therefore, it is possible to provide a packet classifier that can reduce the bit length of the BV and can suppress an increase in the number of clock cycles required to read data from the memory.

Furthermore, in the present embodiment, as in the second embodiment, packet header information used as a rule is within a range permitted by the bit length W of the rule, the number of fields F, and the number of fields C that can be used for area division. It is possible to provide a packet classifier that can be freely changed without changing the hardware circuit.

(4) Fourth Embodiment Next, a fourth embodiment of the present invention will be described with reference to the drawings.

FIG. 37 shows a configuration example of a packet classifier according to the fourth embodiment of the present invention. In FIG. 37, the packet classifier includes a program processing device 5, a network interface device 6, and a packet classification program 7.

The program processing device 5 is realized by a CPU of a host such as a server or a PC. The network interface device 6 is, for example, a server expansion card or a NIC (Network Interface Card) mounted on board. The program processing device 7 is realized by a CPU of a host such as a server or a PC.

In the present embodiment, the search key used in the packet classification of the present invention is extracted from the packet input from the network to the network interface device 6 and input to the program processing device 5.

The packet classification program 7 is a computer program executed by the program processing device 5 and controls the operation of the program processing device 5.

The program processing device 5 includes the packet classifier 1 in the first and second embodiments of the present invention, more specifically, the Tree Pipeline 10 and the Priority Pipeline 20, and the program processing device 5 includes the packet classification program 7. It is realized by executing. Note that the functions of Tree Pipeline 10 and Priority Pipeline 20 are the same as those in the first and second embodiments of the present invention.

In the first and second embodiments of the present invention, the packet classifier 1 described above is realized by a hardware circuit, but the same processing is executed by software. Further, the program processing device 5 is constituted by a multi-core processor having a plurality of CPU cores (and a many-core processor having more CPU cores), and each CPU core is provided in the Tree Pipeline 10 and the Priority Pipeline 20 respectively. By executing the Pipeline Stage process, higher speed processing is possible.

Note that the packet classification program 7 may be recorded on a computer-readable recording medium, and the program processing device 5 may cause the packet classifier 1 in the third embodiment of the present invention to execute processing. .

The embodiments of the present invention have been described above with reference to the accompanying drawings. However, the present invention is not limited to the above-described embodiment, and can be appropriately changed by those skilled in the art without departing from the gist.

The present invention identifies a flow to which a packet belongs by a combination of specific fields from packet header information and performs a specific process for each flow such as QoS processing or load distribution, and a network device such as a switch and a router, and a load balancer It can be applied to uses such as appliance devices.

This application claims priority based on Japanese Patent Application No. 2010-49051 filed on Mar. 5, 2010, the entire disclosure of which is incorporated herein.

Claims

From a rule set composed of a large number of rules defined using a plurality of fields, a rule that matches the search key to be searched is selected using a plurality of types of bit arrays having a predetermined small number of lengths. A packet classifier to search for,
Using a decision tree, we narrow down the rules that can be matched from a large number of rules to a predetermined number,
Using a bit identifier having the same length as the number of rules narrowed down by the decision tree for each predetermined data in the search key, and using a rule identifier list having rule identifiers indicated by the bit positions of these bit arrays as a list Identify the matching rule from the narrowed-down rules,
Determine the final matching rule according to the priority of the identified rule,
A packet classifier.
The packet classifier
A tree pipeline processing circuit that performs processing by pipeline control to narrow down a rule that may match from a large number of rules to a predetermined number using a decision tree;
Using a bit identifier having the same length as the number of rules narrowed down by the decision tree for each predetermined data in the search key, and using a rule identifier list having rule identifiers indicated by the bit positions of these bit arrays as a list A priority pipeline processing circuit that performs processing for identifying a rule to be matched from the narrowed down rules, processing for determining a matching rule as a final result according to the priority of the identified rule, and pipeline control. ,
The packet classifier according to claim 1, comprising:
The tree pipeline processing circuit includes:
A plurality of tree pipeline processing units for executing a process starting from the root node of the decision tree and tracing one node in the depth direction,
Further, each of the plurality of tree pipeline processing units includes:
An area division information storage block for storing area division information of nodes processed by the tree pipeline processing unit;
A memory controller that reads out the region division information;
Based on the area division information, the area is divided using the input search key, and an address value for the area division information storage block in which the area division information of the node to be processed by the subsequent tree pipeline processing unit is calculated is calculated. An area dividing circuit;
An effective bit length update circuit for updating the effective bit length of the search key;
A search key delay circuit for causing a search key input to the tree pipeline processing unit to have a certain delay and synchronizing with another output;
Further, the region dividing circuit includes:
An area division information separation circuit that extracts the number of divisions for each header field used for area division from the area division information of the node;
A field division circuit that performs region division processing for each header field;
A determination unit that determines an address value in which region division information of a node to be processed by a subsequent tree pipeline processing unit is stored based on the output of the region division information separation circuit and the output of the field division circuit; ,
The priority pipeline processing circuit includes:
A rule that matches a rule identifier list of a node processed by the tree pipeline processing unit with reference to the rule identifier list and a bit array in which the order and bit position of the rule identifier list are associated with each other And a plurality of priority pipeline processing units for executing a process for determining the most suitable rule at that time in accordance with the priority from among the rules determined to be matched so far,
Further, each of the plurality of priority pipeline processing units includes:
A field separation circuit that extracts data of each header field included in the search key;
An address conversion circuit that determines whether the node actually has a rule identifier list from the address value specified by the tree pipeline processing unit;
A plurality of bit array selection circuits for selecting the bit array of the node;
A priority check circuit that selects the most suitable rule among the matching rule candidates;
A rule ID list storage block for storing a rule identifier list,
Further, the bit array selection circuit includes:
A search circuit for selecting an appropriate bit arrangement according to the value of the header field cut out from the search key;
A bit array storage block for storing the bit array;
The packet classifier according to claim 2.
The area division information in the decision tree node is:
A flag indicating whether or not the node is a leaf node in the decision tree;
The number of divisions at that node for a particular number of header fields used to divide the region at each node of the decision tree;
A base address for an area division information storage block in which area division information of child nodes of the node is stored;
A flag indicating that the node is not a real node but a virtual node, and
The number of divisions, in the case of split 2 k with respect to the header field, a packet classifier according to claim 3, characterized in that specified by k.
Each of the plurality of tree pipeline processing units includes:
Furthermore, a field extraction circuit that identifies the division number of the area division information in each node by a field identifier, associates the field identifier with a header field defined in an actual rule, and extracts necessary information is provided.
The packet classifier according to claim 3.
The area division information in the decision tree node is:
A flag indicating whether or not the node is a leaf node in the decision tree;
A set of a field identifier and a division number indicating the division number of the header field used to divide the region in each node of the decision tree;
A base address for an area division information storage block in which area division information of child nodes of the node is stored;
A flag indicating that the node is not a real node but a virtual node, and
The number of divisions, in the case of split 2 k with respect to the header field, a packet classifier according to claim 5, characterized in that specified by k.
The packet classifier
For multiple decision trees configured independently in header field units or specific bit length units, for each decision tree, the rules are narrowed down for each predetermined data of the search key, and the number of rules narrowed down by the decision tree Using the rule identifier list, which uses the same length bit array as the list, and the rule identifier list that includes the rule identifiers indicated by the bit positions of these bit arrays as a list, the rule that may be matched from the narrowed-down rules is identified, and the solution A plurality of decision tree processing circuits for performing processing for generating a candidate rule identifier list by pipeline control;
An optimal solution selection circuit that selects a rule with the highest priority from the solution candidate rule identifier list identified by the plurality of decision tree processing circuits;
A search key separation circuit for extracting a search key corresponding to the decision tree processed by each decision tree processing circuit;
The packet classifier according to claim 1, comprising:
The decision tree processing circuit includes:
A tree pipeline processing circuit that uses pipeline control to narrow down rules that may match from a large number of rules using a decision tree;
A solution candidate selection circuit,
Further, the tree pipeline processing circuit includes:
A plurality of tree pipeline processing units for executing a process starting from the root node of the decision tree and tracing one node in the depth direction,
Further, each of the plurality of tree pipeline processing units includes:
An area division information storage block for storing area division information of nodes processed by the tree pipeline processing unit;
A memory controller that reads out the region division information;
Based on the area division information, the area is divided using the input search key, and an address value for the area division information storage block in which the area division information of the node to be processed by the subsequent tree pipeline processing unit is calculated is calculated. An area dividing circuit;
An effective bit length update circuit for updating the effective bit length of the search key;
A search key delay circuit for causing a search key input to the tree pipeline processing unit to have a certain delay and synchronizing with another output;
Further, the region dividing circuit includes:
An area division information separation circuit that extracts the number of divisions for each header field used for area division from the area division information of the node;
A field division circuit that performs region division processing for each header field;
A determination unit that determines an address value in which region division information of a node to be processed by a subsequent tree pipeline processing unit is stored based on the output of the region division information separation circuit and the output of the field division circuit; ,
The solution candidate selection circuit includes:
A field separation circuit that extracts data of each field included in the search key;
An address conversion circuit that determines whether the node actually has a rule identifier list from the address value specified by the tree pipeline processing unit;
A plurality of bit array selection circuits for selecting the bit array of the node;
A rule ID list storage block for storing the rule identifier list;
A solution candidate selection circuit that generates a solution candidate rule identifier list using the bit arrangement and the rule identifier list;
The packet classifier according to claim 7.
A packet classification method by a packet classifier that searches a rule set composed of a large number of rules defined using a plurality of fields and that matches a search key that is a search target,
Using a decision tree, narrow down the number of rules that may match from a large number of rules to a predetermined number,
Using a bit identifier having the same length as the number of rules narrowed down by the decision tree for each predetermined data in the search key, and using a rule identifier list having rule identifiers indicated by the bit positions of these bit arrays as a list Identify the matching rule from the narrowed-down rules,
Determine the final matching rule according to the priority of the identified rule,
A packet classification method characterized by the above.
From a rule set consisting of a large number of rules defined using multiple fields to a computer that searches for a rule that matches the search key that is the search target,
Using a decision tree, a process of narrowing down a rule that may be matched from a large number of rules to a predetermined number,
Using a bit identifier having the same length as the number of rules narrowed down by the decision tree for each predetermined data in the search key, and using a rule identifier list having rule identifiers indicated by the bit positions of these bit arrays as a list Process to identify the matching rule from the narrowed down rules,
Determining the final matching rule according to the priority of the identified rule;
The packet classification program characterized by performing this.