GB2463889A - Adaptive clocking system for a packet classifier - Google Patents

Adaptive clocking system for a packet classifier Download PDF

Info

Publication number
GB2463889A
GB2463889A GB0817659A GB0817659A GB2463889A GB 2463889 A GB2463889 A GB 2463889A GB 0817659 A GB0817659 A GB 0817659A GB 0817659 A GB0817659 A GB 0817659A GB 2463889 A GB2463889 A GB 2463889A
Authority
GB
United Kingdom
Prior art keywords
packet
buffer
classifier
packet classifier
operably
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB0817659A
Other versions
GB0817659D0 (en
Inventor
Alan Kennedy
Xiaojun Wang
Liu Zhen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dublin City University
Original Assignee
Dublin City University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dublin City University filed Critical Dublin City University
Priority to GB0817659A priority Critical patent/GB2463889A/en
Publication of GB0817659D0 publication Critical patent/GB0817659D0/en
Publication of GB2463889A publication Critical patent/GB2463889A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • H04L12/569
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/90Buffering arrangements

Abstract

An adaptive clocking system 100 for a packet classifier is described. The system provides for control of the clocking frequency of the packet classifier dependent on the traffic within the packet based network. By operating the packet classifier at a frequency commensurate with the traffic encountered it is possible to match fluctuations in the traffic and effect power savings during periods of low traffic. The system comprises a buffer 120 for storing the packet header information of received packets 160 and a comparator 130 for determining the number of buffer slots occupied by the information. A controller selects one of a plurality of different clocking frequencies for clocking the packet classifier dependent upon the number of occupied buffer slots. The packet classifier may divide a ruleset into multiple groups, each containing rules that can be processed in a linear search and SRAM with a long word line may be used so as to reduce the number of clock cycles required to perform the linear search on selected rules.

Description

Adaptive clocking system for a packet classifier
Field of the Invention
The present invention relates to packet based networks and in particular to packet classification within such packet based networks. The invention more particularly relates to an adaptive clocking system which is configured to dynamically change the clock frequency of a packet classifier to match fluctuations in the traffic
Background
Within the context of modern information networks a packet may be considered a formatted block of data carried by a packet based computer network. Within such networks, packet classification is the process of mapping a packet to one of a finite set of flows or categories using information from the packet header. This information includes the source and destination IP addresses which are matched using longest prefix matching, the source and destination port numbers which are matched using range matching and the protocol number which can be an exact match or wildcard.
Packets belonging to the same flow match a pre-defined rule and are processed in the same way by a router, specifically a router line card. The classifier will select the flow with the highest priority in the case where there are multiple rule matches.
Packet classification is used by networking devices to carry out advanced Internet services like network security, traffic billing based on Internet usage, giving priority to VoIP and IPTV packets, rate limiting, load balancing and resource reservation.
With increases in the complexity of traffic that is carried by such packet based networks there has been an increasing number of services that the classifier needs to classify. With this increase in number of services there is an equivalent increase in the number of classifications that are required with the result that the the number of rules used to separate incoming packets into appropriate flows has grown from hundreds to thousands of rules. This growth in ruleset size has further complicated the problem of packet classification. The problem of packet classification has been looked at in detail. Implementing packet classification algorithms solely in software is not feasible when trying to achieve high speed packet classification. High throughput algorithms such as RFC are unable to reach OC-768 or even OC-1 92 line rates when run on devices such as programmable network processors for even relatively small sized rulesets. Packet classification algorithms tailored towards high throughput architectures also tends to suffer from large memory consumption for rulesets containing thousands of rules, making them more suited to using slower DRAM rather than faster SRAM.
These approaches at packet classification seldom consider power consumption.
Key components on a router's line card such as the Intel 1XP2800 network processor can consume up to 30 Watts. Each line card on a router typically contains two network processors for ingress and egress processing and a router can contain multiple line cards. Current hardware methods at high speed packet classification such as Ternary Content Addressable Memory (TCAM) can meet OC-768 line rate but tend to use large amounts of power. State of the art low power packet classification devices such as the Cypress Ayama 10000 Network Search Engine can use up to 19 Watts. This is due to the fact that TCAM carries out parallel comparisons on all the stored rules in one clock cycle and has a poor die density with one bit requiring 10-12 transistors, compared to 4-6 transistors for more power efficient SRAM. TCAM also suffers from poor storage efficiency.
There is therefore a need for a packet classifier that may be implemented with reduce power requirements.
Summary
These and other problems are addressed by an adaptive clocking system which is configured to dynamically change the clock frequency of a packet classifier to match fluctuations in the traffic within the network. Use of such an adaptive clocking system within the context of the present teaching provides maintenance of clock frequencies within the packet classifier at the lowest speed capable of classifying the packets while reducing frequency switches. By reducing the operation frequency of the packet classifier in response to fluctuations in the traffic it is possible to reduce the power consumption of the classifier.
Such an adaptive clocking system is advantageously employed to dynamically scale the frequency to a packet classifier so that it matches fluctuations in traffic volume to a router line card. By dynamically changing the classifier's clock frequency, it is possible to reduce its dynamic power consumption by running it at lower speeds, reducing idle time when traffic volume is low. It is also possible to reduce the buffer size and therefore its power consumption, by allowing the classifier to respond to bursts of packets, by increasing its clock frequency in order to keep the buffer clear.
These and other features will be apparent from the following exemplary arrangements which are provided to assist in an understanding of the teaching of the invention.
Brief Description Of The Drawings
The present invention will now be described with reference to the accompanying drawings in which: Figure 1 shows in schematic form an architecture incorporating an adaptive clocking system in accordance with the present invention.
Figure 2 is an example of a switching sequence with all states used that may be implemented using an adaptive clocking system in accordance with the present teaching.
Figure 3 shows an example of a switching sequence with selected states used that may be implemented using an adaptive clocking system in accordance with another implementation of the present teaching.
Figure 4 is an example schematic of hardware architecture for a packet classifier whose clock signal is controlled using an adaptive clock system in accordance with the present teaching.
Figure 5 shows exemplary power figures for an ASIC implementation.
Figure 6 shows exemplary power figures for a Cyclone 3 implementation.
Figure 7 shows exemplary power figures for a Stratix 3 implementation
Detailed Description Of The Drawings
The teaching of the invention will now be described with reference to exemplary arrangements thereof which are provided to assist in an understanding of the teaching of the invention but which are not in any way intended to limit the scope of the invention to that described.
Figure 1 shows an example of a system architecture incorporating an adaptive clocking system or unit 100 in accordance with the present invention. The system is provided between a router or other network node that provides a plurality of packets 160 for classification purposes. Typically the information provided in the packets is extracted from the packet header and as such does not represent the entirety of the data packet that is being transmitted within the network. The header information provided will typically include about 104 bits which will be appreciated is not the sum total of the bits within each packet. To extract these representative bits will typically require a level of pre-processing of the individual packets at for example the router level prior to transfer to the adaptive clocking unit of the present invention.
The adaptive clocking unit is configured to buffer the received packets within a buffer 120 which desirably operates on FIFO principle. To ensure packets are not delayed for too long it is desirable that the buffer stores the individual packets for as short a time period as possible before transferring the packets onto a packet classifier 110 where packet classification can be effected. Once a packet has been classified it is possible to then provide that classification as a result 170 back to the router or other part of the network management system for subsequent usage. As will be appreciated by those skilled in the art, packet classification is an intensive process that typically requires hardware implementations. Such hardware implementations use an internal clock to effect the processing of a packet, typical processing taking one or two clock cycles within the packet classifier.
In an exemplary arrangement, the adaptive clocking system 100 is configured to run the packet classification hardware accelerator 110 at speeds to match the incoming traffic volume 160 from for example the router's line card while buffering the incoming packets at 128 MHz within buffer 120 of the adaptive clocking system.
While it is not intended to limit the teaching of the present invention to any one type of hardware accelerator an exemplary type that could be used may implement modified versions of the HiCuts and HyperCuts packet classification algorithms.
Instead of comparing a large number of rules simultaneously (as is the case with TCAM as discussed in the Background portion of this description), these algorithms divide the hyperspace of the ruleset heuristically into multiple groups so that each subset contains only a small number of rules that are suitable for linear search, reducing the unnecessary comparisons and thus the power consumption. The hardware accelerator uses SRAM with long word line to reduce the number of clock cycles needed to perform a linear search on the selected rules.
The buffer 120 typically uses dual port SRAM to buffer information from the packet header. This information includes the Source and Destination Internet Protocol addresses, the Source and Destination Port numbers and the protocol number, which are read in at a speed of 128 MHz. The headers are read in at this speed to avoid packets being dropped when the arrival of back-to-back 40 byte packets occur at OC-768 line speeds resulting in up to 125 Mpps as mentioned before. The number of packets stored in the buffer is calculated by providing a comparator 130 to monitor the difference between the buffer's read and write addresses. This difference is used as a trigger to determine which clock frequency the packet classification hardware accelerator should be run at. The comparator 130 is configured to interact with both control logic 140 and a clocking unit 150 provided as elements of the adaptive clocking system.
Typically these frequencies will be provided in bins or ranges such that a volume of traffic in a first range will provide for processing at a first clock frequency whereas that in a second range will effect processing at a different frequency. In this way the number of frequency changes is rriinimised while still providing for adaptive control of the clocking frequency of the hardware packet classifier. In an exemplary arrangement, as will be discussed below with respect to Figures 2 and 3, the adaptive clocking unit has been designed to run a packet classification hardware accelerator at up to ten different frequencies, each of the frequencies being assigned to a specific state. Each frequency is generated using a separate Phase Lock Loop (PLL) to reduce the setup time of changing a PLL frequency. Dedicated clock switching logic on the FPGA is used to prevent clock glitches when switching between frequencies. Ideally the control logic 140 is configured to put the packet classification hardware accelerator in an idle state when changing clock frequencies to prevent problems that may occur due to glitches. The frequencies allowed for running a hardware accelerator packet classifier may be calculated using the following equation: Fi=Fbuff/2 (1) Where n must be a natural number greater than zero and Fbuff is the frequency used by the buffer and adaptive clocking logic. Frequencies calculated using this equation allows the hardware accelerator to run synchronously with the buffer and adaptive clocking logic. In testing of a system provided in accordance with the present invention it has been found that 32 MHz is fast enough to deal with worst case bursts of packets for OC-768 line speeds when using the packet classifier to classify packets using rulesets containing over 20,000 rules. In this exemplary arrangement the adaptive clocking unit uses ten different states to determine the clock frequency of the hardware accelerator. The entering and exiting of each state is triggered by the number of packets stored in the buffer 120. Table 1 shows an example of how the clock frequency associated with each state. In this arrangement each state shift provides for a doubling of the operable frequency of the clock.
tate So Si S2 S3 S4 S5 S6 S7 S8 S9 peed F0= F1 F2= F3= F4=1 F5=2 F6=4 F7=8 F8=16 F9=32 MHz 0.0625 0.125 0.25 0.5
Table 1
Each state has a threshold for determining how many packets can be stored in the buffer before the next higher frequency should be used. The threshold is variable and the number of buffer slots corresponding to each state Si can be any number between zero and N (the total number of buffer slots) as long as the following equation is satisfied: N=LS (2) The threshold for determining when a state is exited and the next higher state entered is saved in a register in the adaptive clocking unit and can be changed at any time. A comparator 130 is used to compare the volume of traffic within the buffer with the threshold values set so as to determine if changes in the clocking frequency should be set. The formula for calculating the threshold for each state is calculated using the following equation: Ti=of slots in preceding states + slots in current state (3) The output clock frequency always starts at the frequency of the lowest-used state and only changes to the frequency of the next higher-used state if the number of packets stored in the buffer exceeds the threshold for the lowest-used state. There are two conditions for leaving the subsequent states and thus changing the output clock frequency. The first of these conditions is that the threshold for the currently occupied state indicating how many packets the buffer can store is exceeded with the output clock frequency changing to that of the next higher-used state. The second condition is that the number of packets stored in the buffer reaches zero meaning the output clock frequency changes to that of the lowest-used state. This means that the number of buffer slots that the current state can occupy before a frequency change is equal to the sum of the buffer slots occupied by the previous states + the number of slots occupied by the current state itself. This is done to allow larger fluctuations in the number of packets stored in the buffer without unnecessary frequency drops. It also keeps the latency time of processing a packet to a minimum, by trying to clear the buffer before reducing the clock frequency. The clock frequency of the packet classification hardware accelerator remains fixed if all buffer slots are occupied by one state.
In the example shown in Figure 2 the buffers slots are distributed equally among all states. The output clock frequency to the packet classification hardware accelerator will start at that of the lowest-used state SO. If the threshold for this state is exceeded (meaning the buffer slots occupied by state SO have filled) then the next higher-used state Si will be entered and the clock frequency will change to Fl. The output clock frequency will remain at that of state Si until the number of packets stored in the buffer is reduced to zero, returning the output clock frequency to FO, or the threshold for state Si is exceeded in which case the output clock frequency changes to F2. The same is true for all subsequent states. The output clock frequency will remain at that of state S2 until either all packets in the buffer are cleared returning the output clock frequency to FO, or the maximum threshold for state S2 is exceeded, meaning state S3 is entered and the output clock frequency changes to F3.
Figure 3 shows an example where only states S4, S7, S8 and S9 have been selected. In this case the output clock frequency to the packet classifier will start at that of the lowest-used state S4. It will stay in this state until the threshold for this state is exceeded, increasing the clock frequency to F7. The output clock frequency will stay at that of state S7 until all packets in the buffer have cleared, returning the output frequency to F4, or the threshold for state S7 is exceeded, increasing the output frequency to F8. The same procedure is followed for states S8 and S9.
Exemplary implementation Figure 4 shows an exemplary implementation of a packet classifier incorporating an adaptive clocking unit in accordance with the present teaching. It will be appreciated that while the present teaching of dynamically changing the frequency of the packet classifier depending on the nature of the traffic within the network may be used with any number of a different types of packet classifiers, one of the advantages derived from such frequency scaling for multidimensional packet classifiers is the reduction in power consumption that may be achieved. It is therefore advantageous to combine an adaptive clocking unit with a low power architecture such as shown in Figure 4.
The same reference numerals will be used for similar components. The hardware accelerator 110 has been designed to traverse an internal node of the decision tree and do a parallel comparison of up to 48 rules contained in a leaf node in 1 clock cycle. This is possible due to the fact the hardware accelerator can access a 7704-bit memory word every clock cycle. By storing the decision tree root node information in a register separate from main memory, it is possible to traverse the root node for an incoming packet while searching a leaf node for the previous packet. Carrying out these tasks in parallel has the effect of reducing the worst-case number of clock cycles by 1. This means that the hardware accelerator is able to classify a packet every clock cycle if the worst-case number of clock cycles needed to classify a packet is 2.
Before any packets can be classified by the hardware accelerator, the first step is to save the preprocessed search structure to memory. The hardware accelerator's memory structure 400 consists of 107 memory cells which are 72-bits wide each.
The search structure is saved using 72-bit memory words, which are first loaded to the memory 400 where they are then read by the hardware accelerator. A write enable signal is used for selecting which memory cell is to be written to, while a write address is saved into the memory 400 with each memory word. The write address selects which line of the selected memory cell the memory word is to be written to.
The clock speed for the hardware accelerator is fixed at 32 MHz when the search structure is being saved in order to save it as quickly as possible.
In this implementation, once the Reset pin is placed low, the hardware accelerator 110 transfers the decision tree's root node information from main memory 400 to Reg A in 1 clock cycle. This information includes the starting position, memory location and node type for each of the root's child nodes. It also includes the 8-bit mask and shift values for each dimension used for selecting which child the incoming packet should go to. On the next rising clock edge the hardware accelerator begins scanning the Start signal from the adaptive clocking unit 100, which will be high if there are packets stored in the buffer 120. The hardware accelerator 110 places a Ready signal high when this Start signal is high to read in a new packet from the buffer 120 to be classified. An index value for each dimension is created by AN Ding the five 8-bit mask values stored in Reg A with the 8 most significant bits from the packet's 5 dimensions read from the buffer 120. The resulting indexes are shifted using the 8-bit shift values stored in Reg A and then added together to determine which node address should be selected from Reg A. This node address is used to select which memory word should be loaded from memory 400 on the next rising clock edge. On this edge the hardware accelerator checks if the node to be loaded from memory 400 is an internal or leaf node.
If the node loaded from main memory is an internal node then the hardware accelerator will use the internal node information loaded to traverse to the next node.
The mask values from the internal node loaded are ANDed with the packet values from the buffer 120. These values are shifted using the shift values from the internal node loaded and then added together. The result is used to determine which child node should be loaded from main memory on the next rising clock edge. If the selected child is still an internal node, then the process of traversing the internal nodes is repeated until a leaf node is found. Each internal node to be traversed takes 1 clock cycle.
The packet value loaded from the buffer will be transferred to Reg B if the node loaded from main memory on a rising clock edge is a leaf. The accelerator 110 then uses 48 comparator blocks 410 in parallel to compare the packet value in Reg B with the leaf node's rule information loaded from memory 400. While this compare takes place the Start signal is again monitored, and if high will cause the Ready signal to go high, loading a new packet to be classified. The mask and index values for the root node stored in Reg A are used with the packet value loaded from the buffer, to determine which child node should be loaded from memory 400 once a matching rule has been found for the previous packet.
On the next rising clock edge the hardware accelerator 110 checks if a matching rule has been found. The hardware accelerator 110 will continue searching the leaf node if a matching rule has not been found. If a match has been found then the hardware accelerator checks to see if a packet has been loaded from the buffer 120. If a packet has not been loaded, then the hardware accelerator will continue monitoring the Start signal until it goes high. If a packet has been loaded, the hardware accelerator will check if the child node traversed to is an internal node or a leaf node. An internal node will mean repeating the process of searching for a leaf node, while a leaf node will mean repeating the process of searching for a matching rule.
Simulation Results To determine performance benefits derivable from implementation of an adaptive clocking unit for scaling the clocking frequency of a packet classifier a number of simulations were generated.
The low power architecture for high speed packet classification was implemented in VHDL and targeted at three devices: a Cyclone EP3C120F484C8 FPGA which is built on Taiwan Semiconductor Manufacturing Company's (TSMC's) 65-nm process technology running at 1.2 Volts, a Stratix EP3SE26OF1 1 52C47 FPGA also built on TSMC's 65nm technology running at 0.9 Volts and a 65nm ASIC library by TSMC running at 1.08 Volts. The low power architecture was synthesized using Altera Quartus 2 software for both the Cyclone 3 and Stratix 3 FPGA implementations. Post place and route timing analysis showed that timing requirements were made for the architecture implemented on both devices. The adaptive clocking unit met its timing requirement of 128 MHz and the hardware accelerator met its timing requirement of 32 MHz. Post place and route simulations were carried out using the Quartus 2 PowerPlay Power Analyzer Tool using VCD files generated by ModelSim. The results are explained below.
For an ASIC solution the logic for the low power architecture was synthesized using Synopsys software. Post place and route timing analysis showed that the timing requirements for both the adaptive clocking logic and hardware accelerator logic were met. In order to estimate the power consumption for the logic the Synopsys Prime Power tool was used to analyze the annotated switching information from VCD files generated using ModelSim. Power results from RAM compilers obtained from Chartered Semiconductor manufacturing were substituted for the 65nm TSMC RAM compilers to provide a measurement of the power consumed by memory. These dual and single port RAM compilers use l3Onm process technology running at 1.2 Volts. To normalize the power results for the RAM so that they were the same as the 65nm process technology running at 1.08 Volts used for the logic, the following equation was used, where S is the scaling factor of the process technology and U is the scaling factor of the voltage: p!=p*52*U (4) Power Results In order to measure the power saved when using our adaptive clocking unit on an energy efficient packet classification hardware accelerator such as that described in Figure 4, we implemented two systems. System A used the adaptive clocking unit to run the hardware accelerator at speeds to match the traffic volume while buffering the incoming packets at a frequency of 128 MHz, in accordance with the teaching of the present invention. It uses the same architecture described in Figure 4. System B ran the hardware accelerator at a fixed clock speed of 32 MHz while buffering the incoming packets at 128 MHz. It used the architecture shown in Figure 4 without Reg 1, the clocking unit and the comparator logic used for deciding the appropriate clock frequency. Power simulations were run for both systems implemented as an ASIC and on the Cyclone and Stratix FPGA5 using the PowerPlay Power Analyzer and Prime Power tools. The resource utilization for the systems implemented on the Cyclone and Stratix 3 FPGAs can be seen in Table 2.
System A System B Device Logic Cells M9K RAMS Logic Cells M9K RAMS Cyclone 3 18.2% 99.8% 17.9% 99.8% Stratix 3 5.9% 99.4% 5.8% 99.4% Table 2. FPGA Resource Utilization The simulation conditions for both systems are identical with packets read in at rates of 32, 16,8,4,2, 1,0.5,0.25,0.125 and 0.0625 Mpps. The search structure used needed at worst 2 clock cycles to classify a packet. This meant a packet was classified on each clock cycle when reading in 32 Mpps. The power consumption for the 2 systems implemented on the 3 technologies can be seen in Figures 5, 6 and 7.
The power figures for system A are shown on the right for each packet speed and system B on the left. Looking at Figure 5 it can be seen that system A with the adaptive clocking uses 0.25% more power than system B with the fixed clock speed when implemented as an ASIC while classifying 32 Mpps. This is due to the extra logic used for the frequency scaling. It can be seen that system A shows power savings of 89% when the packet speed drops down to 0.0625 Mpps. The ASIC implementation shows good power savings, as most of the power consumed is dynamic rather than static.
Figure 6 shows power figures for the Cyclone 3 and it can be seen that system A with the adaptive clocking uses 0.7% more power than system B with the fixed clock speed when there are 32 Mpps. This is due to the fact system A uses 0.3% more of the Cyclone 3 logic resources to implement frequency scaling. System A shows power savings of 57.16% when the packet speed drops down to 0.0625 Mpps. The Cyclone 3 implementation shows lower power savings than the ASIC implementation due to the fact the FPGA has a larger percentage of its power consumption due to static power than the ASIC.
Finally Figure 7 shows the power results for the Stratix 3. When classifying 32 Mpps it can be seen that the power consumed by system A and B are almost identical, as system A only uses an extra 0.1% of the Stratix 3 logic resources to implement frequency scaling. System A shows power savings of 19% when the packet speed drops to 0.0625 Mpps. It can be seen that the power consumption is much higher for the Stratix than the Cyclone FPGA. This is because the classifier implemented on the Stratix uses double the memory of the classifier implemented on the Cyclone. The Stratix also has much more logic and memory resources available, leading to a larger amount of static power consumption. This large amount of static power is why the Stratix shows poorer reductions in power consumption.
It will be appreciated that what has been described herein is an exemplary arraignment of a low power architecture for a high speed packet classifier capable of meeting OC-768 line speed for rulesets containing up to 49,000 rules. The architecture presented has been tested while classifying packets at line speeds of up to OC-768 using large rulesets. Simulation results show that ASIC and FPGA implementations of such an architecture can reduce power consumption by between 17-88% by adjusting the frequency of an energy efficient hardware accelerator to match the traffic volume on a router line card.
While the architecture would be ideally suited to implementation as an on-chip hardware accelerator, relieving the burden from a programmable network processor's processing engines, or as an off-chip high speed classifier on a router line card, it is not intended that the teaching be so limited. Similarly, the architecture described has been in the context of an adaptive clocking unit used with a low power hardware accelerator which implements modified versions of the HiCuts and HyperCuts algorithms, but it will be appreciated and understood that the teaching of the present invention is not to be construed as being limited to the specifics of the hardware accelerator described as such an adaptive clocking unit could be used with different hardware accelerators without departing from the teaching of the present invention. While the low power hardware accelerator implementation of the packet classifier described uses SRAM rather than TCAM in order to reduce power consumption, it will be understood that the teaching of the present invention is not to be limited to such an exemplary arrangement.
The words comprises/comprising when used in this specification are to specify the presence of stated features, integers, steps or components but does not preclude the presence or addition of one or more other features, integers, steps, components or groups thereof. Similarly the word outer, inner, upper, lower, top and bottom are provided to infer relative geometries but are not to be construed as limiting the teaching the present invention to such specific orientations.

Claims (25)

  1. Claims 1. An adaptive clocking system for controlling the clocking frequency of a packet classifier, the system being operably coupled to a node within a packet network architecture so as to receive packet header information relating to packet traffic at that node and to subsequently transfer that packet header information to the packet classifier, the system comprising: a. A buffer having a plurality of buffer slots, each buffer slot being configured to operably store packet header information of a specific packet received from the node, b. A comparator configured to operably determine the number of buffer slots occupied within the buffer, and c. A controller which operably provides a control signal to the packet classifier to vary the clocking frequency of the packet classifier dependent on the number of buffer slots occupied within the buffer.
  2. 2. The system of claim 1 wherein the packet header information includes source and destination IP address, source and destination port numbers and protocol number.
  3. 3. The system of claim 1 or 2 wherein the buffer buffers 104 bits of a packet header.
  4. 4. The system of any preceding claim wherein the buffer is configured to buffer the packet header information at speeds of upto 128 MHz.
  5. 5. The system of any preceding claim wherein the buffer uses dual port SRAM to buffer the information.
  6. 6. The system of any preceding claim wherein the controller uses a plurality of states for determination of the clocking frequency to be applied within the packet classifier.
  7. 7. The system of claim 6 wherein each of the plurality of states are associated with specific range of occupancy of the buffer.
  8. 8. The system of claim 7 wherein the states differ from one another in the clocking signal to be used within the packet classifier while within that state.
  9. 9. The system of any one of claims 6 to 8 wherein the plurality of states provide for a range of clocking signals to be applied within the packet classifier.
  10. 10. The system of claim 9 wherein the controller operably sequentially uses individual ones of the plurality of states so as to provide for a sequential increment in the clocking frequency applied within the packet classifier dependent on the occupancy of the buffer.
  11. 11. The system of claim 10 wherein the controller operably steps back to the lowest state only on determination that the buffer is empty.
  12. 12. The system of claim 7 wherein the specific range of occupancy for each state is defined by a minimum and maximum threshold in the number of buffer slots occupied
  13. 13. The system of any one of claims 6 to 12 wherein the buffer slots are allocated equally amongst each of the states.
  14. 14. The system of any one of claim 6 to 12 wherein the buffer slots are not equally allocated amongst each of the states.
  15. 15. The system of any one of claims 6 to 14 wherein the frequency associated with each state provides for the packet classifier to operate synchronously with the adaptive clocking system.
  16. 16. The system of any preceding claim configured to operably transfer packet header information of a specific packet from the buffer to the packet classifier on confirmation that the packet classifier has effected classification of packet header information from a previously transferred packet.
  17. 17. The system of claim 16 wherein the buffer operates on a FIFO basis.
  18. 18. A packet classification apparatus comprising a packet classifier as claimed in any preceding claim coupled to a packet classifier, the packet classifier configured to operably compare packet header information of a received packet against a plurality of rules of a classification ruleset to determine an appropriate classification for that packet.
  19. 19. The apparatus of claim 18 wherein each rule differs from each other rule in the processing to be applied to packets defined within that rule.
  20. 20. The apparatus of claim 18 or 19 wherein packet classifier is configured to operably divide a ruleset into multiple groups, each of the multiple groups containing a number of rules that can be processed in a linear search.
  21. 21. The apparatus of claim 20 wherein the packet classifier comprises SRAM with a long word line so as to reduce the number of clock cycles required to do the linear search on selected rules.
  22. 22. The apparatus of any one of claims 18 to 21 wherein the packet classifier is implemented in an FPGA.
  23. 23. A system substantially as hereinbefore described with reference to any one of Figures 1 to 7.
  24. 24. A clocking system for providing a clock signal to a packet classifier the system being operably coupled to a node within a packet network architecture so as to receive packet header information relating to packet traffic at that node and to subsequently transfer that packet header information to the packet classifier, the system comprising: a. A buffer having a plurality of buffer slots, each buffer slot being configured to operably store packet header information of a specific packet received from the node, b. A comparator configured to operably determine the number of buffer slots occupied within the buffer, and c. A controller which operably provides a control signal to the packet classifier to vary the clocking frequency of the packet classifier dependent on the number of buffer slots occupied within the buffer.
  25. 25. A method of controlling the clocking frequency of a packet classifier, the method including providing: a. A buffer having a plurality of buffer slots, each buffer slot being configured to operably store packet header information of a specific packet received from a node within a packet network, b. A comparator configured to operably determine the number of buffer slots occupied within the buffer, and c. A controller which operably provides a control signal to the packet classifier to vary the clocking frequency of the packet classifier dependent on the number of buffer slots occupied within the buffer, d. Means for transferring packet header information of a packet from the buffer to the packet classifier for packet classification on determination that the packet classifier has effected classification of a previously transferred buffer.
GB0817659A 2008-09-26 2008-09-26 Adaptive clocking system for a packet classifier Withdrawn GB2463889A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB0817659A GB2463889A (en) 2008-09-26 2008-09-26 Adaptive clocking system for a packet classifier

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB0817659A GB2463889A (en) 2008-09-26 2008-09-26 Adaptive clocking system for a packet classifier

Publications (2)

Publication Number Publication Date
GB0817659D0 GB0817659D0 (en) 2008-11-05
GB2463889A true GB2463889A (en) 2010-03-31

Family

ID=40019615

Family Applications (1)

Application Number Title Priority Date Filing Date
GB0817659A Withdrawn GB2463889A (en) 2008-09-26 2008-09-26 Adaptive clocking system for a packet classifier

Country Status (1)

Country Link
GB (1) GB2463889A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03273731A (en) * 1990-03-22 1991-12-04 Nec Corp Fixed length packet switch
US20070201461A1 (en) * 2006-02-27 2007-08-30 Masayuki Shinohara Network switching device
US20080084893A1 (en) * 2006-10-10 2008-04-10 Samsung Electronics Co., Ltd. Network-on-chip apparatus, and method for controlling dynamic frequency for the same

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03273731A (en) * 1990-03-22 1991-12-04 Nec Corp Fixed length packet switch
US20070201461A1 (en) * 2006-02-27 2007-08-30 Masayuki Shinohara Network switching device
US20080084893A1 (en) * 2006-10-10 2008-04-10 Samsung Electronics Co., Ltd. Network-on-chip apparatus, and method for controlling dynamic frequency for the same

Also Published As

Publication number Publication date
GB0817659D0 (en) 2008-11-05

Similar Documents

Publication Publication Date Title
Kennedy et al. Low power architecture for high speed packet classification
US7356663B2 (en) Layered memory architecture for deterministic finite automaton based string matching useful in network intrusion detection and prevention systems and apparatuses
Zheng et al. An ultra high throughput and power efficient TCAM-based IP lookup engine
US8369121B2 (en) System for dynamically managing power consumption in a search engine
US6687247B1 (en) Architecture for high speed class of service enabled linecard
Zerbini et al. Performance evaluation of packet classification on FPGA-based TCAM emulation architectures
Lin et al. Route table partitioning and load balancing for parallel searching with TCAMs
McLaughlin et al. Fully hardware based WFQ architecture for high-speed QoS packet scheduling
GB2463889A (en) Adaptive clocking system for a packet classifier
CN101645852B (en) Equipment and method for classifying network packet
Kennedy et al. Energy efficient packet classification hardware accelerator
WO2003088047A1 (en) System and method for memory management within a network processor architecture
Hwang et al. A new TCAM architecture for managing ACL in routers
Kennedy et al. Multi-engine packet classification hardware accelerator
Zhou et al. RocketTC: a high throughput traffic classification architecture
Van Tu et al. A high throughput pipelined hardware architecture for tag sorting in packet fair queuing schedulers
Lin et al. Designing packet buffers using random round robin
Antichi et al. On the use of compressed DFAs for packet classification
Zhang et al. Efficient searching with a tcam-based parallel architecture
Guesmi et al. Design of priority-based active queue management for a high-performance IP switch
McCanny et al. Exploring technology related design-space limitations of high performance network processing
Hanna et al. Progressive hashing for packet processing using set associative memory
Yong-gang et al. Realization of FPGA-based packet classification in embedded system
Fan et al. Experiences with Active Per-flow Queuing for Traffic Manager in High Performance Routers
Gao Advanced Programmable Packet Schedulers and Match-Action Table Management

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)