WO2001039430A2 - Programmateur de paquets programmable grande vitesse et gestionnaire de mémoire tampon - Google Patents

Programmateur de paquets programmable grande vitesse et gestionnaire de mémoire tampon Download PDF

Info

Publication number
WO2001039430A2
WO2001039430A2 PCT/CA2000/001389 CA0001389W WO0139430A2 WO 2001039430 A2 WO2001039430 A2 WO 2001039430A2 CA 0001389 W CA0001389 W CA 0001389W WO 0139430 A2 WO0139430 A2 WO 0139430A2
Authority
WO
WIPO (PCT)
Prior art keywords
queue
packet
sequencer
minicell
hol
Prior art date
Application number
PCT/CA2000/001389
Other languages
English (en)
Other versions
WO2001039430A3 (fr
Inventor
Alberto Leon-Garcia
Massoud Hashemi
Original Assignee
Leon Garcia Alberto
Massoud Hashemi
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Leon Garcia Alberto, Massoud Hashemi filed Critical Leon Garcia Alberto
Priority to AU16848/01A priority Critical patent/AU1684801A/en
Publication of WO2001039430A2 publication Critical patent/WO2001039430A2/fr
Publication of WO2001039430A3 publication Critical patent/WO2001039430A3/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/30Peripheral units, e.g. input or output ports
    • H04L49/3081ATM peripheral units, e.g. policing, insertion or extraction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/25Routing or path finding in a switch fabric
    • H04L49/253Routing or path finding in a switch fabric using establishment or release of connections between ports
    • H04L49/254Centralised controller, i.e. arbitration or scheduling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/30Peripheral units, e.g. input or output ports
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q11/00Selecting arrangements for multiplex systems
    • H04Q11/04Selecting arrangements for multiplex systems for time-division multiplexing
    • H04Q11/0428Integrated services digital network, i.e. systems for transmission of different types of digitised signals, e.g. speech, data, telecentral, television signals
    • H04Q11/0478Provisions for broadband connections
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/54Store-and-forward switching systems 
    • H04L12/56Packet switching systems
    • H04L12/5601Transfer mode dependent, e.g. ATM
    • H04L2012/5638Services, e.g. multimedia, GOS, QOS
    • H04L2012/5646Cell characteristics, e.g. loss, delay, jitter, sequence integrity
    • H04L2012/5651Priority, marking, classes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/54Store-and-forward switching systems 
    • H04L12/56Packet switching systems
    • H04L12/5601Transfer mode dependent, e.g. ATM
    • H04L2012/5638Services, e.g. multimedia, GOS, QOS
    • H04L2012/5646Cell characteristics, e.g. loss, delay, jitter, sequence integrity
    • H04L2012/5652Cell construction, e.g. including header, packetisation, depacketisation, assembly, reassembly
    • H04L2012/566Cell construction, e.g. including header, packetisation, depacketisation, assembly, reassembly using the ATM layer
    • H04L2012/5661Minicells
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/54Store-and-forward switching systems 
    • H04L12/56Packet switching systems
    • H04L12/5601Transfer mode dependent, e.g. ATM
    • H04L2012/5678Traffic aspects, e.g. arbitration, load balancing, smoothing, buffer management
    • H04L2012/5679Arbitration or scheduling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/54Store-and-forward switching systems 
    • H04L12/56Packet switching systems
    • H04L12/5601Transfer mode dependent, e.g. ATM
    • H04L2012/5678Traffic aspects, e.g. arbitration, load balancing, smoothing, buffer management
    • H04L2012/5681Buffer or queue management
    • H04L2012/5683Buffer or queue management for avoiding head of line blocking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/50Overload detection or protection within a single switching element

Definitions

  • the present invention in general relates to high-speed packet switches and routers for use in telecommunication and computer networks, more specifically, the present invention relates to queuing and scheduling in switches and routers that provide differential transfer service to multiple subclasses of packets destined to a given output port.
  • the present invention can be used in switches and routers to provide quality of service across packet networks.
  • a data packet is a discrete quantity of information that is normally sent serially between devices connected via a network. Packet switches and routers are used to direct these packets along the various branches of a network to its destination, using information contained in the packet header.
  • switches and routers data packets arriving from input ports are buffered and after routing and classification, that is done by processing the packet headers, the packets are queued and scheduled for transmission to individual output ports. Queuing is required in switches and routers because more than one packet may arrive from different input ports for the same output port. The packets have to be queued and sent from the output port one at a time. Packets destined for an output port may belong to different applications such as data, voice, video and so on.
  • a desirable feature of packet networks is the capability to buffer and transfer packets differentially in a manner that may depend on the type of application. Different applications can require different qualities of service (QoS) in the transfer of packets.
  • QoS qualities of service
  • the quality of a service can be specified by the fraction of packets that are lost, by the total packet delay, and by the variations in the total packet delay (jitter) that packets experience in traversing switches and routers across the network.
  • QoS qualities of service
  • the quality of a service can be specified by the fraction of packets that are lost, by the total packet delay, and by the variations in the total packet delay (jitter) that packets experience in traversing switches and routers across the network.
  • To provide differential buffering and transfer service in a switch or router packets that are destined to a given output port are classified according to priority level, connection or flow, or membership in some subclass of packets that require a certain type of service. The packets of different classes or priorities may then be put in different queues.
  • a scheduler is used to select the packets from these queues for transfer to the output port according to some scheduling algorithm.
  • the order in which the queued packets are selected for transmission to an output port affects the loss, delay, and jitter that a packet incurs in a switch or router.
  • the relative preference that different packet classes are given in access to buffering also affects the loss probability. Packets that are given preferential access will in general experience lower loss than other packets. Buffer management algorithms are used to implement differential access to buffering as well as to discard packets under certain conditions.
  • Queuing and scheduling and buffer management are key parts of any switch or router that is designed to provide differential packet buffering and transfer in a packet network.
  • Different network architectures use different approaches to providing differential service and so impose different requirements on the queuing and scheduling system.
  • Differential buffer access can be provided in a FIFO queuing system to provide different levels of packet loss to different packet types . For example, if there are two types of packet traffic, both classes are allowed access if the number of packets in queue is below a first threshold value. When the number of packets exceeds the first threshold value, only the first packet type is allowed access and second- packet type arrivals are discarded. Once the number of packets exceeds a second higher threshold, all packet arrivals are discarded.
  • Buffer management algorithms such as Random Early Detection (RED) [Floyd 1993] are designed to throttle the rate at which end systems send packets into a network.
  • RED Random Early Detection
  • packets are discarded according to a probability that depends on the margin, which exceeds the threshold.
  • the discarded packets are intended to signal to an endsystem mechanism such as TCP to reduce the rate at which packets are sent into the network before congestion sets in. Once the number of packets exceeds a second threshold all arriving packets are discarded.
  • Each packet type has a corresponding pair of threshold values and associated discard probabilities.
  • Each packet arrival is then discarded according to the current average number of packets in queue and the -type-specific thresholds and discard probabilities.
  • Policing mechanisms are used to enforce the manner in which packets from a given class are allowed to arrive at a system. For example, the peak and average arrival rate in bytes per second may be enforced by a policing mechanism. The maximum burst size of a packet arrival in bytes may also be enforced. When packets arrive that violate said arrival pattern, the policing mechanism may discard the packets. Alternatively, the policing mechanism may mark a packet as being non-conforming and indicating that it is to undergo certain treatment in the case of congestion.
  • the simplest approach to providing differential packet transfer service and hence differential delay performance is to use static priority classes where packets are classified into a small number of classes [Keshav, pg. 223]. Packets destined to a given output port are placed in a queue dedicated to its given priority class. Each time a packet transmission is completed in an output port, the next packet to be transmitted is selected from the head of the line of the highest-priority non-empty queue. As a result packets from higher-priority classes experience lower delay than packets from lower-priority classes. However, large arrival rates of higher-priority packets can result in an insufficient number of transmission opportunities for the lower-priority queues, which can eventually fill up with packets and overflow, resulting in packet loss and excessive delay for the lower priority packets.
  • Round robin scheduling is a mechanism for providing an equitable share of transmission opportunities [Peterson, pg. 403].
  • each packet class has its own queue, and the scheduler selects the head-of-line packet from each queue in round robin fashion, so that after each round each queue has had one transmission opportunity. If packets from each subclass have the same average packet length, then over the long run each queue will transmit the same volume of information, measured in bits. However, round robin scheduling can be unfair. If packets from different subclasses have different average lengths then the volume of information transmitted over the long run by different queues will not be equal.
  • Weighted fair queuing can guarantee pre-assigned but not necessarily equal levels of long-run transmission volumes to different subclasses of packets. Weighted fair queuing scheduling has been shown to provide guaranteed bounds on packet delay transfer across a network under certain conditions [Zhang, 1995]. Fair queuing, weighted fair queuing, self-clocked fair queuing and other scheduling algorithms have been shown to be implementable using a dynamic sorting mechanism where each packet is assigned a dynamically-computed tag which is then sorted to find its placement in a queue [Zhang, 1995].
  • ATM networks provide quality of service guarantees to every virtual connection of cells (fixed-length packets) that is established across the network. This can require that the cells pf each connection be queued separately, called per-NC queuing, and that the scheduler be aware of every connection in its corresponding output port.
  • Integrated Services IP routers can provide packet transfer delay guarantees but they must place packets for each separate packet flow in a separate queue and the scheduler must be aware of the connections in an output port [Peterson, pg. 464].
  • the requirements that ATM and Integrated Services IP handle individual packet flows imply that schedulers must be able to handle a very large number of queues for the different flows. To provide delay guarantees, weighted fair queuing scheduling mechanisms are required in ATM switches and Integrated Services IP routers.
  • Differentiated services IP routers provide differential transfer to packets according to their membership in specified classes.
  • a field in the IP header is used to indicate that a packet is to experience a certain type of per-hop behavior (PHB).
  • PHB per-hop behavior
  • Various types of scheduling algorithms can be used to produce the standardized behaviors. Six bits have been allocated to identify different PHBs, so a differentiated service router may be required to handle up to 64 separate classes. PHBs address not only delay behavior but also packet loss performance. In particular, different PHBs can be defined to have different levels of packet drop precedence.
  • a customer and a service provider may establish a traffic agreement that specifies the temporal properties of the traffic stream that the customer is allowed to transmit into the network.
  • Two service providers may also establish such agreements to exchange traffic. Users and service providers may apply shaping mechanisms to their traffic to ensure conformance to the agreement.
  • Some routers are required to allocate transmission opportunities on a hierarchical basis. For example, a router may be required to share the transmission on an output port first according to organization, and within each organization, according to packet flow or application type [Floyd 1995].
  • Hierarchical packet fair queuing algorithms [Bennett 1996] can provide this type of allocation.
  • Hierarchical allocation of transmission is an important requirement in the partitioning of packet networks into private virtual networks that have dedicated amounts of transmission allocated to them.
  • Virtual networks can be created by partitioning the transmission capability of an output port among a number of virtual networks using a mechanism such as Multiprotocol Label Switching [Davie]. Each virtual network may handle its own set of packet flows. A scheduling algorithm can be used to enforce the partitioning of buffering and bandwidth in a router and if necessary to provide the differential treatment to different packet flows within each virtual network. Routers that implement MPLS are intended for operation in the core of the Internet where very high-speed operation will be required. The number of MPLS flows that need to be handled in a large backbone network can be very large.
  • Connection-oriented networks such as ATM provide quality of service guarantees to every virtual connection of cells (fixed-length packets) that is established across the network. This can require that the cells of each connection be queued separately, called per-VC queuing, and that the scheduler be aware of every connection in its corresponding output port.
  • the requirement that ATM handle individual packet flows implies that schedulers must be able to handle a very large number of queues for the different flows.
  • shaping and scheduling mechanisms need to be set up in the switches along a connection.
  • U.S. Patent 5,864,540 and [Rexford et al, 1997] uses an integrated shaping and scheduling mechanism to handle fixed-length ATM cell connections with specified rate and burstiness traffic parameters.
  • an ideal packet scheduler should be able to: 1. Implement the scheduling algorithm that is appropriate in a given network scenario; 2. Schedule large to very large numbers of packet flows depending on the network scenario; and 3. Be capable of operating at very high speed.
  • Schedulers that can operate at high speed and for a large number of queues are difficult to implement. Many existing switches and routers implement queue scheduling in software. Software-based scheduling is programmable, but does not scale to very high speed [Kumar], [Newman], [Keshav 1998]. Existing hardware-based schedulers can achieve high speeds but have limited flexibility.
  • the method involves inserting new packets from the front of the queue since the only candidates for departure are the packet that is currently at the head of the queue and the new arrivals.
  • a series of research publications has developed packet schedulers using the exact-sorting insertion from the front method.
  • [Hashemi 1997a] describes a sequencer that can implement the scheduling of packets arriving from a plurality of input ports and destined for a single output port. The sequencer can implement any scheduling algorithm that involves the comparison of a tag.
  • [Hashemi 1997b] shows that the sequencer circuit can be programmed to implement a variety of scheduling algorithms, including priority queues, windowed-priority queues, fair queuing, pacing mechanisms, and hybrid combinations of scheduling algorithms.
  • Hashemi 1997c describes a "single queue switch" sequencer that can schedule packets arriving from a plurality of input ports and destined to a plurality of output ports. Said sequencer can be used as a centralized scheduler in a memory-based packet switch [Leon-Garcia and Hashemi, "The Single Queue Switch", Canadian Application No. 2,227,655]. Hashemi [1997d] describes how the single-queue switch can be used to schedule variable length packets. [Zhang 1999] shows that very high speeds can be achieved by integrated circuit implementations of the single queue switch.
  • One feature of the insert- from-the- front sequencer is the serial comparison of numerical or logical relationships between binary numbers that are attached to minicells or minipackets that enter the sequencer.
  • a major drawback of all sequencer circuits is scalability in terms of the number of flows that can be scheduled individually.
  • the total number of packets that can be buffered is limited by the number of comparison cells that can be built into an integrated sequencer circuit [Zhang 1999].
  • the number of packet classes is typically much smaller than the total number of packets in the system. For this reason sequencer circuits have not been used in packet schedulers.
  • An insert- from-the- front sequencer is the serial comparison of numerical or logical relationships between binary numbers that are attached to minicells or minipackets that enter the sequencer.
  • Said sequencer circuits can be made programmable by placing a superset of comparison logic that can be selected according to the type of scheduling that is desired.
  • the packet scheduler in the present invention provides programmability and high speed by incorporating said type of sequencer circuits and enhancements thereof.
  • a large number of queues can be implemented due to a new approach in which packets are queued in linked-list queues and scheduled by a single- queue sequencer.
  • the method of combining the queuing and scheduling parts makes it possible to schedule a large number of queues using a smaller sequencer.
  • the present invention provides a method and apparatus for managing the buffering and scheduling the transfer of packets of data arriving on one or a plurality of input ports and destined for one or a plurality of output ports of a switch or router or a subsystem thereof.
  • Each arriving packet has an index that specifies both a unique destination output port for the packet and membership in a subclass.
  • a minipacket is created that contains the index of the packet, a unique identifier such as the packet's storage location, and information relevant to the scheduling of the packet.
  • Minipackets are input into a queue control part that stores said minipackets in order of arrival in a unique queue that is assigned to each index.
  • These queues are implemented using a bank of linked-list queues that can be assigned flexibly and arbitrarily to the minipackets in the system.
  • the bank of linked-list queues is scalable in that it can handle a very large number of queues.
  • the queue control part can implement a variety of buffer management and policing mechanisms.
  • a head-of-line (HoL) selector part identifies the head-of-line minipacket in a queue and at an appropriate time creates a shorter data unit called a minicell that contains a pointer to the minipacket and the essential scheduling information.
  • the HoL selector transfers the head-of-line minicell to a head-of-line (HoL) sequencer part.
  • the HoL selector part replaces each minicell that exits the HoL sequencer part with the HoL minicell from the queue with the same index. This ensures that the HoL sequencing part always has one HoL minicell for each non-empty queue.
  • the HoL sequencer part determines which minicell should be scheduled for transfer out of the system according to a selected scheduling algorithm.
  • a sequencer circuit determines the order in which the HoL minicells should be transferred out of the one or plurality of output ports.
  • a generalized sequencer circuit is disclosed that can implement any scheduling algorithm that can be translated into a logical or numerical relationship between a series of numbers carried in the minicells.
  • a tagging unit that precedes the HoL sequencer circuit can assign such numbers to the minicells according to the desired scheduling algorithm.
  • the associated minipacket is dequeued and output from the scheduler system.
  • the actual packets in the switch, router, or subsystem thereof are transferred from storage to output ports according to the sequence of minipackets that exit the scheduler system.
  • the packet scheduler of the present invention handles a sequence of data units call minipackets to control the buffering and transfer of packets outside the scheduler system.
  • the packet scheduler implements exact sorting of all packets in the system by arranging them in a bank of FIFO queues in tandem with a sequencer that contains the single head-of-line minipacket from each non-empty FIFO queue. Provisioning of different transmission rates to the packets of different FIFO queues may be accomplished through the sorting of minipackets according to tags. The choice of sequencer and the structure of the tag determine the type of scheduling that is implemented.
  • FIG. 1 Block diagram of a scheduler system
  • Figure 9 Generalized Single-Queue Sequencer Circuit
  • Figure 10 Example interleaving the logical output queues for 8 ports
  • FIG. 12 Example application of Scheduling System in Centralized Switch
  • Figure 15 is a schematic diagram of a queue and scheduling system in an input line card.
  • FIG. 1 shows the block diagram of a scheduler system, according to the present invention, that is used for scheduling the transfer of data packets arriving on one or a plurality of input ports and destined for one or a plurality of output ports in a switch, router, or subsystem thereof.
  • the scheduler system of the present invention has an input interface 16 through which it receives data units, called minipackets, that represent data packets that have arrived from input ports and that are temporarily placed in a storage location outside the scheduler system.
  • the scheduler system accepts these minipackets, performs a scheduling algorithm, and outputs on output interface 18 minipackets that specify the order in which data packets are to be transmitted on the output ports.
  • the scheduler system may optionally perform buffer management and packet policing functions in which case it may output minipackets from time to time that indicate that the corresponding data packets are to be discarded.
  • the system of the present invention can be viewed as a combination of four major parts: a queue control part 10; a system memory part 11; a head-of-line (HoL) selector part 12; and a head-of-line (HoL) sequencer part 14.
  • the objective of the queue control part 10 is to accept minipackets as input, and to place these minipackets in the queue indicated by the queue index in each minipacket.
  • the system memory part 11 is used for the storage of minipackets and as a repository of queue state information.
  • An important aspect of this invention is that all packets of the same queue index are to be transmitted in first-in first-out (FIFO) fashion.
  • the queue control part can place minipackets in their corresponding queue in order of arrival.
  • An important aspect of this invention is that, at any given time, the minipacket that is selected for transmission must belong to the set of minipackets that are at the head of their respective queue.
  • the objective of the HoL selector part 12 is to ensure that every minipacket from the head of line of each non-empty queue is represented in the HoL sequencer part.
  • the HoL selector creates a shorter data unit called a minicell that contains a pointer to the minipacket and the essential scheduling information.
  • the HoL sequencer part 14 selects the next minipacket to be transmitted from among the set of all HoL minipackets according to a given scheduling algorithm.
  • an important requirement of this invention is that a minipacket arriving to an empty system be transferred to the HoL sequencer directly.
  • minipackets are created outside the scheduler system, and that they enter the system sequentially as shown in Figure 1.
  • Each minipacket is prepared so that it contains information required by the queuing and scheduling algorithms.
  • Said information can include priority level or class, packet length, time stamp, and other information, some of which could be optional or not implemented in some realizations.
  • Said information can be extracted directly from the header of the data packets or indirectly by processing of the headers by a packet processor or classifier.
  • each minipacket contains a queue index that specifies the output port that the corresponding data packet is destined for, and optionally membership of the data packet in a subclass.
  • minipackets The specific format of minipackets is not of particular importance in this invention; however all the minipackets should have the same format and the realization of the system should also conform to the selected format.
  • Each minipacket represents a data packet, and so each minipacket must also contain a unique identifier that specifies its corresponding data packet.
  • the unique identifier may consist of the storage location of the data packet.
  • Minipackets may represent different types of data packets such as IP packets, ATM cells, or ethernet frames.
  • the data packets can be fixed-length such as in ATM, or variable-length such as in IP.
  • the queue control part 10 accepts minipackets on input 16. Each input minipacket is stored in memory allocated for this purpose. The queue control part 10 is responsible for keeping track of the order of each minipacket in each queue. The queue control part 10 is also responsible for the output of minipackets.
  • the HoL sequencer part 14 indicates that a certain data packet should be transmitted next, the corresponding minicell is transferred from the HoL sequencer to the queue control part.
  • the minicell contains a pointer that specifies the location of the corresponding minipacket in memory.
  • the queue control part removes the minipacket from memory, performs certain bookkeeping functions associated with the given minipacket, and sends the mimpacket from the scheduler system on output 18. The data packet corresponding to the given minipacket can then be retrieved from storage and transmitted on the corresponding output port(s).
  • Table I is a pseudocode representation of the Queue Control
  • the plurality of queues are not pre-assigned to any particular service, priority, class, flow or any other category of packets.
  • the use of queues and their assignment to different categories of packets is arbitrary and can be decided by previous stages of the switch or router, or by the controller software in the switch or router.
  • Each queue is identified by an index.
  • a controller outside the scheduler system such as a switch/router control, or management software, assigns the queues to different categories of packets.
  • the controller also keeps a record of the assignment in its memory, most likely in the form of a lookup table.
  • the target queue for a packet needs to be determined before corresponding the minipacket can be sent to the scheduler system.
  • the queue can be chosen by a packet processor, a packet classifier, or other means depending on the switch or router architecture.
  • the category to which the packet belongs is determined from the information attached to the packet. Based on this information, and according to the queue assignment information, a queue is selected for the packet and the corresponding index is attached to the minipacket.
  • Figure 2 shows the queue control part 10 that performs the function of enqueueing and dequeueing of the minipackets into the queues.
  • the queue control part includes a linked- list controller 20 that maintains the storage of minipackets in the queues.
  • Minipackets arriving to the scheduler system of the present invention are stored in a minipacket memory 30 in the system memory part, 11 in Figure 3.
  • the linked list controller 20 in contains a memory space controller 21 that provides pointers to free spaces in the memory that is used to store arriving minipackets. The pointers are returned to the memory space controller whenever the minipackets are sent out to the fabric.
  • a pool of pointers or a linked list queue of free blocks of memory can be used by the memory space controller to provide pointers to free blocks in the memory.
  • the linked list controller may optionally use as the unique identifier that arrives with an input minipacket as pointer to a minipacket memory space.
  • a minipacket must be placed in the appropriate queue after it has been stored in the minipacket memory 30 in Figure 3.
  • the pointer of the storage location of the minipacket is linked to the linked-list queue identified by the queue index attached to the minipacket.
  • Each linked-list is accessed through two pointers called head pointer and tail pointer that are stored in two buffers known as head pointer buffer 31 and tail pointer buffer 33 as shown in Figure 3.
  • the head pointer buffers are in a separate memory.
  • the head pointer buffer of a linked-list queue stores the pointer of the first minipacket of the queue.
  • the tail pointer buffers are in another separate memory.
  • the tail pointer buffer of a linked-list queue stores the pointer of the last minipacket of the queue.
  • the head and tail pointers of a linked-list queue are determined using the queue index.
  • the head pointer buffer is used to read the head-of-line minipacket out of the linked list queue.
  • the tail pointer is used to write a new minipacket into the linked list queue.
  • the minipackets in a queue are linked to each other in the order of arrival by writing the pointer of the new minipacket in the link buffer memory 32 of the last minipacket in the link-list queue.
  • a separate link buffer memory 32 is used. Both the minipacket memory and the link buffer memory are accessed using the same pointer.
  • the queue control part 10 in Figure 2 can include a queue information controller 26 that maintains per-queue control information, such as empty/nonempty state, packet arrival time, queue length, average queue length, and the last time a queue was serviced.
  • the queue information controller uses the queue information memory 34 in Figure 3 to store and retrieve information. The collection and use of some of this control information is optional and could vary in different implementations of the present invention.
  • the queue control part 10 in Figure 2 can also include a buffer management controller 28 that can process information generated by the queue information controller to determine whether a data packet should be discarded or marked for certain treatment. Alternatively, the , information in the queue information controller 26 can be reported to an attached processor, that uses this information for switch management purposes.
  • the sequence of functions performed upon arrival of a minipacket into the queue control part of the present invention is as follows and is shown in Figure 4.
  • the minipacket and associated information is stored in a free block of memory in the minipacket memory 30.
  • the pointer for the location of the minipacket is linked to the tail of the linked-list queue whose index is attached to the minipacket, and the linked list 32 is updated accordingly.
  • the queue information 34 for the given queue index is updated as well.
  • This information includes a time stamp that is assigned to the minipacket according to a global clock that indicates the current time of the system.
  • the length of the queue is incremented and stored back.
  • a running average of the queue length can be updated.
  • the age of the queue which is defined as the time stamp of the head-of-line minipacket in the queue, is updated if the minipacket arrives to an empty queue.
  • HoL selector 12 obtains the queue index of the queue that is selected.
  • the queue index is used to access the head pointer buffer memory 31.
  • the head pointer is used to read out the first minipacket in the queue and the linked-list is updated accordingly.
  • a minicell is prepared containing the subset of the information in the minipacket that is required by the HoL sequencer part 14. Said information contains the pointer to the minipacket location in the minipacket memory 30. Said minicell is transferred to the HoL sequencer part. In this particular implementation, the minipacket is kept in the minipacket memory 30 even though it is removed from the link buffer memory 32.
  • the queue length is updated by reading the queue length, decrementing it and writing it back into the entry for the given queue index in the queue information memory 34.
  • the age of the queue is also updated by writing the age of the new HoL minipacket in the queue age entry for the given queue index in the queue information memory.
  • the age of the new HoL minipacket is obtained from the information stored for the HoL minipacket in the minipacket memory 30.
  • the series of functions to output a scheduled minipacket in the queue control part 10 of the example implementation of the present invention is as follows.
  • the winner minicell is passed to the queue control part 10 by the HoL sequencer.
  • the pointer in the minicell is used to read the actual minipacket out from the minipacket memory 30.
  • the minipacket is then sent out through the output interface 18 of the scheduler system.
  • the pointer of the mimpacket is returned to the pointer pool of free memory that can be used to store new minipackets and that is maintained by the memory space controller 21.
  • the exiting mimpacket may contain the timestamp of instant of arrival to the scheduler system and the instant of departure from the scheduler system, or the difference thereof.
  • the functions in the queue control part, system memory part, and HoL selector part of the present invention are performed by separate units and in parallel wherever it is possible, and if necessary serially within a series of time slots (clock cycles), such as in cases where two or more functions access the same memory or register.
  • An optional buffer management controller 28 of the queue control part 10 can use the per-queue contained in the queue information memory 34 to discard minipackets from the queues if necessary, or to mark minipackets for certain subsequent treatment in the scheduling system.
  • Different discarding algorithms can be implemented in said buffer management section, for example the minipacket at the head or the tail of a queue can be discarded.
  • Different criteria can be used to decide on when to discard a minipacket, e.g. queue size and pre-established thresholds.
  • Packet discarding as in RED and RIO can also be implemented by comparing a pseudo-random number and a threshold that depends on the average queue size [Floyd 1993], [Clark 1998].
  • the buffer management controller may also be used to police arriving packet flows.
  • the queue information memory and associated processing is used to police packet arrival patterns.
  • the Generic Cell Rate Algorithm for ATM cell arrivals can be policed in this fashion [DePrycker pg 305].
  • This discarding procedure is performed upon arrival of a minipacket for a queue.
  • the discarded minipackets are marked in a unique manner and output by the scheduler system.
  • the original data packet is then discarded by the switch, router, or subsystem thereof.
  • a HoL selector part 12 transfers minicells representing the head-of-line minipackets to the HoL sequencer part 14 of the system.
  • Each subclass of packets that belong to a given queue is defined so that only first-in first-out scheduling is required for the minipackets within the queue. Consequently, each time the HoL sequencer needs to select the next minipacket to be transferred out of the entire scheduler system, only the head-of-line minipacket of each queue needs to be considered.
  • the function of the HoL selector part 12 is to ensure that the HoL sequencer part 14 has the appropriate head-of- line minicell from each non-empty queue.
  • the HoL selector part transfers to the HoL sequencer the next HoL minicell of the queue with the same queue index.
  • the size of the required space in the HoL sequencer part is equal to the number of queues (distinct indices) in use, rather than the total number of minipackets in the system. Typically the total number of minipackets in the system is much larger than the number of queues.
  • Table II below is a pseudocode representation of a HoL Selector and sequencer:
  • the HoL selector mechanism is an important aspect of the present invention.
  • the HoL selector part 12 of the present invention contains a HoL controller 40 that ensures that the new HoL minipacket of the most recently serviced queue is sent to the HoL sequencer part in time for the next scheduling decision.
  • the following describes one implementation of the mechanism. However it must be noted that this implementation is not the only way to realize the mechanism. The basic goal of this mechanism can be achieved through other implementations.
  • Each of the queues has a flag bit associated with it that indicates whether or not the queue is represented in the HoL sequencer part.
  • this flag information can be stored in the queue information memory 34.
  • An arriving minipacket is placed in the queue if the flag is already set to 1.
  • a minipacket arriving to a queue with the flag bit set to 0 always results in the sending of a corresponding minicell to the HoL sequencer part, and the flag is set to 1.
  • the direct transfer of said minicell to the sequencer part is essential to maintain the correct operation of the scheduling algorithm.
  • a possible implementation of the direct transfer mechanism involves setting aside time slots so that each arriving minipacket can have a corresponding minicell transferred directly to the sequencer part if necessary. However it must be noted that this implementation is not the only way to realize the mechanism.
  • the minicell representing the minipacket is sent back to the HoL selector part where the HoL controller 40 extracts the queue index in the minicell and sends the next HoL minicell of the same queue to the HoL sequencer part before the next scheduling decision time for the related output port. If the queue is empty, the HoL controller sets the flag bit to 0 to indicate that the queue is not represented in the HoL scheduler part. This mechanism is completed by the procedure for empty queues that is explained next.
  • the winner minicell is also sent to the queue controller part 10, where the corresponding minipacket is removed from the minipacket memory and then output back to the switch or router or subsystem thereofr.
  • An important aspect of the present invention is the way empty queues are handled.
  • a major limitation in queuing systems that incorporate a large number of queues is the need for updating the state of the queuing system to track the state transitions from an empty queue to a non-empty queue or vice versa.
  • a major problem with these systems is that whenever the queue that is chosen for service is empty, many time slots are required before another non-empty queue can be scheduled for service.
  • a queue is examined only at the time that a new minipacket arrives for that queue or when a minicell from that queue is serviced.
  • This aspect of the present invention can also be implemented in different ways. Here we describe one implementation of the mechanism to elaborate on fundamental objectives of the mechanism.
  • An empty queue flag bit associated with each queue indicates whether or not the queue is empty.
  • the empty queue flag can be stored in the entry for a given queue index in the queue information memory 34. In this mechanism whenever a new minipacket arrives to an empty queue that is already represented, the minipacket is put in the linked-list queue and the empty flag is set to 1. If a minipacket arrives to an empty queue that is not represented in the sequencer part it is sent to the HoL sequencer part. In this case the empty flag bit remains 0, however the represent flag bit is set to I.
  • the next HoL minicell of the serviced queue is sent to the HoL sequencer part, and if the minicell is the last in the queue, the empty queue flag is reset to 0 to indicate an empty queue. In the case if the queue is already empty the flag bit remains 0, however the represented flag bit is set lo 0 to indicate the queue is not represented in the HoL sequencer part.
  • the HoL selector part of the present invention can also provide shaping and rate regulation using a control circuit that marks minicells for certain treatment in the HoL sequencer part ,or that enables or disables the transfer of certain HoL minicells to the HoL sequencer part.
  • the control circuit can be implemented in different ways. For example per-queue token bucket regulators can be used to determine the time at which an HoL minicell may be transferred to the HoL sequencer part [Ferguson 1998].
  • the control mechanism is incorporated in the present invention by a simple modification that can be implemented into the HoL controller: The next HoL minicell is transferred only if sufficient tokens are available for the given packet length. Otherwise the normal replacement of the HoL minicells is not performed.
  • the policing and shaping circuit records the indices of the queues whose HoL replacement has not been performed because of the lack of the tokens.
  • the HoL replacement of a minicell is resumed when sufficient tokens become available for the transfer of the HoL minicell.
  • the operation of the shaping and policing circuit operates in parallel, and independently of the HoL controller so that speed of operation is unaffected.
  • rate control can also be implemented directly in the sequencer part by using rate-based scheduling algorithms [Hashemi 1997b].
  • said token bucket regulators can set an earliest service time field in the minicell that indicates to the HoL sequencer part the time when a minicell is eligible to exit the system.
  • the HoL sequencer part 14 of the present invention consists of a tagging circuit and a sequencer circuit as shown in Figure 6.
  • a tagging circuit takes a HoL minicell and associated information from the HoL selector part and produces a tag.
  • the sequencer circuit sorts the HoL minicells according to a specific scheduling algorithm and at the appropriate time instants outputs a winner minicell which is then transferred to the HoL selector part and to the queue control part.
  • the HoL sequencer can handle the scheduling of variable length packets as follows. Since an output port will not become available for the next transmission until an entire packet is transmitted, the next minicell is not allowed to exit the system until said transmission time is completed.
  • the sequencer of the present invention orders the minicells based on their tags. Therefore a scheduling algorithm can be implemented by tagging minicells in such manner that the minicells are sorted in the same order that they would be chosen by the specific scheduling algorithm.
  • the tagging algorithm in the tagging circuit and the ordering algorithm in the sequencer need to be selected jointly so that the desired scheduling algorithm can be realized.
  • the tagging circuit can be designed and implemented in different ways and can be customized for specific scheduling algorithms.
  • the tagging circuit can be very simple. In the case were minicells carry static priority values, the circuit only adds a number of flag bits required by the sequencer circuit to perform the basic sorting operation.
  • complex processing may be required in the tagging section that can be provided by arithmetic circuits in hardware or by local processors or through an attached processor.
  • weighted fair queuing involves the calculation of a tag that depends on the packet length, the weight of the queue, and a virtual time of the given queue.
  • the information required to calculate the tag may be carried by the minicells or transferred in parallel to the HoL sequencer part. It is also possible to store per queue information required for tagging locally in the tagging section.
  • An interface to an attached processor can be used to initialize or dynamically update this information.
  • Any sequencer that can select the minicell with the highest tag value can be used in the HoL sequencer part.
  • An important aspect of the current invention is the use of a sequencer circuit that maintains a sorted list of minicells according to tag values.
  • Various sequencers that maintain said sorted list of minicells have been disclosed in [Hashemi 1997a], [Hashemi 1997b], [Hashemi 1997c], [Hashemi 1997d] and [Leon-
  • the generalized sequencer circuit of the present invention uses the principles of the sequencer circuit described in [Hashemi 1997a].
  • the generalized sequencer of the present invention consists of a chain of buffering units 53 in Figure 7. Each buffering unit can accommodate two minicells. Each minicell, as described before, is a fixed-length block of data that includes a tag. Minicells residing in the chained buffering units form a sorted list. Minicells in the sorted list can travel from one unit to the adjacent units in forward and backward directions.
  • Each buffering unit 53 has a comparison unit 55 that can compare the two tags of the two minicells residing in the buffering unit. Comparison can involved an arithmetic or logical operation. Moreover, as an important aspect of the present invention, each buffering unit has an arithmetic and logical unit 56 that can perform arithmetic and logical operations on the tags.
  • a new minicell enters the queue from the head of the queue. The cell is compared to the cell at the head of the queue. According to the predefined logic of comparison and the two tag values, one of the two minicells is sent out as the winner. The other minicell, the loser, is sent one step back inside the queue to the next buffering unit. According to the present invention an arithmetic and/or logical operation can be performed on the tag of the minicell that is sent back in the queue.
  • the comparison and forwarding procedure is repeated at every buffering unit spreading into the buffer, like a wave, until the last unit is reached.
  • the winner of the unit i is always forwarded to the unit i-1 (the forward path ), and the loser to the unit i+1 (the backward path ) as is shown in Figure 7.
  • the next new minicell enters the sequencer circuit immediately after the previous minicell has departed the first buffering unit, thus generating a new wave.
  • the events inside the sequencer are exactly synchronized. Therefore, whenever the new wave arrives at unit i, the outcome of the previous wave is already in unit i+1 and the comparison can start immediately. Each arriving minicell moves backward into the queue until it finds its right place in the queue pushing other minicells backward into the queue.
  • a blocking capability 57 in Figure 7 is implemented in the sequencer, so that minicells can enter the sequencer even when a minicell is not allowed to exit the sequencer. To realize this capability, the winner in the comparison in each unit remains at the same unit instead of being forwarded to the adjacent unit. The loser is sent backward as before. In the scheduling of variable length packets, the blocking capability is used to determine the timing of the exit of the next winner minicell.
  • the generalized sequencer circuit can use the arithmetic and logical units 56 in Figure 7 to recalculate minicell tag values "on the fly" as the minicells are flowing inside the sequencer.
  • the feature is required in scheduling algorithms where the arrival of a packet to a queue can trigger the recalculation of tags in related minicells.
  • scheduling algorithms such as hierarchical weighted fair queuing [Zhang 1995] recalculations may be required upon arrival of a minicell to an empty queue.
  • the insertion of a minicell with appropriate flag information indicates an arrival to an empty queue can trigger the recalculation of tag values of related minicells as the wave generated by the arriving minicell propagates through the sequencer.
  • each logical subqueue can represent minicells belonging to a given priority class. Additional functions can be implemented within each logical subqueue such as age control, discarding over-aged cells and so on.
  • weighted fair queuing and priority-based algorithms can be handled by the same hardware.
  • the tag In weighted fair queuing, the tag reflects the finish number of a data packet, which is determined upon its arrival, and the minicells are served in the order of their finish numbers [Keshav page 239].
  • the tag In the second case, the tag reflects the priority level of a cell and the packets are served in the order of their priority levels [Hashemi 1997a]. In either case the comparison hardware determines the larger of two numerical values.
  • the tag can be composed of several distinct fields, each controlling an aspect in the sequencing of the cells.
  • an age field can be used, in conjunction with the priority field, to arrange cells of the same priority in order of age as shown in Figure 8.
  • programmable options can be implemented in the hardware and controlled by flag bits in the tag. For example, discarding over-aged cells, aging, and joining of the priority and age fields together can be enabled or disabled by using flag bits in the tag field of each cell.
  • programmable logic can be used for the comparison logic units in the sequencer, allowing flexibility in the implementation of scheduling algorithms.
  • a sequencer circuit with generic comparison logic can be manufactured and later customized for specific scheduler implementations by appropriately programming the logic section.
  • the exact definition of the tag field which includes the type, length, and position of each field in the tag relating to the properties such as the age and priority can be defined according to the desired scheduling algorithm after the manufacture of the circuit.
  • the operations used to compare and operate on the tags can also be defined after the manufacture of the circuit.
  • An important aspect of the present invention is the capability of scheduling of data packets that are destined to different output ports using a single scheduler system.
  • a generalized single-queue sequencer circuit is used as the scheduler circuit of the HoL sequencer part of the present invention to realize the aspect of the present invention.
  • a single-queue sequencer circuit is described in [Leon-Garcia and Hashemi, "The Single Queue Switch", Canadian Application No. 2,227,655].
  • a generalized single-queue circuit is obtained by replacing the buffering units of the single queue sequencer circuit with the novel buffering units 53 of the generalized sequencer circuit 52 of this invention.
  • the generalized sequencer circuit 52 can schedule the transfer of data packets from a multiplicity of input ports to a single output port.
  • the generalized single-queue sequencer of this invention can combine N generalized sequencers, each dedicated to a unique output port, into a single sequencer circuit.
  • the sequencer circuit in the HoL sequencer part of the present invention has a multiplexer 61 and output controller 62 that manages the input and output of the generalized single-queue sequencer as shown in Figure 9.
  • the generalized single-queue sequencer 60 in Figure 9 uses the principles of the single- queue switch architecture described in [Leon-Garcia and Hashemi, "The Single Queue Switch", Canadian Application No. 2,227,655] and Hashemi [1997d] and introduces a novel buffering element that includes arithmetic and logical processing.
  • the single- queue switch the N output queues that co ⁇ espond to N output ports are interleaved into a single sequencer circuit using a grouping algorithm.
  • the generalized sequencer 52 of Figure 7 has the capability of organizing the logical queues within the same physical queue.
  • the same capability is used to interleave the queues of different output ports into the same physical sequencer circuit.
  • the minicells destined for a given output port are said to belong to the same logical queue.
  • the interleaving mechanism is as follows.
  • the sequence of minicells is divided into groups. The first group contains the first minicell of each logical (output) queue.
  • the second group contains the second minicell of each logical (output) queue and so on. Within each group the cells are placed in order of increasing logical output queue number.
  • An example of the interleaved sequence of the minicells is shown in Figure 10.
  • the grouping mechanism is implemented simply by exploiting the basic function of tagging and tag comparison in the generalized sequencer. Every minicell entering the generalized single-queue sequencer carries a field in its tag that indicates its output port number. Inside the generalized single-queue sequencer the output port number field of the arriving minicell is compared to the existing minicells in each group. If there is a minicell with the same port number in the group, the minicell is sent to the next group. Within a group, a minicell is inserted in order of increasing output port number. This ordering is accomplished by comparison of the output port number field of a new minicell and minicells in a group.
  • the generalized single-queue sequencer operates according to scheduling rounds of fixed duration.
  • M minicells can enter a scheduler system during a scheduling round.
  • one group of minicells leaves the single-queue sequencer every scheduling round.
  • the minicells exit the sequencer sequentially in order of output port number.
  • the scheduler system outputs a maximum of N minicells in a scheduling round when the HoL group has one minicell for each output port.
  • the HoL group does not contain N minicells, then one or more of the output ports will not be utilized during a scheduling round.
  • the time to transfer the group remains constant.
  • the output will remain idle during the turns of the absent minicells. Obviously, the queue will not move forward during this time period necessitating the use of the aforementioned blocking mechanism 63 in Figure 9.
  • the generalized single queue sequencer can handle variable length packets as follows. If the port corresponding to a given minicell in the HoL group has not completed its previous packet transmission, then said minicell is sent back into the generalized single queue sequencer.
  • each logical queue can be viewed as an independent virtual queue operating on a distinct generalized sequencer.
  • Each logical queue can then implement a different scheduling algorithm by invoking different features of the comparison logic and arithmetic logic in each buffering element.
  • the tagging circuit must operate on each minicell according to the scheduling algorithm that corresponds to the specific minicell.
  • Each minicell must carry flag bits that invoke the appropriate processing in each unit.
  • the output port field of the transit minicell is compared to the output port field of the minicell in the buffering unit. If the two output port fields are different, the transit field is declared a loser and is passed to the next unit. If the two output port fields are the same, the priority fields of the two minicells are compared to each other first. If the transit minicell is the loser, the procedure continues as before. If the transit minicell is the winner, then the transit minicell is placed in the unit and the resident minicell is pushed out. In this case the loser minicell is passed to the next group.
  • a multicasting mechanism can be incorporated in the multiple-port implementation of the present invention.
  • a packet can be scheduled for transmission to its destination output ports independently, while keeping only one copy of the original data packet outside the scheduler system and using only one minicell in the scheduler system.
  • the multicasting capability of the scheduler of the present invention can be used for multicasting in ATM switches. It can also be exploited as a complementary facility for multicasting in IP routers. The details of the multicasting mechanism is as follows.
  • Multicast minipackets are put in a different set of queues than unicast minipackets by the queue control part.
  • the HoL minicell of each multicast queue is sent to the sequencer circuit.
  • These multicast minicells are treated by the generalized single-queue sequencer circuit as belonging to a separate virtual queue.
  • the multicast logical queue is treated as if it corresponds to the queue for a fictitious output port with a number (say port #0) that precedes the number of all other output ports.
  • the first timeslot of every scheduling round of the generalized single-queue sequencer is dedicated to the virtual multicast port (port #0).
  • the head of line multicast minicell in the first group of the sequencer circuit is sent to a multicast controller module 72 of Figure 11.
  • the destination list of said multicast minicell is retrieved and is stored in a register in the multicast controller as a bitmapped list.
  • the destination list can be part of the minicell set by the routing protocol in the switch.
  • the list can also be retrieved from a local look up table set per connection for connection-oriented switches such as ATM.
  • the HoL unicast minicell for said output port is sent to the multicast controller.
  • the output port is one of the destinations of the multicast minicell
  • a copy of the multicast minicell is sent to the output by the output controller 71 and the related bit is reset to 0.
  • the HoL unicast minicell is sent back to the sequencer in this case using multiplexer 74. If the destination list becomes empty in a given time slot, the copy of the multicast minicell that is sent out is marked as the final copy of the multicast minicell.
  • the minipacket corresponding to said minicell is output from the scheduler system.
  • the corresponding minipacket is not removed from the mimpacket memory if the minicell is not the final copy of a multicast minicell.
  • the HoL selector transfers the next HoL minicell of the same multicast queue to the HoL sequencer part.
  • the queue control part clears the minicell from the memory by recycling the pointers similar to pointers of normal minicells.
  • the minipacket that is output from the scheduler system is marked to indicate that it corresponds to the last copy of the data packet.
  • Packets can be discarded in the HoL sequencer part of the present invention.
  • a minicell can be marked as discarded.
  • a discarded minicell is then output from the scheduler system so that the original data packet can be discarded.
  • the discarding mechanism is implemented as follows.
  • a minicell may be marked as discarded if it is found to exceed a preset age limit.
  • a discard control circuit sends such a minicell back to the queue control part, where the corresponding minipacket is removed from memory, and is then output from the scheduler system. The corresponding packet can then be discarded.
  • the HoL minicell replacement is performed for discarded minicells in the same manner as for normal minicells.
  • a discarded multicast minicell is handled in the same manner as a discarded unicast minicell.
  • the present invention can be integrated in a switch or router or subsystem thereof to queue and schedule the transmission of packets on one or a multiplicity of output ports.
  • the integration of the system of the present invention in switching and routing systems can be done in different ways depending on the architecture of the switching or the routing system.
  • FIG. 12 shows a block diagram of a switching system 80 that incorporates the present invention as a centralized queuing and scheduling controller.
  • Data packets arriving from a multiplicity of input ports 82 are stored and for each packet a minicell is sent to the controller.
  • the minicell contains information regarding the class, priority, output port number and a pointer to the storage location of the data packet.
  • Such a minicell can be composed and represented to the queue controller and scheduler module in different ways. However the location of each field of the information must be known to the interface logic that is used to integrate the module in the system.
  • the minicells are queued and scheduled based on the information contained in them.
  • the scheduler system selects a minicell for each output port during each scheduling period.
  • Each scheduling period contains one time-slot per output port.
  • Selected "winner" minicells are transferred to the switch fabric during their corresponding time-slots.
  • the minicell refers to a packet that is stored in the fabric. The packet is then retrieved and is transmitted to the output port according to the switch fabric architecture. The speed of the queuing and scheduling controller must be sufficiently fast to handle the N ports in this manner.
  • a classifier circuit can be used to perform packet classification and to provide class and priority information required by the queuing and scheduling controller module.
  • the classifier receives the packets or the headers of the packets or the information in the headers, and determines the priority, class, or type of the - packet, and along with the destination port number sends the information and a queue index to the queuing and scheduling system.
  • the classifier circuit can be combined with the queuing and scheduling controller circuit.
  • the multiple-port queuing and scheduling system can also be used in a multiport line card in which the queuing and scheduling system handles the buffer management and scheduling of data packets destined for a multiplicity of output ports on the line card.
  • data packets are transferred from the switch fabric of a switch of router to an output line card 80.
  • the data packets are stored in memory and the scheduler system 81 is used to implement buffer management, scheduling, and other packet management functions.
  • the scheduler system determines the order in which data packets are transmitted in the multiplicity of the output ports 82 in the line card.
  • a switching or router may have several line cards each containing a number of output ports.
  • the queuing and scheduling system of the present invention can be used in a single port line card 85 in Figure 14for queuing and scheduling the packets destined for an output port on a line card.
  • a generalized sequencer circuit is can be used as the sequencer circuit of the scheduling part of the present invention in this case.
  • the queuing and scheduling controller of the present invention as described in the above example can be implemented as a single-board card that can be added to a system such as a personal computer that is used as a switch or router.
  • the queuing and scheduling controller directs the transfer of data packets from the input line card across a standard bus such as PCI (Peripheral Component Interconnect) to the output line card.
  • PCI Peripheral Component Interconnect
  • the single-port and multi-port queuing and scheduling system 89 can also be used in an input line card 88 to process packets as they enter a switch or router and prior to transfer across a switching fabric as shown in Figure 15.
  • the arriving data packets are stored in memory and the queuing and scheduling system implements policing, metering and buffer management functions.
  • the queuing and scheduling system may also be used to schedule the order in which packets are transferred across the switch fabric.
  • the multi- port queuing and scheduling system can be used to schedule the transfer of packets in switch and router designs that use several parallel fabrics to transfer packets between line cards.
  • said unique identifier is the packet's storage location.
  • said queues are implemented using a bank of linked- list queues that can be assigned flexibly and arbitrarily to the minipackets in the system.
  • said bank of linked-list queues is scalable.
  • said scheduler for scheduling routing of packets between an input port and an output port, wherein each packet has an index that specifies both a unique destination output port for the packet and membership in a subclass, the scheduler comprising:
  • a memory for storing minipacket information corresponding to an arriving packet, including the index of the packet, a unique identifier, and scheduling information;

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

L'invention concerne un procédé et un appareil permettant de gérer la mise en mémoire tampon et de programmer le transfert de paquets de données arrivant sur un ou plusieurs ports d'entrée et destinés à un ou plusieurs ports de sortie d'un commutateur de paquets ou routeur ou d'un sous-système de celui-ci. Un indice est attribué à chaque paquet qui arrive; lequel indice spécifie à la fois un port de sortie à destination unique destiné au paquet et une appartenance à une sous-classe, telle qu'une classe de priorité, une connexion ou un flux. Pour chaque paquet qui arrive, un mini paquet est crée qui contient l'indice du paquet, un identifiant unique, tel qu'une position de stockage. Une file d'attente est attribuée à chaque indice, dans laquelle les mini paquets pourvus dudit indice sont placés dans leur ordre d'arrivée. Les paquets sont transmis depuis la position de stockage vers des ports de sortie en fonction de la séquence des mini paquets produite par le système programmateur.
PCT/CA2000/001389 1999-11-24 2000-11-24 Programmateur de paquets programmable grande vitesse et gestionnaire de mémoire tampon WO2001039430A2 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU16848/01A AU1684801A (en) 1999-11-24 2000-11-24 A high-speed, programmable packet scheduler and buffer manager

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CA2,290,265 1999-11-24
CA002290265A CA2290265A1 (fr) 1999-11-24 1999-11-24 Ordonnanceur programmable haute vitesse de paquets de donnees et gestionnaire de memoire tampon

Publications (2)

Publication Number Publication Date
WO2001039430A2 true WO2001039430A2 (fr) 2001-05-31
WO2001039430A3 WO2001039430A3 (fr) 2001-10-18

Family

ID=4164692

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2000/001389 WO2001039430A2 (fr) 1999-11-24 2000-11-24 Programmateur de paquets programmable grande vitesse et gestionnaire de mémoire tampon

Country Status (3)

Country Link
AU (1) AU1684801A (fr)
CA (1) CA2290265A1 (fr)
WO (1) WO2001039430A2 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1271859A1 (fr) * 2001-06-21 2003-01-02 Alcatel internetworking Elimination aléatoire précoce pour un commutateur de données à commutation de cellules
WO2004080149A3 (fr) * 2003-03-07 2005-07-28 Cisco Tech Ind Systeme et procede pour l'ordonnancement dynamique dans un processeur de reseau
EP1626544A1 (fr) * 2003-03-13 2006-02-15 Alcatel Amélioration du calcul de la longueur moyenne d'une file d'attente en vue de l'élimination aléatoire précoce de paquets de données (RED)
WO2017219993A1 (fr) * 2016-06-22 2017-12-28 新华三技术有限公司 Planification de paquet

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7974191B2 (en) * 2004-03-10 2011-07-05 Alcatel-Lucent Usa Inc. Method, apparatus and system for the synchronized combining of packet data
CN106528598B (zh) * 2016-09-23 2019-10-18 华为技术有限公司 一种链的管理方法及物理设备

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5850395A (en) * 1995-07-19 1998-12-15 Fujitsu Network Communications, Inc. Asynchronous transfer mode based service consolidation switch

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5850395A (en) * 1995-07-19 1998-12-15 Fujitsu Network Communications, Inc. Asynchronous transfer mode based service consolidation switch

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HASHEMI M R ET AL: "A GENERAL PURPOSE CELL SEQUENCER/SCHEDULER FOR ATM SWITCHES" KOBE, APRIL 7 - 12, 1997,LOS ALAMITOS, CA: IEEE COMPUTER SOC,US, 7 April 1997 (1997-04-07), pages 29-37, XP000850281 ISBN: 0-8186-7782-1 cited in the application *
HASHEMI M R ET AL: "THE SINGLE-QUEUE SWITCH: A BUILDING BLOCK FOR SWITCHES WITH PROGRAMMABLE SCHEDULING" IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS,US,IEEE INC. NEW YORK, vol. 15, no. 5, 1 June 1997 (1997-06-01), pages 785-794, XP000657032 ISSN: 0733-8716 cited in the application *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1271859A1 (fr) * 2001-06-21 2003-01-02 Alcatel internetworking Elimination aléatoire précoce pour un commutateur de données à commutation de cellules
US6937607B2 (en) 2001-06-21 2005-08-30 Alcatel Random early discard for cell-switched data switch
WO2004080149A3 (fr) * 2003-03-07 2005-07-28 Cisco Tech Ind Systeme et procede pour l'ordonnancement dynamique dans un processeur de reseau
US7039914B2 (en) 2003-03-07 2006-05-02 Cisco Technology, Inc. Message processing in network forwarding engine by tracking order of assigned thread in order group
US7287255B2 (en) 2003-03-07 2007-10-23 Cisco Technology, Inc. System and method for dynamic ordering in a network processor
CN100392602C (zh) * 2003-03-07 2008-06-04 思科技术公司 用于在网络处理器中动态排序的系统和方法
EP1626544A1 (fr) * 2003-03-13 2006-02-15 Alcatel Amélioration du calcul de la longueur moyenne d'une file d'attente en vue de l'élimination aléatoire précoce de paquets de données (RED)
WO2017219993A1 (fr) * 2016-06-22 2017-12-28 新华三技术有限公司 Planification de paquet
CN107528789A (zh) * 2016-06-22 2017-12-29 新华三技术有限公司 报文调度方法及装置
CN107528789B (zh) * 2016-06-22 2020-02-11 新华三技术有限公司 报文调度方法及装置

Also Published As

Publication number Publication date
AU1684801A (en) 2001-06-04
WO2001039430A3 (fr) 2001-10-18
CA2290265A1 (fr) 2001-05-24

Similar Documents

Publication Publication Date Title
US8023521B2 (en) Methods and apparatus for differentiated services over a packet-based network
US7158528B2 (en) Scheduler for a packet routing and switching system
US6259699B1 (en) System architecture for and method of processing packets and/or cells in a common switch
EP1646192B1 (fr) Commutateur de paquets, dispositif de plannification, circuit de contrôle d'abandon, circuit de contrôle multidestinataire et dispositif de contrôle de qualité de service
US6134217A (en) Traffic scheduling system and method for packet-switched networks with fairness and low latency
US5278828A (en) Method and system for managing queued cells
EP1414195B1 (fr) Architecture de allocateur hiérarchique pour utiliser avec un noeud d'accès
US6975638B1 (en) Interleaved weighted fair queuing mechanism and system
US20040151197A1 (en) Priority queue architecture for supporting per flow queuing and multiple ports
US6163542A (en) Virtual path shaping
US20020110134A1 (en) Apparatus and methods for scheduling packets in a broadband data stream
JP2001217836A (ja) Atmセルの配布をスケジューリングする際に使用する重み付きラウンドロビンエンジン
JPH0983547A (ja) パケットスケジューリング装置
CA2462793C (fr) Transmission distribuee de courants de trafic dans les reseaux de telecommunication
Chiussi et al. A distributed scheduling architecture for scalable packet switches
Chao et al. Design of a generalized priority queue manager for ATM switches
EP1414197B1 (fr) Système et méthode de multi-diffusion à utiliser dans un commutateur ATM d'un noeud d'accès
EP1111851B1 (fr) Système ordonnanceur pour l'ordonnancement de cellules ATM
US6807171B1 (en) Virtual path aggregation
WO2001039430A2 (fr) Programmateur de paquets programmable grande vitesse et gestionnaire de mémoire tampon
US7079545B1 (en) System and method for simultaneous deficit round robin prioritization
JP3157113B2 (ja) トラヒックシェイパー装置
US9363186B2 (en) Hierarchical shaping of network traffic
Shan A new scalable and efficient packet scheduling method in high-speed packet switch networks
Hou et al. Service disciplines for guaranteed performance service

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: A3

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase