WO2002073884A9 - A method and system for bandwidth allocation tracking - Google Patents
A method and system for bandwidth allocation trackingInfo
- Publication number
- WO2002073884A9 WO2002073884A9 PCT/US2002/007122 US0207122W WO02073884A9 WO 2002073884 A9 WO2002073884 A9 WO 2002073884A9 US 0207122 W US0207122 W US 0207122W WO 02073884 A9 WO02073884 A9 WO 02073884A9
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- queues
- bandwidth
- group
- network
- mps
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/30—Flow control; Congestion control in combination with information about buffer occupancy at either end or at transit nodes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/28—Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
- H04L12/2852—Metropolitan area networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2425—Traffic characterised by specific attributes, e.g. priority or QoS for supporting services specification, e.g. SLA
- H04L47/2433—Allocation of priorities to traffic types
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2441—Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/52—Queue scheduling by attributing bandwidth to queues
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/52—Queue scheduling by attributing bandwidth to queues
- H04L47/525—Queue scheduling by attributing bandwidth to queues by redistribution of residual bandwidth
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/56—Queue scheduling implementing delay-aware scheduling
- H04L47/562—Attaching a time tag to queues
Definitions
- the present invention relates to the field of packet data networks. More specifically, the present invention pertains to a data flow control method and system for managing the data flow with respect to the available bandwidth in a metro packet transport ring network.
- the present disclosure describes a system and method for bandwidth allocation tracking in a packet data network
- the Internet is a general purpose, public computer network which allows millions of computers all over the world, connected to the Internet, to communicate and exchange digital data with other computers also coupled to the Internet.
- the speed at which one can connect onto the Internet is ever increasing.
- users on the Internet have the bandwidth to participate in live discussions in chat rooms, play games in real-time, watch streaming video, listen to music, shop and trade on-line, etc.
- the bandwidth will be such that video-on- demand, HDTV, IP telephony, video teleconferencing, and other types of bandwidth intensive applications will soon be possible.
- One approach by which bandwidth is being increased relates to fiber optics technology.
- QoS quality of service
- a flow refers to the transmission of packets from a sender to a receiver to support an application, such as transferring a Web page, implementing a voice over IP conversation, playing a video, or the like.
- Some flows are described as real time flows since they require very low latency (e.g., a voice over IP application).
- Other flows are not so much latency dependent as they are consistent data transfer rate dependent (e.g., video over the Web).
- real-time application flows such as video on demand, HDTV, voice communications, etc., dropped packets or late- arriving packets of the flows can seriously disrupt or even destroy performance.
- ISP's Internet Service Providers
- ASP's Applications Service Providers
- web sites/portals web sites/portals, and businesses
- ISP's Internet Service Providers
- ASP's Applications Service Providers
- web sites/portals web sites/portals, and businesses
- ISP's Internet Service Providers
- ASP's Applications Service Providers
- web sites/portals web sites/portals, and businesses
- ISP's Internet Service Providers
- ASP's Applications Service Providers
- web sites/portals web sites/portals
- businesses it is of paramount importance that they have the ability to provide these flows with a certain minimum threshold bandwidth and/or latency.
- an e-commerce or business web site may lose critical revenue from lost sales due to customers not being able to access their site during peak hours.
- TDM time division multiplexing
- T-carrier services e.g., Tl line for carrying data at 1.544 Mbits sec. and T3 line for carrying data at a much faster rate of 274.176 Mbits/sec.
- Tl and T3 lines are dedicated point-to-point datalinks leased out by the telephone companies.
- the telephone companies typically charge long distance rates (e.g., $l,500-$20,000 per month) for leasing out a plain old Tl line.
- TDM Synchronous Optical Network
- SONET Synchronous Optical Network
- TDM uses TDM to assign individual channels, or flows, to pre-determined time slots.
- each channel is guaranteed its own specific time slot in which it can transmit its data.
- TDM enables QoS, it is costly to implement because both the transmitter and receiver must be synchronized at all times. The circuits and overhead associated with maintaining this precise synchronization is costly.
- TDM based networking technologies are inefficient with respect to unused time slots. If flows are inactive, their allocated bandwidth is wasted. In general, with TDM technologies, unused bandwidth from inactive flows is not reallocated to other users.
- Asynchronous data transmission schemes provide numerous advantages when compared to synchronous TDM type schemes, and as such, are generally overtaking synchronous technologies in both voice and data network installations (e.g., the IP based networks of the Internet).
- asynchronous schemes usually function by reserving a portion of their bandwidth for "high priority" latency sensitive flows.
- QoS performance deteriorates with the increasing bandwidth utilization of the network.
- Such schemes either maintain a large margin of unused bandwidth to ensure QoS, thereby virtually guaranteeing an under- utilization of available total bandwidth, or over-allocate bandwidth, leading to the abrupt dropping data for some users and/or ruining QoS for any high priority users.
- the required solution should be able to track individual flows on an individual basis, in order to ensure individual flows are not starved of bandwidth, while simultaneously ensuring bandwidth is not over-allocated to flows which do not require it.
- the required solution should be able to track when individual flows are active and when they are inactive, thereby allowing the bandwidth allocated to the inactive flows to be reassigned to those flows in need of it.
- the required solution should be capable of tracking total allocated bandwidth in real time, thereby allowing efficient allocation of unused bandwidth in real time while maintaining QoS.
- the real-time total allocated bandwidth tracking should allow the dynamic allocation of unused bandwidth in real-time.
- the present invention provides a novel solution to the above requirements.
- the present invention comprises a method and system that provides the advantages of asynchronous data networks while efficiently implementing QoS.
- the present invention enables the efficient allocation of available bandwidth, thereby allowing guaranteed QoS.
- the present invention is able to track individual flows on an individual basis, in order to ensure individual flows are not starved of bandwidth, while simultaneously ensuring bandwidth is not over-allocated to flows which do not require it.
- the present invention can track when individual flows are active and when they are inactive, thereby allowing the bandwidth allocated to the inactive flows to be reassigned to those flows in need of it.
- the present invention can track total allocated bandwidth in real time, thereby allowing efficient allocation of unused bandwidth in real time while maintaining QoS.
- the real-time total allocated bandwidth tracking allows the dynamic allocation of unused bandwidth in real-time, while maintaining QoS.
- the present invention is a system for maintaining an accurate total of the amount of allocated bandwidth on the network, as implemented within a metropolitan area switch (MPS) that functions by allocating bandwidth of a metropolitan area network.
- MPS metropolitan area switch
- a plurality of incoming packets are assigned to a respective plurality of queues of the MPS.
- a finish time for each respective queue is computed, the finish time describing a time at which the respective queue will be emptied using the output rate.
- the plurality of queues are grouped into multiple groups in accordance with their respective finish times. These groups are referred to as "buckets" due to the fact that they include those queues having the same finish times.
- the earliest group includes the reserved bandwidth of those queues having a finish time indicating an empty condition at a first time increment.
- the second earliest group includes the reserved bandwidth of those queues having a finish time indicating an empty condition at a second time increment later than the first time increment, and so on.
- bucket 0 contains those queues which will be empty at the next time increment
- bucket 1 contains those queues that will be empty at the next two time increments, and so on.
- the amount of allocated bandwidth on the network is determined by counting the reserved bandwidth of all active flows.
- the first time increment, second time increment, and the like are indexed with respect to a schedule clock.
- One increment of the schedule clock comprises one complete round robin arbitration (e.g., per queue output onto the metropolitan area network) of all active queues within the MPS.
- the earliest group thus indicates those queues that will have an empty condition at a next time increment (e.g., output round) of the schedule clock.
- a new finish time is computed for each respective queue when a new packet is received by the respective queue.
- the series of buckets are progressively "emptied" as the schedule clock progresses, and new buckets are filled as new queues receive new packets for transmission and new associated empty times.
- the queues that are empty at the next time increment indicate those flows that will be inactive at the next time increment.
- the bandwidth allocated to those flows can be reallocated. In this manner, the determination of the amount of allocated bandwidth can be accomplished in real time, thereby allowing the efficient allocation of unallocated bandwidth in real time while maintaining quality of service.
- the earliest bucket e.g., bucket 0
- embodiments of the present invention can efficiently scale up to handle an extremely large number (e.g., 1 million or more) individual flows.
- the flows are assigned to buckets as described above on an individual basis.
- Their condition (active vs. inactive) is individually tracked in real-time, allowing their allocated bandwidth for inactive flows to be reallocated to active flows in real time.
- the present invention enables the efficient allocation of available bandwidth, since the MPS is capable of tracking total allocated bandwidth in real time. This allows the efficient allocation of unused bandwidth in real time while mamtaining QoS.
- Figure 1 shows the overall architecture of the asynchronous metro packet transport ring network according to the currently preferred embodiment of the present invention.
- Figure 2 shows an exemplary Metro Packet Transport Ring.
- Figure 3 shows an exemplary diagram of components of an MPTR.
- Figure 4 a diagram of a set of MPS units and ring segments as implemented within an exemplary system in accordance with one embodiment of the present invention.
- Figure 5 shows a diagram of a queue of an MPS and its associated finish time.
- Figure 6A shows a diagram depicting the multi-group queuing process in accordance with one embodiment of the present invention.
- Figure 6B graphically depicts the summation of all r s and w ; in accordance with one embodiment of the present invention.
- Figure 7 shows a diagram of a bucket information base (BIB) in accordance with one embodiment of the present invention.
- FIG. 8 shows a flow information base (FIB) in accordance with one. embodiment of the present invention.
- Figure 9 shows a flow chart of the steps of a bandwidth tracking and allocation process in accordance with one embodiment of the present invention.
- Embodiments of the present invention are directed to a method and system for maintaining an accurate total of the amount of allocated bandwidth on a network, as implemented within a metropolitan area switch (MPS).
- the present invention provides the advantages of asynchronous data networks while efficiently implementing QoS.
- the present invention enables the efficient allocation of available bandwidth, thereby allowing guaranteed QoS.
- the present invention is capable of tracking total allocated bandwidth in real time, thereby allowing efficient allocation of unused bandwidth in real time while maintaining QoS.
- Figure 1 shows an overall architecture of an asynchronous metro packet transport ring network in accordance with a currently preferred embodiment of the present invention.
- a metropolitan packet transport ring consists of a ring which is laid to transmit data packets in a metropolitan area network (MAN).
- MAN is a backbone network which spans a geographical metropolitan area.
- telephone companies, cable companies, and other telecommunications providers supply MAN services to other companies, businesses, and users who need access to networks spanning public rights-of-way in metropolitan areas.
- the communications channel of the MPTR is implemented using a ring topology of installed fiber optic cables.
- Other less efficient transmission mediums such as hybrid fiber coax, coax cables, copper wiring, or even wireless (radio frequency or over-the-air laser beams) can be used or substituted in part thereof.
- Users coupled to a particular MPTR can transmit and receive packetized data to/from each other through that MPTR.
- a personal computer coupled to MPTRl can transmit and received data packets to/from a server also coupled to MPTRl.
- data packets originating from one MPTR can be routed to another MPTR by means of a router.
- a computer coupled to MPTRl can transmit data packets over its fiber ring to a router 101 to MPTR2. The data packets can then be sent to its final destination (e.g., a computer coupled to MPTR2) through the fiber ring associated with MPTR2.
- MPTR rings can be of various sizes and configurations. Although the currently preferred embodiment contemplates the use of a ring, the present invention can also utilize other types of topologies.
- the MPTRs can also be coupled onto the Internet backbone via a router.
- MPTRl can be coupled to a dense wavelength division multiplexed (DWDM) fiber backbone 102 by means of router 101.
- DWDM dense wavelength division multiplexed
- users coupled to MPTRl has access to the resources available on traditional Internet 103.
- the present invention can be used in conjunction with traditional Internet schemes employing standard routers, switches, and other LAN equipment 104-107. And any number of MPTR's can thusly be coupled together to gracefully and cost-efficiently scale to meet the most stringent networking demands which may arise.
- MPTR may be added to accommodate the increased load.
- These MPTR's can be coupled to the same router (e.g., MPTR5, MPTR6, and MPTR7) or may alternatively be coupled to different routers.
- an MPTR can be used to support one or more LANs.
- MPTR6 may support traffic flowing to/from LAN 108.
- an MPTR may be coupled directly to another MPTR. In this manner, data flowing in MPTR8 can be directly exchanged with data packets flowing through MPTR7.
- a single MPTR can have multiple entries/exits.
- MPTR5 is coupled to both router 109 as well as router/switch 110. Thereby, users on MPTR5 have the ability to transmit and receive data packets through either of the two routers 109 or 110. Virtually any configuration, protocol, medium, and topology is made possible with the present MPTR invention.
- MPTR 200 is comprised of two fiber cable rings, or rings, 201 and 202; a number of Metro Packet Switches (MPSl-MPSn); and a Ring Management System (RMS) 203.
- MPSl-MPSn Metro Packet Switches
- RMS Ring Management System
- the physical layer of an MPTR is actually comprised of two redundant fiber cable rings 201 and 202. Data packets flow in opposite directions through the two rings (e.g., clockwise in ring
- MPS Metro Packet Switches
- An MPS is coupled to both of the fiber rings 201 and 202. Thereby, if there is a break in one segment of the fiber ring, data can be redirected through one of the MPS's to flow through the other, operational fiber ring. Alternatively, traffic can be redirected to minimize localized congestion occurring in either of the rings.
- each MPTR can support up to
- An MPS is a piece of equipment which can be housed in specially designed environmental structures or it can be located in wiring closets or it can reside at a place of business, etc.
- the distances between MPS's can be variable. It is through an MPS that each individual end user gains access to the fiber rings 201 and 202. Each individual end user transmits packetized data onto the MPS first. The MPS then schedules how that packetized data is put on the fiber ring. Likewise, packetized data are first pulled off a fiber ring by the MPS before being sent to the recipient end user coupled to the MPS.
- a single MPS can support up to 128 end users.
- An end user can be added to an MPS by inserting a fine interface card into that particular MPS.
- the line interface cards provide I O ports through which data can be transferred between the MPS and its end users.
- Different fine interface cards are designed in order to meet the particular protocol corresponding to that particular end user.
- Some of the protocols supported include Tl, T3, SONET, Asynchronous Transfer Mode (ATM), digital subscriber line (DSL) Ethernet, etc.
- ATM Asynchronous Transfer Mode
- DSL digital subscriber line
- line interface cards can be designed to meet the specifications of future protocols.
- end users such as mainframe computers, workstations, servers, personal computers, set-top boxes, terminals, digital appliances, TV consoles, routers, switches, hubs, and other computing/processing devices, can gain access to either of the fiber rings 201 and 202 through an MPS.
- an MPS provides a means for inputting packetized data into the MPTR and also for outputting packetized data out from the MPTR.
- data packets are input to MPTR 200 via MPS 204 which is coupled to router 205.
- data packets are output from MPTR 200 via MPS 204 to router 205.
- An MPS receives upstream data packets forwarded from an upstream MPS via an input fiber port coupled to the fiber ring. Data packets received from the fiber ring are examined by that MPS. If the data packet is destined for an end user coupled to that particular MPS, the data packet is routed to the appropriate I/O port. Otherwise, the MPS immediately forwards that data packet to the next downstream MPS as quickly as possible. The data packet is output from the MPS by an output fiber port onto the fiber ring.
- a computer 207 coupled to MPS4 can transmit and receive data to/from the Internet as follows.
- Data packets generated by the computer are first transmitted to MPS4 via a line coupled to a fine interface card residing within MPS4. These data packets are then sent on to MPS3 by MPS4 via ring segment 206.
- MPS3 examines the data packets and passes the data packets downstream to MPS2 via ring segment 207;
- MPS2 examines the data packets and passes the data packets downstream to MPSl via ring segment 208.
- MPSl Based on the addresses contained in the data packets, MPSl knows to output theses data packets on to the I/O port corresponding to router 205. It can be seen that MPSl is connected to a router 205.
- Router 205 routes data packets to/from MPTR 200, other MPTR's, and the Internet backbone. In this case, the data packets are then routed over the Internet to their final destination. Similarly, data packets from the Internet are routed by router 205 to MPTR 200 via MPSl. The incoming data packets are then examined and forwarded from MPSl to MPS2 via ring segment 209; examined and forwarded from MPS2 to MPS3 via ring segment 210; and examined and forwarded from MPS3 to MPS4 via ring segment 211. MPS4 examines these data packets and determines that they are destined for computer 207, whereby MPS4 outputs the data packets through its I/O port corresponding to computer 207.
- users coupled to any of the MPS's can transmit and receive packets from any other MPS on the same MPTR without having to leave the ring.
- a user on MPS2 can transmit data packets to a user on MPS4 by first transmitting the packets into MPS2; sending the packets from MPS2 to MPS3 over ring segment 207; MPS3 sending the packets to MPS4 over ring 202; and MPS4 outputting them on the appropriate port corresponding to the intended recipient.
- the strict priority problem refers to the fact that upstream nodes (e.g., an upstream MPS) have larger amounts of available bandwidth in the communications channel in comparison to downstream nodes.
- upstream nodes e.g., an upstream MPS
- MPS 2 is able to insert its local input flows (e.g., insertion traffic) onto segment 210 prior to MPS 3, and so on with MPS 3 and MPS 4 with ring segment 211.
- MPS 4 by virtue of its location within the ring topology, has less available bandwidth to insert its local input flow in comparison to MPS 3 and MPS 2.
- bandwidth utilization information should be available on a "per-flow" basis should be sufficiently timely to allow intelligent allocation decisions to be made in real time.
- FIG. 3 shows an exemplary diagram of components of an MPTR.
- a number of MPS's 301-306 are shown coupled to a fiber ring 307.
- Two of the MPS's 302 and 303 have been shown in greater detail to depict how data flows in an MPTR.
- a number of computers 308-310 are shown coupled to MPS 302.
- Each of these computers 308-310 has a corresponding buffer 311-313.
- These buffers 311-313 are used to temporarily store incoming data packets from their respective computers 308-310.
- a respective controller 314-316 Associated with each of these buffers 311-313 which controls when packets queued in that particular buffer are allowed to be transmitted onto the ring 307.
- LANT 006 18 In a preferred MPTR embodiment, fair bandwidth allocation is implemented using a per-flow bandwidth allocation concept. Traffic on the ring 307 is classified into flows. For example, all packets from one user belong to one flow. The granularity of flow can be fine (e.g., per-session) or coarse (e.g., per service port, etc.), and is generally specifiable by packet classification rules. Once packets are classified into a flow, each MPS can allocation bandwidth to each flow fairly and monitor that no flow exceeds the allocation.
- the flow thus must be set up before the packets can be sent on the ring.
- Setting up flow involves specifying a number of parameters. Among these, the reserved bandwidth, r i5 and the allocation weight, w i? are necessary for flow control, where "i" is the flow's unique identifier referred to as the flow D. Once set up, a flow is recognized by its unique flow ID.
- FIG. 4 shows a diagram showing three MPS units and their respective ring segments. As depicted in Figure 4, three MPS units (MPS 0, MPS 1, and MPS 2) are shown with their respective ring segments 401-404. The MPS units are shown with their respective insertion traffic (10, II, and 12) and their respective exit traffic (E0, El, and E2). Each MPS 0-2 is shown with a plurality of internal queues (four depicted within each MPS) used for tracking the flows.
- the queues of each MPS tracks the allocated bandwidth on each outgoing ring segment 401-404.
- the traffic on the outgoing segment is represented as: ac ve act ve
- the queues of each MPS track the data traffic belonging to each individual flow (described in greater detail below).
- the traffic on each segment takes into account the exit traffic of the previous MPS, the insertion traffic of the previous MPS, and the through traffic on the ring.
- the insertion traffic of each MPS is shown in Figure 4 as "I” and the exit traffic of each MPS is shown as "E”.
- the insertion traffic is the flows from the users coupled to the MPS that want to get onto the ring, for example, destined for users coupled to another MPS.
- the exit traffic is the flows destined for the users coupled to the MPS coming from other MPS units.
- the queues within each MPS are used to track the unique flows (e.g., having unique flow IDs) that are monitored and maintained by an MPS.
- Each of the queues tracking the outgoing flow for the outgoing ring segment are drained at a rate equal to the allocated bandwidth.
- the queues are emptied at a rate affected by their respective weight, W j .
- the weight of each queue allows the implementation of differing levels of bandwidth per queue. For example, where queues are of equal weight, the individual flow packets are routed from the queues at an equal rate.
- an MPS in accordance with the present invention maintains large sets of virtual queues (VQs) to monitor flow activity on all of its output links.
- Virtual queues function in a manner similar to the queues described above (e.g., the queues shown within the MPS units depicted in Figure 4), however, they are implemented as counters which track the depth of the queues so that the data packets are not delayed as flow through their respective buffers. Additional descriptions of virtual queues as implemented in the preferred embodiment can be found in "PER-FLOW CONTROL FOR AN ASYNCHRONOUS METRO PACKET
- a VQ will have a finish time describing the time when all the packets are completely drained from the VQ at a flow allocated rate of ⁇ .
- Figure 5 shows a diagram of a queue 415 and its associated finish time.
- the output rate of the queues 411-415 allows the determination of a "finish time" describing the time at which the respective queue will be emptied.
- This finish time provides a key measure of the total allocated bandwidth of the ring 450.
- queue 415 has a finish time that describes the time at which queue 415 will be emptied at its output rate.
- a new finish time is computed reflecting the new depth of the queue 415.
- the MPS routes packets from the respective queues at a specified output rate, and a finish time for each respective queue is computed, the finish time describing a time at which the respective queue will be emptied using the allocated output rate (e.g., f ; as defined below).
- each MPS maintains a large number of queues (e.g., up to 1 million or more), one for each flow at each link.
- Each queue grows at the rate of the traffic belonging to the flow, and is drained at a rate equal to the allocated bandwidth.
- Congestion is measured in the forms of: ⁇ r j and ⁇ w ; of all non-empty (active) queues (e.g., queues 411-415).
- High values of ⁇ and ⁇ W j active active indicate that more flows are competing for the outgoing link bandwidth of the MPS.
- Each MPS frequently monitors the states of its queues to update these two parameters. Once detected, an MPS uses ⁇ r ; and Xw t acti e active to. calculate bandwidth allocation for each flow.
- each MPS calculates a fair allocation of bandwidth for all flows going through each of congestion points (e.g., at the outgoing ring segments).
- the allocation is calculated based on the following calculation:
- each MPS does not send out fj for every flow it sees. Instead, it sends a capacity reserved ratio (CRR) which generally describes the amount of unallocated bandwidth of the link.
- CRR capacity reserved ratio
- the CRR can then be used by each source within each MPS to calculate its own f ; from its static database of r ⁇ and w ⁇ CRR is more formally defined as follows:
- each source uses the equation below to calculate its ⁇ .
- MPS needs to track the total amount of allocated bandwidth and the total weight of the allocated bandwidth, jr i and ⁇ w ; . active active
- these terms are tracked in real time and track flow activity at high speeds, as high as 10 Gbps per ring segment.
- the present invention uses the finish times of the respective queues and the assigned weights of the respective queues to implement a high speed tracking method for ⁇ r ; and ⁇ W ; . active active
- active active These techniques involve the uses of per-flow queues, a flow information base (FIB), a bucket information base (BIB), and a schedule clock.
- FIB flow information base
- BIOB bucket information base
- schedule clock e.g., a schedule clock.
- embodiments of the present invention can efficiently scale up to handle an extremely large number (e.g., 1 million or more) individual flows, while remaining within the capabilities of integrated circuit technology (e.g., can be implemented in an ASIC).
- the individual flows can be tracked in real-time, allowing their allocated bandwidth for inactive flows to be reallocated to active flows in real time.
- Figure 6A a diagram depicting the multi-group queuing process of the present embodiment is shown.
- Figure 6A depicts a plurality of flows sorted into a plurality of groups, shown as bucket 0, bucket 1, bucket 2, and so on, to bucket n.
- the plurahty of queues are grouped into the multiple buckets, or groups, in accordance with their respective finish times.
- the finish times are indexed with respect to a schedule clock.
- the schedule clock or global clock, provides the time reference for finish times.
- the value of schedule clock represents the current virtual time that finish times are compared to.
- Schedule clocks increment at a rate proportional to the congestion at a node, as described below.
- As depicted in figure 6A as buckets are emptied, they move from right to left, as each bucket successively reaches the "queue empty" state shown on the left side figure 6A.
- buckets These groups of flows are referred to as "buckets" due to the fact that they include those queues having the same finish times with the schedule clock.
- bucket 0 includes the reserved bandwidth and weight of those flows having a finish time corresponding to the next increment of the schedule clock
- bucket n includes the reserved bandwidth and weight of those flows having the longest finish time with respect to the schedule clock.
- the earliest bucket e.g., bucket 0
- the second earhest bucket e.g., bucket 1
- the second earhest bucket includes those queues having a finish time indicating an empty condition at a second time increment later than the first time increment, and so on.
- bucket 0 contains the reserved bandwidth and weight of those queues which will be empty at the next time increment of the schedule clock
- bucket 1 contains those queues that will be empty at the next two time increments of the schedule clock, and so on, thereby indicating the amount of unallocated bandwidth that becomes available each time increment.
- the amount of allocated bandwidth on the network is determined by counting the total allocated bandwidth and total allocated weight of all the active flows (e.g., all bucket totals).
- the time increments for the first bucket, the second bucket, and the like are indexed with respect to the schedule clock.
- One increment of the schedule clock comprises one complete round robin arbitration (e°.g., per queue output onto the metropolitan area network) of all active queues within the MPS, in the manner described in figure 4 above. Inactive, or empty, queues do not contribute to the schedule clock period.
- the bucket 0 thus indicates those flows that will have an empty condition at a next time increment (e.g., output round) of the schedule clock.
- a new finish time is computed for each respective queue, and thus for each flow, when a new packet is received by the respective queue.
- the schedule clock advances by one every time interval, T_Sclk, given below:
- the schedule clock (represented as SCLK) advances independently based on the flow activity on the corresponding link. It should be noted that the SCLK does not necessarily advance at a constant rate as a conventional clock, ⁇ r j + CRR* ⁇ w x divided by the link capacity C, represents the percentage of link usage at current CRR values. The higher value of ⁇ , the slower the active SCLK advances.
- the difference between the finish time of a queue and the schedule clock represents the degree of backlog of the queue in terms of the amount of time to empty the queue (empty time).
- the schedule clock can also be used to pace flows to determine whether any of them have exceeded their respective allocated bandwidths. This can be done by ensuring T empty does not get too large.
- ⁇ W j is also a term that can be computed incrementally. expired bucket When a flow moves from one bucket to another, its and w 5 are subtracted from the sums of the old bucket and added to that on the new buckets. For a flow that comes back from a previously inactive (empty) state, its r ⁇ and j should be add to ⁇ r ; and ⁇ W j too. active active
- Figure 6B graphically depicts the summation of all ri and wi in accordance with one embodiment of the present invention.
- the vertical axis is bandwidth and the horizontal axis is time.
- the link capacity is as shown.
- the trace shows the utilization of the link capacity as it changes over time (e.g., as some flows become active and other flows become inactive).
- FIG 7 a diagram of a bucket information base (BIB) 700 in accordance with one embodiment of the present invention is shown.
- the buckets depicted in figure 6 A are implemented as a series of counters within a database, the BIB 700, maintained within each MPS.
- each bucket is implemented as a ring total bandwidth counter and a corresponding ring total weight counter.
- the counters are incremented to reflect the number of flows, and their associated weights, within the bucket.
- the schedule clock functions as a pointer that cycle through the counters in accordance with the time increment at which their respective flows will be empty, in the manner described above. Thus, for example, at the next time increment, the schedule time pointer will move to indicate the counters associated with bucket 1, and so on.
- BIB 700 is organized as a two column and 8K long table as shown figure 7.
- BD3 700 is able to sustain 16 accesses for every 50 ns, thereby allowing updates when, for example, new packets arrive within the queues.
- FIG. 8 shows a flow information base (FIB) 800 in accordance with one embodiment of the present invention.
- An MPS uses the FIB 800 as a flow descriptor.
- the FIB 800 contains fields specifying various actions to be applied to the packets belonging to each flow (transit forward on the ring, exit the ring to a specifc port, etc.) and fields holding flow parameters, such as r £ and W j .
- the finish time of a flow which tracks its virtual queue depth, is stored in the FIB. When packets arrive, the finish time in the FIB is updated, and used to access the BIB as described above. Thus the FIB is only accessed as packets arrive.
- FIG. 9 shows a flow chart of the steps of an operating process 900 in accordance with one embodiment of the present invention.
- process 900 shows the operating steps of an MPS maintaining an accurate total of the amount of allocated bandwidth on the network, as implemented within an MPTR.
- Process 900 begins in step 901, where data packets for transmission are received from a plurality of users by the queues of an MPS. Within the MPS, the plurality of incoming packets from the various users are assigned to a respective plurality of queues of the MPS.
- step 902 data from the queues is routed onto the ring.
- a controller is configured to empty the respective queues at a specified output rate.
- step 903 a finish time is computed for each respect queue.
- the finish time describes the time at which the respective queue will be emptied using the current output rate.
- the queues are grouped into respective buckets based on their respective finish times.
- the plurality of queues are grouped into multiple buckets, or groups, in accordance with their respective finish times. These groups are referred to as "buckets" due to the fact that they include those queues having the same finish times.
- the buckets can be implemented using respective counter pairs within a database, the counter pairs configured to track the total reserved r ; having the same finish times and their respective weights.
- a schedule clock is incremented in accordance with the cycle time of the controller. As described above, a higher number of active flows leads to a slower increment rate of the schedule clock, and vice versa.
- the finish times are indexed with respect to the schedule clock.
- the earliest bucket includes those queues having a finish time indicating an empty condition at a first time increment, the second earliest bucket includes those queues having a finish time indicating an empty condition at a second time increment later than the first time increment, and so on.
- step 906 the total r £ of flows becoming inactive and their associated weight are determined using the buckets.
- counter pairs configured to track the reserved bandwidth of queues having the same finish times and their respective weights can be used to determine the allocated bandwidth of flows and their associated weights becoming inactive on the next schedule clock increment.
- step 907 determine the amount of unallocated bandwidth based upon information obtained in step 906.
- the amount of allocated bandwidth on the network is determined by counting ⁇ r j and ⁇ W ; . active active
- This information allows the MPS to accurately determine an amount of unallocated bandwidth available for distribution to the active flows.
- step 908 new finish times are computed for the active flows as new data arrives at the queues for transmission. Subsequently, in step 909, process 900 continues by repeating steps 904-909. In this manner, the series of buckets are progressively "emptied” as the schedule clock progresses, and new buckets are filled as new queues receive new packets for transmission and new associated empty times.
- the determination of the amount of allocated bandwidth can be accomplished in real time, thereby allowing the efficient allocation of unallocated bandwidth in real time while maintaining quality of service.
- the earhest bucket e.g., bucket 0
- the present invention enables the efficient allocation of available bandwidth, since the MPS is capable of tracking total allocated bandwidth in real time. This allows the efficient allocation of unused bandwidth in real time while maintaining QoS.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Small-Scale Networks (AREA)
Abstract
Description
Claims
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2002254147A AU2002254147A1 (en) | 2001-03-08 | 2002-03-08 | A method and system for bandwidth allocation tracking |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US27462101P | 2001-03-08 | 2001-03-08 | |
US60/274,621 | 2001-03-08 | ||
US27533801P | 2001-03-12 | 2001-03-12 | |
US60/275,338 | 2001-03-12 | ||
US10/094,035 US6947998B2 (en) | 2001-03-08 | 2002-03-07 | Method and system for bandwidth allocation tracking in a packet data network |
US10/094,035 | 2002-03-07 |
Publications (3)
Publication Number | Publication Date |
---|---|
WO2002073884A2 WO2002073884A2 (en) | 2002-09-19 |
WO2002073884A3 WO2002073884A3 (en) | 2004-03-04 |
WO2002073884A9 true WO2002073884A9 (en) | 2004-04-15 |
Family
ID=27377633
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2002/007122 WO2002073884A2 (en) | 2001-03-08 | 2002-03-08 | A method and system for bandwidth allocation tracking |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN1462528A (en) |
AU (1) | AU2002254147A1 (en) |
WO (1) | WO2002073884A2 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101237417B (en) * | 2008-02-29 | 2010-09-08 | 华为技术有限公司 | Queue index method, device and traffic shaping method and device |
CN103067306B (en) * | 2012-12-28 | 2016-12-28 | 山石网科通信技术有限公司 | The method and device of distribution bandwidth |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2150967C (en) * | 1994-12-22 | 2001-04-03 | Jon C. R. Bennett | Method and a scheduler for controlling when a server provides service with rate control to an entity |
JP2001519121A (en) * | 1997-04-04 | 2001-10-16 | アセンド コミュニケーションズ インコーポレイテッド | High-speed packet scheduling method and apparatus |
-
2002
- 2002-03-08 CN CN02800572A patent/CN1462528A/en active Pending
- 2002-03-08 AU AU2002254147A patent/AU2002254147A1/en not_active Abandoned
- 2002-03-08 WO PCT/US2002/007122 patent/WO2002073884A2/en not_active Application Discontinuation
Also Published As
Publication number | Publication date |
---|---|
WO2002073884A2 (en) | 2002-09-19 |
CN1462528A (en) | 2003-12-17 |
AU2002254147A1 (en) | 2002-09-24 |
WO2002073884A3 (en) | 2004-03-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7653740B2 (en) | Method and system for bandwidth allocation tracking in a packet data network | |
US7061861B1 (en) | Method and system for weighted fair flow control in an asynchronous metro packet transport ring network | |
US6654374B1 (en) | Method and apparatus to reduce Jitter in packet switched networks | |
US6714517B1 (en) | Method and apparatus for interconnection of packet switches with guaranteed bandwidth | |
US6970424B2 (en) | Method and apparatus to minimize congestion in a packet switched network | |
Golestani | Congestion-free communication in high-speed packet networks | |
US6295294B1 (en) | Technique for limiting network congestion | |
JP3814393B2 (en) | Cell scheduling method and apparatus | |
CA2364090C (en) | Bandwidth allocation in ethernet networks | |
US7792023B2 (en) | Per-flow rate control for an asynchronous metro packet transport ring | |
Chen et al. | Efficient and fine scheduling algorithm for bandwidth allocation in Ethernet passive optical networks | |
Zhang et al. | Dual DEB-GPS scheduler for delay-constraint applications in Ethernet passive optical networks | |
US7016969B1 (en) | System using weighted fairness decisions in spatial reuse protocol forwarding block to determine allowed usage for servicing transmit and transit traffic in a node | |
Kaur et al. | Core-stateless guaranteed throughput networks | |
US7239607B1 (en) | Guaranteed quality of service in an asynchronous metro packet transport ring | |
WO2002073884A9 (en) | A method and system for bandwidth allocation tracking | |
JP4118824B2 (en) | Shaping device that minimizes delay of priority packets | |
EP1333622A1 (en) | Weighted fair flow control | |
JPH09233104A (en) | System and method for capacity management in multi-service network | |
JP3686345B2 (en) | Communication quality assurance method | |
Jiwasurat et al. | Hierarchical shaped deficit round-robin scheduling | |
EP1333621A1 (en) | Packet transport ring with Quality of service | |
EP1333623A1 (en) | Packet transport ring with Quality of service | |
Chlamtac et al. | A counter based congestion control (CBC) for ATM networks | |
Zhu et al. | A new scheduling scheme for resilient packet ring networks with single transit buffer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
WWE | Wipo information: entry into national phase |
Ref document number: 028005724 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
AK | Designated states |
Kind code of ref document: C2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: C2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
COP | Corrected version of pamphlet |
Free format text: PAGES 1/10-10/10, DRAWINGS, REPLACED BY NEW PAGES 1/11-11/11; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
COP | Corrected version of pamphlet |
Free format text: PAGES 1/10-10/10, DRAWINGS, REPLACED BY NEW PAGES 1/11-11/11; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: JP |
|
WWW | Wipo information: withdrawn in national office |
Country of ref document: JP |