EP1411680A2 - Système de réseau Metro-Ethernet a messagerie de pause selective en avant - Google Patents

Système de réseau Metro-Ethernet a messagerie de pause selective en avant Download PDF

Info

Publication number
EP1411680A2
EP1411680A2 EP20030021891 EP03021891A EP1411680A2 EP 1411680 A2 EP1411680 A2 EP 1411680A2 EP 20030021891 EP20030021891 EP 20030021891 EP 03021891 A EP03021891 A EP 03021891A EP 1411680 A2 EP1411680 A2 EP 1411680A2
Authority
EP
European Patent Office
Prior art keywords
buffer
packet
node
virtual space
packets
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP20030021891
Other languages
German (de)
English (en)
Other versions
EP1411680B1 (fr
EP1411680A3 (fr
Inventor
Mohamed Nizam Mohamed Hamzah
Girish Chiruvolu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alcatel Lucent SAS
Original Assignee
Alcatel CIT SA
Alcatel SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alcatel CIT SA, Alcatel SA filed Critical Alcatel CIT SA
Publication of EP1411680A2 publication Critical patent/EP1411680A2/fr
Publication of EP1411680A3 publication Critical patent/EP1411680A3/fr
Application granted granted Critical
Publication of EP1411680B1 publication Critical patent/EP1411680B1/fr
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/26Flow control; Congestion control using explicit feedback to the source, e.g. choke packets
    • H04L47/263Rate modification at the source after receiving feedback
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/30Flow control; Congestion control in combination with information about buffer occupancy at either end or at transit nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/50Queue scheduling
    • H04L47/62Queue scheduling characterised by scheduling criteria
    • H04L47/621Individual queue per connection or flow, e.g. per VC
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/50Queue scheduling
    • H04L47/62Queue scheduling characterised by scheduling criteria
    • H04L47/6215Individual queue per QOS, rate or priority

Definitions

  • the present embodiments relate to computer networks and are more particularly directed to a Metro Ethernet network system in which its nodes transmit upstream pause messaging to cause backpressure for only selected upstream switches.
  • Metro Ethernet networks are one type of network that has found favor in various applications in the networking industry, and for various reasons. For example, Ethernet is a widely used and cost effective medium, with numerous interfaces and capable of communications and various speeds up to the Gbps range.
  • a Metro Ethernet network is generally a publicly accessible network that provides a Metro domain, typically under the control of a single administrator, such as an Internet Service Provider ("ISP"). Metro Ethernet may be used to connect to the global Internet and to connect between geographically separated sites, such as between different locations of a business entity.
  • ISP Internet Service Provider
  • Metro Ethernet may be used to connect to the global Internet and to connect between geographically separated sites, such as between different locations of a business entity.
  • the Metro Ethernet network is often shared among different customer virtual local area networks (“VLAN”), where these networks are so named because a first VLAN is unaware of the shared use of the Metro Ethernet network by one or more additional VLANs. In this manner, long-standing technologies and infrastructures may be used to facilitate efficient data transfer.
  • VLAN virtual local area networks
  • a Metro Ethernet network includes various nodes for sake of routing traffic among the network, where such nodes include what are referred to in the art as switches or routers and are further distinguished as edge nodes or core nodes based on their location in the network.
  • Edge nodes are so named as they provide a link to one or more nodes outside of the Metro Ethernet network and, hence, logically they are located at the edge of the network.
  • core nodes are inside the edges defined by the logically perimeter-located edge nodes.
  • both types of nodes employ known techniques for servicing traffic arriving from different nodes and for minimizing transient (i.e., short term) congestion at any of the nodes.
  • a node Under IEEE 802.3x, which is the IEEE standard on congestion control, and in the event of such congestion, a node provides "backpressure" by sending pause messages to all upstream Metro Ethernet nodes, that is, those that are transmitting data to the congestion-detecting node. Such congestion is detected by a node in response to its buffering system reaching a threshold, where once that threshold is reached and without intervention, the node will become unable to properly communicate its buffered packets onward to the link extending outward from that node.
  • the node transmits a pause message to every upstream adjacent node whereby all such adjacent nodes are commanded to cease the transmission of data to the congested node, thereby permitting the congested node additional time to relieve its congested state by servicing the then-stored data in its buffering system.
  • the approach is non-scalable, as there should be n number of buffers (or buffer space) in a node that switches traffic to n different MAC destinations.
  • the number of buffers required also increases when traffic-class is introduced.
  • one of the buffers is not optimally utilized, other traffic with a different MAC destination is not able to utilize the unused resources in the sub-optimal buffer(s), thereby leading to wastage.
  • each session capacity requirement and path can vary with time as well as network condition and, hence, there is no provision for local Max-Min fairness.
  • there is no scheme for differentiation among sessions and the traffic of each of the sessions may vary with time. Some sessions may be idle and some may become active for a period of time and so on.
  • Max-Min fairness is an outcome of one such arbitrator for bandwidth.
  • Max-Min fairness the session that requires the least bandwidth is first satisfied/allocated by the arbitrator and the procedure is repeated recursively for the remaining sessions until the available capacity is shared.
  • a buffer is described that is shared by more than one session, where a session is defined as a packet or packets communicated between a same ingress and egress Metro Ethernet network edge node (i.e., as identifiable by the addresses in the MAC-in-MAC addressing scheme used for Metro Ethernet networks).
  • the buffer is divided into segments and each segment is given an identification number. Each segment is allowed to store packets with different MAC addresses at the same time, but an arriving packet can only be stored in a segment that currently has packets with the same MAC addresses.
  • the node disallows any arriving packets from being stored not only in the congested segment but also other segments whose identification number is smaller than the congested one.
  • a backpressure message is sent to every adjacent upstream node.
  • the upstream-nodes will then temporarily stop serving all buffer segments that have identification number similar or smaller than the downstream congested-node segment.
  • the upstream node is prevented not only from transmitting to the segment that was filled, but also to other segments as well (i.e., those with a smaller identification code). These segments also will be temporarily prevented from accepting any arriving packets.
  • each segment is also rigid, that is, the number of packets that can be stored within a segment is fixed.
  • the congestion mechanism is inefficient in that it is always triggered by the state of any one segment, even if the total packet occupancy in the buffer space, including potentially numerous other segments, has not reached a congestion state.
  • the system comprises a first network node, and that node comprises an input for receiving a packet, and the node also comprises a buffer, coupled to the input and for storing the packet.
  • the first network node also comprises circuitry for detecting when a number of packets stored in the buffer exceeds a buffer storage threshold and circuitry, responsive to a detection by the circuitry for detecting that the number of packets stored in the buffer exceeds the buffer storage threshold, for issuing a pause message along an output to at least a second network node.
  • the pause message indicates a message ingress address and a message egress address, where that message ingress address and the message egress address correspond to a network ingress address and a network egress address in a congestion-causing packet received by the first network node.
  • the pause message commands the second network node to discontinue, for a period of time, transmitting to the first network node any packets that have the message ingress address and the message egress address.
  • FIG. 1 illustrates a block diagram of a system 10 into which the preferred embodiments may be implemented.
  • System 10 generally represents a Metro Ethernet network that includes a number of Metro nodes. As introduced earlier in the Background Of The Invention section of this document, such nodes are typically described as edge nodes or core nodes based on their location in the network; by way of example, system 10 includes five Metro edge nodes ME 0 through ME 4 and nine Metro core nodes MC 0 through MC 8 . These nodes include various aspects as known in the art, such as operating to send packets as a source or receive packets as a destination.
  • system 10 is typically coupled with stations or nodes external to system 10, such as may be implemented in the global Internet or at remotely located networks, such as at different physical locations of a business entity.
  • those external nodes can communicate packets with system 10; for example, one such node external from but coupled to, Metro edge node ME 1 can communicate a packet to Metro edge node ME 1 .
  • Metro edge node ME is referred to as an ingress node because in this example it is the location of ingress into system 10.
  • that packet may be forwarded on through various paths of system 10, and ultimately it will reach one of the other edge nodes and then pass outward of system 10, such as to another external node.
  • This latter edge node that communicates the packet outward, is referred to as the egress node because in this example it is the location of egress out of system 10.
  • the egress node because in this example it is the location of egress out of system 10.
  • the number of nodes shown in Figure 1 is solely by way of example and to simplify the illustration and example, where in reality system 10 may include any number of such nodes.
  • the specific connections shown also are by way of example, and are summarized in the following Table 1, which identifies each node as well as the other node(s) to which it is connected.
  • Figure 2 illustrates the general form of each packet 20 that passes through system 10, according to the preferred embodiment.
  • Packet 20 includes four fields.
  • a first field 20 1 indicates the address of the ingress edge node; in other words, that address is the address of the first node that encountered packet 20 once packet 20 was provided by an external source to system 10; this address is also sometimes referred to as a Metro source address.
  • a second field 20 2 indicates the address of the egress edge node for packet 20.
  • Metro Ethernet networks provide sufficient controls such that when a packet is received external from the network, it includes a source and destination address as relating to nodes external from the network; in response and in order to cause the packet ultimately to be directed to the packet-specified destination address, the ingress edge node determines a desired egress edge node within system 10 such that once the packet arrives at that egress node, it can travel onward to the destination address. Then, the ingress edge node locates within packet 20 both its own address, shown as field 20 1 , as well as the egress edge node address, shown as field 20 2 . Continuing with Figure 2, a third field 20 3 is included to designate the packet as pertaining to a given class.
  • packet 20 includes a payload field 20 4 .
  • Payload field 20 4 includes the packet as originally transmitted from the external node to system 10; note, therefore, that this packet information includes the user data as well as the originally-transmitted external source address and destination address, where those addresses are directed to nodes outside of system 10. Accordingly, packet 20 includes two sets of addresses, one (in field 20 4 ) pertaining to the source and destination node external from system 10 and the other (in fields 20 1 and 20 2 ) identifying the ingress and egress edge nodes.
  • This technique may be referred to as encapsulation and with respect to the addressing in the preferred embodiment may be referred to as MAC-in-MAC encapsulation in that there is one set of MAC addresses encapsulated within another set of MAC addresses.
  • packet 20 reaches its egress edge node ME x , that node strips fields 20 1 , 20 2 , and 20 3 from the packet and then forwards, to the destination address in payload field 20 4 , the remaining payload field 20 4 as the entirety of the packet.
  • the destination node upon receipt of the final packet at a node external from system 10, the destination node is unaware of the information previously provided in fields 20 1 , 20 2 , and 20 3 .
  • Figure 3 illustrates a block diagram of various aspects of the preferred embodiment as included in each Metro node in system 10, that is, in either an edge node or a core node and, thus, in Figure 3 the node is indicated generally as node MN x .
  • node MN x includes various other circuitry not illustrated but as will be understood to be included so as to support the functionality known in the art to be provided by such a node.
  • Figure 3 instead illustrates those additional aspects that are particularly directed to the preferred embodiments. Further, these aspects are generally described in terms of functionality, where one skilled in the art may readily ascertain various hardware and/or software to implement such functionality.
  • a packet sent to node MN x is received along an input 30 IN , where each such packet is then switched to one of two buffers 30 HC and 30 LC , based on the class designation of the packet as is specified in field 20 3 in Figure 2. More particularly in the present illustration, two such classes are contemplated and are therefore referred to as a high class, corresponding to buffer 30 HC , and a low class, corresponding to buffer 30 LC .
  • Each buffer 30 HC and 30 LC represents a packet storage facility that may be formed by various devices and is subject to the control of a controller 32, where various aspects of controller 32 are detailed throughout the remainder of this document and include detecting and responding to packet congestion in node MN x .
  • Each buffer is generally the same in structure and functions so as to store the respective high or low class of packets, subject to being logically partitioned by controller 32. Further, note that two buffers are shown in node MN x by way of example as corresponding to two respective packet classes, while in an alternative embodiment a different number of buffers corresponding to a different number of classes may be implemented.
  • buffer 30 HC it is shown to include three different virtual space regions VS HC0 , VS HC1 , and VS HC2 , where each such region corresponds to a different session of high class packets.
  • the term "session" is meant to correspond to any packets having both the same ingress and egress Metro edge nodes, where recall those nodes are identified in the packet fields 20 1 and 20 2 as shown in Figure 2. For example with respect to Figure 1, if three packets are communicated from node ME 0 to node ME 2 , then each of those same packets are said to belong to the same session.
  • buffer 30 HC in Figure 3 since it includes three virtual space regions directed to high class packets, then those regions correspond to three different sessions and the high class packets thereof, that is, three sets of high class packets, where each such set has a different one of either or both of a Metro ingress or egress node address as compared to the other sets. For sake of example in the remainder of this document, therefore, assume that the three virtual space regions of buffer 30 HC correspond to the high class packets in the sessions indicated in the following Table 2.
  • Table 2 illustrates three different sessions, note there are other paths through system 10 having different possible ingress and egress nodes, such as from node ME 3 as an ingress node to node ME 2 as an egress node; however, since that combination is not shown in Table 2 and there is not a corresponding virtual space in buffer 30 HC , then it may be assumed as of the time of the example illustrated in Figure 3 that a packet has not been received by node MN x so as to establish such a buffer space or that previously such a space had been established but it was no longer needed and therefore has been discontinued.
  • controller 32 will allocate a new virtual space for that session, while re-adjusting the allocation of one or more of the existing virtual spaces in order to allow buffer storage resources for the new session.
  • the amount of buffer space allocated to each session by controller 32 is proportional to the session's computed share of bandwidth. The share preferably is re-evaluated when there is a change in network conditions and requirements, and the computation is performed periodically for Best Effort class traffic in order to achieve Max-Min Fairness. Further, once all packets in a virtual space have been output, then the virtual space can be closed so that its resource may be available for a different session.
  • buffer 30 LC is shown to include virtual space regions VS LC0 , VS LC1 , VS LC2 , and VS LC3 , where here therefore there are four regions as opposed to three for buffer 30 HC , and for buffer 30 LC each such region corresponds to a different session of low class packets.
  • each virtual space region in buffers 30 HC and 30 LC is referred to as a "virtual" space because it represents a logically apportioned amount of the entire storage in the respective buffer, where that logical area may be changed at various times so as to accommodate a different number of packets for the corresponding class and session, as will be further evident later.
  • a respective threshold At the top of each virtual space region is indicated, by a dashed line, a respective threshold.
  • THR HC0 For example, in virtual space region VS HC0 of buffer 30 HC , it has a corresponding threshold THR HC0 . This threshold, as well as the threshold for each other virtual space region in buffers 30 HC and 30 LC , are monitored by controller 32.
  • each such threshold is determined by controller 32 and may represent a different percentage of the overall space in buffer 30 HC , as compared to the apportionment of the threshold for any other virtual space region in buffers 30 HC and 30 LC .
  • the threshold for a session region in effect provides a nominal boundary of that region, that is, it is anticipated that packets of that session can be accommodated in the region, and as detailed later, if the threshold is exceeded then the session is considered aggressive in that it has exceeded its anticipated needed buffering.
  • each of buffers 30 HC and 30 LC has an associated global threshold GTHR HC and GTHR LC , respectively.
  • each global threshold GTHR HC and GTHR LC is set such that if that threshold is exceeded, based on the combination of all space regions occupied by valid packets in the corresponding buffer, then the preferred embodiment interprets such an event as a first indicator of potential traffic congestion for the class of packets stored in that buffer.
  • node MN x also includes a server 34.
  • controller 32 is operable at the appropriate timing to read a packet from either buffer 30 HC or buffer 30 LC and the packet then passes to server 34; in response, server 34 outputs the packet to a downstream link 34 L . Accordingly, the output packet may then proceed from downstream link 34 L to another node, either within or external from system 10. In this same manner, therefore, additional packets over time may be output from each of buffers 30 HC and 30 LC to downstream nodes, thereby also freeing up space in each of those buffers to store newly-received upstream packets.
  • FIG. 4 further illustrates the operation of node MN x of Figure 3 through the use of a flow chart depicting a method 40.
  • Method 40 begins with a step 42, indicating the receipt of a packet; thus, step 42 represents a wait state by node MN x as it awaits receipt of an upstream packet. When such a packet arrives, method 40 continues from step 42 to step 44.
  • step 44 node MN x directs the flow of method 40 based on the Metro class of the packet received in the immediately-preceding step 42, where recall that this class information is available from class field 20 3 as shown in Figure 2. Further, and as mentioned above, one preferred embodiment contemplates only two different Metro classes; thus, consistent with that example, step 44 demonstrates alternative flows in two respective directions. Specifically, if step 44 detects a high class packet, then the flow continues to step 46 H in which case node MN x is controlled to handle the packet in connection with the high class packet buffer 30 HC .
  • step 44 detects a low class packet
  • the flow continues to step 46 L in which case node MN x is controlled to handle the packet in connection with the low class packet buffer 30 LC .
  • step 48 After either of steps 46 H or 46 L , method 40 continues to step 48.
  • step 48 node MN x stores the packet at issue in the appropriate virtual space region of the buffer to which the packet was directed from step 44.
  • each virtual space in each of buffers 30 HC and 30 LC corresponds to a particular session; thus, step 48 represents the storage of the packet, based on its session, into the corresponding virtual space, provided there is vacant space available in that virtual space region.
  • Table 2 suppose a high class packet is received by node MN x with a Metro ingress node of ME 0 and a Metro egress node of ME 1 , where recall these two node addresses are identifiable from fields 20 1 and 20 2 , respectively, as shown in Figure 2.
  • this first example packet is stored in virtual space region VS HC0 , provided there is vacant space available in that virtual space region.
  • this second example packet is stored in virtual space region VS HC1 , again, provided there is vacant space available in that virtual space region. Additional examples will be ascertainable by one skilled in the art, and note also that such packet storage may be achieved also into buffer 30 LC and the virtual space regions therein.
  • the packet is stored in another virtual space region.
  • the total packet occupancy of the former virtual space region is increased, and is considered to have exceeded its virtual threshold.
  • the total packet occupancy of the latter virtual space region however is not incremented, as the packet that occupies its region does not actually belong to its virtual space.
  • step 48 if upon receipt of the packet there is no corresponding virtual space yet established in a buffer and corresponding to that packet's session (i.e., ingress and egress nodes), then step 48 also establishes such a virtual space and then stores the packet therein. Following the packet storage of step 48, method 40 continues to step 50.
  • step 50 node MN x determines whether the buffer into which the packet was just stored has now reached its corresponding global threshold, where recall those thresholds are GTHR HC for buffer 30 HC and GTHR LC for buffer 30 LC .
  • step 50 determines whether that packet has now caused the buffer, in its entirety (i.e., including all packets from all virtual space regions), to exceed its respective global threshold.
  • step 50 determines whether the total number of valid packets then stored in that buffer exceed the global threshold GTHR HC , and similarly if the packet was stored into the low class packet buffer 30 LC , then step 50 determines whether the total number of valid packets then stored in that buffer exceed the global threshold GTHR LC . As indicated earlier, if such a threshold is exceeded, then the preferred embodiment interprets such an event as an indicator of potential traffic congestion for the class of packets stored in that buffer, where this action is now shown to occur first in connection with step 50. Toward this end, if the relevant global threshold is exceeded, then method 40 continues from step 50 to step 52.
  • method 40 returns from step 50 to step 42 to await receipt of the next packet.
  • this analysis may be performed at other times, such as after receiving and storing multiple packets or based on other resource and or time considerations.
  • step 52 node MN x determines which packet occupancy or session within the congested buffer has exceeded its respective threshold for its respective virtual space region. Again, this step may occur each time following the storage of a packet in a buffer, and preferably step 52 is repeated for all virtual space regions within the congested buffer. Further, recall that during the operation of node MN x to receive packets according to method 40, at the same time server 34 is servicing packets in buffers 30 HC and 30 LC such that the serviced packets are being output to downstream link 34 L so as to reach another downstream node. Thus, this operation also may affect the extent to which any virtual space region is filled with valid packets.
  • node MN x determines whether the stored valid packets in virtual space region VS HC0 exceeds THR HC0 , and it determines whether the stored valid packets in virtual space region VS HC1 exceeds THR HC1 , and it determines whether the stored valid packets in virtual space region VS HC2 exceed THR HC2 .
  • method 40 continues from step 52 to step 54.
  • any session within the congested buffer that is causing the threshold in its corresponding virtual space region to be exceeded is considered an aggressive session in the present context, that is, it is considered a cause of potential congestion at node MN x because it is seeking to exceed its allotted space (or the threshold within that space), and this condition is detected by node MN x after it detects that its buffer 30 HC or 30 LC in its entirety has exceeded its corresponding global threshold.
  • step 52 is reached following receipt of a packet in virtual space region VS HC0 , and using also the example of Table 2 therefore this packet is from ingress node ME 0 and is passing to (or has reached) egress node ME 1 .
  • the threshold THR HC0 is exceeded upon storage of that packet into virtual space region VS HC0 .
  • node MN x transmits what will be referred to herein as a pause message to each upstream adjacent node, that is, to each node that is immediately connected to node MN x in the upstream direction, where due to their connections therefore such nodes have the ability to communicate packets downstream to node MN x .
  • a type of pause message is known and that a node under that standard sends pause messages to all adjacent upstream nodes; in response all of those nodes are thereby directed to cease communicating all packets downstream to the node that communicated the pause message.
  • step 54 when step 54 issues the pause message, it takes a form such as message 60 shown in Figure 5, and preferably includes the following information.
  • Message 60 includes a pause message identifier field 60 1 , which indicates to a node receiving message 60 that the message is a pause message.
  • Message 60 also includes a session identifier field 60 2 ; this field 60 2 indicates both the Metro ingress and egress addresses of the aggressive session detected by node MN x , that is, it identifies the session that exceeded the threshold in its respective virtual space region.
  • Message 60 also includes a traffic class field 60 3 , which thereby identifies the class of the aggressive section.
  • message 60 includes a pause time field 60 4 , where the pause time indicates the amount of time one or more upstream nodes should cease transmitting certain packets to node MN x , which issued pause message 60.
  • the length of the pause time in field 60 4 is proportional to the amount of packets the aggressive session has exceeded its virtual space region.
  • E P HCO - THR HCO .
  • each such adjacent upstream node is only commanded to cease transmitting packets, for the aggressive session and in the specified class, for the duration of the pause time in field 60 4 .
  • the adjacent upstream nodes may still communicate other packets for other sessions to node MN x .
  • step 52 receives a high class packet from a session with ingress node ME 0 and egress node ME 1 and assume THR HC0 is exceeded upon storage of that packet into virtual space region VS HC0 .
  • node MN x transmits a pause message in the form of message 60 to each adjacent upstream Metro node.
  • each of those Metro nodes is prohibited from transmitting a high class packet with ingress node ME 0 and egress node ME 1 to node MN x ; however, during that same duration of the pause time in field 60 4 , each of those adjacent upstream Metro nodes are free to communicate packets of other sessions and also classes to node MN x , assuming that such nodes have not or do not in the interim receive additional pause messages directed to such other packets.
  • the preferred embodiments provide a Metro Ethernet network with nodes functioning to detect a threshold-exceeding buffer and aggressive sessions therein, and to issue pause messages to upstream nodes in response thereto. Further in this regard, note that if an upstream node receives a pause message directed to a particular class and session, it will cease transmitting packets from that class and session toward the detecting node; this cessation may cause the upstream node itself to fill its own buffer beyond the buffer's global threshold, whereby that upstream node also will determine for its own buffer the aggressive session(s) and issue pause messages still further upstream.
  • the preferred embodiments also provide various benefits over the prior art. For example, unlike the current standard IEEE 802.3x, the preferred embodiment only pauses Metro sessions that contribute to the state of congestion. Further, the amount of paused time is proportional to the aggressive session's level of aggressiveness, that is, traffic that is more aggressive is paused a longer time duration than less aggressive traffic, thereby distributing network resources in a more equitable manner. As another benefit in contrast to the prior art, under the preferred embodiment, a node issuing a pause message can still receive other traffic (i.e., non-aggressive traffic) from adjacent upstream nodes.
  • traffic i.e., non-aggressive traffic
  • the preferred embodiment preferably implements a single buffer to monitor and control each session of similar class and the portioning of different virtual space regions within that buffer are preferably altered over time, thereby also leading to reduced hardware modification and cost.
  • each session has an allocated virtual buffer space which is evaluated based on the session's evaluated share of bandwidth, and given that the spaces are virtual then the allocation of that space is non-rigid.
  • each of the Ethernet frames is augmented with a spanning tree id (ST-id).
  • ST-id spanning tree id

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Small-Scale Networks (AREA)
  • Computer And Data Communications (AREA)
EP03021891A 2002-10-18 2003-09-27 Système de réseau Metro-Ethernet avec une messagerie de pauses selective en amont Expired - Lifetime EP1411680B1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US41975602P 2002-10-18 2002-10-18
US419756P 2002-10-18
US626066 2003-07-24
US10/626,066 US7327678B2 (en) 2002-10-18 2003-07-24 Metro ethernet network system with selective upstream pause messaging

Publications (3)

Publication Number Publication Date
EP1411680A2 true EP1411680A2 (fr) 2004-04-21
EP1411680A3 EP1411680A3 (fr) 2004-05-06
EP1411680B1 EP1411680B1 (fr) 2006-11-02

Family

ID=32045444

Family Applications (1)

Application Number Title Priority Date Filing Date
EP03021891A Expired - Lifetime EP1411680B1 (fr) 2002-10-18 2003-09-27 Système de réseau Metro-Ethernet avec une messagerie de pauses selective en amont

Country Status (4)

Country Link
US (1) US7327678B2 (fr)
EP (1) EP1411680B1 (fr)
AT (1) ATE344564T1 (fr)
DE (1) DE60309414T2 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101453419B (zh) * 2007-12-06 2012-10-10 卢森特技术有限公司 在分组交换数据网络中控制拥塞
EP2875617B1 (fr) * 2012-07-20 2017-11-08 Cisco Technology, Inc. Pause intelligente pour système de matrice de commutation distribué

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1260926C (zh) * 2002-11-08 2006-06-21 华为技术有限公司 一种城域传输设备中虚容器映射通道的流量控制方法
CA2508833A1 (fr) * 2003-01-28 2004-08-12 Telefonaktiebolaget L M Ericsson (Publ) Procede et dispositif avisant d'un encombrement survenu dans ses reseaux a commutation par paquets et en indiquant les differentes raisons
US7400639B2 (en) * 2003-08-07 2008-07-15 Intel Corporation Method, system, and article of manufacture for utilizing host memory from an offload adapter
US7508763B2 (en) * 2003-09-04 2009-03-24 Hewlett-Packard Development Company, L.P. Method to regulate traffic congestion in a network
US7317682B2 (en) * 2003-09-04 2008-01-08 Mitsubishi Electric Research Laboratories, Inc. Passive and distributed admission control method for ad hoc networks
US7761589B1 (en) * 2003-10-23 2010-07-20 Foundry Networks, Inc. Flow control for multi-hop networks
US7639608B1 (en) 2003-10-23 2009-12-29 Foundry Networks, Inc. Priority aware MAC flow control
EP1650905A1 (fr) * 2004-10-25 2006-04-26 Siemens Aktiengesellschaft Procédé pour la gestion de profils de bande passante dans un réseau Ethernet de type Metro (MAN)
US7436773B2 (en) * 2004-12-07 2008-10-14 International Business Machines Corporation Packet flow control in switched full duplex ethernet networks
US7697528B2 (en) * 2005-11-01 2010-04-13 Nortel Networks Limited Multilink trunking for encapsulated traffic
US7872973B2 (en) * 2006-03-17 2011-01-18 Alcatel Lucent Method and system for using a queuing device as a lossless stage in a network device in a communications network
US9621375B2 (en) 2006-09-12 2017-04-11 Ciena Corporation Smart Ethernet edge networking system
US8542693B2 (en) * 2007-08-01 2013-09-24 Texas Instruments Incorporated Managing free packet descriptors in packet-based communications
US8149719B2 (en) * 2008-02-13 2012-04-03 Embarq Holdings Company, Llc System and method for marking live test packets
US8315179B2 (en) * 2008-02-19 2012-11-20 Centurylink Intellectual Property Llc System and method for authorizing threshold testing within a network
US7978607B1 (en) * 2008-08-29 2011-07-12 Brocade Communications Systems, Inc. Source-based congestion detection and control
US20100260108A1 (en) * 2009-04-13 2010-10-14 Qualcomm Incorporated Setting up a reverse link data transmission within a wireless communications system
US20110267948A1 (en) * 2010-05-03 2011-11-03 Koc Ali T Techniques for communicating and managing congestion in a wireless network
US8861366B1 (en) * 2010-06-21 2014-10-14 Arris Enterprises, Inc. Multi-level flow control
US9847889B2 (en) * 2011-07-20 2017-12-19 Cisco Technology, Inc. Packet trains to improve packet success rate in carrier sense multiple access networks
US8817807B2 (en) 2012-06-11 2014-08-26 Cisco Technology, Inc. System and method for distributed resource control of switches in a network environment
US10791062B1 (en) * 2017-11-14 2020-09-29 Amazon Technologies, Inc. Independent buffer memory for network element
DE112020002497T5 (de) 2019-05-23 2022-04-28 Hewlett Packard Enterprise Development Lp System und verfahren zur dynamischen zuweisung von reduktionsmotoren
US11809289B2 (en) * 2021-10-15 2023-11-07 Dell Products L.P. High-availability (HA) management networks for high performance computing platforms

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0748087A1 (fr) * 1995-06-09 1996-12-11 International Business Machines Corporation Système de commande d'accès d'un tampon partagé
US6035333A (en) * 1997-11-24 2000-03-07 International Business Machines Corporation Method and system for providing congestion control in a data communications network
EP1079660A1 (fr) * 1999-08-20 2001-02-28 Alcatel Méthode pour l'acceptation de données dans une mémoire
US6405258B1 (en) * 1999-05-05 2002-06-11 Advanced Micro Devices Inc. Method and apparatus for controlling the flow of data frames through a network switch on a port-by-port basis
US20020087723A1 (en) * 2001-01-03 2002-07-04 Robert Williams Method and apparatus for performing priority-based flow control

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6167054A (en) * 1997-02-14 2000-12-26 Advanced Micro Devices, Inc. Method and apparatus providing programmable thresholds for full-duplex flow control in a network switch
US6026075A (en) * 1997-02-25 2000-02-15 International Business Machines Corporation Flow control mechanism
ATE329433T1 (de) * 2000-03-29 2006-06-15 Cit Alcatel Verfahren zur erzeugung einer annahmeentscheidung in einem telekommunikationssystem
IL143456A0 (en) * 2001-05-30 2002-04-21 Broadlight Ltd Communication system architecture useful to converge voice and data services for point-to-multipoint transmission

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0748087A1 (fr) * 1995-06-09 1996-12-11 International Business Machines Corporation Système de commande d'accès d'un tampon partagé
US6035333A (en) * 1997-11-24 2000-03-07 International Business Machines Corporation Method and system for providing congestion control in a data communications network
US6405258B1 (en) * 1999-05-05 2002-06-11 Advanced Micro Devices Inc. Method and apparatus for controlling the flow of data frames through a network switch on a port-by-port basis
EP1079660A1 (fr) * 1999-08-20 2001-02-28 Alcatel Méthode pour l'acceptation de données dans une mémoire
US20020087723A1 (en) * 2001-01-03 2002-07-04 Robert Williams Method and apparatus for performing priority-based flow control

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101453419B (zh) * 2007-12-06 2012-10-10 卢森特技术有限公司 在分组交换数据网络中控制拥塞
EP2875617B1 (fr) * 2012-07-20 2017-11-08 Cisco Technology, Inc. Pause intelligente pour système de matrice de commutation distribué

Also Published As

Publication number Publication date
DE60309414T2 (de) 2007-09-06
DE60309414D1 (de) 2006-12-14
ATE344564T1 (de) 2006-11-15
EP1411680B1 (fr) 2006-11-02
EP1411680A3 (fr) 2004-05-06
US7327678B2 (en) 2008-02-05
US20040095882A1 (en) 2004-05-20

Similar Documents

Publication Publication Date Title
US7327678B2 (en) Metro ethernet network system with selective upstream pause messaging
US7372814B1 (en) Network system with color-aware upstream switch transmission rate control in response to downstream switch traffic buffering
US7136351B2 (en) Switched ethernet networks
KR100735408B1 (ko) 이더넷 기반의 네트워크에서 서비스 클래스별 트래픽의스위칭 제어 방법 및 그 스위칭 장치
US6621791B1 (en) Traffic management and flow prioritization over multiple physical interfaces on a routed computer network
US8274887B2 (en) Distributed congestion avoidance in a network switching system
EP1457008B1 (fr) Procedes et appareil de regulation de l'encombrement de reseaux
US9699095B2 (en) Adaptive allocation of headroom in network devices
US8520522B1 (en) Transmit-buffer management for priority-based flow control
CN101040471B (zh) 数据中心的以太网扩展
US8553684B2 (en) Network switching system having variable headers and addresses
CN1633786B (zh) 以太网体系结构中基于优先级的流控制方法和装置
US9882817B2 (en) Inter-device policing on network interface devices in LAG configuration
EP1650905A1 (fr) Procédé pour la gestion de profils de bande passante dans un réseau Ethernet de type Metro (MAN)
US20030185249A1 (en) Flow control and quality of service provision for frame relay protocols
US7944834B2 (en) Policing virtual connections
EP2651054A1 (fr) Canal de fibre sur Ethernet
US7231471B2 (en) System using fairness logic for mediating between traffic associated with transit and transmit buffers based on threshold values of transit buffer
JPH06224941A (ja) ネットワーク・アクセス制御システム
CN103957156A (zh) 通过网络传输数据的方法
US7092359B2 (en) Method for distributing the data-traffic load on a communication network and a communication network for implementing this method
CN107770085B (zh) 一种网络负载均衡方法、设备及系统
CN110943933A (zh) 一种实现数据传输的方法、装置和系统
US20050068798A1 (en) Committed access rate (CAR) system architecture
US6320870B1 (en) Method and apparatus for flow control on a switched CSMA/CD network implementing BLAM

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL LT LV MK

AK Designated contracting states

Kind code of ref document: A3

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL LT LV MK

RIN1 Information on inventor provided before grant (corrected)

Inventor name: CHIRUVOLU, GIRISH

Inventor name: HAMZAH, MOHAMED NIZAM MOHAMED

17P Request for examination filed

Effective date: 20041105

AKX Designation fees paid

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061102

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061102

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061102

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061102

Ref country code: BE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061102

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061102

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061102

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061102

Ref country code: LI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061102

Ref country code: CH

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061102

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REF Corresponds to:

Ref document number: 60309414

Country of ref document: DE

Date of ref document: 20061214

Kind code of ref document: P

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070202

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070202

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070202

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070213

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070402

RAP2 Party data changed (patent owner data changed or rights of a patent transferred)

Owner name: ALCATEL LUCENT

NLV1 Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act
REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

ET Fr: translation filed
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed

Effective date: 20070803

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20070930

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070203

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20070927

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061102

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061102

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20070927

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061102

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070503

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: IT

Payment date: 20100923

Year of fee payment: 8

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110927

REG Reference to a national code

Ref country code: FR

Ref legal event code: GC

Effective date: 20140717

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20140922

Year of fee payment: 12

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20140919

Year of fee payment: 12

Ref country code: FR

Payment date: 20140919

Year of fee payment: 12

REG Reference to a national code

Ref country code: FR

Ref legal event code: RG

Effective date: 20141016

REG Reference to a national code

Ref country code: FR

Ref legal event code: CA

Effective date: 20150521

REG Reference to a national code

Ref country code: FR

Ref legal event code: CA

Effective date: 20150521

REG Reference to a national code

Ref country code: DE

Ref legal event code: R119

Ref document number: 60309414

Country of ref document: DE

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20150927

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

Effective date: 20160531

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20150927

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20160401

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20150930