WO2001005107A1 - Apparatus and method to minimize congestion in an output queuing switch - Google Patents

Apparatus and method to minimize congestion in an output queuing switch Download PDF

Info

Publication number
WO2001005107A1
WO2001005107A1 PCT/US2000/019006 US0019006W WO0105107A1 WO 2001005107 A1 WO2001005107 A1 WO 2001005107A1 US 0019006 W US0019006 W US 0019006W WO 0105107 A1 WO0105107 A1 WO 0105107A1
Authority
WO
WIPO (PCT)
Prior art keywords
moving average
utilization
data
threshold
established
Prior art date
Application number
PCT/US2000/019006
Other languages
French (fr)
Inventor
Cheng-Gang Kong
Original Assignee
Alteon Web Systems, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US14343199P priority Critical
Priority to US60/143,431 priority
Application filed by Alteon Web Systems, Inc. filed Critical Alteon Web Systems, Inc.
Publication of WO2001005107A1 publication Critical patent/WO2001005107A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic regulation in packet switching networks
    • H04L47/10Flow control or congestion control
    • H04L47/29Using a combination of thresholds
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic regulation in packet switching networks
    • H04L47/10Flow control or congestion control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic regulation in packet switching networks
    • H04L47/10Flow control or congestion control
    • H04L47/30Flow control or congestion control using information about buffer occupancy at either end or transit nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic regulation in packet switching networks
    • H04L47/10Flow control or congestion control
    • H04L47/32Packet discarding or delaying
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/50Overload detection; Overload protection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/54Store-and-forward switching systems
    • H04L12/56Packet switching systems
    • H04L12/5601Transfer mode dependent, e.g. ATM
    • H04L2012/5629Admission control
    • H04L2012/5631Resource management and allocation
    • H04L2012/5636Monitoring or policing, e.g. compliance with allocated rate, corrective actions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/54Store-and-forward switching systems
    • H04L12/56Packet switching systems
    • H04L12/5601Transfer mode dependent, e.g. ATM
    • H04L2012/5638Services, e.g. multimedia, GOS, QOS
    • H04L2012/5646Cell characteristics, e.g. loss, delay, jitter, sequence integrity
    • H04L2012/5647Cell loss
    • H04L2012/5648Packet discarding, e.g. EPD, PTD
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/54Store-and-forward switching systems
    • H04L12/56Packet switching systems
    • H04L12/5601Transfer mode dependent, e.g. ATM
    • H04L2012/5678Traffic aspects, e.g. arbitration, load balancing, smoothing, buffer management
    • H04L2012/5681Buffer or queue management
    • H04L2012/5682Threshold; Watermark
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/30Peripheral units, e.g. input or output ports

Abstract

Computer network switching units have limited hardware resources. When a switching unit can not accommodate the aggregate data load arriving at its input, it must drop data frames that it can not forward. Switching units will normally set two thresholds that are compared to the current state of hardware utilization in the switching unit. The first threshold indicates that the utilization of hardware is low. If current utilization falls below this first threshold, the inference is that the switching unit can pass a data frame. If current hardware utilization exceeds an upper threshold, then the inference is that the switching unit is saturated and can not effectively pass a data frame. When current hardware utilization is found to be between these two threshold limits, the switching unit relies on one of four probability tables to decide if the data frame should be dropped. The values in these tables are established empirically. The probability values in these four tables define the probability that a data frame will be lost given the current level of hardware utilization. The switching unit maintains historical trends of utilization and uses the sign of the first derivative of the historical data and the sign of the first derivative of a filtered version of the historical data to predict if hardware utilization is increasing or decreasing. Using the signs of these two first derivatives to select one of the four tables, the switching unit can then read a probability value from one of the tables. If that value exceeds a third threshold, the switching unit decides to drop the data frame.

Description

APPARATUS AND METHOD TO MINIMIZE CONGESTION IN AN OUTPUT QUEUING SWITCH

BACKGROUND OF THE INVENTION

TECHNICAL FIELD

The present invention relates to the processing and management of data flowing through a computer network switch

DESCRIPTION OF THE PRIOR ART

Computer networks are constructed by tying together a plurality of switching units The switching units receive data from various sources in a quantum known as a frame As computer networks continue to proliferate throughout the world, switching units must be able to route an ever-increasing bandwidth of data As the bandwidth increases, switching units must be able to handle a greater number of data frames per given unit of time

The switching units themselves rely on various strategies to ensure that the total frame rate can be accommodated In the previous art, as frame rate increased the hardware foundation of the switching unit would be taxed to such an extent that some frames would inevitably be lost These lost frames are known as dropped frames

Switching units don't just drop frames Frames are dropped intentionally and systematically based on a set of criteria that reflects the current utilization of all of the resources in the switching unit Some switching units known today wait until the data frames undergo a process known as "forwarding' before the decision to drop a frame is made This means that the decision point occurs after the frame is queued up for output to a communications channel Contention for available resources in the switching unit causes a state of congestion through the switching unit. To reduce the congestion, the switching unit executes a process that monitors the total amount of data traffic currently being handled and creates historical traffic patterns that it uses to predict future contention levels. These techniques are collectively known as active queue management methods.

One popular congestion prediction mechanism employed by prior art switching units is known as the Random Early Drop method. The

Random Early Drop method compares current resource demand against two thresholds; high-threshold and low-threshold. The resource that is monitored is utilization of queue output buffers. In order to reduce the noise associated with the bursty use of these output buffers, the utilization rate is first subjected to a low pass filter. It is the output of the low pass filter that is actually compared against the two thresholds.

If the output of the low pass filter is greater than the high-threshold, the data frame is dropped. Conversely, data frames are never dropped if the low-threshold level is not reached. When the filtered buffer utilization rate is found to be between the two thresholds, a table is consulted to determine the probability that the new data frame will cause congestion.

The probability tables are indexed by comparing the difference between the low-threshold and the filtered output. The probability tables specify the likelihood of congestion based on a-priori knowledge of the switching unit's capabilities.

SUMMARY OF THE INVENTION The methods and apparatus described herein implement a novel and unique facility that provides for significant improvement in the actual rate at which data frames are dropped by a computer network switching unit. By generating and continuously updating utilization histograms, the switching unit can anticipate utilization of switching unit resources. The switching unit uses a new congestion controller that considers the first derivatives of the real-time utilization and of a filtered rendition of the utilization histograms. The sign of these two derivatives define four states of utilization wherein each state carries an inference of upcoming changes in utilization. The new congestion controller uses these four states to select one of four probability tables. The congestion controller reads a probability value from one of these tables to determine if a data frame should be dropped.

BRIEF DESCRIPTION OF THE DRAWINGS

Fig. 1 is a function flow diagram that depicts the processing performed by the Prior Art Random Early Drop method of congestion predication;

Fig. 2 is a graph that presents a histogram for output queue buffer volume;

Fig. 3 is a table that defines the four states that the variance in queue buffer utilization and the average thereof can assume;

Fig. 4 is a block diagram that depicts the preferred hardware embodiment of the present invention;

Fig. 5 is a block diagram that depicts the preferred hardware embodiment of the congestion controller integral to the present invention; Fig. 6A is the first portion of a Flow diagram that depicts the sequence of steps that the congestion controller follows when determining if a data frame should be dropped; and

Fig. 6B is the second portion of a Flow diagram that depicts the sequence of steps that the congestion controller follows when determining if a data frame should be dropped.

DETAILED DESCRIPTION OF THE INVENTION

The present invention is a method and an apparatus that minimizes the loss of data flowing through a switching unit based on a novel method of congestion control. The apparatus is embodied as a queue switch.

Prior Art

In order to fully appreciate the utility of the present invention, it is necessary to review the prior art related to congestion control and most specifically the Random Early Drop (RED) method of congestion prediction.

Fig. 1 depicts the processing performed by the RED method. Real-time buffer utilization rates are indicative of the number of output queues that will be required to send output data to a communications channel. As can be seen in this figure, the real time buffer utilization rates 5 are directed to a low pass filter 10. The low pass filter 10 is implemented in software that executes in the control processor of a switching unit. The low pass filter 10 is best implemented as an exponential weighted moving average, the output of which is referred to by the mnemonic AVG. The moving average filter reduces short-term variations in the buffer utilization rate. This provides a more stable statistical basis for the congestion prediction method. The output of the low pass filter 10 is called the average buffer utilization rate 15 (AVG). In the prior art RED method, the value of AVG 15 is immediately compared to two thresholds; high-threshold and low- threshold. The comparison is made in software, but it is functionally depicted in this figure as two hardware comparators, 20 and 25 respectively. The output of the high-threshold comparator 20 is used to determine if the incoming data frame must be dropped. If the value of AVG 15 is less than the low-threshold, the low threshold comparator 25 indicates that the frame can be accepted without increased contention for the switching unit's hardware resources.

If neither the drop-frame indicator 30 or the accept-frame indicator 35 are active, as detected by an AND gate 40, the value of AVG 15 is summed with the negative of the low-threshold. The difference of these two values, which is called the table index 45, is used to select a probability value from a probability table 50. The probability table 50 is filled with values that are empirically determined by monitoring the performance of the switching unit under varying data loads with the congestion control methods disabled. The output of the probability table 50 is then compared against an acceptable probability threshold 55 in order to make the drop-accept decision.

Improved Method

The prior art is an effective congestion control method, but it has limited value because its prediction is based only on the filtered value of the data frame volume. The two thresholds are an adequate means for establishing a straight frame dropping criteria, but the intermediate values of AVG 15 that lie inside the two thresholds add little to no benefit aside from serving as an index into the probability table 50. The improved method considers the historical trend of resource utilization in order to provide additional table indexing. Real-time buffer utilization rates can change almost instantaneously, i.e. in step functions. This type of traffic pattern is often referred to as being bursty. The filtered rendition of the real-time buffer utilization rate does not exhibit these step functions, rather it follows a smoother curve commensurate with the filtering function provided by the low pass filter

10.

The prior art used the filtered value of the real-time buffer utilization, AVG 15, alone to index a table of congestion probabilities. The present invention uses not only the AVG 15 value, but also considers the direction of the change in the value of AVG 15 to provide additional frame- drop decision criteria. The present invention also considers the direction of change in the unfiltered real-time buffer utilization.

Fig. 2 presents a typical histogram for network data frame volume. The gray bars 65 record the number of queue buffers being used by the switching unit per unit of time (such as 5,000 per second). The output of the low pass filter is recorded as a curve 70. The present invention exploits the predictive nature of the filter output 70 in that the sign of the first derivative of this AVG curve indicates if the average of the buffer utilization rate is increasing or decreasing. The present invention also determines the sign of the first derivative of the total number of output buffers being used per unit time. This latter characteristic indicates if the utilization of buffers is declining or increasing. Any values specified in this paragraph or that can be inferred from Fig. 2 are for purposes of illustration only. Actual values are dependant on actual load conditions that the switching unit is exposed to.

Fig. 3 presents a table that defines the four states that are defined by the polarity of the first derivative of the quantity of buffers used and the first derivative of the AVG moving average of the quantity of buffers used. In effect, these two first derivatives predict the direction of change in hardware utilization based on the historical trend. A key feature of the present invention is the use of this trend-based prediction to distinguish states of hardware utilization By distinguishing the state of hardware utilization, the improved method of congestion control can select one of four probability tables instead of one

In practice, the four probability tables, referenced herein as PT-1 through PT-4, inclusive, are populated with probability values that are discovered through an empirical process This process involves subjecting the switching unit to varying load conditions that result in the four states defined in Fig 3 The switching unit is then operated in each of these four states with the congestion control process disabled The actual drop rate for data frames is recorded for each state at varying levels of the AVG moving average Basic statistical methods are used to develop drop probability values for each state at the various AVG levels that index the probability table The probability threshold that is compared against the values stored in the tables can be derived empirically As an added refinement to this method, the probability threshold can be generated in a random fashion in order to approximate the actual random nature of network loading

Preferred Embodiment

This improved method of congestion control is best reduced to practice in the form of computer based signal processing On a periodic basis, a processing element maintains the history of queue utilization From this history, the processing element calculates a moving average of queue utilization and differentiates both the raw queue utilization function and the filtered moving average function

Fig 4 presents the preferred embodiment of the new switching unit including a new congestion controller 100 In operation Switching unit

95 receives data from an external source through a media attachment unit 105 The media attachment unit 105 generally receives serial data although the data can be parallel After receiving data from the external source, the media attachment unit 105 creates data frames that it then presents to a input first-in-first-out (FIFO) buffer 110.

The input FIFO 110 forwards each data frame that it receives to a forwarding engine 115. The forwarding engine 115 determines what output queue each data frame must be directed to in accordance with either a-priori routing knowledge or dynamic maps that it creates. Once the forwarding engine 115 has processed a data frame, it is stored in a queue memory 120. The forwarding engine 115 identifies the data frame and the queue to which it was directed and delivers this identification to a queue linker 125. The queue linker 125 informs the congestion controller 100 that a new queue buffer has been allocated. If the congestion controller 100 determines that the data frame should be dropped, queue linker 125 removes the data frame from the processing stream and frees the associated data block in the queue memory 120.

Otherwise, the queue linker 125 notifies a switch queue 130 that the queue can be transmitted.

Once the switch queue acknowledges the new queue, it retrieves the data frame from the memory element 120 and delivers it to a switch media attachment unit (MAC) 135. Figure 5 is a block diagram that depicts the construction of the congestion controller 100, The congestion controller 100 is comprised of a high speed processing element 150, a firmware storage memory 155, a history memory 170 and a probability table memory 180. A regular central processor unit (CPU) or a digital signal processor (DSP) can be used in this application. The CPU or in the alternate a DSP, executes a series of instructions stored in a firmware storage memory 155.

The congestion controller 100 is further comprised of an input port 160 and an output port 165 The input port is used by the processing element 150 to detect when a buffer has been allocated. A signal is received from the queue linker 125 that indicates when buffers are allocated. This signal is then captured by the input port 160 and conveyed to the processing element 150. After the processing element 150 has determined that a data frame should be dropped, it sends a drop-frame signal to the switch queue 130 using the output port 165.

Fig. 6A and Fig. 6B demonstrate the functional flow of the instruction sequence stored in the firmware storage memory. Once the processing element has sensed the buffer allocation signal (step 200), it begins the process of creating a histogram of buffer allocations (step 205). This is stored in a history memory 170 as a function of time; B(t). The processing element 150, based on the history of the buffer allocation, creates a moving average of the buffer allocation function (step 210). This moving average can be any suitable moving average method. The moving average is referred to by the mnemonic AVG.

On a periodic basis, the period of which is established empirically to maximize the throughput of the switching unit, the processing element 150 executes a series of instructions that effectively differentiates the buffer allocation function stored in the history memory 170 (step 215). The resultant first derivative of the buffer allocation function is also stored in the history memory 170. The processing element 150 then executes a series of instructions that differentiate the moving average of the buffer allocation function (step 220). The resultant first derivative of the moving average of the buffer allocation function is also stored in the history memory 170.

The processing element 150 maintains upper and lower threshold values in a probability table memory 180. These are referred to as Ty and TL respectively. Whenever the switching unit must decide if a data frame should be dropped, the value of the moving average of buffer allocation (AVG) is compared to the upper and lower thresholds. If the value of the AVG exceeds the upper threshold Ty (step 225), then the processing element uses the output port 165 to signal the switch queue 130 that the data frame should be dropped (step 230). If the value of AVG is less than the lower threshold (step 235), the processing element does not perform any other processing for the current data frame and the data frame is not dropped. This method is analogous to the prior art.

Fig. 6A shows that, in the present art, the processing element 150 performs additional processing to determine if a data frame should be dropped. The processing element uses the sign of the first derivatives of the buffer allocation function and the sign of the first derivative of the moving average to select one of four probability tables stored in probability table memory 180 (step 240). The table selection is made according to the combinations described in Fig. 3. If the value of the AVG is greater than the lower threshold, as determined by inference by step 235, then the processing element 150 then subtracts the value of the lower threshold TL from the moving average AVG (AVG - TL) (step 245). The difference of AVG - TL is used as an index into the selected probability table according to the statement:

Figure imgf000011_0001

where P is a probability table selected by the sign of the first derivative of the buffer utilization function and the sign of the first derivative of the buffer utilization moving average AVG.

Once the processing element 150 has read a probability value from one of the four probability tables stored in the probability table memory 180 (step 250), it then compares that value to an empirically established probability threshold Tp. If the table value exceeds the probability threshold (step 255), then the processing element 150 uses the output port 165 to indicate to the switch queue 130 that the current data frame must be dropped (step 260). In a refinement to the present embodiment, the probability threshold for this comparison can be derived in a random manner to more closely approximate the random nature of actual network loading.

Alternative Embodiments

The key essence of the present invention is the use of the trend analysis mechanism to predict the direction of change in buffer utilization and the moving average thereof. Many alternative embodiments have been considered by the inventor including, but not limited to, using a multidimensional table for the storage of empirically discovered probability values. In such an alternative embodiment, the four tables discussed herein are replaced with one table having three indices. A value from such a table would be referenced by the statement:

P[B(t)\ A VG\ (A VG - TL )]

Two tables could be used to store probability values with each table having two indices. An example of such a table reference having two tables selected by the sign of the first derivative of the unfiltered moving average would be:

Pm,A A VG\ (A VG - TL)\

All of the probability tables have been described with an index that represents the difference between the average buffer utilization AVG and the Lower threshold TL. The index to any of these tables can be the difference of the upper threshold TH and the average buffer utilization AVG.

Claims

1. A method for reducing resource congestion in an output queuing switch comprising the steps of: monitoring the utilization of output queue buffers; calculating a moving average of the utilization of output queue buffers; accepting a data frame if the moving average is less than a first lower threshold; dropping a data frame if the moving average is greater than a first upper threshold; determining the sign of the first derivative of said utilization of output queue buffers; determining the sign of the first derivative of said moving average; selecting a probability table based on said sign of the first derivative of said utilization of output queue buffers and said sign of the first derivative of said moving average; using the difference between said moving average and said lower threshold as an index to read a value in said selected probability table; and dropping the frame if said value read from said selected probability table according to said index exceeds a pre-determined value and the value of said moving average is greater than said first lower threshold.
2. The method of Claim 1 wherein the pre-determined value is established in a random manner.
3. The method of Claim 1 , wherein the first lower threshold in the step of accepting a data frame if the moving average is less than a first lower threshold is established empirically.
4 The method of Claim 1 , wherein the first upper threshold in the step of dropping a data frame if the moving average is greater than a first upper threshold is established empirically
5 The method of Claim 1 , wherein the values stored in said probability tables are established empirically
6 The method of Claim 1 , wherein the first lower threshold in the step of accepting a data frame if the moving average is less than a first lower threshold is established analytically
7 The method of Claim 1 , wherein the first upper threshold in the step of dropping a data frame if the moving average is greater than a first upper threshold is established analytically
8 The method of Claim 1 , wherein the values stored in said probability tables are established analytically
9 A method for reducing resource congestion in an output queuing switch comprising the steps of monitoring the utilization of output queue buffers, calculating a moving average of the utilization of output queue buffers, accepting a data frame if the moving average is less than a first lower threshold, dropping a data frame if the moving average is greater than a first upper threshold determining the sign of the first derivative of the utilization of output queue buffers determining the sign of the first derivative of the moving average, using the difference between said moving average and said lower threshold together with said sign of the first derivative of said utilization of output queue buffers and said sign of the first derivative of said moving average as indices to read a value in a probability table; and dropping the frame if said value read from said probability table according to said indices exceeds a pre-determined value and the value of said moving average is greater than said first lower threshold.
10. The method of Claim 9 wherein the pre-determined value is established in a random manner.
11. The method of Claim 9, wherein the first lower threshold in the step of accepting a data frame if the moving average is less than a first lower threshold is established empirically.
12. The method of Claim 9, wherein the first upper threshold in the step of dropping a data frame if the moving average is greater than a first upper threshold is established empirically.
13. The method of Claim 9, wherein the values stored in said probability tables are established empirically.
14. The method of Claim 9, wherein the first lower threshold in the step of accepting a data frame if the moving average is less than a first lower threshold is established analytically.
15. The method of Claim 9, wherein the first upper threshold in the step of dropping a data frame if the moving average is greater than a first upper threshold is established analytically.
16 The method of Claim 9, wherein the values stored in said probability tables are established analytically
17 An output queuing switch apparatus comprising network receiver circuit that accepts data from an external data source, wire-input first-in-first-out buffer that accepts data from said network receiver circuit and assembles data frames, memory element, forwarding engine that receives data frames from said wire input first-in-first-out buffer, determines the appropriate destination output queue and stores data frames in said memory element, queue linker that receives data frame descriptors from said forwarding engine and creates output queues containing said data in said memory element, congestion controller that monitors the utilization of output queue buffers, calculates a moving average based on the utilization of output queue buffers, issues a discard signal if said moving average exceeds a first upper threshold, determines the sign of the first derivative of the utilization of output queue buffers, determines the sign of the first derivative of said moving average selects a probability table based on the said sign of the first derivative of said utilization of output queue buffers and said sign of the first derivative of said moving average, issues a discard signal if said moving average is greater than a first lower threshold and the value stored in said selected probability table as indexed by the difference of said moving average and said first lower threshold exceeds a first probability threshold; switch manager that accepts queue data from said memory and discards data frames in response to a discard signal issued by said congestion controller; and switch media access controller that accepts queue data from said switch manager and dispatches the queue data to a network port .
18. The apparatus of Claim 17, wherein the first lower threshold used in the congestion controller is established empirically.
19. The apparatus of Claim 17, wherein the first upper threshold used in the congestion controller is established empirically.
20. The apparatus of Claim 17, wherein the values stored in said probability tables used in the congestion controller are established empirically.
21. The apparatus of Claim 17, wherein the first lower threshold used in the congestion controller is established analytically.
22. The apparatus of Claim 17, wherein the first upper threshold used in the congestion controller is established analytically.
23. The apparatus of Claim 17, wherein the values stored in said probability tables used in the congestion controller are established analytically.
24 The apparatus of Claim 17 wherein said first probability threshold i s established empirically
25 The apparatus of Claim 17 wherein said first probability threshold is established in a random manner
26 An output queuing switch apparatus comprising network receiver circuit that accepts data from an external source, wire-input first-in-first-out buffer that accepts data from said network receiver circuit and assembles data frames, memory element, forwarding engine that receives data frames from said wire input first-in-first-out buffer determines the appropriate destination output queue and stores data frames in said memory element, queue linker that receives data frame descriptors from said forwarding engine and creates output queues containing said data in said memory element, congestion controller that monitors the utilization of output queue buffers, calculates a moving average based on the utilization of output queue buffers issues a discard signal if said moving average exceeds a first upper threshold, determines the sign of the first derivative of the utilization of output queue buffers, determines the sign of the first derivative of said moving average, issues a discard signal if said moving average is greater than a first lower threshold and the value stored in a probability table as indexed by the difference of said moving average and said first lower threshold together with the said sign of the first derivative of said utilization of output queue buffers and said sign of the first derivative of said moving average exceeds a first probability threshold; switch manager that accepts queue data from said memory and discards data frames in response to a discard signal issued by said congestion controller; and switch media access controller that accepts queue data from said switch manager and dispatches the queue data to a network port .
27. The apparatus of Claim 26, wherein the first lower threshold used in the congestion controller is established empirically.
28. The apparatus of Claim 26, wherein the first upper threshold used in the congestion controller is established empirically.
29. The apparatus of Claim 26, wherein the values stored in said probability table used in the congestion controller are established empirically.
30. The apparatus of Claim 26, wherein the first lower threshold used in the congestion controller is established analytically.
31 . The apparatus of Claim 26, wherein the first upper threshold used in the congestion controller is established analytically.
32. The apparatus of Claim 26, wherein the values stored in said probability table used in the congestion controller are established analytically.
33. The apparatus of Claim 26 wherein said first probability threshold is established empirically.
34. The apparatus of Claim 26 wherein said first probability threshold is established in a random manner.
PCT/US2000/019006 1999-07-13 2000-07-13 Apparatus and method to minimize congestion in an output queuing switch WO2001005107A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US14343199P true 1999-07-13 1999-07-13
US60/143,431 1999-07-13

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU60914/00A AU6091400A (en) 1999-07-13 2000-07-13 Apparatus and method to minimize congestion in an output queuing switch

Publications (1)

Publication Number Publication Date
WO2001005107A1 true WO2001005107A1 (en) 2001-01-18

Family

ID=22504047

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/019006 WO2001005107A1 (en) 1999-07-13 2000-07-13 Apparatus and method to minimize congestion in an output queuing switch

Country Status (2)

Country Link
AU (1) AU6091400A (en)
WO (1) WO2001005107A1 (en)

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6389479B1 (en) 1997-10-14 2002-05-14 Alacritech, Inc. Intelligent network interface device and system for accelerated communication
WO2002058322A1 (en) * 2001-01-18 2002-07-25 International Business Machines Corporation Congestion management in computer networks
US6427171B1 (en) 1997-10-14 2002-07-30 Alacritech, Inc. Protocol processing stack for use with intelligent network interface device
US6427173B1 (en) 1997-10-14 2002-07-30 Alacritech, Inc. Intelligent network interfaced device and system for accelerated communication
US6434620B1 (en) 1998-08-27 2002-08-13 Alacritech, Inc. TCP/IP offload network interface device
WO2003028288A2 (en) * 2001-09-27 2003-04-03 Hyperchip Inc. Method and system for congestion avoidance in packet switching devices
US6591302B2 (en) 1997-10-14 2003-07-08 Alacritech, Inc. Fast-path apparatus for receiving data corresponding to a TCP connection
US6658480B2 (en) 1997-10-14 2003-12-02 Alacritech, Inc. Intelligent network interface system and method for accelerated protocol processing
EP1374498A1 (en) * 2001-03-06 2004-01-02 Pluris, Inc. An improved system for fabric packet control
US6687758B2 (en) 2001-03-07 2004-02-03 Alacritech, Inc. Port aggregation for network connections that are offloaded to network interface devices
US6697868B2 (en) 2000-02-28 2004-02-24 Alacritech, Inc. Protocol processing stack for use with intelligent network interface device
EP1417498A1 (en) * 2001-07-26 2004-05-12 International Business Machines Corporation Guarding against a "denial-of-service"
US6751665B2 (en) 2002-10-18 2004-06-15 Alacritech, Inc. Providing window updates from a computer to a network interface device
US6757746B2 (en) 1997-10-14 2004-06-29 Alacritech, Inc. Obtaining a destination address so that a network interface device can write network data without headers directly into host memory
US6807581B1 (en) 2000-09-29 2004-10-19 Alacritech, Inc. Intelligent network storage interface system
EP1478141A2 (en) * 2003-03-13 2004-11-17 Alcatel Improved determination of average queue depth for RED (Random Early Discard)
US6965941B2 (en) 1997-10-14 2005-11-15 Alacritech, Inc. Transmit fast-path processing on TCP/IP offload network interface device
EP1626544A1 (en) * 2003-03-13 2006-02-15 Alcatel Improvement in average queue depth calculation for use in random early packet discard (red) algorithms
US7042898B2 (en) 1997-10-14 2006-05-09 Alacritech, Inc. Reducing delays associated with inserting a checksum into a network message
WO2006072876A1 (en) * 2005-01-06 2006-07-13 Telefonaktiebolaget Lm Ericsson (Publ) Method of controlling packet flow
US7237036B2 (en) 1997-10-14 2007-06-26 Alacritech, Inc. Fast-path apparatus for receiving data corresponding a TCP connection
US7284070B2 (en) 1997-10-14 2007-10-16 Alacritech, Inc. TCP offload network interface device
US8019901B2 (en) 2000-09-29 2011-09-13 Alacritech, Inc. Intelligent network storage interface system
US8893159B1 (en) 2008-04-01 2014-11-18 Alacritech, Inc. Accelerating data transfer in a virtual computer system with tightly coupled TCP connections
US9055104B2 (en) 2002-04-22 2015-06-09 Alacritech, Inc. Freeing transmit memory on a network interface device prior to receiving an acknowledgment that transmit data has been received by a remote device
US9306793B1 (en) 2008-10-22 2016-04-05 Alacritech, Inc. TCP offload device that batches session layer headers to reduce interrupts as well as CPU copies
US9413788B1 (en) 2008-07-31 2016-08-09 Alacritech, Inc. TCP offload send optimization

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5748901A (en) * 1996-05-21 1998-05-05 Ramot University Authority Ltd. Flow control algorithm for high speed networks

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5748901A (en) * 1996-05-21 1998-05-05 Ramot University Authority Ltd. Flow control algorithm for high speed networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FLOYD S ET AL: "RANDOM EARLY DETECTION GATEWAYS FOR CONGESTION AVOIDANCE", IEEE / ACM TRANSACTIONS ON NETWORKING,US,IEEE INC. NEW YORK, vol. 1, no. 4, 1 August 1993 (1993-08-01), pages 397 - 413, XP000415363, ISSN: 1063-6692 *

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6757746B2 (en) 1997-10-14 2004-06-29 Alacritech, Inc. Obtaining a destination address so that a network interface device can write network data without headers directly into host memory
US7237036B2 (en) 1997-10-14 2007-06-26 Alacritech, Inc. Fast-path apparatus for receiving data corresponding a TCP connection
US6427171B1 (en) 1997-10-14 2002-07-30 Alacritech, Inc. Protocol processing stack for use with intelligent network interface device
US6427173B1 (en) 1997-10-14 2002-07-30 Alacritech, Inc. Intelligent network interfaced device and system for accelerated communication
US7284070B2 (en) 1997-10-14 2007-10-16 Alacritech, Inc. TCP offload network interface device
US6389479B1 (en) 1997-10-14 2002-05-14 Alacritech, Inc. Intelligent network interface device and system for accelerated communication
US6591302B2 (en) 1997-10-14 2003-07-08 Alacritech, Inc. Fast-path apparatus for receiving data corresponding to a TCP connection
US9009223B2 (en) 1997-10-14 2015-04-14 Alacritech, Inc. Method and apparatus for processing received network packets on a network interface for a computer
US6658480B2 (en) 1997-10-14 2003-12-02 Alacritech, Inc. Intelligent network interface system and method for accelerated protocol processing
US8131880B2 (en) 1997-10-14 2012-03-06 Alacritech, Inc. Intelligent network interface device and system for accelerated communication
US8856379B2 (en) 1997-10-14 2014-10-07 A-Tech Llc Intelligent network interface system and method for protocol processing
US6965941B2 (en) 1997-10-14 2005-11-15 Alacritech, Inc. Transmit fast-path processing on TCP/IP offload network interface device
US8782199B2 (en) 1997-10-14 2014-07-15 A-Tech Llc Parsing a packet header
US8447803B2 (en) 1997-10-14 2013-05-21 Alacritech, Inc. Method and apparatus for distributing network traffic processing on a multiprocessor computer
US7042898B2 (en) 1997-10-14 2006-05-09 Alacritech, Inc. Reducing delays associated with inserting a checksum into a network message
US6434620B1 (en) 1998-08-27 2002-08-13 Alacritech, Inc. TCP/IP offload network interface device
US6697868B2 (en) 2000-02-28 2004-02-24 Alacritech, Inc. Protocol processing stack for use with intelligent network interface device
US8019901B2 (en) 2000-09-29 2011-09-13 Alacritech, Inc. Intelligent network storage interface system
US6807581B1 (en) 2000-09-29 2004-10-19 Alacritech, Inc. Intelligent network storage interface system
WO2002058322A1 (en) * 2001-01-18 2002-07-25 International Business Machines Corporation Congestion management in computer networks
US6870811B2 (en) 2001-01-18 2005-03-22 International Business Machines Corporation Quality of service functions implemented in input interface circuit interface devices in computer network hardware
EP1374498A1 (en) * 2001-03-06 2004-01-02 Pluris, Inc. An improved system for fabric packet control
EP1374498A4 (en) * 2001-03-06 2006-09-27 Pluris Inc An improved system for fabric packet control
US6938092B2 (en) 2001-03-07 2005-08-30 Alacritech, Inc. TCP offload device that load balances and fails-over between aggregated ports having different MAC addresses
US6687758B2 (en) 2001-03-07 2004-02-03 Alacritech, Inc. Port aggregation for network connections that are offloaded to network interface devices
EP1417498A1 (en) * 2001-07-26 2004-05-12 International Business Machines Corporation Guarding against a "denial-of-service"
EP1417498A4 (en) * 2001-07-26 2005-03-02 Ibm Guarding against a "denial-of-service"
US8125902B2 (en) 2001-09-27 2012-02-28 Hyperchip Inc. Method and system for congestion avoidance in packet switching devices
WO2003028288A3 (en) * 2001-09-27 2003-07-31 Robin Boivin Method and system for congestion avoidance in packet switching devices
WO2003028288A2 (en) * 2001-09-27 2003-04-03 Hyperchip Inc. Method and system for congestion avoidance in packet switching devices
US9055104B2 (en) 2002-04-22 2015-06-09 Alacritech, Inc. Freeing transmit memory on a network interface device prior to receiving an acknowledgment that transmit data has been received by a remote device
US6751665B2 (en) 2002-10-18 2004-06-15 Alacritech, Inc. Providing window updates from a computer to a network interface device
EP1478141A2 (en) * 2003-03-13 2004-11-17 Alcatel Improved determination of average queue depth for RED (Random Early Discard)
EP1478141A3 (en) * 2003-03-13 2006-03-15 Alcatel Improved determination of average queue depth for RED (Random Early Discard)
EP1626544A1 (en) * 2003-03-13 2006-02-15 Alcatel Improvement in average queue depth calculation for use in random early packet discard (red) algorithms
WO2006072876A1 (en) * 2005-01-06 2006-07-13 Telefonaktiebolaget Lm Ericsson (Publ) Method of controlling packet flow
US7301907B2 (en) 2005-01-06 2007-11-27 Telefonktiebolaget Lm Ericsson (Publ) Method of controlling packet flow
US8893159B1 (en) 2008-04-01 2014-11-18 Alacritech, Inc. Accelerating data transfer in a virtual computer system with tightly coupled TCP connections
US9667729B1 (en) 2008-07-31 2017-05-30 Alacritech, Inc. TCP offload send optimization
US9413788B1 (en) 2008-07-31 2016-08-09 Alacritech, Inc. TCP offload send optimization
US9306793B1 (en) 2008-10-22 2016-04-05 Alacritech, Inc. TCP offload device that batches session layer headers to reduce interrupts as well as CPU copies

Also Published As

Publication number Publication date
AU6091400A (en) 2001-01-30

Similar Documents

Publication Publication Date Title
JP3548720B2 (en) How to improve system performance in a data network by the queue management based on monitoring of the entry-side speed
US6388992B2 (en) Flow control technique for traffic in a high speed packet switching network
US6498781B1 (en) Self-tuning link aggregation system
US7421273B2 (en) Managing priority queues and escalation in wireless communication systems
US7616572B2 (en) Call admission control/session management based on N source to destination severity levels for IP networks
US5491687A (en) Method and system in a local area network switch for dynamically changing operating modes
US7616573B2 (en) Fair WRED for TCP UDP traffic mix
JP4394203B2 (en) How to share the available bandwidth, the processor to implement such a method as well as a scheduler, intelligent buffer and communication system comprising such a processor,
CN101917330B (en) Methods and apparatus for defining a flow control signal
EP1352495B1 (en) Congestion management in computer networks
US6667985B1 (en) Communication switch including input bandwidth throttling to reduce output congestion
US6011776A (en) Dynamic bandwidth estimation and adaptation in high speed packet switching networks
Sun et al. PD-RED: to improve the performance of RED
US6377546B1 (en) Rate guarantees through buffer management
US8036117B1 (en) Dequeuing and congestion control systems and methods
JP4852194B2 (en) System and method for coordinating message flows in a digital data network
US6333917B1 (en) Method and apparatus for red (random early detection) and enhancements.
RU2277300C2 (en) Method and device for monitoring and prediction of multi-thread load
US20040233912A1 (en) Method and systems for controlling ATM traffic using bandwidth allocation technology
US6754215B1 (en) Packet scheduling device
US6144636A (en) Packet switch and congestion notification method
EP1225734B1 (en) Method, system and computer program product for bandwidth allocation in a multiple access system
US7031313B2 (en) Packet transfer apparatus with the function of flow detection and flow management method
US20040240389A1 (en) Method and apparatus for load sharing and overload control for packet media gateways under control of a single media gateway controller
EP0932282A2 (en) TCP admission control

Legal Events

Date Code Title Description
AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP