US20190007323A1 - Phantom queue link level load balancing system, method and device - Google Patents
Phantom queue link level load balancing system, method and device Download PDFInfo
- Publication number
- US20190007323A1 US20190007323A1 US16/126,644 US201816126644A US2019007323A1 US 20190007323 A1 US20190007323 A1 US 20190007323A1 US 201816126644 A US201816126644 A US 201816126644A US 2019007323 A1 US2019007323 A1 US 2019007323A1
- Authority
- US
- United States
- Prior art keywords
- output ports
- packet
- output
- selection logic
- link selection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/12—Avoiding congestion; Recovering from congestion
- H04L47/125—Avoiding congestion; Recovering from congestion by balancing the load, e.g. traffic engineering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0876—Network utilisation, e.g. volume of load or congestion level
- H04L43/0882—Utilisation of link capacity
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/16—Threshold monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/11—Identifying congestion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/12—Avoiding congestion; Recovering from congestion
- H04L47/122—Avoiding congestion; Recovering from congestion by diverting traffic away from congested entities
Definitions
- the present invention relates to load balancing. More particularly, the present invention relates to using phantom queues to balance the load on a system.
- Ethernet switches typically have static balance algorithms that are limited because they do not response to load in the network. Thus, the current switches are unable to dynamically adjust to different loads and are as a result not as efficient as possible.
- a data processing system comprises a phantom queue for each of a plurality of output ports each associated with an output link for outputting data.
- the phantom queues receive/monitor traffic on the respective ports and/or the associated links such that the congestion or traffic volume on the output ports/links is able to be determined by a congestion mapper coupled with the phantom queues. Based on the determined congestion level on each of the ports/links, the congestion mapper selects one or more non or less congested ports/links as destination of one or more packets.
- a link selection logic element then processes the packets according to the selected path or multi-path thereby reducing congestion on the system.
- the system provides the advantage of providing dynamic load balancing for non-TCP traffic by leveraging the phantom queue fill levels.
- a first aspect is directed to a dynamic load balancing system on a processing microchip.
- the system comprises a multipath interface group comprising a plurality of paths for outputting packets from the microchip, wherein each of the paths is coupled to an output port of the microchip, link selection logic that receives input traffic packets and, for each of the packets, selects which one of the output ports the packet is to be output from onto the path coupled to the one of the output ports and a plurality of shapers, wherein each of the shapers is coupled to one of the output ports and limits the outputting of the packets out of the output port such that a rate of data output by the output port is below a data output rate threshold, and further wherein each of the shapers indicate a congestion level of the output port coupled to the shaper that corresponds to a quantity of the packets sent to the output port by the link selection logic during a time period, wherein for each packet the link selection logic determines whether the packet has a transmission control protocol (TCP) format, and if the packet does not have the TCP format, the link selection logic selects the one of the output ports based
- the link selection logic selects the one of the output ports independent of the congestion level of each of the output ports. In some embodiments, if the packet does have the TCP format, the link selection logic selects the one of the output ports based on a hash of the packet and an equal or weighted cost multipath selection protocol. In some embodiments, if the packet does not have the TCP format, the link selection logic selects the one of the output ports according to a metric except the link selection logic will remove all of the output ports whose congestion level is above a congestion threshold value from a pool of the output ports that are able to be selected according to the metric.
- the link selection logic selects the one of the output ports according to the metric while including all of the output ports in the pool despite the congestion level of all of the output ports.
- the metric is one of the group consisting of round robin, random, and smallest congestion level first.
- each of the shapers comprise a phantom queue and a credit generator that deposits a credit into the phantom queue at a predefined credit deposit rate, wherein as each packet is output by one of the output ports, the shaper coupled to the one of the output ports removes one or more credits from the phantom queue of the shaper such that a total value of the removed credits is equal to or greater than a size of the packet.
- the link selection logic determines the congestion level of each of the output ports based on the number of credits within the phantom queue coupled to the output port.
- the system further comprises a plurality of packet queues each coupled with one of the output ports such that the queues receive and queue each of the packets to be output by the output ports.
- the link selection logic determines the congestion level of each of the output ports based on a number of the packets within the packet queue associated with the output port.
- the system further comprises one of more additional shapers, wherein each of the additional shapers is coupled to one of the output ports and monitors the outputting of the packets out of the output port to determine whether the rate of data output by the output port is above an additional data output rate threshold, and further wherein each of the additional shapers indicate an additional congestion level of the output port coupled to the additional shaper that corresponds to the quantity of the packets sent to the output port by the link selection logic during the time period.
- a second aspect is directed to a link selection logic element stored on a non-transitory computer-readable medium of a processing microchip having a plurality of shapers and a multipath interface group including a plurality of paths for outputting packets from the microchip, wherein each of the paths is coupled to an output port of the microchip and each of the shapers is coupled to one of the output ports and monitors the outputting of the packets out of the output port to determine whether a rate of data output by the output port is above a data output rate threshold, the link selection logic element configured to receive a plurality of traffic packets input by the microchip, for each of the traffic packets, determine whether the packet has a transmission control protocol (TCP) format and for each of the traffic packets, select which one of the output ports the packet is to be output from onto the path coupled to the one of the output ports, wherein each of the shapers indicate a congestion level of the output port coupled to the shaper that corresponds to a quantity of the packets sent to the output port by the link selection
- the link selection logic selects the one of the output ports independent of the congestion level of each of the output ports. In some embodiments, if the packet does have the TCP format, the link selection logic selects the one of the output ports based on a hash of the packet and an equal or weighted cost multipath selection protocol. In some embodiments, if the packet does not have the TCP format, the link selection logic selects the one of the output ports according to a metric except the link selection logic will remove all of the output ports whose congestion level is above a congestion threshold value from a pool of the output ports that are able to be selected according to the metric.
- the link selection logic selects the one of the output ports according to the metric while including all of the output ports in the pool despite the congestion level of all of the output ports.
- the metric is one of the group consisting of round robin, random, and smallest congestion level first.
- each of the shapers comprise a phantom queue and a credit generator that deposits a credit into the phantom queue at a predefined credit deposit rate, wherein as each packet is output by one of the output ports, the shaper coupled to the one of the output ports removes one or more credits from the phantom queue of the shaper such that a total value of the removed credits is equal to or greater than a size of the packet.
- the link selection logic determines the congestion level of each of the output ports based on the number of credits within the phantom queue coupled to the output port.
- the microchip has a plurality of packet queues each coupled with one of the output ports such that the queues receive and queue each of the packets to be output by the output ports.
- the link selection logic determines the congestion level of each of the output ports based on a number of the packets within the packet queue associated with the output port.
- the microchip further comprises one of more additional shapers such that each of the additional shapers is coupled to one of the output ports, wherein each of the additional shapers indicate an additional congestion level of the output port coupled to the additional shaper that corresponds to the quantity of the packets sent to the output port by the link selection logic during the time period, and further wherein if the packet does not have the TCP format, the link selection logic selects the one of the output ports based on the congestion level and the additional congestion levels of each of the output ports.
- a third aspect is directed to a method of dynamic load balancing within a dynamic load balancing system.
- the method comprises receiving a plurality of traffic packets with link selection logic on a processing microchip having a plurality of shapers and a multipath interface group including a plurality of paths for outputting packets from the microchip, wherein each of the paths is coupled to an output port of the microchip and each of the shapers is coupled to one of the output ports and monitors the outputting of the packets out of the output port to determine whether a rate of data output by the output port is above a data output rate threshold, for each of the traffic packets, determining whether the packet has a transmission control protocol (TCP) format with the link selection logic and for each of the traffic packets, selecting which one of the output ports the packet is to be output from onto the path coupled to the one of the output ports with the link selection logic, wherein each of the shapers indicate a congestion level of the output port coupled to the shaper that corresponds to a quantity of the packets sent to
- the method further comprises, if the packet does have the TCP format, selecting the one of the output ports independent of the congestion level of each of the output ports with the link selection logic. In some embodiments, the method further comprises, if the packet does have the TCP format, selecting the one of the output ports based on a hash of the packet and an equal or weighted cost multipath selection protocol with the link selection logic. In some embodiments, the method further comprises, if the packet does not have the TCP format, selecting the one of the output ports according to a metric with the link selection logic wherein the link selection logic removes all of the output ports whose congestion level is above a congestion threshold value from a pool of the output ports that are able to be selected according to the metric.
- the method further comprises, if the packet does not have the TCP format and all of the output ports have a congestion level that is above the congestion threshold value, selecting the one of the output ports according to the metric with the link selection logic while including all of the output ports in the pool despite the congestion level of all of the output ports.
- the metric is one of the group consisting of round robin, random, and smallest congestion level first.
- each of the shapers comprise a phantom queue and a credit generator that deposits a credit into the phantom queue at a predefined credit deposit rate, further comprising as each packet is output by one of the output ports, removing, with the shaper coupled to the one of the output ports, one or more credits from the phantom queue of the shaper such that a total value of the removed credits is equal to or greater than a size of the packet.
- the method further comprises determining the congestion level of each of the output ports with the link selection logic based on the number of credits within the phantom queue coupled to the output port.
- the processing microchip further comprises a plurality of packet queues each coupled with one of the output ports such that the queues receive and queue each of the packets to be output by the output ports.
- the method further comprises determining the congestion level of each of the output ports with the link selection logic based on a number of the packets within the packet queue associated with the output port.
- the processing microchip has one or more additional shapers such that each of the additional shapers is coupled to one of the output ports and monitors the outputting of the packets out of the output port to determine whether the rate of data output by the output port is below an additional data output rate threshold, and further wherein each of the additional shapers indicate an additional congestion level of the output port coupled to the additional shaper that corresponds to the quantity of the packets sent to the output port by the link selection logic during the time period, and further wherein if the packet does not have the TCP format, the selecting of the one of the output ports is based on the congestion level and the additional congestion levels of each of the output ports.
- FIG. 1 illustrates a dynamic load balancing system 100 according to some embodiments.
- FIG. 2 illustrates a method of dynamic load balancing within a dynamic load balancing system according to some embodiments.
- Embodiments are directed to a data processing system that comprises a phantom queue for each of a plurality of output ports each associated with an output link for outputting data.
- the phantom queues receive/monitor traffic on the respective ports and/or the associated links such that the congestion or traffic volume on the output ports/links is able to be determined by a congestion mapper coupled with the phantom queues.
- the congestion mapper Based on the determined congestion level on each of the ports/links, the congestion mapper selects one or more non or less congested ports/links as destination of one or more packets.
- a link selection logic element then processes the packets according to the selected path or multi-path thereby reducing congestion on the system. For example, when a current port/link is determined to be congested, packets are able to be re-routed to one or more of the other links/ports until the current port/link is no longer congested.
- the non-congested ports are selected by masking links to congested ports. In some embodiments, the non-congested ports are determined based on their congestion level value being below a congestion threshold value and the congested ports are determined based on their congestion level being above the congestion threshold value or a different threshold value. In some embodiments, a link/port is determined to be congested if a bucket of the associated phantom queue is empty and/or out of credits for outputting the traffic packets. Alternatively, or in addition, a link/port is determined to be congested based on the queue fill level for the port/link.
- TCP traffic is not enabled for the dynamic load balancing of the system such that the traffic is able to ignore congestion levels and thus is not directed to different ports/links by the congestion mapper regardless of the congestion state.
- non-TCP traffic is enabled for the load balancing of the system such that it is able to be routed to different ports/links based on the congestion levels by the congestion mapper.
- both the TCP and the non-TCP traffic is enabled for the load balancing of the system such that it is able to be routed to different ports/links based on the congestion levels by the congestion mapper.
- the ports/links are able to be selected randomly, in a round robin order, based on the level of congestion (e.g. which has the least current congestion), and/or according to other types of selection priority protocols.
- one or more of the phantom queues are able to be replaced and/or supplemented with a traffic shaper.
- FIG. 1 illustrates a dynamic load balancing system 100 according to some embodiments.
- the dynamic load balancing system 100 is able to be located within and/or stored on one or more processing microchips 102 (e.g. one or more software-defined network microchips, datacenter switch, ethernet switch).
- the system 100 is able to be located within and/or stored on one or more components of a processing circuit.
- the dynamic load balancing system 100 comprises a plurality of output ports 104 , output paths 106 , shapers 110 , packet queues 112 and link selection logic 114 .
- the system 100 comprises two output ports 104 , output paths 106 , shapers 110 and packet queues 112 , more output ports 104 , output paths 106 , shapers 110 and/or packet queues 112 are contemplated. Further, the system 100 is able to comprise more or less components. For example, in some embodiments the packet queues 112 are able to be omitted. Additionally, in some embodiments one or more of the output ports 104 are able to each have a plurality of shapers 110 and/or packet queues 112 operably coupled therewith.
- the plurality of output ports 104 are each associated with one of the output path 106 , which together form a multipath interface 108 . Thus, packets that exit the chip 102 via one of the output ports 104 will travel on the output path 106 associated with the output port 104 .
- one or more of the output ports 104 are physical ports of the microchip 102 .
- one or more of the output ports 104 are able to be virtual ports of the microchip 102 .
- Each one of the shapers 110 is operably coupled a different one of the packet queues 112 and/or a different one of the output ports 104 such that each link or path 106 is associated with a set of one queue 112 , one shaper 110 and one port 104 .
- a group of a plurality of shapers 110 is able to be operable coupled to each of the packet queues 112 and/or the output ports 104 such that each link or path 106 is associated with a set of one queue 112 , a group of shapers 110 and one port 104 .
- the packet queue 112 coupled to that port 104 is able to receive and buffer packets that are to be sent to the port 104 until the port 104 is ready to output them.
- the queue 112 is able to receive packets as routed by the link selection logic 114 and buffer the packets according to a first in first out (FIFO) or other buffering system until they are ready to be received by the corresponding output port 104 .
- the shaper 110 coupled to that port 104 is able to shape or control the packet rate (e.g. number of packets/time) of the packet traffic traveling out of the output port 104 .
- the shaper 110 is able to comprise a credit generator 110 a and a phantom queue 110 b, wherein the credit generator 110 a fills the phantom queue 110 b with credits at a predetermined credit rate and the shaper 110 must remove one of the credits each time the shaper 110 permits a number of packets having a size equal to or less than a value of the credit or credits to be output through the corresponding output port 104 .
- the shaper 110 must remove one credit before permitting one or more packets whose size together equal the 256 bytes (i.e. the value of the credit).
- the shaper 110 must wait for at least two credits to accumulate within the queue 110 b before permitting the packet to be output and removing two of the at least two credits. Consequently, the shaper 110 is able to limit the maximum output rate of the packets out of the output port 104 because if there are no credits remaining in the phantom queue 110 b (because they all have previously been removed and the next credit has yet to be deposited by the credit generator 110 a ) the shaper 110 will prevent any further packets from being output until a new credit is available.
- the phantom queue 110 b is able to fill up with extra credits (that cannot be used because there are no packets to output) until the phantom queue 110 b is completely full.
- the fill level of each of the phantom queues 110 b is able to indicate a congestion level of the associated ports 104 , wherein the fuller the phantom queue 110 b the lower the congestion level of the port 104 and vice versa.
- the shapers 110 are able to be passive in that they do not enforce restricting traffic or packet transmission to the shaper rate, rather they only passively monitor the rate of the packet traffic to detect when a congestion level is reached and then signal that information to the selection logic. Further, in some embodiments wherein one or more groups of shapers 110 are each coupled to different single output ports 104 , each shaper 110 of the groups is able to have a credit generator 110 a that generates credits at a different rate and/or of a different size than the other credit generators 110 a of the other shapers 110 in the group. As a result, the different shapers 110 will each have different phantom queue fill levels (i.e.
- multi-level congestion indications (e.g. one for each shaper 110 in the group) are able to be provided to the link selection logic 114 for each port 104 coupled with one of the groups of shapers 110 .
- the link selection logic 114 is coupled with or is provided access to input traffic packets 116 and each path 106 including the associated port 104 , shaper 110 and packet queue 112 .
- the selection logic 114 is able to input or access traffic in the form of packets that enter the system 100 and phantom queue vectors from the shapers 110 indicating the current number of credits (e.g. a congestion level) within each of the phantom queues 110 b, and further able to determine which of the paths 106 and/or ports 104 each of the packets are output from by the system 100 .
- the link selection logic 114 is able to use a TCP selection metric to select one of the ports 104 from which to output the input packet determined to be a TCP or TCP format packet.
- the TCP metric is able to be a weighted or equal cost multipath metric that selects a port 104 based on a hash or other representation of the TCP packets in order to attempt to maintain the order of the sequence of the TCP packets.
- the TCP metric is able to be other types of selection metrics that prioritize maintaining the sequence of the TCP packets.
- the link selection logic 114 is able to use a non-TCP selection metric and the phantom queue vectors to select one of the ports 104 from which to output the input packet. Specifically, the link selection logic 114 is able to input or review the latest phantom queue vector and remove any of the ports 104 whose vector value (or congestion level or phantom queue 110 b fill level) indicates a level of congestion that exceeds a predetermined congestion threshold from the pool of ports 104 that are able to be selected by the non-TCP selection metric.
- the link selection logic 114 is able to select the one of the ports 104 from which to output the input packet based on the non-TCP selection metric.
- the TCP selection metric is able to be used based on the remaining pool of ports 104 .
- the non-TCP selection metric is able to be the port 104 whose vector value indicates the lowest level of congestion.
- the port 104 with the lowest congestion level is able to be determined based on which port 104 has the least number of shapers whose congestion level is above the threshold.
- the number of shapers 110 of each of the groups of shapers 110 that indicate a congestion level above the threshold is able to be used by the selection logic 114 to determine which port 104 to select and/or which ports 104 to remove from the pool of selectable ports 104 .
- the non-TCP selection metric is able to be a random, round robin or other schedule of selecting one of the pool of ports 104 .
- the link selection logic 114 is able to add all of the ports 104 back into the pool (despite their congestion levels) and based on this full pool of the ports 104 select the one of the ports 104 from which to output the input packet based on the non-TCP selection metric.
- the TCP selection metric is able to be used in such a case based on the full pool of ports 104 .
- the system provides the advantage of considering phantom queue 110 b indications of congestion levels to dynamically balancing output port 104 packet loads for non-TCP traffic while disregarding phantom queue 110 b indications of congestion levels when distributing TCP traffic (e.g.
- each shaper 110 is subject to the same credit generation rate (e.g. congestion threshold). Alternatively, one or more of the shapers 110 are able to be subject to different credit generation rates (e.g. congestion thresholds).
- the link selection logic 114 is able to comprise a first component 114 b that receives the phantom queue vectors and performs the link selection for the non-TCP traffic and a second component 114 b that performs the link selection for the TCP traffic. Alternatively, the first and second components 114 a, 114 b are able to be combined as a single component 114 .
- the determination whether the traffic is TCP or non-TCP is able to be omitted and instead all traffic is able to be subject to the non-TCP selection metric as if it were all non-TCP traffic as described above.
- the link selection logic is able to determine and consider the current level of fullness of packets of the associated packet queue 112 .
- the packet queue fullness level is able to be compared to the packet queue fullness threshold wherein a port 104 is removed from the pool only when both the packet queue fullness and the phantom queue thresholds have been exceeded, when at least one of the packet queue fullness and the phantom queue thresholds have been exceeded, or solely based on when the packet queue fullness threshold has been exceeded.
- other factors such as quantized congestion notification methods are able to be used to determine when to remove ports 104 from the pool of ports.
- FIG. 2 illustrates a method of dynamic load balancing within a dynamic load balancing system 100 according to some embodiments.
- the link selection logic 114 accesses or receives a plurality of traffic packets at the step 202 . Then, for each of the traffic packets, the selection logic 114 determines whether the packet has a TCP format at the step 204 .
- the link selection logic 114 selects which one of the output ports 104 the packet is to be output from onto the path 106 coupled to the one of the output ports 104 , wherein if the packet does not have the TCP format, the link selection logic 114 selects the one of the output ports 104 based on the congestion level of each of the output ports 104 at the step 206 . Specifically, the link selection logic 114 is able to determine the congestion level of each of the output ports 104 based on the number of credits within the phantom queue 110 b coupled to the output port 104 .
- the link selection logic 114 is able to determine the congestion level of each of the output ports 104 based on a number of the packets within the packet queue 112 associated with the output port 104 .
- the method is able to provide the advantage of dynamically load balancing the outputting of the non-TCP traffic based on port congestion level. If instead the packet does have the TCP format, the link selection logic 114 is able to select the one of the output ports 104 independent of the congestion level of each of the output ports 104 .
- the system 100 is able to recognize the preference for keeping TCP traffic in sequence and thus does not apply the dynamic load balancing to its port selection for TCP traffic.
- the method further provides the advantage of distinguishing between traffic types and applying different port selection metrics based on the traffic type/format.
- the selecting the one of the output ports 104 is able to be based on a hash of the packet and an equal or weighted cost multipath selection protocol. If the packet does not have the TCP format, the selecting the one of the output ports 104 is able to be according to a non-TCP metric, wherein the link selection logic 114 removes all of the output ports 104 whose congestion level is above a congestion threshold value from a pool of the output ports 104 that are able to be selected according to the non-TCP metric.
- the non-TCP metric is one of the group consisting of round robin, random, and smallest congestion level first.
- the link selection logic 114 selects the one of the output ports 104 according to the metric while including all of the output ports 104 in the pool despite the congestion level of all of the output ports 104 .
- the dynamic load balancing system provides the advantage of distinguishing between traffic types and applying different port selection metrics based on the traffic type/format. Further, the system provides the advantage of considering phantom queue indications of congestion levels to dynamically balancing output port packet loads for non-TCP traffic while disregarding phantom queue indications of congestion levels when distributing TCP traffic (e.g. statically balancing output port packet loads for TCP traffic). Moreover, the system provides the advantage of ensuring the packet flow is not halted in the case that all the ports are above the congestion threshold. Therefore, the dynamic load balancing system described herein has numerous advantages.
- shapers 110 do not drop any packets in order to control the output rate of a port 104 . Instead, shapers 110 only delay the packets to ensure the maximum output rate is not exceeded.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Environmental & Geological Engineering (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Computer Security & Cryptography (AREA)
Abstract
Description
- This application claims priority under 35 U.S.C. § 119(e) of the co-pending U.S. provisional patent application Ser. No. 62/043,331, filed Aug. 28, 2014, and titled “PHANTOM QUEUE LINK LEVEL LOAD BALANCING SYSTEM, METHOD AND DEVICE,” which is hereby incorporated by reference.
- The present invention relates to load balancing. More particularly, the present invention relates to using phantom queues to balance the load on a system.
- Load balancing has become increasingly important as data centers look to adopt solutions to minimize congestion and/or packet loss and application jitter. Ethernet switches typically have static balance algorithms that are limited because they do not response to load in the network. Thus, the current switches are unable to dynamically adjust to different loads and are as a result not as efficient as possible.
- A data processing system comprises a phantom queue for each of a plurality of output ports each associated with an output link for outputting data. The phantom queues receive/monitor traffic on the respective ports and/or the associated links such that the congestion or traffic volume on the output ports/links is able to be determined by a congestion mapper coupled with the phantom queues. Based on the determined congestion level on each of the ports/links, the congestion mapper selects one or more non or less congested ports/links as destination of one or more packets. A link selection logic element then processes the packets according to the selected path or multi-path thereby reducing congestion on the system. As a result, the system provides the advantage of providing dynamic load balancing for non-TCP traffic by leveraging the phantom queue fill levels. A first aspect is directed to a dynamic load balancing system on a processing microchip.
- The system comprises a multipath interface group comprising a plurality of paths for outputting packets from the microchip, wherein each of the paths is coupled to an output port of the microchip, link selection logic that receives input traffic packets and, for each of the packets, selects which one of the output ports the packet is to be output from onto the path coupled to the one of the output ports and a plurality of shapers, wherein each of the shapers is coupled to one of the output ports and limits the outputting of the packets out of the output port such that a rate of data output by the output port is below a data output rate threshold, and further wherein each of the shapers indicate a congestion level of the output port coupled to the shaper that corresponds to a quantity of the packets sent to the output port by the link selection logic during a time period, wherein for each packet the link selection logic determines whether the packet has a transmission control protocol (TCP) format, and if the packet does not have the TCP format, the link selection logic selects the one of the output ports based on the congestion level of each of the output ports. In some embodiments, if the packet does have the TCP format, the link selection logic selects the one of the output ports independent of the congestion level of each of the output ports. In some embodiments, if the packet does have the TCP format, the link selection logic selects the one of the output ports based on a hash of the packet and an equal or weighted cost multipath selection protocol. In some embodiments, if the packet does not have the TCP format, the link selection logic selects the one of the output ports according to a metric except the link selection logic will remove all of the output ports whose congestion level is above a congestion threshold value from a pool of the output ports that are able to be selected according to the metric. In some embodiments, if the packet does not have the TCP format and all of the output ports have a congestion level that is above the congestion threshold value, the link selection logic selects the one of the output ports according to the metric while including all of the output ports in the pool despite the congestion level of all of the output ports. In some embodiments, the metric is one of the group consisting of round robin, random, and smallest congestion level first. In some embodiments, each of the shapers comprise a phantom queue and a credit generator that deposits a credit into the phantom queue at a predefined credit deposit rate, wherein as each packet is output by one of the output ports, the shaper coupled to the one of the output ports removes one or more credits from the phantom queue of the shaper such that a total value of the removed credits is equal to or greater than a size of the packet. In some embodiments, the link selection logic determines the congestion level of each of the output ports based on the number of credits within the phantom queue coupled to the output port. In some embodiments, the system further comprises a plurality of packet queues each coupled with one of the output ports such that the queues receive and queue each of the packets to be output by the output ports. In some embodiments, the link selection logic determines the congestion level of each of the output ports based on a number of the packets within the packet queue associated with the output port. In some embodiments, the system further comprises one of more additional shapers, wherein each of the additional shapers is coupled to one of the output ports and monitors the outputting of the packets out of the output port to determine whether the rate of data output by the output port is above an additional data output rate threshold, and further wherein each of the additional shapers indicate an additional congestion level of the output port coupled to the additional shaper that corresponds to the quantity of the packets sent to the output port by the link selection logic during the time period.
- A second aspect is directed to a link selection logic element stored on a non-transitory computer-readable medium of a processing microchip having a plurality of shapers and a multipath interface group including a plurality of paths for outputting packets from the microchip, wherein each of the paths is coupled to an output port of the microchip and each of the shapers is coupled to one of the output ports and monitors the outputting of the packets out of the output port to determine whether a rate of data output by the output port is above a data output rate threshold, the link selection logic element configured to receive a plurality of traffic packets input by the microchip, for each of the traffic packets, determine whether the packet has a transmission control protocol (TCP) format and for each of the traffic packets, select which one of the output ports the packet is to be output from onto the path coupled to the one of the output ports, wherein each of the shapers indicate a congestion level of the output port coupled to the shaper that corresponds to a quantity of the packets sent to the output port by the link selection logic during a time period, and further wherein if the packet does not have the TCP format, the link selection logic selects the one of the output ports based on the congestion level of each of the output ports. In some embodiments, if the packet does have the TCP format, the link selection logic selects the one of the output ports independent of the congestion level of each of the output ports. In some embodiments, if the packet does have the TCP format, the link selection logic selects the one of the output ports based on a hash of the packet and an equal or weighted cost multipath selection protocol. In some embodiments, if the packet does not have the TCP format, the link selection logic selects the one of the output ports according to a metric except the link selection logic will remove all of the output ports whose congestion level is above a congestion threshold value from a pool of the output ports that are able to be selected according to the metric. In some embodiments, if the packet does not have the TCP format and all of the output ports have a congestion level that is above the congestion threshold value, the link selection logic selects the one of the output ports according to the metric while including all of the output ports in the pool despite the congestion level of all of the output ports. In some embodiments, the metric is one of the group consisting of round robin, random, and smallest congestion level first. In some embodiments, each of the shapers comprise a phantom queue and a credit generator that deposits a credit into the phantom queue at a predefined credit deposit rate, wherein as each packet is output by one of the output ports, the shaper coupled to the one of the output ports removes one or more credits from the phantom queue of the shaper such that a total value of the removed credits is equal to or greater than a size of the packet. In some embodiments, the link selection logic determines the congestion level of each of the output ports based on the number of credits within the phantom queue coupled to the output port. In some embodiments, the microchip has a plurality of packet queues each coupled with one of the output ports such that the queues receive and queue each of the packets to be output by the output ports. In some embodiments, the link selection logic determines the congestion level of each of the output ports based on a number of the packets within the packet queue associated with the output port. In some embodiments, the microchip further comprises one of more additional shapers such that each of the additional shapers is coupled to one of the output ports, wherein each of the additional shapers indicate an additional congestion level of the output port coupled to the additional shaper that corresponds to the quantity of the packets sent to the output port by the link selection logic during the time period, and further wherein if the packet does not have the TCP format, the link selection logic selects the one of the output ports based on the congestion level and the additional congestion levels of each of the output ports.
- A third aspect is directed to a method of dynamic load balancing within a dynamic load balancing system. The method comprises receiving a plurality of traffic packets with link selection logic on a processing microchip having a plurality of shapers and a multipath interface group including a plurality of paths for outputting packets from the microchip, wherein each of the paths is coupled to an output port of the microchip and each of the shapers is coupled to one of the output ports and monitors the outputting of the packets out of the output port to determine whether a rate of data output by the output port is above a data output rate threshold, for each of the traffic packets, determining whether the packet has a transmission control protocol (TCP) format with the link selection logic and for each of the traffic packets, selecting which one of the output ports the packet is to be output from onto the path coupled to the one of the output ports with the link selection logic, wherein each of the shapers indicate a congestion level of the output port coupled to the shaper that corresponds to a quantity of the packets sent to the output port by the link selection logic during a time period, and further wherein if the packet does not have the TCP format, the link selection logic selects the one of the output ports based on the congestion level of each of the output ports. In some embodiments, the method further comprises, if the packet does have the TCP format, selecting the one of the output ports independent of the congestion level of each of the output ports with the link selection logic. In some embodiments, the method further comprises, if the packet does have the TCP format, selecting the one of the output ports based on a hash of the packet and an equal or weighted cost multipath selection protocol with the link selection logic. In some embodiments, the method further comprises, if the packet does not have the TCP format, selecting the one of the output ports according to a metric with the link selection logic wherein the link selection logic removes all of the output ports whose congestion level is above a congestion threshold value from a pool of the output ports that are able to be selected according to the metric. In some embodiments, the method further comprises, if the packet does not have the TCP format and all of the output ports have a congestion level that is above the congestion threshold value, selecting the one of the output ports according to the metric with the link selection logic while including all of the output ports in the pool despite the congestion level of all of the output ports. In some embodiments, the metric is one of the group consisting of round robin, random, and smallest congestion level first. In some embodiments, each of the shapers comprise a phantom queue and a credit generator that deposits a credit into the phantom queue at a predefined credit deposit rate, further comprising as each packet is output by one of the output ports, removing, with the shaper coupled to the one of the output ports, one or more credits from the phantom queue of the shaper such that a total value of the removed credits is equal to or greater than a size of the packet. In some embodiments, the method further comprises determining the congestion level of each of the output ports with the link selection logic based on the number of credits within the phantom queue coupled to the output port. In some embodiments, the processing microchip further comprises a plurality of packet queues each coupled with one of the output ports such that the queues receive and queue each of the packets to be output by the output ports. In some embodiments, the method further comprises determining the congestion level of each of the output ports with the link selection logic based on a number of the packets within the packet queue associated with the output port. In some embodiments, the processing microchip has one or more additional shapers such that each of the additional shapers is coupled to one of the output ports and monitors the outputting of the packets out of the output port to determine whether the rate of data output by the output port is below an additional data output rate threshold, and further wherein each of the additional shapers indicate an additional congestion level of the output port coupled to the additional shaper that corresponds to the quantity of the packets sent to the output port by the link selection logic during the time period, and further wherein if the packet does not have the TCP format, the selecting of the one of the output ports is based on the congestion level and the additional congestion levels of each of the output ports.
-
FIG. 1 illustrates a dynamicload balancing system 100 according to some embodiments. -
FIG. 2 illustrates a method of dynamic load balancing within a dynamic load balancing system according to some embodiments. - In the following description, numerous details are set forth for purposes of explanation. However, one of ordinary skill in the art will realize that the invention can be practiced without the use of these specific details. Thus, the present invention is not intended to be limited to the embodiments shown but is to be accorded the widest scope consistent with the principles and features described herein.
- Embodiments are directed to a data processing system that comprises a phantom queue for each of a plurality of output ports each associated with an output link for outputting data. The phantom queues receive/monitor traffic on the respective ports and/or the associated links such that the congestion or traffic volume on the output ports/links is able to be determined by a congestion mapper coupled with the phantom queues. Based on the determined congestion level on each of the ports/links, the congestion mapper selects one or more non or less congested ports/links as destination of one or more packets. A link selection logic element then processes the packets according to the selected path or multi-path thereby reducing congestion on the system. For example, when a current port/link is determined to be congested, packets are able to be re-routed to one or more of the other links/ports until the current port/link is no longer congested.
- In some embodiments, the non-congested ports are selected by masking links to congested ports. In some embodiments, the non-congested ports are determined based on their congestion level value being below a congestion threshold value and the congested ports are determined based on their congestion level being above the congestion threshold value or a different threshold value. In some embodiments, a link/port is determined to be congested if a bucket of the associated phantom queue is empty and/or out of credits for outputting the traffic packets. Alternatively, or in addition, a link/port is determined to be congested based on the queue fill level for the port/link. In some embodiments, TCP traffic is not enabled for the dynamic load balancing of the system such that the traffic is able to ignore congestion levels and thus is not directed to different ports/links by the congestion mapper regardless of the congestion state. In some embodiments, non-TCP traffic is enabled for the load balancing of the system such that it is able to be routed to different ports/links based on the congestion levels by the congestion mapper. Alternatively, both the TCP and the non-TCP traffic is enabled for the load balancing of the system such that it is able to be routed to different ports/links based on the congestion levels by the congestion mapper. In some embodiments, if selection of one of a plurality of non-congested ports is required, the ports/links are able to be selected randomly, in a round robin order, based on the level of congestion (e.g. which has the least current congestion), and/or according to other types of selection priority protocols. In some embodiments, one or more of the phantom queues are able to be replaced and/or supplemented with a traffic shaper. As a result, the system provides the advantage of considering phantom queue indications of congestion levels to dynamically balancing output port packet loads for non-TCP traffic while disregarding phantom queue indications of congestion levels when distributing TCP traffic (e.g. statically balancing output port packet loads for TCP traffic).
-
FIG. 1 illustrates a dynamicload balancing system 100 according to some embodiments. As shown inFIG. 1 , the dynamicload balancing system 100 is able to be located within and/or stored on one or more processing microchips 102 (e.g. one or more software-defined network microchips, datacenter switch, ethernet switch). Alternatively, thesystem 100 is able to be located within and/or stored on one or more components of a processing circuit. The dynamicload balancing system 100 comprises a plurality ofoutput ports 104,output paths 106,shapers 110,packet queues 112 andlink selection logic 114. Although as shown inFIG. 1 , thesystem 100 comprises twooutput ports 104,output paths 106,shapers 110 andpacket queues 112,more output ports 104,output paths 106,shapers 110 and/orpacket queues 112 are contemplated. Further, thesystem 100 is able to comprise more or less components. For example, in some embodiments thepacket queues 112 are able to be omitted. Additionally, in some embodiments one or more of theoutput ports 104 are able to each have a plurality ofshapers 110 and/orpacket queues 112 operably coupled therewith. - The plurality of
output ports 104 are each associated with one of theoutput path 106, which together form amultipath interface 108. Thus, packets that exit thechip 102 via one of theoutput ports 104 will travel on theoutput path 106 associated with theoutput port 104. In some embodiments, one or more of theoutput ports 104 are physical ports of themicrochip 102. Alternatively, one or more of theoutput ports 104 are able to be virtual ports of themicrochip 102. Each one of theshapers 110 is operably coupled a different one of thepacket queues 112 and/or a different one of theoutput ports 104 such that each link orpath 106 is associated with a set of onequeue 112, oneshaper 110 and oneport 104. Alternatively, as described above, a group of a plurality ofshapers 110 is able to be operable coupled to each of thepacket queues 112 and/or theoutput ports 104 such that each link orpath 106 is associated with a set of onequeue 112, a group ofshapers 110 and oneport 104. As a result, for each of theoutput ports 104, thepacket queue 112 coupled to thatport 104 is able to receive and buffer packets that are to be sent to theport 104 until theport 104 is ready to output them. For example, thequeue 112 is able to receive packets as routed by thelink selection logic 114 and buffer the packets according to a first in first out (FIFO) or other buffering system until they are ready to be received by thecorresponding output port 104. Also for each of theoutput ports 104, theshaper 110 coupled to thatport 104 is able to shape or control the packet rate (e.g. number of packets/time) of the packet traffic traveling out of theoutput port 104. In particular, theshaper 110 is able to comprise acredit generator 110 a and a phantom queue 110 b, wherein thecredit generator 110 a fills the phantom queue 110 b with credits at a predetermined credit rate and theshaper 110 must remove one of the credits each time theshaper 110 permits a number of packets having a size equal to or less than a value of the credit or credits to be output through thecorresponding output port 104. For example, if each credit is worth 256 bytes, theshaper 110 must remove one credit before permitting one or more packets whose size together equal the 256 bytes (i.e. the value of the credit). Correspondingly, if each credit is worth 256 bytes and the packet to be transmitted has a size of 300 bytes, theshaper 110 must wait for at least two credits to accumulate within the queue 110 b before permitting the packet to be output and removing two of the at least two credits. Consequently, theshaper 110 is able to limit the maximum output rate of the packets out of theoutput port 104 because if there are no credits remaining in the phantom queue 110 b (because they all have previously been removed and the next credit has yet to be deposited by thecredit generator 110 a) theshaper 110 will prevent any further packets from being output until a new credit is available. On the other hand, if there are less packets being selected for output via the port 104 (and therefore input by the packet queue 112) than the value of the number of credits being deposited, the phantom queue 110 b is able to fill up with extra credits (that cannot be used because there are no packets to output) until the phantom queue 110 b is completely full. In this manner, the fill level of each of the phantom queues 110 b is able to indicate a congestion level of the associatedports 104, wherein the fuller the phantom queue 110 b the lower the congestion level of theport 104 and vice versa. - In some embodiments, the
shapers 110 are able to be passive in that they do not enforce restricting traffic or packet transmission to the shaper rate, rather they only passively monitor the rate of the packet traffic to detect when a congestion level is reached and then signal that information to the selection logic. Further, in some embodiments wherein one or more groups ofshapers 110 are each coupled to differentsingle output ports 104, eachshaper 110 of the groups is able to have acredit generator 110 a that generates credits at a different rate and/or of a different size than theother credit generators 110 a of theother shapers 110 in the group. As a result, thedifferent shapers 110 will each have different phantom queue fill levels (i.e. indicate different congestion levels) based on the rates that credits are produced by theseparate credit generators 110 a in comparison with the rate that packets are being output via theassociate output port 104. Thus, in such embodiments, multi-level congestion indications (e.g. one for each shaper 110 in the group) are able to be provided to thelink selection logic 114 for eachport 104 coupled with one of the groups ofshapers 110. - The
link selection logic 114 is coupled with or is provided access toinput traffic packets 116 and eachpath 106 including the associatedport 104,shaper 110 andpacket queue 112. As a result, theselection logic 114 is able to input or access traffic in the form of packets that enter thesystem 100 and phantom queue vectors from theshapers 110 indicating the current number of credits (e.g. a congestion level) within each of the phantom queues 110 b, and further able to determine which of thepaths 106 and/orports 104 each of the packets are output from by thesystem 100. In particular, upon determining whether an input packet is a TCP or non-TCP format packet, thelink selection logic 114 is able to use a TCP selection metric to select one of theports 104 from which to output the input packet determined to be a TCP or TCP format packet. For example, the TCP metric is able to be a weighted or equal cost multipath metric that selects aport 104 based on a hash or other representation of the TCP packets in order to attempt to maintain the order of the sequence of the TCP packets. Alternatively, the TCP metric is able to be other types of selection metrics that prioritize maintaining the sequence of the TCP packets. - In contrast, if the input packet is determined to be a non-TCP or TCP format packet, the
link selection logic 114 is able to use a non-TCP selection metric and the phantom queue vectors to select one of theports 104 from which to output the input packet. Specifically, thelink selection logic 114 is able to input or review the latest phantom queue vector and remove any of theports 104 whose vector value (or congestion level or phantom queue 110 b fill level) indicates a level of congestion that exceeds a predetermined congestion threshold from the pool ofports 104 that are able to be selected by the non-TCP selection metric. Then, based on this remaining pool of theports 104, thelink selection logic 114 is able to select the one of theports 104 from which to output the input packet based on the non-TCP selection metric. Alternatively, the TCP selection metric is able to be used based on the remaining pool ofports 104. As a result, heavilycongested ports 104 are prohibited from selection by theselection logic 114 until their congestion level falls back below the threshold thereby dynamically balancing the traffic load on theports 104 for the non-TCP traffic. - In some embodiments, the non-TCP selection metric is able to be the
port 104 whose vector value indicates the lowest level of congestion. In particular, in the case wherein a group ofshapers 110 produce a plurality of congestion levels for each of theports 104, theport 104 with the lowest congestion level is able to be determined based on whichport 104 has the least number of shapers whose congestion level is above the threshold. In other words, in such embodiments the number ofshapers 110 of each of the groups ofshapers 110 that indicate a congestion level above the threshold is able to be used by theselection logic 114 to determine whichport 104 to select and/or whichports 104 to remove from the pool ofselectable ports 104. Alternatively, the non-TCP selection metric is able to be a random, round robin or other schedule of selecting one of the pool ofports 104. - In the case where based on the vector values the congestion levels of all of the
ports 104 of themultipath interface 108 exceed the congestion threshold, thelink selection logic 114 is able to add all of theports 104 back into the pool (despite their congestion levels) and based on this full pool of theports 104 select the one of theports 104 from which to output the input packet based on the non-TCP selection metric. Alternatively, the TCP selection metric is able to be used in such a case based on the full pool ofports 104. Thus, in any case the system provides the advantage of considering phantom queue 110 b indications of congestion levels to dynamically balancingoutput port 104 packet loads for non-TCP traffic while disregarding phantom queue 110 b indications of congestion levels when distributing TCP traffic (e.g. - statically balancing
output port 104 packet loads for TCP traffic). In some embodiments, each shaper 110 is subject to the same credit generation rate (e.g. congestion threshold). Alternatively, one or more of theshapers 110 are able to be subject to different credit generation rates (e.g. congestion thresholds). In some embodiments, as shown inFIG. 1 , thelink selection logic 114 is able to comprise afirst component 114b that receives the phantom queue vectors and performs the link selection for the non-TCP traffic and asecond component 114b that performs the link selection for the TCP traffic. Alternatively, the first andsecond components single component 114. In some embodiments, the determination whether the traffic is TCP or non-TCP is able to be omitted and instead all traffic is able to be subject to the non-TCP selection metric as if it were all non-TCP traffic as described above. - In some embodiments, other factors are able to be considered for non-TCP traffic before removing
ports 104 from the pool ofports 104 from which a packet is output. For example, in addition to or in lieu of whether the congestion threshold is exceeded based on the phantom queue, the link selection logic is able to determine and consider the current level of fullness of packets of the associatedpacket queue 112. In particular, the packet queue fullness level is able to be compared to the packet queue fullness threshold wherein aport 104 is removed from the pool only when both the packet queue fullness and the phantom queue thresholds have been exceeded, when at least one of the packet queue fullness and the phantom queue thresholds have been exceeded, or solely based on when the packet queue fullness threshold has been exceeded. Alternatively or in addition, other factors such as quantized congestion notification methods are able to be used to determine when to removeports 104 from the pool of ports. -
FIG. 2 illustrates a method of dynamic load balancing within a dynamicload balancing system 100 according to some embodiments. As shown inFIG. 2 , thelink selection logic 114 accesses or receives a plurality of traffic packets at thestep 202. Then, for each of the traffic packets, theselection logic 114 determines whether the packet has a TCP format at thestep 204. Accordingly, for each of the traffic packets, thelink selection logic 114 selects which one of theoutput ports 104 the packet is to be output from onto thepath 106 coupled to the one of theoutput ports 104, wherein if the packet does not have the TCP format, thelink selection logic 114 selects the one of theoutput ports 104 based on the congestion level of each of theoutput ports 104 at thestep 206. Specifically, thelink selection logic 114 is able to determine the congestion level of each of theoutput ports 104 based on the number of credits within the phantom queue 110 b coupled to theoutput port 104. Alternatively or in addition, thelink selection logic 114 is able to determine the congestion level of each of theoutput ports 104 based on a number of the packets within thepacket queue 112 associated with theoutput port 104. As a result, the method is able to provide the advantage of dynamically load balancing the outputting of the non-TCP traffic based on port congestion level. If instead the packet does have the TCP format, thelink selection logic 114 is able to select the one of theoutput ports 104 independent of the congestion level of each of theoutput ports 104. In other words, unlike non-TCP traffic, thesystem 100 is able to recognize the preference for keeping TCP traffic in sequence and thus does not apply the dynamic load balancing to its port selection for TCP traffic. As a result, the method further provides the advantage of distinguishing between traffic types and applying different port selection metrics based on the traffic type/format. - If the packet does have the TCP format, the selecting the one of the
output ports 104 is able to be based on a hash of the packet and an equal or weighted cost multipath selection protocol. If the packet does not have the TCP format, the selecting the one of theoutput ports 104 is able to be according to a non-TCP metric, wherein thelink selection logic 114 removes all of theoutput ports 104 whose congestion level is above a congestion threshold value from a pool of theoutput ports 104 that are able to be selected according to the non-TCP metric. In some embodiments, the non-TCP metric is one of the group consisting of round robin, random, and smallest congestion level first. Alternatively, other metrics are able to be used and/or a combination of round robin, random, and smallest congestion level first wherein the combined metrics are prioritized and implemented according to the priority wherein the next metric is used to break ties of the previous metric. Also, in some embodiments if the packet does not have the TCP format and all of theoutput ports 104 have a congestion level that is above the congestion threshold value, thelink selection logic 114 selects the one of theoutput ports 104 according to the metric while including all of theoutput ports 104 in the pool despite the congestion level of all of theoutput ports 104. Thus, the method provides the advantage of ensuring the packet flow is not halted in the case that all theports 104 are above the congestion threshold. - Accordingly, the dynamic load balancing system provides the advantage of distinguishing between traffic types and applying different port selection metrics based on the traffic type/format. Further, the system provides the advantage of considering phantom queue indications of congestion levels to dynamically balancing output port packet loads for non-TCP traffic while disregarding phantom queue indications of congestion levels when distributing TCP traffic (e.g. statically balancing output port packet loads for TCP traffic). Moreover, the system provides the advantage of ensuring the packet flow is not halted in the case that all the ports are above the congestion threshold. Therefore, the dynamic load balancing system described herein has numerous advantages.
- One of ordinary skill in the art will realize other uses and advantages also exist. While the invention has been described with reference to numerous specific details, one of ordinary skill in the art will recognize that the invention can be embodied in other specific forms without departing from the spirit of the invention. For example, although the system described herein illustrates a single
multipath interface 108, a plurality ofmultipath interfaces 108 are contemplated wherein each packet is assigned to one of the interfaces and is then sent to one of theports 104 of thatinterface 108 as described above. As another example, although the different methods described herein describe a particular order of steps, other orders are contemplated as well as the omission of one or more of the steps and/or the addition of one or more new steps. Moreover, although the methods above are described herein separately, one or more of the methods are able to be combined (in whole or part). Thus, one of ordinary skill in the art will understand that the invention is not to be limited by the foregoing illustrative details, but rather is to be defined by the appended claims. Additionally, it should be noted that, unlike policers,shapers 110 do not drop any packets in order to control the output rate of aport 104. Instead,shapers 110 only delay the packets to ensure the maximum output rate is not exceeded. - While the invention has been described with reference to numerous specific details, one of ordinary skill in the art will recognize that the invention can be embodied in other specific forms without departing from the spirit of the invention. Thus, one of ordinary skill in the art will understand that the invention is not to be limited by the foregoing illustrative details, but rather is to be defined by the appended claims.
Claims (33)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/126,644 US10523567B2 (en) | 2014-08-28 | 2018-09-10 | Phantom queue link level load balancing system, method and device |
US16/694,841 US11095561B2 (en) | 2014-08-28 | 2019-11-25 | Phantom queue link level load balancing system, method and device |
US17/372,286 US11700204B2 (en) | 2014-08-28 | 2021-07-09 | Phantom queue link level load balancing system, method and device |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201462043331P | 2014-08-28 | 2014-08-28 | |
US14/667,568 US9900253B2 (en) | 2014-08-28 | 2015-03-24 | Phantom queue link level load balancing system, method and device |
US15/862,509 US10103993B2 (en) | 2014-08-28 | 2018-01-04 | Phantom queue link level load balancing system, method and device |
US16/126,644 US10523567B2 (en) | 2014-08-28 | 2018-09-10 | Phantom queue link level load balancing system, method and device |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/862,509 Continuation US10103993B2 (en) | 2014-08-28 | 2018-01-04 | Phantom queue link level load balancing system, method and device |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/694,841 Continuation US11095561B2 (en) | 2014-08-28 | 2019-11-25 | Phantom queue link level load balancing system, method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
US20190007323A1 true US20190007323A1 (en) | 2019-01-03 |
US10523567B2 US10523567B2 (en) | 2019-12-31 |
Family
ID=55403852
Family Applications (5)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/667,568 Active 2035-10-23 US9900253B2 (en) | 2014-08-28 | 2015-03-24 | Phantom queue link level load balancing system, method and device |
US15/862,509 Active US10103993B2 (en) | 2014-08-28 | 2018-01-04 | Phantom queue link level load balancing system, method and device |
US16/126,644 Active US10523567B2 (en) | 2014-08-28 | 2018-09-10 | Phantom queue link level load balancing system, method and device |
US16/694,841 Active US11095561B2 (en) | 2014-08-28 | 2019-11-25 | Phantom queue link level load balancing system, method and device |
US17/372,286 Active US11700204B2 (en) | 2014-08-28 | 2021-07-09 | Phantom queue link level load balancing system, method and device |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/667,568 Active 2035-10-23 US9900253B2 (en) | 2014-08-28 | 2015-03-24 | Phantom queue link level load balancing system, method and device |
US15/862,509 Active US10103993B2 (en) | 2014-08-28 | 2018-01-04 | Phantom queue link level load balancing system, method and device |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/694,841 Active US11095561B2 (en) | 2014-08-28 | 2019-11-25 | Phantom queue link level load balancing system, method and device |
US17/372,286 Active US11700204B2 (en) | 2014-08-28 | 2021-07-09 | Phantom queue link level load balancing system, method and device |
Country Status (1)
Country | Link |
---|---|
US (5) | US9900253B2 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210288909A1 (en) * | 2016-03-30 | 2021-09-16 | Intel Corporation | Switch, devices and methods for receiving and forwarding ethernet packets |
US11477122B2 (en) * | 2017-09-27 | 2022-10-18 | Intel Corporation | Technologies for selecting non-minimal paths and throttling port speeds to increase throughput in a network |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170048144A1 (en) * | 2015-08-13 | 2017-02-16 | Futurewei Technologies, Inc. | Congestion Avoidance Traffic Steering (CATS) in Datacenter Networks |
US10003537B2 (en) * | 2015-10-01 | 2018-06-19 | Keysight Technologies Singapore (Holding) Pte Ltd | Egress port overload protection for network packet forwarding systems |
CN106803812B (en) * | 2015-11-26 | 2020-12-01 | 华为技术有限公司 | Method and device for realizing load sharing |
CN108111431B (en) * | 2016-11-24 | 2021-09-24 | 腾讯科技(北京)有限公司 | Service data sending method, device, computing equipment and computer readable storage medium |
US11563698B2 (en) * | 2017-11-30 | 2023-01-24 | Telefonaktiebolaget Lm Ericsson (Publ) | Packet value based packet processing |
US20190260657A1 (en) * | 2018-02-21 | 2019-08-22 | Cisco Technology, Inc. | In-band performance loss measurement in ipv6/srv6 software defined networks |
US11184235B2 (en) * | 2018-03-06 | 2021-11-23 | Cisco Technology, Inc. | In-band direct mode performance loss measurement in software defined networks |
US11102127B2 (en) * | 2018-04-22 | 2021-08-24 | Mellanox Technologies Tlv Ltd. | Load balancing among network links using an efficient forwarding scheme |
US10848458B2 (en) | 2018-11-18 | 2020-11-24 | Mellanox Technologies Tlv Ltd. | Switching device with migrated connection table |
US10764315B1 (en) * | 2019-05-08 | 2020-09-01 | Capital One Services, Llc | Virtual private cloud flow log event fingerprinting and aggregation |
WO2021134621A1 (en) * | 2019-12-31 | 2021-07-08 | 华为技术有限公司 | Message scheduling method and apparatus |
US20220321478A1 (en) * | 2022-06-13 | 2022-10-06 | Intel Corporation | Management of port congestion |
Family Cites Families (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB9618137D0 (en) * | 1996-08-30 | 1996-10-09 | Sgs Thomson Microelectronics | Improvements in or relating to an ATM switch |
US7724760B2 (en) | 2001-07-05 | 2010-05-25 | Broadcom Corporation | Method and apparatus for bandwidth guarantee and overload protection in a network switch |
US7207062B2 (en) * | 2001-08-16 | 2007-04-17 | Lucent Technologies Inc | Method and apparatus for protecting web sites from distributed denial-of-service attacks |
US7457297B2 (en) * | 2001-11-16 | 2008-11-25 | Enterasys Networks, Inc. | Methods and apparatus for differentiated services over a packet-based network |
US7330430B2 (en) * | 2002-06-04 | 2008-02-12 | Lucent Technologies Inc. | Packet-based traffic shaping |
US7440573B2 (en) | 2002-10-08 | 2008-10-21 | Broadcom Corporation | Enterprise wireless local area network switching system |
US7796627B2 (en) | 2004-08-12 | 2010-09-14 | Broadcom Corporation | Apparatus and system for coupling and decoupling initiator devices to a network using an arbitrated loop without disrupting the network |
KR100603567B1 (en) | 2004-09-02 | 2006-07-24 | 삼성전자주식회사 | Method and system for quality of service using bandwidth reservation in switch |
US7860006B1 (en) * | 2005-04-27 | 2010-12-28 | Extreme Networks, Inc. | Integrated methods of performing network switch functions |
US7619971B1 (en) * | 2005-05-16 | 2009-11-17 | Extreme Networks, Inc. | Methods, systems, and computer program products for allocating excess bandwidth of an output among network users |
EP2036267B1 (en) * | 2006-06-22 | 2009-10-07 | Xelerated AB | A processor and a method for a processor |
US8417257B2 (en) | 2006-08-22 | 2013-04-09 | Ca, Inc. | Method and system for load balancing traffic in a wireless network |
US8259715B2 (en) | 2007-07-25 | 2012-09-04 | Hewlett-Packard Development Company, L.P. | System and method for traffic load balancing to multiple processors |
US8068416B2 (en) * | 2007-09-20 | 2011-11-29 | At&T Intellectual Property I, L.P. | System and method of communicating a media stream |
US8467294B2 (en) * | 2011-02-11 | 2013-06-18 | Cisco Technology, Inc. | Dynamic load balancing for port groups |
US8930505B2 (en) * | 2011-07-26 | 2015-01-06 | The Boeing Company | Self-configuring mobile router for transferring data to a plurality of output ports based on location and history and method therefor |
US9590820B1 (en) | 2011-09-02 | 2017-03-07 | Juniper Networks, Inc. | Methods and apparatus for improving load balancing in overlay networks |
US9331929B1 (en) * | 2012-03-29 | 2016-05-03 | Juniper Networks, Inc. | Methods and apparatus for randomly distributing traffic in a multi-path switch fabric |
US8995277B2 (en) * | 2012-10-30 | 2015-03-31 | Telefonaktiebolaget L M Ericsson (Publ) | Method for dynamic load balancing of network flows on LAG interfaces |
WO2014108173A1 (en) * | 2013-01-08 | 2014-07-17 | Telefonaktiebolaget L M Ericsson (Publ) | Distributed traffic inspection in a telecommunications network |
US9582440B2 (en) | 2013-02-10 | 2017-02-28 | Mellanox Technologies Ltd. | Credit based low-latency arbitration with data transfer |
US9590914B2 (en) | 2013-11-05 | 2017-03-07 | Cisco Technology, Inc. | Randomized per-packet port channel load balancing |
US9391876B2 (en) * | 2014-03-18 | 2016-07-12 | Telefonaktiebolaget L M Ericsson (Publ) | Better alternate paths for multi homed IS-IS prefixes |
US10708187B2 (en) * | 2014-05-22 | 2020-07-07 | Intel Corporation | Data center congestion management for non-TCP traffic |
-
2015
- 2015-03-24 US US14/667,568 patent/US9900253B2/en active Active
-
2018
- 2018-01-04 US US15/862,509 patent/US10103993B2/en active Active
- 2018-09-10 US US16/126,644 patent/US10523567B2/en active Active
-
2019
- 2019-11-25 US US16/694,841 patent/US11095561B2/en active Active
-
2021
- 2021-07-09 US US17/372,286 patent/US11700204B2/en active Active
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210288909A1 (en) * | 2016-03-30 | 2021-09-16 | Intel Corporation | Switch, devices and methods for receiving and forwarding ethernet packets |
US11477122B2 (en) * | 2017-09-27 | 2022-10-18 | Intel Corporation | Technologies for selecting non-minimal paths and throttling port speeds to increase throughput in a network |
Also Published As
Publication number | Publication date |
---|---|
US11095561B2 (en) | 2021-08-17 |
US9900253B2 (en) | 2018-02-20 |
US11700204B2 (en) | 2023-07-11 |
US10523567B2 (en) | 2019-12-31 |
US10103993B2 (en) | 2018-10-16 |
US20200092208A1 (en) | 2020-03-19 |
US20210336885A1 (en) | 2021-10-28 |
US20160065477A1 (en) | 2016-03-03 |
US20180131618A1 (en) | 2018-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11700204B2 (en) | Phantom queue link level load balancing system, method and device | |
US11962490B2 (en) | Systems and methods for per traffic class routing | |
US9185047B2 (en) | Hierarchical profiled scheduling and shaping | |
US8520522B1 (en) | Transmit-buffer management for priority-based flow control | |
JP5365415B2 (en) | Packet relay apparatus and congestion control method | |
US9667570B2 (en) | Fabric extra traffic | |
TWI543568B (en) | Reducing headroom | |
US20170272372A1 (en) | Flexible application of congestion control measures | |
US9197570B2 (en) | Congestion control in packet switches | |
US20150236955A1 (en) | Congestion Notification in a Network | |
JP2017063388A (en) | Band control device and band control system | |
KR100546968B1 (en) | Method and system for controlling transmission of packets in computer networks | |
US20160285777A1 (en) | Apparatus to achieve quality of service (qos) without requiring fabric speedup | |
US8625624B1 (en) | Self-adjusting load balancing among multiple fabric ports | |
JP4342395B2 (en) | Packet relay method and apparatus | |
JP2015149537A (en) | Path controller, system, and method | |
US9424088B1 (en) | Multi-level deficit weighted round robin scheduler acting as a flat single scheduler | |
US10742710B2 (en) | Hierarchal maximum information rate enforcement | |
JP5938939B2 (en) | Packet switching apparatus, packet switching method, and bandwidth control program | |
US20150049770A1 (en) | Apparatus and method | |
Fuentes Saez et al. | Network unfairness in dragonfly topologies | |
JP2003023456A (en) | Multiplexer, band controller, program and recording medium | |
JP2003023450A (en) | Rate controller |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: XPLIANT, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:WHITE, MARTIN LESLIE;REEL/FRAME:046830/0420 Effective date: 20150709 Owner name: CAVIUM NETWORKS LLC, CALIFORNIA Free format text: MERGER;ASSIGNOR:XPLIANT, INC.;REEL/FRAME:046830/0537 Effective date: 20150429 Owner name: CAVIUM, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CAVIUM NETWORKS LLC;REEL/FRAME:046830/0637 Effective date: 20160308 |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: CAVIUM, LLC, CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:CAVIUM, INC.;REEL/FRAME:047577/0653 Effective date: 20180924 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: CAVIUM INTERNATIONAL, CAYMAN ISLANDS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CAVIUM, LLC;REEL/FRAME:051948/0807 Effective date: 20191231 |
|
AS | Assignment |
Owner name: MARVELL ASIA PTE, LTD., SINGAPORE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CAVIUM INTERNATIONAL;REEL/FRAME:053179/0320 Effective date: 20191231 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |