US20170111283A1 - CONGESTION CONTROL AND QoS IN NoC BY REGULATING THE INJECTION TRAFFIC - Google Patents

CONGESTION CONTROL AND QoS IN NoC BY REGULATING THE INJECTION TRAFFIC Download PDF

Info

Publication number
US20170111283A1
US20170111283A1 US15/392,154 US201615392154A US2017111283A1 US 20170111283 A1 US20170111283 A1 US 20170111283A1 US 201615392154 A US201615392154 A US 201615392154A US 2017111283 A1 US2017111283 A1 US 2017111283A1
Authority
US
United States
Prior art keywords
node
allocation
destination
source
agents
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US15/392,154
Inventor
Sailesh Kumar
Eric Norige
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
NetSpeed Systems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NetSpeed Systems Inc filed Critical NetSpeed Systems Inc
Priority to US15/392,154 priority Critical patent/US20170111283A1/en
Publication of US20170111283A1 publication Critical patent/US20170111283A1/en
Assigned to INTEL CORPORATION reassignment INTEL CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: Netspeed Systems, Inc.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/18End to end
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/30Flow control; Congestion control in combination with information about buffer occupancy at either end or at transit nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/39Credit based

Definitions

  • Methods and example implementations described herein are generally directed to interconnect architecture, and more specifically, to weight assignment and weighted arbitration of node channels in a Network on Chip (NoC) system interconnect architecture.
  • NoC Network on Chip
  • FIG. 1( a ) Bi-directional rings (as shown in FIG. 1( a ) ), 2-D (two dimensional) mesh (as shown in FIG. 1( b ) ) and 2-D Torus (as shown in FIG. 1( c ) ) are examples of topologies in the related art.
  • Mesh and Torus can also be extended to 2.5-D (two and half dimensional) or 3-D (three dimensional) organizations.
  • FIG. 1( d ) shows a 3D mesh NoC, where there are three layers of 3 ⁇ 3 2D mesh NoC shown over each other.
  • dimension order routing may not be feasible between certain source and destination nodes, and alternative paths may have to be taken.
  • the alternative paths may not be shortest or minimum turn.
  • Source routing and routing using tables are other routing options used in NoC.
  • Adaptive routing can dynamically change the path taken between two points on the network based on the state of the network. This form of routing may be complex to analyze and implement.
  • NoC interconnects may employ wormhole routing, wherein, a large message or packet is broken into small pieces known as flits (also referred to as flow control digits).
  • the first flit is the header flit, which holds information about this packet's route and key message level info along with payload data and sets up the routing behavior for all subsequent flits associated with the message.
  • one or more body flits follows the head flit, containing the remaining payload of data.
  • the final flit is the tail flit, which in addition to containing the last payload also performs some bookkeeping to close the connection for the message.
  • virtual channels are often implemented.
  • the physical channels are time sliced into a number of independent logical channels called virtual channels (VCs).
  • VCs provide multiple independent paths to route packets, however they are time-multiplexed on the physical channels.
  • a virtual channel holds the state needed to coordinate the handling of the flits of a packet over a channel. At a minimum, this state identifies the output channel of the current node for the next hop of the route and the state of the virtual channel (idle, waiting for resources, or active).
  • the virtual channel may also include pointers to the flits of the packet that are buffered on the current node and the number of flit buffers available on the next node.
  • wormhole plays on the way messages are transmitted over the channels: the output port at the next router can be so short that received data can be translated in the head flit before the full message arrives. This allows the router to quickly set up the route upon arrival of the head flit and then opt out from the rest of the conversation. Since a message is transmitted flit by flit, the message may occupy several flit buffers along its path at different routers, creating a worm-like image.
  • each of the five components are connected to a local router node, and the router nodes are connected with each other using point to point channels as shown in FIG. 3 .
  • each of the channels have a receive bandwidth of the destination component equal to a transmit bandwidth of the source component.
  • FIG. 4 the routers and components are separated for clarity, and the channels that connect components with their local routers are illustrated.
  • router 42 in FIG. 4 messages arriving at the left input port (e.g. from router 43 ) and bottom input port (e.g., from source 3 ) will contend for the right output port. If routers implement uniformly fair arbitration policy to arbitrate between incoming messages at different input ports contending for an output port, then the 50% output port's bandwidth (computed in the above step) will be equally split between the two input ports as shown. Each input port will receive 25% of the destination bandwidth—source 3 therefore will receive a quarter of the destination bandwidth.
  • FIG. 4 illustrates that even though each router employs a uniformly fair arbitration policy wherein the router gives fair share of output port bandwidth among all input port contenders, the four sources receive vastly different shares of the destination bandwidth.
  • the bandwidth allocated to various source components when they content for various destinations may vary substantially. This may be undesirable in many applications, wherein fair or equal allocation of various resources among all contenders may be important to achieve a high system performance.
  • weighted allocation is desired, so that the various resource bandwidths are allocated among various contenders in a pre-specified ratio.
  • Rate limiting the sources Each source contending for a resource destination is allowed to send data at a pre-specified rate based on its fair share. This technique is independent of the state of other sources, whether the other sources are contending for the resource or not. Therefore, based upon the pre-specified rates of sources, rate limiting of the sources can either lead to under-utilization of resource bandwidth, or unfair allocation.
  • Age based arbitration Every message injected by various components carries timestamp information, which describes the age of the message. Within the NoC interconnect, routers give higher preference to older messages over newer messages, whenever multiple messages content for an output port. This technique can provide end-to-end uniform fairness, however it is unable to provide weighted fairness. Furthermore, age based arbitration comes at a high implementation cost of additional bits needed to carry the age information and complex circuitry at every router to determine the oldest message.
  • Weight based arbitration Weights for various channels in a network on chip (NoC) is computed based on the bandwidth requirements of the traffic flows at the channels. Subsequently these weights are used to perform weighted arbitration between channels at each router in the NoC to provide Quality of Service (QoS). Advanced implementations may dynamically adjust the weights by monitoring the activity of flows at the channels to avoid unfair allocations, and perform weighted arbitration using the newly computed channel weights. This is described in U.S. application Ser. No. 13/745,696, herein incorporated by reference in its entirety for all purposes. Using this scheme, an example assignment of weights to various channels of the NoC illustrated in FIG. 4 is shown in FIG. 5 .
  • QoS Quality of Service
  • aspects of the present application include a method, which may involve providing congestion avoidance and end-to-end flow control and QoS by using explicit notification messages between communicating agents for congestion notification; using the congestion notification information to adjust and regulate the transmission rates at various agents; computing the transmission rates and enforcing them at the agents, or alternatively using various types of end-to-end flow credit based flow control schemes for controlling the resource allocation to various agents.
  • aspects of the present application include a computer readable storage medium storing instructions for executing a process.
  • the process may involve providing congestion avoidance and end-to-end flow control and QoS by using explicit notification messages between communicating agents for congestion notification; using the congestion notification information to adjust and regulate the transmission rates at various agents; computing the transmission rates and enforcing them at the agents, or alternatively using various types of end-to-end flow credit based flow control schemes for controlling the resource allocation to various agents.
  • aspects of the present application include a system or apparatus, which may involve providing congestion avoidance and end-to-end flow control and QoS by using explicit notification messages between communicating agents for congestion notification; using the congestion notification information to adjust and regulate the transmission rates at various agents; computing the transmission rates and enforcing them at the agents, or alternatively using various types of end-to-end flow credit based flow control schemes for controlling the resource allocation to various agents.
  • NoC which may be configured at the NoC level to provide congestion avoidance and end-to-end flow control and QoS by use of explicit notification messages between communicating agents for congestion notification; use the congestion notification information to adjust and regulate the transmission rates at various agents; compute the transmission rates and enforce them at the agents, or alternatively use various types of end-to-end flow credit based flow control schemes to control the resource allocation to various agents.
  • FIGS. 1( a ), 1( b ) 1( c ) and 1( d ) illustrate examples of Bidirectional ring, 2D Mesh, 2D Torus, and 3D Mesh NoC Topologies.
  • FIG. 2 illustrates an example of XY routing in a related art two dimensional mesh.
  • FIG. 3 illustrates an example of a NoC interconnect.
  • FIG. 4 illustrates a NoC interconnect with routers and interconnects separated for clarity and bandwidth received by various channels.
  • FIG. 5 illustrates an example assignment of weights to various NoC channels, in accordance with an example implementation.
  • FIG. 6( a ) illustrates a system with four source agents communicating with a destination agent and the weight and the transmission rates of the source agents, in accordance with an example implementation.
  • FIG. 6( b ) illustrates the resulting rate computed by the arte computation module, total leftover bandwidth and the way leftover bandwidths are assigned to the source agents, in accordance with an example implementation.
  • FIG. 6( c ) illustrates the system with updated transmission rates after the leftover rates were notified to the source agents and source agents updated their transmission rates, in accordance with an example implementation.
  • FIG. 7( a ) illustrates a system with four source and one destination agents and end-to-end credit based flow control scheme using separate buffers at the destination for each source agents, in accordance with an example implementation.
  • FIG. 7( b ) illustrates a system with four source and one destination agents and end-to-end credit based flow control scheme using a shared buffer pool at the destination for all source agents.
  • FIGS. 8( a ), ( b ), ( c ) and ( d ) illustrate the message transmission and credit return protocol in various example implementations.
  • FIG. 9 illustrates a computer/apparatus block diagram upon which the example implementations described herein may be implemented.
  • FIG. 10 illustrates an example Network on Chip (NoC) block diagram, on which example implementations may be implemented.
  • NoC Network on Chip
  • Example implementations of the present application involve regulating the way various agents connected to the NoC inject traffic into the NoC so that the network congestion can be avoided and provide fairness and maintain QoS.
  • Certain implementations of such traffic injection regulation schemes may have been employed in the Internet traffic management, but not in the NoC interconnects with 2-D, 2.5-D or 3-D mesh or Torus topologies.
  • the “slow start” congestion strategy in which source agents slowly (i.e. cautiously) increase their transmission rates, has been employed in the Internet to avoid such oscillations.
  • the slow start congestion strategy works only when the transmitting agents are sending steadily and their burstiness is much smaller than the round-trip latency in the network.
  • the agent's traffic behavior may be highly bursty, and maintaining low latency and high resource utilization may be more important. Therefore, such standard techniques may not work very well in an SoC or NoC.
  • Standard congestion control techniques that regulate the traffic injection rate may also cause unfairness in the resource bandwidth allocation among various contenders in the system adversely affecting the QoS.
  • Example implementations described herein are directed to solutions for 2-D, 2.5-D and 3-D NoC interconnects that avoids network congestion by regulating the injection of traffic at various source agents and provides end-to-end uniform-fair and weighted-fair allocation of destination bandwidth among the contending source agents.
  • a number of novel traffic injection regulation designs are described. The example implementations are fully distributed and scales well with number of agents in the NoC interconnect.
  • congestion To control congestion and regulate the traffic injection, congestion must be detected first.
  • the congestion detection is described below with three example methods, which can be used individually or together with each other in a system.
  • the destination agents which are receiving messages from various sources send an explicit notification back to the corresponding sources indicating whether it is congested or not (i.e. it is receiving messages at rate higher than it can process therefore it has a backlog). If the destination is congested then the sources should back-off and slow down their transmission rate.
  • the notification may be piggybacked if the destination sends a response message back to the sources, or the notification can be a separate message sent on the same NoC or on a separate set of side band channels. Based on the congestion notification, sources regulate the traffic injection.
  • the sources compute the level of congestion in the system interconnect by monitoring various metrics, such as the round-trip time of the request to response messages if applicable, or the amount of backpressure the source is experiencing from the network when the source attempts to inject a message to a destination. Based on this information, the sources take action locally to avoid the congestion and ensure QoS.
  • the destination agents use buffers or allocate credit for buffer slots to various sources to control the message arrival from various sources. Buffers or credits may be allocated among all contending source agents based on the QoS policy. If a source does not have an allocated buffer or credit for the destination, it may not send to the destination. Buffers or credits for a destination can be pre-allocated among the sources or can be allocated by the destination upon receiving an explicit request from a source.
  • This method may provide end-to-end flow control in the system and congestion avoidance and QoS policy may be enforced with the correct allocation of buffers and credit distribution.
  • the first method uses explicit congestion notification messages.
  • An explicit congestion notification message from a destination agent may contain information such as the current load at the destination agent, the acceptable load that the agent can handle from the source, etc.
  • the source may begin to regulate the injection rate of messages for the destination based on the notification information.
  • the explicit notification information may include the rate at which the sources are allowed to transmit to the destination, in which case sources can regulate the traffic to the destination with this rate.
  • the notification may only indicate the congestion state at the destination to the sources communicating with the destination; the congestion state may be a bit indicating whether there is congestion at the destination or not or can also indicate the amount of congestion. Based on this information, sources may locally determine the transmission rate to the destination.
  • the destination computes the rate at which various sources may transmit messages to the destination. To compute this rate, the destinations may use information such as the average and peak transmission rates of various source agents when communicating with the destination, and the relative weight between them, in addition to the current level of congestion at the destination.
  • rp(i, j) and ra(i, j) be the peak and average transmission rate of messages respectively from source agent i to destination agent j.
  • w(i, j) be the weight of messages from source agent i to destination agent j. The weight decides how messages from a source are serviced.
  • a work-conserving design may ensure that fair allocation of bandwidth occurs between contending sources and the remaining leftover bandwidth (if any) is distributed between the remaining sources based on their weights to fully use the system bandwidth.
  • the example implementation can be fair as well as work-conserving. Every destination keeps track of the rate at which it is receiving messages. The destination is congested by x % if it is receiving messages at x % higher rate than the rate it can process them. For all source agents that communicate with the destination, the destination tracks whether they are currently transmitting or not. If n source agents are currently transmitting indicated by the set ⁇ active ⁇ then the fair rate at which the source agent s may send messages to destination d is:
  • This rate in equation (1) is notified by the destination agent d to the sources and they limit their transmission rate to the destination to this value. Notice that some source agents may be sending at a rate lower than their fair share, in which case the destination will receive traffic from the source at a rate lower than it has allocated to the source. To detect this and maintain work-conserving property, the rate at which destination d is receiving messages from each source is tracked. This tracking may be implemented at the destination, with a counter for each source. The destination can then increment the corresponding counter whenever a message arrives from a source, and then track the rate at which various counters are increasing.
  • the adjusted rate of a source agents in the set ⁇ active ⁇ slow ⁇ is now its previously computed rate, rate(s, d)+ its share of the leftover bandwidth, leftover(s, d).
  • the updated transmission rate is notified to the sources in the set ⁇ active ⁇ slow ⁇ with new notification messages.
  • the destination may start to receive enough traffic or the destination may detect that there are no source agents left in the set ⁇ active ⁇ slow ⁇ . In both cases, the maximum utilization state is achieved and no new additional notification to adjust the source agent's rate need to be sent.
  • the destination may get congested when new source agents become active and begin transmitting or existing active agents decide to increase their transmission rates.
  • a destination agent detects congestion by x % (message receive rate exceeds the destination processing capacity by x %), the destination agent readjusts the transmission rate of the active sources and sends new notification messages to enforce the adjusted rates at the sources so that the total receive traffic at the destination is reduced by x %.
  • the fair share transmission rate of all active agents is re-computed based on the current transmission rate of all active source agents, and their computed fair share.
  • the source agents that are already sending at rates lower than their new fair share do not need to reduce their transmission rates, and do not need to be notified again. Only those agents which are sending more than their new fair share of bandwidth need to be notified to reduce their rates. Once the source agents are notified to reduce their rates, they will slow down and the congestion should disappear.
  • the source agents need to be provided with a rate at which they begin transmitting when they become active.
  • An example implementation may use a fixed initial rate at which any given inactive agent begins transmitting, or may begin at their fair share of rate when all agents are active (this value will be proportional to the weight of the agents and cumulative rate will be equal to the destination's receive capacity).
  • an agent may use the initial rates, or the last transmission rate of the agent when it was active last.
  • Other initial rate values which are either higher or lower than the fair share rates may be used to improve the utilization of the system when agents go from inactive to active states.
  • FIG. 6( a ) there is a destination d to which four source agents, s 1 , s 2 , s 3 and s 4 are transmitting messages.
  • the weight of the four sources to destination d are 10, 20, 30 and 40 respectively.
  • Destination d's maximum receive rate is 100.
  • the initial rate assigned to the sources is 10, 20, 30, and 40 respectively which is the fair share of the rates when all four agents are active.
  • s 1 , s 3 and s 4 are active.
  • Their initial transmit rates are 10, 30 and 40, however, in this example the agent s 4 is transmitting only at rate of 10.
  • the total receive rate at the destination is 50.
  • the destination will begin allocating additional bandwidth to the active agents.
  • the agents s 1 and s 3 are active ( 600 and 601 ) while agent s 4 is active but slow ( 602 ), i.e. it is transmitting less than its fair share.
  • the leftover bandwidth can be divided among the active but not slow agents, which are s 1 and s 3 .
  • the resulting leftover bandwidth allocation to these two agents is shown in 604 .
  • the leftover bandwidth is allocated based on the weight of the two agents.
  • the leftover bandwidth is added to the current transmit rate of the agents and they are notified of the new transmit rates, 22.5 and 67.5 respectively.
  • the resulting state of the system after the source agents s 1 and s 3 update their transmit rates is shown in FIG. 6( c ) .
  • the source agents store the notified rate from the destinations for rate regulation.
  • the notified rate value is stored and the source ensures that the transmissions rate to each destination meets the rate.
  • An example implementation may use resources or buffers at each destination to accept messages which can be partitioned statically or dynamically based on the source agent's activity and weights.
  • the destination may notify the sources about the number of available buffers it has for the source and the source agent therefore does not send more messages than the available buffers.
  • the buffer allocation can be made dynamic and the notification of buffer availability may be piggybacked on existing sets of messages in the system.
  • An example implementation may also utilize the source agent's average and peak transmission rates to determine the transmission rate by the source agents to the destination.
  • Another example implementation may use a mechanism where the destination agents send the delta or difference between the expected transmission rate of the source agents and their current transmission rates instead of sending transmission rate values. Using this delta, the sources may adjust their transmission rates. Destinations may continue to send the delta notifications until the system is stabilized and the destinations are no longer congested.
  • the notification from destination agents to source agents may only convey to the sources that there is congestion at the destination and optionally by how much, and not the rate at which the sources should transmit or adjust their rates to avoid congestion.
  • the source agents are responsible for computing the rate at which they may transmit to various destinations.
  • the source agents may start with a fixed transmission rate once they becomes active, which can be determined based on the QoS policy.
  • the destination agents notify the active sources about whether the destination is currently congested or not, and optionally the amount of congestion. This notification is broadcasted to all active sources which are currently communicating with the destination.
  • the notification messages may be transmitted whenever the level of congestion at the destination changes, or the set of active agents changes (new agents become active and start talking to the destination, or an active agent stops transmitting to the destination).
  • source agents When a notification message arrives at a source, the source reacts to the message by updating its transmission rate. If the destination is not congested, then source agents may increase their rates by a fixed value or a fixed multiple of the current transmit rate. They can continue increasing the rates every time they receive a notification that indicates that there is no congestion at the destination. Once a congestion notification message arrives, source agents may thereby reduce their transmission rate. If the level of congestion is not known then the source agents may reduce the transmission rate with a fixed value or a fixed multiple, or the rate reduction can be proportional to the level of congestion at the destination.
  • congestion notification only indicates whether there is congestion or not without the actual amount of congestion
  • a number of optimizations may be used to avoid the oscillations.
  • the rate at which the transmit rate of various sources are increased may be reduced upon each non-congestion notification to allow smooth convergence to a steady state rate.
  • a standard proportional integral (PI) or proportional integral derivative (PID) controller based mechanism may also be used to adjust the rates.
  • the Proportional gain, Integral gain, and the Derivative gain tuning parameters can be chosen based on the system parameters such as number of agents, burstiness in traffic, etc.
  • Another concern with such a design may be the unfairness of the destination's bandwidth allocation among various contending source agents. Since the source agents are reacting locally and independently, the final rate at which they stabilize may depend upon when they became active and how many notification messages they have received so far. To address this, an example implementation may reset the transmit rates of all source agents to various destination agents to the initial configured value periodically and then repeat the notification protocol until the rates stabilize again. Since the notification messages are sent to all source agents, in this round if no new sources were activated during the stabilization period, the resulting rates will be fair.
  • the frequency at which notification messages are sent may vary in various implementations.
  • the notification may be piggybacked on the response and therefore can be sent for every arriving message.
  • the notification may be sent less frequently.
  • the notification message may be sent by the destination agent whenever new agents begin to communicate with the destination, or an active agents stops communicating with the destination. Additionally, the notification message may be sent when the level of congestion at the destination changes that requires the transmission rates at the sources to be updated.
  • the second method of congestion avoidance and QoS via injection regulation involves computation of the transmission rate by the source agents without any explicit notification from the destinations.
  • the sources determine the level of congestion at the destinations with which it is communicating.
  • the congestion level may be determined by monitoring for observable metrics such as round-trip time of request messages to response messages, when response messages are expected for the request messages, or the amount of backpressure the source agent is experiencing from the network when the source agent attempts to inject a message for a given destination.
  • the backpressure is the flow control signaling from the network to an agent if the agent is no longer allowed to send any more data into the network due to congestion in the network.
  • source agents can take action locally to avoid congestion and also ensure QoS.
  • Source agents may regulate the transmission rates in a number of ways.
  • the source agents may use multiple sets of local registers, one for each destination, to track the round-trip latencies and current transmission rates to the destinations.
  • Sources may start transmission at a fixed rate which may be decided based on the QoS policy, relative weight of various traffic flows from sources to the destination agents, and the maximum bandwidth of the destination agents.
  • source agent may decide to increase the transmission rate to the destination.
  • the rate of increase may be linear additive, i.e. after each round-trip time of latency observation the transmission rate is increased by a fixed value as long as round-trip time does not indicate congestion.
  • the fixed value may be determined based on how quickly the source agents want to achieve full bandwidth utilization; choosing a high rate of increase will enable source agents to reach high bandwidth quickly and increase the system utilization, however it may also lead to oscillations in the system congestion.
  • a standard proportional integral (PI) or proportional integral derivate (PID) controller based mechanism may also be used to adjust the rates based on the observed congestion.
  • the source agents can simply track the backpressure signal at its outgoing interfaces to infer the level of congestion in the system. The more frequently an outgoing interface is experiencing backpressure, the more there is congestion at the set of destinations for the transmitted messages of the interface. As the congestion level is determined at an interface, the transmission rates to the set of corresponding destination agents can be regulated based on the previously described additive increase and multiplicative decrease or PI and PID controller scheme.
  • end to end credit based flow control is used between all source and destination agents.
  • the destination agents use buffers to receive arriving messages and control the message arrival by providing credits to the source agents.
  • a credit corresponds to an empty buffer slot at the destination allocated for a message from the source agent. If a source does not have an allocated buffer or credit for a destination, it cannot send messages to the destination, and must acquire credit first.
  • Buffer allocation and credit distribution to various source agents are performed at each destination agent based on the QoS policy.
  • This method provides end-to-end flow control in the system and congestion avoidance and QoS policy can be ensured with the correct allocation of buffers, distribution of credit and processing of the arriving messages at the destination. Buffers or credits at a destination can be pre-allocated among the source agents communicating with the destination or can be dynamically allocated upon receiving an explicit request from the sources.
  • every destination agent has separate buffer pools for arriving messages from every source agent. This is illustrated in FIG. 7( a ) .
  • At the destination there are four buffer pools, one for each source agent.
  • the arriving messages 1, 2, 3, and 4, from the source agents are stored in the corresponding buffer pool in a First in First out order (FIFO).
  • FIFO First in First out order
  • Certain designs may store the arriving messages in non-FIFO order depending on the priority of various messages or certain isolation requirements such as one type of messages cannot block the other types.
  • Source agents must acquire credit before sending a message to the destination.
  • the sources can begin with a credit value equal to the number of slots in its buffer pool at the destination.
  • sources may begin with zero credit, and the destinations distribute credits to the sources after reset based on the number of free slots in the buffer pool.
  • the arriving messages stored in the buffers are read ( 700 ) and for further processing at the destination agent.
  • the mechanism to read the messages from various buffer pools is based on the system QoS policy.
  • a QoS policy which assigns weights w 1 , w 2 , w 3 and w 4 to the four source agents.
  • the number of messages read and processed from each buffer can be made proportional to the weight of the source that writes into the buffer.
  • the weights of the four buffers are 1, 2, 3 and 4, respectively.
  • the weights of the four buffers are 1, 2, 3 and 4, respectively. In this case, in every 10 messages that are read, one should be from the first buffer, two should be from the second buffer, three should be from the third buffer and four should be from the fourth buffer, providing weighted fair allocation of bandwidth to each source agent.
  • a slightly approximate implementation may provide fairness over larger periods of time, allowing some unfairness during short time periods.
  • Standard weighted Round Robin (WRR), Deficit Round Robin (DRR), or Weighted Fair Queuing (WFQ) based designs may be used to implement the read mechanism.
  • QoS policy may also provide different priority to the source agents, and the priority may be strict, i.e. if there is a message of higher priority waiting then it has be processed before all messages of lower priority. Between messages of the same priority value, equal or weighted fairness may be needed. In this case, a combination of weighted arbitration and strict priority arbitration may be implemented.
  • the corresponding source agent can send a new message to the destination. Therefore, the destination agent can send a credit back to the source agents each time a message is read from the buffer.
  • the credit can be sent as a separate message to the source agent, or can be piggybacked on an existing message that is being sent to the source agent. If the arriving message at the destination is going to generate a response message back to the source then it may be efficient to piggyback the credit on the response message.
  • the resulting protocol of message transmission and credit return is illustrated in FIG. 8( a ) .
  • the total number of buffers needed at a destination may be proportional to the number of source agents talking to the destination.
  • the total number of buffers in the entire system may be O(n ⁇ 2) for n agents, since each destination may need n ⁇ 1 buffer pools, or one for each source.
  • the number of slots in the buffer pool for each source agent may need to be proportional to the round-trip latency between the source and the destination, and the maximum message rate from the source to the destination.
  • the round-trip latency may be O(n) and therefore the total number of buffer slots in the system may be O(n ⁇ 3).
  • the round-trip time may be O(n ⁇ 1/2), in which case the total number of buffer slots in the system may be O(n ⁇ 5/2).
  • the number of buffer slots may grow to become excessive as a fully connected system scales in number of agents.
  • a second example implementation of end-to-end flow control method may use a shared pool of buffers at the destinations to store the arriving messages from various sources instead of separate buffers for each source agent.
  • An example is illustrated in FIG. 7( b ) .
  • the buffer slots from the shared pool may be dynamically allocated to the requesting source agents based on their need and based on the QoS policy. Thus the source agents begin with no credit.
  • To send a message source agents send a credit request message to the destination. If destination has a buffer slot available then it reserves the slot for the request and responds back with a credit to the source. Source can then consume the credit and send the message.
  • the resulting protocol of message transmission and credit return is illustrated in FIG. 8( b ) .
  • sources To reduce latency, it is possible for sources to acquire a few credits ahead of time. In such designs, there may be deadlock if the destination runs out of buffer slots and new credits, and if sources acquire credits that they are not using. To avoid this, the source agents return the unused credits back to the destination if they are not used after certain timeout interval.
  • source agents can go ahead and send a message to the destination without acquiring a credit from the destination.
  • the destination agents may choose to accept an arriving message if there are available resources to accept and process the message or it may decide to discard it.
  • the destination discards an arriving message from a source, the destination notifies the source so that the source agent can re-transmit this message.
  • an example implementation may restrain the sources to always acquire a credit before re-sending a message that was earlier discarded by the destination.
  • the resulting protocol of message transmission and credit return is illustrated in FIG. 8( c ) .
  • An additional optimization may avoid a source from sending an explicit credit request to a destination for a previously discarded message. Assuming that source agents always resends the discarded messages later, the destination can register all discards and send credits to the requesting source agents later once resources and buffer slots are available at the destination for the source. Once the credit arrives at the source the source agent may re-send the discarded messages which are guaranteed to be accepted at the destination this time.
  • the resulting protocol of message transmission and credit return is illustrated in FIG. 8( d ) .
  • Re-transmission of messages may affect the ordering of message delivery so the source and destination agents should ensure that the un-ordered delivery of message is either acceptable or is resolved correctly.
  • the buffer slot is freed up, and can be used for a newly arriving message or can be allocated for a source agent that requested a credit previously or had a message discarded.
  • a hybrid implementation of the two end-to-end flow control schemes may use both, a set of separate buffer pools for each source agent and a set of dynamically allocated shared buffer pool to be shared among all sources.
  • the source agents will track two types of credits for each destination, one for the dedicated buffer pool it has for itself at the destination, and the other for the buffer slots it requests and allocates dynamically at the destination.
  • the source agents based on its design, the types of messages it is sending, and the QoS policy, may use the two types of credits for the different types of messages being sent.
  • a number of alternative example implementations are possible within the context of the previously described end-to-end flow control schemes, in which the buffer allocation and credit distribution at the destination may be performed in various ways depending upon the latency between the destination and source agents, the topology of the NoC interconnect, the bandwidth of various NoC channels and the transmission and receive capability of the agents.
  • One may also combine the end-to-end credit flow control schemes with the feedback based congestion notification schemes to avoid congestion more effectively and provide end-to-end QoS more efficiently.
  • FIG. 9 illustrates an example computer system 900 on which example designs may be implemented.
  • the computer system 900 includes an apparatus 905 which may involve an I/O unit 935 , storage 960 , and a processor 910 operable to execute one or more units as known to one of skill in the art.
  • the term “computer-readable medium” as used herein refers to any medium that participates in providing instructions to processor 910 for execution, which may come in the form of computer-readable storage mediums, such as, but not limited to optical disks, magnetic disks, read-only memories, random access memories, solid state devices and drives, or any other types of tangible media suitable for storing electronic information, or computer-readable signal mediums, which can include transitory media such as carrier waves.
  • the I/O unit processes input from user interfaces 940 and operator interfaces 945 which may utilize input devices such as a keyboard, mouse, touch device, or verbal command.
  • the apparatus 905 may also be connected to an external storage 950 , which can contain removable storage such as a portable hard drive, optical media (CD or DVD), disk media or any other medium from which a computer can read executable code.
  • the apparatus may also be connected an output device 955 , such as a display to output data and other information to a user, as well as request additional information from a user.
  • the connections from the apparatus 905 to the user interface 940 , the operator interface 945 , the external storage 950 , and the output device 955 may via wireless protocols, such as the 902.11 standards, Bluetooth® or cellular protocols, or via physical transmission media, such as cables or fiber optics.
  • the output device 955 may therefore further act as an input device for interacting with a user.
  • the processor 910 may execute one or more modules.
  • the congestion detection module module 911 may be configured to determine the congestion level in the network based on various performance metrics such as the round trip latency or the amount of backpressure observed by the source agents, or, the receive rate of messages at a destination agent.
  • the rate computation module 912 present at a NoC node may compute the rate at which an agent may transmit data into network.
  • Source agents may compute the rates by observing the congestion level metrics such as round trip time or amount of backpressure, or destination agents may compute the rate at which various sources may send to it based on the rate at which it is receiving messages currently and the rate at which it can process them.
  • the QoS enforcement module 913 may be configured to dynamically adjust the transmission rates at various source agents so that the end-to-end QoS specification is satisfied.
  • the various modules and processor in single or in combination be configured to perform certain operations.
  • Such operations can include to receive, at one of a first node and a second node in the NoC, an instruction based on at least one of a command signal from the other of the first node and the second node, a computed level of congestion based on a QoS metric indicative of traffic congestion, and an end-to-end flow control buffer allocation result; and to determine, at the one of the first node and the second node, an allocation of traffic bandwidth based on a result of the instruction.
  • the QoS metric can include at least one of a round trip time of a request message to a response message, and backpressure experienced by the one of the first node and the second node.
  • the command signal can be in the form of a bit signal and/or a notification message indicative of congestion. Further operations can include to determine an allocation of traffic bandwidth by a computation of a transmission rate at the one of the first node and the second node and an allocation of the traffic bandwidth based on the computed transmission rate, and/or by an issuance of a buffer allocation to the one of the first node and the second node from the other of the first node and the second node based on the result of the instruction, as described in the example implementations above.
  • FIG. 10 illustrates an example Network on Chip (NoC) hardware block diagram 1000 , on which example implementations may be implemented.
  • the NoC 1010 may include a plurality of routers and hosts that are connected by interconnects, as illustrated and described in FIGS. 1-6 .
  • the NoC 1010 can be implemented on a chip 1015 , which may be in the form of an integrated circuit, such as a System on Chip (SoC), Very-Large-Scale-Integration (VLSI) device or other hardware configurations, depending on the desired implementation.
  • SoC System on Chip
  • VLSI Very-Large-Scale-Integration
  • the NoC 1010 is configured to handle all of the functions as described in the example implementations above at the NoC level, or can be operated on with a processor.
  • Chip 1015 may also include an I/O unit 1035 for facilitating communications between the chip 1015 and a computer system implementing the chip 1015 via a computer bus interface 1045 and external storage 1050 .
  • Chip 1015 may also include Random Access Memory (RAM) 1060 and processor 1015 .
  • RAM Random Access Memory
  • Processor 1015 may store and execute the congestion detection module 911 , the rate computation module 912 , and the QoS enforcement module 913 as described above. Additionally, the modules in the processor 1015 can be stored and executed within the nodes of the NoC 1010 itself at the NoC level.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

Systems and methods described herein are directed to solutions for NoC interconnects that provide congestion avoidance and end-to-end uniform and weighted-fair allocation of resource bandwidths among various contenders in a mesh or torus interconnect. The example implementations are fully distributed and involve using explicit congestion notification messages or local congestion identification for congestion detection. Based on the congestion level detected, the injection rates of traffic at various agents are regulated that avoids congestion and also provides end-to-end QoS. Alternative example implementations may also utilize end-to-end credit based flow control between communicating agents for resource and bandwidth allocation of the destination between the contending sources. The resource allocation is performed so that both the weighted and strict bandwidth allocation QoS policies are satisfied.

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • This application claims benefit under 35 USC §120 and is a Continuation of U.S. patent application Ser. No. 13/886,794, filed on May 3, 2013, titled “Congestion Control and QoS in NoC By Regulating the Injection Traffic”, the content of which is incorporated herein in its entirety by reference for all purposes.
  • BACKGROUND
  • Technical Field
  • Methods and example implementations described herein are generally directed to interconnect architecture, and more specifically, to weight assignment and weighted arbitration of node channels in a Network on Chip (NoC) system interconnect architecture.
  • Related Art
  • The number of components on a chip is rapidly growing due to increasing levels of integration, system complexity and shrinking transistor geometry. Complex System-on-Chips (SoCs) may involve a variety of components e.g., processor cores, DSPs, hardware accelerators, memory and I/O, while Chip Multi-Processors (CMPs) may involve a large number of homogenous processor cores, memory and I/O subsystems. In both systems the on-chip interconnect plays a role in providing high-performance communication between the various components. Due to scalability limitations of traditional buses and crossbar based interconnects, Network-on-Chip (NoC) has emerged as a paradigm to interconnect a large number of components on the chip. NoC is a global shared communication infrastructure made up of several routing nodes interconnected with each other using point-to-point physical links.
  • Messages are injected by the source and are routed from the source node to the destination over multiple intermediate nodes and physical links. The destination node then ejects the message and provides the message to the destination. For the remainder of this application, the terms ‘components’, ‘blocks’, ‘hosts’ or ‘cores’ will be used interchangeably to refer to the various system components which are interconnected using a NoC. Terms ‘routers’ and ‘nodes’ will also be used interchangeably. Without loss of generalization, the system with multiple interconnected components will itself be referred to as a ‘multi-core system’.
  • There are several topologies in which the routers can connect to one another to create the system network. Bi-directional rings (as shown in FIG. 1(a)), 2-D (two dimensional) mesh (as shown in FIG. 1(b)) and 2-D Torus (as shown in FIG. 1(c)) are examples of topologies in the related art. Mesh and Torus can also be extended to 2.5-D (two and half dimensional) or 3-D (three dimensional) organizations. FIG. 1(d) shows a 3D mesh NoC, where there are three layers of 3×3 2D mesh NoC shown over each other. The NoC routers have up to two additional ports, one connecting to the router in the higher layer, and another connecting to the router in the lower layer. Router 111 in the middle layer of the example has both ports used one connecting to the router at the top layer and another connecting to the router at the bottom layer. Routers 110 and 112 are at the bottom and top mesh layers respectively, therefore they have only the upper facing and lower facing ports of the two additional ports used. The inter-layer ports or channels between these three routers are 113 and 114.
  • Packets are message transport units for intercommunication between various components. Routing involves identifying a path composed of a set of routers and physical links of the network over which packets are sent from a source to a destination. Components are connected to one or multiple ports of one or multiple routers, with each such port having a unique ID. Packets carry the destination's router and port ID for use by the intermediate routers to route the packet to the destination component.
  • Examples of routing techniques include deterministic routing, which involves choosing the same path from A to B for every packet. This form of routing is independent from the state of the network and does not load balance across path diversities, which might exist in the underlying network. However, such deterministic routing may be implemented in hardware, maintains packet ordering and may be rendered free of network level deadlocks. Shortest path routing may minimize the latency as such routing reduces the number of hops from the source to the destination. For this reason, the shortest path may also be the lowest power path for communication between the two components. Dimension-order routing is a form of deterministic shortest path routing in 2-D, 2.5-D, and 3-D mesh networks. In this routing scheme, messages are routed along each coordinates in a particular sequence until the message reaches the final destination. For example in a 3-D mesh network, one may first route along the X dimension until it reaches a router whose X-coordinate is equal to the X-coordinate of the destination router. Next, the message takes a turn and is routed in along Y dimension and finally takes another turn and moves along the Z dimension until the message reaches the final destination router. Dimension ordered routing is often minimal turn and shortest path routing.
  • FIG. 2 pictorially illustrates an example of XY routing in a two dimensional mesh. More specifically, FIG. 2 illustrates XY routing from node ‘34’ to node ‘00’. In the example of FIG. 2, each component is connected to only one port of one router. A packet is first routed over the x-axis till the packet reaches node ‘04’ where the x-coordinate of the node is the same as the x-coordinate of the destination node. The packet is next routed over the y-axis until the packet reaches the destination node.
  • In heterogeneous mesh topology in which one or more routers or one or more links are absent, dimension order routing may not be feasible between certain source and destination nodes, and alternative paths may have to be taken. The alternative paths may not be shortest or minimum turn.
  • Source routing and routing using tables are other routing options used in NoC. Adaptive routing can dynamically change the path taken between two points on the network based on the state of the network. This form of routing may be complex to analyze and implement.
  • A NoC interconnect may contain multiple physical networks. Over each physical network, there may exist multiple virtual networks, wherein different message types are transmitted over different virtual networks. In this case, at each physical link or channel, there are multiple virtual channels; each virtual channel may have dedicated buffers at both end points. In any given clock cycle, only one virtual channel can transmit data on the physical channel.
  • NoC interconnects may employ wormhole routing, wherein, a large message or packet is broken into small pieces known as flits (also referred to as flow control digits). The first flit is the header flit, which holds information about this packet's route and key message level info along with payload data and sets up the routing behavior for all subsequent flits associated with the message. Optionally, one or more body flits follows the head flit, containing the remaining payload of data. The final flit is the tail flit, which in addition to containing the last payload also performs some bookkeeping to close the connection for the message. In wormhole flow control, virtual channels are often implemented.
  • The physical channels are time sliced into a number of independent logical channels called virtual channels (VCs). VCs provide multiple independent paths to route packets, however they are time-multiplexed on the physical channels. A virtual channel holds the state needed to coordinate the handling of the flits of a packet over a channel. At a minimum, this state identifies the output channel of the current node for the next hop of the route and the state of the virtual channel (idle, waiting for resources, or active). The virtual channel may also include pointers to the flits of the packet that are buffered on the current node and the number of flit buffers available on the next node.
  • The term “wormhole” plays on the way messages are transmitted over the channels: the output port at the next router can be so short that received data can be translated in the head flit before the full message arrives. This allows the router to quickly set up the route upon arrival of the head flit and then opt out from the rest of the conversation. Since a message is transmitted flit by flit, the message may occupy several flit buffers along its path at different routers, creating a worm-like image.
  • Based upon the traffic between various end points, and the routes and physical networks that are used for various messages, different physical channels of the NoC interconnect may experience different levels of load and congestion. During congestion, when multiple sources transmit messages to the same destination, their messages may contend with each other and with the cross-traffic for the bandwidth. Therefore, the effective destination bandwidth received by each source will depend on their positions in the network, how their routes overlap with each other, cross-traffic along their routes to the destination, and the arbitration policies deployed at various routers where arbitration is needed. In spite of uniformly fair arbitration policies at all routers, depending on location of various sources there may be a substantial difference in the destination bandwidth received.
  • Consider a section of a NoC interconnect shown in FIG. 3, wherein four components (source 1, source 2, source 3, and source 4) transmit messages to one component (destination). In this example, the maximum data transmit bandwidth of the four source components is equal to the maximum data receive bandwidth of the destination component. Each of the five components are connected to a local router node, and the router nodes are connected with each other using point to point channels as shown in FIG. 3. In the example of FIG. 3, each of the channels have a receive bandwidth of the destination component equal to a transmit bandwidth of the source component.
  • In the system shown in FIG. 3, if all four source components attempt to transmit data at their peak transmit rate and if the destination component is ready to accept data at its peak receive rate, then messages from the four source components will contend with each other within the NoC interconnect.
  • In FIG. 4, the routers and components are separated for clarity, and the channels that connect components with their local routers are illustrated.
  • At router 41 in FIG. 4, messages arriving at the left input port (e.g., from router 42) and the bottom input port (e.g., from source 4) will contend for the right output port (e.g., to router 40). If routers implement uniformly fair arbitration policy to arbitrate between incoming messages at different input ports contending for an output port, then the output port's bandwidth will be equally split between the two input ports as shown. Each input port will receive 50% of the destination bandwidth—source 4 therefore will receive half of the destination bandwidth.
  • At router 42 in FIG. 4, messages arriving at the left input port (e.g. from router 43) and bottom input port (e.g., from source 3) will contend for the right output port. If routers implement uniformly fair arbitration policy to arbitrate between incoming messages at different input ports contending for an output port, then the 50% output port's bandwidth (computed in the above step) will be equally split between the two input ports as shown. Each input port will receive 25% of the destination bandwidth—source 3 therefore will receive a quarter of the destination bandwidth.
  • At router 43 in FIG. 4, messages arriving at the left input port (e.g., from router 44) and bottom input port (e.g., from source 2) will contend for the right output port (e.g., to router 42). If routers implement uniformly fair arbitration policy to arbitrate between incoming messages at different input ports contending for an output port, then the 25% output port's bandwidth (computed in the above step) will be equally split between the two input ports as shown. Each input port will receive 12.5% of the destination bandwidth—source 2 therefore will receive 12.5% of the destination bandwidth. The remaining 12.5% bandwidth will be received by source 1.
  • The example of FIG. 4 illustrates that even though each router employs a uniformly fair arbitration policy wherein the router gives fair share of output port bandwidth among all input port contenders, the four sources receive vastly different shares of the destination bandwidth. In a complex network with additional cross-traffic, the bandwidth allocated to various source components when they content for various destinations may vary substantially. This may be undesirable in many applications, wherein fair or equal allocation of various resources among all contenders may be important to achieve a high system performance. In many systems, weighted allocation is desired, so that the various resource bandwidths are allocated among various contenders in a pre-specified ratio.
  • There are several techniques in the related art to provide uniform or weighted fair arbitration within a single router, wherein the output port bandwidth is allocated to contending input ports based on the weight specification. Weighted round-robin, deficit round-robin, weighted fair queuing, etc. are a few techniques that are used in the related art. Guaranteeing weighted or uniform allocation of various resources among contenders in a distributed NoC interconnect with resources and contenders connected at arbitrary positions in the NoC interconnect is challenging. A few techniques that are used in the related art are described below.
  • Rate limiting the sources: Each source contending for a resource destination is allowed to send data at a pre-specified rate based on its fair share. This technique is independent of the state of other sources, whether the other sources are contending for the resource or not. Therefore, based upon the pre-specified rates of sources, rate limiting of the sources can either lead to under-utilization of resource bandwidth, or unfair allocation.
  • Age based arbitration: Every message injected by various components carries timestamp information, which describes the age of the message. Within the NoC interconnect, routers give higher preference to older messages over newer messages, whenever multiple messages content for an output port. This technique can provide end-to-end uniform fairness, however it is unable to provide weighted fairness. Furthermore, age based arbitration comes at a high implementation cost of additional bits needed to carry the age information and complex circuitry at every router to determine the oldest message.
  • Weight based arbitration: Weights for various channels in a network on chip (NoC) is computed based on the bandwidth requirements of the traffic flows at the channels. Subsequently these weights are used to perform weighted arbitration between channels at each router in the NoC to provide Quality of Service (QoS). Advanced implementations may dynamically adjust the weights by monitoring the activity of flows at the channels to avoid unfair allocations, and perform weighted arbitration using the newly computed channel weights. This is described in U.S. application Ser. No. 13/745,696, herein incorporated by reference in its entirety for all purposes. Using this scheme, an example assignment of weights to various channels of the NoC illustrated in FIG. 4 is shown in FIG. 5. There are four flows from four sources which are contending for the destination's bandwidth. Assume that each flow needs to share the bandwidth equally. At node 40, the bandwidth at the incoming channel and the outgoing channel is same, therefore after normalization their weights are 1. At node 41, the bandwidth requirement of the incoming channel from left is three times the bandwidth requirement of the incoming channel at the bottom, as there are three flows being carried on the former channel versus only one flow at the latter. Therefore the weights are 3 and 1 respectively. Similarly, weights of the left and bottom incoming channels at node 42 are 2 and 1 respectively and at node 43 are 1 and 1 respectively. With this weight assignment, if weighted arbitration is performed at all nodes then fair allocation of destination bandwidth may be provided to all sources. If certain sources are inactive and are not participating in the arbitration then this weighted scheme may become unfair. For example if source 2 and source 3 are not participating then, source 1 may receive three-fold higher bandwidth than source 1.
  • SUMMARY
  • Aspects of the present application include a method, which may involve providing congestion avoidance and end-to-end flow control and QoS by using explicit notification messages between communicating agents for congestion notification; using the congestion notification information to adjust and regulate the transmission rates at various agents; computing the transmission rates and enforcing them at the agents, or alternatively using various types of end-to-end flow credit based flow control schemes for controlling the resource allocation to various agents.
  • Aspects of the present application include a computer readable storage medium storing instructions for executing a process. The process may involve providing congestion avoidance and end-to-end flow control and QoS by using explicit notification messages between communicating agents for congestion notification; using the congestion notification information to adjust and regulate the transmission rates at various agents; computing the transmission rates and enforcing them at the agents, or alternatively using various types of end-to-end flow credit based flow control schemes for controlling the resource allocation to various agents.
  • Aspects of the present application include a system or apparatus, which may involve providing congestion avoidance and end-to-end flow control and QoS by using explicit notification messages between communicating agents for congestion notification; using the congestion notification information to adjust and regulate the transmission rates at various agents; computing the transmission rates and enforcing them at the agents, or alternatively using various types of end-to-end flow credit based flow control schemes for controlling the resource allocation to various agents.
  • Aspects of the present application may involve a NoC which may be configured at the NoC level to provide congestion avoidance and end-to-end flow control and QoS by use of explicit notification messages between communicating agents for congestion notification; use the congestion notification information to adjust and regulate the transmission rates at various agents; compute the transmission rates and enforce them at the agents, or alternatively use various types of end-to-end flow credit based flow control schemes to control the resource allocation to various agents.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIGS. 1(a), 1(b) 1(c) and 1(d) illustrate examples of Bidirectional ring, 2D Mesh, 2D Torus, and 3D Mesh NoC Topologies.
  • FIG. 2 illustrates an example of XY routing in a related art two dimensional mesh.
  • FIG. 3 illustrates an example of a NoC interconnect.
  • FIG. 4 illustrates a NoC interconnect with routers and interconnects separated for clarity and bandwidth received by various channels.
  • FIG. 5 illustrates an example assignment of weights to various NoC channels, in accordance with an example implementation.
  • FIG. 6(a) illustrates a system with four source agents communicating with a destination agent and the weight and the transmission rates of the source agents, in accordance with an example implementation.
  • FIG. 6(b) illustrates the resulting rate computed by the arte computation module, total leftover bandwidth and the way leftover bandwidths are assigned to the source agents, in accordance with an example implementation.
  • FIG. 6(c) illustrates the system with updated transmission rates after the leftover rates were notified to the source agents and source agents updated their transmission rates, in accordance with an example implementation.
  • FIG. 7(a) illustrates a system with four source and one destination agents and end-to-end credit based flow control scheme using separate buffers at the destination for each source agents, in accordance with an example implementation.
  • FIG. 7(b) illustrates a system with four source and one destination agents and end-to-end credit based flow control scheme using a shared buffer pool at the destination for all source agents.
  • FIGS. 8(a), (b), (c) and (d) illustrate the message transmission and credit return protocol in various example implementations.
  • FIG. 9 illustrates a computer/apparatus block diagram upon which the example implementations described herein may be implemented.
  • FIG. 10 illustrates an example Network on Chip (NoC) block diagram, on which example implementations may be implemented.
  • DETAILED DESCRIPTION
  • The following detailed description provides further details of the figures and example implementations of the present application. Reference numerals and descriptions of redundant elements between figures are omitted for clarity. Terms used throughout the description are provided as examples and are not intended to be limiting. For example, the use of the term “automatic” may involve fully automatic or semi-automatic implementations involving user or administrator control over certain aspects of the implementation, depending on the desired implementation of one of ordinary skill in the art practicing implementations of the present application.
  • Example implementations of the present application involve regulating the way various agents connected to the NoC inject traffic into the NoC so that the network congestion can be avoided and provide fairness and maintain QoS. Certain implementations of such traffic injection regulation schemes may have been employed in the Internet traffic management, but not in the NoC interconnects with 2-D, 2.5-D or 3-D mesh or Torus topologies. There are a number of challenges in implementing injection regulation in the NoC interconnects to avoid congestion. The first is due to the delayed congestion detection. The injection rate regulation at sources can be enforced after congestion is detected in the network, which often occurs after certain delay from the moment congestion actually occurs. For example, if an explicit congestion notification is sent by the destination agent to sources or congestion is computed by the sources based on the detected round-trip latency, then there will be a round trip time delay from the moment congestion occurs to the moment source agents are notified. This time gap may cause oscillations in the network congestion and destination's bandwidth utilization.
  • By the time the sources take action, there may already be excessive congestion, which in turn may prompt the sources to overreact, which may thereby result in underutilization. The “slow start” congestion strategy, in which source agents slowly (i.e. cautiously) increase their transmission rates, has been employed in the Internet to avoid such oscillations. However, the slow start congestion strategy works only when the transmitting agents are sending steadily and their burstiness is much smaller than the round-trip latency in the network. In on-chip networks and SoCs, the agent's traffic behavior may be highly bursty, and maintaining low latency and high resource utilization may be more important. Therefore, such standard techniques may not work very well in an SoC or NoC.
  • Standard congestion control techniques that regulate the traffic injection rate may also cause unfairness in the resource bandwidth allocation among various contenders in the system adversely affecting the QoS. Example implementations described herein are directed to solutions for 2-D, 2.5-D and 3-D NoC interconnects that avoids network congestion by regulating the injection of traffic at various source agents and provides end-to-end uniform-fair and weighted-fair allocation of destination bandwidth among the contending source agents. A number of novel traffic injection regulation designs are described. The example implementations are fully distributed and scales well with number of agents in the NoC interconnect.
  • To control congestion and regulate the traffic injection, congestion must be detected first. The congestion detection is described below with three example methods, which can be used individually or together with each other in a system. In the first method, the destination agents which are receiving messages from various sources send an explicit notification back to the corresponding sources indicating whether it is congested or not (i.e. it is receiving messages at rate higher than it can process therefore it has a backlog). If the destination is congested then the sources should back-off and slow down their transmission rate. The notification may be piggybacked if the destination sends a response message back to the sources, or the notification can be a separate message sent on the same NoC or on a separate set of side band channels. Based on the congestion notification, sources regulate the traffic injection.
  • In the second method, the sources (transmitting agents) compute the level of congestion in the system interconnect by monitoring various metrics, such as the round-trip time of the request to response messages if applicable, or the amount of backpressure the source is experiencing from the network when the source attempts to inject a message to a destination. Based on this information, the sources take action locally to avoid the congestion and ensure QoS.
  • In the third method, the destination agents use buffers or allocate credit for buffer slots to various sources to control the message arrival from various sources. Buffers or credits may be allocated among all contending source agents based on the QoS policy. If a source does not have an allocated buffer or credit for the destination, it may not send to the destination. Buffers or credits for a destination can be pre-allocated among the sources or can be allocated by the destination upon receiving an explicit request from a source. This method may provide end-to-end flow control in the system and congestion avoidance and QoS policy may be enforced with the correct allocation of buffers and credit distribution. These three example methods are described in greater detail in next section.
  • The first method uses explicit congestion notification messages. An explicit congestion notification message from a destination agent may contain information such as the current load at the destination agent, the acceptable load that the agent can handle from the source, etc. When the notification information from a destination agent is available at a source, the source may begin to regulate the injection rate of messages for the destination based on the notification information. The explicit notification information may include the rate at which the sources are allowed to transmit to the destination, in which case sources can regulate the traffic to the destination with this rate. In an alternative design, the notification may only indicate the congestion state at the destination to the sources communicating with the destination; the congestion state may be a bit indicating whether there is congestion at the destination or not or can also indicate the amount of congestion. Based on this information, sources may locally determine the transmission rate to the destination.
  • If the notification information contains the rate at which various sources may transmit to various destinations, the destination computes the rate at which various sources may transmit messages to the destination. To compute this rate, the destinations may use information such as the average and peak transmission rates of various source agents when communicating with the destination, and the relative weight between them, in addition to the current level of congestion at the destination.
  • Let rp(i, j) and ra(i, j) be the peak and average transmission rate of messages respectively from source agent i to destination agent j. Let w(i, j) be the weight of messages from source agent i to destination agent j. The weight decides how messages from a source are serviced.
  • For example if two sources, a and b, transmits messages to a destination c and are transmitting faster than destination can accept, then the ratio of the number of messages from the two sources that gets serviced is the ratio of their weights, i.e. w(a, c)/w(b, c). This is referred to as fair allocation and every source agent gets their fair share of bandwidth.
  • However, if a source is sending fewer messages than its fair share, then the remaining bandwidth may be utilized by the sources which are willing to send more than their fair-share of bandwidth. This is called the work-conserving property. A work-conserving design may ensure that fair allocation of bandwidth occurs between contending sources and the remaining leftover bandwidth (if any) is distributed between the remaining sources based on their weights to fully use the system bandwidth.
  • The example implementation can be fair as well as work-conserving. Every destination keeps track of the rate at which it is receiving messages. The destination is congested by x % if it is receiving messages at x % higher rate than the rate it can process them. For all source agents that communicate with the destination, the destination tracks whether they are currently transmitting or not. If n source agents are currently transmitting indicated by the set {active} then the fair rate at which the source agent s may send messages to destination d is:

  • rate(s,d)=w(s,d)/Σ{active} w(k,d)×receive rate of d.  (1)
  • This rate in equation (1) is notified by the destination agent d to the sources and they limit their transmission rate to the destination to this value. Notice that some source agents may be sending at a rate lower than their fair share, in which case the destination will receive traffic from the source at a rate lower than it has allocated to the source. To detect this and maintain work-conserving property, the rate at which destination d is receiving messages from each source is tracked. This tracking may be implemented at the destination, with a counter for each source. The destination can then increment the corresponding counter whenever a message arrives from a source, and then track the rate at which various counters are increasing.
  • Once the receive rate from all sources are known at the destination, the destination can detect sources that are sending at a rate lower than their fair share, wherein the remaining bandwidth of the destination can be re-distributed among the remaining active sources. Let sources that are sending less than their fair share of traffic be indicated by the set {slow}, then the leftover bandwidth of the destination is distributed among the agents in the set {active}−{slow} according to the following equation:

  • leftover(s,d)=w(s,d)/Σ{active}-{slow} w(k,d)×leftover bandwidth of d.  (2)
  • The adjusted rate of a source agents in the set {active}−{slow} is now its previously computed rate, rate(s, d)+ its share of the leftover bandwidth, leftover(s, d). The updated transmission rate is notified to the sources in the set {active}−{slow} with new notification messages.
  • At some point, the destination may start to receive enough traffic or the destination may detect that there are no source agents left in the set {active}−{slow}. In both cases, the maximum utilization state is achieved and no new additional notification to adjust the source agent's rate need to be sent.
  • The destination may get congested when new source agents become active and begin transmitting or existing active agents decide to increase their transmission rates. When a destination agent detects congestion by x % (message receive rate exceeds the destination processing capacity by x %), the destination agent readjusts the transmission rate of the active sources and sends new notification messages to enforce the adjusted rates at the sources so that the total receive traffic at the destination is reduced by x %. The fair share transmission rate of all active agents is re-computed based on the current transmission rate of all active source agents, and their computed fair share.
  • The source agents that are already sending at rates lower than their new fair share do not need to reduce their transmission rates, and do not need to be notified again. Only those agents which are sending more than their new fair share of bandwidth need to be notified to reduce their rates. Once the source agents are notified to reduce their rates, they will slow down and the congestion should disappear.
  • Since no agents are active right after reset or in the beginning, the source agents need to be provided with a rate at which they begin transmitting when they become active. An example implementation may use a fixed initial rate at which any given inactive agent begins transmitting, or may begin at their fair share of rate when all agents are active (this value will be proportional to the weight of the agents and cumulative rate will be equal to the destination's receive capacity). When an agent become active again after being inactive for some time, an example implementation may use the initial rates, or the last transmission rate of the agent when it was active last. Other initial rate values which are either higher or lower than the fair share rates may be used to improve the utilization of the system when agents go from inactive to active states.
  • Consider the following example that illustrates this type of explicit notification based rate regulation. In FIG. 6(a), there is a destination d to which four source agents, s1, s2, s3 and s4 are transmitting messages. The weight of the four sources to destination d, are 10, 20, 30 and 40 respectively. Destination d's maximum receive rate is 100. In this example, assume that the initial rate assigned to the sources is 10, 20, 30, and 40 respectively which is the fair share of the rates when all four agents are active. In the beginning only three agents, s1, s3 and s4 are active. Their initial transmit rates are 10, 30 and 40, however, in this example the agent s4 is transmitting only at rate of 10. Thus the total receive rate at the destination is 50.
  • In FIG. 6(b), the receive rate computation module at the destination tracks the receive rate from each source, for example, 100−the total receive rate=the leftover bandwidth 603. Once the module realizes that there is leftover bandwidth (603) of 50 in the system, the destination will begin allocating additional bandwidth to the active agents. The agents s1 and s3 are active (600 and 601) while agent s4 is active but slow (602), i.e. it is transmitting less than its fair share. The leftover bandwidth can be divided among the active but not slow agents, which are s1 and s3.
  • The resulting leftover bandwidth allocation to these two agents is shown in 604. The leftover bandwidth is allocated based on the weight of the two agents. Subsequently, the leftover bandwidth is added to the current transmit rate of the agents and they are notified of the new transmit rates, 22.5 and 67.5 respectively. The resulting state of the system after the source agents s1 and s3 update their transmit rates is shown in FIG. 6(c).
  • In this example, the source agents store the notified rate from the destinations for rate regulation. When there are multiple destinations to which a source agent communicates, then for each destination, the notified rate value is stored and the source ensures that the transmissions rate to each destination meets the rate. A number of alternative designs are possible. An example implementation may use resources or buffers at each destination to accept messages which can be partitioned statically or dynamically based on the source agent's activity and weights.
  • The destination may notify the sources about the number of available buffers it has for the source and the source agent therefore does not send more messages than the available buffers. The buffer allocation can be made dynamic and the notification of buffer availability may be piggybacked on existing sets of messages in the system. An example implementation may also utilize the source agent's average and peak transmission rates to determine the transmission rate by the source agents to the destination. Another example implementation may use a mechanism where the destination agents send the delta or difference between the expected transmission rate of the source agents and their current transmission rates instead of sending transmission rate values. Using this delta, the sources may adjust their transmission rates. Destinations may continue to send the delta notifications until the system is stabilized and the destinations are no longer congested.
  • In an alternative example implementation, the notification from destination agents to source agents may only convey to the sources that there is congestion at the destination and optionally by how much, and not the rate at which the sources should transmit or adjust their rates to avoid congestion. In this example, the source agents are responsible for computing the rate at which they may transmit to various destinations. In such a system, the source agents may start with a fixed transmission rate once they becomes active, which can be determined based on the QoS policy. Subsequently when destination agents receive messages, the destination agents notify the active sources about whether the destination is currently congested or not, and optionally the amount of congestion. This notification is broadcasted to all active sources which are currently communicating with the destination. The notification messages may be transmitted whenever the level of congestion at the destination changes, or the set of active agents changes (new agents become active and start talking to the destination, or an active agent stops transmitting to the destination).
  • When a notification message arrives at a source, the source reacts to the message by updating its transmission rate. If the destination is not congested, then source agents may increase their rates by a fixed value or a fixed multiple of the current transmit rate. They can continue increasing the rates every time they receive a notification that indicates that there is no congestion at the destination. Once a congestion notification message arrives, source agents may thereby reduce their transmission rate. If the level of congestion is not known then the source agents may reduce the transmission rate with a fixed value or a fixed multiple, or the rate reduction can be proportional to the level of congestion at the destination.
  • In this example where congestion notification only indicates whether there is congestion or not without the actual amount of congestion, there may be oscillations in the congestion level at the destination depending on the source actions. A number of optimizations may be used to avoid the oscillations. For example, the rate at which the transmit rate of various sources are increased may be reduced upon each non-congestion notification to allow smooth convergence to a steady state rate. A standard proportional integral (PI) or proportional integral derivative (PID) controller based mechanism may also be used to adjust the rates. The Proportional gain, Integral gain, and the Derivative gain tuning parameters can be chosen based on the system parameters such as number of agents, burstiness in traffic, etc.
  • Another concern with such a design may be the unfairness of the destination's bandwidth allocation among various contending source agents. Since the source agents are reacting locally and independently, the final rate at which they stabilize may depend upon when they became active and how many notification messages they have received so far. To address this, an example implementation may reset the transmit rates of all source agents to various destination agents to the initial configured value periodically and then repeat the notification protocol until the rates stabilize again. Since the notification messages are sent to all source agents, in this round if no new sources were activated during the stabilization period, the resulting rates will be fair.
  • The frequency at which notification messages are sent may vary in various implementations. In a design where the destination agents respond back with a response message for an arriving message, the notification may be piggybacked on the response and therefore can be sent for every arriving message. When such response messages are not present, or when the notification cannot be piggybacked on the response messages, the notification may be sent less frequently. In an example implementation, the notification message may be sent by the destination agent whenever new agents begin to communicate with the destination, or an active agents stops communicating with the destination. Additionally, the notification message may be sent when the level of congestion at the destination changes that requires the transmission rates at the sources to be updated.
  • The second method of congestion avoidance and QoS via injection regulation involves computation of the transmission rate by the source agents without any explicit notification from the destinations. To compute the transmission rate, the sources determine the level of congestion at the destinations with which it is communicating. The congestion level may be determined by monitoring for observable metrics such as round-trip time of request messages to response messages, when response messages are expected for the request messages, or the amount of backpressure the source agent is experiencing from the network when the source agent attempts to inject a message for a given destination. The backpressure is the flow control signaling from the network to an agent if the agent is no longer allowed to send any more data into the network due to congestion in the network. Since the network is reliable, when a destination becomes congested, the congestion propagates into the network and finally appears at source agent's interface to the network. Based on the amount of backpressure from the network to an agent (e.g., number of queued messages, etc.) which indicates the congestion level, source agents can take action locally to avoid congestion and also ensure QoS.
  • Source agents may regulate the transmission rates in a number of ways. In an example implementation where round-trip time from request messages to response messages are used by source agents and is used as metric of congestion, the source agents may use multiple sets of local registers, one for each destination, to track the round-trip latencies and current transmission rates to the destinations. Sources may start transmission at a fixed rate which may be decided based on the QoS policy, relative weight of various traffic flows from sources to the destination agents, and the maximum bandwidth of the destination agents.
  • If the observed round-trip time to a destination is comparable to the latency in an uncongested system then source agent may decide to increase the transmission rate to the destination. The rate of increase may be linear additive, i.e. after each round-trip time of latency observation the transmission rate is increased by a fixed value as long as round-trip time does not indicate congestion. The fixed value may be determined based on how quickly the source agents want to achieve full bandwidth utilization; choosing a high rate of increase will enable source agents to reach high bandwidth quickly and increase the system utilization, however it may also lead to oscillations in the system congestion.
  • When congestion occurs in the system, the round-trip time will begin to grow; when source agents observe this, they may back-off and reduce the transmission rates. The rate at which transmission rate is reduced may be multiplicative, i.e. the rate will be reduced by a fixed multiple (new rate=current rate×k, k<1). This method of additive increase but multiplicative decrease may help in avoiding oscillations and ensure better fairness and QoS between various source agents contending for a destination. A standard proportional integral (PI) or proportional integral derivate (PID) controller based mechanism may also be used to adjust the rates based on the observed congestion.
  • In designs, where round-trip latency may not be used as congestion feedback, the source agents can simply track the backpressure signal at its outgoing interfaces to infer the level of congestion in the system. The more frequently an outgoing interface is experiencing backpressure, the more there is congestion at the set of destinations for the transmitted messages of the interface. As the congestion level is determined at an interface, the transmission rates to the set of corresponding destination agents can be regulated based on the previously described additive increase and multiplicative decrease or PI and PID controller scheme.
  • In the third method of congestion avoidance and QoS via injection regulation, end to end credit based flow control is used between all source and destination agents. The destination agents use buffers to receive arriving messages and control the message arrival by providing credits to the source agents. A credit corresponds to an empty buffer slot at the destination allocated for a message from the source agent. If a source does not have an allocated buffer or credit for a destination, it cannot send messages to the destination, and must acquire credit first. Buffer allocation and credit distribution to various source agents are performed at each destination agent based on the QoS policy. This method provides end-to-end flow control in the system and congestion avoidance and QoS policy can be ensured with the correct allocation of buffers, distribution of credit and processing of the arriving messages at the destination. Buffers or credits at a destination can be pre-allocated among the source agents communicating with the destination or can be dynamically allocated upon receiving an explicit request from the sources. These two example implementations are described next.
  • In the first example implementation of end-to-end credit based flow control, every destination agent has separate buffer pools for arriving messages from every source agent. This is illustrated in FIG. 7(a). There are four source agents s1, s2, s3, s4 communicating with a destination agent d. At the destination, there are four buffer pools, one for each source agent. The arriving messages 1, 2, 3, and 4, from the source agents are stored in the corresponding buffer pool in a First in First out order (FIFO). Certain designs may store the arriving messages in non-FIFO order depending on the priority of various messages or certain isolation requirements such as one type of messages cannot block the other types.
  • Source agents must acquire credit before sending a message to the destination. In this case, since separate buffer pools are available for each source, the sources can begin with a credit value equal to the number of slots in its buffer pool at the destination. Alternatively sources may begin with zero credit, and the destinations distribute credits to the sources after reset based on the number of free slots in the buffer pool. Once a source agent has a credit for the destination d, it can send a message and decrement the credit value. The source agent can continue sending as long as it has positive credit left. At the destination, an arriving message is guaranteed to be accepted eventually as it will always have a buffer slot available for it.
  • At the destination, the arriving messages stored in the buffers are read (700) and for further processing at the destination agent. The mechanism to read the messages from various buffer pools is based on the system QoS policy. Consider a QoS policy which assigns weights w1, w2, w3 and w4 to the four source agents. In this case if the destination agent is congested, i.e. messages are arriving at rate higher than it can process, then the number of messages read and processed from each buffer can be made proportional to the weight of the source that writes into the buffer. Say the weights of the four buffers are 1, 2, 3 and 4, respectively. In this case, in every 10 messages that are read, one should be from the first buffer, two should be from the second buffer, three should be from the third buffer and four should be from the fourth buffer, providing weighted fair allocation of bandwidth to each source agent.
  • A slightly approximate implementation may provide fairness over larger periods of time, allowing some unfairness during short time periods. Standard weighted Round Robin (WRR), Deficit Round Robin (DRR), or Weighted Fair Queuing (WFQ) based designs may be used to implement the read mechanism. QoS policy may also provide different priority to the source agents, and the priority may be strict, i.e. if there is a message of higher priority waiting then it has be processed before all messages of lower priority. Between messages of the same priority value, equal or weighted fairness may be needed. In this case, a combination of weighted arbitration and strict priority arbitration may be implemented.
  • Once a message is read and removed from the buffer, the corresponding source agent can send a new message to the destination. Therefore, the destination agent can send a credit back to the source agents each time a message is read from the buffer. The credit can be sent as a separate message to the source agent, or can be piggybacked on an existing message that is being sent to the source agent. If the arriving message at the destination is going to generate a response message back to the source then it may be efficient to piggyback the credit on the response message. The resulting protocol of message transmission and credit return is illustrated in FIG. 8(a).
  • In the first example implementation of end-to-end credit based flow control, there is a separate buffer pool for every source agent; therefore the total number of buffers needed at a destination may be proportional to the number of source agents talking to the destination. In a fully connected system where all agents talk to all other agents, the total number of buffers in the entire system may be O(n̂2) for n agents, since each destination may need n−1 buffer pools, or one for each source. To maintain high performance, the number of slots in the buffer pool for each source agent may need to be proportional to the round-trip latency between the source and the destination, and the maximum message rate from the source to the destination. If a ring topology interconnect is used, the round-trip latency may be O(n) and therefore the total number of buffer slots in the system may be O(n̂3). If a mesh or Torus topology interconnect, the round-trip time may be O(n̂1/2), in which case the total number of buffer slots in the system may be O(n̂5/2). Clearly in this example, the number of buffer slots may grow to become excessive as a fully connected system scales in number of agents.
  • To reduce the number of buffer slots, a second example implementation of end-to-end flow control method may use a shared pool of buffers at the destinations to store the arriving messages from various sources instead of separate buffers for each source agent. An example is illustrated in FIG. 7(b). The buffer slots from the shared pool may be dynamically allocated to the requesting source agents based on their need and based on the QoS policy. Thus the source agents begin with no credit. To send a message, source agents send a credit request message to the destination. If destination has a buffer slot available then it reserves the slot for the request and responds back with a credit to the source. Source can then consume the credit and send the message. The resulting protocol of message transmission and credit return is illustrated in FIG. 8(b). To reduce latency, it is possible for sources to acquire a few credits ahead of time. In such designs, there may be deadlock if the destination runs out of buffer slots and new credits, and if sources acquire credits that they are not using. To avoid this, the source agents return the unused credits back to the destination if they are not used after certain timeout interval.
  • In another example, source agents can go ahead and send a message to the destination without acquiring a credit from the destination. The destination agents may choose to accept an arriving message if there are available resources to accept and process the message or it may decide to discard it. In case the destination discards an arriving message from a source, the destination notifies the source so that the source agent can re-transmit this message. In order to avoid multiple retransmissions and discards of the same message, an example implementation may restrain the sources to always acquire a credit before re-sending a message that was earlier discarded by the destination.
  • The resulting protocol of message transmission and credit return is illustrated in FIG. 8(c). An additional optimization may avoid a source from sending an explicit credit request to a destination for a previously discarded message. Assuming that source agents always resends the discarded messages later, the destination can register all discards and send credits to the requesting source agents later once resources and buffer slots are available at the destination for the source. Once the credit arrives at the source the source agent may re-send the discarded messages which are guaranteed to be accepted at the destination this time. The resulting protocol of message transmission and credit return is illustrated in FIG. 8(d).
  • Re-transmission of messages may affect the ordering of message delivery so the source and destination agents should ensure that the un-ordered delivery of message is either acceptable or is resolved correctly.
  • When arriving messages are processed by the destination, the buffer slot is freed up, and can be used for a newly arriving message or can be allocated for a source agent that requested a credit previously or had a message discarded.
  • A hybrid implementation of the two end-to-end flow control schemes may use both, a set of separate buffer pools for each source agent and a set of dynamically allocated shared buffer pool to be shared among all sources. In this case, the source agents will track two types of credits for each destination, one for the dedicated buffer pool it has for itself at the destination, and the other for the buffer slots it requests and allocates dynamically at the destination. The source agents, based on its design, the types of messages it is sending, and the QoS policy, may use the two types of credits for the different types of messages being sent.
  • A number of alternative example implementations are possible within the context of the previously described end-to-end flow control schemes, in which the buffer allocation and credit distribution at the destination may be performed in various ways depending upon the latency between the destination and source agents, the topology of the NoC interconnect, the bandwidth of various NoC channels and the transmission and receive capability of the agents. One may also combine the end-to-end credit flow control schemes with the feedback based congestion notification schemes to avoid congestion more effectively and provide end-to-end QoS more efficiently.
  • FIG. 9 illustrates an example computer system 900 on which example designs may be implemented. The computer system 900 includes an apparatus 905 which may involve an I/O unit 935, storage 960, and a processor 910 operable to execute one or more units as known to one of skill in the art. The term “computer-readable medium” as used herein refers to any medium that participates in providing instructions to processor 910 for execution, which may come in the form of computer-readable storage mediums, such as, but not limited to optical disks, magnetic disks, read-only memories, random access memories, solid state devices and drives, or any other types of tangible media suitable for storing electronic information, or computer-readable signal mediums, which can include transitory media such as carrier waves. The I/O unit processes input from user interfaces 940 and operator interfaces 945 which may utilize input devices such as a keyboard, mouse, touch device, or verbal command.
  • The apparatus 905 may also be connected to an external storage 950, which can contain removable storage such as a portable hard drive, optical media (CD or DVD), disk media or any other medium from which a computer can read executable code. The apparatus may also be connected an output device 955, such as a display to output data and other information to a user, as well as request additional information from a user. The connections from the apparatus 905 to the user interface 940, the operator interface 945, the external storage 950, and the output device 955 may via wireless protocols, such as the 902.11 standards, Bluetooth® or cellular protocols, or via physical transmission media, such as cables or fiber optics. The output device 955 may therefore further act as an input device for interacting with a user.
  • The processor 910 may execute one or more modules. The congestion detection module module 911 may be configured to determine the congestion level in the network based on various performance metrics such as the round trip latency or the amount of backpressure observed by the source agents, or, the receive rate of messages at a destination agent. The rate computation module 912 present at a NoC node may compute the rate at which an agent may transmit data into network. Source agents may compute the rates by observing the congestion level metrics such as round trip time or amount of backpressure, or destination agents may compute the rate at which various sources may send to it based on the rate at which it is receiving messages currently and the rate at which it can process them. The QoS enforcement module 913 may be configured to dynamically adjust the transmission rates at various source agents so that the end-to-end QoS specification is satisfied.
  • The various modules and processor, in single or in combination be configured to perform certain operations. Such operations can include to receive, at one of a first node and a second node in the NoC, an instruction based on at least one of a command signal from the other of the first node and the second node, a computed level of congestion based on a QoS metric indicative of traffic congestion, and an end-to-end flow control buffer allocation result; and to determine, at the one of the first node and the second node, an allocation of traffic bandwidth based on a result of the instruction. As described above, the QoS metric can include at least one of a round trip time of a request message to a response message, and backpressure experienced by the one of the first node and the second node. The command signal can be in the form of a bit signal and/or a notification message indicative of congestion. Further operations can include to determine an allocation of traffic bandwidth by a computation of a transmission rate at the one of the first node and the second node and an allocation of the traffic bandwidth based on the computed transmission rate, and/or by an issuance of a buffer allocation to the one of the first node and the second node from the other of the first node and the second node based on the result of the instruction, as described in the example implementations above.
  • FIG. 10 illustrates an example Network on Chip (NoC) hardware block diagram 1000, on which example implementations may be implemented. The NoC 1010 may include a plurality of routers and hosts that are connected by interconnects, as illustrated and described in FIGS. 1-6. The NoC 1010 can be implemented on a chip 1015, which may be in the form of an integrated circuit, such as a System on Chip (SoC), Very-Large-Scale-Integration (VLSI) device or other hardware configurations, depending on the desired implementation. In an example configuration, the NoC 1010 is configured to handle all of the functions as described in the example implementations above at the NoC level, or can be operated on with a processor.
  • Chip 1015 may also include an I/O unit 1035 for facilitating communications between the chip 1015 and a computer system implementing the chip 1015 via a computer bus interface 1045 and external storage 1050. Chip 1015 may also include Random Access Memory (RAM) 1060 and processor 1015. Processor 1015 may store and execute the congestion detection module 911, the rate computation module 912, and the QoS enforcement module 913 as described above. Additionally, the modules in the processor 1015 can be stored and executed within the nodes of the NoC 1010 itself at the NoC level.
  • Furthermore, some portions of the detailed description are presented in terms of algorithms and symbolic representations of operations within a computer. These algorithmic descriptions and symbolic representations are the means used by those skilled in the data processing arts to most effectively convey the essence of their innovations to others skilled in the art. An algorithm is a series of defined steps leading to a desired end state or result. In the example implementations, the steps carried out require physical manipulations of tangible quantities for achieving a tangible result.
  • Moreover, other implementations of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the example implementations disclosed herein. Various aspects and/or components of the described example implementations may be used singly or in any combination. It is intended that the specification and examples be considered as examples, with a true scope and spirit of the application being indicated by the following claims.

Claims (18)

What is claimed is:
1. A method for a Network on Chip (NoC) comprising a first node and a second node, the method comprising:
receiving, at one of the first node and the second node, an instruction from the other of the first node and the second node based on an end-to-end flow control buffer allocation result, the instruction comprising at least one of a buffer slot credit allocation and a buffer allocation for the one of the first node and the second node; and
determining, at the one of the first node and the second node, an allocation of traffic bandwidth based on the instruction.
2. The method of claim 1, wherein the at least one of the buffer slot credit allocation and the buffer allocation is determined from a Quality of Service (QoS) policy for the NoC.
3. The method of claim 1, wherein the at least one of the buffer slot credit allocation and the buffer allocation is determined from messages being discarded when dynamic credits are allocated.
4. The method of claim 1, wherein the determining the allocation of traffic bandwidth comprises determining a transmission rate at the one of the first node and the second node, and allocating the traffic bandwidth based on the determined transmission rate.
5. The method of claim 1, wherein the determining the allocation of traffic bandwidth comprises issuing a buffer allocation to the one of the first node and the second node from the other of the first node and the second node based on the instruction.
6. The method of claim 1, wherein the at least one of the buffer slot credit allocation and the buffer allocation is pre-allocated based on the one of the first node and the second node.
7. A non-transitory computer readable medium, storing instructions for a Network on Chip (NoC) comprising a first node and a second node, the instructions comprising:
receiving, at one of the first node and the second node, an instruction from the other of the first node and the second node based on an end-to-end flow control buffer allocation result, the instruction comprising at least one of a buffer slot credit allocation and a buffer allocation for the one of the first node and the second node; and
determining, at the one of the first node and the second node, an allocation of traffic bandwidth based on the instruction.
8. The non-transitory computer readable medium of claim 7, wherein the at least one of the buffer slot credit allocation and the buffer allocation is determined from a Quality of Service (QoS) policy for the NoC.
9. The non-transitory computer readable medium of claim 7, wherein the at least one of the buffer slot credit allocation and the buffer allocation is determined from messages being discarded when dynamic credits are allocated.
10. The non-transitory computer readable medium of claim 7, wherein the determining the allocation of traffic bandwidth comprises determining a transmission rate at the one of the first node and the second node, and allocating the traffic bandwidth based on the determined transmission rate.
11. The non-transitory computer readable medium of claim 7, wherein the determining the allocation of traffic bandwidth comprises issuing a buffer allocation to the one of the first node and the second node from the other of the first node and the second node based on the instruction.
12. The non-transitory computer readable medium of claim 7, wherein the at least one of the buffer slot credit allocation and the buffer allocation is pre-allocated based on the one of the first node and the second node.
13. A Network on Chip (NoC) comprising a first node and a second node, the NoC configured to:
receive, at one of the first node and the second node, an instruction from the other of the first node and the second node based on an end-to-end flow control buffer allocation result, the instruction comprising at least one of a buffer slot credit allocation and a buffer allocation for the one of the first node and the second node; and
determine, at the one of the first node and the second node, an allocation of traffic bandwidth based on the instruction.
14. The NoC of claim 13, wherein the at least one of the buffer slot credit allocation and the buffer allocation is determined from a Quality of Service (QoS) policy for the NoC.
15. The NoC of claim 13, wherein the at least one of the buffer slot credit allocation and the buffer allocation is determined from messages being discarded when dynamic credits are allocated.
16. The NoC of claim 13, wherein the NoC is configured to determine the allocation of traffic bandwidth from determining a transmission rate at the one of the first node and the second node, and allocating the traffic bandwidth based on the determined transmission rate.
17. The NoC of claim 13, wherein the NoC is configured to determine the allocation of traffic bandwidth from issuing a buffer allocation to the one of the first node and the second node from the other of the first node and the second node based on the instruction.
18. The NoC of claim 13, wherein the at least one of the buffer slot credit allocation and the buffer allocation is pre-allocated based on the one of the first node and the second node.
US15/392,154 2013-05-03 2016-12-28 CONGESTION CONTROL AND QoS IN NoC BY REGULATING THE INJECTION TRAFFIC Abandoned US20170111283A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/392,154 US20170111283A1 (en) 2013-05-03 2016-12-28 CONGESTION CONTROL AND QoS IN NoC BY REGULATING THE INJECTION TRAFFIC

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13/886,794 US9571402B2 (en) 2013-05-03 2013-05-03 Congestion control and QoS in NoC by regulating the injection traffic
US15/392,154 US20170111283A1 (en) 2013-05-03 2016-12-28 CONGESTION CONTROL AND QoS IN NoC BY REGULATING THE INJECTION TRAFFIC

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US13/886,794 Continuation US9571402B2 (en) 2013-05-03 2013-05-03 Congestion control and QoS in NoC by regulating the injection traffic

Publications (1)

Publication Number Publication Date
US20170111283A1 true US20170111283A1 (en) 2017-04-20

Family

ID=51841373

Family Applications (2)

Application Number Title Priority Date Filing Date
US13/886,794 Active 2033-09-29 US9571402B2 (en) 2013-05-03 2013-05-03 Congestion control and QoS in NoC by regulating the injection traffic
US15/392,154 Abandoned US20170111283A1 (en) 2013-05-03 2016-12-28 CONGESTION CONTROL AND QoS IN NoC BY REGULATING THE INJECTION TRAFFIC

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US13/886,794 Active 2033-09-29 US9571402B2 (en) 2013-05-03 2013-05-03 Congestion control and QoS in NoC by regulating the injection traffic

Country Status (1)

Country Link
US (2) US9571402B2 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190303222A1 (en) * 2018-03-28 2019-10-03 Apple Inc. Methods and apparatus for self-tuning operation within user space stack architectures
US10845868B2 (en) 2014-10-08 2020-11-24 Apple Inc. Methods and apparatus for running and booting an inter-processor communication link between independently operable processors
US10846224B2 (en) 2018-08-24 2020-11-24 Apple Inc. Methods and apparatus for control of a jointly shared memory-mapped region
US11477123B2 (en) 2019-09-26 2022-10-18 Apple Inc. Methods and apparatus for low latency operation in user space networking
US11558348B2 (en) 2019-09-26 2023-01-17 Apple Inc. Methods and apparatus for emerging use case support in user space networking
US11606302B2 (en) 2020-06-12 2023-03-14 Apple Inc. Methods and apparatus for flow-based batching and processing
US11775359B2 (en) 2020-09-11 2023-10-03 Apple Inc. Methods and apparatuses for cross-layer processing
US11799986B2 (en) 2020-09-22 2023-10-24 Apple Inc. Methods and apparatus for thread level execution in non-kernel space
US11829303B2 (en) 2019-09-26 2023-11-28 Apple Inc. Methods and apparatus for device driver operation in non-kernel space
US11876719B2 (en) 2021-07-26 2024-01-16 Apple Inc. Systems and methods for managing transmission control protocol (TCP) acknowledgements
US11882051B2 (en) 2021-07-26 2024-01-23 Apple Inc. Systems and methods for managing transmission control protocol (TCP) acknowledgements
US11954540B2 (en) 2020-09-14 2024-04-09 Apple Inc. Methods and apparatus for thread-level execution in non-kernel space

Families Citing this family (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9378168B2 (en) * 2013-09-18 2016-06-28 International Business Machines Corporation Shared receive queue allocation for network on a chip communication
KR20150034062A (en) * 2013-09-25 2015-04-02 삼성전자주식회사 Method and Apparatus for routing data included in same group and re-configuring rendering unit
US9699079B2 (en) 2013-12-30 2017-07-04 Netspeed Systems Streaming bridge design with host interfaces and network on chip (NoC) layers
US9473415B2 (en) * 2014-02-20 2016-10-18 Netspeed Systems QoS in a system with end-to-end flow control and QoS aware buffer allocation
US20150249580A1 (en) * 2014-03-03 2015-09-03 Korcett Holdings, Inc. System and method for providing uncapped internet bandwidth
US11200581B2 (en) 2018-05-10 2021-12-14 Hubspot, Inc. Multi-client service system platform
US10867003B2 (en) 2014-09-15 2020-12-15 Hubspot, Inc. Method of enhancing customer relationship management content and workflow
WO2018209254A1 (en) 2017-05-11 2018-11-15 Hubspot, Inc. Methods and systems for automated generation of personalized messages
US9660942B2 (en) 2015-02-03 2017-05-23 Netspeed Systems Automatic buffer sizing for optimal network-on-chip design
US10348563B2 (en) 2015-02-18 2019-07-09 Netspeed Systems, Inc. System-on-chip (SoC) optimization through transformation and generation of a network-on-chip (NoC) topology
US10218580B2 (en) 2015-06-18 2019-02-26 Netspeed Systems Generating physically aware network-on-chip design from a physical system-on-chip specification
CN105187313B (en) * 2015-09-25 2018-05-01 东北大学 A kind of Survey on network-on-chip topology and its adaptive routing method
FR3045998B1 (en) * 2015-12-18 2018-07-27 Avantix TERMINAL AND METHOD FOR DATA TRANSMISSION VIA A CONTRAINTED CHANNEL
US10452124B2 (en) 2016-09-12 2019-10-22 Netspeed Systems, Inc. Systems and methods for facilitating low power on a network-on-chip
US10291539B2 (en) 2016-09-22 2019-05-14 Oracle International Corporation Methods, systems, and computer readable media for discarding messages during a congestion event
CN106453109A (en) * 2016-10-28 2017-02-22 南通大学 Network-on-chip communication method and network-on-chip router
WO2018089619A1 (en) 2016-11-09 2018-05-17 HubSpot Inc. Methods and systems for a content development and management platform
US20180159786A1 (en) 2016-12-02 2018-06-07 Netspeed Systems, Inc. Interface virtualization and fast path for network on chip
US10063496B2 (en) 2017-01-10 2018-08-28 Netspeed Systems Inc. Buffer sizing of a NoC through machine learning
US10911394B2 (en) 2017-01-30 2021-02-02 Hubspot, Inc. Mitigating abuse in an electronic message delivery environment
US10469337B2 (en) 2017-02-01 2019-11-05 Netspeed Systems, Inc. Cost management against requirements for the generation of a NoC
US10298485B2 (en) 2017-02-06 2019-05-21 Netspeed Systems, Inc. Systems and methods for NoC construction
US10542086B2 (en) * 2017-10-30 2020-01-21 General Electric Company Dynamic flow control for stream processing
CN109995633B (en) * 2017-12-29 2021-10-01 华为技术有限公司 Chip and related equipment
US10348572B1 (en) * 2018-01-20 2019-07-09 Facebook, Inc. Dynamic bandwidth allocation for wireless mesh networks
US20190238485A1 (en) * 2018-01-30 2019-08-01 Hewlett Packard Enterprise Development Lp Transmitting credits between accounting channels
US10547514B2 (en) 2018-02-22 2020-01-28 Netspeed Systems, Inc. Automatic crossbar generation and router connections for network-on-chip (NOC) topology generation
US10896476B2 (en) 2018-02-22 2021-01-19 Netspeed Systems, Inc. Repository of integration description of hardware intellectual property for NoC construction and SoC integration
US11144457B2 (en) 2018-02-22 2021-10-12 Netspeed Systems, Inc. Enhanced page locality in network-on-chip (NoC) architectures
US10983910B2 (en) 2018-02-22 2021-04-20 Netspeed Systems, Inc. Bandwidth weighting mechanism based network-on-chip (NoC) configuration
US11176302B2 (en) 2018-02-23 2021-11-16 Netspeed Systems, Inc. System on chip (SoC) builder
US11023377B2 (en) 2018-02-23 2021-06-01 Netspeed Systems, Inc. Application mapping on hardened network-on-chip (NoC) of field-programmable gate array (FPGA)
CN109189720B (en) * 2018-08-22 2022-11-25 曙光信息产业(北京)有限公司 Hierarchical network-on-chip topology structure and routing method thereof
CN113874848A (en) 2019-05-23 2021-12-31 慧与发展有限责任合伙企业 System and method for facilitating management of operations on accelerators in a Network Interface Controller (NIC)
US11102138B2 (en) 2019-10-14 2021-08-24 Oracle International Corporation Methods, systems, and computer readable media for providing guaranteed traffic bandwidth for services at intermediate proxy nodes
US11425598B2 (en) 2019-10-14 2022-08-23 Oracle International Corporation Methods, systems, and computer readable media for rules-based overload control for 5G servicing
CN110868359B (en) * 2019-11-15 2023-03-24 中国人民解放军国防科技大学 Network congestion control method
US11775494B2 (en) 2020-05-12 2023-10-03 Hubspot, Inc. Multi-service business platform system having entity resolution systems and methods
US11245643B2 (en) 2020-05-20 2022-02-08 Tenstorrent Inc. Speculative resource allocation for routing on interconnect fabrics
US11470004B2 (en) * 2020-09-22 2022-10-11 Advanced Micro Devices, Inc. Graded throttling for network-on-chip traffic
US12113712B2 (en) * 2020-09-25 2024-10-08 Advanced Micro Devices, Inc. Dynamic network-on-chip throttling
US11818046B2 (en) 2021-01-26 2023-11-14 Samsung Electronics Co., Ltd. Coordinated congestion control in network-attached devices
US20230111522A1 (en) * 2021-09-28 2023-04-13 Arteris, Inc. MECHANISM TO CONTROL ORDER OF TASKS EXECUTION IN A SYSTEM-ON-CHIP (SoC) BY OBSERVING PACKETS IN A NETWORK-ON-CHIP (NoC)
EP4395364A4 (en) * 2021-10-15 2024-07-10 Huawei Tech Co Ltd Exchange apparatus, exchange method, and exchange device
CN114866475B (en) * 2022-04-06 2023-05-26 中山大学 Network-on-chip congestion control method, system, device and storage medium
CN114978859A (en) * 2022-05-13 2022-08-30 海光信息技术股份有限公司 Network-on-chip architecture, related equipment and data transmission system

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050276274A1 (en) * 2004-05-28 2005-12-15 Matthew Mattina Method and apparatus for synchronous unbuffered flow control of packets on a ring interconnect
US20060041889A1 (en) * 2002-10-08 2006-02-23 Koninklijke Philips Electronics N.V. Integrated circuit and method for establishing transactions
US20070195748A1 (en) * 2004-04-05 2007-08-23 Koninklijke Philips Electronics, N.V. Integrated circuit and method for time slot allocation
US20070274215A1 (en) * 2006-05-23 2007-11-29 International Business Machines Corporation Method and a system for flow control in a communication network
US20080186991A1 (en) * 2005-05-18 2008-08-07 Koninklijke Philips Electronics, N.V. Integrated Circuit and Method of Arbitration in a Network on an Integrated Circuit
US20080267211A1 (en) * 2004-06-09 2008-10-30 Koninklijke Philips Electronics, N.V. Integrated Circuit and Method for Time Slot Allocation
US20080310458A1 (en) * 2005-05-26 2008-12-18 Nxp B.V. Electronic Device and Method of Communication Resource Allocation
US20100281144A1 (en) * 2009-04-29 2010-11-04 Stmicroelectronics S.R.L. Control device for a system-on-chip and corresponding method
US20110069612A1 (en) * 2009-03-12 2011-03-24 Takao Yamaguchi Best path selecting device, best path selecting method, and program
US20120140633A1 (en) * 2009-06-12 2012-06-07 Cygnus Broadband, Inc. Systems and methods for prioritizing and scheduling packets in a communication network
US20130142066A1 (en) * 2011-03-28 2013-06-06 Panasonic Corporation Router, method for controlling router, and program
US8667205B2 (en) * 2012-04-30 2014-03-04 International Business Machines Corporation Deadlock resolution in end-to-end credit protocol
US20140204740A1 (en) * 2012-07-24 2014-07-24 Panasonic Corporation Bus system and router

Family Cites Families (77)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1994009576A1 (en) 1992-10-21 1994-04-28 Bell Communications Research, Inc. A broadband virtual private network service and system
US5764740A (en) 1995-07-14 1998-06-09 Telefonaktiebolaget Lm Ericsson System and method for optimal logical network capacity dimensioning with broadband traffic
US5991308A (en) 1995-08-25 1999-11-23 Terayon Communication Systems, Inc. Lower overhead method for data transmission using ATM and SCDMA over hybrid fiber coax cable plant
US6003029A (en) 1997-08-22 1999-12-14 International Business Machines Corporation Automatic subspace clustering of high dimensional data for data mining applications
US6249902B1 (en) 1998-01-09 2001-06-19 Silicon Perspective Corporation Design hierarchy-based placement
US6415282B1 (en) 1998-04-22 2002-07-02 Nec Usa, Inc. Method and apparatus for query refinement
US6711152B1 (en) 1998-07-06 2004-03-23 At&T Corp. Routing over large clouds
US6968514B2 (en) 1998-09-30 2005-11-22 Cadence Design Systems, Inc. Block based design methodology with programmable components
US6356900B1 (en) 1999-12-30 2002-03-12 Decode Genetics Ehf Online modifications of relations in multidimensional processing
CA2359168A1 (en) 2000-10-25 2002-04-25 John Doucette Design of meta-mesh of chain sub-networks
US6925627B1 (en) 2002-12-20 2005-08-02 Conexant Systems, Inc. Method and apparatus for power routing in an integrated circuit
US8281297B2 (en) 2003-02-05 2012-10-02 Arizona Board Of Regents Reconfigurable processing
US7065730B2 (en) 2003-04-17 2006-06-20 International Business Machines Corporation Porosity aware buffered steiner tree construction
US7318214B1 (en) 2003-06-19 2008-01-08 Invarium, Inc. System and method for reducing patterning variability in integrated circuit manufacturing through mask layout corrections
US7725859B1 (en) 2003-08-01 2010-05-25 Cadence Design Systems, Inc. Methods and mechanisms for inserting metal fill data
US7518990B2 (en) 2003-12-26 2009-04-14 Alcatel Lucent Usa Inc. Route determination method and apparatus for virtually-concatenated data traffic
KR100674933B1 (en) 2005-01-06 2007-01-26 삼성전자주식회사 Method of deciding core-tile-switch mapping architecture within on-chip-bus and computer-readable medium for recoding the method
US8059551B2 (en) 2005-02-15 2011-11-15 Raytheon Bbn Technologies Corp. Method for source-spoofed IP packet traceback
JP2008535435A (en) * 2005-04-06 2008-08-28 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Network-on-chip environment and delay reduction method
US20090122703A1 (en) * 2005-04-13 2009-05-14 Koninklijke Philips Electronics, N.V. Electronic Device and Method for Flow Control
US20060281221A1 (en) 2005-06-09 2006-12-14 Sharad Mehrotra Enhanced routing grid system and method
US7603644B2 (en) 2005-06-24 2009-10-13 Pulsic Limited Integrated circuit routing and compaction
US7343581B2 (en) 2005-06-27 2008-03-11 Tela Innovations, Inc. Methods for creating primitive constructed standard cells
JP2009502080A (en) * 2005-07-19 2009-01-22 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Electronic device and communication resource allocation method
JP2007149061A (en) 2005-10-31 2007-06-14 Seiko Epson Corp Layout system, layout program and layout method
US7289933B2 (en) 2005-11-04 2007-10-30 Synopsys, Inc. Simulating topography of a conductive material in a semiconductor wafer
CA2580998A1 (en) 2006-03-03 2007-09-03 Queen's University At Kingston Adaptive analysis methods
US8448102B2 (en) 2006-03-09 2013-05-21 Tela Innovations, Inc. Optimizing layout of irregular structures in regular layout context
US20070256044A1 (en) 2006-04-26 2007-11-01 Gary Coryer System and method to power route hierarchical designs that employ macro reuse
US8924269B2 (en) 2006-05-13 2014-12-30 Sap Ag Consistent set of interfaces derived from a business object model
JP2007311491A (en) 2006-05-17 2007-11-29 Toshiba Corp Semiconductor integrated circuit
EP1863232A1 (en) * 2006-05-29 2007-12-05 Stmicroelectronics Sa On-chip bandwidth allocator
US20080072182A1 (en) 2006-09-19 2008-03-20 The Regents Of The University Of California Structured and parameterized model order reduction
EP2080128A1 (en) 2006-10-10 2009-07-22 Ecole Polytechnique Federale De Lausanne (Epfl) Method to design network-on-chip (noc)-based communication systems
US7502378B2 (en) * 2006-11-29 2009-03-10 Nec Laboratories America, Inc. Flexible wrapper architecture for tiled networks on a chip
WO2008126516A1 (en) 2007-04-10 2008-10-23 Naoki Suehiro Transmitting method, transmitting device, receiving method, and receiving device
US8136071B2 (en) 2007-09-12 2012-03-13 Neal Solomon Three dimensional integrated circuits and methods of fabrication
US8099757B2 (en) 2007-10-15 2012-01-17 Time Warner Cable Inc. Methods and apparatus for revenue-optimized delivery of content in a network
TWI390869B (en) 2008-04-24 2013-03-21 Univ Nat Taiwan System of network resources distribution and method of the same
US8040799B2 (en) * 2008-05-15 2011-10-18 International Business Machines Corporation Network on chip with minimum guaranteed bandwidth for virtual communications channels
CN102017548B (en) 2008-06-12 2013-08-28 松下电器产业株式会社 Network monitoring device, bus system monitoring device, method and program
US8050256B1 (en) 2008-07-08 2011-11-01 Tilera Corporation Configuring routing in mesh networks
US8312402B1 (en) 2008-12-08 2012-11-13 Cadence Design Systems, Inc. Method and apparatus for broadband electromagnetic modeling of three-dimensional interconnects embedded in multilayered substrates
US8065433B2 (en) 2009-01-09 2011-11-22 Microsoft Corporation Hybrid butterfly cube architecture for modular data centers
CN102415059B (en) * 2009-07-07 2014-10-08 松下电器产业株式会社 Bus control device
FR2948840B1 (en) * 2009-07-29 2011-09-16 Kalray CHIP COMMUNICATION NETWORK WITH SERVICE WARRANTY
US8285912B2 (en) 2009-08-07 2012-10-09 Arm Limited Communication infrastructure for a data processing apparatus and a method of operation of such a communication infrastructure
US8276105B2 (en) 2009-09-18 2012-09-25 International Business Machines Corporation Automatic positioning of gate array circuits in an integrated circuit design
FR2951342B1 (en) * 2009-10-13 2017-01-27 Arteris Inc NETWORK ON CHIP WITH NULL LATENCY
US8407647B2 (en) 2009-12-17 2013-03-26 Springsoft, Inc. Systems and methods for designing and making integrated circuits with consideration of wiring demand ratio
US8541819B1 (en) 2010-12-09 2013-09-24 Monolithic 3D Inc. Semiconductor device and structure
US8492886B2 (en) 2010-02-16 2013-07-23 Monolithic 3D Inc 3D integrated circuit with logic
FR2961048B1 (en) * 2010-06-03 2013-04-26 Arteris Inc CHIP NETWORK WITH QUALITY-OF-SERVICE CHARACTERISTICS
US20130080073A1 (en) 2010-06-11 2013-03-28 Waters Technologies Corporation Techniques for mass spectrometry peak list computation using parallel processing
US8196086B2 (en) 2010-07-21 2012-06-05 Lsi Corporation Granular channel width for power optimization
US9396162B2 (en) 2010-07-22 2016-07-19 John APPLEYARD Method and apparatus for estimating the state of a system
CN102467582B (en) 2010-10-29 2014-08-13 国际商业机器公司 Method and system for optimizing wiring constraint in integrated circuit design
US9397933B2 (en) 2010-12-21 2016-07-19 Verizon Patent And Licensing Inc. Method and system of providing micro-facilities for network recovery
US8717875B2 (en) 2011-04-15 2014-05-06 Alcatel Lucent Condensed core-energy-efficient architecture for WAN IP backbones
CN103348640B (en) * 2011-07-22 2016-11-23 松下知识产权经营株式会社 Relay
US8711867B2 (en) 2011-08-26 2014-04-29 Sonics, Inc. Credit flow control scheme in a router with flexible link widths utilizing minimal storage
US9213788B2 (en) 2011-10-25 2015-12-15 Massachusetts Institute Of Technology Methods and apparatus for constructing and analyzing component-based models of engineering systems
US20130151215A1 (en) 2011-12-12 2013-06-13 Schlumberger Technology Corporation Relaxed constraint delaunay method for discretizing fractured media
JP2013125906A (en) 2011-12-15 2013-06-24 Toshiba Corp Flare map calculation method, flare map calculation program, and method of manufacturing semiconductor device
ITTO20111180A1 (en) * 2011-12-21 2013-06-22 St Microelectronics Grenoble 2 CONTROL DEVICE, FOR EXAMPLE FOR SYSTEM-ON-CHIP, AND CORRESPONDENT PROCEDURE
US20130174113A1 (en) 2011-12-30 2013-07-04 Arteris SAS Floorplan estimation
US9070121B2 (en) 2012-02-14 2015-06-30 Silver Spring Networks, Inc. Approach for prioritizing network alerts
US9111151B2 (en) 2012-02-17 2015-08-18 National Taiwan University Network on chip processor with multiple cores and routing method thereof
US8756541B2 (en) 2012-03-27 2014-06-17 International Business Machines Corporation Relative ordering circuit synthesis
GB2500915B (en) * 2012-04-05 2018-03-14 Stmicroelectronics Grenoble2 Sas Arrangement and method
US8635577B2 (en) 2012-06-01 2014-01-21 International Business Machines Corporation Timing refinement re-routing
US9244880B2 (en) 2012-08-30 2016-01-26 Netspeed Systems Automatic construction of deadlock free interconnects
US20140092740A1 (en) 2012-09-29 2014-04-03 Ren Wang Adaptive packet deflection to achieve fair, low-cost, and/or energy-efficient quality of service in network on chip devices
US8885510B2 (en) 2012-10-09 2014-11-11 Netspeed Systems Heterogeneous channel capacities in an interconnect
GB2507124A (en) * 2012-10-22 2014-04-23 St Microelectronics Grenoble 2 Controlling data transmission rates based on feedback from the data recipient
US8601423B1 (en) 2012-10-23 2013-12-03 Netspeed Systems Asymmetric mesh NoC topologies
US8667439B1 (en) 2013-02-27 2014-03-04 Netspeed Systems Automatically connecting SoCs IP cores to interconnect nodes to minimize global latency and reduce interconnect cost

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060041889A1 (en) * 2002-10-08 2006-02-23 Koninklijke Philips Electronics N.V. Integrated circuit and method for establishing transactions
US20070195748A1 (en) * 2004-04-05 2007-08-23 Koninklijke Philips Electronics, N.V. Integrated circuit and method for time slot allocation
US20050276274A1 (en) * 2004-05-28 2005-12-15 Matthew Mattina Method and apparatus for synchronous unbuffered flow control of packets on a ring interconnect
US20080267211A1 (en) * 2004-06-09 2008-10-30 Koninklijke Philips Electronics, N.V. Integrated Circuit and Method for Time Slot Allocation
US20080186991A1 (en) * 2005-05-18 2008-08-07 Koninklijke Philips Electronics, N.V. Integrated Circuit and Method of Arbitration in a Network on an Integrated Circuit
US20080310458A1 (en) * 2005-05-26 2008-12-18 Nxp B.V. Electronic Device and Method of Communication Resource Allocation
US20070274215A1 (en) * 2006-05-23 2007-11-29 International Business Machines Corporation Method and a system for flow control in a communication network
US20110069612A1 (en) * 2009-03-12 2011-03-24 Takao Yamaguchi Best path selecting device, best path selecting method, and program
US20100281144A1 (en) * 2009-04-29 2010-11-04 Stmicroelectronics S.R.L. Control device for a system-on-chip and corresponding method
US20120140633A1 (en) * 2009-06-12 2012-06-07 Cygnus Broadband, Inc. Systems and methods for prioritizing and scheduling packets in a communication network
US20130142066A1 (en) * 2011-03-28 2013-06-06 Panasonic Corporation Router, method for controlling router, and program
US8667205B2 (en) * 2012-04-30 2014-03-04 International Business Machines Corporation Deadlock resolution in end-to-end credit protocol
US20140204740A1 (en) * 2012-07-24 2014-07-24 Panasonic Corporation Bus system and router

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10845868B2 (en) 2014-10-08 2020-11-24 Apple Inc. Methods and apparatus for running and booting an inter-processor communication link between independently operable processors
US11824962B2 (en) 2018-03-28 2023-11-21 Apple Inc. Methods and apparatus for sharing and arbitration of host stack information with user space communication stacks
US10819831B2 (en) 2018-03-28 2020-10-27 Apple Inc. Methods and apparatus for channel defunct within user space stack architectures
US11843683B2 (en) 2018-03-28 2023-12-12 Apple Inc. Methods and apparatus for active queue management in user space networking
US11095758B2 (en) 2018-03-28 2021-08-17 Apple Inc. Methods and apparatus for virtualized hardware optimizations for user space networking
US11146665B2 (en) 2018-03-28 2021-10-12 Apple Inc. Methods and apparatus for sharing and arbitration of host stack information with user space communication stacks
US11159651B2 (en) 2018-03-28 2021-10-26 Apple Inc. Methods and apparatus for memory allocation and reallocation in networking stack infrastructures
US11178260B2 (en) 2018-03-28 2021-11-16 Apple Inc. Methods and apparatus for dynamic packet pool configuration in networking stack infrastructures
US20190303222A1 (en) * 2018-03-28 2019-10-03 Apple Inc. Methods and apparatus for self-tuning operation within user space stack architectures
US11792307B2 (en) 2018-03-28 2023-10-17 Apple Inc. Methods and apparatus for single entity buffer pool management
US11368560B2 (en) * 2018-03-28 2022-06-21 Apple Inc. Methods and apparatus for self-tuning operation within user space stack architectures
US10846224B2 (en) 2018-08-24 2020-11-24 Apple Inc. Methods and apparatus for control of a jointly shared memory-mapped region
US11477123B2 (en) 2019-09-26 2022-10-18 Apple Inc. Methods and apparatus for low latency operation in user space networking
US11558348B2 (en) 2019-09-26 2023-01-17 Apple Inc. Methods and apparatus for emerging use case support in user space networking
US11829303B2 (en) 2019-09-26 2023-11-28 Apple Inc. Methods and apparatus for device driver operation in non-kernel space
US11606302B2 (en) 2020-06-12 2023-03-14 Apple Inc. Methods and apparatus for flow-based batching and processing
US11775359B2 (en) 2020-09-11 2023-10-03 Apple Inc. Methods and apparatuses for cross-layer processing
US11954540B2 (en) 2020-09-14 2024-04-09 Apple Inc. Methods and apparatus for thread-level execution in non-kernel space
US11799986B2 (en) 2020-09-22 2023-10-24 Apple Inc. Methods and apparatus for thread level execution in non-kernel space
US11882051B2 (en) 2021-07-26 2024-01-23 Apple Inc. Systems and methods for managing transmission control protocol (TCP) acknowledgements
US11876719B2 (en) 2021-07-26 2024-01-16 Apple Inc. Systems and methods for managing transmission control protocol (TCP) acknowledgements

Also Published As

Publication number Publication date
US9571402B2 (en) 2017-02-14
US20140328172A1 (en) 2014-11-06

Similar Documents

Publication Publication Date Title
US9571402B2 (en) Congestion control and QoS in NoC by regulating the injection traffic
US10110499B2 (en) QoS in a system with end-to-end flow control and QoS aware buffer allocation
US9007920B2 (en) QoS in heterogeneous NoC by assigning weights to NoC node channels and using weighted arbitration at NoC nodes
US10355996B2 (en) Heterogeneous channel capacities in an interconnect
EP3641244B1 (en) Method and apparatus for selecting path
US20140211622A1 (en) Creating multiple noc layers for isolation or avoiding noc traffic congestion
US8867559B2 (en) Managing starvation and congestion in a two-dimensional network having flow control
US9025456B2 (en) Speculative reservation for routing networks
US8817619B2 (en) Network system with quality of service management and associated management method
US10536385B2 (en) Output rates for virtual output queses
US20140036680A1 (en) Method to Allocate Packet Buffers in a Packet Transferring System
US9185026B2 (en) Tagging and synchronization for fairness in NOC interconnects
US9436642B2 (en) Bus system for semiconductor circuit
US11165705B2 (en) Data transmission method, device, and computer storage medium
US10983910B2 (en) Bandwidth weighting mechanism based network-on-chip (NoC) configuration
KR20130137539A (en) System for performing data cut-through
US11768784B2 (en) Latency and jitter for traffic over PCIe
Wang et al. Predictable vFabric on informative data plane
US20090274049A1 (en) Non-blocked network system and packet arbitration method thereof
CN109995608B (en) Network rate calculation method and device
CN109716719B (en) Data processing method and device and switching equipment
US20180198682A1 (en) Strategies for NoC Construction Using Machine Learning
US20230254259A1 (en) System And Method For Using Dynamic Thresholds With Route Isolation For Heterogeneous Traffic In Shared Memory Packet Buffers
KR101421232B1 (en) Packet processing device, method and computer readable recording medium thereof

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: INTEL CORPORATION, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NETSPEED SYSTEMS, INC.;REEL/FRAME:060753/0662

Effective date: 20220708