US20050088969A1 - Port congestion notification in a switch - Google Patents
Port congestion notification in a switch Download PDFInfo
- Publication number
- US20050088969A1 US20050088969A1 US10/873,329 US87332904A US2005088969A1 US 20050088969 A1 US20050088969 A1 US 20050088969A1 US 87332904 A US87332904 A US 87332904A US 2005088969 A1 US2005088969 A1 US 2005088969A1
- Authority
- US
- United States
- Prior art keywords
- congestion
- switch
- credit
- module
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/30—Peripheral units, e.g. input or output ports
- H04L49/3009—Header conversion, routing tables or routing tags
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/30—Peripheral units, e.g. input or output ports
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/54—Store-and-forward switching systems
- H04L12/56—Packet switching systems
- H04L12/5601—Transfer mode dependent, e.g. ATM
- H04L2012/5678—Traffic aspects, e.g. arbitration, load balancing, smoothing, buffer management
- H04L2012/5681—Buffer or queue management
- H04L2012/5683—Buffer or queue management for avoiding head of line blocking
Definitions
- the present invention relates to congestion notification in a switch. More particularly, the present invention relates to maintaining and updating a congestion status for all destination ports within a switch.
- Fibre Channel is a switched communications protocol that allows concurrent communication among servers, workstations, storage devices, peripherals, and other computing devices.
- Fibre Channel can be considered a channel-network hybrid, containing enough network features to provide the needed connectivity, distance and protocol multiplexing, and enough channel features to retain simplicity, repeatable performance and reliable delivery.
- Fibre Channel is capable of full-duplex transmission of frames at rates extending from 1 Gbps (gigabits per second) to 10 Gbps. It is also able to transport commands and data according to existing protocols such as Internet protocol (IP), Small Computer System Interface (SCSI), High Performance Parallel Interface (HIPPI) and Intelligent Peripheral Interface (IPI) over both optical fiber and copper cable.
- IP Internet protocol
- SCSI Small Computer System Interface
- HIPPI High Performance Parallel Interface
- IPI Intelligent Peripheral Interface
- Fibre Channel is used to connect one or more computers or workstations together with one or more storage devices.
- each of these devices is considered a node.
- One node can be connected directly to another, or can be interconnected such as by means of a Fibre Channel fabric.
- the fabric can be a single Fibre Channel switch, or a group of switches acting together.
- the N_port (node ports) on each node are connected to F_ports (fabric ports) on the switch.
- Multiple Fibre Channel switches can be combined into a single fabric. The switches connect to each other via E-Port (Expansion Port) forming an interswitch link, or ISL.
- E-Port Exsion Port
- Fibre Channel data is formatted into variable length data frames. Each frame starts with a start-of-frame (SOF) indicator and ends with a cyclical redundancy check (CRC) code for error detection and an end-of-frame indicator. In between are a 24-byte header and a variable-length data payload field that can range from 0 to 2112 bytes.
- the switch uses a routing table and the source and destination information found within the Fibre Channel frame header to route the Fibre Channel frames from one port to another. Routing tables can be shared between multiple switches in a fabric over an ISL, allowing one switch to know when a frame must be sent over the ISL to another switch in order to reach its destination port.
- Fibre Channel switches are required to deliver frames to any destination in the same order that they arrive from a source.
- One common approach to insure in order delivery in this context is to process frames in strict temporal order at the input or ingress side of a switch. This is accomplished by managing its input buffer as a first in, first out (FIFO) buffer.
- FIFO first in, first out
- a switch encounters a frame that cannot be delivered due to congestion at the destination port. This frame remains at the top of the buffer until the destination port becomes un-congested, even when the next frame in the FIFO is destined for a port that is not congested and could be transmitted immediately. This condition is referred to as head of line blocking.
- Scheduling algorithms do not use true FIFOs. Rather, they search the input FIFO buffer looking for matches between waiting data and available output ports. If the top frame is destined for a busy port, the scheduling algorithm merely scans the FIFO buffer for the first frame that is destined for an available port. Such algorithms must take care to avoid sending Fibre Channel frames out of order.
- Another approach is to divide the input buffer into separate buffers for each possible destination. However, this requires large amounts of memory and a good deal of complexity in large switches having many possible destination ports.
- a third approach is the deferred queuing solution described in detail in the incorporated references.
- Deferred queuing requires that all incoming data frames that are destined for a congested port be placed in a deferred queue, which keeps these frames from unduly interfering with frames destined for uncongested ports. This technique requires a dependable method for determining the congestion status for all destinations at each input port.
- Congestion and blocking are especially troublesome when the destination port is an E_Port providing an interswitch link to another switch.
- One reason that the E_Port can become congested is that the input port on the second switch has filled up its input buffer. The flow control between the switches prevents the first switch from sending any more data to the second switch. Often times the input buffer on the second switch becomes filled with frames that are all destined for a single congested port on that second switch. This filled buffer has congested the ISL, so that the first switch cannot send any data to the second switch—including data that is destined for an un-congested port on the second switch.
- Several manufacturers have proposed the use of virtual channels to prevent the situation where congestion on an interswitch link is caused by traffic to a single destination.
- the virtual channel flow control solution requires that every input port in the downstream switch know the congestion status of all destinations in the switch.
- the existing solutions for providing this information are not satisfactory, as they do not easily present accurate congestion status information to each of the ingress ports in a switch.
- the foregoing needs are met, to a great extent, by the present invention, which provides a method for noticing port congestion and informing ingress ports of the congestion.
- the present invention utilizes a switch that submits data to a crossbar component for making connections to a destination port. Before data is submitted to the crossbar, it is stored in a virtual output queue structure in a memory subsystem. A separate virtual output queue is maintained for each destination within the switch. When a connection is made over the crossbar to a destination port, data is removed from the virtual output queue associated with that destination port and transmitted over the connection. When a destination port becomes congested, flow control within the switch will prevent data from leaving the virtual output queues associated with that destination.
- the present invention utilizes a cell credit manager at the ingress to the switch.
- the cell credit manager tracks credits associated with each virtual output queue in order to obtain knowledge about the amount of data within each queue. If the credit count in the cell credit manager drops below a threshold value, the cell credit manager views the associated port as a congested port and asserts an XOFF signal.
- the XOFF signal includes three components: a internal switch destination address for the relevant destination port, an XOFF/XON status bit, and a validity signal to indicate that a valid XOFF signal is being sent.
- the XOFF signal of the cell credit manager is received by a plurality of XOFF mask modules.
- One XOFF mask is utilized at each ingress to the switch.
- Each XOFF mask receives the XOFF signal, and assigns the designated destination port to the indicated XOFF/XON status.
- the XOFF mask maintains the status for every destination port in a look up table that assigns a single bit to each port. If the bit assigned to a port is set to “1,” the port has an XOFF status. If the bit is “0,” the port has an XON status and is free to receive data.
- the present invention recognizes that the XOFF mask should not set the status of the destination port to XON during certain portions of the deferred queuing procedure. Consequently, the present invention utilizes a XON history register that also tracks the current status of all ports. This XON history register receives the XOFF signals from the cell credit manager and reflects those changes in its own lookup table. The values in the look up table in the XON history register are then used to periodically update the values in the look up table in the XOFF mask.
- the present invention also recognizes flow control signals directly from the memory subsystem that request that all data stop flowing to that subsystem.
- a “gross_xoff” signal is sent to the XOFF mask.
- the XOFF mask is then able to combine the results of this signal with the status of every destination port as maintained in its lookup table.
- the internal switch destination address is submitted to the XOFF mask. This address is used to reference the status of that destination in the lookup table, and the result is ORed with the value of the gross_xoff signal. The resulting signal indicates the status of the indicated destination port.
- the present invention utilizes a single cell credit manager to track the inputs to the memory subsystem for a plurality of ports. Since each port has its own XOFF mask, the XOFF signals must be sent to the XOFF mask for each port that the cell credit manager tracks. Other cell credit managers exist within the switch.
- the present invention utilizes a special bus to transfer XOFF signals between the various cell credit managers within a switch.
- the present invention provides a technique for a stop_all signal to be shared with all XOFF masks utilizing a single memory subsystem. This signal will ensure that when the gross_xoff signal is set, it will prevent all traffic from flowing into the memory subsystem.
- FIG. 1 is a block diagram of one possible Fibre Channel switch in which the present invention can be utilized.
- FIG. 2 is a block diagram showing the details of the port protocol device of the Fibre Channel switch shown in FIG. 1 .
- FIG. 3 is a block diagram showing the details of the memory controller of the port protocol device shown in FIG. 2 .
- FIG. 4 is a block diagram showing the queuing utilized in an upstream switch and a downstream switch communicating over an interswitch link.
- FIG. 5 is a block diagram showing XOFF flow control between the ingress memory subsystem and the egress memory subsystem in the switch of FIG. 1 .
- FIG. 6 is a block diagram showing backplane credit flow control between the ingress memory subsystem and the egress memory subsystem in the switch of FIG. 1 .
- FIG. 7 is a block diagram showing flow control between the ingress memory subsystem and the protocol interface module in the switch of FIG. 1 .
- FIG. 8 is a block diagram showing flow control between the fabric interface module and the egress memory subsystem in the switch of FIG. 1 .
- FIG. 9 is a block diagram showing the interactions of the fabric interface modules, the XOFF masks, and the cell credit manager in the switch of FIG. 1 .
- FIG. 10 is a block diagram showing the details of the cell credit manager, the XON history register, and the XOFF mask in the switch of FIG. 1 .
- the present invention is best understood after examining the major components of a Fibre Channel switch, such as switch 100 shown in FIG. 1 .
- the components shown in FIG. 1 are helpful in understanding the applicant's preferred embodiment, but persons of ordinary skill will understand that the present invention can be incorporated in switches of different construction, configuration, or port counts.
- Switch 100 is a director class Fibre Channel switch having a plurality of Fibre Channel ports 110 .
- the ports 110 are physically located on one or more I/O boards inside of switch 100 .
- FIG. 1 shows only two I/O boards, namely ingress board 120 and egress board 122 , a director class switch 100 would contain eight or more such boards.
- the preferred embodiment described in the application can contain thirty-two such I/O boards 120 , 122 .
- Each board 120 , 122 contains a microprocessor 124 that, along with its RAM and flash memory (not shown), is responsible for controlling and monitoring the other components on the boards 120 , 122 and for handling communication between the boards 120 , 122 .
- each board 120 , 122 also contains four port protocol devices (or PPDs) 130 .
- PPDs 130 can take a variety of known forms, including an ASIC, an FPGA, a daughter card, or even a plurality of chips found directly on the boards 120 , 122 .
- the PPDs 130 are ASICs, and can be referred to as the FCP ASICs, since they are primarily designed to handle Fibre Channel protocol data.
- Each PPD 130 manages and controls four ports 110 . This means that each I/O board 120 , 122 in the preferred embodiment contains sixteen Fibre Channel ports 110 .
- crossbar 140 designed to establish a switched communication path between two ports 110 .
- crossbar 140 is cell-based, meaning that it is designed to switch small, fixed-size cells of data. This is true even though the overall switch 100 is designed to switch variable length Fibre Channel frames.
- the Fibre Channel frames are received on a port, such as input port 112 , and are processed by the port protocol device 130 connected to that port 112 .
- the PPD 130 contains two major logical sections, namely a protocol interface module 150 and a fabric interface module 160 .
- the protocol interface module 150 receives Fibre Channel frames from the ports 110 and stores them in temporary buffer memory.
- the protocol interface module 150 also examines the frame header for its destination ID and determines the appropriate output or egress port 114 for that frame.
- the frames are then submitted to the fabric interface module 160 , which segments the variable-length Fibre Channel frames into fixed-length cells acceptable to crossbar 140 .
- the fabric interface module 160 then transmits the cells to an ingress memory subsystem (iMS) 180 .
- iMS ingress memory subsystem
- a single iMS 180 handles all frames received on the I/O board 120 , regardless of the port 110 or PPD 130 on which the frame was received.
- the ingress memory subsystem 180 When the ingress memory subsystem 180 receives the cells that make up a particular Fibre Channel frame, it treats that collection of cells as a variable length packet.
- the iMS 180 assigns this packet a packet ID (or “PID”) that indicates the cell buffer address in the iMS 180 where the packet is stored.
- PID packet ID
- the PID and the packet length is then passed on to the ingress Priority Queue (iPQ) 190 , which organizes the packets in iMS 180 into one or more queues, and submits those packets to crossbar 140 .
- the iPQ 190 Before submitting a packet to crossbar 140 , the iPQ 190 submits a “bid” to arbiter 170 .
- the arbiter 170 When the arbiter 170 receives the bid, it configures the appropriate connection through crossbar 140 , and then grants access to that connection to the iPQ 190 .
- the packet length is used to ensure that the connection is maintained until the entire packet has been transmitted through the crossbar 140 , although the connection can be terminated early.
- a single arbiter 170 can manage four different crossbars 140 .
- the arbiter 170 handles multiple simultaneous bids from all iPQs 190 in the switch 100 , and can grant multiple simultaneous connections through crossbar 140 .
- the arbiter 170 also handles conflicting bids, ensuring that no output port 114 receives data from more than one input port 112 at a time.
- the output or egress memory subsystem (eMS) 182 receives the data cells comprising the packet from the crossbar 140 , and passes a packet ID to an egress priority queue (ePQ) 192 .
- the egress priority queue 192 provides scheduling, traffic management, and queuing for communication between egress memory subsystem 182 and the PPD 130 in egress I/O board 122 .
- the eMS 182 transmits the cells comprising the Fibre Channel frame to the egress portion of PPD 130 .
- the fabric interface module 160 then reassembles the data cells and presents the resulting Fibre Channel frame to the protocol interface module 150 .
- the protocol interface module 150 stores the frame in its buffer, and then outputs the frame through output port 114 .
- crossbar 140 and the related components are part of a commercially available cell-based switch chipset, such as the nPX8005 or “Cyclone” switch fabric manufactured by Applied Micro Circuits Corporation of San Diego, Calif. More particularly, in the preferred embodiment, the crossbar 140 is the AMCC S8705 Crossbar product, the arbiter 170 is the AMCC S8605 Arbiter, the iPQ 190 and ePQ 192 are AMCC S8505 Priority Queues, and the iMS 180 and eMS 182 are AMCC S8905 Memory Subsystems, all manufactured by Applied Micro Circuits Corporation.
- the crossbar 140 is the AMCC S8705 Crossbar product
- the arbiter 170 is the AMCC S8605 Arbiter
- the iPQ 190 and ePQ 192 are AMCC S8505 Priority Queues
- the iMS 180 and eMS 182 are AMCC S8905 Memory Subsystems, all manufactured by Applied Micro Circuits Corporation.
- FIG. 2 shows the components of one of the four port protocol devices 130 found on each of the I/O Boards 120 , 122 .
- incoming Fibre Channel frames are received over a port 110 by the protocol interface 150 .
- a link controller module (LCM) 300 in the protocol interface 150 receives the Fibre Channel frames and submits them to the memory controller module 310 .
- One of the primary jobs of the link controller module 300 is to compress the start of frame (SOF) and end of frame (EOF) codes found in each Fibre Channel frame. By compressing these codes, space is created for status and routing information that must be transmitted along with the data within the switch 100 .
- SOF start of frame
- EEF end of frame
- the PPD 130 As each frame passes through PPD 130 , the PPD 130 generates information about the frame's port speed, its priority value, the internal switch destination address (or SDA) for the source port 112 and the destination port 114 , and various error indicators. This information is added to the SOF and EOF in the space made by the LCM 300 . This “extended header” stays with the frame as it traverses through the switch 100 , and is replaced with the original SOF and EOF as the frame leaves the switch 100 .
- the LCM 300 uses a SERDES chip (such as the Gigablaze SERDES available from LSI Logic Corporation, Milpitas, Calif.) to convert between the serial data used by the port 110 and the 10-bit parallel data used in the rest of the protocol interface 150 .
- the LCM 300 performs all low-level link-related functions, including clock conversion, idle detection and removal, and link synchronization.
- the LCM 300 also performs arbitrated loop functions, checks frame CRC and length, and counts errors.
- the memory controller module 310 is responsible for storing the incoming data frame on the inbound frame buffer memory 320 .
- Each port 110 on the PPD 130 is allocated a separate portion of the buffer 320 .
- each port 110 could be given a separate physical buffer 320 .
- This buffer 320 is also known as the credit memory, since the BB_Credit flow control between switch 100 and the upstream device is based upon the size or credits of this memory 320 .
- the memory controller 310 identifies new Fibre Channel frames arriving in credit memory 320 , and shares the frame's destination ID and its location in credit memory 320 with the inbound routing module 330 .
- the routing module 330 of the present invention examines the destination ID found in the frame header of the frames and determines the switch destination address (SDA) in switch 100 for the appropriate destination port 114 .
- the router 330 is also capable of routing frames to the SDA associated with one of the microprocessors 124 in switch 100 .
- the SDA is a ten-bit address that uniquely identifies every port 110 and processor 124 in switch 100 .
- a single routing module 330 handles all of the routing for the PPD 130 .
- the routing module 330 then provides the routing information to the memory controller 310 .
- the memory controller 310 consists of a memory write module 340 , a memory read module 350 , and a queue control module 400 .
- a separate memory controller 310 exists for each of the four ports 110 on the PPD 130 .
- the memory write module 340 handles all aspects of writing data to the credit memory 320 .
- the memory read module 350 is responsible for reading the data frames out of memory 320 and providing the frame to the fabric interface module 160 .
- the queue control module 400 handles the queuing and ordering of data on the credit memory 320 .
- the XON history register 420 can also be considered a part of the memory controller 310 , although only a single XON history register 420 is needed to service all four ports 110 on a PPD 130 .
- the queue control module 400 stores the routing results received from the inbound routing module 330 .
- the queue control module 400 decides which frame should leave the memory 320 next. In doing so, the queue module 400 utilizes procedures that avoid head-of-line blocking.
- the queue control module 400 has four primary components, namely the deferred queue 402 , the backup queue 404 , the header select logic 406 , and the XOFF mask 408 . These components work in conjunction with the XON History register 420 and the cell credit manager or credit module 440 to control ingress queuing and to assist in managing flow control within switch 100 .
- the deferred queue 402 stores the frame headers and locations in buffer memory 320 for frames waiting to be sent to a destination port 114 that is currently busy.
- the backup queue 404 stores the frame headers and buffer locations for frames that arrive at the input port 112 while the deferred queue 402 is sending deferred frames to their destination.
- the header select logic 406 determines the state of the queue control module 400 , and uses this determination to select the next frame in credit memory 320 to be submitted to the FIM 160 . To do this, the header select logic 406 supplies to the memory read module 350 a valid buffer address containing the next frame to be sent.
- the functioning of the backup queue 404 , the deferred queue 402 , and the header select logic 406 are described in the incorporated Fibre Channel Switch application.
- the XOFF mask 408 contains a congestion status bit for each port 110 in the switch.
- the XON history register 420 is used to delay updating the XOFF mask 408 under certain conditions.
- the queue control 400 passes the frame's routed header and pointer to the memory read portion 350 .
- This read module 350 then takes the frame from the credit memory 320 and provides it to the fabric interface module 160 .
- the fabric interface module 160 converts the variable-length Fibre Channel frames received from the protocol interface 150 into fixed-sized data cells acceptable to the cell-based crossbar 140 . Each cell is constructed with a specially configured cell header appropriate to the cell-based switch fabric.
- the cell header When using the Cyclone switch fabric of Applied Micro Circuits Corporation, the cell header includes a starting sync character, the switch destination address of the egress port 114 and a priority assignment from the inbound routing module 330 , a flow control field and ready bit, an ingress class of service assignment, a packet length field, and a start-of-packet and end-of-packet identifier.
- the preferred embodiment of the fabric interface 160 creates fill data to compensate for the speed difference between the memory controller 310 output data rate and the ingress data rate of the cell-based crossbar 140 . This process is described in more detail in the incorporated Fibre Channel Switch application.
- Egress data cells are received from the crossbar 140 and stored in the egress memory subsystem 182 . When these cells leave the eMS 182 , they enter the egress portion of the fabric interface module 160 . The FIM 160 then examines the cell headers, removes fill data, and concatenates the cell payloads to re-construct Fibre Channel frames with extended SOF/EOF codes. If necessary, the FIM 160 uses a small buffer to smooth gaps within frames caused by cell header and fill data removal.
- each PPD 130 and the iMS 180 there are multiple links between each PPD 130 and the iMS 180 .
- Each separate link uses a separate FIM 160 .
- each port 110 on the PPD 130 is given a separate link to the iMS 180 , and therefore each port 110 is assigned a separate FIM 160 .
- the FIM 160 then submits the frames to the outbound processor module (OPM) 450 .
- OPM outbound processor module
- a separate OPM 450 is used for each port 110 on the PPD 130 .
- the outbound processor module 450 checks each frame's CRC, and handles the necessary buffering between the fabric interface 160 and the ports 110 to account for their different data transfer rates.
- the primary job of the outbound processor modules 450 is to handle data frames received from the cell-based crossbar 140 that are destined for one of the Fibre Channel ports 110 . This data is submitted to the link controller module 300 , which replaces the extended SOF /EOF codes with standard Fibre Channel SOF/EOF characters, performs 8b/10b encoding, and sends data frames through its SERDES to the Fibre Channel port 110 .
- the components of the PPD 130 can communicate with the microprocessor 124 on the I/O board 120 , 122 through the microprocessor interface module (MIM) 360 .
- MIM microprocessor interface module
- the microprocessor 124 can read and write registers on the PPD 130 and receive interrupts from the PPDs 130 . This communication occurs over a microprocessor communication path 362 .
- the microprocessor 124 also uses the microprocessor interface 360 to communicate with the ports 110 and with other processors 124 over the cell-based switch fabric.
- FIG. 4 shows two switches 260 , 270 that are communicating over an interswitch link 230 .
- the ISL 230 connects an egress port 114 on upstream switch 260 with an ingress port 112 on downstream switch 270 .
- the egress port 114 is located on the first PPD 262 (labeled PPD 0 ) on the first I/O Board 264 (labeled I/O Board 0 ) on switch 260 .
- This I/O board 264 contains a total of four PPDs 130 , each containing four ports 110 .
- This means I/O board 264 has a total of sixteen ports 110 , numbered 0 through 15.
- FIG. 1 shows two switches 260 , 270 that are communicating over an interswitch link 230 .
- switch 260 contains thirty-one other I/O boards 120 , 122 , meaning the switch 260 has a total of five hundred and twelve ports 110 .
- This particular configuration of I/O Boards 120 , 122 , PPDs 130 , and ports 110 is for exemplary purposes only, and other configurations would clearly be within the scope of the present invention.
- I/O Board 264 has a single egress memory subsystem 182 to hold all of the data received from the crossbar 140 (not shown) for its sixteen ports 110 .
- the data in eMS 182 is controlled by the egress priority queue 192 (also not shown).
- the ePQ 192 maintains the data in the eMS 182 in a plurality of output class of service queues (O_COS_Q) 280 .
- Data for each port 110 on the I/O Board 264 is kept in a total of “n” O_COS queues, with the number n reflecting the number of virtual channels 240 defined to exist with the ISL 230 .
- the eMS 182 and ePQ 192 add the cell to the appropriate O_COS_Q 280 based on the destination SDA and priority value assigned to the cell. This information was placed in the cell header as the cell was created by the ingress FIM 160 .
- the cells are then removed from the O_COS_Q 280 and are submitted to the PPD 262 for the egress port 114 , which converts the cells back into a Fibre Channel frame and sends it across the ISL 230 to the downstream switch 270 .
- the frame enters switch 270 over the ISL 230 through ingress port 112 .
- This ingress port 112 is actually the second port (labeled port 1 ) found on the first PPD 272 (labeled PPD 0 ) on the first I/O Board 274 (labeled I/O Board 0 ) on switch 270 .
- this I/O board 274 contains a total of four PPDs 130 , with each PPD 130 containing four ports 110 .
- switch 270 has the same five hundred and twelve ports as switch 260 .
- the frame When the frame is received at port 112 , it is placed in credit memory 320 .
- the D_ID of the frame is examined, and the frame is queued and a routing determination is made as described above. Assuming that the destination port on switch 270 is not XOFFed according to the XOFF mask 408 servicing input port 112 , the frame will be subdivided into cells and forwarded to the ingress memory subsystem 180 .
- the iMS 180 is organized and controlled by the ingress priority queue 190 , which is responsible for ensuring in-order delivery of data cells and packets.
- the iPQ 190 organizes the data in its iMS 180 into a number (“m”) of different virtual output queues (V_O_Qs) 290 .
- V_O_Qs virtual output queues
- a separate V_O_Q 290 is established for every destination within the switch 270 . In switch 270 , this means that there are at least five hundred forty-four V_O_Qs 290 (five hundred twelve physical ports 110 and thirty-two microprocessors 124 ) in iMS 180 .
- the iMS 180 places incoming data on the appropriate V_O_Q 290 according to the switch destination address assigned to that data by the routing module 330 in PPD 272 .
- V_O_Qs 290 Data in the V_O_Qs 290 is handled like the data in O_COS_Qs 280 , such as by using round robin servicing.
- data is removed from a V_O_Q 290 , it is submitted to the crossbar 140 and provided to an eMS 182 on the switch 270 .
- FIG. 4 also shows a virtual input queue structure 282 within each ingress port 112 in downstream switch 270 .
- Each of these V_I_Qs 282 corresponds to one of the virtual channels 240 on the ISL 230 , which in turn corresponds to one of the O_COS_Qs 280 on the upstream switch.
- the downstream switch 270 can identify which O_COS_Q 280 in switch 260 was assigned to the frame.
- the switch 270 is able to communicate that congestion to the upstream switch by performing flow control for the virtual channel 240 assigned to that V_I_Q 282 .
- the cell-based switch fabric used in the preferred embodiment of the present invention can be considered to include the memory subsystems 180 , 182 , the priority queues 190 , 192 , the cell-based crossbar 140 , and the arbiter 170 . As described above, these elements can be obtained commercially from companies such as Applied Micro Circuits Corporation.
- This switch fabric utilizes a variety of flow control mechanisms to prevent internal buffer overflows, to control the flow of cells into the cell-based switch fabric, and to receive flow control instructions to stop cells from exiting the switch fabric.
- XOFF internal flow control within the cell-based switch fabric is shown as communication 500 in FIG. 5 .
- This flow control serves to stop data cells from being sent from iMS 180 to eMS 182 over the crossbar 140 in situations where the eMS 182 or one of the O_COS_Qs 280 in the eMS 182 is becoming full. If there were no flow control, congestion at an egress port 114 would prevent data in the port's associated O_COS_Qs 280 from exiting the switch 100 . If the iMS 180 were allowed to keep sending data to these queues 280 , eMS 182 would overflow and data would be lost.
- This flow control works as follows.
- an XOFF signal is generated internal to the switch fabric to stop transmission of data from the iMS 180 to these O_COS_Qs 280 .
- the preferred Cyclone switch fabric uses three different thresholds, namely a routine threshold, an urgent threshold, and an emergency threshold. Each threshold creates a corresponding type of XOFF signal to the iMS 180 .
- the XOFF signal generated by the eMS 182 cannot simply turn off data for a single O_COS_Q 280 .
- the XOFF signal is not even specific to a single congested port 110 .
- the iMS 180 responds by stopping all cell traffic to the group of four ports 110 found on the PPD 130 that contains the congested egress port 114 .
- Urgent and Emergency XOFF signals cause the iMS 180 and Arbiter 170 to stop all cell traffic to the effected egress I/O board 122 .
- the eMS 182 is able to accept additional packets of data before the iMS 180 stops sending data.
- Emergency XOFF signals mean that new packets arriving at the eMS 182 will be discarded.
- the iPQ 190 also uses a backplane credit flow control 510 (shown in FIG. 6 ) to manage the traffic from the iMS 180 to the different egress memory subsystems 182 more granularly than the XOFF signals 500 described above. For every packet submitted to an egress port 114 , the iPQ 190 decrements its “backplane” credit count for that port 114 . When the packet is transmitted out of the eMS 182 , a backplane credit is returned to the iPQ 190 .
- a backplane credit flow control 510 shown in FIG. 6
- the iPQ 190 even though only a single O_COS_Q 280 is not sending data, the iPQ 190 only maintains credits on an port 110 basis, not a class of service basis. Thus, the effected iPQ 190 will stop sending all data to the port 114 , including data with a different class of service that could be transmitted over the port 114 . In addition, since the iPQ 190 services an entire I/O board 120 , all traffic to that egress port 114 from any of the ports 110 on that board 120 is stopped. Other iPQs 190 on other I/O boards 120 , 122 can continue sending packets to the same egress port 114 as long as those other iPQs 190 have backplane credits for that port 114 .
- the backplane credit system 510 can provide some internal switch flow control from ingress to egress on the basis of a virtual channel 240 , but it is inconsistent. If two ingress ports 112 on two separate I/O boards 120 , 122 are each sending data to different virtual channels 240 on the same ISL 230 , the use of backplane credits will flow control those channels 240 differently. One of those virtual channels 240 might have an XOFF condition. Packets to that O_COS_Q 280 will back up, and backplane credits will not be returned. The lack of backplane credits will cause the iPQ 190 sending to the XOFFed virtual channel 240 to stop sending data.
- the cell-based switch fabric must be able to stop the flow of data from its data source (i.e., the FIM 160 ) whenever the iMS 180 or a V_O_Q 290 maintained by the iPQ 190 is becoming full.
- the switch fabric signals this XOFF condition by setting the RDY (ready) bit to 0 on the cells it returns to the FIM 160 , shown as 520 on FIG. 7 .
- this XOFF is an input flow control signal between the iMS 180 and the ingress portion of the PPD 130 , the signals are communicated from the eMS 182 into the egress portion of the same PPD 130 .
- the egress portion of the FIM 160 receives the cells with RDY set to 0, it informs the ingress portion of the PPD 130 to stop sending data to the iMS 180 .
- flow control cells 520 are sent by the eMS 182 to the egress portion of the FIM 160 to inform the PPD 130 of this updated state. These flow control cells use the RDY bit in the cell header to indicate the current status of the iMS 180 and its related queues 290 .
- the iMS 180 may fill up to its threshold level. In this case, no more traffic should be sent to the iMS 180 .
- a FIM 160 receives the flow control cells 520 indicating this condition, it sends a congestion signal (or “gross_xoff” signal) 522 to the XOFF mask 408 in the memory controller 310 . This signal informs the MCM 310 to stop all data traffic to the iMS 180 , as described in more detail below.
- the FIM 160 will also broadcast an external signal called STOP_ALL 164 to the FIMs 160 on its PPD 130 , as well as to the other three PPDs 130 on its I/O board 120 .
- the STOP_ALL congestion signal 164 may take the same form as the gross_xoff congestion signal 522 , or it may be differently formatted.
- the interconnection between the PPDs 130 and the STOP_ALL signal 164 is shown in FIG. 9 , although it is possible to use the physical linkage 454 between the cell credit managers 440 to communicate this signal.
- the gross_xoff signal 522 is sent to its memory controller 310 . Since all FIMs 160 on a board 120 will receive the STOP_ALL signal 164 , this signal will stop all traffic to the iMS 180 .
- the gross_xoff signal 522 will remain on until the flow control cells 520 received by the FIM 160 indicate the buffer condition at the iMS 180 is over.
- a single V_O_Q 290 in the iMS 180 fills up to its threshold.
- the signal 520 back to the PPD 130 will behave just as it did in the first case, with the generation of a gross_xoff congestion signal 522 and a STOP_ALL congestion signal 164 .
- the entire iMS 180 stops receiving data, even though only a single V_O_Q 290 has become congestion.
- the third case involves a failed link between a FIM 160 and the iMS 180 .
- Flow control cells indicating this condition will cause a gross_xoff signal 522 to be sent only to the MCM 310 for the corresponding FIM 160 .
- No STOP_ALL signal 164 is sent in this situation.
- an egress portion of a PPD 130 When an egress portion of a PPD 130 wishes to stop traffic coming from the eMS 182 , it signals an XOFF to the switch fabric by sending a cell from the input FIM 160 to the iMS 180 , which is shown as flow control 530 on FIG. 8 .
- the cell header contains a queue flow control field and a RDY bit to help define the XOFF signal.
- the queue flow control field is eleven bits long, and can identify the class of service, port 110 and PPD 130 , as well as the desired flow status (XON or XOFF).
- the PPD 130 might desire to stop the flow of data from the eMS 182 for several reasons.
- the iMS 180 extracts each XOFF instruction from the cell header, and sends it to the eMS 182 , directing the eMS 182 to XOFF or XON a particular O_COS_Q 280 . If the O_COS_Q 280 is sending a packet to the FIM 160 , it finishes sending the packet. The eMS 182 then stops sending fabric-to-port or fabric-to-microprocessor packets to the FIM 160 .
- the XOFF mask 408 shown in FIG. 10 is responsible for notifying the ingress ports 112 of the congestion status of all egress ports 114 and microprocessors 124 in the switch. Every port 112 has its own XOFF mask 408 , as shown in FIG. 9 .
- the XOF mask 408 is considered part of the queue control module 400 in the memory controller 310 , and is therefore shown within the MCM 330 in FIG. 9 .
- Each XOFF mask 408 contains a separate status bit for all destinations within the switch 100 .
- the switch 100 there are five hundred and twelve physical ports 110 and thirty-two microprocessors 124 that can serve as a destination for a frame.
- the XOFF mask 408 uses a 544 by 1 look up table 410 to store the “XOFF” status of each destination. If a bit in XOFF look up table 410 is set, the port 110 corresponding to that bit is busy and cannot receive any frames.
- the XOFF mask 408 returns a status for a destination by first receiving the switch destination address for that port 110 or microprocessor 124 on SDA input 412 .
- the look up table 410 is examined for the SDA on input 412 , and if the corresponding bit is set, the XOFF mask 408 asserts a signal on “defer” output 414 , which indicates to the rest of the queue control module 400 that the selected port 110 or processor 124 is busy.
- This construction of the XOFF mask 408 is the preferred way to store the congestion status of possible destinations at each port 110 . Other ways are possible, as long as they can quickly respond to a status query about a destination with the congestion status for that destination.
- the output of the XOFF look up table 410 is not the sole source for the defer signal 414 .
- the XOFF mask 408 receives the gross_xoff signal 522 from its associated FIM 160 . This signal 522 is ORed with the output of the lookup table 410 in order to generate the defer signal 414 . This means that whenever the gross_xoff signal 522 is set, the defer signal 414 will also be set, effectively stopping all traffic to the iMS 180 .
- a force defer signal that is controlled by the microprocessor 124 is also able to cause the defer signal 414 to go on. When the defer signal 414 is set, it informs the header select logic 406 and the remaining elements of the queue module 400 that the port 110 having the address on next frame header output 415 is congested, and this frame should be stored on the deferred queue 402 .
- the XON history register 420 is used to record the history of the XON status of all destinations in the switch 100 .
- the XOFF mask 408 cannot be updated with an XON event when the queue control 400 is servicing deferred frames in the deferred queue 402 .
- the XOFF mask 408 will ignore (or not receive) the XOFF signal 452 from the cell credit manager 440 and will therefore not update its lookup table 410 .
- the signal 452 from the cell credit manager 440 will, however, update the lookup table 422 within the XON history register 420 .
- the XON history register 420 maintains the current XON status of all ports 110 .
- the update signal 416 is made active by the header select 406 portion of the queue control module 400 , the entire content of the lookup table 422 in the XON history register 420 is transferred to the lookup table 410 of the XOFF mask 408 .
- Registers within the table 422 containing a zero(having a status of XON) will cause corresponding registers within the XOFF mask lookup table 410 to be reset to zero.
- the dual register setup allows for XOFFs to be written directly to the XOFF mask 408 at any time the cell credit manager 440 requires traffic to be halted, and causes XONs to be applied only when the logic within the queue control module 400 allows for a change to an XON value. While a separate queue control module 400 and its associated XOFF mask 408 is necessary for each port 110 in the PPD 130 , only one XON history register 420 is necessary to service all four ports 110 in the PPD 130 , which again is shown in FIG. 9 .
- the cell credit manager or credit module 440 sets the XOFF/XON status of the possible destination ports 110 in the lookup tables 410 , 422 of the XOFF mask 408 and the XON history register 420 . To update these tables 410 , 422 , the cell credit manager 440 maintains a cell credit count of every cell in the virtual output queues 290 of the iMS 180 . Every time a cell addressed to a particular SDA leaves the FIM 160 and enters the iMS 180 , the FIM 160 informs the credit module 440 through a cell credit event signal 442 . The credit module 440 then decrements the cell count for that SDA.
- the credit module 440 is again informed and adds a credit to the count for the associated SDA.
- the iPQ 190 sends this credit information back to the credit module 440 by sending a cell containing the cell credit back to the FIM 160 through the eMS 182 .
- the FIM 160 then sends an increment credit signal 442 to the cell credit manager 440 .
- This cell credit flow control is designed to prevent the occurrence of more drastic levels of flow control from within the cell-based switch fabric described above, since these flow control signals 500 - 520 can result in multiple blocked ports 110 , shutting down an entire iMS 180 , or even the loss of data.
- the cell credits are tracked through increment and decrement credit events 442 received from FIM 160 . These events are stored in dedicated increment FIFOs 444 and decrement FIFOs 446 . Each FIM 160 is associated with a separate increment FIFO 444 and a separate decrement FIFO 446 , although ports 1 - 3 are shown as sharing FIFOs 444 , 446 for the sake of simplicity. Decrement FIFOs 446 contain SDAs for cells that have entered the iMS 180 . Increment FIFOs 444 contain SDAs for cells that have left the iMS 180 .
- FIFOs 444 , 446 are handled in round robin format, decrementing and incrementing the credit count that the credit module 440 maintains for each SDA in its cell credit accumulator 447 .
- the cell credit accumulator 447 is able to handle one increment event from one of the FIFOs 444 and one decrement event from one of the FIFOs 446 at the same time.
- An event select logic services the FIFOs 444 , 446 in a round robin manner while monitoring the status of each FIFOs 444 , 446 so as to avoid giving access to the accumulator 447 to empty FIFOs 444 , 446 .
- the accumulator 447 maintains separate credit counts for each SDA, with each count reflecting the number of cells contained within the iMS 180 for a given SDA.
- a compare module 448 detects when the count for an SDA within accumulator 447 crosses an XOFF or XON threshold stored in threshold memory 449 . When a threshold is crossed, the compare module 448 causes a driver to send the appropriate XOFF or XON event 452 to the XOFF mask 408 and the XON history register 420 . If the count gets too low, then that SDA is XOFFed. This means that Fibre Channel frames that are to be routed to that SDA are held in the credit memory 320 by queue control module 400 .
- the credit module 440 waits for the count for that SDA to rise to a certain level, and then the SDA is XONed, which instructs the queue control module 400 to release frames for that destination from the credit memory 320 .
- the XOFF and XON thresholds in threshold memory 449 can be different for each individual SDA, and are programmable by the processor 124 .
- the credit module 440 sends an XOFF instruction 452 to the XON history register 420 and all four XOFF masks 408 in its PPD 130 .
- the XOFF instruction 452 is a three-part signal identifying the SDA, the new XOFF status, and a validity signal.
- each cell credit manager 440 receives communications from the FIMs 160 on its PPD 130 regarding the cells that each FIM 160 submits to the iMS 180 .
- the FIMs 160 also report back to the cell credit manager 440 when those cells are submitted by the iMS 180 over the crossbar 140 .
- the cell credit managers 440 are able to track the status of all cells submitted to the iMS 180 . Even though each cell credit manager 440 is only tracking cells related to its PPD 130 (approximately one fourth of the total cells passing through the iMS 180 ), this information could be used to implement a useful congestion notification system.
- the preferred embodiment ingress memory system 180 manufactured by AMCC does not return cell credit information to the same FIM 160 that submitted the cell.
- the cell credit relating to a cell submitted by the first FIM 160 on the first PPD 130 might be returned by the iMS 180 to the last FIM 160 on the last PPD 130 . Consequently, the cell credit managers 440 cannot assume that each decrement credit event 442 they receive relating to a cell entering the iMS 180 will ever result in a related increment credit event 442 being returned to it when that cell leaves the iMS 180 .
- the increment credit event 442 may very well end up at another cell credit manager 440 .
- an alternative embodiment of the present invention has the four cell credit managers 440 on an I/O board 120 , 122 combine their cell credit events 442 in a master/slave relationship.
- each board 120 , 122 has a single “master” cell credit manager 441 and three “slave” cell credit manager 440 .
- a slave unit 440 receives a cell credit event signal 442 from a FIM 160
- the signal 442 is forwarded to the master cell credit manager 441 over a special XOFF bus 454 (as seen in FIG. 9 ).
- the master unit 441 receives cell credit event signals 442 from the three slave units 440 as well as the FIMs 160 that directly connect to the master unit 441 .
- the master cell credit manager 441 receives the cell credit event signals 442 from all of the FIMs 160 on an I/O board 120 . This allows the master unit to maintain a credit count for each SDA in its accumulator 447 that reflects all data cells entering and leaving the iMS 180 .
- the master cell credit manager 441 is solely responsible for maintaining the credit counts and for comparing the credit counts with the threshold values stored in its threshold memory 449 .
- the master unit 441 sends an XOFF or XON event 452 to its associated XON history register 420 and XOFF masks 408 .
- the master unit 441 sends an instruction to the slave cell credit managers 440 to send the same XOFF or XON event 452 to their XON history registers 420 and XOFF masks 408 .
- the four cell credit managers 440 , 441 send the same XOFF/XON event 452 to all four XON history registers 442 and all sixteen XOFF masks 408 on the I/O board 120 , 122 , effectively unifying the cell credit congestion notification across the board 120 , 122 .
- the FIM 160 toggles a ‘state’ bit in the headers of all cells sent to the iMS 180 to reflect a transition in the system's state.
- the credit counters in cell credit accumulator 447 are restored to full credit. Since each of the cell credits returned from the iMS 180 /eMS 182 includes an indication of the value of the state bit in the cell, it is possible to differentiate credits relating to cells sent before the state change. Any credits received by the FIM 160 that do not have the proper state bit are ignored.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Small-Scale Networks (AREA)
Abstract
Description
- This application is a continuation-in-part application based on U.S. patent application Ser. No. 10/020,968, entitled “Deferred Oueuing in a Buffered Switch,” filed on Dec. 19, 2001, which is hereby incorporated by reference.
- This application is related to U.S. patent application entitled “Fibre Channel Switch,” Ser. No. ______, attorney docket number 3194, filed on even date herewith with inventors in common with the present application. This related application is hereby incorporated by reference.
- The present invention relates to congestion notification in a switch. More particularly, the present invention relates to maintaining and updating a congestion status for all destination ports within a switch.
- Fibre Channel is a switched communications protocol that allows concurrent communication among servers, workstations, storage devices, peripherals, and other computing devices. Fibre Channel can be considered a channel-network hybrid, containing enough network features to provide the needed connectivity, distance and protocol multiplexing, and enough channel features to retain simplicity, repeatable performance and reliable delivery. Fibre Channel is capable of full-duplex transmission of frames at rates extending from 1 Gbps (gigabits per second) to 10 Gbps. It is also able to transport commands and data according to existing protocols such as Internet protocol (IP), Small Computer System Interface (SCSI), High Performance Parallel Interface (HIPPI) and Intelligent Peripheral Interface (IPI) over both optical fiber and copper cable.
- In a typical usage, Fibre Channel is used to connect one or more computers or workstations together with one or more storage devices. In the language of Fibre Channel, each of these devices is considered a node. One node can be connected directly to another, or can be interconnected such as by means of a Fibre Channel fabric. The fabric can be a single Fibre Channel switch, or a group of switches acting together. Technically, the N_port (node ports) on each node are connected to F_ports (fabric ports) on the switch. Multiple Fibre Channel switches can be combined into a single fabric. The switches connect to each other via E-Port (Expansion Port) forming an interswitch link, or ISL.
- Fibre Channel data is formatted into variable length data frames. Each frame starts with a start-of-frame (SOF) indicator and ends with a cyclical redundancy check (CRC) code for error detection and an end-of-frame indicator. In between are a 24-byte header and a variable-length data payload field that can range from 0 to 2112 bytes. The switch uses a routing table and the source and destination information found within the Fibre Channel frame header to route the Fibre Channel frames from one port to another. Routing tables can be shared between multiple switches in a fabric over an ISL, allowing one switch to know when a frame must be sent over the ISL to another switch in order to reach its destination port.
- Fibre Channel switches are required to deliver frames to any destination in the same order that they arrive from a source. One common approach to insure in order delivery in this context is to process frames in strict temporal order at the input or ingress side of a switch. This is accomplished by managing its input buffer as a first in, first out (FIFO) buffer. Sometimes, however, a switch encounters a frame that cannot be delivered due to congestion at the destination port. This frame remains at the top of the buffer until the destination port becomes un-congested, even when the next frame in the FIFO is destined for a port that is not congested and could be transmitted immediately. This condition is referred to as head of line blocking.
- Various techniques have been proposed to deal with the problem of head of line blocking. Scheduling algorithms, for instance, do not use true FIFOs. Rather, they search the input FIFO buffer looking for matches between waiting data and available output ports. If the top frame is destined for a busy port, the scheduling algorithm merely scans the FIFO buffer for the first frame that is destined for an available port. Such algorithms must take care to avoid sending Fibre Channel frames out of order. Another approach is to divide the input buffer into separate buffers for each possible destination. However, this requires large amounts of memory and a good deal of complexity in large switches having many possible destination ports. A third approach is the deferred queuing solution described in detail in the incorporated references. Deferred queuing requires that all incoming data frames that are destined for a congested port be placed in a deferred queue, which keeps these frames from unduly interfering with frames destined for uncongested ports. This technique requires a dependable method for determining the congestion status for all destinations at each input port.
- Congestion and blocking are especially troublesome when the destination port is an E_Port providing an interswitch link to another switch. One reason that the E_Port can become congested is that the input port on the second switch has filled up its input buffer. The flow control between the switches prevents the first switch from sending any more data to the second switch. Often times the input buffer on the second switch becomes filled with frames that are all destined for a single congested port on that second switch. This filled buffer has congested the ISL, so that the first switch cannot send any data to the second switch—including data that is destined for an un-congested port on the second switch. Several manufacturers have proposed the use of virtual channels to prevent the situation where congestion on an interswitch link is caused by traffic to a single destination. In these proposals, traffic on the link is divided into several virtual channels, and no virtual channel is allowed to interfere with traffic on the other virtual channels. In range Technologies Corporation has proposed a technique for flow control over virtual channel that is described in the incorporated Fibre Channel Switch application. This flow control technique monitors the congestion status of all destination ports at the downstream switch. If a destination port becomes congested, the flow control process determines which virtual channel on the ISL is affected, and sends an XOFF message so informing the upstream switch. The upstream switch will then stop sending data on the affected virtual channel.
- Like the deferred queuing solution, the virtual channel flow control solution requires that every input port in the downstream switch know the congestion status of all destinations in the switch. Unfortunately, the existing solutions for providing this information are not satisfactory, as they do not easily present accurate congestion status information to each of the ingress ports in a switch.
- The foregoing needs are met, to a great extent, by the present invention, which provides a method for noticing port congestion and informing ingress ports of the congestion. The present invention utilizes a switch that submits data to a crossbar component for making connections to a destination port. Before data is submitted to the crossbar, it is stored in a virtual output queue structure in a memory subsystem. A separate virtual output queue is maintained for each destination within the switch. When a connection is made over the crossbar to a destination port, data is removed from the virtual output queue associated with that destination port and transmitted over the connection. When a destination port becomes congested, flow control within the switch will prevent data from leaving the virtual output queues associated with that destination.
- The present invention utilizes a cell credit manager at the ingress to the switch. The cell credit manager tracks credits associated with each virtual output queue in order to obtain knowledge about the amount of data within each queue. If the credit count in the cell credit manager drops below a threshold value, the cell credit manager views the associated port as a congested port and asserts an XOFF signal. The XOFF signal includes three components: a internal switch destination address for the relevant destination port, an XOFF/XON status bit, and a validity signal to indicate that a valid XOFF signal is being sent.
- The XOFF signal of the cell credit manager is received by a plurality of XOFF mask modules. One XOFF mask is utilized at each ingress to the switch. Each XOFF mask receives the XOFF signal, and assigns the designated destination port to the indicated XOFF/XON status. The XOFF mask maintains the status for every destination port in a look up table that assigns a single bit to each port. If the bit assigned to a port is set to “1,” the port has an XOFF status. If the bit is “0,” the port has an XON status and is free to receive data.
- The present invention recognizes that the XOFF mask should not set the status of the destination port to XON during certain portions of the deferred queuing procedure. Consequently, the present invention utilizes a XON history register that also tracks the current status of all ports. This XON history register receives the XOFF signals from the cell credit manager and reflects those changes in its own lookup table. The values in the look up table in the XON history register are then used to periodically update the values in the look up table in the XOFF mask.
- The present invention also recognizes flow control signals directly from the memory subsystem that request that all data stop flowing to that subsystem. When these signals are receives, a “gross_xoff” signal is sent to the XOFF mask. The XOFF mask is then able to combine the results of this signal with the status of every destination port as maintained in its lookup table. When another portion of the switch wishes to determine the status of a particular port, the internal switch destination address is submitted to the XOFF mask. This address is used to reference the status of that destination in the lookup table, and the result is ORed with the value of the gross_xoff signal. The resulting signal indicates the status of the indicated destination port.
- The present invention utilizes a single cell credit manager to track the inputs to the memory subsystem for a plurality of ports. Since each port has its own XOFF mask, the XOFF signals must be sent to the XOFF mask for each port that the cell credit manager tracks. Other cell credit managers exist within the switch. The present invention utilizes a special bus to transfer XOFF signals between the various cell credit managers within a switch. In addition, the present invention provides a technique for a stop_all signal to be shared with all XOFF masks utilizing a single memory subsystem. This signal will ensure that when the gross_xoff signal is set, it will prevent all traffic from flowing into the memory subsystem.
-
FIG. 1 is a block diagram of one possible Fibre Channel switch in which the present invention can be utilized. -
FIG. 2 is a block diagram showing the details of the port protocol device of the Fibre Channel switch shown inFIG. 1 . -
FIG. 3 is a block diagram showing the details of the memory controller of the port protocol device shown inFIG. 2 . -
FIG. 4 is a block diagram showing the queuing utilized in an upstream switch and a downstream switch communicating over an interswitch link. -
FIG. 5 is a block diagram showing XOFF flow control between the ingress memory subsystem and the egress memory subsystem in the switch ofFIG. 1 . -
FIG. 6 is a block diagram showing backplane credit flow control between the ingress memory subsystem and the egress memory subsystem in the switch ofFIG. 1 . -
FIG. 7 is a block diagram showing flow control between the ingress memory subsystem and the protocol interface module in the switch ofFIG. 1 . -
FIG. 8 is a block diagram showing flow control between the fabric interface module and the egress memory subsystem in the switch ofFIG. 1 . -
FIG. 9 is a block diagram showing the interactions of the fabric interface modules, the XOFF masks, and the cell credit manager in the switch ofFIG. 1 . -
FIG. 10 is a block diagram showing the details of the cell credit manager, the XON history register, and the XOFF mask in the switch ofFIG. 1 . - 1.
Switch 100 - The present invention is best understood after examining the major components of a Fibre Channel switch, such as
switch 100 shown inFIG. 1 . The components shown inFIG. 1 are helpful in understanding the applicant's preferred embodiment, but persons of ordinary skill will understand that the present invention can be incorporated in switches of different construction, configuration, or port counts. -
Switch 100 is a director class Fibre Channel switch having a plurality ofFibre Channel ports 110. Theports 110 are physically located on one or more I/O boards inside ofswitch 100. AlthoughFIG. 1 shows only two I/O boards, namelyingress board 120 andegress board 122, adirector class switch 100 would contain eight or more such boards. The preferred embodiment described in the application can contain thirty-two such I/O boards board microprocessor 124 that, along with its RAM and flash memory (not shown), is responsible for controlling and monitoring the other components on theboards boards - In the preferred embodiment, each
board PPDs 130 can take a variety of known forms, including an ASIC, an FPGA, a daughter card, or even a plurality of chips found directly on theboards PPDs 130 are ASICs, and can be referred to as the FCP ASICs, since they are primarily designed to handle Fibre Channel protocol data. EachPPD 130 manages and controls fourports 110. This means that each I/O board Fibre Channel ports 110. - The I/
O boards more crossbars 140 designed to establish a switched communication path between twoports 110. Although only asingle crossbar 140 is shown, the preferred embodiment uses four ormore crossbar devices 140 working together. In the preferred embodiment,crossbar 140 is cell-based, meaning that it is designed to switch small, fixed-size cells of data. This is true even though theoverall switch 100 is designed to switch variable length Fibre Channel frames. - The Fibre Channel frames are received on a port, such as
input port 112, and are processed by theport protocol device 130 connected to thatport 112. ThePPD 130 contains two major logical sections, namely aprotocol interface module 150 and afabric interface module 160. Theprotocol interface module 150 receives Fibre Channel frames from theports 110 and stores them in temporary buffer memory. Theprotocol interface module 150 also examines the frame header for its destination ID and determines the appropriate output oregress port 114 for that frame. The frames are then submitted to thefabric interface module 160, which segments the variable-length Fibre Channel frames into fixed-length cells acceptable tocrossbar 140. - The
fabric interface module 160 then transmits the cells to an ingress memory subsystem (iMS) 180. Asingle iMS 180 handles all frames received on the I/O board 120, regardless of theport 110 orPPD 130 on which the frame was received. - When the
ingress memory subsystem 180 receives the cells that make up a particular Fibre Channel frame, it treats that collection of cells as a variable length packet. TheiMS 180 assigns this packet a packet ID (or “PID”) that indicates the cell buffer address in theiMS 180 where the packet is stored. The PID and the packet length is then passed on to the ingress Priority Queue (iPQ) 190, which organizes the packets iniMS 180 into one or more queues, and submits those packets tocrossbar 140. Before submitting a packet tocrossbar 140, theiPQ 190 submits a “bid” toarbiter 170. When thearbiter 170 receives the bid, it configures the appropriate connection throughcrossbar 140, and then grants access to that connection to theiPQ 190. The packet length is used to ensure that the connection is maintained until the entire packet has been transmitted through thecrossbar 140, although the connection can be terminated early. - A
single arbiter 170 can manage fourdifferent crossbars 140. Thearbiter 170 handles multiple simultaneous bids from alliPQs 190 in theswitch 100, and can grant multiple simultaneous connections throughcrossbar 140. Thearbiter 170 also handles conflicting bids, ensuring that nooutput port 114 receives data from more than oneinput port 112 at a time. - The output or egress memory subsystem (eMS) 182 receives the data cells comprising the packet from the
crossbar 140, and passes a packet ID to an egress priority queue (ePQ) 192. Theegress priority queue 192 provides scheduling, traffic management, and queuing for communication betweenegress memory subsystem 182 and thePPD 130 in egress I/O board 122. When directed to do so by theePQ 192, theeMS 182 transmits the cells comprising the Fibre Channel frame to the egress portion ofPPD 130. Thefabric interface module 160 then reassembles the data cells and presents the resulting Fibre Channel frame to theprotocol interface module 150. Theprotocol interface module 150 stores the frame in its buffer, and then outputs the frame throughoutput port 114. - In the preferred embodiment,
crossbar 140 and the related components are part of a commercially available cell-based switch chipset, such as the nPX8005 or “Cyclone” switch fabric manufactured by Applied Micro Circuits Corporation of San Diego, Calif. More particularly, in the preferred embodiment, thecrossbar 140 is the AMCC S8705 Crossbar product, thearbiter 170 is the AMCC S8605 Arbiter, theiPQ 190 andePQ 192 are AMCC S8505 Priority Queues, and theiMS 180 andeMS 182 are AMCC S8905 Memory Subsystems, all manufactured by Applied Micro Circuits Corporation. - 2.
Port Protocol Device 130 - a)
Link Controller Module 300 -
FIG. 2 shows the components of one of the fourport protocol devices 130 found on each of the I/O Boards port 110 by theprotocol interface 150. A link controller module (LCM) 300 in theprotocol interface 150 receives the Fibre Channel frames and submits them to thememory controller module 310. One of the primary jobs of thelink controller module 300 is to compress the start of frame (SOF) and end of frame (EOF) codes found in each Fibre Channel frame. By compressing these codes, space is created for status and routing information that must be transmitted along with the data within theswitch 100. More specifically, as each frame passes throughPPD 130, thePPD 130 generates information about the frame's port speed, its priority value, the internal switch destination address (or SDA) for thesource port 112 and thedestination port 114, and various error indicators. This information is added to the SOF and EOF in the space made by theLCM 300. This “extended header” stays with the frame as it traverses through theswitch 100, and is replaced with the original SOF and EOF as the frame leaves theswitch 100. - The
LCM 300 uses a SERDES chip (such as the Gigablaze SERDES available from LSI Logic Corporation, Milpitas, Calif.) to convert between the serial data used by theport 110 and the 10-bit parallel data used in the rest of theprotocol interface 150. TheLCM 300 performs all low-level link-related functions, including clock conversion, idle detection and removal, and link synchronization. TheLCM 300 also performs arbitrated loop functions, checks frame CRC and length, and counts errors. - b)
Memory Controller Module 310 - The
memory controller module 310 is responsible for storing the incoming data frame on the inboundframe buffer memory 320. Eachport 110 on thePPD 130 is allocated a separate portion of thebuffer 320. Alternatively, eachport 110 could be given a separatephysical buffer 320. Thisbuffer 320 is also known as the credit memory, since the BB_Credit flow control betweenswitch 100 and the upstream device is based upon the size or credits of thismemory 320. Thememory controller 310 identifies new Fibre Channel frames arriving incredit memory 320, and shares the frame's destination ID and its location incredit memory 320 with theinbound routing module 330. - The
routing module 330 of the present invention examines the destination ID found in the frame header of the frames and determines the switch destination address (SDA) inswitch 100 for theappropriate destination port 114. Therouter 330 is also capable of routing frames to the SDA associated with one of themicroprocessors 124 inswitch 100. In the preferred embodiment, the SDA is a ten-bit address that uniquely identifies everyport 110 andprocessor 124 inswitch 100. Asingle routing module 330 handles all of the routing for thePPD 130. Therouting module 330 then provides the routing information to thememory controller 310. - As shown in
FIG. 3 , thememory controller 310 consists of amemory write module 340, a memory readmodule 350, and aqueue control module 400. Aseparate memory controller 310 exists for each of the fourports 110 on thePPD 130. Thememory write module 340 handles all aspects of writing data to thecredit memory 320. The memory readmodule 350 is responsible for reading the data frames out ofmemory 320 and providing the frame to thefabric interface module 160. Thequeue control module 400 handles the queuing and ordering of data on thecredit memory 320. TheXON history register 420 can also be considered a part of thememory controller 310, although only a singleXON history register 420 is needed to service all fourports 110 on aPPD 130. - c)
Queue Control Module 400 - The
queue control module 400 stores the routing results received from theinbound routing module 330. When thecredit memory 320 contains multiple frames, thequeue control module 400 decides which frame should leave thememory 320 next. In doing so, thequeue module 400 utilizes procedures that avoid head-of-line blocking. - The
queue control module 400 has four primary components, namely the deferredqueue 402, thebackup queue 404, the headerselect logic 406, and theXOFF mask 408. These components work in conjunction with the XON History register 420 and the cell credit manager orcredit module 440 to control ingress queuing and to assist in managing flow control withinswitch 100. The deferredqueue 402 stores the frame headers and locations inbuffer memory 320 for frames waiting to be sent to adestination port 114 that is currently busy. Thebackup queue 404 stores the frame headers and buffer locations for frames that arrive at theinput port 112 while the deferredqueue 402 is sending deferred frames to their destination. The headerselect logic 406 determines the state of thequeue control module 400, and uses this determination to select the next frame incredit memory 320 to be submitted to theFIM 160. To do this, the headerselect logic 406 supplies to the memory read module 350 a valid buffer address containing the next frame to be sent. The functioning of thebackup queue 404, the deferredqueue 402, and the headerselect logic 406 are described in the incorporated Fibre Channel Switch application. - The
XOFF mask 408 contains a congestion status bit for eachport 110 in the switch. TheXON history register 420 is used to delay updating theXOFF mask 408 under certain conditions. These twocomponents cell credit manager 440 andFIM 160 are described in more detail below. - d)
Fabric Interface Module 160 - When a Fibre Channel frame is ready to be submitted to the
ingress memory subsystem 180 of I/O board 120, thequeue control 400 passes the frame's routed header and pointer to the memory readportion 350. Thisread module 350 then takes the frame from thecredit memory 320 and provides it to thefabric interface module 160. Thefabric interface module 160 converts the variable-length Fibre Channel frames received from theprotocol interface 150 into fixed-sized data cells acceptable to the cell-basedcrossbar 140. Each cell is constructed with a specially configured cell header appropriate to the cell-based switch fabric. When using the Cyclone switch fabric of Applied Micro Circuits Corporation, the cell header includes a starting sync character, the switch destination address of theegress port 114 and a priority assignment from theinbound routing module 330, a flow control field and ready bit, an ingress class of service assignment, a packet length field, and a start-of-packet and end-of-packet identifier. - When necessary, the preferred embodiment of the
fabric interface 160 creates fill data to compensate for the speed difference between thememory controller 310 output data rate and the ingress data rate of the cell-basedcrossbar 140. This process is described in more detail in the incorporated Fibre Channel Switch application. - Egress data cells are received from the
crossbar 140 and stored in theegress memory subsystem 182. When these cells leave theeMS 182, they enter the egress portion of thefabric interface module 160. TheFIM 160 then examines the cell headers, removes fill data, and concatenates the cell payloads to re-construct Fibre Channel frames with extended SOF/EOF codes. If necessary, theFIM 160 uses a small buffer to smooth gaps within frames caused by cell header and fill data removal. - In the preferred embodiment, there are multiple links between each
PPD 130 and theiMS 180. Each separate link uses aseparate FIM 160. Preferably, eachport 110 on thePPD 130 is given a separate link to theiMS 180, and therefore eachport 110 is assigned aseparate FIM 160. - e)
Outbound Processor Module 450 - The
FIM 160 then submits the frames to the outbound processor module (OPM) 450. Aseparate OPM 450 is used for eachport 110 on thePPD 130. Theoutbound processor module 450 checks each frame's CRC, and handles the necessary buffering between thefabric interface 160 and theports 110 to account for their different data transfer rates. The primary job of theoutbound processor modules 450 is to handle data frames received from the cell-basedcrossbar 140 that are destined for one of theFibre Channel ports 110. This data is submitted to thelink controller module 300, which replaces the extended SOF /EOF codes with standard Fibre Channel SOF/EOF characters, performs 8b/10b encoding, and sends data frames through its SERDES to theFibre Channel port 110. - The components of the
PPD 130 can communicate with themicroprocessor 124 on the I/O board microprocessor interface 360, themicroprocessor 124 can read and write registers on thePPD 130 and receive interrupts from thePPDs 130. This communication occurs over amicroprocessor communication path 362. Themicroprocessor 124 also uses themicroprocessor interface 360 to communicate with theports 110 and withother processors 124 over the cell-based switch fabric. - 3. Queues
- a) Class of
Service Queue 280 -
FIG. 4 shows twoswitches interswitch link 230. TheISL 230 connects anegress port 114 onupstream switch 260 with aningress port 112 ondownstream switch 270. Theegress port 114 is located on the first PPD 262 (labeled PPD 0) on the first I/O Board 264 (labeled I/O Board 0) onswitch 260. This I/O board 264 contains a total of fourPPDs 130, each containing fourports 110. This means I/O board 264 has a total of sixteenports 110, numbered 0 through 15. InFIG. 4 , switch 260 contains thirty-one other I/O boards switch 260 has a total of five hundred and twelveports 110. This particular configuration of I/O Boards PPDs 130, andports 110 is for exemplary purposes only, and other configurations would clearly be within the scope of the present invention. - I/
O Board 264 has a singleegress memory subsystem 182 to hold all of the data received from the crossbar 140 (not shown) for its sixteenports 110. The data ineMS 182 is controlled by the egress priority queue 192 (also not shown). In the preferred embodiment, theePQ 192 maintains the data in theeMS 182 in a plurality of output class of service queues (O_COS_Q) 280. Data for eachport 110 on the I/O Board 264 is kept in a total of “n” O_COS queues, with the number n reflecting the number ofvirtual channels 240 defined to exist with theISL 230. When cells are received from thecrossbar 140, theeMS 182 andePQ 192 add the cell to theappropriate O_COS_Q 280 based on the destination SDA and priority value assigned to the cell. This information was placed in the cell header as the cell was created by theingress FIM 160. The cells are then removed from theO_COS_Q 280 and are submitted to thePPD 262 for theegress port 114, which converts the cells back into a Fibre Channel frame and sends it across theISL 230 to thedownstream switch 270. - b)
Virtual Output Queue 290 - The frame enters
switch 270 over theISL 230 throughingress port 112. Thisingress port 112 is actually the second port (labeled port 1) found on the first PPD 272 (labeled PPD 0) on the first I/O Board 274 (labeled I/O Board 0) onswitch 270. Like the I/O board 264 onswitch 260, this I/O board 274 contains a total of fourPPDs 130, with eachPPD 130 containing fourports 110. With a total of thirty-two I/O boards switch 270 has the same five hundred and twelve ports asswitch 260. - When the frame is received at
port 112, it is placed incredit memory 320. The D_ID of the frame is examined, and the frame is queued and a routing determination is made as described above. Assuming that the destination port onswitch 270 is not XOFFed according to theXOFF mask 408servicing input port 112, the frame will be subdivided into cells and forwarded to theingress memory subsystem 180. - The
iMS 180 is organized and controlled by theingress priority queue 190, which is responsible for ensuring in-order delivery of data cells and packets. To accomplish this, theiPQ 190 organizes the data in itsiMS 180 into a number (“m”) of different virtual output queues (V_O_Qs) 290. To avoid head-of-line blocking, aseparate V_O_Q 290 is established for every destination within theswitch 270. Inswitch 270, this means that there are at least five hundred forty-four V_O_Qs 290 (five hundred twelvephysical ports 110 and thirty-two microprocessors 124) iniMS 180. TheiMS 180 places incoming data on theappropriate V_O_Q 290 according to the switch destination address assigned to that data by therouting module 330 inPPD 272. - Data in the
V_O_Qs 290 is handled like the data inO_COS_Qs 280, such as by using round robin servicing. When data is removed from aV_O_Q 290, it is submitted to thecrossbar 140 and provided to aneMS 182 on theswitch 270. - c)
Virtual Input Queue 282 -
FIG. 4 also shows a virtualinput queue structure 282 within eachingress port 112 indownstream switch 270. Each of theseV_I_Qs 282 corresponds to one of thevirtual channels 240 on theISL 230, which in turn corresponds to one of theO_COS_Qs 280 on the upstream switch. By assigning frames to aV_I_Q 282 iningress port 112, thedownstream switch 270 can identify which O_COS_Q 280 inswitch 260 was assigned to the frame. As a result, if a particular data frame encounters a congested port within thedownstream switch 270, theswitch 270 is able to communicate that congestion to the upstream switch by performing flow control for thevirtual channel 240 assigned to that V_I_Q 282. - 4. Flow Control in Switch
- a) XOFF Flow Control between
iMS 180 andeMS 182 - The cell-based switch fabric used in the preferred embodiment of the present invention can be considered to include the
memory subsystems priority queues crossbar 140, and thearbiter 170. As described above, these elements can be obtained commercially from companies such as Applied Micro Circuits Corporation. This switch fabric utilizes a variety of flow control mechanisms to prevent internal buffer overflows, to control the flow of cells into the cell-based switch fabric, and to receive flow control instructions to stop cells from exiting the switch fabric. - XOFF internal flow control within the cell-based switch fabric is shown as
communication 500 inFIG. 5 . This flow control serves to stop data cells from being sent fromiMS 180 toeMS 182 over thecrossbar 140 in situations where theeMS 182 or one of theO_COS_Qs 280 in theeMS 182 is becoming full. If there were no flow control, congestion at anegress port 114 would prevent data in the port's associatedO_COS_Qs 280 from exiting theswitch 100. If theiMS 180 were allowed to keep sending data to thesequeues 280,eMS 182 would overflow and data would be lost. - This flow control works as follows. When cell occupancy of an
O_COS_Q 280 reaches a threshold, an XOFF signal is generated internal to the switch fabric to stop transmission of data from theiMS 180 to theseO_COS_Qs 280. The preferred Cyclone switch fabric uses three different thresholds, namely a routine threshold, an urgent threshold, and an emergency threshold. Each threshold creates a corresponding type of XOFF signal to theiMS 180. - Unfortunately, since the
V_O_Qs 290 iniMS 180 are not organized into the individual class of services for eachpossible output port 114, the XOFF signal generated by theeMS 182 cannot simply turn off data for asingle O_COS_Q 280. In fact, due to the manner in which the cell-based fabric addresses individual ports, the XOFF signal is not even specific to a singlecongested port 110. Rather, in the case of the routine XOFF signal, theiMS 180 responds by stopping all cell traffic to the group of fourports 110 found on thePPD 130 that contains thecongested egress port 114. Urgent and Emergency XOFF signals cause theiMS 180 andArbiter 170 to stop all cell traffic to the effected egress I/O board 122. In the case of routine and urgent XOFF signals, theeMS 182 is able to accept additional packets of data before theiMS 180 stops sending data. Emergency XOFF signals mean that new packets arriving at theeMS 182 will be discarded. - b) Backplane Credit Flow Control
- The
iPQ 190 also uses a backplane credit flow control 510 (shown inFIG. 6 ) to manage the traffic from theiMS 180 to the differentegress memory subsystems 182 more granularly than the XOFF signals 500 described above. For every packet submitted to anegress port 114, theiPQ 190 decrements its “backplane” credit count for thatport 114. When the packet is transmitted out of theeMS 182, a backplane credit is returned to theiPQ 190. If aparticular O_COS_Q 280 cannot submit data to an ISL 230 (such as when the associatedvirtual channel 240 has an XOFF status), credits will not be returned to theiPQ 190 that submitted those packets. Eventually, theiPQ 190 will run out of credits for thategress port 114, and will stop making bids to thearbiter 170 for these packets. These packets will then be held in theiMS 180. - Note that even though only a
single O_COS_Q 280 is not sending data, theiPQ 190 only maintains credits on anport 110 basis, not a class of service basis. Thus, the effectediPQ 190 will stop sending all data to theport 114, including data with a different class of service that could be transmitted over theport 114. In addition, since theiPQ 190 services an entire I/O board 120, all traffic to thategress port 114 from any of theports 110 on thatboard 120 is stopped.Other iPQs 190 on other I/O boards same egress port 114 as long as thoseother iPQs 190 have backplane credits for thatport 114. - Thus, the
backplane credit system 510 can provide some internal switch flow control from ingress to egress on the basis of avirtual channel 240, but it is inconsistent. If twoingress ports 112 on two separate I/O boards virtual channels 240 on thesame ISL 230, the use of backplane credits will flow control thosechannels 240 differently. One of thosevirtual channels 240 might have an XOFF condition. Packets to that O_COS_Q 280 will back up, and backplane credits will not be returned. The lack of backplane credits will cause theiPQ 190 sending to the XOFFedvirtual channel 240 to stop sending data. Assuming the other virtual channel does not have an XOFF condition, credits from itsO_COS_Q 280 to theother iPQ 190 will continue, and data will flow through thatchannel 240. However, if the twoingress ports 112 sending to the twovirtual channels 240 utilize thesame iPQ 190, the lack of returned backplane credits from theXOFFed O_COS_Q 280 will stop traffic to allvirtual channels 240 on theISL 230. - c) Input to
Fabric Flow Control 520 - The cell-based switch fabric must be able to stop the flow of data from its data source (i.e., the FIM 160) whenever the
iMS 180 or aV_O_Q 290 maintained by theiPQ 190 is becoming full. The switch fabric signals this XOFF condition by setting the RDY (ready) bit to 0 on the cells it returns to theFIM 160, shown as 520 onFIG. 7 . Although this XOFF is an input flow control signal between theiMS 180 and the ingress portion of thePPD 130, the signals are communicated from theeMS 182 into the egress portion of thesame PPD 130. When the egress portion of theFIM 160 receives the cells with RDY set to 0, it informs the ingress portion of thePPD 130 to stop sending data to theiMS 180. - There are three situations where the switch fabric may request an XOFF or XON state change. In every case,
flow control cells 520 are sent by theeMS 182 to the egress portion of theFIM 160 to inform thePPD 130 of this updated state. These flow control cells use the RDY bit in the cell header to indicate the current status of theiMS 180 and itsrelated queues 290. - In the first of the three different situations, the
iMS 180 may fill up to its threshold level. In this case, no more traffic should be sent to theiMS 180. When aFIM 160 receives theflow control cells 520 indicating this condition, it sends a congestion signal (or “gross_xoff” signal) 522 to theXOFF mask 408 in thememory controller 310. This signal informs theMCM 310 to stop all data traffic to theiMS 180, as described in more detail below. TheFIM 160 will also broadcast an external signal calledSTOP_ALL 164 to theFIMs 160 on itsPPD 130, as well as to the other threePPDs 130 on its I/O board 120. TheSTOP_ALL congestion signal 164 may take the same form as thegross_xoff congestion signal 522, or it may be differently formatted. The interconnection between thePPDs 130 and theSTOP_ALL signal 164 is shown inFIG. 9 , although it is possible to use thephysical linkage 454 between thecell credit managers 440 to communicate this signal. When aFIM 160 receives theSTOP_ALL signal 164, thegross_xoff signal 522 is sent to itsmemory controller 310. Since allFIMs 160 on aboard 120 will receive theSTOP_ALL signal 164, this signal will stop all traffic to theiMS 180. Thegross_xoff signal 522 will remain on until theflow control cells 520 received by theFIM 160 indicate the buffer condition at theiMS 180 is over. - In the second case, a
single V_O_Q 290 in theiMS 180 fills up to its threshold. When this occurs, thesignal 520 back to thePPD 130 will behave just as it did in the first case, with the generation of agross_xoff congestion signal 522 and aSTOP_ALL congestion signal 164. Thus, theentire iMS 180 stops receiving data, even though only asingle V_O_Q 290 has become congestion. - The third case involves a failed link between a
FIM 160 and theiMS 180. Flow control cells indicating this condition will cause agross_xoff signal 522 to be sent only to theMCM 310 for thecorresponding FIM 160. No STOP_ALL signal 164 is sent in this situation. - d) Outputfrom
Fabric Flow Control 530 - When an egress portion of a
PPD 130 wishes to stop traffic coming from theeMS 182, it signals an XOFF to the switch fabric by sending a cell from theinput FIM 160 to theiMS 180, which is shown asflow control 530 onFIG. 8 . The cell header contains a queue flow control field and a RDY bit to help define the XOFF signal. The queue flow control field is eleven bits long, and can identify the class of service,port 110 andPPD 130, as well as the desired flow status (XON or XOFF). - The
PPD 130 might desire to stop the flow of data from theeMS 182 for several reasons. First, an internal buffer within the egress portion of theFIM 160 may be approaching an overflow condition. Second, the egress portion of thePIM 150 may have received a switch-to-switch flow control signal. This signal may request stopping the flow of data over the entire link. Alternatively, the signal may reflect only a desire to stop traffic over a particularvirtual channel 240 on a link. Regardless of the reason, when theFIM 160 needs to stop data traffic from theeMS 182, theFIM 160 sends an XOFF to the switch fabric in an ingress cell header directed towardiMS 180. TheiMS 180 extracts each XOFF instruction from the cell header, and sends it to theeMS 182, directing theeMS 182 to XOFF or XON aparticular O_COS_Q 280. If theO_COS_Q 280 is sending a packet to theFIM 160, it finishes sending the packet. TheeMS 182 then stops sending fabric-to-port or fabric-to-microprocessor packets to theFIM 160. - 5. Congestion Notification
- a)
XOFF Mask 408 - The
XOFF mask 408 shown inFIG. 10 is responsible for notifying theingress ports 112 of the congestion status of allegress ports 114 andmicroprocessors 124 in the switch. Everyport 112 has itsown XOFF mask 408, as shown inFIG. 9 . TheXOF mask 408 is considered part of thequeue control module 400 in thememory controller 310, and is therefore shown within theMCM 330 inFIG. 9 . - Each
XOFF mask 408 contains a separate status bit for all destinations within theswitch 100. In one embodiment of theswitch 100, there are five hundred and twelvephysical ports 110 and thirty-twomicroprocessors 124 that can serve as a destination for a frame. Hence, theXOFF mask 408 uses a 544 by 1 look up table 410 to store the “XOFF” status of each destination. If a bit in XOFF look up table 410 is set, theport 110 corresponding to that bit is busy and cannot receive any frames. - In the preferred embodiment, the
XOFF mask 408 returns a status for a destination by first receiving the switch destination address for thatport 110 ormicroprocessor 124 onSDA input 412. The look up table 410 is examined for the SDA oninput 412, and if the corresponding bit is set, theXOFF mask 408 asserts a signal on “defer”output 414, which indicates to the rest of thequeue control module 400 that the selectedport 110 orprocessor 124 is busy. This construction of theXOFF mask 408 is the preferred way to store the congestion status of possible destinations at eachport 110. Other ways are possible, as long as they can quickly respond to a status query about a destination with the congestion status for that destination. - In the preferred embodiment, the output of the XOFF look up table 410 is not the sole source for the defer
signal 414. In addition, theXOFF mask 408 receives the gross_xoff signal 522 from its associatedFIM 160. Thissignal 522 is ORed with the output of the lookup table 410 in order to generate the defersignal 414. This means that whenever thegross_xoff signal 522 is set, the defersignal 414 will also be set, effectively stopping all traffic to theiMS 180. In another embodiment (not shown), a force defer signal that is controlled by themicroprocessor 124 is also able to cause the defersignal 414 to go on. When the defersignal 414 is set, it informs the headerselect logic 406 and the remaining elements of thequeue module 400 that theport 110 having the address on next frame header output 415 is congested, and this frame should be stored on the deferredqueue 402. - b)
XOFF History Register 420 - The
XON history register 420 is used to record the history of the XON status of all destinations in theswitch 100. Under the procedure established for deferred queuing, theXOFF mask 408 cannot be updated with an XON event when thequeue control 400 is servicing deferred frames in the deferredqueue 402. During that time, whenever aport 110 changes status from XOFF to XON, theXOFF mask 408 will ignore (or not receive) the XOFF signal 452 from thecell credit manager 440 and will therefore not update its lookup table 410. Thesignal 452 from thecell credit manager 440 will, however, update the lookup table 422 within theXON history register 420. Thus, theXON history register 420 maintains the current XON status of allports 110. When theupdate signal 416 is made active by the header select 406 portion of thequeue control module 400, the entire content of the lookup table 422 in theXON history register 420 is transferred to the lookup table 410 of theXOFF mask 408. Registers within the table 422 containing a zero(having a status of XON) will cause corresponding registers within the XOFF mask lookup table 410 to be reset to zero. The dual register setup allows for XOFFs to be written directly to theXOFF mask 408 at any time thecell credit manager 440 requires traffic to be halted, and causes XONs to be applied only when the logic within thequeue control module 400 allows for a change to an XON value. While a separatequeue control module 400 and its associatedXOFF mask 408 is necessary for eachport 110 in thePPD 130, only oneXON history register 420 is necessary to service all fourports 110 in thePPD 130, which again is shown inFIG. 9 . - c)
Cell Credit Manager 440 - The cell credit manager or
credit module 440 sets the XOFF/XON status of thepossible destination ports 110 in the lookup tables 410, 422 of theXOFF mask 408 and theXON history register 420. To update these tables 410, 422, thecell credit manager 440 maintains a cell credit count of every cell in thevirtual output queues 290 of theiMS 180. Every time a cell addressed to a particular SDA leaves theFIM 160 and enters theiMS 180, theFIM 160 informs thecredit module 440 through a cellcredit event signal 442. Thecredit module 440 then decrements the cell count for that SDA. Every time a cell for that destination leaves theiMS 180, thecredit module 440 is again informed and adds a credit to the count for the associated SDA. TheiPQ 190 sends this credit information back to thecredit module 440 by sending a cell containing the cell credit back to theFIM 160 through theeMS 182. TheFIM 160 then sends anincrement credit signal 442 to thecell credit manager 440. This cell credit flow control is designed to prevent the occurrence of more drastic levels of flow control from within the cell-based switch fabric described above, since these flow control signals 500-520 can result in multiple blockedports 110, shutting down anentire iMS 180, or even the loss of data. - In the preferred embodiment, the cell credits are tracked through increment and
decrement credit events 442 received fromFIM 160. These events are stored indedicated increment FIFOs 444 anddecrement FIFOs 446. EachFIM 160 is associated with aseparate increment FIFO 444 and aseparate decrement FIFO 446, although ports 1-3 are shown as sharingFIFOs Decrement FIFOs 446 contain SDAs for cells that have entered theiMS 180.Increment FIFOs 444 contain SDAs for cells that have left theiMS 180. TheseFIFOs credit module 440 maintains for each SDA in itscell credit accumulator 447. In the preferred embodiment, thecell credit accumulator 447 is able to handle one increment event from one of theFIFOs 444 and one decrement event from one of theFIFOs 446 at the same time. An event select logic services theFIFOs accumulator 447 toempty FIFOs - The
accumulator 447 maintains separate credit counts for each SDA, with each count reflecting the number of cells contained within theiMS 180 for a given SDA. A comparemodule 448 detects when the count for an SDA withinaccumulator 447 crosses an XOFF or XON threshold stored inthreshold memory 449. When a threshold is crossed, the comparemodule 448 causes a driver to send the appropriate XOFF orXON event 452 to theXOFF mask 408 and theXON history register 420. If the count gets too low, then that SDA is XOFFed. This means that Fibre Channel frames that are to be routed to that SDA are held in thecredit memory 320 byqueue control module 400. After the SDA is XOFFed, thecredit module 440 waits for the count for that SDA to rise to a certain level, and then the SDA is XONed, which instructs thequeue control module 400 to release frames for that destination from thecredit memory 320. The XOFF and XON thresholds inthreshold memory 449 can be different for each individual SDA, and are programmable by theprocessor 124. - When an XOFF event or an XON event occurs, the
credit module 440 sends anXOFF instruction 452 to theXON history register 420 and all fourXOFF masks 408 in itsPPD 130. In the preferred embodiment, theXOFF instruction 452 is a three-part signal identifying the SDA, the new XOFF status, and a validity signal. - In the above description, each
cell credit manager 440 receives communications from theFIMs 160 on itsPPD 130 regarding the cells that eachFIM 160 submits to theiMS 180. TheFIMs 160 also report back to thecell credit manager 440 when those cells are submitted by theiMS 180 over thecrossbar 140. As long as the system works as described, thecell credit managers 440 are able to track the status of all cells submitted to theiMS 180. Even though eachcell credit manager 440 is only tracking cells related to its PPD 130 (approximately one fourth of the total cells passing through the iMS 180), this information could be used to implement a useful congestion notification system. - Unfortunately, the preferred embodiment
ingress memory system 180 manufactured by AMCC does not return cell credit information to thesame FIM 160 that submitted the cell. In fact, the cell credit relating to a cell submitted by thefirst FIM 160 on thefirst PPD 130 might be returned by theiMS 180 to thelast FIM 160 on thelast PPD 130. Consequently, thecell credit managers 440 cannot assume that eachdecrement credit event 442 they receive relating to a cell entering theiMS 180 will ever result in a relatedincrement credit event 442 being returned to it when that cell leaves theiMS 180. Theincrement credit event 442 may very well end up at anothercell credit manager 440. - To overcome this issue, an alternative embodiment of the present invention has the four
cell credit managers 440 on an I/O board cell credit events 442 in a master/slave relationship. In this embodiment, eachboard cell credit manager 441 and three “slave”cell credit manager 440. When aslave unit 440 receives a cell credit event signal 442 from aFIM 160, thesignal 442 is forwarded to the mastercell credit manager 441 over a special XOFF bus 454 (as seen inFIG. 9 ). Themaster unit 441 receives cell credit event signals 442 from the threeslave units 440 as well as theFIMs 160 that directly connect to themaster unit 441. In this way, the mastercell credit manager 441 receives the cell credit event signals 442 from all of theFIMs 160 on an I/O board 120. This allows the master unit to maintain a credit count for each SDA in itsaccumulator 447 that reflects all data cells entering and leaving theiMS 180. - The master
cell credit manager 441 is solely responsible for maintaining the credit counts and for comparing the credit counts with the threshold values stored in itsthreshold memory 449. When a threshold is crossed, themaster unit 441 sends an XOFF orXON event 452 to its associatedXON history register 420 and XOFF masks 408. In addition, themaster unit 441 sends an instruction to the slavecell credit managers 440 to send the same XOFF orXON event 452 to their XON history registers 420 and XOFF masks 408. In this manner, the fourcell credit managers XON event 452 to all four XON history registers 442 and all sixteenXOFF masks 408 on the I/O board board - Due to error probabilities, there is a possibility that the cell credit counts in
accumulator 447 may drift from actual values over time. The present invention overcomes this issue by periodically re-syncing these counts. To do this, theFIM 160 toggles a ‘state’ bit in the headers of all cells sent to theiMS 180 to reflect a transition in the system's state. At the same time, the credit counters incell credit accumulator 447 are restored to full credit. Since each of the cell credits returned from theiMS 180/eMS 182 includes an indication of the value of the state bit in the cell, it is possible to differentiate credits relating to cells sent before the state change. Any credits received by theFIM 160 that do not have the proper state bit are ignored. After theiMS 180 recognizes the state change, credits will only be returned for those cells indicating the new state. In the preferred embodiment, this changing of the state bit and the re-syncing of the credit incell credit accumulator 447 occurs approximately every eight minutes, although this time period is adjustable under the control of theprocessor 124. - The many features and advantages of the invention are apparent from the above description. Numerous modifications and variations will readily occur to those skilled in the art. For instance, persons of ordinary skill could easily reconfigure the various components described above into different elements, each of which has a slightly different functionality than those described. The component reconfigurations do not fundamentally alter the present invention. Since such modifications are possible, the invention is not to be limited to the exact construction and operation illustrated and described. Rather, the present invention should be limited only by the following claims.
Claims (53)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/873,329 US20050088969A1 (en) | 2001-12-19 | 2004-06-21 | Port congestion notification in a switch |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/020,968 US7260104B2 (en) | 2001-12-19 | 2001-12-19 | Deferred queuing in a buffered switch |
US10/873,329 US20050088969A1 (en) | 2001-12-19 | 2004-06-21 | Port congestion notification in a switch |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/020,968 Continuation-In-Part US7260104B2 (en) | 2001-06-13 | 2001-12-19 | Deferred queuing in a buffered switch |
Publications (1)
Publication Number | Publication Date |
---|---|
US20050088969A1 true US20050088969A1 (en) | 2005-04-28 |
Family
ID=21801585
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/020,968 Expired - Fee Related US7260104B2 (en) | 2001-06-13 | 2001-12-19 | Deferred queuing in a buffered switch |
US10/873,329 Abandoned US20050088969A1 (en) | 2001-12-19 | 2004-06-21 | Port congestion notification in a switch |
US10/873,430 Expired - Fee Related US7773622B2 (en) | 2001-12-19 | 2004-06-21 | Deferred queuing in a buffered switch |
US12/826,959 Expired - Fee Related US8379658B2 (en) | 2001-12-19 | 2010-06-30 | Deferred queuing in a buffered switch |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/020,968 Expired - Fee Related US7260104B2 (en) | 2001-06-13 | 2001-12-19 | Deferred queuing in a buffered switch |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/873,430 Expired - Fee Related US7773622B2 (en) | 2001-12-19 | 2004-06-21 | Deferred queuing in a buffered switch |
US12/826,959 Expired - Fee Related US8379658B2 (en) | 2001-12-19 | 2010-06-30 | Deferred queuing in a buffered switch |
Country Status (5)
Country | Link |
---|---|
US (4) | US7260104B2 (en) |
EP (1) | EP1466449A1 (en) |
AU (1) | AU2002366842A1 (en) |
CA (1) | CA2470758A1 (en) |
WO (1) | WO2003055157A1 (en) |
Cited By (70)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050015518A1 (en) * | 2003-07-16 | 2005-01-20 | Wen William J. | Method and system for non-disruptive data capture in networks |
US20050013258A1 (en) * | 2003-07-16 | 2005-01-20 | Fike John M. | Method and apparatus for detecting and removing orphaned primitives in a fibre channel network |
US20050013318A1 (en) * | 2003-07-16 | 2005-01-20 | Fike John M. | Method and system for fibre channel arbitrated loop acceleration |
US20050018603A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for reducing latency and congestion in fibre channel switches |
US20050018701A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for routing fibre channel frames |
US20050018606A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for congestion control based on optimum bandwidth allocation in a fibre channel switch |
US20050018671A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for keeping a fibre channel arbitrated loop open during frame gaps |
US20050018621A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for selecting virtual lanes in fibre channel switches |
US20050018663A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for power control of fibre channel switches |
US20050018672A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Lun based hard zoning in fibre channel switches |
US20050018649A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for improving bandwidth and reducing idles in fibre channel switches |
US20050018674A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for buffer-to-buffer credit recovery in fibre channel systems using virtual and/or pseudo virtual lanes |
US20050018675A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Multi-speed cut through operation in fibre channel |
US20050018650A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for configuring fibre channel ports |
US20050018676A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Programmable pseudo virtual lanes for fibre channel systems |
US20050018680A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for programmable data dependant network routing |
US20050025193A1 (en) * | 2003-07-16 | 2005-02-03 | Fike John M. | Method and apparatus for test pattern generation |
US20050030893A1 (en) * | 2003-07-21 | 2005-02-10 | Dropps Frank R. | Method and system for detecting congestion and over subscription in a fibre channel network |
US20050030978A1 (en) * | 2003-07-21 | 2005-02-10 | Dropps Frank R. | Method and system for managing traffic in fibre channel systems |
US20050030954A1 (en) * | 2003-07-21 | 2005-02-10 | Dropps Frank R. | Method and system for programmable data dependant network routing |
US20050044267A1 (en) * | 2003-07-21 | 2005-02-24 | Dropps Frank R. | Method and system for routing and filtering network data packets in fibre channel systems |
US20050135251A1 (en) * | 2002-10-07 | 2005-06-23 | Kunz James A. | Method and system for reducing congestion in computer networks |
US20050174936A1 (en) * | 2004-02-05 | 2005-08-11 | Betker Steven M. | Method and system for preventing deadlock in fibre channel fabrics using frame priorities |
US20050174942A1 (en) * | 2004-02-05 | 2005-08-11 | Betker Steven M. | Method and system for reducing deadlock in fibre channel fabrics using virtual lanes |
US20060013135A1 (en) * | 2004-06-21 | 2006-01-19 | Schmidt Steven G | Flow control in a switch |
US20060023705A1 (en) * | 2004-07-27 | 2006-02-02 | Alcatel | Method and apparatus for closed loop, out-of-band backpressure mechanism |
US20060072580A1 (en) * | 2004-10-01 | 2006-04-06 | Dropps Frank R | Method and system for transferring data drectly between storage devices in a storage area network |
US20060072616A1 (en) * | 2004-10-01 | 2006-04-06 | Dropps Frank R | Method and system for LUN remapping in fibre channel networks |
US20060072473A1 (en) * | 2004-10-01 | 2006-04-06 | Dropps Frank R | High speed fibre channel switch element |
US20060087963A1 (en) * | 2004-10-25 | 2006-04-27 | Cisco Technology, Inc. | Graceful port shutdown protocol for fibre channel interfaces |
US20060092932A1 (en) * | 2004-11-01 | 2006-05-04 | Cisco Technology, Inc. | Trunking for fabric ports in fibre channel switches and attached devices |
US20060153186A1 (en) * | 2004-12-29 | 2006-07-13 | Cisco Technology, Inc. | In-order fibre channel packet delivery |
US7184466B1 (en) * | 2002-09-12 | 2007-02-27 | Xilinx, Inc. | Radio frequency data conveyance system including configurable integrated circuits |
US20070047535A1 (en) * | 2005-08-31 | 2007-03-01 | Intel Corporation | Switching device utilizing flow-control management |
US20070064605A1 (en) * | 2005-09-02 | 2007-03-22 | Intel Corporation | Network load balancing apparatus, systems, and methods |
US20070081527A1 (en) * | 2002-07-22 | 2007-04-12 | Betker Steven M | Method and system for primary blade selection in a multi-module fibre channel switch |
US20070153816A1 (en) * | 2002-06-12 | 2007-07-05 | Cisco Technology, Inc. | Methods and apparatus for characterizing a route in a fibre channel fabric |
US20070268825A1 (en) * | 2006-05-19 | 2007-11-22 | Michael Corwin | Fine-grain fairness in a hierarchical switched system |
US20080168161A1 (en) * | 2007-01-10 | 2008-07-10 | International Business Machines Corporation | Systems and methods for managing faults within a high speed network employing wide ports |
US20080168302A1 (en) * | 2007-01-10 | 2008-07-10 | International Business Machines Corporation | Systems and methods for diagnosing faults in a multiple domain storage system |
US20080253289A1 (en) * | 2004-03-05 | 2008-10-16 | Xyratex Technology Limited | Method For Congestion Management of a Network, a Signalling Protocol, a Switch, an End Station and a Network |
US20080270638A1 (en) * | 2007-04-30 | 2008-10-30 | International Business Machines Corporation | Systems and methods for monitoring high speed network traffic via simultaneously multiplexed data streams |
US20080316942A1 (en) * | 2002-11-27 | 2008-12-25 | Cisco Technology, Inc. | Methods and devices for exchanging peer parameters between network devices |
US20090041057A1 (en) * | 2007-08-06 | 2009-02-12 | International Business Machines Corporation | Performing a recovery action in response to a credit depletion notification |
US20090043880A1 (en) * | 2007-08-06 | 2009-02-12 | International Business Machines Corporation | Credit depletion notification for transmitting frames between a port pair |
US7577133B1 (en) | 2005-09-09 | 2009-08-18 | Juniper Networks, Inc. | Scalable central memory switching fabric |
US7593330B1 (en) * | 2006-01-30 | 2009-09-22 | Juniper Networks, Inc. | Processing of partial frames and partial superframes |
US20090252167A1 (en) * | 2008-04-04 | 2009-10-08 | Finbar Naven | Queue processing method |
US20100008375A1 (en) * | 2002-04-01 | 2010-01-14 | Cisco Technology, Inc. | Label switching in fibre channel networks |
US7684401B2 (en) | 2003-07-21 | 2010-03-23 | Qlogic, Corporation | Method and system for using extended fabric features with fibre channel switch elements |
US7729288B1 (en) | 2002-09-11 | 2010-06-01 | Qlogic, Corporation | Zone management in a multi-module fibre channel switch |
US7876711B2 (en) | 2003-06-26 | 2011-01-25 | Cisco Technology, Inc. | Fibre channel switch that enables end devices in different fabrics to communicate with one another while retaining their unique fibre channel domain—IDs |
US7894348B2 (en) * | 2003-07-21 | 2011-02-22 | Qlogic, Corporation | Method and system for congestion control in a fibre channel switch |
US7930377B2 (en) | 2004-04-23 | 2011-04-19 | Qlogic, Corporation | Method and system for using boot servers in networks |
US8711697B1 (en) * | 2011-06-22 | 2014-04-29 | Marvell International Ltd. | Method and apparatus for prioritizing data transfer |
US9088497B1 (en) * | 2007-05-09 | 2015-07-21 | Marvell Israel (M.I.S.L) Ltd. | Method and apparatus for switch port memory allocation |
WO2016105419A1 (en) * | 2014-12-24 | 2016-06-30 | Intel Corporation | Apparatus and method for routing data in a switch |
WO2016109104A1 (en) * | 2014-12-29 | 2016-07-07 | Oracle International Corporation | System and method for supporting efficient virtual output queue (voq) resource utilization in a networking device |
CN106301967A (en) * | 2016-10-25 | 2017-01-04 | 杭州华为数字技术有限公司 | A kind of method of data synchronization and outband management equipment |
US9621484B2 (en) | 2014-12-29 | 2017-04-11 | Oracle International Corporation | System and method for supporting efficient buffer reallocation in a networking device |
US9832143B2 (en) | 2014-12-29 | 2017-11-28 | Oracle International Corporation | System and method for supporting efficient virtual output queue (VOQ) packet flushing scheme in a networking device |
US9838330B2 (en) | 2014-12-29 | 2017-12-05 | Oracle International Corporation | System and method for supporting credit management for output ports in a networking device |
US9838338B2 (en) | 2014-12-29 | 2017-12-05 | Oracle International Corporation | System and method for supporting efficient virtual output queue (VOQ) resource utilization in a networking device |
US20190190982A1 (en) * | 2017-12-19 | 2019-06-20 | Solarflare Communications, Inc. | Network interface device |
US20190190853A1 (en) * | 2017-12-19 | 2019-06-20 | Solarflare Communications, Inc. | Network Interface Device |
US10848426B2 (en) * | 2007-10-17 | 2020-11-24 | Dispersive Networks, Inc. | Virtual dispersive networking systems and methods |
US11088966B2 (en) * | 2018-11-06 | 2021-08-10 | Mellanox Technologies, Ltd. | Managing congestion in a network adapter based on host bus performance |
US11134021B2 (en) * | 2016-12-29 | 2021-09-28 | Intel Corporation | Techniques for processor queue management |
US11165720B2 (en) | 2017-12-19 | 2021-11-02 | Xilinx, Inc. | Network interface device |
US11750504B2 (en) | 2019-05-23 | 2023-09-05 | Hewlett Packard Enterprise Development Lp | Method and system for providing network egress fairness between applications |
Families Citing this family (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7260104B2 (en) * | 2001-12-19 | 2007-08-21 | Computer Network Technology Corporation | Deferred queuing in a buffered switch |
US7277448B1 (en) * | 2003-06-27 | 2007-10-02 | Cisco Technology, Inc. | Hierarchical scheduler inter-layer eligibility deferral |
US20050089054A1 (en) * | 2003-08-11 | 2005-04-28 | Gene Ciancaglini | Methods and apparatus for provisioning connection oriented, quality of service capabilities and services |
US7539143B2 (en) * | 2003-08-11 | 2009-05-26 | Netapp, Inc. | Network switching device ingress memory system |
KR100612442B1 (en) * | 2004-01-26 | 2006-08-16 | 삼성전자주식회사 | Buffer switch and scheduling method thereof |
US20050281282A1 (en) * | 2004-06-21 | 2005-12-22 | Gonzalez Henry J | Internal messaging within a switch |
US7916743B2 (en) * | 2004-11-17 | 2011-03-29 | Jinsalas Solutions, Llc | System and method for improved multicast performance |
US20060174050A1 (en) * | 2005-01-31 | 2006-08-03 | International Business Machines Corporation | Internal data bus interconnection mechanism utilizing shared buffers supporting communication among multiple functional components of an integrated circuit chip |
US7136954B2 (en) * | 2005-01-31 | 2006-11-14 | International Business Machines Corporation | Data communication method and apparatus utilizing credit-based data transfer protocol and credit loss detection mechanism |
US7493426B2 (en) * | 2005-01-31 | 2009-02-17 | International Business Machines Corporation | Data communication method and apparatus utilizing programmable channels for allocation of buffer space and transaction control |
US7548513B2 (en) * | 2005-02-17 | 2009-06-16 | Intel Corporation | Techniques to provide recovery receive queues for flooded queues |
WO2007138250A2 (en) * | 2006-05-25 | 2007-12-06 | Solarflare Communications Incorporated | Computer system with lock- protected queues for sending and receiving data |
US8068429B2 (en) * | 2007-05-31 | 2011-11-29 | Ixia | Transmit scheduling |
US8135025B2 (en) * | 2009-06-03 | 2012-03-13 | Microsoft Corporation | Asynchronous communication in an unstable network |
US8644140B2 (en) * | 2009-09-09 | 2014-02-04 | Mellanox Technologies Ltd. | Data switch with shared port buffers |
US8699491B2 (en) * | 2011-07-25 | 2014-04-15 | Mellanox Technologies Ltd. | Network element with shared buffers |
WO2013022427A1 (en) * | 2011-08-08 | 2013-02-14 | Hewlett-Packard Development Company, L.P. | Fabric chip having trunked links |
EP2742653A4 (en) * | 2011-08-08 | 2015-04-15 | Hewlett Packard Development Co | Fabric chip having a port resolution module |
US9013997B2 (en) * | 2012-06-01 | 2015-04-21 | Broadcom Corporation | System for performing distributed data cut-through |
US9582440B2 (en) | 2013-02-10 | 2017-02-28 | Mellanox Technologies Ltd. | Credit based low-latency arbitration with data transfer |
US8989011B2 (en) | 2013-03-14 | 2015-03-24 | Mellanox Technologies Ltd. | Communication over multiple virtual lanes using a shared buffer |
CA2819539C (en) * | 2013-06-21 | 2021-01-12 | Ibm Canada Limited - Ibm Canada Limitee | Dynamic management of integration protocols |
US9641465B1 (en) | 2013-08-22 | 2017-05-02 | Mellanox Technologies, Ltd | Packet switch with reduced latency |
US9548960B2 (en) | 2013-10-06 | 2017-01-17 | Mellanox Technologies Ltd. | Simplified packet routing |
US10601713B1 (en) * | 2013-10-15 | 2020-03-24 | Marvell Israel (M.I.S.L) Ltd. | Methods and network device for performing cut-through |
CN104679667B (en) * | 2013-11-28 | 2017-11-28 | 中国航空工业集团公司第六三一研究所 | Efficient sampling port amortization management method |
US9372500B2 (en) | 2014-02-27 | 2016-06-21 | Applied Micro Circuits Corporation | Generating a timeout signal based on a clock counter associated with a data request |
US9325641B2 (en) | 2014-03-13 | 2016-04-26 | Mellanox Technologies Ltd. | Buffering schemes for communication over long haul links |
US9584429B2 (en) | 2014-07-21 | 2017-02-28 | Mellanox Technologies Ltd. | Credit based flow control for long-haul links |
US10229230B2 (en) | 2015-01-06 | 2019-03-12 | International Business Machines Corporation | Simulating a large network load |
US9946819B2 (en) | 2015-01-06 | 2018-04-17 | International Business Machines Corporation | Simulating a large network load |
US9342388B1 (en) * | 2015-12-02 | 2016-05-17 | International Business Machines Corporation | Dynamic queue alias |
US10951549B2 (en) | 2019-03-07 | 2021-03-16 | Mellanox Technologies Tlv Ltd. | Reusing switch ports for external buffer network |
US11558316B2 (en) | 2021-02-15 | 2023-01-17 | Mellanox Technologies, Ltd. | Zero-copy buffering of traffic of long-haul links |
US11973696B2 (en) | 2022-01-31 | 2024-04-30 | Mellanox Technologies, Ltd. | Allocation of shared reserve memory to queues in a network device |
US20240070060A1 (en) * | 2022-08-30 | 2024-02-29 | Micron Technology, Inc. | Synchronized request handling at a memory device |
Citations (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4710868A (en) * | 1984-06-29 | 1987-12-01 | International Business Machines Corporation | Interconnect scheme for shared memory local networks |
US5455820A (en) * | 1993-05-20 | 1995-10-03 | Nec Corporation | Output-buffer switch for asynchronous transfer mode |
US5533201A (en) * | 1994-03-07 | 1996-07-02 | Unisys Corporation | Method and apparatus for simultaneous interconnection of multiple requestors to multiple memories |
US5751969A (en) * | 1995-12-04 | 1998-05-12 | Motorola, Inc. | Apparatus and methods for predicting and managing congestion in a network |
US5781549A (en) * | 1996-02-23 | 1998-07-14 | Allied Telesyn International Corp. | Method and apparatus for switching data packets in a data network |
US5844887A (en) * | 1995-11-30 | 1998-12-01 | Scorpio Communications Ltd. | ATM switching fabric |
US5974467A (en) * | 1997-08-29 | 1999-10-26 | Extreme Networks | Protocol for communicating data between packet forwarding devices via an intermediate network interconnect device |
US5983260A (en) * | 1995-07-19 | 1999-11-09 | Fujitsu Network Communications, Inc. | Serial control and data interconnects for coupling an I/O module with a switch fabric in a switch |
US5999527A (en) * | 1996-02-02 | 1999-12-07 | Telefonaktiebolaget Lm Ericsson | Modular switch |
US6067286A (en) * | 1995-04-11 | 2000-05-23 | General Datacomm, Inc. | Data network switch with fault tolerance |
US6160813A (en) * | 1997-03-21 | 2000-12-12 | Brocade Communications Systems, Inc. | Fibre channel switching system and method |
US6335992B1 (en) * | 2000-02-15 | 2002-01-01 | Tellium, Inc. | Scalable optical cross-connect system and method transmitter/receiver protection |
US6370145B1 (en) * | 1997-08-22 | 2002-04-09 | Avici Systems | Internet switch router |
US6421348B1 (en) * | 1998-07-01 | 2002-07-16 | National Semiconductor Corporation | High-speed network switch bus |
US20020156918A1 (en) * | 2001-04-23 | 2002-10-24 | Brocade Communications Systems, Inc. | Dynamic path selection with in-order delivery within sequence in a communication network |
US20020176363A1 (en) * | 2001-05-08 | 2002-11-28 | Sanja Durinovic-Johri | Method for load balancing in routers of a network using overflow paths |
US20030016686A1 (en) * | 2001-05-01 | 2003-01-23 | Wynne John M. | Traffic manager for network switch port |
US20030026267A1 (en) * | 2001-07-31 | 2003-02-06 | Oberman Stuart F. | Virtual channels in a network switch |
US20030202474A1 (en) * | 2002-04-29 | 2003-10-30 | Brocade Communications Systems, Inc. | Frame-pull flow control in a fibre channel network |
US6643256B1 (en) * | 1998-12-15 | 2003-11-04 | Kabushiki Kaisha Toshiba | Packet switch and packet switching method using priority control based on congestion status within packet switch |
US20040017771A1 (en) * | 2002-07-29 | 2004-01-29 | Brocade Communications Systems, Inc. | Cascade credit sharing for fibre channel links |
US20040024906A1 (en) * | 2002-07-31 | 2004-02-05 | Brocade Communications Systems, Inc. | Load balancing in a network comprising communication paths having different bandwidths |
US20040081096A1 (en) * | 2002-10-28 | 2004-04-29 | Brocade Communications Systems, Inc. | Method and device for extending usable lengths of fibre channel links |
US6937607B2 (en) * | 2001-06-21 | 2005-08-30 | Alcatel | Random early discard for cell-switched data switch |
US6967924B1 (en) * | 1996-01-29 | 2005-11-22 | Hitachi, Ltd. | Packet switching device and cell transfer control method |
US6992980B2 (en) * | 2000-06-20 | 2006-01-31 | International Business Machines Corporation | System and method for enabling a full flow control down to the sub-ports of a switch fabric |
Family Cites Families (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2625392B1 (en) * | 1987-12-24 | 1993-11-26 | Quinquis Jean Paul | CIRCUIT FOR MANAGING BUFFER WRITE POINTERS IN PARTICULAR FOR SELF-ROUTING PACKET TIME SWITCH |
JP3269273B2 (en) * | 1994-09-02 | 2002-03-25 | 三菱電機株式会社 | Cell switching device and cell switching system |
GB9509484D0 (en) * | 1995-05-10 | 1995-07-05 | Gen Datacomm Adv Res | Atm network switch |
US6442172B1 (en) * | 1996-07-11 | 2002-08-27 | Alcatel Internetworking, Inc. | Input buffering and queue status-based output control for a digital traffic switch |
JPH1032585A (en) * | 1996-07-18 | 1998-02-03 | Nec Corp | Atm switch control system |
JP3156623B2 (en) | 1997-01-31 | 2001-04-16 | 日本電気株式会社 | Fiber channel fabric |
KR100247022B1 (en) * | 1997-06-11 | 2000-04-01 | 윤종용 | A single switch element of atm switching system and buffer thresholds value decision method |
US6094435A (en) * | 1997-06-30 | 2000-07-25 | Sun Microsystems, Inc. | System and method for a quality of service in a multi-layer network element |
US6091707A (en) * | 1997-12-18 | 2000-07-18 | Advanced Micro Devices, Inc. | Methods and apparatus for preventing under-flow conditions in a multiple-port switching device |
US6078959A (en) * | 1998-01-29 | 2000-06-20 | Opuswave Networks, Inc. | Subscriber-originated call deferred queuing |
JP3001502B2 (en) | 1998-05-20 | 2000-01-24 | 九州日本電気通信システム株式会社 | ATM switch module, ATM switch capacity expansion method, and ATM routing information setting method |
IL125271A0 (en) * | 1998-07-08 | 1999-03-12 | Galileo Technology Ltd | Head of line blocking |
US6473827B2 (en) | 1998-12-22 | 2002-10-29 | Ncr Corporation | Distributed multi-fabric interconnect |
US7120117B1 (en) * | 2000-08-29 | 2006-10-10 | Broadcom Corporation | Starvation free flow control in a shared memory switching device |
US6952401B1 (en) * | 1999-03-17 | 2005-10-04 | Broadcom Corporation | Method for load balancing in a network switch |
US6625121B1 (en) * | 1999-04-28 | 2003-09-23 | Cisco Technology, Inc. | Dynamically delisting and relisting multicast destinations in a network switching node |
US6904043B1 (en) * | 1999-05-21 | 2005-06-07 | Advanced Micro Devices, Inc. | Apparatus and methods for storing and processing header information in a network switch |
AU2001239595A1 (en) | 2000-03-07 | 2001-09-17 | Sun Microsystems, Inc. | Virtual channel flow control |
US7046632B2 (en) * | 2000-04-01 | 2006-05-16 | Via Technologies, Inc. | Method and switch controller for relieving flow congestion in network |
TW477133B (en) * | 2000-04-01 | 2002-02-21 | Via Tech Inc | Method for solving network congestion and Ethernet switch controller using the same |
US6987732B2 (en) * | 2000-12-15 | 2006-01-17 | Tellabs San Jose, Inc. | Apparatus and methods for scheduling packets in a broadband data stream |
US7260104B2 (en) * | 2001-12-19 | 2007-08-21 | Computer Network Technology Corporation | Deferred queuing in a buffered switch |
US7042842B2 (en) * | 2001-06-13 | 2006-05-09 | Computer Network Technology Corporation | Fiber channel switch |
US6804245B2 (en) | 2001-08-17 | 2004-10-12 | Mcdata Corporation | Compact, shared route lookup table for a fiber channel switch |
US6606322B2 (en) | 2001-08-17 | 2003-08-12 | Mcdata Corporation | Route lookup caching for a fiber channel switch |
US7423967B2 (en) * | 2002-05-09 | 2008-09-09 | Broadcom Corporation | Fairness scheme method and apparatus for pause capable and pause incapable ports |
US20060098660A1 (en) * | 2004-11-10 | 2006-05-11 | Rajesh Pal | Mechanism for automatic protection switching and apparatus utilizing same |
-
2001
- 2001-12-19 US US10/020,968 patent/US7260104B2/en not_active Expired - Fee Related
-
2002
- 2002-12-19 CA CA002470758A patent/CA2470758A1/en not_active Abandoned
- 2002-12-19 AU AU2002366842A patent/AU2002366842A1/en not_active Abandoned
- 2002-12-19 WO PCT/US2002/040519 patent/WO2003055157A1/en not_active Application Discontinuation
- 2002-12-19 EP EP02805619A patent/EP1466449A1/en not_active Withdrawn
-
2004
- 2004-06-21 US US10/873,329 patent/US20050088969A1/en not_active Abandoned
- 2004-06-21 US US10/873,430 patent/US7773622B2/en not_active Expired - Fee Related
-
2010
- 2010-06-30 US US12/826,959 patent/US8379658B2/en not_active Expired - Fee Related
Patent Citations (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4710868A (en) * | 1984-06-29 | 1987-12-01 | International Business Machines Corporation | Interconnect scheme for shared memory local networks |
US5455820A (en) * | 1993-05-20 | 1995-10-03 | Nec Corporation | Output-buffer switch for asynchronous transfer mode |
US5533201A (en) * | 1994-03-07 | 1996-07-02 | Unisys Corporation | Method and apparatus for simultaneous interconnection of multiple requestors to multiple memories |
US6067286A (en) * | 1995-04-11 | 2000-05-23 | General Datacomm, Inc. | Data network switch with fault tolerance |
US5983260A (en) * | 1995-07-19 | 1999-11-09 | Fujitsu Network Communications, Inc. | Serial control and data interconnects for coupling an I/O module with a switch fabric in a switch |
US5844887A (en) * | 1995-11-30 | 1998-12-01 | Scorpio Communications Ltd. | ATM switching fabric |
US5751969A (en) * | 1995-12-04 | 1998-05-12 | Motorola, Inc. | Apparatus and methods for predicting and managing congestion in a network |
US6967924B1 (en) * | 1996-01-29 | 2005-11-22 | Hitachi, Ltd. | Packet switching device and cell transfer control method |
US5999527A (en) * | 1996-02-02 | 1999-12-07 | Telefonaktiebolaget Lm Ericsson | Modular switch |
US5781549A (en) * | 1996-02-23 | 1998-07-14 | Allied Telesyn International Corp. | Method and apparatus for switching data packets in a data network |
US6160813A (en) * | 1997-03-21 | 2000-12-12 | Brocade Communications Systems, Inc. | Fibre channel switching system and method |
US6370145B1 (en) * | 1997-08-22 | 2002-04-09 | Avici Systems | Internet switch router |
US5974467A (en) * | 1997-08-29 | 1999-10-26 | Extreme Networks | Protocol for communicating data between packet forwarding devices via an intermediate network interconnect device |
US6421348B1 (en) * | 1998-07-01 | 2002-07-16 | National Semiconductor Corporation | High-speed network switch bus |
US6643256B1 (en) * | 1998-12-15 | 2003-11-04 | Kabushiki Kaisha Toshiba | Packet switch and packet switching method using priority control based on congestion status within packet switch |
US6335992B1 (en) * | 2000-02-15 | 2002-01-01 | Tellium, Inc. | Scalable optical cross-connect system and method transmitter/receiver protection |
US6992980B2 (en) * | 2000-06-20 | 2006-01-31 | International Business Machines Corporation | System and method for enabling a full flow control down to the sub-ports of a switch fabric |
US20020156918A1 (en) * | 2001-04-23 | 2002-10-24 | Brocade Communications Systems, Inc. | Dynamic path selection with in-order delivery within sequence in a communication network |
US20030016686A1 (en) * | 2001-05-01 | 2003-01-23 | Wynne John M. | Traffic manager for network switch port |
US20020176363A1 (en) * | 2001-05-08 | 2002-11-28 | Sanja Durinovic-Johri | Method for load balancing in routers of a network using overflow paths |
US6937607B2 (en) * | 2001-06-21 | 2005-08-30 | Alcatel | Random early discard for cell-switched data switch |
US20030026267A1 (en) * | 2001-07-31 | 2003-02-06 | Oberman Stuart F. | Virtual channels in a network switch |
US20030202474A1 (en) * | 2002-04-29 | 2003-10-30 | Brocade Communications Systems, Inc. | Frame-pull flow control in a fibre channel network |
US20040017771A1 (en) * | 2002-07-29 | 2004-01-29 | Brocade Communications Systems, Inc. | Cascade credit sharing for fibre channel links |
US20040024906A1 (en) * | 2002-07-31 | 2004-02-05 | Brocade Communications Systems, Inc. | Load balancing in a network comprising communication paths having different bandwidths |
US20040081096A1 (en) * | 2002-10-28 | 2004-04-29 | Brocade Communications Systems, Inc. | Method and device for extending usable lengths of fibre channel links |
Cited By (135)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100008375A1 (en) * | 2002-04-01 | 2010-01-14 | Cisco Technology, Inc. | Label switching in fibre channel networks |
US9350653B2 (en) | 2002-04-01 | 2016-05-24 | Cisco Technology, Inc. | Label switching in fibre channel networks |
US8462790B2 (en) | 2002-04-01 | 2013-06-11 | Cisco Technology, Inc. | Label switching in fibre channel networks |
US7830809B2 (en) | 2002-06-12 | 2010-11-09 | Cisco Technology, Inc. | Methods and apparatus for characterizing a route in a fibre channel fabric |
US20070153816A1 (en) * | 2002-06-12 | 2007-07-05 | Cisco Technology, Inc. | Methods and apparatus for characterizing a route in a fibre channel fabric |
US20070081527A1 (en) * | 2002-07-22 | 2007-04-12 | Betker Steven M | Method and system for primary blade selection in a multi-module fibre channel switch |
US7729288B1 (en) | 2002-09-11 | 2010-06-01 | Qlogic, Corporation | Zone management in a multi-module fibre channel switch |
US7184466B1 (en) * | 2002-09-12 | 2007-02-27 | Xilinx, Inc. | Radio frequency data conveyance system including configurable integrated circuits |
US20050135251A1 (en) * | 2002-10-07 | 2005-06-23 | Kunz James A. | Method and system for reducing congestion in computer networks |
US20080316942A1 (en) * | 2002-11-27 | 2008-12-25 | Cisco Technology, Inc. | Methods and devices for exchanging peer parameters between network devices |
US8605624B2 (en) | 2002-11-27 | 2013-12-10 | Cisco Technology, Inc. | Methods and devices for exchanging peer parameters between network devices |
US20110090816A1 (en) * | 2003-06-26 | 2011-04-21 | Cisco Technology, Inc. | FIBRE CHANNEL SWITCH THAT ENABLES END DEVICES IN DIFFERENT FABRICS TO COMMUNICATE WITH ONE ANOTHER WHILE RETAINING THEIR UNIQUE FIBRE CHANNEL DOMAIN_IDs |
US7876711B2 (en) | 2003-06-26 | 2011-01-25 | Cisco Technology, Inc. | Fibre channel switch that enables end devices in different fabrics to communicate with one another while retaining their unique fibre channel domain—IDs |
US8625460B2 (en) | 2003-06-26 | 2014-01-07 | Cisco Technology, Inc. | Fibre channel switch that enables end devices in different fabrics to communicate with one another while retaining their unique fibre channel domain—IDs |
US20050013318A1 (en) * | 2003-07-16 | 2005-01-20 | Fike John M. | Method and system for fibre channel arbitrated loop acceleration |
US20050013258A1 (en) * | 2003-07-16 | 2005-01-20 | Fike John M. | Method and apparatus for detecting and removing orphaned primitives in a fibre channel network |
US20050015518A1 (en) * | 2003-07-16 | 2005-01-20 | Wen William J. | Method and system for non-disruptive data capture in networks |
US20050025193A1 (en) * | 2003-07-16 | 2005-02-03 | Fike John M. | Method and apparatus for test pattern generation |
US20050018676A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Programmable pseudo virtual lanes for fibre channel systems |
US20050018663A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for power control of fibre channel switches |
US20050044267A1 (en) * | 2003-07-21 | 2005-02-24 | Dropps Frank R. | Method and system for routing and filtering network data packets in fibre channel systems |
US20050030978A1 (en) * | 2003-07-21 | 2005-02-10 | Dropps Frank R. | Method and system for managing traffic in fibre channel systems |
US7894348B2 (en) * | 2003-07-21 | 2011-02-22 | Qlogic, Corporation | Method and system for congestion control in a fibre channel switch |
US20050018671A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for keeping a fibre channel arbitrated loop open during frame gaps |
US7646767B2 (en) | 2003-07-21 | 2010-01-12 | Qlogic, Corporation | Method and system for programmable data dependant network routing |
US20050018621A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for selecting virtual lanes in fibre channel switches |
US20050018606A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for congestion control based on optimum bandwidth allocation in a fibre channel switch |
US7684401B2 (en) | 2003-07-21 | 2010-03-23 | Qlogic, Corporation | Method and system for using extended fabric features with fibre channel switch elements |
US20050018672A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Lun based hard zoning in fibre channel switches |
US20050018649A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for improving bandwidth and reducing idles in fibre channel switches |
US20050018603A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for reducing latency and congestion in fibre channel switches |
US20050018701A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for routing fibre channel frames |
US20050030893A1 (en) * | 2003-07-21 | 2005-02-10 | Dropps Frank R. | Method and system for detecting congestion and over subscription in a fibre channel network |
US20050018680A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for programmable data dependant network routing |
US7792115B2 (en) | 2003-07-21 | 2010-09-07 | Qlogic, Corporation | Method and system for routing and filtering network data packets in fibre channel systems |
US20050018650A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for configuring fibre channel ports |
US20050018675A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Multi-speed cut through operation in fibre channel |
US20050030954A1 (en) * | 2003-07-21 | 2005-02-10 | Dropps Frank R. | Method and system for programmable data dependant network routing |
US20050018674A1 (en) * | 2003-07-21 | 2005-01-27 | Dropps Frank R. | Method and system for buffer-to-buffer credit recovery in fibre channel systems using virtual and/or pseudo virtual lanes |
US20050174936A1 (en) * | 2004-02-05 | 2005-08-11 | Betker Steven M. | Method and system for preventing deadlock in fibre channel fabrics using frame priorities |
US20050174942A1 (en) * | 2004-02-05 | 2005-08-11 | Betker Steven M. | Method and system for reducing deadlock in fibre channel fabrics using virtual lanes |
US8174978B2 (en) * | 2004-03-05 | 2012-05-08 | Xyratex Technology Limited | Method for congestion management of a network, a signalling protocol, a switch, an end station and a network |
US20080253289A1 (en) * | 2004-03-05 | 2008-10-16 | Xyratex Technology Limited | Method For Congestion Management of a Network, a Signalling Protocol, a Switch, an End Station and a Network |
US7930377B2 (en) | 2004-04-23 | 2011-04-19 | Qlogic, Corporation | Method and system for using boot servers in networks |
US20060013135A1 (en) * | 2004-06-21 | 2006-01-19 | Schmidt Steven G | Flow control in a switch |
US20060023705A1 (en) * | 2004-07-27 | 2006-02-02 | Alcatel | Method and apparatus for closed loop, out-of-band backpressure mechanism |
US7453810B2 (en) * | 2004-07-27 | 2008-11-18 | Alcatel Lucent | Method and apparatus for closed loop, out-of-band backpressure mechanism |
US8295299B2 (en) | 2004-10-01 | 2012-10-23 | Qlogic, Corporation | High speed fibre channel switch element |
US20060072616A1 (en) * | 2004-10-01 | 2006-04-06 | Dropps Frank R | Method and system for LUN remapping in fibre channel networks |
US20060072580A1 (en) * | 2004-10-01 | 2006-04-06 | Dropps Frank R | Method and system for transferring data drectly between storage devices in a storage area network |
US20060072473A1 (en) * | 2004-10-01 | 2006-04-06 | Dropps Frank R | High speed fibre channel switch element |
US20060087963A1 (en) * | 2004-10-25 | 2006-04-27 | Cisco Technology, Inc. | Graceful port shutdown protocol for fibre channel interfaces |
US8750094B2 (en) | 2004-11-01 | 2014-06-10 | Cisco Technology, Inc. | Trunking for fabric ports in Fibre channel switches and attached devices |
US7916628B2 (en) | 2004-11-01 | 2011-03-29 | Cisco Technology, Inc. | Trunking for fabric ports in fibre channel switches and attached devices |
US20110141906A1 (en) * | 2004-11-01 | 2011-06-16 | Cisco Technology, Inc. | Trunking for fabric ports in fibre channel switches and attached devices |
US20060092932A1 (en) * | 2004-11-01 | 2006-05-04 | Cisco Technology, Inc. | Trunking for fabric ports in fibre channel switches and attached devices |
US7649844B2 (en) * | 2004-12-29 | 2010-01-19 | Cisco Technology, Inc. | In-order fibre channel packet delivery |
US20060153186A1 (en) * | 2004-12-29 | 2006-07-13 | Cisco Technology, Inc. | In-order fibre channel packet delivery |
US20070047535A1 (en) * | 2005-08-31 | 2007-03-01 | Intel Corporation | Switching device utilizing flow-control management |
US7719982B2 (en) * | 2005-08-31 | 2010-05-18 | Intel Corporation | Switching device utilizing flow-control management |
US7680039B2 (en) * | 2005-09-02 | 2010-03-16 | Intel Corporation | Network load balancing |
US20070064605A1 (en) * | 2005-09-02 | 2007-03-22 | Intel Corporation | Network load balancing apparatus, systems, and methods |
US8428055B2 (en) | 2005-09-09 | 2013-04-23 | Juniper Networks, Inc. | Scalable central memory switching fabric |
US7903644B1 (en) | 2005-09-09 | 2011-03-08 | Juniper Networks, Inc. | Scalable central memory switching fabric |
US20110122892A1 (en) * | 2005-09-09 | 2011-05-26 | Juniper Networks, Inc. | Scalable central memory switching fabric |
US7577133B1 (en) | 2005-09-09 | 2009-08-18 | Juniper Networks, Inc. | Scalable central memory switching fabric |
US20100128735A1 (en) * | 2006-01-30 | 2010-05-27 | Juniper Networks, Inc. | Processing of partial frames and partial superframes |
US7593330B1 (en) * | 2006-01-30 | 2009-09-22 | Juniper Networks, Inc. | Processing of partial frames and partial superframes |
US8077727B2 (en) | 2006-01-30 | 2011-12-13 | Juniper Networks, Inc. | Processing of partial frames and partial superframes |
US20070268825A1 (en) * | 2006-05-19 | 2007-11-22 | Michael Corwin | Fine-grain fairness in a hierarchical switched system |
US20080168302A1 (en) * | 2007-01-10 | 2008-07-10 | International Business Machines Corporation | Systems and methods for diagnosing faults in a multiple domain storage system |
US20080168161A1 (en) * | 2007-01-10 | 2008-07-10 | International Business Machines Corporation | Systems and methods for managing faults within a high speed network employing wide ports |
US20080270638A1 (en) * | 2007-04-30 | 2008-10-30 | International Business Machines Corporation | Systems and methods for monitoring high speed network traffic via simultaneously multiplexed data streams |
US9088497B1 (en) * | 2007-05-09 | 2015-07-21 | Marvell Israel (M.I.S.L) Ltd. | Method and apparatus for switch port memory allocation |
US20090041057A1 (en) * | 2007-08-06 | 2009-02-12 | International Business Machines Corporation | Performing a recovery action in response to a credit depletion notification |
US20090043880A1 (en) * | 2007-08-06 | 2009-02-12 | International Business Machines Corporation | Credit depletion notification for transmitting frames between a port pair |
US7787375B2 (en) | 2007-08-06 | 2010-08-31 | International Business Machines Corporation | Performing a recovery action in response to a credit depletion notification |
US7975027B2 (en) * | 2007-08-06 | 2011-07-05 | International Business Machines Corporation | Credit depletion notification for transmitting frames between a port pair |
US10848426B2 (en) * | 2007-10-17 | 2020-11-24 | Dispersive Networks, Inc. | Virtual dispersive networking systems and methods |
US8644326B2 (en) * | 2008-04-04 | 2014-02-04 | Micron Technology, Inc. | Queue processing method |
US20090252167A1 (en) * | 2008-04-04 | 2009-10-08 | Finbar Naven | Queue processing method |
US8711697B1 (en) * | 2011-06-22 | 2014-04-29 | Marvell International Ltd. | Method and apparatus for prioritizing data transfer |
US10757039B2 (en) | 2014-12-24 | 2020-08-25 | Intel Corporation | Apparatus and method for routing data in a switch |
EP3238386B1 (en) * | 2014-12-24 | 2020-03-04 | Intel Corporation | Apparatus and method for routing data in a switch |
WO2016105419A1 (en) * | 2014-12-24 | 2016-06-30 | Intel Corporation | Apparatus and method for routing data in a switch |
CN107005467A (en) * | 2014-12-24 | 2017-08-01 | 英特尔公司 | Apparatus and method for route data in a switch |
US9838330B2 (en) | 2014-12-29 | 2017-12-05 | Oracle International Corporation | System and method for supporting credit management for output ports in a networking device |
WO2016109104A1 (en) * | 2014-12-29 | 2016-07-07 | Oracle International Corporation | System and method for supporting efficient virtual output queue (voq) resource utilization in a networking device |
US9838338B2 (en) | 2014-12-29 | 2017-12-05 | Oracle International Corporation | System and method for supporting efficient virtual output queue (VOQ) resource utilization in a networking device |
CN107005487A (en) * | 2014-12-29 | 2017-08-01 | 甲骨文国际公司 | System and method for supporting efficient VOQ (VOQ) utilization of resources in networked devices |
US9621484B2 (en) | 2014-12-29 | 2017-04-11 | Oracle International Corporation | System and method for supporting efficient buffer reallocation in a networking device |
US9832143B2 (en) | 2014-12-29 | 2017-11-28 | Oracle International Corporation | System and method for supporting efficient virtual output queue (VOQ) packet flushing scheme in a networking device |
CN106301967A (en) * | 2016-10-25 | 2017-01-04 | 杭州华为数字技术有限公司 | A kind of method of data synchronization and outband management equipment |
US11010086B2 (en) | 2016-10-25 | 2021-05-18 | Huawei Technologies Co., Ltd. | Data synchronization method and out-of-band management device |
CN110703985A (en) * | 2016-10-25 | 2020-01-17 | 杭州华为数字技术有限公司 | Data synchronization method and out-of-band management equipment |
US11134021B2 (en) * | 2016-12-29 | 2021-09-28 | Intel Corporation | Techniques for processor queue management |
US10686872B2 (en) * | 2017-12-19 | 2020-06-16 | Xilinx, Inc. | Network interface device |
US10686731B2 (en) * | 2017-12-19 | 2020-06-16 | Xilinx, Inc. | Network interface device |
US20190190853A1 (en) * | 2017-12-19 | 2019-06-20 | Solarflare Communications, Inc. | Network Interface Device |
US20190190982A1 (en) * | 2017-12-19 | 2019-06-20 | Solarflare Communications, Inc. | Network interface device |
US11165720B2 (en) | 2017-12-19 | 2021-11-02 | Xilinx, Inc. | Network interface device |
US11394664B2 (en) | 2017-12-19 | 2022-07-19 | Xilinx, Inc. | Network interface device |
US11394768B2 (en) | 2017-12-19 | 2022-07-19 | Xilinx, Inc. | Network interface device |
US11088966B2 (en) * | 2018-11-06 | 2021-08-10 | Mellanox Technologies, Ltd. | Managing congestion in a network adapter based on host bus performance |
US11784920B2 (en) | 2019-05-23 | 2023-10-10 | Hewlett Packard Enterprise Development Lp | Algorithms for use of load information from neighboring nodes in adaptive routing |
US11899596B2 (en) | 2019-05-23 | 2024-02-13 | Hewlett Packard Enterprise Development Lp | System and method for facilitating dynamic command management in a network interface controller (NIC) |
US11757764B2 (en) | 2019-05-23 | 2023-09-12 | Hewlett Packard Enterprise Development Lp | Optimized adaptive routing to reduce number of hops |
US11765074B2 (en) | 2019-05-23 | 2023-09-19 | Hewlett Packard Enterprise Development Lp | System and method for facilitating hybrid message matching in a network interface controller (NIC) |
US11777843B2 (en) | 2019-05-23 | 2023-10-03 | Hewlett Packard Enterprise Development Lp | System and method for facilitating data-driven intelligent network |
US11750504B2 (en) | 2019-05-23 | 2023-09-05 | Hewlett Packard Enterprise Development Lp | Method and system for providing network egress fairness between applications |
US11792114B2 (en) | 2019-05-23 | 2023-10-17 | Hewlett Packard Enterprise Development Lp | System and method for facilitating efficient management of non-idempotent operations in a network interface controller (NIC) |
US11799764B2 (en) | 2019-05-23 | 2023-10-24 | Hewlett Packard Enterprise Development Lp | System and method for facilitating efficient packet injection into an output buffer in a network interface controller (NIC) |
US11818037B2 (en) | 2019-05-23 | 2023-11-14 | Hewlett Packard Enterprise Development Lp | Switch device for facilitating switching in data-driven intelligent network |
US11848859B2 (en) | 2019-05-23 | 2023-12-19 | Hewlett Packard Enterprise Development Lp | System and method for facilitating on-demand paging in a network interface controller (NIC) |
US11855881B2 (en) | 2019-05-23 | 2023-12-26 | Hewlett Packard Enterprise Development Lp | System and method for facilitating efficient packet forwarding using a message state table in a network interface controller (NIC) |
US11863431B2 (en) | 2019-05-23 | 2024-01-02 | Hewlett Packard Enterprise Development Lp | System and method for facilitating fine-grain flow control in a network interface controller (NIC) |
US11876701B2 (en) | 2019-05-23 | 2024-01-16 | Hewlett Packard Enterprise Development Lp | System and method for facilitating operation management in a network interface controller (NIC) for accelerators |
US11876702B2 (en) | 2019-05-23 | 2024-01-16 | Hewlett Packard Enterprise Development Lp | System and method for facilitating efficient address translation in a network interface controller (NIC) |
US11882025B2 (en) | 2019-05-23 | 2024-01-23 | Hewlett Packard Enterprise Development Lp | System and method for facilitating efficient message matching in a network interface controller (NIC) |
US11757763B2 (en) | 2019-05-23 | 2023-09-12 | Hewlett Packard Enterprise Development Lp | System and method for facilitating efficient host memory access from a network interface controller (NIC) |
US11902150B2 (en) | 2019-05-23 | 2024-02-13 | Hewlett Packard Enterprise Development Lp | Systems and methods for adaptive routing in the presence of persistent flows |
US11916781B2 (en) | 2019-05-23 | 2024-02-27 | Hewlett Packard Enterprise Development Lp | System and method for facilitating efficient utilization of an output buffer in a network interface controller (NIC) |
US11916782B2 (en) | 2019-05-23 | 2024-02-27 | Hewlett Packard Enterprise Development Lp | System and method for facilitating global fairness in a network |
US11929919B2 (en) | 2019-05-23 | 2024-03-12 | Hewlett Packard Enterprise Development Lp | System and method for facilitating self-managing reduction engines |
US11962490B2 (en) | 2019-05-23 | 2024-04-16 | Hewlett Packard Enterprise Development Lp | Systems and methods for per traffic class routing |
US11968116B2 (en) | 2019-05-23 | 2024-04-23 | Hewlett Packard Enterprise Development Lp | Method and system for facilitating lossy dropping and ECN marking |
US11973685B2 (en) | 2019-05-23 | 2024-04-30 | Hewlett Packard Enterprise Development Lp | Fat tree adaptive routing |
US11985060B2 (en) | 2019-05-23 | 2024-05-14 | Hewlett Packard Enterprise Development Lp | Dragonfly routing with incomplete group connectivity |
US11991072B2 (en) | 2019-05-23 | 2024-05-21 | Hewlett Packard Enterprise Development Lp | System and method for facilitating efficient event notification management for a network interface controller (NIC) |
US12003411B2 (en) | 2019-05-23 | 2024-06-04 | Hewlett Packard Enterprise Development Lp | Systems and methods for on the fly routing in the presence of errors |
US12021738B2 (en) | 2019-05-23 | 2024-06-25 | Hewlett Packard Enterprise Development Lp | Deadlock-free multicast routing on a dragonfly network |
US12034633B2 (en) | 2019-05-23 | 2024-07-09 | Hewlett Packard Enterprise Development Lp | System and method for facilitating tracer packets in a data-driven intelligent network |
US12040969B2 (en) | 2019-05-23 | 2024-07-16 | Hewlett Packard Enterprise Development Lp | System and method for facilitating data-driven intelligent network with flow control of individual applications and traffic flows |
US12058032B2 (en) | 2019-05-23 | 2024-08-06 | Hewlett Packard Enterprise Development Lp | Weighting routing |
US12058033B2 (en) | 2019-05-23 | 2024-08-06 | Hewlett Packard Enterprise Development Lp | Method and system for providing network ingress fairness between applications |
Also Published As
Publication number | Publication date |
---|---|
AU2002366842A1 (en) | 2003-07-09 |
US8379658B2 (en) | 2013-02-19 |
US7260104B2 (en) | 2007-08-21 |
WO2003055157A1 (en) | 2003-07-03 |
CA2470758A1 (en) | 2003-07-03 |
US20050088970A1 (en) | 2005-04-28 |
US20100265821A1 (en) | 2010-10-21 |
US7773622B2 (en) | 2010-08-10 |
EP1466449A1 (en) | 2004-10-13 |
US20030112818A1 (en) | 2003-06-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20050088969A1 (en) | Port congestion notification in a switch | |
US7606150B2 (en) | Fibre channel switch | |
US7296093B1 (en) | Network processor interface system | |
US7346001B1 (en) | Systems and methods for limiting low priority traffic from blocking high priority traffic | |
US7151744B2 (en) | Multi-service queuing method and apparatus that provides exhaustive arbitration, load balancing, and support for rapid port failover | |
EP1779607B1 (en) | Network interconnect crosspoint switching architecture and method | |
US7515537B2 (en) | Method and apparatus for rendering a cell-based switch useful for frame based protocols | |
US7298739B1 (en) | System and method for communicating switch fabric control information | |
US7406092B2 (en) | Programmable pseudo virtual lanes for fibre channel systems | |
EP0981878B1 (en) | Fair and efficient scheduling of variable-size data packets in an input-buffered multipoint switch | |
US7161906B2 (en) | Three-stage switch fabric with input device features | |
EP0772323A2 (en) | Method and apparatus for tracking buffer availability | |
JPH08265369A (en) | Data communication accelerating switch | |
EP0709984A2 (en) | High performance path allocation system and method for a fiber optic switch | |
JP3908483B2 (en) | Communication device | |
US20050281282A1 (en) | Internal messaging within a switch | |
US7522529B2 (en) | Method and system for detecting congestion and over subscription in a fibre channel network | |
US7724666B1 (en) | Credit-based flow control over unreliable links | |
US10880236B2 (en) | Switch with controlled queuing for multi-host endpoints | |
US20060013135A1 (en) | Flow control in a switch | |
EP1322079B1 (en) | System and method for providing gaps between data elements at ingress to a network element | |
US8131854B2 (en) | Interfacing with streams of differing speeds | |
US7773592B1 (en) | Method and system for routing network information | |
US7039057B1 (en) | Arrangement for converting ATM cells to infiniband packets | |
US7609710B1 (en) | Method and system for credit management in a networking system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: COMPUTER NETWORK TECHNOLOGY CORPORATION, MINNESOTA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:CARLSEN, SCOTT;TORNETTA, ANTHONY G.;SCHMIDT, STEVEN G.;REEL/FRAME:015949/0358 Effective date: 20041021 |
|
AS | Assignment |
Owner name: MCDATA SERVICES CORPORATION, CALIFORNIA Free format text: MERGER;ASSIGNOR:COMPUTER NETWORK TECHNOLOGY CORPORATION;REEL/FRAME:021952/0245 Effective date: 20050531 |
|
AS | Assignment |
Owner name: BANK OF AMERICA, N.A. AS ADMINISTRATIVE AGENT, CAL Free format text: SECURITY AGREEMENT;ASSIGNORS:BROCADE COMMUNICATIONS SYSTEMS, INC.;FOUNDRY NETWORKS, INC.;INRANGE TECHNOLOGIES CORPORATION;AND OTHERS;REEL/FRAME:022012/0204 Effective date: 20081218 Owner name: BANK OF AMERICA, N.A. AS ADMINISTRATIVE AGENT,CALI Free format text: SECURITY AGREEMENT;ASSIGNORS:BROCADE COMMUNICATIONS SYSTEMS, INC.;FOUNDRY NETWORKS, INC.;INRANGE TECHNOLOGIES CORPORATION;AND OTHERS;REEL/FRAME:022012/0204 Effective date: 20081218 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: INRANGE TECHNOLOGIES CORPORATION, CALIFORNIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:034792/0540 Effective date: 20140114 Owner name: FOUNDRY NETWORKS, LLC, CALIFORNIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:034792/0540 Effective date: 20140114 Owner name: BROCADE COMMUNICATIONS SYSTEMS, INC., CALIFORNIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT;REEL/FRAME:034792/0540 Effective date: 20140114 |