WO2015147840A1 - Modular input/output aggregation zone - Google Patents
Modular input/output aggregation zone Download PDFInfo
- Publication number
- WO2015147840A1 WO2015147840A1 PCT/US2014/032066 US2014032066W WO2015147840A1 WO 2015147840 A1 WO2015147840 A1 WO 2015147840A1 US 2014032066 W US2014032066 W US 2014032066W WO 2015147840 A1 WO2015147840 A1 WO 2015147840A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- ports
- servers
- crosslink
- switch chip
- aggregation zone
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/64—Hybrid switching systems
- H04L12/6418—Hybrid transport
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
In one example implementation according to aspects of the present disclosure, a system is disclosed having a modular input/output aggregation zone to directly communicatively couple together a plurality of servers within an enclosure shared by the modular input/output aggregation zone and the plurality of servers. The example modular input/output aggregation zone includes a first switch chip having link ports configurable as uplink ports and crosslink ports, the uplink ports being communicatively coupleable to a network device and the crosslink ports being communicatively coupleable to a crosslink port of a second switch chip.
Description
MODULAR INPUT/OUTPUT AGGREGATION ZONE BACKGROUND
[0001] The amount and size of electronic data consumers and companies generate and use continues to grow in size and complexity, as does the size and complexity of related applications. In response, data centers housing the growing and complex data and related applications have begun to implement a variety of networking and server configurations to provide access to the data and applications.
BRIEF DESCRIPTION OF THE DRAWINGS
[0002] The following detailed description references the drawings, in which:
[0003] FIG. 1 illustrates a block diagram of a system utilizing a modular input/output aggregation zone having a switch chip according to examples of the present disclosure;
[0004] FIG. 2 illustrates a block diagram of a system utilizing two modular input/output aggregation zones each having a switch chip according to examples of the present disclosure;
[0005] FIG. 3 illustrates a block diagram of a modular input/output aggregation zone having a first switch chip and a second switch chip according to examples of the present disclosure; and
[0006] FIG. 4 illustrates a block diagram of a modular input/output aggregation zone having a first switch chip, a second switch chip, and a third switch chip according to examples of the present disclosure.
DETAILED DESCRIPTION
[0007] Data centers store growing amounts of data and host increasingly complex applications. The data and applications may be distributed across numerous servers networked together in a traditional hierarchical network topology. Server application architecture, particularly those employing heavy use of virtualization technology and data spread across multiple scale-out servers, may not be well-served by traditional hierarchical network switching topologies.
Problems associated with a traditional hierarchical network approach may include cost, latency, and management complexity.
[0008] Cost is typically measured in terms of cost per connected server. Thus each layer of networking adds to the total solution cost, affecting the cost per connected server. This situation is particularly aggravated by high-density, low- cost servers, as many individual servers connect to a top-of-rack (TOR) switch, which then must connect to the next level network. These connections typically utilize a network interface controller for each port of each server for purposes of redundancy.
[0009] Switches with many ports (e.g., 24 ports, 48 ports, or more) are disproportionally expensive, on a cost-per-port basis, relative to switches with fewer ports {e.g., less than 24 ports). The cost disparity is at least in part due to the increased connectivity and bandwidth between the switch chips in the switch chassis. As more ports are added, more internal connections are utilized, switch chip sizes are increased, silicon area is increased, and cost thus is increased. Furthermore, as chip-to-chip distances expand due to the number of switch chips utilized, signal loading degrades the switch bit rate and aggregate bandwidth.
[0010] The cost is further aggravated by the number of cables to connect the servers to the switches. In a typical redundant connection topology, an enclosure containing ten servers, with an "A" link and a "B" link for each server will implement twenty cables. A rack containing four such enclosures will have eighty cables. The top-of-rack (TOR) switch (or switches) then handle eighty downlinks as well as an appropriate number of uplinks. Though switch over-subscription is often employed in order to mitigate costs, it is an insufficient remedy.
[0011] Latency remains another issue. For example, for a server to communicate with a peer server in a system complex, the central processing unit communicates with the NIC, which then communicates via a cable to the TOR switch. The TOR switch forwards the information to the next level (L2) switch, then to the next level (L3) switch, then down to the next level (L2) which for purposes of this example will forward the information to the next TOR switch, then down the cable to the appropriate NIC then onward to the appropriate central processing unit. Each of these transactions will accumulate a switching delay. For this
example, servers in the same row would experience a 5-hop path with switch latency at each hop.
[0012] The latency problem is exacerbated in architectures with a high degree of east-west traffic— that is traffic between peer servers in the same enclosure. Since each packet traverses cables up to the TOR and back down, the aggregate switch bandwidth should be high to enable the servers to effectively communicate.
[0013] In terms of management, the more complex a switch is, the more difficult it is to manage. When multiple complex switches are implemented to facilitate communications between servers, the management problem is compounded.
[0014] Various implementations are described below by referring to several examples of a modular input/output aggregation zone having a switch chip. For example, a system is disclosed having a modular input/output aggregation zone to directly communicatively couple together a plurality of servers within an enclosure shared by the modular input/output aggregation zone and the plurality of servers. The example modular input/output aggregation zone includes a first switch chip having link ports configurable as uplink ports and crosslink ports, the uplink ports being communicatively coupleable to a network device and the crosslink ports being communicatively coupleable to a crosslink port of a second switch chip. Additional examples are described below.
[0015] In some implementations, cost can be significantly decreased by reducing the number of TOR switches and cables used to connect a plurality of servers. For example, efficient communications for workloads with significant amounts of east-west traffic is provided by communicating at a lower level in the switching hierarchy, reducing latency while also reducing the port count for expensive L2 and L3 switches. Also, a path with fewer hops provides improvements in latency. Moreover, management of the network topology is simplified. These and other advantages will be apparent from the description that follows.
[0016] FIG. 1 illustrates a block diagram of a system utilizing a modular input/output (I/O) aggregation zone 100 having a switch chip 110 according to examples of the present disclosure. It should be understood that FIG. 1 includes particular components, modules, etc. according to various examples. However, in
different embodiments, more, fewer, and/or other components, modules, arrangements of components/modules, etc. may be used according to the teachings described herein. In addition, various components, modules, etc. described herein may be implemented as one or more software modules, hardware modules, special-purpose hardware (e.g., application specific hardware, application specific integrated circuits (ASICs), embedded controllers, hardwired circuitry, etc.), or some combination of these.
[0017] The example system shown utilizes the modular I/O aggregation zone 100 to directly communicatively couple together a plurality of servers 104a-f within an enclosure such as shared enclosure 106 that is shared by the modular I/O aggregation zone 100 and the plurality of servers 104a-f. That is, the modular I/O aggregation zone 100 and the plurality of servers 104a-f are contained within the shared enclosure 106. The shared enclosure 106 may be made of a suitable material and of a suitable size to contain both the modular I/O aggregation zone 100 and the plurality of servers 104a-f.
[0018] The modular I/O aggregation zone 100 may contain a switch chip, such as switch chip 110, in one example. However, in another example, such as that shown in FIG. 1 , the modular I/O aggregation zone 100 also includes a second switch chip 120. The modular I/O aggregation zone 100 is also configured to be directly communicatively coupled to the plurality of servers 104a-f. The switch chips described herein may include various switch chips from different manufacturers including, for example, Intel's® Red Rock Canyon switch chip.
[0019] The plurality of servers 104a-f may include servers of a similar or identical configuration, or the servers may be of a variety of types and configurations. It should be appreciated that the servers 104a-f may be blade servers, modular servers, or servers of a similar type, and may include hardware components such as processing resources, memory resources, storage resources, and other appropriate components. For example, the plurality of servers 104a-f may include a processing resource that represents generally any suitable type or form of processing unit or units capable of processing data or interpreting and executing instructions. The instructions may be stored on a non-transitory tangible computer-readable storage medium, such as a memory resource, or on a separate
device (not shown), or on any other type of volatile or non-volatile memory that stores instructions. Alternatively or additionally, the plurality of servers 104a-f may include dedicated hardware, such as one or more integrated circuits, Application Specific Integrated Circuits (ASICs), Application Specific Special Processors (ASSPs), Field Programmable Gate Arrays (FPGAs), or any combination of the foregoing examples of dedicated hardware. In some implementations, multiple processors may be used, as appropriate, along with multiple memories and/or types of memory.
[0020] The plurality of servers 104a-f are directly communicatively coupled to the modular I/O aggregation zone 100, and more specifically are directly communicatively coupled to the switch chips 110 and 120 of the modular I/O aggregation zone 100. The direct coupling may include a Peripheral Component Interconnect Express (PCIe) or similar connection between the servers 104a-f and the modular I/O aggregation zone 100. These connections are depicted by the dotted lines in FIG 1. Once directly communicatively coupled to the modular I/O aggregation zone 100, the servers may transmit and receive data among one another and with other network connected devices via the direct communicatively coupled connection to the switch chips 110 and 120.
[0021] The switch chips 110 and 120 may each include link ports, which are configurable as uplink ports and crosslink ports. In the example shown, switch chip 110 includes an uplink port 112 and two crosslink ports 114a,b. Similarly, switch chip 120 includes an uplink port 122 and two crosslink ports 124a,b. In other examples, the switch chips may include additional ports in a variety of configurations. It should be understood that the link ports may be configured (and re-configured) as either uplink ports or crosslink ports, either automatically by the nature of the connections created to the uplink ports or manually by an administrator when the chips are installed or when the modular I/O aggregation zone 100 is set up. Because the link ports are configurable, the bandwidth for the connections between the servers, the switch chips, and the network devices may be variable such that some connections may support only minimal bandwidth while other connections support much greater bandwidth. Each switch chip may be
configured individually, thus increasing the flexibility and bandwidth possibilities for each modular I/O aggregation zone.
[0022] The crosslink ports (e.g., crosslink ports 114a,b and 124a,b) are communicatively coupleable to one another (or to additional switch chips) using any suitable network connection, including Ethernet, optical, or other electrical connection. In the example shown in FIG. 1 , the crosslink port 114a of switch chip 110 is communicatively coupled to the crosslink port 124a of switch chip 120, and the crosslink port 114b of switch chip 110 is communicatively coupled to the crosslink port 124b of switch chip 120. These connections are depicted by the dashed lines in FIG 1. Additional, either or both of the uplink ports 112 and 122 may be configured as crosslink ports, and any or all of the crosslink ports 114a,b and 124a,b may be configured as uplink ports, as appropriate. Additional ports may also be implemented.
[0023] Data or network traffic transmitted from one of the plurality of servers to another of the plurality of servers is transmitted through at least one crosslink port of at least one of the first switch chip and the second switch chip to the other of the plurality of servers. For example, data transmitted from the server 104a to the server 104d is transmitted through crosslink port 114b of switch chip 110 and the crosslink port 124b of switch chip 120 to the server 104d. In another example, the data could be transmitted through crosslink port 114a of switch chip 110 and the crosslink port 124a of switch chip 120 to the server 104d. By transmitting the data between the switch chips within the modular I/O aggregation zone 100, the data need not be transmitted up to the network device 140 and back down to the server, thus reducing the latency, cost, and management concerns discussed above.
[0024] In this configuration, switch chips 110 and 120 are said to have redundant connections. That is, the switch chips 110 and 120 are connected to each other along two separate paths, such that if one path fails, the switch chips 110 and 120 may communicate via the second path. In other examples, such as illustrated in FIG. 4, additional switch chips may be implemented in the modular I/O aggregation zone 100, enabling the modular I/O aggregation zone 100 to provide a mesh network, ring network, star network, fully connected network, linear
network, tree network, bus network, dragonfly network, and any other suitable network topology or combinations of network topologies.
[0025] The uplink ports (e.g., uplink ports 112 and 122) are communicatively coupleable to a network device, such as network device 140 using any suitable network connection, including Ethernet, optical, or other electrical connection. In the example shown in FIG. 1 , the uplink port 112 of switch chip 110 is communicatively coupled to the network device 140. In other examples, the uplink port 122 of switch chip 120 may also be communicatively coupled to the network device 140 or another network device as depicted by the dashed line between the crosslink port 112 of switch chip 110 and the network device 140.
[0026] The network device 140 may be any suitable network device, including at least a switch, a hub, and a router. The network device 140 may be part of a larger network, the network representing generally hardware components and computers interconnected by communications channels that allow sharing of resources and information. The network may include one or more of a cable, wireless, fiber optic, or remote connection via a telecommunication link, an infrared link, a radio frequency link, or any other connectors or systems that provide electronic communication. The network may include, at least in part, an intranet, the Internet, or a combination of both. The network may also include intermediate proxies, routers, switches, load balancers, and the like, including the network device 140 and the modular I/O aggregation zone 100 via the switch chips 110 and 120. The paths followed by the network between switch chip 110 and network device 140 as depicted in FIG. 1 represent the logical communication paths between these devices, not necessarily the physical paths between the devices.
[0027] In other examples, as discussed below, the modular I/O aggregation zone 100 may be communicatively coupled to another modular I/O aggregation zone to expand or scale the number of servers serviced by the functionality that the modular I/O aggregation zone 100 provides. For example, the additional modular input/output aggregation zone directly communicatively couple together additional pluralities of servers within shared enclosures, and each of the plurality of additional modular input/output aggregation include at least one switch chip having link ports configurable as uplink ports and crosslink ports. The additional
modular I/O aggregation zones may be arranged in a variety of network topologies. For instance, the modular I/O aggregation zones may be arranged in a mesh network, ring network, star network, fully connected network, linear network, tree network, bus network, dragonfly network, and any other suitable network topology or combinations of network topologies.
[0028] FIG. 2 illustrates a block diagram of a system utilizing two modular input/output aggregation zones each having a switch chip according to examples of the present disclosure. It should be understood that FIG. 2 includes particular components, modules, etc. according to various examples. However, in different embodiments, more, fewer, and/or other components, modules, arrangements of components/modules, etc. may be used according to the teachings described herein. In addition, various components, modules, etc. described herein may be implemented as one or more software modules, hardware modules, special- purpose hardware (e.g., application specific hardware, application specific integrated circuits (ASICs), embedded controllers, hardwired circuitry, etc.), or some combination of these.
[0029] Like FIG. 1, FIG. 2 illustrates a modular I/O aggregation zone 200 to directly communicatively couple together a plurality of servers 204a-c within an enclosure such as shared enclosure 206 that is shared by the modular I/O aggregation zone 200 and the plurality of servers 204a-c. The modular I/O aggregation zone 200 may contain a switch chip, such as switch chip 210. The modular I/O aggregation zone 200 is also configured to be directly communicatively coupled to the plurality of servers 204a-c.
[0030] Additionally, FIG. 2 illustrates a second modular I/O aggregation zone 201 to directly communicatively couple together a second plurality of servers 205d- f within a second enclosure such as shared enclosure 207 that is shared by the second modular I/O aggregation zone 201 and the second plurality of servers 205d-f. The second modular I/O aggregation zone 201 may contain a second switch chip, such as switch chip 211. The second modular I/O aggregation zone 201 is also configured to be directly communicatively coupled to the plurality of servers 205d-f.
[0031] The switch chips 210 and 211 each include link ports configurable as uplink ports and crosslink ports. For example, switch chip 210 includes uplink port 212 and crosslink ports 214a,b while switch chip 211 includes uplink port 213 and crosslink ports 215a,b. In the example illustrated, the crosslink port 214b of switch chip 210 is communicatively coupled to the crosslink port 215a of the second switch chip 211. Thus, data may be transmitted between the first plurality of servers 204a- c and the second plurality of servers 204d-f via the first switch chip 210 and the second switch chip 211 without having to transmit the data up to a higher level network device (not shown).
[0032] In examples, additional crosslink ports of switch chips 210 and 211 may be communicatively coupled to additional switch chips (not shown) within the respective modular I/O aggregation zones 200 and 201. For example, either of the illustrated modular I/O aggregation zones 200 and 201 may include additional switch chips, which may be communicatively coupled via optical or electrical links such as Ethernet links.
[0033] Moreover, additional crosslink ports of switch chips 210 and 211 may be communicatively coupled to the switch chips of additional modular I/O aggregation zones (not shown). For example, a third modular input/output aggregation zone may communicatively couple together a third plurality of servers. The third modular input/output aggregation zone may include a third switch chip having link ports configurable as uplink ports and crosslink ports. Then, the first, second, and third modular input/output aggregation zones may be communicatively coupled in any number or combinations of appropriate network topologies such as mesh network, ring network, star network, fully connected network, linear network, tree network, bus network, dragonfly network, and any other suitable network topology. In this way, multiple modular I/O aggregation zones can be linked together in a variety of network topologies to enable servers such as servers 204a-c, servers 205d-f, and additional servers to transmit and receive network traffic and data without having to transmit the network traffic and data up to a higher level network device (not shown).
[0034] The crosslink port 214a of switch chip 210 may be communicatively coupled to the crosslink port 215b of switch chip 211 to create two discrete network
paths between the modular I/O aggregation zone 200 and the second modular I/O aggregation zone 201.
[0035] In another example, the uplink port 212 of the switch chip 210 and/or the uplink port 213 of the second switch chip 211 may be communicatively coupled to a network device, such as a switch, hub, router, or other appropriate network device, using optical or electrical networking connections. Additional, either or both of the uplink ports 212 and 213 may be configured as crosslink ports, and any or all of the crosslink ports 214a,b and 215a,b may be configured as uplink ports, as appropriate. Additional ports may also be implemented.
[0036] Because the link ports are configurable, the bandwidth for the connections between the servers, the switch chips, and the network devices may be variable such that some connections may support only minimal bandwidth while other connections support much greater bandwidth. Each switch chip may be configured individually, thus increasing the flexibility and bandwidth possibilities for each modular I/O aggregation zone.
[0037] FIG. 3 illustrates a block diagram of a modular input/output aggregation zone 300 having a first switch chip 310 and a second switch chip 320 according to examples of the present disclosure. It should be understood that FIG. 3 includes particular components, modules, etc. according to various examples. However, in different embodiments, more, fewer, and/or other components, modules, arrangements of components/modules, etc. may be used according to the teachings described herein. In addition, various components, modules, etc. described herein may be implemented as one or more software modules, hardware modules, special-purpose hardware (e.g., application specific hardware, application specific integrated circuits (ASICs), embedded controllers, hardwired circuitry, etc.), or some combination of these.
[0038] The modular input/output aggregation zone 300 shown directly communicatively couples together a plurality of servers 304a-f within an enclosure such as shared enclosure 306 that is shared by the modular I/O aggregation zone 300 and the plurality of servers 304a-f. That is, the modular I/O aggregation zone 300 and the plurality of servers 304a-f are contained within the shared enclosure 306. The shared enclosure 306 may be made of a suitable material and of a
suitable size to contain both the modular I/O aggregation zone 300 and the plurality of servers 304a-f .
[0039] The plurality of servers 304a-f may include servers of a similar or identical configuration, or the servers may be of a variety of types and configurations. It should be appreciated that the servers 304a-f may be blade servers, modular servers, or servers of a similar type, and may include hardware components such as processing resources, memory resources, storage resources, and other appropriate components. For example, the plurality of servers 304a-f may include a processing resource that represents generally any suitable type or form of processing unit or units capable of processing data or interpreting and executing instructions. The instructions may be stored on a non-transitory tangible computer-readable storage medium, such as a memory resource, or on a separate device (not shown), or on any other type of volatile or non-volatile memory that stores instructions. Alternatively or additionally, the plurality of servers 304a-f may include dedicated hardware, such as one or more integrated circuits, Application Specific Integrated Circuits (ASICs), Application Specific Special Processors (ASSPs), Field Programmable Gate Arrays (FPGAs), or any combination of the foregoing examples of dedicated hardware. In some implementations, multiple processors may be used, as appropriate, along with multiple memories and/or types of memory.
[0040] The plurality of servers 304a-f are directly communicatively coupled to the modular I/O aggregation zone 300, and more specifically are directly communicatively coupled to the switch chips 310 and 320 of the modular I/O aggregation zone 300 via a Peripheral Component Interconnect Express (PCIe) connection 306 (shown as dotted lines in FIG. 3). Once directly communicatively coupled to the modular I/O aggregation zone 300, the servers may transmit and receive data among one another and with other network connected devices via the PCIe connection to the switch chips 310 and 320.
[0041] The modular I/O aggregation zone 300 may contain a first switch chip 310 and a second switch chip 320. The switch chips 310 and 320 may each include link ports, which are configurable as uplink ports and crosslink ports. In the example shown, switch chip 310 includes an uplink port 312 and two crosslink ports 314a,b.
Similarly, switch chip 320 includes an uplink port 322 and two crosslink ports 324a, b. In other examples, the switch chips may include additional ports in a variety of configurations. It should be understood that the link ports may be configured (and re-configured) as either uplink ports or crosslink ports, either automatically by the nature of the connections created to the uplink ports or manually by an administrator when the chips are installed or when the modular I/O aggregation zone 300 is set up. Because the link ports are configurable, the bandwidth for the connections between the servers, the switch chips, and the network devices may be variable such some connections may support only minimal bandwidth while other connections support much greater bandwidth. Each switch chip may be configured individually, thus increasing the flexibility and bandwidth possibilities for each modular I/O aggregation zone.
[0042] The crosslink ports {e.g., crosslink ports 314a,b and 324a, b) are communicatively coupleable to one another (or to additional switch chips) using any suitable network connection, including Ethernet, optical, or other electrical connection. In the example shown in FIG. 3, the crosslink port 314a of switch chip 310 is communicatively coupled to the crosslink port 324a of switch chip 320, and the crosslink port 314b of switch chip 310 is communicatively coupled to the crosslink port 324b of switch chip 320. These connections are depicted by the dashed lines in FIG 3. Additional, either or both of the uplink ports 312 and 322 may be configured as crosslink ports, and any or all of the crosslink ports 314a,b and 324a, b may be configured as uplink ports, as appropriate. In an example, at least one of the uplink ports 312 and 322 of switch chips 310 and 320 respectively may be communicatively coupled to a network device (not shown) such as a switch, router, hub, or other suitable networking device. The connection between the uplink ports and the networking device may be an optical network connection in one example or may be an electrical connection in another example. It is also possible that multiple connections between the switch chips and the network device are implemented using a combination of different network connection types. For example, the uplink port 312 of the switch chip 310 may be connected to a network device via an electrical connection while the uplink port 322 of the switch chip 320 may be connected to the same network device via an optical connection. Of course,
in examples, the connections between the network device and the uplink ports may be of the same type in any suitable number. Additional ports may also be implemented.
[0043] Data or network traffic transmitted from one of the plurality of servers to another of the plurality of servers is transmitted through at least one crosslink port of at least one of the first switch chip and the second switch chip to the other of the plurality of servers. For example, data transmitted from the server 304a to the server 304d is transmitted through crosslink port 314b of switch chip 310 and the crosslink port 324b of switch chip 320 to the server 304d. In another example, the data could be transmitted through crosslink port 314a of switch chip 310 and the crosslink port 324a of switch chip 320 to the server 304d. By transmitting the data between the switch chips within the modular I/O aggregation zone 300, the data need not be transmitted up to a network device (not shown) and back down to the server, thus reducing the latency, cost, and management concerns discussed above.
[0044] In this configuration, switch chips 310 and 320 are said to have redundant connections. That is, the switch chips 310 and 320 are connected to each other along two separate paths, such that if one path fails, the switch chips 310 and 320 may communicate via the second path. In other examples, such as illustrated in FIG. 4, additional switch chips may be implemented in the modular I/O aggregation zone 300, enabling the modular I/O aggregation zone 300 to provide a mesh network, star network, or other appropriate network topology among the switch chips.
[0045] FIG. 4 illustrates a block diagram of a modular input/output aggregation zone 400 having a first switch chip 410, a second switch chip 420, and a third switch chip 430 according to examples of the present disclosure. It should be understood that FIG. 4 includes particular components, modules, etc. according to various examples. However, in different embodiments, more, fewer, and/or other components, modules, arrangements of components/modules, etc. may be used according to the teachings described herein. In addition, various components, modules, etc. described herein may be implemented as one or more software modules, hardware modules, special-purpose hardware (e.g., application specific
hardware, application specific integrated circuits (ASICs), embedded controllers, hardwired circuitry, etc.), or some combination of these.
[0046] The modular input/output aggregation zone 400 shown directly communicatively couples together a plurality of servers 404a-i within an enclosure such as shared enclosure 406 that is shared by the modular I/O aggregation zone 400 and the plurality of servers 404a-i. That is, the modular I/O aggregation zone 400 and the plurality of servers 404a-i are contained within the shared enclosure 406. The shared enclosure 406 may be made of a suitable material and of a suitable size to contain both the modular I/O aggregation zone 400 and the plurality of servers 404a-i.
[0047] The plurality of servers 404a-i may include servers of a similar or identical configuration, or the servers may be of a variety of types and configurations. It should be appreciated that the servers 404a-i may be blade servers, modular servers, or servers of a similar type, and may include hardware components such as processing resources, memory resources, storage resources, and other appropriate components. For example, the plurality of servers 404a-i may include a processing resource that represents generally any suitable type or form of processing unit or units capable of processing data or interpreting and executing instructions. The instructions may be stored on a non-transitory tangible computer-readable storage medium, such as a memory resource, or on a separate device (not shown), or on any other type of volatile or non-volatile memory that stores instructions. Alternatively or additionally, the plurality of servers 404a-i may include dedicated hardware, such as one or more integrated circuits, Application Specific Integrated Circuits (ASICs), Application Specific Special Processors (ASSPs), Field Programmable Gate Arrays (FPGAs), or any combination of the foregoing examples of dedicated hardware. In some implementations, multiple processors may be used, as appropriate, along with multiple memories and/or types of memory.
[0048] The plurality of servers 404a-i are directly communicatively coupled to the modular I/O aggregation zone 400, and more specifically are directly communicatively coupled to the switch chips 410, 420, and 430 of the modular I/O aggregation zone 400 via a direct connection such as a Peripheral Component
Interconnect Express (PCIe) connection or other suitable connection (shown as dotted lines in FIG. 4). Once directly communicatively coupled to the modular I/O aggregation zone 400, the servers may transmit and receive data among one another and with other network connected devices via the direct connection to the switch chips 410, 420, and 430.
[0049] The modular I/O aggregation zone 400 may contain a first switch chip 410, a second switch chip 420, and a third switch chip 430. The switch chips 410, 420, and 430 may each include link ports, which are configurable as uplink ports and crosslink ports. It should be understood that the link ports may be configured (and re-configured) as either uplink ports or crosslink ports, either automatically by the nature of the connections created to the uplink ports or manually by an administrator when the chips are installed or when the modular I/O aggregation zone 400 is set up. Because the link ports are configurable, the bandwidth for the connections between the servers, the switch chips, and the network devices may be variable such some connections may support only minimal bandwidth while other connections support much greater bandwidth. Each switch chip may be configured individually, thus increasing the flexibility and bandwidth possibilities for each modular I/O aggregation zone.
[0050] The crosslink ports are communicatively coupleable to one another (or to additional switch chips) using any suitable network connection, including Ethernet, optical, or other electrical connection. In the example shown in FIG. 4, a crosslink port of switch chip 410 is communicatively coupled to a crosslink port switch chip 420 and to a crosslink port of switch chip 430. Similarly, a crosslink port of switch chip 420 is communicatively coupled to a crosslink port of switch chip 430. In this configuration, each switch chip is communicatively coupled to each of the other two switch chips, forming a redundant network topology such that if any one connection fails, the switch chips may still communicate via the remaining connections. The configuration shown is only one possible network topology, and other network topologies may include a mesh network, ring network, star network, fully connected network, linear network, tree network, bus network, dragonfly network, and any other suitable network topology or combinations of network topologies.
[0051] It should be emphasized that the above-described examples are merely possible examples of implementations and set forth for a clear understanding of the present disclosure. Many variations and modifications may be made to the above-described examples without departing substantially from the spirit and principles of the present disclosure. Further, the scope of the present disclosure is intended to cover any and all appropriate combinations and sub-combinations of all elements, features, and aspects discussed above. All such appropriate modifications and variations are intended to be included within the scope of the present disclosure, and all possible claims to individual aspects or combinations of elements or steps are intended to be supported by the present disclosure.
Claims
WHAT IS CLAIMED IS: 1. A system comprising:
a modular input/output aggregation zone to directly communicatively couple together a plurality of servers within an enclosure shared by the modular input/output aggregation zone and the plurality of servers, the modular input/output aggregation zone comprising a first switch chip having link ports configurable as uplink ports and crosslink ports, the uplink ports being communicatively coupleable to a network device and the crosslink ports being communicatively coupleable to a crosslink port of a second switch chip.
2. The system of claim 1 , wherein two crosslink ports of the first switch chip are communicatively coupled to two crosslink ports of the second switch chip.
3. The system of claim 2, wherein the two crosslink ports of the first and second switch chips are communicatively coupled via at least one of the group consisting of an optical cable and an electrical cable.
4. The system of claim 1 , wherein the plurality of servers are directly communicatively coupled together using Peripheral Component Interconnect Express connections in the modular input/output aggregation zone.
5. The system of claim 1 , wherein the modular input/output aggregation zone is communicatively coupleable to a plurality of additional modular input/output aggregation zones, the plurality of additional modular input/output aggregation zone to directly communicatively couple together additional pluralities of servers within shared enclosures, wherein each of the plurality of additional modular input/output aggregation zones comprises a switch chip having link ports configurable as uplink ports and crosslink ports.
6. The system of claim 5, wherein the modular input/output aggregation zone and the plurality of additional modular input/output aggregation zones are arranged in a mesh network topology.
7. The system of claim 1 ,
wherein an uplink port of the second switch chip is communicatively coupled to the networking device, and
wherein a second uplink port of the first switch chip and a second uplink port of the second switch chip are communicatively coupled to the networking device via at least one of the group consisting of an optical cable and an electrical cable.
8. The system of claim 1, wherein data transmitted from one of the plurality of servers to another of the plurality of servers is transmitted through at least one crosslink port of at least one of the first switch chip and the second switch chip to the other of the plurality of servers.
9. The system of claim 8, wherein the data transmitted from one of the plurality of servers to another of the plurality of servers is not transmitted to the network device.
10. A system comprising:
a first modular input/output aggregation zone to directly communicatively couple together a first plurality of servers within a first enclosure shared by the first modular input/output aggregation zone and the first plurality of servers, the first modular input/output aggregation zone comprising a first switch chip having link ports configurable as uplink ports and crosslink ports; and
a second modular input/output aggregation zone to directly communicatively couple together a second plurality of servers within a second enclosure shared by the second modular input/output aggregation zone and the second plurality of servers, the second modular input/output aggregation zone comprising a second switch chip having link ports configurable as uplink ports and crosslink ports,
wherein at least one of the crosslink ports of the first switch chip of the first modular input/output aggregation zone is communicatively coupled to at least one of the crosslink ports of the second switch chip of the second modular input/output aggregation zone, and
wherein a total bandwidth of the crosslink ports and the uplink ports of the first and second switch chips is greater than a total bandwidth of the direct communicative coupling together of first and second plurality of servers to the respective first and second modular input/output aggregation zones.
11. The system of claim 10, wherein at least one of the uplink ports of the first switch chip and at least one of the uplink ports of the second switch chip are communicatively coupleable to a network device.
12. The system of claim 10, further comprising:
a third modular input/output aggregation zone to directly communicatively couple together a third plurality of servers, the third modular input/output aggregation zone comprising a third switch chip having link ports configurable as uplink ports and crosslink ports,
wherein the first, second, and third modular input/output aggregation zones are communicatively coupled in a networking topology selected from the group consisting of a mesh network, a ring network, a star network, a fully connected network, a linear network, a tree network, a bus network, and a dragonfly network.
13. A modular input/output aggregation zone comprising:
a first switch chip having first link ports configurable as first uplink ports and first crosslink ports; and
a second switch chip having second link ports configurable as second uplink ports and second crosslink ports,
wherein at least one of the first crosslink ports of the first switch chip is communicatively coupleable to at least one of the second crosslink ports of the second switch chip, and
wherein the modular input/output aggregation zone is directly communicatively coupled to a plurality of servers within an enclosure shared by the plurality of servers and the modular input/output aggregation zone via Peripheral Connect Interconnect Express connections.
14. The modular input/output aggregation zone of claim 13, wherein at least one of the first uplink port and at least one of the second uplink port is communicatively coupleable to a network device by an optical network connection.
15. The modular input/output aggregation zone of claim 13, wherein the at least one of the first crosslink ports of the first switch chip is communicatively coupleable to the at least one of the second crosslink ports of the second switch chip via an electrical network connection.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US2014/032066 WO2015147840A1 (en) | 2014-03-27 | 2014-03-27 | Modular input/output aggregation zone |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US2014/032066 WO2015147840A1 (en) | 2014-03-27 | 2014-03-27 | Modular input/output aggregation zone |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015147840A1 true WO2015147840A1 (en) | 2015-10-01 |
Family
ID=54196157
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2014/032066 WO2015147840A1 (en) | 2014-03-27 | 2014-03-27 | Modular input/output aggregation zone |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2015147840A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107534590A (en) * | 2015-10-12 | 2018-01-02 | 慧与发展有限责任合伙企业 | switch network architecture |
US10484519B2 (en) | 2014-12-01 | 2019-11-19 | Hewlett Packard Enterprise Development Lp | Auto-negotiation over extended backplane |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070097948A1 (en) * | 2005-10-27 | 2007-05-03 | Boyd William T | Creation and management of destination ID routing structures in multi-host PCI topologies |
US8031722B1 (en) * | 2008-03-31 | 2011-10-04 | Emc Corporation | Techniques for controlling a network switch of a data storage system |
US20120201253A1 (en) * | 2010-10-20 | 2012-08-09 | International Business Machines Corporation | Multi-Adapter Link Aggregation for Adapters with Hardware Based Virtual Bridges |
US20130156028A1 (en) * | 2011-12-20 | 2013-06-20 | Dell Products, Lp | System and Method for Input/Output Virtualization using Virtualized Switch Aggregation Zones |
US20130322434A1 (en) * | 2012-05-30 | 2013-12-05 | Siemens Aktiengesellschaft | Network Mechanism, Network Arrangement And Method For Operating A Network Arrangement |
-
2014
- 2014-03-27 WO PCT/US2014/032066 patent/WO2015147840A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070097948A1 (en) * | 2005-10-27 | 2007-05-03 | Boyd William T | Creation and management of destination ID routing structures in multi-host PCI topologies |
US8031722B1 (en) * | 2008-03-31 | 2011-10-04 | Emc Corporation | Techniques for controlling a network switch of a data storage system |
US20120201253A1 (en) * | 2010-10-20 | 2012-08-09 | International Business Machines Corporation | Multi-Adapter Link Aggregation for Adapters with Hardware Based Virtual Bridges |
US20130156028A1 (en) * | 2011-12-20 | 2013-06-20 | Dell Products, Lp | System and Method for Input/Output Virtualization using Virtualized Switch Aggregation Zones |
US20130322434A1 (en) * | 2012-05-30 | 2013-12-05 | Siemens Aktiengesellschaft | Network Mechanism, Network Arrangement And Method For Operating A Network Arrangement |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10484519B2 (en) | 2014-12-01 | 2019-11-19 | Hewlett Packard Enterprise Development Lp | Auto-negotiation over extended backplane |
US11128741B2 (en) | 2014-12-01 | 2021-09-21 | Hewlett Packard Enterprise Development Lp | Auto-negotiation over extended backplane |
CN107534590A (en) * | 2015-10-12 | 2018-01-02 | 慧与发展有限责任合伙企业 | switch network architecture |
EP3284218A4 (en) * | 2015-10-12 | 2018-03-14 | Hewlett-Packard Enterprise Development LP | Switch network architecture |
US10616142B2 (en) * | 2015-10-12 | 2020-04-07 | Hewlett Packard Enterprise Development Lp | Switch network architecture |
CN107534590B (en) * | 2015-10-12 | 2020-07-28 | 慧与发展有限责任合伙企业 | Network system |
US11223577B2 (en) * | 2015-10-12 | 2022-01-11 | Hewlett Packard Enterprise Development Lp | Switch network architecture |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11509538B2 (en) | Network interconnect as a switch | |
US11223577B2 (en) | Switch network architecture | |
RU2543558C2 (en) | Input/output routing method and device and card | |
US9774499B2 (en) | System guided surrogating control in broadcast and multicast | |
US9473833B2 (en) | Systems and methods for increasing bandwidth in a computer network | |
US8706938B2 (en) | Bandwidth limiting on generated PCIE packets from debug source | |
WO2013093734A1 (en) | Flexible and scalable enhanced transmission selection method for network fabrics | |
US9419912B2 (en) | Selective underflow protection in a network switch | |
US9774546B2 (en) | Fibre channel over ethernet (FCoE) zoning in a distributed ethernet switch | |
WO2015147840A1 (en) | Modular input/output aggregation zone | |
US8842523B2 (en) | Fencing off switch domains | |
US9864728B2 (en) | Automatic generation of physically aware aggregation/distribution networks | |
US20200235815A1 (en) | Methods and sysems for reconfigurable network topologies | |
US9750135B2 (en) | Dual faced ATCA backplane | |
Xia et al. | Flat-tree: A convertible data center network architecture from CLOS to random graph | |
CN104081693A (en) | Reconfiguration of an optical connection infrastructure | |
US9706274B2 (en) | Distributed control of a modular switching system | |
WO2018057160A1 (en) | Technologies for increasing bandwidth in partitioned hierarchical networks | |
US8284770B1 (en) | Physical layer switching and network packet switching integrated into a hybrid switching module | |
WO2015107344A1 (en) | Reconfigurable computing system | |
Cai et al. | Software defined data center network architecture with hybrid optical wavelength routing and electrical packet switching | |
Shu et al. | Programmable optical packet/circuit switched data centre interconnects: traffic modeling and evaluation | |
Shu et al. | Evaluation of function-topology programmable (FTP) optical packet/circuit switched data centre interconnects |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14887301 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase | ||
122 | Ep: pct application non-entry in european phase |
Ref document number: 14887301 Country of ref document: EP Kind code of ref document: A1 |