US9461867B1 - Assigning communication paths among computing devices utilizing a multi-path communication protocol - Google Patents

Assigning communication paths among computing devices utilizing a multi-path communication protocol Download PDF

Info

Publication number
US9461867B1
US9461867B1 US15/014,191 US201615014191A US9461867B1 US 9461867 B1 US9461867 B1 US 9461867B1 US 201615014191 A US201615014191 A US 201615014191A US 9461867 B1 US9461867 B1 US 9461867B1
Authority
US
United States
Prior art keywords
communication
program
storage
computing
path
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
US15/014,191
Inventor
Ohad Atia
Yuval A. Ben-Horin
Alon Marx
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US15/014,191 priority Critical patent/US9461867B1/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BEN-HORIN, YUVAL A., ATIA, OHAD, MARX, ALON
Priority to US15/209,775 priority patent/US9531626B1/en
Application granted granted Critical
Publication of US9461867B1 publication Critical patent/US9461867B1/en
Priority to US15/365,997 priority patent/US9674078B2/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • H04L29/06
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/22Alternate routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/24Multipath
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/12Avoiding congestion; Recovering from congestion
    • H04L47/125Avoiding congestion; Recovering from congestion by balancing the load, e.g. traffic engineering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/14Multichannel or multilink protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2101/00Indexing scheme associated with group H04L61/00
    • H04L2101/60Types of network addresses
    • H04L2101/618Details of network addresses
    • H04L2101/663Transport layer addresses, e.g. aspects of transmission control protocol [TCP] or user datagram protocol [UDP] ports

Definitions

  • the present invention relates generally to the field of network communications, and more particularly to controlling the network connections utilized by multi-path data transmissions protocols.
  • NAS network-attached storage
  • SAN storage area network
  • the execution of software and programs can be deployed to hardware that supports virtual systems (e.g., virtual machines).
  • virtual systems e.g., virtual machines
  • various components within a computing system can be virtualized, such as network switches and communication adapters.
  • a virtual machine e.g., an application server
  • a virtual machine can be dynamically configured (e.g., computational speed, multitasking, high-volume network traffic, response time, reliability, etc.) and optimized for the applications executed on the virtual machine (VM).
  • some objectives of utilizing NAS and/or SAN systems are: improved availability (e.g., fault-tolerance), improved performance (e.g., bandwidth), improved scalability, and improved maintainability (e.g., disaster recovery processes).
  • a mesh network topology provides at least two nodes with two or more communication paths between the nodes, which provides redundant paths for the communications.
  • various communication protocols can be utilized within a communication network. Some networking protocols (e.g., Fibre Channel Protocol (FCP)) are implemented on the communication adapters (e.g., host bus adapters), which reduces the resource demands on central processing units (CPUs) within a computing system and/or a SAN. Other networking protocols can take advantage of the redundant paths within some networks to increase the bandwidth of information transfer. Two such protocols are Stream Control Transmission Protocol (SCTP) and Multi-path Transmission Control Protocol (MPTCP).
  • SCTP Stream Control Transmission Protocol
  • MPTCP Multi-path Transmission Control Protocol
  • various virtualization technologies can be applied to communication ports of a computing system and/or a SAN. In one instance, virtualizing port IDs improves the isolation of VMs utilizing the same physical port on a SAN.
  • the method includes one or more computer processors identifying a computing entity and a data storage entity that transfers data.
  • the method further includes determining a plurality of communication ports that the data storage entity utilizes to transfer data to the computing entity.
  • the method further includes identifying a plurality of computing resources respectively associated with the determined plurality of communication ports that the data storage entity utilizes to transfer the data to the computing entity.
  • the method further includes generating a list of tuples for the data storage entity based, at least in part, on the identified plurality of computing resources and the determined plurality of communication ports.
  • FIG. 1 illustrates a distributed computing environment, in accordance with an embodiment of the present invention.
  • FIG. 2 depicts an illustrative example of a configuration of physical and virtual ports associated with communication adapters of a computing system to communicate with a network, in accordance with an embodiment of the present invention.
  • FIG. 3 depicts an illustrative example of a storage area network utilizing physical and virtual ports to communicate with a network, in accordance with an embodiment of the present invention.
  • FIG. 4 depicts a flowchart of steps of a path generation program, in accordance with an embodiment of the present invention.
  • FIG. 5 depicts a flowchart of steps of an alternate path selection program, in accordance with an embodiment of the present invention.
  • FIG. 6 depicts an illustrative example of a portion of a list of communication ports for a storage area network, sorted, in accordance with an embodiment of the present invention.
  • FIG. 7 depicts a block diagram of components of a computer, in accordance with an embodiment of the present invention.
  • Embodiments of the present invention recognize that virtual environments for application servers as well as storage systems is an area of growth for some businesses. Data center decentralization and Cloud computing are two factors that contribute to this growth. To ensure that the transmission of data among application servers and SAN and/or NAS systems is reliable and low-latency, multi-path information transfer may be utilized.
  • Some networking technologies utilize a World Wide Name (WWN) as a unique identifier for a storage technology.
  • WWNs are Fibre Channel (FC), serial-attached virtual Small Computer Serial Interface (SAS), and Advanced Technology Attachment (ATA).
  • FC Fibre Channel
  • SAS serial-attached virtual Small Computer Serial Interface
  • ATA Advanced Technology Attachment
  • a WWN may be used as a World Wide Node Name (WWNN) to identify a switch.
  • a World Wide Port Name may be utilized to identify a port on a switch (e.g., communication adapter).
  • a switch e.g., communication adapter
  • N_Port_ID Virtualization may be utilized.
  • the logical unit number (LUNs) e.g., storage devices
  • LUNs logical unit number
  • NPIV enables multiple WWPNs to be assigned to a single physical N_Port. Different LUNs can be assigned to each WWPN rendering a LUN visible only to the VM that is zoned to a specific WWPN.
  • Embodiments of the present invention recognize that in a computing system that utilizes virtualization (e.g., hypervisor based), a large number of VMs may be provisioned.
  • the communication ports on the computing system are “initiators” of the communication with a storage system.
  • I/O and port virtualization the total number of possible communication paths that can be created between the VMs and a network, in some instances, may exceed 10,000 paths.
  • a storage system e.g., a SAN
  • targets targets
  • connection paths may provide connection redundancy between a VM and a storage device; however, this does not preclude creating “bottle-necks” at the physical ports of a storage device.
  • connection paths may result in a limited number of hardware components and ports being utilized generating “hot spots” or congestion while other hardware components and ports are not utilized.
  • Embodiments of the present invention provide redundancy and load balancing for VMs of a computing system that communicate information over a network to a storage system utilizing a multi-path communication protocol.
  • Embodiments of the present invention determine the communication resources (e.g., servers, HBAs, physical ports, physical port IDs, virtual port IDs, etc.) available on a storage system and the requirements (e.g., LUNs, bandwidth, timeout constraints, fault-tolerance, etc.) of the VMs and associated software executing on the VMs of a computing system that communicate with the storage system.
  • the computing resources of a storage system are identified, and a sorted list of the computing resources is generated, in accordance with an embodiment of the present invention.
  • computing resources of a storage system are cyclically distributed, based on a hardware hierarchy as a method to minimize the effects of faults that may occur on a storage system.
  • the sorted list includes the targets (e.g., port IDs, WWPNs) on the storage system.
  • the targets are distributed among the initiators (e.g., port IDs, WWPNs) utilized by one or more VMs of a computing system.
  • a computing system and/or a storage system may reserve one or more system (e.g., computing) resources.
  • a computing system includes a communication routing program that utilizes information associated with the initiator-target assignments for negotiating (e.g., handshaking) the communication paths between a computing system and a storage system.
  • a percentage of the total possible targets is assigned to each initiator.
  • the number of targets assigned to an initiator is related to the bandwidth associated with a VM and/or software application executing on a VM.
  • embodiments of the present invention recognize that VMs of a computing system may be dynamically created (e.g., provisioned), paused, stopped, and destroyed. Limiting the distribution of the total number of targets provides a method to assign (e.g., allocate) targets to new initiators without generating a new sorted target list and renegotiating a plurality of communication paths between a computing system and a storage system.
  • one or more communication paths between a computing system and a storage system are monitored.
  • the monitoring of one or more communication paths is utilized to detect communication faults.
  • the monitoring of one or more communication paths is utilized to verify that sufficient bandwidth (e.g., communication paths) is assigned to an initiator.
  • Embodiments of the present invention also recognize that a VM of a computing system may execute various software programs where each software program may utilize one or more initiators to communicate with a storage system.
  • each initiator utilized by a VM may be assigned a different number of targets on a storage system.
  • a software program is determined to have timeout constraints that may affect the routing selection method, routing selection delay, and target assignments. For example, a software program that includes a timeout constraint may fail, abort, and/or produce an error message when requested information requested from data storage is not received within a period of time dictated by the code of the software program.
  • An alternate embodiment of the present invention recognizes that multi-path communications may also exist within computing systems hosting application servers and data storage systems as part of input/output (I/O) virtualization. Some embodiments of the present invention may be utilized by internal communication networks to handle redundancy and load balancing issues.
  • FIG. 1 illustrates distributed computing environment 100 , which includes computing system 102 , storage 105 , and network 110 .
  • An embodiment of distributed computing environment 100 includes computing system 102 and storage 105 interconnected over network 110 .
  • Many modifications to the depicted environment may be made by those skilled in the art without departing from the scope of the invention as recited by the claims.
  • computing system 102 is comprised of physical and virtualized systems.
  • Physical systems can be a stand-alone computer, or alternatively, a computing system utilizing clustered computers and components.
  • Virtual systems are independent operating environments that use virtual resources made up of logical divisions of physical resources, such as processors 121 , disks 122 , network cards 123 (e.g., input/output (I/O) adapters), and memory 124 .
  • Hypervisor 120 provides the ability to divide physical computing system resources into isolated logical partitions. Each logical partition operates like an independent computing system running its own operating system (e.g., a virtual system).
  • the independent operating environments controlled by a hypervisor may be structured in various schemes and hierarchies (e.g., logical partitions (LPAR), servers).
  • hypervisor 120 interfaces with communication routing program 125 as the status of one or more VM changes (e.g., provisioned, destroyed, etc.).
  • hypervisor 120 manages communication between the logical partitions and other systems within computing system 102 via one or more virtual switches (not shown).
  • some virtual switches and internal network communications are represented by bus 190 .
  • each logical partition may include one or more virtual adaptors for communication between the VMs within a logical partition and VMs or other systems outside of the LPAR.
  • LPAR 130 includes virtual adapters 140 , 150 , and 160 associated with VM 131 , VM 133 , and VM 135 , respectively.
  • the type of the virtual adapter depends on the operating system used by the logical partition.
  • virtual adapters examples include virtual Ethernet adapters, virtual fiber channel adapters, virtual small computer serial interface (SCSI) adapters, and virtual serial adapters.
  • Some of the virtual adapters utilize bus 190 to facilitate communications.
  • bus 190 may be configured to create as a Virtual Local Area Network (VLAN) within computing system 102 .
  • computing system 102 may utilize other technologies, such as Virtual Machine Communication Interface (VMCI) protocol or virtual network interface cards (VNIC), to enhance the communications among virtual adapters.
  • VMCI Virtual Machine Communication Interface
  • VNIC virtual network interface cards
  • Physical and virtual adapters within computing system 102 may utilize protocols that support communication via virtual port IDs (e.g., NPIV, WWPNs).
  • computing system 102 is divided into logical partitions (LPARs) that include LPAR 129 and LPAR 130 with each LPAR executing an independent operating environment, such as an operating system (OS).
  • LPAR 129 contains an OS that supports a virtual I/O server (VIOS).
  • the VIOS LPAR i.e., LPAR 129
  • the I/O servers respectively including communication adapters (e.g., host bus adapters (HBA)) 171 and/or 177 , and 181 and/or 187 .
  • LPAR 130 includes VM 131 , VM 133 , and VM 135 executing a shared OS.
  • at least one of I/O server 170 and I/O server 180 are physical computer servers (e.g., blade servers, network servers, etc.) that communicate with various portions of computing system 102 via an internal communication system, bus 190 .
  • communications from network 110 are routed through communication adapters (e.g., HBAs, network interface cards (NIC), etc.) 171 and 177 , and 181 and 187 of I/O server 170 and I/O server 180 , respectively, on logical partition 129 , to bus 190 .
  • Communications from virtual adapters 140 , 150 , and 160 in logical partition 130 may be routed through bus 190 to network 110 , in accordance with an embodiment of the present invention.
  • one or more of communication adapters 171 , 177 , 181 , and 187 are physical adapters included in network cards 123 .
  • one or more of communication adapters 171 , 177 , 181 , and 187 are virtual adapters derived from network cards 123 .
  • one or more physical network adapters are allocated to logical partition 130 for VM 131 , 133 , and 135 to function in place of instances of virtual adapters 140 , 150 , and 160 .
  • computing system 102 communicates through network 110 to storage 105 (e.g., a storage area network).
  • storage 105 e.g., a storage area network
  • One or more connections of computing system 102 to network 110 occur via ports; herein identified as initiators.
  • One or more connections of storage 105 to network 110 occur via ports; herein identified as targets.
  • Network 110 can be, for example, a local area network (LAN), a telecommunications network, a wireless local area network (WLAN), a wide area network (WAN), such as the Internet, or any combination of the previous, and can include wired, wireless, or fiber optic connections.
  • network 110 can be any combination of connections and protocols that will support communications between computing system 102 and storage 105 , in accordance with embodiments of the present invention.
  • network 110 operates locally via wired, wireless, or optical connections and can be any combination of connections and protocols (e.g., NFC, laser, infrared, etc.).
  • Network 110 may be configured in various topologies (e.g., bus, tree, mesh, hybrid, fabric, switched fabric, etc.).
  • Communication routing program 125 utilizes various communication protocols to communicate data between computing system 102 and storage 105 , based on the networking technologies utilized by network 110 and the protocols supported by instances of physical and/or virtualized network adapters utilized by computing system 102 and storage 105 .
  • communication routing program 125 utilizes one or more protocols that support multi-path data communication between computing system 102 and storage 105 .
  • communication routing program 125 may establish (e.g., negotiate) a plurality of connections (e.g., communication paths) between computing system 102 and storage 105 and communicate information associated with the connections (e.g., initiator IDs, target IDs) to path generation program 400 .
  • communication routing program 125 dictates the assignment of the target ports on storage 105 based on a sorted list of targets generated by path generation program 400 .
  • communication routing program 125 receives initiator-target assignment from path generation program 400 for initiator ports of computing system 102 (e.g., referencing FIG. 6 , target port list for storage 105 ).
  • communication routing program 125 utilizes one or more networking utilities to determine information associated with a communication path. Information determined by communication routing program 125 may include: a status, a retry rate, a packet loss rate, a queuing delay, a propagation delay, an error rate, a fault, and a handshaking error.
  • communication routing program 125 may utilize technologies, such as N_Port_ID Virtualization (NPIV) and N_Port Virtualization (NPV) to enable one or more ports of I/O server 170 and I/O server 180 to be identified by more than one instance of a WWPN.
  • communication routing program 125 enables multi-path communication (e.g., multi-path I/O) via I/O virtualization within computing system 102 .
  • communication routing program 125 enables multi-path communication between VM 131 (e.g., processors 121 ) utilizing virtual adapter 140 (e.g., network cards 123 ) and disks 122 .
  • Environment 200 includes I/O server 170 , communication adapter 171 , communication adapter 177 , bus 190 , and connections to network 110 . Environment 200 is described in further detail with respect to FIG. 2 .
  • Path generation program 400 determines which physical or virtual initiators (e.g., communication adapter ports) of computing system 102 communicate with storage 105 utilizing a multi-path communication protocol. Multiple instances of path generation program 400 may execute concurrently. Path generation program 400 may execute concurrently with alternate path selection program 500 . In some embodiments, another instance of path generation program 400 may execute when a new VM activates. In an embodiment, path generation program 400 determines the number of targets available on storage 105 and the physical relationships (e.g., communication adapter address, server supporting a communication adapter) of those targets on storage 105 . Path generation program 400 utilizes this information to determine the number of connections that may be utilized by each initiator and a group of targets that are assigned to each initiator.
  • physical or virtual initiators e.g., communication adapter ports
  • Path generation program 400 communicates groups of targets that are assigned to each initiator to communication routing program 125 , which subsequently generates the connection paths between computing system 102 and storage 105 via network 110 utilizing a multi-path communication protocol.
  • path generation program 400 communicates with hypervisor 120 to determine when a new VM is provisioned and which initiators are utilized by the provisioned VM.
  • path generation program 400 communicates with hypervisor 120 to determine when the status of a VM changes; for example, when a VM de-provisions and which initiators are associated with the de-provisioned VM.
  • path generation program 400 may monitor a communication path between an initiator and a target. In a scenario, if path generation program 400 detects a problem with a communication path (e.g., connection), path generation program 400 may activate alternate path selection program 500 . In another embodiment, path generation program 400 may also determine which VM within LPAR 130 utilizes the initiator of the affected communication path.
  • a communication path e.g., connection
  • path generation program 400 may also determine which VM within LPAR 130 utilizes the initiator of the affected communication path.
  • Alternate path selection program 500 responds to faults detected in storage 105 .
  • Alternate path selection program 500 determines which targets (e.g., ports, port IDs, WWPNs, WWNNs) on storage 105 are associated with a fault detected on storage 105 . Based on the sorted list of targets generated by path generation program 400 , alternate path selection program 500 determines which initiators are assigned (e.g., connected) to the affected targets. In one embodiment, alternate path selection program 500 may communicate with communication routing program 125 and flag the affected connection to prevent further use of the flagged connection while the communication fault is present on storage 105 .
  • targets e.g., ports, port IDs, WWPNs, WWNNs
  • alternate path selection program 500 may activate path generation program 400 to generate a new list of connections for the affected VM.
  • alternate path selection program 500 may utilize information associated with a VM of an affected initiator determined by path generation program 400 (step 410 ) to determine if software executing on the VM includes timeout constraints which may be triggered by the affected communication connection.
  • Storage 105 includes data and information utilized by VM 131 , VM 133 , and VM 135 .
  • the data and information contained within storage 105 may include: text files, video files, audio files, numerical data, e-mail files, databases, etc.
  • storage 105 is a storage area network (SAN).
  • storage 105 is network-attached storage (NAS) system.
  • NAS network-attached storage
  • storage 105 may be a SAN-NAS hybrid system.
  • storage 105 may utilize storage virtualization to enable additional functionality and more advanced features within a computer data storage system.
  • FIG. 2 depicts a functional block diagram illustrating environment 200 of computing system 102 within distributed computing environment 100 of FIG. 1 .
  • environment 200 includes: I/O server 170 , communication adapter 171 , communication adapter 177 , a portion of bus 190 , and associated physical and virtual communication ports (i.e., ports) that connect to network 110 .
  • communication adapter 171 includes physical ports P 201 and P 202
  • communication adapter 177 includes physical ports P 217 and P 218 .
  • physical port P 201 includes virtual ports VP 203 , VP 205 , and VP 207 .
  • physical port P 202 includes virtual ports VP 204 , VP 206 , and VP 208 .
  • Communication adapters 171 and 177 may communicate with VM 131 , VM 133 , and VM 135 via bus 190 .
  • VM 131 , VM 133 , and VM 135 may utilize one or more ports: VP 203 , VP 204 , VP 205 , VP 206 , VP 207 , VP 208 , P 217 , and P 218 to act as initiators (i.e., communication initiators) to communicate with storage 105 via network 110 .
  • initiators i.e., communication initiators
  • FIG. 3 depicts a functional block diagram illustrating storage 105 .
  • storage 105 includes: server 340 , server 350 , and server 360 , where each server further includes two communication adapters and each communication adapter includes two physical ports.
  • Storage 105 may also include, but is not shown: persistent storage devices (e.g., hard-disk arrays, magnetic tape libraries, optical jukeboxes, solid-state disk drives, etc.), support/monitoring hardware and software, virtualization firmware and software (e.g., a hypervisor), and networking devices.
  • server 340 includes communication adapter A 341 that includes physical ports P 342 and P 343 ; and communication adapter A 347 that includes physical ports P 348 and P 349 .
  • one physical port of communication adapter A 341 , port P 342 may utilize NPIV to create virtual ports VP 344 and VP 345 .
  • server 350 includes communication adapter A 351 that includes physical ports P 352 and P 353 ; and communication adapter A 357 that includes physical ports P 358 and P 359 .
  • server 360 includes communication adapter A 361 that includes physical ports P 362 and P 363 ; and communication adapter A 367 that includes physical ports P 368 and P 369 .
  • storage 105 utilizes I/O virtualization and multi-path communication within storage 105 .
  • servers 340 , 350 , and 360 may have HBAs that communicate with the controllers of the storage devices (not shown) via storage virtualization utilizing NPIV and NPV.
  • Some protocols utilized within storage 105 may include: Fibre Channel protocol (FCP), Internet SCSI (iSCSI), SAS, Fibre Channel over Internet Protocol (FCIP), ATA over Ethernet (AoE), and Fibre Channel over Ethernet (FCoE).
  • FCP Fibre Channel protocol
  • iSCSI Internet SCSI
  • FCIP Fibre Channel over Internet Protocol
  • AoE ATA over Ethernet
  • FCoE Fibre Channel over Ethernet
  • FIG. 4 is a flowchart depicting operational steps for path generation program 400 , executing on computing system 102 within distributed computing environment 100 of FIG. 1 .
  • Path generation program 400 determines which initiators of a computing system communicate with a storage system utilizing a multi-path communication protocol.
  • path generation program 400 generates a sorted list of targets (e.g., ports) and assigns groups of items (e.g., targets) from a sorted list that are available (e.g., not reserved, not flagged) on a storage system to various initiators of a computing device to improve the performance (e.g., bandwidth) and reliability (e.g., redundancy) of the multi-path communications between the computing device and the storage system.
  • targets e.g., ports
  • groups of items e.g., targets
  • redundancy e.g., redundancy
  • path generation program 400 determines which initiators of a computing system communicate with a storage system utilizing a multi-path communication protocol. In one embodiment, path generation program 400 determines which initiators of computing system 102 communicate with storage 105 by the protocol utilized by a communication connection. In an example, path generation program 400 may communicate with communication routing program 125 to determine which initiators of computing system 102 communicate with storage 105 via MPTCP, SCTP, and/or FCP. In another example, path generation program 400 utilizes handshaking information determined by communication routing program 125 to determine which initiators utilize a multi-path communication protocol.
  • path generation program 400 determines which initiators of computing system 102 utilize a multi-path communication protocol based on configuration information associated with a VM (e.g., VM 135 ). In one scenario, path generation program 400 may determine that an initiator (e.g., P 218 ) utilizes a multi-path communication protocol based on the middleware and/or application programming interfaces (APIs) (not shown) executed by a VM (e.g., VM 131 ) that utilizes the initiator (e.g., P 218 ). In another scenario, path generation program 400 determines which initiators utilize a multi-path communication protocol based in the provisioning information associated with a VM (e.g., VM 131 ).
  • APIs application programming interfaces
  • path generation program 400 determines a number of targets of a storage system that connect to each initiator.
  • path generation program 400 utilizes information determined by communication routing program 125 obtained while establishing a plurality of connections between computing system 102 and storage 105 to determine the number of targets that are connected to each initiator.
  • path generation program 400 communicates with storage 105 to determine the number of targets (e.g., ports, port IDs) that are utilized by computing system 102 and which targets are reserved (e.g., inactive) on storage 105 .
  • targets e.g., ports, port IDs
  • path generation program 400 generates a sorted list of targets.
  • Path generation program 400 generates a sorted list of targets (e.g., port ID's, WWNNs, WWPN) for storage 105 based on physical and virtual computing resources related to the targets.
  • the resources of storage 105 include one or more servers, where each server further includes one or more communication adapters, and each communication adapter further includes one or more ports.
  • path generation program 400 generates a sorted list that may be described as a series of tuples in the format of (S,C,P) where (S), (C), and (P) respectively identify: a server, a communication adapter, and a communication port (e.g., a port ID). Each element within the tuple format of (S,C,P) is cyclically utilized.
  • path generation program 400 prioritizes the sort based on a hardware hierarchy (e.g., servers affect communication adapters, communication adapters affect ports). In an example, (referring to the elements within FIG.
  • an instance of path generation program 400 generates the series: ( 340 , A 341 ), ( 350 , A 351 ), ( 360 , A 361 ), ( 340 , A 347 ), ( 350 , A 357 ), and ( 360 , A 367 ).
  • the length of the series generated by path generation program 400 is based on the total number of target ports on storage 105 . In other scenarios, the length of the series generated by path generation program 400 is based on the number of target active ports on storage 105 .
  • path generation program 400 bases the sorted list on server 340 , server 350 , server 360 , and the communication adapters (e.g., A 341 /A 347 , A 351 /A 357 , A 361 /A 367 ) respectively associated with each server.
  • Path generation program 400 also cyclically distributes the respective communication ports, physical and/or virtual, among each server/communication adapter combination. Referencing FIG.
  • path generation program 400 applies a sorting routing on the components depicted in FIG. 3 to generate the target port list for storage 105 depicted in FIG. 6 .
  • path generation program 400 does not include WWPNs as port IDs when generating a sorted list of targets.
  • Path generation program 400 assigns the WWPN based on the LUNs that are utilized by a VM.
  • path generation program 400 may create a sorted list of ports based on other factors. Such factors may include: connection speed, routing paths, connection reliability, etc.
  • path generation program 400 defines a number of connections for each initiator and assigns targets to each initiator based on the sorted list of targets.
  • Path generation program 400 may utilize one or more rules (e.g., determined by a system administrator, dictated by a software program, included in the provisioning of a VM, etc.).
  • path generation program 400 identifies the bandwidth needed for the multi-path communications of each initiator utilized by a VM (e.g., VM 131 , VM 133 , and VM 135 ) and determines a number of connections to storage 105 (e.g., targets) to support the identified bandwidth.
  • a VM e.g., VM 131 , VM 133 , and VM 135
  • storage 105 e.g., targets
  • initiators VP 203 , VP 205 , and VP 207 utilize a similar bandwidth (e.g., communication rate) and path generation program 400 assigns four connections (e.g., communication paths) to initiators VP 203 , VP 205 , and VP 207 .
  • path generation program 400 determines that initiator P 217 utilizes a higher bandwidth and path generation program 400 assigns initiator P 217 seven connections.
  • path generation program 400 receives an indication from VM 135 that a program executing on VM 135 , which utilizes initiator P 218 , includes timeout constraints.
  • Path generation program 400 may determine that the bandwidth utilized by P 218 is similar to VP 203 ; however, path generation program 400 also determines that the program of VM 135 that utilizes P 218 includes timeout constrains. Based on a timeout constraint associated with program associated with initiator P 218 , path generation program 400 assigns seven connections to P 218 as opposed to four connection. In some scenarios, path generation program 400 dynamically defines the number of connections assigned to each initiator. In one instance, path generation program 400 utilizes historical data from a load balancing program (not shown) to define the number of connections assigned to each initiator utilizing multi-path communication. In another instance, path generation program 400 defines the number of connections assigned to each initiator based on the number of active VMs of computing system 102 .
  • path generation program 400 defines a uniform number of connections to each initiator utilizing a multi-path communications.
  • path generation program 400 utilizes information obtained from the administrator of computing system 102 to define the number of connections that are assigned to each initiator utilizing multi-path communications with storage 105 .
  • path generation program 400 defines a number of connections assigned to each initiator based on a percentage of the total available targets on storage 105 , rounded to a whole number of connections. For an example, storage 105 includes 1600 target port IDs and computing system 102 has 200 initiators that utilize multi-path communications. Path generation program 400 assigns 40% of the target to the initiators engaged in multi-path communications with storage 105 .
  • path generation program 400 assigns 3 connections, 3.2 rounded to the nearest integer number of connections, to each of the 200 initiators.
  • path generation program 400 utilizes unassigned targets of a sorted target list to assign targets to new initiators that utilize multi-path communication.
  • path generation program 400 returns assigned targets to a sorted target list when an initiator utilizing multi-path communication is not used (e.g., VM utilizing the initiator is de-provisioned).
  • path generation program 400 may communicate with storage 105 to determine the number of targets (e.g., ports, port IDs, WWPNs, WWNNs) available prior to assigning a number of connections to initiators on computing system 102 . Based on the information communicated to path generation program 400 by storage 105 in some instances the reserved ports are excluded from the sorted list generation (step 406 ). In other instances, path generation program 400 includes the reserved ports in the sorted port list; however, path generation program 400 does not assign the reserved ports to an initiator. In some scenarios, path generation program 400 determines that storage 105 reserves a portion of the available targets.
  • targets e.g., ports, port IDs, WWPNs, WWNNs
  • path generation program 400 determines that storage 105 reserves targets, P 368 and P 369 of communication adapter A 367 on server 360 .
  • the reserved targets are released for during workload spikes.
  • the reserved targets are released when a communication fault is identified on storage 105 (e.g., a HBA failure, a communication adapter, a networking cable failure, a server failure, etc.).
  • path generation program 400 communicates the assigned targets for an initiator to communication routing program 125 .
  • communication routing program 125 utilizes the assigned targets for an initiator to modify the connection paths of network 110 utilized by computing system 102 to communicate with storage 105 .
  • path generation program 400 may transfer a list of ports (e.g., group) associated with each initiator to communication routing program 125 .
  • path generation program 400 communicates the list depicted in FIG. 6 to communication routing program 125 .
  • path generation program 400 may communicate changes associated with individual initiators to communication routing program 125 .
  • VM 131 which utilizes initiator VP 203 , is de-provisioned.
  • Path generation program 400 communicates to communication routing program 125 that initiator VP 203 is not utilized.
  • path generation program 400 determines that a new VM is provisioned and communicates a group of unassigned, substantially sequential targets from the sorted list of targets to the initiator associated with the new VM to communication routing program 125 .
  • path generation program 400 may assign multiple WWPNs to targets on storage 105 when NPIV is utilized. Path generation program 400 may utilize this strategy to zone specific LUNs of storage 105 to a VM (e.g., VM 135 ) on computing system 102 .
  • Path generation program 400 includes the WWPN assignments with the initiator-target assignment communicated to communication routing program 125 .
  • path generation program 400 optionally monitors a communication path between an initiator and a target.
  • path generation program 400 may monitor one or more communication paths associated with an initiator to ensure that sufficient connections were assigned to meet a bandwidth dictate of an initiator.
  • path generation program 400 may monitor one or more communication paths for an initiator associated with a program that includes timeout constraints.
  • path generation program 400 may execute alternate path selection program 500 to determine if a communication path issue is related to a fault on storage 105 .
  • path generation program 400 may determine that a communication path issue is not associated with a fault on storage 105 .
  • path generation program 400 may communicate with hypervisor 120 to assign another initiator to a VM associated with the affected communication path.
  • the path may fail due to another fault on: computing system 102 , storage 105 , and/or network 110 .
  • Examples of other issues that affect communication paths between computing system 102 and storage 105 are: incorrect hardware setting, duplicate IP addresses, access authority, errors in LUN masking, and incorrect LUN zoning.
  • FIG. 5 is a flowchart depicting operational steps for alternate path selection program 500 , executing on computing system 102 within distributed computing environment 100 of FIG. 1 .
  • Alternate path selection program 500 determines whether one or more communication faults may be related to a fault on storage 105 . If alternate path selection program 500 identifies a communication fault associated with storage 105 , then alternate path selection program 500 determines which targets are affected and whether an affected target includes a timeout constraint. Subsequently, alternate path selection program 500 utilizes one or more techniques to select unassigned communication paths (e.g., target ID's) and communicates the alternative communication path to communication routing program 125 and/or path generation program 400 .
  • unassigned communication paths e.g., target ID's
  • alternate path selection program 500 identifies whether a communication fault is associated with a fault on a storage system. In one embodiment, alternate path selection program 500 receives information from communication routing program 125 that one or more connection paths are not established with storage 105 . In another embodiment, alternate path selection program 500 determines that there is a fault on storage 105 from a communication (e.g., status message) between storage 105 and computing system 102 . Responsive to determining that a communication fault is identified on storage 105 (yes branch, decision step 502 ), alternate path selection program 500 determines which targets are affected by a communication fault of a storage system (step 504 ).
  • a communication fault e.g., status message
  • alternate path selection program 500 determines which targets are associated with a communication fault of a storage system. In one embodiment, alternate path selection program 500 determines which targets (e.g., port IDs) are affected by a hardware fault. In one scenario, alternate path selection program 500 communicates with a monitoring program on storage 105 to determine that a server is associated with an identified communication fault. In an example, (referring to FIG. 3 ) if server 360 is shutdown, then alternate path selection program 500 determines that targets (e.g., ports) P 362 , P 363 , P 368 , and P 369 are affected.
  • targets e.g., ports
  • alternate path selection program 500 determines that targets P 358 and/or P 359 may be affected. In this scenario, alternate path selection program 500 may require additional information from storage 105 to determine whether one or both of targets P 358 and P 359 are affected by a fault on communication adapter A 357 .
  • alternate path selection program 500 determines which targets (e.g., port IDs) are affected by a software fault.
  • alternate path selection program 500 communicates with a hypervisor of storage 105 to determine whether internal communication occurs among storage systems and server 340 , server 350 , and server 360 .
  • targets e.g., ports
  • alternate path selection program 500 determines that targets (e.g., ports) P 362 , P 363 , P 368 , and P 369 are affected.
  • alternate path selection program 500 determines that NPID virtualization is associated with a communication fault on storage 105 , then alternate path selection program 500 determines that targets VP 344 and VP 345 are affected.
  • alternate path selection program 500 communicates with communication routing program 125 to determine which targets of storage 105 are affected by a communication fault based on one or more network diagnostics executed by communication routing program 125 .
  • communication routing program 125 may periodically utilize a traceroute function (not shown) to identify which connection paths cannot be accessed.
  • communication routing program 125 may utilize a neighbor discovery protocol monitor (not shown) to identify which connection paths cannot be accessed.
  • alternate path selection program 500 determines which initiators are assigned to affected targets. In an embodiment, alternate path selection program 500 communicates with communication routing program 125 to determine the one or more initiators that are assigned to the affected targets. In another embodiment, alternate path selection program 500 communicates with path generation program 400 to obtain the sorted list of targets and the initiator-target assignments to determine the one or more initiators that are assigned to the affected targets.
  • alternate path selection program 500 identifies a software program associated with an initiator that includes a timeout constraint.
  • alternate path selection program 500 communicates with path generation program 400 to determine whether a software program, executing on computing system 102 , that utilizes an affected target on storage 105 includes a timeout constraint.
  • alternate path selection program 500 communicates with hypervisor 120 to determine whether a software program, executing on computing system 102 , that utilizes an affected target on storage 105 includes a timeout constraint.
  • alternate path selection program 500 determines from communications with path generation program 400 that the initiator associated with a timeout constraint is also affected by varying bandwidth utilization.
  • alternate path selection program 500 determines whether an initiator utilizes an affected target. In one embodiment, alternate path selection program 500 determines that an initiator is affected based on a determined timeout constraint of a VM and/or program that utilizes a target that is affected by a fault. In another embodiment, alternate path selection program 500 determines that an initiator is associated with an affected target based on information obtained from path generation program 400 . In one scenario, alternate path selection program 500 determines that an initiator is affected based on “quality of service” measurements (e.g., error rates, retry rates, etc.) obtained from path generation program 400 . In another scenario, alternate path selection program 500 determines that sufficient communication paths do not exist for an initiator based on a bandwidth constraint of a VM and/or software program.
  • quality of service e.g., error rates, retry rates, etc.
  • alternate path selection program 500 selects an alternative communication path for an affected initiator (step 512 ).
  • alternate path selection program 500 selects an alternative communication path for an affected initiator.
  • alternate path selection program 500 determines that the affected initiator is not associated with a software program that includes a timeout constraint.
  • alternate path selection program 500 communicates to communication routing program 125 which target of storage 105 is affected and that communication routing program 125 may proceed to another assigned target.
  • Alternate path selection program 500 may utilize various scheduling or queuing algorithms for selecting the other assigned target. Examples of queuing algorithms that alternate path selection program 500 may utilize include: round robin, weighted round robin, deficit round robin, multilevel queuing, and random.
  • alternate path selection program 500 flags the affected target to communication routing program 125 so that the affected target is excluded from subsequent multi-path data communications.
  • alternate path selection program 500 determines from communications with path generation program 400 that an initiator has a bandwidth constraint. Alternate path selection program 500 communicates the information related to the affected target to path generation program 400 for path generation program 400 to communicate a replacement target assignment to communication routing program 125 .
  • alternate path selection program 500 determines that the affected initiator is associated with a software program that includes a timeout constraint. In one scenario, alternate path selection program 500 communicates to communication routing program 125 which target of storage 105 is affected and that communication routing program 125 may reduce the delay and/or number of communication retries before proceeding to the next assigned target in round-robin fashion. In addition, alternate path selection program 500 flags the affected target to communication routing program 125 so that the affected target is excluded from subsequent multi-path data communications.
  • alternate path selection program 500 communicates with path generation program 400 to replace the affected targets with a newgroup of target assignments from the sorted target list for storage 105 . Subsequently, path generation program 400 communicates the replacement target assignments to communication routing program 125 .
  • alternate path selection program 500 determines that an affected initiator includes both a timeout constraint and variable bandwidth utilization. In an instance, alternate path selection program 500 communicates with path generation program 400 to replace the affected targets with a new group of target assignments from the sorted target list for storage 105 , where the new group of targets is larger to support periods of higher bandwidth utilization. Subsequently, path generation program 400 communicates the new and larger group of target assignments to communication routing program 125 .
  • alternate path selection program 500 terminates.
  • alternate path selection program 500 in response to a determination that alternate path selection program 500 determines that an initiator does not utilizes an affected target (no branch, decision step 510 ), alternate path selection program 500 terminates.
  • FIG. 6 is an illustrative example of a portion of a list of target ports on storage 105 and the initiators of computing system 102 that may utilize the target ports.
  • FIG. 6 depicts twenty-four targets assigned to five initiators, in accordance with an embodiment of the present invention.
  • Target port list for storage 105 is generated based on an embodiment of path generation program 400 (referencing step 406 ).
  • Column 602 depicts the assignment of five physical and virtual initiators of computing system 102 to targets (e.g., ports) of storage 105 , in accordance with an embodiment of the present invention.
  • Column 604 depicts a cyclical assignment of servers, server 340 then server 350 , then server 360 , and the cycle begins again with server 340 .
  • Column 606 depicts a cyclical assignment of communication adapters respectively associated with each instance of a server (referencing FIG. 3 ).
  • communication adapter A 341 is assigned to the first, third, fifth, and seventh instances of server 340 .
  • Communication adapter A 347 is assigned to the second, fourth, sixth, and eighth instances of server 340 .
  • Column 608 depicts a cyclical assignment of physical and virtual ports (e.g., targets) respectively associated with each instance of a communication adapter (referencing FIG. 3 ).
  • virtual port VP 344 is assigned to the first and fourth instances of communication adapter A 341 .
  • Virtual port VP 345 is assigned to the second instance of communication adapter A 341
  • port P 343 is assigned to the third instance of communication adapter A 341 .
  • initiator P 217 is assigned ports P 343 , P 352 , P 362 , P 348 , P 358 , and P 368 .
  • Row 610 depicts an instance where alternate path selection program 500 determines that communication adapter A 361 (stippled shading) of storage 105 is identified as having a communication fault. In an embodiment, alternate path selection program 500 may determine that there is no impact to initiator P 217 . Alternate path selection program 500 communicates with communication routing program 125 to proceed with a round-robin utilization of ports P 343 , P 352 , P 348 , P 358 , and P 368 for multi-path communication between computing system 102 and storage 105 .
  • initiator P 218 is assigned ports VP 344 , P 352 , P 362 , P 348 , P 358 , and P 368 .
  • Row 615 and row 616 depict an instance where alternate path selection program 500 determines that server 350 (random stipple shading) of storage 105 is identified as having a communication fault. A fault in server 350 affects targets P 352 , P 353 , P 358 , and P 359 .
  • alternate path selection program 500 may determine, based on communications with path generation program 400 , that initiator P 218 is bandwidth constrained based on losing the communication paths associated with targets P 352 and P 358 .
  • Alternate path selection program 500 communicates with path generation program 400 that additional targets are needed to maintain the bandwidth dictated by initiator P 218 .
  • initiators VP 203 , VP 205 , VP 207 , P 217 , and P 218 of computing system 102 are engaged in multi-path communication with storage 105 .
  • path generation program 400 assigns the next two target ports in sequence.
  • the tuples that describes the two target ports in sequence are: server 340 /communication adapter A 341 /port VP 345 and server 350 /communication adapter A 351 /port P 353 .
  • port P 353 is affected by the fault in server 350 ; therefore, path generation program 400 assigns a different target, which is described by a tuple as server 360 /communication adapter A 361 /port VP 343 .
  • FIG. 7 depicts computer system 700 , which is representative of computing system 102 and storage 105 .
  • Computer system 700 is an example of a system that includes software and data 712 .
  • Computer system 700 includes processor(s) 701 , cache 703 , memory 702 , persistent storage 705 , communications unit 707 , input/output (I/O) interface(s) 706 , and communication fabric 704 .
  • Communications fabric 704 provides communications between cache 703 , memory 702 , persistent storage 705 , communications unit 707 , and input/output (I/O) interface(s) 706 .
  • Communications fabric 704 can be implemented with any architecture designed for passing data and/or control information between processors (such as microprocessors, communications and network processors, etc.), system memory, peripheral devices, and any other hardware components within a system.
  • processors such as microprocessors, communications and network processors, etc.
  • system memory such as RAM, ROM, etc.
  • peripheral devices such as peripherals, etc.
  • communications fabric 704 can be implemented with one or more buses or a crossbar switch.
  • Memory 702 and persistent storage 705 are computer readable storage media.
  • memory 702 includes random access memory (RAM).
  • RAM random access memory
  • memory 702 can include any suitable volatile or non-volatile computer readable storage media.
  • Cache 703 is a fast memory that enhances the performance of processor(s) 701 by holding recently accessed data, and data near recently accessed data, from memory 702 .
  • memory 702 includes, at least in part, designated memory 124 (e.g., physical hardware) depicted in FIG. 1 to be shared among logical partitions.
  • persistent storage 705 includes a magnetic hard disk drive.
  • persistent storage 705 can include a solid-state hard drive, a semiconductor storage device, a read-only memory (ROM), an erasable programmable read-only memory (EPROM), a flash memory, or any other computer readable storage media that is capable of storing program instructions or digital information.
  • persistent storage 705 includes, at least in part, disks 122 (e.g., physical hardware) depicted in FIG. 1 to be shared among logical partitions.
  • the media used by persistent storage 705 may also be removable.
  • a removable hard drive may be used for persistent storage 705 .
  • Other examples include optical and magnetic disks, thumb drives, and smart cards that are inserted into a drive for transfer onto another computer readable storage medium that is also part of persistent storage 705 .
  • Software and data 712 are stored in persistent storage 705 for access and/or execution by one or more of the respective processor(s) 701 via cache 703 and one or more memories of memory 702 .
  • software and data 712 includes hypervisor 120 , communication routing program 125 , path generation program 400 , alternate path selection program 500 , and other programs.
  • Communications unit 707 in these examples, provides for communications with other data processing systems or devices, including resources of computing system 102 and processors 121 .
  • communications unit 707 includes one or more network interface cards.
  • Communications unit 707 may provide communications through the use of either or both physical and wireless communications links.
  • hypervisor 120 , virtual adapter 140 , virtual adapter 150 , virtual adapter 160 , bus 190 , and program instructions and data used to practice embodiments of the present invention may be downloaded to persistent storage 705 through communications unit 707 .
  • communication unit 707 includes, at least in part, one or more network cards 123 (e.g., physical hardware), virtual adapter 140 , virtual adapter 150 , virtual adapter 160 , communication adapter 171 , communication adapter 177 , communication adapter 181 , and communication adapter 187 depicted in FIG. 1 , to be shared among logical partitions.
  • network cards 123 e.g., physical hardware
  • I/O interface(s) 706 allows for input and output of data with other devices that may be connected to each computer system.
  • I/O interface(s) 706 may provide a connection to external device(s) 708 , such as a keyboard, a keypad, a touch screen, and/or some other suitable input device.
  • External device(s) 708 can also include portable computer readable storage media such as, for example, thumb drives, portable optical or magnetic disks, and memory cards.
  • Software and data used to practice embodiments of the present invention can be stored on such portable computer readable storage media and can be loaded onto persistent storage 705 via I/O interface(s) 706 .
  • I/O interface(s) 706 also connect to display device 709 .
  • Display device 709 provides a mechanism to display data to a user and may be, for example, a computer monitor. Display device 709 can also function as a touch screen, such as the display of a tablet computer or a smartphone.
  • aspects of the present invention may be embodied as a system, method, or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer-readable medium(s) having computer-readable program code/instructions embodied thereon.
  • the present invention may be a system, a method, and/or a computer program product.
  • the computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.
  • the computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device.
  • the computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • a non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read-only memory
  • EPROM or Flash memory erasable programmable read-only memory
  • SRAM static random access memory
  • CD-ROM compact disc read-only memory
  • DVD digital versatile disk
  • memory stick a floppy disk
  • a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon
  • a computer readable storage medium is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
  • Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network.
  • the network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
  • Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages.
  • the computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
  • electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
  • These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
  • the computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the block may occur out of the order noted in the figures.
  • two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

A method for routing communication paths among computing devices. The method includes a one or more computer processors identifying a computing entity and a data storage entity that transfer data. The method further includes determining a plurality of communication ports that the data storage entity utilizes to transfer data to the computing entity. The method further includes identifying a plurality of computing resources respectively associated with the determined plurality of communication ports that the data storage entity utilizes to transfer the data to the computing entity. The method further includes generating a list of tuples for the data storage entity based, at least in part, on the identified plurality of computing resources and the determined plurality of communication ports.

Description

BACKGROUND OF THE INVENTION
The present invention relates generally to the field of network communications, and more particularly to controlling the network connections utilized by multi-path data transmissions protocols.
In a distributed computing environment, large volumes of information can be stored on systems that are optimized for data storage, such as a network-attached storage (NAS) system and/or a storage area network (SAN). The execution of software and programs can be deployed to hardware that supports virtual systems (e.g., virtual machines). In addition, various components within a computing system can be virtualized, such as network switches and communication adapters. A virtual machine (e.g., an application server) can be dynamically configured (e.g., computational speed, multitasking, high-volume network traffic, response time, reliability, etc.) and optimized for the applications executed on the virtual machine (VM). In contrast, some objectives of utilizing NAS and/or SAN systems, as opposed to integrating data storage on a computing system hosting VMs, are: improved availability (e.g., fault-tolerance), improved performance (e.g., bandwidth), improved scalability, and improved maintainability (e.g., disaster recovery processes).
Various network topologies exist that can enable the communication between a computing system hosting VMs and a SAN storing the information utilized by a VM. A mesh network topology provides at least two nodes with two or more communication paths between the nodes, which provides redundant paths for the communications. In addition, various communication protocols can be utilized within a communication network. Some networking protocols (e.g., Fibre Channel Protocol (FCP)) are implemented on the communication adapters (e.g., host bus adapters), which reduces the resource demands on central processing units (CPUs) within a computing system and/or a SAN. Other networking protocols can take advantage of the redundant paths within some networks to increase the bandwidth of information transfer. Two such protocols are Stream Control Transmission Protocol (SCTP) and Multi-path Transmission Control Protocol (MPTCP). In addition, various virtualization technologies can be applied to communication ports of a computing system and/or a SAN. In one instance, virtualizing port IDs improves the isolation of VMs utilizing the same physical port on a SAN.
SUMMARY
Aspects of an embodiment of the present invention disclose a method, computer program product, and computing system for routing communication paths among computing devices. The method includes one or more computer processors identifying a computing entity and a data storage entity that transfers data. The method further includes determining a plurality of communication ports that the data storage entity utilizes to transfer data to the computing entity. The method further includes identifying a plurality of computing resources respectively associated with the determined plurality of communication ports that the data storage entity utilizes to transfer the data to the computing entity. The method further includes generating a list of tuples for the data storage entity based, at least in part, on the identified plurality of computing resources and the determined plurality of communication ports.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 illustrates a distributed computing environment, in accordance with an embodiment of the present invention.
FIG. 2 depicts an illustrative example of a configuration of physical and virtual ports associated with communication adapters of a computing system to communicate with a network, in accordance with an embodiment of the present invention.
FIG. 3 depicts an illustrative example of a storage area network utilizing physical and virtual ports to communicate with a network, in accordance with an embodiment of the present invention.
FIG. 4 depicts a flowchart of steps of a path generation program, in accordance with an embodiment of the present invention.
FIG. 5 depicts a flowchart of steps of an alternate path selection program, in accordance with an embodiment of the present invention.
FIG. 6 depicts an illustrative example of a portion of a list of communication ports for a storage area network, sorted, in accordance with an embodiment of the present invention.
FIG. 7 depicts a block diagram of components of a computer, in accordance with an embodiment of the present invention.
DETAILED DESCRIPTION
Embodiments of the present invention recognize that virtual environments for application servers as well as storage systems is an area of growth for some businesses. Data center decentralization and Cloud computing are two factors that contribute to this growth. To ensure that the transmission of data among application servers and SAN and/or NAS systems is reliable and low-latency, multi-path information transfer may be utilized. Some networking technologies utilize a World Wide Name (WWN) as a unique identifier for a storage technology. Some storage technologies that utilize WWNs are Fibre Channel (FC), serial-attached virtual Small Computer Serial Interface (SAS), and Advanced Technology Attachment (ATA). In one instance, a WWN may be used as a World Wide Node Name (WWNN) to identify a switch. In another instance, a World Wide Port Name (WWPN) may be utilized to identify a port on a switch (e.g., communication adapter). To reduce the number of physical ports required on communication adapters (e.g., host bus adapters (HBA)) of a computing system, SAN, and/or a NAS, N_Port_ID Virtualization (NPIV) may be utilized. In addition, without NPIV, the logical unit number (LUNs) (e.g., storage devices) zoned to a specific physical N_Port_ID would be visible to any VM that is connected to that N_Port_ID. NPIV enables multiple WWPNs to be assigned to a single physical N_Port. Different LUNs can be assigned to each WWPN rendering a LUN visible only to the VM that is zoned to a specific WWPN.
Embodiments of the present invention recognize that in a computing system that utilizes virtualization (e.g., hypervisor based), a large number of VMs may be provisioned. The communication ports on the computing system are “initiators” of the communication with a storage system. By utilizing I/O and port virtualization, the total number of possible communication paths that can be created between the VMs and a network, in some instances, may exceed 10,000 paths. Similarly, a storage system (e.g., a SAN) can present a large number of possible connections (targets) to a network. This large number of possible connection paths (e.g., routes) may provide connection redundancy between a VM and a storage device; however, this does not preclude creating “bottle-necks” at the physical ports of a storage device. Embodiments of the present invention recognize that simply limiting the number of connection paths may result in a limited number of hardware components and ports being utilized generating “hot spots” or congestion while other hardware components and ports are not utilized.
Embodiments of the present invention provide redundancy and load balancing for VMs of a computing system that communicate information over a network to a storage system utilizing a multi-path communication protocol. Embodiments of the present invention determine the communication resources (e.g., servers, HBAs, physical ports, physical port IDs, virtual port IDs, etc.) available on a storage system and the requirements (e.g., LUNs, bandwidth, timeout constraints, fault-tolerance, etc.) of the VMs and associated software executing on the VMs of a computing system that communicate with the storage system. The computing resources of a storage system are identified, and a sorted list of the computing resources is generated, in accordance with an embodiment of the present invention. In some embodiments of the present invention, computing resources of a storage system are cyclically distributed, based on a hardware hierarchy as a method to minimize the effects of faults that may occur on a storage system. The sorted list includes the targets (e.g., port IDs, WWPNs) on the storage system. The targets are distributed among the initiators (e.g., port IDs, WWPNs) utilized by one or more VMs of a computing system. In addition, embodiments of the present invention recognize that a computing system and/or a storage system may reserve one or more system (e.g., computing) resources. In an embodiment of the present invention, a computing system includes a communication routing program that utilizes information associated with the initiator-target assignments for negotiating (e.g., handshaking) the communication paths between a computing system and a storage system.
In one embodiment, a percentage of the total possible targets is assigned to each initiator. In another embodiment, the number of targets assigned to an initiator is related to the bandwidth associated with a VM and/or software application executing on a VM. In addition, embodiments of the present invention recognize that VMs of a computing system may be dynamically created (e.g., provisioned), paused, stopped, and destroyed. Limiting the distribution of the total number of targets provides a method to assign (e.g., allocate) targets to new initiators without generating a new sorted target list and renegotiating a plurality of communication paths between a computing system and a storage system. In some embodiments of the present invention, one or more communication paths between a computing system and a storage system are monitored. In an embodiment, the monitoring of one or more communication paths is utilized to detect communication faults. In another embodiment, the monitoring of one or more communication paths is utilized to verify that sufficient bandwidth (e.g., communication paths) is assigned to an initiator.
Embodiments of the present invention also recognize that a VM of a computing system may execute various software programs where each software program may utilize one or more initiators to communicate with a storage system. In some embodiments, each initiator utilized by a VM may be assigned a different number of targets on a storage system. In other embodiments, a software program is determined to have timeout constraints that may affect the routing selection method, routing selection delay, and target assignments. For example, a software program that includes a timeout constraint may fail, abort, and/or produce an error message when requested information requested from data storage is not received within a period of time dictated by the code of the software program.
An alternate embodiment of the present invention recognizes that multi-path communications may also exist within computing systems hosting application servers and data storage systems as part of input/output (I/O) virtualization. Some embodiments of the present invention may be utilized by internal communication networks to handle redundancy and load balancing issues.
The present invention will now be described in detail with reference to the Figures. FIG. 1 illustrates distributed computing environment 100, which includes computing system 102, storage 105, and network 110. An embodiment of distributed computing environment 100 includes computing system 102 and storage 105 interconnected over network 110. Many modifications to the depicted environment may be made by those skilled in the art without departing from the scope of the invention as recited by the claims.
In some embodiments, computing system 102 is comprised of physical and virtualized systems. Physical systems can be a stand-alone computer, or alternatively, a computing system utilizing clustered computers and components. Virtual systems are independent operating environments that use virtual resources made up of logical divisions of physical resources, such as processors 121, disks 122, network cards 123 (e.g., input/output (I/O) adapters), and memory 124. Hypervisor 120 provides the ability to divide physical computing system resources into isolated logical partitions. Each logical partition operates like an independent computing system running its own operating system (e.g., a virtual system). The independent operating environments controlled by a hypervisor may be structured in various schemes and hierarchies (e.g., logical partitions (LPAR), servers). In some embodiments, hypervisor 120 interfaces with communication routing program 125 as the status of one or more VM changes (e.g., provisioned, destroyed, etc.).
In some embodiments, in addition to creating and managing the logical partitions and associated VMs, hypervisor 120 manages communication between the logical partitions and other systems within computing system 102 via one or more virtual switches (not shown). In an embodiment, some virtual switches and internal network communications are represented by bus 190. To facilitate communication, each logical partition may include one or more virtual adaptors for communication between the VMs within a logical partition and VMs or other systems outside of the LPAR. For example, LPAR 130 includes virtual adapters 140, 150, and 160 associated with VM 131, VM 133, and VM 135, respectively. The type of the virtual adapter depends on the operating system used by the logical partition. Examples of virtual adapters include virtual Ethernet adapters, virtual fiber channel adapters, virtual small computer serial interface (SCSI) adapters, and virtual serial adapters. Some of the virtual adapters utilize bus 190 to facilitate communications. In an embodiment, bus 190 may be configured to create as a Virtual Local Area Network (VLAN) within computing system 102. In another embodiment, computing system 102 may utilize other technologies, such as Virtual Machine Communication Interface (VMCI) protocol or virtual network interface cards (VNIC), to enhance the communications among virtual adapters. Physical and virtual adapters within computing system 102 may utilize protocols that support communication via virtual port IDs (e.g., NPIV, WWPNs).
In some embodiments, computing system 102 is divided into logical partitions (LPARs) that include LPAR 129 and LPAR 130 with each LPAR executing an independent operating environment, such as an operating system (OS). In an embodiment, LPAR 129 contains an OS that supports a virtual I/O server (VIOS). The VIOS LPAR (i.e., LPAR 129) includes I/O server 170 and I/O server 180, with the I/O servers respectively including communication adapters (e.g., host bus adapters (HBA)) 171 and/or 177, and 181 and/or 187. In another embodiment, LPAR 130 includes VM 131, VM 133, and VM 135 executing a shared OS. In other embodiments, at least one of I/O server 170 and I/O server 180 are physical computer servers (e.g., blade servers, network servers, etc.) that communicate with various portions of computing system 102 via an internal communication system, bus 190.
In an embodiment, communications from network 110 are routed through communication adapters (e.g., HBAs, network interface cards (NIC), etc.) 171 and 177, and 181 and 187 of I/O server 170 and I/O server 180, respectively, on logical partition 129, to bus 190. Communications from virtual adapters 140, 150, and 160 in logical partition 130 may be routed through bus 190 to network 110, in accordance with an embodiment of the present invention. In one embodiment, one or more of communication adapters 171, 177, 181, and 187 are physical adapters included in network cards 123. In another embodiment, one or more of communication adapters 171, 177, 181, and 187 are virtual adapters derived from network cards 123. In an alternative embodiment, one or more physical network adapters are allocated to logical partition 130 for VM 131, 133, and 135 to function in place of instances of virtual adapters 140, 150, and 160.
In an embodiment, computing system 102 communicates through network 110 to storage 105 (e.g., a storage area network). One or more connections of computing system 102 to network 110 occur via ports; herein identified as initiators. One or more connections of storage 105 to network 110 occur via ports; herein identified as targets. Network 110 can be, for example, a local area network (LAN), a telecommunications network, a wireless local area network (WLAN), a wide area network (WAN), such as the Internet, or any combination of the previous, and can include wired, wireless, or fiber optic connections. In general, network 110 can be any combination of connections and protocols that will support communications between computing system 102 and storage 105, in accordance with embodiments of the present invention. In another embodiment, network 110 operates locally via wired, wireless, or optical connections and can be any combination of connections and protocols (e.g., NFC, laser, infrared, etc.). Network 110 may be configured in various topologies (e.g., bus, tree, mesh, hybrid, fabric, switched fabric, etc.).
Communication routing program 125 utilizes various communication protocols to communicate data between computing system 102 and storage 105, based on the networking technologies utilized by network 110 and the protocols supported by instances of physical and/or virtualized network adapters utilized by computing system 102 and storage 105. In one embodiment, communication routing program 125 utilizes one or more protocols that support multi-path data communication between computing system 102 and storage 105. In some embodiments, communication routing program 125 may establish (e.g., negotiate) a plurality of connections (e.g., communication paths) between computing system 102 and storage 105 and communicate information associated with the connections (e.g., initiator IDs, target IDs) to path generation program 400. In another embodiment, communication routing program 125 dictates the assignment of the target ports on storage 105 based on a sorted list of targets generated by path generation program 400. In an example, communication routing program 125 receives initiator-target assignment from path generation program 400 for initiator ports of computing system 102 (e.g., referencing FIG. 6, target port list for storage 105). In other embodiments, communication routing program 125 utilizes one or more networking utilities to determine information associated with a communication path. Information determined by communication routing program 125 may include: a status, a retry rate, a packet loss rate, a queuing delay, a propagation delay, an error rate, a fault, and a handshaking error.
In a further embodiment, communication routing program 125 may utilize technologies, such as N_Port_ID Virtualization (NPIV) and N_Port Virtualization (NPV) to enable one or more ports of I/O server 170 and I/O server 180 to be identified by more than one instance of a WWPN. In a different embodiment, communication routing program 125 enables multi-path communication (e.g., multi-path I/O) via I/O virtualization within computing system 102. For example, communication routing program 125 enables multi-path communication between VM 131 (e.g., processors 121) utilizing virtual adapter 140 (e.g., network cards 123) and disks 122.
Environment 200 includes I/O server 170, communication adapter 171, communication adapter 177, bus 190, and connections to network 110. Environment 200 is described in further detail with respect to FIG. 2.
Path generation program 400 determines which physical or virtual initiators (e.g., communication adapter ports) of computing system 102 communicate with storage 105 utilizing a multi-path communication protocol. Multiple instances of path generation program 400 may execute concurrently. Path generation program 400 may execute concurrently with alternate path selection program 500. In some embodiments, another instance of path generation program 400 may execute when a new VM activates. In an embodiment, path generation program 400 determines the number of targets available on storage 105 and the physical relationships (e.g., communication adapter address, server supporting a communication adapter) of those targets on storage 105. Path generation program 400 utilizes this information to determine the number of connections that may be utilized by each initiator and a group of targets that are assigned to each initiator. Path generation program 400 communicates groups of targets that are assigned to each initiator to communication routing program 125, which subsequently generates the connection paths between computing system 102 and storage 105 via network 110 utilizing a multi-path communication protocol. In other embodiments, path generation program 400 communicates with hypervisor 120 to determine when a new VM is provisioned and which initiators are utilized by the provisioned VM. In addition, path generation program 400 communicates with hypervisor 120 to determine when the status of a VM changes; for example, when a VM de-provisions and which initiators are associated with the de-provisioned VM.
In a further embodiment, path generation program 400 may monitor a communication path between an initiator and a target. In a scenario, if path generation program 400 detects a problem with a communication path (e.g., connection), path generation program 400 may activate alternate path selection program 500. In another embodiment, path generation program 400 may also determine which VM within LPAR 130 utilizes the initiator of the affected communication path.
Alternate path selection program 500 responds to faults detected in storage 105. Alternate path selection program 500 determines which targets (e.g., ports, port IDs, WWPNs, WWNNs) on storage 105 are associated with a fault detected on storage 105. Based on the sorted list of targets generated by path generation program 400, alternate path selection program 500 determines which initiators are assigned (e.g., connected) to the affected targets. In one embodiment, alternate path selection program 500 may communicate with communication routing program 125 and flag the affected connection to prevent further use of the flagged connection while the communication fault is present on storage 105. In another embodiment, if sufficient connections between an initiator and storage 105 are affected that degrade the communication bandwidth for a VM, alternate path selection program 500 may activate path generation program 400 to generate a new list of connections for the affected VM. In a further embodiment, alternate path selection program 500 may utilize information associated with a VM of an affected initiator determined by path generation program 400 (step 410) to determine if software executing on the VM includes timeout constraints which may be triggered by the affected communication connection.
Storage 105 includes data and information utilized by VM 131, VM 133, and VM 135. The data and information contained within storage 105 may include: text files, video files, audio files, numerical data, e-mail files, databases, etc. In one embodiment, storage 105 is a storage area network (SAN). In another embodiment, storage 105 is network-attached storage (NAS) system. In some embodiments, storage 105 may be a SAN-NAS hybrid system. In addition, storage 105 may utilize storage virtualization to enable additional functionality and more advanced features within a computer data storage system.
FIG. 2 depicts a functional block diagram illustrating environment 200 of computing system 102 within distributed computing environment 100 of FIG. 1. In an embodiment, environment 200 includes: I/O server 170, communication adapter 171, communication adapter 177, a portion of bus 190, and associated physical and virtual communication ports (i.e., ports) that connect to network 110. With respect to I/O server 170, communication adapter 171 includes physical ports P201 and P202, and communication adapter 177 includes physical ports P217 and P218. With respect to communication adapter 171, physical port P201 includes virtual ports VP203, VP205, and VP207. With respect to communication adapter 171, physical port P202 includes virtual ports VP204, VP206, and VP208. Communication adapters 171 and 177 may communicate with VM 131, VM 133, and VM 135 via bus 190. VM 131, VM 133, and VM 135 may utilize one or more ports: VP203, VP204, VP205, VP206, VP207, VP208, P217, and P218 to act as initiators (i.e., communication initiators) to communicate with storage 105 via network 110.
FIG. 3 depicts a functional block diagram illustrating storage 105. In one embodiment, storage 105 includes: server 340, server 350, and server 360, where each server further includes two communication adapters and each communication adapter includes two physical ports. Storage 105 may also include, but is not shown: persistent storage devices (e.g., hard-disk arrays, magnetic tape libraries, optical jukeboxes, solid-state disk drives, etc.), support/monitoring hardware and software, virtualization firmware and software (e.g., a hypervisor), and networking devices. With respect to server 340, server 340 includes communication adapter A341 that includes physical ports P342 and P343; and communication adapter A347 that includes physical ports P348 and P349. In an embodiment, one physical port of communication adapter A341, port P342, may utilize NPIV to create virtual ports VP344 and VP345. With respect to server 350, server 350 includes communication adapter A351 that includes physical ports P352 and P353; and communication adapter A357 that includes physical ports P358 and P359. With respect to server 360, server 360 includes communication adapter A361 that includes physical ports P362 and P363; and communication adapter A367 that includes physical ports P368 and P369.
In a different embodiment, storage 105 utilizes I/O virtualization and multi-path communication within storage 105. For example, servers 340, 350, and 360 may have HBAs that communicate with the controllers of the storage devices (not shown) via storage virtualization utilizing NPIV and NPV. Some protocols utilized within storage 105 may include: Fibre Channel protocol (FCP), Internet SCSI (iSCSI), SAS, Fibre Channel over Internet Protocol (FCIP), ATA over Ethernet (AoE), and Fibre Channel over Ethernet (FCoE).
FIG. 4 is a flowchart depicting operational steps for path generation program 400, executing on computing system 102 within distributed computing environment 100 of FIG. 1. Path generation program 400 determines which initiators of a computing system communicate with a storage system utilizing a multi-path communication protocol. In addition, path generation program 400 generates a sorted list of targets (e.g., ports) and assigns groups of items (e.g., targets) from a sorted list that are available (e.g., not reserved, not flagged) on a storage system to various initiators of a computing device to improve the performance (e.g., bandwidth) and reliability (e.g., redundancy) of the multi-path communications between the computing device and the storage system.
In step 402, path generation program 400 determines which initiators of a computing system communicate with a storage system utilizing a multi-path communication protocol. In one embodiment, path generation program 400 determines which initiators of computing system 102 communicate with storage 105 by the protocol utilized by a communication connection. In an example, path generation program 400 may communicate with communication routing program 125 to determine which initiators of computing system 102 communicate with storage 105 via MPTCP, SCTP, and/or FCP. In another example, path generation program 400 utilizes handshaking information determined by communication routing program 125 to determine which initiators utilize a multi-path communication protocol. In another embodiment, path generation program 400 determines which initiators of computing system 102 utilize a multi-path communication protocol based on configuration information associated with a VM (e.g., VM 135). In one scenario, path generation program 400 may determine that an initiator (e.g., P218) utilizes a multi-path communication protocol based on the middleware and/or application programming interfaces (APIs) (not shown) executed by a VM (e.g., VM 131) that utilizes the initiator (e.g., P218). In another scenario, path generation program 400 determines which initiators utilize a multi-path communication protocol based in the provisioning information associated with a VM (e.g., VM 131).
In step 404, path generation program 400 determines a number of targets of a storage system that connect to each initiator. In an embodiment, path generation program 400 utilizes information determined by communication routing program 125 obtained while establishing a plurality of connections between computing system 102 and storage 105 to determine the number of targets that are connected to each initiator. In a further embodiment, path generation program 400 communicates with storage 105 to determine the number of targets (e.g., ports, port IDs) that are utilized by computing system 102 and which targets are reserved (e.g., inactive) on storage 105.
In step 406, path generation program 400 generates a sorted list of targets. Path generation program 400 generates a sorted list of targets (e.g., port ID's, WWNNs, WWPN) for storage 105 based on physical and virtual computing resources related to the targets. In an embodiment, the resources of storage 105 include one or more servers, where each server further includes one or more communication adapters, and each communication adapter further includes one or more ports.
In one embodiment, path generation program 400 generates a sorted list that may be described as a series of tuples in the format of (S,C,P) where (S), (C), and (P) respectively identify: a server, a communication adapter, and a communication port (e.g., a port ID). Each element within the tuple format of (S,C,P) is cyclically utilized. In this embodiment, path generation program 400 prioritizes the sort based on a hardware hierarchy (e.g., servers affect communication adapters, communication adapters affect ports). In an example, (referring to the elements within FIG. 3) utilizing the (S) and (C) elements of the tuple format, an instance of path generation program 400 generates the series: (340, A341), (350, A351), (360, A361), (340, A347), (350, A357), and (360, A367). In some scenarios, the length of the series generated by path generation program 400 is based on the total number of target ports on storage 105. In other scenarios, the length of the series generated by path generation program 400 is based on the number of target active ports on storage 105.
In some embodiments, one or more components of storage 105 may be virtualized. Referencing FIG. 3, path generation program 400 bases the sorted list on server 340, server 350, server 360, and the communication adapters (e.g., A341/A347, A351/A357, A361/A367) respectively associated with each server. Path generation program 400 also cyclically distributes the respective communication ports, physical and/or virtual, among each server/communication adapter combination. Referencing FIG. 3, the ports that are included in the list are VP344, VP345, P343, P38, P349, P352, P353, P358, P359, P363, P363, P368, and P369. In another embodiment, path generation program 400 applies a sorting routing on the components depicted in FIG. 3 to generate the target port list for storage 105 depicted in FIG. 6.
Referring to step 406, in other embodiments path generation program 400 does not include WWPNs as port IDs when generating a sorted list of targets. Path generation program 400 assigns the WWPN based on the LUNs that are utilized by a VM. In a different embodiment, path generation program 400 may create a sorted list of ports based on other factors. Such factors may include: connection speed, routing paths, connection reliability, etc.
In step 408, path generation program 400 defines a number of connections for each initiator and assigns targets to each initiator based on the sorted list of targets. Path generation program 400 may utilize one or more rules (e.g., determined by a system administrator, dictated by a software program, included in the provisioning of a VM, etc.). In one embodiment, path generation program 400 identifies the bandwidth needed for the multi-path communications of each initiator utilized by a VM (e.g., VM 131, VM 133, and VM 135) and determines a number of connections to storage 105 (e.g., targets) to support the identified bandwidth. In an instance, referring to the sorted list of FIG. 6 initiators VP203, VP205, and VP 207 utilize a similar bandwidth (e.g., communication rate) and path generation program 400 assigns four connections (e.g., communication paths) to initiators VP203, VP205, and VP207. In another instance, path generation program 400 determines that initiator P217 utilizes a higher bandwidth and path generation program 400 assigns initiator P217 seven connections. In a different instance, path generation program 400 receives an indication from VM 135 that a program executing on VM 135, which utilizes initiator P218, includes timeout constraints. Path generation program 400 may determine that the bandwidth utilized by P218 is similar to VP203; however, path generation program 400 also determines that the program of VM 135 that utilizes P218 includes timeout constrains. Based on a timeout constraint associated with program associated with initiator P218, path generation program 400 assigns seven connections to P218 as opposed to four connection. In some scenarios, path generation program 400 dynamically defines the number of connections assigned to each initiator. In one instance, path generation program 400 utilizes historical data from a load balancing program (not shown) to define the number of connections assigned to each initiator utilizing multi-path communication. In another instance, path generation program 400 defines the number of connections assigned to each initiator based on the number of active VMs of computing system 102.
In another embodiment, path generation program 400 defines a uniform number of connections to each initiator utilizing a multi-path communications. In one scenario, path generation program 400 utilizes information obtained from the administrator of computing system 102 to define the number of connections that are assigned to each initiator utilizing multi-path communications with storage 105. In another scenario, path generation program 400 defines a number of connections assigned to each initiator based on a percentage of the total available targets on storage 105, rounded to a whole number of connections. For an example, storage 105 includes 1600 target port IDs and computing system 102 has 200 initiators that utilize multi-path communications. Path generation program 400 assigns 40% of the target to the initiators engaged in multi-path communications with storage 105. In this example, path generation program 400 assigns 3 connections, 3.2 rounded to the nearest integer number of connections, to each of the 200 initiators. In some scenarios, path generation program 400 utilizes unassigned targets of a sorted target list to assign targets to new initiators that utilize multi-path communication. In other scenarios, path generation program 400 returns assigned targets to a sorted target list when an initiator utilizing multi-path communication is not used (e.g., VM utilizing the initiator is de-provisioned).
Referring to step 408, in further embodiment path generation program 400 may communicate with storage 105 to determine the number of targets (e.g., ports, port IDs, WWPNs, WWNNs) available prior to assigning a number of connections to initiators on computing system 102. Based on the information communicated to path generation program 400 by storage 105 in some instances the reserved ports are excluded from the sorted list generation (step 406). In other instances, path generation program 400 includes the reserved ports in the sorted port list; however, path generation program 400 does not assign the reserved ports to an initiator. In some scenarios, path generation program 400 determines that storage 105 reserves a portion of the available targets. For example, path generation program 400 determines that storage 105 reserves targets, P368 and P369 of communication adapter A367 on server 360. In one instance, the reserved targets are released for during workload spikes. In another instance, the reserved targets are released when a communication fault is identified on storage 105 (e.g., a HBA failure, a communication adapter, a networking cable failure, a server failure, etc.).
In step 409, path generation program 400 communicates the assigned targets for an initiator to communication routing program 125. Subsequently, communication routing program 125 utilizes the assigned targets for an initiator to modify the connection paths of network 110 utilized by computing system 102 to communicate with storage 105. In one embodiment, path generation program 400 may transfer a list of ports (e.g., group) associated with each initiator to communication routing program 125. For example, path generation program 400 communicates the list depicted in FIG. 6 to communication routing program 125. In another embodiment, path generation program 400 may communicate changes associated with individual initiators to communication routing program 125. In an example, VM 131, which utilizes initiator VP203, is de-provisioned. Path generation program 400 communicates to communication routing program 125 that initiator VP203 is not utilized. In another example, path generation program 400 determines that a new VM is provisioned and communicates a group of unassigned, substantially sequential targets from the sorted list of targets to the initiator associated with the new VM to communication routing program 125. In some embodiments, path generation program 400 may assign multiple WWPNs to targets on storage 105 when NPIV is utilized. Path generation program 400 may utilize this strategy to zone specific LUNs of storage 105 to a VM (e.g., VM 135) on computing system 102. Path generation program 400 includes the WWPN assignments with the initiator-target assignment communicated to communication routing program 125.
In step 410, path generation program 400 optionally monitors a communication path between an initiator and a target. In one embodiment, path generation program 400 may monitor one or more communication paths associated with an initiator to ensure that sufficient connections were assigned to meet a bandwidth dictate of an initiator. In another embodiment, path generation program 400 may monitor one or more communication paths for an initiator associated with a program that includes timeout constraints. In one scenario, path generation program 400 may execute alternate path selection program 500 to determine if a communication path issue is related to a fault on storage 105. In another scenario, path generation program 400 may determine that a communication path issue is not associated with a fault on storage 105. In one instance, path generation program 400 may communicate with hypervisor 120 to assign another initiator to a VM associated with the affected communication path. In another instance, the path may fail due to another fault on: computing system 102, storage 105, and/or network 110. Examples of other issues that affect communication paths between computing system 102 and storage 105 are: incorrect hardware setting, duplicate IP addresses, access authority, errors in LUN masking, and incorrect LUN zoning.
FIG. 5 is a flowchart depicting operational steps for alternate path selection program 500, executing on computing system 102 within distributed computing environment 100 of FIG. 1. Alternate path selection program 500 determines whether one or more communication faults may be related to a fault on storage 105. If alternate path selection program 500 identifies a communication fault associated with storage 105, then alternate path selection program 500 determines which targets are affected and whether an affected target includes a timeout constraint. Subsequently, alternate path selection program 500 utilizes one or more techniques to select unassigned communication paths (e.g., target ID's) and communicates the alternative communication path to communication routing program 125 and/or path generation program 400.
In decision step 502, alternate path selection program 500 identifies whether a communication fault is associated with a fault on a storage system. In one embodiment, alternate path selection program 500 receives information from communication routing program 125 that one or more connection paths are not established with storage 105. In another embodiment, alternate path selection program 500 determines that there is a fault on storage 105 from a communication (e.g., status message) between storage 105 and computing system 102. Responsive to determining that a communication fault is identified on storage 105 (yes branch, decision step 502), alternate path selection program 500 determines which targets are affected by a communication fault of a storage system (step 504).
In step 504, alternate path selection program 500 determines which targets are associated with a communication fault of a storage system. In one embodiment, alternate path selection program 500 determines which targets (e.g., port IDs) are affected by a hardware fault. In one scenario, alternate path selection program 500 communicates with a monitoring program on storage 105 to determine that a server is associated with an identified communication fault. In an example, (referring to FIG. 3) if server 360 is shutdown, then alternate path selection program 500 determines that targets (e.g., ports) P362, P363, P368, and P369 are affected. In another scenario, if a communication fault on storage 105 is associated with communication adapter A357, then alternate path selection program 500 determines that targets P358 and/or P359 may be affected. In this scenario, alternate path selection program 500 may require additional information from storage 105 to determine whether one or both of targets P358 and P359 are affected by a fault on communication adapter A357.
In another embodiment, alternate path selection program 500 determines which targets (e.g., port IDs) are affected by a software fault. In one scenario, alternate path selection program 500 communicates with a hypervisor of storage 105 to determine whether internal communication occurs among storage systems and server 340, server 350, and server 360. In an example (referring to FIG. 3), if alternate path selection program 500 determines that server 360 is affected by an internal communication fault of storage 105, then alternate path selection program 500 determines that targets (e.g., ports) P362, P363, P368, and P369 are affected. In another scenario, if alternate path selection program 500 determines that NPID virtualization is associated with a communication fault on storage 105, then alternate path selection program 500 determines that targets VP344 and VP345 are affected.
Referring to step 504 in a different embodiment, alternate path selection program 500 communicates with communication routing program 125 to determine which targets of storage 105 are affected by a communication fault based on one or more network diagnostics executed by communication routing program 125. In an example, communication routing program 125 may periodically utilize a traceroute function (not shown) to identify which connection paths cannot be accessed. In another example, communication routing program 125 may utilize a neighbor discovery protocol monitor (not shown) to identify which connection paths cannot be accessed.
In step 506, alternate path selection program 500 determines which initiators are assigned to affected targets. In an embodiment, alternate path selection program 500 communicates with communication routing program 125 to determine the one or more initiators that are assigned to the affected targets. In another embodiment, alternate path selection program 500 communicates with path generation program 400 to obtain the sorted list of targets and the initiator-target assignments to determine the one or more initiators that are assigned to the affected targets.
In step 508, alternate path selection program 500 identifies a software program associated with an initiator that includes a timeout constraint. In one embodiment, alternate path selection program 500 communicates with path generation program 400 to determine whether a software program, executing on computing system 102, that utilizes an affected target on storage 105 includes a timeout constraint. In another embodiment, alternate path selection program 500 communicates with hypervisor 120 to determine whether a software program, executing on computing system 102, that utilizes an affected target on storage 105 includes a timeout constraint. In a further embodiment, alternate path selection program 500 determines from communications with path generation program 400 that the initiator associated with a timeout constraint is also affected by varying bandwidth utilization.
In decision step 510, alternate path selection program 500 determines whether an initiator utilizes an affected target. In one embodiment, alternate path selection program 500 determines that an initiator is affected based on a determined timeout constraint of a VM and/or program that utilizes a target that is affected by a fault. In another embodiment, alternate path selection program 500 determines that an initiator is associated with an affected target based on information obtained from path generation program 400. In one scenario, alternate path selection program 500 determines that an initiator is affected based on “quality of service” measurements (e.g., error rates, retry rates, etc.) obtained from path generation program 400. In another scenario, alternate path selection program 500 determines that sufficient communication paths do not exist for an initiator based on a bandwidth constraint of a VM and/or software program.
In decision step 510, in response to a determination that alternate path selection program 500 determines that an initiator utilizes an affected target (yes branch, decision step 510), alternate path selection program 500 selects an alternative communication path for an affected initiator (step 512).
In step 512, alternate path selection program 500 selects an alternative communication path for an affected initiator. In one embodiment, alternate path selection program 500 determines that the affected initiator is not associated with a software program that includes a timeout constraint. In one scenario, alternate path selection program 500 communicates to communication routing program 125 which target of storage 105 is affected and that communication routing program 125 may proceed to another assigned target. Alternate path selection program 500 may utilize various scheduling or queuing algorithms for selecting the other assigned target. Examples of queuing algorithms that alternate path selection program 500 may utilize include: round robin, weighted round robin, deficit round robin, multilevel queuing, and random. In addition, alternate path selection program 500 flags the affected target to communication routing program 125 so that the affected target is excluded from subsequent multi-path data communications. In another scenario, alternate path selection program 500 determines from communications with path generation program 400 that an initiator has a bandwidth constraint. Alternate path selection program 500 communicates the information related to the affected target to path generation program 400 for path generation program 400 to communicate a replacement target assignment to communication routing program 125.
In another embodiment, alternate path selection program 500 determines that the affected initiator is associated with a software program that includes a timeout constraint. In one scenario, alternate path selection program 500 communicates to communication routing program 125 which target of storage 105 is affected and that communication routing program 125 may reduce the delay and/or number of communication retries before proceeding to the next assigned target in round-robin fashion. In addition, alternate path selection program 500 flags the affected target to communication routing program 125 so that the affected target is excluded from subsequent multi-path data communications.
Referring to step 512, in another scenario, alternate path selection program 500 communicates with path generation program 400 to replace the affected targets with a newgroup of target assignments from the sorted target list for storage 105. Subsequently, path generation program 400 communicates the replacement target assignments to communication routing program 125. In another scenario, alternate path selection program 500 determines that an affected initiator includes both a timeout constraint and variable bandwidth utilization. In an instance, alternate path selection program 500 communicates with path generation program 400 to replace the affected targets with a new group of target assignments from the sorted target list for storage 105, where the new group of targets is larger to support periods of higher bandwidth utilization. Subsequently, path generation program 400 communicates the new and larger group of target assignments to communication routing program 125.
Referring to decision step 502, responsive to determining that a communication fault is not identified on storage 105 (no branch, decision step 502), alternate path selection program 500 terminates.
Referring to decision step 510, in response to a determination that alternate path selection program 500 determines that an initiator does not utilizes an affected target (no branch, decision step 510), alternate path selection program 500 terminates.
FIG. 6 is an illustrative example of a portion of a list of target ports on storage 105 and the initiators of computing system 102 that may utilize the target ports. FIG. 6 depicts twenty-four targets assigned to five initiators, in accordance with an embodiment of the present invention. Target port list for storage 105 is generated based on an embodiment of path generation program 400 (referencing step 406). Column 602 depicts the assignment of five physical and virtual initiators of computing system 102 to targets (e.g., ports) of storage 105, in accordance with an embodiment of the present invention. Column 604 depicts a cyclical assignment of servers, server 340 then server 350, then server 360, and the cycle begins again with server 340. Column 606 depicts a cyclical assignment of communication adapters respectively associated with each instance of a server (referencing FIG. 3). In an example, communication adapter A341 is assigned to the first, third, fifth, and seventh instances of server 340. Communication adapter A347 is assigned to the second, fourth, sixth, and eighth instances of server 340. Column 608 depicts a cyclical assignment of physical and virtual ports (e.g., targets) respectively associated with each instance of a communication adapter (referencing FIG. 3). In an example, virtual port VP344 is assigned to the first and fourth instances of communication adapter A341. Virtual port VP345 is assigned to the second instance of communication adapter A341, and port P343 is assigned to the third instance of communication adapter A341.
In an example, initiator P217 is assigned ports P343, P352, P362, P348, P358, and P368. Row 610 depicts an instance where alternate path selection program 500 determines that communication adapter A361 (stippled shading) of storage 105 is identified as having a communication fault. In an embodiment, alternate path selection program 500 may determine that there is no impact to initiator P217. Alternate path selection program 500 communicates with communication routing program 125 to proceed with a round-robin utilization of ports P343, P352, P348, P358, and P368 for multi-path communication between computing system 102 and storage 105.
In another example, initiator P218 is assigned ports VP344, P352, P362, P348, P358, and P368. Row 615 and row 616 depict an instance where alternate path selection program 500 determines that server 350 (random stipple shading) of storage 105 is identified as having a communication fault. A fault in server 350 affects targets P352, P353, P358, and P359. In an embodiment, alternate path selection program 500 may determine, based on communications with path generation program 400, that initiator P218 is bandwidth constrained based on losing the communication paths associated with targets P352 and P358. Alternate path selection program 500 communicates with path generation program 400 that additional targets are needed to maintain the bandwidth dictated by initiator P218. In a scenario, initiators VP203, VP205, VP207, P217, and P218 of computing system 102 are engaged in multi-path communication with storage 105. Based on the bandwidth constraint of initiator P218, path generation program 400 assigns the next two target ports in sequence. The tuples that describes the two target ports in sequence are: server 340/communication adapter A341/port VP345 and server 350/communication adapter A351/port P353. However, port P353 is affected by the fault in server 350; therefore, path generation program 400 assigns a different target, which is described by a tuple as server 360/communication adapter A361/port VP343.
FIG. 7 depicts computer system 700, which is representative of computing system 102 and storage 105. Computer system 700 is an example of a system that includes software and data 712. Computer system 700 includes processor(s) 701, cache 703, memory 702, persistent storage 705, communications unit 707, input/output (I/O) interface(s) 706, and communication fabric 704. Communications fabric 704 provides communications between cache 703, memory 702, persistent storage 705, communications unit 707, and input/output (I/O) interface(s) 706. Communications fabric 704 can be implemented with any architecture designed for passing data and/or control information between processors (such as microprocessors, communications and network processors, etc.), system memory, peripheral devices, and any other hardware components within a system. For example, communications fabric 704 can be implemented with one or more buses or a crossbar switch.
Memory 702 and persistent storage 705 are computer readable storage media. In this embodiment, memory 702 includes random access memory (RAM). In general, memory 702 can include any suitable volatile or non-volatile computer readable storage media. Cache 703 is a fast memory that enhances the performance of processor(s) 701 by holding recently accessed data, and data near recently accessed data, from memory 702. With respect to computing system 102, memory 702 includes, at least in part, designated memory 124 (e.g., physical hardware) depicted in FIG. 1 to be shared among logical partitions.
Program instructions and data used to practice embodiments of the present invention may be stored in persistent storage 705 and in memory 702 for execution by one or more of the respective processor(s) 701 via cache 703. In an embodiment, persistent storage 705 includes a magnetic hard disk drive. Alternatively, or in addition to a magnetic hard disk drive, persistent storage 705 can include a solid-state hard drive, a semiconductor storage device, a read-only memory (ROM), an erasable programmable read-only memory (EPROM), a flash memory, or any other computer readable storage media that is capable of storing program instructions or digital information. With respect to computing system 102, persistent storage 705 includes, at least in part, disks 122 (e.g., physical hardware) depicted in FIG. 1 to be shared among logical partitions.
The media used by persistent storage 705 may also be removable. For example, a removable hard drive may be used for persistent storage 705. Other examples include optical and magnetic disks, thumb drives, and smart cards that are inserted into a drive for transfer onto another computer readable storage medium that is also part of persistent storage 705. Software and data 712 are stored in persistent storage 705 for access and/or execution by one or more of the respective processor(s) 701 via cache 703 and one or more memories of memory 702. With respect to computing system 102, software and data 712 includes hypervisor 120, communication routing program 125, path generation program 400, alternate path selection program 500, and other programs.
Communications unit 707, in these examples, provides for communications with other data processing systems or devices, including resources of computing system 102 and processors 121. In these examples, communications unit 707 includes one or more network interface cards. Communications unit 707 may provide communications through the use of either or both physical and wireless communications links. With respect to computing system 102, hypervisor 120, virtual adapter 140, virtual adapter 150, virtual adapter 160, bus 190, and program instructions and data used to practice embodiments of the present invention may be downloaded to persistent storage 705 through communications unit 707. With respect to computing system 102, communication unit 707 includes, at least in part, one or more network cards 123 (e.g., physical hardware), virtual adapter 140, virtual adapter 150, virtual adapter 160, communication adapter 171, communication adapter 177, communication adapter 181, and communication adapter 187 depicted in FIG. 1, to be shared among logical partitions.
I/O interface(s) 706 allows for input and output of data with other devices that may be connected to each computer system. For example, I/O interface(s) 706 may provide a connection to external device(s) 708, such as a keyboard, a keypad, a touch screen, and/or some other suitable input device. External device(s) 708 can also include portable computer readable storage media such as, for example, thumb drives, portable optical or magnetic disks, and memory cards. Software and data used to practice embodiments of the present invention can be stored on such portable computer readable storage media and can be loaded onto persistent storage 705 via I/O interface(s) 706. I/O interface(s) 706 also connect to display device 709.
Display device 709 provides a mechanism to display data to a user and may be, for example, a computer monitor. Display device 709 can also function as a touch screen, such as the display of a tablet computer or a smartphone.
It is understood in advance that although this disclosure discusses system virtualization, implementation of the teachings recited herein are not limited to a virtualized computing environment. Rather, the embodiments of the present invention are capable of being implemented in conjunction with any type of clustered computing environment now known (e.g., cloud computing) or later developed.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method, or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer-readable medium(s) having computer-readable program code/instructions embodied thereon.
The programs described herein are identified based upon the application for which they are implemented in a specific embodiment of the invention. However, it should be appreciated that any particular program nomenclature herein is used merely for convenience, and thus the invention should not be limited to use solely in any specific application identified and/or implied by such nomenclature.
The present invention may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.
The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
The descriptions of the various embodiments of the present invention have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The terminology used herein was chosen to best explain the principles of the embodiment, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (1)

What is claimed is:
1. A method for routing communication paths among computing devices, the method comprising:
identifying, by one or more computer processors, a computing entity and a data storage entity that transfer data;
determining, by one or more computer processors, a plurality of communication ports that the data storage entity utilizes to transfer data to the computing entity;
identifying, by one or more computer processors, a plurality of computing resources respectively associated with the determined plurality of communication ports that the data storage entity utilizes to transfer the data to the computing entity; and
generating, by one or more computer processors, a list of tuples for the data storage entity based, at least in part, on the identified plurality of computing resources and the determined plurality of communication ports;
wherein the list of tuples for the data storage entity comprises information corresponding to computing resources that include: a server identifier, a communication adapter identifier, and a communication port identifier;
wherein the communication adapter identifier is respectively associated with a server; wherein the communication port identifier is respectively associated with a communication adapter; and
wherein the communication port identifier corresponds to at least one of a physical computing resource and a virtualized computing resource;
sorting, by one or more computer processors, the generated list of tuples such that the generated list of tuples is cyclically ordered based on a hierarchy of: the server identifier, the communication adapter identifier, and the communication port identifier; and
identifying, by one or more computer processors, a rule associated with the computing entity, wherein the rule dictates a number of communication paths that the computing entity utilizes to transfer data between a communication port of the computing entity and the plurality of communication ports of the data storage entity; and
assigning, by one or more computer processors, a group of communication port identifiers from the generated list of tuples to the communication port of the computing entity such that the number of the group of assigned communication ports is based, at least in part, on the rule associated with the computing entity.
US15/014,191 2015-08-18 2016-02-03 Assigning communication paths among computing devices utilizing a multi-path communication protocol Expired - Fee Related US9461867B1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US15/014,191 US9461867B1 (en) 2015-08-18 2016-02-03 Assigning communication paths among computing devices utilizing a multi-path communication protocol
US15/209,775 US9531626B1 (en) 2015-08-18 2016-07-14 Assigning communication paths among computing devices utilizing a multi-path communication protocol
US15/365,997 US9674078B2 (en) 2015-08-18 2016-12-01 Assigning communication paths among computing devices utilizing a multi-path communication protocol

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US14/828,895 US9942132B2 (en) 2015-08-18 2015-08-18 Assigning communication paths among computing devices utilizing a multi-path communication protocol
US15/014,191 US9461867B1 (en) 2015-08-18 2016-02-03 Assigning communication paths among computing devices utilizing a multi-path communication protocol

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US14/828,895 Continuation US9942132B2 (en) 2015-08-18 2015-08-18 Assigning communication paths among computing devices utilizing a multi-path communication protocol

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/209,775 Continuation US9531626B1 (en) 2015-08-18 2016-07-14 Assigning communication paths among computing devices utilizing a multi-path communication protocol

Publications (1)

Publication Number Publication Date
US9461867B1 true US9461867B1 (en) 2016-10-04

Family

ID=56995177

Family Applications (4)

Application Number Title Priority Date Filing Date
US14/828,895 Expired - Fee Related US9942132B2 (en) 2015-08-18 2015-08-18 Assigning communication paths among computing devices utilizing a multi-path communication protocol
US15/014,191 Expired - Fee Related US9461867B1 (en) 2015-08-18 2016-02-03 Assigning communication paths among computing devices utilizing a multi-path communication protocol
US15/209,775 Expired - Fee Related US9531626B1 (en) 2015-08-18 2016-07-14 Assigning communication paths among computing devices utilizing a multi-path communication protocol
US15/365,997 Expired - Fee Related US9674078B2 (en) 2015-08-18 2016-12-01 Assigning communication paths among computing devices utilizing a multi-path communication protocol

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US14/828,895 Expired - Fee Related US9942132B2 (en) 2015-08-18 2015-08-18 Assigning communication paths among computing devices utilizing a multi-path communication protocol

Family Applications After (2)

Application Number Title Priority Date Filing Date
US15/209,775 Expired - Fee Related US9531626B1 (en) 2015-08-18 2016-07-14 Assigning communication paths among computing devices utilizing a multi-path communication protocol
US15/365,997 Expired - Fee Related US9674078B2 (en) 2015-08-18 2016-12-01 Assigning communication paths among computing devices utilizing a multi-path communication protocol

Country Status (1)

Country Link
US (4) US9942132B2 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9674078B2 (en) 2015-08-18 2017-06-06 International Business Machines Corporation Assigning communication paths among computing devices utilizing a multi-path communication protocol
US20180294610A1 (en) * 2017-04-07 2018-10-11 Centurylink Intellectual Property Llc Power Distribution Unit for Transmitting Data Over a Powerline
US10496486B1 (en) * 2018-06-29 2019-12-03 International Business Machines Corporation Protecting data integrity in a multiple path input/output environment
US10740157B2 (en) 2018-06-05 2020-08-11 International Business Machines Corporation Cache load balancing in a virtual input/output server
US11012510B2 (en) * 2019-09-30 2021-05-18 EMC IP Holding Company LLC Host device with multi-path layer configured for detecting target failure status and updating path availability

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020178268A1 (en) * 2001-05-22 2002-11-28 Aiken John Andrew Methods, systems and computer program products for port assignments of multiple application instances using the same source IP address
US20030189929A1 (en) 2002-04-04 2003-10-09 Fujitsu Limited Electronic apparatus for assisting realization of storage area network system
US7739415B2 (en) 2005-07-01 2010-06-15 International Business Machines Corporation Method for managing virtual instances of a physical port attached to a network
US8032730B2 (en) 2008-05-15 2011-10-04 Hitachi, Ltd. Method and apparatus for I/O priority control in storage systems
US8274881B2 (en) 2009-05-12 2012-09-25 International Business Machines Corporation Altering access to a fibre channel fabric
US8364869B2 (en) 2009-09-29 2013-01-29 Hitachi, Ltd. Methods and apparatus for managing virtual ports and logical units on storage systems
US8510815B2 (en) 2009-06-05 2013-08-13 Hitachi, Ltd. Virtual computer system, access control method and communication device for the same
US8705351B1 (en) * 2009-05-06 2014-04-22 Qlogic, Corporation Method and system for load balancing in networks
US8743872B2 (en) 2004-02-13 2014-06-03 Oracle International Corporation Storage traffic communication via a switch fabric in accordance with a VLAN
US8848727B2 (en) 2004-02-13 2014-09-30 Oracle International Corporation Hierarchical transport protocol stack for data transfer between enterprise servers
US8856337B2 (en) 2011-08-16 2014-10-07 Hitachi, Ltd. Method and apparatus of cluster system provisioning for virtual maching environment
US8914553B2 (en) 2012-04-10 2014-12-16 Oracle International Corporation Multiple path load distribution for host communication with a tape storage device
US8918537B1 (en) 2007-06-28 2014-12-23 Emc Corporation Storage array network path analysis server for enhanced path selection in a host-based I/O multi-path system
US20150023361A1 (en) 2005-10-27 2015-01-22 Cisco Technology, Inc. Technique for implementing virtual fabric membership assignments for devices in a storage area network

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090055831A1 (en) * 2007-08-24 2009-02-26 Bauman Ellen M Allocating Network Adapter Resources Among Logical Partitions
US8582469B2 (en) * 2007-11-14 2013-11-12 Cisco Technology, Inc. Peer-to-peer network including routing protocol enhancement
US20120155468A1 (en) * 2010-12-21 2012-06-21 Microsoft Corporation Multi-path communications in a data center environment
US8954575B2 (en) * 2012-05-23 2015-02-10 Vmware, Inc. Fabric distributed resource scheduling
US9942132B2 (en) 2015-08-18 2018-04-10 International Business Machines Corporation Assigning communication paths among computing devices utilizing a multi-path communication protocol

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020178268A1 (en) * 2001-05-22 2002-11-28 Aiken John Andrew Methods, systems and computer program products for port assignments of multiple application instances using the same source IP address
US20030189929A1 (en) 2002-04-04 2003-10-09 Fujitsu Limited Electronic apparatus for assisting realization of storage area network system
US8743872B2 (en) 2004-02-13 2014-06-03 Oracle International Corporation Storage traffic communication via a switch fabric in accordance with a VLAN
US8848727B2 (en) 2004-02-13 2014-09-30 Oracle International Corporation Hierarchical transport protocol stack for data transfer between enterprise servers
US7739415B2 (en) 2005-07-01 2010-06-15 International Business Machines Corporation Method for managing virtual instances of a physical port attached to a network
US20150023361A1 (en) 2005-10-27 2015-01-22 Cisco Technology, Inc. Technique for implementing virtual fabric membership assignments for devices in a storage area network
US8918537B1 (en) 2007-06-28 2014-12-23 Emc Corporation Storage array network path analysis server for enhanced path selection in a host-based I/O multi-path system
US8032730B2 (en) 2008-05-15 2011-10-04 Hitachi, Ltd. Method and apparatus for I/O priority control in storage systems
US8705351B1 (en) * 2009-05-06 2014-04-22 Qlogic, Corporation Method and system for load balancing in networks
US8274881B2 (en) 2009-05-12 2012-09-25 International Business Machines Corporation Altering access to a fibre channel fabric
US8510815B2 (en) 2009-06-05 2013-08-13 Hitachi, Ltd. Virtual computer system, access control method and communication device for the same
US8364869B2 (en) 2009-09-29 2013-01-29 Hitachi, Ltd. Methods and apparatus for managing virtual ports and logical units on storage systems
US8856337B2 (en) 2011-08-16 2014-10-07 Hitachi, Ltd. Method and apparatus of cluster system provisioning for virtual maching environment
US8914553B2 (en) 2012-04-10 2014-12-16 Oracle International Corporation Multiple path load distribution for host communication with a tape storage device

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Adlung, et al., "FCP for the IBM eServer zSeries systems: Access to distributed storage", IBM J. Res. & Dev., vol. 46, No. 4/5, Jul./Sep. 2002, accepted for publication Apr. 8, 2002, © 2002 IBM, pp. 487-502.
Atia, et al., "Assigning Communication Paths Among Computing Devices Utilizing a Multi-Path Communication Protocol", U.S. Appl. No. 15/014,191, filed Feb. 3, 2016, (a copy is not provided as this application is available to the Examiner).
List of IBM Patents or Patent Applications Treated as Related, Appendix P, Filed Feb. 29, 2016.
List of IBM Patents or Patent Applications Treated as Related, Appendix P, Filed Herewith, 2 pages.
Ohad Atia, et al., "Assigning Communication Paths Among Computing Devices Utilizing a Multi-Path Communication Protocol", U.S. Appl. No. 14/828,895, filed Aug. 18, 2015, (a copy is not provided as this application is available to the Examiner).
Srikrishnan, et al., "Sharing FCP adapters through virtualization", IBM J. Res. & Dev., vol. 51, No. 1/2, Jan./Mar. 2007, accepted for publication Jun. 16, 2006, Internet publication Jan. 11, 2007, © 2007 IBM, pp. 103-118.

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9674078B2 (en) 2015-08-18 2017-06-06 International Business Machines Corporation Assigning communication paths among computing devices utilizing a multi-path communication protocol
US20180294610A1 (en) * 2017-04-07 2018-10-11 Centurylink Intellectual Property Llc Power Distribution Unit for Transmitting Data Over a Powerline
US10454226B2 (en) * 2017-04-07 2019-10-22 Centurylink Intellectual Property Llc Power distribution unit for transmitting data over a power line
US10740157B2 (en) 2018-06-05 2020-08-11 International Business Machines Corporation Cache load balancing in a virtual input/output server
US10496486B1 (en) * 2018-06-29 2019-12-03 International Business Machines Corporation Protecting data integrity in a multiple path input/output environment
US11012510B2 (en) * 2019-09-30 2021-05-18 EMC IP Holding Company LLC Host device with multi-path layer configured for detecting target failure status and updating path availability

Also Published As

Publication number Publication date
US9674078B2 (en) 2017-06-06
US9942132B2 (en) 2018-04-10
US20170054632A1 (en) 2017-02-23
US20170078192A1 (en) 2017-03-16
US9531626B1 (en) 2016-12-27

Similar Documents

Publication Publication Date Title
US9674078B2 (en) Assigning communication paths among computing devices utilizing a multi-path communication protocol
US8041987B2 (en) Dynamic physical and virtual multipath I/O
US8856319B1 (en) Event and state management in a scalable cloud computing environment
US9619429B1 (en) Storage tiering in cloud environment
US10708140B2 (en) Automatically updating zone information in a storage area network
US10162681B2 (en) Reducing redundant validations for live operating system migration
US11669360B2 (en) Seamless virtual standard switch to virtual distributed switch migration for hyper-converged infrastructure
US10235206B2 (en) Utilizing input/output configuration templates to reproduce a computing entity
US10616141B2 (en) Large scale fabric attached architecture
WO2011144633A1 (en) Migrating virtual machines among networked servers upon detection of degrading network link operation
US9571585B2 (en) Using alternate port name for uninterrupted communication
US11070512B2 (en) Server port virtualization for guest logical unit number (LUN) masking in a host direct attach configuration
US11481356B2 (en) Techniques for providing client interfaces
US10623341B2 (en) Configuration of a set of queues for multi-protocol operations in a target driver
US11119801B2 (en) Migrating virtual machines across commonly connected storage providers
WO2016195634A1 (en) Storage area network zone set
US11442626B2 (en) Network scaling approach for hyper-converged infrastructure (HCI) and heterogeneous storage clusters
US11630581B2 (en) Host bus adaptor (HBA) virtualization awareness for effective input-output load balancing
US11392459B2 (en) Virtualization server aware multi-pathing failover policy
US11256446B1 (en) Host bus adaptor (HBA) virtualization aware multi-pathing failover policy
US11620054B1 (en) Proactive monitoring and management of storage system input-output operation limits

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ATIA, OHAD;BEN-HORIN, YUVAL A.;MARX, ALON;SIGNING DATES FROM 20150804 TO 20150813;REEL/FRAME:037653/0189

STCF Information on status: patent grant

Free format text: PATENTED CASE

FEPP Fee payment procedure

Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

LAPS Lapse for failure to pay maintenance fees

Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20201004