WO2015163868A1 - Determining connectivity information for a storage device - Google Patents

Determining connectivity information for a storage device Download PDF

Info

Publication number
WO2015163868A1
WO2015163868A1 PCT/US2014/035134 US2014035134W WO2015163868A1 WO 2015163868 A1 WO2015163868 A1 WO 2015163868A1 US 2014035134 W US2014035134 W US 2014035134W WO 2015163868 A1 WO2015163868 A1 WO 2015163868A1
Authority
WO
WIPO (PCT)
Prior art keywords
storage device
network
coupled
storage
device
Prior art date
Application number
PCT/US2014/035134
Other languages
French (fr)
Inventor
Curtis C. BALLARD
Seth Pickett
Original Assignee
Hewlett-Packard Development Company, L.P.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hewlett-Packard Development Company, L.P. filed Critical Hewlett-Packard Development Company, L.P.
Priority to PCT/US2014/035134 priority Critical patent/WO2015163868A1/en
Publication of WO2015163868A1 publication Critical patent/WO2015163868A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing packet switching networks
    • H04L43/08Monitoring based on specific metrics
    • H04L43/0805Availability
    • H04L43/0811Connectivity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network-specific arrangements or communication protocols supporting networked applications
    • H04L67/10Network-specific arrangements or communication protocols supporting networked applications in which an application is distributed across nodes in the network
    • H04L67/1097Network-specific arrangements or communication protocols supporting networked applications in which an application is distributed across nodes in the network for distributed storage of data in a network, e.g. network file system [NFS], transport mechanisms for storage area networks [SAN] or network attached storage [NAS]

Abstract

A method includes, in a storage device coupled to a storage network, determining connectivity information for the storage device. The determining the connectivity information includes determining a first set of at least one network device coupled to the storage device and determining a second set of at least one network device that is coupled to the storage device. The technique includes selectively initiating corrective action based at least in part on a comparison of the first set with the second set.

Description

DETERMINING CONNECTIVITY INFORMATION FOR A STORAGE DEVICE Background

[0001 ] A computer may access a storage area network (SAN) for purposes of storing and retrieving large amounts of data. The typical SAN includes a consolidated pool of mass storage devices (magnetic tape drives, hard drives, optical drives, and so forth); and the SAN typically provides relatively high speed block level storage, which may be advantageous for backup applications, archival applications, database applications and other such purposes.

Brief Description of the Drawings

[0002] Fig 1 is a schematic diagram of a computer system according to an example implementation.

[0003] Figs. 2 and 3 are flow diagrams depicting techniques to determine connectivity information for a storage device of a storage area network (SAN) according to example implementations.

[0004] Fig. 4 is a flow diagram depicting a technique to detect and report a failed connection path according to an example implementation.

[0005] Fig. 5 is a schematic diagram of a storage device of Fig. 1 according to an example implementation.

Detailed Description

[0008] Referring to Fig. 1 , in accordance with example implementations, a computer system 100 includes host resources 102 and storage resources 150 that may be configured in numerous ways to provide datacenter or cloud services for clients (not shown). As examples, the computer system 100 may provide such services as Software as a Service (SaaS), Infrastructure as a Service (laaS) or Platform as a Service (PaaS). For the example of Fig. 1 , the storage resources 1 50 include P physical storage devices 1 54 which are identified by reference numerals 154-1 , 1 54-2. . .154-P in Fig. 1 ); and the host resources 102 include N hosts 1 10 (hosts 1 10-1 , 1 10-2. . .1 1 0-N-1 and 1 1 0-N, being depicted as examples in Fig. 1 ). In general, the computer system 100 may be locally disposed at a given site or may be geographically distributed at multiple locations, depending on the particular implementation.

[0009] Clients (not shown in Fig. 1 ) of the system 100, such as thin clients, tablets, portable computers, smartphones, desktop computers, and so forth may access (via a network fabric not shown in Fig. 1 ) the computer system 100 for purpose of using the hosted services. Depending on the particular service for a given client, one or multiple storage devices 154 may be pooled together. Moreover, a given host 1 10 may be shared by clients, a given storage device 154 may be shared by multiple clients, and so forth.

[0010] As depicted in Fig. 1 , the storage resources 150 are part of a storage area network (SAN) 120. The SAN 120 has several points where network connections are made; and in general, a new connection is formed whenever a given network device (a host 1 10 or storage device 154, as examples) is attached, or coupled, to the SAN 120. The computer system 1 00 may have hundreds of storage devices 154 and may also have a relatively large number of hosts 1 10. Due to the resulting complexity of the computer system 100, a given host 1 1 0 may miss a network notification that a new storage device 154 has been attached and as a result, the host 1 10 may fail to connect to the storage device 154. Moreover, a given host 1 1 0 may miss a momentary disconnection with a storage device 154 and fail to reconnect to the device 1 54 when the device 154 becomes available. It is also possible that the same result may occur due to other causes (hardware or software defects, for example). It is noted that these are merely examples of points of the SAN 120 where network connections are made.

[001 1 ] One solution for a host 1 1 0 failing to connect to a storage device or for the failure of a connection in the SAN 120 preventing such a connection, is for a human storage administrator to perform relatively complex troubleshooting for purposes of identifying the underlying cause of the connectivity problem. In this troubleshooting, if a new storage device 154 is attached and the connections do not work correctly, then an assumption may be incorrectly made that the newly-attached storage device 154 is faulty, even though another problem of the SAN 120 may be preventing the connection.

[0012] In accordance with example systems and techniques that are disclosed herein, the storage device 1 54 is constructed to initiate and perform a connectivity analysis so that this connectivity analysis may be used (by a human storage administrator or by an automated component, as examples) to resolve the connectivity problem relatively quickly and accurately. As a result, replacement of a non-defective storage device 154 may be avoided.

[0013] More specifically, referring to Fig. 1 in connection with Fig. 2, in accordance with example implementations, a given storage device 154 is constructed to perform a technique 200 for purposes of initiating and performing a connectivity analysis for the device 154. Pursuant to the technique 200, the storage device 154 undertakes measures to determine (block 202) one or multiple network devices that should be coupled to the storage device 154. The storage device 154 further undertakes measures to determine (block 204) a second set of one or multiple network devices that are coupled to the storage device 154. Based at least in part on the difference(s) between the first and second lists, the storage device 154 selectively initiates corrective action, pursuant to block 206. As further described herein, this corrective action may include, as examples, generating a report identifying the network device(s) that should be but are not coupled to the storage device 154; communicating an alert message to a storage administrator; alerting a connectivity analysis or repair engine; and so forth.

[0014] Referring back to Fig. 1 , as a more specific example, in accordance with example implementations, the computer system 100 may contain switch fabric 124, which, in general, represents the network cables, switches, gateways, bridges, routers and so forth that couple the storage devices 154 to the hosts 1 10. The switch fabric 124 may contain one or multiple types of network fabric, such as a local area network (LAN) fabric, wide area network (WAN) fabric, Internet-based fabric, Fibre Channel (FC) fabric, Small Computer System Interface (SCSI) fabric, Fibre Channel over Ethernet (FCOE), a combination of one or more these fabrics, and so forth.

[0015] For the following example implementation, at least one of the storage devices 154 contains a connectivity analysis and reporting engine 160, or "engine 160." In accordance with example implementations, the engine 1 60 is constructed to use a management interface (a management application programming interface (API), for example) of the SAN 120 for purposes of acquiring connectivity information data for the storage device 154 and use the acquired connectivity information data to at least make a preliminary analysis of any connectivity issues associated with the storage device 154.

[0016] More specifically, although a management interface may be used by network components (a switch, for example) of the SAN 120 other than the storage devices 154 to initiate logins with the storage devices 154, in accordance with example implementations, the engine 160 makes use of the management interface to initiate a connection with the SAN 120 for purposes of accessing the SAN's network management functions. In this manner, in accordance with example implementations, the engine 1 60 may initiate a connection between the storage device 154 and the SAN 120 using, for example, a port login request for a FC SAN.

[0017] As a more specific example, in accordance with example implementations, the switch fabric 124 may include a switch 130 that provides a network management service, such as a "Nameserver" service 132, which may be accessed by the engine 160 for purposes of acquiring connectivity information. In this manner, the switch 130 contains a memory 134 that stores connectivity data 136 that is acquired by the Nameserver service 132 and which identifies the network devices that should be coupled to the storage device 1 54, among other connectivity data. The engine 160 accesses the Nameserver service 1 32 to retrieve the connectivity information data 136.

[0018] In accordance with example implementations, the storage device 1 54 contains a hardware interface 1 55 that is constructed to receive connections from host computers. The interface 155 is further constructed to initiate connections, such as initiating a connection to a switch, and in accordance with further example implementations, the interface 155 is constructed to recover a lost connection to a host by reinitiating the connection. As an example, the interface may be constructed from logic that forms state machines to receive and initiate connections, according to the protocol that is used by the switch fabric 124.

[0019] In accordance with example implementations, after retrieving the connectivity information data 136, the engine 160 parses the data 136 for purposes of building a table of network devices that should be (according to the data 1 36) coupled to the storage device 1 54. Using this constructed table, the engine 160 compares the network devices that should be coupled to the storage device 154 with a list of network devices that actually are coupled to the storage device 1 54. In this manner, in accordance with example implementations, the list of network devices that are actually coupled to the storage device 1 54 may be stored in an internal memory of the storage device 154.

[0020] By comparing the list of network devices that are coupled to the storage device 154 with the list of network devices that should be coupled to the storage device 154, the engine 160 identifies any differences and flags these differences as potential connectivity issues. Moreover, in accordance with example

implementations, the engine 1 60 may generate data (data representing a

connectivity report, for example) that highlights any connectivity issues that are identified by the engine's analysis.

[0021 ] The engine 160 may initiate one or multiple corrective actions based on the detected differences between which network devices should be coupled to the storage device 154 and which network devices are coupled to the storage device 154. For example, in accordance with some implementations, in response to detecting discrepancies, the engine 160 generates an alert message that is communicated via the switch fabric 124 to a monitoring station 1 14 for the SAN 120. In accordance with example implementations, in response to receiving the alert message, the monitoring station 1 14 generates an alert message 1 15 (an SMS message, an electronic message (email), multiple different type messages and so forth) to a storage administrator so that the administrator may take the appropriate action(s). In further implementations, the monitoring station 1 14 may communicate an alert message to an automated component for purposes of addressing the connectivity issue. Moreover, in accordance with example implementations, the engine 160 may store or communicate data representing a report that identifies any potential connectivity issues. This report data may accompany the alert message; or in accordance with further example implementations, the engine 160 may store data representing the report in a memory of the storage device 1 54 for subsequent retrieval by a storage administrator, for example.

[0022] Thus, referring to Fig. 3 in conjunction with Fig. 1 , in accordance with example implementations, the engine 160 may perform a technique 300. Pursuant to the technique 300, the engine 160 uses (block 302) a storage device of a storage network to initiate a connection to the storage network (using the interface 1 55, for example) for purposes of accessing (block 304) management functions to retrieve connectivity information data. The engine 160 then parses (block 306) the retrieved connectivity information data to build a table of one or multiple network devices that should be coupled to the storage device 154. The engine 160 further retrieves (block 308) connectivity data from an internal memory of the storage device 1 54 to compare the network device(s) that are coupled to the storage device 1 54 with the network device(s) that should be coupled to the storage device. The engine 160 selectively communicates (block 31 0) an alert to a management station based at least in part on this comparison.

[0023] In accordance with further example implementations, the engine 160 performs a corrective action that includes identifying one or possibly multiple connection paths that caused the failure of the storage device 1 54 to be coupled to a given network device. For example, in accordance with example implementations, the engine 160 may attempt to connect to a given network device and collect a record detailing the step(s) in which the connection process that succeeded and the step(s) in which the connection process failed.

[0024] As a more specific example, in accordance with example implementations, the engine 160 is constructed to detects "hops" in which a connection request may be communicated through multiple connection paths, or points, to a given network device. In this manner, by analyzing the hops, the engine 160 may generate a report identifying connection path(s) in which the request failed. If a request is successful for a given connection path, then the problem may have been automatically resolved or at least the physical connections that are part of the successful connection path may be eliminated as the source of the problem.

[0025] Thus, referring to Fig. 4 in conjunction with Fig. 1 , in accordance with example implementations, the engine 160 performs a technique 400 that includes attempting (block 402) to communicate with a network device that should be coupled to the storage device 154 but is not coupled to the storage device 1 54. The engine 160 determines (block 404) which connection path(s) failed based at least in part on hops data. From this information, the engine 160 may generate (block 406) a report identifying the failed connection path(s).

[0026] Referring to Fig. 5, in accordance with example implementations, the storage device 154 is a sequential access medium device, such as a magnetic tape drive, and as such, the device 1 54 includes a bay to receive removable media, such as a physical magnetic tape cartridge 550. It is noted that the storage device 154 may be a storage device other than a magnetic tape drive or a sequential access medium device, in accordance with further example implementations.

[0027] In general, the storage device 154 is a physical machine that includes actual hardware and actual machine executable instructions, or "software." For example, the hardware may include a controller 520 that, in general, controls the overall operations of the device 154. The controller 520 may include one or multiple processors 522 (one or multiple central processing units (CPUs), microcontrollers, processing cores, and so forth), as well as a memory 524 (a non-transitory memory, such as semiconductor storage, optical storage, and so forth) that may store data, program instructions and so forth for processing by the processor(s) 522. Through the execution of the machine executable instructions, the controller 520 forms an instance of the connectivity analysis and reporting engine 160, in accordance with example implementations.

[0028] For the example implementation of Fig. 5, the storage device 154 includes a drive interface 540 for purposes of writing data to and reading data from the physical cartridge 550. In this regard, the drive interface 540 may include such features as motors coupled to reels of the physical tape cartridge 550, read elements, write elements, servo elements and various other components, such as sense amplifiers, positioners, pulse detectors, error correction code (ECC) engines, and so forth, as can be appreciated by the skilled artisan.

[0029] Among its other features, the storage device 154 may include a read data path 530, a write data path 532, a drive motor interface 534, and one or multiple interfaces 155 (for redundancy purposes) that, as described above, may be coupled to the switch fabric 124 to receive as well as initiate connections for the storage device 154..

[0030] Among the advantages of the systems and techniques that are disclosed herein, storage devices may be used to automatically initiate and perform

connectivity analyses for purposes of detecting and reporting connectivity issues more rapidly, thereby preventing lengthy and costly service events and customer escalations. Other and different advantages may be achieved using the systems and techniques that are disclosed herein in accordance with further example implementations.

[0031 ] While the present invention has been described with respect to a limited number of embodiments, those skilled in the art, having the benefit of this disclosure, will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present invention.

Claims

What is claimed is: 1 . A method comprising:
in a storage device coupled to a storage network, determining connectivity information for the storage device, wherein determining the connectivity information comprises determining a first set of at least one network device that should be coupled to the storage device and determining a second set of at least one network device that is coupled to the storage device; and
selectively initiating corrective action based at least in part on a comparison of the first set with the second set.
2. The method of claim 1 , wherein determining which network device should be coupled to the storage device comprises using the storage device to initiate a connection with the storage network.
3. The method of claim 1 , wherein determining which network device should be coupled to the storage device comprises retrieving connectivity
information data from the storage network.
4. The method of claim 3, wherein determining which network device should be coupled to the storage device further comprises parsing the connectivity information data and building a table to identify the at least one device of the first list.
5. The method of claim 1 , wherein determining the second list comprises retrieving connectivity information from an internal memory of the storage device.
6. The method of claim 1 , wherein selectively initiating the corrective action comprises communicating an alert message to a system monitoring station.
7. The method of claim 1 , further comprising:
attempting to communicate over multiple connection paths to at least one device that is not coupled to the storage device but should be coupled to the storage device and generating a report identifying at least one of the multiple connection paths that failed.
8. The method of claim 7, wherein generating the report comprises analyzing hops associated with the multiple connection paths.
9. An article comprising a non-transitory computer readable storage medium to store instructions that when executed by a computer cause the computer to:
initiate a connection from a storage device to a storage network;
access at least one management function of the storage network to retrieve first connectivity information data for the storage device, the first connectivity information indicating at least one device that should be coupled to the storage device;
retrieve second connectivity information data from an internal memory of the storage device, the second connectivity information indicating at least one device that is coupled to the storage device;
identify at least one device that should be coupled to the storage device but is not coupled to the storage device based on the first connectivity information and the second connectivity information; and
selectively communicate an alert to a management station for the storage network based at least in part on the identification.
10. The article of claim 9, the storage medium storing instructions that when executed by the computer cause the computer to communicate with a service provided by a switch of the storage network.
1 1 . The article of claim 9, the storage medium storing instructions that when executed by the computer cause the computer to attempt to communicate over multiple connection paths to at least one device that is not coupled to the storage device but should be coupled to the storage device and generating a report identifying at least one of the multiple connection paths that failed.
12. A storage device comprising:
storage media interface;
a network interface; and
a controller to:
initiate a connection with a storage network using the network interface to determine connectivity information for the storage device, wherein determining the connectivity information comprises determining a first set of at least one network device that should be coupled to the storage device and determine a second set of at least one network device that is coupled to the storage device; and selectively initiate corrective action based at least in part on a comparison of the first set with the second set.
13. The apparatus of claim 12, wherein the network interface:
accepts network connections for the storage device; and
initiates a network connection for the storage device to determine the first set of at least one network device that should be coupled to the storage device.
14. The storage device of claim 12, wherein the storage network comprises a storage area network (SAN).
15. The storage device claim 12, further comprising a memory, wherein the controller determines which devices should be coupled to the storage device based at least in part on data retrieved from the memory.
PCT/US2014/035134 2014-04-23 2014-04-23 Determining connectivity information for a storage device WO2015163868A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2014/035134 WO2015163868A1 (en) 2014-04-23 2014-04-23 Determining connectivity information for a storage device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2014/035134 WO2015163868A1 (en) 2014-04-23 2014-04-23 Determining connectivity information for a storage device

Publications (1)

Publication Number Publication Date
WO2015163868A1 true WO2015163868A1 (en) 2015-10-29

Family

ID=54332900

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2014/035134 WO2015163868A1 (en) 2014-04-23 2014-04-23 Determining connectivity information for a storage device

Country Status (1)

Country Link
WO (1) WO2015163868A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060047850A1 (en) * 2004-08-31 2006-03-02 Singh Bhasin Harinder P Multi-chassis, multi-path storage solutions in storage area networks
US20060080416A1 (en) * 2004-08-31 2006-04-13 Gandhi Shreyas P Virtual logical unit state maintenance rules engine
US20080109584A1 (en) * 2006-11-06 2008-05-08 Dot Hill Systems Corp. Method and apparatus for verifying fault tolerant configuration
US20100293316A1 (en) * 2009-05-15 2010-11-18 Vivek Mehrotra Migration of Switch in a Storage Area Network
US20130151646A1 (en) * 2004-02-13 2013-06-13 Sriram Chidambaram Storage traffic communication via a switch fabric in accordance with a vlan

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130151646A1 (en) * 2004-02-13 2013-06-13 Sriram Chidambaram Storage traffic communication via a switch fabric in accordance with a vlan
US20060047850A1 (en) * 2004-08-31 2006-03-02 Singh Bhasin Harinder P Multi-chassis, multi-path storage solutions in storage area networks
US20060080416A1 (en) * 2004-08-31 2006-04-13 Gandhi Shreyas P Virtual logical unit state maintenance rules engine
US20080109584A1 (en) * 2006-11-06 2008-05-08 Dot Hill Systems Corp. Method and apparatus for verifying fault tolerant configuration
US20100293316A1 (en) * 2009-05-15 2010-11-18 Vivek Mehrotra Migration of Switch in a Storage Area Network

Similar Documents

Publication Publication Date Title
US10027757B1 (en) Locally providing cloud storage array services
US9521200B1 (en) Locally providing cloud storage array services
US9619311B2 (en) Error identification and handling in storage area networks
JP6476348B2 (en) Implementing automatic switchover
US9830239B2 (en) Failover in response to failure of a port
US8799709B2 (en) Snapshot management method, snapshot management apparatus, and computer-readable, non-transitory medium
US20170118076A1 (en) Information Handling System Physical Component Inventory To Aid Operational Management Through Near Field Communication Device Interaction
US10108367B2 (en) Method for a source storage device sending data to a backup storage device for storage, and storage device
US9015527B2 (en) Data backup and recovery
US10169173B2 (en) Preserving management services with distributed metadata through the disaster recovery life cycle
US8990368B2 (en) Discovery of network software relationships
US8688642B2 (en) Systems and methods for managing application availability
US8028193B2 (en) Failover of blade servers in a data center
US9686138B2 (en) Information handling system operational management through near field communication device interaction
US8086896B2 (en) Dynamically tracking virtual logical storage units
US7689736B2 (en) Method and apparatus for a storage controller to dynamically determine the usage of onboard I/O ports
US7558988B2 (en) Storage system and control method thereof
US10147048B2 (en) Storage device lifetime monitoring system and storage device lifetime monitoring method thereof
US8443237B2 (en) Storage apparatus and method for controlling the same using loopback diagnosis to detect failure
US9003115B2 (en) Method and system for governing an enterprise level green storage system drive technique
US20170046152A1 (en) Firmware update
US8738961B2 (en) High-availability computer cluster with failover support based on a resource map
US20160073276A1 (en) Information Handling System Physical Component Maintenance Through Near Field Communication Device Interaction
US20080310431A1 (en) Method for automatic discovery of a transaction gateway daemon of specified type
US9916214B2 (en) Preventing split-brain scenario in a high-availability cluster

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14890107

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14890107

Country of ref document: EP

Kind code of ref document: A1