CN109120522B - Multipath state monitoring method and device - Google Patents

Multipath state monitoring method and device Download PDF

Info

Publication number
CN109120522B
CN109120522B CN201810953868.1A CN201810953868A CN109120522B CN 109120522 B CN109120522 B CN 109120522B CN 201810953868 A CN201810953868 A CN 201810953868A CN 109120522 B CN109120522 B CN 109120522B
Authority
CN
China
Prior art keywords
storage
judging
port
path
fault
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810953868.1A
Other languages
Chinese (zh)
Other versions
CN109120522A (en
Inventor
黄远超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201810953868.1A priority Critical patent/CN109120522B/en
Publication of CN109120522A publication Critical patent/CN109120522A/en
Application granted granted Critical
Publication of CN109120522B publication Critical patent/CN109120522B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/22Alternate routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/28Routing or path finding of packets in data switching networks using route fault recovery

Abstract

The invention discloses a multipath state monitoring method, which comprises the following steps: judging a path fault by periodically inquiring the multi-path output information; executing a path fault diagnosis judgment process according to the storage type in a classified manner; and sending the link failure point to the monitoring end. The invention also discloses a multi-path state monitoring device, which can automatically scan the multi-path state, realize the unified monitoring of the multi-path state of the multi-type storage without considering the difference of the storage types, and further realize the simple and convenient monitoring and diagnosis of the link state of the multi-type storage.

Description

Multipath state monitoring method and device
Technical Field
The invention relates to the technical field of computer storage, in particular to a method for realizing a multi-path function based on a Linux system and a function of storing abnormal fault diagnosis of a host link.
Background
With the development of novel technologies such as cloud computing and big data, people use more and more storage servers, the same data center machine room generally relates to storage servers of different models, and each storage server generally has a plurality of links to a host computer, which relates to the concept of multi-path redundancy. Multipath redundant I/O (Multipath I/O) refers to a server being connected to a storage device through multiple physical paths. The main function of the method is that when one physical path fails due to failure of a host HBA card, a cable, a switch or a RAID controller of a storage device, the server can transfer I/O passing through the physical path to other normal physical paths, and an application program cannot detect the change, so that the availability of the system is improved. For the administrator, when there is no physical hardware alarm, the multipath fault switching is not easy to be detected by the administrator at the host end, and further, more serious results may be caused if the fault is not repaired in time.
Disclosure of Invention
The invention aims to provide a multi-path state monitoring method and a multi-path state monitoring device.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention provides a multipath state monitoring method in a first aspect, which comprises the following steps:
judging a path fault by periodically inquiring the multi-path output information;
executing a path fault diagnosis judgment process according to the storage type in a classified manner;
and sending the link failure point to the monitoring end.
With reference to the first aspect, in a first possible implementation manner of the first aspect, before the step of determining a path fault by periodically querying multi-path output information, the method further includes:
and setting the interval time and the execution authority of the periodic inquiry.
With reference to the first aspect, in a second possible implementation manner of the first aspect, the determining a path fault by periodically querying multipath output information includes:
and inquiring the multi-path output information periodically, and judging that a path has a fault when detecting that a fault keyword exists in the state of a certain path in the multi-path state information.
With reference to the first aspect, in a third possible implementation manner of the first aspect, the executing a fault diagnosis determination process according to storage type classification includes:
judging whether the storage type is IP storage, respectively testing whether the communication of the storage service port, the storage management port and the switch management port is normal, and judging a link fault point;
and judging the storage type to be FC storage, respectively testing whether the communication of the storage optical port, the storage management port and the management port of the optical fiber switch is normal, and judging a link fault point.
With reference to the first aspect, in a fourth possible implementation manner of the first aspect, the executing the fault diagnosis and judgment process according to the storage type classification further includes: and searching whether the system contains IP storage connection equipment information, if so, judging that the storage type is IP storage, and otherwise, judging that the storage is FC storage.
With reference to the first aspect, in a fifth possible implementation manner of the first aspect, the determining that the storage type is IP storage, and respectively testing whether communications of the storage service port, the storage management port, and the switch management port are normal includes:
testing a storage service port through 'ping + IP address of the storage service port' under a host system, if the IP address of an opposite end can be ping-passed, judging that the link is normal, and if the IP address of the opposite end can not be ping-passed, judging that the link has a fault; whether the communication of the storage management port is normal is tested through the ping + the IP address of the storage management port, and whether the communication of the management port of the switch is normal is tested through the ping + the IP address of the management port of the switch.
With reference to the first aspect, in a sixth possible implementation manner of the first aspect, the determining that the storage type is FC storage, and respectively testing whether communications of the storage optical port, the storage management port, and the optical fiber switch management port are normal includes:
detecting whether the communication from the host to the storage optical port is normal or not through an 'fcping + storage optical port wwn address' command under the host system, if the communication can be stated that the link is normal, and if the communication can not be stated that the link is in failure; whether the communication between the storage management port and the management port of the optical fiber switch is normal or not is judged through the ping + the IP address of the storage management port and the ping + the IP address of the management port of the optical fiber switch.
A second aspect of the present invention provides a multipath condition monitoring apparatus, including:
the query setting module is used for setting the interval time and the execution authority of periodic query;
the path fault judging module judges the path fault by periodically inquiring the multi-path output information;
the fault point diagnosis module is used for executing a path fault diagnosis judgment process according to the storage type classification;
and the fault sending module is used for sending the link fault point to the monitoring end.
The multipath state monitoring device according to the second aspect of the present invention can realize the methods according to the first aspect and the respective implementation manners of the first aspect, and achieve the same effects.
The effect provided in the summary of the invention is only the effect of the embodiment, not all the effects of the invention, and one of the above technical solutions has the following advantages or beneficial effects:
the invention provides a multi-path state monitoring method based on a Linux operating system, which can automatically scan multi-path states, realize the unified monitoring of multi-path states stored in multiple types, and does not need to consider the difference of storage types, thereby realizing the simple and convenient monitoring and diagnosis of the link states stored in multiple types. And the generated link fault is preliminarily diagnosed, and meanwhile, the link fault is alarmed, so that the method has a good auxiliary effect on timely handling of the fault by an administrator and starting. The invention has high usability and no compatibility problem to physical equipment and an operating system.
Drawings
FIG. 1 is a flow chart of a method embodiment of the present invention;
FIG. 2 is a flow chart of an embodiment of the method of the present invention;
FIG. 3 is a schematic structural diagram of an embodiment of the apparatus of the present invention.
Detailed Description
In order to clearly explain the technical features of the present invention, the following detailed description of the present invention is provided with reference to the accompanying drawings. The following disclosure provides many different embodiments, or examples, for implementing different features of the invention. To simplify the disclosure of the present invention, the components and arrangements of specific examples are described below. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. It should be noted that the components illustrated in the figures are not necessarily drawn to scale. Descriptions of well-known components and processing techniques and procedures are omitted so as to not unnecessarily limit the invention.
Example one
As shown in fig. 1, a multipath condition monitoring method includes the following steps:
s1, judging a path fault by periodically inquiring multi-path output information;
s2, executing a path fault diagnosis judgment process according to the storage type classification;
and S3, sending the link failure point to the monitoring end.
Example two
As shown in fig. 2, a multipath condition monitoring method includes the following steps:
and S1, setting the interval time and the execution authority of the periodic inquiry.
And S2, periodically inquiring the multi-path output information, and judging that a path has a fault when detecting that a fault keyword exists in the state of a certain path in the multi-path state information.
And S3, judging whether the storage type is IP storage, respectively testing whether the communication of the storage service port, the storage management port and the switch management port is normal, and judging a link fault point.
And S4, judging the storage type to be FC storage, respectively testing whether the communication of the storage optical port, the storage management port and the optical fiber switch management port is normal, and judging the link fault point.
And S5, sending the link failure point to the monitoring end.
EXAMPLE III
A method of multipath condition monitoring, comprising the steps of:
s1, multipath software is installed, the script is copied to a directory under the Linux system, and executable authority is added to the script by using chmod + x.
S2, writing an instruction of a specified time or interval through a Crontab command under the Linux system, and enabling the system to automatically execute the script program according to a timing plan.
And S3, automatically and ceaselessly monitoring the state of the multipaths according to a timing plan, judging that the path has a fault when the script detects a fault keyword in the state of a certain path in the multi-path state information, and then executing a fault diagnosis process.
S4, when finding out multi-path sending fault, the script program firstly searches whether there is link information of IP storage in the system by iscsiadm command, if it can find out the information of IP storage connection device, it judges the storage type is IP storage, otherwise, it judges the storage is FC storage, then enters into corresponding fault diagnosis process.
S5, if the IP storage is IP storage, testing by a method of 'ping + storage service port IP address' under the host system, if the device can be determined to be normal by the opposite IP address by ping, and if the device is determined to be failed by ping. The same method is used for testing whether the communication of the storage management port and the switch management port is normal or not, and further judging a fault point where a link fault possibly occurs.
S6, if it is FC storage, it will first detect whether the host-to-storage optical interface communication is normal through the command of "fcping + storage optical interface wwn address" under the host system, if it can indicate that the link is normal, if it does not indicate that the link is failed. And then, whether the communication is normal or not is judged through the ping + storage management port IP address and the ping + optical fiber switch management port IP address, so that a fault point of a link is judged.
And S7, outputting the alarm mode of the diagnosis result on the screen and sending the alarm mode to the administrator in the mail mode after the diagnosis process is finished. And after the administrator confirms the fault, the detection process is entered again.
Ping is a command under Windows, Unix, and Linux systems. ping also belongs to a communication protocol and is part of the TCP/IP protocol. Whether the Ethernet port network is connected or not can be checked by using a ping + ip address command, and the analysis and the judgment of network faults can be well facilitated.
fcping is similar to ethernet ping in that since the optical port has no IP address, only a unique wwn address, the "fcping + wwn address" command can be used to detect the link communication status of the host to a specified fiber port and obtain link latency information.
The invention uses if and grep sentences to search and classify the multipath state keywords, and can well deal with different types of storage servers. The invention has higher usability, and only needs to copy the script program to any directory of the host, add the script executable authority by using the command chmod, and then configure the crontab to execute the task at regular time.
As shown in fig. 3, a multipath condition monitoring apparatus includes:
the query setting module 101 is used for setting the interval time and the execution authority of periodic query;
the path fault judging module 102 judges a path fault by periodically inquiring the multi-path output information;
the fault point diagnosis module 103 is used for executing a path fault diagnosis judgment process according to the storage type classification;
and the fault sending module 104 is used for sending the link fault point to the monitoring end.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims (1)

1. A multipath state monitoring method is characterized by comprising the following steps:
setting interval time and execution authority of periodic query;
judging a path fault by periodically inquiring the multi-path output information;
the judging of the path fault by periodically inquiring the multi-path output information includes:
inquiring the multi-path output information periodically, and judging that a path has a fault when detecting that a fault keyword exists in the state of a certain path in the multi-path state information;
executing a path fault diagnosis judgment process according to the storage type in a classified manner;
the process of performing fault diagnosis and judgment according to the storage type classification comprises the following steps:
judging whether the storage type is IP storage, respectively testing whether the communication of the storage service port, the storage management port and the switch management port is normal, and judging a link fault point;
judging whether the storage type is FC storage, respectively testing whether the communication of the storage optical port, the storage management port and the management port of the optical fiber switch is normal, and judging a link fault point;
searching whether the system contains IP storage connection equipment information, if so, judging that the storage type is IP storage, and otherwise, judging that the storage is FC storage;
the judging that the storage type is IP storage, and respectively testing whether the communication of the storage service port, the storage management port and the switch management port is normal includes:
testing a storage service port through 'ping + IP address of the storage service port' under a host system, if the IP address of an opposite end can be ping-passed, judging that the link is normal, and if the IP address of the opposite end can not be ping-passed, judging that the link has a fault; testing whether the communication of the storage management port is normal through 'ping + IP address of the storage management port', and testing whether the communication of the management port of the switch is normal through 'ping + IP address of the management port of the switch';
judge that the storage type is FC storage, test respectively and store optical port, storage management mouth and optical fiber switch management mouth communication and whether normal, include:
detecting whether the communication from the host to the storage optical port is normal or not through an 'fcping + storage optical port wwn address' command under the host system, if the communication can be stated that the link is normal, and if the communication can not be stated that the link is in failure; judging whether the communication between the storage management port and the management port of the optical fiber switch is normal or not through the ping + IP address of the storage management port and the ping + IP address of the management port of the optical fiber switch;
and sending the link failure point to the monitoring end.
CN201810953868.1A 2018-08-21 2018-08-21 Multipath state monitoring method and device Active CN109120522B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810953868.1A CN109120522B (en) 2018-08-21 2018-08-21 Multipath state monitoring method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810953868.1A CN109120522B (en) 2018-08-21 2018-08-21 Multipath state monitoring method and device

Publications (2)

Publication Number Publication Date
CN109120522A CN109120522A (en) 2019-01-01
CN109120522B true CN109120522B (en) 2021-07-27

Family

ID=64853293

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810953868.1A Active CN109120522B (en) 2018-08-21 2018-08-21 Multipath state monitoring method and device

Country Status (1)

Country Link
CN (1) CN109120522B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109982450B (en) * 2019-02-19 2020-09-29 生迪智慧科技有限公司 Communication link repairing method, device, coordinator and system
CN110071843B (en) * 2019-05-08 2021-11-26 浪潮云信息技术股份公司 Fault positioning method and device based on flow path analysis
CN110362445A (en) * 2019-05-28 2019-10-22 平安普惠企业管理有限公司 A kind of monitoring information feedback method and information feedback system based on user behavior
CN111901399B (en) * 2020-07-08 2022-12-09 苏州浪潮智能科技有限公司 Cloud platform block equipment exception auditing method, device, equipment and storage medium
CN113886291B (en) * 2021-08-29 2023-08-18 苏州浪潮智能科技有限公司 Path disabling method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105630715A (en) * 2015-12-18 2016-06-01 国云科技股份有限公司 Multipath based storage early warning method
US9509555B2 (en) * 2013-12-13 2016-11-29 International Business Machines Corporation Multipath fiber channel over ethernet networks
CN107147528A (en) * 2017-05-23 2017-09-08 郑州云海信息技术有限公司 One kind stores gateway intelligently anti-fissure system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9509555B2 (en) * 2013-12-13 2016-11-29 International Business Machines Corporation Multipath fiber channel over ethernet networks
CN105630715A (en) * 2015-12-18 2016-06-01 国云科技股份有限公司 Multipath based storage early warning method
CN107147528A (en) * 2017-05-23 2017-09-08 郑州云海信息技术有限公司 One kind stores gateway intelligently anti-fissure system and method

Also Published As

Publication number Publication date
CN109120522A (en) 2019-01-01

Similar Documents

Publication Publication Date Title
CN109120522B (en) Multipath state monitoring method and device
US8204980B1 (en) Storage array network path impact analysis server for path selection in a host-based I/O multi-path system
US7664986B2 (en) System and method for determining fault isolation in an enterprise computing system
US6742059B1 (en) Primary and secondary management commands for a peripheral connected to multiple agents
EP2951963B1 (en) Failover in response to failure of a port
US8250202B2 (en) Distributed notification and action mechanism for mirroring-related events
WO2021027481A1 (en) Fault processing method, apparatus, computer device, storage medium and storage system
JP2018533788A (en) Automatic switchover implementation
JP2001249856A (en) Method for processing error in storage area network(san) and data processing system
JP2005025483A (en) Failure information management method and management server in network equipped with storage device
JP2007257180A (en) Network node, switch, and network fault recovery method
US20240048468A1 (en) Traffic monitoring method and apparatus for open stack tenant network
US20040073648A1 (en) Network calculator system and management device
CN102187627B (en) Method, device and broadband access server system for load share
US7231503B2 (en) Reconfiguring logical settings in a storage system
US20160197994A1 (en) Storage array confirmation of use of a path
CN114035997A (en) High-availability fault switching method based on MGR
US7925728B2 (en) Facilitating detection of hardware service actions
US5517616A (en) Multi-processor computer system with system monitoring by each processor and exchange of system status information between individual processors
CN109885420B (en) PCIe link fault analysis method, BMC and storage medium
CN114172789B (en) Virtual equipment link detection method, device, equipment and storage medium
WO2019241199A1 (en) System and method for predictive maintenance of networked devices
CN109117317A (en) A kind of clustering fault restoration methods and relevant apparatus
CN111817892B (en) Network management method, system, electronic equipment and storage medium
US11805039B1 (en) Method and apparatus for detecting degraded network performance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant