CN112579384A - Method, device and system for monitoring nodes of SAS domain and nodes - Google Patents

Method, device and system for monitoring nodes of SAS domain and nodes Download PDF

Info

Publication number
CN112579384A
CN112579384A CN201910926440.2A CN201910926440A CN112579384A CN 112579384 A CN112579384 A CN 112579384A CN 201910926440 A CN201910926440 A CN 201910926440A CN 112579384 A CN112579384 A CN 112579384A
Authority
CN
China
Prior art keywords
node
nodes
storage
exp
pages
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910926440.2A
Other languages
Chinese (zh)
Other versions
CN112579384B (en
Inventor
王腾腾
李庆华
吴海波
张宏海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN201910926440.2A priority Critical patent/CN112579384B/en
Publication of CN112579384A publication Critical patent/CN112579384A/en
Application granted granted Critical
Publication of CN112579384B publication Critical patent/CN112579384B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3034Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a storage system, e.g. DASD based or network based
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3055Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Hardware Redundancy (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the invention provides a method, a device and a system for monitoring nodes in an SAS domain and the nodes. Wherein the method is applied to any node in a multi-node SAS domain system, each node in the multi-node SAS domain system comprises a host bus adapter HBA, an expander EXP and at least one disk HDD, wherein the EXP of each node is used for connecting the HBA and the HDD of the node, the EXP in each node stores storage pages, and each node periodically updates the storage pages of the node, the method comprises the following steps: periodically reading the storage pages in the EXPs of other nodes; determining whether the memory pages of the other nodes are updated; and if the storage pages of the other nodes are not updated, determining that the other nodes are in the downtime state. The heartbeat monitoring among the nodes can be realized through the EXP storage pages which can be shared among the nodes, so that the nodes in the downtime state can be accurately and timely found.

Description

Method, device and system for monitoring nodes of SAS domain and nodes
Technical Field
The invention relates to the technical field of SAS domain storage clusters, in particular to a method, a device, a system and a node for monitoring a SAS domain node.
Background
A plurality of storage devices may be included in a SAS (Serial Attached Small computer system interface) domain system, and hereinafter referred to as nodes in the SAS domain system. The nodes are connected through the SAS technology, and point-to-point data interaction between the nodes can be achieved.
However, a node in the SAS domain system may be in a down state in which it cannot work for some special reasons, and a disk of the node is in a state of lacking management, which results in waste of storage resources. How to accurately monitor the states of the nodes in the SAS domain system and discover the nodes in the downtime state in time so as to reduce the waste of storage resources becomes a technical problem to be solved urgently.
Disclosure of Invention
The embodiment of the invention aims to provide a method, a device, a system and a node for monitoring nodes in a SAS domain, so as to monitor the state of the nodes in a multi-node SAS domain. The specific technical scheme is as follows:
in a first aspect of embodiments of the present invention, a method for monitoring nodes in a SAS domain is provided, which is applied to any node in a multi-node SAS domain system, where each node in the multi-node SAS domain system includes a host bus adapter HBA, an expander EXP, and at least one disk HDD, where the EXP of each node is used to connect the HBA and the HDD of the node, and the EXP of each node stores a storage page, and each node periodically updates the storage page of the node, and a communication connection is established between nodes in the multi-node SAS domain system, and the method includes:
periodically reading the storage pages in the EXPs of other nodes;
determining whether the memory pages of the other nodes are updated;
and if the storage pages of the other nodes are not updated, determining that the other nodes are in the downtime state.
With reference to the first aspect, in a possible implementation manner, the method further includes:
and if the storage pages of the other nodes are updated, determining that the other nodes are in a normal state.
With reference to the first aspect, in a possible implementation manner, after determining that the other nodes are down, the method further includes:
managing HDDs of the other nodes.
With reference to the first aspect, in a possible implementation manner, before the managing the HDDs of the other nodes, the method further includes:
determining whether the storage pages of the other nodes store takeover identifiers;
if the storage pages of other nodes do not store the takeover identifier, writing the takeover identifier into the storage pages of other nodes;
terminating the step of managing the HDDs of the other nodes if the takeover flag is stored in the storage page of the other node.
In a second aspect of the embodiments of the present invention, there is provided a node monitoring apparatus of a SAS domain, applied to any node in a multi-node SAS domain system, where each node in the multi-node SAS domain system includes a host bus adapter HBA, an expander EXP, and at least one disk HDD, where the HBA of each node is configured to manage the HDD of the node through the EXP, and the EXP in each node stores a storage page, and each node periodically updates the storage page of the node, and a communication connection is established between nodes in the multi-node SAS domain system, the apparatus including:
the page reading module is used for regularly reading the storage pages in the EXP of other nodes;
an update judgment module, configured to determine whether the storage page of the other node is updated;
and the state judgment module is used for determining that the other nodes are in the downtime state if the storage pages of the other nodes are not updated.
With reference to the second aspect, in a possible implementation manner, the state determining module is further configured to determine that the other node is in a normal state if the storage page of the other node is updated.
With reference to the second aspect, in a possible implementation manner, the apparatus further includes an HDD takeover module, configured to manage HDDs of the other nodes after the determination that the other nodes are down is made.
With reference to the second aspect, in a possible implementation manner, the HDD takeover module is further configured to determine, before the managing the HDDs of the other nodes, whether takeover identifiers are stored in the storage pages of the other nodes;
if the storage pages of other nodes do not store the takeover identifier, writing the takeover identifier into the storage pages of other nodes;
terminating the step of managing the HDDs of the other nodes if the takeover flag is stored in the storage page of the other node.
In a third aspect of the embodiments of the present invention, there is provided a multi-node SAS domain system, including:
a plurality of nodes, each of which comprises a host bus adapter HBA, an expander EXP and at least one disk HDD, wherein the HBA of each node is used for managing the HDD of the node through the EXP, the EXP of each node stores storage pages, each node periodically updates the storage pages of the node, and a communication connection is established among the nodes;
each node of the plurality of nodes implements the method steps of any of the first aspect.
With reference to the third aspect, in a possible implementation manner, the system further includes a SAS switch, where the SAS switch establishes a communication connection with the EXP of each node in the plurality of nodes;
the SAS switch is used for realizing data interaction among the nodes.
With reference to the third aspect, in a possible implementation manner, the plurality of nodes are two nodes;
the system further comprises a SAS wire, one end of the SAS wire is connected with one of the two nodes, and the other end of the SAS wire is connected with the other of the two nodes;
the SAS wire is used for realizing communication connection between the two nodes.
In a fourth aspect of the embodiments of the present invention, a node is provided, which is applied to a multi-node SAS domain system, where the node includes a host bus adapter HBA, an expander EXP, at least one disk HDD, a processor, and a memory;
the EXP is used for connecting the HBA and the HDD;
the EXP stores storage pages;
the processor is used for updating the storage page periodically;
the memory is used for storing a computer program;
the processor is further configured to implement the method steps of any of the first aspect described above when executing the program stored in the memory.
In a fifth aspect of embodiments of the present invention, there is provided a computer-readable storage medium having stored therein a computer program which, when executed by a processor, performs the method steps of any one of the above-mentioned first aspects.
The method, the device, the system and the nodes for monitoring the nodes of the SAS domain provided by the embodiment of the invention can realize the heartbeat monitoring among the nodes through the EXP storage page which can be shared among the nodes, so as to accurately and timely discover the nodes in the downtime state in the multi-node SAS domain system. Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1a is a schematic diagram of a multi-node SAS domain system according to an embodiment of the present invention;
FIG. 1b is a schematic diagram of another structure of a multi-node SAS domain system according to an embodiment of the present invention
FIG. 2 is a flowchart illustrating a method for monitoring nodes in a SAS domain according to an embodiment of the present invention;
FIG. 3 is a flow chart of a HDD takeover method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating an exemplary configuration of a node monitoring device in a SAS domain;
fig. 5 is a schematic structural diagram of a node applied in a multi-node SAS domain system according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1a, fig. 1a is a schematic structural diagram of a multi-node SAS domain system according to an embodiment of the present invention, which may include nodes 100 and SAS switches 200, where each node 100 includes a host Bus adapter hba (host Bus adapter)110, an expander exp (expander)120, and at least one Disk hdd (hard Disk drive) 130. The number of HDDs in different nodes may be the same or different, for example, the node a may include 4 HDDs, and the node B may include 6 HDDs, which is not limited in this embodiment.
The EXP120 is used to connect the HBA110 and the HDD130, and the EXP stores therein storage pages (pages), and each node periodically updates the storage page of the node. The regular update may be a periodic update, or an aperiodic update according to a preset rule or a user instruction. For example, the memory pages may be updated every 3 minutes, or may be updated every 10 th minute, 20 th minute, or 40 th minute of each hour.
Updating a memory page may refer to changing at least one content in the memory page such that the content in the updated memory page does not completely coincide with the content in the memory page before the update. For example, assuming that the memory page includes 3 bytes in total, and the 3 bytes before updating are 000 respectively, the 3 bytes after updating may be 010 or 123, which is not limited in this embodiment. The content stored in the storage page may be different according to different application scenarios, which is not limited in this embodiment. For example, a count may be stored in the storage page, and the count is incremented by one each time the storage page is updated until the preset maximum value is reached, and then the count is zeroed when the storage page is updated again.
Communication connections are established among a plurality of nodes, and the communication connections are realized based on SAS technology, and can be realized through SAS interaction machines by taking the embodiment shown in FIG. 1a as an example. In other application scenarios, the communication connection between two nodes may be implemented in other manners, for example, when the number of nodes in the multiple-node SAS domain system is 2, as shown in fig. 1b, by using a SAS line 300, one end of which is connected to one of the two nodes, and the other end of which is connected to the other of the two nodes.
The multi-node SAS domain system shown in fig. 1a and fig. 1b is only two possible architectures of the multi-node SAS domain system provided in the embodiment of the present invention, and in other possible embodiments, the multi-node SAS domain system may also be another architecture, which is not limited in this embodiment. For convenience of description, the framework shown in fig. 1a will be taken as an example to explain the method for monitoring the node of the SAS domain provided in the embodiment of the present invention, and the principle is the same for the framework shown in fig. 1b and other possible framework embodiments, so that the details are not repeated.
Referring to fig. 2, fig. 2 is a schematic flowchart illustrating a method for monitoring nodes in a SAS domain according to an embodiment of the present invention, where the method may include:
s201, regularly reading the storage pages in the EXP of other nodes.
The method can be applied to any node in the multi-node SAS domain system, and other nodes can refer to all or part of nodes except any node in the multi-node SAS domain system. For example, assuming that the multi-node SAS domain system includes node a, node B, node C and node D, and the method is applied to node a, other nodes may refer to node B, node C and node D, and in other possible application scenarios, may also refer to one or two of the three nodes, such as node B, or node B and node C. It is understood that since communication connection based on SAS technology is established between nodes, EXP of other nodes can be accessed for any node and the storage page therein can be read.
The periodic reading may be a periodic reading, or may be a non-periodic reading according to a preset rule or a user instruction. Referring to the foregoing description of the periodic update, in the embodiment of the present invention, at least one periodic update is included in the interval between any two periodic reads. For example, the memory pages may be updated every 3 minutes, and the memory pages of other nodes are read every 3 minutes, and there is a one-minute delay between reading the memory pages and updating the memory pages, for example, when t is 0min, each node updates the memory pages, when t is 1min, the memory pages of other nodes are read, when t is 3min, each node updates the memory pages again, when t is 4min, the memory pages of other nodes are read again, and so on. For another example, the memory page may be updated at 10 th minute, 20 th minute, and 40 th minute every hour, and read at 11 th minute, 21 st minute, and 41 th minute every hour.
S202, whether the storage pages of other nodes are updated or not is determined.
The storage page update means that the contents in the storage page read this time are different from the storage page read last time. It can be understood that, for reading a memory page for the first time, since there is no memory page read last time, a memory page update may mean that there is a difference in content in the memory page read this time compared to the preset initial content.
The manner of determining whether the storage pages of other nodes are updated may be different according to different application scenarios. For example, the memory pages read this time may be compared with the memory pages read last time, if there is a difference between the memory pages read this time and the memory pages read last time, it is determined that the memory pages of other nodes are updated, and if there is no difference between the memory pages read this time and the memory pages read last time, it is determined that the memory pages of other nodes are not updated.
In other possible application scenarios, the storage page may also store information indicating the time of the latest update, and the time of the latest update of the storage page may be determined according to the read information, and if the time is later than the time of the last read of the storage page, the read storage page update is determined. For example, assuming that the time for reading the memory page last time is t-1 min and the time for reading the memory page this time is t-4 min, if it is determined from the information in the memory page that the memory page was updated last time when t-3 min, it may be determined that the memory page has been updated, and if it is determined from the information in the memory page that the memory page was updated last time when t-0 min, it may be determined that the memory page has not been updated.
S203, if the storage pages of other nodes are not updated, determining that the other nodes are in the downtime state.
Referring to the foregoing description of the memory pages, it can be understood that if other nodes are in a normal state, the other nodes will theoretically update the memory pages periodically, and therefore if the memory pages of the other nodes are not updated, it can be determined that the other nodes are in a down state. If the memory pages of other nodes have been updated, the other nodes may be considered to be in a normal state.
For example, taking a multi-node SAS domain system including a node a, a node B, a node C, and a node D as an example, assuming that after the node a reads the storage pages of the node B, the node C, and the node D, it is determined that the storage pages of the node B and the node C are updated, and the storage pages of the node D are not updated, it may be determined that the node B and the node C are in a normal state, and the node D is in a down state.
By adopting the embodiment, the heartbeat monitoring among the nodes can be realized through the EXP storage pages which can be shared among the nodes, so that the nodes in the downtime state in the multi-node SAS domain system can be accurately and timely found.
Referring again to the embodiment shown in fig. 1a (the principle of the embodiment shown in fig. 1b is the same as that of other possible embodiments, and is not described again), if a node is in a down state, the HDD of the node is in an unmanaged state, which results in waste of hardware resources. In view of this, referring to fig. 3, fig. 3 is a schematic flowchart of a HDD takeover method according to an embodiment of the present invention, where the method includes:
s301, periodically reading the storage pages in the EXP of other nodes.
The step is the same as S201, and reference may be made to the related description in S201, which is not described herein again.
S302, determining whether the storage pages of other nodes are updated, if the storage pages of other nodes are updated, executing S303, and if the storage pages of other nodes are not updated, executing S304.
And S303, determining that other nodes are in a normal state, and returning to execute S301.
S304, determining that other nodes are in the downtime state.
The step is the same as S203, and reference may be made to the related description of S203, which is not described herein again.
S305, determining whether the storage pages of other nodes store the takeover identifier.
The takeover identifier may be represented in different forms according to different application scenarios. For example, in one possible embodiment, a specified location in the memory page may be set as the takeover flag. Determining whether the storage page stores the takeover identifier may be determining whether a value of a specified location in the storage page is 1, determining that the storage page stores the takeover identifier if the value of the specified location is 1, and determining that the storage page does not store the takeover identifier if the value of the specified location is 0.
S306, if the memory pages of other nodes do not have the takeover identifier, writing the takeover identifier into the memory pages of other nodes.
In this embodiment, if there is no takeover flag in the storage pages of other nodes, it may be considered that the HDDs of other nodes in the downtime state have not been taken over yet, and are in the no-management state, so that these HDDs may be taken over. In order to avoid taking over these HDDs by multiple nodes at the same time, the take-over identifier may be written in the storage page before taking over, so that other nodes ready to take over these HDDs do not continue to prepare to take over these HDDs after determining the take-over identifier in the storage page.
S307, managing the HDDs of other nodes.
In order to more clearly describe the HDD takeover method provided in the embodiment of the present invention, a multi-node SAS domain system including a node a, a node B, a node C, and a node D will be described below with reference to a specific application scenario. Assuming that the four nodes periodically update the storage pages in a period of 3 minutes, and periodically read the storage pages of other nodes in a period of 3 minutes, where node a, node B, and node C are always in a normal state, and node D goes down when t is 2min, the timing for the HDD to take over may be as follows:
when t is 0min, the node a updates the storage page of the node a, the node B updates the storage page of the node B, the node C updates the storage page of the node C, and the node D updates the storage page of the node D.
When t is 1min, the node a reads the storage pages of the node B, the node C and the node D, the node B reads the storage pages of the node a, the node C and the node D, the node C reads the storage pages of the node a, the node B and the node D, and the node D reads the storage pages of the node a, the node B and the node C.
The node A respectively determines whether the storage pages of the node B, the node C and the node D are updated, the node B respectively determines whether the storage pages of the node A, the node C and the node D are updated, the node C respectively determines whether the storage pages of the node A, the node B and the node D are updated, and the node D respectively determines whether the storage pages of the node A, the node B and the node C are updated.
Since each node is in a normal state and updates its respective memory page when t is 0min, the memory pages of all other nodes are updated for each node, and there is no need to take over the HDDs of other nodes.
When t is 3min, because the node D is down, the node a updates the storage page of the node a, the node B updates the storage page of the node B, and the node C updates the storage page of the node C.
When t is 4min, the node a reads the storage pages of the node B, the node C and the node D, the node B reads the storage pages of the node a, the node C and the node D, and the node C reads the storage pages of the node a, the node B and the node D.
The node A respectively determines whether the storage pages of the node B, the node C and the node D are updated, the node B respectively determines whether the storage pages of the node A, the node C and the node D are updated, and the node C respectively determines whether the storage pages of the node A, the node B and the node D are updated.
Since the node D does not update the storage page of the node D when t is 3min, it can be determined that the storage page of the node D is not updated for the node a, the node B, and the node C, and further, it is determined that the HDD of the node D is needed.
If the response speed of the node a is high, it is determined whether the storage page of the node D stores the takeover identifier for the node a, and at this time, the node a may determine that the storage page of the node D does not store the takeover identifier because the node D does not take over the HDD. Node a can write the takeover identification to node D and manage the HDD of node D.
Because the response speed is slow, for the node B and the node C, it is determined whether the storage page of the node D stores the takeover identifier, and at this time, the node a has written the takeover identifier in the storage page of the node D, so that the node B and the node C can determine that the storage page of the node D stores the takeover identifier. Therefore, node B and node C do not take over the HDD of node D. Namely, the node A takes over the HDD of the node D after the node D is down.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a node monitoring apparatus of a SAS domain according to an embodiment of the present invention, where the apparatus is applied to any node in a multi-node SAS domain system, each node in the multi-node SAS domain system includes a host bus adapter HBA, an expander EXP, and at least one disk HDD, where the HBA of each node is used to manage the HDD of the node through the EXP, the EXP in each node stores storage pages, and each node periodically updates the storage pages of the node, and a communication connection is established between nodes in the multi-node SAS domain system, and the apparatus may include:
a page reading module 401, configured to periodically read a storage page in the EXP of another node;
an update judging module 402, configured to determine whether the storage pages of the other nodes are updated;
and a state judgment module 403, configured to determine that other nodes are in a downtime state if the storage pages of the other nodes are not updated.
In a possible embodiment, the state determination module 403 is further configured to determine that the other node is in a normal state if the memory page of the other node is updated.
In one possible embodiment, the apparatus further includes an HDD takeover module for managing HDDs of the other nodes after determining that the other nodes are down.
In a possible embodiment, the HDD takeover module is further configured to determine whether a takeover identifier is stored in a storage page of the other node before managing HDDs of the other nodes;
if the memory pages of other nodes do not store the takeover identifier, writing the takeover identifier into the memory pages of other nodes;
and if the takeover identification is stored in the storage pages of the other nodes, terminating the step of managing the HDDs of the other nodes.
The embodiment of the present invention further provides a node, which is applied to a multi-node SAS domain system, and the node may include, as shown in fig. 5, a host bus adapter HBA110, an expander EXP120, at least one disk HDD130, a processor 140, and a memory 150.
EXP120 is used to connect HBA110 and HDD 130;
the EXP120 stores a storage page;
a processor 140 for periodically updating the memory pages;
a memory 150 for storing a computer program;
the processor 140 is further configured to implement the following steps when executing the program stored in the memory 150:
regularly reading storage pages in the EXP of other nodes;
determining whether the storage pages of other nodes are updated;
and if the storage pages of other nodes are not updated, determining that the other nodes are in the downtime state.
In one possible embodiment, the method further comprises:
and if the storage pages of other nodes are updated, determining that the other nodes are in a normal state.
In one possible embodiment, after determining that the other node is down, the method further includes:
the HDDs of the other nodes are managed.
In one possible embodiment, before managing HDDs of other nodes, the method further comprises:
determining whether the storage pages of other nodes store takeover identifications or not;
if the memory pages of other nodes do not store the takeover identifier, writing the takeover identifier into the memory pages of other nodes;
and if the takeover identification is stored in the storage pages of the other nodes, terminating the step of managing the HDDs of the other nodes.
The Memory mentioned in the above node may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
In yet another embodiment of the present invention, a computer-readable storage medium is further provided, which has instructions stored therein, which when run on a computer, cause the computer to perform the node monitoring method of any of the SAS domains in the above embodiments.
In yet another embodiment, a computer program product containing instructions is also provided which, when run on a computer, causes the computer to perform the method of node monitoring of any of the SAS domains of the embodiments described above.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the embodiments of the apparatus, the system, the node, the computer-readable storage medium, and the computer program product, which are substantially similar to the method embodiments, the description is relatively simple, and for relevant points, reference may be made to some descriptions of the method embodiments.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (13)

1. A node monitoring method of a SAS domain, applied to any node in a multi-node SAS domain system, each node in the multi-node SAS domain system including a host bus adapter HBA, an expander EXP, and at least one disk HDD, wherein the EXP of each node is used to connect the HBA and the HDD of the node, and the EXP in each node stores a storage page, and each node periodically updates the storage page of the node, and a communication connection is established between nodes in the multi-node SAS domain system, the method comprising:
periodically reading the storage pages in the EXPs of other nodes;
determining whether the memory pages of the other nodes are updated;
and if the storage pages of the other nodes are not updated, determining that the other nodes are in the downtime state.
2. The method of claim 1, further comprising:
and if the storage pages of the other nodes are updated, determining that the other nodes are in a normal state.
3. The method of claim 1, wherein after said determining that said other node is down, said method further comprises:
managing HDDs of the other nodes.
4. The method of claim 3, wherein prior to said managing HDDs of said other nodes, said method further comprises:
determining whether the storage pages of the other nodes store takeover identifiers;
if the storage pages of other nodes do not store the takeover identifier, writing the takeover identifier into the storage pages of other nodes;
terminating the step of managing the HDDs of the other nodes if the takeover flag is stored in the storage page of the other node.
5. A node monitoring apparatus of a SAS domain applied to any node in a multi-node SAS domain system, each node in the multi-node SAS domain system including a host bus adapter HBA, an expander EXP, and at least one disk HDD, wherein the HBA of each node is used to manage the HDD of the node by the EXP, and the EXP in each node stores a storage page, and each node periodically updates the storage page of the node, and a communication connection is established between nodes in the multi-node SAS domain system, the apparatus comprising:
the page reading module is used for regularly reading the storage pages in the EXP of other nodes;
an update judgment module, configured to determine whether the storage page of the other node is updated;
and the state judgment module is used for determining that the other nodes are in the downtime state if the storage pages of the other nodes are not updated.
6. The apparatus of claim 5, wherein the status determining module is further configured to determine that the other node is in a normal status if the memory page of the other node is updated.
7. The apparatus of claim 5, further comprising an HDD takeover module for managing HDDs of the other nodes after the determination that the other nodes are down.
8. The apparatus of claim 7, wherein the HDD takeover module is further configured to determine whether a takeover flag is stored in the storage page of the other node before the managing the HDDs of the other node;
if the storage pages of other nodes do not store the takeover identifier, writing the takeover identifier into the storage pages of other nodes;
terminating the step of managing the HDDs of the other nodes if the takeover flag is stored in the storage page of the other node.
9. A multi-node SAS domain system, the system comprising:
a plurality of nodes, each of which comprises a host bus adapter HBA, an expander EXP and at least one disk HDD, wherein the HBA of each node is used for managing the HDD of the node through the EXP, the EXP of each node stores storage pages, each node periodically updates the storage pages of the node, and a communication connection is established among the nodes;
each node of said plurality of nodes implementing the method steps of any of claims 1-4.
10. The system of claim 9, further comprising a SAS switch, the SAS switch having established communication connections with the EXP of each of the plurality of nodes;
the SAS switch is used for realizing data interaction among the nodes.
11. The system of claim 9, wherein the plurality of nodes is two nodes;
the system further comprises a SAS wire, one end of the SAS wire is connected with one of the two nodes, and the other end of the SAS wire is connected with the other of the two nodes;
the SAS wire is used for realizing communication connection between the two nodes.
12. A node, for application in a multi-node SAS domain system, comprising a host bus adapter HBA, an expander EXP, at least one disk HDD, a processor, a memory;
the EXP is used for connecting the HBA and the HDD;
the EXP stores storage pages;
the processor is used for updating the storage page periodically;
the memory is used for storing a computer program;
the processor, when further configured to execute the program stored in the memory, to perform the method steps of any of claims 1-4.
13. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 4.
CN201910926440.2A 2019-09-27 2019-09-27 Method, device and system for monitoring nodes of SAS domain and nodes Active CN112579384B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910926440.2A CN112579384B (en) 2019-09-27 2019-09-27 Method, device and system for monitoring nodes of SAS domain and nodes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910926440.2A CN112579384B (en) 2019-09-27 2019-09-27 Method, device and system for monitoring nodes of SAS domain and nodes

Publications (2)

Publication Number Publication Date
CN112579384A true CN112579384A (en) 2021-03-30
CN112579384B CN112579384B (en) 2023-07-04

Family

ID=75110040

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910926440.2A Active CN112579384B (en) 2019-09-27 2019-09-27 Method, device and system for monitoring nodes of SAS domain and nodes

Country Status (1)

Country Link
CN (1) CN112579384B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102325163A (en) * 2011-07-18 2012-01-18 福建星网锐捷网络有限公司 Routing updating method, device and relevant equipment
CN103475695A (en) * 2013-08-21 2013-12-25 华为数字技术(成都)有限公司 Interconnection method and device for storage system
CN104657316A (en) * 2015-03-06 2015-05-27 北京百度网讯科技有限公司 Server
CN105843557A (en) * 2016-03-24 2016-08-10 天津书生云科技有限公司 Redundant storage system, redundant storage method and redundant storage device
CN105912666A (en) * 2016-04-12 2016-08-31 中国科学院软件研究所 Method for high-performance storage and inquiry of hybrid structure data aiming at cloud platform
CN107046575A (en) * 2017-04-18 2017-08-15 南京卓盛云信息科技有限公司 A kind of cloud storage system and its high density storage method
CN109582213A (en) * 2017-09-29 2019-04-05 杭州海康威视系统技术有限公司 Data reconstruction method and device, data-storage system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3326669B2 (en) * 1995-06-30 2002-09-24 ソニー株式会社 Data playback device
CN108762987A (en) * 2018-05-30 2018-11-06 上海顺舟智能科技股份有限公司 Data reconstruction method and device for double copies microcontroller flash memory

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102325163A (en) * 2011-07-18 2012-01-18 福建星网锐捷网络有限公司 Routing updating method, device and relevant equipment
CN103475695A (en) * 2013-08-21 2013-12-25 华为数字技术(成都)有限公司 Interconnection method and device for storage system
CN104657316A (en) * 2015-03-06 2015-05-27 北京百度网讯科技有限公司 Server
CN105843557A (en) * 2016-03-24 2016-08-10 天津书生云科技有限公司 Redundant storage system, redundant storage method and redundant storage device
CN105912666A (en) * 2016-04-12 2016-08-31 中国科学院软件研究所 Method for high-performance storage and inquiry of hybrid structure data aiming at cloud platform
CN107046575A (en) * 2017-04-18 2017-08-15 南京卓盛云信息科技有限公司 A kind of cloud storage system and its high density storage method
CN109582213A (en) * 2017-09-29 2019-04-05 杭州海康威视系统技术有限公司 Data reconstruction method and device, data-storage system

Also Published As

Publication number Publication date
CN112579384B (en) 2023-07-04

Similar Documents

Publication Publication Date Title
CN114020482A (en) Method and apparatus for data writing
JP6325001B2 (en) Method and system using recursive event listeners in nodes of hierarchical data structures
US20130166672A1 (en) Physically Remote Shared Computer Memory
CN103338243A (en) Method and system for updating cache data of Web node
RU2653254C1 (en) Method, node and system for managing data for database cluster
US10795579B2 (en) Methods, apparatuses, system and computer program products for reclaiming storage units
US11237761B2 (en) Management of multiple physical function nonvolatile memory devices
CN115599747A (en) Metadata synchronization method, system and equipment of distributed storage system
CN107577775B (en) Data reading method and device, electronic equipment and readable storage medium
US8738816B2 (en) Management of detected devices coupled to a host machine
CN112579384B (en) Method, device and system for monitoring nodes of SAS domain and nodes
CN111078418A (en) Operation synchronization method and device, electronic equipment and computer readable storage medium
US20150135004A1 (en) Data allocation method and information processing system
US11150847B2 (en) Shingled magnetic recording drive mapping using nonvolatile random access memory for persistent updates
CN110083509B (en) Method and device for arranging log data
EP2916231B1 (en) Directory maintenance method and apparatus
JP2013186765A (en) Batch processing system, progress confirmation device, progress confirmation method and program
US10866756B2 (en) Control device and computer readable recording medium storing control program
JP6988178B2 (en) Information processing device, log management program and log management method
JP6542172B2 (en) Job execution control device and program
CN107209882B (en) Multi-stage de-registration for managed devices
US20230244390A1 (en) Collecting quality of service statistics for in-use child physical functions of multiple physical function non-volatile memory devices
CN114731326B (en) Block chain system, program and network connection device
CN112543213B (en) Data processing method and device
US10853188B2 (en) System and method for data retention in a decentralized system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant