CN103051470B - The control method of a kind of cluster and magnetic disk heartbeat thereof - Google Patents

The control method of a kind of cluster and magnetic disk heartbeat thereof Download PDF

Info

Publication number
CN103051470B
CN103051470B CN201210500389.7A CN201210500389A CN103051470B CN 103051470 B CN103051470 B CN 103051470B CN 201210500389 A CN201210500389 A CN 201210500389A CN 103051470 B CN103051470 B CN 103051470B
Authority
CN
China
Prior art keywords
heartbeat
node
sector
cluster
magnetic disk
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210500389.7A
Other languages
Chinese (zh)
Other versions
CN103051470A (en
Inventor
魏子然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Standard Software Co Ltd
Original Assignee
China Standard Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Standard Software Co Ltd filed Critical China Standard Software Co Ltd
Priority to CN201210500389.7A priority Critical patent/CN103051470B/en
Publication of CN103051470A publication Critical patent/CN103051470A/en
Application granted granted Critical
Publication of CN103051470B publication Critical patent/CN103051470B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of control method of magnetic disk heartbeat of cluster, comprise: communication heartbeat be designated be time, the any node of cluster alternately performs magnetic disk heartbeat and one of network Heartbeat or serial heartbeat or at least heartbeat of interval primary network and serial heartbeat between twice magnetic disk heartbeat, when communication heartbeat is designated no, carry out magnetic disk heartbeat; When any node cannot detect network Heartbeat or serial heartbeat, communication heartbeat mark is set to no; One of sector do not used by other nodes is defined as the sector that will use by any node, and other sector except the sector that any node will use is defined as the sector that other node will use; Any node, by passing on heartbeat to other node to the sector write data that will use, obtains the heartbeat message of other node by reading other sector.The present invention is when network connection is broken down and/or serial cable breaks down, and owing to controlling magnetic disk heartbeat separately, therefore cluster still can normally work, and prevents cluster fissure.

Description

The control method of a kind of cluster and magnetic disk heartbeat thereof
Technical field
The present invention relates to computer network field, particularly relate to the control method of a kind of cluster and magnetic disk heartbeat thereof.
Background technology
High-availability cluster is one comparatively common in cluster, and when Hardware & software system breaks down time, the data operated in group system are not easily lost, and can recover normal operation in the time short as far as possible.
The level framework of high-availability cluster, is followed successively by from bottom to top: infrostructure layer (Messageing and Infrastructure Layer), member's layer (Membership Layer), Resourse Distribute layer (Resource Allocation Layer), resource layer (Resource Layer).Wherein infrostructure layer is the very important sublayer of transmitting heartbeat message.So-called heartbeat transmission, namely each node regularly can notify the heartbeat message of oneself to other nodes, if certain node does not also detect at 3-5 heart beat cycle, just thinks that this node breaks down.
In the prior art, in cluster, the heartbeat of each node connects is an independently physical connection, and this connection can be that serial cable or Ethernet connect or shared disk.
1, serial cable: it is considered to the connected mode more a shade better than Ethernet connection safety, because hacker (hacker) cannot by the program running such as telnet, ssh or rsh class connected in series, thus its probability by node of having kidnapped another node of subintrusion again can be reduced.But serial cable is limited to length available, therefore in cluster, the distance of two nodes must be very short, the corresponding serial heartbeat of serial cable.
2, Ethernet connects: use this mode can eliminate limiting in length of serial cable, and each inter-node synchronous file system can be connected to by this kind, thus decrease taking from proper communication connection bandwidth, Ethernet connects map network heartbeat, but its fail safe is lower, be easily subject to the attack of hacker.
3, shared disk: use during which and require there is certain storage system shared in high-availability cluster.Use shared disk to perform heartbeat signal to exchange, multiple node, by the sector of respective heartbeat message write shared disk, reads the information of other nodes write simultaneously, the corresponding magnetic disk heartbeat of shared disk.When the maximum benefit of magnetic disk heartbeat can prevent cluster generation fissure exactly, the node in cluster can damage data in shared storage.
OpenAIS is the application programming interfaces specification of the cluster frameworks based on SA Forum standard.OpenAIS provides a kind of cluster mode, and this pattern comprises cluster frameworks, and cluster member manages, and communication mode, can provide for clustered software or instrument the cluster interface meeting AIS standard.The Corosync be derived after OpenAIS development is open cluster engine engineering, and Corosync performs the communication set system of High Availabitity application program.For heartbeat problem, provide the support of network Heartbeat in Corosync, each network Heartbeat sends heartbeat message in the mode of multicast or broadcast.
But, realize without the heartbeat of any other form except network Heartbeat in corosync, when network environment breaks down, because cluster can become several little cluster by fissure, each little cluster can read while write shares the data in storing, make to share data in storage to be destroyed, cluster cannot normally work.
Therefore, a solution is needed badly to solve the problem.
Summary of the invention
One of technical problem to be solved by this invention needs to provide a kind of control method can supporting magnetic disk heartbeat in the cluster of the heartbeat of various ways.
In order to solve the problems of the technologies described above, the invention provides a kind of control method of magnetic disk heartbeat of cluster, the method comprises: communication heartbeat be designated be time, the any node of described cluster alternately performs magnetic disk heartbeat and one of network Heartbeat or serial heartbeat or at least heartbeat of interval primary network and serial heartbeat between twice magnetic disk heartbeat, and when communication heartbeat is designated no, described any node carries out magnetic disk heartbeat; Wherein, when the continuous preset times of described any node cannot detect network Heartbeat or serial heartbeat, described communication heartbeat mark is set to no by described any node; And when the sector that each node determined in described cluster uses, one of sector do not used by other nodes in the sector being used for magnetic disk heartbeat in the shared disk of described cluster is defined as the sector that described any node will use by described any node, and described other sector be used in the sector of magnetic disk heartbeat except the sector that described any node will use is defined as the sector that other node in described cluster except this node will use; When performing magnetic disk heartbeat, described any node, by passing on heartbeat to other node to the described sector write data that will use, obtains the heartbeat message of other node by reading other sector.
Control method according to a further aspect of the invention, also comprise: when communication heartbeat is designated no, described any node interval preset time period sends retroeflection message to other node, if receive the response to this retroeflection message, then described communication heartbeat mark is set to be.
Control method according to a further aspect of the invention, in described any node, one of sector do not used by other nodes in the shared disk of described cluster is defined as in the process of the sector that described any node will use, repeatedly travels through the sector for magnetic disk heartbeat in the shared disk of described cluster; Described to be used in the sector of magnetic disk heartbeat is judged as not by sector that other node uses without the sector of data variation during repeatedly traversal.
Control method according to a further aspect of the invention, in described any node, one of sector do not used by other nodes in the shared disk of described cluster is defined as in the process of the sector that described any node will use, for in the process of the sector of magnetic disk heartbeat in the shared disk repeatedly traveling through described cluster, repeatedly travel through n sector before the shared disk of described cluster; Be judged as in a described front n sector not by sector that other node uses without the sector of data variation during repeatedly traversal.
Control method according to a further aspect of the invention, magnetic disk heartbeat is performed: the maximum node number supported according to cluster described in configuration file, repeatedly travels through the several sector of maximum node before in the shared disk of described cluster by setting up the communication module performing following operation in Corosync; Be judged as not by sector that other node use without the sector of data variation by the described several sector of front maximum node during repeatedly traversal; One of described sector do not used by other nodes is defined as the sector that described any node will use by described any node, and other sector in several for front maximum node sector except the sector that described any node will use is defined as the sector that other node in described cluster except this node will use; When performing magnetic disk heartbeat, described any node, by passing on heartbeat to other node to the described sector write data that will use, obtains the heartbeat message of other node by reading other sector.
Control method according to a further aspect of the invention, arranges timer in described any node, and described timer triggers described any node obtains other node heartbeat message by reading other sector.
According to a further aspect of the invention, additionally provide a kind of cluster, comprising: described cluster comprises multiple node, connected by network between each node and/or serial ports connect and shared disk connect, wherein,
Communication heartbeat be designated be time, the any node of described cluster alternately performs magnetic disk heartbeat and one of network Heartbeat or serial heartbeat or at least heartbeat of interval primary network and serial heartbeat between twice magnetic disk heartbeat, and when communication heartbeat is designated no, described any node carries out magnetic disk heartbeat; Wherein,
When the continuous preset times of described any node cannot detect network Heartbeat or serial heartbeat, described communication heartbeat mark is set to no by described any node; And
When the sector that each node determined in described cluster uses, one of sector do not used by other nodes in the sector being used for magnetic disk heartbeat in the shared disk of described cluster is defined as the sector that described any node will use by described any node, and described other sector be used in the sector of magnetic disk heartbeat except the sector that described any node will use is defined as the sector that other node in described cluster except this node will use;
When performing magnetic disk heartbeat, described any node, by passing on heartbeat to other node to the described sector write data that will use, obtains the heartbeat message of other node by reading other sector.
Cluster according to a further aspect of the invention, also comprise: when communication heartbeat is designated no, described any node interval preset time period sends retroeflection message to other node, if receive the response to this retroeflection message, then described communication heartbeat mark is set to be.
Cluster according to a further aspect of the invention, in described any node, one of sector do not used by other nodes in the shared disk of described cluster is defined as in the process of the sector that described any node will use, repeatedly travels through the sector for magnetic disk heartbeat in the shared disk of described cluster; Described to be used in the sector of magnetic disk heartbeat is judged as not by sector that other node uses without the sector of data variation during repeatedly traversal.
Cluster according to a further aspect of the invention, in described any node, one of sector do not used by other nodes in the shared disk of described cluster is defined as in the process of the sector that described any node will use, for in the process of the sector of magnetic disk heartbeat in the shared disk repeatedly traveling through described cluster, repeatedly travel through n sector before the shared disk of described cluster; Be judged as in a described front n sector not by sector that other node uses without the sector of data variation during repeatedly traversal.
Compared with prior art, one or more embodiment of the present invention can have the following advantages by tool:
The present invention, by determining each node corresponding sector in shared disk in cluster, when carrying out network Heartbeat and/or serial heartbeat at the same time, controls separately magnetic disk heartbeat.Because magnetic disk heartbeat is independent of network Heartbeat and serial heartbeat, therefore still normally can works when network connection is broken down and/or serial cable breaks down, prevent the situation that the node in cluster departs from because receiving heartbeat from cluster.
Other features and advantages of the present invention will be set forth in the following description, and, partly become apparent from specification, or understand by implementing the present invention.Object of the present invention and other advantages realize by structure specifically noted in specification, claims and accompanying drawing and obtain.
Accompanying drawing explanation
Accompanying drawing is used to provide a further understanding of the present invention, and forms a part for specification, with embodiments of the invention jointly for explaining the present invention, is not construed as limiting the invention.In the accompanying drawings:
Fig. 1 is the schematic flow sheet of the control method of cluster disk heartbeat according to the embodiment of the present invention;
Fig. 2 is the schematic flow sheet of the sector according to each node use in the determination cluster of the embodiment of the present invention;
Fig. 3 is the general frame schematic diagram of the high-availability cluster according to the present invention one example.
Embodiment
Describe embodiments of the present invention in detail below with reference to drawings and Examples, to the present invention, how application technology means solve technical problem whereby, and the implementation procedure reaching technique effect can fully understand and implement according to this.It should be noted that, only otherwise form conflict, each embodiment in the present invention and each feature in each embodiment can be combined with each other, and the technical scheme formed is all within protection scope of the present invention.
In addition, can perform in the computer system of such as one group of computer executable instructions in the step shown in the flow chart of accompanying drawing, and, although show logical order in flow charts, but in some cases, can be different from the step shown or described by order execution herein.
Fig. 1 is the schematic flow sheet of the control method of the magnetic disk heartbeat of cluster according to the embodiment of the present invention, describes each step of the present embodiment below with reference to Fig. 1 in detail.
It should be noted that, in the present embodiment, for the high-availability cluster in cluster, the features and advantages of the present invention are described.Wherein, the high-availability cluster of the present embodiment comprises multiple node, connected by network between each node and/or serial ports connect and shared disk connect, this cluster can realize serial heartbeat, network Heartbeat and magnetic disk heartbeat respectively.Importantly, this cluster can carry out network Heartbeat and/or serial heartbeat and magnetic disk heartbeat simultaneously.The following detailed description of in such a mode, control the idiographic flow of magnetic disk heartbeat.
Step S110, the sector that initialization shared disk uses with each node determined in cluster.
Particularly, when the sector that each node determined in cluster uses, one of sector do not used by other nodes in the sector being used for magnetic disk heartbeat in the shared disk of cluster is defined as the sector that any node will use by any node, and other sector in the sector of magnetic disk heartbeat except the sector that any node will use is defined as the sector that other node in cluster except this node will use.
One of sector do not used by other nodes in the shared disk of any node by cluster is defined as in the process of the sector that any node will use, and repeatedly travels through the sector for magnetic disk heartbeat in the shared disk of cluster; Sector without data variation during repeatedly traversal in the sector being used for magnetic disk heartbeat is judged as not by the sector that other node uses, or, repeatedly travel through n sector before the shared disk of cluster; Be judged as in a front n sector not by sector that other node uses without the sector of data variation during repeatedly traversal.
More specifically, in cluster management engine, in Corosync, such as set up communication module to perform magnetic disk heartbeat, this communication module is an exec/totemdisk.c file, which includes the function totemdisk_initialize () of the initialization shared disk for realizing initialization operation, this function is responsible for initialization shared disk to determine the node corresponding to each sector in shared disk.
This function specifically performs following operation: the maximum node number supported according to cluster in configuration file, repeatedly read and travel through the several sector of front maximum node of shared disk, the sector without data variation during repeatedly traversal in several for front maximum node sector being judged as not by sector that other node uses; One of sector do not used by other nodes is defined as the sector that any node will use by any node, and other sector in several for front maximum node sector except the sector that described any node will use is defined as the sector that other node in cluster except this node will use.
As shown in Figure 2, when being any node (as node 1) allocated sector in shared disk, in configuration file, the maximum node number of cluster support is m, repeatedly m sector before traversal, if the data in sector are constantly updated, then judge that this sector is used by certain node in cluster; If the data persistence in sector constant and its do not used, therefore this sector can be defined as this node 1 respective sectors in shared disk, write the sector of heartbeat message to use this sector for node 1; If the data persistence in certain sector is constant and it is not used but node 1 has determined a sector for the respective sectors of this node in shared disk, so this sector is reserved to other nodes and uses.Finally record the sector that each node in all clusters is corresponding.
Step S120, is designated when being (TRUE) at communication heartbeat, any node of cluster alternately performs magnetic disk heartbeat and one of network Heartbeat or serial heartbeat or at least heartbeat of interval primary network and serial heartbeat between twice magnetic disk heartbeat.
After each node of cluster is opened, general communication heartbeat mark is set to TRUE, to represent that each heart pattern in this cluster all can normally carry out.
Can carry out network Heartbeat and magnetic disk heartbeat in cluster, after each node is opened, any node in cluster alternately performs magnetic disk heartbeat and network Heartbeat, and when execution two kinds of heartbeats, the two is independent of each other independently of one another.
Such as, carry out network Heartbeat in the first moment any node and other nodes, namely send a network Heartbeat to other nodes at this moment node, the networking heartbeat of other node reverts back; Magnetic disk heartbeat is carried out in the second moment any node and other nodes.
When performing magnetic disk heartbeat, any node, by passing on heartbeat to other node to the sector write data that will use, obtains the heartbeat message of other node by reading other sector.
Such as, can by increasing transmission function totemdisk_token_send () and the totemdisk_mcast_send () function of heartbeat message in exec/totemdisk.c file, they are responsible for, when this node needs to send heartbeat message to other nodes, heartbeat message being write in shared disk the sector belonging to this node.
Heartbeat message is read to control any node to other node by arranging timer to any node.That is, in any node, arrange timer, timer triggers any node obtains other node heartbeat message by reading other sector.
Particularly, this timer triggers once at set intervals, reads the heartbeat message that in cluster, other nodes send in the sector that each triggering all goes other nodes in shared disk in cluster to use, and ensures the normal work of cluster.Such as, a timing function timer_function_disk_read () is increased in exec/totemdisk.c file, this timer triggers once at set intervals, each triggering all go to have recorded those read by other nodes in cluster the heartbeat message that in cluster, other nodes send in the sector that uses, ensure the normal work of cluster.
Step S130, when communication heartbeat is designated no (FAULTY), any node carries out magnetic disk heartbeat.Wherein, when the continuous preset times of any node cannot detect network Heartbeat or serial heartbeat, communication heartbeat mark is set to no by any node.
It should be noted that, due to the destruction problem to data in shared storage when network Heartbeat and serial heartbeat all cannot solve cluster fissure, this is because when network Heartbeat and serial heartbeat break down, cluster can become several little cluster by fissure, each little cluster all thinks the control had shared storage, can read while write and share the data in storing, make to share data in storage and destroyed.
But in the present embodiment owing to achieving magnetic disk heartbeat in cluster simultaneously; the node gone wrong can be made must to be disconnected the connection with shared storage; even if achieve the control sharing storage also cannot read and write the data shared in storage, protect data in shared storage and be not destroyed.
That is, when the continuous preset times of any node cannot detect network Heartbeat or serial heartbeat, then the network Heartbeat of any node or serial heartbeat may occur fault, then communication heartbeat mark is set to no by any node.But, because the magnetic disk heartbeat now between each node is also being carried out, then protecting the data in cluster in shared disk.
In addition, when communication heartbeat is designated no, any node interval preset time period to other node send retroeflection message (echo message), if receive the response to this retroeflection message, then by communication heartbeat mark be set to be.Like this when communication heartbeat returns to normal condition by interrupt status, cluster reverts to again the pattern of multiple communication heartbeat.
For further illustrating feature of the present invention and advantage, below in conjunction with example, illustrate that one by network Heartbeat and magnetic disk heartbeat example used in combination.
Fig. 3 is the general frame schematic diagram of the high-availability cluster according to the present invention one example, and this example is described for the cluster with two nodes.As shown in Figure 3, the high-availability cluster of this example comprises two nodes (the node A shown in figure and Node B).This high-availability cluster coordinates the network access devices such as router to realize network Heartbeat by using one piece of network interface card, one piece of shared storage is used to realize magnetic disk heartbeat, eth0 in figure, eth1 is the Ethernet card that uses of representation node A and Node B respectively, the disk partition of/dev/sdz ,/dev/sdy difference representation node A and Node B.
Under normal circumstances, after node A and Node B are opened, because employ two-way heartbeat, the pattern that node A, B can read redundancy heartbeat in the configuration file from/etc/corosync/corosync.conf is passive, and the communication heartbeat mark of two-way heartbeat under passive pattern is marked as TRUE.Two nodes in cluster can be used alternatingly the heartbeat circuit being labeled as TRUE and send heartbeat message, and therefore two-way heartbeat is independent of each other independently of one another.
And node A and Node B can bind the IP address of the machine (this node) respectively.Before carrying out heartbeat transmission, the sector that use of node A and B in shared disk can be determined, determine No. 0 sector that sector uses for node A, the sector that other sectors except No. 0 sector use for other nodes; When determining the sector that Node B uses, can find that the data in No. 0 sector constantly update, so can determine the sector that other sectors (such as No. 1 sector) except No. 0 sector use as Node B.
When network environment and shared disk are all normal, a certain moment node A sends a network Heartbeat, and Node B can reply a network Heartbeat.If the timing function now for reading shared disk in node A is activated because the time arrives, whether node A can check in other sectors except No. 0 sector has new heartbeat message to reach, because do not have new heartbeat message to arrive (now Node B does not send heartbeat message by magnetic disk heartbeat), then can again add the timer reading shared disk in node A.
Under redundancy heartbeat is passive pattern, network Heartbeat and magnetic disk heartbeat can be used alternatingly, subsequent time node A can send a magnetic disk heartbeat, heartbeat message by itself writes to No. 0 sector, if the timing function now for reading shared disk in node A is activated because the time arrives, will go to check in other sectors except No. 0 sector whether have new heartbeat message to reach by trigger node A, because do not have new heartbeat message to arrive, then again can add timer in node A.When being activated because the time arrives for the timing function reading shared disk of Node B, will go to check in other sectors except No. 1 sector whether have new heartbeat message to reach by trigger node B, because the heartbeat message having node A to send in No. 0 sector, then Node B can read heartbeat message, and reply to node A magnetic disk heartbeat, No. 1 sector that the heartbeat message write by Node B self uses himself.When being activated because the time arrives for the timing function reading shared disk of node 1, then trigger node A is gone to check in other sectors except No. 0 sector whether have new heartbeat message to reach, because the heartbeat message having Node B to send in No. 1 sector reaches, then node A can read heartbeat message.
Network Heartbeat and magnetic disk heartbeat carry out such being used alternatingly, and ensure that the normal work of cluster.
If at a time, the netting twine on node 1 is pulled out, and in the prior art owing to not having magnetic disk heartbeat, whole cluster cannot normally work because not having the transmission of heartbeat.
But in the present example, node A can carry out the re-transmission of heartbeat message because of receiving heartbeat message, because network Heartbeat and magnetic disk heartbeat are used alternatingly, can use magnetic disk heartbeat when retransferring heartbeat message.Now Node B can receive the magnetic disk heartbeat that node A sends, because a upper transmission heartbeat uses magnetic disk heartbeat, Node B can use network Heartbeat when replying heartbeat message, but due to the network failure of node A, node A cannot receive network Heartbeat, Node B can use magnetic disk heartbeat to retransmit, and node A can receive the magnetic disk heartbeat that Node B sends.Like this after the cycle of several magnetic disk heartbeat, the communication heartbeat mark of the network Heartbeat of node A and Node B can be marked as FAULTY because not receiving the heartbeat message of the other side's transmission for a long time, and regularly (interval preset time period) attempts the concord that transmission retroeflection message expects opposite end, if the communication heartbeat identification tag of network Heartbeat will be TRUE by the response obtaining opposite end.
When communication heartbeat mark is marked as FAULTY, node 1 and node 2 can only carry out the mutual of heartbeat message by shared disk, and the realization of magnetic disk heartbeat ensure that the normal operation of cluster.
The present embodiment be using magnetic disk heartbeat as one independently heartbeat mode, do not rely on network Heartbeat or serial heartbeat.When shared disk, serial cable connection are connected used in combination with Ethernet, when network connection is broken down and/or serial cable connection is broken down, magnetic disk heartbeat still can normally run the normal heartbeat maintaining cluster.
Those skilled in the art should be understood that, above-mentioned of the present invention each module or each step can realize with general calculation element, they can concentrate on single calculation element, or be distributed on network that multiple calculation element forms, alternatively, they can realize with the executable program code of calculation element, thus, they can be stored and be performed by calculation element in the storage device, or they are made into each integrated circuit modules respectively, or the multiple module in them or step are made into single integrated circuit module to realize.Like this, the present invention is not restricted to any specific hardware and software combination.
Although the execution mode disclosed by the present invention is as above, the execution mode that described content just adopts for the ease of understanding the present invention, and be not used to limit the present invention.Technical staff in any the technical field of the invention; under the prerequisite not departing from the spirit and scope disclosed by the present invention; any amendment and change can be done what implement in form and in details; but scope of patent protection of the present invention, the scope that still must define with appending claims is as the criterion.

Claims (10)

1. a control method for the magnetic disk heartbeat of cluster, is characterized in that, comprising:
Communication heartbeat be designated be time, the any node of described cluster alternately performs magnetic disk heartbeat and one of network Heartbeat or serial heartbeat or at least heartbeat of interval primary network and serial heartbeat between twice magnetic disk heartbeat, and when communication heartbeat is designated no, described any node carries out magnetic disk heartbeat; Wherein,
When the continuous preset times of described any node cannot detect network Heartbeat or serial heartbeat, described communication heartbeat mark is set to no by described any node; And
When the sector that each node determined in described cluster uses, one of sector do not used by other nodes in the sector being used for magnetic disk heartbeat in the shared disk of described cluster is defined as the sector that described any node will use by described any node, and described other sector be used in the sector of magnetic disk heartbeat except the sector that described any node will use is defined as the sector that other node in described cluster except this node will use;
When performing magnetic disk heartbeat, described any node, by passing on heartbeat to other node to the described sector write data that will use, obtains the heartbeat message of other node by reading other sector;
Described network Heartbeat and described serial heartbeat are all corresponding to the independently physical connection in described cluster, and wherein, Ethernet connects corresponding to described network Heartbeat, and serial cable connects corresponding to described serial heartbeat.
2. control method according to claim 1, is characterized in that, also comprises:
When communication heartbeat is designated no, described any node interval preset time period sends retroeflection message to other node, if receive the response to this retroeflection message, then described communication heartbeat mark is set to be.
3. control method according to claim 1, is characterized in that, is defined as in the process of the sector that described any node will use in described any node by one of sector do not used by other nodes in the shared disk of described cluster,
Repeatedly travel through the sector for magnetic disk heartbeat in the shared disk of described cluster;
Described to be used in the sector of magnetic disk heartbeat is judged as not by sector that other node uses without the sector of data variation during repeatedly traversal.
4. control method according to claim 1, is characterized in that, is defined as in the process of the sector that described any node will use in described any node by one of sector do not used by other nodes in the shared disk of described cluster,
For in the process of the sector of magnetic disk heartbeat in the shared disk repeatedly traveling through described cluster, repeatedly travel through n sector before the shared disk of described cluster;
Sector without data variation during repeatedly traveling through in a described front n sector is judged as not by the sector that other node uses, wherein,
The number n of described sector equals the maximum node number of the support of cluster described in configuration file.
5. control method according to claim 3, is characterized in that,
Magnetic disk heartbeat is performed by setting up the communication module performing following operation in Corosync:
According to the maximum node number that cluster described in configuration file is supported, repeatedly travel through the several sector of maximum node before in the shared disk of described cluster;
Be judged as not by sector that other node use without the sector of data variation by the described several sector of front maximum node during repeatedly traversal;
One of described sector do not used by other nodes is defined as the sector that described any node will use by described any node, and other sector in several for front maximum node sector except the sector that described any node will use is defined as the sector that other node in described cluster except this node will use;
When performing magnetic disk heartbeat, described any node, by passing on heartbeat to other node to the described sector write data that will use, obtains the heartbeat message of other node by reading other sector.
6. control method according to claim 1, is characterized in that,
In described any node, arrange timer, described timer triggers described any node obtains other node heartbeat message by reading other sector.
7. a cluster, is characterized in that, comprising:
Described cluster comprises multiple node, connected by network between each node and/or serial ports connect and shared disk connect, wherein,
Communication heartbeat be designated be time, the any node of described cluster alternately performs magnetic disk heartbeat and one of network Heartbeat or serial heartbeat or at least heartbeat of interval primary network and serial heartbeat between twice magnetic disk heartbeat, and when communication heartbeat is designated no, described any node carries out magnetic disk heartbeat; Wherein,
When the continuous preset times of described any node cannot detect network Heartbeat or serial heartbeat, described communication heartbeat mark is set to no by described any node; And
When the sector that each node determined in described cluster uses, one of sector do not used by other nodes in the sector being used for magnetic disk heartbeat in the shared disk of described cluster is defined as the sector that described any node will use by described any node, and described other sector be used in the sector of magnetic disk heartbeat except the sector that described any node will use is defined as the sector that other node in described cluster except this node will use;
When performing magnetic disk heartbeat, described any node, by passing on heartbeat to other node to the described sector write data that will use, obtains the heartbeat message of other node by reading other sector;
Described network Heartbeat and described serial heartbeat are all corresponding to the independently physical connection in described cluster, and wherein, Ethernet connects corresponding to described network Heartbeat, and serial cable connects corresponding to described serial heartbeat.
8. cluster according to claim 7, is characterized in that, also comprises:
When communication heartbeat is designated no, described any node interval preset time period sends retroeflection message to other node, if receive the response to this retroeflection message, then described communication heartbeat mark is set to be.
9. cluster according to claim 7, is characterized in that, is defined as in the process of the sector that described any node will use in described any node by one of sector do not used by other nodes in the shared disk of described cluster,
Repeatedly travel through the sector for magnetic disk heartbeat in the shared disk of described cluster;
Described to be used in the sector of magnetic disk heartbeat is judged as not by sector that other node uses without the sector of data variation during repeatedly traversal.
10. cluster according to claim 7, is characterized in that, is defined as in the process of the sector that described any node will use in described any node by one of sector do not used by other nodes in the shared disk of described cluster,
For in the process of the sector of magnetic disk heartbeat in the shared disk repeatedly traveling through described cluster, repeatedly travel through n sector before the shared disk of described cluster;
Sector without data variation during repeatedly traveling through in a described front n sector is judged as not by the sector that other node uses, wherein,
The number n of described sector equals the maximum node number of the support of cluster described in configuration file.
CN201210500389.7A 2012-11-29 2012-11-29 The control method of a kind of cluster and magnetic disk heartbeat thereof Active CN103051470B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210500389.7A CN103051470B (en) 2012-11-29 2012-11-29 The control method of a kind of cluster and magnetic disk heartbeat thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210500389.7A CN103051470B (en) 2012-11-29 2012-11-29 The control method of a kind of cluster and magnetic disk heartbeat thereof

Publications (2)

Publication Number Publication Date
CN103051470A CN103051470A (en) 2013-04-17
CN103051470B true CN103051470B (en) 2015-10-07

Family

ID=48063975

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210500389.7A Active CN103051470B (en) 2012-11-29 2012-11-29 The control method of a kind of cluster and magnetic disk heartbeat thereof

Country Status (1)

Country Link
CN (1) CN103051470B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10102088B2 (en) 2013-12-25 2018-10-16 Nec Solution Innovators, Ltd. Cluster system, server device, cluster system management method, and computer-readable recording medium
CN105528202B (en) * 2014-10-22 2021-01-26 中兴通讯股份有限公司 Resource processing method and device of multi-controller system
CN105045533B (en) * 2015-07-09 2019-03-22 上海爱数信息技术股份有限公司 Magnetic disk heartbeat receiving/transmission method suitable for dual control high availability storage system
CN105681074B (en) * 2015-12-29 2018-11-09 北京同有飞骥科技股份有限公司 A kind of enhancing dual computer group is reliable, availability method and device
CN106873918A (en) * 2017-02-27 2017-06-20 郑州云海信息技术有限公司 Storage method to set up and device in a kind of virtualization system
US10785350B2 (en) * 2018-10-07 2020-09-22 Hewlett Packard Enterprise Development Lp Heartbeat in failover cluster
CN109728981A (en) * 2019-03-19 2019-05-07 江苏汇智达信息科技有限公司 A kind of cloud platform fault monitoring method and device
CN112822078B (en) * 2021-02-26 2023-01-13 上海沄熹科技有限公司 Method for realizing raft heartbeat report of nodes in different network domains
CN113595836A (en) * 2021-09-27 2021-11-02 云宏信息科技股份有限公司 Heartbeat detection method of high-availability cluster, storage medium and computing node
CN114844809A (en) * 2022-04-18 2022-08-02 北京凝思软件股份有限公司 Multi-factor arbitration method and device based on network heartbeat and kernel disk heartbeat
CN116743550B (en) * 2023-08-11 2023-12-29 之江实验室 Processing method of fault storage nodes of distributed storage cluster

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101291243A (en) * 2007-04-16 2008-10-22 广东省新支点技术服务有限公司 Split brain preventing method for highly available cluster system
CN102402395A (en) * 2010-09-16 2012-04-04 上海中标软件有限公司 Quorum disk-based non-interrupted operation method for high availability system
CN102799394A (en) * 2012-06-29 2012-11-28 华为技术有限公司 Method and device for realizing heartbeat services of high-availability clusters

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6944785B2 (en) * 2001-07-23 2005-09-13 Network Appliance, Inc. High-availability cluster virtual server system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101291243A (en) * 2007-04-16 2008-10-22 广东省新支点技术服务有限公司 Split brain preventing method for highly available cluster system
CN102402395A (en) * 2010-09-16 2012-04-04 上海中标软件有限公司 Quorum disk-based non-interrupted operation method for high availability system
CN102799394A (en) * 2012-06-29 2012-11-28 华为技术有限公司 Method and device for realizing heartbeat services of high-availability clusters

Also Published As

Publication number Publication date
CN103051470A (en) 2013-04-17

Similar Documents

Publication Publication Date Title
CN103051470B (en) The control method of a kind of cluster and magnetic disk heartbeat thereof
US10965496B2 (en) Logical router comprising disaggregated network elements
US10630570B2 (en) System and method for supporting well defined subnet topology in a middleware machine environment
CN105656653B (en) Increase method of network entry, the device and system of node in distributed coordination system newly
US7978595B2 (en) Method for processing multiple active devices in stacking system and stacking member device
JP6283361B2 (en) Systems and methods for supporting degraded fat tree discovery and routing in a middleware machine environment
CN104077199B (en) Based on partition method and the system of the high-availability cluster of shared disk
US10454809B2 (en) Automatic network topology detection for merging two isolated networks
JP2003503899A (en) Stack type intelligent switching system
US9197507B2 (en) Auto-configuring multi-layer network
CN101924699A (en) Message forwarding method, system and provider edge equipment
CN111211955B (en) Method for distributing slave node address and node management system
CN104753707B (en) A kind of system maintenance method and the network switching equipment
CN110493069A (en) Fault detection method, device, SDN controller and forwarding device
CN109981404B (en) Ad hoc network structure and diagnosis method thereof
CN104702693B (en) The processing method and node of two node system subregions
CN104125079A (en) Method and device for determining double-device hot-backup configuration information
CN114124803B (en) Device management method and device, electronic device and storage medium
CN102577249A (en) Connected instance group of dynamically addressed hosts
CN104202443B (en) The method and apparatus and relevant device of disaster tolerance processing are carried out to IP address conflict
CN106559234B (en) Control message sending method and device
CN111064593A (en) Network topology redundant communication system and network topology redundant communication method
US9282054B2 (en) Determining an active management uplink
CN103684858A (en) Method and relevant device for generating tenant network and processing label message
US20190213047A1 (en) Provisioning a Network Device in a Stack

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant