CN103167010B - For the method and apparatus indicating node to survive in the cluster - Google Patents

For the method and apparatus indicating node to survive in the cluster Download PDF

Info

Publication number
CN103167010B
CN103167010B CN201110430012.4A CN201110430012A CN103167010B CN 103167010 B CN103167010 B CN 103167010B CN 201110430012 A CN201110430012 A CN 201110430012A CN 103167010 B CN103167010 B CN 103167010B
Authority
CN
China
Prior art keywords
cluster
node
generation
host node
shared resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201110430012.4A
Other languages
Chinese (zh)
Other versions
CN103167010A (en
Inventor
吴江
黄剑
李卫华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
EMC Corp
Original Assignee
EMC Corp
Filing date
Publication date
Application filed by EMC Corp filed Critical EMC Corp
Priority to CN201110430012.4A priority Critical patent/CN103167010B/en
Publication of CN103167010A publication Critical patent/CN103167010A/en
Application granted granted Critical
Publication of CN103167010B publication Critical patent/CN103167010B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The method and apparatus that embodiments of the present invention relate to indicate node to survive in the cluster.Especially, disclose a kind of at host node in the cluster to cluster at least one from node instruction host node survival method, host node follows small computer system interface SCSI protocol with at least one from node, and the method includes: the persistency creating the shared resource for cluster is reserved;And the generation updating cluster by periodically changing the reserved state of persistency indicates, wherein instruction from generation to generation is at least one by accessing, to share resource obtainable from node.Also disclose the method whether survived for the host node at node in detection cluster in the cluster, including: register to the shared resource of cluster;Regular visit shares resource to obtain the generation instruction of cluster;And be determined by indicating from generation to generation and whether be updated compared with the generation instruction previously obtained, detect whether host node survives.Also disclose corresponding equipment.

Description

For the method and apparatus indicating node to survive in the cluster
Technical field
Embodiments of the present invention relate generally to distributed information processing field, more specifically, The method and apparatus relating to indicate node to survive in the cluster.
Background technology
Along with the development of computing technique, cluster has had become as a kind of important modern computing frame Structure.In the cluster, it usually needs the specific node in system (can be equipment, it is also possible to be Application program etc.) indicate it to survive (aliveness) to other nodes, i.e. and this node is current It is in operable state or effective duty.Such as, in the cluster that multiple nodes form, Being usually present one or more node and serve as the host node of this cluster, it is such as responsible for controlling cluster And the behavior of other member node.This type of host node needs the constantly joint of other in cluster Point (referred to as " from node ") indicates it to survive.If host node does not indicate it to survive, then From node it is believed that host node has occurred and that fault.Now, in group system can from node New host node is become, in order to guarantee the normal operation of whole group system with " competition ".Can With understand, host node to other nodes indicate its survival be safeguard cluster normal operating basis it
In addition to the host node in cluster, another common scene needing instruction to survive is right Share the mutual exclusion of resource continuous property to occupy and (that is, the exclusiveness of this shared resource is used and to continue one The section time) node to etc. the node of this shared resource to be used indicate it to survive.Specifically, Owing to the owner node of current resource wants exclusive resource until it is abandoned or breaks down, because of This needs other wait nodes to shared resource to indicate it to be currently normally used this money Source, to avoid the occurrence of the phenomenon of deadlock.If the owner node of resource no longer indicates it to survive, Then other wait that node can terminate to wait, and the actively competition use to this shared resource.
In the prior art, a kind of method indicating node to survive is based on network Heartbeat message. Need to indicate its node survived to send heart beating on network at a predetermined interval to other nodes to disappear Breath, to declare that it is survived.Other nodes are if able to receive the heart on network at predetermined space Jump message, then show that host node is currently in existing state.Whereas if do not have at predetermined space Receive heartbeat message, then it is believed that host node has broken down.This method exists More significantly defect.First, the loss of network Heartbeat is probably main equipment and/or from equipment end Network cause.Accordingly, it would be desirable to other aided algorithms determine that problem occurs at main joint actually Point or the network from node side.Secondly, much cluster based on shared storage is among the nodes There is not strong network and connect (such as, virtual machine file system VMFS).Now, network Heartbeat mechanism may cannot work at all.Connect it addition, network segmentation problem is merely representative of network Problem, is not sufficient to show that host node has occurred and that fault.After being caused by network segmentation problem Sequel group's protection act may unnecessarily destroy data integrity.
In addition to network Heartbeat, another known survival instruction strategy is magnetic disk heartbeat.Such as, Can arrange by host node and all from the addressable shared storage of node (such as, disk), And special region is set on this disk.Host node is continually updated the storage in this region Content, notice of thus being survived is to other nodes.Compared with network Heartbeat, magnetic disk heartbeat gram Take its some shortcoming.But, in magnetic disk heartbeat strategy, system must reserve additional depositing Storage space is for magnetic disk heartbeat information.It addition, the magnetic disk heartbeat information in reserved storage space Damage may cause " splitting brain " (split-brain) problem.And, although by magnetic disk heartbeat Store to provide together with important mutual exclusion data and judge the most accurately.But access continually Magnetic disk heartbeat information in the different sector of/renewal makes read/write head move back and forth, thus have impact on it The efficiency of his data I/O access.In addition, magnetic disk heartbeat information must also have host node Identification information, in order to other nodes are able to know that who on earth is host node.But, at some In specific cluster configuration, may be simultaneously present multiple host node.Now, system must design Extremely complex algorithm follows the tracks of the survival of host node.
Therefore, there is a need in the art for one the most accurate and effective and indicate easily The technical scheme of equipment survival.
Summary of the invention
In view of problem above, the present invention proposes one and more effectively and the most in the cluster indicates The method and apparatus of host node survival.
In one aspect of the invention, it is provided that a kind of at host node in the cluster at least One method indicating this host node to survive from node.The method includes: create for this cluster Shared resource persistency reserve;And by periodically changing the shape that this persistency is reserved State updates the generation instruction of this cluster, wherein the instruction of this generation be this at least one logical from node Cross that to access this shared resource obtainable.
In another aspect of this invention, it is provided that a kind of at host node in the cluster at least One equipment indicating this host node to survive from node.This equipment includes: create device, configuration Reserve for creating the persistency of the shared resource for this cluster;And updating device, configuration Refer to for updating the generation of this cluster by periodically changing the reserved state of this persistency Show, wherein the instruction of this generation be this at least one can obtain by accessing this shared resource from node 's.
In still another aspect of the invention, it is provided that a kind of in the cluster at node detection should The method whether host node in cluster survives.The method includes: to the shared resource of this cluster Register;This shared resource of regular visit indicated with the generation obtaining this cluster;And pass through Determine whether the instruction of this generation is updated compared with the generation instruction previously obtained, detect this master Whether node survives.
In another aspect of the invention, it is provided that a kind of in the cluster at node detection should The equipment whether host node in cluster survives.This equipment includes: register device, is configured to Register to the shared resource of this cluster;Accessing device, this is shared to be configured to regular visit Resource indicated with the generation obtaining this cluster;And detection device, it is configured to determine this Whether instruction from generation to generation is updated compared with the generation instruction previously obtained, and detecting this host node is No survival.
According to the embodiment of the present invention, at the distributed collection by device build compatible for SCSI In group's system, host node can be reserved for the persistency of shared resource by operation, thus Member's situation of cluster makes the generation instruction of system constantly update in the case of not changing.Phase Ying Di, can obtain instruction from generation to generation by accessing shared resource from node, and refer to according to this Show whether to be updated regularly and detect whether host node survives.In the preferred embodiment of the present invention In, host node and all operations of shared resource be may be by existing SCSI from node Order realizes.In this way, host node can efficiently, convenient and refer to from node neatly Show its existing state.
Accompanying drawing explanation
By reading detailed description below with reference to accompanying drawing, embodiment of the present invention above-mentioned and Other objects, features and advantages will become prone to understand.In the accompanying drawings, with exemplary rather than limit The mode of property processed shows some embodiments of the present invention, wherein:
Fig. 1 shows the applied environment framework 100 that can use embodiment of the present invention wherein Block diagram;
Fig. 2 show according to one exemplary embodiment for the main joint at cluster To the flow chart of the method 200 of other nodes instruction host node survival at Dian;
Fig. 3 show according to one exemplary embodiment at cluster from joint The flow chart of the method 300 whether host node survives is detected at Dian;
Fig. 4 show according to one exemplary embodiment for the main joint at cluster To the block diagram of the equipment 400 of other nodes instruction host node survival at Dian;
Fig. 5 show according to one exemplary embodiment at cluster from joint The block diagram of the equipment 500 whether host node survives is detected at Dian;And
Fig. 6 shows the frame of the computer system 600 being adapted to put into practice embodiment of the present invention Figure.
In various figures, identical or corresponding label represents identical or corresponding part.
Detailed description of the invention
Some illustrative embodiments shown in below with reference to the accompanying drawings describe the present invention's Principle and spirit.Should be appreciated that providing these embodiments is only used to make art technology Personnel better understood when and then realize the present invention, and limits the present invention the most by any way Scope.
Noting, term as used herein " node " can represent calculating equipment (example physically As, computer), it is also possible to represent virtual machine or application, process or thread.And, Term as used herein " host node " both can represent to be responsible in a cluster controlling cluster operating Main controlled node, it is also possible to represent have the usufructuary node that mutual exclusion is shared resource at present, Etc..Correspondingly, term " from node " laid under tribute both can having represented a cluster Node, it is also possible to the node of resource is shared in the mutual exclusion to be used such as expression, etc..
Although additionally, describing Primary Reference minicomputer below for embodiment of the present invention System interface (SCSI) agreement is given, but what this was merely exemplary.Its objective is more preferably Ultimate principle and the thought of the present invention is explained on ground to those skilled in the art, and not with any side Formula limits the scope of the present invention.It practice, embodiments of the present invention be applicable to any at present the most Know the similar agreement of exploitation in the future.
Below with reference to Fig. 1, it illustrates the cluster 100 that can realize embodiment of the present invention wherein Block diagram.In fig. 1 it is shown that node 1110, node 2 112... and node N 114. These nodes are connected to each other by connecting medium 120 and with compounding practice, thus can constitute and divide Cloth system or cluster 100.Although being not shown, cluster 100 can also include Other node and data storage device, control equipment, shared resource, etc..
As it has been described above, according to the embodiment of the present invention, node 110-114 can be physically Calculating equipment.Such as, node can be the conventional desktop of personal computer (PC) etc Calculating equipment, it is also possible to be personal digital assistant (PDA), cell phone, smart phone, The mobile terminals such as laptop computer.Should be appreciated that above-named is only some possible Example, is not intended to limit the scope of the present invention.It practice, node 110-114 can be tool Any physical equipment of information processing and communication capacity, the most currently known or future is had to open Send out.Alternatively or additionally, node 110-114 can also be dummy node, such as, include But it is not limited to virtual machine, application program, process, thread, etc..Especially, according to this Bright preferred implementation, (such as, node 110-114 follows small computer system interface SCSI-3) agreement.
According to the embodiment of the present invention, connecting medium 120 can be network.Especially, even Connecing medium 120 can be cable network, wireless network or a combination thereof, includes but not limited to following At least one: cellular phone network, Ethernet, based on IEEE 802.11,802.16,802.20 Deng WLAN (WLAN), and/or World Interoperability for Microwave Access, WiMax (WiMAX) Network.Additionally, connecting medium 120 can be public network (such as, the Internet), special Network (such as, intranet) or a combination thereof.Alternatively or additionally, medium 120 is connected Can also be to interconnect and the equipment communicated between the subscriber equipmenies such as bus.The scope of the present invention is side at this Face is unrestricted.
Additionally, as it is shown in figure 1, cluster 100 can also include sharing resource 130.Make at this Term " share with resource " refer to the most qualified access of node 110 in cluster and use Resource, including physical resource and logical resource, such as SCSI storage device etc..Node 110 Access to shared resource 130 can realize by means of connecting medium 120.Alternatively or additionally Ground, node 110 can also be accessed by other suitable technological means and use shared resource 130。
In cluster 100, one or more nodes can serve as host node, i.e. the control of cluster The node of resource is shared in node processed or exclusive use mutual exclusion.Other nodes correspondingly serve as from joint Point.Should be appreciated that the host node in cluster 100 and the role from node can exchange. Such as, host node can actively abandon its primary node identity, or lose main joint because of fault Point identity.Now, new host node can be become by " competition " from node.
With reference now to Fig. 2, it illustrates according to one exemplary embodiment for To the flow chart of the method 200 of other nodes instruction host node survival at the host node of cluster. Method 200 can by the host node in cluster or the equipment associated with host node (such as, hereafter The equipment 400 described with reference to Fig. 4) perform.
After method 200 starts, in step S202, host node is made to create for place cluster The persistency sharing resource is reserved (persistent reservation).
As it has been described above, sharing resource is that the most addressable resource of the member in cluster (such as, is deposited Storage resource), it can be physical resource, it is also possible to be logical block.Term as used herein " persistency is reserved " represents that node has the use of exclusiveness to shared resource or its specific part Weigh, and this exclusiveness uses and will continue for some time.In other words, during this period of time, The shared resource part reserved by persistence can not be modified by other nodes in cluster (such as, forbidding performing write operation).In logic, can understand reserved for persistency shared The mutual exclusion lock of the above-mentioned applying of resource.Should be appreciated that the different piece of shared resource in cluster On, one or more node can be allowed to create one or more persistency and to reserve, this depends on The setting of cluster and configuration.
As it was noted above, according to the embodiment of the present invention, the node in cluster can be followed SCSI protocol.In such embodiment, step S202 can be come by two sub-steps Realize.First, in the first sub-step, host node can be made to register to shared resource.Example As, host node can generate or otherwise obtain the key assignments of an associated and (be designated as k1).Host node can utilize key assignments k1 associated with it to register to described shared resource. According to the embodiment of the present invention, such as can be by using the order PERSISTENT of SCSI RESERVE OUT realizes this sub-step, the service action code that wherein this order is had For " REGISTER ", and indicate and register key assignments as k1.As known in the art, SCSI Protocol command PERSISTENT RESERVE OUT be a primitive command.Therefore, logical Crossing and use this order, host node can be easily accomplished registration (the relevant SCSI to shared resource The more details of order, refer to the various data about SCSI protocol known in the art, For example, it is possible to see the SCSI specification literary composition that can obtain from http://www.t10.org/drafts.htm Shelves.
It follows that in the second sub-step of step S202, host node can be in shared resource Create the persistency being associated with registration key assignments to reserve.Such as can use scsi command PERSISTENT RESERVE OUT realizes This move, the service action code of this order It is arranged to " RESERVE ".As known in the art, now SCSI protocol support is altogether Enjoy creating in resource, with the registration key assignments k1 of this node is associated, there is particular type (TYPE) Reserve with the persistency of scope (SCOPE).
It follows that method 200 proceeds to step S204, created by change at this host node The state that the persistency built is reserved indicated to the generation (generation) updating cluster.Note, Term as used herein " from generation to generation " is the information of the change reacting cluster member situation in cluster. Specifically, whenever having new node to add the member that cluster becomes cluster, instruction can from generation to generation To increase with the renewal representing cluster member situation.Otherwise, when there being member node to exit cluster, The instruction of this generation also will correspondingly be changed.
But, under the aggregated structure following such as SCSI protocol, in step S204, having can Can be in the case of cluster member situation change, by lasting to shared resource of change Property reserved state, make the generation instruction of cluster change.Specifically, according to the present invention Some embodiment, host node can utilize reserves the key assignments being associated (i.e., with persistency K1) to described shared resource repeated registration.Such as can be by using scsi command PERSISTENT RESERVE OUT realizes This move, and its service action code is set For " REGISTER AND IGNORE EXISTING KEY ", and indicate reserved key assignments For k1.In scsi protocol, by being called by host node and sending above-mentioned to shared resource Order, can make to indicate generation from generation to generation in the case of member's situation of cluster does not changes Change.
As mentioned below, in cluster from node can by read host node on shared resource The state that the persistency created is reserved obtains instruction from generation to generation, and can be according to instruction from generation to generation Update status detects whether host node is in existing state.This is detailed below with regard to Fig. 3 Describe.
It follows that method 200 proceeds to step S206, determine if to abandon at this host node Primary node identity.It is appreciated that this includes determining whether host node abandons the ground of the master control to cluster Position, or terminate alternative is shared the exclusive use of resource.Do not abandon if host node determines Primary node identity (branch's "No" of step S206), then method 200 is in the predetermined time Return after interval and perform step S204.In this way, as long as host node is survived and do not puts Abandon primary node identity, just can carry out generation Pleistocene by the state that change persistence is reserved termly Instruction.So, as long as finding that the generation instruction of cluster is being constantly changing from node, the most permissible Judge that host node is currently in survival.
On the other hand, if host node determine abandon its primary node identity (step S206 point Prop up "Yes"), then method 200 proceeds to step S208, and it is right to discharge at this host node The persistency sharing resource is reserved.Such as can use scsi command PERSISTENT RESERVE OUT realizes This move, and the service action code of this order is arranged to “RELEASE”.In this case, from node, other will detect that instruction is no longer from generation to generation Periodically change.Thus, from node, cluster can be determined that host node is not on existing state. Various suitable subsequent action, such as competition can be correspondingly taked to become new main joint from node Point, etc..
Method 200 terminates after step S208.
By method 200, neatly the generation in SCSI protocol can be indicated and be used for indicating master The survival (and unactual configuration changing cluster) of node.As known in the art, above The various scsi commands mentioned are primitive command, and it obtains the support of firmware level, therefore can Enough realize efficiently.Certainly, principle and the spirit of the present invention is equally applicable to be similar to SCSI Other agreements of agreement.
With reference now to Fig. 3, it illustrates according to one exemplary embodiment for The flow chart detecting the method 300 whether host node survives at node at cluster.Method 300 can by cluster each from node or the equipment that associates with from node (such as, hereafter The equipment 500 described with reference to Fig. 5) perform.
After method 300 starts, in step S302, can be to the shared resource of cluster from node Register.
In the embodiment following SCSI protocol, this such as can be by SCSI protocol PERSISTENT RESERVE OUT order realizes, and the service action code of this order is set It is set to " REGISTER ", and indicates the registration key assignments (being designated as k2) being associated with from node.
It follows that method 300 proceeds to step S304, can be by accessing altogether from node at this Enjoy resource and obtain the generation instruction of cluster.
According to some embodiment of the present invention, step S304 can be divided into two sub-steps. In the first sub-step, currently holding for shared resource can be read from shared resource from node Property is reserved for information about for a long time.As a example by the cluster following SCSI protocol, this such as can make Realize with scsi command PERSISTENT RESERVE IN, the service action of this order Code is arranged to " READ RESERVATION ".As response, shared resource will to from Node returns reserves relevant information with the current persistent sharing resource.It follows that second Sub-step, can parse the generation instruction of described cluster from node.
As known in the art, when from node to shared resource transfer scsi command During PERSISTENT RESERVE IN, shared resource return to wrap from the response of node The field of the current generation containing instruction cluster.This field is referred to as " holding in scsi protocol Property is reserved from generation to generation for a long time " (PRGENERATION) field, it is the enumerator of 32, Firmware maintenance by shared resource.It is similar with PERSISTENT RESERVE OUT, PERSISTENT RESERVE IN order is also primitive command, therefore can be efficient from node Ground obtains the generation instruction information of cluster efficiently, easily in step S304.
It follows that method 300 proceeds to step S306, determine at this and obtain in step S304 Generation instruction whether be updated compared with the generation instruction previously obtained, thus detect described master Whether node survives.
If determining that instruction is updated (branch's "Yes" of step S306) from generation to generation from node, Then it thinks that the host node of cluster is currently in existing state.Now, method 300 is predetermined Step S304 is returned after time interval.In this way, can be periodically by accessing from node The shared resource of cluster and obtain the generation instruction of cluster.
On the other hand, it is not updated (dividing of step S306 if determining to indicate from generation to generation from node Prop up "No"), then it is it is believed that host node is (such as, main currently without being in existing state Nodes break down, or actively abandon primary node identity).
Now, method 300 may proceed to step S308, removes the right of host node establishment at this The persistency sharing resource is reserved.According to some embodiment of the present invention, step S308 can With by using scsi command PERSISTENT RESERVE OUT to realize, it has Service action code PREEMPT.This service action code allows to be associated with host node by instruction Registration key assignments remove the registration carried out in shared resource by main equipment.Note, step S308 is optional, such as, the most actively discharge in the case of persistency is reserved at host node and walk Rapid S308 is performed without.
It follows that method 300 can optionally proceed to optional step S310, at this from joint Point can ask to become new host node.For selecting one or more one-tenth from node from multiple Various mechanism and strategy for host node all can be used in combination with embodiments of the present invention, no matter That be currently known or exploitation in the future.The scope of the present invention is not limited in this respect.
Method 300 terminates after step S310.
By method 300, each can share resource and obtain periodically by accessing from node Take the generation instruction of cluster.If instruction is updated continuously from generation to generation, then think from node Host node is in normal operating state, i.e. existing state.Whereas if find cluster from node Generation instruction no longer update, then it can be assumed that host node is no longer on existing state.Now, Can perform to remove the registration of host node from node, that competition becomes new host node etc. is various follow-up Operation.
Below with reference to Fig. 4, it illustrates according to one exemplary embodiment for Indicate the equipment of described host node survival from node at least one at host node in the cluster The block diagram of 400.Described host node in described cluster and described at least one follow little from node Type computer system interface SCSI protocol.According to the embodiment of the present invention, equipment 400 can To reside at the host node of cluster;Alternatively, equipment 400 can also be positioned at outside host node Portion and operationally communicating with host node.
As it can be seen, equipment 400 includes: create device 402, it is configured to create for institute The persistency of the shared resource stating cluster is reserved;And updating device 404, it is configured to Periodically change the reserved state of described persistency to indicate to the generation updating described cluster, its Described in from generation to generation instruction be described at least one can obtain by accessing described shared resource from node ?.
According to some embodiment of the present invention, equipment 400 can also include register device, join Put for utilizing the key assignments being associated with described host node to register to described shared resource.? Such embodiment, creates device 402 and may be configured to create in described shared resource The described persistency being associated with described key assignments is reserved.
According to some embodiment of the present invention, updating device 404 may include that repeated registration Device, be configured to periodically with reserve with described persistency the described key assignments that is associated to Described shared resource repeated registration.
According to some embodiment of the present invention, equipment 400 also includes: release device, configuration Discharge described shared resource for primary node identity will be abandoned in response to described host node Described persistency reserve.
According to some embodiment of the present invention, described establishment device 402 and described updating device 404 may be configured to perform each to operate based on scsi command.
Below with reference to Fig. 5, it illustrates according to one exemplary embodiment for The equipment 500 whether host node at node in the described cluster of detection in the cluster survives Block diagram.Described host node and described at least one follow small computer system interface from node SCSI protocol.According to the embodiment of the present invention, equipment 500 may reside within cluster from At node;Alternatively, equipment 500 can also be positioned at outside from node and can with from node Operatively communicate.
As it can be seen, equipment 500 includes: register device 502, it is configured to described cluster Shared resource register;Access device 504, be configured to described in regular visit share money Source is to obtain the generation instruction of described cluster;And detection device 506, it is configured to really Whether fixed instruction of described generation is updated compared with the generation instruction previously obtained, and detects described Whether host node survives.
According to some embodiment of the present invention, access device 504 and may include that reading device, It is configured to read the current persistent for described shared resource reserved for information about;And Resolver, is configured to parse the generation instruction of described cluster from described information.
According to some embodiment of the present invention, equipment 500 can also include: apparatus for removing, It is configured to remove described host node to institute in response to determining the instruction of described generation not to be updated State the registration of shared resource.
According to some embodiment of the present invention, equipment 500 can also include: request unit, It is configured to ask to become the new host node of described cluster.
According to some embodiment of the present invention, described register device 502 and described access device 504 are configured to perform respective operation based on scsi command.
Note, for clarity, be shown without in Fig. 4-Fig. 5 optional device and each The sub-device that device is comprised.It will be appreciated, however, that each described in equipment 400 and 500 Individual device correspond respectively to above with reference to Fig. 2 and Fig. 3 describe method 200 and 300 in each Individual step.Thus, the operation and the feature that describe above with respect to method 200 and 300 are equally applicable In equipment 400 and 500 and the device that wherein comprises, do not repeat them here.
It is also understood that equipment 400 and 500 can utilize to be embodied in various ways.Such as, exist In some embodiment, equipment 400 and 500 can utilize software and/or firmware to realize.Standby Selection of land or additionally, equipment 400 and 500 can partially or fully come real based on hardware Existing.Such as, equipment 400 and 500 can be implemented as integrated circuit (IC) chip or special collection Become circuit (ASIC).Equipment 400 and 500 can also be embodied as SOC(system on a chip) (SOC). Other modes currently known or exploitation in the future are also feasible, and the scope of the present invention is side at this Face is unrestricted.
Below with reference to Fig. 6, it illustrates the computer be suitable to for putting into practice embodiment of the present invention The schematic block diagram of system 600.Such as, the computer system 600 shown in Fig. 6 may be used for Realize host node described above and/or from node.
As it can be seen, computer system 600 may include that CPU (CPU) 601, RAM (random access memory) 602, ROM (read only memory) 603, system bus 604, hard disk controller 605, KBC 606, serial interface controller 607, parallel Interface controller 608, display controller 609, hard disk 610, keyboard 611, serial peripheral set For 612, concurrent peripheral equipment 613 and display 614.In such devices, total with system Line 604 coupling have CPU 601, RAM 602, ROM 603, hard disk controller 605, KBC 606, serialization controller 607, parallel controller 608 and display controller 609. Hard disk 610 couples with hard disk controller 605, and keyboard 611 couples with KBC 606, Serial peripheral equipment 612 couples with serial interface controller 607, concurrent peripheral equipment 613 with Parallel interface controller 608 couples, and display 614 couples with display controller 609. Should be appreciated that the structured flowchart described in Fig. 6 illustrates just to the purpose of example, and not Limit the scope of the present invention.In some cases, can as the case may be and increase or Person reduces some equipment.
As it has been described above, equipment 400 and 500 can be implemented as pure hardware, such as chip, ASIC, SOC etc..These hardware can be integrated in computer system 600.Additionally, the reality of the present invention The mode of executing can also be realized by the form of computer program.Such as, with reference to Fig. 2 and Tu 3 methods 200 and 300 described can be realized by computer program.This computer Program product can be stored in the such as RAM 604 shown in Fig. 6, ROM 604, hard disk 610 And/or in any suitable storage medium, or download to calculate from suitable position by network In machine system 600.Computer program can include computer code part, and it includes can The programmed instruction performed by suitable processing equipment (such as, the CPU 601 shown in Fig. 6). Described programmed instruction at least can include the instruction of the step for implementation method 200 and 300.
Spirit and principles of the present invention are illustrated above already in connection with some detailed description of the invention.Root According to embodiments of the present invention, at the collection constituted with the node of SCSI or similar protocol-compliant In Qun, the information of instruction cluster generation can be used to realize node and survive by escape neatly Instruction.According to the embodiment of the present invention, it is not necessary to reserved storage as magnetic disk heartbeat algorithm Space, because instruction (such as, PRGENERATION field) is only one 32 from generation to generation The enumerator of position, and be typically by the firmware maintenance of cluster-based storage, its operating efficiency is higher. And, the shifting being not result in disk head is reserved by the persistency on operating shared resource Dynamic, and do not damaged by data in magnetic disk and affected.
Additionally, according to the embodiment of the present invention, owing to the renewal indicating the generation can be former Sub-rank realizes, and therefore can easily build the cluster configuration of many host nodes.And, from joint Point can easily know viability or the effectiveness of host node in single I/O operation.It addition, By using existing SCSI (primitive) order to indicate survival, so that the base of cluster Plinth framework becomes apparent from and succinctly.
It should be noted that, embodiments of the present invention can pass through hardware, software or software and hard Being implemented in combination in of part.Hardware components can utilize special logic to realize;Software section is permissible Storage in memory, by suitable instruction execution system, such as microprocessor or special set Meter hardware performs.It will be understood by those skilled in the art that above-mentioned equipment and method can To use computer executable instructions and/or be included in processor control routine to realize, such as At the such as mounting medium of disk, CD or DVD-ROM, such as read only memory (firmware) Programmable memory or such as optics or electrical signal carrier data medium on provide Such code.The equipment of the present invention and module thereof can be by such as super large-scale integration Or the quasiconductor of gate array, such as logic chip, transistor etc. or such as field-programmable The hardware circuit of the programmable hardware device of gate array, programmable logic device etc. realizes, it is possible to To realize with the software that performed by various types of processors, it is also possible to by above-mentioned hardware circuit and The combination of software such as firmware realizes.
The communication network mentioned in description can include disparate networks, includes but not limited to local Net (" LAN "), wide area network (" WAN "), according to the network of IP agreement (such as, because of Special net) and ad-hoc network (such as, ad hoc peer-to-peer network).
If although it should be noted that, being referred to equipment for drying or the son dress of equipment in above-detailed Put, but this division is the most enforceable.It practice, according to the embodiment party of the present invention Formula, the feature of two or more devices above-described and function can be the most concrete Change.Otherwise, the feature of an above-described device and function can be with Further Divisions for by many Individual device embodies.
Although additionally, describe the operation of the inventive method in the accompanying drawings with particular order, but, This does not requires that or implies must be to perform these operations or necessary according to this particular order Operation shown in performing all could realize desired result.On the contrary, the step described in flow chart Suddenly execution sequence can be changed.Additionally or alternatively, it is convenient to omit some step, by multiple Step is merged into a step and is performed, and/or a step is decomposed into the execution of multiple step.
Although describing the present invention by reference to some detailed description of the invention, it should be appreciated that, The present invention is not limited to disclosed detailed description of the invention.Want it is contemplated that contain appended right Various amendments included in the spirit and scope asked and equivalent arrangements.The model of claims Enclose and meet broadest explanation, thus comprise all such amendments and equivalent structure and function.

Claims (24)

1. one kind refers to from node at least one of described cluster at host node in the cluster Show that the method that described host node is survived, described method include:
The persistency creating the shared resource for described cluster is reserved;And
The generation of described cluster is updated by periodically changing the reserved state of described persistency Instruction, wherein reflects the change of member's situation in described cluster from generation to generation, and the wherein said generation indicates Be described at least one from node by access described shared resource obtainable.
Method the most according to claim 1, wherein creates the shared money for described cluster The persistency in source is reserved to be included:
The key assignments being associated with described host node is utilized to register to described shared resource;And
Described shared resource creates the described persistency being associated with described key assignments reserve.
Method the most according to claim 2, the most periodically changes described persistency pre- The state stayed includes:
Periodically with reserving the described key assignments being associated with described persistency to described shared money Source repeated registration.
Method the most according to claim 1, also includes:
Will abandon primary node identity in response to described host node, release is to described shared resource Described persistency is reserved.
Method the most according to claim 1, wherein creates the shared money for described cluster The persistency in source state that is reserved and that periodically change described persistency reserved is based on small-sized Computer system interface scsi command and perform.
6. according to the method described in any one of claim 1-5, wherein said host node and described At least one follows small computer system interface SCSI protocol from node.
7. one kind at host node in the cluster at least one in described cluster from node The equipment that described host node is survived, described equipment is indicated to include:
Creating device, the persistency being configured to create the shared resource for described cluster is reserved; And
Updating device, is configured to periodically change the reserved state of described persistency and comes Update the generation instruction of described cluster, wherein reflect the change of member's situation in described cluster from generation to generation, The wherein said generation instruction be described at least one from node by access described shared resource can obtain ?.
Equipment the most according to claim 7, also includes register device, is configured to utilize The key assignments being associated with described host node is registered to described shared resource;And wherein
It is relevant to described key assignments that described establishment device is arranged in described shared resource establishment The described persistency of connection is reserved.
Equipment the most according to claim 8, wherein said updating device includes:
Repeated registration device, is configured to be associated periodically with reserved with described persistency Described key assignments to described shared resource repeated registration.
Equipment the most according to claim 7, also includes:
Release device, is configured to abandon primary node identity in response to described host node and release The described persistency of described shared resource of being rivals in a contest is reserved.
11. equipment according to claim 7, wherein said establishment device and described more new clothes Put and be configured to perform each to operate based on small computer system interface scsi command.
12. according to the equipment described in any one of claim 7-11, wherein said host node and institute State at least one and follow small computer system interface SCSI protocol from node.
For the host node detected at node in described cluster in the cluster whether 13. 1 kinds The method of survival, described method includes:
To described in the shared resource registering of described cluster from node;
Shared money described in regular visit is reserved by scanning the current persistent of described shared resource Member's situation in described cluster, to obtain the generation instruction of described cluster, is wherein reflected from generation to generation in source Change, and the wherein said generation instruction be described from node by access described shared resource can Obtain;And
It is determined by whether the instruction of described generation is updated compared with the generation instruction previously obtained, Detect whether described host node survives.
14. methods according to claim 13, wherein described in regular visit share resource with The generation instruction obtaining described cluster includes:
Read the current persistent for described shared resource reserved for information about;And
The generation instruction of described cluster is parsed from described information.
15. methods according to claim 13, also include:
In response to determining that the instruction of described generation is not updated, remove described host node and share to described The registration of resource.
16. methods according to claim 13, also include:
Request becomes the new host node of described cluster.
17. methods according to claim 13, wherein enter to the shared resource of described cluster Sharing resource described in row registration and regular visit is based on small computer system interface SCSI Order and perform.
18. according to the method described in any one of claim 13-17, wherein said host node and institute State and follow small computer system interface SCSI protocol from node.
For the host node detected at node in described cluster in the cluster whether 19. 1 kinds The equipment of survival, described equipment includes:
Register device, be configured to described in the shared resource registering of described cluster from node;
Accessing device, the current persistent being configured to scan described shared resource is reserved next Resource is shared to obtain the generation instruction of described cluster, wherein generation reflection institute described in regular visit State the change of member's situation in cluster, and the instruction of wherein said generation is described to pass through from node Access described shared resource obtainable;And
Detection device, is configured to determine that the instruction of described generation referred to the generation previously obtained Show to compare whether be updated, detect whether described host node survives.
20. equipment according to claim 19, wherein access device and include:
Reading device, is configured to read what the current persistent for described shared resource was reserved For information about;And
Resolver, is configured to parse the generation instruction of described cluster from described information.
21. equipment according to claim 19, also include:
Apparatus for removing, is configured to remove institute in response to determining the instruction of described generation not to be updated State the host node registration to described shared resource.
22. equipment according to claim 19, also include:
Request unit, is configured to ask to become the new host node of described cluster.
23. equipment according to claim 19, wherein said register device and described access Device is configured to perform respective operation based on small computer system interface scsi command.
24. according to the equipment described in any one of claim 19-23, wherein said host node and institute State and follow small computer system interface SCSI protocol from node.
CN201110430012.4A 2011-12-16 For the method and apparatus indicating node to survive in the cluster Active CN103167010B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110430012.4A CN103167010B (en) 2011-12-16 For the method and apparatus indicating node to survive in the cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110430012.4A CN103167010B (en) 2011-12-16 For the method and apparatus indicating node to survive in the cluster

Publications (2)

Publication Number Publication Date
CN103167010A CN103167010A (en) 2013-06-19
CN103167010B true CN103167010B (en) 2016-12-14

Family

ID=

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1348134A (en) * 2000-10-13 2002-05-08 国际商业机器公司 Method and equipment for providing multi-channel input/output in the environment of non-cocurrent cluster
CN101179466A (en) * 2007-10-15 2008-05-14 北京交通大学 Centralized service based distributed peer-to-peer network implementing method and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1348134A (en) * 2000-10-13 2002-05-08 国际商业机器公司 Method and equipment for providing multi-channel input/output in the environment of non-cocurrent cluster
CN101179466A (en) * 2007-10-15 2008-05-14 北京交通大学 Centralized service based distributed peer-to-peer network implementing method and system

Similar Documents

Publication Publication Date Title
CN105960639B (en) Prioritization data reconstruct in distributed memory system
CN104111897B (en) A kind of data processing method, device and computer system
CN103842969B (en) Information processing system
US8392368B1 (en) System and method for distributing and accessing files in a distributed storage system
CN103297456B (en) Access method and the distributed system of resource is shared under a kind of distributed system
CN102457517A (en) Inter-virtual machine communication
CN104781794B (en) The temporary transient original place between permanent state for the data structure in nonvolatile memory changes
US9400767B2 (en) Subgraph-based distributed graph processing
EP2472398A1 (en) Memory-aware scheduling for NUMA architectures
CN104871143A (en) System and method for allocating memory to dissimilar memory devices using quality of service
CN106254240A (en) A kind of data processing method and routing layer equipment and system
CN107179878A (en) The method and apparatus of data storage based on optimizing application
CN106453618A (en) Remote sensing image processing service cloud platform system based on G-Cloud cloud computing
CN106095483A (en) The Automation arranging method of service and device
CN105992261A (en) Apparatus, system, and method for predicting roaming patterns of mobile devices within wireless networks
CN107817951A (en) A kind of method and device for realizing the fusion of Ceph clusters
CN103970678B (en) Catalogue designing method and device
CN105468296A (en) No-sharing storage management method based on virtualization platform
CN108205573A (en) A kind of data distribution formula storage method and system
CN104573112B (en) Page interrogation method and data processing node in OLTP Cluster Databases
CN103167010B (en) For the method and apparatus indicating node to survive in the cluster
CN112596669A (en) Data processing method and device based on distributed storage
US20140368536A1 (en) Efficient collaging of a large image
CN102833295B (en) Data manipulation method and device in distributed cache system
CN104636161B (en) The online patch method and system of a kind of multiple nucleus system

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200409

Address after: Massachusetts, USA

Patentee after: EMC IP Holding Company LLC

Address before: Massachusetts, USA

Patentee before: EMC Corp.