CN103167010B - For the method and apparatus indicating node to survive in the cluster - Google Patents
For the method and apparatus indicating node to survive in the cluster Download PDFInfo
- Publication number
- CN103167010B CN103167010B CN201110430012.4A CN201110430012A CN103167010B CN 103167010 B CN103167010 B CN 103167010B CN 201110430012 A CN201110430012 A CN 201110430012A CN 103167010 B CN103167010 B CN 103167010B
- Authority
- CN
- China
- Prior art keywords
- cluster
- node
- generation
- host node
- shared resource
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000004083 survival Effects 0.000 claims abstract description 16
- 238000001514 detection method Methods 0.000 claims abstract description 7
- 230000002085 persistent Effects 0.000 claims description 18
- 230000004044 response Effects 0.000 claims description 8
- 230000000875 corresponding Effects 0.000 abstract description 3
- 239000002609 media Substances 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 239000000203 mixture Substances 0.000 description 6
- 238000004590 computer program Methods 0.000 description 4
- 230000002093 peripheral Effects 0.000 description 4
- 238000000034 method Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000001276 controlling effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000002688 persistence Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 210000004556 Brain Anatomy 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 230000001413 cellular Effects 0.000 description 1
- 230000001808 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000001035 drying Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000010247 heart contraction Effects 0.000 description 1
- 230000002045 lasting Effects 0.000 description 1
- 239000012120 mounting media Substances 0.000 description 1
- 230000035899 viability Effects 0.000 description 1
Abstract
The method and apparatus that embodiments of the present invention relate to indicate node to survive in the cluster.Especially, disclose a kind of at host node in the cluster to cluster at least one from node instruction host node survival method, host node follows small computer system interface SCSI protocol with at least one from node, and the method includes: the persistency creating the shared resource for cluster is reserved;And the generation updating cluster by periodically changing the reserved state of persistency indicates, wherein instruction from generation to generation is at least one by accessing, to share resource obtainable from node.Also disclose the method whether survived for the host node at node in detection cluster in the cluster, including: register to the shared resource of cluster;Regular visit shares resource to obtain the generation instruction of cluster;And be determined by indicating from generation to generation and whether be updated compared with the generation instruction previously obtained, detect whether host node survives.Also disclose corresponding equipment.
Description
Technical field
Embodiments of the present invention relate generally to distributed information processing field, more specifically,
The method and apparatus relating to indicate node to survive in the cluster.
Background technology
Along with the development of computing technique, cluster has had become as a kind of important modern computing frame
Structure.In the cluster, it usually needs the specific node in system (can be equipment, it is also possible to be
Application program etc.) indicate it to survive (aliveness) to other nodes, i.e. and this node is current
It is in operable state or effective duty.Such as, in the cluster that multiple nodes form,
Being usually present one or more node and serve as the host node of this cluster, it is such as responsible for controlling cluster
And the behavior of other member node.This type of host node needs the constantly joint of other in cluster
Point (referred to as " from node ") indicates it to survive.If host node does not indicate it to survive, then
From node it is believed that host node has occurred and that fault.Now, in group system can from node
New host node is become, in order to guarantee the normal operation of whole group system with " competition ".Can
With understand, host node to other nodes indicate its survival be safeguard cluster normal operating basis it
In addition to the host node in cluster, another common scene needing instruction to survive is right
Share the mutual exclusion of resource continuous property to occupy and (that is, the exclusiveness of this shared resource is used and to continue one
The section time) node to etc. the node of this shared resource to be used indicate it to survive.Specifically,
Owing to the owner node of current resource wants exclusive resource until it is abandoned or breaks down, because of
This needs other wait nodes to shared resource to indicate it to be currently normally used this money
Source, to avoid the occurrence of the phenomenon of deadlock.If the owner node of resource no longer indicates it to survive,
Then other wait that node can terminate to wait, and the actively competition use to this shared resource.
In the prior art, a kind of method indicating node to survive is based on network Heartbeat message.
Need to indicate its node survived to send heart beating on network at a predetermined interval to other nodes to disappear
Breath, to declare that it is survived.Other nodes are if able to receive the heart on network at predetermined space
Jump message, then show that host node is currently in existing state.Whereas if do not have at predetermined space
Receive heartbeat message, then it is believed that host node has broken down.This method exists
More significantly defect.First, the loss of network Heartbeat is probably main equipment and/or from equipment end
Network cause.Accordingly, it would be desirable to other aided algorithms determine that problem occurs at main joint actually
Point or the network from node side.Secondly, much cluster based on shared storage is among the nodes
There is not strong network and connect (such as, virtual machine file system VMFS).Now, network
Heartbeat mechanism may cannot work at all.Connect it addition, network segmentation problem is merely representative of network
Problem, is not sufficient to show that host node has occurred and that fault.After being caused by network segmentation problem
Sequel group's protection act may unnecessarily destroy data integrity.
In addition to network Heartbeat, another known survival instruction strategy is magnetic disk heartbeat.Such as,
Can arrange by host node and all from the addressable shared storage of node (such as, disk),
And special region is set on this disk.Host node is continually updated the storage in this region
Content, notice of thus being survived is to other nodes.Compared with network Heartbeat, magnetic disk heartbeat gram
Take its some shortcoming.But, in magnetic disk heartbeat strategy, system must reserve additional depositing
Storage space is for magnetic disk heartbeat information.It addition, the magnetic disk heartbeat information in reserved storage space
Damage may cause " splitting brain " (split-brain) problem.And, although by magnetic disk heartbeat
Store to provide together with important mutual exclusion data and judge the most accurately.But access continually
Magnetic disk heartbeat information in the different sector of/renewal makes read/write head move back and forth, thus have impact on it
The efficiency of his data I/O access.In addition, magnetic disk heartbeat information must also have host node
Identification information, in order to other nodes are able to know that who on earth is host node.But, at some
In specific cluster configuration, may be simultaneously present multiple host node.Now, system must design
Extremely complex algorithm follows the tracks of the survival of host node.
Therefore, there is a need in the art for one the most accurate and effective and indicate easily
The technical scheme of equipment survival.
Summary of the invention
In view of problem above, the present invention proposes one and more effectively and the most in the cluster indicates
The method and apparatus of host node survival.
In one aspect of the invention, it is provided that a kind of at host node in the cluster at least
One method indicating this host node to survive from node.The method includes: create for this cluster
Shared resource persistency reserve;And by periodically changing the shape that this persistency is reserved
State updates the generation instruction of this cluster, wherein the instruction of this generation be this at least one logical from node
Cross that to access this shared resource obtainable.
In another aspect of this invention, it is provided that a kind of at host node in the cluster at least
One equipment indicating this host node to survive from node.This equipment includes: create device, configuration
Reserve for creating the persistency of the shared resource for this cluster;And updating device, configuration
Refer to for updating the generation of this cluster by periodically changing the reserved state of this persistency
Show, wherein the instruction of this generation be this at least one can obtain by accessing this shared resource from node
's.
In still another aspect of the invention, it is provided that a kind of in the cluster at node detection should
The method whether host node in cluster survives.The method includes: to the shared resource of this cluster
Register;This shared resource of regular visit indicated with the generation obtaining this cluster;And pass through
Determine whether the instruction of this generation is updated compared with the generation instruction previously obtained, detect this master
Whether node survives.
In another aspect of the invention, it is provided that a kind of in the cluster at node detection should
The equipment whether host node in cluster survives.This equipment includes: register device, is configured to
Register to the shared resource of this cluster;Accessing device, this is shared to be configured to regular visit
Resource indicated with the generation obtaining this cluster;And detection device, it is configured to determine this
Whether instruction from generation to generation is updated compared with the generation instruction previously obtained, and detecting this host node is
No survival.
According to the embodiment of the present invention, at the distributed collection by device build compatible for SCSI
In group's system, host node can be reserved for the persistency of shared resource by operation, thus
Member's situation of cluster makes the generation instruction of system constantly update in the case of not changing.Phase
Ying Di, can obtain instruction from generation to generation by accessing shared resource from node, and refer to according to this
Show whether to be updated regularly and detect whether host node survives.In the preferred embodiment of the present invention
In, host node and all operations of shared resource be may be by existing SCSI from node
Order realizes.In this way, host node can efficiently, convenient and refer to from node neatly
Show its existing state.
Accompanying drawing explanation
By reading detailed description below with reference to accompanying drawing, embodiment of the present invention above-mentioned and
Other objects, features and advantages will become prone to understand.In the accompanying drawings, with exemplary rather than limit
The mode of property processed shows some embodiments of the present invention, wherein:
Fig. 1 shows the applied environment framework 100 that can use embodiment of the present invention wherein
Block diagram;
Fig. 2 show according to one exemplary embodiment for the main joint at cluster
To the flow chart of the method 200 of other nodes instruction host node survival at Dian;
Fig. 3 show according to one exemplary embodiment at cluster from joint
The flow chart of the method 300 whether host node survives is detected at Dian;
Fig. 4 show according to one exemplary embodiment for the main joint at cluster
To the block diagram of the equipment 400 of other nodes instruction host node survival at Dian;
Fig. 5 show according to one exemplary embodiment at cluster from joint
The block diagram of the equipment 500 whether host node survives is detected at Dian;And
Fig. 6 shows the frame of the computer system 600 being adapted to put into practice embodiment of the present invention
Figure.
In various figures, identical or corresponding label represents identical or corresponding part.
Detailed description of the invention
Some illustrative embodiments shown in below with reference to the accompanying drawings describe the present invention's
Principle and spirit.Should be appreciated that providing these embodiments is only used to make art technology
Personnel better understood when and then realize the present invention, and limits the present invention the most by any way
Scope.
Noting, term as used herein " node " can represent calculating equipment (example physically
As, computer), it is also possible to represent virtual machine or application, process or thread.And,
Term as used herein " host node " both can represent to be responsible in a cluster controlling cluster operating
Main controlled node, it is also possible to represent have the usufructuary node that mutual exclusion is shared resource at present,
Etc..Correspondingly, term " from node " laid under tribute both can having represented a cluster
Node, it is also possible to the node of resource is shared in the mutual exclusion to be used such as expression, etc..
Although additionally, describing Primary Reference minicomputer below for embodiment of the present invention
System interface (SCSI) agreement is given, but what this was merely exemplary.Its objective is more preferably
Ultimate principle and the thought of the present invention is explained on ground to those skilled in the art, and not with any side
Formula limits the scope of the present invention.It practice, embodiments of the present invention be applicable to any at present the most
Know the similar agreement of exploitation in the future.
Below with reference to Fig. 1, it illustrates the cluster 100 that can realize embodiment of the present invention wherein
Block diagram.In fig. 1 it is shown that node 1110, node 2 112... and node N 114.
These nodes are connected to each other by connecting medium 120 and with compounding practice, thus can constitute and divide
Cloth system or cluster 100.Although being not shown, cluster 100 can also include
Other node and data storage device, control equipment, shared resource, etc..
As it has been described above, according to the embodiment of the present invention, node 110-114 can be physically
Calculating equipment.Such as, node can be the conventional desktop of personal computer (PC) etc
Calculating equipment, it is also possible to be personal digital assistant (PDA), cell phone, smart phone,
The mobile terminals such as laptop computer.Should be appreciated that above-named is only some possible
Example, is not intended to limit the scope of the present invention.It practice, node 110-114 can be tool
Any physical equipment of information processing and communication capacity, the most currently known or future is had to open
Send out.Alternatively or additionally, node 110-114 can also be dummy node, such as, include
But it is not limited to virtual machine, application program, process, thread, etc..Especially, according to this
Bright preferred implementation, (such as, node 110-114 follows small computer system interface
SCSI-3) agreement.
According to the embodiment of the present invention, connecting medium 120 can be network.Especially, even
Connecing medium 120 can be cable network, wireless network or a combination thereof, includes but not limited to following
At least one: cellular phone network, Ethernet, based on IEEE 802.11,802.16,802.20
Deng WLAN (WLAN), and/or World Interoperability for Microwave Access, WiMax (WiMAX)
Network.Additionally, connecting medium 120 can be public network (such as, the Internet), special
Network (such as, intranet) or a combination thereof.Alternatively or additionally, medium 120 is connected
Can also be to interconnect and the equipment communicated between the subscriber equipmenies such as bus.The scope of the present invention is side at this
Face is unrestricted.
Additionally, as it is shown in figure 1, cluster 100 can also include sharing resource 130.Make at this
Term " share with resource " refer to the most qualified access of node 110 in cluster and use
Resource, including physical resource and logical resource, such as SCSI storage device etc..Node 110
Access to shared resource 130 can realize by means of connecting medium 120.Alternatively or additionally
Ground, node 110 can also be accessed by other suitable technological means and use shared resource
130。
In cluster 100, one or more nodes can serve as host node, i.e. the control of cluster
The node of resource is shared in node processed or exclusive use mutual exclusion.Other nodes correspondingly serve as from joint
Point.Should be appreciated that the host node in cluster 100 and the role from node can exchange.
Such as, host node can actively abandon its primary node identity, or lose main joint because of fault
Point identity.Now, new host node can be become by " competition " from node.
With reference now to Fig. 2, it illustrates according to one exemplary embodiment for
To the flow chart of the method 200 of other nodes instruction host node survival at the host node of cluster.
Method 200 can by the host node in cluster or the equipment associated with host node (such as, hereafter
The equipment 400 described with reference to Fig. 4) perform.
After method 200 starts, in step S202, host node is made to create for place cluster
The persistency sharing resource is reserved (persistent reservation).
As it has been described above, sharing resource is that the most addressable resource of the member in cluster (such as, is deposited
Storage resource), it can be physical resource, it is also possible to be logical block.Term as used herein
" persistency is reserved " represents that node has the use of exclusiveness to shared resource or its specific part
Weigh, and this exclusiveness uses and will continue for some time.In other words, during this period of time,
The shared resource part reserved by persistence can not be modified by other nodes in cluster
(such as, forbidding performing write operation).In logic, can understand reserved for persistency shared
The mutual exclusion lock of the above-mentioned applying of resource.Should be appreciated that the different piece of shared resource in cluster
On, one or more node can be allowed to create one or more persistency and to reserve, this depends on
The setting of cluster and configuration.
As it was noted above, according to the embodiment of the present invention, the node in cluster can be followed
SCSI protocol.In such embodiment, step S202 can be come by two sub-steps
Realize.First, in the first sub-step, host node can be made to register to shared resource.Example
As, host node can generate or otherwise obtain the key assignments of an associated and (be designated as
k1).Host node can utilize key assignments k1 associated with it to register to described shared resource.
According to the embodiment of the present invention, such as can be by using the order PERSISTENT of SCSI
RESERVE OUT realizes this sub-step, the service action code that wherein this order is had
For " REGISTER ", and indicate and register key assignments as k1.As known in the art, SCSI
Protocol command PERSISTENT RESERVE OUT be a primitive command.Therefore, logical
Crossing and use this order, host node can be easily accomplished registration (the relevant SCSI to shared resource
The more details of order, refer to the various data about SCSI protocol known in the art,
For example, it is possible to see the SCSI specification literary composition that can obtain from http://www.t10.org/drafts.htm
Shelves.
It follows that in the second sub-step of step S202, host node can be in shared resource
Create the persistency being associated with registration key assignments to reserve.Such as can use scsi command
PERSISTENT RESERVE OUT realizes This move, the service action code of this order
It is arranged to " RESERVE ".As known in the art, now SCSI protocol support is altogether
Enjoy creating in resource, with the registration key assignments k1 of this node is associated, there is particular type (TYPE)
Reserve with the persistency of scope (SCOPE).
It follows that method 200 proceeds to step S204, created by change at this host node
The state that the persistency built is reserved indicated to the generation (generation) updating cluster.Note,
Term as used herein " from generation to generation " is the information of the change reacting cluster member situation in cluster.
Specifically, whenever having new node to add the member that cluster becomes cluster, instruction can from generation to generation
To increase with the renewal representing cluster member situation.Otherwise, when there being member node to exit cluster,
The instruction of this generation also will correspondingly be changed.
But, under the aggregated structure following such as SCSI protocol, in step S204, having can
Can be in the case of cluster member situation change, by lasting to shared resource of change
Property reserved state, make the generation instruction of cluster change.Specifically, according to the present invention
Some embodiment, host node can utilize reserves the key assignments being associated (i.e., with persistency
K1) to described shared resource repeated registration.Such as can be by using scsi command
PERSISTENT RESERVE OUT realizes This move, and its service action code is set
For " REGISTER AND IGNORE EXISTING KEY ", and indicate reserved key assignments
For k1.In scsi protocol, by being called by host node and sending above-mentioned to shared resource
Order, can make to indicate generation from generation to generation in the case of member's situation of cluster does not changes
Change.
As mentioned below, in cluster from node can by read host node on shared resource
The state that the persistency created is reserved obtains instruction from generation to generation, and can be according to instruction from generation to generation
Update status detects whether host node is in existing state.This is detailed below with regard to Fig. 3
Describe.
It follows that method 200 proceeds to step S206, determine if to abandon at this host node
Primary node identity.It is appreciated that this includes determining whether host node abandons the ground of the master control to cluster
Position, or terminate alternative is shared the exclusive use of resource.Do not abandon if host node determines
Primary node identity (branch's "No" of step S206), then method 200 is in the predetermined time
Return after interval and perform step S204.In this way, as long as host node is survived and do not puts
Abandon primary node identity, just can carry out generation Pleistocene by the state that change persistence is reserved termly
Instruction.So, as long as finding that the generation instruction of cluster is being constantly changing from node, the most permissible
Judge that host node is currently in survival.
On the other hand, if host node determine abandon its primary node identity (step S206 point
Prop up "Yes"), then method 200 proceeds to step S208, and it is right to discharge at this host node
The persistency sharing resource is reserved.Such as can use scsi command PERSISTENT
RESERVE OUT realizes This move, and the service action code of this order is arranged to
“RELEASE”.In this case, from node, other will detect that instruction is no longer from generation to generation
Periodically change.Thus, from node, cluster can be determined that host node is not on existing state.
Various suitable subsequent action, such as competition can be correspondingly taked to become new main joint from node
Point, etc..
Method 200 terminates after step S208.
By method 200, neatly the generation in SCSI protocol can be indicated and be used for indicating master
The survival (and unactual configuration changing cluster) of node.As known in the art, above
The various scsi commands mentioned are primitive command, and it obtains the support of firmware level, therefore can
Enough realize efficiently.Certainly, principle and the spirit of the present invention is equally applicable to be similar to SCSI
Other agreements of agreement.
With reference now to Fig. 3, it illustrates according to one exemplary embodiment for
The flow chart detecting the method 300 whether host node survives at node at cluster.Method
300 can by cluster each from node or the equipment that associates with from node (such as, hereafter
The equipment 500 described with reference to Fig. 5) perform.
After method 300 starts, in step S302, can be to the shared resource of cluster from node
Register.
In the embodiment following SCSI protocol, this such as can be by SCSI protocol
PERSISTENT RESERVE OUT order realizes, and the service action code of this order is set
It is set to " REGISTER ", and indicates the registration key assignments (being designated as k2) being associated with from node.
It follows that method 300 proceeds to step S304, can be by accessing altogether from node at this
Enjoy resource and obtain the generation instruction of cluster.
According to some embodiment of the present invention, step S304 can be divided into two sub-steps.
In the first sub-step, currently holding for shared resource can be read from shared resource from node
Property is reserved for information about for a long time.As a example by the cluster following SCSI protocol, this such as can make
Realize with scsi command PERSISTENT RESERVE IN, the service action of this order
Code is arranged to " READ RESERVATION ".As response, shared resource will to from
Node returns reserves relevant information with the current persistent sharing resource.It follows that second
Sub-step, can parse the generation instruction of described cluster from node.
As known in the art, when from node to shared resource transfer scsi command
During PERSISTENT RESERVE IN, shared resource return to wrap from the response of node
The field of the current generation containing instruction cluster.This field is referred to as " holding in scsi protocol
Property is reserved from generation to generation for a long time " (PRGENERATION) field, it is the enumerator of 32,
Firmware maintenance by shared resource.It is similar with PERSISTENT RESERVE OUT,
PERSISTENT RESERVE IN order is also primitive command, therefore can be efficient from node
Ground obtains the generation instruction information of cluster efficiently, easily in step S304.
It follows that method 300 proceeds to step S306, determine at this and obtain in step S304
Generation instruction whether be updated compared with the generation instruction previously obtained, thus detect described master
Whether node survives.
If determining that instruction is updated (branch's "Yes" of step S306) from generation to generation from node,
Then it thinks that the host node of cluster is currently in existing state.Now, method 300 is predetermined
Step S304 is returned after time interval.In this way, can be periodically by accessing from node
The shared resource of cluster and obtain the generation instruction of cluster.
On the other hand, it is not updated (dividing of step S306 if determining to indicate from generation to generation from node
Prop up "No"), then it is it is believed that host node is (such as, main currently without being in existing state
Nodes break down, or actively abandon primary node identity).
Now, method 300 may proceed to step S308, removes the right of host node establishment at this
The persistency sharing resource is reserved.According to some embodiment of the present invention, step S308 can
With by using scsi command PERSISTENT RESERVE OUT to realize, it has
Service action code PREEMPT.This service action code allows to be associated with host node by instruction
Registration key assignments remove the registration carried out in shared resource by main equipment.Note, step
S308 is optional, such as, the most actively discharge in the case of persistency is reserved at host node and walk
Rapid S308 is performed without.
It follows that method 300 can optionally proceed to optional step S310, at this from joint
Point can ask to become new host node.For selecting one or more one-tenth from node from multiple
Various mechanism and strategy for host node all can be used in combination with embodiments of the present invention, no matter
That be currently known or exploitation in the future.The scope of the present invention is not limited in this respect.
Method 300 terminates after step S310.
By method 300, each can share resource and obtain periodically by accessing from node
Take the generation instruction of cluster.If instruction is updated continuously from generation to generation, then think from node
Host node is in normal operating state, i.e. existing state.Whereas if find cluster from node
Generation instruction no longer update, then it can be assumed that host node is no longer on existing state.Now,
Can perform to remove the registration of host node from node, that competition becomes new host node etc. is various follow-up
Operation.
Below with reference to Fig. 4, it illustrates according to one exemplary embodiment for
Indicate the equipment of described host node survival from node at least one at host node in the cluster
The block diagram of 400.Described host node in described cluster and described at least one follow little from node
Type computer system interface SCSI protocol.According to the embodiment of the present invention, equipment 400 can
To reside at the host node of cluster;Alternatively, equipment 400 can also be positioned at outside host node
Portion and operationally communicating with host node.
As it can be seen, equipment 400 includes: create device 402, it is configured to create for institute
The persistency of the shared resource stating cluster is reserved;And updating device 404, it is configured to
Periodically change the reserved state of described persistency to indicate to the generation updating described cluster, its
Described in from generation to generation instruction be described at least one can obtain by accessing described shared resource from node
?.
According to some embodiment of the present invention, equipment 400 can also include register device, join
Put for utilizing the key assignments being associated with described host node to register to described shared resource.?
Such embodiment, creates device 402 and may be configured to create in described shared resource
The described persistency being associated with described key assignments is reserved.
According to some embodiment of the present invention, updating device 404 may include that repeated registration
Device, be configured to periodically with reserve with described persistency the described key assignments that is associated to
Described shared resource repeated registration.
According to some embodiment of the present invention, equipment 400 also includes: release device, configuration
Discharge described shared resource for primary node identity will be abandoned in response to described host node
Described persistency reserve.
According to some embodiment of the present invention, described establishment device 402 and described updating device
404 may be configured to perform each to operate based on scsi command.
Below with reference to Fig. 5, it illustrates according to one exemplary embodiment for
The equipment 500 whether host node at node in the described cluster of detection in the cluster survives
Block diagram.Described host node and described at least one follow small computer system interface from node
SCSI protocol.According to the embodiment of the present invention, equipment 500 may reside within cluster from
At node;Alternatively, equipment 500 can also be positioned at outside from node and can with from node
Operatively communicate.
As it can be seen, equipment 500 includes: register device 502, it is configured to described cluster
Shared resource register;Access device 504, be configured to described in regular visit share money
Source is to obtain the generation instruction of described cluster;And detection device 506, it is configured to really
Whether fixed instruction of described generation is updated compared with the generation instruction previously obtained, and detects described
Whether host node survives.
According to some embodiment of the present invention, access device 504 and may include that reading device,
It is configured to read the current persistent for described shared resource reserved for information about;And
Resolver, is configured to parse the generation instruction of described cluster from described information.
According to some embodiment of the present invention, equipment 500 can also include: apparatus for removing,
It is configured to remove described host node to institute in response to determining the instruction of described generation not to be updated
State the registration of shared resource.
According to some embodiment of the present invention, equipment 500 can also include: request unit,
It is configured to ask to become the new host node of described cluster.
According to some embodiment of the present invention, described register device 502 and described access device
504 are configured to perform respective operation based on scsi command.
Note, for clarity, be shown without in Fig. 4-Fig. 5 optional device and each
The sub-device that device is comprised.It will be appreciated, however, that each described in equipment 400 and 500
Individual device correspond respectively to above with reference to Fig. 2 and Fig. 3 describe method 200 and 300 in each
Individual step.Thus, the operation and the feature that describe above with respect to method 200 and 300 are equally applicable
In equipment 400 and 500 and the device that wherein comprises, do not repeat them here.
It is also understood that equipment 400 and 500 can utilize to be embodied in various ways.Such as, exist
In some embodiment, equipment 400 and 500 can utilize software and/or firmware to realize.Standby
Selection of land or additionally, equipment 400 and 500 can partially or fully come real based on hardware
Existing.Such as, equipment 400 and 500 can be implemented as integrated circuit (IC) chip or special collection
Become circuit (ASIC).Equipment 400 and 500 can also be embodied as SOC(system on a chip) (SOC).
Other modes currently known or exploitation in the future are also feasible, and the scope of the present invention is side at this
Face is unrestricted.
Below with reference to Fig. 6, it illustrates the computer be suitable to for putting into practice embodiment of the present invention
The schematic block diagram of system 600.Such as, the computer system 600 shown in Fig. 6 may be used for
Realize host node described above and/or from node.
As it can be seen, computer system 600 may include that CPU (CPU) 601,
RAM (random access memory) 602, ROM (read only memory) 603, system bus
604, hard disk controller 605, KBC 606, serial interface controller 607, parallel
Interface controller 608, display controller 609, hard disk 610, keyboard 611, serial peripheral set
For 612, concurrent peripheral equipment 613 and display 614.In such devices, total with system
Line 604 coupling have CPU 601, RAM 602, ROM 603, hard disk controller 605,
KBC 606, serialization controller 607, parallel controller 608 and display controller 609.
Hard disk 610 couples with hard disk controller 605, and keyboard 611 couples with KBC 606,
Serial peripheral equipment 612 couples with serial interface controller 607, concurrent peripheral equipment 613 with
Parallel interface controller 608 couples, and display 614 couples with display controller 609.
Should be appreciated that the structured flowchart described in Fig. 6 illustrates just to the purpose of example, and not
Limit the scope of the present invention.In some cases, can as the case may be and increase or
Person reduces some equipment.
As it has been described above, equipment 400 and 500 can be implemented as pure hardware, such as chip, ASIC,
SOC etc..These hardware can be integrated in computer system 600.Additionally, the reality of the present invention
The mode of executing can also be realized by the form of computer program.Such as, with reference to Fig. 2 and Tu
3 methods 200 and 300 described can be realized by computer program.This computer
Program product can be stored in the such as RAM 604 shown in Fig. 6, ROM 604, hard disk 610
And/or in any suitable storage medium, or download to calculate from suitable position by network
In machine system 600.Computer program can include computer code part, and it includes can
The programmed instruction performed by suitable processing equipment (such as, the CPU 601 shown in Fig. 6).
Described programmed instruction at least can include the instruction of the step for implementation method 200 and 300.
Spirit and principles of the present invention are illustrated above already in connection with some detailed description of the invention.Root
According to embodiments of the present invention, at the collection constituted with the node of SCSI or similar protocol-compliant
In Qun, the information of instruction cluster generation can be used to realize node and survive by escape neatly
Instruction.According to the embodiment of the present invention, it is not necessary to reserved storage as magnetic disk heartbeat algorithm
Space, because instruction (such as, PRGENERATION field) is only one 32 from generation to generation
The enumerator of position, and be typically by the firmware maintenance of cluster-based storage, its operating efficiency is higher.
And, the shifting being not result in disk head is reserved by the persistency on operating shared resource
Dynamic, and do not damaged by data in magnetic disk and affected.
Additionally, according to the embodiment of the present invention, owing to the renewal indicating the generation can be former
Sub-rank realizes, and therefore can easily build the cluster configuration of many host nodes.And, from joint
Point can easily know viability or the effectiveness of host node in single I/O operation.It addition,
By using existing SCSI (primitive) order to indicate survival, so that the base of cluster
Plinth framework becomes apparent from and succinctly.
It should be noted that, embodiments of the present invention can pass through hardware, software or software and hard
Being implemented in combination in of part.Hardware components can utilize special logic to realize;Software section is permissible
Storage in memory, by suitable instruction execution system, such as microprocessor or special set
Meter hardware performs.It will be understood by those skilled in the art that above-mentioned equipment and method can
To use computer executable instructions and/or be included in processor control routine to realize, such as
At the such as mounting medium of disk, CD or DVD-ROM, such as read only memory (firmware)
Programmable memory or such as optics or electrical signal carrier data medium on provide
Such code.The equipment of the present invention and module thereof can be by such as super large-scale integration
Or the quasiconductor of gate array, such as logic chip, transistor etc. or such as field-programmable
The hardware circuit of the programmable hardware device of gate array, programmable logic device etc. realizes, it is possible to
To realize with the software that performed by various types of processors, it is also possible to by above-mentioned hardware circuit and
The combination of software such as firmware realizes.
The communication network mentioned in description can include disparate networks, includes but not limited to local
Net (" LAN "), wide area network (" WAN "), according to the network of IP agreement (such as, because of
Special net) and ad-hoc network (such as, ad hoc peer-to-peer network).
If although it should be noted that, being referred to equipment for drying or the son dress of equipment in above-detailed
Put, but this division is the most enforceable.It practice, according to the embodiment party of the present invention
Formula, the feature of two or more devices above-described and function can be the most concrete
Change.Otherwise, the feature of an above-described device and function can be with Further Divisions for by many
Individual device embodies.
Although additionally, describe the operation of the inventive method in the accompanying drawings with particular order, but,
This does not requires that or implies must be to perform these operations or necessary according to this particular order
Operation shown in performing all could realize desired result.On the contrary, the step described in flow chart
Suddenly execution sequence can be changed.Additionally or alternatively, it is convenient to omit some step, by multiple
Step is merged into a step and is performed, and/or a step is decomposed into the execution of multiple step.
Although describing the present invention by reference to some detailed description of the invention, it should be appreciated that,
The present invention is not limited to disclosed detailed description of the invention.Want it is contemplated that contain appended right
Various amendments included in the spirit and scope asked and equivalent arrangements.The model of claims
Enclose and meet broadest explanation, thus comprise all such amendments and equivalent structure and function.
Claims (24)
1. one kind refers to from node at least one of described cluster at host node in the cluster
Show that the method that described host node is survived, described method include:
The persistency creating the shared resource for described cluster is reserved;And
The generation of described cluster is updated by periodically changing the reserved state of described persistency
Instruction, wherein reflects the change of member's situation in described cluster from generation to generation, and the wherein said generation indicates
Be described at least one from node by access described shared resource obtainable.
Method the most according to claim 1, wherein creates the shared money for described cluster
The persistency in source is reserved to be included:
The key assignments being associated with described host node is utilized to register to described shared resource;And
Described shared resource creates the described persistency being associated with described key assignments reserve.
Method the most according to claim 2, the most periodically changes described persistency pre-
The state stayed includes:
Periodically with reserving the described key assignments being associated with described persistency to described shared money
Source repeated registration.
Method the most according to claim 1, also includes:
Will abandon primary node identity in response to described host node, release is to described shared resource
Described persistency is reserved.
Method the most according to claim 1, wherein creates the shared money for described cluster
The persistency in source state that is reserved and that periodically change described persistency reserved is based on small-sized
Computer system interface scsi command and perform.
6. according to the method described in any one of claim 1-5, wherein said host node and described
At least one follows small computer system interface SCSI protocol from node.
7. one kind at host node in the cluster at least one in described cluster from node
The equipment that described host node is survived, described equipment is indicated to include:
Creating device, the persistency being configured to create the shared resource for described cluster is reserved;
And
Updating device, is configured to periodically change the reserved state of described persistency and comes
Update the generation instruction of described cluster, wherein reflect the change of member's situation in described cluster from generation to generation,
The wherein said generation instruction be described at least one from node by access described shared resource can obtain
?.
Equipment the most according to claim 7, also includes register device, is configured to utilize
The key assignments being associated with described host node is registered to described shared resource;And wherein
It is relevant to described key assignments that described establishment device is arranged in described shared resource establishment
The described persistency of connection is reserved.
Equipment the most according to claim 8, wherein said updating device includes:
Repeated registration device, is configured to be associated periodically with reserved with described persistency
Described key assignments to described shared resource repeated registration.
Equipment the most according to claim 7, also includes:
Release device, is configured to abandon primary node identity in response to described host node and release
The described persistency of described shared resource of being rivals in a contest is reserved.
11. equipment according to claim 7, wherein said establishment device and described more new clothes
Put and be configured to perform each to operate based on small computer system interface scsi command.
12. according to the equipment described in any one of claim 7-11, wherein said host node and institute
State at least one and follow small computer system interface SCSI protocol from node.
For the host node detected at node in described cluster in the cluster whether 13. 1 kinds
The method of survival, described method includes:
To described in the shared resource registering of described cluster from node;
Shared money described in regular visit is reserved by scanning the current persistent of described shared resource
Member's situation in described cluster, to obtain the generation instruction of described cluster, is wherein reflected from generation to generation in source
Change, and the wherein said generation instruction be described from node by access described shared resource can
Obtain;And
It is determined by whether the instruction of described generation is updated compared with the generation instruction previously obtained,
Detect whether described host node survives.
14. methods according to claim 13, wherein described in regular visit share resource with
The generation instruction obtaining described cluster includes:
Read the current persistent for described shared resource reserved for information about;And
The generation instruction of described cluster is parsed from described information.
15. methods according to claim 13, also include:
In response to determining that the instruction of described generation is not updated, remove described host node and share to described
The registration of resource.
16. methods according to claim 13, also include:
Request becomes the new host node of described cluster.
17. methods according to claim 13, wherein enter to the shared resource of described cluster
Sharing resource described in row registration and regular visit is based on small computer system interface SCSI
Order and perform.
18. according to the method described in any one of claim 13-17, wherein said host node and institute
State and follow small computer system interface SCSI protocol from node.
For the host node detected at node in described cluster in the cluster whether 19. 1 kinds
The equipment of survival, described equipment includes:
Register device, be configured to described in the shared resource registering of described cluster from node;
Accessing device, the current persistent being configured to scan described shared resource is reserved next
Resource is shared to obtain the generation instruction of described cluster, wherein generation reflection institute described in regular visit
State the change of member's situation in cluster, and the instruction of wherein said generation is described to pass through from node
Access described shared resource obtainable;And
Detection device, is configured to determine that the instruction of described generation referred to the generation previously obtained
Show to compare whether be updated, detect whether described host node survives.
20. equipment according to claim 19, wherein access device and include:
Reading device, is configured to read what the current persistent for described shared resource was reserved
For information about;And
Resolver, is configured to parse the generation instruction of described cluster from described information.
21. equipment according to claim 19, also include:
Apparatus for removing, is configured to remove institute in response to determining the instruction of described generation not to be updated
State the host node registration to described shared resource.
22. equipment according to claim 19, also include:
Request unit, is configured to ask to become the new host node of described cluster.
23. equipment according to claim 19, wherein said register device and described access
Device is configured to perform respective operation based on small computer system interface scsi command.
24. according to the equipment described in any one of claim 19-23, wherein said host node and institute
State and follow small computer system interface SCSI protocol from node.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110430012.4A CN103167010B (en) | 2011-12-16 | For the method and apparatus indicating node to survive in the cluster |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110430012.4A CN103167010B (en) | 2011-12-16 | For the method and apparatus indicating node to survive in the cluster |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103167010A CN103167010A (en) | 2013-06-19 |
CN103167010B true CN103167010B (en) | 2016-12-14 |
Family
ID=
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1348134A (en) * | 2000-10-13 | 2002-05-08 | 国际商业机器公司 | Method and equipment for providing multi-channel input/output in the environment of non-cocurrent cluster |
CN101179466A (en) * | 2007-10-15 | 2008-05-14 | 北京交通大学 | Centralized service based distributed peer-to-peer network implementing method and system |
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1348134A (en) * | 2000-10-13 | 2002-05-08 | 国际商业机器公司 | Method and equipment for providing multi-channel input/output in the environment of non-cocurrent cluster |
CN101179466A (en) * | 2007-10-15 | 2008-05-14 | 北京交通大学 | Centralized service based distributed peer-to-peer network implementing method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105960639B (en) | Prioritization data reconstruct in distributed memory system | |
CN104111897B (en) | A kind of data processing method, device and computer system | |
CN103842969B (en) | Information processing system | |
US8392368B1 (en) | System and method for distributing and accessing files in a distributed storage system | |
CN103297456B (en) | Access method and the distributed system of resource is shared under a kind of distributed system | |
CN102457517A (en) | Inter-virtual machine communication | |
CN104781794B (en) | The temporary transient original place between permanent state for the data structure in nonvolatile memory changes | |
US9400767B2 (en) | Subgraph-based distributed graph processing | |
EP2472398A1 (en) | Memory-aware scheduling for NUMA architectures | |
CN104871143A (en) | System and method for allocating memory to dissimilar memory devices using quality of service | |
CN106254240A (en) | A kind of data processing method and routing layer equipment and system | |
CN107179878A (en) | The method and apparatus of data storage based on optimizing application | |
CN106453618A (en) | Remote sensing image processing service cloud platform system based on G-Cloud cloud computing | |
CN106095483A (en) | The Automation arranging method of service and device | |
CN105992261A (en) | Apparatus, system, and method for predicting roaming patterns of mobile devices within wireless networks | |
CN107817951A (en) | A kind of method and device for realizing the fusion of Ceph clusters | |
CN103970678B (en) | Catalogue designing method and device | |
CN105468296A (en) | No-sharing storage management method based on virtualization platform | |
CN108205573A (en) | A kind of data distribution formula storage method and system | |
CN104573112B (en) | Page interrogation method and data processing node in OLTP Cluster Databases | |
CN103167010B (en) | For the method and apparatus indicating node to survive in the cluster | |
CN112596669A (en) | Data processing method and device based on distributed storage | |
US20140368536A1 (en) | Efficient collaging of a large image | |
CN102833295B (en) | Data manipulation method and device in distributed cache system | |
CN104636161B (en) | The online patch method and system of a kind of multiple nucleus system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20200409 Address after: Massachusetts, USA Patentee after: EMC IP Holding Company LLC Address before: Massachusetts, USA Patentee before: EMC Corp. |