CN100420251C - Self adaptable electing algorithm for main controlled node in group - Google Patents

Self adaptable electing algorithm for main controlled node in group Download PDF

Info

Publication number
CN100420251C
CN100420251C CNB2005100182320A CN200510018232A CN100420251C CN 100420251 C CN100420251 C CN 100420251C CN B2005100182320 A CNB2005100182320 A CN B2005100182320A CN 200510018232 A CN200510018232 A CN 200510018232A CN 100420251 C CN100420251 C CN 100420251C
Authority
CN
China
Prior art keywords
main controlled
finger
node
controlled node
guessing game
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2005100182320A
Other languages
Chinese (zh)
Other versions
CN1645862A (en
Inventor
陈勇
涂小明
叶磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Northern Fiberhome Technologies Co Ltd
Original Assignee
Beijing Northern Fiberhome Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Northern Fiberhome Technologies Co Ltd filed Critical Beijing Northern Fiberhome Technologies Co Ltd
Priority to CNB2005100182320A priority Critical patent/CN100420251C/en
Publication of CN1645862A publication Critical patent/CN1645862A/en
Application granted granted Critical
Publication of CN100420251C publication Critical patent/CN100420251C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Computer And Data Communications (AREA)

Abstract

The present invention relates to a self adaptable electing algorithm for main control nodes in groups, which uses finger-guessing modules and heartbeat detection modules. On the basis of a unique main control node selected from integer finger-guessing values generated arbitrarily, when the main control node disengages from a group caused by some reasons, a new group can automatically select a new main control node. The present invention is applied to a group with high reliability in a telecommunication system, so the main control node can be completely automatically generated in the group. The present invention has independence, does not need to determine the main control node through configuration files and avoids man-made intervention.

Description

Main controlled node self adaptation election algorithm in a kind of cluster
Technical field
The invention belongs to the computer cluster technical field, be specifically related to choose in a kind of cluster the algorithm of main controlled node, and guarantee the high availability of main controlled node.
Background technology
In cluster, node of general selection is used as main controlled node, and (what need explain is, main controlled node is the specific function entity in the cluster, all nodes in the control cluster), at present, the general way that adopts is to specify certain node as main controlled node by configuration file, and in the time of the main controlled node fault, the size of the weights that define in configuration file according to each node is selected the node of new node as master control again.From this mode of present cluster, need artificial intervention that each node of cluster order as main controlled node is defined.
Patent " based on the high-availability system of many TCP bind image " (publication number: the election submodule 1423197) (specification of this patent 7/13 page), the way of how to elect a master server has been described:
This submodule determined by configuration file who is a master server when system just opened.Concrete grammar is as follows: server just opens, read configuration file, find oneself to be configured to master server, at this time not to be master server oneself electing immediately, but the receiving port a period of time of monitoring heartbeat message (is greater than the blanking time of heartbeat message at least, could guarantee to receive the heartbeat message of the server that has started like this), judge whether there has been a master server in the system according to the heartbeat message that receives, if exist, then do not elect oneself and be master server, to avoid conflict; If there is no, just oneself being elected is master server.Handle the constraint of having avoided start precedence like this.
In the process afterwards, this submodule (can be provided with) state of just checking server in the current system also at set intervals; When finding that master server lost efficacy, then from the backup server that is in activation, elect a new master server again; When finding that certain backup server lost efficacy, then this server is deleted from the chained list of record server.Election mechanism: judge that according to id number of each server who takes on master server, whose id minimum its basic thought sees exactly.Wherein, in the configuration file of each server, provide for id number.A period of time after system just opens, the state information of each server (hostlist_head) has all been set up, and wherein preserve this each server id number, and also hostlist_head is identical on each server.When certain server lost efficacy, other servers will be deleted this failed server from hostlist_head separately, therefore, concerning each still is in the server of state of activation, all be in id number of state of activation server by all and formed set by one, in this identical set, look for a minimum id so, its result is inevitable identical, and this has just guaranteed the consistency of election.After election results are come out, each server is compared this result with oneself id number, if it is identical with oneself id number, then start IP and seize thread, the virtual ip address that system is external is robbed on the outside network interface card that is bundled in oneself, also will rob the gateway address of internal lan on the inside network interface card that is bundled in oneself simultaneously.
By above content, we find that the method for this module election master server need intervene by configuration file, after the master server fault, also are that the value of the id of each backup server by the configuration file appointment is selected new master server.Obviously, this method has following several problem:
1. need artificial intervention, that is to say needs the people to specify main controlled node.
2. need configuration file.
Summary of the invention
The present invention is directed to and present need specify main controlled node not have the method for flexibility by configuration file, main controlled node self adaptation election algorithm in a kind of computer cluster has been proposed, make cluster inside fully independently produce main controlled node, has independence, do not need to determine main controlled node by configuration file, avoided artificial intervention, whenever guarantee all has a main controlled node in the cluster, in the main controlled node fault, can carry out at random node of selection automatically again as main controlled node.
Technical scheme of the present invention is: main controlled node self adaptation election algorithm in a kind of computer cluster, it is characterized in that: each node all has finger-guessing game module and heartbeat detection module, each node finger-guessing game module of initial state starts: at first each node produces an integer at random as the finger-guessing game value, then this finger-guessing game value is announced in cluster, each node will be received other node finger-guessing game values, and compares with oneself finger-guessing game value; Comparison rule is, if the finger-guessing game value of oneself is littler than the finger-guessing game value of other node, then oneself withdraws from finger-guessing game, if the finger-guessing game value than other all nodes is big, then announce oneself to be main controlled node to cluster, if the finger-guessing game value that other nodes are arranged is identical with oneself, then carry out the finger-guessing game of a new round again, till selecting main controlled node; After cluster enters the stable state with main controlled node, the heartbeat detection module starts, non-main controlled node in the cluster regularly sends detection packet to main controlled node, main controlled node returns after receiving this detection packet responds bag, show the main controlled node machine of delaying if non-main controlled node (can be provided with) the response Bao Ze that does not receive main controlled node in certain period, in new cluster, restart the finger-guessing game module and select new main controlled node.
Main controlled node self adaptation election algorithm in the aforesaid computer cluster is characterized in that the finger-guessing game module adopts UDP (UDP) multicasting technology among the TCP/IP that the finger-guessing game value is announced in cluster.
Principle of the present invention is to adopt existing heartbeat detection technology and the distinctive finger-guessing game algorithm of the present invention to produce main controlled node in cluster.
The finger-guessing game algorithm mainly is to select a node as main controlled node in a cluster, after cluster enters the stable state with main controlled node, the heartbeat detection module begins to start, regularly detect the existing state of main controlled node, if find the main controlled node fault, other nodes in the cluster begin to start the finger-guessing game module once more, reselect the main controlled node that makes new advances.Main method is: the node in the cluster starts, judge whether oneself is main controlled node, if, then need not to carry out finger-guessing game, otherwise just whether inquiry exists main controlled node in cluster, if receiving the main controlled node in the cluster receives the response then the main controlled node existence, do not need to carry out finger-guessing game yet, otherwise show and do not have main controlled node in the cluster, then each node carries out finger-guessing game, and at first each node produces an integer at random as the finger-guessing game value, then this finger-guessing game value is announced (multicasting technology that adopts UDP) in cluster, each node will be received other node finger-guessing game values, and compare with oneself finger-guessing game value, here we have adopted following rule, if the finger-guessing game value of oneself is littler than the finger-guessing game value of other node, then oneself releases finger-guessing game, if the finger-guessing game value than other all nodes is big, then announce oneself to be main controlled node, if there is the finger-guessing game value of other nodes identical with oneself to cluster, then carry out the finger-guessing game of a new round again, till selecting main controlled node.
Whether the heartbeat detection technology mainly is to detect main controlled node to survive in cluster, he starts after the finger-guessing game module is selected main controlled node, whether detect main controlled node by periodic transmission/response detection packet survives, the machine if the discovery main controlled node is delayed, will start the finger-guessing game algorithm, in cluster, select new main controlled node again.Main method is: the node of the non-main controlled node in the cluster regularly sends detection packet to main controlled node, main controlled node returns after receiving this detection packet responds bag, show the main controlled node machine of delaying if non-main controlled node (can be provided with) the response Bao Ze that does not receive main controlled node in certain period, in new cluster, restart the finger-guessing game algorithm and select new main controlled node.
Algorithm of the present invention has the following advantages:
1. can be automated randomized select main controlled node in the cluster, rather than need specify by configuration file.
2. after main controlled node was delayed machine, other nodes can detect and select again automatically new main controlled node in the cluster;
3. this algorithm makes cluster have retractility, adds in the cluster or the minimizing node, and cluster can be accomplished automatic adaptation.
Description of drawings
Fig. 1 is the workflow diagram of finger-guessing game module
Fig. 2 is the schematic diagram of cluster election contest main controlled node
Fig. 3 is the cluster heartbeat detection schematic diagram that has main controlled node
Concrete execution mode
Fig. 1 has described the workflow of finger-guessing game module.
Step 10, module begins initialization.
Step 11 sees whether oneself is main controlled node, and the initial condition of acquiescence is non-main controlled node.If main controlled node then jumps to step 12; Otherwise, jump to step 14.
Step 12 if self be main controlled node, judges whether to receive the message whether other non-main controlled nodes are sent in the cluster inquiry exists main controlled node.If after receiving the message of the inquiry main controlled node of other non-main controlled nodes in the cluster, then forward step 13 to; Otherwise jump to 12.
Step 13 is received after the apply for information, and it is the message of main controlled node that main controlled node is replied me to the inquirer.
Step 14, if self be non-main controlled node, then whether the member of all in cluster inquiry exists main controlled node, has adopted the multicast characteristic of UDP here, we stipulate that a multicast address is the mailbox of cluster, and the message that node mails to this mailbox will be received by all members in the cluster.
Step 15, whether this node can receive the answer message of main controlled node, here also adopted the point-to-point characteristic of UDP, each node is to collect message from the multicast address that we are defined as mailbox, if there is main controlled node in the cluster, our regulation, main controlled node after receiving message so, need point-to-point to inquiring the answer message that sends the main controlled node that I am.If this section point is received the answer message of main controlled node, then jump to step 16; Otherwise jump to step 17.
Step 16, this node receive after the answer message of main controlled node, receive the control of main controlled node.
Step 17 does not receive after the answer message of main controlled node, and node will enter the finger-guessing game flow process, produces an integer at random, as the fist value that finger-guessing game is used, announces in cluster.
Whether step 18 can receive the fist value of other node finger-guessing games in the cluster, if received, then jumps to step 110, otherwise, then jump to step 19.
Step 19, if do not receive the message of other nodes in the cluster, then this node becomes main controlled node.
Step 110, it is the message of main controlled node that main controlled node timing other nodes in cluster send me, mainly is the existing state of other node reports oneself in cluster.
Step 111, if receive the fist value of the finger-guessing game of other nodes in the cluster, then the fist value with the participation finger-guessing game of this node compares.
Step 112 if the fist value of this node is littler than the fist value that receives, then jumps to step 112; Otherwise, jump to step 17.
Step 113, because the fist value of this node is littler than the fist value of other nodes that receive, then this node withdraws from the election contest of main controlled node in the cluster.
Step 114 receives for the first time the multicast message of main controlled node, shows that cluster has existed or produced main controlled node.
Step 115 starts the timer to the monitoring of main controlled node existing state, and we can stipulate in certain period, if do not receive the message of main controlled node, then showed main controlled node death.
Step 116 is judged the type of message of the message that receives.
Step 117 is held message if receive to surpass, and then shows main controlled node death, and new cluster does not have main controlled node, needs carry out the election contest of main controlled node once more by the mode of finger-guessing game, so jump to step 17.
Step 118 receives the message of main controlled node, shows that main controlled node is the state of survival, the last timer of cancellation then, and jump to step 115.
Fig. 2 has described the main controlled node election contest schematic diagram of a concrete instance.
4 nodes are arranged among the figure, all are non-main controlled nodes, and this figure has described the message sink of this module in election contest and the direction of transmission.What need explain is, the mailbox here is the IP address of a multicast, and each node reaches the mutual of message by send, receive message to this IP address.
Fig. 3 has described the schematic diagram of the detection main controlled node of a concrete instance.
Here, node 1 becomes main controlled node, he periodically send heartbeat message to mailbox, other non-main controlled nodes then receive this heartbeat message by mailbox, if in official hour, do not receive the heartbeat message of main controlled node, then enter the election contest flow process of new main controlled node.
Though described the present invention by examples of implementation, those of ordinary skills know, the present invention has many distortion and variation and do not break away from spirit of the present invention, wish that appended claim comprises these distortion and variation and do not break away from spirit of the present invention.

Claims (2)

1. main controlled node self adaptation election algorithm in the computer cluster, it is characterized in that: each node all has finger-guessing game module and heartbeat detection module, each node finger-guessing game module of initial state starts: at first each node produces an integer at random as the finger-guessing game value, then this finger-guessing game value is announced in cluster, each node will be received other node finger-guessing game values, and compares with oneself finger-guessing game value; Comparison rule is, if the finger-guessing game value of oneself is littler than the finger-guessing game value of other node, then oneself withdraws from finger-guessing game, if the finger-guessing game value than other all nodes is big, then announce oneself to be main controlled node to cluster, if the finger-guessing game value that other nodes are arranged is identical with oneself, then carry out the finger-guessing game of a new round again, till selecting main controlled node; After cluster enters the stable state with main controlled node, the heartbeat detection module starts, non-main controlled node in the cluster regularly sends detection packet to main controlled node, main controlled node returns after receiving this detection packet responds bag, do not show the main controlled node machine of delaying if non-main controlled node receives the response Bao Ze of main controlled node in certain period, in new cluster, restart the finger-guessing game module and select new main controlled node.
2. main controlled node self adaptation election algorithm in the computer cluster as claimed in claim 1 is characterized in that the finger-guessing game module adopts the UDP multicasting technology among the TCP/IP that the finger-guessing game value is announced in cluster.
CNB2005100182320A 2005-02-01 2005-02-01 Self adaptable electing algorithm for main controlled node in group Expired - Fee Related CN100420251C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005100182320A CN100420251C (en) 2005-02-01 2005-02-01 Self adaptable electing algorithm for main controlled node in group

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005100182320A CN100420251C (en) 2005-02-01 2005-02-01 Self adaptable electing algorithm for main controlled node in group

Publications (2)

Publication Number Publication Date
CN1645862A CN1645862A (en) 2005-07-27
CN100420251C true CN100420251C (en) 2008-09-17

Family

ID=34875701

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100182320A Expired - Fee Related CN100420251C (en) 2005-02-01 2005-02-01 Self adaptable electing algorithm for main controlled node in group

Country Status (1)

Country Link
CN (1) CN100420251C (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100396049C (en) * 2006-05-26 2008-06-18 北京交通大学 Cluster chief election method based on node type for ad hoc network
CN101197651B (en) * 2007-12-19 2012-07-04 华为技术有限公司 Method, equipment, master control board and interface plate of communication between plates
CN102340793B (en) * 2010-07-23 2015-09-16 中兴通讯股份有限公司 The choosing method of temporary core network, base station and trunked communication system
CN102291250A (en) * 2011-04-25 2011-12-21 程旭 Method and device for maintaining network topology in cloud computing
CN102833289B (en) * 2011-06-16 2016-02-17 浙江速腾电子有限公司 A kind of distributed cloud computing resources tissue and method for allocating tasks
CN102843259A (en) * 2012-08-21 2012-12-26 武汉达梦数据库有限公司 Middleware self-management hot backup method and middleware self-management hot backup system in cluster
CN103580915B (en) * 2013-09-26 2017-01-11 东软集团股份有限公司 Method and device for determining main control node of trunking system
CN103475742B (en) * 2013-09-30 2017-02-01 北京华胜天成科技股份有限公司 Method and system for determining master control node in cloud computing environment
CN104917792B (en) * 2014-03-12 2018-10-30 上海宝信软件股份有限公司 The cluster management method and system of democratic autonomy
CN105306566A (en) * 2015-10-22 2016-02-03 创新科存储技术(深圳)有限公司 Method and system for electing master control node in cloud storage system
WO2017214805A1 (en) * 2016-06-13 2017-12-21 深圳天珑无线科技有限公司 Distributed network heartbeat method and node
CN111884888A (en) * 2020-07-27 2020-11-03 宁波奥克斯电气股份有限公司 Referee machine election method and device, intelligent equipment and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998044733A1 (en) * 1997-03-31 1998-10-08 Broadband Associates Method and system for providing a presentation on a network
US20030002494A1 (en) * 2001-07-02 2003-01-02 Arttu Kuukankorpi Processing of data packets within a network element cluster
CN1423197A (en) * 2002-12-16 2003-06-11 华中科技大学 High usable system based on multi TCP linking map
WO2003061237A2 (en) * 2002-01-18 2003-07-24 International Business Machines Corporation Master node selection in clustered node configurations
US20040243709A1 (en) * 2003-05-27 2004-12-02 Sun Microsystems, Inc. System and method for cluster-sensitive sticky load balancing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998044733A1 (en) * 1997-03-31 1998-10-08 Broadband Associates Method and system for providing a presentation on a network
US20030002494A1 (en) * 2001-07-02 2003-01-02 Arttu Kuukankorpi Processing of data packets within a network element cluster
WO2003061237A2 (en) * 2002-01-18 2003-07-24 International Business Machines Corporation Master node selection in clustered node configurations
CN1423197A (en) * 2002-12-16 2003-06-11 华中科技大学 High usable system based on multi TCP linking map
US20040243709A1 (en) * 2003-05-27 2004-12-02 Sun Microsystems, Inc. System and method for cluster-sensitive sticky load balancing

Also Published As

Publication number Publication date
CN1645862A (en) 2005-07-27

Similar Documents

Publication Publication Date Title
CN100420251C (en) Self adaptable electing algorithm for main controlled node in group
CN102739775B (en) The monitoring of internet of things data acquisition server cluster and management method
US7668943B2 (en) Method for real-time synchronizing configuration data between element management systems and network elements
CN112764407B (en) Distributed control non-periodic communication method
KR100812374B1 (en) System and method for managing protocol network failures in a cluster system
CN101729231B (en) Industrial Ethernet in distributed control system
CN111901422B (en) Method, system and device for managing nodes in cluster
CN111565229A (en) Communication system distributed method based on Redis
CN110297801A (en) A just transaction semantics for transaction system based on fault-tolerant FPGA
CN101499976A (en) Stack manager protocol with automatic set up mechanism
WO2009101531A2 (en) System and method for network recovery from multiple link failures
CN1937521A (en) Retention of a stack address during primary master failover
CN105144645A (en) Spanning tree in fabric switches
CN109040184B (en) Host node election method and server
CN104572344B (en) A kind of method and system of cloudy data backup
CN110855508B (en) Distributed SDN synchronization method based on blockchain technology
CN103001868A (en) Method and device used for synchronous ARP (Address Resolution Protocol) list item of virtual router redundancy protocol backup set
CN105554142B (en) The method, apparatus and system of message push
JP7315679B2 (en) mesh network
CN110677282B (en) Hot backup method of distributed system and distributed system
CN109547873A (en) A kind of processing method and processing device of the realization two-node cluster hot backup based on one-way optical gate
CN106571973A (en) Heartbeat packet timeout management method and system
KR101075462B1 (en) Method to elect master nodes from nodes of a subnet
CN105991371A (en) Fault detection method and device
CN113794765A (en) Gate load balancing method and device based on file transmission

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee