CN108600328B - Cluster election method and device - Google Patents

Cluster election method and device Download PDF

Info

Publication number
CN108600328B
CN108600328B CN201810270749.6A CN201810270749A CN108600328B CN 108600328 B CN108600328 B CN 108600328B CN 201810270749 A CN201810270749 A CN 201810270749A CN 108600328 B CN108600328 B CN 108600328B
Authority
CN
China
Prior art keywords
node
cluster
peon
election
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810270749.6A
Other languages
Chinese (zh)
Other versions
CN108600328A (en
Inventor
张岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou H3C Technologies Co Ltd
Original Assignee
Hangzhou H3C Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou H3C Technologies Co Ltd filed Critical Hangzhou H3C Technologies Co Ltd
Priority to CN201810270749.6A priority Critical patent/CN108600328B/en
Publication of CN108600328A publication Critical patent/CN108600328A/en
Application granted granted Critical
Publication of CN108600328B publication Critical patent/CN108600328B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Hardware Redundancy (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The application provides a cluster election method and a device, wherein the method comprises the following steps: when a first node is newly added into a cluster, detecting whether a leader node and a peon _ master node are selected from the cluster or not; if the leader node and the peon _ master node are elected, sending a first election request message to the nodes in the cluster, wherein the first election request is used for reselecting the peon _ master node; and if the leader node and the peon _ master node are not elected, sending a second election request message to the nodes in the cluster, wherein the second election request is used for electing the leader node and the peon _ master node. The method provides service clusters for stable operation to guarantee by subdividing the types and the region division of the peon roles, reduces the range of election influence, and adds new nodes without threatening the stable leader roles.

Description

Cluster election method and device
Technical Field
The present application relates to the field of communications technologies, and in particular, to a method and an apparatus for cluster election.
Background
The Ceph cluster is a popular distributed storage system at present, and monitor nodes in the Ceph cluster play an important role in managing, maintaining and publishing the state information of the cluster. In order to avoid the problem of single point failure or performance hot spot, a plurality of monitor nodes generally exist in the Ceph cluster; in order to maintain the consistency of the plurality of monitor nodes, a leader needs to be elected from the plurality of monitor nodes, and the leader is responsible for updating the cluster state information and maintaining a quorum member (a committee member, i.e., a node supporting the monitor node to elect the leader).
The current election mechanism of monitor leader is: each monitor may be assigned a rank value according to its IP (Internet Protocol) address at initialization; when electing leader, the monitor with the minimum rank value wins the elected leader; when the quorum member changes (increases or decreases), the election process is triggered again, and a new leader is selected.
The drawbacks of this election rule are: once a new monitor is added into the Ceph cluster, the election process is triggered again; during the election process, the Ceph cluster cannot work because there are no leader nodes. Under some special conditions, when a certain monitor node has faults of virtual connection, repeated connection and disconnection or large network delay, the monitor leader election is triggered repeatedly, and further the Ceph cluster oscillates, and stable service cannot be provided for the outside.
Disclosure of Invention
In view of this, the present application provides a method and an apparatus for cluster election, so as to optimize an election mechanism and enhance stability, continuity, and robustness of a cluster.
Specifically, the method is realized through the following technical scheme:
in a first aspect of the present application, a cluster election method is provided, which is applied to a first node, and includes:
when the first node is newly added into the cluster, detecting whether a leader node and a peon _ master node are selected in the cluster or not, wherein the rank value of the leader node is the minimum in the nodes supporting the leader node to be elected, and the rank value of the peon _ master node is the maximum in the nodes supporting the peon _ master node to be elected;
if the leader node and the peon _ master node are elected, sending a first election request message to the nodes in the cluster, wherein the first election request is used for reselecting the peon _ master node;
and if the leader node and the peon _ master node are not elected, sending a second election request message to the nodes in the cluster, wherein the second election request is used for electing the leader node and the peon _ master node.
In a second aspect of the present application, a cluster election device is provided, which is applied to a first node and has a function of implementing the method provided in the first aspect. The functions can be realized by hardware, and the functions can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules or units corresponding to the above functions.
In one possible implementation, the apparatus may include:
the detection unit is used for detecting whether a leader node and a main labor peon _ master node are selected from the cluster or not when the first node is newly added into the cluster, wherein the rank value of the leader node is the minimum in the nodes supporting the leader node to be elected, and the rank value of the peon _ master node is the maximum in the nodes supporting the peon _ master node to be elected;
an election unit, configured to send a first election request message to a node in a cluster if a leader node and a peon _ master node have been elected, where the first election request is used to reselect the peon _ master node; and if the leader node and the peon _ master node are not elected, sending a second election request message to the nodes in the cluster, wherein the second election request is used for electing the leader node and the peon _ master node.
In another possible implementation manner, the apparatus may include a communication interface, a processor, a memory, and a bus, where the communication interface, the processor, and the memory are connected to each other through the bus; the processor executes the method provided by the first aspect of the present application by reading the logic instructions stored in the memory.
According to the scheme, by subdividing the types of the peon roles and the division of the areas, the cluster which provides services for the current stable operation is guaranteed, the hierarchical areas are introduced, the scope of the election domain is reduced, the scope of the election influence is reduced, namely, a barrier is made for leader nodes, new nodes are added, the stable leader roles are not threatened, the service stability and continuity of the cluster are guaranteed, and the robustness is improved. The method and the device can inherit the original monitor election algorithm of the Ceph cluster, and are small in modification amount and easy to realize. The method and the device can solve the problem of service interruption caused by frequent election of the leader node due to network reasons or attacks.
Drawings
FIG. 1 is a system architecture diagram of a Ceph cluster;
FIG. 2 is a schematic diagram of the angular regions of a Ceph cluster provided herein;
FIG. 3 is a flow chart of a method provided herein;
FIG. 4 is a block diagram of functional blocks of the apparatus provided herein;
fig. 5 is a hardware configuration diagram of the apparatus shown in fig. 4 provided in the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
Referring to fig. 1, the architecture of the Ceph cluster can be divided into the following four parts: client, MDS (metadata Server), OSD (Object Storage Device), and monitor node. When there are multiple monitor nodes in the Ceph cluster, leader needs to be generated by election. For convenience of description, the monitor node that plays the role of leader will be referred to as a leader node, and the rest of the monitor nodes except the leader node will be referred to as peon nodes. The leader node is responsible for coordinating the update of cluster maps (cluster information) of other monitor nodes, the leader node issues lease (lease) to other monitor nodes at regular intervals, the monitor with the lease can provide read service to the outside, otherwise, the data on the monitor node is considered to be outdated and service cannot be provided. Here, the cluster map is a combination of a plurality of maps, including monitor map, OSD map, PG (placement group) map, CRUSH map, MDS map, and the like.
The current monitor leader election may roughly include the following 3 steps:
1) initiating election by a promoter, and sending a promoter message to all monitor nodes according to a monitor list in a configuration file, wherein the promoter message carries a rank value of the promoter message; each monitor node that receives the prompt message also initiates an election.
Here, a promoter refers to an election initiator, and the role can only initiate one election request at a time; the receiver of the request is called an accept, and only one election request can be received at a time; during election, each node may assume both roles.
2) Other monitor nodes (acceptors) receive the promoter message and compare the rank value carried in the message with the nodes themselves; and responding to the ACK message if the promoter is accepted to be a leader (namely the rank value of the opposite side is smaller than that of the opposite side), and not responding to the ACK message.
3) The proxy counts the received ACK messages. Under the condition that the election is not overtime, if the proposer receives the ACK messages of all the other nodes, a victory (election success) message is sent to the other monitor nodes to announce that the election is won; in the case of election timeout, election is won if the number of ACK messages counted by the promoter exceeds half. If the response of the proxy responds to the ACK message, the ACK message is not counted.
After election, the monitor node becomes either a leader node or a peon node. Election will not only create leader nodes but will also determine the Quorum members, which are those nodes that support the elected leader of the new leader node. Therefore, although the rank value of the leader node cannot be guaranteed to be the smallest among all nodes, its value can be guaranteed to be the smallest among the qualum. Quorum is the majority of the lots in Monitor, and its member number must be greater than N/2+1, where N is the Monitor node number.
The following three cases may trigger the reselection leader: 1) when the leader node fails, if the lease on the peon node fails but the leader node does not re-issue lease authorization to the leader node, the leader is reselected; 2) when any of the peon nodes fails, if the peon nodes do not periodically feed back an ACK (Acknowledgement) message to the leader node, the leader is reselected and selected; 3) and when a new monitor node is added, or the original monitor node is powered off or fails, triggering the reselecting leader.
Because the current monitor leader election mechanism singly takes the monitor node with the minimum selective rank value as the leader election principle, once a new monitor is added into the Ceph cluster, election is triggered again, and the Ceph cluster is further oscillated, so that stable service cannot be provided.
Aiming at the problem, the cluster stability is taken as a starting point, and the following optimization is made on the basis of the existing monitor leader election mechanism:
1) on the basis of two roles of original leader and peon, the peon role is subdivided into: peon master and peon slave.
During initial election, the three roles need to be elected, wherein the rank value of the leader node is the smallest among the nodes supporting the leader node to be elected, the rank value of the peon _ master node is the largest among the nodes supporting the peon _ master node to be elected, and the other nodes except the leader node and the peon _ master node are all the peon _ slave nodes.
After the role is elected, 2 role zones can be formed, namely a central election zone and a local election zone. As shown in fig. 2, the local election area is composed of monitor nodes that assume the roles of peon _ master and peon _ slave, and the central election area is composed of all monitor nodes.
2) Two new election request messages are defined, as follows:
first election request message: the system is used for triggering the election peon _ master role in the local election area;
second election request message: for triggering the election leader role and the peon master role within the "central election zone".
The two election request messages can be distinguished by TLV (type-length-value) field values carrying different type values in the message body. For example, it may be defined that when a type value TLV field of 0x01 is carried in a proximity message, the proximity message is the first election request message; when the proximity message carries a TLV field with a type value of 0x02, the proximity message is the second election request message.
3) Four new ACK messages are defined, as follows:
first and second types of ACK messages (hereinafter referred to as first and second ACK messages): for responding to the election request message. For example, when receiving the first or second election request message, if the receiving end determines that the rank value of the receiving end is smaller than the sending end of the first or second election request message, the receiving end responds to the sending end with a first ACK message; when receiving the second election request message, if the receiving end determines that the rank value of the receiving end is larger than that of the sending end of the second election request message, responding a second ACK message to the sending end;
third and fourth ACK messages (hereinafter referred to as third and fourth ACK messages): the method is used for responding to the probe message to inform a sending end of the probe message whether a leader node and a peon _ master node are selected in a current cluster or not. For example, it may be predetermined that the third ACK message indicates that the leader node and the peon _ master node are already enumerated in the current cluster, and the fourth ACK message indicates that the leader node and the peon _ master node are not already enumerated in the current cluster.
These four types of ACK messages can also be distinguished by TLV fields carrying different type values in the message body.
4) Two new victory messages are defined, as follows:
first type of victory message (hereinafter referred to as first victory message): the node is used for announcing successful selection as the peon _ master node;
second type of victory message (hereinafter referred to as second victory message): for announcing the success as the leader node.
The two types of victory messages can also be distinguished by TLV fields carrying different type values in the message body.
Based on the above description, the cluster election method provided by the present application is described below.
Referring to fig. 3, in one embodiment, a first monitor node (hereinafter referred to simply as a first node) that newly joins a Ceph cluster (hereinafter referred to simply as a cluster) may perform the following steps 301-303.
It should be noted that, in the embodiment of the present application, the first node does not refer to a fixed monitor node, but may refer to any monitor node, and the description of the embodiment of the present application will not be repeated later. Moreover, the cluster election method provided by the embodiment of the application is not only suitable for election of a monitor node inside a Ceph cluster, but also suitable for election of nodes inside other types of clusters, such as an SDN (Software defined network) cluster.
Step 301: when a first node is newly added into the cluster, whether a leader node and a peon _ master node are selected in the cluster or not is detected.
Here, the first node newly joining the cluster may send a detection packet to the nodes in the cluster according to the configuration file, and the node receiving the detection packet replies a third ACK message or a fourth ACK message to the first node according to the actual election situation. Specifically, if a leader node and a peon _ master node have been elected in the cluster, the node receiving the probe packet may return a third ACK message to the first node; if the leader node and the peon _ master node are not selected in the cluster, the node receiving the detection message may return a fourth ACK message to the first node.
The first node may acknowledge the actual election of the cluster based on the type of ACK message received. Optionally, the first node may also confirm its network connectivity according to the number of received ACK messages replied by the nodes in the cluster for the probe packet. As an embodiment, if the first node determines that its network connectivity is poor, the first node may exit the cluster without performing the subsequent step 302 or step 303 in order to avoid cluster oscillation due to its network status.
Step 302: and if the leader node and the peon _ master node are elected, the first node sends a first election request message to the nodes in the cluster, wherein the first election request message is used for reselecting the peon _ master node.
Step 303: and if the leader node and the peon _ master node are not elected, the first node sends a second election request message to the nodes in the cluster, and the second election request message is used for electing the leader node and the peon _ master node.
In the embodiment of the application, under the condition that leader nodes and peon _ master nodes are already elected in a cluster, newly added nodes trigger small elections only in a local election area, namely, only peon _ master roles are elected, and leader roles are not elected, and the specific election process is as follows:
1) and the first node sends a first election request message to the nodes in the cluster, wherein the first election request message is used for reselecting the peon _ master node and carries the rank value of the first node.
2) The method comprises the steps that a peon node in a cluster comprises a peon _ master node and a peon _ slave node, and after a first election request message sent by a first node is received, the rank value of the first node carried in the first election request message is compared with the rank value of the first node; and if the rank value of the first node is larger than the rank value of the first node, responding the first ACK message to the first node, and if not, not responding.
Meanwhile, the peon node receiving the first election request message also initiates election, and sends a third election request message to other nodes (including the newly added first node) in the cluster, wherein the third election request message is also used for reselecting the peon _ master node. In the process of "small election", each node can only initiate one election, i.e. one node sends only one election request message at most to another node.
3) After receiving the first election request message of the first node, the leader node in the cluster ignores the message and does not process the message.
4) The first node and the original peon node in the cluster respectively count the number of the first ACK messages received by the first node and stop counting the number of the first ACK messages received by the first node if a certain node responds to the first ACK messages.
When a certain node determines that the first ACK message responded by all the nodes except the leader node in the cluster is received under the condition that the election is not overtime, or the first ACK message responded by at least half of the nodes except the leader node in the cluster is received under the condition that the election is overtime, the node can send a first victory message to the nodes in the cluster to notify that the node is elected as a pest _ master node.
The small election process described above does not affect the leader node, and does not affect the leader node to issue lease and synchronize the client map to each peon node, thereby ensuring the stability of the cluster.
In the embodiment of the application, under the condition that a leader node and a peon _ master node are not elected in a cluster, a newly added node triggers large election in a central election area, namely, a leader role is elected, and a peon _ master role is elected, wherein the specific election process comprises the following steps:
1) and the first node sends a second election request message to the nodes in the cluster, wherein the second election request message is used for electing leader nodes and peon _ master nodes, and the second election request message carries the rank value of the first node.
2) After receiving a second election request message sent by a first node, comparing a rank value of the first node carried in the second election request message with a rank value of the first node; and if the rank value of the first node is greater than the rank value of the first node, responding a first ACK message to the first node, and if the rank value of the first node is less than the rank value of the first node, responding a second ACK message to the first node.
Meanwhile, the node receiving the second election request message also initiates election, and sends a fourth election request message to other nodes (including the newly added first node) in the cluster, wherein the fourth election request message is also used for electing the leader node and the peon _ master node. In the process of 'big election', each node can only initiate one election, namely, one node only sends one election request message to another node at most.
3) The first node and the original node in the cluster respectively count the first ACK message and the second ACK message received by the first node and the original node in the cluster. If a certain node responds to the first ACK message, stopping counting the number of the first ACK messages received by the node; similarly, if a node responds to the second ACK message, the counting of the number of the second ACK messages received by the node is stopped.
When a node determines that first ACK messages responded by all nodes in the cluster are received under the condition that election is not overtime, or first ACK messages responded by at least half of the nodes in the cluster are received under the condition that election is overtime, the node can send a first victory message to the nodes in the cluster to notify that the node is elected as a peon _ master node.
When a node determines that a second ACK message responded by all nodes in the cluster is received under the condition that the election is not overtime, or second ACK messages responded by at least half of the nodes in the cluster are received under the condition that the election is overtime, the node may send a second victory message to the nodes in the cluster to notify that the node is elected as a leader node.
After the above-mentioned "large election" or "small election" process, the first node becomes either a leader node, a peon _ maser node, or a peon _ slave node.
In the first case, when the first node is added into the cluster as the peon _ master node, the original peon _ master node in the cluster will quit as a peon _ slave node; the new peon _ master node, that is, the first node, may send an update message to the leader node in the cluster, and it is known that a node is newly added to the leader node cluster, and the role of the node is peon _ master. After receiving the update message, the leader node adds the information of the first node into the monitor map and the monitor peon map, and synchronizes cluster information including the updated node information and the peon node information to all the peon nodes without waiting for the next period.
In the second case, when the first node is added to the cluster as the peon _ slave node, the peon _ master node in the cluster sends an update message to the leader node in the cluster, it is known that a node is newly added to the leader node cluster, and the role of the node is peon _ slave. After receiving the update message, the leader node adds the information of the first node into the node information and the peon node information, and synchronizes cluster information including the updated node information and the peon node information to all the peon nodes.
Under the third condition, when the first node is used as a leader node to be added into the cluster, the original leader node in the cluster is withdrawn as a peon _ slave node; the new leader node, i.e., the first node, may add its own information to the node information and synchronize cluster information including the updated node information to all the peon nodes.
The election strategy when a new node joins a cluster is described above. In the existing monitor leader election mechanism, not only the addition of a new node can trigger the election again, but also the failure (or power failure) of any node in the cluster can trigger the election again; for the latter described node failure situation, the following strategies are also proposed by the present application:
when a peon _ master node or a peon _ slave node in a cluster detects that a leader node in the cluster fails, the node which detects the failure may trigger a large election in a central election area, that is, send a fifth election request message to other nodes in the cluster, where the fifth election request message is used for reselecting the leader node and the peon _ master node.
When a leader node in a cluster detects that a peon _ master node or a peon _ slave node in the cluster fails, calculating the current leader support rate of the leader node; if the current leader support rate of the leader node is greater than 1/2(1/2 is a preset first threshold value, namely 50%), the leader node may trigger small election in a "local election area", which indicates that any peon node sends a sixth election request message to the nodes in the cluster, where the sixth election request message is used for reselecting a peon _ master node; if the current leader support rate of the leader node is less than or equal to 1/2, the leader node can trigger large election in the central election area, and send a seventh election request message to the nodes in the cluster, wherein the seventh election request message is used for reselecting the leader node and the peon _ master node.
The current leader support rate of the leader node is (M-1)/N, M is the number of second ACK messages received when the leader node is elected as the leader node, and N is the number of nodes in the cluster (including the failed node).
When a peon _ master node in a cluster detects that a peon _ slave node in the cluster fails, the current peon _ master support rate of the peon _ master node can be calculated; if the current peon _ master support rate of the peon _ master node is greater than 1/2(1/2 is a preset second threshold value, namely 50%), the peon _ master node can trigger small election in a local election area, and send an eighth election request message to the nodes in the cluster, wherein the eighth election request message is used for reselecting the peon _ master node; if the current peon _ master support rate of the peon _ master node is not more than 1/2, the peon _ master node triggers large election in the central election area, and sends a ninth election request message to the nodes in the cluster, wherein the ninth election request message is used for reselecting the leader node and the peon _ master node.
The current peon _ master support rate of the peon _ master node is (K-1)/N, where K is the number of first ACK messages received when the peon _ master node is elected as a peon _ slave node, and N is the number of nodes in the cluster (including the failed node).
It should be noted that, in the first to ninth election request messages appearing in the embodiment of the present application, the first, third, sixth, and eighth election request messages are the same type of election request message, and are used to trigger the election peon _ master role in the "local election area"; the second, fourth, fifth, seventh and ninth election request messages are the same type of election request messages and are used for triggering an election leader role and a peon _ master role in the central election area.
In the embodiment of the application, the first threshold and the second threshold are set according to an actual scene, and the first threshold and the second threshold may be the same or different.
The flow shown in fig. 3 is completed.
As can be seen from the process shown in fig. 3, in the embodiment of the present application, by subdividing the types of the peon roles and the division of the regions, a cluster that provides services for the current stable operation is ensured, a hierarchical region is introduced, the range of the "election domain" is narrowed, the range of the election influence is narrowed, which is equivalent to making a barrier for a leader node, and a new node is added, so that the stable "leader" role is not threatened, thereby ensuring the service stability and continuity of the cluster, and improving the robustness. The embodiment of the application can inherit the original monitor election algorithm of the Ceph cluster, and is small in change amount and easy to realize. The method and the device can solve the problem of service interruption caused by frequent election of the leader node due to network reasons or attacks.
The methods provided herein are described above. The apparatus provided in the present application is described below.
Referring to fig. 4, a functional block diagram of a cluster election device according to an embodiment of the present application is shown. The device is applied to a first node and comprises the following units:
a detecting unit 401, configured to detect whether a leader node and a peon _ master node have been selected in a cluster when the first node newly joins the cluster, where a rank value of the leader node is smallest in nodes that support the leader node to be elected, and a rank value of the peon _ master node is largest in nodes that support the peon _ master node to be elected.
An election unit 402, configured to send a first election request message to a node in a cluster if a leader node and a peon _ master node have been elected, where the first election request is used to reselect the peon _ master node; and if the leader node and the peon _ master node are not elected, sending a second election request message to the nodes in the cluster, wherein the second election request is used for electing the leader node and the peon _ master node.
In one embodiment, the first election request includes a rank value of the first node; the apparatus may further include:
a receiving unit, configured to receive a first ACK message responded by a peon node in a cluster to the first election request message, where the peon node is the other nodes except the leader node in the cluster, and the first ACK message is sent by the peon node when it is determined that the rank value of the first node is greater than the rank value of the peon node.
And the notification unit is used for sending a first victory message to the nodes in the cluster when the number of the received first ACK messages in the preset election time is not exceeded and is the same as the number of all the peon nodes in the cluster, or the election time is exceeded and the number of the received first ACK messages exceeds half of the number of all the peon nodes in the cluster, wherein the first victory message is used for indicating that the first node is elected as a peon _ master node.
In one embodiment, the apparatus may further include a comparison unit;
the receiving unit is further configured to receive a third election request message sent by a peon node in the cluster, where the third election request message is used to reselect a peon _ master node, and the third election request message is triggered and sent by the peon node in the cluster after receiving the first election request message.
Correspondingly, the comparing unit is configured to compare the rank value of the peon node carried in the third election request message with the rank value of the first node; and if the rank value of the peon node is greater than the rank value of the first node, responding to the peon node with a first ACK message, and stopping counting the number of the first ACK messages received by the first node.
In one embodiment, the second election request includes a rank value of the first node;
a receiving unit, configured to receive a first ACK message or a second ACK message that is responded by a node in a cluster to the second election request message, where the first ACK message is sent by the node in the cluster when it is determined that the rank value of the first node is greater than the rank value of the node in the cluster, and the second ACK message is sent by the node in the cluster when it is determined that the rank value of the first node is less than the rank value of the node in the cluster.
Correspondingly, the notification unit is configured to send a first victory message to the nodes in the cluster when the number of the received first ACK messages in the preset election time is not exceeded and is the same as the number of all nodes in the cluster, or the election time is exceeded and the number of the received first ACK messages exceeds half of the number of all nodes in the cluster, where the first victory message is used to indicate that the first node is elected as a pest _ master node; or,
the advertisement unit is further configured to send a second vector message to the nodes in the cluster when the number of the received second ACK messages does not exceed the preset election time and is the same as the number of all nodes in the cluster, or the election time is exceeded and the number of the received second ACK messages exceeds half of the number of all nodes in the cluster, where the second vector message is used to indicate that the first node is elected as a leader node.
In one implementation manner, the receiving unit is further configured to receive a fourth election request message sent by a node in the cluster, where the fourth election request message is used to reselect a leader node and a peon _ master node, and the fourth election request message is triggered and sent by the node in the cluster after receiving the second election request message.
Correspondingly, the comparing unit is configured to compare the rank value of the node in the cluster, which is carried in the fourth election request message, with the rank value of the first node; if the rank value of the node in the cluster is greater than the rank value of the first node, responding a first ACK message to the node in the cluster, and stopping counting the number of the first ACK messages received by the first node; and if the rank value of the node in the cluster is smaller than the rank value of the first node, responding a second ACK message to the node in the cluster, and stopping counting the number of the second ACK messages received by the first node.
In one embodiment, the detecting unit 401 is configured to send a detection packet to a node in a cluster; if a third ACK message responded by the nodes in the cluster aiming at the detection message is received, determining that leader nodes and a peon _ master node are selected in the cluster; and if a fourth ACK message responded by the nodes in the cluster aiming at the detection message is received, confirming that the leader node and the peon _ master node are not selected in the cluster.
In one embodiment, the election unit 402 is further configured to:
after the first node is elected to be a peon _ master node or a peon _ slave node, if a fault of a leader node in a cluster is detected, sending a fifth election request message to the nodes in the cluster, wherein the fifth election request message is used for reselecting the leader node and the peon _ master node;
and the peon _ slave node is the rest of nodes except the leader node and the peon _ master node in the cluster.
In one embodiment, the election unit 402 is further configured to: after the first node is elected as a leader node, if a peon node in the cluster is detected to be failed, calculating the current leader support rate of the first node;
if the current leader support rate of the first node is greater than a preset first threshold value, indicating any peon node to send a sixth election request message to the nodes in the cluster, wherein the sixth election request message is used for reselecting a peon _ master node;
and if the current leader support rate of the first node is not greater than the first threshold value, sending a seventh election request message to the nodes in the cluster, wherein the seventh election request message is used for reselecting the leader node and the peon _ master node.
In one embodiment, the election unit 402 is further configured to: after the first node is elected as a peon _ master node, if the fact that a peon _ slave node in a cluster fails is detected, calculating the current peon _ master support rate of the first node;
if the current peon _ master support rate of the first node is greater than a preset second threshold value, sending an eighth election request message to nodes in the cluster, wherein the eighth election request message is used for reselecting a peon _ master node;
and if the current peon _ master support rate of the first node is not greater than the second threshold, sending a ninth election request message to the nodes in the cluster, wherein the ninth election request message is used for reselecting the leader node and the peon _ master node.
For details that are not described in the embodiment, reference may be made to the description of the method shown in fig. 3, and further description is omitted here.
It should be noted that the division of the unit in the embodiment of the present invention is schematic, and is only a logic function division, and there may be another division manner in actual implementation. The functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
Correspondingly, the application also provides a hardware structure of the device shown in fig. 4. Referring to fig. 5, fig. 5 is a schematic diagram of a hardware structure of the cluster election device shown in fig. 4 provided in the present application, where the device includes: a communication interface 501, a processor 502, a memory 503, and a bus 504; the communication interface 501, the processor 502, and the memory 503 are configured to communicate with each other via a bus 504.
Wherein, the communication interface 501 is used for communicating with nodes in the cluster. Processor 502 may be a CPU, memory 503 may be a non-volatile memory, and cluster election logic instructions stored in memory 503 may be executed by processor 502 to implement the process illustrated in fig. 3 described above.
Thus, the description of the hardware structure of the apparatus shown in fig. 5 is completed.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.

Claims (16)

1. A cluster election method applied to a first node comprises the following steps:
when the first node newly joins the cluster, detecting whether a leader node and a main labor pest node are selected from the cluster or not, wherein the rank value of the leader node is the minimum in the nodes supporting the leader node to be elected, and the rank value of the pest node is the maximum in the nodes supporting the pest node to be elected;
if the leader node and the peon _ master node are elected, sending a first election request message to the nodes in the cluster, wherein the first election request is used for reselecting the peon _ master node;
and if the leader node and the peon _ master node are not elected, sending a second election request message to the nodes in the cluster, wherein the second election request is used for electing the leader node and the peon _ master node.
2. The method of claim 1, wherein the first election request includes a rank value for the first node;
after sending the first election request message to the nodes in the cluster, the method further includes:
receiving a first ACK message responded by a peon node in a cluster aiming at the first election request message, wherein the peon node is the other nodes except for the leader node in the cluster, and the first ACK message is sent by the peon node when the rank value of the first node is determined to be greater than the rank value of the peon node;
and when the number of the first ACK messages received in the preset election time is not more than the number of all the peon nodes in the cluster, or the election time is exceeded and the number of the received first ACK messages exceeds half of the number of all the peon nodes in the cluster, sending a first election success victoriy message to the nodes in the cluster, wherein the first victoriy message is used for indicating that the first node is elected as a peon _ master node.
3. The method of claim 2, wherein after sending the first election request message to the nodes in the cluster, the method further comprises:
receiving a third election request message sent by a peon node in the cluster, wherein the third election request message is used for reselecting a peon _ master node, and the third election request message is triggered and sent after the peon node in the cluster receives the first election request message;
comparing the rank value of the peon node carried by the third election request message with the rank value of the first node;
and if the rank value of the peon node is greater than the rank value of the first node, responding to the peon node with a first ACK message, and stopping counting the number of the first ACK messages received by the first node.
4. The method of claim 1, wherein the second election request includes a rank value for the first node;
after sending the second election request message to the nodes in the cluster, the method further includes:
receiving a first ACK message or a second ACK message responded by the nodes in the cluster aiming at the second election request message, wherein the first ACK message is sent by the nodes in the cluster when the rank value of the first node is determined to be larger than the rank value of the nodes in the cluster, and the second ACK message is sent by the nodes in the cluster when the rank value of the first node is determined to be smaller than the rank value of the nodes in the cluster;
when the number of the first ACK messages received in the preset election time is not exceeded and is the same as the number of all nodes in the cluster, or the election time is exceeded and the number of the received first ACK messages exceeds half of the number of all nodes in the cluster, sending a first victoriy message to the nodes in the cluster, wherein the first victoriy message is used for indicating that the first node is elected as a peon _ master node; or,
and when the number of the second ACK messages received in the preset election time is not exceeded and is the same as the number of all nodes in the cluster, or the election time is exceeded and the number of the second ACK messages received in the preset election time exceeds half of the number of all nodes in the cluster, sending a second victory message to the nodes in the cluster, wherein the second victory message is used for indicating that the first node is elected as a leader node.
5. The method of claim 4, wherein after sending the second election request message to the nodes in the cluster, the method further comprises:
receiving a fourth election request message sent by a node in the cluster, wherein the fourth election request message is used for reselecting a leader node and a peon _ master node, and the fourth election request message is triggered and sent by the node in the cluster after receiving the second election request message;
comparing the rank value of the node in the cluster carried by the fourth election request message with the rank value of the first node;
if the rank value of the node in the cluster is greater than the rank value of the first node, responding a first ACK message to the node in the cluster, and stopping counting the number of the first ACK messages received by the first node;
and if the rank value of the node in the cluster is smaller than the rank value of the first node, responding a second ACK message to the node in the cluster, and stopping counting the number of the second ACK messages received by the first node.
6. The method of claim 1, wherein the method further comprises:
when the first node is elected as a peon _ master node or a labor peon _ slave node, if a fault of a leader node in the cluster is detected, sending a fifth election request message to the nodes in the cluster, wherein the fifth election request message is used for reselecting the leader node and the peon _ master node;
and the peon _ slave node is the rest of nodes except the leader node and the peon _ master node in the cluster.
7. The method of claim 6, further comprising:
after the first node is elected as a leader node, if a peon node in the cluster is detected to be failed, calculating the current leader support rate of the first node;
if the current leader support rate of the first node is greater than a preset first threshold value, indicating any peon node to send a sixth election request message to the nodes in the cluster, wherein the sixth election request message is used for reselecting a peon _ master node;
and if the current leader support rate of the first node is not greater than the first threshold value, sending a seventh election request message to the nodes in the cluster, wherein the seventh election request message is used for reselecting the leader node and the peon _ master node.
8. The method of claim 7, further comprising:
after the first node is elected as a peon _ master node, if the fact that a peon _ slave node in a cluster fails is detected, calculating the current peon _ master support rate of the first node;
if the current peon _ master support rate of the first node is greater than a preset second threshold value, sending an eighth election request message to nodes in the cluster, wherein the eighth election request message is used for reselecting a peon _ master node;
and if the current peon _ master support rate of the first node is not greater than the second threshold, sending a ninth election request message to the nodes in the cluster, wherein the ninth election request message is used for reselecting the leader node and the peon _ master node.
9. A cluster election device applied to a first node comprises:
the detection unit is used for detecting whether a leader node and a main labor peon _ master node are selected from the cluster or not when the first node is newly added into the cluster, wherein the rank value of the leader node is the minimum in the nodes supporting the leader node to be elected, and the rank value of the peon _ master node is the maximum in the nodes supporting the peon _ master node to be elected;
an election unit, configured to send a first election request message to a node in a cluster if a leader node and a peon _ master node have been elected, where the first election request is used to reselect the peon _ master node; and if the leader node and the peon _ master node are not elected, sending a second election request message to the nodes in the cluster, wherein the second election request is used for electing the leader node and the peon _ master node.
10. The apparatus of claim 9, wherein the first election request includes a rank value for the first node;
the device further comprises:
a receiving unit, configured to receive a first ACK message responded by a peon node in a cluster to the first election request message, where the peon node is a node other than a leader node in the cluster, and the first ACK message is sent by the peon node when it is determined that a rank value of the first node is greater than a rank value of the peon node;
and the notification unit is used for sending a first election success victorial message to the nodes in the cluster when the number of the received first ACK messages in the preset election time is not exceeded is the same as the number of all the peon nodes in the cluster, or the election time is exceeded and the number of the received first ACK messages exceeds half of the number of all the peon nodes in the cluster, wherein the first victorial message is used for indicating that the first node is elected as a peon _ master node.
11. The apparatus of claim 10, further comprising a comparison unit;
the receiving unit is further configured to receive a third election request message sent by a peon node in the cluster, where the third election request message is used to reselect a peon _ master node, and the third election request message is triggered and sent by the peon node in the cluster after receiving the first election request message;
the comparing unit is configured to compare the rank value of the peon node carried in the third election request message with the rank value of the first node; and if the rank value of the peon node is greater than the rank value of the first node, responding to the peon node with a first ACK message, and stopping counting the number of the first ACK messages received by the first node.
12. The apparatus of claim 9, wherein the second election request includes a rank value for the first node;
the device further comprises:
a receiving unit, configured to receive a first ACK message or a second ACK message that is responded by a node in a cluster to the second election request message, where the first ACK message is sent by the node in the cluster when it is determined that the rank value of the first node is greater than the rank value of the node in the cluster, and the second ACK message is sent by the node in the cluster when it is determined that the rank value of the first node is less than the rank value of the node in the cluster;
the notification unit is used for sending a first victoriy message to the nodes in the cluster when the number of the received first ACK messages in the preset election time is not exceeded and is the same as the number of all the nodes in the cluster, or the election time is exceeded and the number of the received first ACK messages exceeds half of the number of all the nodes in the cluster, wherein the first victoriy message is used for indicating that the first node is elected as a pest _ master node; or,
the advertisement unit is further configured to send a second vector message to the nodes in the cluster when the number of the received second ACK messages does not exceed the preset election time and is the same as the number of all nodes in the cluster, or the election time is exceeded and the number of the received second ACK messages exceeds half of the number of all nodes in the cluster, where the second vector message is used to indicate that the first node is elected as a leader node.
13. The apparatus of claim 12, further comprising a comparison unit:
the receiving unit is further configured to receive a fourth election request message sent by a node in the cluster, where the fourth election request message is used to reselect a leader node and a peon _ master node, and the fourth election request message is triggered and sent by the node in the cluster after receiving the second election request message;
the comparing unit is configured to compare the rank value of the node in the cluster carried in the fourth election request message with the rank value of the first node; if the rank value of the node in the cluster is greater than the rank value of the first node, responding a first ACK message to the node in the cluster, and stopping counting the number of the first ACK messages received by the first node; and if the rank value of the node in the cluster is smaller than the rank value of the first node, responding a second ACK message to the node in the cluster, and stopping counting the number of the second ACK messages received by the first node.
14. The apparatus of claim 9, wherein the election unit is further configured to:
when the first node is elected as a peon _ master node or a labor peon _ slave node, if a fault of a leader node in the cluster is detected, sending a fifth election request message to the nodes in the cluster, wherein the fifth election request message is used for reselecting the leader node and the peon _ master node;
and the peon _ slave node is the rest of nodes except the leader node and the peon _ master node in the cluster.
15. The apparatus of claim 14, wherein the election unit is further configured to:
after the first node is elected as a leader node, if a peon node in the cluster is detected to be failed, calculating the current leader support rate of the first node;
if the current leader support rate of the first node is greater than a preset first threshold value, indicating any peon node to send a sixth election request message to the nodes in the cluster, wherein the sixth election request message is used for reselecting a peon _ master node;
and if the current leader support rate of the first node is not greater than the first threshold value, sending a seventh election request message to the nodes in the cluster, wherein the seventh election request message is used for reselecting the leader node and the peon _ master node.
16. The apparatus of claim 15, wherein the election unit is further configured to:
after the first node is elected as a peon _ master node, if the fact that a peon _ slave node in a cluster fails is detected, calculating the current peon _ master support rate of the first node;
if the current peon _ master support rate of the first node is greater than a preset second threshold value, sending an eighth election request message to nodes in the cluster, wherein the eighth election request message is used for reselecting a peon _ master node;
and if the current peon _ master support rate of the first node is not greater than the second threshold, sending a ninth election request message to the nodes in the cluster, wherein the ninth election request message is used for reselecting the leader node and the peon _ master node.
CN201810270749.6A 2018-03-29 2018-03-29 Cluster election method and device Active CN108600328B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810270749.6A CN108600328B (en) 2018-03-29 2018-03-29 Cluster election method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810270749.6A CN108600328B (en) 2018-03-29 2018-03-29 Cluster election method and device

Publications (2)

Publication Number Publication Date
CN108600328A CN108600328A (en) 2018-09-28
CN108600328B true CN108600328B (en) 2021-06-29

Family

ID=63623972

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810270749.6A Active CN108600328B (en) 2018-03-29 2018-03-29 Cluster election method and device

Country Status (1)

Country Link
CN (1) CN108600328B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109587218B (en) * 2018-11-07 2021-08-24 新华三技术有限公司 Cluster election method and device
CN109327544B (en) * 2018-11-21 2021-06-18 新华三技术有限公司 Leader node determination method and device
CN112379845B (en) * 2020-11-30 2024-05-28 深信服科技股份有限公司 Cluster capacity expansion method and device, computing equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106302170A (en) * 2016-09-22 2017-01-04 东南大学 A kind of resource allocation methods of wireless cloud computing system
CN106911524A (en) * 2017-04-27 2017-06-30 紫光华山信息技术有限公司 A kind of HA implementation methods and device
CN107579860A (en) * 2017-09-29 2018-01-12 新华三技术有限公司 Node electoral machinery and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8095492B2 (en) * 2005-12-30 2012-01-10 Frederick B. Cohen Method and/or system for providing and/or analyzing influence strategies
CN101217402B (en) * 2008-01-15 2012-01-04 杭州华三通信技术有限公司 A method to enhance the reliability of the cluster and a high reliability communication node
CN103491168A (en) * 2013-09-24 2014-01-01 浪潮电子信息产业股份有限公司 Cluster election design method
CN104933132B (en) * 2015-06-12 2019-11-19 深圳巨杉数据库软件有限公司 Distributed data base based on the sequence of operation number has the right to weigh electoral machinery

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106302170A (en) * 2016-09-22 2017-01-04 东南大学 A kind of resource allocation methods of wireless cloud computing system
CN106911524A (en) * 2017-04-27 2017-06-30 紫光华山信息技术有限公司 A kind of HA implementation methods and device
CN107579860A (en) * 2017-09-29 2018-01-12 新华三技术有限公司 Node electoral machinery and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种分布式文件系统的设计与实现;白钺;《中国优秀硕士学位论文全文数据库》;20160331;全文 *

Also Published As

Publication number Publication date
CN108600328A (en) 2018-09-28

Similar Documents

Publication Publication Date Title
CN108600328B (en) Cluster election method and device
CN109729111B (en) Method, apparatus and computer program product for managing distributed systems
CN107995029B (en) Election control method and device and election method and device
Das et al. Swim: Scalable weakly-consistent infection-style process group membership protocol
US6574197B1 (en) Network monitoring device
EP2509285A1 (en) Methods and apparatus for merging peer-to-peer overlay networks
CN106911728A (en) The choosing method and device of host node in distributed system
CN112328421B (en) System fault processing method and device, computer equipment and storage medium
CN113596176B (en) Self-selection method and device of center node of Internet of things, internet of things equipment and system
KR20010109080A (en) Topology propagation in a distributed computing environment with no topology message traffic in steady state
US11493978B2 (en) Decentralized sleep management
CN103581276A (en) Cluster management device and system, service client side and corresponding method
CN101626314A (en) Backup method and backup system of central node of star topology network
CN107360025B (en) Distributed storage system cluster monitoring method and device
CN114866365B (en) Arbitration machine election method, device, intelligent equipment and computer readable storage medium
CN103139081A (en) Update method and nodes for distributed hash table routing lists
US8812801B2 (en) Method of data replication in a distributed data storage system and corresponding device
CN113746733A (en) Table item synchronization method, gateway equipment, networking system and storage medium
TW201014391A (en) Updating routing and outage information in a communications network
US8559317B2 (en) Alarm threshold for BGP flapping detection
CN109189854B (en) Method and node equipment for providing continuous service
CN103262470B (en) FCoE network linking management method, equipment and system
EP2071764B1 (en) A method, device and communication system thereof of electing local master
CN108418863B (en) Management method of controller cluster, SDN controller and storage medium
CN102752335B (en) Peer-to-peer overlay network interior joint overload information subscribing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant