WO2022116661A1 - Procédé et appareil de quorum de grappe, dispositif électronique et support d'enregistrement lisible - Google Patents

Procédé et appareil de quorum de grappe, dispositif électronique et support d'enregistrement lisible Download PDF

Info

Publication number
WO2022116661A1
WO2022116661A1 PCT/CN2021/121209 CN2021121209W WO2022116661A1 WO 2022116661 A1 WO2022116661 A1 WO 2022116661A1 CN 2021121209 W CN2021121209 W CN 2021121209W WO 2022116661 A1 WO2022116661 A1 WO 2022116661A1
Authority
WO
WIPO (PCT)
Prior art keywords
arbitration
nodes
power supply
network topology
cluster
Prior art date
Application number
PCT/CN2021/121209
Other languages
English (en)
Chinese (zh)
Inventor
李辉
赵鹏
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Priority to US18/034,554 priority Critical patent/US11902095B2/en
Publication of WO2022116661A1 publication Critical patent/WO2022116661A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0893Assignment of logical groups to network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/142Reconfiguring to eliminate the error
    • G06F11/1425Reconfiguring to eliminate the error by reconfiguration of node membership
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/18Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits
    • G06F11/187Voting techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to the field of cluster technology, and in particular, to a cluster arbitration method, a cluster arbitration device, an electronic device, and a computer-readable storage medium.
  • cluster nodes When a distributed cluster is working, cluster nodes may fail, or network communication between cluster nodes may be interrupted, which may result in loss or splitting of the cluster topology. If a topology loss occurs, the cluster will shut down if the remaining in the topology does not take over the cluster function; if a topology split occurs, and each part after the split tries to take over the cluster, since the parts cannot communicate with each other, the state consistency will be reduced. will be destroyed.
  • the related technology adopts the method of majority arbitration, that is, the network topology is confirmed when the above two faults occur in the cluster. If the connected branch of the topology contains more than half of the nodes, then the branch where it is located will take over the cluster. Nodes in other branches at the same time exit the cluster.
  • related technologies have great limitations. In many cases that can continue to provide services as a cluster, the service cannot be provided only because the number of surviving nodes does not exceed half, resulting in poor cluster survival.
  • the purpose of the present application is to provide a cluster arbitration method, a cluster arbitration device, an electronic device and a computer-readable storage medium, so that the network topology can continue to work under the condition of including any number of nodes, and the survival of the cluster is improved. sex.
  • the present application provides a cluster arbitration method, including:
  • obtaining the number of first nodes and power supply conditions by using the historical election set includes:
  • the power supply condition is determined as the maximum single power supply node number; the maximum single power supply node number is greater than half of the first node number.
  • generating an arbitration parameter by using the first number of nodes according to the power supply situation includes:
  • the power supply condition is the single power supply, generating an arbitration threshold by using the first number of nodes according to the parity of the first number of nodes;
  • the power supply condition is the maximum number of single power supply nodes, generating the arbitration threshold by using the maximum single power supply node number and the first number of nodes according to the arbitration disk situation of the historical election set;
  • the arbitration parameter is generated using the arbitration threshold.
  • generating an arbitration threshold by using the first number of nodes according to the parity of the number of first nodes includes:
  • arbitration disk exists in the historical election set, divide the number of the first nodes by two to obtain the arbitration threshold;
  • the arbitration threshold is obtained by dividing the number of the first nodes by two and then adding one.
  • generating the arbitration threshold by using the maximum number of single power supply nodes and the first number of nodes according to the arbitration disk situation of the historical election set including:
  • the arbitration threshold is obtained by subtracting the maximum number of single power supply nodes from the first node number
  • the arbitration threshold is obtained by subtracting the maximum single power supply node number from the first node number and adding one.
  • the judging whether the current network topology satisfies the arbitration parameter includes:
  • the historical election set is a target election set;
  • the target election set is a power supply with more than one power supply and an arbitration disk. an election set, or an election set with one of the power supplies, an even number of the first authorized nodes, and a quorum disk;
  • the current arbitration disk situation of the current network topology is used to determine whether the arbitration parameter is satisfied.
  • determining whether the arbitration parameter is satisfied by using the current arbitration disk situation of the current network topology includes:
  • the current network topology identifies the arbitration disk, execute the arbitration disk contention, and determine that the arbitration parameter is satisfied after the contention is successful;
  • arbitration node If the arbitration node does not exist, it is determined that the arbitration parameter is not satisfied.
  • the application also provides a cluster arbitration device, including:
  • an acquisition module used to acquire a historical election set, and use the historical election set to obtain the number of first nodes and the power supply;
  • a generating module configured to generate an arbitration parameter by using the first number of nodes according to the power supply condition
  • a judgment module configured to obtain the current network topology, and judge whether the current network topology satisfies the arbitration parameter
  • a service module configured to provide a cluster service if the current network topology satisfies the arbitration parameter.
  • the application also provides an electronic device, including a memory and a processor, wherein:
  • the memory for storing computer programs
  • the processor is configured to execute the computer program to implement the above cluster arbitration method.
  • the present application also provides a computer-readable storage medium for storing a computer program, wherein when the computer program is executed by a processor, the above-mentioned cluster arbitration method is implemented.
  • the cluster arbitration method provided in this application obtains a historical election set, and uses the historical election set to obtain the number of first nodes and power supply; according to the power supply situation, uses the number of first nodes to generate arbitration parameters; obtains the current network topology, and judges the current network topology Whether the quorum parameter is met; if the quorum parameter is met, the cluster service is provided.
  • the method will obtain the number of first nodes and the power supply corresponding to the historical election set before the split or loss, and use them to determine whether it can continue to provide services as a cluster.
  • arbitration is generated by the number of the first nodes, and can limit the number of nodes in the network topology after the failure based on the number of the original nodes, so as to ensure that if the network topology split occurs, there is only one new network Topology can provide services as a cluster to prevent data inconsistency caused by two clusters providing services at the same time.
  • the nodes in the network topology can serve as a new cluster to provide external services.
  • the arbitration parameter After obtaining the arbitration parameter, it is judged whether the current network topology satisfies the arbitration parameter. If the current network topology satisfies the arbitration parameter, the cluster service can be provided to the outside, so that the cluster continues to work. This method does not require that the nodes in the new network topology must be the majority in the original cluster after a failure. On the basis of ensuring data consistency, the network topology can continue to work even if it contains any number of nodes, which improves the Cluster survival.
  • the present application also provides a cluster arbitration device, an electronic device and a computer-readable storage medium, which also have the above beneficial effects.
  • FIG. 1 is a flowchart of a cluster arbitration method provided by an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of a cluster arbitration apparatus according to an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • FIG. 1 is a flowchart of a cluster arbitration method provided by an embodiment of the present application.
  • the method includes:
  • S101 Obtain a historical election set, and use the historical election set to obtain the number of first nodes and the power supply situation.
  • all or part of the steps of the cluster arbitration method provided in this embodiment may be performed by using a node specified in the current network topology, and the node may be any node in the current network topology, such as the node with the smallest number. , or can be the node with the largest number, or can be the node with the smallest network address.
  • the network topology is split or lost, that is, the network communication between some nodes in the original cluster and other nodes is disconnected, resulting in topology classification; or when some nodes are shut down or powered off due to failure.
  • Each node can broadcast its own number or broadcast IP address in the current network topology. After obtaining the numbers or IP addresses of all nodes, determine whether it is the designated node, such as the node with the smallest number. If so, perform all or some of the steps in the cluster arbitration method.
  • the historical election set is the set of authorized nodes in the cluster when the cluster serves as the cluster before the network failure occurs.
  • An authorized node is a node whose cluster status is sufficiently new, and it can participate in cluster voting or provide services, such as voting as a quorum for whether the cluster continues to provide services to the outside world, or can vote for other nodes.
  • the number of authorized nodes is not limited. A node whose state is not new enough cannot be used as an authorized node and cannot vote. It needs to update the cluster state before it can become an authorized node.
  • the historical election set can be stored on each node in the cluster, or on each authorized node. By obtaining the historical election set, the cluster topology before the failure can be determined, and then the corresponding arbitration parameters can be generated based on this, and the arbitration parameters can be used to determine whether the cluster can continue to provide services.
  • the number of first nodes is the number of first authorized nodes in the historical election set, and its specific size is not limited.
  • the power supply situation is the power supply situation of each first authorized node in the historical election set.
  • the first authorization node may all be powered by one power source, or may be powered by multiple power sources. Based on this, this embodiment provides a specific method for obtaining the number of first nodes and power supply conditions. The steps of obtaining the number and power conditions of the first nodes by using the historical election set may include:
  • Step 11 Count the first authorized nodes in the historical election set to obtain the number of the first nodes.
  • Step 12 Acquire power supply information corresponding to each first authorization node, and determine whether all the first authorization nodes are powered by one power supply.
  • Step 13 If all the first authorized nodes are powered by one power supply, determine the power supply condition as a single power supply.
  • Step 14 If all the first authorized nodes are not powered by one power supply, determine the power supply situation as the maximum number of single power supply nodes.
  • the first authorized node in the historical election set there is only the first authorized node in the historical election set, so to count the first authorized node in the historical election set is to count the number of nodes in the historical election set. While counting the number of first nodes, power supply information corresponding to each first authorization node can also be obtained.
  • the power supply information is used to indicate the identity of the power supply that supplies power to the first authorization node, which may specifically be a power supply ID, a power supply name, or the like.
  • each power source information to determine whether all the first authorized nodes are powered by one power source, that is, it is judged whether all nodes in the historical election set are powered by one power source.
  • the power supply is determined to be a single power supply; if all nodes in the historical election set are not powered by one power supply, it means that the nodes in the historical election set are powered by two or more power supplies respectively.
  • the maximum number of single power supply nodes is the maximum number of nodes powered by the same power supply. Since the historical election set is powered by more than two power supplies, there must be a maximum number of nodes powered by each power supply, and the maximum value is the maximum single power supply. number of nodes.
  • each first authorization node is powered by only one power supply, and the maximum number of single power supply nodes in this embodiment is greater than half of the number of first nodes. This embodiment does not discuss the case where the maximum number of single power supply nodes is not greater than half of the first number of nodes.
  • S102 Generate an arbitration parameter by using the first number of nodes according to the power supply situation.
  • the arbitration parameter may be generated by using the first number of nodes according to the power supply situation.
  • the specific content of the arbitration parameter is not limited, and can be set according to actual needs. Since the topology loss can be caused by node power down, it is understandable that the impact of the topology loss is different under different power supply conditions.
  • the arbitration parameters generated according to the power supply situation and the number of the first nodes can not only select one of the two network topologies as a new cluster and provide services after a topology split, but also prevent nodes in the two network topologies from forming clusters to provide services , in order to ensure the singleness of the data and improve the survivability of the cluster; after the topology loss occurs, the current network topology can continue to provide services without being limited by the number of nodes, which improves the survivability of the cluster.
  • the number of the first authorized nodes in the cluster is an even number, and after the topology split occurs, the authorized nodes in the two new network topologies obtained by the split are half of the original ones.
  • the related technology cannot provide services to the outside world.
  • most of the first authorized nodes in the cluster are powered by the same power supply. If the power supply is suddenly cut off, the related technology determines that the remaining nodes in the cluster do not account for the majority, so the service cannot be continued.
  • the present application considers the power supply situation when generating the arbitration parameters, and the generated arbitration parameters can take into account the topology loss caused by the power failure, and can accurately arbitrate when the above or similar situations occur, thereby improving the cluster survival.
  • the arbitration parameter may include an arbitration threshold, and the arbitration threshold is used to limit the number of nodes in the topology for arbitration.
  • the step of using the first number of nodes to generate the arbitration parameter may include:
  • Step 21 If the power supply condition is a single power supply, the arbitration threshold is generated by using the first number of nodes according to the parity of the first number of nodes.
  • Step 22 If the power supply condition is the maximum number of single power supply nodes, the arbitration threshold is generated by using the maximum single power supply node number and the first number of nodes according to the arbitration disk situation in the historical election set.
  • Step 23 Use the arbitration threshold to generate arbitration parameters.
  • the topology loss may not be considered, because if the topology loss occurs, since all the first authorized nodes are powered by one power supply, the entire cluster cannot work. Therefore, when arbitration is required, a topology split must have occurred.
  • the number of nodes in the new topology obtained after topology splitting is related to the number of first nodes. When the number of first nodes is an odd number, the number of nodes in the two topologies obtained after splitting must not be equal; when the number of first nodes is an even number , the number of topological nodes obtained after the two splits may be equal. Therefore, when generating the arbitration threshold, it needs to be generated according to the parity of the first number of nodes.
  • the fault can include topology splitting and topology loss.
  • arbitration it can be judged whether there is an arbitration disk according to the arbitration disk situation in the historical election set, and the largest single power supply node can be used. number and the first number of nodes to generate the quorum threshold.
  • the arbitration disk is the storage address of the core data of the cluster, which may be a hard disk. When there is an arbitration disk, arbitration can be performed by competing for the arbitration disk. After obtaining the arbitration threshold, use it to generate arbitration parameters, and the specific generation process is not limited.
  • the arbitration parameters can also include other arbitration rules according to the actual situation, so as to prevent the problem of data inconsistency caused by successful arbitration between the two topologies.
  • the step of using the first number of nodes to generate an arbitration threshold may include:
  • Step 31 If the number of the first nodes is an odd number, add one to the number of the first nodes and divide by two to obtain the arbitration threshold.
  • the arbitration threshold can be obtained by adding one to the number of the first nodes and dividing by two.
  • the arbitration threshold can be represented by N Q (Node), where:
  • Ne is the number of first nodes.
  • Step 32 If the number of the first nodes is an even number, determine whether there is an arbitration disk in the historical election set.
  • the number of the first nodes is an even number, it means that the number of nodes in the two network topologies after the split may be the same. At this time, it is necessary to judge whether there is an arbitration disk in the historical election set, and then determine whether the new network topology can pass the arbitration disk contention. Arbitration by way of grab.
  • Step 33 If there is an arbitration disk, divide the number of the first nodes by two to obtain the arbitration threshold.
  • the arbitration threshold can be obtained by dividing the first number of nodes by two, that is, the success can be achieved.
  • the network topology of the quorum needs to have at least half the number of nodes in the original cluster.
  • the arbitration threshold can be represented by N Q (Disk), where:
  • Step 34 If there is no arbitration disk, divide the number of the first nodes by two and add one to obtain the arbitration threshold.
  • the arbitration threshold can be represented by N Q (Node), where:
  • the step of generating the arbitration threshold by using the maximum number of single power supply nodes and the first number of nodes may include:
  • Step 41 If there is an arbitration disk in the historical election set, subtract the maximum number of single power supply nodes from the first node number to obtain the arbitration threshold.
  • the network topology participating in the arbitration only needs to be no less than the difference between the number of first nodes minus the maximum number of single power supply nodes. To avoid topology loss, even if the power supply corresponding to the maximum number of single power supply nodes is dropped, the remaining nodes can also provide services as a cluster. If a topology split occurs, even if the two network topologies after the split have the same number of nodes, the arbitration disk contention can also be performed based on the arbitration threshold to complete the arbitration. At this time, the arbitration threshold can be represented by N Q (Disk), where:
  • N Q (Disk) N e -N p .
  • N p is the maximum number of single power supply nodes.
  • Step 42 If there is no arbitration disk in the historical election set, subtract the maximum number of single power supply nodes from the first node number and add one to obtain the arbitration threshold.
  • the arbitration threshold can be represented by N Q (Node), where:
  • N Q (Node) N e -N p +1.
  • S103 Acquire the current network topology, and determine whether the current network topology satisfies the arbitration parameter.
  • the current network topology is obtained, and it is judged whether the current network topology satisfies the arbitration parameters.
  • the current network topology is the network topology where the specified node is located, which may specifically be any one of two network topologies obtained after topology splitting, or may be a network topology composed of remaining nodes after topology loss.
  • the number of the second authorized nodes in the current network topology, the situation of the arbitration disk, etc. can be determined, so as to subsequently judge whether the current network topology satisfies the arbitration parameters.
  • This embodiment does not limit the specific judgment method for judging whether the current network topology satisfies the arbitration parameter, and the specific judgment method is related to the content of the arbitration parameter.
  • step S104 is entered. If the current network topology does not satisfy the arbitration parameter, step S105 may be entered.
  • the step of judging whether the current network topology satisfies the arbitration parameter may include:
  • Step 51 Determine whether the number of second authorized nodes corresponding to the current network topology is less than the arbitration threshold.
  • Step 52 If the number of the second authorized nodes corresponding to the current network topology is less than the arbitration threshold, it is determined that the arbitration parameter is not satisfied.
  • the second authorized node book corresponding to the current network topology is smaller than the arbitration threshold, it means that a topology split has occurred, and the current network topology obtained from the split is a network topology with fewer nodes, so it cannot be used as a new cluster to provide external services. It is determined that it does not satisfy the quorum parameter.
  • Step 53 If the number of the second authorized nodes corresponding to the current network topology is not less than the arbitration threshold, determine whether the historical election set is the target election set.
  • the target election set in this embodiment is an election set with more than one power supply and an arbitration disk, or an election set with one power supply, an even number of first authorized nodes, and an arbitration disk.
  • the historical election set is the target election set
  • the corresponding arbitration thresholds are all represented by N Q (Node), and all have an arbitration disk, so it may be necessary to use the arbitration disk for arbitration.
  • Step 54 If the historical election set is not the target election set, it is determined that the arbitration parameter is satisfied.
  • the historical election set is not the target election set, it means that the arbitration cannot be performed through the arbitration disk, and the arbitration disk does not need to be arbitrated.
  • the number of second authorized nodes in the current network topology that is, the number of second authorized nodes is greater than the arbitration threshold. It means that the current network topology includes more authorized nodes, so it is determined that the arbitration parameter is satisfied.
  • Step 55 If the historical election set is the target election set, determine whether the number of second authorized nodes is equal to the arbitration threshold.
  • the historical election set is the target election set, it means that arbitration may need to be performed by competing for the arbitration disk. Therefore, it is further judged whether the number of the second authorized nodes is large enough that there is no need to compete for the arbitration disk. Arbitration. Therefore, judging whether the number of second authorized nodes is less than the arbitration threshold plus one is equivalent to judging whether the number of second authorized nodes is equal to the arbitration threshold because the number of second authorized nodes is not less than the arbitration threshold.
  • Step 56 If the number of the second authorized nodes is not equal to the arbitration threshold, it is determined that the arbitration parameter is satisfied.
  • the number of second authorized nodes is not equal to the arbitration threshold, it means that the number of second authorized nodes is greater than the arbitration threshold, and it can be determined that the current network topology has enough second authorized nodes, so there is no need to compete for the arbitration disk, and it can be directly determined that the number of second authorized nodes is satisfied.
  • Arbitration parameters
  • Step 57 If the number of the second authorized nodes is equal to the arbitration threshold, it is judged whether the arbitration parameter is satisfied by using the current arbitration disk situation of the current network topology.
  • the number of second authorized nodes is equal to the arbitration threshold, it means that the current network topology has the same number of second authorized nodes as another network topology, and it is necessary to determine whether the arbitration parameter is satisfied by way of contention on the arbitration disk.
  • step 57 may include:
  • Step 61 If the current network topology identifies the arbitration disk, execute the arbitration disk contention, and determine that the arbitration parameters are satisfied after the contention is successful.
  • Step 62 If no arbitration disk is identified in the current network topology, it is determined whether there is an arbitration node.
  • Step 63 If there is an arbitration node, it is determined that the arbitration parameter is satisfied.
  • Step 64 If there is no arbitration node, it is determined that the arbitration parameter is not satisfied.
  • the arbitration disk contention operation can be performed, that is, it competes with another network topology for the write permission of the arbitration disk. After the contention is successful, the current network topology can be determined. Arbitration parameters are satisfied. If the current network topology cannot identify the arbitration disk, the arbitration may be performed in a manner similar to competing for the arbitration disk. Specifically, it can be determined whether there is an arbitration node in the current network topology. The arbitration node is specifically the node with the smallest ID among the first authorization nodes not in the leaving state. If all the first authorization nodes are in the leaving state, the arbitration node is the node with the smallest ID among all the first authorization nodes.
  • the going-off state means that the first authorized node is going to be offline under control, for example, after the next round of calculation, so as to receive maintenance.
  • the master node of the cluster is generally the authorization node with the smallest ID, if the master node is the leaving node, it needs to re-arbitrate after the master node goes offline. Therefore, the arbitration node is specifically the first authorization node that is not in the leaving state. The node with the smallest ID. If all the first authorized nodes are in the leaving state, the arbitration node is determined as the node with the smallest ID among all the first authorized nodes in a conventional manner. Therefore, when there is an arbitration node in the current network topology, that is, there is a node that can serve as the cluster master node, it is determined that the current network topology satisfies the arbitration parameter.
  • S104 Provide a cluster service.
  • the current network topology satisfies the arbitration parameter, it means that the current network topology can serve as a new cluster to provide external services. Therefore, after determining that the current network topology satisfies the arbitration parameter, the current network topology is used to provide the cluster service.
  • the specific method of providing cluster services is not limited. For example, operations such as initialization are required, and services are provided externally as a new cluster after initialization.
  • the current network topology does not meet the arbitration parameters, it means that the current network topology cannot provide services as a new cluster, and a topology split occurs at this time, and another split network topology can provide services as a new cluster.
  • the specific content of the preset operation is not limited, for example, each node in the current network topology can be controlled to go offline, or to exit the original cluster.
  • the cluster arbitration method By applying the cluster arbitration method provided by the embodiment of the present application, after a split or loss occurs in the network topology of the cluster, the number of first nodes and the power supply situation corresponding to the historical election set before the split or loss will be obtained, and use them to determine whether A quorum that can continue to serve as a cluster.
  • the arbitration parameter is generated by the number of the first nodes, and can limit the number of nodes in the network topology after the failure based on the number of the original nodes, so as to ensure that if the network topology split occurs, there is only one new network Topology can provide services as a cluster to prevent data inconsistency caused by two clusters providing services at the same time.
  • the nodes in the network topology can serve as a new cluster to provide external services.
  • the arbitration parameter After obtaining the arbitration parameter, it is judged whether the current network topology satisfies the arbitration parameter. If the current network topology satisfies the arbitration parameter, the cluster service can be provided to the outside, so that the cluster continues to work. This method does not require that the nodes in the new network topology must be the majority in the original cluster after a failure. On the basis of ensuring data consistency, the network topology can continue to work even if it contains any number of nodes, which improves the Cluster survival.
  • the following describes the cluster arbitration apparatus provided by the embodiments of the present application.
  • the cluster arbitration apparatus described below and the cluster arbitration method described above may refer to each other correspondingly.
  • FIG. 2 is a schematic structural diagram of a cluster arbitration apparatus provided by an embodiment of the present application, including:
  • an acquisition module 110 configured to acquire a historical election set, and use the historical election set to obtain the number of first nodes and power supply conditions;
  • a generating module 120 configured to generate an arbitration parameter by using the first number of nodes according to the power supply situation
  • the judgment module 130 is used for obtaining the current network topology, and judging whether the current network topology satisfies the arbitration parameter;
  • the service module 140 is configured to provide a cluster service if the current network topology satisfies the arbitration parameter.
  • the obtaining module 110 includes:
  • the statistical unit is used to perform statistics on the first authorized nodes in the historical election set to obtain the number of the first nodes
  • a power supply determination unit configured to obtain power supply information corresponding to each first authorization node, and determine whether all the first authorization nodes are powered by one power supply;
  • a single power supply unit configured to determine the power supply situation as a single power supply if all the first authorized nodes are powered by one power supply;
  • the unit for determining the maximum number of single power supply nodes is used to determine the power supply situation as the maximum number of single power supply nodes if all the first authorized nodes are not powered by one power supply; the maximum number of single power supply nodes is greater than half of the number of first nodes.
  • the generating module 120 includes:
  • a first arbitration threshold generating unit configured to generate an arbitration threshold by using the first number of nodes according to the parity of the first number of nodes if the power supply is a single power supply;
  • the second arbitration threshold generating unit is configured to generate an arbitration threshold by using the maximum single power supply node number and the first node number according to the arbitration disk situation in the historical election set if the power supply condition is the maximum single power supply node number;
  • the arbitration parameter generating unit is used for generating the arbitration parameter by using the arbitration threshold.
  • the first arbitration threshold generating unit includes:
  • a first calculation subunit used for adding one to the first node and dividing it by two if the number of the first nodes is an odd number to obtain an arbitration threshold
  • the judgment subunit is used to judge whether there is an arbitration disk in the historical election set if the number of the first nodes is an even number;
  • the second calculation subunit is used to divide the number of the first nodes by two if there is an arbitration disk in the historical election set to obtain the arbitration threshold;
  • the third calculation subunit is used to divide the number of the first nodes by two and add one to obtain the arbitration threshold if there is no arbitration disk in the historical election set.
  • the second arbitration threshold generating unit includes:
  • the fourth calculation subunit is used to obtain the arbitration threshold by subtracting the maximum number of single power supply nodes from the first node number if there is an arbitration disk in the historical election set;
  • the fifth calculation subunit is used for obtaining the arbitration threshold by subtracting the maximum number of single power supply nodes from the first node number and adding one if there is no arbitration disk in the historical election set.
  • the judgment module 130 includes:
  • a first judging unit for judging whether the number of second authorized nodes corresponding to the current network topology is less than the arbitration threshold
  • a first determining unit configured to determine that the arbitration parameter is not satisfied if the number of the second authorized nodes corresponding to the current network topology is less than the arbitration threshold
  • the target election set judgment unit is used to judge whether the historical election set is a target election set if the number of second authorized nodes corresponding to the current network topology is not less than the arbitration threshold; the target election set is an election set with more than one power supply and an arbitration disk , or an election set with one power supply, an even number of first authorized nodes, and a quorum disk;
  • the second determination unit is used for determining that the arbitration parameter is satisfied if the historical election set is not the target election set;
  • a second judgment unit configured to judge whether the number of the second authorized nodes is equal to the arbitration threshold if the historical election set is the target election set;
  • a third determining unit configured to determine that the arbitration parameter is satisfied if the number of the second authorized nodes is not equal to the arbitration threshold
  • the arbitration disk judging unit is configured to judge whether the arbitration parameter is satisfied by using the current arbitration disk situation of the current network topology if the number of the second authorized nodes is equal to the arbitration threshold.
  • the arbitration disk judgment unit includes:
  • the contention subunit is used to execute the contention for the arbitration disc if the current network topology identifies the arbitration disc, and determine that the arbitration parameter is satisfied after the contention is successful;
  • the arbitration node judgment subunit is used to judge whether there is an arbitration node if the current network topology does not identify the arbitration disk;
  • Satisfaction determination subunit used to determine that the arbitration parameter is satisfied if there is an arbitration node
  • the non-satisfaction determination subunit is used to determine that the arbitration parameter is not satisfied if there is no arbitration node.
  • the electronic device provided by the embodiment of the present application is introduced below, and the electronic device described below and the cluster arbitration method described above may refer to each other correspondingly.
  • the electronic device 100 may include a processor 101 and a memory 102 , and may further include one or more of a multimedia component 103 , an information input/information output (I/O) interface 104 and a communication component 105 .
  • a multimedia component 103 may be included in the electronic device 100 .
  • I/O information input/information output
  • the processor 101 is used to control the overall operation of the electronic device 100 to complete all or part of the steps in the above-mentioned cluster arbitration method;
  • the memory 102 is used to store various types of data to support the operation of the electronic device 100. These data For example, instructions for any application or method to operate on the electronic device 100 may be included, as well as application-related data.
  • the memory 102 may be implemented by any type of volatile or non-volatile memory device or a combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory) Erasable Programmable Read-Only Memory, EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (Read- One or more of Only Memory, ROM), magnetic memory, flash memory, magnetic disk or optical disk.
  • SRAM Static Random Access Memory
  • EEPROM Electrically Erasable Programmable Read-Only Memory
  • EPROM Erasable Programmable Read-Only Memory
  • PROM Programmable Read-Only Memory
  • Read- One or more of Only Memory ROM
  • magnetic memory flash memory
  • flash memory magnetic disk or optical disk.
  • Multimedia components 103 may include screen and audio components.
  • the screen can be, for example, a touch screen, and the audio component is used for outputting and/or inputting audio signals.
  • the audio component may include a microphone for receiving external audio signals.
  • the received audio signal may be further stored in the memory 102 or transmitted through the communication component 105 .
  • the audio assembly also includes at least one speaker for outputting audio signals.
  • the I/O interface 104 provides an interface between the processor 101 and other interface modules, and the above-mentioned other interface modules may be a keyboard, a mouse, a button, and the like. These buttons can be virtual buttons or physical buttons.
  • the communication component 105 is used for wired or wireless communication between the electronic device 100 and other devices. Wireless communication, such as Wi-Fi, Bluetooth, Near Field Communication (NFC for short), 2G, 3G or 4G, or one or a combination of them, so the corresponding communication component 105 may include: Wi-Fi parts, Bluetooth parts, NFC parts.
  • the electronic device 100 may be implemented by one or more Application Specific Integrated Circuit (ASIC for short), Digital Signal Processor (DSP for short), Digital Signal Processing Device (DSPD for short), Programmable Logic Device (PLD for short), Field Programmable Gate Array (FPGA for short), controller, microcontroller, microprocessor or other electronic components are implemented for implementing the above embodiments The given cluster quorum method.
  • ASIC Application Specific Integrated Circuit
  • DSP Digital Signal Processor
  • DSPD Digital Signal Processing Device
  • PLD Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • the following describes the computer-readable storage medium provided by the embodiments of the present application.
  • the computer-readable storage medium described below and the cluster arbitration method described above may refer to each other correspondingly.
  • the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the foregoing cluster arbitration method are implemented.
  • the computer-readable storage medium may include: a USB flash drive, a removable hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, etc. that can store program codes medium.
  • a software module can be placed in random access memory (RAM), internal memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other in the technical field. in any other known form of storage medium.
  • RAM random access memory
  • ROM read only memory
  • electrically programmable ROM electrically erasable programmable ROM
  • registers hard disk, removable disk, CD-ROM, or any other in the technical field. in any other known form of storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Physics & Mathematics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Hardware Redundancy (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

Sont divulgués un procédé et un appareil de quorum de grappe, un dispositif électronique et un support d'enregistrement lisible par ordinateur. Le procédé consiste à : obtenir un ensemble de vote passé et utiliser l'ensemble de vote passé pour obtenir un premier numéro de nœud et un état d'alimentation électrique ; générer un paramètre de quorum en fonction de l'état d'alimentation électrique à l'aide du premier numéro de nœud ; obtenir la topologie de réseau actuelle, et déterminer si la topologie de réseau actuelle satisfait au paramètre de quorum ; et si le paramètre de quorum est satisfait, fournir un service de grappe. Selon le procédé, il n'est pas nécessaire de s'assurer que des nœuds dans une nouvelle topologie de réseau après une défaillance sont majoritaires dans une grappe d'origine, de sorte que la topologie de réseau puisse continuer à fonctionner, quel que soit le nombre de nœuds compris dans la topologie de réseau tout en garantissant la cohérence des données, ce qui permet d'améliorer la viabilité d'une grappe.
PCT/CN2021/121209 2020-12-02 2021-09-28 Procédé et appareil de quorum de grappe, dispositif électronique et support d'enregistrement lisible WO2022116661A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/034,554 US11902095B2 (en) 2020-12-02 2021-09-28 Cluster quorum method and apparatus, electronic device, and readable storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011391793.6A CN112468596B (zh) 2020-12-02 2020-12-02 一种集群仲裁方法、装置、电子设备及可读存储介质
CN202011391793.6 2020-12-02

Publications (1)

Publication Number Publication Date
WO2022116661A1 true WO2022116661A1 (fr) 2022-06-09

Family

ID=74805294

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/121209 WO2022116661A1 (fr) 2020-12-02 2021-09-28 Procédé et appareil de quorum de grappe, dispositif électronique et support d'enregistrement lisible

Country Status (3)

Country Link
US (1) US11902095B2 (fr)
CN (1) CN112468596B (fr)
WO (1) WO2022116661A1 (fr)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112468596B (zh) * 2020-12-02 2022-07-05 苏州浪潮智能科技有限公司 一种集群仲裁方法、装置、电子设备及可读存储介质
CN113489595B (zh) * 2021-06-28 2023-02-28 苏州浪潮智能科技有限公司 一种实现分离式mac和phy电磁兼容的系统、方法
CN114390052B (zh) * 2021-12-30 2023-10-10 武汉达梦数据技术有限公司 一种基于vrrp协议实现etcd双节点高可用方法和装置
CN114461141B (zh) * 2021-12-30 2023-08-18 苏州浪潮智能科技有限公司 一种etcd系统、节点仲裁方法及系统
CN115242704B (zh) * 2022-06-22 2023-08-11 中国电信股份有限公司 网络拓扑数据更新方法、装置和电子设备

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8024432B1 (en) * 2008-06-27 2011-09-20 Symantec Corporation Method and apparatus for partitioning a computer cluster through coordination point devices
CN102308559A (zh) * 2011-07-26 2012-01-04 华为技术有限公司 一种用于集群计算机系统的投票仲裁方法及装置
CN104378232A (zh) * 2014-11-10 2015-02-25 东软集团股份有限公司 主备集群组网模式下的脑裂发现、恢复方法及装置
WO2016107172A1 (fr) * 2014-12-31 2016-07-07 华为技术有限公司 Procédé de traitement de quorum après déconnexion de deux parties d'une grappe, et dispositif et système de stockage de quorum
CN106789193A (zh) * 2016-12-06 2017-05-31 郑州云海信息技术有限公司 一种集群投票仲裁方法及系统
CN106953914A (zh) * 2017-03-20 2017-07-14 郑州云海信息技术有限公司 一种用于控制器集群的仲裁方法及系统
CN112468596A (zh) * 2020-12-02 2021-03-09 苏州浪潮智能科技有限公司 一种集群仲裁方法、装置、电子设备及可读存储介质

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19704662A1 (de) * 1997-02-07 1998-08-13 Siemens Ag Verfahren zur Lastsymmetrierung mehrerer autark und unabhängig arbeitenden Stromversorgungsmodulen einer modularen Stromversorgungsanlage
US6502203B2 (en) * 1999-04-16 2002-12-31 Compaq Information Technologies Group, L.P. Method and apparatus for cluster system operation
US7478263B1 (en) * 2004-06-01 2009-01-13 Network Appliance, Inc. System and method for establishing bi-directional failover in a two node cluster
US8108715B1 (en) * 2010-07-02 2012-01-31 Symantec Corporation Systems and methods for resolving split-brain scenarios in computer clusters
CN102402395B (zh) * 2010-09-16 2014-07-16 中标软件有限公司 基于仲裁磁盘的高可用系统不间断运行方法
US8578204B1 (en) * 2010-12-29 2013-11-05 Emc Corporation Witness facility for distributed storage system
US9063787B2 (en) * 2011-01-28 2015-06-23 Oracle International Corporation System and method for using cluster level quorum to prevent split brain scenario in a data grid cluster
US9203900B2 (en) * 2011-09-23 2015-12-01 Netapp, Inc. Storage area network attached clustered storage system
US20230113718A1 (en) * 2017-04-26 2023-04-13 View, Inc. Self orchestrating network
CN109257195B (zh) * 2017-07-12 2021-01-15 华为技术有限公司 集群中节点的故障处理方法及设备
CN109729129B (zh) * 2017-10-31 2021-10-26 华为技术有限公司 存储集群系统的配置修改方法、存储集群及计算机系统
US11769592B1 (en) * 2018-10-07 2023-09-26 Cerner Innovation, Inc. Classifier apparatus with decision support tool
CN109491615A (zh) * 2018-11-13 2019-03-19 郑州云海信息技术有限公司 一种基于集群存储系统的仲裁系统
CN110597664A (zh) * 2019-09-17 2019-12-20 深信服科技股份有限公司 一种高可用集群资源部署方法、装置及相关组件
CN111694694A (zh) * 2020-05-22 2020-09-22 北京三快在线科技有限公司 数据库集群的处理方法、装置、存储介质和节点
CN111654402B (zh) * 2020-06-23 2023-08-01 中国平安财产保险股份有限公司 网络拓扑创建方法、装置、设备及存储介质
US20220300384A1 (en) * 2021-03-22 2022-09-22 EMC IP Holding Company LLC Enhanced fencing scheme for cluster systems without inherent hardware fencing
US20230298138A1 (en) * 2022-03-15 2023-09-21 Stryker Corporation Methods and systems for extracting medical images

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8024432B1 (en) * 2008-06-27 2011-09-20 Symantec Corporation Method and apparatus for partitioning a computer cluster through coordination point devices
CN102308559A (zh) * 2011-07-26 2012-01-04 华为技术有限公司 一种用于集群计算机系统的投票仲裁方法及装置
CN104378232A (zh) * 2014-11-10 2015-02-25 东软集团股份有限公司 主备集群组网模式下的脑裂发现、恢复方法及装置
WO2016107172A1 (fr) * 2014-12-31 2016-07-07 华为技术有限公司 Procédé de traitement de quorum après déconnexion de deux parties d'une grappe, et dispositif et système de stockage de quorum
CN106789193A (zh) * 2016-12-06 2017-05-31 郑州云海信息技术有限公司 一种集群投票仲裁方法及系统
CN106953914A (zh) * 2017-03-20 2017-07-14 郑州云海信息技术有限公司 一种用于控制器集群的仲裁方法及系统
CN112468596A (zh) * 2020-12-02 2021-03-09 苏州浪潮智能科技有限公司 一种集群仲裁方法、装置、电子设备及可读存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHANG YU , LIN YUN: "Research on Quorum and Data Protection of Veritas Cluster Server Cluster Members", DIGITAL TECHNOLOGY & APPLICATION, 15 November 2011 (2011-11-15), pages 84 - 85, XP055937191, DOI: 10.19695/j.cnki.cn12-1369.2011.11.057 *

Also Published As

Publication number Publication date
US11902095B2 (en) 2024-02-13
CN112468596B (zh) 2022-07-05
CN112468596A (zh) 2021-03-09
US20230396501A1 (en) 2023-12-07

Similar Documents

Publication Publication Date Title
WO2022116661A1 (fr) Procédé et appareil de quorum de grappe, dispositif électronique et support d'enregistrement lisible
CN108122165B (zh) 一种区块链共识方法及系统
EP4071610A1 (fr) Procédé, appareil et dispositif de traitement de transaction, et support de stockage informatique
US10020980B2 (en) Arbitration processing method after cluster brain split, quorum storage apparatus, and system
US10402115B2 (en) State machine abstraction for log-based consensus protocols
US11102084B2 (en) Fault rectification method, device, and system
US20100299447A1 (en) Data Replication
US20210320977A1 (en) Method and apparatus for implementing data consistency, server, and terminal
CN111314125A (zh) 用于容错通信的系统和方法
EP4006742A1 (fr) Procédé de traitement de fourche et noeud de chaîne de blocs
CN110855737B (zh) 一种一致性级别可控的自适应数据同步方法和系统
CN107666493B (zh) 一种数据库配置方法及其设备
US11544245B2 (en) Transaction processing method, apparatus, and device and computer storage medium
CN109495540A (zh) 一种数据处理的方法、装置、终端设备及存储介质
WO2017034898A1 (fr) Estampille temporelle logique globale
CN115088235A (zh) 主节点选取方法、装置、电子设备以及存储介质
US8719622B2 (en) Recording and preventing crash in an appliance
CN113395165B (zh) 共识流程处理方法、装置、存储介质及计算机设备
CN107040509B (zh) 一种报文发送方法及装置
KR20220013846A (ko) 블록체인 네트워크의 블록 합의 방법 및 장치
CN114500327B (zh) 一种服务器集群的检测方法、检测装置及计算设备
CN108199882B (zh) 分布式数据库的节点分配方法、装置、储存介质和设备
US10564665B2 (en) Performing scalable, causally consistent reads using a logical wall clock
US20170161969A1 (en) System and method for model-based optimization of subcomponent sensor communications
US20230261893A1 (en) Quality issue management for online meetings

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21899695

Country of ref document: EP

Kind code of ref document: A1