WO2017215430A1 - Node management method in cluster and node device - Google Patents

Node management method in cluster and node device Download PDF

Info

Publication number
WO2017215430A1
WO2017215430A1 PCT/CN2017/085935 CN2017085935W WO2017215430A1 WO 2017215430 A1 WO2017215430 A1 WO 2017215430A1 CN 2017085935 W CN2017085935 W CN 2017085935W WO 2017215430 A1 WO2017215430 A1 WO 2017215430A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
management
determining
nodes
converted
Prior art date
Application number
PCT/CN2017/085935
Other languages
French (fr)
Chinese (zh)
Inventor
骆旭剑
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017215430A1 publication Critical patent/WO2017215430A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0823Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability
    • H04L41/0836Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability to enhance reliability, e.g. reduce downtime
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/51Discovery or management thereof, e.g. service location protocol [SLP] or web services

Definitions

  • the present application relates to, but is not limited to, the field of communication technologies, and in particular, to a node management method and a node device in a cluster.
  • the cluster system should respond quickly, assign the task of the system to other working nodes in the cluster, and the faulty node Shared resources (such as IP, magnetic array) will also be taken over by other nodes.
  • the faulty node Shared resources such as IP, magnetic array
  • a heartbeat is used between nodes in a high-availability cluster to detect a node.
  • a split-brain problem may occur. Brain splitting can cause data incompleteness and can have a serious impact on services.
  • a high-availability cluster is inevitably faced with a split-brain problem.
  • split-brain there are some solutions to the problem of split-brain:
  • the service party locks the shared disk.
  • the other party can completely rob the shared disk resource.
  • the other party will never be shared.
  • Disk if the node occupying the shared disk suddenly crashes or crashes, the other party cannot execute the unlock command, and the backup node can not take over the shared resources and application services;
  • the embodiments of the present invention provide a node management method and a node device in a cluster, which can effectively solve the brain crack problem by using existing resources of the cluster without introducing a third-party device, and ensure high availability and reliability of the cluster.
  • An embodiment of the present invention provides a node management method in a cluster, where the method is applied to a first node, and the method includes:
  • the first node When detecting a heartbeat connection abnormality between nodes, determining, according to the first management policy, that the first node is a preliminary management node of the subgroup in which the node is located;
  • the method before the detecting an abnormality of the heartbeat connection between the nodes, the method further includes:
  • the second node is a management node, so that the second node performs resource configuration and task scheduling.
  • the determining, by the second management policy, whether the first node can be converted into a management node by the preliminary management node includes:
  • Determining whether the first node has an external network connection based on a preset network detection manner determining that the first node can be converted into a management node when the determination is yes; determining that the first node cannot be converted into a Management node.
  • the shared storage device in the cluster supports multi-node common access
  • the determining, according to the second management policy, whether the first node can be converted into a management node by the preliminary management node includes:
  • a placeholder file is created on the shared storage device, and after a preset time, it is detected whether there is a placeholder file created by another preparatory management node in the specific directory, if not Determining that the first node can be converted into a management node; if present, comparing the number of nodes of the sub-group in which the first node is located and the number of nodes of the sub-group in which the other preparatory management node is located, and based on the comparison result Determining whether the first node can be pre- The standby management node is converted into a management node.
  • comparing the number of nodes of the subgroup in which the first node is located with the number of nodes of the subgroup in which the other preliminary management nodes are located, and determining, according to the comparison result, whether the first node can be managed by the preliminary node Convert to a management node including:
  • the shared storage device in the cluster supports single node exclusive access
  • the determining, according to the second management policy, whether the first node can be converted into a management node by the preliminary management node includes:
  • the determining, by the first management policy, that the first node is a preliminary management node of a subgroup includes:
  • the first node is a node with the smallest node number in the subgroup where the first node is located, and the first node is a preliminary management node of the subgroup in which the first node is located.
  • An embodiment of the present invention further provides a node device, where the node device includes: a determining module and a determining module;
  • the determining module is configured to: when detecting an abnormal heartbeat connection between the nodes, determine, according to the first management policy, that the first node is a preliminary management node of the subgroup in which the node is located;
  • the determining module is configured to determine, according to the second management policy, whether the first node can be converted into a management node by the preliminary management node, and determine that the first node can be a preliminary management node When converted to a management node, the management node performs reconfiguration between nodes and task scheduling between nodes as a cluster resource.
  • the determining module is further configured to determine, according to the third management policy, that the second node is a management node, so that the second node performs resource configuration and task scheduling.
  • the determining module is configured to determine, according to a preset network detection manner, whether the first node has an external network connection, and if the determination is yes, determining that the first node can be converted into a management node; If not, it is determined that the first node cannot be converted into a management node.
  • the shared storage device in the cluster supports multi-node common access
  • the determining module is configured to: when the shared storage device is not occupied, create a placeholder file in a specific directory of the shared storage device, and detect whether there is another preparatory management in the specific directory after a preset time. a placeholder file created by the node, if not, determining that the first node can be converted into a management node; if present, the number of nodes of the subgroup where the first node is located and the subgroup of the other preparatory management node The number of nodes is compared, and based on the comparison result, it is determined whether the first node can be converted into a management node by the preliminary management node.
  • the determining module is configured to determine that the number of nodes of the sub-group in which the first node is located is greater than the number of nodes of the sub-group in which the other preparatory management node is located, and determine that the first node can be converted into Management node
  • the shared storage device in the cluster supports single node exclusive access
  • the determining module is configured to determine an access time of the first node to the first partition of the shared storage device, and mount the first partition when the access time arrives, and determine the first partition Whether there is a placeholder file, determining that there is no placeholder file in the first partition, determining that the first node can be converted into a management node; determining that the first partition has a placeholder When the file is determined, it is determined that the first node cannot be converted into a management node.
  • the determining module is configured to determine that the first node is a node with the smallest node number in the subgroup where the first node is located, and the first node is a preliminary management node of the subgroup in which the first node is located.
  • the embodiment of the invention further provides a computer readable storage medium storing computer executable instructions, which are implemented by the processor to implement a node management method in the cluster.
  • the first node in the high-availability cluster detects that the heartbeat connection between the nodes is abnormal
  • the first node is determined to be the preliminary management according to the preset first management policy.
  • a node determining, according to a preset second management policy, whether the first node can be converted into a management node by the preparatory management node, and when the determination is yes, performing reconfiguration between nodes and inter-node tasks as the management node Scheduling; thus, when the heartbeat connection between nodes is abnormal, the cluster is split into two or more subgroups, and the first node further determines whether it can become the management of the cluster after determining that it is the preparatory management node of the subgroup in which it is located.
  • the inter-node reconfiguration and inter-node task scheduling of the cluster resources effectively avoid the occurrence of brain splitting, ensure the high availability and reliability of the cluster, and more
  • the first node is a node in the cluster, so there is no need to introduce a third-party management device, which is simple to implement.
  • FIG. 1 is a schematic flowchart 1 of a node management method in a cluster according to an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a cluster in an embodiment of the present invention.
  • FIG. 3 is a second schematic flowchart of a node management method in a cluster according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of splitting a cluster into multiple subgroups and determining a preliminary management node according to an embodiment of the present invention
  • FIG. 5 is a schematic flowchart 3 of a node management method in a cluster according to an embodiment of the present invention.
  • FIG. 6 is a schematic flowchart 4 of a node management method in a cluster according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a node device according to an embodiment of the present invention.
  • the common brain splitting conditions can be described as follows:
  • Node A and Node B in the cluster pass the heartbeat detection to confirm the existence of the other party.
  • the heartbeat detection fails to confirm that the other party exists, the corresponding shared resource is taken over. If suddenly, the heartbeat between node A and node B does not exist (such as network disconnection), while node A and node B are actually in the active state, node A will take over the resources of node B. At the same time, Node B will take over the resources of Node A, which is the brain split.
  • the cluster may be split into multiple node groups, that is, subgroups, each of which takes over the service and accesses file system resources (for example, concurrently written to the file system) to cause data corruption.
  • the brain splitting of the cluster can cause data incompleteness: the nodes in the cluster (during the brain splitting) access the same shared resource at the same time, and there is no lock mechanism to control access to the data, then there is data integrity. Or other possible errors.
  • node A and node B may not be preempting an IP resource, causing network data to fail to transmit.
  • the splitting of the cluster may cause serious negative consequences.
  • the first node in the cluster detects the abnormal connection between the nodes, the first node is determined according to the preset first management policy. a preliminary management node of the subgroup; determining whether the first node can be converted into a management node by the preliminary management node based on the preset second management policy, so as to determine the node as the management node Reconfiguration and inter-node task scheduling; thus, when the heartbeat connection between nodes is abnormal, the cluster is split into two or more subgroups, and the first node further determines after determining that it is the preparatory management node of the subgroup in which it is located.
  • the inter-node reconfiguration and the inter-node task scheduling of the cluster resources can effectively avoid the occurrence of brain splitting and ensure the high availability of the cluster. And reliability, and since the first node is a node within the cluster, there is no need.
  • FIG. 1 is a schematic flowchart of a method for managing a node in a cluster according to an embodiment of the present invention. The method is applied to a first node, where the first node is a node in the cluster
  • FIG. 2 is a schematic structural diagram of a cluster in an embodiment of the present invention.
  • the node management method in the cluster in the embodiment of the present invention includes:
  • Step 101 When detecting that the heartbeat connection between the nodes is abnormal, determining that the first node is a preliminary management node of the subgroup according to the preset first management policy.
  • the cluster is split into two or more subgroups.
  • a node exists in one of two or more subgroups that are split into one.
  • the first node determines, according to the preset first management policy, that it is a preliminary management node of the sub-group, and the first management policy is applicable to all nodes in the cluster, so it can be understood as the sub-group where the first node is located. All the nodes in the first management policy elect the first node as the preliminary management node; at the same time, the nodes in the other sub-groups that are abnormally split due to the heartbeat also elect the preliminary management node of each sub-group according to the first management policy. .
  • the preset first management policy may be any pre-set election rule, for example, a node with the node number (each node in the cluster has a unique number) minimum/maximum is used as the preliminary management node.
  • the method may further include:
  • the third management policy Determining, by the third management policy, that the second node is a management node, so that the second node performs resource configuration and task scheduling; and the third management policy may be the same as or different from the first management policy, when the same.
  • the nodes in the cluster elect the second node as the management node through the preset first management policy to perform resource configuration and task scheduling. For example, create a specific partition, directory, or file on the shared storage device for clustering. Association between nodes under abnormal conditions The quotient can determine that the node with the smallest node number in the cluster is the second node, and the second node becomes the management node.
  • Step 102 Determine, according to a preset second management policy, whether the first node can be converted into a management node by the preliminary management node, so as to perform reconfiguration between the nodes and the internode between the cluster resources as the management node when the determination is yes. Task scheduling.
  • the subgroups that are split due to abnormal heartbeat between nodes in the cluster respectively determine the preliminary management nodes of the respective subgroups based on the first management strategy.
  • one of the preliminary management nodes is further determined as a management node to perform inter-node reconfiguration and inter-node task scheduling on the cluster resources, and correspondingly, the determined sub-group of the management node is a working sub-group. Other subgroups stopped working.
  • the first node determines whether it can become a management node based on the preset second management policy, and when the determination is yes, performs reconfiguration between nodes and inter-node task scheduling as a management node;
  • the second management policy is preset according to the actual situation of the cluster.
  • the second management policy may be set as: the first node determines itself Whether there is an external network connection, that is, whether the external network entity is connected, and if the determination is yes, determining that the first node can be converted into a management node; if the determination is no, determining that the first node cannot be converted into a management node;
  • the second management policy may be: the first node determines whether the shared storage device is occupied, When the shared storage device is not occupied, a placeholder file is created in a specific directory of the shared storage device, and after a preset time, it is detected whether there is a placeholder file created by another preparatory management node in the specific directory, if it does not exist.
  • the first node Determining that the first node can be converted into a management node; if present, comparing the number of nodes of the sub-group in which the first node is located and the number of nodes of the sub-group in which the other preparatory management node is located, and based on the comparison result Determining the number Whether a node can be converted into a management node by a preliminary management node;
  • the second management policy may be set as follows: the first node determines the first partition of the shared storage device by itself. Access time, and when the access time arrives, the first partition is mounted, and it is determined whether there is a placeholder file in the first partition. If there is no placeholder file in the first partition, the first A node can be converted into a management node.
  • the first node in the cluster when an inter-node heartbeat abnormality occurs in the cluster, the first node in the cluster further determines whether it can become the management node of the cluster after determining that it is the preparatory management node of the subgroup in which the cluster is located, and determines that In the case of the management node, the inter-node reconfiguration and the inter-node task scheduling are performed on the cluster resources, so that the cluster system is load balanced, effectively avoiding the occurrence of brain splitting, and ensuring the high availability and reliability of the cluster. Since the first node is a node in the cluster, it is not necessary to introduce a third-party management device, and the implementation is simple.
  • FIG. 3 is a schematic flowchart of a method for managing a node in a cluster according to an embodiment of the present invention.
  • the cluster has an external network connection.
  • the node management method in the cluster in the embodiment of the present invention includes:
  • Step 301 The first node determines, according to the preset first management policy, that the second node is a management node, so that the second node performs resource configuration and task scheduling.
  • the externally connected network plane may be preset as a redundant heartbeat plane. If there are multiple network planes, a critical network plane may be selected as the redundancy.
  • Heartbeat plane; the key network plane may be a plane that affects the processing of data if the plane anomaly.
  • the second node may create a specific partition, directory, or file on the shared storage device for the cluster abnormality.
  • the next nodes may create a specific partition, directory, or file on the shared storage device for the cluster abnormality.
  • the first node determines that the node with the smallest node number in the cluster is the management node.
  • Step 302 When the first node detects that the heartbeat connection between the nodes is abnormal, according to the first management The policy determines itself as the preparatory management node of the subgroup in which it is located.
  • the cluster when there is an abnormal heartbeat connection in the cluster, the cluster is split into two or more subgroups, and the first node exists in one of two or more subgroups split into each subgroup.
  • the node is determined according to the first management policy, and the preliminary management node of the respective sub-group is determined according to the first management policy; as shown in FIG. 4 is a schematic diagram of splitting the cluster into multiple sub-groups and determining a preliminary management node according to an embodiment of the present invention; wherein, AC, AD
  • the heartbeat is abnormal between AEs, that is, node A and node B form a subgroup, and the remaining nodes form a subgroup.
  • the first node determines, according to the first management policy, that it is a preliminary management node of the subgroup in which the first node determines that the first node determines that it is the node with the smallest node number of the subgroup, and then determines the first The node is the preliminary management node of the subgroup in which it is located.
  • the subgroups that are split due to abnormal heartbeat between nodes in the cluster respectively determine the preliminary management nodes of the respective subgroups based on the first management strategy.
  • one of the preliminary management nodes is further determined as a management node to perform inter-node reconfiguration and inter-node task scheduling on the cluster resources, and correspondingly, the determined sub-group of the management node is a working sub-group. Other subgroups stopped working.
  • Step 303 The first node determines whether there is an external network connection based on the preset network detection mode. If yes, step 304 is performed; if not, step 305 is performed.
  • the preset network detection mode may be a ping or an address resolution protocol (ARP);
  • the first node determines whether there is an external network connection, that is, whether the external network entity is connected.
  • Step 304 Determine that it can become a management node, and perform reconfiguration between nodes and task scheduling between nodes as a management node.
  • the location of the management node of the first node may be maintained, and the management node may be re-determined according to the first management policy and the second management policy.
  • Step 305 Determine that it cannot become a management node, and end the current processing flow.
  • the cluster uses its own internal nodes to implement self-management.
  • the first node determines whether it can become a management node by determining whether it has an external network connection by determining the self-provisioning management node in the sub-group.
  • the management node performs reconfiguration between nodes and task scheduling between nodes, so that the cluster system is load balanced, effectively avoiding the occurrence of brain splitting, and ensuring high availability of the cluster and reliability.
  • FIG. 5 is a schematic flowchart of a method for managing a node in a cluster according to an embodiment of the present invention.
  • the cluster does not have an external network connection, but a shared storage device that supports multi-node access is provided.
  • FIG. 5 the cluster in the embodiment of the present invention is shown in FIG.
  • Step 501 The first node determines, according to the preset first management policy, that the second node is a management node, so that the second node performs resource configuration and task scheduling.
  • the second node since the cluster has a shared storage device, after the step, that is, after determining that the second node is the management node, the second node may create a specific partition, a directory, or the shared storage device. File, used for negotiation between nodes in the case of cluster anomalies.
  • the first node determines that the node with the smallest node number in the cluster is the management node.
  • Step 502 When the first node detects that the heartbeat connection is abnormal between nodes, the first node determines that it is a preliminary management node of the subgroup according to the first management policy.
  • the cluster when there is an abnormal heartbeat connection in the cluster, the cluster is split into two or more subgroups, and the first node exists in one of two or more subgroups split into each subgroup.
  • the nodes all determine the preliminary management nodes of the respective subgroups according to the first management policy; as shown in FIG. 4, the schematic diagram of the cluster splitting into multiple subgroups and determining the preliminary management nodes in the embodiment of the present invention.
  • the first node determines, according to the first management policy, that it is a preliminary management node of the subgroup in which the first node determines that the first node determines that it is the node with the smallest node number of the subgroup, and then determines the first The node is the preliminary management node of the subgroup in which it is located.
  • the subgroup that is split due to the abnormal heartbeat between nodes in the cluster is based on the first
  • the management strategy determines the preparatory management nodes of the respective subgroups.
  • one of the various preparatory management nodes is further determined as a management node to perform cluster resources. Reconfiguration between nodes and task scheduling between nodes, correspondingly, the determined subgroup of the management node is the subgroup of the work, and the other subgroups stop working.
  • Step 503 The first node determines whether the shared storage device of the cluster is occupied. If it is not occupied, step 504 is performed; if it is occupied, step 509 is performed.
  • the second node When the heartbeat of the node is normal, the second node creates an identification file on the shared storage device of the cluster as the occupation identifier of the shared storage device, and the timing (such as S seconds, S size can be based on actual needs). Setting) updating the identification file (such as updating the creation time and/or content of the identification file);
  • the first node determines whether the shared storage device of the cluster is occupied by periodically detecting the change of the identifier file created by the second node on the shared storage device, which may include: W seconds (W ⁇ S) detects whether the identification file changes once. If the identification file is continuously detected T times (T is a positive integer, the actual value can be set according to actual needs), the identification file does not change, then the determination The shared storage device is not occupied; if the identification file changes, it is determined that the shared storage device is occupied.
  • Step 504 Create a placeholder file in a specific directory of the shared storage device, and detect, after a preset time, whether there is a placeholder file created by another preparatory management node in the specific directory, if yes, execute step 505; If not, step 508 is performed.
  • a placeholder file may be created in a specific directory of the shared storage device, where the placeholder file includes the node number information of the first node and the first node
  • the node number information of the subgroup correspondingly, the placeholder file created by the other preparatory management node carries the node number information of the other preparatory management node and the node number information of the corresponding subgroup; the length of the preset time may be Set according to actual needs, but need to be greater than or equal to S seconds.
  • Step 505 Compare the number of nodes of the sub-group in which the first node is located and the number of nodes of the sub-group in which the other preparatory management node is located, and determine whether the number of nodes of the sub-group where the first node is located is the most. If yes, go to step 506. If not, step 509 is performed.
  • Step 506 Determine whether there is a subgroup corresponding to the number of nodes of the subgroup in which the first node is located in the subgroup where the other preparatory management node is located. If yes, go to step 507; if not, go to step 508.
  • Step 507 Determine whether the node number of the first node is smaller than the node number of the preliminary management node in the same sub-group as the number of nodes of the sub-group in which the first node is located. If yes, go to step 508; if no, go to step 509.
  • Step 508 Determine that the first node can become a management node, and perform, as a management node, perform reconfiguration between nodes and task scheduling between nodes.
  • the location of the management node of the first node may be maintained, and the management node may be re-determined according to the first management policy and the second management policy.
  • Step 509 Determine that the first node cannot become a management node, and end the current processing flow.
  • the first node when a heartbeat abnormality occurs in the cluster, the first node further determines the occupation of the shared storage device, the creation and detection of the placeholder file, and the first The number of sub-group nodes in which a node is located, the node number information of the first node, etc., determine whether or not it can become a management node, and further, when the management node can be a management node, perform reconfiguration between nodes and nodes of the cluster resource. Inter-task scheduling, so that the cluster system load balancing, effectively avoiding the occurrence of brain splitting, ensuring the high availability and reliability of the cluster.
  • FIG. 6 is a schematic flowchart of a method for managing a node in a cluster according to an embodiment of the present invention.
  • the cluster does not have an external network connection, and only a shared storage device that supports single-node exclusive access exists.
  • the cluster in the embodiment of the present invention is shown in FIG.
  • Step 601 The first node determines, according to the preset first management policy, that the second node is a management node, so that the second node performs resource configuration and task scheduling.
  • the second node since the cluster has a shared storage device, after the step, that is, after determining that the second node is the management node, the second node may create a specific partition, directory, or file on the shared storage device. Used for negotiation between nodes in the case of cluster anomalies.
  • the first node determines that the node with the smallest node number in the cluster is the management section. point.
  • Step 602 When the first node detects that the heartbeat connection between the nodes is abnormal, the first node determines that it is the preliminary management node of the subgroup according to the first management policy.
  • the cluster when there is an abnormal heartbeat connection in the cluster, the cluster is split into two or more subgroups, and the first node exists in one of two or more subgroups split into each subgroup.
  • the nodes all determine the preliminary management nodes of the respective subgroups according to the first management policy; as shown in FIG. 4, the schematic diagram of the cluster splitting into multiple subgroups and determining the preliminary management nodes in the embodiment of the present invention.
  • the first node determines, according to the first management policy, that it is a preliminary management node of the subgroup in which the first node determines that the first node determines that it is the node with the smallest node number of the subgroup, and then determines the first The node is the preliminary management node of the subgroup in which it is located.
  • the subgroups that are split due to abnormal heartbeat between nodes in the cluster respectively determine the preliminary management nodes of the respective subgroups based on the first management strategy.
  • one of the preliminary management nodes is further determined as a management node to perform inter-node reconfiguration and inter-node task scheduling on the cluster resources, and correspondingly, the determined sub-group of the management node is a working sub-group. Other subgroups stopped working.
  • Step 603 The first node determines its own access time to the first partition of the shared storage device.
  • the first partition is a small partition that is created by the second node on the shared storage device for the disaster-tolerant node negotiation.
  • the first partition is in an empty state.
  • the shared storage device of the cluster only supports single-node exclusive access, that is, only one node is allowed to access the shared storage device at the same time.
  • the time range that the node numbered N can access is n*M+N* after the start of the zero point of the day. T to n*M+(N+1)*T seconds, where n is greater than or equal to 0, M is the sum of points, N is the node number, and T is the configurable secure access duration.
  • Step 604 When the first node determines that its own access time arrives, the first partition is mounted, and it is determined whether there is a placeholder file in the first partition. If not, step 605 is performed; Then, step 606 is performed.
  • the first node finds that there is a placeholder file in the first partition, it is determined that the shared storage device is occupied. If the placeholder file does not exist in the first partition, that is, the shared storage device is not occupied; wherein the placeholder file Refers to the file created by the preparatory management node that carries its own node number and the number of nodes in which it is located.
  • Step 605 Determine that it can become a management node, create a placeholder file in the first partition, and uninstall the first partition.
  • a placeholder file is created in the first partition to identify the shared storage file, and the node is reconfigured as a management node and the inter-node task is scheduled. .
  • the location of the management node of the first node may be maintained, and the management node may be re-determined according to the first management policy and the second management policy.
  • Step 606 Determine that it cannot become a management node, and end the current processing flow.
  • the first node determines the time of accessing the first partition of the shared device, and arrives at the time when the self-determination is a preliminary management node in the sub-group. Determine whether the first partition can be a management node based on whether the first partition is occupied, and then, as a management node, perform reconfiguration between nodes and inter-node task scheduling as a management node to make the cluster system load Balanced, effectively avoiding the occurrence of brain splitting, ensuring high availability and reliability of the cluster.
  • FIG. 7 is a schematic structural diagram of a node device according to an embodiment of the present invention.
  • the composition of the node device in the embodiment of the present invention includes: a determining module 71 and a determining module 72;
  • the determining module 71 is configured to determine that the first node is a preliminary management node of the subgroup according to the preset first management policy when detecting an abnormal connection between the nodes;
  • the determining module 72 is configured to determine, according to the preset second management policy, whether the first node can be converted into a management node by the preliminary management node, so that when the determination is yes, the node is configured as a management node between the nodes. Reconfiguration and inter-node task scheduling.
  • the determining module 71 is further configured to determine the third management policy according to the third management policy.
  • the two nodes are management nodes, so that the second node performs resource configuration and task scheduling.
  • the determining module 72 is configured to determine, according to a preset network detection manner, whether the first node has an external network connection, and if the determination is yes, determining that the first node can be converted into a management node; When the determination is no, it is determined that the first node cannot be converted into a management node.
  • the shared storage device of the cluster supports multi-node common access
  • the determining module 72 is configured to determine whether the shared storage device is occupied, and when the shared storage device is not occupied, create a placeholder file in a specific directory of the shared storage device, and after a preset time Detecting whether there is a placeholder file created by another preparatory management node in the specific directory, if not, determining that the first node can be converted into a management node; if yes, a node of the subgroup in which the first node is located The number is compared with the number of nodes of the subgroup in which the other preparatory management nodes are located, and based on the comparison result, it is determined whether the first node can be converted into a management node by the preparatory management node.
  • the shared storage device of the cluster supports single node exclusive access
  • the determining module 72 is configured to determine an access time of the first node to the first partition of the shared storage device, and mount the first partition when the access time arrives, and determine the first partition Whether there is a placeholder file, if there is no placeholder file in the first partition, it is determined that the first node can be converted into a management node.
  • the determining module 71 is configured to determine that the first node is a node with the smallest node number in the subgroup where the first node is located, and the first node is a preliminary management node of the subgroup in which the first node is located. .
  • the determining module 71 and the determining module 72 in the node device may be a central processing unit (CPU) or a digital signal processor (DSP, Digital Signal Processor) in the terminal or the server. ), or Field Programmable Gate Array (FPGA), or Integrated Circuit (ASIC) implementation.
  • CPU central processing unit
  • DSP Digital Signal Processor
  • FPGA Field Programmable Gate Array
  • ASIC Integrated Circuit
  • the embodiment of the invention further provides a computer readable storage medium storing computer executable instructions, which are implemented by the processor to implement a node management method in the cluster.
  • the foregoing storage medium includes: a mobile storage device, a random access memory (RAM), a read-only memory (ROM), a magnetic disk, or an optical disk.
  • RAM random access memory
  • ROM read-only memory
  • magnetic disk or an optical disk.
  • optical disk A medium that can store program code.
  • the above-described integrated unit of the embodiment of the present invention may be stored in a computer readable storage medium if it is implemented in the form of a software function module and sold or used as a stand-alone product.
  • the technical solution of the embodiment of the present invention may be embodied in the form of a software product stored in a storage medium, including a plurality of instructions for causing a computer device (which may be a personal computer, a server, or Either network device or the like) performs all or part of the methods described in various embodiments of the invention.
  • the foregoing storage medium includes various media that can store program codes, such as a mobile storage device, a RAM, a ROM, a magnetic disk, or an optical disk.
  • the cluster when the heartbeat connection between the nodes is abnormal, the cluster is split into two or more subgroups, and the first node further determines whether it can become a cluster after determining that it is the preparatory management node of the subgroup in which the subgroup is located.
  • Management node and when it is judged as YES, as a management node, the inter-node reconfiguration and inter-node task scheduling of the cluster resources effectively avoid the occurrence of brain splitting and ensure the high availability and reliability of the cluster.
  • the first node since the first node is a node in the cluster, it is not necessary to introduce a third-party management device, and the implementation is simple.

Abstract

Disclosed are a node management method in cluster and a node device. The method comprises: when a first node detects an internode heartbeat connecting abnormality, determining, according to a first management strategy, that the first node is a prepared management node of a cluster in which the first node is located; and determining whether the first node can be converted into a management node from the prepared management node on the basis of a second management strategy, and performing internode reconfiguration and internode task scheduling on cluster resources by the first node as the management node when determining that the first node can be converted into the management node from the prepared management node.

Description

一种集群内的节点管理方法及节点设备Node management method and node device in cluster 技术领域Technical field
本申请涉及但不限于通信技术领域,尤指一种集群内的节点管理方法及节点设备。The present application relates to, but is not limited to, the field of communication technologies, and in particular, to a node management method and a node device in a cluster.
背景技术Background technique
为了使集群的整体服务尽可能可用,当高可用性集群中的节点发生故障时,集群系统应该迅速做出反应,将该系统的任务分配到集群中其它正在工作的节点上执行,而故障节点的共享资源(比如IP、磁阵)也会被其他节点接管。In order to make the overall service of the cluster as available as possible, when the node in the high availability cluster fails, the cluster system should respond quickly, assign the task of the system to other working nodes in the cluster, and the faulty node Shared resources (such as IP, magnetic array) will also be taken over by other nodes.
一般的,高可用性集群中节点间使用心跳(heartbeat)检测节点的情况,然而在心跳失效的时候,可能会发生脑裂(split-brain)问题。脑裂会引起数据的不完整性,并且可能会对服务造成严重影响,一个高可用性集群,不可避免要面对脑裂问题,目前,针对脑裂问题有一些解决方案:In general, a heartbeat is used between nodes in a high-availability cluster to detect a node. However, when the heartbeat fails, a split-brain problem may occur. Brain splitting can cause data incompleteness and can have a serious impact on services. A high-availability cluster is inevitably faced with a split-brain problem. Currently, there are some solutions to the problem of split-brain:
1)添加冗余的心跳,然而这只能减少而不能避免脑裂;1) Add redundant heartbeats, however this can only be reduced without avoiding brain splitting;
2)做好对裂脑的监控报警,如邮件以及手机短信等,在问题发生的时候能够人为的介入到仲裁,降低损失,然而这需要人工参与;2) Do a good job of monitoring and alarming the split brain, such as mail and SMS, etc., when the problem occurs, it can be artificially involved in arbitration and reduce losses, but this requires manual participation;
3)启用磁盘锁,正在服务一方锁住共享磁盘,脑裂发生的时候,让对方完全抢不走共享的磁盘资源,然而如果占用共享磁盘的一方不主动解锁,另一方就永远得不到共享磁盘,如果占用共享磁盘的节点突然死机或者崩溃,另一方不可能执行解锁命令,后备节点也就接管不了共享的资源和应用服务;3) Enable the disk lock, the service party locks the shared disk. When the brain split occurs, the other party can completely rob the shared disk resource. However, if the party occupying the shared disk does not actively unlock, the other party will never be shared. Disk, if the node occupying the shared disk suddenly crashes or crashes, the other party cannot execute the unlock command, and the backup node can not take over the shared resources and application services;
4)增加第三方仲裁的机制,确定资源获得者,然而这需要引入第三方,且第三方如果发生故障,脑裂问题变无法解决。4) Increase the mechanism of third-party arbitration to determine the resource winners. However, this requires the introduction of a third party, and if the third party fails, the brain cracking problem cannot be solved.
发明概述Summary of invention
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求 的保护范围。The following is an overview of the topics detailed in this document. This summary is not intended to limit the claims The scope of protection.
本发明实施例提供一种集群内的节点管理方法及节点设备,能够在不引入第三方设备的情况下,利用集群已有的资源有效地解决脑裂问题,保证集群的高可用性和可靠性。The embodiments of the present invention provide a node management method and a node device in a cluster, which can effectively solve the brain crack problem by using existing resources of the cluster without introducing a third-party device, and ensure high availability and reliability of the cluster.
本发明实施例提供了一种集群内的节点管理方法,所述方法应用于第一节点,所述方法包括:An embodiment of the present invention provides a node management method in a cluster, where the method is applied to a first node, and the method includes:
检测到节点间心跳连接异常时,依据第一管理策略确定所述第一节点为所在子群的预备管理节点;When detecting a heartbeat connection abnormality between nodes, determining, according to the first management policy, that the first node is a preliminary management node of the subgroup in which the node is located;
基于第二管理策略判断所述第一节点是否可以由预备管理节点转换成管理节点,确定所述第一节点可以由预备管理节点转换成管理节点时,作为管理节点对集群资源进行节点间的重配置及节点间任务调度。Determining whether the first node can be converted into a management node by the preliminary management node based on the second management policy, and determining that the first node can be converted into a management node by the preliminary management node, Configuration and task scheduling between nodes.
在一实施方式中,所述检测到节点间心跳连接异常之前,所述方法还包括:In an embodiment, before the detecting an abnormality of the heartbeat connection between the nodes, the method further includes:
依据第三管理策略确定第二节点为管理节点,以使所述第二节点进行资源配置及任务调度。Determining, according to the third management policy, that the second node is a management node, so that the second node performs resource configuration and task scheduling.
在一实施方式中,所述基于第二管理策略判断所述第一节点是否可以由预备管理节点转换成管理节点,包括:In an embodiment, the determining, by the second management policy, whether the first node can be converted into a management node by the preliminary management node includes:
基于预设的网络检测方式判断所述第一节点是否存在对外网络连接,判断为是时,确定所述第一节点可以转换成管理节点;判断为否时,确定所述第一节点不能转换成管理节点。Determining whether the first node has an external network connection based on a preset network detection manner, determining that the first node can be converted into a management node when the determination is yes; determining that the first node cannot be converted into a Management node.
在一实施方式中,所述集群内的共享存储设备支持多节点共同访问;In an implementation manner, the shared storage device in the cluster supports multi-node common access;
所述基于第二管理策略判断所述第一节点是否可以由预备管理节点转换成管理节点,包括:The determining, according to the second management policy, whether the first node can be converted into a management node by the preliminary management node includes:
确定所述共享存储设备未被占用时,在所述共享存储设备上创建占位文件,并在预设时间后检测所述特定目录下是否存在其它预备管理节点创建的占位文件,若不存在,则确定所述第一节点可以转换成管理节点;若存在,则对所述第一节点所在子群的节点数及所述其它预备管理节点所在子群的节点数进行比较,并基于比较结果判定所述第一节点是否可以由预 备管理节点转换成管理节点。When it is determined that the shared storage device is not occupied, a placeholder file is created on the shared storage device, and after a preset time, it is detected whether there is a placeholder file created by another preparatory management node in the specific directory, if not Determining that the first node can be converted into a management node; if present, comparing the number of nodes of the sub-group in which the first node is located and the number of nodes of the sub-group in which the other preparatory management node is located, and based on the comparison result Determining whether the first node can be pre- The standby management node is converted into a management node.
在一实施方式中,对所述第一节点所在子群的节点数及所述其它预备管理节点所在子群的节点数进行比较,并基于比较结果判定所述第一节点是否可以由预备管理节点转换成管理节点,包括:In an embodiment, comparing the number of nodes of the subgroup in which the first node is located with the number of nodes of the subgroup in which the other preliminary management nodes are located, and determining, according to the comparison result, whether the first node can be managed by the preliminary node Convert to a management node, including:
确定所述第一节点所在子群的节点数多于所述其它预备管理节点所在子群的节点数时,确定所述第一节点可以转换成管理节点;Determining that the number of nodes of the sub-group in which the first node is located is greater than the number of nodes of the sub-group in which the other preparatory management nodes are located, determining that the first node can be converted into a management node;
确定所述第一节点所在子群的节点数最多且存在与所述第一节点所在子群的节点数相同的子群时,判断所述第一节点的节点编号是否小于与所述第一节点所在子群的节点数相同的子群中的预备管理节点的节点编号,并在判断为是时确定所述第一节点可以转换成管理节点,判断为否时确定所述第一节点不可以转换成管理节点。Determining whether the node number of the first node is smaller than the first node when determining that the number of nodes of the sub-group in which the first node is located is the most and the sub-group having the same number of nodes as the sub-group in which the first node is located a node number of a preliminary management node in a subgroup having the same number of nodes in the subgroup, and determining that the first node can be converted into a management node when the determination is yes, and determining that the first node is not convertible when the determination is no Become a management node.
在一实施方式中,所述集群内的共享存储设备支持单节点独占访问;In an implementation manner, the shared storage device in the cluster supports single node exclusive access;
所述基于第二管理策略判断所述第一节点是否可以由预备管理节点转换成管理节点,包括:The determining, according to the second management policy, whether the first node can be converted into a management node by the preliminary management node includes:
确定所述第一节点对所述共享存储设备的第一分区的访问时间,并在所述访问时间到达时挂载所述第一分区,且在确定所述第一分区内不存在占位文件时,确定所述第一节点可以转换成管理节点;在确定所述第一分区内存在占位文件时,确定所述第一节点不可以转换成管理节点。Determining an access time of the first node to the first partition of the shared storage device, and mounting the first partition when the access time arrives, and determining that there is no placeholder file in the first partition When it is determined that the first node can be converted into a management node; when it is determined that the location file exists in the first partition, it is determined that the first node cannot be converted into a management node.
在一实施方式中,所述依据第一管理策略确定所述第一节点为所在子群的预备管理节点,包括:In an embodiment, the determining, by the first management policy, that the first node is a preliminary management node of a subgroup, includes:
确定所述第一节点为所述第一节点所在子群内节点编号最小的节点,则所述第一节点为所在子群的预备管理节点。Determining that the first node is a node with the smallest node number in the subgroup where the first node is located, and the first node is a preliminary management node of the subgroup in which the first node is located.
本发明实施例还提供了一种节点设备,所述节点设备包括:确定模块及判断模块;An embodiment of the present invention further provides a node device, where the node device includes: a determining module and a determining module;
所述确定模块,设置为检测到节点间心跳连接异常时,依据第一管理策略确定所述第一节点为所在子群的预备管理节点;The determining module is configured to: when detecting an abnormal heartbeat connection between the nodes, determine, according to the first management policy, that the first node is a preliminary management node of the subgroup in which the node is located;
所述判断模块,设置为基于第二管理策略判断所述第一节点是否可以由预备管理节点转换成管理节点,确定所述第一节点可以由预备管理节点 转换成管理节点时,作为管理节点对集群资源进行节点间的重配置及节点间任务调度。The determining module is configured to determine, according to the second management policy, whether the first node can be converted into a management node by the preliminary management node, and determine that the first node can be a preliminary management node When converted to a management node, the management node performs reconfiguration between nodes and task scheduling between nodes as a cluster resource.
在一实施方式中,所述确定模块,还设置为依据第三管理策略确定第二节点为管理节点,以使所述第二节点进行资源配置及任务调度。In an embodiment, the determining module is further configured to determine, according to the third management policy, that the second node is a management node, so that the second node performs resource configuration and task scheduling.
在一实施方式中,所述判断模块,设置为基于预设的网络检测方式判断所述第一节点是否存在对外网络连接,判断为是时,确定所述第一节点可以转换成管理节点;判断为否时,确定所述第一节点不能转换成管理节点。In an embodiment, the determining module is configured to determine, according to a preset network detection manner, whether the first node has an external network connection, and if the determination is yes, determining that the first node can be converted into a management node; If not, it is determined that the first node cannot be converted into a management node.
在一实施方式中,所述集群内的共享存储设备支持多节点共同访问;In an implementation manner, the shared storage device in the cluster supports multi-node common access;
所述判断模块,设置为确定所述共享存储设备未被占用时,在所述共享存储设备的特定目录下创建占位文件,并在预设时间后检测所述特定目录下是否存在其它预备管理节点创建的占位文件,若不存在,则确定所述第一节点可以转换成管理节点;若存在,则对所述第一节点所在子群的节点数及所述其它预备管理节点所在子群的节点数进行比较,并基于比较结果判定所述第一节点是否可以由预备管理节点转换成管理节点。The determining module is configured to: when the shared storage device is not occupied, create a placeholder file in a specific directory of the shared storage device, and detect whether there is another preparatory management in the specific directory after a preset time. a placeholder file created by the node, if not, determining that the first node can be converted into a management node; if present, the number of nodes of the subgroup where the first node is located and the subgroup of the other preparatory management node The number of nodes is compared, and based on the comparison result, it is determined whether the first node can be converted into a management node by the preliminary management node.
在一实施方式中,所述判断模块,设置为确定所述第一节点所在子群的节点数多于所述其它预备管理节点所在子群的节点数时,确定所述第一节点可以转换成管理节点;In an embodiment, the determining module is configured to determine that the number of nodes of the sub-group in which the first node is located is greater than the number of nodes of the sub-group in which the other preparatory management node is located, and determine that the first node can be converted into Management node
确定所述第一节点所在子群的节点数最多且存在与所述第一节点所在子群的节点数相同的子群时,判断所述第一节点的节点编号是否小于与所述第一节点所在子群的节点数相同的子群中的预备管理节点的节点编号,并在判断为是时确定所述第一节点可以转换成管理节点,判断为否时确定所述第一节点不可以转换成管理节点。Determining whether the node number of the first node is smaller than the first node when determining that the number of nodes of the sub-group in which the first node is located is the most and the sub-group having the same number of nodes as the sub-group in which the first node is located a node number of a preliminary management node in a subgroup having the same number of nodes in the subgroup, and determining that the first node can be converted into a management node when the determination is yes, and determining that the first node is not convertible when the determination is no Become a management node.
在一实施方式中,所述集群内的共享存储设备支持单节点独占访问;In an implementation manner, the shared storage device in the cluster supports single node exclusive access;
所述判断模块,设置为确定所述第一节点对所述共享存储设备的第一分区的访问时间,并在所述访问时间到达时挂载所述第一分区,判断所述第一分区内是否存在占位文件,确定所述第一分区内不存在占位文件时,确定所述第一节点可以转换成管理节点;在确定所述第一分区内存在占位 文件时,确定所述第一节点不可以转换成管理节点。The determining module is configured to determine an access time of the first node to the first partition of the shared storage device, and mount the first partition when the access time arrives, and determine the first partition Whether there is a placeholder file, determining that there is no placeholder file in the first partition, determining that the first node can be converted into a management node; determining that the first partition has a placeholder When the file is determined, it is determined that the first node cannot be converted into a management node.
在一实施方式中,所述确定模块,设置为确定所述第一节点为所述第一节点所在子群内节点编号最小的节点,则所述第一节点为所在子群的预备管理节点。In an embodiment, the determining module is configured to determine that the first node is a node with the smallest node number in the subgroup where the first node is located, and the first node is a preliminary management node of the subgroup in which the first node is located.
本发明实施例还提供了一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令被处理器执行时实现上述集群内的节点管理方法。The embodiment of the invention further provides a computer readable storage medium storing computer executable instructions, which are implemented by the processor to implement a node management method in the cluster.
应用本发明实施例上述集群内的节点管理方法及节点设备,高可用集群中的第一节点检测到节点间心跳连接异常时,依据预设的第一管理策略确定所述第一节点为预备管理节点;基于预设的第二管理策略判断所述第一节点是否可以由预备管理节点转换成管理节点,以在判断为是时,作为管理节点对集群资源进行节点间的重配置及节点间任务调度;如此,在节点间心跳连接异常时,集群分裂成两个或两个以上的子群,而第一节点在确定自身成为所在子群的预备管理节点后,进一步判断是否可以成为集群的管理节点,并在判断为是的情况下,作为管理节点对集群资源进行节点间的重配置及节点间任务调度,有效的避免了脑裂的发生,保证了集群的高可用性和可靠性,更由于第一节点为集群内的节点,因此无需引入第三方管理设备,实现简单。Applying the node management method and the node device in the cluster according to the embodiment of the present invention, when the first node in the high-availability cluster detects that the heartbeat connection between the nodes is abnormal, the first node is determined to be the preliminary management according to the preset first management policy. a node; determining, according to a preset second management policy, whether the first node can be converted into a management node by the preparatory management node, and when the determination is yes, performing reconfiguration between nodes and inter-node tasks as the management node Scheduling; thus, when the heartbeat connection between nodes is abnormal, the cluster is split into two or more subgroups, and the first node further determines whether it can become the management of the cluster after determining that it is the preparatory management node of the subgroup in which it is located. Node, and if it is judged as YES, as the management node, the inter-node reconfiguration and inter-node task scheduling of the cluster resources effectively avoid the occurrence of brain splitting, ensure the high availability and reliability of the cluster, and more The first node is a node in the cluster, so there is no need to introduce a third-party management device, which is simple to implement.
在阅读并理解了附图和详细描述后,可以明白其他方面。Other aspects will be apparent upon reading and understanding the drawings and detailed description.
附图概述BRIEF abstract
图1为本发明实施例中集群内的节点管理方法流程示意图一;1 is a schematic flowchart 1 of a node management method in a cluster according to an embodiment of the present invention;
图2为本发明实施例中集群的架构示意图;2 is a schematic structural diagram of a cluster in an embodiment of the present invention;
图3为本发明实施例中集群内的节点管理方法流程示意图二;3 is a second schematic flowchart of a node management method in a cluster according to an embodiment of the present invention;
图4为本发明实施例中集群分裂成多个子群并确定预备管理节点的示意图;4 is a schematic diagram of splitting a cluster into multiple subgroups and determining a preliminary management node according to an embodiment of the present invention;
图5为本发明实施例中集群内的节点管理方法流程示意图三;5 is a schematic flowchart 3 of a node management method in a cluster according to an embodiment of the present invention;
图6为本发明实施例中集群内的节点管理方法流程示意图四;6 is a schematic flowchart 4 of a node management method in a cluster according to an embodiment of the present invention;
图7为本发明实施例中节点设备的组成结构示意图。 FIG. 7 is a schematic structural diagram of a node device according to an embodiment of the present invention.
详述Detailed
发明人发现,一个高可用性集群,不可避免要面对脑裂问题,常见的脑裂情况可以描述如下:The inventor found that a high-availability cluster is inevitably faced with a brain splitting problem. The common brain splitting conditions can be described as follows:
集群中的节点A和节点B会通过心跳检测以确认对方存在,在通过心跳检测确认不到对方存在时,就接管对应的共享资源。如果突然间,节点A和节点B之间的心跳不存在了(如网络断开),而节点A和节点B事实上却都处于激活(Active)状态,此时节点A要接管节点B的资源,同时节点B要接管节点A的资源,这时就是脑裂。Node A and Node B in the cluster pass the heartbeat detection to confirm the existence of the other party. When the heartbeat detection fails to confirm that the other party exists, the corresponding shared resource is taken over. If suddenly, the heartbeat between node A and node B does not exist (such as network disconnection), while node A and node B are actually in the active state, node A will take over the resources of node B. At the same time, Node B will take over the resources of Node A, which is the brain split.
当集群的心跳网络出现异常时,集群可能分裂成多个节点组,即子群,每个子群分别接管服务并且访问文件系统资源(例如并发写入文件系统)导致数据损坏。When an abnormality occurs in the heartbeat network of the cluster, the cluster may be split into multiple node groups, that is, subgroups, each of which takes over the service and accesses file system resources (for example, concurrently written to the file system) to cause data corruption.
脑裂会引起数据的不完整性,并且可能会对服务造成严重影响。Brain splitting can cause data incompleteness and can have a serious impact on the service.
集群的脑裂会引起数据的不完整性:集群中的节点(在脑裂期间)同时访问同一共享资源,而此时并没有锁机制来控制针对该数据访问,那么就存在数据的不完整性或其他错误的可能。The brain splitting of the cluster can cause data incompleteness: the nodes in the cluster (during the brain splitting) access the same shared resource at the same time, and there is no lock mechanism to control access to the data, then there is data integrity. Or other possible errors.
集群的脑裂还会对服务造成严重影响,举例来说,可能节点A和节点B不停在抢占一个IP资源,造成网络数据不能正常传输。The splitting of the cluster will also have a serious impact on the service. For example, node A and node B may not be preempting an IP resource, causing network data to fail to transmit.
集群的脑裂会引起严重的负面后果,而在本发明实施例中,如果集群中的第一节点在检测到节点间心跳连接异常时,依据预设的第一管理策略确定所述第一节点为所在子群的预备管理节点;基于预设的第二管理策略判断所述第一节点是否可以由预备管理节点转换成管理节点,以在判断为是时,作为管理节点对集群资源进行节点间的重配置及节点间任务调度;如此,在节点间心跳连接异常时,集群分裂成两个或两个以上的子群,第一节点在确定自身成为所在子群的预备管理节点后,进一步判断是否可以成为集群的管理节点,并在判断为是的情况下,作为管理节点对集群资源进行节点间的重配置及节点间任务调度,有效的避免了脑裂的发生,保证了集群的高可用性和可靠性,更由于第一节点为集群内的节点,因此无需 引入第三方管理设备,实现简单。The splitting of the cluster may cause serious negative consequences. In the embodiment of the present invention, if the first node in the cluster detects the abnormal connection between the nodes, the first node is determined according to the preset first management policy. a preliminary management node of the subgroup; determining whether the first node can be converted into a management node by the preliminary management node based on the preset second management policy, so as to determine the node as the management node Reconfiguration and inter-node task scheduling; thus, when the heartbeat connection between nodes is abnormal, the cluster is split into two or more subgroups, and the first node further determines after determining that it is the preparatory management node of the subgroup in which it is located. Whether it can become the management node of the cluster, and if it is judged to be YES, as the management node, the inter-node reconfiguration and the inter-node task scheduling of the cluster resources can effectively avoid the occurrence of brain splitting and ensure the high availability of the cluster. And reliability, and since the first node is a node within the cluster, there is no need The introduction of third-party management devices is simple to implement.
下面结合附图和实施例进一步详细说明。The details will be further described below in conjunction with the drawings and embodiments.
实施例一Embodiment 1
图1为本发明实施例中集群内的节点管理方法流程示意图,所述方法应用于第一节点,所述第一节点为所述集群内节点,图2为本发明实施例中集群的架构示意图,如图1、图2所示,本发明实施例中集群内的节点管理方法包括:1 is a schematic flowchart of a method for managing a node in a cluster according to an embodiment of the present invention. The method is applied to a first node, where the first node is a node in the cluster, and FIG. 2 is a schematic structural diagram of a cluster in an embodiment of the present invention. As shown in FIG. 1 and FIG. 2, the node management method in the cluster in the embodiment of the present invention includes:
步骤101:检测到节点间心跳连接异常时,依据预设的第一管理策略确定所述第一节点为所在子群的预备管理节点。Step 101: When detecting that the heartbeat connection between the nodes is abnormal, determining that the first node is a preliminary management node of the subgroup according to the preset first management policy.
这里,当集群内的第一节点通过心跳检测判断得到节点间的心跳连接存在异常,即当前集群的心跳平面存在节点不通时,此时集群被分裂成两个或两个以上的子群,第一节点存在于分裂成的两个或两个以上的子群中的一个。Here, when the first node in the cluster determines that there is an abnormality in the heartbeat connection between the nodes through the heartbeat detection, that is, when the node of the current cluster has a node failure, the cluster is split into two or more subgroups. A node exists in one of two or more subgroups that are split into one.
第一节点依据预设的第一管理策略确定自身为所在子群的预备管理节点,由于在实施时,该第一管理策略适用于集群内所有节点,因此,可以理解为第一节点所在子群的所有节点通过第一管理策略选举出第一节点为预备管理节点;同时,由于心跳异常分裂出的其它子群内的节点也会依据所述第一管理策略选举出各个子群的预备管理节点。The first node determines, according to the preset first management policy, that it is a preliminary management node of the sub-group, and the first management policy is applicable to all nodes in the cluster, so it can be understood as the sub-group where the first node is located. All the nodes in the first management policy elect the first node as the preliminary management node; at the same time, the nodes in the other sub-groups that are abnormally split due to the heartbeat also elect the preliminary management node of each sub-group according to the first management policy. .
在实施时,预设的第一管理策略可以为任意预先设置的选举规则,如,可设置节点编号(集群内每个节点都有唯一编号)最小/最大的节点作为预备管理节点。In implementation, the preset first management policy may be any pre-set election rule, for example, a node with the node number (each node in the cluster has a unique number) minimum/maximum is used as the preliminary management node.
基于本发明上述实施例,在实际应用中,在第一节点检测到节点间心跳连接异常之前,所述方法还可以包括:Based on the foregoing embodiment of the present invention, in a practical application, before the first node detects an abnormality of the heartbeat connection between the nodes, the method may further include:
依据第三管理策略确定第二节点为管理节点,以使所述第二节点进行资源配置及任务调度;所述第三管理策略可以和所述第一管理策略相同或不同,当相同时,在集群正常时,集群内的节点通过预设的第一管理策略选举出第二节点作为管理节点,以进行资源配置及任务调度,如在共享存储设备上创建特定分区、目录或文件,用于集群异常情况下节点之间协 商;可以确定集群内节点编号最小的节点为第二节点,则所述第二节点成为管理节点。Determining, by the third management policy, that the second node is a management node, so that the second node performs resource configuration and task scheduling; and the third management policy may be the same as or different from the first management policy, when the same, When the cluster is normal, the nodes in the cluster elect the second node as the management node through the preset first management policy to perform resource configuration and task scheduling. For example, create a specific partition, directory, or file on the shared storage device for clustering. Association between nodes under abnormal conditions The quotient can determine that the node with the smallest node number in the cluster is the second node, and the second node becomes the management node.
需要说明的是,任何情况下,同一时刻集群内只有一个管理节点,因此,当集群内出现心跳异常时,在集群正常时确定的管理节点便不再是管理节点。It should be noted that in any case, there is only one management node in the cluster at the same time. Therefore, when a heartbeat abnormality occurs in the cluster, the management node determined when the cluster is normal is no longer the management node.
步骤102:基于预设的第二管理策略判断所述第一节点是否可以由预备管理节点转换成管理节点,以在判断为是时,作为管理节点对集群资源进行节点间的重配置及节点间任务调度。Step 102: Determine, according to a preset second management policy, whether the first node can be converted into a management node by the preliminary management node, so as to perform reconfiguration between the nodes and the internode between the cluster resources as the management node when the determination is yes. Task scheduling.
在实际实施时,由于集群内节点间心跳异常而分裂成的子群基于第一管理策略分别确定了各自子群的预备管理节点,然而,为了避免脑裂,仅可以有一个子群可以正常工作,因此,进一步从各个预备管理节点中确定一个作为管理节点,以对集群资源进行节点间的重配置及节点间任务调度,相应的,确定的管理节点所在的子群即为工作的子群,其它子群均停止工作。In actual implementation, the subgroups that are split due to abnormal heartbeat between nodes in the cluster respectively determine the preliminary management nodes of the respective subgroups based on the first management strategy. However, in order to avoid brain splitting, only one subgroup can work normally. Therefore, one of the preliminary management nodes is further determined as a management node to perform inter-node reconfiguration and inter-node task scheduling on the cluster resources, and correspondingly, the determined sub-group of the management node is a working sub-group. Other subgroups stopped working.
第一节点基于预设的第二管理策略判断自身是否可以成为管理节点,并在判断为是时,作为管理节点对集群资源进行节点间的重配置及节点间任务调度;The first node determines whether it can become a management node based on the preset second management policy, and when the determination is yes, performs reconfiguration between nodes and inter-node task scheduling as a management node;
这里,所述第二管理策略为依据所述集群的实际情况预先设置,在实际应用中,若所述集群存在对外连接的网络时,可设置所述第二管理策略为:第一节点判断自身是否存在对外网络连接,即判断和外部网络实体是否连通,判断为是时,确定所述第一节点可以转换成管理节点;判断为否时,确定所述第一节点不能转换成管理节点;Here, the second management policy is preset according to the actual situation of the cluster. In an actual application, if the cluster has an externally connected network, the second management policy may be set as: the first node determines itself Whether there is an external network connection, that is, whether the external network entity is connected, and if the determination is yes, determining that the first node can be converted into a management node; if the determination is no, determining that the first node cannot be converted into a management node;
而当所述集群不存在对外连接的网络,集群的共享存储设备支持多节点共同访问时,可设置所述第二管理策略为:第一节点判断所述共享存储设备是否被占用,在所述共享存储设备未被占用时,在所述共享存储设备的特定目录下创建占位文件,并在预设时间后检测所述特定目录下是否存在其它预备管理节点创建的占位文件,若不存在,则确定所述第一节点可以转换成管理节点;若存在,则对所述第一节点所在子群的节点数及所述其它预备管理节点所在子群的节点数进行比较,并基于比较结果判定所述第 一节点是否可以由预备管理节点转换成管理节点;When the cluster has no externally connected network, and the shared storage device of the cluster supports the multi-node access, the second management policy may be: the first node determines whether the shared storage device is occupied, When the shared storage device is not occupied, a placeholder file is created in a specific directory of the shared storage device, and after a preset time, it is detected whether there is a placeholder file created by another preparatory management node in the specific directory, if it does not exist. Determining that the first node can be converted into a management node; if present, comparing the number of nodes of the sub-group in which the first node is located and the number of nodes of the sub-group in which the other preparatory management node is located, and based on the comparison result Determining the number Whether a node can be converted into a management node by a preliminary management node;
而当所述集群不存在对外连接的网络,集群的共享存储设备仅支持单节点独占访问时,可设置所述第二管理策略为:第一节点确定自身对所述共享存储设备的第一分区的访问时间,并在所述访问时间到达时挂载所述第一分区,判断所述第一分区内是否存在占位文件,若所述第一分区内不存在占位文件,确定所述第一节点可以转换成管理节点。When the cluster does not have an externally connected network, and the shared storage device of the cluster supports only single-node exclusive access, the second management policy may be set as follows: the first node determines the first partition of the shared storage device by itself. Access time, and when the access time arrives, the first partition is mounted, and it is determined whether there is a placeholder file in the first partition. If there is no placeholder file in the first partition, the first A node can be converted into a management node.
应用本发明上述实施例,当集群内出现节点间心跳异常时,集群内的第一节点在确定自身成为所在子群的预备管理节点后,进一步判断是否可以成为集群的管理节点,并在判断为是的情况下,作为管理节点对集群资源进行节点间的重配置及节点间任务调度,以使得集群系统负载均衡,有效的避免了脑裂的发生,保证了集群的高可用性和可靠性,更由于第一节点为集群内的节点,因此无需引入第三方管理设备,实现简单。According to the foregoing embodiment of the present invention, when an inter-node heartbeat abnormality occurs in the cluster, the first node in the cluster further determines whether it can become the management node of the cluster after determining that it is the preparatory management node of the subgroup in which the cluster is located, and determines that In the case of the management node, the inter-node reconfiguration and the inter-node task scheduling are performed on the cluster resources, so that the cluster system is load balanced, effectively avoiding the occurrence of brain splitting, and ensuring the high availability and reliability of the cluster. Since the first node is a node in the cluster, it is not necessary to introduce a third-party management device, and the implementation is simple.
实施例二Embodiment 2
图3为本发明实施例中集群内的节点管理方法流程示意图,所述集群存在对外网络连接,如图3所示,本发明实施例中集群内的节点管理方法包括:FIG. 3 is a schematic flowchart of a method for managing a node in a cluster according to an embodiment of the present invention. The cluster has an external network connection. As shown in FIG. 3, the node management method in the cluster in the embodiment of the present invention includes:
步骤301:第一节点依据预设的第一管理策略确定第二节点为管理节点,以使所述第二节点进行资源配置及任务调度。Step 301: The first node determines, according to the preset first management policy, that the second node is a management node, so that the second node performs resource configuration and task scheduling.
在本实施例中,由于所述集群存在对外连接的网络,因此,可预先设置对外连接的网络平面作为冗余的心跳平面,如果有多个网络平面,则可选取一个关键网络平面作为冗余心跳平面;所述关键网络平面可以为若该平面异常会影响数据的处理的平面。In this embodiment, since the cluster has an externally connected network, the externally connected network plane may be preset as a redundant heartbeat plane. If there are multiple network planes, a critical network plane may be selected as the redundancy. Heartbeat plane; the key network plane may be a plane that affects the processing of data if the plane anomaly.
若所述集群存在共享存储设备,则本步骤之后,即确定第二节点为管理节点之后,所述第二节点可在所述共享存储设备上创建特定分区、目录或文件,用于集群异常情况下节点之间协商。If the cluster has a shared storage device, after the step, that is, after determining that the second node is the management node, the second node may create a specific partition, directory, or file on the shared storage device for the cluster abnormality. Negotiate between the next nodes.
在本实施例中,第一节点确定集群内节点编号最小的节点为管理节点。In this embodiment, the first node determines that the node with the smallest node number in the cluster is the management node.
步骤302:第一节点检测到节点间心跳连接异常时,依据所述第一管理 策略确定自身为所在子群的预备管理节点。Step 302: When the first node detects that the heartbeat connection between the nodes is abnormal, according to the first management The policy determines itself as the preparatory management node of the subgroup in which it is located.
这里,当集群内存在心跳连接异常时,集群被分裂成两个或两个以上的子群,第一节点存在于分裂成的两个或两个以上的子群中的一个,每个子群内的节点均会依据所述第一管理策略确定各自子群的预备管理节点;如图4所示为本发明实施例中集群分裂成多个子群并确定预备管理节点的示意图;其中,A-C、A-D、A-E之间心跳异常,即节点A及节点B形成一个子群,剩余节点形成一个子群。Here, when there is an abnormal heartbeat connection in the cluster, the cluster is split into two or more subgroups, and the first node exists in one of two or more subgroups split into each subgroup. The node is determined according to the first management policy, and the preliminary management node of the respective sub-group is determined according to the first management policy; as shown in FIG. 4 is a schematic diagram of splitting the cluster into multiple sub-groups and determining a preliminary management node according to an embodiment of the present invention; wherein, AC, AD The heartbeat is abnormal between AEs, that is, node A and node B form a subgroup, and the remaining nodes form a subgroup.
在本实施例中,第一节点依据所述第一管理策略确定自身为所在子群的预备管理节点,包括:第一节点确定自身为所在子群的节点编号最小的节点时,则确定第一节点为所在子群的预备管理节点。In this embodiment, the first node determines, according to the first management policy, that it is a preliminary management node of the subgroup in which the first node determines that the first node determines that it is the node with the smallest node number of the subgroup, and then determines the first The node is the preliminary management node of the subgroup in which it is located.
在实际实施时,由于集群内节点间心跳异常而分裂成的子群基于第一管理策略分别确定了各自子群的预备管理节点,然而,为了避免脑裂,仅可以有一个子群可以正常工作,因此,进一步从各个预备管理节点中确定一个作为管理节点,以对集群资源进行节点间的重配置及节点间任务调度,相应的,确定的管理节点所在的子群即为工作的子群,其它子群均停止工作。In actual implementation, the subgroups that are split due to abnormal heartbeat between nodes in the cluster respectively determine the preliminary management nodes of the respective subgroups based on the first management strategy. However, in order to avoid brain splitting, only one subgroup can work normally. Therefore, one of the preliminary management nodes is further determined as a management node to perform inter-node reconfiguration and inter-node task scheduling on the cluster resources, and correspondingly, the determined sub-group of the management node is a working sub-group. Other subgroups stopped working.
步骤303:第一节点基于预设的网络检测方式判断自身是否存在对外网络连接,如果存在,执行步骤304;如果不存在,则执行步骤305。Step 303: The first node determines whether there is an external network connection based on the preset network detection mode. If yes, step 304 is performed; if not, step 305 is performed.
在实际应用中,所述预设的网络检测方式可以为ping或者地址解析协议(ARP,Address Resolution Protocol);In an actual application, the preset network detection mode may be a ping or an address resolution protocol (ARP);
第一节点判断自身是否存在对外网络连接,即判断和外部网络实体是否连通。The first node determines whether there is an external network connection, that is, whether the external network entity is connected.
步骤304:确定自身可以成为管理节点,并作为管理节点对集群资源进行节点间的重配置及节点间任务调度。Step 304: Determine that it can become a management node, and perform reconfiguration between nodes and task scheduling between nodes as a management node.
在实施时,若有节点恢复心跳,可维持所述第一节点的管理节点位置不变,也可以依据第一管理策略及第二管理策略重新确定管理节点。In the implementation, if the node recovers the heartbeat, the location of the management node of the first node may be maintained, and the management node may be re-determined according to the first management policy and the second management policy.
步骤305:确定自身不能成为管理节点,结束本次处理流程。Step 305: Determine that it cannot become a management node, and end the current processing flow.
应用本发明上述实施例,集群利用自身内部的节点实现自我管理,当 集群内出现心跳异常时,第一节点在确定自身成为子群内预备管理节点的情况下,进一步通过网络平面的方式,即通过判断自身是否存在对外网络连接来判断自身是否可成为管理节点,进而在可以成为管理节点的情况下,作为管理节点对集群资源进行节点间的重配置及节点间任务调度,以使得集群系统负载均衡,有效的避免了脑裂的发生,保证了集群的高可用性和可靠性。Applying the above embodiments of the present invention, the cluster uses its own internal nodes to implement self-management. When a heartbeat abnormality occurs in the cluster, the first node determines whether it can become a management node by determining whether it has an external network connection by determining the self-provisioning management node in the sub-group. In the case of being a management node, the management node performs reconfiguration between nodes and task scheduling between nodes, so that the cluster system is load balanced, effectively avoiding the occurrence of brain splitting, and ensuring high availability of the cluster and reliability.
实施例三Embodiment 3
图5为本发明实施例中集群内的节点管理方法流程示意图,所述集群不存在对外网络连接,但存在支持多节点共同访问的共享存储设备,如图5所示,本发明实施例中集群内的节点管理方法包括:FIG. 5 is a schematic flowchart of a method for managing a node in a cluster according to an embodiment of the present invention. The cluster does not have an external network connection, but a shared storage device that supports multi-node access is provided. As shown in FIG. 5, the cluster in the embodiment of the present invention is shown in FIG. The node management methods within:
步骤501:第一节点依据预设的第一管理策略确定第二节点为管理节点,以使所述第二节点进行资源配置及任务调度。Step 501: The first node determines, according to the preset first management policy, that the second node is a management node, so that the second node performs resource configuration and task scheduling.
在本实施例中,由于所述集群存在共享存储设备,因此,本步骤之后,即确定第二节点为管理节点之后,所述第二节点可在所述共享存储设备上创建特定分区、目录或文件,用于集群异常情况下节点之间协商。In this embodiment, since the cluster has a shared storage device, after the step, that is, after determining that the second node is the management node, the second node may create a specific partition, a directory, or the shared storage device. File, used for negotiation between nodes in the case of cluster anomalies.
在本实施例中,第一节点确定集群内节点编号最小的节点为管理节点。In this embodiment, the first node determines that the node with the smallest node number in the cluster is the management node.
步骤502:第一节点检测到节点间心跳连接异常时,依据所述第一管理策略确定自身为所在子群的预备管理节点。Step 502: When the first node detects that the heartbeat connection is abnormal between nodes, the first node determines that it is a preliminary management node of the subgroup according to the first management policy.
这里,当集群内存在心跳连接异常时,集群被分裂成两个或两个以上的子群,第一节点存在于分裂成的两个或两个以上的子群中的一个,每个子群内的节点均会依据所述第一管理策略确定各自子群的预备管理节点;如图4所示为本发明实施例中集群分裂成多个子群并确定预备管理节点的示意图。Here, when there is an abnormal heartbeat connection in the cluster, the cluster is split into two or more subgroups, and the first node exists in one of two or more subgroups split into each subgroup. The nodes all determine the preliminary management nodes of the respective subgroups according to the first management policy; as shown in FIG. 4, the schematic diagram of the cluster splitting into multiple subgroups and determining the preliminary management nodes in the embodiment of the present invention.
在本实施例中,第一节点依据所述第一管理策略确定自身为所在子群的预备管理节点,包括:第一节点确定自身为所在子群的节点编号最小的节点时,则确定第一节点为所在子群的预备管理节点。In this embodiment, the first node determines, according to the first management policy, that it is a preliminary management node of the subgroup in which the first node determines that the first node determines that it is the node with the smallest node number of the subgroup, and then determines the first The node is the preliminary management node of the subgroup in which it is located.
在实际实施时,由于集群内节点间心跳异常而分裂成的子群基于第一 管理策略分别确定了各自子群的预备管理节点,然而,为了避免脑裂,仅可以有一个子群可以正常工作,因此,进一步从各个预备管理节点中确定一个作为管理节点,以对集群资源进行节点间的重配置及节点间任务调度,相应的,确定的管理节点所在的子群即为工作的子群,其它子群均停止工作。In actual implementation, the subgroup that is split due to the abnormal heartbeat between nodes in the cluster is based on the first The management strategy determines the preparatory management nodes of the respective subgroups. However, in order to avoid brain splitting, only one subgroup can work normally. Therefore, one of the various preparatory management nodes is further determined as a management node to perform cluster resources. Reconfiguration between nodes and task scheduling between nodes, correspondingly, the determined subgroup of the management node is the subgroup of the work, and the other subgroups stop working.
步骤503:第一节点判断集群的共享存储设备是否被占用,如果未被占用,执行步骤504;如果被占用,执行步骤509。Step 503: The first node determines whether the shared storage device of the cluster is occupied. If it is not occupied, step 504 is performed; if it is occupied, step 509 is performed.
在集群内节点心跳正常时,所述第二节点会在集群的共享存储设备上创建一个标识文件,作为对所述共享存储设备的占用标识,并定时(如S秒,S大小可依据实际需要设定)更新所述标识文件(如更新该标识文件的创建时间和/或内容);When the heartbeat of the node is normal, the second node creates an identification file on the shared storage device of the cluster as the occupation identifier of the shared storage device, and the timing (such as S seconds, S size can be based on actual needs). Setting) updating the identification file (such as updating the creation time and/or content of the identification file);
而在实施本步骤的时候,第一节点通过周期性的检测第二节点在共享存储设备上创建的标识文件的变化情况来判断集群的共享存储设备是否被占用,可以包括:第一节点每隔W秒(W≥S)检测一次所述标识文件是否发生变化,若连续检测T次(T为正整数,实际值可依据实际需要进行设置)后所述标识文件均未发生变化,则确定所述共享存储设备未被占用;若标识文件发生变化,则确定所述共享存储设备已被占用。When the first step is performed, the first node determines whether the shared storage device of the cluster is occupied by periodically detecting the change of the identifier file created by the second node on the shared storage device, which may include: W seconds (W ≥ S) detects whether the identification file changes once. If the identification file is continuously detected T times (T is a positive integer, the actual value can be set according to actual needs), the identification file does not change, then the determination The shared storage device is not occupied; if the identification file changes, it is determined that the shared storage device is occupied.
步骤504:在所述共享存储设备的特定目录下创建占位文件,并在预设时间后检测所述特定目录下是否存在其它预备管理节点创建的占位文件,如果存在,执行步骤505;如果不存在,则执行步骤508。Step 504: Create a placeholder file in a specific directory of the shared storage device, and detect, after a preset time, whether there is a placeholder file created by another preparatory management node in the specific directory, if yes, execute step 505; If not, step 508 is performed.
这里,当第一节点确定集群的共享存储设备未被占用时,便可在共享存储设备的特定目录下创建占位文件,该占位文件包含了第一节点的节点编号信息及第一节点所在子群的节点数信息,相应的,其它预备管理节点创建的占位文件则会携带所述其它预备管理节点的节点编号信息及相应所在子群的节点数信息;所述预设时间的长短可以依据实际需要进行设定,但需大于等于S秒。Here, when the first node determines that the shared storage device of the cluster is not occupied, a placeholder file may be created in a specific directory of the shared storage device, where the placeholder file includes the node number information of the first node and the first node The node number information of the subgroup, correspondingly, the placeholder file created by the other preparatory management node carries the node number information of the other preparatory management node and the node number information of the corresponding subgroup; the length of the preset time may be Set according to actual needs, but need to be greater than or equal to S seconds.
步骤505:比较所述第一节点所在子群的节点数及所述其它预备管理节点所在子群的节点数,判断所述第一节点所在子群的节点数是否最多,如果是,执行步骤506;如果否,则执行步骤509。 Step 505: Compare the number of nodes of the sub-group in which the first node is located and the number of nodes of the sub-group in which the other preparatory management node is located, and determine whether the number of nodes of the sub-group where the first node is located is the most. If yes, go to step 506. If not, step 509 is performed.
步骤506:判断所述其它预备管理节点所在子群中是否存在与所述第一节点所在子群的节点数相同的子群,如果存在,执行步骤507;如果不存在,则执行步骤508。Step 506: Determine whether there is a subgroup corresponding to the number of nodes of the subgroup in which the first node is located in the subgroup where the other preparatory management node is located. If yes, go to step 507; if not, go to step 508.
步骤507:判断第一节点的节点编号是否小于与第一节点所在子群的节点数相同的子群中的预备管理节点的节点编号,如果是,执行步骤508;如果否,则执行步骤509。Step 507: Determine whether the node number of the first node is smaller than the node number of the preliminary management node in the same sub-group as the number of nodes of the sub-group in which the first node is located. If yes, go to step 508; if no, go to step 509.
步骤508:确定第一节点可以成为管理节点,并作为管理节点对集群资源进行节点间的重配置及节点间任务调度。Step 508: Determine that the first node can become a management node, and perform, as a management node, perform reconfiguration between nodes and task scheduling between nodes.
在实施时,若有节点恢复心跳,可维持所述第一节点的管理节点位置不变,也可以依据第一管理策略及第二管理策略重新确定管理节点。In the implementation, if the node recovers the heartbeat, the location of the management node of the first node may be maintained, and the management node may be re-determined according to the first management policy and the second management policy.
步骤509:确定第一节点不能成为管理节点,结束本次处理流程。Step 509: Determine that the first node cannot become a management node, and end the current processing flow.
应用本发明上述实施例,当集群内出现心跳异常时,第一节点在确定自身成为子群内预备管理节点的情况下,进一步基于共享存储设备的占用情况、占位文件的创建及检测、第一节点所在子群节点数信息、第一节点的节点编号信息等来判断自身是否可成为管理节点,进而在可以成为管理节点的情况下,作为管理节点对集群资源进行节点间的重配置及节点间任务调度,以使得集群系统负载均衡,有效的避免了脑裂的发生,保证了集群的高可用性和可靠性。According to the foregoing embodiment of the present invention, when a heartbeat abnormality occurs in the cluster, the first node further determines the occupation of the shared storage device, the creation and detection of the placeholder file, and the first The number of sub-group nodes in which a node is located, the node number information of the first node, etc., determine whether or not it can become a management node, and further, when the management node can be a management node, perform reconfiguration between nodes and nodes of the cluster resource. Inter-task scheduling, so that the cluster system load balancing, effectively avoiding the occurrence of brain splitting, ensuring the high availability and reliability of the cluster.
实施例四Embodiment 4
图6为本发明实施例中集群内的节点管理方法流程示意图,所述集群不存在对外网络连接,仅存在支持单节点独占访问的共享存储设备,如图6所示,本发明实施例中集群内的节点管理方法包括:FIG. 6 is a schematic flowchart of a method for managing a node in a cluster according to an embodiment of the present invention. The cluster does not have an external network connection, and only a shared storage device that supports single-node exclusive access exists. As shown in FIG. 6, the cluster in the embodiment of the present invention is shown in FIG. The node management methods within:
步骤601:第一节点依据预设的第一管理策略确定第二节点为管理节点,以使所述第二节点进行资源配置及任务调度。Step 601: The first node determines, according to the preset first management policy, that the second node is a management node, so that the second node performs resource configuration and task scheduling.
在本实施例中,由于所述集群存在共享存储设备,因此本步骤之后,即确定第二节点为管理节点之后,所述第二节点可在所述共享存储设备上创建特定分区、目录或文件,用于集群异常情况下节点之间协商。In this embodiment, since the cluster has a shared storage device, after the step, that is, after determining that the second node is the management node, the second node may create a specific partition, directory, or file on the shared storage device. Used for negotiation between nodes in the case of cluster anomalies.
在本实施例中,第一节点确定集群内节点编号最小的节点为管理节 点。In this embodiment, the first node determines that the node with the smallest node number in the cluster is the management section. point.
步骤602:第一节点检测到节点间心跳连接异常时,依据所述第一管理策略确定自身为所在子群的预备管理节点。Step 602: When the first node detects that the heartbeat connection between the nodes is abnormal, the first node determines that it is the preliminary management node of the subgroup according to the first management policy.
这里,当集群内存在心跳连接异常时,集群被分裂成两个或两个以上的子群,第一节点存在于分裂成的两个或两个以上的子群中的一个,每个子群内的节点均会依据所述第一管理策略确定各自子群的预备管理节点;如图4所示为本发明实施例中集群分裂成多个子群并确定预备管理节点的示意图。Here, when there is an abnormal heartbeat connection in the cluster, the cluster is split into two or more subgroups, and the first node exists in one of two or more subgroups split into each subgroup. The nodes all determine the preliminary management nodes of the respective subgroups according to the first management policy; as shown in FIG. 4, the schematic diagram of the cluster splitting into multiple subgroups and determining the preliminary management nodes in the embodiment of the present invention.
在本实施例中,第一节点依据所述第一管理策略确定自身为所在子群的预备管理节点,包括:第一节点确定自身为所在子群的节点编号最小的节点时,则确定第一节点为所在子群的预备管理节点。In this embodiment, the first node determines, according to the first management policy, that it is a preliminary management node of the subgroup in which the first node determines that the first node determines that it is the node with the smallest node number of the subgroup, and then determines the first The node is the preliminary management node of the subgroup in which it is located.
在实际实施时,由于集群内节点间心跳异常而分裂成的子群基于第一管理策略分别确定了各自子群的预备管理节点,然而,为了避免脑裂,仅可以有一个子群可以正常工作,因此,进一步从各个预备管理节点中确定一个作为管理节点,以对集群资源进行节点间的重配置及节点间任务调度,相应的,确定的管理节点所在的子群即为工作的子群,其它子群均停止工作。In actual implementation, the subgroups that are split due to abnormal heartbeat between nodes in the cluster respectively determine the preliminary management nodes of the respective subgroups based on the first management strategy. However, in order to avoid brain splitting, only one subgroup can work normally. Therefore, one of the preliminary management nodes is further determined as a management node to perform inter-node reconfiguration and inter-node task scheduling on the cluster resources, and correspondingly, the determined sub-group of the management node is a working sub-group. Other subgroups stopped working.
步骤603:第一节点确定自身对所述共享存储设备的第一分区的访问时间。Step 603: The first node determines its own access time to the first partition of the shared storage device.
这里,所述第一分区为所述第二节点在所述共享存储设备上创建的用于容灾时节点协商的小分区,在集群内节点间心跳正常时,第一分区处于清空状态。Here, the first partition is a small partition that is created by the second node on the shared storage device for the disaster-tolerant node negotiation. When the heartbeat between the nodes in the cluster is normal, the first partition is in an empty state.
由于集群的共享存储设备仅支持单节点独占访问,也即同一时间,仅允许一个节点访问该共享存储设备,编号为N的节点可以访问的时间段为当天零点开始后的n*M+N*T至n*M+(N+1)*T秒,其中n大于等于0,M为总结点数,N为节点编号,T为可配置的安全访问时长。The shared storage device of the cluster only supports single-node exclusive access, that is, only one node is allowed to access the shared storage device at the same time. The time range that the node numbered N can access is n*M+N* after the start of the zero point of the day. T to n*M+(N+1)*T seconds, where n is greater than or equal to 0, M is the sum of points, N is the node number, and T is the configurable secure access duration.
步骤604:第一节点确定自身的访问时间到达时,挂载所述第一分区,并判断所述第一分区内是否存在占位文件,若不存在,执行步骤605;若存 在,则执行步骤606。Step 604: When the first node determines that its own access time arrives, the first partition is mounted, and it is determined whether there is a placeholder file in the first partition. If not, step 605 is performed; Then, step 606 is performed.
这里,当第一节点发现第一分区内存在占位文件时,确定共享存储设备被占用,如果第一分区内不存在占位文件,即共享存储设备未被占用;其中,所述占位文件指的是预备管理节点创建的携带自身节点编号及所在子群节点数的文件。Here, when the first node finds that there is a placeholder file in the first partition, it is determined that the shared storage device is occupied. If the placeholder file does not exist in the first partition, that is, the shared storage device is not occupied; wherein the placeholder file Refers to the file created by the preparatory management node that carries its own node number and the number of nodes in which it is located.
步骤605:确定自身可以成为管理节点,在所述第一分区内创建占位文件,并卸载第一分区。Step 605: Determine that it can become a management node, create a placeholder file in the first partition, and uninstall the first partition.
第一节点确定自身可以成为管理节点后,在所述第一分区内创建占位文件,以标识占用所述共享存储文件,并作为管理节点对集群资源进行节点间的重配置及节点间任务调度。After the first node determines that it can become the management node, a placeholder file is created in the first partition to identify the shared storage file, and the node is reconfigured as a management node and the inter-node task is scheduled. .
在实施时,若有节点恢复心跳,可维持所述第一节点的管理节点位置不变,也可以依据第一管理策略及第二管理策略重新确定管理节点。In the implementation, if the node recovers the heartbeat, the location of the management node of the first node may be maintained, and the management node may be re-determined according to the first management policy and the second management policy.
步骤606:确定自身不能成为管理节点,结束本次处理流程。Step 606: Determine that it cannot become a management node, and end the current processing flow.
应用本发明上述实施例,当集群内出现心跳异常时,第一节点在确定自身成为子群内预备管理节点的情况下,确定自身访问共享设备的第一分区的时间,并在所述时间到达时基于第一分区是否被占用来判断自身是否可成为管理节点,进而在可以成为管理节点的情况下,作为管理节点对集群资源进行节点间的重配置及节点间任务调度,以使得集群系统负载均衡,有效的避免了脑裂的发生,保证了集群的高可用性和可靠性。Applying the foregoing embodiment of the present invention, when a heartbeat abnormality occurs in the cluster, the first node determines the time of accessing the first partition of the shared device, and arrives at the time when the self-determination is a preliminary management node in the sub-group. Determine whether the first partition can be a management node based on whether the first partition is occupied, and then, as a management node, perform reconfiguration between nodes and inter-node task scheduling as a management node to make the cluster system load Balanced, effectively avoiding the occurrence of brain splitting, ensuring high availability and reliability of the cluster.
实施例五Embodiment 5
图7为本发明实施例中节点设备的组成结构示意图,如图7所示,本发明实施例中节点设备的组成包括:确定模块71及判断模块72;FIG. 7 is a schematic structural diagram of a node device according to an embodiment of the present invention. As shown in FIG. 7, the composition of the node device in the embodiment of the present invention includes: a determining module 71 and a determining module 72;
所述确定模块71,设置为检测到节点间心跳连接异常时,依据预设的第一管理策略确定所述第一节点为所在子群的预备管理节点;The determining module 71 is configured to determine that the first node is a preliminary management node of the subgroup according to the preset first management policy when detecting an abnormal connection between the nodes;
所述判断模块72,设置为基于预设的第二管理策略判断所述第一节点是否可以由预备管理节点转换成管理节点,以在判断为是时,作为管理节点对集群资源进行节点间的重配置及节点间任务调度。The determining module 72 is configured to determine, according to the preset second management policy, whether the first node can be converted into a management node by the preliminary management node, so that when the determination is yes, the node is configured as a management node between the nodes. Reconfiguration and inter-node task scheduling.
在一实施例中,所述确定模块71,还设置为依据第三管理策略确定第 二节点为管理节点,以使所述第二节点进行资源配置及任务调度。In an embodiment, the determining module 71 is further configured to determine the third management policy according to the third management policy. The two nodes are management nodes, so that the second node performs resource configuration and task scheduling.
在一实施例中,所述判断模块72,设置为基于预设的网络检测方式判断所述第一节点是否存在对外网络连接,判断为是时,确定所述第一节点可以转换成管理节点;判断为否时,确定所述第一节点不能转换成管理节点。In an embodiment, the determining module 72 is configured to determine, according to a preset network detection manner, whether the first node has an external network connection, and if the determination is yes, determining that the first node can be converted into a management node; When the determination is no, it is determined that the first node cannot be converted into a management node.
在一实施例中,所述集群的共享存储设备支持多节点共同访问;In an embodiment, the shared storage device of the cluster supports multi-node common access;
所述判断模块72,设置为判断所述共享存储设备是否被占用,在所述共享存储设备未被占用时,在所述共享存储设备的特定目录下创建占位文件,并在预设时间后检测所述特定目录下是否存在其它预备管理节点创建的占位文件,若不存在,则确定所述第一节点可以转换成管理节点;若存在,则对所述第一节点所在子群的节点数及所述其它预备管理节点所在子群的节点数进行比较,并基于比较结果判定所述第一节点是否可以由预备管理节点转换成管理节点。The determining module 72 is configured to determine whether the shared storage device is occupied, and when the shared storage device is not occupied, create a placeholder file in a specific directory of the shared storage device, and after a preset time Detecting whether there is a placeholder file created by another preparatory management node in the specific directory, if not, determining that the first node can be converted into a management node; if yes, a node of the subgroup in which the first node is located The number is compared with the number of nodes of the subgroup in which the other preparatory management nodes are located, and based on the comparison result, it is determined whether the first node can be converted into a management node by the preparatory management node.
在一实施例中,所述集群的共享存储设备支持单节点独占访问;In an embodiment, the shared storage device of the cluster supports single node exclusive access;
所述判断模块72,设置为确定所述第一节点对所述共享存储设备的第一分区的访问时间,并在所述访问时间到达时挂载所述第一分区,判断所述第一分区内是否存在占位文件,若所述第一分区内不存在占位文件,确定所述第一节点可以转换成管理节点。The determining module 72 is configured to determine an access time of the first node to the first partition of the shared storage device, and mount the first partition when the access time arrives, and determine the first partition Whether there is a placeholder file, if there is no placeholder file in the first partition, it is determined that the first node can be converted into a management node.
在一实施例中,所述确定模块71,设置为确定所述第一节点为所述第一节点所在子群内节点编号最小的节点,则所述第一节点为所在子群的预备管理节点。In an embodiment, the determining module 71 is configured to determine that the first node is a node with the smallest node number in the subgroup where the first node is located, and the first node is a preliminary management node of the subgroup in which the first node is located. .
在本发明实施例中,所述节点设备中的确定模块71及判断模块72,均可由终端或或服务器中的中央处理器(CPU,Central Processing Unit)或数字信号处理器(DSP,Digital Signal Processor)、或现场可编程门阵列(FPGA,Field Programmable Gate Array)、或集成电路(ASIC,Application Specific Integrated Circuit)实现。In the embodiment of the present invention, the determining module 71 and the determining module 72 in the node device may be a central processing unit (CPU) or a digital signal processor (DSP, Digital Signal Processor) in the terminal or the server. ), or Field Programmable Gate Array (FPGA), or Integrated Circuit (ASIC) implementation.
以上涉及节点设备的描述,与上述方法描述是类似的,同方法的有益效果描述,不做赘述。对于本发明所述节点设备实施例中未披露的技术细 节,请参照本发明方法实施例的描述。The above description of the node device is similar to the description of the above method, and the beneficial effects of the same method are described without further description. Technical details not disclosed in the node device embodiment of the present invention For a section, please refer to the description of the method embodiment of the present invention.
本发明实施例还提供了一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令被处理器执行时实现上述集群内的节点管理方法。The embodiment of the invention further provides a computer readable storage medium storing computer executable instructions, which are implemented by the processor to implement a node management method in the cluster.
本领域的技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、随机存取存储器(RAM,Random Access Memory)、只读存储器(ROM,Read-Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。It can be understood by those skilled in the art that all or part of the steps of implementing the above method embodiments may be completed by hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, and the program is executed when executed. The foregoing storage medium includes: a mobile storage device, a random access memory (RAM), a read-only memory (ROM), a magnetic disk, or an optical disk. A medium that can store program code.
或者,本发明实施例上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、RAM、ROM、磁碟或者光盘等各种可以存储程序代码的介质。Alternatively, the above-described integrated unit of the embodiment of the present invention may be stored in a computer readable storage medium if it is implemented in the form of a software function module and sold or used as a stand-alone product. Based on the understanding, the technical solution of the embodiment of the present invention may be embodied in the form of a software product stored in a storage medium, including a plurality of instructions for causing a computer device (which may be a personal computer, a server, or Either network device or the like) performs all or part of the methods described in various embodiments of the invention. The foregoing storage medium includes various media that can store program codes, such as a mobile storage device, a RAM, a ROM, a magnetic disk, or an optical disk.
以上所述,仅为本发明的实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The foregoing is only an embodiment of the present invention, but the scope of protection of the present application is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present application. It is covered by the scope of protection of this application. Therefore, the scope of protection of the present application should be determined by the scope of the claims.
工业实用性Industrial applicability
通过本发明实施例,在节点间心跳连接异常时,集群分裂成两个或两个以上的子群,而第一节点在确定自身成为所在子群的预备管理节点后,进一步判断是否可以成为集群的管理节点,并在判断为是的情况下,作为管理节点对集群资源进行节点间的重配置及节点间任务调度,有效的避免了脑裂的发生,保证了集群的高可用性和可靠性,更由于第一节点为集群内的节点,因此无需引入第三方管理设备,实现简单。 According to the embodiment of the present invention, when the heartbeat connection between the nodes is abnormal, the cluster is split into two or more subgroups, and the first node further determines whether it can become a cluster after determining that it is the preparatory management node of the subgroup in which the subgroup is located. Management node, and when it is judged as YES, as a management node, the inter-node reconfiguration and inter-node task scheduling of the cluster resources effectively avoid the occurrence of brain splitting and ensure the high availability and reliability of the cluster. Moreover, since the first node is a node in the cluster, it is not necessary to introduce a third-party management device, and the implementation is simple.

Claims (15)

  1. 一种集群内的节点管理方法,所述方法应用于第一节点,所述方法包括:A node management method in a cluster, the method being applied to a first node, the method comprising:
    检测到节点间心跳连接异常时,依据第一管理策略确定所述第一节点为所在子群的预备管理节点;When detecting a heartbeat connection abnormality between nodes, determining, according to the first management policy, that the first node is a preliminary management node of the subgroup in which the node is located;
    基于第二管理策略判断所述第一节点是否可以由预备管理节点转换成管理节点,确定所述第一节点可以由预备管理节点转换成管理节点时,作为管理节点对集群资源进行节点间的重配置及节点间任务调度。Determining whether the first node can be converted into a management node by the preliminary management node based on the second management policy, and determining that the first node can be converted into a management node by the preliminary management node, Configuration and task scheduling between nodes.
  2. 根据权利要求1所述的方法,其中,所述检测到节点间心跳连接异常之前,所述方法还包括:The method according to claim 1, wherein before the detecting an abnormality of the heartbeat connection between the nodes, the method further comprises:
    依据第三管理策略确定第二节点为管理节点,以使所述第二节点进行资源配置及任务调度。Determining, according to the third management policy, that the second node is a management node, so that the second node performs resource configuration and task scheduling.
  3. 根据权利要求1或2所述的方法,其中,所述基于第二管理策略判断所述第一节点是否可以由预备管理节点转换成管理节点,包括:The method according to claim 1 or 2, wherein the determining, based on the second management policy, whether the first node can be converted into a management node by the preliminary management node comprises:
    基于预设的网络检测方式判断所述第一节点是否存在对外网络连接,判断为是时,确定所述第一节点可以转换成管理节点;判断为否时,确定所述第一节点不能转换成管理节点。Determining whether the first node has an external network connection based on a preset network detection manner, determining that the first node can be converted into a management node when the determination is yes; determining that the first node cannot be converted into a Management node.
  4. 根据权利要求1或2所述的方法,其中,所述集群内的共享存储设备支持多节点共同访问;The method according to claim 1 or 2, wherein the shared storage device in the cluster supports multi-node common access;
    所述基于第二管理策略判断所述第一节点是否可以由预备管理节点转换成管理节点,包括:The determining, according to the second management policy, whether the first node can be converted into a management node by the preliminary management node includes:
    确定所述共享存储设备未被占用时,在所述共享存储设备上创建占位文件,并在预设时间后检测所述特定目录下是否存在其它预备管理节点创建的占位文件,若不存在,则确定所述第一节点可以转换成管理节点;若存在,则对所述第一节点所在子群的节点数及所述其它预备管理节点所在子群的节点数进行比较,并基于比较结果判定所述第一节点是否可以由预备管理节点转换成管理节点。When it is determined that the shared storage device is not occupied, a placeholder file is created on the shared storage device, and after a preset time, it is detected whether there is a placeholder file created by another preparatory management node in the specific directory, if not Determining that the first node can be converted into a management node; if present, comparing the number of nodes of the sub-group in which the first node is located and the number of nodes of the sub-group in which the other preparatory management node is located, and based on the comparison result It is determined whether the first node can be converted into a management node by the preliminary management node.
  5. 根据权利要求4所述的方法,其中,对所述第一节点所在子群的节 点数及所述其它预备管理节点所在子群的节点数进行比较,并基于比较结果判定所述第一节点是否可以由预备管理节点转换成管理节点,包括:The method of claim 4, wherein the section of the subgroup in which the first node is located The number of points and the number of nodes of the subgroup in which the other preparatory management nodes are located are compared, and based on the comparison result, it is determined whether the first node can be converted into a management node by the preliminary management node, including:
    确定所述第一节点所在子群的节点数多于所述其它预备管理节点所在子群的节点数时,确定所述第一节点可以转换成管理节点;Determining that the number of nodes of the sub-group in which the first node is located is greater than the number of nodes of the sub-group in which the other preparatory management nodes are located, determining that the first node can be converted into a management node;
    确定所述第一节点所在子群的节点数最多且存在与所述第一节点所在子群的节点数相同的子群时,判断所述第一节点的节点编号是否小于与所述第一节点所在子群的节点数相同的子群中的预备管理节点的节点编号,并在判断为是时确定所述第一节点可以转换成管理节点,判断为否时确定所述第一节点不可以转换成管理节点。Determining whether the node number of the first node is smaller than the first node when determining that the number of nodes of the sub-group in which the first node is located is the most and the sub-group having the same number of nodes as the sub-group in which the first node is located a node number of a preliminary management node in a subgroup having the same number of nodes in the subgroup, and determining that the first node can be converted into a management node when the determination is yes, and determining that the first node is not convertible when the determination is no Become a management node.
  6. 根据权利要求1或2所述的方法,其中,所述集群内的共享存储设备支持单节点独占访问;The method according to claim 1 or 2, wherein the shared storage device in the cluster supports single node exclusive access;
    所述基于第二管理策略判断所述第一节点是否可以由预备管理节点转换成管理节点,包括:The determining, according to the second management policy, whether the first node can be converted into a management node by the preliminary management node includes:
    确定所述第一节点对所述共享存储设备的第一分区的访问时间,并在所述访问时间到达时挂载所述第一分区,且在确定所述第一分区内不存在占位文件时,确定所述第一节点可以转换成管理节点;在确定所述第一分区内存在占位文件时,确定所述第一节点不可以转换成管理节点。Determining an access time of the first node to the first partition of the shared storage device, and mounting the first partition when the access time arrives, and determining that there is no placeholder file in the first partition When it is determined that the first node can be converted into a management node; when it is determined that the location file exists in the first partition, it is determined that the first node cannot be converted into a management node.
  7. 根据权利要求1或2所述的方法,其中,所述依据第一管理策略确定所述第一节点为所在子群的预备管理节点,包括:The method according to claim 1 or 2, wherein the determining, according to the first management policy, that the first node is a preliminary management node of a subgroup, comprises:
    确定所述第一节点为所述第一节点所在子群内节点编号最小的节点,则所述第一节点为所在子群的预备管理节点。Determining that the first node is a node with the smallest node number in the subgroup where the first node is located, and the first node is a preliminary management node of the subgroup in which the first node is located.
  8. 一种节点设备,所述节点设备包括:确定模块及判断模块;A node device, the node device includes: a determining module and a determining module;
    所述确定模块,设置为检测到节点间心跳连接异常时,依据第一管理策略确定所述第一节点为所在子群的预备管理节点;The determining module is configured to: when detecting an abnormal heartbeat connection between the nodes, determine, according to the first management policy, that the first node is a preliminary management node of the subgroup in which the node is located;
    所述判断模块,设置为基于第二管理策略判断所述第一节点是否可以由预备管理节点转换成管理节点,确定所述第一节点可以由预备管理节点转换成管理节点时,作为管理节点对集群资源进行节点间的重配置及节点间任务调度。 The determining module is configured to determine, according to the second management policy, whether the first node can be converted into a management node by the preliminary management node, and determine that the first node can be converted into a management node by the preliminary management node, as a management node pair Cluster resources perform reconfiguration between nodes and task scheduling between nodes.
  9. 根据权利要求8所述的节点设备,其中,The node device according to claim 8, wherein
    所述确定模块,还设置为依据第三管理策略确定第二节点为管理节点,以使所述第二节点进行资源配置及任务调度。The determining module is further configured to determine, according to the third management policy, that the second node is a management node, so that the second node performs resource configuration and task scheduling.
  10. 根据权利要求8或9所述的节点设备,其中,A node device according to claim 8 or 9, wherein
    所述判断模块,设置为基于预设的网络检测方式判断所述第一节点是否存在对外网络连接,判断为是时,确定所述第一节点可以转换成管理节点;判断为否时,确定所述第一节点不能转换成管理节点。The determining module is configured to determine, according to a preset network detection manner, whether the first node has an external network connection, and if the determination is yes, determining that the first node can be converted into a management node; The first node cannot be converted into a management node.
  11. 根据权利要求8或9所述的节点设备,其中,所述集群内的共享存储设备支持多节点共同访问;The node device according to claim 8 or 9, wherein the shared storage device in the cluster supports multi-node common access;
    所述判断模块,设置为确定所述共享存储设备未被占用时,在所述共享存储设备的特定目录下创建占位文件,并在预设时间后检测所述特定目录下是否存在其它预备管理节点创建的占位文件,若不存在,则确定所述第一节点可以转换成管理节点;若存在,则对所述第一节点所在子群的节点数及所述其它预备管理节点所在子群的节点数进行比较,并基于比较结果判定所述第一节点是否可以由预备管理节点转换成管理节点。The determining module is configured to: when the shared storage device is not occupied, create a placeholder file in a specific directory of the shared storage device, and detect whether there is another preparatory management in the specific directory after a preset time. a placeholder file created by the node, if not, determining that the first node can be converted into a management node; if present, the number of nodes of the subgroup where the first node is located and the subgroup of the other preparatory management node The number of nodes is compared, and based on the comparison result, it is determined whether the first node can be converted into a management node by the preliminary management node.
  12. 根据权利要求11所述的节点设备,其中,The node device according to claim 11, wherein
    所述判断模块,设置为确定所述第一节点所在子群的节点数多于所述其它预备管理节点所在子群的节点数时,确定所述第一节点可以转换成管理节点;The determining module is configured to determine that the number of nodes of the sub-group in which the first node is located is greater than the number of nodes of the sub-group in which the other preparatory management node is located, and determine that the first node can be converted into a management node;
    确定所述第一节点所在子群的节点数最多且存在与所述第一节点所在子群的节点数相同的子群时,判断所述第一节点的节点编号是否小于与所述第一节点所在子群的节点数相同的子群中的预备管理节点的节点编号,并在判断为是时确定所述第一节点可以转换成管理节点,判断为否时确定所述第一节点不可以转换成管理节点。Determining whether the node number of the first node is smaller than the first node when determining that the number of nodes of the sub-group in which the first node is located is the most and the sub-group having the same number of nodes as the sub-group in which the first node is located a node number of a preliminary management node in a subgroup having the same number of nodes in the subgroup, and determining that the first node can be converted into a management node when the determination is yes, and determining that the first node is not convertible when the determination is no Become a management node.
  13. 根据权利要求8或9所述的节点设备,其中,所述集群内的共享存储设备支持单节点独占访问;The node device according to claim 8 or 9, wherein the shared storage device in the cluster supports single node exclusive access;
    所述判断模块,设置为确定所述第一节点对所述共享存储设备的第一分区的访问时间,并在所述访问时间到达时挂载所述第一分区,判断所述第一 分区内是否存在占位文件,确定所述第一分区内不存在占位文件时,确定所述第一节点可以转换成管理节点;在确定所述第一分区内存在占位文件时,确定所述第一节点不可以转换成管理节点。The determining module is configured to determine an access time of the first node to the first partition of the shared storage device, and mount the first partition when the access time arrives, and determine the first Whether there is a placeholder file in the sub-area, determining that the first node does not have a placeholder file, determining that the first node can be converted into a management node; determining that the placeholder file exists in the first partition, determining The first node cannot be converted into a management node.
  14. 根据权利要求8或9所述的节点设备,其中,A node device according to claim 8 or 9, wherein
    所述确定模块,设置为确定所述第一节点为所述第一节点所在子群内节点编号最小的节点,则所述第一节点为所在子群的预备管理节点。The determining module is configured to determine that the first node is a node with the smallest node number in the subgroup where the first node is located, and the first node is a preliminary management node of the subgroup in which the first node is located.
  15. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1-7任一项的集群内的节点管理方法。 A computer readable storage medium storing computer executable instructions for performing a node management method within a cluster of any of claims 1-7.
PCT/CN2017/085935 2016-06-14 2017-05-25 Node management method in cluster and node device WO2017215430A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610416731.3A CN107508694B (en) 2016-06-14 2016-06-14 Node management method and node equipment in cluster
CN201610416731.3 2016-06-14

Publications (1)

Publication Number Publication Date
WO2017215430A1 true WO2017215430A1 (en) 2017-12-21

Family

ID=60663429

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/085935 WO2017215430A1 (en) 2016-06-14 2017-05-25 Node management method in cluster and node device

Country Status (2)

Country Link
CN (1) CN107508694B (en)
WO (1) WO2017215430A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113472662A (en) * 2021-07-09 2021-10-01 武汉绿色网络信息服务有限责任公司 Path redistribution method and network service system
CN115269248A (en) * 2022-07-28 2022-11-01 江苏安超云软件有限公司 Method and device for preventing split brain under dual-node cluster, electronic equipment and storage medium

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108111337B (en) * 2017-12-06 2021-04-06 北京天融信网络安全技术有限公司 Method and equipment for arbitrating main nodes in distributed system
CN110290159B (en) * 2018-03-19 2022-06-28 中移(苏州)软件技术有限公司 Method and equipment for scheduling management
CN110858168B (en) * 2018-08-24 2023-08-18 浙江宇视科技有限公司 Cluster node fault processing method and device and cluster node
CN109614390A (en) * 2018-12-06 2019-04-12 无锡华云数据技术服务有限公司 Data base read-write separation method, device, service system, equipment and medium
CN112835915B (en) * 2019-11-25 2023-07-18 中国移动通信集团辽宁有限公司 MPP database system, data storage method and data query method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102402395A (en) * 2010-09-16 2012-04-04 上海中标软件有限公司 Quorum disk-based non-interrupted operation method for high availability system
CN104378232A (en) * 2014-11-10 2015-02-25 东软集团股份有限公司 Schizencephaly finding and recovering method and device under main joint and auxiliary joint cluster networking mode
CN104917792A (en) * 2014-03-12 2015-09-16 上海宝信软件股份有限公司 Democratic and autonomous cluster management method and system
US9146705B2 (en) * 2012-04-09 2015-09-29 Microsoft Technology, LLC Split brain protection in computer clusters

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9026860B2 (en) * 2012-07-31 2015-05-05 International Business Machines Corpoation Securing crash dump files
CN105450717A (en) * 2014-09-29 2016-03-30 中兴通讯股份有限公司 Method and device for processing brain split in cluster
CN105046382A (en) * 2015-09-16 2015-11-11 浪潮(北京)电子信息产业有限公司 Heterogeneous system parallel random forest optimization method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102402395A (en) * 2010-09-16 2012-04-04 上海中标软件有限公司 Quorum disk-based non-interrupted operation method for high availability system
US9146705B2 (en) * 2012-04-09 2015-09-29 Microsoft Technology, LLC Split brain protection in computer clusters
CN104917792A (en) * 2014-03-12 2015-09-16 上海宝信软件股份有限公司 Democratic and autonomous cluster management method and system
CN104378232A (en) * 2014-11-10 2015-02-25 东软集团股份有限公司 Schizencephaly finding and recovering method and device under main joint and auxiliary joint cluster networking mode

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113472662A (en) * 2021-07-09 2021-10-01 武汉绿色网络信息服务有限责任公司 Path redistribution method and network service system
CN115269248A (en) * 2022-07-28 2022-11-01 江苏安超云软件有限公司 Method and device for preventing split brain under dual-node cluster, electronic equipment and storage medium
CN115269248B (en) * 2022-07-28 2023-08-08 安超云软件有限公司 Method and device for preventing brain fracture under double-node cluster, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN107508694A (en) 2017-12-22
CN107508694B (en) 2021-11-16

Similar Documents

Publication Publication Date Title
WO2017215430A1 (en) Node management method in cluster and node device
US11163653B2 (en) Storage cluster failure detection
US9830239B2 (en) Failover in response to failure of a port
US10983880B2 (en) Role designation in a high availability node
US11809291B2 (en) Method and apparatus for redundancy in active-active cluster system
CN109344014B (en) Main/standby switching method and device and communication equipment
US10127124B1 (en) Performing fencing operations in multi-node distributed storage systems
WO2015169199A1 (en) Anomaly recovery method for virtual machine in distributed environment
US11106556B2 (en) Data service failover in shared storage clusters
WO2016202051A1 (en) Method and device for managing active and backup nodes in communication system and high-availability cluster
US9992058B2 (en) Redundant storage solution
WO2017107827A1 (en) Method and apparatus for isolating environment
US20070180287A1 (en) System and method for managing node resets in a cluster
JP6866927B2 (en) Cluster system, cluster system control method, server device, control method, and program
US8370897B1 (en) Configurable redundant security device failover
WO2021139174A1 (en) Faas distributed computing method and apparatus
CN113596195B (en) Public IP address management method, device, main node and storage medium
CN114840495A (en) Database cluster split-brain prevention method, storage medium and device
US9374315B2 (en) Spare resource election in a computing system
CN112612652A (en) Distributed storage system abnormal node restarting method and system
US11947431B1 (en) Replication data facility failure detection and failover automation
KR100793446B1 (en) Method for processing fail-over and returning of duplication telecommunication system
JP6653250B2 (en) Computer system
CN117499210A (en) Dual-activity cluster arbitration method and device, computer equipment and storage medium
JP2022174535A (en) Cluster system, monitoring system, monitoring method, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17812541

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17812541

Country of ref document: EP

Kind code of ref document: A1