CN111200518B - Decentralized HPC computing cluster management method and system based on paxos algorithm - Google Patents

Decentralized HPC computing cluster management method and system based on paxos algorithm Download PDF

Info

Publication number
CN111200518B
CN111200518B CN201911352764.6A CN201911352764A CN111200518B CN 111200518 B CN111200518 B CN 111200518B CN 201911352764 A CN201911352764 A CN 201911352764A CN 111200518 B CN111200518 B CN 111200518B
Authority
CN
China
Prior art keywords
nodes
cluster
node
management
management node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911352764.6A
Other languages
Chinese (zh)
Other versions
CN111200518A (en
Inventor
解文龙
张晋锋
张永生
刘瑞贤
李斌
历军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Sugon Information Industry Chengdu Co ltd
Dawning Information Industry Beijing Co Ltd
Original Assignee
Zhongke Sugon Information Industry Chengdu Co ltd
Dawning Information Industry Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Sugon Information Industry Chengdu Co ltd, Dawning Information Industry Beijing Co Ltd filed Critical Zhongke Sugon Information Industry Chengdu Co ltd
Priority to CN201911352764.6A priority Critical patent/CN111200518B/en
Publication of CN111200518A publication Critical patent/CN111200518A/en
Application granted granted Critical
Publication of CN111200518B publication Critical patent/CN111200518B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/04Network management architectures or arrangements
    • H04L41/042Network management architectures or arrangements comprising distributed management centres cooperatively managing the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention discloses a decentralized HPC computing cluster management method and system based on paxos algorithm, wherein the method comprises deploying a main management node and a plurality of standby management nodes, and setting a cluster management election mechanism; the cluster management election mechanism comprises: the main management node sends out a reply that the heartbeat connection exceeds a preset value, and the standby management node carries out election according to paxos algorithm to generate a new main management node; and the original main management node is off line, and the new main management node monitors the heartbeat of the rest standby management nodes. The invention can optimize the HPC high-performance job scheduling cluster from a single-master centralized cluster mode to a decentralized cluster mode, greatly improve the cluster availability due to the change of the mode, is not limited by single-point fault of the single-master centralized mode, improve the fault tolerance of the cluster by several orders of magnitude, make the fault more fit the actual scene, provide automatic high availability for the cluster, and do not need to finish the high availability by a third-party tool.

Description

Decentralized HPC computing cluster management method and system based on paxos algorithm
Technical Field
The invention relates to the technical field of computer data processing, in particular to a decentralized HPC computing cluster management method and system based on paxos algorithm.
Background
With the vigorous push of the nation on informatization innovation, the construction of Chinese supercomputers is also named in the top of the world, more and more national supercomputers are used, the scale of the supercomputers is larger and larger, the calculation power easily breaks through the level E, the requirements on software such as an operation scheduling system and a cluster monitoring system which run on the supercomputers are higher and higher, and a High Performance Computing (HPC) software product framework used in the prior small scale can not adapt to larger-scale scheduling and calculation resource monitoring, so that the hardware is not matched with a software system, and the actual calculation performance of the whole calculation cluster is influenced in the software level. The conventional HPC cluster product software is basically a master-slave cluster architecture, the clusters are basically realized in a master-slave mode, a typical single-master centralized cluster architecture can realize high availability at a single fault through three-party software, and if more than two faults occur, the whole cluster is in an unavailable state. All the jobs submitted in the working mode of the single main cluster can only be submitted and dispatched through the main management node, when the cluster scale is small, the job dispatching pressure is relieved by multiple jobs in a queuing mode, when the scale of the supercomputer is large enough, the calculation force is no longer a bottleneck, the dispatching and the availability of the main management node become a new bottleneck, and particularly when more small jobs are submitted at high concurrency, the job dispatching pressure can rise in a geometric multiple; the cluster computing resource monitoring is the same, the collected data pressure is transferred to a management node for processing, and an ultra-large-scale high-concurrency monitoring service scene is difficult to realize. Because the existing high-performance computing cluster is a single management node job scheduling system, a job scheduler cannot realize load balance; the expansibility is poor, or the expansion is not supported, and the management node cannot be increased randomly.
In view of the above, the present invention is particularly proposed.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a decentralized HPC computing cluster management method and system based on paxos algorithm, and the cluster availability is improved.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a decentralized HPC computing cluster management method based on paxos algorithm includes
Deploying a main management node and a plurality of standby management nodes, and setting a cluster management election mechanism;
the cluster management election mechanism comprises: the main management node sends out a reply that the heartbeat connection exceeds a preset value, and the standby management node carries out election according to paxos algorithm to generate a new main management node;
and the original main management node is off line, and the new main management node monitors the heartbeat of the rest standby management nodes.
Further, in the decentralized HPC computing cluster management method based on paxos algorithm, the cluster management election mechanism specifically includes:
s1, a main management node sends heartbeat connection information to monitor other nodes in a cluster, repeatedly collects and counts heartbeat and heartbeat, and determines whether a standby management node initiates an election request or not according to a counting result;
and S2, when one of the standby management nodes initiates an election request firstly, other nodes respond.
S3, if the response of more than half of the nodes is true, the original main management node is off-line;
s4, if the responses above half of the nodes are false, the original main management node continues to work;
s5, if the original main management node is offline, entering an election process;
s6, after the node initiating the election sends an election notice, all the nodes enter an election mode; and selecting a new management node according to a Paxos election algorithm, and informing all the nodes.
Furthermore, the decentralized HPC computing cluster management method based on paxos algorithm further includes setting a multi-node job submission and resource management mechanism.
Further, in the decentralized HPC computing cluster management method based on paxos algorithm, the multi-node job submission and resource management mechanism includes:
all nodes in the cluster are deployed with job receiving, job scheduling, job monitoring, resource application and monitoring services;
all nodes in the cluster share one computing resource pool; when any node submits the operation, computing resources are selected at the same time, if the computing resources in the computing resource pool are met, the corresponding computing resources are locked from the resource pool, the operation is received, an operation queue is created for operation scheduling and operation, and other nodes cannot see the locked resources;
and when the operation on the node is finished, immediately releasing the resources into the resource pool.
Further, in the decentralized HPC computing cluster management method based on paxos algorithm, the multi-node job submission and resource management mechanism further includes: when the node fails and is confirmed to be offline, the main management node is responsible for updating the resource pool; and when the main management node is offline, the new management node updates the resource pool.
Further, in the decentralized HPC computing cluster management method based on paxos algorithm, the multi-node job submission and resource management mechanism further includes:
when the number of the nodes in the cluster is lower than 1/2 of the total number of the nodes, sending a cluster shutdown application by the main management node;
the node receiving the application confirms all the nodes, if the result is consistent with the main management node, true is replied to the main management node, otherwise, false is replied;
when the number of the nodes replying the main management nodes is larger than that of the nodes sending the shutdown application, the main management nodes send shutdown instructions, all received nodes are off line, the management nodes are also off line automatically, and the cluster is broken down.
Furthermore, in the decentralized HPC computing cluster management method based on paxos algorithm, all services are in a stop state after the cluster is disassembled; if the current computing resources are enough to complete the computation in the running process of the existing operation, stopping the operation and waiting for the computation of the operation to be completed;
after the main management node sends out a job halt application, the job receiving services of all the nodes stop service, if the halt service is refused, the job receiving services of the available nodes are started again, and if the resources of the computing nodes are less than the resources of job running, the job is forcibly cancelled; and recorded in the shutdown log.
The invention also provides a cluster system for implementing the method.
Compared with the prior art, the invention has the following beneficial effects:
the invention can optimize the HPC high-performance job scheduling cluster from a single-master centralized cluster mode to a decentralized cluster mode, greatly improve the cluster availability due to the change of the mode, is not limited by single-point fault of the single-master centralized mode, improve the fault tolerance of the cluster by several orders of magnitude, make the fault more fit the actual scene, provide automatic high availability for the cluster, and do not need to finish the high availability by a third-party tool. Any node in the cluster can be a master node, so that the cluster can continue to work, automatic load balancing can be realized, a user can submit a job from any node, and any node can also schedule the job. The multi-node simultaneously provides services, automatic load balancing is realized, the bottleneck that the conventional HPC job scheduling software cannot adapt to a larger-scale cluster is solved, and the computing capability is exerted greatly.
Drawings
In order to more clearly illustrate the detailed description of the invention or the technical solutions in the prior art, the drawings used in the detailed description or the prior art description will be briefly described below. Throughout the drawings, like elements or portions are generally identified by like reference numerals. In the drawings, elements or portions are not necessarily drawn to scale.
FIG. 1 is a logical block diagram of one embodiment of a cluster implementation of a decentralized HPC computing cluster management method based on paxos algorithm in accordance with the present invention;
FIG. 2 is a flow chart of one embodiment of the method of the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and therefore are only examples, and the protection scope of the present invention is not limited thereby.
It is to be noted that, unless otherwise specified, technical or scientific terms used herein shall have the ordinary meaning as understood by those skilled in the art to which the invention pertains.
Technical terms involved in the present invention are first explained:
Figure BDA0002335040130000041
Figure BDA0002335040130000051
example 1
1-2, a decentralized HPC computing cluster management method based on paxos algorithm includes
Deploying a main management node and a plurality of standby management nodes, and setting a cluster management election mechanism;
the cluster management election mechanism comprises: the main management node sends out a reply that the heartbeat connection exceeds a preset value, and the standby management node elects according to paxos algorithm to generate a new main management node;
and the original main management node is off line, and the new main management node monitors the heartbeat of the rest standby management nodes.
The cluster management method of the invention mainly carries out multi-cluster election based on paxos algorithm, a plurality of standby management nodes can be deployed during deployment, and each management node can carry out job scheduling; the main management node monitors the heartbeat of other nodes, does not undertake management work in the aspect of scheduling service, and is only responsible for a small amount of communication and monitoring work.
Specifically, in the method of the present invention, a master management node and a plurality of standby management nodes are deployed, and the management of each node includes:
s1, a main management node sends a heartbeat connection message to monitor other nodes in a cluster; collecting and counting heartbeat replies, and determining whether the standby management node initiates an election request according to a counting result;
and S2, when one of the standby management nodes initiates an election request firstly, other nodes respond.
S3, if the response of more than half of the nodes is true, the original main management node is off-line;
s4, if the responses of more than half of the nodes are false, the original main management node continues to work;
s5, if the original main management node is offline, entering an election process;
s6, after the node initiating the election sends an election notice, all the nodes enter an election mode; and selecting a new management node according to a Paxos election algorithm, and informing all the nodes.
In the invention, the decentralized cluster is not equal to the cluster without the management node, but the management node and the function are changed and basically not related to the service; when the current main management node fails, other nodes can generate new management nodes in an election mode at any time, and the service is not influenced.
When the current main management node has problems, other nodes can not receive heartbeat messages; the node of the received heartbeat message sends an election request to all nodes, if the number of the received replies that the number is false exceeds 1/2 of the total number of the nodes, the node is proved to be a non-management node fault, and the current main management node records the fault frequency of the node and the node is off-line after the fault frequency is up;
if the received replies are more than half of true replies, carrying out an election process, taking the first two nodes which receive the election replies by the node initiating the election firstly as candidate management nodes and the rest nodes as decision nodes, finally deciding a management node by adopting a Paxos algorithm, informing all the nodes, keeping the consensus and receiving the heartbeat monitoring of the new management node. And then deleting the management information of the original management node and taking the original management node off line.
The invention also includes setting up a multinode job submission and resource management mechanism, the mechanism including:
all nodes in the cluster are deployed with services such as job receiving, job scheduling, job monitoring, resource application and monitoring and the like.
The user can submit his/her job from each node, and does not necessarily have to submit the job from the management node.
The clusters share a computing resource pool; when a user submits a job, computing resources are selected at the same time, if the computing resources in the computing resource pool are met, the corresponding computing resources are locked in the resource pool, job creation job alignment is received, job scheduling is carried out, and operation is carried out, other nodes cannot see the locked resources, and only the unlocked computing resources in the computing resource pool can be seen.
When the operation is finished, immediately releasing the resources into the resource pool; if the operation is configured with the priority, the operation with low priority is suspended, and the resource is released. If the higher priority and the job do not exist, the suspended job operation is preferentially resumed; when the node fails and is confirmed to be offline, the main management node is responsible for updating the resource pool; and when the main management node is offline, the new management node updates the resource pool.
When the number of the nodes is less than 1/2 of the total number of the nodes, the management node (namely the main management node, the same below) sends out a cluster shutdown application, namely when the number of the nodes of which the heartbeat reply is received by the management node is less than 1/2 of the total number of the recorded nodes, the nodes receiving the application confirm all the nodes, if the result is consistent with the management node, the node replies true to the management node, otherwise, the node is false.
When the number of the nodes replying the management nodes is true and is larger than that of the nodes sending the shutdown application, the shutdown consistency opinion is achieved, the management nodes send shutdown instructions, all received nodes are offline, the management nodes are also automatically offline, and the cluster is disassembled. At this time, the job scheduling system does not provide job calculation services any more, and all the services are in a stop state.
If the current computing resources are enough to complete the computation during the operation of the job, the machine is stopped to wait for the computation of the job to be completed. After the management node sends out the job halt application, the job receiving services of all the nodes stop service, if the halt service is refused, the job receiving services of the available nodes are started again, and if the resources of the computing nodes are less than the resources of job running, the job is forcibly cancelled. And recorded in the shutdown log.
The cluster users in the invention adopt a unified management mode to manage, any node can be added with users, the whole cluster operation users and the system users are synchronous, and the mode can adopt NIS, LDAP and the like. The users are not repeatable and users of the same ID or username are considered to be the same user. The user may set the limit of computing resources, such as 100CPU, 50GPU, 100G memory, 10T disk, etc. for the user a. A user defaults to no resource quota usage limits when selecting from the total resource pool when the user registers.
The cluster uses a uniform shared storage mode to synchronize operation data, and each management node and each computing node need to mount shared storage and have corresponding directory authority. The storage resources can be quota according to users, and when the user quota exceeds the specified quota, the operation cannot be submitted; and the storage resources are also added into the computing resource pool to carry out unified management on the computing resources.
The invention manages the cluster computing resources, comprising: the nodes added into the computing cluster can individually configure the computing resource contribution to the computing resource pool, and can also uniformly set the computing resource contribution amount in batch; the contribution of the computing resource must not exceed the hardware configuration of the node, otherwise the configuration fails. After the computing node starts to join in computation, the error between the primary configuration resource and the actual computing resource is automatically corrected, and the contribution configuration of the computing resource is modified based on the corrected error.
The invention can optimize the HPC high-performance job scheduling cluster from a single-master centralized cluster mode to a decentralized cluster mode, the cluster availability is greatly improved by changing the mode without being limited by single-point fault of the single-master centralized cluster mode, the cluster fault tolerance capability is improved by several orders of magnitude, the fault is more suitable for a practical scene, automatic high availability is provided for the cluster, and high availability is not required to be completed by a third-party tool. Any node in the cluster can be a master node, so that the cluster can continue to work, automatic load balancing can be realized, a user can submit a job from any node, and any node can also schedule the job. The multi-node simultaneously provides services, automatic load balancing is realized, the bottleneck that the conventional HPC job scheduling software cannot adapt to a larger-scale cluster is solved, and the computing capability is exerted greatly.
Example 2
The invention also provides a decentralized HPC computing cluster system based on paxos algorithm, which is used for realizing the method in the embodiment 1. As shown in fig. 1, the cluster system includes a master management node and a plurality of standby management nodes, where the master management node and the managed nodes automatically generate management nodes according to a cluster management election mechanism preset in the cluster system;
wherein the cluster management election mechanism comprises
The main management node sends out a reply that the heartbeat connection exceeds a preset value, and the standby management node elects according to paxos algorithm to generate a new main management node;
and the original master management node is off line, and the new master management node performs heartbeat monitoring on the remaining standby management nodes.
Referring to fig. 2, specifically, the cluster management election mechanism includes:
s1, a main management node sends a heartbeat connection message to monitor other nodes in a cluster; collecting and counting heartbeat replies, and determining whether the standby management node initiates an election request according to a counting result;
and S2, when one of the standby management nodes initiates an election request firstly, other nodes respond.
S3, if the response of more than half of the nodes is true, the original main management node is off-line;
s4, if the responses above half of the nodes are false, the original main management node continues to work;
s5, if the original main management node is offline, entering an election process;
s6, after the node initiating the election sends an election notice, all the nodes enter an election mode; and selecting a new management node according to a Paxos election algorithm, and informing all nodes.
In the invention, the decentralized cluster is not equal to the cluster without the management node, but the management node and the function are changed and basically not related to the service; when the current main management node fails, other nodes can generate new management nodes in an election mode at any time, and the service is not influenced.
When the current main management node has a problem, other nodes can not receive heartbeat messages; the node of the received heartbeat message sends an election request to all nodes, if the number of the received replies that the number is false exceeds 1/2 of the total number of the nodes, the node is proved to be a non-management node fault, the current main management node records the fault frequency of the node, and the node is off-line after the fault frequency is up;
if the received reply is more than half of true, an election process is carried out, the first two nodes which receive the election reply by the node initiating the election firstly are candidate management nodes, the remaining nodes are decision nodes, a management node is finally decided by adopting a Paxos algorithm, all the nodes are informed, the consensus is kept, and the heartbeat monitoring of the new management node is received. And then deleting the management information of the original management node and disconnecting the original management node.
The cluster system of the invention is also provided with a multi-node job submission and resource management mechanism for managing each node, and the mechanism comprises:
all nodes in the cluster are deployed with services such as job receiving, job scheduling, job monitoring, resource application and monitoring and the like.
The user can submit his/her job from each node, and does not necessarily have to submit the job from the management node.
The clusters share a computing resource pool; when a user submits a job, computing resources are selected at the same time, if the computing resources in the computing resource pool are satisfied, the corresponding computing resources are locked from the resource pool, job creation job alignment is received, job scheduling is carried out, and the job is operated, and other nodes cannot see the locked resources, but can only see the unlocked computing resources in the computing resource pool.
When the operation is finished, immediately releasing the resources into the resource pool; if the operation is configured with the priority, the operation with low priority is suspended, and the resource is released. If no higher priority and operation exist, the suspended operation is preferentially resumed; when the node fails and is confirmed to be offline, the main management node is responsible for updating the resource pool; and when the main management node is offline, the new management node updates the resource pool.
When the number of the nodes is less than 1/2 of the total number of the nodes, the management node (namely the main management node, the same below) sends out a cluster halt application, namely when the number of the nodes receiving the heartbeat reply by the management node is less than 1/2 of the total number of the recorded nodes, the nodes receiving the application confirm all the nodes, if the result is consistent with the management node, the nodes reply true to the management node, otherwise, the result is false.
When the number of the nodes replying the real management nodes is larger than that of the nodes sending the halt application, the halt consistency suggestion is achieved, the management nodes send halt instructions, all received nodes are offline, the management nodes are also offline automatically, and the cluster is broken up. At this time, the job scheduling system does not provide job calculation services any more, and all services are in a stop state.
If the current computing resources are enough to complete the computation in the running process of the job, the machine is stopped to wait for the computation of the job to be completed. After the management node sends out the job halt application, the job receiving services of all the nodes stop service, if the halt service is refused, the job receiving services of the available nodes are started again, and if the resources of the computing nodes are less than the resources of job running, the job is forcibly cancelled. And recorded in the shutdown log.
The cluster users in the invention adopt a unified management mode to manage, any node can be added with users, the whole cluster operation users and the system users are synchronous, and the mode can adopt NIS, LDAP and the like. The users are not repeatable and users of the same ID or username are considered to be the same user. The user may set the limit of computing resources, such as 100CPU, 50GPU, 100G memory, 10T disk, etc. for the user a. A user defaults to no resource quota usage limits when selecting from the total resource pool when the user registers.
The cluster uses a uniform shared storage mode to synchronize operation data, and each management node and each computing node need to mount shared storage and have corresponding directory authority. The storage resources can be quota according to users, and when the user quota exceeds the specified quota, the operation cannot be submitted; and the storage resources are also added into the computing resource pool to carry out unified management on the computing resources.
The invention manages the cluster computing resources, comprising: the nodes added into the computing cluster can be used for independently configuring the computing resource contribution to the computing resource pool and uniformly setting the computing resource contribution amount in batch; the contribution of the computing resource must not exceed the hardware configuration of the node, otherwise the configuration fails. After the computing node starts to join in computation, the error between the primary configuration resource and the actual computing resource is automatically corrected, and the contribution configuration of the computing resource is modified based on the corrected error.
Embodiment 2 is a cluster system implementing the method of embodiment 1, in which the HPC job scheduling cluster is decentralized to automatically generate management nodes, and the decentralized cluster is used to solve the problem of high availability of the HPC cluster; the operation submission, the scheduling and the management nodes are separated, and the operation and the scheduling operation are submitted in parallel by multiple nodes.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being covered by the appended claims and their equivalents.

Claims (6)

1. A decentralized HPC computing cluster management method based on paxos algorithm is characterized by comprising the following steps
Deploying a main management node and a plurality of standby management nodes, and setting a cluster management election mechanism;
the cluster management election mechanism comprises: the main management node sends out a reply that the heartbeat connection exceeds a preset value, and the standby management node carries out election according to paxos algorithm to generate a new main management node;
the original main management node is off line, and the new main management node performs heartbeat monitoring on the remaining standby management nodes;
setting a multi-node job submission and resource management mechanism, wherein the multi-node job submission and resource management mechanism comprises the following steps:
all nodes in the cluster are deployed with job receiving, job scheduling, job monitoring, resource application and monitoring services;
all nodes in the cluster share one computing resource pool; when any node submits the operation, computing resources are selected at the same time, if the computing resources in the computing resource pool are satisfied, the corresponding computing resources are locked from the resource pool, the operation is received, an operation queue is created, operation scheduling is carried out, and the operation is carried out, and other nodes cannot see the locked resources;
and when the operation on the node is finished, immediately releasing the resources into the resource pool.
2. The method for decentralized HPC computing cluster management based on paxos algorithm according to claim 1, wherein the cluster management election mechanism comprises:
s1, a main management node sends a heartbeat connection message to monitor other nodes in a cluster, repeatedly collects and counts heartbeat and heartbeat, and determines whether a standby management node initiates an election request or not according to a counting result;
s2, when one of the standby management nodes initiates an election request firstly, other nodes respond;
s3, if more than half of the nodes do not receive the heartbeat message, the response is true, and the original main management node is off-line;
s4, if more than half of the nodes receive the heartbeat message, the response is false, and the original main management node continues to work;
s5, if the original main management node is offline, entering an election process;
s6, after the node initiating the election sends an election notice, all the nodes enter an election mode; and selecting a new management node according to a Paxos election algorithm, and informing all nodes.
3. The method for decentralized HPC computing cluster management based on paxos algorithm according to claim 1, wherein said multi-node job submission and resource management mechanism further comprises: when the node fails and is confirmed to be offline, the main management node is responsible for updating the resource pool; and when the main management node is offline, the new management node updates the resource pool.
4. The method for decentralized HPC computing cluster management based on a paxos algorithm according to claim 3, wherein said multi-node job submission and resource management mechanism further comprises:
when the number of the nodes in the cluster is lower than 1/2 of the total number of the nodes, sending a cluster shutdown application by the main management node;
the node receiving the application confirms all the nodes, if the result is consistent with the main management node, true is replied to the main management node, otherwise false is replied;
when the number of the nodes replying the main management nodes is larger than that of the nodes sending the shutdown application, the main management nodes send shutdown instructions, all received nodes are off line, the management nodes are also off line automatically, and the cluster is broken down.
5. The method for decentralized HPC computing cluster management based on the paxos algorithm according to claim 4,
after the cluster is disassembled, all the services are in a stop state; if the current computing resources are enough to complete the computation in the running process of the existing operation, stopping the operation and waiting for the computation of the operation to be completed;
after the main management node sends out a job halt application, job receiving services of all nodes stop service, if the halt service is refused, the job receiving services of available nodes are started again, and if the resources of the computing nodes are less than the resources of job running, the job is forcibly cancelled; and recorded in the shutdown log.
6. A cluster system implementing the method of any one of claims 1 to 5, the cluster system comprising the primary management node and the plurality of standby management nodes.
CN201911352764.6A 2019-12-25 2019-12-25 Decentralized HPC computing cluster management method and system based on paxos algorithm Active CN111200518B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911352764.6A CN111200518B (en) 2019-12-25 2019-12-25 Decentralized HPC computing cluster management method and system based on paxos algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911352764.6A CN111200518B (en) 2019-12-25 2019-12-25 Decentralized HPC computing cluster management method and system based on paxos algorithm

Publications (2)

Publication Number Publication Date
CN111200518A CN111200518A (en) 2020-05-26
CN111200518B true CN111200518B (en) 2022-10-18

Family

ID=70746682

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911352764.6A Active CN111200518B (en) 2019-12-25 2019-12-25 Decentralized HPC computing cluster management method and system based on paxos algorithm

Country Status (1)

Country Link
CN (1) CN111200518B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112860413A (en) * 2021-03-29 2021-05-28 中信银行股份有限公司 Centralized job scheduling system, device, electronic equipment and computer readable storage medium
CN114039978B (en) * 2022-01-06 2022-03-25 天津大学四川创新研究院 Decentralized PoW computing power cluster deployment method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016058307A1 (en) * 2014-10-15 2016-04-21 中兴通讯股份有限公司 Fault handling method and apparatus for resource
CN107276839A (en) * 2017-08-24 2017-10-20 郑州云海信息技术有限公司 A kind of cloud platform from monitoring method and system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9201742B2 (en) * 2011-04-26 2015-12-01 Brian J. Bulkowski Method and system of self-managing nodes of a distributed database cluster with a consensus algorithm
CN106027634B (en) * 2016-05-16 2019-06-04 白杨 Message port Exchange Service system
US10243780B2 (en) * 2016-06-22 2019-03-26 Vmware, Inc. Dynamic heartbeating mechanism

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016058307A1 (en) * 2014-10-15 2016-04-21 中兴通讯股份有限公司 Fault handling method and apparatus for resource
CN107276839A (en) * 2017-08-24 2017-10-20 郑州云海信息技术有限公司 A kind of cloud platform from monitoring method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《Continuous Adaptation for High Performance Throughput Computing across Distributed Clusters》;Edward Walker;《IEEE》;20181231;全文 *

Also Published As

Publication number Publication date
CN111200518A (en) 2020-05-26

Similar Documents

Publication Publication Date Title
CN113014634B (en) Cluster election processing method, device, equipment and storage medium
CN101702721B (en) Reconfigurable method of multi-cluster system
CN104769919B (en) Load balancing access to replicated databases
CA3168286A1 (en) Data flow processing method and system
CN108810100B (en) Method, device and equipment for electing master node
CN102521044B (en) Distributed task scheduling method and system based on messaging middleware
TWI755417B (en) Computing task allocation method, execution method of stream computing task, control server, stream computing center server cluster, stream computing system and remote multi-active system
US9158589B2 (en) Method for dynamic migration of a process or services from one control plane processor to another
Pashkov et al. Controller failover for SDN enterprise networks
WO2016058307A1 (en) Fault handling method and apparatus for resource
WO2017128507A1 (en) Decentralized resource scheduling method and system
CN109814998A (en) A kind of method and device of multi-process task schedule
US20040243709A1 (en) System and method for cluster-sensitive sticky load balancing
CN111200518B (en) Decentralized HPC computing cluster management method and system based on paxos algorithm
JP2001306349A (en) Backup device and backup method
CN109802986B (en) Equipment management method, system, device and server
CN111459639B (en) Distributed task management platform and method supporting global multi-machine room deployment
JPWO2007072544A1 (en) Information processing apparatus, computer, resource allocation method, and resource allocation program
US20170228250A1 (en) Virtual machine service availability
CN111414241A (en) Batch data processing method, device and system, computer equipment and computer readable storage medium
EP3084603B1 (en) System and method for supporting adaptive busy wait in a computing environment
CN115391058B (en) SDN-based resource event processing method, resource creation method and system
CN100473065C (en) A network-oriented machine group working management system and realizing method thereof
CN116074315A (en) Multi-tenant scheduling system based on cloud native architecture
CN107644035B (en) Database system and deployment method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20211009

Address after: 100193 building 36, yard 8, Dongbeiwang West Road, Haidian District, Beijing

Applicant after: Dawning Information Industry (Beijing) Co.,Ltd.

Applicant after: ZHONGKE SUGON INFORMATION INDUSTRY CHENGDU Co.,Ltd.

Address before: 100193 building 36, yard 8, Dongbeiwang West Road, Haidian District, Beijing

Applicant before: Dawning Information Industry (Beijing) Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant