CN101702721A - Reconfigurable method of multi-cluster system - Google Patents

Reconfigurable method of multi-cluster system Download PDF

Info

Publication number
CN101702721A
CN101702721A CN200910236550A CN200910236550A CN101702721A CN 101702721 A CN101702721 A CN 101702721A CN 200910236550 A CN200910236550 A CN 200910236550A CN 200910236550 A CN200910236550 A CN 200910236550A CN 101702721 A CN101702721 A CN 101702721A
Authority
CN
China
Prior art keywords
cluster
election
message
clusters
convenes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200910236550A
Other languages
Chinese (zh)
Other versions
CN101702721B (en
Inventor
胡凯
丁毅
牛建伟
陈陆佳
那日苏
张伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN2009102365502A priority Critical patent/CN101702721B/en
Publication of CN101702721A publication Critical patent/CN101702721A/en
Application granted granted Critical
Publication of CN101702721B publication Critical patent/CN101702721B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a reconfigurable method of a multi-cluster system and is used for solving the problem of flexible configuration such as combination of clusters, detachment of clusters. The multi-cluster system comprises a plurality of member clusters, wherein each member cluster comprises a management node and a calculation node; the management node is provided with a scheduler and a work manager; the management node is only interacted with one activated scheduler to schedule work; the member clusters are further divided into a master cluster and auxiliary clusters; and the master cluster is the member comprising the member clusters of the activated scheduler while the auxiliary cluster is the member cluster comprising no such scheduler. The method further comprises the following processing: processing one, according to system request, adding one cluster to be combined into the present multi-cluster; processing two, according to system request, determining a new master cluster when the master cluster quits from the present multi-cluster; and processing three, according to system request, removing an auxiliary cluster from the present multi-cluster, and rearranging the multi-cluster.

Description

A kind of reconfigurable method of multi-cluster system
Technical field
The present invention relates to high-performance computer Clustering field, particularly relate to the reconfigurable technology between many clusters.
Background technology
Cluster is a kind of computer system, it couples together highly closely to cooperate by one group of loose integrated computer software and hardware finishes evaluation work, have advantages such as cost is low, easy maintenance, flexible configuration, and cluster computer is than single computer generally speaking, and is much higher such as the work station or the supercomputer ratio of performance to price.Fig. 1 is the physical structure of cluster, and many computers (management node 1 and computing node 3) connect into network by switch or other high speed communication equipments 2, promptly constitute a simple computer cluster (being designated hereinafter simply as cluster).Fig. 2 is the logical construction of cluster.
But development along with society, people are increasing to the demand of mass data calculating and complex problem solving, performance of computer systems, availability and cost are had higher requirement, single cluster many times can not satisfy the demand of calculating, and its shortcoming and deficiency also highlight gradually.
When the node number of concurrent job needs was counted above total node of cluster, single cluster just had no idea to have moved, so single cluster can't be handled more massive calculation task.If a plurality of clusters that are in diverse geographic location are coupled together by network, form a bigger computational resource, be referred to as many clusters (Multi-cluster) usually.Many Clusterings can and organize existing a plurality of cluster to couple together with some enterprises, form a bigger computational resource.Do so not only and can improve the computing capability of this organized whole greatly, can move more massive calculation task, all right balanced load, avoid the cluster user of some department to make the group system of affiliated function excessively overload because of operation more, then left unused because operation the is few group system of this department of the cluster user of other departments makes very low of utilization rate of group system.This scheme does not need to increase hardware cost, only wants update system, just can improve computing capability greatly, is very effective.
Yet, need in the real world applications can dispose flexibly between the cluster, so that vehicle-mounted or mobile parallel computation.A plurality of clusters can and split configuration by flexible combination and form a flexible computing environment, just these clusters organically must be combined into an integral body in logic, make its requirement that can satisfy higher computing capability, have good scalability and availability again.In other words, a plurality of independently clusters are when facing having than intensive of task, need be incorporated into provides computing capability together jointly, and after calculation task is finished, need the independent use that at once to scatter again, even under some special situations, need single or one group of cluster are split out from many clusters, be combined into small-scale many clusters and carry out vehicle-mounted mobile computing.But prior art does not effectively address the above problem.
Summary of the invention
In view of this, purpose of the present invention is exactly the reconfigurable method that a kind of multi-cluster system will be provided, and this method just can solve flexibility allocation problems such as many cluster combinations, fractionation, can satisfy dynamic, adaptivity requirement under the particular surroundings.
According to a first aspect of the invention, the present invention has disclosed a kind of reconstitutable method of multi-cluster system, comprise the steps: with between the management node of a plurality of member's clusters that work independently with the network interconnection, make between the management node and can intercom mutually, thereby enough become many clusters; Make described management node transmit communication between each computing node of described member's cluster; In the described management node of each member's cluster, scheduler and job manager are set, are responsible for submission, scheduling and the management of operation; Resource and monitoring operation device are set in the described computing node of each member's cluster, are responsible for monitoring the resource situation of computing node and the execution of job task; The management node of each member's cluster is responsible for receiving the operation that the local user submits to, by management node operation is assigned to each computing node, for avoiding the scheduling conflict, the management node of each member's cluster only carries out the scheduling of operation alternately with a scheduler that is activated, the described scheduler that is activated is to serve for the described management node in a plurality of member's clusters; A plurality of member's clusters in described many clusters further being divided into main cluster and from cluster, wherein main cluster is the member's cluster with described scheduler that is activated, is the member's cluster that does not comprise the described scheduler that is activated from cluster; This method further comprises following three kinds of processing: handle one, according to system requirements, a cluster to be combined is joined in current many clusters; Handle two,, after main cluster withdraws from current many clusters, redefine new main cluster according to system requirements; Handle three,, withdraw from current many clusters with a certain from cluster, and reorganize many clusters according to system requirements.
According to a first aspect of the invention, above-mentioned processing one further comprises: detection phase, handshake phase, competitive stage and update stage; Described cluster to be combined can be the single cluster that does not add any many clusters, also can be cluster more than; Wherein,
A. detection phase
1. current main cluster is probe message of broadcasting in many cluster networks periodically, and monitors and reply;
2. after a cluster to be combined has been received an above-mentioned probe message, judge whether it is that oneself sends, if then abandon; If not then sending an answer that comprises own information to the current main cluster that sends probe message, afterwards, this cluster to be combined abandons the probe message that all are received;
3. after current main cluster was received the answer message of cluster to be combined, current main cluster sent the message of request combination to cluster to be combined, and waited for the answer message of the message of this request combination;
B. handshake phase
4. cluster to be combined is received and is sent the answer message of the message of this request combination to current main cluster after the message of the request combination that current main cluster is sent;
5. after current main cluster is received the answer message of message of this request combination that cluster to be combined is sent, send to cluster to be combined and to agree that both sides shake hands successfully in conjunction with message;
C. competitive stage
6. cluster to be combined sends challenge message to current main cluster, begins to compete main cluster;
7. both sides are at war with according to the rules of competition, and the competition victor becomes new main cluster, loser and oneself from cluster all become the other side from cluster;
D. update stage
8. the cluster of competition failure sends a main cluster change message that comprises new main cluster ID to own all from cluster, notifies them to change the main cluster information of oneself;
The described rules of competition is: selecting to have from the maximum main cluster of number of clusters becomes new main cluster.
According to a first aspect of the invention, above-mentioned processing two further comprises: 1) determine the election regulation of new main cluster, the cluster of election numbering maximum is as new main cluster; 2) after current main cluster withdraws from, find that at first the cluster that current main cluster withdraws from convenes election, this convenes the cluster of election to send an election message to all than the big cluster of oneself numbering; 3) if a certain cluster is received the election information that numbering is sent than own little cluster, then this cluster is replied a message that comprises self number information and is convened the cluster of election to this; 4) if the described cluster of election of convening can not get any answer, then this convenes the cluster of election to win election, and this convenes the cluster of election to send co-ordination message to all clusters, announces oneself to be new main cluster; 5) if this convenes the cluster of election to obtain any one answer, this answer necessarily comes from than the big cluster of oneself numbering, this convenes the election of convening of the cluster of election to finish, because this convenes the cluster of election can not win election and become new main cluster this moment; 6) for other cluster the cluster of convening election except this, or convening election, maybe may receive the election message of the cluster littler than oneself numbering, after other cluster except that above-mentioned cluster of convening election is received an election message, send the cluster of election message to this with replying a response message; If at this moment other cluster except that above-mentioned cluster of convening election also is not the cluster of convening election, they also will begin a process of convening election, and promptly execution in step 1) to 4) operation.
According to a first aspect of the invention, above-mentioned processing three further comprises: main cluster periodically sends broadcasting to each from cluster, finding certain when main cluster no longer responds from cluster, just send message to each cluster, and upgrade member's cluster-list information, show the withdrawing from or break down of not response from cluster.
According to a first aspect of the invention, above-mentioned processing two further comprises: 1) determine the election regulation of new main cluster, the cluster of election numbering maximum is as new main cluster; 2) after current main cluster withdraws from, find that at first the cluster that current main cluster withdraws from convenes election, this convenes the cluster of election to send an election message to all than the big cluster of oneself numbering; 3), then reply a message that comprises self number information and convene the cluster of election to this if a certain cluster is received the election information that numbering is sent than own little cluster; 4) if convene the cluster of election to can not get any answer, then this convenes the cluster of election to win election, and this convenes the cluster of election to send co-ordination message to all clusters, announces oneself elected new main cluster; 5) if the described cluster of election of convening obtains replying, this convenes the cluster of election to collect all response messages, therefrom chooses the maximum cluster of numbering, elects it and is new main cluster, and this new main cluster sends the message informing election results to other cluster.
According to a first aspect of the invention, above-mentioned processing two further comprises: 1) determine the election regulation of new main cluster, the cluster of election numbering maximum is as new main cluster; 2) after current main cluster withdraws from, find that at first the cluster that current main cluster withdraws from convenes election, this convenes the cluster of election to send an election message to all than the big cluster of oneself numbering; 3) if a plurality of cluster is initiated election simultaneously, then a certain cluster in the multi-cluster system will be received many described election message, and this cluster of receiving many election message only less cluster of numbering in a plurality of clusters of convening election that send many election message is replied a response message that comprises self number information; 4) if a certain cluster of election of convening can not get any answer, then this convenes the cluster of election to win election, and this convenes the cluster of election to send co-ordination message to all clusters, announces oneself to become new main cluster; 5) if the above-mentioned cluster of election of convening obtains replying, this convenes the cluster of election to collect all response messages, therefrom chooses the maximum cluster of numbering, elects it and is new main cluster, and this new main cluster sends the message informing election results to other cluster.
In order to allow purpose of the present invention, feature and advantage become apparent, several embodiment of the present invention have hereinafter been exemplified, and conjunction with figs., be described in detail below, nationality so that those of ordinary skills can clearer understanding institute of the present invention the spirit set forth of desire.
Description of drawings
Fig. 1 is the physical structure figure of cluster;
Fig. 2 is the building-block of logic of cluster;
Fig. 3 is the institutional framework of multi-cluster computing environment of can recombinating;
Fig. 4 is the architecture that job management and scheduling are separated from each other;
Fig. 5 is centralized management and centralized scheduling structure chart;
Fig. 6 is distributed management and centralized scheduling structure chart;
Fig. 7 is distributed management and distributed scheduling structure chart;
Fig. 8 is the architecture of multi-cluster system of can recombinating;
Fig. 9 be the Master cluster withdraw from the processing schematic diagram;
Figure 10 is the optimization process 1 that the Master cluster withdraws from processing;
Figure 11 is the optimization process 2 that the Master cluster withdraws from processing.
Embodiment
Embodiment 1
The multi-cluster computing environment of can recombinating is made up of a plurality of member's clusters 4, each cluster has independently, and management domain also can work independently, be in reciprocity status between the cluster, the combination of multi-cluster computing environment and fractionation need not to change the management domain of each cluster, reorganization that can be flexible has scalability and availability preferably.Its institutional framework as shown in Figure 3.
If multi-cluster computing environment is by n member's cluster C 1, C 2..., C nForm, wherein C iComputing node 3 by a plurality of isomorphisms constitutes, but then its regroup can be described as: many clusters M={C i, the cluster among 1≤i≤n} can combination in any become a plurality of cluster group N 1, N 2... N m(1≤m≤n), and N j(1≤j≤m) is one of nonvoid subset of M, if x ∈ is N p, y ∈ N q, and p ≠ q, so x ≠ y must be arranged.
With the network interconnection, can intercom mutually between the management node 1 of a plurality of clusters, the communication between each computing node 3 between many clusters need be through the forwarding of management node 1.
By analysis, but to satisfy regroup and will solve following two problems.
Figure G2009102365502D0000051
Do you how to know which member's cluster are arranged in the current system?
Whom stores the information of member's cluster by?
In the process of many colony dispatchings operation, need the operation and the resource situation of each member's cluster 4 of inquiry, thereby make scheduling decision.In common many group operations management system, can use configuration file to specify each member's cluster 4.But in can type recombined many clusters, member's cluster 4 can dynamically add or withdraw from any time, if all need each configuration file of the manual renewal of user at every turn, be undoubtedly the user more workload, reduced the ease for use of multi-cluster system.Therefore, need the tabulation that a kind of new mechanism is safeguarded all member's clusters 4 in the current system, use for scheduler.
The another one problem be member's cluster 4 list storage somewhere.Solution is to be responsible for safeguarding the information of cluster-list by the some nodes in member's cluster 4 the most intuitively.Yet in the environment of can recombinating, any one clustered node all may withdraw from any time.
The architecture that the present invention has adopted job management and scheduling to be separated from each other, reduced the coupling between job management module and scheduler module, simplify the realization of scheduler, be convenient to add new scheduling strategy, and given the ability that scheduler is served for a plurality of job managers simultaneously.
As shown in Figure 4, job manager assembly 5 and scheduler 6 assemblies are installed on the management node 1 of cluster, and a resource and monitoring operation device assembly 7 all can be installed on each computing node 3.Job manager assembly 5 has member's cluster-list, the relevant information of its storage member cluster, the management of Life cycle is carried out in 5 pairs of operations of this job manager, also be responsible for and the user between mutual, comprise operation submission and monitoring etc.Resource and monitoring operation assembly 7 are responsible for monitoring the resource situation of computing node 3 and the execution of job task.Scheduler component 6 is scheduler modules, and itself does not store any operation and resource information.When operation manager component 5 need be carried out job scheduling, send a dispatching command can for scheduler component 6, scheduler component 6 sends back to job manager assembly 5 with the result after finishing scheduling decision.TCP/IP with standard between job manager assembly 5 and the scheduler component 6 communicates, and therefore, job manager assembly 5 fully can be on different machines with scheduler component 6.And,, therefore can provide dispatch service for the job manager assembly on a plurality of clusters simultaneously because scheduler component 6 is not stored any job information.Above-mentioned job manager assembly 5, scheduler component 6 and resource and monitoring operation assembly 7 are normally realized by program or hardware in this area, and have been used widely.
Establish job management and job scheduling is separated, and after the event driven scheduling model, just can on scheduling model, select corresponding architecture.Have following three kinds of architectures available:
Centralized scheduling and centralized management
Figure G2009102365502D0000062
Centralized scheduling and distributed management
Figure G2009102365502D0000063
Distributed management and distributed scheduling
Centralized management and centralized scheduling are actually a kind of simple extension of single group operation management system.As shown in Figure 5, it has added the job manager and the scheduler of a high level on single cluster, thereby the two-layer tree structure of single cluster is expanded to three layers.Physically, high-rise job manager and scheduler can be installed on the independent main frame, also can be acted on behalf of by the management node 1 of certain unit cluster.Identical with single cluster, the user still can only be on the root node of tree structure submit job.In this case, the local job manager 5 on each unit cluster no longer has the function of job management, and becomes the execution agency of overall job manager.This structure realizes simple, and is more suitable for general many clusters.Yet in many clusters of can recombinating, any cluster all might withdraw from any time, can't exist a node that global administration's device is installed.
In distributed management and the centralized scheduling, do not have the job manager of the overall situation, as shown in Figure 6, the local job manager 5 of each cluster is responsible for receiving the operation that the local user submits to, and manages these operations.All local job managers 5 and scheduler 6 carry out the scheduling of operation alternately, and the scheduler 6 of this moment is a plurality of job managers 5 services.
In distributed management and distributed scheduling, each cluster all has the local job manager 5 of oneself, as shown in Figure 7, and a scheduler 6 is arranged all, and each scheduler 6 can both carry out the scheduling of the overall situation to operation.For many clusters of can recombinating, this is optimal structure.Yet, in this structure, because there are a plurality of schedulers 6 in the system, the scheduling conflict may appear, need between each scheduler 6, carry out synchronously, implement comparatively difficulty.
Through above-mentioned analysis, find that distributed management and centralized scheduling are suitable for reconstitutable environment most.But also there are some problems in this structure, because if after the cluster at scheduler 6 places withdrawed from, system had just lost scheduler 6.But because scheduler 6 itself do not stored any information, therefore can adopt following scheme, a scheduler 6 all is installed on each cluster, but be had only one to be in state of activation, other all be used as the back scheduling device.As shown in Figure 8, the scheduler 6 of cluster 1 is the scheduler that activates, and the scheduler 6 of cluster 2 and cluster 3 is all as the back scheduling device.We are called Master cluster (main cluster) to the cluster at the scheduler place of activating, and other cluster is called Slave cluster (from cluster).In addition, we are called Idle cluster (idle cluster) to the single cluster that does not have to form many clusters with any cluster.
After the scheduler place cluster that activates withdraws from, can start a back scheduling device by certain mode.But must can only start one, and in all clusters, reach common understanding.
Because a Master cluster and several Slave clusters are arranged in the multi-cluster system.The single cluster that does not add any many clusters is called the Idle cluster.So, but recombination mechanism will adapt to following several situation:
1) idle cluster adds many clusters: how the Master cluster finds new cluster, and it is become the Slave cluster of oneself.
2) merging of cluster more than two: occurred two Master clusters in the system, then whom has become unique Master cluster by.
3) the Master cluster leaves: the Master cluster withdraws from or breaks down, and when no longer responding, Slave cluster has originally just lost the Master cluster of oneself, then how to specify a new Master cluster.
4) the Slave cluster leaves: certain Slave cluster leaves system, and this moment, how the Master cluster handled this situation.
5) fractionation of many clusters: a plurality of Slave clusters in multi-cluster system leave simultaneously, and this moment, how original Master cluster handled this situation.
Further analyze, can find that the Idle cluster can be regarded as a Master cluster that has 0 Slave cluster in fact, situation 1 so) and situation 2) just there has not been difference in essence.For situation 5) can be regarded as two clusters and leave respectively and enter into another communication domain, situation 4 so) and situation 5) just can be regarded as with a kind of situation.Finally, only surplus following 3 look shapes need to handle:
Figure G2009102365502D0000071
The merging of cluster
Figure G2009102365502D0000072
Withdrawing from of Master cluster
Figure G2009102365502D0000073
Withdrawing from of Slave cluster
At above-mentioned several situations, the present invention has used the polymerization algorithm and has come the consolidation problem of Processing Cluster, has used election algorithm simultaneously and has handled withdrawing from of Master cluster.
Withdraw from for the Slave cluster, the Master cluster sends broadcasting periodically for each Slave.In certain cycle, find certain Slave cluster when the Master cluster and no longer respond, withdraw from or when breaking down, just send message to each cluster, upgrade cluster-list information.
Processing to above-mentioned three kinds of situations is described in detail below:
(1) processing that merges about cluster
Can be divided into following four-stage: detection phase, handshake phase, competitive stage and update stage.
A. detection phase
1. Master cluster A periodically broadcasts a Detect message (probe message) in network, and monitors and reply.
2. after supposing that a Master cluster B has received a Detect message, judge whether it is that oneself sends, if then abandon; If not then sending an answer that comprises own information to the cluster A that sends Detect message, afterwards, cluster B abandons all Detect message of receiving.
3. after cluster A received the answer message of cluster B, cluster A sent Join message (request is in conjunction with message) to cluster B, and waited for Join Ack message (request is in conjunction with the answer message of message).
B. handshake phase
4. cluster B receives after the Join message that cluster A sends, and then sends Join Ack message to cluster A.
5. after cluster A receives the Join Ack message that cluster B sends, send Join Con message (agreeing in conjunction with message) to cluster B, both sides shake hands successfully.
C. competitive stage
6. cluster B sends Compete message (challenge message) to cluster A, begins to compete the Master cluster.
7. both sides are at war with according to certain rule, and the competition victor becomes the Master cluster, and loser and the Slave cluster of oneself all become the other side's Slave cluster.
D. update stage
8. the cluster of competition failure sends a UpdateMaster message (main cluster change message) that comprises new Master cluster ID to own all Slave clusters, notifies them to change the information of the Master cluster of oneself.
The rules of competition: upgrade setting in order to reduce the quantity of the UpdateMaster message that in update stage, sends, make the cluster of trying one's best few as far as possible.Selected the following rules of competition: select to have the many Master cluster competitions of Slave number of clusters and win, become new Master cluster.
Above-mentioned Detect message (probe message), Join message (request is in conjunction with message), Join Ack message (request is in conjunction with the answer message of message), Join Con message (agreeing in conjunction with message), Compete message (challenge message), UpdateMaster message multiple message such as (main cluster change message) can be that network identifier or other have the binary coding of identification function usually.
(2) processing withdrawed from of Master cluster
Suppose that the Master cluster has the redundant node of a Hot Spare state, it does not participate in calculating and management, when the Master cluster leaves or breaks down, system utilizes the principle cancellation operation of affairs, after the Master cluster that election makes new advances, the job state of backup is moved to new Master cluster, reappear the beginning operation.
Suppose that each cluster has a numbering that the overall situation is unique, this numbering can be the numbering that the network address or additive method produce.Always elect the maximum cluster of numbering as the Master cluster when supposing election.
If a cluster has started the once execution of election, and can only start the process of once convening election at every turn, then there is the cluster group of n cluster may convene n election concomitantly in principle.A basic demand of election process is that the selection to elected cluster must be unique, even there are a plurality of concurrent processes of convening carrying out, last result must guarantee that all clusters of convening election and participating in election reach common understanding to elected cluster.
With reference to figure 9, be labeled as 0,1,2 among the figure ... 7 circle is represented a plurality of clusters, and concrete treatment step is as follows:
1) after current Master cluster withdraws from, find that at first the cluster that current Master group withdraws from convenes election, this convenes the cluster of election to send an election message (election message) to all than the big cluster of oneself numbering;
2) if a certain cluster is received the election message that numbering is sent than own little cluster, then this cluster is replied an OK message (response message) that comprises self number information and is convened the cluster of election to this;
3) if convene the cluster of election to can not get any answer, then convene the cluster of election to win election, convene the cluster of election to send coordinator message (co-ordination message), announce oneself to become new Master cluster to all clusters;
4) if convene the cluster of election to obtain any one answer, reply and necessarily come from than the big cluster of oneself numbering, convene the end-of-job of convening election of the cluster of election, because convene the cluster of election can not win election and become new Master cluster this moment;
5) for other cluster, or convening election, maybe may receive the election message of the cluster littler than oneself numbering.After other cluster except that above-mentioned cluster of convening election is received an election message, will reply the cluster of an OK message (response message) to this transmission election message; If at this moment other cluster except that above-mentioned cluster of convening election also is not the cluster of convening election, it also will begin a process of convening election, and promptly execution in step 1) to 3) operation.
Above-mentioned election message (election message), coordinator message (co-ordination message), OK message (response message) etc. can be that network identifier or other have the binary coding of identification function usually.
(3) processing withdrawed from of Slave cluster
The Master cluster sends broadcasting periodically for each Slave cluster, in certain cycle, find certain Slave cluster when the Master cluster and no longer respond, just send message to each cluster, and upgrade cluster-list information, show that this Slave cluster that does not respond withdraws from or breaks down.
Embodiment 2
Among the embodiment 1 advantage that withdraws from processing of Master cluster be carry out simple, but more or number under the situation that less cluster finds that at first the Master cluster withdraws from when the quantity of cluster, need at a large amount of letter bag of transmission over networks, systematic function is lower.
Present embodiment is handled at withdrawing from of Master cluster among the embodiment 1 and is improved, and obtains a kind of optimization process method.When cluster P finds the Master cluster no longer during response request, its just initiates election.With reference to Figure 10, be labeled as 0,1,2 among the figure ... 7 circle is represented a plurality of clusters, and processing procedure is as follows:
1) after current Master cluster withdraws from, find that at first the cluster that current Master cluster withdraws from convenes election, this convenes the cluster of election to send an election message (election message) to all than the big cluster of oneself numbering;
2), then reply an OK message (response message) that comprises self number information and convene the cluster of election to this if a certain cluster is received the election message that numbering is sent than own little cluster;
3) if convene the cluster of election to can not get any answer, then this convenes the cluster of election to win election, and this convenes the cluster of election to send coordinator message (co-ordination message) to all clusters, announces oneself elected new Master cluster;
4) if this convenes the cluster of election to obtain replying, this convenes the cluster of election to collect all response messages, therefrom choose the maximum cluster of numbering, elect it and (promptly do not need cyclic process for new Master cluster, once select the coordinator), new Master cluster sends the message informing election results to other cluster.
Processing mode after the above-mentioned optimization has been avoided the election process of continuous circulation, under any circumstance all only needs through once taking turns the optional coordinator who makes new advances of election.Big or number less cluster when taking the lead in initiating election when number of clusters, superiority embodies clearly.Because greatly reduced the message count of inter-cluster communication, overall system performance is significantly improved and improves.
Master cluster with respect to embodiment 1 withdraws from processing, and the processing mode of embodiment 2 has the following advantages:
1. reduced the required message transmitted number of election process, and then reduced network traffics, avoided owing to causing network congestion at a large amount of letter bag of transmission over networks;
2. because the minimizing of message transmission capacity makes the traffic minimize, correspondingly make the response time also minimize, avoided because the too big network delay that causes of network traffics has improved system works efficient and performance.
Embodiment 3
For distributed many cluster environment, lost efficacy if more than one even all clusters are found Master simultaneously, and initiated election simultaneously, also can cause offered load to increase the weight of.At this situation, present embodiment withdraws from processing method to the Master cluster among embodiment 1 and the embodiment 2 again and improves.
With reference to Figure 11, be labeled as 0,1,2 among the figure ... 7 circle is represented a plurality of clusters, and processing procedure is as follows:
1) find the Master cluster no longer during response request when a cluster of convening election, this cluster of convening election is to all cluster transmission election message (election message) bigger than own numbering;
2) if a plurality of cluster is initiated election simultaneously, then a certain cluster in the multi-cluster system will be received many election message, and this cluster only less cluster of numbering in a plurality of clusters of convening election that send many election message is replied OK message (response message);
3) if convene the cluster of election to can not get any answer, then this convenes the cluster of election to win election, and this convenes the cluster of election to send coordinator message (co-ordination message) to all clusters, announces oneself to become new Master cluster;
4) if convene the cluster of election to obtain replying, this convenes the cluster of election to collect all response messages, therefrom chooses the maximum cluster of numbering, elects it and is new Master cluster, and this new Master cluster sends the message informing election results to other cluster.
Above-mentioned processing method has further reduced the new required message transmitted number of Master cluster production process on the basis of embodiment 2.
The above, it only is preferred embodiments of the present invention, be not that the present invention is done any pro forma restriction, though the present invention discloses as above with preferred embodiment, yet be not in order to limit the present invention, anyly be skillful in the professional and technical personnel, in not breaking away from the technical solution of the present invention scope, when the technology contents that can utilize above-mentioned announcement is made other all improvement or is modified to the equivalent example of equivalent variations, in every case be the content that does not break away from technical solution of the present invention, according to technical spirit of the present invention to any simple modification that above embodiment did, equivalent variations and modification all still belong in the scope of technical solution of the present invention.

Claims (6)

1. the reconstitutable method of a multi-cluster system is characterized in that comprising the steps:
With between the management node 1 of a plurality of member's clusters 4 that work independently with the network interconnection, make between the management node 1 and can intercom mutually, thereby enough become many clusters;
Make described management node 1 transmit communication between each computing node 3 of described member's cluster 4;
Scheduler 6 and job manager 5 are set in the described management node 1 of each member's cluster 4, are responsible for submission, scheduling and the management of operation; Resource and monitoring operation device 7 are set in the described computing node 3 of each member's cluster 4, are responsible for monitoring the resource situation of computing node 3 and the execution of job task;
The management node 1 of each member's cluster 4 is responsible for receiving the operation that the local user submits to, by management node 1 operation is assigned to each computing node 3, for avoiding the scheduling conflict, 1 of the management node of each member's cluster 4 and a scheduler that is activated carry out the scheduling of operation alternately, and the described scheduler that is activated is for 1 service of the described management node in a plurality of member's clusters 4;
A plurality of member's clusters 4 in described many clusters further being divided into main cluster and from cluster, wherein main cluster is the member's cluster with described scheduler that is activated, is the member's cluster that does not comprise the described scheduler that is activated from cluster;
This method further comprises following three kinds of processing:
Handle one,, a cluster to be combined is joined in current many clusters according to system requirements;
Handle two,, after main cluster withdraws from current many clusters, redefine new main cluster according to system requirements;
Handle three, according to system requirements, withdraw from current many clusters from cluster, and reorganize many clusters one.
2. the reconstitutable method of a multi-cluster system as claimed in claim 1, it is characterized in that: described processing one further comprises detection phase, handshake phase, competitive stage and update stage, described cluster to be combined can be the single cluster that does not add any many clusters, also can be cluster more than; Wherein,
A. detection phase
1. current main cluster is probe message of broadcasting in network periodically, and monitors and reply;
2. after described cluster to be combined has been received an above-mentioned probe message, judge whether it is that oneself sends, if then abandon; If not then sending an answer that comprises own information to the current main cluster that sends probe message, afterwards, this cluster to be combined abandons the probe message that all are received;
3. after current main cluster was received the answer message of cluster to be combined, current main cluster sent the message of request combination to cluster to be combined, and waited for the answer message of the message of this request combination;
B. handshake phase
4. cluster to be combined is received and is sent the answer message of the message of this request combination to current main cluster after the message of the request combination that current main cluster is sent;
5. after current main cluster is received the answer message of message of this request combination that cluster to be combined is sent, send to cluster to be combined and to agree that both sides shake hands successfully in conjunction with message;
C. competitive stage
6. cluster to be combined sends challenge message to current main cluster, begins to compete main cluster;
7. both sides are at war with according to the rules of competition, and the competition victor becomes new main cluster, loser and oneself from cluster all become the other side from cluster;
D. update stage
8. the cluster of competition failure sends a main cluster change message that comprises new main cluster ID to own all from cluster, notifies them to change the main cluster information of oneself;
The described rules of competition is: selecting to have from the maximum main cluster of number of clusters becomes new main cluster.
3. the reconstitutable method of a multi-cluster system as claimed in claim 1 is characterized in that described processing two further comprises the steps:
1) determine the election regulation of new main cluster, the maximum cluster of election numbering is as new main cluster;
2) after current main cluster withdraws from, find that at first the cluster that current main cluster withdraws from convenes election, this convenes the cluster of election to send an election message to all than the big cluster of oneself numbering;
3) if a certain cluster is received the election information that numbering is sent than own little cluster, then this cluster is replied a message that comprises self number information and is convened the cluster of election to this;
4) if the described cluster of election of convening can not get any answer, then this convenes the cluster of election to win election, and this convenes the cluster of election to send co-ordination message to all clusters, announces oneself to be new main cluster;
5) if this convenes the cluster of election to obtain any one answer, this answer necessarily comes from than the big cluster of oneself numbering, this convenes the election of convening of the cluster of election to finish, because this convenes the cluster of election can not win election and become new main cluster this moment;
6) for other cluster the cluster of convening election except this, or convening election, maybe may receive the election message of the cluster littler than oneself numbering, after other cluster except that above-mentioned cluster of convening election is received an election message, send the cluster of election message to this with replying a response message; If at this moment other cluster except that above-mentioned cluster of convening election also is not the cluster of convening election, they also will begin a process of convening election, and promptly execution in step 1) to 4) operation.
4. the reconstitutable method of a multi-cluster system as claimed in claim 1 is characterized in that described processing three further comprises:
Main cluster periodically sends broadcasting to each from cluster, finds certain when main cluster and no longer responds from cluster, just sends message to each cluster, and upgrades member's cluster-list information, shows withdrawing from or break down from cluster of not response.
5. the reconstitutable method of a multi-cluster system as claimed in claim 1 is characterized in that described processing two further comprises the steps:
1) determine the election regulation of new main cluster, the maximum cluster of election numbering is as new main cluster;
2) after current main cluster withdraws from, find that at first the cluster that current main cluster withdraws from convenes election, this convenes the cluster of election to send an election message to all than the big cluster of oneself numbering;
3), then reply a message that comprises self number information and convene the cluster of election to this if a certain cluster is received the election information that numbering is sent than own little cluster;
4) if convene the cluster of election to can not get any answer, then this convenes the cluster of election to win election, and this convenes the cluster of election to send co-ordination message to all clusters, announces oneself elected new main cluster;
5) if the described cluster of election of convening obtains replying, this convenes the cluster of election to collect all response messages, therefrom chooses the maximum cluster of numbering, elects it and is new main cluster, and this new main cluster sends the message informing election results to other cluster.
6. the reconstitutable method of a multi-cluster system as claimed in claim 1 is characterized in that described processing two further comprises the steps:
1) determine the election regulation of new main cluster, the maximum cluster of election numbering is as new main cluster;
2) after current main cluster withdraws from, find that at first the cluster that current main cluster withdraws from convenes election, this convenes the cluster of election to send an election message to all than the big cluster of oneself numbering;
3) if a plurality of cluster is initiated election simultaneously, then a certain cluster in the multi-cluster system will be received many described election message, and this cluster of receiving many election message only less cluster of numbering in a plurality of clusters of convening election that send many election message is replied a response message that comprises self number information;
4) if a certain cluster of election of convening can not get any answer, then this convenes the cluster of election to win election, and this convenes the cluster of election to send co-ordination message to all clusters, announces oneself to become new main cluster;
5) if the above-mentioned cluster of election of convening obtains replying, this convenes the cluster of election to collect all response messages, therefrom chooses the maximum cluster of numbering, elects it and is new main cluster, and this new main cluster sends the message informing election results to other cluster.
CN2009102365502A 2009-10-26 2009-10-26 Reconfigurable method of multi-cluster system Expired - Fee Related CN101702721B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009102365502A CN101702721B (en) 2009-10-26 2009-10-26 Reconfigurable method of multi-cluster system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009102365502A CN101702721B (en) 2009-10-26 2009-10-26 Reconfigurable method of multi-cluster system

Publications (2)

Publication Number Publication Date
CN101702721A true CN101702721A (en) 2010-05-05
CN101702721B CN101702721B (en) 2011-08-31

Family

ID=42157614

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009102365502A Expired - Fee Related CN101702721B (en) 2009-10-26 2009-10-26 Reconfigurable method of multi-cluster system

Country Status (1)

Country Link
CN (1) CN101702721B (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102479099A (en) * 2010-11-22 2012-05-30 中兴通讯股份有限公司 Virtual machine management system and use method thereof
CN102571954A (en) * 2011-12-02 2012-07-11 北京航空航天大学 Complex network clustering method based on key influence of nodes
CN102685173A (en) * 2011-04-14 2012-09-19 天脉聚源(北京)传媒科技有限公司 Asynchronous task distribution system and scheduling distribution computing unit
CN102833289A (en) * 2011-06-16 2012-12-19 宁波速腾电子有限公司 Distributed cloud computing resource organizing and task allocating method
CN103380608A (en) * 2011-03-09 2013-10-30 中国科学院计算机网络信息中心 Method for gathering queue information and job information in computation environment
CN103491168A (en) * 2013-09-24 2014-01-01 浪潮电子信息产业股份有限公司 Cluster election design method
CN104469699A (en) * 2014-11-27 2015-03-25 华为技术有限公司 Cluster quorum method and multi-cluster cooperation system
CN104683446A (en) * 2015-01-29 2015-06-03 广州杰赛科技股份有限公司 Method and system for monitoring service states of cloud storage cluster nodes in real time
CN104917792A (en) * 2014-03-12 2015-09-16 上海宝信软件股份有限公司 Democratic and autonomous cluster management method and system
CN105045566A (en) * 2015-08-13 2015-11-11 山东华宇航天空间技术有限公司 Embedded parallel computing system and parallel computing method adopting same
CN105227349A (en) * 2015-08-27 2016-01-06 北京泰乐德信息技术有限公司 Nomadic MANET dispatching patcher and dispatching method thereof
CN105791354A (en) * 2014-12-23 2016-07-20 中兴通讯股份有限公司 Job scheduling method and cloud scheduling server
WO2016150066A1 (en) * 2015-03-25 2016-09-29 中兴通讯股份有限公司 Master node election method and apparatus, and storage system
CN106301904A (en) * 2016-08-08 2017-01-04 无锡天脉聚源传媒科技有限公司 A kind of cluster server management method and device
CN106331098A (en) * 2016-08-23 2017-01-11 东方网力科技股份有限公司 Server cluster system
CN107196814A (en) * 2017-07-28 2017-09-22 郑州云海信息技术有限公司 A kind of management method and system of many clusters
CN108092829A (en) * 2018-01-31 2018-05-29 深信服科技股份有限公司 Processing method, SDN controllers and the storage medium of cluster division
CN108282526A (en) * 2018-01-22 2018-07-13 中国软件与技术服务股份有限公司 Server dynamic allocation method and system between double clusters
CN110308984A (en) * 2019-04-30 2019-10-08 北京航空航天大学 It is a kind of for handle geographically distributed data across cluster computing system
CN111586110A (en) * 2020-04-22 2020-08-25 广州锦行网络科技有限公司 Optimization processing method for raft in point-to-point fault
CN111708659A (en) * 2020-06-10 2020-09-25 中国—东盟信息港股份有限公司 Method for constructing cloud native disaster tolerance architecture based on kubernets

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08235141A (en) * 1995-02-28 1996-09-13 Kofu Nippon Denki Kk Information processing system
CN101252603B (en) * 2008-04-11 2011-03-30 清华大学 Cluster distributed type lock management method based on storage area network SAN
CN101340423B (en) * 2008-08-13 2011-02-02 北京航空航天大学 Multi-cluster job scheduling method based on element scheduling ring

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012068867A1 (en) * 2010-11-22 2012-05-31 刘建 Virtual machine management system and using method thereof
CN102479099B (en) * 2010-11-22 2015-06-10 中兴通讯股份有限公司 Virtual machine management system and use method thereof
CN102479099A (en) * 2010-11-22 2012-05-30 中兴通讯股份有限公司 Virtual machine management system and use method thereof
CN103380608A (en) * 2011-03-09 2013-10-30 中国科学院计算机网络信息中心 Method for gathering queue information and job information in computation environment
CN103380608B (en) * 2011-03-09 2015-12-02 中国科学院计算机网络信息中心 Converge the method for queuing message and job information in a computing environment
CN102685173A (en) * 2011-04-14 2012-09-19 天脉聚源(北京)传媒科技有限公司 Asynchronous task distribution system and scheduling distribution computing unit
CN102833289A (en) * 2011-06-16 2012-12-19 宁波速腾电子有限公司 Distributed cloud computing resource organizing and task allocating method
CN102833289B (en) * 2011-06-16 2016-02-17 浙江速腾电子有限公司 A kind of distributed cloud computing resources tissue and method for allocating tasks
CN102571954B (en) * 2011-12-02 2014-07-16 北京航空航天大学 Complex network clustering method based on key influence of nodes
CN102571954A (en) * 2011-12-02 2012-07-11 北京航空航天大学 Complex network clustering method based on key influence of nodes
CN103491168A (en) * 2013-09-24 2014-01-01 浪潮电子信息产业股份有限公司 Cluster election design method
CN104917792B (en) * 2014-03-12 2018-10-30 上海宝信软件股份有限公司 The cluster management method and system of democratic autonomy
CN104917792A (en) * 2014-03-12 2015-09-16 上海宝信软件股份有限公司 Democratic and autonomous cluster management method and system
CN104469699B (en) * 2014-11-27 2018-09-21 华为技术有限公司 Cluster quorum method and more cluster coupled systems
CN104469699A (en) * 2014-11-27 2015-03-25 华为技术有限公司 Cluster quorum method and multi-cluster cooperation system
CN105791354A (en) * 2014-12-23 2016-07-20 中兴通讯股份有限公司 Job scheduling method and cloud scheduling server
CN104683446A (en) * 2015-01-29 2015-06-03 广州杰赛科技股份有限公司 Method and system for monitoring service states of cloud storage cluster nodes in real time
WO2016150066A1 (en) * 2015-03-25 2016-09-29 中兴通讯股份有限公司 Master node election method and apparatus, and storage system
CN105045566A (en) * 2015-08-13 2015-11-11 山东华宇航天空间技术有限公司 Embedded parallel computing system and parallel computing method adopting same
CN105045566B (en) * 2015-08-13 2018-11-20 山东华宇航天空间技术有限公司 A kind of embedded type parallel computation system and the parallel calculating method using it
CN105227349A (en) * 2015-08-27 2016-01-06 北京泰乐德信息技术有限公司 Nomadic MANET dispatching patcher and dispatching method thereof
CN105227349B (en) * 2015-08-27 2018-04-17 北京泰乐德信息技术有限公司 Nomadic ad hoc network dispatches system and its dispatching method
CN106301904A (en) * 2016-08-08 2017-01-04 无锡天脉聚源传媒科技有限公司 A kind of cluster server management method and device
CN106331098A (en) * 2016-08-23 2017-01-11 东方网力科技股份有限公司 Server cluster system
CN106331098B (en) * 2016-08-23 2020-01-21 东方网力科技股份有限公司 Server cluster system
CN107196814A (en) * 2017-07-28 2017-09-22 郑州云海信息技术有限公司 A kind of management method and system of many clusters
CN108282526A (en) * 2018-01-22 2018-07-13 中国软件与技术服务股份有限公司 Server dynamic allocation method and system between double clusters
CN108282526B (en) * 2018-01-22 2021-02-05 中国软件与技术服务股份有限公司 Dynamic allocation method and system for servers between double clusters
CN108092829A (en) * 2018-01-31 2018-05-29 深信服科技股份有限公司 Processing method, SDN controllers and the storage medium of cluster division
CN108092829B (en) * 2018-01-31 2021-07-06 深信服科技股份有限公司 Cluster splitting processing method, SDN controller and storage medium
CN110308984A (en) * 2019-04-30 2019-10-08 北京航空航天大学 It is a kind of for handle geographically distributed data across cluster computing system
CN110308984B (en) * 2019-04-30 2022-01-07 北京航空航天大学 Cross-cluster computing system for processing geographically distributed data
CN111586110A (en) * 2020-04-22 2020-08-25 广州锦行网络科技有限公司 Optimization processing method for raft in point-to-point fault
CN111708659A (en) * 2020-06-10 2020-09-25 中国—东盟信息港股份有限公司 Method for constructing cloud native disaster tolerance architecture based on kubernets

Also Published As

Publication number Publication date
CN101702721B (en) 2011-08-31

Similar Documents

Publication Publication Date Title
CN101702721B (en) Reconfigurable method of multi-cluster system
CN109471705B (en) Task scheduling method, device and system, and computer device
CN101441580B (en) Distributed paralleling calculation platform system and calculation task allocating method thereof
Shan et al. Job superscheduler architecture and performance in computational grid environments
CN101957780B (en) Resource state information-based grid task scheduling processor and grid task scheduling processing method
CN104461740B (en) A kind of cross-domain PC cluster resource polymerization and the method for distribution
CN102457906B (en) Load balancing control method and system of message queues
CN107959705A (en) The distribution method and control server of streaming computing task
CN109802986B (en) Equipment management method, system, device and server
CN102394807A (en) System and method for decentralized scheduling of autonomous flow engine load balancing clusters
CN111064672A (en) Cloud platform communication system, election method and resource scheduling management method
CN101753405A (en) Cluster server memory management method and system
CN109728941A (en) A kind of block chain leader election method and device thereof
CN100357930C (en) Large scale data parallel computing main system and method under network environment
Li et al. Task scheduling algorithm for heterogeneous real-time systems based on deadline constraints
CN110247980B (en) Gateway control method in local area network and gateway
CN109218138B (en) Network node monitoring method and system
CN102571595B (en) Route forwarding information synchronizing method and device of stack system
CN104484228A (en) Distributed parallel task processing system based on Intelli-DSC (Intelligence-Data Service Center)
CN100440802C (en) Service gridding system and method for processing operation
CN116775338A (en) Distributed event asynchronous processing system
Feller et al. Autonomous and energy-aware management of large-scale cloud infrastructures
Li et al. Design and implementation of high availability distributed system based on multi-level heartbeat protocol
CN111200518B (en) Decentralized HPC computing cluster management method and system based on paxos algorithm
Gawali Leader election problem in distributed algorithm

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110831

Termination date: 20151026

EXPY Termination of patent right or utility model