CN102073546A

CN102073546A - Task-dynamic dispatching method under distributed computation mode in cloud computing environment

Info

Publication number: CN102073546A
Application number: CN2010105835979A
Authority: CN
Inventors: 肖利民; 毛宏; 祝明发; 阮利; 胡声秋
Original assignee: Beihang University
Current assignee: SHANGHAI JUNESH INFORMATION TECHNOLOGY CO., LTD.
Priority date: 2010-12-13
Filing date: 2010-12-13
Publication date: 2011-05-25
Anticipated expiration: 2030-12-13
Also published as: CN102073546B

Abstract

The invention provides a task-dynamic dispatching method under a distributed computation mode in a cloud computing environment, which comprises the following four steps: 1. a main node receives and analyzes heartbeat information of a subsidiary node; 2. the main node previously distributes the task according to a node state table and a task state table; 3. the subsidiary node demands the task from the main node; and 4. the main node distributes the task to the subsidiary node. The method firstly considers the resource demand of the task and the performance information of the nodes, and dynamically controls the distribution of the task under the condition that the requirement is met, so that the response speed of the work and the resource utilization of the nodes are improved. The method has wide practical value and application prospect in the technical field of the distributed computation in the cloud computing environment.

Description

Task dynamic dispatching method in a kind of cloud computing environment under the distributed computing model

(1) technical field

The present invention relates to a kind of method for scheduling task of distributed computing platform.Be specifically related to the task dynamic dispatching method under the distributed computing model in a kind of cloud computing environment.It is the dynamic dispatching method based on joint behavior of task in a kind of task scheduling subsystem, belongs to field of computer technology.

(2) background technology

At present, along with the feasible demand to computing power of the develop rapidly of network application constantly increases, be accompanied by the development of grid computing, parallel computation, Distributed Calculation, cloud computing is arisen at the historic moment, be listed in the technique direction that country will give priority to future, and become the hot research problem of computer nowadays research circle and industry member.Popular along with cloud computing, increasing network (Web) service and commercial the application are deployed in the cloud computing environment, for the distributed node of handling the application layer computation requests in the cloud environment, how to pass through the scheduling of task with efficient processing upper strata computation requests, improve the utilization rate of the resource of the distributed node of isomery on the performance, and the response speed of lifting operation becomes the research focus in current cloud computing field.

When the mass data in the cloud environment was handled, the task scheduling that is treated to the basis with distributed storage and distributed parallel was one of committed step.The dispatching method that improves operation and task is present research focus, the aspects such as optimization of the subtask quantity of the scheduling of subtask and parallel running when the scheduling when research both at home and abroad mainly comprises many operations parallel running between the operation, single job operation.

Aspect the scheduling of operation, the most basic current scheduling mode is the job scheduling method of first in first out, and is unquestionable, and this job processing method has a lot of drawbacks, especially in operation quantity more for a long time, Whole Response is chronic.The proposition of equity dispatching device (Fair scheduler) preferably resolves this problem, when an independent operation in when operation, it will use whole cluster.When other operation was submitted, system can compose task free time sheet to these new operations, so that each operation all probably gets access to the CPU time of equivalent, and guaranteed the service level of big task when little task is responded fast.Capacity scheduling device (Capacity scheduler) is then supported many formations, and operation enters a formation after submitting to, and resource is by queue assignment, and the resource of this formation is used in the operation in each formation; In a formation, the operation of high priority can be used resource prior to the operation of low priority; But in case an operation begins to carry out, it just can not seized by the operation of higher priority; For preventing that one or more users from monopolizing all resources, be forced to a certain proportion of resource of each queue assignment.Three queue dispatchers based on MR-Predict that Inst. of Computing Techn. Academia Sinica proposes are divided into 3 classes according to CPU and I/O utilization rate with operating load, can improve the utilization rate of CPU and I/O resource under dissimilar operating load environment simultaneously.

On task scheduling, LATE (the Longest ApproximateTime to End) dispatching algorithm that the researchist of Univ. of California, Berkeley proposes then focuses on the optimization to the scheduling of the backup tasks in the operation.Finish the work the needed time by supposition, guarantee the backup tasks of the task that only the execution estimation is finished the latest on fireballing node.The researchist of Purdue University has proposed the task quantity optimization collocation method based on historical statistical data, its research mainly focuses on when carrying out operation, the number of moving simultaneously on each node in the cloud environment of task is to Effect on Performance, according to historical statistical data, obtain the optimization configuration and be applied to new similar operation.

Yet, in most of the cases, the performance of different nodes is different, the load state of different each node constantly is also different, the dynamic allocation scheme that sets the tasks according to the performance isomerism and the dynamic debt situation of node how, for efficient processing calculation task and improve distributed node resource utilization, promote operation response speed significant.

(3) summary of the invention

1, purpose:

Fundamental purpose of the present invention provides the task dynamic dispatching method under the distributed computing model in a kind of cloud computing environment, it at first considers the resource requirement of task and the performance information of node, distribution to task under situation about satisfying the demands is dynamically controlled, thereby improves the response speed of operation and the resource utilization of node.

For achieving the above object, the present invention proposes in the cloud computing environment under the distributed computing model dynamic dispatching method based on the task of joint behavior and task practice condition, the composition structure of Distributed Calculation node as shown in Figure 1 under the cloud computing environment, mainly comprise a main controlled node (host node) and a plurality of computing node (child node), computing node both can be a physical machine, also can be virtual machine, transparent to main controlled node, between node by the network interconnection.Main controlled node and computing node are mutual by remote procedure call (RPC) mode.Main controlled node mainly be responsible for to receive the heartbeat message of computing node, and is analyzed and feed back scheduling and execution with control task; Computing node is except executing the task, and also main being responsible for collected the performance information and the task execution information of this node and sent to main controlled node.

2, technical scheme:

Technical scheme of the present invention is such:

Task dynamic dispatching method in a kind of cloud computing environment of the present invention under the distributed computing model, idiographic flow as shown in Figure 2, this method may further comprise the steps:

The performance information of this node of step 201. computing node dynamic collection and task are carried out information, report to main controlled node with the form of heartbeat message.

Step 202. main controlled node receives and analyzes the heartbeat message of each computing node, creates and bring in constant renewal in node state table and task status table.According to node state table and task status table, main controlled node is a computing node predistribution task, and more new node is looked ahead and shown and the pre-submeter of task.

If task groove (task slot) free in step 203. computing node is available, then in the heartbeat message of next time, add sign to main controlled node request task.

After step 204. main controlled node receives the task requests of computing node, press scheduling strategy for its allocating task, and look ahead table and the pre-submeter of task of new node more.

Wherein, described joint behavior information of step 201 and task execution information are the significant data sources that main controlled node upgrades node state table and task status table.Joint behavior information can comprise CPU frequency, memory size, CPU usage, memory usage, I/O resource utilization etc.Task execution information comprises the task execution information of firm end and the task execution information of well afoot; Just the task execution information that finishes comprise TaskID, the place operation of task JobID, be used for the time (replication processes data) of IO and be used for time of CPU calculating, wherein, the replication processes data occur in this computing node and do not have under the input data conditions of this task and take place; The task execution information of well afoot comprises the TaskID of task, JobID, task executions progress and the executed time of place operation.Each computing node is collected these two kinds of information of this node at set intervals, and is encapsulated as heartbeat message and sends to main controlled node.

Wherein, node state table described in the step 202 and task status table are the important references information that main controlled node is formulated the Task Distribution scheme.Node state table has been described the performance state of each computing node in a recent period of time, the task status table record situation of each computing node Processing tasks in a recent period of time.After main controlled node receives the heartbeat message of computing node for the first time, upgrade these two tables after creating node state table and task status table and receiving the heartbeat message of computing node afterwards at every turn.Node state table comprises NodeName, CPU_Speed, MemSize, CPU_Usage, these fields of Mem_Usage, IO_Usage; Task status table comprises JobID, TaskID, NodeName, Time_IO, Time_CPU, Progress, these fields of PastTime.Node look ahead the table and task presort the predistribution information that table record task in the current cluster.Node look ahead table record main controlled node be the information that computing node is allocated task in advance, the node table of looking ahead comprises NodeName, preFetched, these fields of preFetchedTaskID.Task presort table record main controlled node allocate task in advance information to computing node, the pre-submeter of task comprises TaskID, preScheduled, these fields of preScheduledNodeName.

Wherein, the size of the task groove of the described computing node of step 203 is meant the maximum number of tasks of computing node synchronization energy executed in parallel, and the size of task groove configures before the distributed node cluster starts.Computing node only in free task groove just to main controlled node application task, the application of task is by the heartbeat message transmission, the zone bit that comprises the application task in the heartbeat message, if for very then show the task groove that this computing node is free, main controlled node can assign the task to this computing node and carry out.

Wherein, the described main controlled node of step 204 is that the computing node allocating task of application task is carried out and to be determined by the distributed scheduling algorithm.The distributed scheduling algorithm is realized in the scheduler of main controlled node, synchronization has a plurality of computing nodes application task simultaneously to be carried out, scheduler is by reading node state table, task status table, node look ahead table and the pre-submeter of task, and in conjunction with the residue task queue, be defined as the priority ranking and the task number of computing node allocating task according to the distributed scheduling algorithm, then look ahead table and the pre-submeter of task of new node more.

3, advantage and effect: the task dynamic dispatching method in a kind of cloud computing environment of the present invention under the distributed computing model, it and prior art this, its major advantage is: (1) carries out information by the performance dynamic change and the historic task of analytical calculation node, make main controlled node more reasonable to the distribution of task, more can give full play to the performance advantage of the computing node of better performances, and original method for scheduling task is not all considered the dynamic change of each computing node on performance; (2) changed in the typical distribution formula computation model (as MapReduce) as long as the convention that computing node can the acquisition task be carried out to main controlled node application task, and be that main controlled node has been given and selected computing node to go the right of executing the task, the bottleneck problem brought with regard to the computing node of having avoided poor-performing like this.

(4) description of drawings

The composition structural representation of Distributed Calculation node in Fig. 1 cloud computing environment of the present invention

In Fig. 2 cloud environment based on the task distribution formula scheduling flow synoptic diagram of distributed node performance and task practice condition

The interactive structure figure of the three phases that Fig. 3 the present invention includes (initialization, information updating and task scheduling)

The detail flowchart of the three phases that Fig. 4 the present invention includes

Fig. 5 information updating module schematic flow sheet of the present invention

Fig. 6 task scheduling modules schematic flow sheet of the present invention

Symbol description is as follows among the figure:

The 201-204 step number; The 501-505 step number; The 601-604 step number;

(5) embodiment

For making the purpose, technical solutions and advantages of the present invention express clearlyer, the present invention is further described in more detail below in conjunction with drawings and the specific embodiments.

Satisfied facility environment condition required for the present invention is seen Fig. 1, the composition structure of Distributed Calculation node mainly comprises a main controlled node (host node) and a plurality of computing node (child node) in the cloud environment, computing node both can be a physical machine, it also can be virtual machine, transparent to main controlled node, between node by the network interconnection.Main controlled node and computing node are mutual by remote procedure call (RPC) mode.Main controlled node mainly be responsible for to receive the heartbeat message of computing node, and is analyzed and feed back scheduling and execution with control task; Wherein, the node analysis device is used to receive the performance information with the analytical calculation node, upgrades node state table, and task analyzer is used to receive the mission bit stream with the analytical calculation node, updating task state table.Computing node is except executing the task, and also main being responsible for collected the performance information and the task execution information of this node and sent to main controlled node; Wherein, the joint behavior watch-dog is responsible for the performance information of nearest a period of time of collector node, and the Mission Monitor device is responsible for the recorded information that nearest a period of time of collector node executes the task.

The present invention requires each node to adopt (SuSE) Linux OS aspect software condition, and Java development kit 1.6 and above version are installed.

The present invention requires each node not have password by ssh and visits mutually aspect environmental baseline.

Task dynamic dispatching flow process based on joint behavior and task practice condition is seen Fig. 2, mainly comprise two contents: (1) computing node is collected the heartbeat message of this node of encapsulation and is sent to main controlled node, and main controlled node is set up and renewal node state table and task status table according to the heartbeat message that receives; (2) main controlled node is after receiving the task requests of computing node, is computing node allocating task and look ahead table and the pre-submeter of task of new node more according to dispatching algorithm.

This method comprises three phases: initialization, information updating and task scheduling.Its interactive structure as shown in Figure 3.At initial phase, main controlled node receives operation, and sets up node state table and task status table; In the information updating stage, main controlled node receives the heartbeat message of computing node and upgrades node state table, task status table, node look ahead table and the pre-submeter of task, if computing node request task then enters the task scheduling stage; At the task scheduling phase, main controlled node is the computing node allocating task according to nodal information and mission bit stream, finishes the heartbeat message that back return message update stage is waited for computing node.

Describe with an example below, as shown in Figure 4, method of the present invention may further comprise the steps:

Step 401: the joint behavior watch-dog on the computing node is collected the performance information of this node, and the Mission Monitor device is collected the task of this node and carried out information, is packaged into heartbeat message again, sends to main controlled node.The cycle that information gathering and heartbeat message send is 3 seconds.

Step 402: main controlled node receives and analyzes the heartbeat message of each computing node, if receive heartbeat message for the first time, then create node state table and task status table,, receive whenever that then a heartbeat message just upgrades node state table and task status table if create.Main controlled node is computing node predistribution task according to node state table and task status table, and more new node is looked ahead and shown and the pre-submeter of task.Specifically shown in the information updating module of Fig. 5.

Step 403: computing node then adds the sign to main controlled node request task if free task groove (task slot) is available in the heartbeat message of next time.

Step 404: after main controlled node receives the task requests of computing node, press scheduling strategy and be its allocating task.Specifically shown in the task scheduling modules of Fig. 6.

The detailed process of information updating module as shown in Figure 5,

Step 501: main controlled node is monitored the RPC visit of computing node, receives the heartbeat message that computing node sends.The main controlled node synchronization can only receive the heartbeat message of a computing node, if main controlled node is when receiving the heartbeat message of certain computing node, have other computing nodes also to send heartbeat message to main controlled node, then main controlled node adds waiting list with the computing node of later heartbeat.Joint behavior watch-dog monitoring on the computing node is also collected interior performance information of nearest a period of time of this node, the Mission Monitor device is monitored the information of carrying out on this node of task and is collected the record of executed 3 historic tasks recently, and computing node is encapsulated as heartbeat message with performance information and mission bit stream.If the mission bit stream in a period of time does not recently upgrade, also can only comprise the performance information of node in the heartbeat message.Computing node sends to main controlled node with heartbeat message by the RPC mode at set intervals.Heart beat cycle is 3 seconds.During each heartbeat, all should comprise the performance information of node and the information of current carrying out of task in the heartbeat message, and computing node whenever executes a task, the task executions that all will just finish when heartbeat next time record adds heartbeat message and sends to main controlled node, promptly comprises two generic task information in the mission bit stream: finished but the mission bit stream (may be sky) that do not report and the mission bit stream of well afoot.

Step 502: main controlled node upgrades node state table and task status table according to the heartbeat message that receives.For node state table, the computing node status information in the heartbeat message is covered in the main controlled node node state table corresponding to the information of this computing node.Writing down the information of nearest 3 historic tasks of carrying out on each node and the mission bit stream of well afoot in the task status table, when main controlled node is received new task status information at every turn, at first see if there is the mission bit stream of having finished but not reported, if have, then obtain the TaskID of this task and check whether this task exists in task status table, if there is the information of this task in the updating task state table then, otherwise the oldest mission bit stream of this computing node and add this completed mission bit stream in the deletion task status table.For the mission bit stream of well afoot, obtain the TaskID of this task, if this task exists in task status table, the information of this task in the updating task state table then, otherwise, in task status table, add the information of this task.

Step 503: according to node state table and task status table look ahead table and the pre-submeter of task of new node more.For m task in the task list, according to node state table and task status table, predict that each node carries out the time of this required by task, prediction algorithm is as follows:

T_{i j_{i}} = \frac{Σ_{j - h}^{j - 1} (t_{s} + t_{io} + t_{cpu})}{h}, i = 1,2 . . . . . . n - - - (1)

Wherein, T _IjBe that i computing node carried out its j _iThe predicted time of individual required by task, t _sFor the time of available task groove, t occurring _I0Be required time of copy data, t _CpuBe data processing time, h is the computing node reference number of the task of successful execution, and n is the computing node number in the cluster.

Obtain T _IjValue after, main controlled node is selected minimum one, to its corresponding computing node predistribution task.And task m is labeled as presorts, the computing node that is about to carry out this task is labeled as looks ahead.

Then, main controlled node continues not carry out its computing node for the next task choosing of having presorted that is not labeled as in being marked as each computing node of having looked ahead.Only presort num task, wherein num=5 at every turn.Presort finish after, may also have in the task queue not to be marked as the task of having presorted, also may have in the node listing and not be marked as the node of having looked ahead.

Step 504:, then enter task scheduling modules in heartbeat message if computing node will ask the task word segment mark to be designated as very.

The detailed process of task scheduling modules is as shown in Figure 6:

Step 601: main controlled node at first judges whether to allocate task in advance for this node by searching the node table of looking ahead, if allocate in advance, the then task of having presorted for the distribution of this node, and this task of mark is distributed, and this node of mark is not for looking ahead.

Step 602: if main controlled node is not allocated task in advance for this node, according to the prediction algorithm described in the step 503, not being marked as the node of having looked ahead is the more weak node of poor-performing or computing power, then choosing a task from task queue carries out for this node, but not with this task flagging for distributing, treat pre-timesharing next time, this task will be given processing speed node faster in advance, thereby this task will have a backup tasks to carry out to guarantee its smooth execution on fast node.

Step 603: for after the computing node allocating task finishes, main controlled node is look ahead table and the pre-submeter of task of new node more.

Step 604: the task scheduling stage finishes, and changes the information updating stage over to, and main controlled node continues to receive and handle the heartbeat message that computing node sends over.

It should be noted last that: above embodiment is the unrestricted technical scheme of the present invention in order to explanation only, although the present invention is had been described in detail with reference to the foregoing description, those of ordinary skill in the art is to be understood that: still can make amendment or be equal to replacement the present invention, and not breaking away from any modification or partial replacement of the spirit and scope of the present invention, it all should be encompassed in the middle of the claim scope of the present invention.

Claims

1. the task dynamic dispatching method under the distributed computing model in the cloud computing environment, by dynamically obtaining and the resource requirement of analysis task and the performance information and the historic task execution information of node, distribution to task under situation about satisfying the demands is dynamically controlled, thereby improve the response speed of operation and the resource utilization of node, it is characterized in that: this method may further comprise the steps:

Step 1: the performance information of this node of computing node dynamic collection and task are carried out information, report to main controlled node with the form of heartbeat message; Main controlled node receives and analyzes the heartbeat message of each computing node, generates node state table and task status table;

Step 2: main controlled node is computing node predistribution task according to node state table and task status table, and more new node is looked ahead and shown and the pre-submeter of task;

Step 3:, then in the heartbeat message of next time, add sign to main controlled node request task if free task groove is that task slot can use in the computing node;

Step 4: after main controlled node receives the task requests of computing node, press scheduling strategy and be its allocating task.

2. the task dynamic dispatching method in a kind of cloud computing environment according to claim 1 under the distributed computing model is characterized in that: described joint behavior information of step 1 and task execution information are the significant data sources that main controlled node upgrades node state table and task status table; Joint behavior information comprises CPU frequency, memory size, CPU usage, memory usage and I/O resource utilization; Task execution information comprises the task execution information of firm end and the task execution information of well afoot; Just the task execution information that finishes comprises that the JobID of TaskID, the place operation of task, the time that is used for IO are the replication processes data and are used for the time that CPU calculates, wherein, the replication processes data occur in this computing node and do not have under the input data conditions of this task and take place; The task execution information of well afoot comprises the TaskID of task, JobID, task executions progress and the executed time of place operation; Each computing node is collected these two kinds of information of this node at set intervals, and is encapsulated as heartbeat message and sends to main controlled node.

3. the task dynamic dispatching method in a kind of cloud computing environment according to claim 1 under the distributed computing model is characterized in that: described node state table of step 2 and task status table are the important references information that main controlled node is formulated the Task Distribution scheme; Node state table has been described the performance state of each computing node in a recent period of time, the task status table record situation of each computing node Processing tasks in a recent period of time; After main controlled node receives the heartbeat message of computing node for the first time, upgrade these two tables after creating node state table and task status table and receiving the heartbeat message of computing node afterwards at every turn; Node state table comprises NodeName, CPU_Speed, MemSize, CPU_Usage, these fields of Mem_Usage, IO_Usage; Task status table comprises JobID, TaskID, NodeName, Time_IO, Time_CPU, Progress, these fields of PastTime; Node look ahead table record main controlled node be the information that computing node is allocated task in advance, the node table of looking ahead comprises NodeName, preFetched, these fields of preFetchedTaskID; Task presort table record main controlled node allocate task in advance information to computing node, the pre-submeter of task comprises TaskID, preScheduled, these fields of preScheduledNodeName; The specific implementation process is as follows:

1) main controlled node is monitored the RPC visit of computing node, receives the heartbeat message that computing node sends; The main controlled node synchronization can only receive the heartbeat message of a computing node, if main controlled node is when receiving the heartbeat message of certain computing node, have other computing nodes also to send heartbeat message to main controlled node, then main controlled node adds waiting list with the computing node of later heartbeat; Joint behavior watch-dog monitoring on the computing node is also collected interior performance information of nearest a period of time of this node, the Mission Monitor device is monitored the information of carrying out on this node of task and is collected the record of executed 3 historic tasks recently, and computing node is encapsulated as heartbeat message with performance information and mission bit stream; If the mission bit stream in a period of time does not recently upgrade, also can only comprise the performance information of node in the heartbeat message; Computing node sends to main controlled node with heartbeat message by the RPC mode at set intervals; Heart beat cycle is 3 seconds, during each heartbeat, all should comprise the performance information of node and the information of current carrying out of task in the heartbeat message, and computing node whenever executes a task, the task executions that all will just finish when heartbeat next time record adds heartbeat message and sends to main controlled node, promptly comprises two generic task information in the mission bit stream: finished but the mission bit stream that do not report and the mission bit stream of well afoot;

2) main controlled node upgrades node state table and task status table according to the heartbeat message that receives; For node state table, the computing node status information in the heartbeat message is covered in the main controlled node node state table corresponding to the information of this computing node; Writing down the information of nearest 3 historic tasks of carrying out on each node and the mission bit stream of well afoot in the task status table, when main controlled node is received new task status information at every turn, at first see if there is the mission bit stream of having finished but not reported, if have, then obtain the TaskID of this task and check whether this task exists in task status table, if there is the information of this task in the updating task state table then, otherwise the oldest mission bit stream of this computing node and add this completed mission bit stream in the deletion task status table; For the mission bit stream of well afoot, obtain the TaskID of this task, if this task exists in task status table, the information of this task in the updating task state table then, otherwise, in task status table, add the information of this task;

3) according to node state table and task status table look ahead table and the pre-submeter of task of new node more.For m task in the task list, according to node state table and task status table, predict that each node carries out the time of this required by task, prediction algorithm is as follows:

T_{i j_{i}} = \frac{Σ_{j - h}^{j - 1} (t_{s} + t_{io} + t_{cpu})}{h}, i = 1,2 . . . . . . n - - - (1)

Wherein, T _IjBe that i computing node carried out its j _iThe predicted time of individual required by task, t _sFor the time of available task groove, t occurring _I0Be required time of copy data, t _CpuBe data processing time, h is the computing node reference number of the task of successful execution, and n is the computing node number in the cluster;

Obtain T _IjValue after, main controlled node is selected minimum one, to its corresponding computing node predistribution task, and task m is labeled as presorts, and the computing node that is about to carry out this task is labeled as looks ahead;

Then, main controlled node continues not carry out its computing node for the next task choosing of having presorted that is not labeled as in being marked as each computing node of having looked ahead, only presorts num task, wherein num=5 at every turn; Presort finish after, may also have in the task queue not to be marked as the task of having presorted, also may have in the node listing and not be marked as the node of having looked ahead;

4) if computing node will ask the task word segment mark to be designated as very, then enter task scheduling modules in heartbeat message.

4. the task dynamic dispatching method in a kind of cloud computing environment according to claim 1 under the distributed computing model, it is characterized in that: the task groove of the described computing node of step 3, its size is meant the maximum number of tasks of computing node synchronization energy executed in parallel, and the size of task groove configures before the distributed node cluster starts; Computing node only in free task groove just to main controlled node application task, the application of task is by the heartbeat message transmission, the zone bit that comprises the application task in the heartbeat message, if for very then show the task groove that this computing node is free, main controlled node can assign the task to this computing node and carry out.

5. the task dynamic dispatching method in a kind of cloud computing environment according to claim 1 under the distributed computing model is characterized in that: the described main controlled node of step 4 is that the computing node allocating task execution of application task determines by the distributed scheduling algorithm; The distributed scheduling algorithm is realized in the scheduler of main controlled node, synchronization has a plurality of computing nodes application task simultaneously to be carried out, scheduler is by reading node state table, task status table, node look ahead table and the pre-submeter of task, and in conjunction with the residue task queue, be defined as the priority ranking and the task number of computing node allocating task according to the distributed scheduling algorithm, then look ahead table and the pre-submeter of task of new node more; The specific implementation process is as follows:

1) main controlled node at first judges whether to allocate task in advance for this node by searching the node table of looking ahead, if allocate in advance, the then task of having presorted for the distribution of this node, and this task of mark is distributed, and this node of mark is not for looking ahead;

2) if main controlled node is not allocated task in advance for this node, according to the prediction algorithm formula (1) described in the step 2, not being marked as the node of having looked ahead is the more weak node of poor-performing or computing power, then choosing a task from task queue carries out for this node, but not with this task flagging for distributing, treat pre-timesharing next time, this task will be given processing speed node faster in advance, thereby this task will have a backup tasks to carry out to guarantee its smooth execution on fast node;

3) for after the computing node allocating task finishes, main controlled node is look ahead table and the pre-submeter of task of new node more;

4) the task scheduling stage finishes, and changes the information updating stage over to, and main controlled node continues to receive and handle the heartbeat message that computing node sends over.