CN102073546B - Task-dynamic dispatching method under distributed computation mode in cloud computing environment - Google Patents

Task-dynamic dispatching method under distributed computation mode in cloud computing environment Download PDF

Info

Publication number
CN102073546B
CN102073546B CN 201010583597 CN201010583597A CN102073546B CN 102073546 B CN102073546 B CN 102073546B CN 201010583597 CN201010583597 CN 201010583597 CN 201010583597 A CN201010583597 A CN 201010583597A CN 102073546 B CN102073546 B CN 102073546B
Authority
CN
China
Prior art keywords
task
node
computing
main controlled
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 201010583597
Other languages
Chinese (zh)
Other versions
CN102073546A (en
Inventor
肖利民
毛宏
祝明发
阮利
胡声秋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI JUNESH INFORMATION TECHNOLOGY CO., LTD.
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN 201010583597 priority Critical patent/CN102073546B/en
Publication of CN102073546A publication Critical patent/CN102073546A/en
Application granted granted Critical
Publication of CN102073546B publication Critical patent/CN102073546B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Multi Processors (AREA)

Abstract

The invention provides a task-dynamic dispatching method under a distributed computation mode in a cloud computing environment, which comprises the following four steps: 1. a main node receives and analyzes heartbeat information of a subsidiary node; 2. the main node previously distributes the task according to a node state table and a task state table; 3. the subsidiary node demands the task from the main node; and 4. the main node distributes the task to the subsidiary node. The method firstly considers the resource demand of the task and the performance information of the nodes, and dynamically controls the distribution of the task under the condition that the requirement is met, so that the response speed of the work and the resource utilization of the nodes are improved. The method has wide practical value and application prospect in the technical field of the distributed computation in the cloud computing environment.

Description

Task dynamic dispatching method in a kind of cloud computing environment under the distributed computing model
(1) technical field
The present invention relates to a kind of method for scheduling task of distributed computing platform.Be specifically related to the task dynamic dispatching method under the distributed computing model in a kind of cloud computing environment.It is the dynamic dispatching method based on joint behavior of task in a kind of task scheduling subsystem, belongs to field of computer technology.
(2) background technology
At present, along with the feasible demand to computing power of the develop rapidly of network application constantly increases, be accompanied by the development of grid computing, parallel computation, Distributed Calculation, cloud computing is arisen at the historic moment, be listed in the technique direction that country will give priority to future, and become the hot research problem of computer nowadays research circle and industry member.Popular along with cloud computing, increasing network (Web) service and commercial the application are deployed in the cloud computing environment, for the distributed node of handling the application layer computation requests in the cloud environment, how to pass through the scheduling of task with efficient processing upper strata computation requests, improve the utilization rate of the resource of the distributed node of isomery on the performance, and the response speed of lifting operation becomes the research focus in current cloud computing field.
When the mass data in the cloud environment was handled, the task scheduling that is treated to the basis with distributed storage and distributed parallel was one of committed step.The dispatching method that improves operation and task is present research focus, the aspects such as optimization of the subtask quantity of the scheduling of subtask and parallel running when the scheduling when research both at home and abroad mainly comprises many operations parallel running between the operation, single job operation.
In the scheduling aspect of operation, the most basic current scheduling mode is the job scheduling method of first in first out, and is unquestionable, and this job processing method has a lot of drawbacks, especially in operation quantity more for a long time, Whole Response is chronic.The proposition of equity dispatching device (Fair scheduler) preferably resolves this problem, when an independent operation in when operation, it will use whole cluster.When other operation was submitted, system can compose task free time sheet to these new operations, so that each operation all probably gets access to the CPU time of equivalent, and guaranteed the service level of big task when little task is responded fast.Capacity scheduling device (Capacity scheduler) is then supported many formations, and operation enters a formation after submitting to, and resource is by queue assignment, and the resource of this formation is used in the operation in each formation; In a formation, the operation of high priority can be used resource prior to the operation of low priority; But in case an operation begins to carry out, it just can not seized by the operation of higher priority; For preventing that one or more users from monopolizing all resources, be forced to a certain proportion of resource of each queue assignment.Three queue dispatchers based on MR-Predict that Inst. of Computing Techn. Academia Sinica proposes are divided into 3 classes according to CPU and I/O utilization rate with operating load, can improve the utilization rate of CPU and I/O resource under dissimilar operating load environment simultaneously.
On task scheduling, LATE (the Longest ApproximateTime to End) dispatching algorithm that the researchist of Univ. of California, Berkeley proposes then focuses on the optimization to the scheduling of the backup tasks in the operation.Finish the work the needed time by supposition, only guarantee to carry out the backup tasks of estimating the task of finishing the latest at fireballing node.The researchist of Purdue University has proposed the task quantity optimization collocation method based on historical statistical data, its research mainly focuses on when carrying out operation, the number of moving simultaneously on each node in the cloud environment of task is to Effect on Performance, according to historical statistical data, obtain the optimization configuration and be applied to new similar operation.
Yet, in most of the cases, the performance of different nodes is different, the load state of different each node constantly is also different, the dynamic allocation scheme that sets the tasks according to performance isomerism and the dynamic debt situation of node how, for efficient processing calculation task and improve distributed node resource utilization, promote operation response speed significant.
(3) summary of the invention
1, purpose:
Fundamental purpose of the present invention provides the task dynamic dispatching method under the distributed computing model in a kind of cloud computing environment, it at first considers the resource requirement of task and the performance information of node, distribution to task under situation about satisfying the demands is dynamically controlled, thereby improves the response speed of operation and the resource utilization of node.
For achieving the above object, the present invention proposes in the cloud computing environment under the distributed computing model dynamic dispatching method based on the task of joint behavior and task practice condition, the composition structure of Distributed Calculation node as shown in Figure 1 under the cloud computing environment, mainly comprise a main controlled node (host node) and a plurality of computing node (child node), computing node both can be physical machine, also can be virtual machine, transparent to main controlled node, between node by the network interconnection.Main controlled node and computing node are mutual by remote procedure call (RPC) mode.Main controlled node mainly be responsible for to receive the heartbeat message of computing node, and is analyzed and feed back scheduling and execution with control task; Computing node is except executing the task, and also main being responsible for collected performance information and the task execution information of this node and sent to main controlled node.
2, technical scheme:
Technical scheme of the present invention is such:
Task dynamic dispatching method in a kind of cloud computing environment of the present invention under the distributed computing model, idiographic flow as shown in Figure 2, this method may further comprise the steps:
The performance information of this node of step 201. computing node dynamic collection and task are carried out information, report to main controlled node with the form of heartbeat message.
Step 202. main controlled node receives and analyzes the heartbeat message of each computing node, creates and bring in constant renewal in node state table and task status table.According to node state table and task status table, main controlled node is computing node predistribution task, and more new node is looked ahead and shown and the pre-submeter of task.
If task groove (task slot) free in step 203. computing node is available, then in the heartbeat message of next time, add the sign to main controlled node request task.
After step 204. main controlled node receives the task requests of computing node, press scheduling strategy for its allocating task, and look ahead table and the pre-submeter of task of new node more.
Wherein, the described joint behavior information of step 201 and task execution information are the significant data sources that main controlled node upgrades node state table and task status table.Joint behavior information can comprise CPU frequency, memory size, CPU usage, memory usage, I/O resource utilization etc.Task execution information comprises the task execution information of firm end and the task execution information of well afoot; Just the task execution information that finishes comprise TaskID, the place operation of task JobID, be used for the time (replication processes data) of IO and be used for the time that CPU calculates, wherein, the replication processes data occur in this computing node and do not have under the situation of input data of this task and take place; The task execution information of well afoot comprises the TaskID of task, JobID, task executions progress and the executed time of place operation.Each computing node is collected these two kinds of information of this node at set intervals, and is encapsulated as heartbeat message and sends to main controlled node.
Wherein, the node state table described in the step 202 and task status table are the important references information that main controlled node is formulated the task allocative decision.Node state table has been described the performance state of each computing node in a recent period of time, the task status table record situation of each computing node Processing tasks in a recent period of time.After main controlled node receives the heartbeat message of computing node for the first time, upgrade these two tables after creating node state table and task status table and receiving the heartbeat message of computing node afterwards at every turn.Node state table comprises NodeName, CPU_Speed, MemSize, CPU_Usage, these fields of Mem_Usage, IO_Usage; Task status table comprises JobID, TaskID, NodeName, Time_IO, Time_CPU, Progress, these fields of PastTime.Node look ahead the table and task presort the predistribution information that table record task in the current cluster.Node look ahead table record main controlled node be the information that computing node is allocated task in advance, the node table of looking ahead comprises NodeName, preFetched, these fields of preFetchedTaskID.Task presort table record main controlled node allocate task in advance to computing node information, the pre-submeter of task comprises TaskID, preScheduled, these fields of preScheduledNodeName.
Wherein, the size of the task groove of the described computing node of step 203 refers to the maximum number of tasks of computing node synchronization energy executed in parallel, and the size of task groove configures before the distributed node cluster starts.Computing node only in free task groove just to main controlled node application task, the application of task is by the heartbeat message transmission, the zone bit that comprises the application task in the heartbeat message, if for very then show the task groove that this computing node is free, main controlled node can assign the task to this computing node and carry out.
Wherein, the described main controlled node of step 204 is that the computing node allocating task of application task is carried out and to be determined by the distributed scheduling algorithm.The distributed scheduling algorithm is realized in the scheduler of main controlled node, synchronization may have a plurality of computing nodes application task simultaneously to carry out, scheduler is by reading node state table, task status table, node look ahead table and the pre-submeter of task, and in conjunction with the residue task queue, be defined as priority ranking and the task number of computing node allocating task according to the distributed scheduling algorithm, then look ahead table and the pre-submeter of task of new node more.
3, advantage and effect: the task dynamic dispatching method in a kind of cloud computing environment of the present invention under the distributed computing model, it and prior art this, its major advantage is: (1) carries out information by performance dynamic change and the historic task of analytical calculation node, make main controlled node more reasonable to the distribution of task, more can give full play to the performance advantage of the computing node of better performances, and original method for scheduling task is not all considered the dynamic change of each computing node on performance; (2) changed in the typical distributed computing platform (as MapReduce) as long as the convention that computing node can the acquisition task be carried out to main controlled node application task, and be that main controlled node has been given and selected computing node to go the right of executing the task, the bottleneck problem brought with regard to the computing node of having avoided poor-performing like this.
(4) description of drawings
The composition structural representation of Distributed Calculation node in Fig. 1 cloud computing environment of the present invention
In Fig. 2 cloud environment based on the task distribution formula scheduling flow synoptic diagram of distributed node performance and task practice condition
The interactive structure figure of the three phases that Fig. 3 the present invention includes (initialization, information updating and task scheduling)
The detail flowchart of the three phases that Fig. 4 the present invention includes
Fig. 5 information updating module schematic flow sheet of the present invention
Fig. 6 task scheduling modules schematic flow sheet of the present invention
Symbol description is as follows among the figure:
The 201-204 step number; The 501-505 step number; The 601-604 step number;
(5) embodiment
For making the purpose, technical solutions and advantages of the present invention express clearlyer, the present invention is further described in more detail below in conjunction with drawings and the specific embodiments.
Satisfied facility environment condition required for the present invention is seen Fig. 1, the composition structure of Distributed Calculation node mainly comprises a main controlled node (host node) and a plurality of computing node (child node) in the cloud environment, computing node both can be physical machine, it also can be virtual machine, transparent to main controlled node, between node by the network interconnection.Main controlled node and computing node are mutual by remote procedure call (RPC) mode.Main controlled node mainly be responsible for to receive the heartbeat message of computing node, and is analyzed and feed back scheduling and execution with control task; Wherein, the node analysis device is used for the performance information of reception and analytical calculation node, upgrades node state table, and task analyzer is used for the mission bit stream of reception and analytical calculation node, updating task state table.Computing node is except executing the task, and also main being responsible for collected performance information and the task execution information of this node and sent to main controlled node; Wherein, the joint behavior watch-dog is responsible for the performance information of nearest a period of time of collector node, and the Mission Monitor device is responsible for the recorded information that nearest a period of time of collector node executes the task.
The present invention requires each node to adopt (SuSE) Linux OS aspect software condition, and Java development kit 1.6 and above version are installed.
The present invention requires each node not have password by ssh and visits mutually aspect environmental baseline.
Task dynamic dispatching flow process based on joint behavior and task practice condition is seen Fig. 2, mainly comprise two contents: (1) computing node is collected the heartbeat message of this node of encapsulation and is sent to main controlled node, and main controlled node is set up and renewal node state table and task status table according to the heartbeat message that receives; (2) main controlled node is after receiving the task requests of computing node, is computing node allocating task and look ahead table and the pre-submeter of task of new node more according to dispatching algorithm.
This method comprises three phases: initialization, information updating and task scheduling.Its interactive structure as shown in Figure 3.At initial phase, main controlled node receives operation, and sets up node state table and task status table; In the information updating stage, main controlled node receives the heartbeat message of computing node and upgrades node state table, task status table, node look ahead table and the pre-submeter of task, if computing node request task then enters the task scheduling stage; At the task scheduling phase, main controlled node is the computing node allocating task according to nodal information and mission bit stream, finishes the heartbeat message that back return message update stage is waited for computing node.
Describe with an example below, as shown in Figure 4, method of the present invention may further comprise the steps:
Step 401: the joint behavior watch-dog on the computing node is collected the performance information of this node, and the Mission Monitor device is collected the task of this node and carried out information, is packaged into heartbeat message again, sends to main controlled node.The cycle that information is collected and heartbeat message sends is 3 seconds.
Step 402: main controlled node receives and analyzes the heartbeat message of each computing node, if receive heartbeat message for the first time, then create node state table and task status table, if create, receive whenever that then a heartbeat message just upgrades node state table and task status table.Main controlled node is computing node predistribution task according to node state table and task status table, and more new node is looked ahead and shown and the pre-submeter of task.Specifically shown in the information updating module of Fig. 5.
Step 403: computing node then adds the sign to main controlled node request task if free task groove (task slot) is available in the heartbeat message of next time.
Step 404: after main controlled node receives the task requests of computing node, press scheduling strategy and be its allocating task.Specifically shown in the task scheduling modules of Fig. 6.
The detailed process of information updating module as shown in Figure 5,
Step 501: main controlled node is monitored the RPC visit of computing node, receives the heartbeat message that computing node sends.The main controlled node synchronization can only receive the heartbeat message of a computing node, if main controlled node is when receiving the heartbeat message of certain computing node, have other computing nodes also to send heartbeat message to main controlled node, then main controlled node adds waiting list with the computing node of later heartbeat.Joint behavior watch-dog monitoring on the computing node is also collected interior performance information of nearest a period of time of this node, the Mission Monitor device is monitored the information of carrying out on this node of task and is collected the record of executed 3 historic tasks recently, and computing node is encapsulated as heartbeat message with performance information and mission bit stream.If the mission bit stream in a period of time does not recently upgrade, also can only comprise the performance information of node in the heartbeat message.Computing node sends to main controlled node with heartbeat message by the RPC mode at set intervals.Heart beat cycle is 3 seconds.During each heartbeat, all should comprise the performance information of node and the information of current carrying out of task in the heartbeat message, and computing node whenever executes a task, the task executions that all will just finish when heartbeat next time record adds heartbeat message and sends to main controlled node, namely comprises two generic task information in the mission bit stream: finished but the mission bit stream (may be sky) that do not report and the mission bit stream of well afoot.
Step 502: main controlled node upgrades node state table and task status table according to the heartbeat message that receives.For node state table, the computing node status information in the heartbeat message is covered in the main controlled node node state table corresponding to the information of this computing node.Recording the information of nearest 3 historic tasks of carrying out on each node and the mission bit stream of well afoot in the task status table, when main controlled node is received new task status information at every turn, at first see if there is the mission bit stream of having finished but not reported, if have, then obtain the TaskID of this task and check whether this task exists in task status table, if there is the information of this task in the updating task state table then, otherwise the oldest mission bit stream of this computing node and add this completed mission bit stream in the deletion task status table.For the mission bit stream of well afoot, obtain the TaskID of this task, if this task exists in task status table, the information of this task in the updating task state table then, otherwise, in task status table, add the information of this task.
Step 503: according to node state table and task status table look ahead table and the pre-submeter of task of new node more.For m task in the task list, according to node state table and task status table, predict that each node carries out the time of this required by task, prediction algorithm is as follows:
T i j i = Σ j - h j - 1 ( t s + t io + t cpu ) h , i = 1,2 . . . . . . n - - - ( 1 )
Wherein, T IjBe that i computing node carried out its j iThe predicted time of individual required by task, t sFor the time of available task groove, t occurring I0Be required time of copy data, t CpuBe data processing time, h is the computing node reference number of the task of successful execution, and n is the computing node number in the cluster.
Obtain T IjValue after, main controlled node is selected minimum one, to its corresponding computing node predistribution task.And task m is labeled as presorts, the computing node that is about to carry out this task is labeled as looks ahead.
Then, main controlled node continues not carry out its computing node for the next task choosing of having presorted that is not labeled as in being marked as each computing node of having looked ahead.Only presort num task, wherein num=5 at every turn.Presort finish after, may also have in the task queue not to be marked as the task of having presorted, also may have in the node listing and not be marked as the node of having looked ahead.
Step 504: if computing node will ask the task word segment mark to be designated as very, then enter task scheduling modules in heartbeat message.
The detailed process of task scheduling modules is as shown in Figure 6:
Step 601: main controlled node at first judges whether to allocate task in advance for this node by searching the node table of looking ahead, if allocate in advance, the then task of having presorted for the distribution of this node, and this task of mark is distributed, and this node of mark is not for looking ahead.
Step 602: if main controlled node is not allocated task in advance for this node, according to the prediction algorithm described in the step 503, not being marked as the node of having looked ahead is the more weak node of poor-performing or computing power, then choosing a task from task queue carries out for this node, but not with this task flagging for distributing, treat pre-timesharing next time, this task will be given processing speed node faster in advance, thereby this task will have a backup tasks to carry out to guarantee its smooth execution at fast node.
Step 603: for after the computing node allocating task finishes, main controlled node is look ahead table and the pre-submeter of task of new node more.
Step 604: the task scheduling stage finishes, and changes the information updating stage over to, and main controlled node continues to receive and handle the heartbeat message that computing node sends over.
It should be noted last that: above embodiment is the unrestricted technical scheme of the present invention in order to explanation only, although with reference to above-described embodiment the present invention is had been described in detail, those of ordinary skill in the art is to be understood that: still can make amendment or be equal to replacement the present invention, and not breaking away from any modification or partial replacement of the spirit and scope of the present invention, it all should be encompassed in the middle of the claim scope of the present invention.

Claims (1)

1. the task dynamic dispatching method under the distributed computing model in the cloud computing environment, by dynamically obtaining and the resource requirement of analysis task and performance information and the historic task execution information of node, distribution to task under situation about satisfying the demands is dynamically controlled, thereby improve the response speed of operation and the resource utilization of node, it is characterized in that: this method may further comprise the steps:
Step 1: the performance information of this node of computing node dynamic collection and task are carried out information, report to main controlled node with the form of heartbeat message; Main controlled node receives and analyzes the heartbeat message of each computing node, generates node state table and task status table;
Step 2: main controlled node is computing node predistribution task according to node state table and task status table, and more new node is looked ahead and shown and the pre-submeter of task;
Step 3: if free task groove is that task slot can use in the computing node, then in the heartbeat message of next time, add the sign to main controlled node request task;
Step 4: after main controlled node receives the task requests of computing node, press scheduling strategy and be its allocating task;
Wherein, The described joint behavior information of step 1 and task execution information are the significant data sources that main controlled node upgrades node state table and task status table; Joint behavior information comprises CPU frequency, memory size, CPU usage, memory usage and I/O resource utilization; Task execution information comprises the task execution information of firm end and the task execution information of well afoot; Just the task execution information that finishes comprises that the JobID of TaskID, the place operation of task, the time that is used for IO are the replication processes data and are used for the time that CPU calculates, wherein, the replication processes data occur in this computing node and do not have under the situation of input data of this task and take place; The task execution information of well afoot comprises the TaskID of task, JobID, task executions progress and the executed time of place operation; Each computing node is collected these two kinds of information of this node at set intervals, and is encapsulated as heartbeat message and sends to main controlled node;
Wherein, The described node state table of step 2 and task status table are the important references information that main controlled node is formulated the task allocative decision; Node state table has been described the performance state of each computing node in a recent period of time, the task status table record situation of each computing node Processing tasks in a recent period of time; After main controlled node receives the heartbeat message of computing node for the first time, upgrade these two tables after creating node state table and task status table and receiving the heartbeat message of computing node afterwards at every turn; Node state table comprises NodeName, CPU_Speed, MemSize, CPU_Usage, these fields of Mem_Usage, IO_Usage; Task status table comprises JobID, TaskID, NodeName, Time_IO, Time_CPU, Progress, these fields of PastTime; Node look ahead table record main controlled node be the information that computing node is allocated task in advance, the node table of looking ahead comprises NodeName, preFetched, these fields of preFetchedTaskID; Task presort table record main controlled node allocate task in advance to computing node information, the pre-submeter of task comprises TaskID, preScheduled, these fields of preScheduledNodeName; The specific implementation process is as follows:
1) main controlled node is monitored the RPC visit of computing node, receives the heartbeat message that computing node sends; The main controlled node synchronization can only receive the heartbeat message of a computing node, if main controlled node is when receiving the heartbeat message of certain computing node, have other computing nodes also to send heartbeat message to main controlled node, then main controlled node adds waiting list with the computing node of later heartbeat; Joint behavior watch-dog monitoring on the computing node is also collected interior performance information of nearest a period of time of this node, the Mission Monitor device is monitored the information of carrying out on this node of task and is collected the record of executed 3 historic tasks recently, and computing node is encapsulated as heartbeat message with performance information and mission bit stream; If the mission bit stream in a period of time does not recently upgrade, only comprise the performance information of node in the heartbeat message; Computing node sends to main controlled node with heartbeat message by the RPC mode at set intervals; Heart beat cycle is 3 seconds, during each heartbeat, all should comprise the performance information of node and the information of current carrying out of task in the heartbeat message, and computing node whenever executes a task, the task executions that all will just finish when heartbeat next time record adds heartbeat message and sends to main controlled node, namely comprises two generic task information in the mission bit stream: finished but the mission bit stream that do not report and the mission bit stream of well afoot;
2) main controlled node upgrades node state table and task status table according to the heartbeat message that receives; For node state table, the computing node status information in the heartbeat message is covered in the main controlled node node state table corresponding to the information of this computing node; Recording the information of nearest 3 historic tasks of carrying out on each node and the mission bit stream of well afoot in the task status table, when main controlled node is received new task status information at every turn, at first see if there is the mission bit stream of having finished but not reported, if have, then obtain the TaskID of this task and check whether this task exists in task status table, if there is the information of this task in the updating task state table then, otherwise the oldest mission bit stream of this computing node and add this completed mission bit stream in the deletion task status table; For the mission bit stream of well afoot, obtain the TaskID of this task, if this task exists in task status table, the information of this task in the updating task state table then, otherwise, in task status table, add the information of this task;
3) according to node state table and task status table look ahead table and the pre-submeter of task of new node more; For m task in the task list, according to node state table and task status table, predict that each node carries out the time of this required by task, prediction algorithm is as follows:
T ij i = Σ j - h j - 1 ( t s + t io + t cpu ) h , i = 1,2 . . . . . . n - - - ( 1 )
Wherein, T IjiBe the predicted time that i computing node carried out its j required by task, t sFor the time of available task groove, t occurring IoBe required time of copy data, t CpuBe data processing time, h is the computing node reference number of the task of successful execution, and n is the computing node number in the cluster;
Obtain T IjiValue after, main controlled node is selected minimum one, to its corresponding computing node predistribution task, and task m is labeled as presorts, and the computing node that is about to carry out this task is labeled as looks ahead;
Then, main controlled node continues not carry out its computing node for the next task choosing of having presorted that is not labeled as in being marked as each computing node of having looked ahead, only presorts num task, wherein num=5 at every turn;
4) if computing node will ask the task word segment mark to be designated as very, then enter task scheduling modules in heartbeat message;
Wherein, The task groove of the described computing node of step 3, its size refer to the maximum number of tasks of computing node synchronization energy executed in parallel, and the size of task groove configures before the distributed node cluster starts; Computing node only in free task groove just to main controlled node application task, the application of task is by the heartbeat message transmission, the zone bit that comprises the application task in the heartbeat message, if for very then show the task groove that this computing node is free, main controlled node assigns the task to this computing node and carries out;
Wherein, The described main controlled node of step 4 is that the computing node allocating task execution of application task determines by the distributed scheduling algorithm; The distributed scheduling algorithm is realized in the scheduler of main controlled node, synchronization has a plurality of computing nodes application task simultaneously to carry out, scheduler is by reading node state table, task status table, node look ahead table and the pre-submeter of task, and in conjunction with the residue task queue, be defined as priority ranking and the task number of computing node allocating task according to the distributed scheduling algorithm, then look ahead table and the pre-submeter of task of new node more; The specific implementation process is as follows:
1) main controlled node at first judges whether to allocate task in advance for this node by searching the node table of looking ahead, if allocate in advance, the then task of having presorted for the distribution of this node, and this task of mark is distributed, and this node of mark is not for looking ahead;
2) if main controlled node is not allocated task in advance for this node, according to the prediction algorithm formula (1) described in the step 2, not being marked as the node of having looked ahead is the more weak node of poor-performing or computing power, then choosing a task from task queue carries out for this node, but not with this task flagging for distributing, treat pre-timesharing next time, this task will be given processing speed node faster in advance, thereby this task will have a backup tasks to carry out to guarantee its smooth execution at fast node;
3) for after the computing node allocating task finishes, main controlled node is look ahead table and the pre-submeter of task of new node more;
4) the task scheduling stage finishes, and changes the information updating stage over to, and main controlled node continues to receive and handle the heartbeat message that computing node sends over.
CN 201010583597 2010-12-13 2010-12-13 Task-dynamic dispatching method under distributed computation mode in cloud computing environment Expired - Fee Related CN102073546B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201010583597 CN102073546B (en) 2010-12-13 2010-12-13 Task-dynamic dispatching method under distributed computation mode in cloud computing environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201010583597 CN102073546B (en) 2010-12-13 2010-12-13 Task-dynamic dispatching method under distributed computation mode in cloud computing environment

Publications (2)

Publication Number Publication Date
CN102073546A CN102073546A (en) 2011-05-25
CN102073546B true CN102073546B (en) 2013-07-10

Family

ID=44032092

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201010583597 Expired - Fee Related CN102073546B (en) 2010-12-13 2010-12-13 Task-dynamic dispatching method under distributed computation mode in cloud computing environment

Country Status (1)

Country Link
CN (1) CN102073546B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10257033B2 (en) 2017-04-12 2019-04-09 Cisco Technology, Inc. Virtualized network functions and service chaining in serverless computing infrastructure

Families Citing this family (113)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102843248B (en) * 2011-06-21 2018-02-02 中兴通讯股份有限公司 The method and device of automatic unit distributed deployment software
CN102209041B (en) * 2011-07-13 2014-05-07 上海红神信息技术有限公司 Scheduling method, device and system
CN102916992B (en) * 2011-08-03 2016-12-28 世纪恒通科技股份有限公司 A kind of method and system of United Dispatching cloud computing remote resource
CN102347989A (en) * 2011-10-25 2012-02-08 百度在线网络技术(北京)有限公司 Data distribution method and system based on resource description symbols
CN102360314A (en) * 2011-10-28 2012-02-22 中国科学院计算技术研究所 System and method for managing resources of data center
CN102404615A (en) * 2011-11-29 2012-04-04 广东威创视讯科技股份有限公司 Video processing system based on cloud computing
CN102495759A (en) * 2011-12-08 2012-06-13 曙光信息产业(北京)有限公司 Method for scheduling job in cloud computing environment
CN102541640B (en) * 2011-12-28 2014-10-29 厦门市美亚柏科信息股份有限公司 Cluster GPU (graphic processing unit) resource scheduling system and method
CN103324533B (en) * 2012-03-22 2016-12-28 华为技术有限公司 distributed data processing method, device and system
CN103365713B (en) * 2012-04-01 2017-06-20 华为技术有限公司 A kind of scheduling of resource and management method and device
CN103377087B (en) * 2012-04-27 2017-06-23 北大方正集团有限公司 A kind of data task processing method, apparatus and system
CN103546510B (en) * 2012-07-13 2018-08-28 天津米游科技有限公司 Management system based on cloud service and management method
CN103546509B (en) * 2012-07-13 2018-05-29 天津米游科技有限公司 A kind of cloud service system to economize on resources and resource-effective method
CN102866918B (en) * 2012-07-26 2016-02-24 中国科学院信息工程研究所 The resource management system of Based on Distributed programming framework
CN103713942B (en) * 2012-09-28 2018-01-05 腾讯科技(深圳)有限公司 The method and system of management and running distributed computing framework in the cluster
CN103036946B (en) * 2012-11-21 2016-08-24 中国电信股份有限公司 A kind of method and system processing file backup task for cloud platform
CN103092698B (en) * 2012-12-24 2017-06-13 中国科学院深圳先进技术研究院 Cloud computing application automatic deployment system and method
CN103001809B (en) * 2012-12-25 2016-12-28 曙光信息产业(北京)有限公司 Service node method for monitoring state for cloud storage system
CN103064742B (en) * 2012-12-25 2016-05-11 中国科学院深圳先进技术研究院 A kind of automatic deployment system and method for hadoop cluster
CN103095853B (en) * 2013-02-27 2016-08-03 北京航空航天大学 Cloud data center calculation capacity management system
CN104077188A (en) * 2013-03-29 2014-10-01 西门子公司 Method and device for scheduling tasks
CN103297499B (en) * 2013-04-19 2017-02-08 无锡成电科大科技发展有限公司 Scheduling method and system based on cloud platform
CN104123214B (en) * 2013-04-26 2017-07-14 阿里巴巴集团控股有限公司 The method and system of tasks carrying progress metrics and displaying based on runtime data
CN104166589A (en) * 2013-05-17 2014-11-26 阿里巴巴集团控股有限公司 Heartbeat package processing method and device
CN103309738B (en) * 2013-05-31 2016-12-28 中国联合网络通信集团有限公司 User job dispatching method and device
CN103347055B (en) * 2013-06-19 2016-04-20 北京奇虎科技有限公司 Task processing system in cloud computing platform, Apparatus and method for
CN103414771B (en) * 2013-08-05 2017-02-15 国云科技股份有限公司 Monitoring method for long task operation between nodes in cloud computing environment
CN103500119B (en) * 2013-09-06 2017-01-04 西安交通大学 A kind of method for allocating tasks based on pre-scheduling
CN103617305A (en) * 2013-10-22 2014-03-05 芜湖大学科技园发展有限公司 Self-adaptive electric power simulation cloud computing platform job scheduling algorithm
WO2015061976A1 (en) * 2013-10-30 2015-05-07 Nokia Technologies Oy Methods and apparatus for task management in a mobile cloud computing environment
CN103593323A (en) * 2013-11-07 2014-02-19 浪潮电子信息产业股份有限公司 Machine learning method for Map Reduce task resource allocation parameters
CN103761146B (en) * 2014-01-06 2017-10-31 浪潮电子信息产业股份有限公司 A kind of method that MapReduce dynamically sets slots quantity
CN104268007A (en) * 2014-01-07 2015-01-07 深圳市华傲数据技术有限公司 Distributed event request scheduling method and system
CN104917642B (en) * 2014-03-11 2019-03-22 深圳业拓讯通信科技有限公司 A kind of Port Mirroring data transmission method and its system
CN103941662A (en) * 2014-03-19 2014-07-23 华存数据信息技术有限公司 Task scheduling system and method based on cloud computing
CN104102533B (en) * 2014-06-17 2017-07-18 华中科技大学 A kind of Hadoop dispatching methods and system based on bandwidth aware
CN105573824B (en) * 2014-10-10 2020-04-03 腾讯科技(深圳)有限公司 Monitoring method and system for distributed computing system
CN104301423B (en) * 2014-10-24 2018-11-06 北京奇安信科技有限公司 A kind of method, apparatus and system sending heartbeat message
CN105578205A (en) * 2014-10-27 2016-05-11 深圳国微技术有限公司 Video transcoding method and system
CN104360909B (en) * 2014-11-04 2017-10-03 无锡天脉聚源传媒科技有限公司 A kind of method for processing video frequency and device
US9736243B2 (en) * 2014-12-12 2017-08-15 Microsoft Technology Licensing, Llc Multiple transaction logs in a distributed storage system
CN104461722B (en) * 2014-12-16 2017-11-10 广东石油化工学院 A kind of job scheduling method for cloud computing system
CN104462581B (en) * 2014-12-30 2018-03-06 成都因纳伟盛科技股份有限公司 Very fast file fingerprint extraction system and method based on the mapping of microchannel internal memory and Smart Slice
CN104503845B (en) * 2015-01-14 2017-07-14 北京邮电大学 A kind of task distribution method and system
CN106156631B (en) * 2015-06-01 2019-03-12 上海红神信息技术有限公司 A kind of service function and the uncertain software and hardware device of structural characterization corresponding relationship
CN104933110B (en) * 2015-06-03 2018-02-09 电子科技大学 A kind of data prefetching method based on MapReduce
CN105095008B (en) * 2015-08-25 2018-04-17 国电南瑞科技股份有限公司 A kind of distributed task scheduling fault redundance method suitable for group system
CN105227488B (en) * 2015-08-25 2018-05-08 上海交通大学 A kind of network flow group scheduling method for distributed computer platforms
CN106484524A (en) * 2015-08-28 2017-03-08 阿里巴巴集团控股有限公司 A kind of task processing method and device
CN106528288A (en) * 2015-09-10 2017-03-22 中兴通讯股份有限公司 Resource management method, device and system
CN106528189B (en) * 2015-09-10 2019-05-28 阿里巴巴集团控股有限公司 A kind of method, apparatus and electronic equipment starting backup tasks
CN106559648A (en) * 2015-09-29 2017-04-05 鸿富锦精密工业(深圳)有限公司 Pedestrian's detecting system and method
CN105468726B (en) * 2015-11-20 2019-02-01 广州视源电子科技股份有限公司 Method for computing data and system based on local computing and distributed computing
CN105516620A (en) * 2015-12-10 2016-04-20 阔地教育科技有限公司 Distribution control device, image processing device and live and recorded broadcast interaction system
WO2017105888A1 (en) * 2015-12-17 2017-06-22 Ab Initio Technology Llc Processing data using dynamic partitioning
CN105868008B (en) * 2016-03-23 2019-05-28 深圳大学 Resource regulating method and identifying system based on keystone resources and data prediction
CN105975334A (en) * 2016-04-25 2016-09-28 深圳市永兴元科技有限公司 Distributed scheduling method and system of task
CN106027617A (en) * 2016-05-11 2016-10-12 广东浪潮大数据研究有限公司 Method for implementing dynamic scheduling of tasks and resources in private cloud environment
CN107479963A (en) * 2016-06-08 2017-12-15 国家计算机网络与信息安全管理中心 A kind of method for allocating tasks and system
CN107491265B (en) * 2016-06-12 2021-05-25 杭州海康威视数字技术股份有限公司 Method and device for distributing internet protocol IP disk
CN106055401B (en) * 2016-06-13 2019-02-26 北京唯智佳辰科技发展有限责任公司 Magnanimity calculates the parallel automatic start-stop and calculating task dynamic allocation method of coarse granule
CN106095586A (en) * 2016-06-23 2016-11-09 东软集团股份有限公司 A kind of method for allocating tasks, Apparatus and system
CN106293952B (en) * 2016-07-11 2019-06-21 河南大学 A kind of task based access control demand and the matched remote sensing method for scheduling task of service ability
CN106375373A (en) * 2016-08-24 2017-02-01 广西小草信息产业有限责任公司 Task decomposition method and system based on dynamic cloud nodes
CN106354563B (en) * 2016-08-29 2020-05-22 广州市香港科大霍英东研究院 Distributed computing system for 3D reconstruction and 3D reconstruction method
CN106371923A (en) * 2016-08-30 2017-02-01 江苏国泰新点软件有限公司 Method and device for processing task
CN107870813A (en) * 2016-09-22 2018-04-03 中兴通讯股份有限公司 A kind of method and device of distributed algorithm processing data
CN107885594B (en) * 2016-09-30 2020-06-12 腾讯科技(深圳)有限公司 Distributed resource scheduling method, scheduling node and access node
CN106452957B (en) * 2016-09-30 2019-09-10 邦彦技术股份有限公司 Heartbeat detection method and node system
CN108121599A (en) * 2016-11-30 2018-06-05 杭州海康威视数字技术股份有限公司 A kind of method for managing resource, apparatus and system
CN106657328A (en) * 2016-12-20 2017-05-10 上海创远仪器技术股份有限公司 Wireless communication signal analysis and measurement system based on cloud computing technology
CN106776034B (en) * 2016-12-27 2020-07-31 国网浙江省电力公司电力科学研究院 Task batch processing calculation method, master station computer and system
CN106648900B (en) * 2016-12-28 2020-12-08 深圳Tcl数字技术有限公司 Supercomputing method and system based on smart television
CN107168779A (en) * 2017-03-31 2017-09-15 咪咕互动娱乐有限公司 A kind of task management method and system
US10884807B2 (en) 2017-04-12 2021-01-05 Cisco Technology, Inc. Serverless computing and task scheduling
CN107066338A (en) * 2017-04-13 2017-08-18 中国人民解放军国防科学技术大学 The computing environment method of automatic configuration of distributed computing system
US20180314971A1 (en) * 2017-04-26 2018-11-01 Midea Group Co., Ltd. Training Machine Learning Models On A Large-Scale Distributed System Using A Job Server
US10489195B2 (en) 2017-07-20 2019-11-26 Cisco Technology, Inc. FPGA acceleration for serverless computing
CN107580023B (en) * 2017-08-04 2020-05-12 山东大学 Stream processing job scheduling method and system for dynamically adjusting task allocation
CN109408220A (en) * 2017-08-17 2019-03-01 北京国双科技有限公司 A kind of task processing method and device
CN107608773B (en) * 2017-08-24 2020-08-04 阿里巴巴集团控股有限公司 Task concurrent processing method and device and computing equipment
US10771584B2 (en) 2017-11-30 2020-09-08 Cisco Technology, Inc. Provisioning using pre-fetched data in serverless computing environments
CN109995824B (en) * 2017-12-29 2022-10-04 阿里巴巴集团控股有限公司 Task scheduling method and device in peer-to-peer network
CN108449215A (en) * 2018-03-31 2018-08-24 甘肃万维信息技术有限责任公司 Based on distributed server performance monitoring system
US10678444B2 (en) 2018-04-02 2020-06-09 Cisco Technology, Inc. Optimizing serverless computing using a distributed computing framework
CN108769254B (en) * 2018-06-25 2019-09-20 星环信息科技(上海)有限公司 Resource-sharing application method, system and equipment based on preemption scheduling
CN108829504A (en) * 2018-06-28 2018-11-16 泰康保险集团股份有限公司 A kind of method for scheduling task, device, medium and electronic equipment
CN110673945A (en) * 2018-07-03 2020-01-10 北京京东尚科信息技术有限公司 Distributed task management method and management system
CN109086894A (en) * 2018-07-06 2018-12-25 西安热工研究院有限公司 A kind of warning message centring system of facing area genco
CN108958942A (en) * 2018-07-18 2018-12-07 郑州云海信息技术有限公司 A kind of distributed system distribution multitask method, scheduler and computer equipment
CN109343942B (en) * 2018-09-03 2020-11-03 北京邮电大学 Task scheduling method based on edge computing network
CN109246479A (en) * 2018-10-09 2019-01-18 深圳市亿联智能有限公司 A kind of cloud computing control mode based on Intelligent set top box
CN109450913A (en) * 2018-11-27 2019-03-08 浪潮软件股份有限公司 A kind of multinode registration dispatching method based on strategy
CN109614211A (en) * 2018-11-28 2019-04-12 新华三技术有限公司合肥分公司 Distributed task scheduling pre-scheduling method and device
CN111352709A (en) * 2018-12-20 2020-06-30 顺丰科技有限公司 Task scheduling method and device in distributed system
CN109783214B (en) * 2018-12-29 2021-06-22 广东电网有限责任公司广州供电局 Task scheduling control system
CN109922050A (en) * 2019-02-03 2019-06-21 普信恒业科技发展(北京)有限公司 A kind of task detection method and device
CN109921926B (en) * 2019-02-19 2022-06-21 重庆市勘测院 Automatic control method and system for live-action modeling cluster
CN110109742B (en) * 2019-05-09 2020-04-28 重庆八戒电子商务有限公司 Zookeeper-based distributed task coordination method and device
CN110209488B (en) * 2019-06-10 2021-12-07 北京达佳互联信息技术有限公司 Task execution method, device, equipment, system and storage medium
CN110297693B (en) * 2019-07-04 2020-07-28 北京伟杰东博信息科技有限公司 Distributed software task allocation method and system
CN110413389B (en) * 2019-07-24 2021-09-28 浙江工业大学 Task scheduling optimization method under resource imbalance Spark environment
CN110389822A (en) * 2019-07-29 2019-10-29 北京金山云网络技术有限公司 The node scheduling method, apparatus and server of execution task
CN110389973B (en) * 2019-07-30 2022-06-07 大连海事大学 Parallel outlier detection method in heterogeneous distributed environment
CN110728317A (en) * 2019-09-30 2020-01-24 腾讯科技(深圳)有限公司 Training method and system of decision tree model, storage medium and prediction method
CN110737521B (en) * 2019-10-14 2021-03-05 中国人民解放军32039部队 Disaster recovery method and device based on task scheduling center
CN113157403A (en) * 2020-01-07 2021-07-23 中科寒武纪科技股份有限公司 Job processing method and device, computer equipment and readable storage medium
CN111580945A (en) * 2020-04-21 2020-08-25 智业互联(厦门)健康科技有限公司 Micro-service task coordination scheduling method and system
CN112003898A (en) * 2020-07-27 2020-11-27 珠海许继芝电网自动化有限公司 Load balancing method and system for multi-node cluster
CN112131007B (en) * 2020-09-28 2023-02-21 山东浪潮科学研究院有限公司 GPU resource scheduling method, device and medium based on AI platform
CN113112139A (en) * 2021-04-07 2021-07-13 上海联蔚盘云科技有限公司 Cloud platform bill processing method and equipment
CN116781703A (en) * 2022-03-09 2023-09-19 中兴通讯股份有限公司 Data processing method, device, computer equipment and readable medium
CN114706671B (en) * 2022-05-17 2022-08-12 中诚华隆计算机技术有限公司 Multiprocessor scheduling optimization method and system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719931A (en) * 2009-11-27 2010-06-02 南京邮电大学 Multi-intelligent body-based hierarchical cloud computing model construction method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100287280A1 (en) * 2009-05-08 2010-11-11 Gal Sivan System and method for cloud computing based on multiple providers

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719931A (en) * 2009-11-27 2010-06-02 南京邮电大学 Multi-intelligent body-based hierarchical cloud computing model construction method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
万至臻 等.基于MapReduce模型的并行计算平台的设计与实现.《中国优秀硕士学位论文全文数据库》.2008,
基于MapReduce模型的并行计算平台的设计与实现;万至臻 等;《中国优秀硕士学位论文全文数据库》;20081231;22-41 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10257033B2 (en) 2017-04-12 2019-04-09 Cisco Technology, Inc. Virtualized network functions and service chaining in serverless computing infrastructure

Also Published As

Publication number Publication date
CN102073546A (en) 2011-05-25

Similar Documents

Publication Publication Date Title
CN102073546B (en) Task-dynamic dispatching method under distributed computation mode in cloud computing environment
CN102063336B (en) Distributed computing multiple application function asynchronous concurrent scheduling method
Mansouri et al. Combination of data replication and scheduling algorithm for improving data availability in Data Grids
CN102611723A (en) Method for building high-performance computing application service based on virtualization technology
CN103116525A (en) Map reduce computing method under internet environment
CN114996018A (en) Resource scheduling method, node, system, device and medium for heterogeneous computing
CN115454649A (en) Dynamic task scheduling system for calculation of space control simulation model
Henzinger et al. Scheduling large jobs by abstraction refinement
Goga et al. Performance analysis of WRF simulations in a public cloud and HPC environment
CN114490049A (en) Method and system for automatically allocating resources in containerized edge computing
Luckow et al. Abstractions for loosely-coupled and ensemble-based simulations on Azure
Meyer et al. An opportunistic algorithm for scheduling workflows on grids
Meddeber et al. Tasks assignment for Grid computing
Khalil et al. Survey of Apache Spark optimized job scheduling in Big Data
Wang et al. A survey of system scheduling for hpc and big data
Fernández-Cerero et al. Quality of cloud services determined by the dynamic management of scheduling models for complex heterogeneous workloads
Megino et al. PanDA: evolution and recent trends in LHC computing
Bakni et al. Survey on improving the performance of MapReduce in Hadoop
Monisha et al. Heterogeneous map reduce scheduling using first order logic
Tang et al. A Survey on Scheduling Techniques in Computing and Network Convergence
Patil et al. Review on a comparative study of various task scheduling algorithm in cloud computing environment
Lin et al. Research on weighted rotation fair scheduling algorithm based on hama parallel computing framework
Nakajima et al. Performance evaluation of omnirpc in a grid environment
Duan et al. Notice of Violation of IEEE Publication PrinciplesScientific Workflow Partitioning and Data Flow Optimization in Hybrid Clouds
Antony et al. Performance study of parallel job scheduling in multiple cloud centers

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: SHANGHAI SHICONG INFORMATION TECHNOLOGY CO., LTD.

Free format text: FORMER OWNER: BEIHANG UNIVERSITY

Effective date: 20150512

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 100191 HAIDIAN, BEIJING TO: 201401 FENGXIAN, SHANGHAI

TR01 Transfer of patent right

Effective date of registration: 20150512

Address after: 201401 Shanghai Fengxian District City Ring Road No. 2200 building 2128 room

Patentee after: Shanghai Shi Cong network information technology Co., Ltd

Address before: 100191 Beijing City, Haidian District Xueyuan Road No. 37 North College of computer

Patentee before: Beihang University

C56 Change in the name or address of the patentee
CP03 Change of name, title or address

Address after: 200233 room 202-35, Guiping Road, Shanghai, Xuhui District, 92

Patentee after: SHANGHAI JUNESH INFORMATION TECHNOLOGY CO., LTD.

Address before: 201401 Shanghai Fengxian District City Ring Road No. 2200 building 2128 room

Patentee before: Shanghai Shi Cong network information technology Co., Ltd

DD01 Delivery of document by public notice

Addressee: SHANGHAI JUNESH INFORMATION TECHNOLOGY CO., LTD.

Document name: Notification to Pay the Fees

DD01 Delivery of document by public notice
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130710

Termination date: 20181213

CF01 Termination of patent right due to non-payment of annual fee