CN104102533A - Bandwidth aware based Hadoop scheduling method and system - Google Patents

Bandwidth aware based Hadoop scheduling method and system Download PDF

Info

Publication number
CN104102533A
CN104102533A CN201410270693.6A CN201410270693A CN104102533A CN 104102533 A CN104102533 A CN 104102533A CN 201410270693 A CN201410270693 A CN 201410270693A CN 104102533 A CN104102533 A CN 104102533A
Authority
CN
China
Prior art keywords
task
node
queue
computing node
job
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410270693.6A
Other languages
Chinese (zh)
Other versions
CN104102533B (en
Inventor
戴彬
秦鹏
邵翔
邹云飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201410270693.6A priority Critical patent/CN104102533B/en
Publication of CN104102533A publication Critical patent/CN104102533A/en
Application granted granted Critical
Publication of CN104102533B publication Critical patent/CN104102533B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a bandwidth aware based Hadoop scheduling method, which comprises the following steps: establishing a job time completion model for Hadoop task scheduling, establishing a mathematical model for a Hadoop scheduling system, and converting the Hadoop task scheduling problem into a problem of looking for a task scheduling method that the job completion time of the job is the shortest for the job to be scheduled; by utilizing the real-time network management and traffic control functions provided by SDN (Software Defined Networking), providing a time slot based network bandwidth allocation mechanism to divide the occupation period of the remaining bandwidth of each link into equal time slots, on the basis of the job time completion model and the network time slot bandwidth allocation mechanism; before allocating a computational node for a certain task, comprehensively considering the locality of the task and the network bandwidth implementation condition, and allocating the computational node capable of providing the earliest completion time for each task. The problem that the task scheduling can not be simultaneously performed from two aspects of global perspective and the actual network available bandwidth in the existing method is solved.

Description

A kind of Hadoop dispatching method and system based on bandwidth aware
Technical field
The invention belongs to information processing and data and calculate field, more specifically, relate to a kind of Hadoop dispatching method and system based on bandwidth aware.
Background technology
Along with scientific and technical progress, Internet technology has obtained development at a high speed, and this has also greatly enriched people's the network life when having promoted social development.And the arrival of WEB2.0, there is especially earth-shaking variation in internet.An outstanding feature of WEB2.0 is the original content of user (User generated content), and the original content of a large amount of users makes data occur volatile growth.In face of the challenge towards large-scale data treatment technology, cloud computing has been carried out as a kind of new model that calculates and process large-scale data.Have benefited from the associating evolution of the multinomial technology such as distributed and virtual, cloud computing has produced as a kind of novel large data processing model.The concept that cloud computing takes cluster to calculate, distributes calculation task in the computing power pond forming to large-scale computer cluster, makes data processing demand and application system to obtain dynamically computing power and storage resources etc. according to the actual demand of oneself.
Up to the present, what in worldwide, most cloud computing system was taked is all based on MapReduce computation model and distributed file storage system, and this mainly imitates the cloud computing core technology that has realized Google.The cloud computing core of Google mainly comprises following three parts: distributed structured data-storage system BigTable, GFS (Google File System) and distributed computing platform MapReduce.Yet Google is as Yi Jia commercial company, is impossible disclose its ins and outs.Individual or scientific research group for wanting to continue research and development cloud computing, cannot obtain more understanding.The cloud computing system Hadoop increasing income has made up this defect, and in 2005, Apache foundation was released a part of the project Nutch that increases income separately, and is subsidized with fund.The design concept of Hadoop is the cloud computing core technology of Google, and it is an Open Framework, and this framework is supported the operation of mass data processing application program.The core of Hadoop has comprised distributed file system HDFS and multiple programming framework MapReduce, and HDFS has realized GFS in cluster, and is supported in cluster operations such as the read-write of file and transmission; MapReduce has completed Distributed Calculation function in cluster, and the file processing ability that it utilizes HDFS to provide has realized the function such as initialization, scheduling, operation of task.
Building of Hadoop do not need supercomputer, can be deployed in the computer cluster that the hardware device by a large amount of cheapnesss forms.Hadoop platform has encapsulated its complicated bottom layer realization details, only the application program for operation on it provides stable api interface, this implementation has shielded the details of bottom to the parallel processing of data, as cutting apart and the scheduling of backup, cluster, fault-tolerant and monitoring etc. of input data.The developer of Hadoop, in the process of exploitation, can pay close attention to too many bottom architecture details, and energy be concentrated on above the core of program, as exploitation ordinary procedure, develops cloud computing application program.This mode has reduced the exploitation pressure of application program widely, has significantly promoted the efficiency of exploitation.Simultaneously in order to strengthen the ease for use of whole Hadoop framework, the Hadoop storehouse of increasing income provides and has enriched complete fault-tolerant ability in application layer, and the failure scenarios that each node in whole cluster can may occur job run is independently processed.Hadoop Development Framework stable with it, cheap and efficiently feature be deeply subject to researcher and developer's welcome, be widely used in the applications such as search engine, commerce data mining, advertisement marketing effect analysis, analysis of biological information, web log file analysis and storage.
Although as the cloud computing platform of increasing income that obtains at present widespread use, but from Apache foundation, release Hadoop platform short several years only up to the present after all, even if obtained the common attention of academia and industry member, in a lot of places, Hadoop platform still exists perfect necessity and possibility.At this wherein, sixty-four dollar question is Mission Scheduling.As a vital gordian technique in Hadoop system, task scheduling is responsible for computational resource and job run to dispatch, and scheduling result can directly affect the calculated performance of Hadoop system and the computational resource utilization ratio of Hadoop system.Yet at present industry to the research of job scheduling technology still in foundation phase, in face of day by day complicated network environment and diversified application scenarios, existing various job scheduling algorithms still exist excessively slow, the executive capability of Hadoop platform of the operation response time of Hadoop system and the technical matters that interaction capabilities is poor and the utilization factor of Hadoop system resource is low.
Summary of the invention
Above defect or Improvement requirement for prior art, the invention provides a kind of Hadoop dispatching method and system based on bandwidth aware, its object is, solves the technical matters that the operation response time is excessively slow, Hadoop platform and integrally performance is too low in existing Hadoop dispatching algorithm.
For achieving the above object, according to one aspect of the present invention, provide a kind of Hadoop dispatching method based on bandwidth aware, comprised the following steps:
(1) receive the operation of submitting to from user, and this operation of initialization, for this operation, set up an operation ID object, this operation ID object is responsible for encapsulation task and recorded information, to follow the tracks of Job execution state and process:
(2) operation initialization being completed is added in job queue, and this job queue is a queue of having safeguarded the operation for the treatment of scheduled for executing, and the All Jobs object in memory-mapped is in charge of and is dispatched in this queue;
(3) receive the heartbeat packet that computing node is sent, extract the current residing status information of the computing node comprising in this heartbeat packet, from job queue, extract operation to be scheduled;
(4) in job scheduling pond, inquire about in this pond whether had this operation to be scheduled, if exist, then proceed to step (6), otherwise enter step (5);
(5) for this operation to be scheduled, carrying out predistribution calculating operation, is this operation to be scheduled newly-built task scheduling mapping in job scheduling pond;
(6) in job scheduling pond, inquire this operation to be scheduled, extract the corresponding task scheduling mapping of this operation to be scheduled, if this mapping is not empty, enter step (7), otherwise, step (8) entered;
(7) from the corresponding task scheduling mapping of operation to be scheduled, extract the corresponding task queue of computing node shown in step (3), computing power according to this computing node, whole or the part of this task queue is encapsulated in the return message of heartbeat packet, returning to this computing node carries out, in job scheduling pond, upgrade this task queue simultaneously, in this task queue, delete the task of distributing to computing node, if be all assigned, delete whole task queue, enter step (3);
(8) if task scheduling is mapped as sky, illustrate that all tasks of this operation are all finished, the execution result of all tasks that obtain is carried out to reduction calculating, and the result that reduction is calculated returns to user.
Preferably, in step (3), in operation distributing reservoir, for each operation to be scheduled, safeguarded a task scheduling mapping, the key of this mapping is the name of computing node, and the value of this mapping is the calculation task queue of allocating in advance to this computing node.Whenever having after a computing node initiated the request of allocating task, from task management queue, extract the operation of a band scheduling, inquire about the task scheduling mapping that this operation is safeguarded.The name of the task node of initiating allocating task request of take in the mapping of this task scheduling is key, extracts the value that this key is corresponding, is the calculation task queue of allocating in advance to this computing node.Computing power according to this computing node, the whole of this task queue or part are encapsulated to feeding to be initiated in the return message of computing node of request of allocating task, returning to this computing node carries out, in job scheduling pond, upgrade this task queue simultaneously, in this task queue, delete the task of distributing to computing node, if be all assigned, delete whole task queue.
Preferably, step (5) specifically comprises following sub-step:
(5-1) calculate the present load situation that whole Hadoop calculates each node in cluster, and then estimate the residue execution time of present load, thereby obtain the free time of each node;
(5-2) communicate with name node, obtain the current data trnascription backup instances for the treatment of the input data of schedule job, resolve and this information of dump;
(5-3) communicate with SDN controller, obtain network implementation Time Bandwidth information, calculate the data-moving time.Call the API of SDN controller NOX, obtain network implementation Time Bandwidth information, obtain implementing bandwidth and store.The definition data-moving time is that data corresponding to task move from data source nodes the time consuming data computing node, and this data-moving time can be passed through formula: T m=DS/BW, wherein T mrepresent the data-moving time, DS represents data block size, and this size can be set in configuration file, and BW represents real-time bandwidth size cases.
(5-4) receive the clustered node free time information that (5-1) step is imported into, receive the input block copy information that (5-2) number of steps reportedly enters, receive the network bandwidth and data-moving temporal information that (5-3) step is imported into, comprehensive three carries out computing, is the computing node of a current optimum of each task distribution.
(5-5), in job scheduling pond, for the newly-built task scheduling mapping of this operation to be scheduled, upgrade this task scheduling mapping.
Preferably, in step (5-1), be specially, monitor and the current operation calculated case that records each computing node in whole calculating cluster, obtain the progress value of the current operation of each computing node.Progress represents that the size of data of a complete calculating operation of task executed accounts for the number percent of whole data block size, can estimate the task deadline thus.Computing formula is T e=T s+ (T n-T s)/progress, wherein T erepresent the task deadline of estimating, T sexpression task starts the time of carrying out, T nfor the current time in system.
Preferably, in step (5-3), be specially, call the API of SDN controller, obtain network implementation Time Bandwidth information, obtain implementing bandwidth and store.The definition data-moving time is that data corresponding to task move from data source nodes the time consuming data computing node, and this data-moving time can be passed through formula: T m=DS/BW, wherein T mrepresent the data-moving time, DS represents data block size, and this size can be set in configuration file, and BW represents real-time bandwidth size cases.
Preferably, in step (5-4), specifically comprise following sub-step:
(5-4-1) in whole calculating cluster, find available the earliest remote node as optimum remote node, record node free time rI now minnow;
(5-4-2) in whole calculating cluster, find available the earliest local node as optimum local node, record node free time rI now minloc;
(5-4-3) whether the node that comparison step (5-4-1) and step (5-4-2) inquire is same node, and if so, defining optimum local node is optimum node, enters step (5-4-5), if not, enters step (5-4-4);
If (5-4-4) relatively task is distributed to respectively to this two computing nodes, the task deadline on these two computing nodes which more early, compares rI minnow+ T mwith rI minlocsize, little i.e. explanation task finishes more early, defining this node is optimum node.
(5-4-5) task to be allocated is distributed to this optimum node.
Preferably, in step (5), when distributing remote task for computing node, slot reservation division while being also responsible for carrying out.When certain remote task need to be carried out data-moving, record the source node ND that task data to be moved is moved dataSrcwith terminal note ND minNow, these Information encapsulations are become to a stream table FlowTable, ND has been recorded in territory, FlowTable packet header dataSrcwith terminal note ND minNowinformation.This stream table information sends to SDN controller, and SDN controller can be issued to this stream table in corresponding SDN switch, when SDN switch be checked through this stream table for stream time, preferentially guarantee the operation of moving of these data;
Preferably, in step (5), if task TK ibe assigned at node ND jupper calculating, and TK iinput deposit data at TK inode ND dataSrcupper,, when executing the task, input data need to be from ND dataSrcmove ND jupper, definition of T M i,jfor this data-moving time; Task is calculated the computing time that the mistiming between complete is task, definition of T P from starting to calculate i,jfor the computing time of this task; From task, be assigned to certain computing node and start, task will take the computational resource of this computing node, the time that task is actual takies computational resource for time of being assigned with from task to task computation the mistiming the complete time, definition of T E i,jfor the actual execution time of this task, wherein, these times meet father-in-law's formula TE i,j=TP i,j+ TM i,j.When computational data is positioned on the node of processing calculation task, defining this computing node is local node, otherwise defining this computing node is remote node.
Preferably, step (5-5) is specially, and in this duty mapping, if the key of the Optimal calculation node calculating does not exist, the name of this Optimal calculation node calculating of take is the newly-built key-value pair of key, then by this task add to worthwhile in.If the key of the Optimal calculation node calculating exists, find the key-value pair that this key is corresponding, task is added to after the task queue in value.
According to another aspect of the present invention, a kind of Hadoop dispatching system based on bandwidth aware is provided, comprising:
The first module, receives the operation of submitting to from user, and this operation of initialization, for this operation, sets up an operation ID object, and this operation ID object is responsible for encapsulation task and recorded information, to follow the tracks of Job execution state and process:
The second module, the operation that initialization is completed is added in job queue, and this job queue is a queue of having safeguarded the operation for the treatment of scheduled for executing, and the All Jobs object in memory-mapped is in charge of and is dispatched in this queue;
The 3rd module, receives the heartbeat packet that computing node is sent, and extracts the current residing status information of the computing node comprising in this heartbeat packet, extracts operation to be scheduled from job queue;
Four module inquires about in this pond whether had this operation to be scheduled in job scheduling pond, if exist, then proceeds to the 6th module, otherwise enters the 5th module;
The 5th module, carries out predistribution calculating operation for this operation to be scheduled, is this operation to be scheduled newly-built task scheduling mapping in job scheduling pond;
The 6th module inquires this operation to be scheduled in job scheduling pond, extracts the corresponding task scheduling mapping of this operation to be scheduled, if this mapping is not empty, enters the 7th module, otherwise, enter the 8th module;
The 7th module, from the corresponding task scheduling mapping of operation to be scheduled, extract the corresponding task queue of computing node shown in the 3rd module, computing power according to this computing node, whole or the part of this task queue is encapsulated in the return message of heartbeat packet, returning to this computing node carries out, in job scheduling pond, upgrade this task queue simultaneously, in this task queue, delete the task of distributing to computing node, if be all assigned, delete whole task queue, enter the 3rd module;
The 8th module, if task scheduling is mapped as sky, illustrates that all tasks of this operation are all finished, and the execution result of all tasks that obtain is carried out to reduction calculating, and the result that reduction is calculated returns to user.
In general, the above technical scheme of conceiving by the present invention compared with prior art, can obtain following beneficial effect:
1, can promote Job execution response speed: dispatching method of the present invention from overall visual angle for task scheduling is carried out in operation, abandoned in prior art, only, when computing node initiation task is distributed request, just for single computing node, execute the task and dispatch and distribute; The present invention guarantees the locality of task from overall angle, taked a kind of predistribution mechanism of operation, when certain operation is scheduled for the first time, for each task dispense needles of this operation Optimal calculation node to this task, the allocation result of task is stored in operation distributing reservoir, when this Optimal calculation node initiation task is distributed request, the task queue of allocating in advance to it is distributed in the past.When each task in this operation is complete on Optimal calculation node, the effect that completes of whole operation is also optimum.
2, can adapt to complicated network environment: the present invention as a parameter, provides a reference frame for scheduler carries out task scheduling by the network bandwidth.If assurance task is all local task in the distribution of task as far as possible, the data transmission in network link be can avoid, thereby execution speed, the reduction network congestion of task improved, this character is called the locality of task.Yet, during load imbalance above Hadoop calculates the computing node in cluster, if still ensure blindly the locality of data, this task is given to the words that local node is carried out, may cause task to be all assigned to the local node of high capacity, can produce like this operation and wait for.The present invention considers locality and the network bandwidth situation of task, chooses neatly an optimum node, thereby guaranteed in complicated network environment among local node and non-local node, still can keep a comparatively high efficiency task scheduling.
Accompanying drawing explanation
Fig. 1 is the process flow diagram that the present invention is based on the Hadoop dispatching method of bandwidth aware.
Fig. 2 is the refinement process flow diagram of step in the inventive method (5).
Fig. 3 is the refinement process flow diagram of step in the inventive method (5-4).
Embodiment
In order to make object of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.In addition,, in each embodiment of described the present invention, involved technical characterictic just can not combine mutually as long as do not form each other conflict.
Integral Thought of the present invention is, as the Hadoop dispatching algorithm based on bandwidth aware, mainly utilizes the preferential algorithm of the local node of bandwidth aware characteristic optimizing Hadoop of SDN (software defined network) own.The link bandwidth obtaining by software defined network carrys out computational data transit time, and estimates the free time of node, finally calculates respectively the task deadline of all nodes.Selection task deadline node is the earliest as the node of tasks carrying.
As shown in Figure 1, the Hadoop dispatching method that the present invention is based on bandwidth aware comprises the following steps:
(1) receive the Hadoop operation of submitting to from user, and this Hadoop operation of initialization, and set up an operation ID object for this Hadoop operation, wherein this operation ID object is for encapsulation task and recorded information, to follow the tracks of executing state and the process of this Hadoop operation:
(2) the Hadoop operation after initialization is added in job queue, wherein this job queue is the Hadoop operation for the treatment of scheduled for executing for safeguarding, and is in charge of and dispatches all Hadoop operations in memory-mapped;
(3) receive the heartbeat packet that computing node is sent, extract the current status information of this computing node comprising in this heartbeat packet, and from job queue, extract the Hadoop operation that is positioned at head of the queue; Particularly, the status information of computing node comprises number of tasks that this computing node is being carried out and the vacant time of computing node;
(4) in job scheduling pond, (what job scheduling pond deposited is the result that operation distributes, and a lot of queues, consists of, and the dispatching distribution result of an operation is deposited in each queue, i.e. an operation is to be assigned on which computing node to carry out.) whether middle inquiry has existed the Hadoop operation of extraction, if existence proceeds to step (6), otherwise enters step (5);
(5) the Hadoop operation of extracting is carried out to predistribution calculating operation, by dispatching algorithm, operation is divided into a plurality of tasks, and each task is distributed to corresponding computing node, be this operation to be scheduled newly-built task scheduling mapping in job scheduling pond simultaneously; As shown in Figure 2, this step specifically comprises following sub-step:
(5-1) calculate the current Hadoop task situation that whole Hadoop calculates each computing node in cluster, and according to current Hadoop task situation, estimate residue execution time of the current Hadoop task of this computing node, to obtain the free time of this computing node; Particularly, first monitor and the present load situation that records each computing node in whole Hadoop calculating cluster, and obtain progress (Progress) value of the Hadoop task of the current operation of each computing node, wherein progress value is the number percent that in the data block of a Hadoop task, processed size of data accounts for whole data block size, can estimate thus the residue execution time of current Hadoop task, specific formula for calculation is T e=(T n-T s) * (1/ progress value-1), wherein T efor the residue execution time of current Hadoop task, T sfor current Hadoop task starts the time of carrying out, T nfor the current time of whole Hadoop calculating cluster, the free time of this computing node is T e+ T n;
(5-2) communicate with name node (Name node), to obtain the data trnascription backup instances of inputting data in the Hadoop operation of extraction, resolve and this information of dump;
(5-3) communicate, to obtain whole Hadoop, calculate the real-time bandwidth information of cluster, and calculate data-moving time T with software defined network (Software defined network is called for short SDN) controller m, use data block size divided by real-time bandwidth.
(5-4) receive the clustered node free time information that (5-1) step is imported into, receive the input block copy information that (5-2) number of steps reportedly enters, receive the network bandwidth and data-moving temporal information that (5-3) step is imported into, comprehensive three carries out computing, is the computing node of a current optimum of each task distribution.
As shown in Figure 3, this step specifically comprises following sub-step:
(5-4-1) in whole calculating cluster, find available the earliest remote node as optimum remote node, record node free time rI now minnow;
(5-4-2) in whole calculating cluster, find available the earliest local node as optimum local node, record node free time rI now minloc;
(5-4-3) whether the node that comparison step (5-4-1) and step (5-4-2) inquire is same node, and if so, defining optimum local node is optimum node, enters step (5-4-5), if not, enters step (5-4-4);
If (5-4-4) relatively task is distributed to respectively to this two computing nodes, the task deadline on these two computing nodes which more early, compares rI minnow+ T mwith rI minlocsize, little i.e. explanation task finishes more early, defining this node is optimum node.
(5-4-5) task to be allocated is distributed to this optimum node.
(5-5), in job scheduling pond, for the newly-built task scheduling mapping of this operation to be scheduled, upgrade this task scheduling mapping.In this duty mapping, if the key of the Optimal calculation node calculating does not exist, the name of this Optimal calculation node calculating of take is the newly-built key-value pair of key, then by this task add to worthwhile in.If the key of the Optimal calculation node calculating exists, find the key-value pair that this key is corresponding, task is added to after the task queue in value.
(6) in job scheduling pond, inquire this operation to be scheduled, extract the corresponding task scheduling mapping of this operation to be scheduled, if this mapping is not empty, enter step (7), otherwise, step (8) entered;
(7) from the corresponding task scheduling mapping of operation to be scheduled, extract the corresponding task queue of computing node shown in step (3), computing power according to this computing node, whole or the part of this task queue is encapsulated in the return message of heartbeat packet, returning to this computing node carries out, in job scheduling pond, upgrade this task queue simultaneously, in this task queue, delete the task of distributing to computing node, if be all assigned, delete whole task queue, enter step (3);
(8) if task scheduling is mapped as sky, illustrate that all tasks of this operation are all finished, the execution result of all tasks that obtain is carried out to reduction calculating, and the result that reduction is calculated returns to user.
The Hadoop dispatching system that the present invention is based on bandwidth aware comprises:
The first module, receives the operation of submitting to from user, and this operation of initialization, for this operation, sets up an operation ID object, and this operation ID object is responsible for encapsulation task and recorded information, to follow the tracks of Job execution state and process:
The second module, the operation that initialization is completed is added in job queue, and this job queue is a queue of having safeguarded the operation for the treatment of scheduled for executing, and the All Jobs object in memory-mapped is in charge of and is dispatched in this queue;
The 3rd module, receives the heartbeat packet that computing node is sent, and extracts the current residing status information of the computing node comprising in this heartbeat packet, extracts operation to be scheduled from job queue;
Four module inquires about in this pond whether had this operation to be scheduled in job scheduling pond, if exist, then proceeds to the 6th module, otherwise enters the 5th module;
The 5th module, carries out predistribution calculating operation for this operation to be scheduled, is this operation to be scheduled newly-built task scheduling mapping in job scheduling pond;
The 6th module inquires this operation to be scheduled in job scheduling pond, extracts the corresponding task scheduling mapping of this operation to be scheduled, if this mapping is not empty, enters the 7th module, otherwise, enter the 8th module;
The 7th module, from the corresponding task scheduling mapping of operation to be scheduled, extract the corresponding task queue of computing node shown in the 3rd module, computing power according to this computing node, whole or the part of this task queue is encapsulated in the return message of heartbeat packet, returning to this computing node carries out, in job scheduling pond, upgrade this task queue simultaneously, in this task queue, delete the task of distributing to computing node, if be all assigned, delete whole task queue, enter the 3rd module;
The 8th module, if task scheduling is mapped as sky, illustrates that all tasks of this operation are all finished, and the execution result of all tasks that obtain is carried out to reduction calculating, and the result that reduction is calculated returns to user.
Those skilled in the art will readily understand; the foregoing is only preferred embodiment of the present invention; not in order to limit the present invention, all any modifications of doing within the spirit and principles in the present invention, be equal to and replace and improvement etc., within all should being included in protection scope of the present invention.

Claims (10)

1. the Hadoop dispatching method based on bandwidth aware, is characterized in that, comprises the following steps:
(1) receive the operation of submitting to from user, and this operation of initialization, for this operation, set up an operation ID object, this operation ID object is responsible for encapsulation task and recorded information, to follow the tracks of Job execution state and process:
(2) operation initialization being completed is added in job queue, and this job queue is a queue of having safeguarded the operation for the treatment of scheduled for executing, and the All Jobs object in memory-mapped is in charge of and is dispatched in this queue;
(3) receive the heartbeat packet that computing node is sent, extract the current residing status information of the computing node comprising in this heartbeat packet, from job queue, extract operation to be scheduled;
(4) in job scheduling pond, inquire about in this pond whether had this operation to be scheduled, if exist, then proceed to step (6), otherwise enter step (5);
(5) for this operation to be scheduled, carrying out predistribution calculating operation, is this operation to be scheduled newly-built task scheduling mapping in job scheduling pond;
(6) in job scheduling pond, inquire this operation to be scheduled, extract the corresponding task scheduling mapping of this operation to be scheduled, if this mapping is not empty, enter step (7), otherwise, step (8) entered;
(7) from the corresponding task scheduling mapping of operation to be scheduled, extract the corresponding task queue of computing node shown in step (3), computing power according to this computing node, whole or the part of this task queue is encapsulated in the return message of heartbeat packet, returning to this computing node carries out, in job scheduling pond, upgrade this task queue simultaneously, in this task queue, delete the task of distributing to computing node, if be all assigned, delete whole task queue, enter step (3);
(8) if task scheduling is mapped as sky, illustrate that all tasks of this operation are all finished, the execution result of all tasks that obtain is carried out to reduction calculating, and the result that reduction is calculated returns to user.
2. Hadoop dispatching method according to claim 1, it is characterized in that, in operation distributing reservoir, for each operation to be scheduled, safeguarded a task scheduling mapping, the key of this mapping is the name of computing node, the value of this mapping is the calculation task queue of allocating in advance to this computing node, whenever having after a computing node initiated the request of allocating task, from task management queue, extract the operation of a band scheduling, inquire about the task scheduling mapping that this operation is safeguarded, the name of the task node of initiating allocating task request of take in the mapping of this task scheduling is key, extract the value that this key is corresponding, be the calculation task queue of allocating in advance to this computing node, computing power according to this computing node, the whole of this task queue or part are encapsulated to feeding to be initiated in the return message of computing node of request of allocating task, returning to this computing node carries out, in job scheduling pond, upgrade this task queue simultaneously, in this task queue, delete the task of distributing to computing node, if be all assigned, delete whole task queue.
3. Hadoop dispatching method according to claim 1 and 2, is characterized in that, step (5) specifically comprises following sub-step:
(5-1) calculate the present load situation that whole Hadoop calculates each node in cluster, and then estimate the residue execution time of present load, thereby obtain the free time of each node;
(5-2) communicate with name node, obtain the current data trnascription backup instances for the treatment of the input data of schedule job, resolve and this information of dump;
(5-3) communicate with SDN controller, obtain network implementation Time Bandwidth information, calculate the data-moving time,
(5-4) receive the clustered node free time information that (5-1) step is imported into, receive the input block copy information that (5-2) number of steps reportedly enters, receive the network bandwidth and data-moving temporal information that (5-3) step is imported into, comprehensive three carries out computing, computing node for a current optimum of each task distribution
(5-5), in job scheduling pond, for the newly-built task scheduling mapping of this operation to be scheduled, upgrade this task scheduling mapping.
4. Hadoop dispatching method according to claim 3, it is characterized in that, step (5-1) is specially, monitoring and the current operation calculated case that records each computing node in whole calculating cluster, obtain the progress value of the current operation of each computing node, progress represents that the size of data of a complete calculating operation of task executed accounts for the number percent of whole data block size, can estimate the task deadline thus, and computing formula is T e=T s+ (T n-T s)/progress, wherein T erepresent the task deadline of estimating, T sexpression task starts the time of carrying out, T nfor the current time in system.
5. Hadoop dispatching method according to claim 3, it is characterized in that, step (5-3) is specially, call the API of SDN controller, obtain network implementation Time Bandwidth information, obtain implementing bandwidth and store, the definition data-moving time is that data corresponding to task move from data source nodes the time consuming data computing node, and this data-moving time can be passed through formula: T m=DS/BW, wherein T mrepresent the data-moving time, DS represents data block size, and this size can be set in configuration file, and BW represents real-time bandwidth size cases.
6. Hadoop dispatching method according to claim 3, is characterized in that, step (5-4) specifically comprises following sub-step:
(5-4-1) in whole calculating cluster, find available the earliest remote node as optimum remote node, record node free time rI now minnow;
(5-4-2) in whole calculating cluster, find available the earliest local node as optimum local node, record node free time rI now minloc;
(5-4-3) whether the node that comparison step (5-4-1) and step (5-4-2) inquire is same node, and if so, defining optimum local node is optimum node, enters step (5-4-5), if not, enters step (5-4-4);
If (5-4-4) relatively task is distributed to respectively to this two computing nodes, the task deadline on these two computing nodes which more early, compares rI minnow+ T mwith rI minlocsize, little i.e. explanation task finishes more early, defining this node is optimum node,
(5-4-5) task to be allocated is distributed to this optimum node.
7. Hadoop dispatching method according to claim 6, it is characterized in that, when distributing remote task for computing node, slot reservation division while being also responsible for carrying out, when certain remote task need to be carried out data-moving, record the source node ND that task data to be moved is moved dataSrcwith terminal note ND minNow, these Information encapsulations are become to a stream table FlowTable, ND has been recorded in territory, FlowTable packet header dataSrcwith terminal note ND minNowinformation, this stream table information sends to SDN controller, SDN controller can be issued to this stream table in corresponding SDN switch, when SDN switch be checked through this stream table for stream time, preferentially guarantee the operation of moving of these data.
8. Hadoop dispatching method according to claim 8, is characterized in that, if task TK ibe assigned at node ND jupper calculating, and TK iinput deposit data at TK inode ND dataSrcupper,, when executing the task, input data need to be from ND dataSrcmove ND jupper, definition of T M i,jfor this data-moving time; Task is calculated the computing time that the mistiming between complete is task, definition of T P from starting to calculate i,jfor the computing time of this task; From task, be assigned to certain computing node and start, task will take the computational resource of this computing node, the time that task is actual takies computational resource for time of being assigned with from task to task computation the mistiming the complete time, definition of T E i,jfor the actual execution time of this task, wherein, these times meet father-in-law's formula TE i,j=TP i,j+ TM i,j, when computational data is positioned on the node of processing calculation task, defining this computing node is local node, otherwise defining this computing node is remote node.
9. Hadoop dispatching method according to claim 3, it is characterized in that, step (5-5) is specially, in this duty mapping, if the key of the Optimal calculation node calculating does not exist, the name of this Optimal calculation node calculating of take is the newly-built key-value pair of key, again this task is added to worthwhile in, if the key of the Optimal calculation node calculating exists, find the key-value pair that this key is corresponding, task is added to after the task queue in value.
10. the Hadoop dispatching system based on bandwidth aware, is characterized in that, comprises with lower module:
The first module, receives the operation of submitting to from user, and this operation of initialization, for this operation, sets up an operation ID object, and this operation ID object is responsible for encapsulation task and recorded information, to follow the tracks of Job execution state and process:
The second module, the operation that initialization is completed is added in job queue, and this job queue is a queue of having safeguarded the operation for the treatment of scheduled for executing, and the All Jobs object in memory-mapped is in charge of and is dispatched in this queue;
The 3rd module, receives the heartbeat packet that computing node is sent, and extracts the current residing status information of the computing node comprising in this heartbeat packet, extracts operation to be scheduled from job queue;
Four module inquires about in this pond whether had this operation to be scheduled in job scheduling pond, if exist, then proceeds to the 6th module, otherwise enters the 5th module;
The 5th module, carries out predistribution calculating operation for this operation to be scheduled, is this operation to be scheduled newly-built task scheduling mapping in job scheduling pond;
The 6th module inquires this operation to be scheduled in job scheduling pond, extracts the corresponding task scheduling mapping of this operation to be scheduled, if this mapping is not empty, enters the 7th module, otherwise, enter the 8th module;
The 7th module, from the corresponding task scheduling mapping of operation to be scheduled, extract the corresponding task queue of computing node shown in the 3rd module, computing power according to this computing node, whole or the part of this task queue is encapsulated in the return message of heartbeat packet, returning to this computing node carries out, in job scheduling pond, upgrade this task queue simultaneously, in this task queue, delete the task of distributing to computing node, if be all assigned, delete whole task queue, enter the 3rd module;
The 8th module, if task scheduling is mapped as sky, illustrates that all tasks of this operation are all finished, and the execution result of all tasks that obtain is carried out to reduction calculating, and the result that reduction is calculated returns to user.
CN201410270693.6A 2014-06-17 2014-06-17 A kind of Hadoop dispatching methods and system based on bandwidth aware Expired - Fee Related CN104102533B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410270693.6A CN104102533B (en) 2014-06-17 2014-06-17 A kind of Hadoop dispatching methods and system based on bandwidth aware

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410270693.6A CN104102533B (en) 2014-06-17 2014-06-17 A kind of Hadoop dispatching methods and system based on bandwidth aware

Publications (2)

Publication Number Publication Date
CN104102533A true CN104102533A (en) 2014-10-15
CN104102533B CN104102533B (en) 2017-07-18

Family

ID=51670705

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410270693.6A Expired - Fee Related CN104102533B (en) 2014-06-17 2014-06-17 A kind of Hadoop dispatching methods and system based on bandwidth aware

Country Status (1)

Country Link
CN (1) CN104102533B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105357124A (en) * 2015-11-22 2016-02-24 华中科技大学 MapReduce bandwidth optimization method
CN107621979A (en) * 2017-10-27 2018-01-23 郑金林 A kind of Development of Students archives big data algorithm and analysis system
CN107704069A (en) * 2017-06-15 2018-02-16 重庆邮电大学 A kind of Spark energy-saving scheduling methods perceived based on energy consumption
CN109960573A (en) * 2018-12-29 2019-07-02 天津南大通用数据技术股份有限公司 A kind of cross-domain calculating task dispatching method and system based on Intellisense
CN111813527A (en) * 2020-07-15 2020-10-23 江苏方天电力技术有限公司 Data-aware task scheduling method
CN107346262B (en) * 2017-06-06 2020-12-15 华为技术有限公司 Task migration method and controller
CN114500514A (en) * 2022-02-14 2022-05-13 京东科技信息技术有限公司 File transmission method and device, electronic equipment and computer readable storage medium
CN114510329A (en) * 2022-01-21 2022-05-17 北京火山引擎科技有限公司 Method, device and equipment for determining predicted output time of task node

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073546A (en) * 2010-12-13 2011-05-25 北京航空航天大学 Task-dynamic dispatching method under distributed computation mode in cloud computing environment
CN102739785A (en) * 2012-06-20 2012-10-17 东南大学 Method for scheduling cloud computing tasks based on network bandwidth estimation
CN103500119A (en) * 2013-09-06 2014-01-08 西安交通大学 Task allocation method based on pre-dispatch

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073546A (en) * 2010-12-13 2011-05-25 北京航空航天大学 Task-dynamic dispatching method under distributed computation mode in cloud computing environment
CN102739785A (en) * 2012-06-20 2012-10-17 东南大学 Method for scheduling cloud computing tasks based on network bandwidth estimation
CN103500119A (en) * 2013-09-06 2014-01-08 西安交通大学 Task allocation method based on pre-dispatch

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105357124A (en) * 2015-11-22 2016-02-24 华中科技大学 MapReduce bandwidth optimization method
CN107346262B (en) * 2017-06-06 2020-12-15 华为技术有限公司 Task migration method and controller
CN107704069A (en) * 2017-06-15 2018-02-16 重庆邮电大学 A kind of Spark energy-saving scheduling methods perceived based on energy consumption
CN107704069B (en) * 2017-06-15 2020-08-04 重庆邮电大学 Spark energy-saving scheduling method based on energy consumption perception
CN107621979A (en) * 2017-10-27 2018-01-23 郑金林 A kind of Development of Students archives big data algorithm and analysis system
CN109960573A (en) * 2018-12-29 2019-07-02 天津南大通用数据技术股份有限公司 A kind of cross-domain calculating task dispatching method and system based on Intellisense
CN109960573B (en) * 2018-12-29 2021-01-08 天津南大通用数据技术股份有限公司 Cross-domain computing task scheduling method and system based on intelligent perception
CN111813527A (en) * 2020-07-15 2020-10-23 江苏方天电力技术有限公司 Data-aware task scheduling method
CN111813527B (en) * 2020-07-15 2022-06-14 江苏方天电力技术有限公司 Data-aware task scheduling method
CN114510329A (en) * 2022-01-21 2022-05-17 北京火山引擎科技有限公司 Method, device and equipment for determining predicted output time of task node
CN114510329B (en) * 2022-01-21 2023-08-08 北京火山引擎科技有限公司 Method, device and equipment for determining estimated output time of task node
CN114500514A (en) * 2022-02-14 2022-05-13 京东科技信息技术有限公司 File transmission method and device, electronic equipment and computer readable storage medium
CN114500514B (en) * 2022-02-14 2023-12-12 京东科技信息技术有限公司 File transmission method and device for cloud storage, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN104102533B (en) 2017-07-18

Similar Documents

Publication Publication Date Title
CN104102533A (en) Bandwidth aware based Hadoop scheduling method and system
CN109034396B (en) Method and apparatus for processing deep learning jobs in a distributed cluster
CN104123182B (en) Based on the MapReduce task of client/server across data center scheduling system and method
CN103092698B (en) Cloud computing application automatic deployment system and method
US9471390B2 (en) Scheduling mapreduce jobs in a cluster of dynamically available servers
Jung et al. Synchronous parallel processing of big-data analytics services to optimize performance in federated clouds
Fakhfakh et al. Workflow scheduling in cloud computing: a survey
Boutaba et al. On cloud computational models and the heterogeneity challenge
CN112148455B (en) Task processing method, device and medium
CN108021435B (en) Cloud computing task flow scheduling method with fault tolerance capability based on deadline
CN107562528B (en) Unitized on-demand computing method supporting multiple computing frameworks and related device
WO2013107012A1 (en) Task processing system and task processing method for distributed computation
CN109257399B (en) Cloud platform application program management method, management platform and storage medium
Pop et al. Deadline scheduling for aperiodic tasks in inter-Cloud environments: a new approach to resource management
CN107291536B (en) Application task flow scheduling method in cloud computing environment
CN104112049B (en) Based on the MapReduce task of P2P framework across data center scheduling system and method
EP3118784A1 (en) Method and system for enabling dynamic capacity planning
CN103685492B (en) Dispatching method, dispatching device and application of Hadoop trunking system
Islam et al. SLA-based scheduling of spark jobs in hybrid cloud computing environments
Perret et al. A deadline scheduler for jobs in distributed systems
Yun et al. An integrated approach to workflow mapping and task scheduling for delay minimization in distributed environments
Govindarajan et al. Task scheduling in big data-review, research challenges, and prospects
Shah et al. Dynamic multilevel hybrid scheduling algorithms for grid computing
Hung et al. Task scheduling for optimizing recovery time in cloud computing
Rashmi et al. Resource optimised workflow scheduling in Hadoop using stochastic hill climbing technique

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170718

Termination date: 20180617

CF01 Termination of patent right due to non-payment of annual fee