CN103455375A

CN103455375A - Load-monitoring-based hybrid scheduling method under Hadoop cloud platform

Info

Publication number: CN103455375A
Application number: CN2013100387467A
Authority: CN
Inventors: 李千目; 陆路; 侯君
Original assignee: LIANYUNGANG RESEARCH INSTITUTE OF NANJING UNIVERSITY OF SCIENCE AND TECHNOLOGY
Current assignee: Golden number information technology (Suzhou) Co., Ltd.
Priority date: 2013-01-31
Filing date: 2013-01-31
Publication date: 2013-12-18
Anticipated expiration: 2033-01-31
Also published as: CN103455375B

Abstract

The invention discloses a load-monitoring-based hybrid scheduling method under a Hadoop cloud platform. A load-monitoring-based hybrid scheduling scheme is provided by analyzing scheduling effects and respective application scenarios of a Max-D algorithm, an FIFO algorithm and a fair scheduling algorithm. The one the most proper for the current load condition is selected from the three algorithms by monitoring a system load in real time. Compared with the single application of one scheduling algorithm, the scheme has significant advantages and is adaptive to changes in the load in a Hadoop system, so that the system can retain good performance.

Description

Mixed scheduling method based on load monitoring under Hadoop cloud platform

Technical field

The present invention relates to cloud platform job scheduling method, especially relate to the mixture operation dispatching method based on load monitoring under a kind of Hadoop cloud platform.

Technical background

The wireless-aware network technology is a technology of gradually rising in recent years, and integrated sensor technology, embedded computing technique, distributed information processing and the communication technology, be a kind of brand-new Information acquisi-tion technology.It is different from legacy network at aspects such as theory design, application realization, developing direction, so the theoretical foundation of sensor network and system and traditional network system make a big difference.Hadoop is a distributed system with good extendability of high reliability, run application on the cluster that can form at a large amount of cheap hardware devices, and provide one group of reliable and stable interface for application program.

The existing dispatching method regulation goal of Hadoop is all more single, only can meet single loading condition, and under different loading conditions, performance has very large difference, has caused job scheduling method adaptability poor, can not meet complicacy and the diversity of cloud platform.In Hadoop, job scheduling method is widely used FIFO method or equity dispatching method.The FIFO method is distributed operation according to the submission time of operation (job), the operation that priority allocation is first submitted to.The FIFO method realizes simple, and scheduling consumes few, but, in the face of mass data processing the time, some needs a large amount of operation meetings of calculating to take for a long time resource, and follow-up operation will slowly can not get carrying out, and affect the performance of system and user's experience.Equity dispatching method guarantees that the acquisition resource that operation can be fair is carried out, but the average deadline of operation is longer.

Summary of the invention

The purpose of this invention is to provide the mixed scheduling method under a kind of efficient Hadoop cloud platform, thus the complicacy of reply cloud platform.

The technical solution that realizes the object of the invention is:

1, the resource (taskTracker) in the Hadoop system continue to send the real-time information of self node to Centroid (jobTracker), comprise whether idle, executed time and the implementation progress of executing the task.

2, system load is monitored in real time, the loading level in the real-time information computing system sent according to resource, system load refers to that system carves the task amount of carrying out at a time:

M resource arranged in supposing the system, and wherein idle resource number is k, and pending number of tasks is numT, and the degree of load of define system is:

HL = \frac{numT}{m}

If 0≤HL<1, system is carried in low;

If 1≤HL≤2, system is in load balance;

If HL>2, system is in overload.

3, when resource is arranged to Centroid request task (task), according to real-time loading level selection scheduling scheme: carry in situation system resource is low, use the FIFO dispatching method, reduce scheduling and consume; Use equity dispatching method in the load balance situation, improve the fairness of system, guarantee that operation can be carried out; Use the Max-D dispatching method under overload situations, make the average deadline of operation shorten.

The Max-D dispatching method realization flow related in 3 is as follows:

The first step: determine the set of all computational resources in cloud environment and idling-resource.

Second step: operation to be allocated is submitted to and successively is ranked into queue by operation, and the new operation of submitting to is added into this queue afterbody.

The 3rd step: the operation after sequence is dispatched, adopt the Max-D method to select suitable resource to be carried out.

For the Max-D method of the 3rd step in 3, its step is as follows:

Step 3.1: to all operations to be allocated, the average estimation working time of computational tasks on all computational resources;

Step 3.2: average estimation working time of calculating each operation with and on the computational resource of single free time the difference Di between minimum working time, and record this computational resource;

Step 3.3: find the operation of difference Di maximum in All Jobs, and this Di is designated as to D;

Step 3.4: if D >=0 is assigned operation and processed to the resource of record, simultaneously this resource is removed from the idling-resource set; If D<0, redefine resource and the idling-resource set of distribution, by completing it, distribute the resource of operation to join in the idling-resource set, then return to step 3.1.

Step 3.5: repeating step 3.2 has distributed operation to step 3.4 until for the resource of all application operations.

In step 3.1, the average estimated time to completion method of computational resource is as follows:

Suppose that cloud environment is by n unallocated operation T={t ₁, t ₂... t _nand m resource R={r ₁, r ₂... r _mform, each resource can only be processed an operation simultaneously; Resource number idle in resource is k, is designated as R'={r ₁', r ₂' ... r _k', k<m wherein; The estimation working time of operation ti on resource rj is TCirj, and the average operating time of operation ti on all resources is

Figure 902512DEST_PATH_GDA0000386662090000021

The deadline of operation ti on resource rj is residual completion time and the deadline sum of operation ti on rj of the operation just carried out on rj.

Suppose in cloud environment, for the same class operation, the speed that resource is processed is directly proportional to the data volume of its processing.The Estimated Time Of Completion of operation i on resource r is the residual completion time of running job and operation i execution time sum on resource r on resource r just:

{TCir}_{j} (k + 1) = {RTCir}_{j} (k) \times \frac{1 - pro}{pro} + [(1 - ρ) \frac{{TCir}_{j} (k)}{M (k)} + ρ \frac{{RTCir}_{j} (k)}{M (k) pro}] \times M (k + 1), r_{j} &Element; R - - - (1)

Wherein, TCir _j(k+1) mean required deadline of resource rj processing operation ti, TCir _j(k) mean the prediction deadline of previous operation on resource rj; M (k) is the ratio of this operation required time of operation and run unit operation required time; RTCir _j(k) mean previous operation actual run time on rj, pro (0<pro≤1) means the completed percentage of previous operation, if resource rj is idling-resource, i.e. previous operation is complete, pro=1, and above-mentioned formula can be reduced to

{TCir}_{j} (k + 1) = [(1 - ρ) \frac{{TCir}_{j} (k)}{M (k)} + ρ \frac{{RTCir}_{j} (k)}{M (k)}] \times M (k + 1), r_{j} &Element; R^{'} - - - (2)

Estimation execution time TCir by previous operation on this resource _jand actual execution time RTCir (k) _j(k), use formula (1) to estimate the execution time of operation on certain resource that obtains not dispatching.Yet, the stage just started in system, also do not carry out operation on each resource, the execution time of resource can't be estimated by the implementation status of previous operation.Therefore when system just starts, for all resources, make

TCir _j(0)＝RTCir _j(0)＝0 （3）

At first pending like this operation meeting selects the resource of not carrying out operation to be carried out, and after resource executes first operation, has just obtained the actual execution time RTCir of operation _j(1), make TCir _j(1) equal RTCir _j(1), can be estimated the working time of operation afterwards according to formula (1).

In step 3.2, the method for calculated difference D is as follows:

Operation ti is designated as mUTC the minimum working time on the node of all unallocated work _i=min{TCir ₁', TCir ₂' ,-, TCir _k', record and meet TCir _j'=mUTC _iunallocated operation rj ', and the note BR _i=r _j', then according to formula D _i=AvgTC _i-mUTC _i, obtain the difference Di of operation i.

Compared with prior art, its remarkable advantage: 1, the present invention is by the monitoring to system load, and real-time is the suitable dispatching method of task choosing to be allocated, and the system that guaranteed can keep efficient performance all the time under different system states in the present invention; 2, job scheduling of the present invention only can be assigned to operation on idle resource, has guaranteed the equilibrium of load under the cloud environment, not there will be the part resource overload and the situation of other resource free time; 3, the present invention is that most suitable resource is selected in operation by the Max-D method, has reduced the average deadline of operation, has improved the throughput of system.

The accompanying drawing explanation

Accompanying drawing is of the present invention based on load mixed scheduling strategy schematic diagram.

embodiment

Below in conjunction with accompanying drawing, the present invention will be further described.

HL = \frac{numT}{m}

If 0≤HL<1, system is carried in low;

If 1≤HL≤2, system is in load balance;

If HL>2, system is in overload.

Mixed scheduling strategy based on load monitoring comprises a load monitor and a scheduling selector.Load monitor is responsible for the load of system is monitored and calculated, and the loading condition that scheduling selector provides according to load monitor is dispatched task.

When available free resource request tasks carrying, scheduling selector can obtain real-time system loading conditions by load sensor, then according to load state, selects corresponding job scheduling method.

1, when system in low load, for example, when system has just started, the idling-resource quantity in system such as is greater than at the pending task, scheduling selector can be selected the FIFO dispatching method.When the resource bid task is arranged, the FIFO algorithm can be sorted to the pending operations such as all according to submission time, then selects first operation, and the task in this operation is distributed to resource.For doing the selection of task in the industry, the FIFO algorithm can be waited for again carrying out of task after the priority allocation failure.Now the FIFO dispatching method can reduce scheduling consumption and average deadline of operation.

2, along with the operation of system, the load meeting increases the load balancing state that reaches gradually, but does not also reach the peak value of load.No matter the FIFO method is average deadline of operation or fairness, and performance has all started to descend, and now scheduler can select equity dispatching method as dispatching method, to guarantee the user, obtains the fairness of resource.

3, further increase and reach overload level when load, equity dispatching method can cause the average deadline of operation to increase fast.Now scheduling selector can be selected the Max-D dispatching method, although the resource distributional equity can reduce, but can significantly reduce the average deadline of operation.

Wherein Max-D dispatching algorithm embodiment is as follows:

Figure 647429DEST_PATH_GDA0000386662090000041

operation ti is designated as mUTC the minimum working time on the node of all unallocated work _i=min{TCir ₁', TCir ₂' ..., TCir _k', record and meet TCir _j'=mUTC _iunallocated operation rj ', and the note BR _i=r _j';

When the operation set non-NULL of needs scheduling, carry out following operation:

Step 1: All Jobs in operation set T is calculated respectively to AvgTC _i;

Step 2: each operation ti is found to mUTC _i, and calculate D _i=AvgTC _i-mUTC _i;

Step 3: find operation ti, make D _i=Max{D ₁, D ₂..., D _n, if there are a plurality of operations to satisfy condition, the select progressively ti arrived according to these operations;

Step 4: if D _i>=0, assign operation ti and processed to resource BRi, resource BRi is removed from idling-resource set R ' simultaneously; If D _i<0, resource and the idling-resource set of reappraising and distributing, distribute the resource of operation to join in the idling-resource set by completing it, then returns to step (1).

Step 5: repeating step 2 has distributed operation to step 4 until for the resource of all application operations.

The present invention's hypothesis is for the same class operation, and the speed that resource is processed is directly proportional to the data volume of its processing.The Estimated Time Of Completion of operation i on resource r is the residual completion time of running job and operation i execution time sum on resource r on resource r just:

{TCir}_{j} (k + 1) = {RTCir}_{j} (k) \times \frac{1 - pro}{pro} + [(1 + ρ) \frac{{TCir}_{j} (k)}{M (k)} + ρ \frac{{RTCir}_{j} (k)}{M (k) pro}] \times M (k + 1), r_{j} &Element; R

Wherein, TCir _j(k+1) mean that operation ti is distributed to resource rj processes required deadline, TCir _j(k) mean the prediction deadline of previous operation on resource rj; M (k) is the ratio of this operation required time of operation and run unit operation required time; RTCir _j(k) mean previous operation actual run time on rj, pro (0<pro≤1) means the completed percentage of previous operation, if resource rj is idling-resource, i.e. previous operation is complete, pro=1, and above-mentioned formula can be reduced to

{TCir}_{j} (k + 1) = [(1 - ρ) \frac{{TCir}_{j} (k)}{M (k)} + ρ \frac{{RTCir}_{j} (k)}{M (k)}] \times M (k + 1), r_{j} &Element; R^{'}

According to formula, the execution time of operation on certain resource of scheduling can be by the estimation execution time TCir of previous operation on this resource _jand actual execution time RTCir (k) _j(k) estimated.Yet, the stage just started in system, also do not carry out operation on each resource, the execution time of resource can't be estimated by the implementation status of previous operation.Therefore when system just starts, for all resources, make

TCir _j(0)＝RTCir _j(0)＝0

At first pending like this operation meeting selects not carry out the resource of operation, after resource executes first operation, has just obtained the actual execution time RTCir of operation _j(1), make TCir _j(1) equal RTCir _j(1), can be estimated the working time of operation afterwards according to formula (1).

Claims

1. the mixed scheduling method based on load monitoring under a Hadoop cloud platform is characterized in that method is as follows:

(1) whether the resource taskTracker in the Hadoop system continues the real-time information of transmission self node to Centroid jobTracker, real-time information comprises executed time and implementation progress idle, that executing the task;

(2) system load is monitored in real time to the loading level in the real-time information computing system sent according to resource:

(3) when resource is arranged to Centroid request task task, according to real-time loading level selection scheduling scheme: carry in situation system resource is low, use the FIFO dispatching method, reduce scheduling and consume; Use equity dispatching method in the load balance situation, improve the fairness of system, guarantee that operation can be carried out; Use the Max-D dispatching method under overload situations, make the average deadline of operation shorten.

2. according to the mixed scheduling method based on load monitoring under the Hadoop cloud platform described in claim 1, it is characterized in that: system load refers to that system carves the task amount of carrying out at a time in (2); System load is monitored in real time, and the method for the loading level in the real-time information computing system sent according to resource is: m resource arranged in supposing the system, and wherein idle resource number is k, and pending number of tasks is numT, and the degree of load of define system is:

If 0≤HL<1, system is carried in low;

If 1≤HL≤2, system is in load balance;

If HL>2, system is in overload.

3. according to the mixed scheduling method based on load monitoring under the Hadoop cloud platform described in claim 1, it is characterized in that, the Max-D dispatching method realization flow related in (3) is as follows:

The first step: determine the set of all computational resources in cloud environment and idling-resource;

Second step: operation to be allocated is submitted to and successively is ranked into queue by operation, and the new operation of submitting to is added into this queue afterbody;

4. according to the mixed scheduling method based on load monitoring under the Hadoop cloud platform described in claim 1 or 3, it is characterized in that, the Max-D method of the 3rd step in described (3), its step is as follows:

Step 3.4: if D >=0 is assigned operation and processed to the resource of record, simultaneously this resource is removed from the idling-resource set; If D<0, redefine resource and the idling-resource set of distribution, by completing it, distribute the resource of operation to join in the idling-resource set, then return to step 3.1;

5. according to the mixed scheduling method based on load monitoring under the Hadoop cloud platform described in claim 4, it is characterized in that, in step 3.1, the average estimated time to completion method of computational resource is as follows:

The deadline of operation ti on resource rj is residual completion time and the deadline sum of operation ti on rj of the operation just carried out on rj;

Suppose in cloud environment, for the same class operation, the speed that resource is processed is directly proportional to the data volume of its processing, and the Estimated Time Of Completion of operation i on resource r is the residual completion time of running job and operation i execution time sum on resource r on resource r just:

Estimation execution time TCir by previous operation on this resource _jand actual execution time RTCir (k) _j(k), use formula (1) to estimate the execution time of operation on certain resource that obtains not dispatching;

When system just starts, for all resources, order

TCir _j(0)＝RTCir _j(0)＝0 （3）

The pending resource of not carrying out operation of at first selecting is carried out, and after resource executes first operation, has just obtained the actual execution time RTCir of operation _j(1), make TCir _j(1) equal RTCir _j(1), estimated the working time of operation afterwards according to formula (1).

6. according to the mixed scheduling method based on load monitoring under the Hadoop cloud platform described in claim 4, it is characterized in that, in step 3.2, the method for calculated difference D is as follows:

Operation ti is designated as mUTC the minimum working time on the node of all unallocated work _i=min{TCir ₁', TCir ₂' ..., TCir _k', record and meet TCir _j'=mUTC _iunallocated operation rj ', and the note BR _i=r _j', then according to formula D _i=AvgTC _i-mUTC _i, obtain the difference Di of operation i.