CN103246570A

CN103246570A - Hadoop scheduling method and system and management node

Info

Publication number: CN103246570A
Application number: CN2013101881806A
Authority: CN
Inventors: 孙垚光; 黎樵
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2013-05-20
Filing date: 2013-05-20
Publication date: 2013-08-14

Abstract

The invention discloses a Hadoop scheduling method. The method comprises that a management node obtains resource consumption information of completed tasks in a plurality of computational nodes; the management node generates resource scheduling values according to the resource consumption information of completed tasks in the plurality of computational nodes; the management node receives an assignment request of new tasks and assigns resources for new tasks according to resource scheduling values. According to the Hadoop scheduling method, the stand-alone concurrency of Hadoop computational nodes (TaskTracker) can be improved, so that the resource utilization ratio of the whole cluster (the plurality of computational nodes) can be improved. The invention also discloses a Hadoop scheduling system and the management node.

Description

The dispatching method of Hadoop, system and management node

Technical field

The present invention relates to the cloud computing technical field, particularly the dispatching method of a kind of Hadoop, system and management node.

Background technology

Apache Hadoop is a software platform that can carry out distributed treatment to mass data, and mass data is professional more and more, and the use of Hadoop also more and more widely.Along with the expansion day by day (first generation Hadoop cluster approximately can be supported 4000 machines) of the scale of single cluster, how to improve the cluster resource utilization rate and also become the topic that people are concerned about gradually.The key that improves the cluster resource utilization factor is the scheduling of cluster.

Hadoop supports multiple scheduler at present, substantially all be according to machines configurations information with TaskTracker, distribute fixing groove position (slot) number, such as 16, expression separate unit TaskTracker machine can be carried out 16 Task at most simultaneously, JobTracker dispatches according to these numbers of slots, and each Task takies at least one groove position.

The scheme of this fixed configurations number of slots has two shortcomings:

(1) number of slots that holds of every machine is fixed, the resource of each groove position correspondence is also fixed, Hadoop gives tacit consent to the corresponding 800MB internal memory in each groove position, the Task that only needs the 100MB internal memory in actual moving process, at JobTracker and TaskTracker, still take a groove position, still need to consume the 800MB internal memory;

(2) certain concrete Task takies several grooves position, and the configuration according to submit job converts fully, needs how many resources in the program operation process of user to self under most of situation, can not accomplish very accurate estimating.

Therefore, if the stand-alone configuration number of slots is less, then can't take full advantage of cluster resource, and if configured slot figure place number is more, when the more operation of resource consumption occurring, the not enough situation of unit resource (for example occur because of the complete machine Out of Memory machine delay machine) can appear again.

Summary of the invention

Purpose of the present invention is intended to solve at least one of described technological deficiency.

For this reason, one object of the present invention is to propose a kind of dispatching method that promotes the Hadoop of resource utilization in the computing node.

Another object of the present invention is to propose the dispatching system of a kind of Hadoop.

A further object of the present invention is to propose a kind of management node.

For achieving the above object, the embodiment of first aspect present invention discloses the dispatching method of a kind of Hadoop, may further comprise the steps: management node obtains the resource consumption information of having finished the work in a plurality of computing nodes; Described management node generates the scheduling of resource value according to the resource consumption information of having finished the work in described a plurality of computing nodes; And described management node receives the distribution request of new task, and is described new task Resources allocation according to described scheduling of resource value.

Dispatching method according to the Hadoop of the embodiment of the invention can improve the unit concurrency of Hadoop computing node (TaskTracker), thereby improve the resource utilization of whole cluster (a plurality of computing node).

In addition, the dispatching method of Hadoop according to the above embodiment of the present invention can also have following additional technical characterictic:

In some instances, operation has a plurality of tasks in the described computing node.

In some instances, the task in the described computing node is sent to described management node by heartbeat message with the resource consumption information of described task correspondence after described task finishes.

In some instances, described management node generates described scheduling of resource value by following formula:

Up-to-date scheduling of resource value=last samples value * p+ current scheduling of resource value * (1-p), wherein, the p value is (0,1).

The embodiment of second aspect present invention discloses the dispatching system of a kind of Hadoop, comprise management node and a plurality of computing node, wherein, management node, be used for obtaining the resource consumption information that a plurality of computing nodes have been finished the work, and generate the scheduling of resource value according to the resource consumption information of having finished the work in described a plurality of computing nodes, and after the distribution request that receives new task, be described new task Resources allocation according to described scheduling of resource value.

Dispatching system according to the Hadoop of the embodiment of the invention can improve the unit concurrency of Hadoop computing node (TaskTracker), thereby improve the resource utilization of whole cluster (a plurality of computing node).

In addition, the dispatching system of Hadoop according to the above embodiment of the present invention can also have following additional technical characterictic:

The embodiment of third aspect present invention discloses a kind of management node, comprising: acquisition module is used for obtaining the resource consumption information that a plurality of computing nodes have been finished the work; Generation module, the resource consumption information that is used for having finished the work according to described a plurality of computing nodes generates the scheduling of resource value; And resource distribution module, being used for after the distribution request that receives new task according to described scheduling of resource value is described new task Resources allocation.

According to the management node of the embodiment of the invention, can improve the unit concurrency of Hadoop computing node, thereby improve the resource utilization of whole cluster (a plurality of computing node).

In addition, management node according to the above embodiment of the present invention can also have following additional technical characterictic:

The aspect that the present invention adds and advantage part in the following description provide, and part will become obviously from the following description, or recognize by practice of the present invention.

Description of drawings

Of the present invention and/or additional aspect and advantage are from obviously and easily understanding becoming the description of embodiment below in conjunction with accompanying drawing, wherein:

Fig. 1 is the process flow diagram of the dispatching method of Hadoop according to an embodiment of the invention;

Fig. 2 is the detail flowchart of the dispatching method of Hadoop according to an embodiment of the invention;

Fig. 3 is the structural drawing of the dispatching system of Hadoop according to an embodiment of the invention; And

Fig. 4 is the structural drawing of management node according to an embodiment of the invention.

Embodiment

Describe embodiments of the invention below in detail, the example of described embodiment is shown in the drawings, and wherein identical or similar label is represented identical or similar elements or the element with identical or similar functions from start to finish.Be exemplary below by the embodiment that is described with reference to the drawings, only be used for explaining the present invention, and can not be interpreted as limitation of the present invention.

In description of the invention, it will be appreciated that, term " vertically ", " laterally ", " on ", close the orientation of indications such as D score, " preceding ", " back ", " left side ", " right side ", " vertically ", " level ", " top ", " end " " interior ", " outward " or position is based on orientation shown in the drawings or position relation, only be that the present invention for convenience of description and simplification are described, rather than indication or the hint device of indication or element must have specific orientation, with specific orientation structure and operation, therefore can not be interpreted as limitation of the present invention.

In description of the invention, need to prove, unless otherwise prescribed and limit, term " installation ", " linking to each other ", " connection " should be done broad understanding, for example, can be mechanical connection or electrical connection, also can be the connection of two element internals, can be directly to link to each other, and also can link to each other indirectly by intermediary, for the ordinary skill in the art, can understand the concrete implication of described term as the case may be.

Below in conjunction with dispatching method, system and the management node of accompanying drawing description according to the Hadoop of the embodiment of the invention.

Fig. 1 is the process flow diagram of the dispatching method of Hadoop according to an embodiment of the invention.As shown in Figure 1, the dispatching method of this Hadoop comprises the steps:

Step S101: management node obtains the resource consumption information of having finished the work in a plurality of computing nodes.

Wherein, operation has a plurality of tasks in the computing node, namely can move a plurality of tasks in each computing node.And the task in the computing node can be sent to management node with the resource consumption information of task correspondence by heartbeat message after task finishes.In this example, if operation has a plurality of tasks in the computing node, then the resource consumption information in this computing node is total resource consumption information of all tasks of moving in this computing node.

In conjunction with shown in Figure 2, management node is Master node and scheduler, is shown in (1) by symbol among Fig. 2, certain the concrete operation of Master node and scheduler schedules, resource information according to this operation configuration starts a collection of Task, such as each Task default allocation internal memory 800MB.

When Task specifically carried out in the computing node, the resource information that computing node collection self Task group consumes was reported to Master node and scheduler with heartbeat when Task finishes.In this example, Task finishes to refer to that tasks all in the computing node is all finished dealing with or some completed task.

Step S102: management node generates the scheduling of resource value according to the resource consumption information of having finished the work in a plurality of computing nodes.

As a concrete example, be shown in (2) as symbol among Fig. 2, Master node and scheduler are collected the resource consumption information that the Task of all computing nodes reports up, and calculate an average T ask resource consumption (being the scheduling of resource value), and get access to computing node at every turn and report the resource consumption information of having finished the work in the next computing node, the Task memory consumption that i.e. each report comes up, all as a new collection sample, and by management node the scheduling of resource value is upgraded.

For example: management node generates the scheduling of resource value by following formula:

Up-to-date scheduling of resource value=last samples value * p+ current scheduling of resource value * (1-p), wherein, the p value is (0,1), in this example, management node can carry out flexible configuration to the p value according to the operating feature of a plurality of computing nodes.In other words, the current single Task average resource of up-to-date single Task average resource consumption=last samples value * p+ consumes * (1-p).

Step S103: management node receives the distribution request of new task, and is the new task Resources allocation according to the scheduling of resource value.

Be shown in (3) in conjunction with symbol among Fig. 2, be the follow-up scheduling of Master node and the scheduler resource information that the time do not re-use the operation configuration of above-mentioned acquiescence (as be defaulted as each Task configuration 800MB internal memory), but adopt up-to-date single Task average resource to consume scheduling of resource value as Task in the computing node.Be 500MB such as the scheduling of resource value that calculates, then management node is each Task storage allocation 500MB in the computing node.

Below the dispatching method of the Hadoop of the embodiment of the invention is carried out applicating example, as follows:

Randomly draw a TaskTracker machine (computing node), this machine can be for saving as 24GB in the scheduling, proportioning according to 800MB pickup groove position, this machine can dispose 30 groove positions at most (also needs to consider the memory cost outside the Task in the practical operation, therefore the number of slots actual disposition of this machine can be far below 30, be generally 10～20), and the scheduling of resource value that the dispatching method that utilizes the Hadoop of embodiment generates is dispatched, after memory source implemented collection of resources and dynamically adjust then, the task information that this machine is moving is as shown in table 1,21 Task have been moved altogether, be converted into the number of slots of Hadoop, 38 groove positions have then been taken (usually, the Task that surpasses 800MB among the Hadoop can take a plurality of grooves position, and for example 1500MB takies 2 groove positions, and 2100MB takies 3 groove positions), as shown in Table 1, the dispatching method of the Hadoop of the embodiment of the invention has greatly improved the memory usage of the concurrent and unit of unit (computing node).

Table 1

In the above description, the Distributed Calculation platform that Hadoop increases income for the Apache foundation, Jobtracker are that the Master node (management node) of Hadoop cluster, execution computing node, the Slot that Tasktracker is the Hadoop cluster are the groove position, the performance element that it is the Hadoop operation that Task, a Task can be carried out in groove position.

Dispatching method according to the Hadoop of the embodiment of the invention, Task on the uniform machinery (computing node) is divided into groups, include but not limited to that " with process group ID (pgid) " is unit, be unit etc. with " TaskID ", management node no longer disposes " unit number of slots " to TaskTracker, but directly configuration " unit available resources ", and can utilize information in ps instrument or the proc file system, add up each Task and be grouped in the resource situation of actual consumption in the operational process, along with constantly moving, finishes Task, management node can obtain the divide into groups resource of required consumption of the single Task of this operation, and shared resource size when adjusting follow-up Task operation according to the scheduling of resource value that the resource consumption information of having finished the work in a plurality of computing nodes generates, dispatching system is not always dispatched according to default resource, but dispatch according to the real resource consumption figures (scheduling of resource value) of concrete Task, and in TaskTracker execution Task process, utilize certain technology, prevent unit TaskTracker because machine OOM too much takes place memory consumption.

The method of the embodiment of the invention has solved, the scheduling defective that the fixing number of slots of the TaskTracker machines configurations of Hadoop brings, as remove the concept of " groove position ", the last configuration of TaskTracker directly be spendable resource, include but not limited to internal memory, CPU, IO etc., and the resource that scheduler (management node) does not always arrange according to the user in scheduling process do not dispatch, but the resource consumption situation of the Task that finishes according to actual motion is dynamically adjusted the resource of the Task that distributes to follow-up startup.

The method of the embodiment of the invention can improve the unit concurrency of Hadoop computing node (TaskTracker), thereby improves the resource utilization of whole cluster.Generally speaking, the computing node stand-alone configuration number of slots of Hadoop is according to the difference of self EMS memory configuration and difference, generally between 10～20, concrete resource utilization gets a promotion in the method computing node of the embodiment of the invention and utilize, for example: observe through actual effect on the line, save as the Hadoop machine of 16GB in one, the configured slot figure place is 16, namely moves 16 Task at most simultaneously, adopt this technical scheme after, the concurrent Task of unit can reach more than 20, promotes resource utilization 20%.

In addition, the method for the embodiment of the invention uses most scenes at Hadoop.

(1) just in time takies the operation of an integer groove position resource for single Task resource consumption, DeGrain (the worst result maintains an equal level with Hadoop, for example single Task to consume just in time be the 800MB internal memory).

(2) for single Task consumption of natural resource and N the operation that groove position resource gap is bigger, effect is more remarkable.

Fig. 3 is the structural drawing of the dispatching system of Hadoop according to an embodiment of the invention.As shown in Figure 3, the dispatching system 300 of the Hadoop of the embodiment of the invention comprises management node 310 and a plurality of computing node 320.

Wherein, management node 310 is used for obtaining the resource consumption information that a plurality of computing nodes 320 have been finished the work, and generate the scheduling of resource value according to the resource consumption information of having finished the work in a plurality of computing nodes 320, and after the distribution request that receives new task, be the new task Resources allocation according to the scheduling of resource value.

Specifically, operation has a plurality of tasks in the computing node 320, namely can move a plurality of tasks in each computing node 320.And the task in the computing node 320 can be sent to management node 310 with the resource consumption information of task correspondence by heartbeat message after task finishes.In this example, if operation has a plurality of tasks in the computing node 320, then the resource consumption information in this computing node 320 is total resource consumption information of all tasks of operation in this computing node 320.

In conjunction with shown in Figure 2, management node 310 is Master node and scheduler, is shown in (1) by symbol among Fig. 2, certain the concrete operation of Master node and scheduler schedules, resource information according to this operation configuration starts a collection of Task, such as each Task default allocation internal memory 800MB.

When Task specifically carried out in the computing node 320, the resource information that computing node 320 collections self Task group consumes was reported to Master node and scheduler with heartbeat when Task finishes.In this example, Task finishes to refer to that tasks all in the computing node 320 is all finished dealing with or some completed task.

Be shown in (2) as symbol among Fig. 2, Master node and scheduler are collected the resource consumption information that the Task of all computing nodes 320 reports up, and calculate an average T ask resource consumption (being the scheduling of resource value), and get access to computing node 320 at every turn and report the resource consumption information of having finished the work in the next computing node 320, the Task memory consumption that i.e. each report comes up, upgrade all as a new collection sample, and by 310 pairs of scheduling of resource values of management node.

For example: management node 310 generates the scheduling of resource value by following formula:

Up-to-date scheduling of resource value=last samples value * p+ current scheduling of resource value * (1-p), wherein, the p value is (0,1), in this example, management node 310 can carry out flexible configuration to the p value according to the operating feature of a plurality of computing nodes 320.In other words, the current single Task average resource of up-to-date single Task average resource consumption=last samples value * p+ consumes * (1-p).

Be shown in (3) in conjunction with symbol among Fig. 2, be the follow-up scheduling of Master node and the scheduler resource information that the time do not re-use the operation configuration of above-mentioned acquiescence (as be defaulted as each Task configuration 800MB internal memory), but adopt up-to-date single Task average resource to consume scheduling of resource value as Task in the computing node 320.Be 500MB such as the scheduling of resource value that calculates, then management node 310 is each Task storage allocation 500MB in the computing node 320.

In the above description, the Distributed Calculation platform that Hadoop increases income for the Apache foundation, Jobtracker are that the Master node (management node 310) of Hadoop cluster, execution computing node 320, the Slot that Tasktracker is the Hadoop cluster are the groove position, the performance element that it is the Hadoop operation that Task, a Task can be carried out in groove position.

Dispatching system according to the Hadoop of the embodiment of the invention, Task on the uniform machinery (computing node 320) is divided into groups, include but not limited to that " with process group ID (pgid) " is unit, be unit etc. with " TaskID ", management node no longer disposes " unit number of slots " to TaskTracker, but directly configuration " unit available resources ", and can utilize information in ps instrument or the proc file system, add up each Task and be grouped in the resource situation of actual consumption in the operational process, along with constantly moving, finishes Task, management node can obtain the divide into groups resource of required consumption of the single Task of this operation, and shared resource size when adjusting follow-up Task operation according to the scheduling of resource value that the resource consumption information of having finished the work in a plurality of computing nodes generates, dispatching system is not always dispatched according to default resource, but dispatch according to the real resource consumption figures (scheduling of resource value) of concrete Task, and in TaskTracker execution Task process, utilize certain technology, prevent unit TaskTracker because machine OOM too much takes place memory consumption.

The system of the embodiment of the invention has solved, the scheduling defective that the fixing number of slots of the TaskTracker machines configurations of Hadoop brings, as remove the concept of " groove position ", the last configuration of TaskTracker directly be spendable resource, include but not limited to internal memory, CPU, IO etc., and the resource that scheduler (management node) does not always arrange according to the user in scheduling process do not dispatch, but the resource consumption situation of the Task that finishes according to actual motion is dynamically adjusted the resource of the Task that distributes to follow-up startup.

The system of the embodiment of the invention can improve the unit concurrency of Hadoop computing node (TaskTracker), thereby improves the resource utilization of whole cluster.Generally speaking, the computing node stand-alone configuration number of slots of Hadoop is according to the difference of self EMS memory configuration and difference, and generally between 10～20, concrete resource utilization gets a promotion in the method computing node of the embodiment of the invention and utilize.

In addition, the system of the embodiment of the invention can use the most scenes at Hadoop.

Fig. 4 is the structural drawing of management node according to an embodiment of the invention.As shown in Figure 4, the management node 310 of the embodiment of the invention comprises: acquisition module 311, generation module 312 and resource distribution module 313.

Wherein, acquisition module 311 is used for obtaining the resource consumption information that a plurality of computing nodes 320 have been finished the work.The resource consumption information that generation module 312 is used for having finished the work according to a plurality of computing nodes 320 generates the scheduling of resource value.It is the new task Resources allocation that resource distribution module 313 is used for after the distribution request that receives new task according to the scheduling of resource value.

Management node according to the embodiment of the invention, can add up each Task and be grouped in the resource situation of actual consumption in the operational process, along with constantly moving, finishes Task, management node can obtain the divide into groups resource of required consumption of the single Task of this operation, and shared resource size when adjusting follow-up Task operation according to the scheduling of resource value that the resource consumption information of having finished the work in a plurality of computing nodes generates, dispatching system is not always dispatched according to default resource, but dispatch according to the real resource consumption figures (scheduling of resource value) of concrete Task, and in TaskTracker execution Task process, utilize certain technology, prevent unit TaskTracker because machine OOM too much takes place memory consumption.

The management node of the embodiment of the invention has solved the scheduling defective that the fixing number of slots of TaskTracker machines configurations of Hadoop brings, as remove the concept of " groove position ", the last configuration of TaskTracker directly be spendable resource, include but not limited to internal memory, CPU, IO etc., and the resource that scheduler (management node) does not always arrange according to the user in scheduling process do not dispatch, but the resource consumption situation of the Task that finishes according to actual motion is dynamically adjusted the resource of the Task that distributes to follow-up startup.

The management node of the embodiment of the invention can improve the unit concurrency of Hadoop computing node (TaskTracker), thereby improves the resource utilization of whole cluster.Generally speaking, the computing node stand-alone configuration number of slots of Hadoop is according to the difference of self EMS memory configuration and difference, and generally between 10～20, concrete resource utilization gets a promotion in the method computing node of the embodiment of the invention and utilize.

In addition, the management node of the embodiment of the invention can use the most scenes at Hadoop.

In the description of this instructions, concrete feature, structure, material or characteristics that the description of reference term " embodiment ", " some embodiment ", " example ", " concrete example " or " some examples " etc. means in conjunction with this embodiment or example description are contained at least one embodiment of the present invention or the example.In this manual, the schematic statement to described term not necessarily refers to identical embodiment or example.And concrete feature, structure, material or the characteristics of description can be with the suitable manner combination in any one or more embodiment or example.

Although illustrated and described embodiments of the invention, for the ordinary skill in the art, be appreciated that without departing from the principles and spirit of the present invention and can carry out multiple variation, modification, replacement and modification to these embodiment that scope of the present invention is by claims and be equal to and limit.

Claims

1. the dispatching method of a Hadoop is characterized in that, may further comprise the steps:

Management node obtains the resource consumption information of having finished the work in a plurality of computing nodes;

Described management node generates the scheduling of resource value according to the resource consumption information of having finished the work in described a plurality of computing nodes; And

Described management node receives the distribution request of new task, and is described new task Resources allocation according to described scheduling of resource value.

2. the method for claim 1 is characterized in that, operation has a plurality of tasks in the described computing node.

3. method as claimed in claim 1 or 2 is characterized in that, the task in the described computing node is sent to described management node by heartbeat message with the resource consumption information of described task correspondence after described task finishes.

4. as each described method of claim 1-3, it is characterized in that described management node generates described scheduling of resource value by following formula:

5. the dispatching system of a Hadoop is characterized in that, comprises management node and a plurality of computing node, wherein,

Management node, be used for obtaining the resource consumption information that a plurality of computing nodes have been finished the work, and generate the scheduling of resource value according to the resource consumption information of having finished the work in described a plurality of computing nodes, and after the distribution request that receives new task, be described new task Resources allocation according to described scheduling of resource value.

6. system as claimed in claim 5 is characterized in that, operation has a plurality of tasks in the described computing node.

7. system as claimed in claim 5 is characterized in that, the task in the described computing node is sent to described management node by heartbeat message with the resource consumption information of described task correspondence after described task finishes.

8. system as claimed in claim 5 is characterized in that, described management node generates described scheduling of resource value by following formula:

9. a management node is characterized in that, comprising:

Acquisition module is used for obtaining the resource consumption information that a plurality of computing nodes have been finished the work;

Generation module, the resource consumption information that is used for having finished the work according to described a plurality of computing nodes generates the scheduling of resource value; And

Resource distribution module, being used for after the distribution request that receives new task according to described scheduling of resource value is described new task Resources allocation.

10. management node as claimed in claim 9 is characterized in that, operation has a plurality of tasks in the described computing node.

11. management node as claimed in claim 9 is characterized in that, the task in the described computing node is sent to described management node by heartbeat message with the resource consumption information of described task correspondence after described task finishes.

12. management node as claimed in claim 9 is characterized in that, described management node generates described scheduling of resource value by following formula: