CN114077486A - MapReduce task scheduling method and system - Google Patents
MapReduce task scheduling method and system Download PDFInfo
- Publication number
- CN114077486A CN114077486A CN202111386374.8A CN202111386374A CN114077486A CN 114077486 A CN114077486 A CN 114077486A CN 202111386374 A CN202111386374 A CN 202111386374A CN 114077486 A CN114077486 A CN 114077486A
- Authority
- CN
- China
- Prior art keywords
- job
- resource
- task
- central
- resources
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 238000013468 resource allocation Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000011084 recovery Methods 0.000 claims description 5
- 238000004458 analytical method Methods 0.000 claims description 2
- 239000012634 fragment Substances 0.000 claims description 2
- 230000007547 defect Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5044—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multi Processors (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a MapReduce task scheduling method and system, which make up for the defect that the existing Kill-based preemption mechanism of Yarn directly kills tasks by introducing the preemption mechanism based on a Docker container. The preemption mechanism based on the Docker container can release resources occupied by tasks while keeping task progress, and can realize that tasks with high priority preempt running resources of other tasks and ensure that the completion time of the operation reaches the target of a Service Level Agreement (SLA) by combining a task strategy perceived by the service level agreement.
Description
Technical Field
The invention relates to the technical field of task scheduling in a heterogeneous cluster environment, in particular to a MapReduce task scheduling method and system.
Background
At present, with the development of internet technology, the scale of data to be calculated and processed in daily production and life is getting larger and larger, and the way of processing large-scale data by using a distributed computing system is widely used. Among them, the scheduler is a vital part of the distributed system. A well-designed scheduling strategy can effectively allocate program requirements and available cluster resources and can reduce the operation cost of a data center. The most widely applied distributed computing framework at present is the flagship project Hadoop of Apache, and the provided programming computing framework is MapReduce. Hadoop extracts an independent framework Yarn from the resource management part. The Yarn is a universal resource management platform and can provide resources required by operation for computing programs such as MapReduce and the like.
Today Yarn has implemented three different schedulers, first-in-first-out, capacity and fair, based on different scheduling strategies. Although the three scheduling strategies can improve the utilization rate of the cluster and optimize the cluster performance to a certain extent, how to schedule jobs with different resource requirements and QoS constraints in a complex heterogeneous cluster environment is still a difficult problem to be solved urgently. According to the completion time of the job, it can be divided into a short job and a long job. Short jobs generally have low latency requirements, while long jobs can tolerate higher latency, but have quality of service requirements. So for short jobs, scheduling needs to be done immediately after they are committed to avoid queuing delays. For long jobs, the scheduler should allow the long job to use cluster resources if there are free resources in the cluster, which may improve the resource utilization of the cluster.
Under a real working environment, a plurality of long jobs and short jobs are generally mixed together for scheduling, and the existing solution either forcibly terminates a running long job to ensure low delay of a short job or completely forbids a resource preemption behavior to improve the resource utilization rate of a cluster. The simple scheduling strategy cannot meet the scheduling of jobs with different resource requirements in a complex heterogeneous environment. The goal is to make a trade-off between resource utilization and job queuing delay, and how to improve hardware resource utilization and efficiency and reduce job queuing delay as much as possible, thereby achieving the goal of service level agreement.
In order to solve the problem, a brand-new scheduling strategy needs to be developed urgently to meet the needs of actual work.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a MapReduce task scheduling method and system used in a heterogeneous Yarn cluster environment, which ensure higher cluster resource utilization rate and simultaneously give consideration to low delay and instant response speed of operation.
The MapReduce task scheduling method provided by the invention comprises the following steps:
s1: the client creates a JobSummiter instance, calculates the input fragment of the job by an internal method of the JobSummiter, copies the resource required by the job operation into a distributed file system, and submits the MapReduce job to a resource scheduler;
s2: after receiving the job submission message, the resource scheduler transmits the request message to the central resource scheduler, and the central resource scheduler analyzes the detailed information related to the job by an internal job analysis method and analyzes the latest deadline required by reaching a service level agreement;
s3: the central resource scheduler adds the new tasks into a central task queue, and reorders all the tasks from near to far according to the deadline of each task according to the different deadline dates;
s4: the central resource scheduler receives the heartbeat information from the node resource scheduler, obtains the number of tasks allocated by each node resource scheduler, sequentially selects the node with the minimum task number from the number of tasks, and assigns the task with the latest current deadline to be executed in the past;
s5: after receiving the new task, the node resource scheduler adds the new task into a local task queue and reorders the task queue according to the deadline;
s6: the node resource scheduler checks the position of the new task in the task queue and if the deadline of the new task is closer than the task being executed, the new task preempts the task being executed.
Further, in step S3, the central resource scheduler obtains the total amount of CPU resources C and the total amount of memory resources M, and obtains the job share of the long job according to the job numberPeriodically calculating the resource share of each job in the central task queue according to the fairness principle
Further, in step S4, after receiving the resource request of a job, the central resource scheduler analyzes whether the job can be completed before the deadline by combining the deadline constraint, the resource condition of the cluster, and the resource requirement of the job, and adds the job into the central task queue if the central resource scheduler determines that the job can be completed before the deadline; otherwise, the central resource scheduler may deny execution of the job.
Further, in step S4, when a certain job arrives, the central resource scheduler determines the current cluster resource amount according to the heartbeat information sent by each node resource scheduler, analyzes the resource amount requested by the job according to the history log of the job running, and if the job is not executed on the cluster, the scheduler executes the job by using a small part of the original data set as a pretest set.
Further, in step S4, if the resource amount of the job request does not exceed the available resource amount in the cluster, the central resource scheduler adds the job to the central task queue;
otherwise, it needs to be subdivided into two cases: one case includes executing the job if the job directly preempts the resources of the job currently running and can be completed in time if executed immediately;
another scenario includes that, even if the job preempts the resources of other jobs, the job cannot meet the deadline requirement, and the central resource scheduler directly denies execution of the job.
Further, in step S4, the amount of resource to be preempted based on the service level agreement is determined by:
when the execution of the W map tasks is finished, the reduce task is started to be executed, and TupRepresenting the upper limit of the execution time of the W map tasks, the following can be obtained:
wherein M isavgIs the average execution time of the map tasks in job j,for the number of map tasks in Job j, MmaxThe maximum execution time of the map task in the operation j is taken as the maximum execution time; there are Q jobs that can be on the upper time limit TupThe amount of resources that were previously executed and released after these jobs were completed is R, the value of which can be calculated by the following formula:
the amount of resources required in the reduce phase is E, and the value of E can be calculated by the following formula:
wherein C isrIndicating the amount of resources available in the cluster at the current time,representing the amount of resources required by the map task for job j.
Further, in step S6, when job preemption is required, the resource share additionally occupied by the job k to be preempted is calculated firstWhereinRepresents the actual occupied resources of the operation k in the process of executionThe source of the light source is,representing the resource amount which should be obtained by the operation k according to the fair resource allocation principle; and then acquiring the resource share requested by the job j needing to be executedIf it isThe resource to be preempted can be calculated
Further, in step S6, if it is determined that the operation is not in progressCalculating the resource to be preempted by an algorithmResource requiring preemptionThe calculation of (a) includes: comparing CPU resource and memory resource, dividing them into main resource and secondary resource, then obtaining resource recovery of secondary resource according to recovery of main resource, the calculation formula includes:
wherein the content of the first and second substances,Cj,Mjrespectively representing the amount of CPU resources and the amount of memory resources requested by job j, Ca,MaRespectively representing the CPU resource amount and the memory resource amount actually and additionally occupied by the current operation k in the cluster;
representing the amount of resources that need to be preempted, ifThen representing that the CPU resource requested by the job j is a main resource, and then preempting all the CPU resources additionally occupied by the job k, and preempting the memory resources additionally occupied by the job k in proportion; and otherwise, preempting all the memory resources additionally occupied by the operation k according to the fact that the memory is the main resource requested by the operation j, and preempting the CPU resources additionally occupied by the operation k in proportion.
Further, the MapReduce task scheduling method is performed according to a scheduling policy based on a service level agreement, and the scheduling policy based on the service level agreement comprises the following steps:
when job j arrives, analyzing the expiration date of the job, the required throughput and the required resource amount;
the central resource scheduler analyzes whether the current resource quantity of the cluster meets the resource demand quantity of the judgment job j, and if the current resource quantity of the cluster meets the resource demand quantity of the judgment job j, the job j is added into a central task queue;
if not, judging whether the resource quantity of the cluster can meet the resource demand quantity of the map task of the job j or not, and whether the resource released after the execution of the map task can meet the resource demand quantity of the reduce task or not;
if the two conditions are met, adding the job j into the central task queue, and marking the job j as a high priority, so that the job j can occupy the resources of other jobs in the execution process; if the two conditions cannot be met simultaneously, the central resource scheduler refuses to execute the job j;
the central resource scheduler sorts the jobs in the central task queue according to the deadline, and performs traversal processing on each job respectively; for the job j in the central task queue, the central resource scheduler judges whether the map task of the job j is completely executed or not, if not, the priority of the job j is judged, if the job is a high-priority job, the central resource scheduler immediately communicates with the node resource scheduler to preempt the appointed resource from the cluster so as to execute the map task of the job j, otherwise, the central resource scheduler waits for the cluster to generate idle resources and allocates the idle resources to the map task of the job j;
the node resource scheduler reports the task execution state to the central resource scheduler through heartbeat information, if the map tasks are completely executed, the central resource scheduler judges whether the number of the executed map tasks exceeds a threshold value W, if so, the central resource scheduler starts to execute the reduce task of the job j and also judges the priority of the job j, if the job is executed preferentially, the node resource scheduler completes job preemption, otherwise, idle resources are waited for allocation.
The invention also provides a MapReduce task scheduling system adopting the MapReduce task scheduling method, which comprises the following steps: the distributed data center cluster comprises a central resource scheduler and a plurality of node resource schedulers;
a central task queue is maintained in the central resource scheduler, and when a new job arrives, the central resource scheduler analyzes job characteristics to obtain the operation time and the deadline of the job;
the node resource scheduler maintains a running task queue and a pause task queue, and the two queues are sorted according to the sequence of the deadline date and continuously report the task information and the resource use condition on the node to the central resource scheduler through a heartbeat mechanism.
The invention provides a MapReduce task scheduling method used in a heterogeneous Yarn cluster environment, wherein a central resource scheduler runs a daemon process on a resource manager (resource manager) and is responsible for receiving task information transmitted by a node resource scheduler, periodically checks a current task scheduling strategy, resource availability of each working node and resource requirement of a newly arrived task, deduces which queues occupy redundant resources and are insufficient in resource allocation, calculates the number of resources to be preempted, obtains an optimal resource allocation scheme of the task queues in each time period, and transmits a scheduling decision to the node resource scheduler for execution.
The scheduling method creatively introduces a Docker container-based preemption mechanism, and overcomes the defect that the existing Kill-based preemption mechanism of Yarn directly kills tasks. The preemption mechanism based on the Docker container can release resources occupied by tasks while keeping task progress, and can realize that tasks with high priority preempt running resources of other tasks and ensure that the completion time of the operation reaches the target of a Service Level Agreement (SLA) by combining a task strategy perceived by the service level agreement.
Drawings
The invention will be described in more detail hereinafter on the basis of embodiments and with reference to the accompanying drawings. Wherein:
FIG. 1 is a schematic flow chart of a MapReduce task scheduling method in the present invention;
FIG. 2 is a system architecture diagram of a MapReduce task scheduling method in the present invention;
FIG. 3 is a deployment diagram of an example of the MapReduce task scheduling method in the present invention.
Detailed Description
In order to clearly illustrate the inventive content of the present invention, the present invention will be described below with reference to examples.
In the description of the present invention, it should be noted that the terms "upper", "lower", "horizontal", "top", "bottom", and the like indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and thus should not be construed as limiting the present invention. Furthermore, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
Referring to fig. 1-3, a central resource scheduler in the present invention is a daemon process running on a resource manager (resource manager), and is responsible for receiving task information transmitted by a node resource scheduler, periodically checking a current task scheduling policy, resource availability of each working node, and acquiring resource requirements of a newly arrived task, so as to infer which queues occupy redundant resources, which queues have insufficient resource allocation, and calculate the number of resources to be preempted, obtain an optimal resource allocation scheme for a task queue in each time period, and send a scheduling decision to the node resource scheduler for execution.
The Node resource scheduler is a daemon process running on a working Node Manager (Node Manager), realizes the integration of a Docker container and a Yarn framework, and solves the problem of a preemption mode that a native Yarn container directly kills a task container. After receiving the task request, the node resource scheduler loads the task into the Docker container and configures the container according to the resource request of the task. In addition, the node resource scheduler is responsible for container suspension and container recovery operations, and recovers or recovers container resources as needed.
In the actual job scheduling process, which job is to be specifically preempted is determined before job preemption operation, and the preemptive job scheduling strategy which can ensure the QoS service quality to the maximum extent and meet the SLA service level agreement is designed in the invention. The idea of the job scheduling strategy is to preferentially execute the job with the earliest deadline, so that the number of jobs missing the deadline can be minimized, and the execution effect of the job is greatly improved. Specifically, after receiving a resource request of a job, the central resource scheduler analyzes whether the job can be completed before the deadline in combination with the deadline constraint, the resource status of the cluster, and the resource requirement of the job. If the hub resource scheduler determines that a job can be completed before its deadline, it adds it to the job queue. Otherwise, the central resource scheduler may deny execution of the job.
When a certain job j arrives, the central resource scheduler determines the current cluster resource amount according to the heartbeat information sent by each node resource scheduler, analyzes the resource amount requested by the job j according to the historical log of job running, and if the job j is not executed on the cluster, the scheduler executes the job by using a small part of the original data set as a pretest set, so as to obtain the performance index related to the job.Representing the total amount of resources required for job j,the amount of resources required by the map task representing job j,representing the amount of resources required for the reduce task of Job j, we use CrIndicating the amount of resources available in the cluster at the current time. If the amount of resources requested by the job does not exceed the amount of resources available in the cluster, i.e., the job requests are not processed by the clusterThe hub resource scheduler adds job j to the job queue. If not, then, it needs to be subdivided into two cases: one case is that if the job j directly occupies the resources of the job k currently running and is executed immediately, in which case the job can be completed in time, the job j is executed; alternatively, if job j fails to meet the deadline requirement even if job j preempts resources of other jobs, the central resource scheduler may directly deny execution of job j.
The MapReduce task scheduling method is carried out according to a scheduling strategy based on a Service Level Agreement (SLA), and the specific deployment mode based on the Service Level Agreement (SLA) is divided into the following steps:
step 1: in the Yarn cluster, the resource manager node is integrated with the central resource scheduler provided by the invention, and the rest node manager nodes are integrated with the node resource scheduler provided by the invention.
Step 2: when a user submits a batch of jobs to be distributed to the cluster, the central resource scheduler analyzes the jobs, analyzes the size of input data volume of the jobs, the size of required resources such as a CPU (Central processing Unit), a memory and the like, and the deadline specified by the user.
And step 3: the central resource scheduler collects the node state information sent by each node resource scheduler, and counts the execution progress of the currently executed job and the utilization rate of various resources in the cluster.
And 4, step 4: and the central resource scheduler gathers the available resources of the current cluster and the characteristics of the jobs to be executed, analyzes whether the current cluster resources can meet the resource demand of the job j, adds the job j into a job queue to be executed if the current cluster resources can meet the resource demand of the map task of the job j, and judges whether the resource demand of the released resources can meet the resource demand of the reduce task after the map task is executed. If both conditions are met, job j is added to the job queue and is set to high priority.
And 5: and the central resource scheduler sorts the jobs in the job queue according to the deadline and performs traversal processing on each job respectively. For the job j in the job queue, the central resource scheduler judges whether the map task of the job j is completely executed or not, if not, the priority of the job j is judged, if the job is a high-priority job, the central resource scheduler immediately communicates with the node resource scheduler to preempt the appointed resource from the cluster so as to execute the map task of the job j, otherwise, the central resource scheduler waits for the cluster to generate idle resources and allocates the idle resources to the map task of the job j;
step 6: and the node resource scheduler reports the task execution state to the central resource scheduler through the heartbeat information. If the map tasks are completely executed, the central resource scheduler judges whether the number of the executed map tasks exceeds a threshold value W. And if the priority exceeds the threshold value W, starting to execute the reduce task of the job j, judging the priority of the job j, finishing job preemption together with the node resource scheduler if the job is executed preferentially, and otherwise waiting for idle resources to be allocated.
The scheduling method can realize that the task with high priority occupies the running resources of other tasks based on the scheduling strategy of the Service Level Agreement (SLA), and ensures that the completion time of the operation reaches the target of the Service Level Agreement (SLA). The MapReduce task scheduling system can balance the relationship between the resource utilization rate and the job queuing delay, effectively improve the hardware resource utilization rate and efficiency, and greatly reduce the job queuing delay so as to achieve the service level agreement target.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.
Claims (10)
1. A MapReduce task scheduling method is characterized by comprising the following steps:
s1: the client creates a JobSummiter instance, calculates the input fragment of the job by an internal method of the JobSummiter, copies the resource required by the job operation into a distributed file system, and submits the MapReduce job to a resource scheduler Resourcemanager;
s2: after receiving the job submission message, the resource scheduler transmits the request message to the central resource scheduler, and the central resource scheduler analyzes the detailed information related to the job by an internal job analysis method and analyzes the latest deadline required by reaching a service level agreement;
s3: the central resource scheduler adds the new tasks into a central task queue, and reorders all the tasks from near to far according to the deadline of each task according to the different deadline dates;
s4: the central resource scheduler receives the heartbeat information from the node resource scheduler, obtains the number of tasks allocated by each node resource scheduler, sequentially selects the node with the minimum task number from the number of tasks, and assigns the task with the latest current deadline to be executed in the past;
s5: after receiving the new task, the node resource scheduler adds the new task into a local task queue and reorders the task queue according to the deadline;
s6: the node resource scheduler checks the position of the new task in the task queue and if the deadline of the new task is closer than the task being executed, the new task preempts the task being executed.
2. The MapReduce task scheduling method of claim 1, wherein in step S2, the central resource scheduler obtains a total amount of CPU resources C and a total amount of memory resources M, and obtains a job share of a long job according to a job numberAccording to the fairness principle, periodicityLocal computation of resource share for each job in central task queue
3. The MapReduce task scheduling method according to claim 2, wherein in step S3, after receiving a resource request of a job, the central resource scheduler analyzes whether the job can be completed before the deadline by combining the deadline constraint, the resource condition of the cluster, and the resource requirement of the job, and adds the job to the central task queue if the central resource scheduler determines that the job can be completed before the deadline; otherwise, the central resource scheduler may deny execution of the job.
4. The MapReduce task scheduling method according to claim 1, wherein in step S4, when a job arrives, the central resource scheduler determines a current cluster resource amount according to the heartbeat information sent by each node resource scheduler, analyzes the resource amount requested by the job according to the history log of the job running, and if the job is not executed on the cluster, the scheduler executes the job using a small part of the original data set as a pre-test set.
5. The MapReduce task scheduling method of claim 4, wherein in step S4, if the resource amount of the job request does not exceed the available resource amount in the cluster, the central resource scheduler adds the job to a central task queue;
otherwise, it needs to be subdivided into two cases: one case includes executing the job if the job directly preempts the resources of the job currently running and can be completed in time if executed immediately;
another scenario includes that, even if the job preempts the resources of other jobs, the job cannot meet the deadline requirement, and the central resource scheduler directly denies execution of the job.
6. The MapReduce task scheduling method of claim 4, wherein in step S4, the amount of preempted resources based on the service level agreement is determined by the following scheme:
when the execution of the W map tasks is finished, the reduce task is started to be executed, and TupRepresenting the upper limit of the execution time of the W map tasks, the following can be obtained:
wherein M isavgIs the average execution time of the map tasks in job j,for the number of map tasks in Job j, MmaxThe maximum execution time of the map task in the operation j is taken as the maximum execution time; there are Q jobs that can be on the upper time limit TupThe amount of resources that were previously executed and released after these jobs were completed is R, the value of which can be calculated by the following formula:
the amount of resources required in the reduce phase is E, and the value of E can be calculated by the following formula:
7. The MapReduce task scheduling method of claim 6, wherein in step S6, when job preemption is required, the method calculates the resource share additionally occupied by the job k to be preemptedWhereinRepresenting the resources actually occupied by job k during execution,representing the resource amount which should be obtained by the operation k according to the fair resource allocation principle; and then acquiring the resource share requested by the job j needing to be executedIf it isThe resource to be preempted can be calculated
8. The MapReduce task scheduling method of claim 7, wherein in step S6, if yes, the task is scheduled according to the following orderThen go toResource needing to be preempted by over-algorithm calculationResource requiring preemptionThe calculation of (a) includes: comparing CPU resource and memory resource, dividing them into main resource and secondary resource, then obtaining resource recovery of secondary resource according to recovery of main resource, the calculation formula includes:
wherein, Cj,MjRespectively representing the amount of CPU resources and the amount of memory resources requested by job j, Ca,MaRespectively representing the CPU resource amount and the memory resource amount actually and additionally occupied by the current operation k in the cluster;
representing the amount of resources that need to be preempted, ifThen representing that the CPU resource requested by the job j is a main resource, and then preempting all the CPU resources additionally occupied by the job k, and preempting the memory resources additionally occupied by the job k in proportion; otherwise, the memory is regarded as the main resource requested by the job j, and the whole memory additionally occupied by the job k is preemptedAnd storing resources and preempting the CPU resources additionally occupied by the operation k in proportion.
9. The MapReduce task scheduling method of claim 8, wherein the MapReduce task scheduling method is performed according to a service level agreement-based scheduling policy, and the service level agreement-based scheduling policy comprises:
when job j arrives, analyzing the expiration date of the job, the required throughput and the required resource amount;
the central resource scheduler analyzes whether the current resource quantity of the cluster meets the resource demand quantity of the judgment job j, and if the current resource quantity of the cluster meets the resource demand quantity of the judgment job j, the job j is added into a central task queue;
if not, judging whether the resource quantity of the cluster can meet the resource demand quantity of the map task of the job j or not, and whether the resource released after the execution of the map task can meet the resource demand quantity of the reduce task or not;
if the two conditions are met, adding the job j into the central task queue, and marking the job j as a high priority, so that the job j can occupy the resources of other jobs in the execution process; if the two conditions cannot be met simultaneously, the central resource scheduler refuses to execute the job j;
the central resource scheduler sorts the jobs in the central task queue according to the deadline, and performs traversal processing on each job respectively; for the job j in the central task queue, the central resource scheduler judges whether the map task of the job j is completely executed or not, if not, the priority of the job j is judged, if the job is a high-priority job, the central resource scheduler immediately communicates with the node resource scheduler to preempt the appointed resource from the cluster so as to execute the map task of the job j, otherwise, the central resource scheduler waits for the cluster to generate idle resources and allocates the idle resources to the map task of the job j;
the node resource scheduler reports the task execution state to the central resource scheduler through heartbeat information, if the map tasks are completely executed, the central resource scheduler judges whether the number of the executed map tasks exceeds a threshold value W, if so, the central resource scheduler starts to execute the reduce task of the job j and also judges the priority of the job j, if the job is executed preferentially, the node resource scheduler completes job preemption, otherwise, idle resources are waited for allocation.
10. A MapReduce task scheduling system using the MapReduce task scheduling method of any one of claims 1 to 9, comprising: the distributed data center cluster comprises a central resource scheduler and a plurality of node resource schedulers;
a central task queue is maintained in the central resource scheduler, and when a new job arrives, the central resource scheduler analyzes job characteristics to obtain the operation time and the deadline of the job;
the node resource scheduler maintains a running task queue and a pause task queue, and the two queues are sorted according to the sequence of the deadline date and continuously report the task information and the resource use condition on the node to the central resource scheduler through a heartbeat mechanism.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111386374.8A CN114077486B (en) | 2021-11-22 | 2021-11-22 | MapReduce task scheduling method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111386374.8A CN114077486B (en) | 2021-11-22 | 2021-11-22 | MapReduce task scheduling method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114077486A true CN114077486A (en) | 2022-02-22 |
CN114077486B CN114077486B (en) | 2024-03-29 |
Family
ID=80284249
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111386374.8A Active CN114077486B (en) | 2021-11-22 | 2021-11-22 | MapReduce task scheduling method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114077486B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114860397A (en) * | 2022-04-14 | 2022-08-05 | 深圳清华大学研究院 | Task scheduling method, device and equipment |
WO2024103463A1 (en) * | 2022-11-18 | 2024-05-23 | 深圳先进技术研究院 | Elastic deep learning job scheduling method and system, and computer device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104991830A (en) * | 2015-07-10 | 2015-10-21 | 山东大学 | YARN resource allocation and energy-saving scheduling method and system based on service level agreement |
WO2020248226A1 (en) * | 2019-06-13 | 2020-12-17 | 东北大学 | Initial hadoop computation task allocation method based on load prediction |
CN112395052A (en) * | 2020-12-03 | 2021-02-23 | 华中科技大学 | Container-based cluster resource management method and system for mixed load |
-
2021
- 2021-11-22 CN CN202111386374.8A patent/CN114077486B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104991830A (en) * | 2015-07-10 | 2015-10-21 | 山东大学 | YARN resource allocation and energy-saving scheduling method and system based on service level agreement |
WO2020248226A1 (en) * | 2019-06-13 | 2020-12-17 | 东北大学 | Initial hadoop computation task allocation method based on load prediction |
CN112395052A (en) * | 2020-12-03 | 2021-02-23 | 华中科技大学 | Container-based cluster resource management method and system for mixed load |
Non-Patent Citations (1)
Title |
---|
王越峰;王溪波;: "Hadoop集群环境下集成抢占式调度策略的本地性调度算法设计", 计算机科学, no. 1, 31 December 2017 (2017-12-31), pages 567 - 570 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114860397A (en) * | 2022-04-14 | 2022-08-05 | 深圳清华大学研究院 | Task scheduling method, device and equipment |
WO2024103463A1 (en) * | 2022-11-18 | 2024-05-23 | 深圳先进技术研究院 | Elastic deep learning job scheduling method and system, and computer device |
Also Published As
Publication number | Publication date |
---|---|
CN114077486B (en) | 2024-03-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11099892B2 (en) | Utilization-aware resource scheduling in a distributed computing cluster | |
US9367357B2 (en) | Simultaneous scheduling of processes and offloading computation on many-core coprocessors | |
US11275609B2 (en) | Job distribution within a grid environment | |
Yao et al. | Haste: Hadoop yarn scheduling based on task-dependency and resource-demand | |
US20200174844A1 (en) | System and method for resource partitioning in distributed computing | |
US9152467B2 (en) | Method for simultaneous scheduling of processes and offloading computation on many-core coprocessors | |
US9141432B2 (en) | Dynamic pending job queue length for job distribution within a grid environment | |
Dumitrescu et al. | GRUBER: A Grid resource usage SLA broker | |
CN109564528B (en) | System and method for computing resource allocation in distributed computing | |
CN114077486B (en) | MapReduce task scheduling method and system | |
Sun et al. | Rose: Cluster resource scheduling via speculative over-subscription | |
CN111258745B (en) | Task processing method and device | |
CN109992418B (en) | SLA-aware resource priority scheduling method and system for multi-tenant big data platform | |
WO2024021489A1 (en) | Task scheduling method and apparatus, and kubernetes scheduler | |
CN112506634B (en) | Fairness operation scheduling method based on reservation mechanism | |
Dimopoulos et al. | Justice: A deadline-aware, fair-share resource allocator for implementing multi-analytics | |
Guo et al. | Delay-optimal scheduling of VMs in a queueing cloud computing system with heterogeneous workloads | |
CN116010064A (en) | DAG job scheduling and cluster management method, system and device | |
CN113391911B (en) | Dynamic scheduling method, device and equipment for big data resources | |
Gao et al. | Deadline-aware preemptive job scheduling in hadoop yarn clusters | |
KR20150089665A (en) | Appratus for workflow job scheduling | |
Nzanywayingoma et al. | Task scheduling and virtual resource optimising in Hadoop YARN-based cloud computing environment | |
Shrivastava et al. | CRI: A Novel Rating Based Leasing Policy and Algorithm for Efficient Resource Management in IaaS Clouds | |
CN118093096A (en) | Ultra-large scale cluster resource scheduling method, device and equipment | |
Louis et al. | A best effort heuristic algorithm for scheduling timely constrained tasks in the cloud |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |