CN114077486A - MapReduce task scheduling method and system - Google Patents

MapReduce task scheduling method and system Download PDF

Info

Publication number
CN114077486A
CN114077486A CN202111386374.8A CN202111386374A CN114077486A CN 114077486 A CN114077486 A CN 114077486A CN 202111386374 A CN202111386374 A CN 202111386374A CN 114077486 A CN114077486 A CN 114077486A
Authority
CN
China
Prior art keywords
job
resource
task
central
resources
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111386374.8A
Other languages
Chinese (zh)
Other versions
CN114077486B (en
Inventor
高永强
张凯丰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inner Mongolia University
Original Assignee
Inner Mongolia University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inner Mongolia University filed Critical Inner Mongolia University
Priority to CN202111386374.8A priority Critical patent/CN114077486B/en
Publication of CN114077486A publication Critical patent/CN114077486A/en
Application granted granted Critical
Publication of CN114077486B publication Critical patent/CN114077486B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5044Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a MapReduce task scheduling method and system, which make up for the defect that the existing Kill-based preemption mechanism of Yarn directly kills tasks by introducing the preemption mechanism based on a Docker container. The preemption mechanism based on the Docker container can release resources occupied by tasks while keeping task progress, and can realize that tasks with high priority preempt running resources of other tasks and ensure that the completion time of the operation reaches the target of a Service Level Agreement (SLA) by combining a task strategy perceived by the service level agreement.

Description

MapReduce task scheduling method and system
Technical Field
The invention relates to the technical field of task scheduling in a heterogeneous cluster environment, in particular to a MapReduce task scheduling method and system.
Background
At present, with the development of internet technology, the scale of data to be calculated and processed in daily production and life is getting larger and larger, and the way of processing large-scale data by using a distributed computing system is widely used. Among them, the scheduler is a vital part of the distributed system. A well-designed scheduling strategy can effectively allocate program requirements and available cluster resources and can reduce the operation cost of a data center. The most widely applied distributed computing framework at present is the flagship project Hadoop of Apache, and the provided programming computing framework is MapReduce. Hadoop extracts an independent framework Yarn from the resource management part. The Yarn is a universal resource management platform and can provide resources required by operation for computing programs such as MapReduce and the like.
Today Yarn has implemented three different schedulers, first-in-first-out, capacity and fair, based on different scheduling strategies. Although the three scheduling strategies can improve the utilization rate of the cluster and optimize the cluster performance to a certain extent, how to schedule jobs with different resource requirements and QoS constraints in a complex heterogeneous cluster environment is still a difficult problem to be solved urgently. According to the completion time of the job, it can be divided into a short job and a long job. Short jobs generally have low latency requirements, while long jobs can tolerate higher latency, but have quality of service requirements. So for short jobs, scheduling needs to be done immediately after they are committed to avoid queuing delays. For long jobs, the scheduler should allow the long job to use cluster resources if there are free resources in the cluster, which may improve the resource utilization of the cluster.
Under a real working environment, a plurality of long jobs and short jobs are generally mixed together for scheduling, and the existing solution either forcibly terminates a running long job to ensure low delay of a short job or completely forbids a resource preemption behavior to improve the resource utilization rate of a cluster. The simple scheduling strategy cannot meet the scheduling of jobs with different resource requirements in a complex heterogeneous environment. The goal is to make a trade-off between resource utilization and job queuing delay, and how to improve hardware resource utilization and efficiency and reduce job queuing delay as much as possible, thereby achieving the goal of service level agreement.
In order to solve the problem, a brand-new scheduling strategy needs to be developed urgently to meet the needs of actual work.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a MapReduce task scheduling method and system used in a heterogeneous Yarn cluster environment, which ensure higher cluster resource utilization rate and simultaneously give consideration to low delay and instant response speed of operation.
The MapReduce task scheduling method provided by the invention comprises the following steps:
s1: the client creates a JobSummiter instance, calculates the input fragment of the job by an internal method of the JobSummiter, copies the resource required by the job operation into a distributed file system, and submits the MapReduce job to a resource scheduler;
s2: after receiving the job submission message, the resource scheduler transmits the request message to the central resource scheduler, and the central resource scheduler analyzes the detailed information related to the job by an internal job analysis method and analyzes the latest deadline required by reaching a service level agreement;
s3: the central resource scheduler adds the new tasks into a central task queue, and reorders all the tasks from near to far according to the deadline of each task according to the different deadline dates;
s4: the central resource scheduler receives the heartbeat information from the node resource scheduler, obtains the number of tasks allocated by each node resource scheduler, sequentially selects the node with the minimum task number from the number of tasks, and assigns the task with the latest current deadline to be executed in the past;
s5: after receiving the new task, the node resource scheduler adds the new task into a local task queue and reorders the task queue according to the deadline;
s6: the node resource scheduler checks the position of the new task in the task queue and if the deadline of the new task is closer than the task being executed, the new task preempts the task being executed.
Further, in step S3, the central resource scheduler obtains the total amount of CPU resources C and the total amount of memory resources M, and obtains the job share of the long job according to the job number
Figure BDA0003367228850000022
Periodically calculating the resource share of each job in the central task queue according to the fairness principle
Figure BDA0003367228850000021
Further, in step S4, after receiving the resource request of a job, the central resource scheduler analyzes whether the job can be completed before the deadline by combining the deadline constraint, the resource condition of the cluster, and the resource requirement of the job, and adds the job into the central task queue if the central resource scheduler determines that the job can be completed before the deadline; otherwise, the central resource scheduler may deny execution of the job.
Further, in step S4, when a certain job arrives, the central resource scheduler determines the current cluster resource amount according to the heartbeat information sent by each node resource scheduler, analyzes the resource amount requested by the job according to the history log of the job running, and if the job is not executed on the cluster, the scheduler executes the job by using a small part of the original data set as a pretest set.
Further, in step S4, if the resource amount of the job request does not exceed the available resource amount in the cluster, the central resource scheduler adds the job to the central task queue;
otherwise, it needs to be subdivided into two cases: one case includes executing the job if the job directly preempts the resources of the job currently running and can be completed in time if executed immediately;
another scenario includes that, even if the job preempts the resources of other jobs, the job cannot meet the deadline requirement, and the central resource scheduler directly denies execution of the job.
Further, in step S4, the amount of resource to be preempted based on the service level agreement is determined by:
when the execution of the W map tasks is finished, the reduce task is started to be executed, and TupRepresenting the upper limit of the execution time of the W map tasks, the following can be obtained:
Figure BDA0003367228850000031
wherein M isavgIs the average execution time of the map tasks in job j,
Figure BDA0003367228850000032
for the number of map tasks in Job j, MmaxThe maximum execution time of the map task in the operation j is taken as the maximum execution time; there are Q jobs that can be on the upper time limit TupThe amount of resources that were previously executed and released after these jobs were completed is R, the value of which can be calculated by the following formula:
Figure BDA0003367228850000033
wherein j represents a certain operation,
Figure BDA0003367228850000034
representing the reduce task number of the operation j;
the amount of resources required in the reduce phase is E, and the value of E can be calculated by the following formula:
Figure BDA0003367228850000035
wherein C isrIndicating the amount of resources available in the cluster at the current time,
Figure BDA0003367228850000036
representing the amount of resources required by the map task for job j.
Further, in step S6, when job preemption is required, the resource share additionally occupied by the job k to be preempted is calculated first
Figure BDA0003367228850000037
Wherein
Figure BDA0003367228850000038
Represents the actual occupied resources of the operation k in the process of executionThe source of the light source is,
Figure BDA0003367228850000039
representing the resource amount which should be obtained by the operation k according to the fair resource allocation principle; and then acquiring the resource share requested by the job j needing to be executed
Figure BDA00033672288500000310
If it is
Figure BDA00033672288500000311
The resource to be preempted can be calculated
Figure BDA00033672288500000312
Further, in step S6, if it is determined that the operation is not in progress
Figure BDA00033672288500000313
Calculating the resource to be preempted by an algorithm
Figure BDA00033672288500000314
Resource requiring preemption
Figure BDA00033672288500000315
The calculation of (a) includes: comparing CPU resource and memory resource, dividing them into main resource and secondary resource, then obtaining resource recovery of secondary resource according to recovery of main resource, the calculation formula includes:
Figure BDA00033672288500000316
Figure BDA00033672288500000317
Figure BDA00033672288500000318
wherein the content of the first and second substances,Cj,Mjrespectively representing the amount of CPU resources and the amount of memory resources requested by job j, Ca,MaRespectively representing the CPU resource amount and the memory resource amount actually and additionally occupied by the current operation k in the cluster;
Figure BDA00033672288500000319
representing the amount of resources that need to be preempted, if
Figure BDA00033672288500000320
Then representing that the CPU resource requested by the job j is a main resource, and then preempting all the CPU resources additionally occupied by the job k, and preempting the memory resources additionally occupied by the job k in proportion; and otherwise, preempting all the memory resources additionally occupied by the operation k according to the fact that the memory is the main resource requested by the operation j, and preempting the CPU resources additionally occupied by the operation k in proportion.
Further, the MapReduce task scheduling method is performed according to a scheduling policy based on a service level agreement, and the scheduling policy based on the service level agreement comprises the following steps:
when job j arrives, analyzing the expiration date of the job, the required throughput and the required resource amount;
the central resource scheduler analyzes whether the current resource quantity of the cluster meets the resource demand quantity of the judgment job j, and if the current resource quantity of the cluster meets the resource demand quantity of the judgment job j, the job j is added into a central task queue;
if not, judging whether the resource quantity of the cluster can meet the resource demand quantity of the map task of the job j or not, and whether the resource released after the execution of the map task can meet the resource demand quantity of the reduce task or not;
if the two conditions are met, adding the job j into the central task queue, and marking the job j as a high priority, so that the job j can occupy the resources of other jobs in the execution process; if the two conditions cannot be met simultaneously, the central resource scheduler refuses to execute the job j;
the central resource scheduler sorts the jobs in the central task queue according to the deadline, and performs traversal processing on each job respectively; for the job j in the central task queue, the central resource scheduler judges whether the map task of the job j is completely executed or not, if not, the priority of the job j is judged, if the job is a high-priority job, the central resource scheduler immediately communicates with the node resource scheduler to preempt the appointed resource from the cluster so as to execute the map task of the job j, otherwise, the central resource scheduler waits for the cluster to generate idle resources and allocates the idle resources to the map task of the job j;
the node resource scheduler reports the task execution state to the central resource scheduler through heartbeat information, if the map tasks are completely executed, the central resource scheduler judges whether the number of the executed map tasks exceeds a threshold value W, if so, the central resource scheduler starts to execute the reduce task of the job j and also judges the priority of the job j, if the job is executed preferentially, the node resource scheduler completes job preemption, otherwise, idle resources are waited for allocation.
The invention also provides a MapReduce task scheduling system adopting the MapReduce task scheduling method, which comprises the following steps: the distributed data center cluster comprises a central resource scheduler and a plurality of node resource schedulers;
a central task queue is maintained in the central resource scheduler, and when a new job arrives, the central resource scheduler analyzes job characteristics to obtain the operation time and the deadline of the job;
the node resource scheduler maintains a running task queue and a pause task queue, and the two queues are sorted according to the sequence of the deadline date and continuously report the task information and the resource use condition on the node to the central resource scheduler through a heartbeat mechanism.
The invention provides a MapReduce task scheduling method used in a heterogeneous Yarn cluster environment, wherein a central resource scheduler runs a daemon process on a resource manager (resource manager) and is responsible for receiving task information transmitted by a node resource scheduler, periodically checks a current task scheduling strategy, resource availability of each working node and resource requirement of a newly arrived task, deduces which queues occupy redundant resources and are insufficient in resource allocation, calculates the number of resources to be preempted, obtains an optimal resource allocation scheme of the task queues in each time period, and transmits a scheduling decision to the node resource scheduler for execution.
The scheduling method creatively introduces a Docker container-based preemption mechanism, and overcomes the defect that the existing Kill-based preemption mechanism of Yarn directly kills tasks. The preemption mechanism based on the Docker container can release resources occupied by tasks while keeping task progress, and can realize that tasks with high priority preempt running resources of other tasks and ensure that the completion time of the operation reaches the target of a Service Level Agreement (SLA) by combining a task strategy perceived by the service level agreement.
Drawings
The invention will be described in more detail hereinafter on the basis of embodiments and with reference to the accompanying drawings. Wherein:
FIG. 1 is a schematic flow chart of a MapReduce task scheduling method in the present invention;
FIG. 2 is a system architecture diagram of a MapReduce task scheduling method in the present invention;
FIG. 3 is a deployment diagram of an example of the MapReduce task scheduling method in the present invention.
Detailed Description
In order to clearly illustrate the inventive content of the present invention, the present invention will be described below with reference to examples.
In the description of the present invention, it should be noted that the terms "upper", "lower", "horizontal", "top", "bottom", and the like indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and thus should not be construed as limiting the present invention. Furthermore, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
Referring to fig. 1-3, a central resource scheduler in the present invention is a daemon process running on a resource manager (resource manager), and is responsible for receiving task information transmitted by a node resource scheduler, periodically checking a current task scheduling policy, resource availability of each working node, and acquiring resource requirements of a newly arrived task, so as to infer which queues occupy redundant resources, which queues have insufficient resource allocation, and calculate the number of resources to be preempted, obtain an optimal resource allocation scheme for a task queue in each time period, and send a scheduling decision to the node resource scheduler for execution.
The Node resource scheduler is a daemon process running on a working Node Manager (Node Manager), realizes the integration of a Docker container and a Yarn framework, and solves the problem of a preemption mode that a native Yarn container directly kills a task container. After receiving the task request, the node resource scheduler loads the task into the Docker container and configures the container according to the resource request of the task. In addition, the node resource scheduler is responsible for container suspension and container recovery operations, and recovers or recovers container resources as needed.
In the actual job scheduling process, which job is to be specifically preempted is determined before job preemption operation, and the preemptive job scheduling strategy which can ensure the QoS service quality to the maximum extent and meet the SLA service level agreement is designed in the invention. The idea of the job scheduling strategy is to preferentially execute the job with the earliest deadline, so that the number of jobs missing the deadline can be minimized, and the execution effect of the job is greatly improved. Specifically, after receiving a resource request of a job, the central resource scheduler analyzes whether the job can be completed before the deadline in combination with the deadline constraint, the resource status of the cluster, and the resource requirement of the job. If the hub resource scheduler determines that a job can be completed before its deadline, it adds it to the job queue. Otherwise, the central resource scheduler may deny execution of the job.
When a certain job j arrives, the central resource scheduler determines the current cluster resource amount according to the heartbeat information sent by each node resource scheduler, analyzes the resource amount requested by the job j according to the historical log of job running, and if the job j is not executed on the cluster, the scheduler executes the job by using a small part of the original data set as a pretest set, so as to obtain the performance index related to the job.
Figure BDA0003367228850000061
Representing the total amount of resources required for job j,
Figure BDA0003367228850000062
the amount of resources required by the map task representing job j,
Figure BDA0003367228850000063
representing the amount of resources required for the reduce task of Job j, we use CrIndicating the amount of resources available in the cluster at the current time. If the amount of resources requested by the job does not exceed the amount of resources available in the cluster, i.e., the job requests are not processed by the cluster
Figure BDA0003367228850000064
The hub resource scheduler adds job j to the job queue. If not, then,
Figure BDA0003367228850000065
Figure BDA0003367228850000066
it needs to be subdivided into two cases: one case is that if the job j directly occupies the resources of the job k currently running and is executed immediately, in which case the job can be completed in time, the job j is executed; alternatively, if job j fails to meet the deadline requirement even if job j preempts resources of other jobs, the central resource scheduler may directly deny execution of job j.
The MapReduce task scheduling method is carried out according to a scheduling strategy based on a Service Level Agreement (SLA), and the specific deployment mode based on the Service Level Agreement (SLA) is divided into the following steps:
step 1: in the Yarn cluster, the resource manager node is integrated with the central resource scheduler provided by the invention, and the rest node manager nodes are integrated with the node resource scheduler provided by the invention.
Step 2: when a user submits a batch of jobs to be distributed to the cluster, the central resource scheduler analyzes the jobs, analyzes the size of input data volume of the jobs, the size of required resources such as a CPU (Central processing Unit), a memory and the like, and the deadline specified by the user.
And step 3: the central resource scheduler collects the node state information sent by each node resource scheduler, and counts the execution progress of the currently executed job and the utilization rate of various resources in the cluster.
And 4, step 4: and the central resource scheduler gathers the available resources of the current cluster and the characteristics of the jobs to be executed, analyzes whether the current cluster resources can meet the resource demand of the job j, adds the job j into a job queue to be executed if the current cluster resources can meet the resource demand of the map task of the job j, and judges whether the resource demand of the released resources can meet the resource demand of the reduce task after the map task is executed. If both conditions are met, job j is added to the job queue and is set to high priority.
And 5: and the central resource scheduler sorts the jobs in the job queue according to the deadline and performs traversal processing on each job respectively. For the job j in the job queue, the central resource scheduler judges whether the map task of the job j is completely executed or not, if not, the priority of the job j is judged, if the job is a high-priority job, the central resource scheduler immediately communicates with the node resource scheduler to preempt the appointed resource from the cluster so as to execute the map task of the job j, otherwise, the central resource scheduler waits for the cluster to generate idle resources and allocates the idle resources to the map task of the job j;
step 6: and the node resource scheduler reports the task execution state to the central resource scheduler through the heartbeat information. If the map tasks are completely executed, the central resource scheduler judges whether the number of the executed map tasks exceeds a threshold value W. And if the priority exceeds the threshold value W, starting to execute the reduce task of the job j, judging the priority of the job j, finishing job preemption together with the node resource scheduler if the job is executed preferentially, and otherwise waiting for idle resources to be allocated.
The scheduling method can realize that the task with high priority occupies the running resources of other tasks based on the scheduling strategy of the Service Level Agreement (SLA), and ensures that the completion time of the operation reaches the target of the Service Level Agreement (SLA). The MapReduce task scheduling system can balance the relationship between the resource utilization rate and the job queuing delay, effectively improve the hardware resource utilization rate and efficiency, and greatly reduce the job queuing delay so as to achieve the service level agreement target.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (10)

1. A MapReduce task scheduling method is characterized by comprising the following steps:
s1: the client creates a JobSummiter instance, calculates the input fragment of the job by an internal method of the JobSummiter, copies the resource required by the job operation into a distributed file system, and submits the MapReduce job to a resource scheduler Resourcemanager;
s2: after receiving the job submission message, the resource scheduler transmits the request message to the central resource scheduler, and the central resource scheduler analyzes the detailed information related to the job by an internal job analysis method and analyzes the latest deadline required by reaching a service level agreement;
s3: the central resource scheduler adds the new tasks into a central task queue, and reorders all the tasks from near to far according to the deadline of each task according to the different deadline dates;
s4: the central resource scheduler receives the heartbeat information from the node resource scheduler, obtains the number of tasks allocated by each node resource scheduler, sequentially selects the node with the minimum task number from the number of tasks, and assigns the task with the latest current deadline to be executed in the past;
s5: after receiving the new task, the node resource scheduler adds the new task into a local task queue and reorders the task queue according to the deadline;
s6: the node resource scheduler checks the position of the new task in the task queue and if the deadline of the new task is closer than the task being executed, the new task preempts the task being executed.
2. The MapReduce task scheduling method of claim 1, wherein in step S2, the central resource scheduler obtains a total amount of CPU resources C and a total amount of memory resources M, and obtains a job share of a long job according to a job number
Figure FDA0003367228840000013
According to the fairness principle, periodicityLocal computation of resource share for each job in central task queue
Figure FDA0003367228840000014
Figure FDA0003367228840000012
3. The MapReduce task scheduling method according to claim 2, wherein in step S3, after receiving a resource request of a job, the central resource scheduler analyzes whether the job can be completed before the deadline by combining the deadline constraint, the resource condition of the cluster, and the resource requirement of the job, and adds the job to the central task queue if the central resource scheduler determines that the job can be completed before the deadline; otherwise, the central resource scheduler may deny execution of the job.
4. The MapReduce task scheduling method according to claim 1, wherein in step S4, when a job arrives, the central resource scheduler determines a current cluster resource amount according to the heartbeat information sent by each node resource scheduler, analyzes the resource amount requested by the job according to the history log of the job running, and if the job is not executed on the cluster, the scheduler executes the job using a small part of the original data set as a pre-test set.
5. The MapReduce task scheduling method of claim 4, wherein in step S4, if the resource amount of the job request does not exceed the available resource amount in the cluster, the central resource scheduler adds the job to a central task queue;
otherwise, it needs to be subdivided into two cases: one case includes executing the job if the job directly preempts the resources of the job currently running and can be completed in time if executed immediately;
another scenario includes that, even if the job preempts the resources of other jobs, the job cannot meet the deadline requirement, and the central resource scheduler directly denies execution of the job.
6. The MapReduce task scheduling method of claim 4, wherein in step S4, the amount of preempted resources based on the service level agreement is determined by the following scheme:
when the execution of the W map tasks is finished, the reduce task is started to be executed, and TupRepresenting the upper limit of the execution time of the W map tasks, the following can be obtained:
Figure FDA0003367228840000021
wherein M isavgIs the average execution time of the map tasks in job j,
Figure FDA0003367228840000022
for the number of map tasks in Job j, MmaxThe maximum execution time of the map task in the operation j is taken as the maximum execution time; there are Q jobs that can be on the upper time limit TupThe amount of resources that were previously executed and released after these jobs were completed is R, the value of which can be calculated by the following formula:
Figure FDA0003367228840000023
wherein j represents a certain operation,
Figure FDA0003367228840000024
representing the reduce task number of the operation j;
the amount of resources required in the reduce phase is E, and the value of E can be calculated by the following formula:
Figure FDA0003367228840000025
wherein C isrIndicating the amount of resources available in the cluster at the current time,
Figure FDA0003367228840000026
representing the amount of resources required by the map task for job j.
7. The MapReduce task scheduling method of claim 6, wherein in step S6, when job preemption is required, the method calculates the resource share additionally occupied by the job k to be preempted
Figure FDA0003367228840000027
Wherein
Figure FDA0003367228840000028
Representing the resources actually occupied by job k during execution,
Figure FDA0003367228840000029
representing the resource amount which should be obtained by the operation k according to the fair resource allocation principle; and then acquiring the resource share requested by the job j needing to be executed
Figure FDA00033672288400000210
If it is
Figure FDA00033672288400000211
The resource to be preempted can be calculated
Figure FDA00033672288400000212
8. The MapReduce task scheduling method of claim 7, wherein in step S6, if yes, the task is scheduled according to the following order
Figure FDA00033672288400000213
Then go toResource needing to be preempted by over-algorithm calculation
Figure FDA00033672288400000214
Resource requiring preemption
Figure FDA00033672288400000215
The calculation of (a) includes: comparing CPU resource and memory resource, dividing them into main resource and secondary resource, then obtaining resource recovery of secondary resource according to recovery of main resource, the calculation formula includes:
Figure FDA00033672288400000216
Figure FDA00033672288400000217
Figure FDA0003367228840000031
wherein, Cj,MjRespectively representing the amount of CPU resources and the amount of memory resources requested by job j, Ca,MaRespectively representing the CPU resource amount and the memory resource amount actually and additionally occupied by the current operation k in the cluster;
Figure FDA0003367228840000032
representing the amount of resources that need to be preempted, if
Figure FDA0003367228840000033
Then representing that the CPU resource requested by the job j is a main resource, and then preempting all the CPU resources additionally occupied by the job k, and preempting the memory resources additionally occupied by the job k in proportion; otherwise, the memory is regarded as the main resource requested by the job j, and the whole memory additionally occupied by the job k is preemptedAnd storing resources and preempting the CPU resources additionally occupied by the operation k in proportion.
9. The MapReduce task scheduling method of claim 8, wherein the MapReduce task scheduling method is performed according to a service level agreement-based scheduling policy, and the service level agreement-based scheduling policy comprises:
when job j arrives, analyzing the expiration date of the job, the required throughput and the required resource amount;
the central resource scheduler analyzes whether the current resource quantity of the cluster meets the resource demand quantity of the judgment job j, and if the current resource quantity of the cluster meets the resource demand quantity of the judgment job j, the job j is added into a central task queue;
if not, judging whether the resource quantity of the cluster can meet the resource demand quantity of the map task of the job j or not, and whether the resource released after the execution of the map task can meet the resource demand quantity of the reduce task or not;
if the two conditions are met, adding the job j into the central task queue, and marking the job j as a high priority, so that the job j can occupy the resources of other jobs in the execution process; if the two conditions cannot be met simultaneously, the central resource scheduler refuses to execute the job j;
the central resource scheduler sorts the jobs in the central task queue according to the deadline, and performs traversal processing on each job respectively; for the job j in the central task queue, the central resource scheduler judges whether the map task of the job j is completely executed or not, if not, the priority of the job j is judged, if the job is a high-priority job, the central resource scheduler immediately communicates with the node resource scheduler to preempt the appointed resource from the cluster so as to execute the map task of the job j, otherwise, the central resource scheduler waits for the cluster to generate idle resources and allocates the idle resources to the map task of the job j;
the node resource scheduler reports the task execution state to the central resource scheduler through heartbeat information, if the map tasks are completely executed, the central resource scheduler judges whether the number of the executed map tasks exceeds a threshold value W, if so, the central resource scheduler starts to execute the reduce task of the job j and also judges the priority of the job j, if the job is executed preferentially, the node resource scheduler completes job preemption, otherwise, idle resources are waited for allocation.
10. A MapReduce task scheduling system using the MapReduce task scheduling method of any one of claims 1 to 9, comprising: the distributed data center cluster comprises a central resource scheduler and a plurality of node resource schedulers;
a central task queue is maintained in the central resource scheduler, and when a new job arrives, the central resource scheduler analyzes job characteristics to obtain the operation time and the deadline of the job;
the node resource scheduler maintains a running task queue and a pause task queue, and the two queues are sorted according to the sequence of the deadline date and continuously report the task information and the resource use condition on the node to the central resource scheduler through a heartbeat mechanism.
CN202111386374.8A 2021-11-22 2021-11-22 MapReduce task scheduling method and system Active CN114077486B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111386374.8A CN114077486B (en) 2021-11-22 2021-11-22 MapReduce task scheduling method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111386374.8A CN114077486B (en) 2021-11-22 2021-11-22 MapReduce task scheduling method and system

Publications (2)

Publication Number Publication Date
CN114077486A true CN114077486A (en) 2022-02-22
CN114077486B CN114077486B (en) 2024-03-29

Family

ID=80284249

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111386374.8A Active CN114077486B (en) 2021-11-22 2021-11-22 MapReduce task scheduling method and system

Country Status (1)

Country Link
CN (1) CN114077486B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114860397A (en) * 2022-04-14 2022-08-05 深圳清华大学研究院 Task scheduling method, device and equipment
WO2024103463A1 (en) * 2022-11-18 2024-05-23 深圳先进技术研究院 Elastic deep learning job scheduling method and system, and computer device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104991830A (en) * 2015-07-10 2015-10-21 山东大学 YARN resource allocation and energy-saving scheduling method and system based on service level agreement
WO2020248226A1 (en) * 2019-06-13 2020-12-17 东北大学 Initial hadoop computation task allocation method based on load prediction
CN112395052A (en) * 2020-12-03 2021-02-23 华中科技大学 Container-based cluster resource management method and system for mixed load

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104991830A (en) * 2015-07-10 2015-10-21 山东大学 YARN resource allocation and energy-saving scheduling method and system based on service level agreement
WO2020248226A1 (en) * 2019-06-13 2020-12-17 东北大学 Initial hadoop computation task allocation method based on load prediction
CN112395052A (en) * 2020-12-03 2021-02-23 华中科技大学 Container-based cluster resource management method and system for mixed load

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王越峰;王溪波;: "Hadoop集群环境下集成抢占式调度策略的本地性调度算法设计", 计算机科学, no. 1, 31 December 2017 (2017-12-31), pages 567 - 570 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114860397A (en) * 2022-04-14 2022-08-05 深圳清华大学研究院 Task scheduling method, device and equipment
WO2024103463A1 (en) * 2022-11-18 2024-05-23 深圳先进技术研究院 Elastic deep learning job scheduling method and system, and computer device

Also Published As

Publication number Publication date
CN114077486B (en) 2024-03-29

Similar Documents

Publication Publication Date Title
US11099892B2 (en) Utilization-aware resource scheduling in a distributed computing cluster
US9367357B2 (en) Simultaneous scheduling of processes and offloading computation on many-core coprocessors
US11275609B2 (en) Job distribution within a grid environment
Yao et al. Haste: Hadoop yarn scheduling based on task-dependency and resource-demand
US20200174844A1 (en) System and method for resource partitioning in distributed computing
US9152467B2 (en) Method for simultaneous scheduling of processes and offloading computation on many-core coprocessors
US9141432B2 (en) Dynamic pending job queue length for job distribution within a grid environment
Dumitrescu et al. GRUBER: A Grid resource usage SLA broker
CN109564528B (en) System and method for computing resource allocation in distributed computing
CN114077486B (en) MapReduce task scheduling method and system
Sun et al. Rose: Cluster resource scheduling via speculative over-subscription
CN111258745B (en) Task processing method and device
CN109992418B (en) SLA-aware resource priority scheduling method and system for multi-tenant big data platform
WO2024021489A1 (en) Task scheduling method and apparatus, and kubernetes scheduler
CN112506634B (en) Fairness operation scheduling method based on reservation mechanism
Dimopoulos et al. Justice: A deadline-aware, fair-share resource allocator for implementing multi-analytics
Guo et al. Delay-optimal scheduling of VMs in a queueing cloud computing system with heterogeneous workloads
CN116010064A (en) DAG job scheduling and cluster management method, system and device
CN113391911B (en) Dynamic scheduling method, device and equipment for big data resources
Gao et al. Deadline-aware preemptive job scheduling in hadoop yarn clusters
KR20150089665A (en) Appratus for workflow job scheduling
Nzanywayingoma et al. Task scheduling and virtual resource optimising in Hadoop YARN-based cloud computing environment
Shrivastava et al. CRI: A Novel Rating Based Leasing Policy and Algorithm for Efficient Resource Management in IaaS Clouds
CN118093096A (en) Ultra-large scale cluster resource scheduling method, device and equipment
Louis et al. A best effort heuristic algorithm for scheduling timely constrained tasks in the cloud

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant