CN108428051B - MapReduce job scheduling method and device facing big data platform and based on maximized benefits - Google Patents

MapReduce job scheduling method and device facing big data platform and based on maximized benefits Download PDF

Info

Publication number
CN108428051B
CN108428051B CN201810172166.XA CN201810172166A CN108428051B CN 108428051 B CN108428051 B CN 108428051B CN 201810172166 A CN201810172166 A CN 201810172166A CN 108428051 B CN108428051 B CN 108428051B
Authority
CN
China
Prior art keywords
job
reward
maximum
time
punishment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810172166.XA
Other languages
Chinese (zh)
Other versions
CN108428051A (en
Inventor
史玉良
胡静
李庆忠
孔兰菊
闫中敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN201810172166.XA priority Critical patent/CN108428051B/en
Publication of CN108428051A publication Critical patent/CN108428051A/en
Application granted granted Critical
Publication of CN108428051B publication Critical patent/CN108428051B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06312Adjustment or analysis of established resource schedule, e.g. resource or task levelling, or dynamic rescheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0207Discounts or incentives, e.g. coupons or rebates

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a MapReduce job scheduling method and device under a cloud service reward and punishment yield mode, wherein the method comprises the following steps of: receiving jobs submitted by a user, and acquiring the execution time of each round of Map tasks and Reduce tasks of each job and the number of the tasks; according to the Map and Reduce task execution time and the task quantity of each operation and a reward and punishment gain mode, determining a maximum round number combination scheme set and a maximum standard time of each operation at different reward and punishment stages; and acquiring a job scheduling strategy based on the maximum round number scheme of each job according to the reward and punishment profit mode. The invention carries out measurement and evaluation on the profit and the claim cost of the operation according to the goal of service provider profit maximization, and meets the operation scheduling goal of service provider maximum profit, platform maximum resource utilization and operation shortest completion time.

Description

MapReduce job scheduling method and device facing big data platform and based on maximized benefits
Technical Field
The invention belongs to the field of cloud platform operation scheduling optimization, and particularly relates to a MapReduce operation scheduling method and device under a cloud service reward and punishment mode.
Background
In recent years, as various data show an explosive growth trend, the demand for more efficient analysis and processing of mass data is more and more urgent. Traditional data processing techniques and tools have been unable to meet current analysis and processing requirements, and thus emerging large data computing platforms provide powerful support for addressing new requirements. Because of the contradictory relationship between efficient processing requirements and processing costs for large amounts of data, large data computing platform service providers have evolved to provide convenient and low cost computing services to users. A Service provider establishes a public big data computing platform according to the existing big data technology, a user only needs to submit a self-defined job to the computing platform, and specific details of the Service are specified according to a Service Level Agreement (SLA) signed by the Service provider and the user. Generally, the SLA defines the service type, the service quality, and the profit model. The common revenue model of the service provider is mainly that the user pays according to the completion effect or the completion time limit, that is, the user also gives the requirement of the completion time while submitting the job, and the service provider can obtain the corresponding compensation only when completing the job in the specified time, otherwise, the service provider can carry out the compensation according to the requirement signed in the agreement. However, under the condition that multiple users share platform resources, the contradiction between the maximization of the utilization rate of the platform resources and the maximum satisfaction of the Qos requirements of the users occurs, so that the service provider cannot obtain the maximum benefit, and the utilization rate of the platform resources is reduced, so that it is very important for the platform service provider to formulate an efficient job scheduling strategy.
Because the research object aimed at in this document is mainly a large amount of offline operations with deadline submitted by multiple users under the existing computing resource condition of the platform, the operations submitted by the users include the following parts on the premise of meeting the original revenue model: 1) the user-defined application program is the specific content of the submitted job; 2) the deadline of each job, i.e. the user's requirement for the final completion time of the job; 3) the facilitator completes the revenue available for each job within a specified time; 4) the service provider will pay the profit in proportion to the amount of the profit when the job completion time exceeds the deadline. Based on the original revenue model strategy and the existing job scheduling strategy, the service provider does not consider the accurate consideration of the user to the job deadline and the urgent need for job execution, and does not consider the influence of the job scheduling result on the utilization rate of platform resources.
How to formulate an efficient job scheduling strategy in a certain big data computing resource environment enables a service provider to obtain the maximum benefit, and meanwhile, determine the accurate completion time of each job for a user and meet the requirement of the maximum platform resource utilization rate, and an effective solution is not yet available.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a MapReduce job scheduling method under a reward and punishment profit mode of a cloud service provider on the basis of a Yarn resource management system of Hadoop2.x, the profit and the payment cost of the job are evaluated in a measurement manner according to the goal of maximizing the profit of the service provider, and the job with less profit and less payment cost is selected to be abandoned after the balance of the whole profit, so that the job scheduling goal of the maximum profit of the service provider, the maximum resource utilization of a platform and the shortest completion time of the job is met. The method can generate corresponding job scheduling strategies for the jobs submitted by all the current users according to the job information submitted by the users and the existing resource information in the cluster. The specific task allocation method still follows the dynamic allocation principle of MapReduce, so that the load balance and other performance characteristics in MapReduce are not influenced.
In order to achieve the purpose, the invention adopts the following technical scheme:
a MapReduce job scheduling method under a cloud service quotient and punishment yield mode comprises the following steps:
receiving jobs submitted by a user, and acquiring the execution time of each round of Map tasks and Reduce tasks of each job and the number of the tasks;
according to the Map and Reduce task execution time and the task quantity of each operation and a reward and punishment gain mode, determining a maximum round number combination scheme set and a maximum standard time of each operation at different reward and punishment stages;
and acquiring a job scheduling strategy based on the maximum round number scheme of each job according to the reward and punishment profit mode.
Further, the method for determining the task execution time and the number corresponding to the job comprises the following steps: after receiving the jobs submitted by the users, the jobs are pre-executed, and the execution time and the corresponding number of the tasks are determined.
Further, the operation reward and punishment gain mode is as follows:
Figure BDA0001586171920000021
Figure BDA0001586171920000022
wherein j _ prefix represents the sum of gains of the operation completed before the cut-off time, and f (j _ prefix) is a reward and punishment coexistence function and represents additional reward or punishment; j _ Nm and j _ Nr respectively represent the number of Reduce tasks and the number of Map tasks of the job; r is the total computational resources that can be used in the platform, and J is the total number of jobs that need to be scheduled.
Further, according to the reward and punishment subsection range of the reward and punishment coexistence function, a set of the maximum standard time T is obtained:
Figure BDA0001586171920000023
when in use
Figure BDA0001586171920000024
Then, the corresponding reward can be obtained after the operation j is finished; when t isstandardWhen j _ deadline is satisfied, the reward for completing job j is 0; when t isstandardWhen b × j _ deadline is reached, the penalty after completion of the job j reaches the maximum value.
Further, the method for determining the maximum round number combination scheme set of each job at different reward and punishment stages includes:
calculating the minimum execution round number of Map and Reduce tasks of each job, the shortest execution time of each round of tasks and the maximum standard time at each reward and punishment stage according to the known information of the jobs;
and obtaining a maximum round number combination scheme set of the operation according to the shortest execution time of each round of the task and the maximum standard time of each reward and punishment stage.
Further, the method for calculating the minimum number of execution rounds of the task comprises the following steps:
Figure BDA0001586171920000031
Figure BDA0001586171920000032
Figure BDA0001586171920000033
in the case where all dynamically allocatable resources are allocated to only one job, RN _ m is the minimum number of execution rounds of the Map task of job j; RN _ r is the minimum number of execution rounds of Reduce tasks; t is tleastIndicating the shortest execution time for job j.
Further, the method for determining the maximum round number combination scheme set at each reward and punishment stage of the operation comprises the following steps:
for each reward and punishment stage, if the maximum standard time is greater than or equal to the shortest execution time of the task, acquiring a round number combination scheme A of the Map and Reduce tasks;
judging whether the residual time of the scheme A is greater than or equal to the k-round Map task execution time, if so, obtaining a round number combination scheme B, and executing the next step by adopting the residual time of the schemes A and B; otherwise, obtaining the round number combination scheme set plan of the reward and punishment stagejm_t={A,B};
Further judging whether the residual time of the schemes A and B is larger than or equal to the i-round Reduce task execution time, if so, obtaining a round number combination scheme C, and forming a round number combination scheme set plan of the reward and punishment stage by the schemes A, B and Cjm_t={A,B,C};
Judging whether the residual time adopting the scheme A is more than or equal to the i-round Reduce task execution time, if so, obtaining a round number combination scheme D, and executing the next step; otherwise, obtaining the round number combination scheme set plan of the reward and punishment stagejr_t={A,D};
Further judging whether the residual time adopting the schemes A and D is larger than or equal to the k-round Map task execution time, if so, obtaining a round number combination scheme, and obtaining a round number combination scheme set plan of the reward and punishment stagejr_t={A,D,E};
The round number combination scheme set of the task is Planj_t={A}∪planjm_t∪planjr_t
And according to the principle that the round number does not exceed the task number in the same reward and punishment stage, acquiring the maximum values of the parameters i and k to obtain a maximum round number combination scheme set.
Further, the job scheduling method includes:
at j _ t according to all jobsstandardDetermining a scheduled job and selecting a maximum round number combination scheme set as a maximum reward value at j _ deadline/a;
in the job scheduling process, when the resource utilization rate is greater than a given threshold value, resources are idled until the execution of the previous task is completed, and a task with the maximum local profit is selected from the rest tasks for serial scheduling;
when the resource utilization rate is less than a given threshold value, executing the task to be scheduled and the previous task in parallel;
and adding all the jobs into a scheduling queue, and calculating a global profit value according to a task round number combination scheme corresponding to each job, wherein the combination scheme with the maximum global profit value is a scheduling strategy.
Further, when the operation with the maximum local profit is selected, the reward or penalty value which can be obtained by each operation in the time range is compared through the range of the deadline time of each operation and the completion time difference of the previous task, and the operation which enables the maximum local profit and the corresponding task round number combination scheme are selected.
And further, before the selected operation is added into the scheduling queue, recording a starting time point when the previous task has vacant resources, comparing the starting time point with the maximum standard time of each operation, when the remaining resources can meet the requirement of the maximum round number of the task within the maximum standard time range, selecting the operation with the maximum reward and punishment value to be added into the scheduling queue, and placing the operation with the maximum punishment value in the scheduling queue for scheduling at last.
According to a second object of the present invention, the present invention further provides a MapReduce job scheduling optimization apparatus in a cloud service reward and penalty mode, including a memory, a processor, and a computer program stored on the memory and operable on the processor, where the processor implements the following steps when executing the program, including:
receiving jobs submitted by a user, and acquiring the execution time of each round of Map tasks and Reduce tasks of each job and the number of the tasks;
according to the Map and Reduce task execution time and the task quantity of each operation and a reward and punishment gain mode, determining a maximum round number combination scheme set and a maximum standard time of each operation at different reward and punishment stages;
and acquiring a job scheduling strategy based on the maximum round number scheme of each job according to the reward and punishment profit mode.
According to a third object of the present invention, the present invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
receiving jobs submitted by a user, and acquiring the execution time of each round of Map tasks and Reduce tasks of each job and the number of the tasks;
according to the Map and Reduce task execution time and the task quantity of each operation and a reward and punishment gain mode, determining a maximum round number combination scheme set and a maximum standard time of each operation at different reward and punishment stages;
and acquiring a job scheduling strategy based on the maximum round number scheme of each job according to the reward and punishment profit mode.
The invention has the advantages of
1. The invention provides a MapReduce job scheduling method under a cloud service quotient reward and punishment profit mode, which takes a profit value in an RPModel as a standard and determines the maximum round number combination and the maximum standard time of the Map and the Reduce of the jobs at different reward and punishment stages according to the Map and Reduce task execution time of each job; on the premise of meeting the utilization rate of platform resources, selecting an operation with the global maximum benefit and a task maximum round number scheme of the operation, thereby making a TS strategy. The TS strategy generated by the scheduling optimization method designed by the invention can shorten the completion time of each job to the greatest extent and improve the utilization rate of platform resources.
2. The invention considers a reward policy of a user to a service provider for the first time, namely, the service provider is given a corresponding reward value when the service provider completes the operation within a certain time range before the operation deadline. This revenue model is directed to users who are unsure of whether the job deadline is accurate and who have a short time to complete the submitted job. However, since the job deadline provided by the user is inaccurate, the service provider attempts to shorten the completion time of each job in order to meet the short job completion time requirement of the user. When the completion time range is shorter than the deadline provided by the user, the service provider feeds back the completion time of the job as more accurate deadline to the user, so that the user can provide more accurate deadline when submitting the same job later, and the overall behavior of the user for processing the job group is promoted. The benefit win-win of the service provider and the user is realized for the first time by proposing the reward and punishment coexistence benefit mode.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application.
FIG. 1 is a flow of a MapReduce job scheduling method under a cloud service business reward and penalty mode;
FIG. 2 shows the results of 4 comparison algorithms for FIFO, Fair, EDF and MPCRS on the average completion time indicator of all jobs;
FIG. 3 is a result of completion time after each job has been executed under the scheduling of a different scheduler;
FIG. 4 is a diagram of the job delay ratio when scheduling a job using each scheduler;
FIG. 5 is a diagram of job delay times when scheduling jobs using various schedulers;
FIG. 6 is a graph of the maximum benefit when using different schedulers;
FIG. 7 is a diagram of resource utilization for different schedulers for each job at 9 statistical intervals;
FIG. 8 is a table resource utilization for a job at various statistical intervals after allocating resources for the job using different schedulers;
FIG. 9 is a graph of job delay rate for different data size scenarios;
FIG. 10 is a graph illustrating job completion times for different data sizes;
FIG. 11 illustrates resource utilization for different data sizes;
FIG. 12 illustrates the maximum benefit for different data size scenarios;
FIG. 13 is a graph of job delay ratios for different threshold values of platform resource utilization.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The embodiments and features of the embodiments in the present application may be combined with each other without conflict.
The method comprises the steps that for a specific user with a job deadline inaccurate setting and a job urgent execution requirement, a profit value in a reward punishment coexistence profit mode (RP Model) is taken as an evaluation standard according to a profit and Reduce task execution time of each job, and a maximum round number combination and a maximum standard time of the Map and the Reduce of the job at different reward punishment stages are determined to form a shortest completion time scheme of the job; on the basis, according to the content of a job set submitted by multiple users and the condition of known jobs, a corresponding job scheduling policy (TS) based on tasks is generated, so that a service provider can obtain the maximum benefit and meet the requirement of the maximum utilization rate of platform resources.
Example one
The embodiment discloses a MapReduce job scheduling method under a cloud service reward and punishment yield mode, which comprises the following steps of:
receiving jobs submitted by a user, and acquiring the execution time of each round of Map tasks and Reduce tasks of each job and the number of the tasks;
according to the Map and Reduce task execution time and the task quantity of each operation and a reward and punishment gain mode, determining a maximum round number combination scheme set and a maximum standard time of each operation at different reward and punishment stages;
and acquiring a job scheduling strategy based on the maximum round number scheme of each job according to the reward and punishment profit mode.
Operational yield model
According to the income and payment information about the operation and the number of resources which can be distributed in the platform, the evaluation standard RP Model for maximizing the income is as follows:
Figure BDA0001586171920000071
Figure BDA0001586171920000072
equation (1) represents the sum of the gains available after all jobs are completed, where the actual gain for each job consists of the gain j _ fix for the job completed before the deadline and an additional reward or penalty f (j _ fix). Equation (2) ensures that the number of resources to be run does not exceed the total number of dynamically allocatable resources in the platform each time a task requests resources in parallel. The reward and penalty coexistence function f (j _ config) is expressed as follows:
Figure BDA0001586171920000073
wherein α is an award rate obtainable when the actual completion time of the job is a times shorter than a specified deadline, β is a payout rate of the job when the actual completion time of the job is longer than the specified deadline but does not exceed b times the deadline, and γ is a profit multiple to be paid when the actual completion time of the job exceeds b times the specified deadline, a payout rate value is set to be larger than the payout rate value, i.e., α ≦ β ≦ 1, in consideration of making each job complete before the deadline as much as possible, and γ ≧ 1 is set as a penalty for exceeding the deadline for a long time, a, b > 1 is set because it is not simple to specify an award payout value or a service provider payout value depending on the magnitude of the actual completion time of the job and the deadline, but the user should pay an additional award value or an additional payout value by the service provider only after the difference between them exceeds a certain range.
Map and Reduce task execution times for each job are determined.
The various information known when the user submits the job includes the execution time j _ Tm of each Map task in each round and the execution time j _ Tr of each Reduce task in the job j, and the number j _ Nm of maps and the number j _ Nr of reduces in j, but the problem that the user does not know the execution time and the number of tasks may exist. For the situation that the information is not complete, after the user submits the job, the service provider firstly executes the job once in the platform to obtain the execution time and the corresponding quantity of the tasks, and the obtained result is stored in the configuration file belonging to the job.
And determining the number of operation turns according to the task execution time to calculate the maximum turn combination scheme and the maximum standard time of the task.
And determining the maximum round number combination and the maximum standard time of the Map and the Reduce of each job at different reward and punishment stages according to the Map and Reduce task execution time of each job and the income value obtained according to the Profit Model.
The method is mainly provided on the basis of a MapReduce framework, the processing capacity of each node is considered to be approximately the same in a computing platform, and the dynamically allocable Container resource is represented by R because the resources in the platform are uniformly managed and allocated.
Since this revenue model is directed to users who are uncertain as to whether the job deadline is accurate, and such users have a short time to complete a submitted job. The following information is known for job j submitted by this type of arbitrary user:
1) the cutoff time of j, j _ deadline;
2) the revenue j _ fix available to the service provider after j is completed before j _ deadline;
3) if the a times of the actual completion time j _ completion of j is shorter than j _ deadline, the user pays the bonus value j _ fix × α × a other than the profit at the bonus rate α specified in the SLA;
4) when the actual completion time j _ completion of j exceeds j _ deadline, carrying out compensation according to a specified compensation ratio β in an SLA signed by a service provider and a user, namely when j _ deadline < j _ completion is not more than b multiplied by j _ deadline, the cost required by the service provider to be compensated to the user is j _ fix multiplied by β multiplied by b;
5) j has a number of maps j _ Nm and a number of Reduce j _ Nr;
6) the execution time j _ Tm of each Map task round and the execution time j _ Tr of each Reduce task round in j.
In order to improve the execution efficiency of the job, under the condition that the platform resource condition is sufficient, a plurality of tasks of the same job are executed in parallel as much as possible, the time for executing the tasks in parallel each time is called the execution time of one round of tasks, and the problem of data skew and the like is not considered in the text, so the execution time of each round of tasks is the execution time of a single task and is equal by default.
Calculating the minimum number of execution rounds of Map and Reduce tasks of each job according to the known information of the jobs, and the minimum execution time t of each round of tasksleastAnd a maximum standard time t at each reward and punishment stagestandard
The factors related to determining the number of task rounds specifically include:
Figure BDA0001586171920000081
Figure BDA0001586171920000082
Figure BDA0001586171920000083
in the case where all dynamically allocatable resources are allocated to only one job, RN _ m is the minimum number of execution rounds of the Map task of job j; RN _ r is the minimum number of execution rounds of Reduce task; the expression of the equation (6) is that the shortest execution time t of the job jleastThe method is composed of the minimum number of rounds of tasks and the execution time of each round of tasks.
Corresponding maximum standard time t at different reward and punishment stagesstandardNamely, according to the reward and punishment segment range of the reward and punishment coexistence function f (j _ config), the T set related to the maximum standard time can be obtained.
Figure BDA0001586171920000091
When in use
Figure BDA0001586171920000092
Then, the corresponding reward can be obtained after the operation j is finished; when t isstandardWhen = j _ deadline, the reward for completing job j is 0; when t isstandardWhen = b × j _ deadline, the penalty after completion of the job j reaches the maximum value. In order to obtain a task maximum round number combination scheme at each reward and punishment stage, the maximum standard time becomes a standard for judging whether the task round number reaches the maximum or not.
Obtaining a maximum round number combination scheme of the operation according to the shortest execution time of each round of the task and the maximum standard time of each reward and punishment stage;
the method for calculating the maximum operation wheel number combination scheme specifically comprises the following steps:
for each job, at each reward and punishment stage, performing the following operations:
if the maximum standard time is larger than or equal to the shortest execution time of the tasks, acquiring a round number combination scheme A (RN _ m, RN _ r) of the Map and Reduce tasks and the residual time (t) adopting the schemestandard-tleast);
Determining the remaining time (t)standard-tleast) Whether the execution time of the Map task of the k rounds is larger than or equal to the execution time of the Map task of the k rounds is judged, if yes, a round number combination scheme B (RN _ m + k, RN _ r) and the residual time (t) of adopting the schemes A and B are obtainedstandard-tleast-j _ Tm × k), performing the next step; otherwise, the schemes A and B form a round number combination scheme set, planjm_t={A,B};
Further judging whether the residual time is more than or equal to the i rounds of Reduce task execution time, if so, obtaining a round number combination scheme C (RN _ m + k, RN _ r + i) and residual time (t) adopting the schemes A, B and Cstandard-tleast-j _ Tm × k-j _ Tr × i), the solutions A, B and C form a round number combination solution set, planjm_t={A,B,C};
Determining the remaining time (t)standard-tleast) Whether the execution time of the i rounds of Reduce tasks is larger than or equal to the execution time of the i rounds of Reduce tasks, if yes, a round number combination scheme D (RN _ m, RN _ r + i) and the residual time (t) adopting the schemes A and D are obtainedstandard-tleast-j _ Tr × i), performing the next step; otherwise, schemes A and D form a round number combining scheme set, planjr_t={A,D};
Further judging whether the residual time is more than or equal to k rounds of Map task execution time, if so, obtaining a round number combination scheme E (RN _ m + k, RN _ r + i) and residual time (t) adopting schemes A, B and Cstandard-tleast-j _ Tm × k-j _ Tr × i), the solutions A, B and C form a round number combination solution set, planjr_t={A,D,E};
The round number combination scheme set of the task is Planj_t={A}∪planjm_t∪planjr_t
And according to the same reward and punishment stage, under the constraint condition that the round number does not exceed the task number, namely (RN _ m + k) < j _ Nm and (RN _ r + i) < j _ Nr, obtaining a maximum round number combination scheme.
Algorithm 1 shows how to obtain the maximum round number combination scheme and the maximum standard time of the task at each reward and punishment stage. First, the algorithm calculates the minimum number of execution rounds of Map and Reduce tasks for each job based on the known information of the jobShortest execution time tleastAnd a maximum standard time t at each reward and punishment stagestandard(line2-line 5); secondly, at each maximum standard time, the minimum execution time of the operation is compared with the maximum standard time, and according to the mode requirement of the reward and punishment coexistence function f (j _ profit), the minimum value of the maximum standard time is always greater than or equal to the minimum execution time, so that a round number combination scheme A-a space time conversion scheme to the maximum extent, namely (RN _ m, RN _ r, t) can be obtained at each reward and punishment stagestandard-tleast) (line 8); finally, according to the size of the residual time value, finding the maximum variable value k, i which can satisfy the time for executing the Map and Reduce tasks of one round, thereby obtaining a scheme set (plan)jm_t∪planjr_t) Maximum time-to-space scheme. Since the service provider pays a claim for the user after the job completion time exceeds the deadline, the completion time of each job is guaranteed to be the shortest, and the number of execution rounds of the task directly affects the completion time of the job. In the same reward and punishment stage, on the premise that the turn number does not exceed the number of the tasks, namely (RN _ m + k) < j _ Nm and (RN _ r + i) < j _ Nr, the more turns of each operation, the more the number of the tasks which can be executed in parallel in different operations, so that the completion time of each operation is shortened, and the maximum benefit of a service provider is ensured.
Figure BDA0001586171920000101
Figure BDA0001586171920000111
Job scheduling based on maximum round scheme
The job scheduling method mainly comprises the following steps (see algorithm 2):
at j _ t according to all jobsstandardDetermining a scheduled job and selecting a maximum round number combination scheme set (line1-line 7);
selecting the round number combination scheme with the local maximum profit for scheduling in all round number combination scheme sets of the previous job (line8-line 49);
after all the jobs are added into the scheduling queue, a global maximum profit value is calculated according to the task round number combination scheme corresponding to each job (line 50).
And when selecting the round number combination scheme with the local maximum profit, giving priority to the resource utilization rate of the platform.
And when the resource utilization rate is greater than a given threshold, the resource is idle, and until the execution of the previous task is completed, the task with the maximum local profit is selected from the rest tasks for serial scheduling (line11-line 34):
when the task with the maximum local profit is selected, comparing the reward or penalty values of each task, which can be obtained in the time range, through the range of the deadline time of each task and the completion time difference of the previous task, and selecting the task with the maximum local profit and the corresponding task round number combination scheme.
If this is the case, there are 4 task alternatives, when f (j) is satisfied1_profit)>f(j2_profit)>f(j3_profit)>f(j4_profit)>When 0, then j is1Adding the number task into a scheduling queue and selecting a round number combination scheme in a corresponding time range;
when f (j)1_profit)>f(j2_profit)>0≥f(j3_profit)>f(j4Phi, _ and satisfies f (j)1_profit)≥|-j3_profit×γ|+|-j4If _, then j is set1The number task is added into a scheduling queue, and if the second condition is not met, the task with the maximum punishment value is selected to be added into the scheduling queue;
when 0 is present>f(j1_profit)>f(j2_profit)>f(j3_profit)>f(j4And _ config), judging the maximum penalty value of all tasks, and selecting the task with the maximum penalty value to add into the scheduling queue, so as to minimize the cost paid by a service provider.
And when the resource utilization rate is less than a given threshold (line35-line48), the task to be scheduled is executed in parallel with the previous task, and the platform resources are fully utilized, so that the resource utilization rate is maximized.
Recording the starting time point t when the previous task has vacant resources before selecting the proper job to be added into the scheduling queuer,trWith maximum standard time t for each jobstandardAnd comparing, when the residual resources can meet the requirement of the maximum round number of the tasks in the maximum standard time range, selecting the operation with the maximum | f (j _ profit) | value to be added into the scheduling queue, and for the operation (j _ profit) | of which the penalty value reaches the maximum (-j _ profit) | f (j _ profit) | value is placed in the scheduling queue to be finally scheduled, wherein the operation with the maximum penalty value is the operation with the maximum reward value, or the operation (j _ profit) | β b) with the maximum penalty value except the operation (j _ profit) | which has reached the maximum penalty value is added.
Because each selected task has a plurality of maximum round number combination schemes, after each scheduling combination scheme is selected, a corresponding local profit value is calculated, and finally, a scheduling queue with the maximum global profit value and a scheduling strategy are selected for scheduling.
The scheduling policy includes the number of rounds of tasks and the starting time point, so that the number of resource allocations and the allocation time point for the tasks in each task are pre-specified before scheduling.
Figure BDA0001586171920000121
Figure BDA0001586171920000131
Figure BDA0001586171920000141
And writing the finally determined scheduling queue and scheduling policy into the configuration file, and enabling the ASM to start the AM of the corresponding job according to the configuration file and the job resource request, and allocating a Container resource to the AM, so that the job can start to run. After the calculation resources are allocated to the jobs, the jobs are divided into Map and Reduce tasks, and the scheduling strategy generated by the MPCRS is based on the task level, so that the resources allocated to the jobs are the corresponding Map or Reduce tasks when the resources are allocated.
Example two
An object of the present embodiment is to provide a computing device.
A MapReduce job scheduling optimization device under a cloud service quotient and penalty mode comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor realizes the following steps when executing the program, and the MapReduce job scheduling optimization device comprises:
receiving jobs submitted by a user, and acquiring the execution time of each round of Map tasks and Reduce tasks of each job and the number of the tasks;
according to the Map and Reduce task execution time and the task quantity of each operation and a reward and punishment gain mode, determining a maximum round number combination scheme set and a maximum standard time of each operation at different reward and punishment stages;
and acquiring a job scheduling strategy based on the maximum round number scheme of each job according to the reward and punishment profit mode.
EXAMPLE III
An object of the present embodiment is to provide a computer-readable storage medium.
A computer-readable storage medium, on which a computer program is stored for fingerprint similarity calculation, which program, when executed by a processor, performs the steps of:
receiving jobs submitted by a user, and acquiring the execution time of each round of Map tasks and Reduce tasks of each job and the number of the tasks;
according to the Map and Reduce task execution time and the task quantity of each operation and a reward and punishment gain mode, determining a maximum round number combination scheme set and a maximum standard time of each operation at different reward and punishment stages;
and acquiring a job scheduling strategy based on the maximum round number scheme of each job according to the reward and punishment profit mode.
The steps involved in the second and third embodiments correspond to the first embodiment of the method, and the detailed description thereof can be found in the relevant description of the first embodiment. The term "computer-readable storage medium" should be taken to include a single medium or multiple media containing one or more sets of instructions; it should also be understood to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by a processor and that cause the processor to perform any of the methods of the present invention.
Experimental verification
Experimental Environment and settings
The algorithm provided by the invention is verified in a computing platform based on a Hadoop framework, and the used version is Hadoop 2.7.1. All nodes in the platform are homogeneous nodes, and comprise 1 master node and 20 slave nodes. The configuration information of each node is CPU8cores,16GB RAM,1TBhard disk, Red Hat Enterprise Linux6.2 System. Resource management and scheduling use the Yarn resource framework to set Container sizes to 1core and 2GRAM, so that there are 8 containers on each node, and there are 160 containers in the whole platform. The block size was set to the default 64M in the experiment.
Experimental data set and experimental indices
To evaluate the performance of the algorithm, we used the MapReduce standard application in PUMA benchmark sets [ ] and included a number of jobs of different input data sizes for each application, and set each job to have the same priority, as shown in Table 1, including the number of each program, input data size, deadline, and data source. The deadline of the job may be derived from the start time and the maximum execution time of the job.
TABLE 1
MapReduce program Number of jobs Input data size (GB) Cut-off time (min) Data source
WordCount J1/J2/J3/J4/J5 20/50/100/200/500 1/5/15/35/60 Wikipedia
TeraSort J6/J7/J8/J9/J10 20/50/100/200/500 1/5/15/35/60 TeraGen
Inverted-Index J11/J12/J13/J14/J15 20/50/100/200/500 1/5/15/35/60 Wikipedia
These programs are typical types of MapReduce programs, WordCount is CPU intensive, TeraSort is I/O intensive, and invoked-Index is both CPU and I/O intensive.
In evaluating the efficiency of MPCRS, MRNS is first compared to FIFO, Fair, EDF scheduling algorithms, respectively, in terms of job completion time and job delay ratio. Then, the maximum gain and the platform resource utilization rate which can be obtained when different algorithms are used are evaluated, and the influence of the input data scale of the operation on the delay rate, the completion time and the platform resource utilization rate of the operation is discussed. Finally, the influence of the optimal setting of the parameters on the MPCRS is researched. The performance of the algorithm was evaluated by the following performance indicators:
● job completion time (j _ completion)
● job delay Rate (P)
● maximum Profit (Profit)
● platform resource Utilization (Utilization)
The completion time of the job can be obtained according to the start time and the execution time of the job, j _ start is the start execution time of the job, j _ execute is the time elapsed after all tasks of the job j are executed, all the jobs arrive at the same time by default, and the completion time j _ completion of the job is represented as:
j_completion=j_start+j_execute (8)
if the completion time j _ completion of job j exceeds the specified deadline j _ deadline, i.e., job j is considered delayed, N _ late is the number of delayed jobs, and N _ total is the total number of completed jobs, then the job delay rate P is expressed as:
Figure BDA0001586171920000163
j _ fix is the profit that can be gained by completing job j before the deadline specified in the SLA protocol, and for the maximum profit that can be gained by the service provider, it can be measured by RP Model winning penalty coexistence profit function f (j _ fix):
Figure BDA0001586171920000161
α in formula (3), γ is the reward rate, penalty rate and maximum penalty rate, α is 0.3, β is 0.5, γ is 2, a and b are time-limited multiples, and a is 1.5.
When defining the platform resource Utilization rate Utilization, the ratio of the product of the resources and the time occupied by all the operations after the operations are completed to the product of the whole resources and the time in the platform is used for expressing:
Figure BDA0001586171920000162
wherein R istotalIs the total number of resources, T, in the platformtotalIs the total time after all operations have been performed, RtaskiIs the amount of resources occupied by the execution of the ith round of task in job j, TtaskiIs the required execution time of the ith round of tasks.
Results and analysis of the experiments
High efficiency of the process
The section uses FIFO, Fair and EDF schedulers as comparison algorithms, and compares and analyzes the operation execution effect, the benefit effect and the platform resource utilization rate with the operation scheduling optimization method provided by the invention, so that the high efficiency of the operation scheduling optimization method is verified.
FIFO scheduling is to arrange all jobs into a queue according to the order of priority, allocate resources in the platform for the jobs one by one, and when one job is executed, all resources in the platform will be occupied. Fair scheduling is the allocation of a relatively Fair amount of resources to each job. The EDF algorithm forms a running queue according to the deadline requirement of the operation, the quantity of the required resources is preferentially distributed to the operation with the earlier deadline requirement, and the newly arrived operation with the earliest deadline can not preempt the resources before the operation is not completed and can only be used after the operation is completed. The MapReduce job scheduling method under the cloud service quotient reward and penalty profit mode is different from the scheduling idea, is a preemptive scheduling mode based on the maximum task round number, cannot kill the resources of the task round to be preempted, and can only preempt the resources occupied by the task round after the task round is completed.
Efficiency of job execution
The job execution effect is reflected from the results of the average completion time of all jobs, the completion time of each job, the job delay ratio, and the delay multiple of each job.
As shown in fig. 2, the results of the 4 comparison algorithms on the average completion time index of all jobs are shown. As can be seen from the figure, the average job completion time of the FIFO scheduler reaches 24.3min, and the sequential reduction is 22.3min for the EDF scheduler, 20.8min for the Fair scheduler and 18min for the job scheduling optimization method provided by the invention. The average job completion time of the FIFO scheduler is the longest because when a job is executing, it occupies all resources in the cluster and cannot start execution of other jobs, so that the completion time of most jobs is extended, resulting in an extended average completion time of all jobs. Compared with FIFO, EDF and Fair, the method has the advantage that the average completion time of the operation is reduced.
As shown in fig. 3, the completion time results after completion of execution under the scheduling of different schedulers are obtained for each job. Firstly, the completion time of the three types of jobs is continuously prolonged along with the increase of the data volume; secondly, when the data volume is the same, the difference of the completion time of different types of operation is very small, which shows that the uniform allocation of resources does not affect the performance of different types of operation; finally, it can be seen by comparing the different execution performances of the jobs scheduled by the 4 schedulers that the Job completion time using the Job scheduling optimization method proposed by the present invention is significantly shorter than the time using the other three schedulers, but the Job completion time using the Job scheduling optimization method proposed by the present invention is higher than the time using EDF and Fair by Job 9. During execution of a job, various situations may occur. The reason why the Job scheduling optimization method proposed by the present invention is used by Job9 is that when the maximum round number execution scheme of Job9 is selected, the number of task rounds selected is the largest within the maximum standard time range, which results in that the overall completion time of the Job is prolonged as compared with the case of a small number of rounds, and a certain time consumption is incurred in the calculation, so that a certain prolongation of the completion time during the Job execution process is inevitable. This extension of the completion time is within an allowable range and can be tolerated.
As shown in fig. 4, the job delay ratio when scheduling a job using each scheduler is shown. It can be seen from the figure that the delay rate reaches 46.7%, the EDF reaches 26.7%, the Fair reaches 20% when the FIFO scheduler is used, and the job scheduling optimization method provided by the invention reaches 13.3%. The platform resource occupation of the FIFO scheduler makes the Job delay rate high, and it can be seen from fig. 5 that the delay times of Job3, Job5, Job9, Job10, and Job13 exceed the specified deadline by 1 time, and that the delay times of Job11 and Job12 exceed the deadline by 1.5 times. Although the EDFs sequence jobs according to deadlines, when a Job is executed, other jobs still cannot use the free resources of the platform, and only after the Job is completed, the next sequenced Job is executed, so that part of the jobs are delayed, as shown in fig. 5, for example, in Job3, Job10, Job12 and Job 14. Fair allocates the same resource to each Job, and does not consider the Job cases with different data volumes, so that a Job with a large data volume is delayed, and the delay time of each Job exceeds 1 time of the predetermined deadline, as in Job8, Job13, and Job15 in fig. 5. When the Job scheduling optimization method provided by the invention is used for scheduling, although the completion time of Job9 and Job10 exceeds the deadline, the Job scheduling optimization method provided by the invention is superior to FIFO, EDF and Fair in terms of the overall execution performance of the Job.
Revenue maximization
The performance of each job can be obtained from the above-mentioned experiments on the efficiency of job execution, but the final goal of the job scheduling optimization algorithm designed herein is to obtain the maximum benefit for the service provider, so the maximum benefit when different schedulers are used is obtained as shown in fig. 6 below according to equations (1) and (3). Where Ideal is the Ideal value of the total yield,
Figure BDA0001586171920000181
it can be seen from the figure that the maximum profit margin that can be obtained by the service provider is large when different schedulers are used. The difference between the total income obtained by using the FIFO and the Ideal value Ideal is the largest, and the difference between the total income obtained by using the job scheduling optimization method provided by the invention and the Ideal value Ideal is the smallest. The total benefit obtainable by the method is still some distance from the ideal value, since the reward value for most jobs is still 0 and there are some delayed jobs.
Efficiency of platform resource utilization
Fig. 7 shows the resource utilization rate of each job at 9 statistical intervals after 4 schedulers have allocated resources fairly for each job. As shown in fig. 8, after allocating resources for a job using different schedulers, the platform resource utilization rate for the job at each statistical interval.
Effect of job size on MPCRS
The three different types of application programs are respectively mixed with 60 jobs at 20G,50G,100G,200G and 500G, the job delay rate is calculated under the condition of each data volume, and the 60 jobs have the same cut-off time under different data volumes. As shown in fig. 9-12.
Threshold tuning of platform resource utilization
As shown in fig. 13
X: size of threshold
Y: rate of job delay
60 job measurements for each data volume.
The invention provides a MapReduce job scheduling optimization method under a cloud service quotient reward and punishment profit mode by combining the interest relationship between a service provider and a user, wherein the maximum round number combination and the maximum standard time of the Map and the Reduce of the operation at different reward and punishment stages are determined according to the Map and Reduce task execution time of each operation by taking the profit value in an RP Model as a standard; on the premise of meeting the utilization rate of platform resources, selecting an operation with the global maximum benefit and a task maximum round number scheme of the operation, thereby making a TS strategy. Experimental results show that the TS strategy generated by the scheduling optimization method can shorten the completion time of each job to the maximum extent, improve the utilization rate of platform resources, and enable a user to obtain more accurate deadline while a service provider obtains the maximum benefit, thereby really realizing the benefit win-win of the service provider and the user.
Those skilled in the art will appreciate that the modules or steps of the present invention described above can be implemented using general purpose computer means, or alternatively, they can be implemented using program code that is executable by computing means, such that they are stored in memory means for execution by the computing means, or they are separately fabricated into individual integrated circuit modules, or multiple modules or steps of them are fabricated into a single integrated circuit module. The present invention is not limited to any specific combination of hardware and software.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims (7)

1. A MapReduce job scheduling method under a cloud service quotient and punishment yield mode is characterized by comprising the following steps:
receiving jobs submitted by a user, and acquiring the execution time of each round of Map tasks and Reduce tasks of each job and the number of the tasks;
according to the Map and Reduce task execution time and the task quantity of each operation and a reward and punishment gain mode, determining a maximum round number combination scheme set and a maximum standard time of each operation at different reward and punishment stages;
according to the reward and punishment profit mode, acquiring an operation scheduling strategy based on the maximum round number scheme of each operation;
the reward and punishment income mode of the operation is as follows:
Figure FDA0002382234340000011
Figure FDA0002382234340000012
wherein j _ prefix represents the income of the operation completed before the cut-off time, and f (j _ prefix) is a reward and punishment coexistence function and represents additional reward or punishment; j _ Nm and j _ Nr respectively represent the number of Reduce tasks and the number of Map tasks of the job; r is total computing resources which can be used in the platform, and J is the total number of the jobs which need to be scheduled;
according to the reward and punishment subsection range of the reward and punishment coexistence function, obtaining a set of the maximum standard time T:
Figure FDA0002382234340000013
when in use
Figure FDA0002382234340000014
Then, the corresponding reward can be obtained after the operation j is finished; when t isstandardWhen j _ deadline is satisfied, the reward for completing job j is 0; when t isstandardWhen the operation j is completed, the penalty reaches the maximum value, tstandardRepresenting the maximum standard time at each reward and punishment stage;
the method for determining the maximum round number combination scheme set of each job at different reward and punishment stages comprises the following steps:
calculating the minimum execution round number of Map and Reduce tasks of each job, the shortest execution time of each round of tasks and the maximum standard time at each reward and punishment stage according to the known information of the jobs;
and obtaining a maximum round number combination scheme set of the operation according to the shortest execution time of each round of the task and the maximum standard time of each reward and punishment stage.
2. The MapReduce job scheduling method in the cloud service quotient and penalty mode according to claim 1, wherein the minimum execution round number calculation method of the task is as follows:
Figure FDA0002382234340000015
Figure FDA0002382234340000016
Figure FDA0002382234340000021
in the case where all dynamically allocatable resources are allocated to only one job, RN _ m is the minimum number of execution rounds of the Map task of job j; RN _ r is the minimum number of execution rounds of Reduce tasks; t is tleastIndicating the shortest execution time for job j.
3. The method for scheduling a MapReduce job in a cloud service reward and penalty mode according to claim 1, wherein the method for determining the maximum round-robin combination scheme set of the job at each reward and penalty stage comprises:
for each reward and punishment stage, if the maximum standard time is greater than or equal to the shortest execution time of the task, acquiring a round number combination scheme A of the Map and Reduce tasks;
judging whether the residual time of the scheme A is greater than or equal to the k-round Map task execution time, if so, obtaining a round number combination scheme B, and executing the next step by adopting the residual time of the schemes A and B; otherwise, obtaining the round number combination scheme set plan of the reward and punishment stagejm_t={A,B};
Further judging whether the residual time of the schemes A and B is larger than or equal to the i-round Reduce task execution time, if so, obtaining a round number combination scheme C, and forming a round number combination scheme set plan of the reward and punishment stage by the schemes A, B and Cjm_t={A,B,C};
Judging whether the residual time adopting the scheme A is more than or equal to the i-round Reduce task execution time, if so, obtaining a round number combination scheme D, and executing the next step; otherwise, obtaining the round number combination scheme set plan of the reward and punishment stagejr_t={A,D};
Further judging whether the residual time adopting the schemes A and D is larger than or equal to the k-round Map task execution time, if so, obtaining a round number combination scheme, and obtaining a round number combination scheme set plan of the reward and punishment stagejr_t={A,D,E};
The round number combination scheme set of the task is Planj_t={A}∪planjm_t∪planjr_t
And according to the principle that the round number does not exceed the task number in the same reward and punishment stage, acquiring the maximum values of the parameters i and k to obtain a maximum round number combination scheme set.
4. The MapReduce job scheduling method in the cloud service reward and penalty mode of claim 1, wherein the job scheduling method comprises:
at j _ t according to all jobsstandardDetermining a scheduled job and selecting a maximum round number combination scheme set as a maximum reward value at j _ deadline/a;
in the job scheduling process, when the resource utilization rate is greater than a given threshold value, resources are idled until the execution of the previous task is completed, and a task with the maximum local profit is selected from the rest tasks for serial scheduling;
when the resource utilization rate is less than a given threshold value, executing the task to be scheduled and the previous task in parallel;
and adding all the jobs into a scheduling queue, and calculating a global profit value according to a task round number combination scheme corresponding to each job, wherein the combination scheme with the maximum global profit value is a scheduling strategy.
5. The MapReduce job scheduling method under the cloud service quotient reward and penalty mode as claimed in claim 1, wherein when the job with the maximum local profit is selected, comparing the reward or penalty value that can be obtained by each job in the time range by the range of the deadline time of each job and the completion time difference of the previous task, and selecting the job with the maximum local profit and the corresponding task round number combination scheme; or
Before the selected operation is added into the scheduling queue, the starting time point when the previous task has vacant resources is recorded, the starting time point is compared with the maximum standard time of each operation, when the residual resources can meet the maximum round number requirement of the task within the maximum standard time range, the operation with the maximum reward and punishment value is selected to be added into the scheduling queue, and the operation with the maximum penalty value is placed in the scheduling queue for scheduling at last.
6. A MapReduce job scheduling optimization device in a cloud service quotient and penalty profit mode comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, and is characterized in that the MapReduce job scheduling method in the cloud service quotient and penalty profit mode is realized according to any one of claims 1 to 5 when the processor executes the program.
7. A computer-readable storage medium on which a computer program is stored, wherein the program, when executed by a processor, implements the MapReduce job scheduling method in the cloud service reward and penalty mode according to any one of claims 1 to 5.
CN201810172166.XA 2018-03-01 2018-03-01 MapReduce job scheduling method and device facing big data platform and based on maximized benefits Active CN108428051B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810172166.XA CN108428051B (en) 2018-03-01 2018-03-01 MapReduce job scheduling method and device facing big data platform and based on maximized benefits

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810172166.XA CN108428051B (en) 2018-03-01 2018-03-01 MapReduce job scheduling method and device facing big data platform and based on maximized benefits

Publications (2)

Publication Number Publication Date
CN108428051A CN108428051A (en) 2018-08-21
CN108428051B true CN108428051B (en) 2020-06-05

Family

ID=63157440

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810172166.XA Active CN108428051B (en) 2018-03-01 2018-03-01 MapReduce job scheduling method and device facing big data platform and based on maximized benefits

Country Status (1)

Country Link
CN (1) CN108428051B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111143871A (en) * 2018-11-03 2020-05-12 广州市明领信息科技有限公司 Big data integration system
CN110188975B (en) * 2019-04-02 2023-08-04 创新先进技术有限公司 Resource acquisition method and device
CN113626159A (en) * 2020-05-09 2021-11-09 比特大陆科技有限公司 Task switching method, device and system
CN112016812B (en) * 2020-08-06 2022-07-12 中南大学 Multi-unmanned aerial vehicle task scheduling method, system and storage medium
CN112650687B (en) * 2020-12-30 2024-03-19 绿盟科技集团股份有限公司 Method, device, equipment and medium for testing execution priority of engine scheduling action

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104317650A (en) * 2014-10-10 2015-01-28 北京工业大学 Map/Reduce type mass data processing platform-orientated job scheduling method
CN104835026A (en) * 2015-05-15 2015-08-12 重庆大学 Automatic stereoscopic warehouse selection operation scheduling modeling and optimizing method based on Petri network and improved genetic algorithm
CN107273209A (en) * 2017-06-09 2017-10-20 北京工业大学 The Hadoop method for scheduling task of improved adaptive GA-IAGA is clustered based on minimum spanning tree

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104317650A (en) * 2014-10-10 2015-01-28 北京工业大学 Map/Reduce type mass data processing platform-orientated job scheduling method
CN104835026A (en) * 2015-05-15 2015-08-12 重庆大学 Automatic stereoscopic warehouse selection operation scheduling modeling and optimizing method based on Petri network and improved genetic algorithm
CN107273209A (en) * 2017-06-09 2017-10-20 北京工业大学 The Hadoop method for scheduling task of improved adaptive GA-IAGA is clustered based on minimum spanning tree

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Improving Performance of Heterogeneous MapReduce Clusters with Adaptive Task Tuning;Dazhao Cheng等;《IEEE Transactions on Parallel and Distributed Systems 》;20170301;第28卷(第3期);774-786 *
MapReduce集群中最大收益问题的研究;王习特等;《计算机学报》;20150131;第38卷(第1期);109-121 *

Also Published As

Publication number Publication date
CN108428051A (en) 2018-08-21

Similar Documents

Publication Publication Date Title
CN108428051B (en) MapReduce job scheduling method and device facing big data platform and based on maximized benefits
CN107659433B (en) Cloud resource scheduling method and equipment
CN110096349B (en) Job scheduling method based on cluster node load state prediction
CN107291545B (en) Task scheduling method and device for multiple users in computing cluster
US9602426B2 (en) Dynamic allocation of resources while considering resource reservations
CN107239336B (en) Method and device for realizing task scheduling
US9218213B2 (en) Dynamic placement of heterogeneous workloads
US7730119B2 (en) Sub-task processor distribution scheduling
US9740526B2 (en) Job scheduling method
CN109861850B (en) SLA-based stateless cloud workflow load balancing scheduling method
US7594016B1 (en) Calculating numbers of servers for tiers of a multi-tiered system
US20080276242A1 (en) Method For Dynamic Scheduling In A Distributed Environment
CN110413412B (en) GPU (graphics processing Unit) cluster resource allocation method and device
CN111614754B (en) Fog-calculation-oriented cost-efficiency optimized dynamic self-adaptive task scheduling method
US20140188532A1 (en) Multitenant Database Placement with a Cost Based Query Scheduler
CN111258745B (en) Task processing method and device
CN111352736A (en) Method and device for scheduling big data resources, server and storage medium
CN109189572B (en) Resource estimation method and system, electronic equipment and storage medium
Liu et al. Strategy-proof mechanism for provisioning and allocation virtual machines in heterogeneous clouds
CN110502321A (en) A kind of resource regulating method and system
CN105373426A (en) Method for memory ware real-time job scheduling of car networking based on Hadoop
CN111176840A (en) Distributed task allocation optimization method and device, storage medium and electronic device
US7886055B1 (en) Allocating resources in a system having multiple tiers
US9021094B1 (en) Allocation of resources for tiers of a multi-tiered system based on selecting items from respective sets
CN104731662B (en) A kind of resource allocation methods of variable concurrent job

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant