CN106354553A - Task scheduling method and device based on resource estimation in big data system - Google Patents

Task scheduling method and device based on resource estimation in big data system Download PDF

Info

Publication number
CN106354553A
CN106354553A CN201510411512.1A CN201510411512A CN106354553A CN 106354553 A CN106354553 A CN 106354553A CN 201510411512 A CN201510411512 A CN 201510411512A CN 106354553 A CN106354553 A CN 106354553A
Authority
CN
China
Prior art keywords
task
resource
data block
list
task list
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510411512.1A
Other languages
Chinese (zh)
Inventor
朱泓
钟咏
曾东
张聪
夏峻峰
李小东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MIGU Music Co Ltd
Original Assignee
MIGU Music Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by MIGU Music Co Ltd filed Critical MIGU Music Co Ltd
Priority to CN201510411512.1A priority Critical patent/CN106354553A/en
Publication of CN106354553A publication Critical patent/CN106354553A/en
Pending legal-status Critical Current

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a task scheduling method based on resource estimation in a big data system. The method includes the steps that a received task is subjected to resource estimation and added into a task list; current system idle resources are estimated, and the task in the task list is scheduled according to the total quantity of resources required by the task in the task list and the size relationship of the current system idle resources. The invention further discloses a task scheduling device based on resource estimation in the big data system.

Description

Method for scheduling task based on calculation of natural resources and device in a kind of big data system
Technical field
The present invention relates to mission planning and dispatching technique field, in more particularly, to a kind of big data system, it is based on money The method for scheduling task of source estimation and device.
Background technology
Big data system possesses that data storage amount is big, complex structure, the task of running are various, and task is processed Data volume is big, have between task complexity dependence the features such as.Big data system is in the side such as calculating and storage The ability in face is all very powerful, but for a specific big data system, over a period to come, no matter Its time resource or storage resource is determined, therefore, only reasonably enters scheduling to the task in system, Execute with allowing task coordinate, the resource that system could be allowed limited gives full play to its effect, realizes big data system True value.
Complete task scheduling process at least will include task resource estimation and tasks carrying plans two parts, but Because the huge and task of big data system data amount is numerous and diverse, effective method is there is no to realize calculation of natural resources at present, In real work, typically all abandon task resource is estimated;In terms of task Execution plan, generally adopt base Principle in controlling stream to be realized, and the method has certain operability in the less situation of task scale, But it is as the increase of number of tasks amount, task dependence becomes complicated, not only efficiency declines but also realizes difficulty Degree is very big.
In sum, provide a kind of task scheduling approach based on calculation of natural resources, be capable of to task resource Estimation, accurately and efficiently complete mission planning scheduling, it has also become problem demanding prompt solution.
Content of the invention
In view of this, embodiment of the present invention expectation provides the task based on calculation of natural resources in a kind of big data system Dispatching method and device, are capable of the estimation to task resource, accurately and efficiently complete mission planning and adjust Degree, and realize simple, reliability height.
For reaching above-mentioned purpose, the technical scheme of the embodiment of the present invention is achieved in that
Embodiments provide the method for scheduling task based on calculation of natural resources, institute in a kind of big data system The method of stating includes:
Calculation of natural resources is carried out to receiving of task, and described task is added task list;
Current system idling-resource is estimated, and the resource according to the required by task in task list is total Amount is scheduling to the task in described task list with the magnitude relationship of current system idling-resource.
In such scheme, the described task to reception carries out calculation of natural resources and includes:
Obtain the data source information of the task of described reception, determine that the scale of the data source obtaining meets first During part, choose n data block from the data block that described data source comprises as the data source of estimation tasks, Run described estimation tasks and record the resource that described estimation tasks consume, consume according to described estimation tasks The resource of the required by task receiving described in calculation of natural resources;Wherein, n is positive integer.
In such scheme, described from the data block that described data source comprises choose n data block as estimation The data source of task, comprising:
The data block that described data source is comprised is ranked up, and randomly selects a data block as the first data Block, then everyIndividual data block chooses a data block, till choosing n data block;Wherein, M is the data block number that described data source comprises, and m is positive integer.
In such scheme, described empty with current system according to the total resources of required by task in task list The magnitude relationship of not busy resource the task in described task list is scheduling including:
When determining that the total resources of the required by task in task list is not more than current system idling-resource, open Move all tasks in described task list;
When determining that the total resources of the required by task in task list is more than current system idling-resource, foundation In task list, the priority of task starts the task in described task list successively, and to priority identical Task, the less task of preferential startup resource occupation.
In such scheme, the described priority according to task in task list starts in described task list successively Task include:
Priority according to task in task list carries out resource occupation to the task in described task list successively Application, and start the successful task of resource occupation application successively according to the priority of task.
The embodiment of the present invention additionally provides the task scheduling apparatus in a kind of big data system based on calculation of natural resources, Described device includes: processing module and scheduler module;Wherein,
Described processing module, for carrying out calculation of natural resources to receiving of task, and described task is added task List;
Described scheduler module, for estimating to current system idling-resource, and according in task list The total resources of required by task and current system idling-resource magnitude relationship in described task list Task is scheduling.
In such scheme, described processing module, specifically for obtaining the data source information of the task of described reception, When determining that the scale of the data source obtaining meets first condition, choose from the data block that described data source comprises N data block, as the data source of estimation tasks, is run described estimation tasks and is recorded described estimation tasks and disappear The resource of consumption, according to the required by task resource receiving described in the calculation of natural resources that described estimation tasks consume;Wherein, N is positive integer.
In such scheme, described processing module, the data block specifically for comprising to described data source is arranged Sequence, randomly selects a data block as the first data block, then everyIndividual data block chooses a number According to block, till choosing n data block;Wherein, m is the data block number that described data source comprises, M is positive integer.
In such scheme, described scheduler module, specifically for determining the resource of the required by task in task list When total amount is not more than current system idling-resource, start all tasks in described task list;
When determining that the total resources of the required by task in task list is more than current system idling-resource, foundation In task list, the priority of task starts the task in described task list successively, and to priority identical Task, the less task of preferential startup resource occupation.
In such scheme, described scheduler module, specifically for the priority according to task in task list successively Task in described task list is carried out with resource occupation application, and starts money successively according to the priority of task Source takies applies for successful task.
Method for scheduling task based on calculation of natural resources and dress in the big data system that the embodiment of the present invention is provided Put, calculation of natural resources is carried out to receiving of task, and described task is added task list;To current system Idling-resource is estimated, and empty with current system according to the total resources of required by task in task list The magnitude relationship of not busy resource is scheduling to the task in described task list;So, it is possible to realize to task The estimation of resource, accurately and efficiently completes mission planning scheduling, and realizes simple, reliability height.
Brief description
Fig. 1 is to be shown based on the method for scheduling task flow process of calculation of natural resources in the embodiment of the present invention one big data system It is intended to;
Fig. 2 is to be shown based on the method for scheduling task flow process of calculation of natural resources in the embodiment of the present invention two big data system It is intended to;
Fig. 3 is that in embodiment of the present invention big data system, the task scheduling apparatus based on calculation of natural resources form structure Schematic diagram.
Specific embodiment
The storage strategy of big data system is each node being uniformly distributed in cluster random as far as possible, leads to Often mission planning and scheduling are based primarily upon with two aspects: run time cost and the storage that this required by task is wanted Cost;In the case that system environmentss are constant, time cost and carrying cost depend primarily on task process Data volume, calculating logic and Algorithms T-cbmplexity, and for assigned tasks, process logical sum algorithm Complexity is all to determine, therefore, the data volume that the time cost of this task is processed with task with carrying cost Proportional.
In embodiments of the present invention, calculation of natural resources is carried out to receiving of task, and described task is added task List;Current system idling-resource is estimated, and the resource according to the required by task in task list Total amount is scheduling to the task in described task list with the magnitude relationship of current system idling-resource.
Fig. 1 is to be shown based on the method for scheduling task flow process of calculation of natural resources in the embodiment of the present invention one big data system It is intended to, as shown in figure 1, the method for scheduling task based on calculation of natural resources in embodiment of the present invention big data system Including:
Step 101: calculation of natural resources is carried out to receiving of task, and described task is added task list;
Here, described task can be data processing task, and the task of described reception can be one or more;
The described task to reception carries out calculation of natural resources and includes:
Obtain the data source information of the task of described reception, determine that the scale of the data source obtaining meets first During part, choose n data block from the data block that described data source comprises as the data source of estimation tasks, Run described estimation tasks and record the resource that described estimation tasks consume, consume according to described estimation tasks The required by task resource receiving described in calculation of natural resources;Wherein, n is positive integer;
Here, described resource includes: time resource and storage resource.
Further, the data source information of the described task of obtaining described reception includes:
Parse the task description file of the task of described reception, obtain the data source information of described task.
Further, the described scale determining the data source obtaining meets first condition and includes:
Determine that the data block total amount that the data source obtaining comprises reaches default data block threshold value;Wherein, described Data block threshold value can be set according to being actually needed.
Further, choose n data block from the data block that described data source comprises as estimation tasks Data source, comprising:
The data block that described data source is comprised is ranked up, and randomly selects a data block as the first data Block, then everyIndividual data block chooses a data block, runs into tail of the queue and just starts anew to count, until Till choosing n data block;Wherein, m is the data block number that described data source comprises, and m is positive integer;
Here, the described data block that described data source is comprised is ranked up being that described data source is comprised Data block be ranked up at random, or, enter according to the rule data block that described data source is comprised setting Row sequence;
DescribedValue be less thanMaximum integer;For example:ThenValue be 5;
The size of described n can according to need be set it is preferred that n value asI.e. n Value be less thanMaximum integer;
In the present embodiment, everyIndividual data block chooses a data block it may be assumed that taking out using equally spaced Sample loading mode, effectively reduces the systematic error carrying out calculation of natural resources;And sampling error partly can count greatly and determine Reason calculates sampling error rateWherein zα/2For coefficient of reliability, α is confidence level, when confidence level is When 95%, this coefficient of reliability value is 1.96, and when confidence level is 90%, this coefficient of reliability value is 1.645, The sample size of the higher needs of confidence level is more;σ is variance, embodies between sampling individual values and overall average Departure degree, sample value distribution more dispersion variance is bigger, and the sampling quantity of needs is more;N is sample size, Sample more multiple error is less;
Further, the threshold value of described n can be set the threshold value it is preferred that n according to being actually needed Can be 10.
Further, described run described estimation tasks and record the resource of described estimation tasks consumption and include:
The n data block chosen is executed respectively with the task of described reception, gathers and record described n data Block is submitted to the operation informations such as the cpu consumption during task completes, storage consumption from task.
Here, described cpu consumes, i.e. time resource shared by operation task, storage consumption namely operation Storage resource shared by task;Cpu consumes and belongs to time resource, and storage consumption belongs to storage resource.
Further, the required by task money of the described reception described in calculation of natural resources consuming according to described estimation tasks Source includes:
Determine needed for each data block in the n data block chosen according to the resource that described estimation tasks consume Resource average, and according to the resource average needed for each data block and described data source corresponding aggregate data block Determine the resource of the required by task of described reception.
Step 102: current system idling-resource is estimated, and according to the task institute in task list The total resources needing is carried out to the task in described task list with the magnitude relationship of current system idling-resource Scheduling;
Here, current system idling-resource is carried out with estimation to include:
Current system idling-resource, specially prior art can be obtained by inquiry system, do not go to live in the household of one's in-laws on getting married herein State.
Further, described idle with current system according to the total resources of required by task in task list The magnitude relationship of resource the task in described task list is scheduling including:
When determining that the total resources of the required by task in task list is not more than current system idling-resource, open Move all tasks in described task list;
When determining that the total resources of the required by task in task list is more than current system idling-resource, foundation In task list, the priority of task starts the task in described task list successively, and to priority identical Task, the less task of preferential startup resource occupation;So, the little task of big task blocking can be avoided, carry High task scheduling efficiency;Here, the priority of described task can be set according to being actually needed.
Further, the described priority according to task in task list starts in described task list successively Task includes:
Priority according to task in task list carries out resource occupation to the task in described task list successively Application, and start the successful task of resource occupation application successively according to the priority of task.
Further, the described priority according to task in task list is successively to appointing in described task list Business carries out resource occupation application, comprising:
When determining that current system idling-resource meets current task demand, it is pre-assigned to described required by task Stock number, and deduct the stock number of described required by task from current system idling-resource, determine as predecessor Business resource occupation application success, until the whole task resources in described task list take and apply for successfully;
When determining that current system idling-resource is unsatisfactory for current task demand, judge to work as at interval of certain time Whether front system idling-resource meets current task demand, until determining that current system idling-resource meets Current task demand;Wherein, described certain time can be set according to being actually needed.
Further, after this step, methods described also includes: feeds back the operation result of described task;Tool Body includes: the operation result of described task is exported with file mode.
Fig. 2 is to be shown based on the method for scheduling task flow process of calculation of natural resources in the embodiment of the present invention two big data system It is intended to;As shown in Fig. 2 the method for scheduling task based on calculation of natural resources in embodiment of the present invention big data system Including:
Step 201: the task of receive user submission simultaneously carries out calculation of natural resources to described task;
Here, described task can be data processing task, and the task of described reception can be one or more;
The described task to reception carries out calculation of natural resources and includes:
Obtain the data source information of the task of described reception, determine that the scale of the data source obtaining meets first During part, choose n data block from the data block that described data source comprises as the data source of estimation tasks, Run described estimation tasks and record the resource that described estimation tasks consume, consume according to described estimation tasks The required by task resource receiving described in calculation of natural resources;Wherein, n is positive integer;
Here, described resource includes: time resource and storage resource.
Further, the data source information of the described task of obtaining described reception includes:
Parse the task description file of the task of described reception, obtain the data source information of described task.
Further, the described scale determining the data source obtaining meets first condition and includes:
Determine that the data block total amount that the data source obtaining comprises reaches default data block threshold value;Wherein, described Data block threshold value can be set according to being actually needed.
Further, choose n data block from the data block that described data source comprises as estimation tasks Data source, comprising:
The data block that described data source is comprised is ranked up, and randomly selects a data block as the first data Block, then everyIndividual data block chooses a data block, runs into tail of the queue and just starts anew to count, until Till choosing n data block;Wherein, m is the data block number that described data source comprises, and m is positive integer;
Here, the described data block that described data source is comprised is ranked up being that described data source is comprised Data block be ranked up at random, or, enter according to the rule data block that described data source is comprised setting Row sequence;
The size of described n can according to need be set it is preferred that n value asI.e. n Value be less thanMaximum integer;
In the present embodiment, everyIndividual data block chooses a data block, that is, using equally spaced sampling Mode, effectively reduces the systematic error carrying out calculation of natural resources;And sampling error part can be with law of great number Calculate sampling error rateWherein zα/2For coefficient of reliability, α is confidence level, when confidence level is When 95%, this coefficient of reliability value is 1.96, and when confidence level is 90%, this coefficient of reliability value is 1.645, The sample size of the higher needs of confidence level is more.σ is variance, embodies between sampling individual values and overall average Departure degree, sample value distribution more dispersion variance is bigger, and the sampling quantity of needs is more;N is sample size, Sample more multiple error is less.
Further, the threshold value of described n can be set the threshold value it is preferred that n according to being actually needed Can be 10.
Further, described run described estimation tasks and record the resource of described estimation tasks consumption and include:
The n data block chosen is executed respectively with the task of described reception, gathers and record described n data Block is submitted to the operation informations such as the cpu consumption during task completes, storage consumption from task.
Further, the required by task money of the described reception described in calculation of natural resources consuming according to described estimation tasks Source includes:
Determine needed for each data block in the n data block chosen according to the resource that described estimation tasks consume Resource average, and according to the resource average needed for each data block and described data source corresponding aggregate data block Determine the resource of the required by task of described reception.
Step 202: estimate by described task addition task list and to current system idling-resource;
Here, described current system idling-resource is carried out estimation include:
Described cpu consumes, i.e. time resource shared by operation task, storage consumption namely operation task institute The storage resource taking;Cpu consumes and belongs to time resource, and storage consumption belongs to storage resource
Step 203: judge whether the total resources of the required by task in task list exceedes current system empty Not busy resource, if it does, execution step 204;If not above execution step 206;
Step 204: start the task in described task list successively according to the priority of task in task list, And to priority identical task, the less task of preferential startup resource occupation;
Here, the priority of described task can be set according to being actually needed;
The task that the described priority according to task in task list starts in described task list successively includes:
Priority according to task in task list carries out resource occupation to the task in described task list successively Application, and start the successful task of resource occupation application successively according to the priority of task.
Further, the described priority according to task in task list is successively to appointing in described task list Business carries out resource occupation application, comprising:
When determining that current system idling-resource meets current task demand, it is pre-assigned to described required by task Stock number, and deduct the stock number of described required by task from current system idling-resource, determine as predecessor Business resource occupation application success, until the whole task resources in described task list take and apply for successfully;
When determining that current system idling-resource is unsatisfactory for current task demand, judge to work as at interval of certain time Whether front system idling-resource meets current task demand, until determining that current system idling-resource meets Current task demand;Wherein, described certain time can be set according to being actually needed.
Step 205: start all tasks in described task list.
Step 206: feedback task run result, terminates this handling process;
Here, described feedback task run result includes: will be defeated with file mode for the operation result of described task Go out.
Fig. 3 is that in embodiment of the present invention big data system, the task scheduling apparatus based on calculation of natural resources form structure Schematic diagram;As shown in figure 3, the task scheduling based on calculation of natural resources fills in embodiment of the present invention big data system Put composition to include: processing module 31 and scheduler module 32;Wherein,
Described processing module 31, for carrying out calculation of natural resources to receiving of task, and the addition of described task is appointed Business list;
Described scheduler module 32, for estimating to current system idling-resource, and according to task list In the total resources of required by task and current system idling-resource magnitude relationship in described task list Task be scheduling.
Further, described processing module 31 carries out calculation of natural resources and includes to receiving of task:
Described processing module 31 obtains the data source information of the task of described reception, determines the data source of acquisition When scale meets first condition, choose n data block from the data block that described data source comprises as estimation The data source of task, runs described estimation tasks and records the resource that described estimation tasks consume, according to described The required by task resource receiving described in the calculation of natural resources that estimation tasks consume;Wherein, n is positive integer;
Here, described resource includes: time resource and storage resource.
Further, the data source information of the task that described processing module 31 obtains described reception includes:
Described processing module 31 parses the task description file of the task of described reception, obtains the number of described task According to source information.
Further, described processing module 31 determines that the scale of the data source obtaining meets first condition and includes:
Described processing module 31 determines that the data block total amount that the data source obtaining comprises reaches default data block Threshold value;Wherein, described data block threshold value can be set according to being actually needed.
Further, described processing module 31 chooses n data from the data block that described data source comprises Block is as the data source of estimation tasks, comprising:
The data block that described processing module 31 comprises to described data source is ranked up, and randomly selects a data Block as the first data block, then everyIndividual data block chooses a data block, runs into tail of the queue just from the beginning Start counting up, till choosing n data block;Wherein, m is the data block that described data source comprises Number, m is positive integer;
Here, the described data block that described data source is comprised is ranked up being that described data source is comprised Data block be ranked up at random, or, enter according to the rule data block that described data source is comprised setting Row sequence;
The size of described n can according to need be set it is preferred that n value asI.e. n Value be less thanMaximum integer;
In the present embodiment, everyIndividual data block chooses a data block, that is, using equally spaced sampling Mode, effectively reduces the systematic error carrying out calculation of natural resources;And sampling error part can be with law of great number Calculate sampling error rateWherein zα/2For coefficient of reliability, α is confidence level, when confidence level is When 95%, this coefficient of reliability value is 1.96, and when confidence level is 90%, this coefficient of reliability value is 1.645, The sample size of the higher needs of confidence level is more.σ is variance, embodies between sampling individual values and overall average Departure degree, sample value distribution more dispersion variance is bigger, and the sampling quantity of needs is more;N is sample size, Sample more multiple error is less;
Further, the threshold value of described n can be set the threshold value it is preferred that n according to being actually needed Can be 10.
Further, described processing module 31 is run described estimation tasks and is recorded what described estimation tasks consumed Resource includes:
Described processing module 31 executes the task of described reception respectively to the n data block chosen, and collection is simultaneously Record that described n data block is submitted to cpu consumptions during task completes from task, storage consumption etc. is transported Row information.
Further, the reception described in calculation of natural resources that described processing module 31 consumes according to described estimation tasks Required by task resource includes:
Described processing module 31 determines in the n data block chosen according to the resource that described estimation tasks consume Resource average needed for each data block, and according to the resource average needed for each data block and described data source Corresponding aggregate data block determines the resource of the required by task of described reception.
Further, described scheduler module 32 carries out to current system idling-resource estimating and includes:
Described scheduler module 32 can obtain current system idling-resource, specially existing skill by inquiry system Art, does not repeat herein.
Further, described scheduler module 32 according to the required by task in task list total resources with current System idling-resource magnitude relationship the task in described task list is scheduling including:
Described scheduler module 32 determines that the total resources of the required by task in task list is not more than During system idling-resource, start all tasks in described task list;
When determining that the total resources of the required by task in task list is more than current system idling-resource, foundation In task list, the priority of task starts the task in described task list successively, and to priority identical Task, the less task of preferential startup resource occupation;So, the little task of big task blocking can be avoided, carry High task scheduling efficiency;Here, the priority of described task can be set according to being actually needed.
Further, described scheduler module 32 starts described appointing successively according to the priority of task in task list Task in business list includes:
Described scheduler module 32 according to task in task list priority successively in described task list Task carries out resource occupation application, and the priority of foundation task starts resource occupation application successively and successfully appoints Business.
Further, described scheduler module 32 according to the priority of task in task list successively to described task Task in list carries out resource occupation application, comprising:
When described scheduler module 32 determines that current system idling-resource meets current task demand, it is pre-assigned to The stock number of described required by task, and deduct the resource of described required by task from current system idling-resource Amount, determines current task resource occupation application success, until the whole task resources in described task list account for With applying for successfully;
When determining that current system idling-resource is unsatisfactory for current task demand, judge to work as at interval of certain time Whether front system idling-resource meets current task demand, until determining that current system idling-resource meets Current task demand;Wherein, described certain time can be set according to being actually needed.
Further, described device also includes feedback module 33, for feeding back the operation result of described task.
In embodiments of the present invention, described processing module 31, scheduler module 32 and feedback module 33 all can be by Central processing unit (cpu, central processing unit) in server or digital signal processor (dsp, Digital signal processor) or field programmable gate array (fpga, field programmable gate Array) realize.
The above, only presently preferred embodiments of the present invention, it is not intended to limit the protection model of the present invention Enclose.

Claims (10)

1. in a kind of big data system the method for scheduling task based on calculation of natural resources it is characterised in that described side Method includes:
Calculation of natural resources is carried out to receiving of task, and described task is added task list;
Current system idling-resource is estimated, and the resource according to the required by task in task list is total Amount is scheduling to the task in described task list with the magnitude relationship of current system idling-resource.
2. according to claim 1 method it is characterised in that described carry out resource and estimate to receiving of task Calculate and include:
Obtain the data source information of the task of described reception, determine that the scale of the data source obtaining meets first During part, choose n data block from the data block that described data source comprises as the data source of estimation tasks, Run described estimation tasks and record the resource that described estimation tasks consume, consume according to described estimation tasks The resource of the required by task receiving described in calculation of natural resources;Wherein, n is positive integer.
3. according to claim 2 method it is characterised in that the described data comprising from described data source N data block is chosen as the data source of estimation tasks in block, comprising:
The data block that described data source is comprised is ranked up, and randomly selects a data block as the first data Block, then everyIndividual data block chooses a data block, till choosing n data block;Its In, m is the data block number that described data source comprises, and m is positive integer.
4. method according to claim 1 or claim 2 it is characterised in that described according to appointing in task list The magnitude relationship of the required total resources of business and current system idling-resource is to the task in described task list Be scheduling including:
When determining that the total resources of the required by task in task list is not more than current system idling-resource, open Move all tasks in described task list;
When determining that the total resources of the required by task in task list is more than current system idling-resource, foundation In task list, the priority of task starts the task in described task list successively, and to priority identical Task, the less task of preferential startup resource occupation.
5. according to claim 4 method it is characterised in that described according in task list task excellent The task that first level starts in described task list successively includes:
Priority according to task in task list carries out resource occupation to the task in described task list successively Application, and start the successful task of resource occupation application successively according to the priority of task.
6. in a kind of big data system the task scheduling apparatus based on calculation of natural resources it is characterised in that described dress Put including processing module and scheduler module;Wherein,
Described processing module, for carrying out calculation of natural resources to receiving of task, and described task is added task List;
Described scheduler module, for estimating to current system idling-resource, and according in task list The total resources of required by task and current system idling-resource magnitude relationship in described task list Task is scheduling.
7. according to claim 6 device it is characterised in that described processing module, specifically for obtaining The data source information of the task of described reception, when determining that the scale of the data source obtaining meets first condition, from Choose n data block in the data block that described data source comprises as the data source of estimation tasks, run described Estimation tasks simultaneously record the resource that described estimation tasks consume, the calculation of natural resources consuming according to described estimation tasks The required by task resource of described reception;Wherein, n is positive integer.
8. according to claim 7 device it is characterised in that described processing module, specifically for institute State the data block that data source comprises to be ranked up, randomly select a data block as the first data block, then EveryIndividual data block chooses a data block, till choosing n data block;Wherein, m is The data block number that described data source comprises, m is positive integer.
9. according to claim 6 or 7 described devices it is characterised in that described scheduler module, specifically for When determining that the total resources of the required by task in task list is not more than current system idling-resource, start institute State all tasks in task list;
When determining that the total resources of the required by task in task list is more than current system idling-resource, foundation In task list, the priority of task starts the task in described task list successively, and to priority identical Task, the less task of preferential startup resource occupation.
10. according to claim 9 device it is characterised in that described scheduler module, specifically for according to Priority according to task in task list carries out resource occupation application to the task in described task list successively, And start the successful task of resource occupation application successively according to the priority of task.
CN201510411512.1A 2015-07-14 2015-07-14 Task scheduling method and device based on resource estimation in big data system Pending CN106354553A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510411512.1A CN106354553A (en) 2015-07-14 2015-07-14 Task scheduling method and device based on resource estimation in big data system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510411512.1A CN106354553A (en) 2015-07-14 2015-07-14 Task scheduling method and device based on resource estimation in big data system

Publications (1)

Publication Number Publication Date
CN106354553A true CN106354553A (en) 2017-01-25

Family

ID=57842180

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510411512.1A Pending CN106354553A (en) 2015-07-14 2015-07-14 Task scheduling method and device based on resource estimation in big data system

Country Status (1)

Country Link
CN (1) CN106354553A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991012A (en) * 2017-04-11 2017-07-28 广东浪潮大数据研究有限公司 A kind of computer resource compression is reserved and dynamic dispatching method
CN108733473A (en) * 2018-05-11 2018-11-02 北京航天发射技术研究所 A kind of positioning based on VxWorks aims at the control method of integration apparatus task
CN112540854A (en) * 2020-12-28 2021-03-23 上海体素信息科技有限公司 Deep learning model scheduling deployment method and system under condition of limited hardware resources

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020078028A1 (en) * 2000-12-18 2002-06-20 Trevalon Inc. Network server
CN1879086A (en) * 2003-11-13 2006-12-13 皇家飞利浦电子股份有限公司 Method and system for restrained budget use
CN102111337A (en) * 2011-03-14 2011-06-29 浪潮(北京)电子信息产业有限公司 Method and system for task scheduling
CN103699445A (en) * 2013-12-19 2014-04-02 北京奇艺世纪科技有限公司 Task scheduling method, device and system
CN104520815A (en) * 2014-03-17 2015-04-15 华为技术有限公司 Method, device and equipment for task scheduling

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020078028A1 (en) * 2000-12-18 2002-06-20 Trevalon Inc. Network server
CN1879086A (en) * 2003-11-13 2006-12-13 皇家飞利浦电子股份有限公司 Method and system for restrained budget use
CN102111337A (en) * 2011-03-14 2011-06-29 浪潮(北京)电子信息产业有限公司 Method and system for task scheduling
CN103699445A (en) * 2013-12-19 2014-04-02 北京奇艺世纪科技有限公司 Task scheduling method, device and system
CN104520815A (en) * 2014-03-17 2015-04-15 华为技术有限公司 Method, device and equipment for task scheduling

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991012A (en) * 2017-04-11 2017-07-28 广东浪潮大数据研究有限公司 A kind of computer resource compression is reserved and dynamic dispatching method
CN108733473A (en) * 2018-05-11 2018-11-02 北京航天发射技术研究所 A kind of positioning based on VxWorks aims at the control method of integration apparatus task
CN112540854A (en) * 2020-12-28 2021-03-23 上海体素信息科技有限公司 Deep learning model scheduling deployment method and system under condition of limited hardware resources

Similar Documents

Publication Publication Date Title
CN105718479B (en) Execution strategy generation method and device under cross-IDC big data processing architecture
CN111381950B (en) Multi-copy-based task scheduling method and system for edge computing environment
CN111176419B (en) Method and apparatus for estimating power performance of jobs running on multiple nodes of a distributed computer system
US8812639B2 (en) Job managing device, job managing method and job managing program
WO2017045553A1 (en) Task allocation method and system
CN106209682B (en) Business scheduling method, device and system
Reda et al. Rein: Taming tail latency in key-value stores via multiget scheduling
CN107656813A (en) The method, apparatus and terminal of a kind of load dispatch
CN105718317A (en) Task scheduling method and task scheduling device
CN110389816B (en) Method, apparatus and computer readable medium for resource scheduling
CN103595651B (en) Distributed data stream processing method and system
CN111104211A (en) Task dependency based computation offload method, system, device and medium
CN103401939A (en) Load balancing method adopting mixing scheduling strategy
CN109582448A (en) A kind of edge calculations method for scheduling task towards criticality and timeliness
CN105320570B (en) Method for managing resource and system
CN104765640A (en) Intelligent service scheduling method
CN108132840B (en) Resource scheduling method and device in distributed system
CN105389206A (en) Method for rapidly configuring virtual machine resources in cloud computing data center
CN103257896B (en) A kind of Max-D job scheduling method under cloud environment
CN106293947B (en) GPU-CPU (graphics processing Unit-Central processing Unit) mixed resource allocation system and method in virtualized cloud environment
CN109039929A (en) Business scheduling method and device
CN106354553A (en) Task scheduling method and device based on resource estimation in big data system
CN106776025A (en) A kind of computer cluster job scheduling method and its device
CN104461722B (en) A kind of job scheduling method for cloud computing system
Huang et al. Rush: A robust scheduler to manage uncertain completion-times in shared clouds

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170125