CN107766143A - Data processing management system and task management, method for scheduling task and device - Google Patents

Data processing management system and task management, method for scheduling task and device Download PDF

Info

Publication number
CN107766143A
CN107766143A CN201610677839.8A CN201610677839A CN107766143A CN 107766143 A CN107766143 A CN 107766143A CN 201610677839 A CN201610677839 A CN 201610677839A CN 107766143 A CN107766143 A CN 107766143A
Authority
CN
China
Prior art keywords
task
data processing
peak period
target data
platform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610677839.8A
Other languages
Chinese (zh)
Inventor
王勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201610677839.8A priority Critical patent/CN107766143A/en
Publication of CN107766143A publication Critical patent/CN107766143A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/504Resource capping

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

This application provides a kind of data processing management system, data processing task management method and device, method for scheduling task and device.The system includes data processing task managing device and task scheduling platform;The data processing task managing device, the data processing peak period for identification data processing platform;Extract multiple data processing tasks in the data processing peak period;According to the task attribute of each data processing task, Screening Treatment time adjustable target data processing task;Notify the task scheduling platform;The task scheduling platform, for the notice according to the data processing task managing device, the objective time interval outside adjustment at least one target data processing task to data processing peak period is handled.The task of peak period can be transferred to off-peak period processing by the application, reach the purpose of peak load shifting, reduce high request of the peak period to process resource, therefore reduce processing cost.

Description

Data processing management system and task management, method for scheduling task and device
Technical field
The application is related to technical field of data processing, at a kind of data processing management system, a kind of data Manage task management method, a kind of data processing task managing device, a kind of method for scheduling task and a kind of task scheduling apparatus.
Background technology
Big data processing business is generally common by scheduling system and this two sets of relatively independent systems of big data processing platform Complete, scheduling system receives pre-defined data processing task, and task is issued according to the setting regular batch of business personnel Being performed to big data processing platform, big data processing platform then needs to handle all execution for being issued task and data output, So as to realize data mart modeling process controllable in order.
In data processing scene, factor data handles logic, typically has dependence between task, scheduling system can create base State control and task triggering are carried out in workflow (workflow diagram) or DAG (directed acyclic graph), is relied on support mission Property.In the directed acyclic graph of signs task dependence as shown in Figure 1, including task A to F, according to the sensing between task, only A and B all computings are completed, and scheduling system can just trigger issuing for C tasks.
In the scene of reality, because task dependency characteristic and advance data source arrival time are inconsistent so that at task Computing resource consumption is often as shown in Figure 2 during reason.According to statistics, the data most important consumption period concentrates on workaday morning In morning, understood according to Fig. 2, data processing task accumulation is serious in 0~9 point of period in morning.
Mass data processing task, which was deposited in the peak period of data processing, to be handled, and not only data processing resources are proposed Higher requirement, causes processing cost to be significantly increased, while because task is excessive, overload make it that partial task processing is too late When, poor in timeliness;And after peak period is handled, because process resource consumes rapid drawdown, cause part process resource to be left unused, Resource consumption is unbalanced.
The content of the invention
Technical problems to be solved in this application are to provide at a kind of data for partly or entirely solving above-mentioned technical problem Manage management system, data processing task management method and device, method for scheduling task and device.
In order to solve the above problems, this application discloses a kind of data processing management system, including data processing task pipe Manage device and task scheduling platform;
The data processing task managing device, the data processing peak period for identification data processing platform;Extraction Multiple data processing tasks in the data processing peak period;According to the task attribute of each data processing task, Screening Treatment Time adjustable target data processing task;Notify the task scheduling platform;
The task scheduling platform, for the notice according to the data processing task managing device, described in adjustment at least Objective time interval outside one target data processing task to data processing peak period is handled.
Present invention also provides a kind of data processing task management method, including:
The data processing peak period of identification data processing platform;
Extract multiple data processing tasks in the data processing peak period;
According to the task attribute of each data processing task, Screening Treatment time adjustable target data processing task;
When notifying that task scheduling platform adjustment at least one target data handles task to the data processing peak Objective time interval outside section is handled.
Preferably, the data processing peak period of the identification data processing platform includes:
Select historic task treating capacity beyond at least one data processing period of given threshold as data processing peak Period.
Preferably, before the data processing peak period of identification data processing platform, methods described also includes:
The history process record of multiple data processing tasks is extracted from the data processing platform (DPP);
The history process record is analyzed, determines the historic task treating capacity of each data processing period.
Preferably, the history number task treating capacity for determining each data processing period includes:
Unit interval is divided into multiple data processing periods;
Count historic task treating capacity sum of each data processing period in multiple unit intervals.
Preferably, the data processing peak period of the identification data processing platform also includes:
Count the current task treating capacity of the data processing peak period;
If the current task treating capacity and the historic task treating capacity respectively compared with the given threshold after Obtained comparative result is inconsistent, then when screening at least one data processing peak again according to current task processing total amount Section.
Preferably, notice task scheduling platform adjustment at least one target data handles task to the data Objective time interval outside processing peak period, which carries out processing, to be included:
Search the historical data and calculate delay duration corresponding to the given threshold that total amount exceeds;
Delay duration of the task scheduling platform according to lookup is notified, to the target under the data processing peak period Data processing task carries out delay process.
Preferably, the given threshold includes multiple;
Delay duration corresponding to the given threshold searched the historical data calculating total amount and exceeded includes:
Search the historical data and calculate delay duration corresponding to the maximum given threshold that total amount exceeds, wherein, setting Threshold value is bigger, then corresponding delay duration is longer.
Preferably, the task attribute includes the affiliated data service of the data processing task and/or the data processing The ageing requirement of output of task.
Preferably, in the task attribute according to each data processing task, Screening Treatment time adjustable number of targets Before processing task, methods described also includes:
Determine the affiliated data service of each data processing task and/or the ageing requirement of output of the data processing task.
Preferably, the output for determining each affiliated data service of data processing task and/or the data processing task It is ageing to require to include:
For each data processing task, at least one dependence task that the data processing task is relied on is searched;
The business weight of the data processing task is determined according to the business weighted value of the affiliated data service of each dependence task Value, and/or, according to the ageing ageing requirement of output for requiring to determine the data processing task of the output of each dependence task.
Preferably, the business weighted value according to the affiliated data service of each dependence task determines the data processing task Business weighted value include:
Highest business weighted value in all dependence task is selected, the business weighted value as the data processing task;
The output according to each dependence task is ageing to require that determining that the output of the data processing task is ageing wants Ask including:
The ageing requirement of highest output in all dependence task is selected, the output timeliness as the data processing task Property require.
Preferably, the task attribute according to each data processing task, Screening Treatment time adjustable target data Processing task includes:
The business weighted value of the affiliated data service of data processing task is extracted less than the first setting rank, and/or, during output Effect property requires the data processing task less than the first sets requirement.
Preferably, notice task scheduling platform adjustment at least one target data handles task to the data Objective time interval outside processing peak period, which carries out processing, to be included:
Notify to change the startup time that selected target data handles task in the task scheduling platform, so that described Target data handles task delay start.
Preferably, the task attribute according to each data processing task, Screening Treatment time adjustable target data Processing task includes:
The business weighted value of the affiliated data service of data processing task is extracted higher than the second setting rank, and/or, during output Effect property requires the data processing task higher than the second sets requirement.
Preferably, notice task scheduling platform adjustment at least one target data handles task to the data Objective time interval outside processing peak period, which carries out processing, to be included:
The target data selected by the task scheduling platform modifying is notified to handle the startup time of task, so that the mesh Mark data processing task pre-cooling.
Preferably, when the startup of selected target data processing task is changed in task dispatching platform in the notice Between before, the notice task scheduling platform adjusts at least one target data and handles task to the data processing peak period Outside objective time interval handled and also included:
Search at least one dependence task that target data processing task to be adjusted is relied on;
According to the startup time of the dependence task, the target data processing task start time is defined as the number According to the objective time interval outside processing peak period, after the deadline of the dependence task.
Preferably, task is handled to the number in notice task scheduling platform adjustment at least one target data Before being handled according to the objective time interval outside processing peak period, methods described also includes:
According to the consumed resource of each target data processing task, at least one target data processing task is selected.
Preferably, in the consumed resource that task is handled according to each target data, at least one number of targets is selected Before processing task, methods described also includes:
Determine the consumed resource of each target data processing task;
It is high to the data processing that notice task scheduling platform adjustment at least one target data handles task Objective time interval outside the peak period, which carries out processing, to be included:
At least one target data processing task that selection consumed resource sorts forward from big to small, notifies task scheduling Objective time interval outside platform adjustment selected target data processing task to the data processing peak period is handled.
Preferably, task is handled to the number in notice task scheduling platform adjustment at least one target data Before being handled according to the objective time interval outside processing peak period, methods described also includes:
The adjustment of generation target data processing task notifies and is issued to task management client;
Receive confirmation instruction of the task management client for the adjustment notice.
Present invention also provides a kind of method for scheduling task, including:
Adjustment notice is received, the adjustment notice instruction is adjusted at least one target data processing task to data Objective time interval outside reason peak period is handled, wherein, the target data processing task is handled flat by identification data The data processing peak period of platform, multiple data processing tasks in the data processing peak period are extracted, and according to each data The task attribute of processing task, the Screening Treatment time, adjustable target data processing task obtained;
Adjusted according to the notice outside at least one target data processing task to data processing peak period Objective time interval is handled;
Target data processing task after adjustment is scheduled.
Present invention also provides a kind of data processing task managing device, including:
Peak period identification module, the data processing peak period for identification data processing platform;
Task extraction module, for extracting multiple data processing tasks in the data processing peak period;
Task screening module, for the task attribute according to each data processing task, Screening Treatment time adjustable mesh Mark data processing task;
Task adjusts notification module, for notifying task scheduling platform adjustment at least one target data processing task Objective time interval outside to the data processing peak period is handled.
Preferably, the peak period identification module, specifically for selection historic task treating capacity beyond given threshold At least one data processing period is as data processing peak period.
Preferably, described device also includes:
Extraction module is recorded, for before the data processing peak period of identification data processing platform, from the data Processing platform extracts the history process record of multiple data processing tasks;
Treating capacity analysis module, for analyzing the history process record, determine that the history of each data processing period is appointed Business treating capacity.
Preferably, the treating capacity analysis module includes:
Time segments division submodule is handled, for unit interval to be divided into multiple data processing periods;
Task amount statistic submodule, for counting at historic task of each data processing period in multiple unit intervals Reason amount sum.
Preferably, the peak period identification module also includes:
Current task amount statistic submodule, for counting the current task treating capacity of the data processing peak period;
Peak period screens submodule again, if for the current task treating capacity and the historic task treating capacity point The comparative result obtained after not compared with the given threshold is inconsistent, then handles total amount again according to the current task Screen at least one data processing peak period.
Preferably, the task adjustment notification module includes:
Delay duration searches submodule, prolongs corresponding to the given threshold exceeded for searching the historical data calculating total amount Shi Shichang;
Delay duration notifies submodule, for notifying delay duration of the task scheduling platform according to lookup, to described Target data processing task under data processing peak period carries out delay process.
Preferably, the given threshold includes multiple;
The delay duration searches submodule, maximum is set specifically for search that the historical data calculates that total amount exceeds Determine delay duration corresponding to threshold value, wherein, given threshold is bigger, then corresponding delay duration is longer.
Present invention also provides a kind of task scheduling apparatus, including:
Receiving module is notified, is notified for receiving, wherein, the target data processing task is handled flat by identification data The data processing peak period of platform, multiple data processing tasks in the data processing peak period are extracted, and according to each data The task attribute of processing task, the Screening Treatment time, adjustable target data processing task obtained;
Adjusting module, task is handled to data processing height for adjusting at least one target data according to the notice Objective time interval outside the peak period is handled;
Scheduler module, for being scheduled to the target data processing task after adjustment.
Compared with prior art, the application includes advantages below:
According to the embodiment of the present application, the data processing peak period of identification data processing platform, further extract at data Each data processing task of peak period is managed, task is handled according to task attribute Screening Treatment time adjustable target data, And the objective time interval adjusted outside at least one target data processing task to data processing peak period that screening obtains is carried out Processing, so as to which the task of peak period is transferred into off-peak period processing, reaches the purpose of peak load shifting, reduce height The peak period reduces processing cost to the high request of process resource;Meanwhile cause timeliness because the task of peak period is reduced Property require that higher task can obtain timely processing;Because target data processing task is delayed to off-peak period processing, The utilization rate of slack resources can be increased so that the resource consumption of day part is balanced.
Brief description of the drawings
Fig. 1 is that the directed acyclic graph that task relies on is characterized in background technology;
Fig. 2 is the consumption schematic diagram of computing resource in task processes in background technology;
Fig. 3 is a kind of application schematic diagram of data processing management system of the application;
Fig. 4 is a kind of flow chart of data processing task management method embodiment 1 of the application;
Fig. 5 is a kind of flow chart of data processing task management method embodiment 2 of the application;
Fig. 6 is a kind of flow chart of method for scheduling task embodiment of the application;
Fig. 7 is a kind of Organization Chart for the data processing task management system for implementing the embodiment of the present application;
Fig. 8 is a kind of data flow diagram for the data processing task management system for implementing the embodiment of the present application;
Fig. 9 is a kind of step flow chart for the data processing task management for implementing the embodiment of the present application;
Computing resource after the tuning processing that Figure 10 passes through the embodiment of the present application consumes schematic diagram;
Figure 11 is a kind of structured flowchart of data processing task managing device embodiment of the application;
Figure 12 is a kind of structured flowchart of task scheduling apparatus embodiment of the application.
Embodiment
It is below in conjunction with the accompanying drawings and specific real to enable the above-mentioned purpose of the application, feature and advantage more obvious understandable Mode is applied to be described in further detail the application.
This application provides a kind of data processing management system, including data processing task managing device and task scheduling to put down Platform, with reference to figure 3, show a kind of application schematic diagram of data processing management system of the application.Wherein, data processing task pipe Manage the data processing peak period of device identification data processing platform and extract multiple data processings in data processing peak period Task, task is handled further according to the task attribute Screening Treatment time adjustable target data of each data processing task, Finally notify task scheduling platform, the notice by task scheduling platform according to the data processing task managing device, adjust to Objective time interval outside few target data processing task to data processing peak period is handled.It can specifically use such as Lower method and step:
With reference to figure 4, a kind of flow chart of data processing task management method embodiment 1 of the application is shown, specifically may be used To comprise the following steps:
Step 101, the data processing peak period of identification data processing platform.
The data processing task of data processing peak period accumulates situation, the utilization rate of process resource far beyond other The data processing period.It can be drawn data processing peak period by more multiple data processing periods, at specific identification data Reason peak period mode have it is a variety of, for example, pass through the task treating capacity of more multiple data processing periods, extraction task processing Amount higher or more than given threshold the data processing period can be expressed as data processing peak period, task treating capacity The quantity of processing task, the operand for handling task etc.;The utilization of resources feelings of more multiple data processing periods can also be passed through Condition, extraction resource utilization is higher or data processing period more than given threshold is as data processing peak period.
Step 102, multiple data processing tasks in the data processing peak period are extracted.
The embodiment of the present application was analyzed data processing peak period, and obtained the data in data processing peak period Processing task, the data processing task that is to say the reason for causing data processing peak period.
History process record can be specifically parsed, is extracted in the data processing task of record in data processing peak period.
Step 103, according to the task attribute of each data processing task, the adjustable target data processing of Screening Treatment time Task.
The task attribute of data processing task can include the affiliated business of data processing task, and specific type of service can be with Divide according to the actual requirements, for example, mobile terminal service or PC ends business, class business of merchandising or login class business.Task Attribute can also include the ageing requirement of output for the data processing task.Task attribute can also be that other are any suitable Type, the application is not limited to this.
The embodiment of the present application according to task attribute judge data processing task processing time whether can delay start adjust, Corresponding rule can be specifically preset, if one or more task attributes meet preset requirement, it is determined that its processing time It is adjustable can delay start, if for example, the affiliated business of some data processing task is pre-set business or belongs to default business Type or meet default business weighted value i.e. severity level less than setting rank, then can be used as delay start processing when Between adjustable target data processing task, or, the ageing requirement of output of some data processing task is less than sets requirement Data processing task as processing time it is adjustable can the target data of delay start handle task, above-mentioned condition can basis Actual demand is set, and can also integrate a variety of preparatory conditions, the conduct when some data processing task meets a variety of preparatory conditions Processing time it is adjustable can delay start target data processing task.
Step 104, notify at task scheduling platform adjustment at least one target data processing task to data Objective time interval outside reason peak period is handled.
It is one or more according to selection is actually needed for the processing time adjustable target data processing task of determination The objective time interval adjusted to outside peak period is handled.Such as the processing time of all data processing tasks is adjustable Situation, reality need not be simultaneously adjusted to all tasks, only need to select a portion.
In the embodiment of the present application, the adjustment of processing time can be processing time in advance or processing time is delayed, Ke Yigen Specific adjustment mode is set according to actual demand, adjustment mode can determine according to task attribute, for example, the ageing requirement of output Higher task can advanced processing but can not delay start, the ageing relatively low task of output can advanced processing also postpone to open It is dynamic.
Preferably, for the adjustment mode of processing time delay, can be directed to compared with calculating total amount with historical data Given threshold preset corresponding delay duration, by compare determine data processing peak period after, extraction is directed to The delay duration of given threshold setting, delay process is carried out to target data processing task according to delay duration.Accordingly, for The adjustment mode that processing time shifts to an earlier date, the given threshold that can be directed to compared with calculating total amount with historical data preset corresponding Duration in advance, after by comparing determination data processing peak period, extraction shifts to an earlier date duration for given threshold setting, according to Duration carries out advanced processing to target data processing task in advance.
Specifically how to select to set according to the actual requirements, such as randomly choose, according to the business of data processing task Attribute (such as type of service) is selected, and is selected according to the consumed resource of data processing task.Wherein, with according to number Can choose the forward at least one data of consumed resource sequence exemplified by being selected according to the consumed resource of processing task Processing task, at least one data processing task that consumed resource meets one or more pre-set intervals can also be chosen, or It is the average resource consumption for calculating all data processing periods, further determines that the consumed resource of data processing peak period With the difference of the average resource consumption, consumed resource summation is chosen close at least one data processing task of the difference.
The objective time interval outside target data processing task to data processing peak period is adjusted, can change the data The startup time of processing task or target data processing task is added into preset queue, until when starting the time, control Target data processing task start in the queue.
The embodiment of the present application is responsible for the task scheduling of data platform by task scheduling platform, in advance for each task Start the time corresponding to configuration, until after starting the time, it is flat to big data processing that task scheduling system is responsible for allocating each task Platform.It can be notified by being sent to task scheduling platform, the processing time that notice instruction handles target data task is carried out Adjustment, after receiving notice by task scheduling platform, the processing time of the target data processing task of adjustment notice instruction.
According to the embodiment of the present application, the data processing peak period of identification data processing platform, further extract at data Each data processing task of peak period is managed, task is handled according to task attribute Screening Treatment time adjustable target data, And other periods adjusted outside at least one target data processing task to data processing peak period that screening obtains are carried out Processing, so as to which the task of peak period is transferred into off-peak period processing, reaches the purpose of peak load shifting, reduce height The peak period reduces processing cost to the high request of process resource;Meanwhile cause timeliness because the task of peak period is reduced Property require that higher task can obtain timely processing;Because target data processing task is delayed to off-peak period processing, The utilization rate of slack resources can be increased so that the resource consumption of day part is balanced.
In the embodiment of the present application, it is preferred that during the data processing peak period of identification data processing platform, can select to go through History task treating capacity exceeds the data processing period of given threshold as data processing peak period.Specifically can be according to actual need Ask selection one or more data processing peak periods.Wherein historic task treating capacity is each data processing period in current time Task treating capacity in historical time before.
Wherein, task treating capacity can be operand of the quantity of processing task, processing task etc., can be according to actual need Ask and counted.Preferably, unit interval (such as one day) can be divided into multiple data processing periods, at each data The time span of reason period can be set according to the actual requirements, then one day can be with such as with 4 hours for a data processing period It is divided into 6 data processing periods.Further, it is possible to count data meter of each data processing period in multiple unit intervals Total amount is calculated, for example, at one week in totally 7 unit intervals, counts the 0~4am data processing period in the processing of this 7 days task Total amount, it that is to say and sum up the task treating capacity of 0~4am in seven days.
Because such scheme is according to the data processing peak period of historic task processing total screening, gone through although reflecting The temporal rule of history, but the task processing total amount of actual current time might have difference, cause the data processing of screening high The peak period can change.Consideration in view of the situation, core further can be carried out to the data processing peak period of screening Look into, specifically, the current task treating capacity of data processing peak period that above-mentioned screening obtains in current time can be counted, And again compared with the threshold value of setting, if the comparative result of some data processing peak period is handled with above-mentioned historic task Amount is inconsistent with the comparative result of given threshold, and in other words, historic task treating capacity exceeds given threshold, and at current task Reason amount is not then then data processing peak period actually in the current time data processing period without departing from given threshold.Enter one Step can handle total amount with current task and screen at least one data processing peak period again, can be specifically according to each data Processing the period current task processing total amount screen again one or more data processing peak periods or only in accordance with Current task treating capacity according to the data processing peak period of historic task treating capacity screening screens one or more numbers again According to processing peak period.
In the embodiment of the present application, it is preferable that multiple threshold values can be preset, it is multiple to that is to say that given threshold includes, when When historic task treating capacity exceedes at least one threshold value, it is determined as data processing peak period, if historic task amount is beyond multiple Threshold value, it is determined that maximum of which threshold value, search maximum given threshold corresponding to delay duration, according to the lookup delay when It is long that delay process is carried out to data processing peak period.It is understood that historic task amount is bigger, then corresponding delay duration Longer, then for the bigger threshold value of numerical value, then the delay duration being correspondingly arranged is longer.
In the embodiment of the present application, it is preferred that the task attribute can include the affiliated data industry of the data processing task Business and the ageing requirement of output of the data processing task;Accordingly, can be ageing according to affiliated data service or output It is required that selection target data processing task, can also judge jointly with reference to two kinds of service attributes.Specifically how according to task attribute Screening target data processing task can be set according to the actual requirements, for example, can be set according to different data services different Weighted value, and the condition that target data processing task is screened according to weighted value is set;It can set according to the ageing requirement of output Screen the condition of target data processing task.
Task attribute can be the data entrained by each data processing task, can also be determined before being screened.This In inventive embodiments, it is preferable that task attribute can be determined according to the dependence between each data processing task.Specifically, One data processing task there may be one or more dependence task, search the dependence task of each data processing task, root The business category of current data processing task is determined according to service attributes such as the affiliated data service of dependence task, the ageing requirements of output Property, it can specifically include, for business weighted value, data are determined according to the business weighted value of the affiliated data service of each dependence task The business weighted value of processing task;For the ageing requirement of output, require to determine number according to the output of each dependence task is ageing According to the ageing requirement of output of processing task.
The mode of the task attribute of current data processing task is determined according to the task attribute of institute's dependence task, can be only The start node relied on task, i.e., the data processing task of no dependence task set task attribute, then other data processings The task attribute of task can derive step by step, it is possible to reduce set the workload of task attribute, and ensure between dependence task The relation of task attribute meet the logic of setting.
In the embodiment of the present application, preferably appoint the peak in the business weighted value of dependence task as current data processing The business weighted value of business, so that being consistent business importance between the data processing task with dependence, is avoided Influence business weighted value highest dependence task.For example, the task that task A is relied on includes task B, C, D, wherein task C's Business weighted value highest, then task A business weighted value is used as using task C business weighted value.Two with dependence Relation between the business weighted value of data processing task can be set according to the actual requirements, such as can also be by dependence task Business weighted value of the average of business weighted value as current data processing task, the application are not limited this.
In the embodiment of the present application, preferably require to appoint as current data processing using highest output in dependence task is ageing The ageing requirement of output of business, so that it is ageing to be consistent output between the data processing task with dependence, Avoid influenceing that output is ageing requires highest dependence task.The task output timeliness of two data processings with dependence Property require between relation can set according to the actual requirements, such as can also be by the equal of the ageing requirement of the output of dependence task It is worth the ageing requirement of output as current data processing task, the application is not limited this.
It is different according to the adjustment mode of task processing time, Screening Treatment time adjustable target data processing task Mode is also different.
If processing time is adjusted to postpone processing time, the Screening Treatment time, adjustable target data processing was appointed Business can include:Data processing of the business weighted value less than setting rank for extracting the affiliated data service of data processing task is appointed Business, or the ageing data processing task required less than sets requirement of extraction output, or extraction business weighted value is less than setting Deciding grade and level not and the ageing data processing task required less than sets requirement of output, as can be at the target data of delay start Reason task.
Accordingly preferably, above-mentioned notice task scheduling platform adjustment at least one target data handles task to described Objective time interval outside data processing peak period, which is handled, can be specifically, notify in task scheduling system selected by modification Target data processing task the startup time so that the target data handle task delay start.To target data processing Task carries out the processing of delay start time, until after starting the time, target data processing task is allocated extremely by scheduling system Big data processing platform is handled.
If processing time is adjusted to shift to an earlier date processing time, the Screening Treatment time, adjustable target data processing was appointed Business can include:The business weighted value of data service belonging to extraction sets the data processing task of rank higher than second, or carries The ageing requirement of output is taken to be higher than the second sets requirement, or the business weighted value of the extraction data service is set higher than second Rank and the ageing data processing task for being higher than the second sets requirement of output, as can pre-cooling data processing task.
Accordingly preferably, above-mentioned notice task scheduling platform adjustment at least one target data handles task to described Objective time interval outside data processing peak period, which is handled, can be specifically, notify selected by task scheduling platform modifying Target data handles the startup time of task, so that target data handles task pre-cooling.
Based on the dependence between data processing task, there is precedence on its processing time, that is to say current data After the dependence task of processing task has been handled, can just start current data processing task, can not be advanced to dependence task it Pre-treatment.Therefore, the data processing task shifted to an earlier date for processing time, carried to target data processing task to be adjusted , it is necessary to search at least one dependence task that target data processing task to be adjusted is relied on before preceding startup, further root The startup event that task is handled target data to be adjusted according to the startup time of dependence task is adjusted, to ensure that it shifts to an earlier date Processing time afterwards after dependence task, objective time interval after adjustment need to meet outside data processing peak period and After the deadline of institute's dependence task.
With reference to figure 5, a kind of flow of the task management method embodiment 2 of big data processing platform of the application is shown Figure, specifically may comprise steps of:
Step 201, the data processing peak period of identification data processing platform.
Step 202, multiple data processing tasks in the data processing peak period are extracted.
Step 203, according to the task attribute of each data processing task, the adjustable target data processing of Screening Treatment time Task.
Step 204, the consumed resource of each target data processing task is determined.
Step 205, at least one target data processing task that selection consumed resource sorts forward from big to small.
In the present embodiment, when selection carries out the target data processing task of delay start from target data processing task, Selected by foundation of consumed resource.
Performing the resource consumed during data processing task includes memory space, CPU, network traffics etc., consumed resource It can be the summation to the consumption of above-mentioned at least one resource.
The consumed resource of each target data processing task is precalculated, is selected according to consumed resource, specifically , target data processing task is ranked up from big to small according to consumed resource, by the forward one or more mesh that sort Data processing task alternatively object is marked, carries out delay start processing.Specifically chosen number can be set according to the actual requirements Put.
In the embodiment of the present application, it is preferable that above-mentioned extraction data processing peak period can from multiple data processing periods With including:
According to three times mean square deviation algorithm, total amount is calculated from during multiple data processings according to the data of each data processing period Extraction data processing peak period in section.
Specifically, when calculating total amount as according to data processing peak is extracted from each data processing period using data Section, the data of data processing peak period calculate the data calculating total amount that total amount is higher than other data processings period.
Three times mean square deviation is generally used for rejecting the larger special datum of deviations reference data in batch of data, if single value and The difference of average value is more than the mean square deviation of three times, then it is assumed that the value occurs abnormal.The embodiment of the present application is applied to, due in more numbers According in the processing period, data processing peak period is much abnormal in other data processing periods.Three times mean square deviation is applied to this Application, the mean square deviation of data calculating total amount corresponding to all data processing periods difference is calculated, and, calculate all data and calculate The average value of total amount, the value of three times mean square deviation is further calculated, the difference that the average value of total amount will be calculated with data is more than three times Some data processing period of mean square deviation is as data processing peak period.
Step 206, the adjustment for generating target data processing task notifies and is issued to task management client.
Task is handled for the one or more target datas selected according to consumed resource, is carrying out the tune of processing time Before whole, can first it send to task management client corresponding to task scheduling system, by task management client to be adjusted At least one target data processing task be shown, the user of task management client can be to the target data of displaying at Reason task is checked.
Step 207, confirmation instruction of the task management client for the adjustment notice is received.
While target data processing task is shown, feedback entrance can also be provided, user can be fed back into by this Mouth submits the instruction for whether adjusting shown target data processing task, and feedback entrance can include being directed to selection yes/no Option, all task setting options can be directed to, each task difference setting options can also be directed to.
After receiving the confirmation instruction that user is submitted to adjustment by feeding back entrance, then task scheduling can be further notified Platform carries out the processing of task adjustment.
Step 208, it is high to the data processing that the target data for notifying the adjustment of task scheduling platform selected handles task Objective time interval outside the peak period is handled.
According to the embodiment of the present application, the data processing peak period of identification data processing platform, further extract at data Each data processing task of peak period is managed, task is handled according to task attribute Screening Treatment time adjustable target data, And the objective time interval adjusted outside at least one target data processing task to data processing peak period that screening obtains is carried out Processing, so as to which the task of peak period is transferred into off-peak period processing, reaches the purpose of peak load shifting, reduce height The peak period reduces processing cost to the high request of process resource;Meanwhile cause timeliness because the task of peak period is reduced Property require that higher task can obtain timely processing;Because target data processing task is delayed to off-peak period processing, The utilization rate of slack resources can be increased so that the resource consumption of day part is balanced.
With reference to figure 6, a kind of flow chart of method for scheduling task embodiment of the application is shown, can specifically be included:
Step 301, adjustment notice is received, the adjustment notice instruction adjusts at least one target data and handles task to institute The objective time interval outside data processing peak period is stated to be handled, wherein, the target data handles task by identifying number According to the data processing peak period of processing platform, multiple data processing tasks in the data processing peak period are extracted, and press According to the task attribute of each data processing task, the Screening Treatment time, adjustable target data processing task obtained.
Step 302, when adjusting at least one target data processing task to data processing peak according to the notice Objective time interval outside section is handled.
Step 303, the target data processing task after adjustment is scheduled.
According to the embodiment of the present application, the data processing peak period of identification data processing platform, further extract at data Each data processing task of peak period is managed, task is handled according to task attribute Screening Treatment time adjustable target data, And the objective time interval adjusted outside at least one target data processing task to data processing peak period that screening obtains is carried out Processing, so as to which the task of peak period is transferred into off-peak period processing, reaches the purpose of peak load shifting, reduce height The peak period reduces processing cost to the high request of process resource;Meanwhile cause timeliness because the task of peak period is reduced Property require that higher task can obtain timely processing;Because target data processing task is delayed to off-peak period processing, The utilization rate of slack resources can be increased so that the resource consumption of day part is balanced.
To make those skilled in the art more fully understand the scheme of the embodiment of the present application, carried out below by way of specific example Explanation.
The embodiment of the present application causes data processing peak by analyzing the history process record of big data processing platform, acquisition The target data processing task of period, and the significance level of the combination affiliated business of these tasks and the ageing requirement of output are to scheduling Task definition in system is screened, and the processing time of task is handled to data processing by adjusting the target data after screening Other periods outside peak period so that the consumption of final resource being capable of relative equilibrium.
Implement the system architecture of the embodiment of the present application as shown in fig. 7, scheduling system is responsible for issuing task to big data processing Platform, big data processing platform export task run record, the device for implementing the embodiment of the present application are then transported by reading the task Row record, analyze data processing peak period and processing time adjustable target data processing task, further modification are adjusted The task start time recorded in degree system.
Reference picture 8 gives the data flow diagram for the system for implementing the embodiment of the present application, and scheduling system will be defined Task record is further issued to each client Client of big data processing platform, with big into list Schedule Data processing platform (DPP) performs task processing, and task run record, the analysis source as the application are generated after task processing.Pass through To task run record parsed and analyzed acquisition can delay disposal task, further the startup time to defined task enter The modification of row delay start.
Such as Fig. 9, the step flow chart of said process is shown, can specifically be included:
A, it was cut into N number of period by one day.
B, according to 1 day or N days logs in history, the amount of calculation of each period is counted.
C, high point special datum is obtained using three times mean square deviation algorithm, and the period is calculated as period T.
D, analyzed using Business Rule Engine, specifically, taking all task run records in period T, pass through summation Mode obtains the computational resource consumption of each task in present period.
E, the service attribute of all tasks in the period is obtained, requires that screening can with reference to affiliated data service and output are ageing The task list being adjusted.
F, task based access control list carries out maximum sequence according to amount of calculation, chooses TopN task
G, further adjusted by changing the task start time in scheduling system come qualified task to non-number According to processing peak period processing, so as to reach the purpose of peak load shifting.
Above step, which can be set, repeats (such as repeating daily one time), intervenes without personnel, and can be compared with The change and disturbance of good adaptation scheduling system and data processing platform (DPP) itself, to enable the system to adaptive change.
To calculate resource consumption schematic diagram in the task processes shown in Fig. 2, pass through the above-mentioned step of the embodiment of the present application After rapid tuning processing, the computing resource consumption schematic diagram in same time span is as shown in Figure 10, it can be seen that high The task of peak period has been transferred to off-peak period processing, realizes peak load shifting, rationally utilizes the purpose of resource.
For embodiment of the method, in order to be briefly described, therefore it is all expressed as to a series of combination of actions, but this area Technical staff should know that the application is not limited by described sequence of movement, because according to the application, some steps can To carry out using other orders or simultaneously.Secondly, those skilled in the art should also know, implementation described in this description Example belongs to preferred embodiment, necessary to involved action and module not necessarily the application.
With reference to figure 11, a kind of structured flowchart of data processing task managing device embodiment of the application is shown, specifically It can include with lower module:
Peak period identification module 301, the data processing peak period for identification data processing platform;
Task extraction module 302, for extracting multiple data processing tasks in the data processing peak period;
Task screening module 303, for the task attribute according to each data processing task, the Screening Treatment time is adjustable Target data handles task;
Task adjusts notification module 304, for notifying task scheduling platform adjustment at least one target data processing Objective time interval outside task to the data processing peak period is handled.
In the embodiment of the present invention, it is preferable that the peak period identification module, during specifically for according to each data processing The historic task treating capacity of section, historic task treating capacity is selected to exceed at least one data processing peak period of given threshold.
In the embodiment of the present invention, it is preferable that described device also includes:
Extraction module is recorded, for before the data processing peak period of identification data processing platform, from the data Processing platform extracts the history process record of multiple data processing tasks;
Treating capacity analysis module, for analyzing the history process record, determine that the history of each data processing period is appointed Business treating capacity.
In the embodiment of the present invention, it is preferable that the treating capacity analysis module includes:
Time segments division submodule is handled, for unit interval to be divided into multiple data processing periods;
Task amount statistic submodule, for counting at historic task of each data processing period in multiple unit intervals Reason amount sum.
In the embodiment of the present invention, it is preferable that the peak period identification module also includes:
Current task amount statistic submodule, for counting the current task treating capacity of the data processing peak period;
Peak period screens submodule again, if for the current task treating capacity and the historic task treating capacity point The comparative result obtained after not compared with the given threshold is inconsistent, then handles total amount again according to the current task Screen at least one data processing peak period.
In the embodiment of the present invention, it is preferable that the task adjustment notification module includes:
Delay duration searches submodule, prolongs corresponding to the given threshold exceeded for searching the historical data calculating total amount Shi Shichang;
Delay duration notifies submodule, for notifying delay duration of the task scheduling platform according to lookup, to described Target data processing task under data processing peak period carries out delay process.
In the embodiment of the present invention, it is preferable that the given threshold includes multiple;The delay duration searches submodule, tool Body is used to search delay duration corresponding to the maximum given threshold that the historical data calculating total amount exceeds, wherein, set threshold Value is bigger, then corresponding delay duration is longer.
In the embodiment of the present application, it is preferable that the task attribute include the affiliated data service of the data processing task and/ Or the ageing requirement of output of the data processing task.
In the embodiment of the present application, it is preferable that described device also includes:
Task attribute determining module, in the task attribute according to each data processing task, Screening Treatment time Before adjustable target data processing task, the affiliated data service of each data processing task and/or the data processing are determined The ageing requirement of output of task.
In the embodiment of the present application, it is preferable that the task attribute determining module includes:
Dependence task determination sub-module, for for each data processing task, searching the data processing task and being relied on At least one dependence task;
Business weighted value determination sub-module, for determining institute according to the business weighted value of the affiliated data service of each dependence task State the business weighted value of data processing task;And/or it is ageing require determination sub-module, for the production according to each dependence task Go out the ageing ageing requirement of output for requiring to determine the data processing task.
In the embodiment of the present application, it is preferable that the business weighted value determination sub-module, specifically for selecting all rely on to appoint Highest business weighted value in business, the business weighted value as the data processing task;
It is described it is ageing require determination sub-module, specifically for selecting in all dependence task highest output is ageing will Ask, the ageing requirement of the output as the data processing task.
In the embodiment of the present application, it is preferable that the task screening module, specifically for number belonging to extraction data processing task According to the severity level of business less than the first setting rank, and/or, output is ageing to be required at the data less than the first sets requirement Reason task.
In the embodiment of the present application, it is preferable that the task adjusts notification module, specifically for notifying the task scheduling to put down The startup time of the selected target data processing task of platform modification, so that the target data handles task delay start.
In the embodiment of the present application, it is preferable that the task screening module, specifically for number belonging to extraction data processing task According to the severity level of business higher than the second setting rank, and/or, output is ageing to be required at the data higher than the second sets requirement Reason task.
In the embodiment of the present application, it is preferable that the task adjusts notification module, specifically for notifying the task scheduling to put down The startup time of the selected target data processing task of platform modification, so that the target data handles task pre-cooling.
In the embodiment of the present application, it is preferable that the task adjustment notification module also includes:
Dependence task searching modul, for being changed in the notice in task dispatching platform at selected target data Before the startup time of reason task, at least one dependence task that target data processing task to be adjusted is relied on is searched;
Start time regulating module, for the startup time according to the dependence task, target data processing is appointed Business start the time be defined as outside the data processing peak period, target after the deadline of the dependence task when Section.
In the embodiment of the present application, it is preferable that described device also includes:
Consumed resource selecting module, for adjusting at least one target data in the notice task scheduling platform Before objective time interval outside processing task to the data processing peak period is handled, handled according to each target data The consumed resource of task, select at least one target data processing task.
In the embodiment of the present application, it is preferable that described device also includes:
Consumed resource determining module, in the consumed resource that task is handled according to each target data, choosing Before selecting at least one target data processing task, the consumed resource of each target data processing task is determined;
The task adjusts notification module, is sorted from big to small specifically for selection consumed resource forward at least one Target data handles task, notifies at selected target data processing task to the data of the task scheduling platform adjustment Objective time interval outside reason peak period is handled.
In the embodiment of the present application, it is preferable that described device also includes:
Notice issues module, for appointing in the notice task scheduling platform adjustment at least one target data processing Before business is handled to the objective time interval outside the data processing peak period, the adjustment of generation target data processing task Notify and be issued to task management client;
Receiving module is indicated, for receiving confirmation instruction of the task management client for the adjustment notice.
According to the embodiment of the present application, the data processing peak period of identification data processing platform, further extract at data Each data processing task of peak period is managed, task is handled according to task attribute Screening Treatment time adjustable target data, And the objective time interval adjusted outside at least one target data processing task to data processing peak period that screening obtains is carried out Processing, so as to which the task of peak period is transferred into off-peak period processing, reaches the purpose of peak load shifting, reduce height The peak period reduces processing cost to the high request of process resource;Meanwhile cause timeliness because the task of peak period is reduced Property require that higher task can obtain timely processing;Because target data processing task is delayed to off-peak period processing, The utilization rate of slack resources can be increased so that the resource consumption of day part is balanced.
With reference to figure 12, a kind of structured flowchart of task scheduling apparatus embodiment of the application is shown, can specifically be included With lower module:
Notify receiving module 401, notified for receiving, wherein, the target data handle task by identification data at The data processing peak period of platform, multiple data processing tasks in the data processing peak period are extracted, and according to each The task attribute of data processing task, the Screening Treatment time, adjustable target data processing task obtained;
Adjusting module 402, for being adjusted according to the notice at least one target data processing task to data Objective time interval outside reason peak period is handled;
Scheduler module 403, for being scheduled to the target data processing task after adjustment.
According to the embodiment of the present application, the data processing peak period of identification data processing platform, further extract at data Each data processing task of peak period is managed, task is handled according to task attribute Screening Treatment time adjustable target data, And the objective time interval adjusted outside at least one target data processing task to data processing peak period that screening obtains is carried out Processing, so as to which the task of peak period is transferred into off-peak period processing, reaches the purpose of peak load shifting, reduce height The peak period reduces processing cost to the high request of process resource;Meanwhile cause timeliness because the task of peak period is reduced Property require that higher task can obtain timely processing;Because target data processing task is delayed to off-peak period processing, The utilization rate of slack resources can be increased so that the resource consumption of day part is balanced.
Because described device embodiment essentially corresponds to the embodiment of the method shown in earlier figures 4-6, therefore the present embodiment is retouched Not detailed part, may refer to the related description in previous embodiment, does not just repeat herein in stating.
The application can be used in numerous general or special purpose computing system environments or configuration.Such as:Personal computer, service Device computer, handheld device or portable set, laptop device, multicomputer system, the system based on microprocessor, top set Box, programmable consumer-elcetronics devices, network PC, minicom, mainframe computer including any of the above system or equipment DCE etc..
The application can be described in the general context of computer executable instructions, such as program Module.Usually, program module includes performing particular task or realizes routine, program, object, the group of particular abstract data type Part, data structure etc..The application can also be put into practice in a distributed computing environment, in these DCEs, by Task is performed and connected remote processing devices by communication network.In a distributed computing environment, program module can be with In the local and remote computer-readable storage medium including storage device.
Herein, term " comprising ", "comprising" or any other variant thereof is intended to cover non-exclusive inclusion, from And process, method, article or the equipment for include a series of elements not only include those key elements, but also including not bright The other element really listed, or also include for this process, method, article or the intrinsic key element of equipment.Do not having In the case of more limitations, the key element that is limited by sentence "including a ...", it is not excluded that the process including the key element, Other identical element in method, article or equipment also be present.
Finally, it is to be noted that, herein, such as first and second or the like relational terms be used merely to by One entity or operation make a distinction with another entity or operation, and not necessarily require or imply these entities or operation Between any this actual relation or order be present.Moreover, term " comprising ", "comprising" or its any other variant meaning Covering including for nonexcludability, so that process, method, article or equipment including a series of elements not only include that A little key elements, but also the other element including being not expressly set out, or also include for this process, method, article or The intrinsic key element of equipment.In the absence of more restrictions, the key element limited by sentence "including a ...", is not arranged Except other identical element in the process including the key element, method, article or equipment being also present.
Above to a kind of data processing management system provided herein, a kind of data processing task management method and dress Put, and, a kind of method for scheduling task and device are described in detail, original of the specific case to the application used herein Reason and embodiment are set forth, and the explanation of above example is only intended to help and understands that the present processes and its core are thought Think;Meanwhile for those of ordinary skill in the art, according to the thought of the application, in specific embodiments and applications There will be changes, in summary, this specification content should not be construed as the limitation to the application.

Claims (29)

1. a kind of data processing management system, it is characterised in that including data processing task managing device and task scheduling platform;
The data processing task managing device, the data processing peak period for identification data processing platform;Described in extraction Multiple data processing tasks in data processing peak period;According to the task attribute of each data processing task, Screening Treatment time Adjustable target data handles task;Notify the task scheduling platform;
The task scheduling platform, it is described at least one for the notice according to the data processing task managing device, adjustment Objective time interval outside target data processing task to the data processing peak period is handled.
A kind of 2. data processing task management method, it is characterised in that including:
The data processing peak period of identification data processing platform;
Extract multiple data processing tasks in the data processing peak period;
According to the task attribute of each data processing task, Screening Treatment time adjustable target data processing task;
Notify task scheduling platform adjustment at least one target data handle task to the data processing peak period it Outer objective time interval is handled.
3. according to the method for claim 2, it is characterised in that during the data processing peak of the identification data processing platform Section includes:
Select historic task treating capacity beyond at least one data processing period of given threshold as data processing peak period.
4. according to the method for claim 3, it is characterised in that in the data processing peak period of identification data processing platform Before, methods described also includes:
The history process record of multiple data processing tasks is extracted from the data processing platform (DPP);
The history process record is analyzed, determines the historic task treating capacity of each data processing period.
5. according to the method for claim 4, it is characterised in that the history number task for determining each data processing period Treating capacity includes:
Unit interval is divided into multiple data processing periods;
Count historic task treating capacity sum of each data processing period in multiple unit intervals.
6. according to the method for claim 3, it is characterised in that during the data processing peak of the identification data processing platform Section also includes:
Count the current task treating capacity of the data processing peak period;
If the current task treating capacity and the historic task treating capacity respectively compared with the given threshold after obtain Comparative result it is inconsistent, then according to the current task processing total amount screen at least one data processing peak period again.
7. according to the method for claim 3, it is characterised in that the notice task scheduling platform adjustment is described at least one Objective time interval outside target data processing task to the data processing peak period, which carries out processing, to be included:
Search the historical data and calculate delay duration corresponding to the given threshold that total amount exceeds;
Delay duration of the task scheduling platform according to lookup is notified, to the target data under the data processing peak period Processing task carries out delay process.
8. according to the method for claim 6, it is characterised in that the given threshold includes multiple;
Delay duration corresponding to the given threshold searched the historical data calculating total amount and exceeded includes:
Search the historical data and calculate delay duration corresponding to the maximum given threshold that total amount exceeds, wherein, given threshold Bigger, then corresponding delay duration is longer.
9. according to the method for claim 2, it is characterised in that the task attribute is included belonging to the data processing task The ageing requirement of the output of data service and/or the data processing task.
10. according to the method for claim 9, it is characterised in that in the task attribute according to each data processing task, Before Screening Treatment time adjustable target data processing task, methods described also includes:
Determine the affiliated data service of each data processing task and/or the ageing requirement of output of the data processing task.
11. according to the method for claim 10, it is characterised in that described to determine each affiliated data service of data processing task And/or the output of the data processing task ageing requires to include:
For each data processing task, at least one dependence task that the data processing task is relied on is searched;
The business weighted value of the data processing task is determined according to the business weighted value of the affiliated data service of each dependence task, And/or the ageing ageing requirement of output for requiring to determine the data processing task of output according to each dependence task.
12. according to the method for claim 11, it is characterised in that the industry according to the affiliated data service of each dependence task Business weighted value determines that the business weighted value of the data processing task includes:
Highest business weighted value in all dependence task is selected, the business weighted value as the data processing task;
The output according to each dependence task is ageing to require that determining that the output of the data processing task is ageing requires bag Include:
The ageing requirement of highest output in all dependence task is selected, the output as the data processing task is ageing will Ask.
13. according to the method for claim 9, it is characterised in that the task attribute according to each data processing task, sieve Processing time adjustable target data processing task is selected to include:
The business weighted value of the affiliated data service of data processing task is extracted less than the first setting rank, and/or, output is ageing It is required that the data processing task less than the first sets requirement.
14. according to the method for claim 13, it is characterised in that at least one described in the notice task scheduling platform adjustment Objective time interval outside individual target data processing task to the data processing peak period, which carries out processing, to be included:
Notify to change the startup time that selected target data handles task in the task scheduling platform, so that the target Data processing task delay start.
15. according to the method for claim 9, it is characterised in that the task attribute according to each data processing task, sieve Processing time adjustable target data processing task is selected to include:
The business weighted value of the affiliated data service of data processing task is extracted higher than the second setting rank, and/or, output is ageing It is required that the data processing task higher than the second sets requirement.
16. according to the method for claim 15, it is characterised in that at least one described in the notice task scheduling platform adjustment Objective time interval outside individual target data processing task to the data processing peak period, which carries out processing, to be included:
The target data selected by the task scheduling platform modifying is notified to handle the startup time of task, so that the number of targets According to processing task pre-cooling.
17. according to the method for claim 16, it is characterised in that selected by being changed in the notice in task dispatching platform Before the startup time for the target data processing task selected, the notice task scheduling platform is adjusted at least one target data Objective time interval outside reason task to the data processing peak period, which is handled, also to be included:
Search at least one dependence task that target data processing task to be adjusted is relied on;
According to the startup time of the dependence task, the target data processing task start time is defined as at the data Manage the objective time interval outside peak period, after the deadline of the dependence task.
18. according to the method for claim 2, it is characterised in that described in being adjusted in the notice task scheduling platform at least Before objective time interval outside one target data processing task to data processing peak period is handled, methods described Also include:
According to the consumed resource of each target data processing task, at least one target data processing task is selected.
19. according to the method for claim 18, it is characterised in that in the money that task is handled according to each target data Source consumption, before selecting at least one target data processing task, methods described also includes:
Determine the consumed resource of each target data processing task;
When notice task scheduling platform adjustment at least one target data handles task to the data processing peak Objective time interval outside section, which carries out processing, to be included:
At least one target data processing task that selection consumed resource sorts forward from big to small, notifies task scheduling platform Objective time interval outside adjustment selected target data processing task to the data processing peak period is handled.
20. according to the method for claim 2, it is characterised in that described in being adjusted in the notice task scheduling platform at least Before objective time interval outside one target data processing task to data processing peak period is handled, methods described Also include:
The adjustment of generation target data processing task notifies and is issued to task management client;
Receive confirmation instruction of the task management client for the adjustment notice.
A kind of 21. method for scheduling task, it is characterised in that including:
Adjustment notice is received, described adjust notifies instruction to adjust at least one target data and handle task to data processing height Objective time interval outside the peak period is handled, wherein, the target data processing task passes through identification data processing platform Data processing peak period, multiple data processing tasks in the data processing peak period are extracted, and according to each data processing The task attribute of task, the Screening Treatment time, adjustable target data processing task obtained;
Target outside at least one target data processing task to data processing peak period is adjusted according to the notice Period is handled;
Target data processing task after adjustment is scheduled.
A kind of 22. data processing task managing device, it is characterised in that including:
Peak period identification module, the data processing peak period for identification data processing platform;
Task extraction module, for extracting multiple data processing tasks in the data processing peak period;
Task screening module, for the task attribute according to each data processing task, Screening Treatment time adjustable number of targets According to processing task;
Task adjusts notification module, for notifying task scheduling platform adjustment at least one target data to handle task to institute The objective time interval outside data processing peak period is stated to be handled.
23. device according to claim 22, it is characterised in that the peak period identification module, specifically for selection Historic task treating capacity exceeds at least one data processing period of given threshold as data processing peak period.
24. device according to claim 23, it is characterised in that described device also includes:
Extraction module is recorded, for before the data processing peak period of identification data processing platform, from the data processing Platform extracts the history process record of multiple data processing tasks;
Treating capacity analysis module, for analyzing the history process record, at the historic task that determines each data processing period Reason amount.
25. device according to claim 24, it is characterised in that the treating capacity analysis module includes:
Time segments division submodule is handled, for unit interval to be divided into multiple data processing periods;
Task amount statistic submodule, for counting historic task treating capacity of each data processing period in multiple unit intervals Sum.
26. device according to claim 23, it is characterised in that the peak period identification module also includes:
Current task amount statistic submodule, for counting the current task treating capacity of the data processing peak period;
Peak period screens submodule again, if for the current task treating capacity and the historic task treating capacity respectively with The comparative result that the given threshold obtains after being compared is inconsistent, then is screened again according to current task processing total amount At least one data processing peak period.
27. device according to claim 23, it is characterised in that the task adjustment notification module includes:
Delay duration searches submodule, when being delayed corresponding to the given threshold exceeded for searching the historical data calculating total amount It is long;
Delay duration notifies submodule, for notifying delay duration of the task scheduling platform according to lookup, to the data The target data processing task handled under peak period carries out delay process.
28. device according to claim 26, it is characterised in that the given threshold includes multiple;
The delay duration searches submodule, the maximum setting threshold for calculating total amount specifically for searching the historical data and exceeding Delay duration corresponding to value, wherein, given threshold is bigger, then corresponding delay duration is longer.
A kind of 29. task scheduling apparatus, it is characterised in that including:
Receiving module is notified, is notified for receiving, wherein, the target data processing task passes through identification data processing platform Data processing peak period, multiple data processing tasks in the data processing peak period are extracted, and according to each data processing The task attribute of task, the Screening Treatment time, adjustable target data processing task obtained;
Adjusting module, when handling task to the data processing peak for adjusting at least one target data according to the notice Objective time interval outside section is handled;
Scheduler module, for being scheduled to the target data processing task after adjustment.
CN201610677839.8A 2016-08-16 2016-08-16 Data processing management system and task management, method for scheduling task and device Pending CN107766143A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610677839.8A CN107766143A (en) 2016-08-16 2016-08-16 Data processing management system and task management, method for scheduling task and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610677839.8A CN107766143A (en) 2016-08-16 2016-08-16 Data processing management system and task management, method for scheduling task and device

Publications (1)

Publication Number Publication Date
CN107766143A true CN107766143A (en) 2018-03-06

Family

ID=61260185

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610677839.8A Pending CN107766143A (en) 2016-08-16 2016-08-16 Data processing management system and task management, method for scheduling task and device

Country Status (1)

Country Link
CN (1) CN107766143A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109597681A (en) * 2018-10-22 2019-04-09 平安科技(深圳)有限公司 Cloud control method, device, computer equipment and storage medium
CN110728588A (en) * 2018-07-17 2020-01-24 新智数字科技有限公司 Meter reading method, remote management platform and business system
CN110781180A (en) * 2019-09-05 2020-02-11 腾讯科技(深圳)有限公司 Data screening method and data screening device
CN110941513A (en) * 2019-11-22 2020-03-31 浪潮电子信息产业股份有限公司 Data reconstruction method and related device
CN111782377A (en) * 2020-07-24 2020-10-16 Oppo广东移动通信有限公司 Task execution control method and device and computer readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001318711A (en) * 2000-05-11 2001-11-16 Nkk Corp Scheduling device
CN101446910B (en) * 2008-12-08 2011-06-22 哈尔滨工程大学 AEDF task scheduling method based on SMP
CN102455932A (en) * 2010-10-22 2012-05-16 金蝶软件(中国)有限公司 Serial execution method, device and system for task instances
CN103365708A (en) * 2012-04-06 2013-10-23 阿里巴巴集团控股有限公司 Method and device for scheduling tasks
CN104731649A (en) * 2015-04-21 2015-06-24 中国建设银行股份有限公司 Multi-task processing method and multi-task processing device
CN104935633A (en) * 2015-04-24 2015-09-23 北京金山安全软件有限公司 Information publishing method and service equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001318711A (en) * 2000-05-11 2001-11-16 Nkk Corp Scheduling device
CN101446910B (en) * 2008-12-08 2011-06-22 哈尔滨工程大学 AEDF task scheduling method based on SMP
CN102455932A (en) * 2010-10-22 2012-05-16 金蝶软件(中国)有限公司 Serial execution method, device and system for task instances
CN103365708A (en) * 2012-04-06 2013-10-23 阿里巴巴集团控股有限公司 Method and device for scheduling tasks
CN104731649A (en) * 2015-04-21 2015-06-24 中国建设银行股份有限公司 Multi-task processing method and multi-task processing device
CN104935633A (en) * 2015-04-24 2015-09-23 北京金山安全软件有限公司 Information publishing method and service equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ZHU JIANGHAN: "Research on the Approaches of Reconstructing Imaging Reconnaissance Task Flow and Adjusting Task Flow Peaks", 《JOURNAL OF THE ACADEMY OF EQUIPMENT COMMAND & TECHNOLOGY》 *
邵安兆: "《现代企业经营与管理》", 31 August 1995 *
黄日胜: "异构并行系统中高时效性任务的节能调度方法", 《计算机应用与软件》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110728588A (en) * 2018-07-17 2020-01-24 新智数字科技有限公司 Meter reading method, remote management platform and business system
CN109597681A (en) * 2018-10-22 2019-04-09 平安科技(深圳)有限公司 Cloud control method, device, computer equipment and storage medium
CN109597681B (en) * 2018-10-22 2024-05-07 平安科技(深圳)有限公司 Cloud control method and device, computer equipment and storage medium
CN110781180A (en) * 2019-09-05 2020-02-11 腾讯科技(深圳)有限公司 Data screening method and data screening device
CN110781180B (en) * 2019-09-05 2022-08-30 腾讯科技(深圳)有限公司 Data screening method and data screening device
CN110941513A (en) * 2019-11-22 2020-03-31 浪潮电子信息产业股份有限公司 Data reconstruction method and related device
CN110941513B (en) * 2019-11-22 2022-03-22 浪潮电子信息产业股份有限公司 Data reconstruction method and related device
CN111782377A (en) * 2020-07-24 2020-10-16 Oppo广东移动通信有限公司 Task execution control method and device and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN107766143A (en) Data processing management system and task management, method for scheduling task and device
Petrovic et al. Fuzzy job shop scheduling with lot-sizing
US20040122647A1 (en) Apparatus and method for managing the performance of an electronic device
EP3188096A1 (en) Data analysis for predictive scheduling optimization for product production
CN104679595B (en) A kind of application oriented IaaS layers of dynamic resource allocation method
Tirkel Forecasting flow time in semiconductor manufacturing using knowledge discovery in databases
CN109885397A (en) The loading commissions migration algorithm of time delay optimization in a kind of edge calculations environment
CN103685347B (en) Method and device for allocating network resources
Crist et al. Prioritising production and engineering lots in wafer fabrication facilities: a simulation study
US20190347593A1 (en) Method for improving semiconductor back-end factories
Lin et al. Integrating analytical hierarchy process to genetic algorithm for re-entrant flow shop scheduling problem
CN113191533A (en) Warehouse employment prediction method, device, equipment and storage medium
Zanjani et al. Robust multi-objective hybrid flow shop scheduling
Guirguis et al. Adaptive scheduling of web transactions
Koskinen et al. Rolling horizon production scheduling of multi-model PCBs for several assembly lines
Chia Yee et al. Weighted grey relational analysis to evaluate multilevel dispatching rules in wafer fabrication
Rabbani et al. A novel bi-level hierarchy towards available-to-promise in mixed-model assembly line sequencing problems
Bayati Power management policy for heterogeneous data center based on histogram and discrete-time mdp
Kim et al. A due date-based approach to part type selection in flexible manufacturing systems
US10496081B2 (en) Method for fulfilling demands in a plan
Wang et al. A hybrid flowshop scheduling model considering dedicated machines and lot-splitting for the solar cell industry
Lohmer et al. Order release methods in semiconductor manufacturing: State-of-the-art in science and lessons from industry
Lee et al. Iterative procedures for multi-period order selection and loading problems in flexible manufacturing systems
Yang et al. Resource allocation algorithm and job scheduling of virtual manufacturing workshop
Guo et al. Green data analytics of supercomputing from massive sensor networks: Does workload distribution matter?

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination