CN102096687B - Method and platform for scheduling tasks - Google Patents

Method and platform for scheduling tasks Download PDF

Info

Publication number
CN102096687B
CN102096687B CN 200910254276 CN200910254276A CN102096687B CN 102096687 B CN102096687 B CN 102096687B CN 200910254276 CN200910254276 CN 200910254276 CN 200910254276 A CN200910254276 A CN 200910254276A CN 102096687 B CN102096687 B CN 102096687B
Authority
CN
China
Prior art keywords
task
pending
virtual
current task
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 200910254276
Other languages
Chinese (zh)
Other versions
CN102096687A (en
Inventor
王玮晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Cloud Computing Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN 200910254276 priority Critical patent/CN102096687B/en
Publication of CN102096687A publication Critical patent/CN102096687A/en
Priority to HK11108717.2A priority patent/HK1154677A1/en
Application granted granted Critical
Publication of CN102096687B publication Critical patent/CN102096687B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a method and platform for scheduling tasks. The method comprises the steps of: obtaining a current task identification of a current task and virtual task identifications of virtual tasks; dynamically generating a plurality of tasks to be executed with same quantity according to the current task identification and the quantity of data storage files of original data sources in a data warehouse, wherein the tasks to be executed are subtasks of the current task; setting the subtasks of the tasks to be executed as the virtual tasks; and scheduling the tasks according to the relation among the current task, the tasks to be executed and the virtual tasks. At the moment of operating the current task, a plurality of subtasks are correspondingly generated dynamically according to the quantity of the storage files, and the subtasks are set as main tasks of the virtual tasks, therefore, the method for scheduling the tasks in the invention can adapt to some application scenes which are not known in current task contents in advance, and the flexibility and the adaptability of the scheduling platform are improved.

Description

A kind of method of scheduler task and platform
Technical field
The application relates to database field, particularly a kind of method of scheduler task and platform.
Background technology
ETL (Extract, Transform, Load) be illustrated in the data warehouse of background system to data extract, conversion, process of loading.In the ETL of data warehouse dispatching system, be the front and back order that sets the tasks and carry out by the set membership between each task to the scheduling between the task, namely be the father's task of carrying out earlier, after father's task is finished, then automatically perform the subtask of this father's task.
Prior art is when scheduler task, before executing the task, the set membership of each task is that static configuration is good, namely be before carrying out current task, configured all subtasks of task before deserving, dispatching system is just dispatched the program module of each task correspondence successively according to the set membership of static configuration.
As can be seen, the relation in the prior art between each task was just generating before each scheduler task from said process, so prior art has been known the scheduling situation of static state of the subtask of current task before just being adapted at dispatching.And in actual applications, if pending data are real-time update, adopt prior art then just can't generate the task that each can handle real time data so, the method that presets the set membership between each task of technology does not just satisfy the demand of this real-time scene now.
In a word, need the urgent technical matters that solves of those skilled in the art to be exactly at present: the method for a kind of scheduler task of proposition how can innovate, to solve the problem that scheduler task method of the prior art does not satisfy the special applications scene.
Summary of the invention
The application's technical matters to be solved provides a kind of method of scheduler task, in order to solve to solve the problem that scheduler task method of the prior art does not satisfy the special applications scene.
The application also provides a kind of platform of scheduler task, in order to guarantee said method realization and application in practice.
In order to address the above problem, the application discloses a kind of method of scheduler task, comprising:
Obtain the current task sign of current task and the virtual task sign of virtual task;
According to the number of described current task sign and raw data source data storage file in data warehouse, dynamically generate a plurality of pending task of same number, described pending task is the subtask of described current task;
The subtask of described pending task is set to described virtual task;
According to the scheduler task that concerns between described current task, pending task and the virtual task.
The application discloses a kind of platform of scheduler task, comprising:
Acquiring unit is used for obtaining the current task sign of current task and the virtual task sign of virtual task;
Dynamically generation unit is used for the described current task sign of foundation and raw data source in the number of data warehouse data storage file, dynamically generates a plurality of pending task of same number, and described pending task is the subtask of described current task;
The unit is set, and the subtask that is used for described pending task is set to described virtual task;
Scheduling unit is used for according to the scheduler task that concerns between described current task, pending task and the virtual task.
Compared with prior art, the application comprises following advantage:
In this application, the setting for the subtask of current task during scheduler task dynamically arranges, and at first obtains the current task sign of current task and the virtual task of virtual task and identifies; And then the described current task of foundation identifies and the number of raw data source data storage file in data warehouse, dynamically generates a plurality of pending task of same number, and described pending task is the subtask of described current task; The subtask of described pending task is set to described virtual task simultaneously; Last according to the scheduler task that concerns between described current task, pending task and the virtual task.Like this because in the moment of moving current task, just the quantity according to storage file dynamically generates corresponding a plurality of subtasks, and these subtasks are set to father's task of virtual task, this just makes the method for the scheduler task among the application also can adapt in advance and not know some special applications scenes of current task content, thereby has improved dirigibility and the adaptability of dispatching platform.Certainly, arbitrary product of enforcement the application might not need to reach simultaneously above-described all advantages.
Description of drawings
In order to be illustrated more clearly in the technical scheme in the embodiment of the present application, the accompanying drawing of required use is done to introduce simply in will describing embodiment below, apparently, accompanying drawing in describing below only is some embodiment of the application, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is the process flow diagram of method embodiment 1 of a kind of scheduler task of the application;
Fig. 2 is the process flow diagram of method embodiment 2 of a kind of scheduler task of the application;
Fig. 3 is the schematic diagram data of current task and virtual task among the application's method embodiment 2;
Fig. 4 is the schematic diagram data that dynamically generates pending task and current task after the subtask among the application's method embodiment 2;
Fig. 5 is the schematic diagram data between current task, pending task and the virtual task among the application's method embodiment 2;
Fig. 6 is the process flow diagram of the method embodiment 3 of a kind of scheduler task among the application;
Fig. 7 is the structured flowchart of platform embodiment 1 of a kind of scheduler task of the application;
Fig. 8 is the structured flowchart of platform embodiment 2 of a kind of scheduler task of the application;
Fig. 9 is the structured flowchart of platform embodiment 3 of a kind of scheduler task of the application.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present application, the technical scheme in the embodiment of the present application is clearly and completely described, obviously, described embodiment only is the application's part embodiment, rather than whole embodiment.Based on the embodiment among the application, those of ordinary skills are not making the every other embodiment that obtains under the creative work prerequisite, all belong to the scope of the application's protection.
The application can be used in numerous general or special purpose calculation element environment or the configuration.For example: personal computer, server computer, handheld device or portable set, plate equipment, multiprocessor device, comprise distributed computing environment of above any device or equipment etc.
The application can describe in the general context of the computer executable instructions of being carried out by computing machine, for example program module.Usually, program module comprises the routine carrying out particular task or realize particular abstract data type, program, object, assembly, data structure etc.Also can in distributed computing environment, put into practice the application, in these distributed computing environment, be executed the task by the teleprocessing equipment that is connected by communication network.In distributed computing environment, program module can be arranged in the local and remote computer-readable storage medium that comprises memory device.
One of main thought of the application can comprise, in the embodiment of the present application, at first need to obtain current task sign and virtual task sign, and then execution current task, the content of task is exactly that data mode according to current time generates some pending subtasks before deserving, namely carry out current task and be used for generating several subtasks (pending task) exactly, can give several task identifications of these several subtask correspondences simultaneously, and then obtain the sign of several subtasks of these generations, and with the subtask of virtual task as these several subtasks, when scheduler task, just can whether finish to judge whether several subtasks of generation finish by this virtual task like this.The embodiment of the present application is in the moment of operation current task, just the quantity according to storage file dynamically generates corresponding a plurality of subtasks, and these subtasks are set to father's task of virtual task, this just makes the method for the scheduler task among the application also can adapt in advance and not know the application scenes of current task content, thereby has improved dirigibility and the adaptability of dispatching platform.
With reference to figure 1, show the process flow diagram of the method embodiment 1 of a kind of scheduler task of the application, can may further comprise the steps:
Step 101: obtain the current task sign of current task and the virtual task sign of virtual task.
In actual applications, the executive agent of present embodiment can be dispatching platform, the program module of described current task and virtual task correspondence is two sections codes in the dispatching platform system, wherein, described current task sign and virtual task sign namely are the unique names of these two sections codes, there is no direct association between these two tasks.The purpose of obtaining the current task sign namely is in order to call the program module of current task correspondence, thereby the operation current task is in order to can generate a plurality of subtasks (pending task) that need execution, and the particular content of these a plurality of pending tasks namely represents to treat the processing of deal with data.Obtaining the virtual task sign then is for described virtual task being arranged to the subtask of all pending tasks, namely be after the described pending task of operation, if can call the program module of virtual task correspondence, just represent that all pending tasks all are finished.
Step 102: according to the number of described current task sign and raw data source data storage file in data warehouse, dynamically generate a plurality of pending task of same number, described pending task is the subtask of described current task.
The embodiment of the present application is applicable to, the data in the raw data table in Foreground Data storehouse in being synchronized to the data warehouse of background system the time to the scheduling of task.The data class in Foreground Data storehouse is like the real time data of production system, this real time data exists with the form of raw data table, and the real time data in the raw data table need be split as a plurality of files with a raw data table and store when being synchronized to the data warehouse of background system.
In this step, can obtain the program module of current task correspondence according to described current task sign, the particular content of task namely is a plurality of pending tasks that dynamically generate before deserving, and after dispatching platform calls described program module and carries out, can generate a plurality of pending tasks.Dispatching platform need be inquired about the storage file quantity of raw data table in data warehouse, namely be that what files a raw data table is split as stores data in data warehouse, dynamically generate the pending task (subtask that these a plurality of pending tasks are current task) with the same number of storage file quantity.Storage file quantity then be the real time data of raw data table synchronously to background system the time, need be stored as the number size of single file; The number of the pending task of Sheng Chenging equates with the number of the single file of storage in this step.The particular content of described pending task namely is to store in the data storage file of the data sync in the raw data table of correspondence in the data warehouse.
Step 103: the subtask of described pending task is set to described virtual task.
Simultaneously, according to described virtual task sign can described pending task the subtask be set to described virtual task.Wherein, need to prove, set membership between current task, pending task and the virtual task, can embody by the form of initialize data table, for example, can represent father's task dispatching with the field value of field Parentid respectively in the initialize data table, this step is that the relation between virtual task sign and the pending task identification is written in the tables of data when realizing so.
Step 104: according to the scheduler task that concerns between described current task, pending task and the virtual task.
This step need be according to described current task, set membership between pending task and the virtual task three is come the scheduled for executing task, namely be at first to call corresponding execution module according to the current task sign of described current task, carrying out the process of deserving preceding task namely is the process that generates a plurality of pending tasks according to storage file quantity, the described a plurality of pending task of reruning, after described a plurality of pending tasks are finished, when carrying out described virtual task, whether dispatching platform just can be by having carried out the operation that described virtual task determines whether to have finished pending task.
In the present embodiment, in the ETL dispatching system, can dynamically generate pending task according to the number of storage file quantity in operational process, make that the relation between each task can dynamically change in operational process; Real time data in the raw data table in Foreground Data storehouse is brought in constant renewal in, therefore the storage file quantity in the data warehouse of background system also can real-time change, and can according to how many dynamic numbers of determining pending task of storage file quantity of real-time change in the embodiment of the present application, therefore the method for the scheduler task of present embodiment also is applicable to the application of special screne, also make the dispatching platform that uses the described method of present embodiment more flexibly with controlled, also improved dispatching platform simultaneously in the efficient of determining on each task set membership.
With reference to figure 2, show the process flow diagram of the method embodiment 2 of a kind of scheduler task of the application, can may further comprise the steps:
Step 201: set up mission bit stream table and task relation table, described mission bit stream table is used for the details of storage described current task, pending task and virtual task, and described task relation table is used for the relation between storage described current task, pending task and the virtual task.
In the present embodiment, need at first set up mission bit stream table and task relation, described mission bit stream table can be stored the details of each task, these details can comprise each task identification, and the title of corresponding program module, content of parameter that need preset etc. in addition, the object lesson of described mission bit stream table can be as shown in table 1:
Table 1
Field name Field type Describe Sample
Id Number Task id, unique identification, except the task id interval 1000 to 9999 changes, other task id is constant. 78778
Prgname Varchar2 (10 0) The title of task, dispatching system are carried out the module of this field and are come scheduler task Tbods.miss ion_create_ table
... ... The middle omission should
The no field of scheme
Parameter value (PARAVALU E) Varchar2 (400) The parameter that execution module need import into (′crm_sche me_satisfac tion′,′tbods′ )
Wherein, described field " prgname " is used for the title of the program module of certain task correspondence of sign, in table 1, this line data represents that task names is that the corresponding program module name of task of Varchar2 (100) is called: Tbods.mission_create_table; Namely be when carrying out the task of Varchar2 (100), calling program module: Tbods.mission_create_table then; Field " parameter value " is used for expression and carries out the content of parameter that this program module needs, for example, need the title of raw data table and the user's name of raw data table, wherein the raw data table title represents the real time data of foreground system from which raw data table is obtained, the user's name of raw data table is illustrated in to get access to this raw data table under which user, and the user's name of described raw data table title and raw data table can uniquely be determined a raw data table.
In actual applications, for example, the parameter that program module need be imported into can followingly be represented:
Table_name_para (raw data table title), owner (user's name of this raw data table)
(the title of table_name_para varchar2 raw data table
Owner varchar2, the user's name of----this raw data table
Jobid number, this is fixed value 1 for----
Auctionid number,-----this also is fixed value 1
N number,---be defaulted as 1
I out number)----is rreturn value here, and 0 is successfully, and 1 is failure
Described task relation table can be used for storing the set membership between current task, pending task and the virtual task, and in practice, the structure of task relation table can be illustrated as table 2:
Table 2
Field name Field type Describe Sample
Id Number Task id is corresponding one by one with the id in the mission bit stream table. 78778
Parentid Number Father's task id is corresponding one by one with the id in the mission bit stream table. 78779。The task of needing to move after the task of expression 78779 is finished is 78778
Gmtdate Date The date created of task 20090807
SKEDID Number Tree id 1. be used for distinguishing relation extents.Value is 1 o'clock, and the set membership between Field ID and the parentid is just set up.
Wherein, " ID " in the task relation table is corresponding one by one with ID in the mission bit stream table, represents namely that in table 2 details of task " 78778 " can be obtained from mission bit stream table (table 1); Simultaneously, " ID " in the task relation table and " parentid " field are represented the hierarchical relationship between each task, and the represented task of field " Parentid " is exactly father's task of the represented task of " ID " field; Value is 1 under field " SKEDID " default situations, i.e. the set membership of expression id field this moment and " Parentid " field is just set up; In table 2, namely be that 78779 subtask is 78778, the task of needing to move after the task of expression 78779 is finished is 78778.
For example, when generating the pending task identification of pending task, can adopt following code:
create sequence MUL_FILE_MISSION
minvalue 1000
maxvalue 9999
start with 3736
increment by 1
nocache
cycle;
Above-mentioned code is namely represented to create the sequence (sequence) that a name is called " MUL_FILE_MISSION ", adopts this sequence to generate the task identification that increases progressively of pending task again, and this task identification is exactly the value of field in the mission bit stream table " ID ".The scope of the value of its sign so just can be avoided conflicting mutually with other static task between 1000 to 9999.
Need to prove that the described task relation table of setting up in this step and mission bit stream table can be after once setting up, follow-up repeated use, need not each dispatching platform all needs to set up task relation table and mission bit stream table when scheduler task.
Step 202: from the mission bit stream table that presets, obtain the current task sign of current task and the virtual task sign of virtual task.
When setting up the described task relation table that presets and mission bit stream table, just specified the value of field " prgname ", this field value is represented current task sign and virtual task sign, namely is the current task of correspondence in practice and the program module of virtual task.Like this, in this step, just can from the task relation table that presets and mission bit stream table, get access to the current task sign of current task and the virtual task sign of virtual task, can be without any relation between the background program module of these two sign correspondences.
For example, in this step, the current task that gets access to identifies corresponding program module a, and its task identification is 100, and virtual task identifies corresponding program module b, and its task identification is 101.The data signal of program module as shown in Figure 3.
Step 203: the program module of calling described current task correspondence according to described current task sign.
Calling the corresponding program module of current task sign, namely is calling module a in the present embodiment, and the particular content of module a namely be follow-up generation a plurality of pending tasks and sign thereof.
Step 204: the number of the data storage file of the raw data source that the program module of described correspondence is obtained foreground system synchronously to the data warehouse of background system the time.
When carrying out described program module a, the number of the data storage file when synchronous is carried out at the data warehouse to background system in the raw data source that just obtains foreground system, so that the follow-up corresponding pending task that generates same number of number according to this data storage file.In actual applications, can also obtain the location parameter of data storage file simultaneously in this step, the information of this location parameter comprises the positional information of file identification and data object, corresponding file ID when file identification namely is the storage of ORACLE data of database table, can be used for unique data storage file of determining, simultaneously, also need to comprise the positional information of raw data in data storage file, for example, the line number of raw data in data storage file etc., being used for raw data in the specified data storage file, namely is the data object that follow-up pending task need be obtained when carrying out.Because the embodiment of the present application is specially implementation procedure how to determine execution sequence between current task, pending task and the virtual task, therefore, this location parameter need not at first to obtain, and obtains also passable in subsequent synchronization processes again.
Step 205: the pending task and the sign thereof that generate same number according to the number of described data storage file.
Need to prove, in the present embodiment, the number of described pending task need be identical with the number of data storage file, the corresponding pending task of each data storage file, generating the sign of pending task simultaneously, namely is unique ID that determines a pending task.
Step 206: described pending task is set to the subtask of current task.
Described pending task is set to the subtask of current task, when specific implementation, both set memberships can be set in the task relation table that presets, for example, the program module b of the program module a of aforesaid current task correspondence and virtual task correspondence, after generating pending task c and d, its relation can be as shown in Figure 4.As can be seen, the operation between task is exactly after moving a module, to have generated subtask c and d in proper order, and this subtask c and d carry out after executing a task, and the subtask of pending task c and d is virtual task b simultaneously.
Step 207: the relation of described pending task and current task is stored in the described task relation table that presets.
The relation of described pending task and current task is stored in the task relation table, namely is in practice two task identifications to be saved as set membership in the task relation table.
Step 208: the subtask of described pending task is set to described virtual task.
In this step, set up after the relation between each task, described program module a and b, and the relation between described pending task c and the d can be as shown in Figure 5.
Step 209: the set membership between described pending task and the virtual task is saved in the described task relation table that presets.
In this step, preserved after the set membership between pending task and the virtual task, data are just as shown in table 3 in the described task relation table that presets:
Table 3
Subtask sign Id Father's task identification Parentid Gmtdate skedid
1001 100 The date on the same day 1 (being defaulted as 1)
1002 100 The date on the same day 1 (being defaulted as 1)
101 1001 The date on the same day 1 (being defaulted as 1)
101 1002 The date on the same day 1 (being defaulted as 1)
Wherein, the data in the table 3 are that program module a is inserted in the task relation table in the process of operation.The execution sequence that dispatching platform concerns to set the tasks by recording in the task relation table of task.As shown in table 3, the subtask of task 1001 is exactly 101, namely be that task 1001 executes the task that the back dispatching system is just carried out sign 101,4 records all are relation and the forms of expression that produces by execution of program modules a (namely being designated 100 current task) time in the table 3.Wherein, being designated 1001 and 1002 pending task is exactly that module a generates.
Step 2010: the program module of described current task correspondence is carried out described a plurality of pending task according to the sign of described data storage file.
At this moment, program module a carries out described a plurality of pending task c and d, at this moment, and need be according to the mark location data storage file of data storage file, and get access to the particular content of data object according to the positional information of the data object that gets access to previously, to finish the ETL process to raw data.
Step 2011: when executing described a plurality of pending task, according to the set membership between described pending task and the virtual task, carry out described virtual task to finish current task scheduling.
In this step, when executing described a plurality of pending task, can learn from the task relation table that the subtask that needs to carry out is virtual task, just can think that at this moment current task scheduling finishes.
In the present embodiment, adopted the task relation table that presets and mission bit stream table to arrange and store relation between each task, dispatching system is when scheduler task, can execute the task with reference to the set membership between each task in the task relation table, so just realized that the pending task of dynamic generation can satisfy the demand of practical application scene, so that how many numbers according to the pending task that generates that can be real-time when actual motion comes scheduler task, improve the dispatching efficiency of dispatching platform, also promoted dirigibility and the adaptability of dispatching platform.
With reference to figure 6, show the process flow diagram of the method embodiment 3 of a kind of scheduler task of the application, present embodiment can be understood as method with the application's scheduler task and is applied to a object lesson in the reality, the scheduling process of each task is example so that the foreground commodity list information synchronization of Taobao is in the data warehouse, and this method can may further comprise the steps:
Step 601: the related data when carrying out about the last time of described current task in the mission bit stream table that presets of deletion and the task relation table.
In the present embodiment, the mission bit stream table that employing is preset and the form of task relation table are preserved the relation between current task, pending task and the virtual task, and the details of each task.Set membership between data representation current task in the described task relation table, pending task and the virtual task can adopt field " parentid " and " id " to distinguish expression respectively.Described mission bit stream table is used for the details of each task of storage.In actual applications, it all is well-determined that described current task sign (for example 10445) and virtual task identify (for example 10762), the title that namely is the program module of current task and virtual task correspondence all is changeless, all need to determine pending task and task definition thereof according to storage file quantity real-time in the current background system during each scheduler task, therefore, in the time of in the data warehouse that each time raw data table of foreground system is synchronized to background system, all need the mission bit stream table that at first last time preset and the related data in the task relation table all to delete.For example, last current task sign (parent_id) is 10445, then can in task relation table and mission bit stream table, find out all subtask information under this task with this id title, these subtasks namely are to generate after last time, operation was finished, then the information of these subtasks is all deleted, so that in the process of this scheduler task, can in task information table and task relation table, write new subtask (pending task) information again.
The purpose of this step namely is the details of the once pending task that had generated on deleting in the task information table, and the once father and son's task relation that had generated on deleting in the task relation table.
Step 602: obtain the current task sign of current task and the virtual task sign of virtual task.
In this step, need obtain current task sign (namely being the task id of father's task of pending task), a true record in the Taobao's data warehouse that for example gets access in practice is as follows:
ID PRGNAME PARAVALUE
10445 tbods.file_mul_table_new (′auction_auctions_stb′,′tbods′,$jobid,$actionid,$i)
Wherein, what the id field was represented is the sign of this task in dispatching system, and prgname represents corresponding program module name, the parameter that need import into when paravlue represents to carry out this program module.The purpose of this record is exactly that the pending task that will produce is as the subtask of task 10445.
Simultaneously, need obtain virtual task sign (namely being the task id of the subtask of pending task), a true record in the data warehouse that for example gets access in practice is as follows:
ID PRGNAME PARAVALUE
10762 tbodspart_create_exchange (′auction_auctions_stb′,′tbods′)
The purpose of this record be exactly the pending task that will produce as father's task of task 10762, after all pending task runs were finished, when operation virtual task 10762, the raw data table synchronous operation of expression foreground system was finished.
Step 603: dynamically generate a plurality of pending tasks according to described current task sign, described pending task is the subtask of described current task.
In the present example, the process that generates pending task and dynamically generate set membership between current task and the pending task is all in the corresponding program module of task 10445, after the module of 10445 correspondences is complete, just representing that dynamic set membership between current task and the pending task has generated and added is stored in 10445 these modules, wherein the id of pending task all between 1000 to 9999 circulation generate.
Step 604: the subtask of described pending task is set to described virtual task.
After pending task all generates, then need these pending tasks, current task and virtual task are inserted in the task relation table, to determine current task, pending task and the set membership of virtual task in dispatching system.
Step 605: the set membership between described pending task and the virtual task is saved in the described task relation table that presets.Step 606: according to the scheduler task that concerns between described current task, pending task and the virtual task.
The dispatching system of data warehouse can be carried out corresponding program module according to relation between each task that has generated.Present embodiment goes for any scheduler task once, current if scheduler task for the first time, and step 601 item needn't be carried out so, because last task relation table and the content of mission bit stream table are sky.The task relation information that produces when deleting last scheduler task and letter information mutually, in the time of can guaranteeing this scheduler task, can the normal call current task and the program module of virtual task correspondence, thereby guaranteed the normal operation of dispatching platform, also can promote the dispatching efficiency of dispatching platform.
For aforesaid each method embodiment, for simple description, so it all is expressed as a series of combination of actions, but those skilled in the art should know, the application is not subjected to the restriction of described sequence of movement, because according to the application, some step can adopt other orders or carry out simultaneously.Secondly, those skilled in the art also should know, the embodiment described in the instructions all belongs to preferred embodiment, and related action and module might not be that the application is necessary.
Corresponding with the method that the method embodiment 1 of a kind of scheduler task of above-mentioned the application provides, referring to Fig. 7, the application also provides a kind of device embodiment 1 of scheduler task, and in the present embodiment, this device can comprise:
Acquiring unit 701 is used for obtaining the current task sign of current task and the virtual task sign of virtual task.
In actual applications, the described program module of stating current task and virtual task correspondence of present embodiment is two sections codes in the dispatching platform system, wherein, described current task sign and virtual task sign namely are the unique names of these two sections codes, there is no direct association between these two tasks.The purpose of obtaining the current task sign namely is in order to call the program module of current task correspondence, thereby the operation current task is in order to can generate a plurality of subtasks (pending task) that need processing, and the particular content of these a plurality of pending tasks namely represents to treat the processing of deal with data.Obtaining the virtual task sign then is for described virtual task being arranged to the subtask of all pending tasks, if can be after the described pending task of operation, can call the program module of virtual task correspondence, just represent that all pending tasks all are finished.
Dynamically generation unit 702 is used for the described current task sign of foundation and raw data source in the number of data warehouse data storage file, dynamically generates a plurality of pending task of same number, and described pending task is the subtask of described current task.
Wherein, can obtain the program module of current task correspondence according to described current task sign, the particular content of task namely is a plurality of pending tasks that dynamically generate before deserving, and after dispatching platform calls described program module and carries out, can generate a plurality of pending tasks.Dispatching platform need be inquired about the storage file quantity of raw data table in data warehouse in Foreground Data storehouse, namely be that what files a raw data table is split as stores data in data warehouse, dynamically generate the pending task (subtask that these a plurality of pending tasks are current task) with the same number of storage file quantity.Storage file quantity then be this real time data synchronously to background system the time, need be stored as the number of single file; Wherein, the number of the pending task of generation equates with the number of the single file of storage.The particular content of described pending task namely is to store in the data storage file of the data sync in the raw data table of correspondence in the data warehouse.
Unit 703 is set, and the subtask that is used for described pending task is set to described virtual task.
Simultaneously, according to described virtual task sign can described pending task the subtask be set to described virtual task.Wherein, need to prove that the set membership between current task, pending task and the virtual task can embody by the form of initialize data table, scheduling unit 704 is used for according to the scheduler task that concerns between described current task, pending task and the virtual task.
Scheduling unit 704 need come the scheduled for executing task according to the set membership between described current task, pending task and the virtual task three, namely be at first to call corresponding execution module according to the current task sign of described current task, after the quantity according to current storage file generates a plurality of pending tasks, the described a plurality of pending task of reruning, after described pending task is finished, when carrying out described virtual task, whether dispatching platform just can be by having carried out the operation that described virtual task determines whether to have finished pending task.
In the present embodiment, adopt described dispatching platform dynamically generate pending task, and the relation between the scheduler task can change dynamically in operational process according to the number of storage file quantity in operational process; Real time data in the raw data table in Foreground Data storehouse is brought in constant renewal in, therefore the storage file quantity in the data warehouse of background system also can real-time change, and can according to how many dynamic numbers of determining pending task of storage file quantity of real-time change in the embodiment of the present application, therefore the device of the scheduler task of present embodiment can satisfy the application of special screne, this just makes the described dispatching platform of present embodiment more flexibly and is controlled, has also improved the efficient of dispatching platform simultaneously.
Corresponding with the method that the method embodiment 2 of a kind of scheduler task of above-mentioned the application provides, referring to Fig. 8, the application also provides a kind of preferred embodiment 2 of device of scheduler task, and in the present embodiment, this device specifically can comprise:
Set up unit 801, be used for setting up mission bit stream table and task relation table, described mission bit stream table is used for the details of storage described current task, pending task and virtual task, and described task relation table is used for the relation between storage described current task, pending task and the virtual task.
In the present embodiment, need at first set up mission bit stream table and task relation, described mission bit stream table can store tasks details, wherein, details in the described mission bit stream table can comprise task identification, and the title of corresponding program module, the content of parameter that need preset etc. in addition, described task relation table can be used for storing the set membership between current task, pending task and the virtual task.
Acquiring unit 701 is used for obtaining the current task sign of current task and the virtual task sign of virtual task.
When setting up the described task relation table that presets and mission bit stream table, just specified the value of field " prgname ", this field value is represented current task sign and virtual task sign, namely is the current task of correspondence in practice and the program module of virtual task.Like this, in this step, just can from the task relation table that presets and mission bit stream table, get access to the current task sign of current task and the virtual task sign of virtual task, between the background program module of these two sign correspondences without any relation.
Call subelement 802, be used for calling according to described current task sign the program module of described current task correspondence.
The described subelement 802 that calls calls the corresponding program module of current task sign, namely is calling module a in the present embodiment, and the particular content of module a namely be follow-up generation a plurality of pending tasks and sign thereof.
Obtain subelement 803, the number of the data storage file of the raw data source that is used for obtaining foreground system synchronously to the data warehouse of background system the time.
Generate subelement 804, be used for generating according to the number of described data storage file pending task and the sign thereof of same number.
Subelement 805 is set, is used for the subtask that described pending task is set to current task.
First storing sub-units 806 is used for the relation of described pending task and current task is stored in the described task relation table that presets.
Second storing sub-units 807 is used for described pending task is stored in the described mission bit stream table that presets with relation and the pending task identification of current task and virtual task respectively.
Unit 703 is set, and the subtask that is used for described pending task is set to described virtual task.
Preserve unit 808, be used for the set membership between described pending task and the virtual task is saved to the described task relation table that presets.
Carry out subelement 809, the program module that is used for the described current task correspondence of execution is carried out described a plurality of pending task.
Scheduling sublayer module 8010 is used for according to the set membership between described pending task and the virtual task, carrying out described virtual task to finish current task scheduling when executing described a plurality of pending task.
In the present embodiment, adopted the task relation table that presets and mission bit stream table to arrange and store relation between each task, dispatching system is when scheduler task, can execute the task with reference to the set membership between each task in the task relation table, so just realized that the pending task of dynamic generation can satisfy the demand of practical application scene, so that how many numbers according to the pending task that generates that can be real-time when actual motion comes scheduler task, improve the dispatching efficiency of dispatching platform, also promoted dirigibility and the adaptability of dispatching platform.
Corresponding with the method embodiment 3 of a kind of scheduler task of above-mentioned the application, the application also provides a kind of platform embodiment 3 of scheduler task, and in the present embodiment, this dispatching platform specifically can comprise:
Delete cells 901 is used for mission bit stream table that deletion presets and task relation table about the related data of the last task of described current task.
Acquiring unit 701 is used for obtaining the current task sign of current task and the virtual task sign of virtual task;
Dynamically generation unit 702 is used for dynamically generating a plurality of pending tasks according to described current task sign, and described pending task is the subtask of described current task;
Unit 703 is set, and the subtask that is used for described pending task is set to described virtual task;
Scheduling unit 705 is used for according to the scheduler task that concerns between described current task, pending task and the virtual task.
Need to prove that each embodiment in this instructions all adopts the mode of going forward one by one to describe, what each embodiment stressed is and the difference of other embodiment that identical similar part is mutually referring to getting final product between each embodiment.For device class embodiment, because it is similar substantially to method embodiment, so description is fairly simple, relevant part gets final product referring to the part explanation of method embodiment.
At last, also need to prove, in this article, relational terms such as first and second grades only is used for an entity or operation are made a distinction with another entity or operation, and not necessarily requires or hint and have the relation of any this reality or in proper order between these entities or the operation.And, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thereby make and comprise that process, method, article or the equipment of a series of key elements not only comprise those key elements, but also comprise other key elements of clearly not listing, or also be included as the intrinsic key element of this process, method, article or equipment.Do not having under the situation of more restrictions, the key element that is limited by statement " comprising ... ", and be not precluded within process, method, article or the equipment that comprises described key element and also have other identical element.
More than method and the platform of a kind of scheduler task that the application is provided be described in detail, used specific case herein the application's principle and embodiment are set forth, the explanation of above embodiment just is used for helping to understand the application's method and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to the application's thought, the part that all can change in specific embodiments and applications, in sum, this description should not be construed as the restriction to the application.

Claims (10)

1. the method for a scheduler task is characterized in that, this method comprises:
Obtain the current task sign of current task and the virtual task sign of virtual task;
According to the number of described current task sign and raw data source data storage file in data warehouse, dynamically generate a plurality of pending task of same number, described pending task is the subtask of described current task; The number of the described current task sign of described foundation and raw data source data storage file in data warehouse, dynamically generate a plurality of pending task of same number, specifically comprise: the program module of calling described current task correspondence according to described current task sign; The number of the data storage file of the raw data source that the program module of described correspondence is obtained foreground system synchronously to the data warehouse of background system the time; Generate a plurality of pending task and the sign thereof of same number according to the number of described data storage file; Described a plurality of pending task is set to the subtask of current task; The relation of described a plurality of pending tasks and current task is stored in the task relation table that presets;
The subtask of described pending task is set to described virtual task;
According to the scheduler task that concerns between described current task, pending task and the virtual task.
2. method according to claim 1 is characterized in that, before the virtual task of the described current task sign of obtaining current task and virtual task identifies, also comprises:
Set up mission bit stream table and task relation table, described mission bit stream table is used for the details of storage described current task, pending task and virtual task, and described task relation table is used for the relation between storage described current task, pending task and the virtual task.
3. method according to claim 2 is characterized in that, before the virtual task of the described current task sign of obtaining current task and virtual task identifies, also comprises:
In the mission bit stream table that presets of deletion and the task relation table about the related data of the last task of described current task.
4. method according to claim 2 is characterized in that, the subtask of described pending task is set to also comprise after the described virtual task:
Set membership between described pending task and the virtual task is saved in the described task relation table that presets.
5. method according to claim 4 is characterized in that, and is described according to the scheduler task that concerns between described current task, pending task and the virtual task, specifically comprises:
The program module of described current task correspondence is carried out described a plurality of pending task;
When executing described a plurality of pending task, according to the set membership between described pending task and the virtual task, carry out described virtual task to finish current task scheduling.
6. the platform of a scheduler task is characterized in that, this platform comprises:
Acquiring unit is used for obtaining the current task sign of current task and the virtual task sign of virtual task;
Dynamically generation unit is used for the described current task sign of foundation and raw data source in the number of data warehouse data storage file, dynamically generates a plurality of pending task of same number, and described pending task is the subtask of described current task; Described dynamic generation unit specifically comprises: call subelement, be used for calling according to described current task sign the program module of described current task correspondence; Obtain subelement, the number of the data storage file of the raw data source that is used for obtaining foreground system synchronously to the data warehouse of background system the time; Generate subelement, be used for generating according to the number of described data storage file a plurality of pending task and the sign thereof of same number; Subelement is set, is used for the subtask that described a plurality of pending tasks are set to current task; First storing sub-units is used for the relation of described a plurality of pending tasks and current task is stored in the task relation table that presets; Second storing sub-units is used for described pending task is stored in the mission bit stream table that presets with relation and the pending task identification of current task and virtual task respectively;
The unit is set, and the subtask that is used for described pending task is set to described virtual task;
Scheduling unit is used for according to the scheduler task that concerns between described current task, pending task and the virtual task.
7. platform according to claim 6 is characterized in that, also comprises:
Set up the unit, be used for setting up mission bit stream table and task relation table, described mission bit stream table is used for the details of storage described current task, pending task and virtual task, and described task relation table is used for the relation between storage described current task, pending task and the virtual task.
8. platform according to claim 7 is characterized in that, also comprises:
Delete cells is used for mission bit stream table that deletion presets and task relation table about the related data of the last task of described current task.
9. platform according to claim 6 is characterized in that, also comprises:
Preserve the unit, be used for the set membership between described pending task and the virtual task is saved to the described task relation table that presets.
10. platform according to claim 6 is characterized in that, described scheduling unit specifically comprises:
Carry out subelement, the program module that is used for the described current task correspondence of execution is carried out described a plurality of pending task;
The scheduling sublayer module is used for according to the set membership between described pending task and the virtual task, carrying out described virtual task to finish current task scheduling when executing described a plurality of pending task.
CN 200910254276 2009-12-14 2009-12-14 Method and platform for scheduling tasks Expired - Fee Related CN102096687B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN 200910254276 CN102096687B (en) 2009-12-14 2009-12-14 Method and platform for scheduling tasks
HK11108717.2A HK1154677A1 (en) 2009-12-14 2011-08-18 A method and platform for task dispatching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200910254276 CN102096687B (en) 2009-12-14 2009-12-14 Method and platform for scheduling tasks

Publications (2)

Publication Number Publication Date
CN102096687A CN102096687A (en) 2011-06-15
CN102096687B true CN102096687B (en) 2013-08-07

Family

ID=44129783

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200910254276 Expired - Fee Related CN102096687B (en) 2009-12-14 2009-12-14 Method and platform for scheduling tasks

Country Status (2)

Country Link
CN (1) CN102096687B (en)
HK (1) HK1154677A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103049326B (en) * 2013-01-16 2015-04-15 浪潮(北京)电子信息产业有限公司 Method and system for managing job program of job management and scheduling system
CN106528070B (en) * 2015-09-15 2019-09-03 阿里巴巴集团控股有限公司 A kind of data table generating method and equipment
CN106919443A (en) * 2015-12-25 2017-07-04 阿里巴巴集团控股有限公司 Perform method, the apparatus and system of calculating task
CN107220117B (en) * 2017-05-25 2020-12-01 深信服科技股份有限公司 Virtual task synthesis method and system under NUMA (non Uniform memory Access) architecture
CN108710532B (en) * 2018-05-21 2023-05-30 平安科技(深圳)有限公司 Dependency realization method, device, equipment and storage medium of cross-dispatching platform
CN109213586A (en) * 2018-08-23 2019-01-15 北京奇虎科技有限公司 A kind of dispatching method and device of task

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5875487A (en) * 1995-06-07 1999-02-23 International Business Machines Corporation System and method for providing efficient shared memory in a virtual memory system
CN101419615A (en) * 2008-12-10 2009-04-29 阿里巴巴集团控股有限公司 Method and apparatus for synchronizing foreground and background databases
CN101470631A (en) * 2007-12-27 2009-07-01 新奥特(北京)视频技术有限公司 Task ranking apparatus

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5875487A (en) * 1995-06-07 1999-02-23 International Business Machines Corporation System and method for providing efficient shared memory in a virtual memory system
CN101470631A (en) * 2007-12-27 2009-07-01 新奥特(北京)视频技术有限公司 Task ranking apparatus
CN101419615A (en) * 2008-12-10 2009-04-29 阿里巴巴集团控股有限公司 Method and apparatus for synchronizing foreground and background databases

Also Published As

Publication number Publication date
CN102096687A (en) 2011-06-15
HK1154677A1 (en) 2012-04-27

Similar Documents

Publication Publication Date Title
CN102096687B (en) Method and platform for scheduling tasks
CN107918666B (en) Data synchronization method and system on block chain
Cortada The computer in the United States: From laboratory to market, 1930-60
US20200226618A1 (en) Platform, method and device for tracing an object
CN105426572A (en) BIM based multidimensional data and P6 database interaction method and system
CN103885788A (en) Dynamic WEB 3D virtual reality scene construction method and system based on model componentization
CN105354376A (en) Method and system for interaction between BIM based multi-dimensional data and OA database
CN107958010A (en) Method and system for online data migration
CN106557307B (en) Service data processing method and system
CN109388397A (en) Product page generation method, system, computer equipment and storage medium
CN103235811A (en) Data storage method and device
CN110825807A (en) Data interaction conversion method, device, equipment and medium based on artificial intelligence
CN108681910A (en) Retroactive method, device, terminal device based on network structure and storage medium
CN112785248B (en) Human resource data cross-organization interaction method, device, equipment and storage medium
CN110597821A (en) Method and device for changing database table structure
CN106156076B (en) The method and system of data processing
CN114895875B (en) Zero-code visual information system metadata production application method and system
CN101673257A (en) Copying and pasting method of complex object data and device
CN106649452A (en) Method of generating template graphics
CN114565316A (en) Task issuing method based on micro-service architecture and related equipment
CN105447605A (en) APS production model-oriented dynamic modeling method
CN106503162B (en) SAP platform based page dynamic generation method and system
CN113791880A (en) Scheduling method and system of business process
CN112686391A (en) Modeling method and device based on federal learning, equipment and storage medium
CN105574145A (en) Modern family archive electronic management method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1154677

Country of ref document: HK

C14 Grant of patent or utility model
GR01 Patent grant
REG Reference to a national code

Ref country code: HK

Ref legal event code: GR

Ref document number: 1154677

Country of ref document: HK

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20200422

Address after: Building 8, No. 16, Zhuantang science and technology economic block, Xihu District, Hangzhou City, Zhejiang Province

Patentee after: ALIYUN COMPUTING Co.,Ltd.

Address before: Cayman Islands Grand Cayman capital building, a four storey No. 847 mailbox

Patentee before: Alibaba Group Holding Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130807

Termination date: 20201214