Detailed Description
Fig. 1 illustrates a part of tasks in a data warehouse, one task in each block, for example, an extraction program of a data set or a generation program of a report. The arrow direction in fig. 1 represents the dependency relationship between the tasks of the data warehouse, for example, one arrow direction is from task 1 to task 4, which indicates that task 4 depends on task 1, and the output of task 1 is used as the input of task 4. The dependency relationship between tasks also represents the execution sequence between tasks, if task 4 depends on task 1, then task 1 is executed first, and then task 4 is executed, where task 1 may be referred to as an upstream task of task 4, and conversely, task 4 may be referred to as a downstream task of task 1. A task may only begin execution after all its upstream tasks have been completed, and in many cases a task may require multiple inputs, such as the outputs of task 1 and task 2 as inputs to task 4, as illustrated in fig. 1.
The tasks connected by the arrow lines constitute a task flow, for example, a task flow "[ task 1, task 2] - > task 4- > task 8" is illustrated in fig. 1, and another task flow "task 2- > task 5- > task 9" is illustrated, and other task flows are not described in detail. The task scheduling system may schedule and execute each task in sequence according to the dependency relationship between the tasks in the task stream, for example, execute task 2 first, then execute task 5, and then execute task 9. The task flows can be executed in parallel, and a plurality of task flows can be executed simultaneously.
In addition, in the task scheduling system of the data warehouse, attributes and scheduling policies of tasks are set for each task, including but not limited to elements such as a trigger mode, running time, task dependency, task level, and the like. For example, when multiple task flows in fig. 1 are scheduled to be executed, the task scheduling system may simultaneously execute multiple tasks, which may belong to different task flows. When the task scheduling system selects which tasks to be extracted to enter the running queue, according to the task level of the tasks, generally, the higher the task level is, the more preferentially the corresponding tasks are scheduled, and the scheduling system can preferentially place the tasks with the high task level into the running queue and also obtain higher system resources. The more important tasks will generally set a higher task level.
The task level of the task is usually set synchronously when the task is created, and can also be adjusted manually after the task is created. One situation that may arise when creating a task is: in a certain task flow, the task level of a downstream task is higher than that of an upstream task, which may be the case that a newly added downstream task is important, so a higher task level is set, while the upstream task is already set to a lower task level when initially created. For a task flow with a downstream task higher than the task level of the upstream task, if the task scheduling system schedules and executes in sequence, the upstream task may not obtain enough resources due to low task level, and cannot enter the running queue preferentially, and the output is delayed, so that the output of the downstream task is also delayed, that is, although the important downstream task sets a higher task level, the delayed output is also caused due to the influence of the upstream task.
In order to reduce the delay influence of the upstream task on the downstream task in the above situation, embodiments of the present application provide a task management method, which may adjust a task level of the upstream task when a task level of the downstream task in a task flow is higher than that of the upstream task, so as to reduce the delay influence of the upstream task on the downstream task. Fig. 2 is a flow chart of the method, which, as shown in fig. 2, may include:
in step 201, the task levels of an upstream task and a downstream task in a task flow are compared, and it is determined that the task level of the downstream task is higher than the task level of the upstream task.
For example, the upstream task and the downstream task in this step are in the same task flow, and upstream and downstream are a relative concept, such as illustrated in fig. 1, where task 1 is an upstream task of task 4, and task 4 is a downstream task of task 1.
In this step, the task level between the upstream task and the downstream task may be compared to determine whether the task level of the downstream task is higher than the task level of the upstream task. In this example, each task in the task flow may perform task level adjustment according to the method, and when the level of each task in the task flow is adjusted, various embodiments are possible. For example, the level of an upstream task may be compared with the level of each downstream task, and if the level of the downstream task is high, the level of the current upstream task is increased, and then the upstream task is compared with the level of the next downstream task. For example, the task levels may be compared between all the downstream tasks of the upstream task, and the maximum value may be selected and compared with the level of the current upstream task. For another example, only the level of the upstream task may be compared with the level of the newly added downstream task in the task flow.
If all downstream tasks of the upstream task have a lower task level than the upstream task, no adjustment may be made to the task level of the upstream task. Conversely, if a downstream task with a task level higher than the upstream task level is being performed, then step 202 may be continued to increase the task level of the upstream task to reduce the delay impact of the upstream task on the downstream task.
In step 202, the task level of the upstream task is increased.
The present example does not limit the magnitude of the task level increase of the upstream task, for example, assuming that the task level includes ten levels from 1 to 10, and the larger the number is, the more important the task is, and it may be assumed that the task level set by the upstream task at the time of creating the task is 1, and the task level of the downstream task is 7, when the task level of the upstream task is increased, the task level of the upstream task may be increased from 1 to 5, or from 1 to 7, etc., which is in any case higher than the current task level.
After the task level of the upstream task is improved, the task scheduling system can schedule the upstream task according to the improved task level of the upstream task. For the upstream task, the improvement of the task level enables the upstream task to enter the running queue faster than before the level adjustment, and more system resources can be allocated, so that the upstream task is executed and completed faster, the downstream task is executed as soon as possible, and the delay influence of the upstream task on the downstream task is reduced.
In the task management method of this example, when the task level of the downstream task is higher than that of the upstream task, the task level of the upstream task is increased, so that when the task scheduling system schedules the upstream according to the increased task level of the upstream task, the upstream task is put into the running queue faster than before the level adjustment, and more system resources are also allocated to the upstream task, so that the upstream task is executed and completed faster, the downstream task is executed as soon as possible, and the delay influence of the upstream task on the downstream task is reduced.
In an example, the task scheduling system of the data warehouse may determine the tasks to be executed in each task flow when scheduling the tasks each time, then execute the process illustrated in fig. 3 on the tasks to be executed, perform task level adjustment, and perform scheduling after the level adjustment. As shown in fig. 3, may include:
in step 301, the task levels of all the tasks currently to be executed are obtained.
For example, the task scheduling system may extract a current task to be executed in each task flow, for example, if one task flow in fig. 1 has been executed to the position of task 4, that is, if the task 4 is currently to be executed, the task 4 is the task to be executed; the position of the task 6 in the other task flow, i.e. the current task to be run is the task 6 (the task 3 upstream of it has finished executing), then the task 6 is the task to be executed. The task scheduling system may obtain the task levels for task 4 and task 6.
In step 302, the maximum value of the task levels of all the downstream tasks of the current task is counted.
For example, the number of to-be-executed tasks acquired in step 301 may be plural, and for each of the to-be-executed tasks, the processing of step 302 to step 306 in this example may be performed. In this step, taking one of the tasks to be executed as an example, the maximum value in the task levels of all the downstream tasks can be counted; for example, task 6 in fig. 1, whose downstream task is only task 9, may obtain the task level of task 9; in another example, assuming that the number of all downstream tasks of a certain task to be executed is multiple, including task a, task B, and task C, the maximum value of the task levels of task a to task C may be counted.
In step 303, the task level of the upstream task is compared with the maximum value, and it is determined whether the maximum value is higher than the task level of the current task.
If yes, go to step 304; otherwise, step 305 is performed.
In step 304, the maximum value is determined as the adjustment level for the current upstream task.
For example, also taking the task 6 as an example described above, if the task level of the task 9, which is a downstream task of the task 6, is high, the level of the task 9 may be set as the adjustment level of the task 6. In another example, even if the level of the downstream task is high, the adjustment level of the upstream task may not be equal to the maximum level of the downstream task, and the task level of the upstream task may be increased by a certain amount.
In step 305, the task level is kept unchanged.
In step 306, the current task is scheduled according to the adjustment level.
For example, after the task scheduling system adjusts the task level of each task to be executed, the task scheduling system may schedule the task according to the adjusted adjustment level, place the priority with the high level into the running queue, and preferentially allocate the running resources, so that the task with the high level can be executed faster.
An example implementation of adjusting the task level of a task is described below in conjunction with fig. 4. As shown in FIG. 4, there is one task in each box, and the task levels for the various tasks have been labeled in the boxes. For example, in the task stream "[ task 1, task 2] - > task 4- > task 8", the task level of both task 1 and task 2 is 7, the task level of task 4 is 1, and the task level of task 8 is 5.
As shown in fig. 4, in this example, a task 11 is newly added, and since this task 11 is important, its task level is set to 10, which is the highest level. This task 11 is dependent on task 8 and is located in the task stream "[ task 1, task 2] - > task 4- > task 8- > task 11".
The task scheduling system may perform the task management method shown in fig. 3 to adjust the task level. For example, taking task 4 in fig. 4 as an example, assuming that this task 4 is the current task to be executed, the maximum value in the task levels of all the downstream tasks of task 4 may be counted. The downstream tasks of task 4 include "task 8, task 11", where the level of task 8 is 5, the level of task 11 is 10, and the maximum value of the task levels of all downstream tasks is level 10. The task level of task 4 can be increased from the original level 1 to level 10, i.e. the same as the maximum value of the task level of the downstream task. In addition, for each task in the task flow, the task level can be adjusted according to the flow of the figure. For example, the task level of task 8 would also be adjusted from level 5 to level 10, and the task levels of task 1 and task 2 could be adjusted from level 7 to level 10. The adjusted task can be seen in the example of fig. 5.
In addition, the execution time of the task management method is not limited in the present application, and the task scheduling system may execute the method when the scheduled task is executed routinely every day, or may execute the method at other times. For each task in the task flow, the task level of the task is adjusted in advance according to the method of the present application before the task scheduling of allocating the queue and the resource to the task is executed according to the task level of the task.
In another example, assuming that the task scheduling system is configured to perform an update of the task level adjustment for each task in the task stream each time a new task is added to the task stream, the task scheduling system may not have all the new tasks added to the task stream every day if the task scheduling system performs the task scheduling once a day. For example, taking fig. 4 as an example, only task 11 is newly added to task flow "[ task 1, task 2] - > task 4- > task 8- > task 11", and other task flows have no newly added tasks, such as task flow "task 3- > task 7- > task 10". In order to improve the efficiency of task level adjustment, the task level of each task in the task flow can be adjusted only in the task flow including the newly added task. As shown in fig. 6, this example may include:
in step 601, a task flow including the newly added downstream task is determined.
For example, taking fig. 4 as an example, the task flow of the newly added task includes: "[ task 1, task 2] - > task 4- > task 8- > task 11".
In step 602, the task level of the upstream task is compared with the task level of a newly added downstream task.
For example, assuming that each time a downstream task is added, the task level of the upstream task may be compared to the task level of the added downstream task. When a plurality of downstream tasks are newly added, the upstream task may be compared with each of the newly added tasks in a level comparison manner, or the upstream task may be compared with a maximum value of the levels of the newly added tasks, as in the example shown in fig. 3.
In step 603, if the task level of the downstream task is higher than the task level of the upstream task, determining the task level of the downstream task as the adjustment level.
For example, in the task flow "[ task 1, task 2] - > task 4- > task 8- > task 11", if the level of the newly added task 11 is 10 and the level of the current task 4 is 1, the level of the task 4 may be adjusted to be level 10. Similarly, if the task level of the task 8 in the task flow is compared with the level of the newly added task 11, and the level 5 of the task 8 is lower than the level 10 of the task 11, the level of the task 8 can be adjusted to the level 10. After the above processing, it is equivalent to reversely propagating the level of the task 11, and after the task 11 with a higher level is newly added, if the level is lower than the newly added task, the level of each task at the upstream is adjusted to be the same as the level of the newly added task.
Of course, in other examples, whether the new task is included or not may be determined, for example, before the task scheduling system is to schedule and execute a certain task flow, whether the task level of each task in the task flow is adjusted or not may be determined by executing the method of this example, if the condition of this example is satisfied, the adjustment is performed, otherwise, the adjustment may not be performed.
In another example, instead of performing level comparison with a certain downstream task or the maximum value specifically for the task level of the current upstream task, the task levels of the upstream task and all the downstream tasks may be grouped into a set, and the maximum value of the level of the set may be obtained as the adjustment level of the upstream task. As shown in fig. 7, may include:
in step 701, for the current upstream task, the task levels of all its downstream tasks are obtained.
For example, taking task 4 in fig. 4 as the current upstream task as an example, all of its downstream tasks may include: task 8 and task 11.
In step 702, the maximum value of the task levels of the upstream task and all downstream tasks thereof is counted.
For example, task 4 has a task level of 1, task 8 has a task level of 5, and task 11 has a task level of 10. The levels of task 4, task 8, and task 11 may be grouped into a set, and the maximum value of the task levels in the set may be counted to obtain a maximum value of 10.
In step 703, the task level of the upstream task is adjusted to the maximum value.
This step may determine the maximum value in step 702 as the adjustment level of the upstream task, and change the task level of task 4 to 10. Similarly, for task 8, which may be grouped with task 11, the maximum value in the task level set [5, 10] is 10, whereby the task level of task 8 is also made 10.
In the task management method of this example, when the task level of the downstream task is higher than that of the upstream task, the task level of the upstream task is increased to be the same as the maximum value of the downstream task level, so that when the task scheduling system schedules the current task according to the increased task level of the current task, the upstream task is put into the running queue more quickly than before the level adjustment, and more system resources are also allocated to the upstream task, so that the upstream task is executed more quickly, the downstream task is executed as soon as possible, the delay influence of the upstream task on the downstream task is reduced, and the on-time output of the downstream important task can be ensured.
The embodiment of the application also provides a task management device which can execute the task management method of any embodiment. As shown in fig. 8, the apparatus may include: a grade determination module 81 and a grade adjustment module 82. Wherein,
a level determining module 81, configured to compare task levels of an upstream task and a downstream task in a task flow, and determine that the task level of the downstream task is higher than the task level of the upstream task;
a level adjustment module 82, configured to increase a task level of the upstream task.
In one example, the level adjustment module 82 is specifically configured to increase the task level of the upstream task to the task level of the downstream task.
In one example, the task level of the downstream task includes: a maximum value among task levels of all downstream tasks of the upstream task.
In one example, the task flow includes a newly added downstream task.
In one example, as shown in fig. 9, the task management device provided by the present invention may include:
the level obtaining module 91 is configured to obtain a maximum value in task levels of all downstream tasks of a current task, where the higher the task level is, the higher the priority of a corresponding task to be scheduled is;
a level adjusting module 92, configured to compare the task level of the current task with the maximum value, and if the maximum value is higher than the task level of the current task, adjust the task level of the current task to the maximum value;
and the scheduling processing module 93 is configured to schedule the current task according to the adjusted task level.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.