CN110895484A

CN110895484A - Task scheduling method and device

Info

Publication number: CN110895484A
Application number: CN201811061150.8A
Authority: CN
Inventors: 李铮; 朱俊; 刘彤
Original assignee: Beijing Qihoo Technology Co Ltd
Current assignee: Beijing Qihoo Technology Co Ltd
Priority date: 2018-09-12
Filing date: 2018-09-12
Publication date: 2020-03-20

Abstract

The invention discloses a task scheduling method and a device, comprising the following steps: running an independent task in the tasks to be executed according to a preset task topology table; the task topology table is used for storing the dependency relationship among tasks to be executed; when the independent task runs successfully, determining whether the independent task has a corresponding subtask according to the task topology table; if the independent task has a corresponding subtask, determining a parent task of the subtask corresponding to the independent task according to the task topology table; judging whether the parent task of the sub task corresponding to the independent task is successfully operated; and if so, running the subtasks corresponding to the independent tasks. According to the method, the dependence relationship among the tasks does not need to be judged manually by a user, the tasks with the dependence relationship can automatically run when the conditions are met, the efficiency is improved, and the misjudgment is avoided.

Description

Task scheduling method and device

Technical Field

The invention relates to the technical field of computers, in particular to a task scheduling method and a task scheduling device.

Background

At present, with the rapid development of the internet, the tasks required to be operated by the network are increasingly diversified. In a traditional task running mode, the dependency relationship among tasks cannot be defined, and accordingly the tasks are independent from one another and do not have the mutual dependency relationship. That is, the running state of one task does not affect the running timing of other tasks, and each task determines whether the task can run or not only according to the state of the task itself.

The inventor finds that the above mode in the prior art has at least the following problems in the process of implementing the invention: in practical situations, some tasks may have relevance dependency relationships, and one task may be operated after the other task is executed.

Disclosure of Invention

In view of the above, the present invention is proposed to provide a task scheduling method and apparatus that overcomes or at least partially solves the above problems.

According to an aspect of the present invention, there is provided a task scheduling method, including: running an independent task in the tasks to be executed according to a preset task topology table; the task topology table is used for storing the dependency relationship among tasks to be executed; when the independent task runs successfully, determining whether the independent task has a corresponding subtask according to the task topology table; if the independent task has a corresponding subtask, determining a parent task of the subtask corresponding to the independent task according to the task topology table; judging whether the parent task of the sub task corresponding to the independent task is successfully operated; and if so, running the subtasks corresponding to the independent tasks.

According to still another aspect of the present invention, there is provided a task scheduling apparatus including: the operation module is suitable for operating the independent tasks in the tasks to be executed according to a preset task topology table; the task topology table is used for storing the dependency relationship among tasks to be executed; the subtask determining module is suitable for determining whether the independent task has a corresponding subtask according to the task topology table when the independent task runs successfully; the parent task determining module is suitable for determining a parent task of a child task corresponding to the independent task according to the task topology table if the independent task has a corresponding child task; the judging module is suitable for judging whether the parent task of the sub task corresponding to the independent task is successfully operated; and if so, running the subtasks corresponding to the independent tasks.

According to yet another aspect of the present invention, there is provided a computing device comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus; the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the task scheduling method.

According to still another aspect of the present invention, a computer storage medium is provided, where at least one executable instruction is stored in the storage medium, and the executable instruction causes a processor to perform operations corresponding to the task scheduling method described above.

According to the task scheduling method and device provided by the invention, the independent task in the task to be executed can be operated according to the preset task topology table, and when the independent task is successfully operated, the sub task corresponding to the independent task and the parent task of the sub task are determined according to the task topology table; and if the parent task of the sub task corresponding to the independent task is judged to be successfully operated, automatically operating the sub task corresponding to the independent task. Therefore, the independent task can be operated firstly according to the task topology table, when the independent task is successfully operated, the subtask corresponding to the independent task and the parent task of the subtask are automatically searched, whether the parent task of the subtask corresponding to the independent task is successfully operated or not is automatically judged, and if the parent task of the subtask corresponding to the independent task is successfully operated, the subtask corresponding to the independent task is automatically operated. According to the method, the dependence relationship among the tasks does not need to be judged manually by a user, the tasks with the dependence relationship can automatically run when the conditions are met, the efficiency is improved, and the misjudgment is avoided.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

fig. 1 is a schematic structural diagram illustrating a distributed task scheduling system according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating a task recovery method according to a second embodiment of the present invention;

fig. 3 is a flowchart illustrating a task scheduling method according to a third embodiment of the present invention;

fig. 4 is a flowchart illustrating a task scheduling method according to a fourth embodiment of the present invention;

fig. 5 is a schematic structural diagram illustrating a distributed task scheduling system according to a fifth embodiment of the present invention;

fig. 6 is a schematic structural diagram illustrating a task scheduling system according to a sixth embodiment of the present invention;

fig. 7 is a schematic structural diagram illustrating a task scheduling apparatus according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a computing device according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

The first embodiment,

Fig. 1 shows a schematic structural diagram of a distributed task scheduling system according to an embodiment of the present invention. Wherein, distributed task scheduling system includes: a front-end interaction module 11, a task scheduling module 12, a plurality of distributed task execution modules 13 (only one shown in the figure), and a calculation engine module 14; the front-end interaction module 11 is configured to send a task interaction request to the task scheduling module 12 according to a received task interaction operation related to a task, and receive a task interaction result returned by the task scheduling module 12; the task scheduling module 12 is configured to determine each task to be executed according to the task interaction request sent by the front-end interaction module 11, and distribute each task to be executed to the task execution module 13; the task execution module 13 is configured to execute the task distributed by the task scheduling module and return a task execution result to the task scheduling module; and the calculation engine module 14 is used for being called by the task execution module 13 to implement calculation processing in the task execution process.

The front-end interaction module is also called a user interface module (namely a UI module) and is used for providing operations such as database access, task provision, kill task and the like. The number of the task scheduling modules 12 may be multiple, and each task scheduling module is mainly used for scheduling tasks, providing functions of logging, task recovery, collecting machine monitoring information, scheduling resource sending and the like. The task execution module is responsible for receiving and executing tasks, updating task states, topology functions and other services. The calculation engine module comprises a plurality of sub-modules which are decoupled with the task execution module, so that each task execution module calls the corresponding sub-module according to the task type of the task to be executed. Other modules in the whole system can be decoupled from the computing engine through the computing engine module 14, so that the system is independent of any computing engine and the computing engine is convenient to expand. For example, the plurality of sub-modules decoupled from the task execution module may include: spark submodule, MapReduce submodule, Flink submodule and the like.

Therefore, according to the distributed task scheduling system provided by the invention, by means of the scheduling function of the task scheduling module, a plurality of tasks can be executed in parallel by a plurality of distributed task execution modules, so that the concurrency of the tasks is greatly improved. In addition, by means of the calculation engine module, a complex task calculation process can be separated from the task execution module, so that the load of the task execution module is favorably reduced, and the task execution module can conveniently run more tasks.

Example II,

Fig. 2 is a flowchart illustrating a task recovery method according to a second embodiment of the present invention. Preferably, the task recovery method is applied to the distributed task scheduling system in the first embodiment of the present invention. Of course, those skilled in the art can know that the task recovery method can also be applied to other forms of systems or devices, and the invention is not limited to the application scenario of the task recovery method.

For convenience of understanding, the task recovery method is applied to the distributed task scheduling system in the first embodiment as an example for explanation. As shown in fig. 2, the task recovery method includes the steps of:

step S210: and creating a metadata file corresponding to the task aiming at the successfully started task, and recording task related information of the task through the metadata file corresponding to the task in the running process of the task.

The main body of this step and its subsequent steps may be any task execution module in fig. 1. Since a task execution module may suspend service for various reasons (e.g., a scheduler service may suspend during a system upgrade), the task being executed on the task execution module need not be suspended therewith during its suspension of service, but can continue execution in the background. Therefore, in order to update the states of the tasks running in the background during the service suspension period after the service of the task execution module is resumed, so that the tasks in the task execution module can run uninterruptedly, in this embodiment, for each task to be started, whether the starting state of the task is successful is monitored, and if not, the task is cleared and reset; and if so, creating a metadata file corresponding to the task aiming at the successfully started task, so that in the running process of the task, the task related information of the task is recorded through the metadata file corresponding to the task.

Wherein the task related information recorded by the metadata file comprises at least one of: the method comprises the following steps of task running time, a task affiliated system, a task identifier, a task running timestamp, a task template, process information of a running task, an error state code, a template check code, log offset information and a log callback interface; the task running timestamp is used for removing repeated tasks, the error state code is used for determining the execution state of the tasks, and the log offset information is used for realizing breakpoint continuous transmission of the logs.

Therefore, the metadata file is created when the task is successfully started, and the task related information recorded in the metadata file is continuously updated along with the continuous running process of the task so as to reflect the current running state of the task in real time.

Step S220: and when the preset task recovery condition is met, inquiring the metadata file corresponding to each task.

The preset task resuming condition may be various conditions related to resuming the service after the service is suspended. For example, the system may be a reboot operation performed after system upgrade, or a boot operation performed after system failure recovery, or the like. The metadata files corresponding to the respective tasks are automatically inquired whenever a preset task recovery condition is satisfied, so that the task recovery process is performed.

Step S230: and determining the execution state of each task according to the query result, and recovering each task according to the execution state of each task.

Specifically, when each task is resumed according to the execution state of each task, the state of the task that has already been run can be updated, and the task that is still running can be taken over, so that the task continues to run under the management of the scheduling system until the task is finished. For example, a task state table may be maintained in the scheduling system, where the task state table may be stored in a corresponding task execution module or a task scheduling module, and the task state table is used to record an operation state of each task, and accordingly, when each task is resumed according to the execution state of each task, if the task is completely executed, the state of the task is modified to an end state in the task state table, and recording is performed according to a task execution result, such as success or failure; if the task is not executed, determining the current running state of the task according to the metadata file, updating the state in the task state table according to the current running state, and taking over the task to enable the task to continue running under the management of the scheduling system. When the system runs under the management of a scheduling system, the task state table is updated in real time according to the change of the running state.

In addition, optionally, after the recovery processing is performed on each task according to the execution state of each task, a state result for indicating the operation state of the task is further generated for the task whose operation is finished, and the metadata file corresponding to the task is deleted, so as to reduce resource occupation. Therefore, the embodiment can provide the system upgrading operation which is not perceived by the user, and the task does not need to be operated again after the system is scheduled to be upgraded, so that the computing resource is saved.

In specific implementation, the invention provides a local task recovery mode and a cloud task recovery mode. Two recovery methods are described below by two examples:

examples one,

The local task recovery mode is mainly used for performing task recovery on the local of each task execution module, each task execution module is responsible for recovery operation of each task in the local module, and the mode has the advantages of high transmission speed, convenience in operation and the like.

Specifically, when the local task recovery mode is adopted, step S210 specifically includes: when the metadata file corresponding to the task is created aiming at the task which is successfully started, the metadata file corresponding to the task is created in the local storage space of the task execution module which is used for running the task which is successfully started. That is, each task execution module creates, for a task in the module, a metadata file corresponding to the task in the local storage space of the module, where the metadata file is generally only accessible by the module, and other task execution modules are not accessible without authorization.

Correspondingly, step S220 specifically includes: and when the preset task recovery condition is met, the task execution module for running the successfully started task queries the metadata files corresponding to the tasks, which are stored in the local storage space of the task execution module. For example, when a task execution module is restarted after being upgraded, the metadata files corresponding to the tasks and stored in the local storage space of the task execution module are inquired, the execution states of the tasks are determined according to the inquiry results, and the tasks are recovered according to the execution states of the tasks. The method can effectively recover the local tasks of each task execution module.

Therefore, in the local task recovery mode, after the task is successfully started, a metadata file is created, and the format is as follows: the date _ sysId _ taskId _ timestamp _ tempId _ pid _ startTime, and writes the contents: { "errCode": 1, "checksum": "," lastOffset ":0," callback ": {" report _ log _ url ":" xxx "," sys _ token ":" xxxxx "}. Wherein, the metadata file name field is described as follows: the date: running time of the task; sysId: a system to which the task belongs; taskId: a task Id; timing and map: the task running timestamp is mainly used for filtering task redundant reports; tempId: a template used for task operation; pid: a process number of the running task; startTime: the start time of the process. The metadata file content fields are described as follows: errCode: error code of task running, 0 indicates success; checksum: the check code of the template used for task operation; lastOffset: the row offset of the log last requested; a callback: an interface for returning the log; report _ log _ url, log interface url; sys _ token is the log interface authentication code. In the process of task operation, if the log of task operation needs to be returned in real time, the log is returned by using an interface in the callback, and the value of lastOffset is continuously updated to be used for determining the starting position of the next log request. During the task running process, if the task execution module is restarted, the task execution module reloads the tasks needing to be recovered by using the metadata file. If the task is finished when the task execution module is not started, the task writes the errCode into the metadata file, and waits for the task execution module to obtain the value after starting, and then corresponding recovery action is performed. After the task operation is finished, the recovered task returns to the task operation state according to the value of the errCode in the metadata file, and relevant metadata are deleted. The system comprises a Report _ log _ url, a lastOffset and a sys _ token, wherein the Report _ log _ url, the lastOffset and the sys _ token are used for solving the function of breakpoint resuming of a task log, and when scheduling is restarted, a log returning module sends logs after the lastOffset to the Report _ log _ url and provides a sys _ token verification mechanism to enable the Report _ log _ url to be considered as a legal request.

Examples two,

The cloud task recovery mode is mainly used for recovering tasks on an abnormal task execution module by other task execution modules by using the cloud metadata file when a certain task execution module is abnormal. Specifically, when a cloud task recovery mode is adopted, the task scheduling system generally includes a plurality of task execution modules, and step S210 specifically includes: when the metadata file corresponding to the task is created for the task which is successfully started, the metadata file corresponding to the task is created in the cloud storage space which can be shared by the plurality of task execution modules. That is, each task execution module creates a metadata file in the cloud storage space, and the metadata file can be accessed not only by the module but also by other task execution modules.

Correspondingly, step S220 specifically includes: and when the preset task recovery condition is met, inquiring the metadata file corresponding to each task in the cloud storage space. Specifically, the method is divided into two query modes, the first query mode is as follows: and the task execution module inquires the metadata file corresponding to the task on the module so as to recover each task on the module. For example, when one task execution module cannot be immediately recovered after being down, the scheduling service on the task execution module needs to be provided after the module is restarted, so that in order to avoid that the normal storage of the metadata file is influenced when the module is down, the security of the metadata file can be improved by accessing the metadata file in the cloud storage space, and the metadata file stored locally is prevented from being influenced by the down.

The second query mode is: when a certain task execution module is down, another task execution module is responsible for recovering the tasks on the down module. Correspondingly, determining the task execution module meeting the preset task recovery condition as a first execution module, and determining the task execution module meeting the preset abnormal condition as a second execution module; the first execution module queries the metadata files of the tasks corresponding to the first execution module and the second execution module, which are stored in the cloud storage space, when the preset task recovery condition is met. For example, if the first task execution module is down for more than a preset time during the system upgrade, the first task execution module is determined to meet a preset abnormal condition and is the second task execution module. And if the second task execution module is normally restarted after the system is upgraded, determining that the second task execution module meets the task recovery condition and is the first execution module. Correspondingly, after the first execution module is restarted successfully, the metadata files of the tasks corresponding to the first execution module and the second execution module, which are stored in the cloud storage space, are inquired, so that the tasks corresponding to the first execution module and the second execution module are recovered. The method is equivalent to that after a certain task execution module fails, other task execution modules take over and recover the tasks on the certain task execution module, so that the robustness of the system is improved.

Wherein, the cloud storage space can be realized by Zookeeper (ZK). Therefore, the cloud task recovery mode can only be applied to the yann-cluster task, logs cannot be returned, and the logs are stored in the cluster. When the task is started successfully, the metadata is written into the zkheda sysId taskId timestamp directory: { "errCode": 1, "checksum": and "" driver: "xxxxx" }. Wherein, the metadata file name field is described as follows: the date: running time of the task; sysId: a system to which the task belongs; taskId: a task Id; timing and map: and the task running timestamp is mainly used for filtering the task redundant report. Metadata file content field description: errCode: error code of task running, 0 indicates success; checksum: the check code of the template used for task operation; driver stores the address returned by the yarn resource manager (yarn needs to be modified to support this function). When the task is running, if the task execution module is restarted or crashed, the task execution module reloads the tasks to be recovered by using the metadata file stored on zk. If the task is finished when the task execution module is not started, the task needs to write the errCode into the metadata file through zk, and the task execution module obtains the value after starting to make a corresponding recovery action. And after the task operation is finished, the recovered task returns to the task operation state according to the value of the errCode in the metadata file, and deletes relevant metadata. In addition, in this embodiment, after the task operation is finished, the task state may be further reported to the task scheduling module, so that the task scheduling module may record the task state.

The task recovery method provided by the embodiment is particularly suitable for recovery processing of real-time tasks. Of course, the recovery method may also be applied to perform recovery processing on the offline task, which is not limited in the present invention. Therefore, the execution state of the tasks is recorded through the metadata file, so that the recovery processing is carried out on each task according to the execution state of each task. By adopting the mode, under special conditions such as system upgrading and the like, the running tasks can continue to run in the background without being suspended, and when the task recovery conditions are met, the tasks which originally run in the background are automatically recovered according to the metadata file, so that the computing resources are saved, and the task running efficiency is improved.

Example III,

Fig. 3 is a flowchart illustrating a task scheduling method according to a third embodiment of the present invention. Preferably, the task scheduling method is applied to the distributed task scheduling system in the first embodiment of the present invention. Of course, those skilled in the art can know that the task scheduling method can also be applied to other forms of systems or devices, and the invention is not limited to the application scenario of the task scheduling method.

For convenience of understanding, the task scheduling method is applied to the distributed task scheduling system in the first embodiment as an example for description. As shown in fig. 3, the task scheduling method includes the following steps:

step S300: a plurality of task dimensions are preset, the size relation among the task dimensions is defined, and the task dimensions are used for describing task execution time information.

This step is an optional step. Wherein the plurality of task dimensions includes at least one of: a minute dimension, an hour dimension, a day dimension, a week dimension, and a month dimension; and the month dimension is less than the week dimension, the week dimension is less than the day dimension, the day dimension is less than the hour dimension, and the hour dimension is less than the minute dimension.

For example, in the present embodiment, tasks that run in minutes, hours, days, weeks, months, and immediately can be supported, which mainly include tasks in the following dimensions: minute task: fxx (e.g., F10 indicates 1 run every 10 minutes). An hour task: including running task HHH every hour and running task Hxx across hours (e.g., H01, meaning execution across one hour), with maximum support across 99 hours. Day tasks: including running task DDD daily and running task Dxx across days (e.g., D01, meaning execution across an hour), with maximum support across 99 days. Week task: including Wxx, further including monday running tasks (W01) and weekend running tasks (W07). And (4) monthly tasks: task MMM is run at the end of the month and task Mxx is run during the month. And immediately executing the task: and III. Accordingly, the task dimensions are defined as follows: cross-hourly running task Hxx > hourly running task HHH > Cross-daily running task Dxx > daily running task DDD > weekly running task Wxx > in month running task Mxx > month-end running task MMM. Namely: the shorter the task execution period is, the larger the task dimension is; conversely, the longer the task execution period is, the smaller the task dimension is.

Step S310: running an independent task in the tasks to be executed according to a preset task topology table; the task topology table is used for storing the dependency relationship among tasks to be executed.

In the embodiment, the dependency relationship among the tasks is flexible, and the tasks with different dimensions can be depended on besides the tasks with the same dimension, so that the tasks meeting various actual requirements can be created conveniently, and the application scene is wider. The task topology table is used for storing the dependency relationship among the tasks to be executed, and whether the dependency relationship exists among the tasks can be clearly seen through the task topology table. Specifically, according to the task topology table, determining a task that is not dependent on a parent task and is included in the task to be executed as an independent task (or determining a root task as an independent task); initializing the independent task and applying for a corresponding resource for the independent task; and sending the independent task to one of the task execution modules, wherein the task execution module is responsible for running the independent task.

In addition, optionally, when a plurality of independent tasks are provided, determining an execution sequence among the independent tasks according to the task priority, and sequentially running the independent tasks according to the execution sequence among the independent tasks; and the task priority of each task is set according to a preset priority setting rule. The priority setting rule includes at least one of: dynamically detecting the waiting time of each task, and adjusting the task priority of the corresponding task according to the waiting time; the preset value range of the priority value comprises a first interval and a second interval, wherein the second interval is larger than the first interval and corresponds to a preset type of task. For example, in the task scheduling system of the present invention, the task priority is set to be within a range of [ 10 to 90 ], the priority of the regular task is set to be within a range of [ 10 to 80 ], and the priority of the preset type of task, such as the system task or the special task, is set to be within a range of [ 80 to 90 ]. In order to prevent the situation that a task with a low priority level may not be operated all the time when a large number of tasks need to be operated, the invention designs a task priority level updating module which is used for detecting the waiting time of each task and dynamically adjusting the task priority level according to the detection result: as the task waits longer and longer for the running time, the priority of the task is gradually increased to the maximum upper limit (e.g., the priority is 80). And if the priority of each task reaches the maximum upper limit, selecting the task according to a first-in first-out mode, wherein the earlier the task initialization time is, the task is preferentially operated. In addition, in order to enable some urgent tasks to be executed as early as possible (particularly, system upgrade, model update, etc. tasks), the present invention sets the priority of a preset type of task to be the highest (e.g., 90), but the priority of a conventional task cannot be generally set to this value.

Step S320: and when the independent task runs successfully, determining whether the independent task has a corresponding subtask according to the task topology table.

And if the number of the subtasks corresponding to the independent task is multiple, executing each subsequent step aiming at each subtask respectively.

Step S330: and if the independent task has the corresponding subtask, determining the parent task of the subtask corresponding to the independent task according to the task topology table.

Wherein the dependencies between tasks include implicit dependencies as well as explicit dependencies. Implicit dependencies refer to: dependencies between tasks are not explicitly specified, but there is only a dependency on the data, e.g. a task will depend on yesterday's input for the task, which will not run today if yesterday's task fails to run. Explicit dependencies refer to: the dependency relationship of the two tasks is formulated clearly, and the relationship between the parent task and the child task is stored. The present embodiment is primarily directed to explicit dependency cases. Accordingly, for a sub task corresponding to an independent task, a parent task of the sub task can be determined according to the task topology table. However, in many cases, a plurality of parent tasks exist in one child task at the same time, and the child task can only be operated after all the parent tasks are successfully operated, so that the embodiment mainly needs to find out other parent tasks of the child tasks corresponding to the independent tasks.

Step S340: judging whether the parent task of the sub task corresponding to the independent task is successfully operated; and if so, running the subtasks corresponding to the independent tasks.

When the parent tasks of the sub tasks corresponding to the independent tasks are multiple, whether each parent task of the sub task is successfully operated is judged one by one. During specific implementation, the task dimension of the subtask and the task dimension of the parent task are determined, and whether the parent task of the subtask corresponding to the independent task runs successfully or not is judged according to the task dimension of the subtask and the task dimension of the parent task. In this embodiment, the task dimension of the specified subtask must be less than or equal to the task dimension of the parent task, otherwise the dependency is illegal.

When the task dimension of the subtask is equal to the task dimension of the parent task, the dimension of the subtask is the same as that of the parent task, and at the moment, whether the parent task is successfully operated or not is directly determined. When the task dimension of the subtask is smaller than the task dimension of the parent task, judging whether the parent task of the subtask corresponding to the independent task is successfully operated, and calculating the theoretical operation times corresponding to the parent task according to the life cycle of the parent task and the task dimension of the parent task; judging whether the actual running times of the parent task are matched with the theoretical running times; if yes, determining that the parent task is successfully operated. Specifically, when the task dimension of the sub task is smaller than that of the parent task, for example, the sub task is a task executed every day, and the parent task is a task executed every hour, accordingly, before the sub task is executed, the theoretical running time of the parent task needs to be determined, for example, assuming that the life cycle of the parent task is three days, and the task dimension is executed every hour, the theoretical running time to be executed in the life cycle is 3 × 24 — 72. Therefore, it is necessary to determine whether the actual running time of the parent task is 72 times, and if so, it is determined that the parent task has run successfully. In the embodiment, the existence duration of the parent task in the system can be restricted by defining the life cycle of the parent task, so that the requirements of various tasks which are executed only in a short term are flexibly met. Of course, the life cycle of the task may be set to be infinitely long, so that the task is always circulated.

Additionally, optionally, the method further comprises: when receiving a topology suspension message triggered by a preset topology suspension entrance, setting the topology state to a suspension state; and/or when receiving a topology recovery message triggered by a preset topology recovery entry, recovering the topology state from the suspended state to the activated state. Correspondingly, in step S320, when the independent task runs successfully, determining whether the independent task has a corresponding subtask according to the task topology table, and further querying whether the topology state is an active state; if yes, determining whether the independent task has a corresponding subtask according to the task topology table, and continuing to execute the subsequent step S330 and the subsequent steps; if not, the operation of determining whether the independent task has the corresponding subtask according to the task topology table is not executed, and the subsequent step S330 and the subsequent steps are not executed. For example, in a special process such as a task debugging process, when an independent task is successfully run, a corresponding sub-task does not need to be run, automatic execution of the sub-task can be avoided through a topology suspension function, and in specific implementation, a dependent state field of the task is set to 0 (an inactive state). After the debugging process is finished, the automatic execution of the subtasks can be recovered by using the topology recovery function, and during specific implementation, the dependent state field of the task is set to be 2 (activated state).

In short, in the present embodiment, only the root task is initialized at the time of task initialization, and other tasks are not initialized. When the task runs successfully, if the task is activated to depend on the task, whether the task has a subtask is checked, if the task has a subtask, whether the parent tasks of the subtask run successfully is checked one by one, if the parent tasks of the subtask run successfully, the subtask is initialized, otherwise, the task is not processed. When judging whether the parent tasks of one subtask are successfully operated, the method can be divided into two cases: if the calculation dimension of the subtask is the same as that of the parent task, directly creating the subtask; and if the two numbers are the same, initializing the subtask, otherwise, not processing the subtask.

Therefore, the independent task can be operated firstly according to the task topology table, when the independent task is successfully operated, the subtask corresponding to the independent task and the parent task of the subtask are automatically searched, whether the parent task of the subtask corresponding to the independent task is successfully operated or not is automatically judged, and if the parent task of the subtask corresponding to the independent task is successfully operated, the subtask corresponding to the independent task is automatically operated. According to the method, the dependence relationship among the tasks does not need to be judged manually by a user, the tasks with the dependence relationship can automatically run when the conditions are met, the efficiency is improved, and the misjudgment is avoided. In addition, the embodiment supports not only the mutual dependency of tasks in the same dimension, but also the mutual dependency of tasks in different dimensions, so that the application scene is more flexible, and various types of tasks can be supported.

In addition, the task scheduling method in this embodiment may be executed by the task scheduling module in fig. 1, so as to implement reasonable scheduling of the whole tasks within the entire system. Alternatively, it may be executed by a task execution module to implement reasonable scheduling of each task in the module, which is not limited by the present invention.

Example four,

Fig. 4 is a flowchart illustrating a task scheduling method according to a fourth embodiment of the present invention. Preferably, the task scheduling method is applied to the distributed task scheduling system in the first embodiment of the present invention. Of course, those skilled in the art can know that the task scheduling method can also be applied to other forms of systems or devices, and the invention is not limited to the application scenario of the task scheduling method.

For convenience of understanding, the task scheduling method is applied to the distributed task scheduling system in the first embodiment as an example for description. As shown in fig. 4, the task scheduling method includes the following steps:

step S410: and distributing the tasks to be executed to each distributed task execution module, and recording the distributed tasks corresponding to each task execution module in the database.

The step can be executed by a task scheduling module, and specifically, the task scheduling module reads the tasks to be executed from the database and distributes the read tasks to be executed to the distributed task execution modules. In order to distribute tasks more reasonably, the task scheduling module further monitors the system running state of each task execution module and issues the tasks to be executed to each task execution module according to the monitoring result. Wherein the system operation state comprises at least one of the following: CPU state, memory state, disk usage, task concurrency, etc. In order to facilitate maintenance of the distributed tasks, the distributed tasks corresponding to the task execution modules are further recorded in the database. During specific recording, the corresponding relationship between the task execution module and the distributed tasks needs to be further recorded so as to determine which task execution module a certain task is specifically run by.

Step S420: and receiving a task state list returned by each task execution module every other preset time period, and determining the task execution module which sends the task state list according to module identification information contained in the task state list.

And each task execution module arranges the state of each task in the period every preset time period so as to return a task state list for reflecting the execution state of each task in the task execution module to the task scheduling module. Correspondingly, after receiving the task state list returned by each task execution module, the task scheduling module determines the task execution module which sends the task state list according to the module identification information contained in the task state list. Wherein the preset time period can be flexibly configured by the skilled person.

Specifically, the task state list further includes task identifiers of tasks on the task execution module and state information corresponding to the tasks, in addition to the module identifier information of the task execution module. Wherein the status information may include at least one of: the method comprises the following steps of initialization state (I), waiting operation state (W), running state (R), task killing request (A), task killing (B), task running success (S), task running failure (F) and task killing (K).

Step S430: and querying distributed tasks corresponding to the task execution module which sends the task state list from a database.

Correspondingly, in this step, after the task execution module that sends the task state list is determined according to the module identification information included in the task state list, the distributed tasks corresponding to the task execution module are further inquired in the database so as to determine the number and types of the tasks distributed to the task execution module.

Step S440: and comparing the task state list with the inquired distributed tasks, determining the tasks needing to be retransmitted according to the comparison result, and distributing the tasks needing to be retransmitted to at least one task execution module.

Specifically, the received task state list is compared with the inquired corresponding distributed tasks. When the task state list is compared specifically, whether the task number in the task state list is matched with the task number of the corresponding inquired distributed task needs to be compared. If the number of tasks included in the task state list is less than the number of the inquired distributed tasks, determining a missing task according to the task identifier of each task, wherein the missing task is a task which is not included in the task state list in the inquired distributed tasks, and determining the missing task as a task needing to be retransmitted.

In addition, in order to prevent the number of queried tasks from being inaccurate due to the fact that some tasks are executed for multiple times, in the embodiment, each task in the task state list has corresponding timestamp information, and the timestamp information is used for describing the initialization time, the running time and/or the completion time of the task; correspondingly, when the task number in the task state list is compared with the task number of the corresponding distributed tasks, which is inquired, whether the task number is matched, the task is subjected to duplicate removal processing according to the task identification and the timestamp information: when a plurality of tasks corresponding to the same task identifier are available, screening one task from the plurality of tasks corresponding to the same task identifier as an effective task according to the timestamp information of the plurality of tasks corresponding to the same task identifier; and counting the number of tasks contained in the task state list according to the screened effective tasks. The method can effectively avoid the phenomenon that the task number statistics is inaccurate, and also avoids the error condition that the same task is recorded for many times. During specific screening, the task with the latest timestamp information or the task with the running state being the non-abnormal state can be screened as the effective task.

In addition, since each task in the task state list has corresponding state information, the state information includes: and correspondingly, when the task needing to be retransmitted is determined according to the comparison result, the task needing to be retransmitted can be determined according to the state information of each task in the task state list. For example, tasks that fail to operate or otherwise operate abnormally may be resent. When distributing a task that needs to be retransmitted to at least one task execution module, the task may be distributed to a task execution module corresponding to the task that needs to be retransmitted, which is recorded in the database, or one task execution module may be newly selected and distributed according to the load of each task execution module.

Therefore, in the embodiment, the task scheduling module and each task execution module can communicate in an asynchronous communication mode. Correspondingly, after the task scheduling module sends the task to the task execution module, the task execution module does not need to wait for the response of the task, and the asynchronous communication mode is beneficial to improving the operation efficiency of the system. However, the inventor finds out in the process of implementing the invention that: this asynchronous communication may cause two problems: in one aspect, a task sent by the task scheduling module may not be received by the task execution module (e.g., a network exception or a restart of the task execution module, etc.), and such a task needs to be sent to the task execution module again for execution. On the other hand, the task scheduling module may not receive the message fed back by the task execution module. In order to solve the above problem, in this embodiment, the task execution module sends a task state list that has been run in a past period of time to the task scheduling module, and any task scheduling module determines whether a task is lost or whether a message reception failure occurs according to the task state list. Thereby increasing the robustness of the system.

Optionally, when determining a task to be retransmitted according to the state information of each task in the task state list, obtaining the last update time included in the state information of the task; judging whether the time difference between the current system time and the last updating time is greater than a preset downtime threshold value or not; if yes, determining the task as a task needing to be retransmitted; in order to avoid misjudging the upgrading process as downtime, the preset downtime threshold value is greater than the system upgrading duration. For example, if the state of some tasks is running and the duration of the current state of the task exceeds a preset downtime threshold, such as 2 hours (which may be updated by configuration parameters) while a machine (task execution module) is down. Namely: and if the current system time is the last update time +2 hours, the task on the machine is considered to need to be re-run. In order to prevent the task whose state is not updated during the upgrading period of the task execution module from being mistakenly judged as machine downtime, the preset downtime threshold value is made to be larger than the system upgrading time. By the mode, the task scheduling module can effectively detect the shutdown task execution module and redistribute the tasks on the shutdown task execution module to other task execution modules which are not shutdown, so that the delay of the tasks is avoided.

Optionally, the method is performed by a task scheduling module, and the method further comprises: when the restarting operation of the task scheduling module is completed, checking each task to be executed in a preset time period; and determining the missing task according to the checking result, and distributing the missing task to at least one task execution module. Specifically, in order to improve reliability, when the schedule of the task scheduling module is restarted, tasks within a preset period, for example, tasks within the life cycle of the last 1 day or the last 3 hours (a specific time may be configured by a parameter) are checked, and if the tasks are found not to be initialized, which indicates that the tasks are lost during the restart and upgrade, the tasks are determined to be lost tasks, and the tasks are initialized and run. In the specific inspection, the inspection can be performed by comparing with a database. In addition, the missing task is different from the missing task mentioned above. The missing task is typically: some tasks are not distributed to the task execution module due to the restart process of the task scheduling module, and these tasks which are not issued to the task execution module due to the restart operation are referred to as missing tasks. Missing tasks generally refer to: the task scheduling module is already distributed to the task execution module through a conventional flow, but the task scheduling module does not receive the task of the execution result because the message issued to the task execution module by the task scheduling module is lost or the message returned to the task scheduling module by the task execution module is lost. In other words, the missing task refers to a task for which the task scheduling module does not execute the issuing operation at all, and the missing task refers to a task for which the task scheduling module executes the issuing operation but a message related to the issuing operation is lost in the asynchronous communication process. Therefore, by the mode of the invention, the task can be effectively recovered no matter which link is lost.

Therefore, the task scheduling method can adopt various fault-tolerant mechanisms to carry out fault-tolerant processing, improves the reliability of the task, and can effectively prevent the task from being lost, the task cannot be monitored due to system abnormity, or the task is lost during system upgrading and the like. For example, the method can perform fault tolerance processing aiming at three conditions of restarting of the task scheduling module, downtime of the task execution module, abnormal message transmission between the task scheduling module and the task execution module and the like, a fault tolerance mechanism is comprehensive and reliable, and zero loss of tasks can be achieved.

Example V,

Fig. 5 shows a schematic structural diagram of a distributed task scheduling system according to a fifth embodiment of the present invention. The distributed task scheduling system shown in fig. 5 and the distributed task scheduling system shown in fig. 1 may be the same system, the embodiment corresponding to fig. 5 focuses on describing the system from the perspective of service stability, the embodiment corresponding to fig. 1 focuses on describing the system from the perspective of overall architecture of the system, and accordingly, various technical features described in the embodiment corresponding to fig. 5 and various technical features described in the embodiment corresponding to fig. 1 may be combined with each other. Of course, it can be understood by those skilled in the art that the distributed task scheduling system shown in fig. 5 can also be implemented in a manner different from the architecture of the distributed task scheduling system shown in fig. 1, and the specific architecture of the distributed task scheduling system shown in fig. 5 is not limited by the present invention.

As shown in fig. 5, the distributed task scheduling system includes: a database 50 for storing task information of tasks to be executed, a plurality of task scheduling modules 51, and a plurality of task execution modules 52; the task scheduling module 51 is adapted to determine whether a task scheduling message to be issued to the task execution module belongs to a preset type of message; if yes, the task scheduling message is issued to the task execution module 52 according to a preset consistency policy; a task execution module 52, adapted to execute the corresponding task according to the received task scheduling message, and return a task response message to the task scheduling module 51; wherein the preset type of message comprises: and accessing task information stored in the database in the message generation process.

The database may be located between the front-end interaction module and the task scheduling module in the first embodiment, and is used to provide a data storage function.

Therefore, in the distributed task scheduling system provided in this embodiment, a database for storing task information of tasks to be executed, a plurality of task scheduling modules, and a plurality of task execution modules are provided. The task scheduling modules and the task execution modules work in parallel, so that the concurrency of the whole system is improved, and the performance bottleneck caused by insufficient resources of a single machine is solved. And each task scheduling module can issue the task scheduling message to the task execution module according to a preset consistency strategy when judging that the task scheduling message to be issued to the task execution module belongs to the preset type message. The preset type of message comprises a message which needs to access task information stored in a database in the message generation process. Therefore, the consistency strategy can prevent the problem of data inconsistency caused by the fact that a plurality of task scheduling modules access the database at the same time, and service stability is improved.

Specifically, in this embodiment, various types of messages transmitted between the task scheduling module and the task execution module are split and classified in advance. In practical cases, the messages transmitted between the task scheduling module and the task execution module may include various types, for example, messages for synchronizing task computation models between the task scheduling module and the task execution module, messages for monitoring system running states of the task execution module, messages for issuing tasks to the task execution module, messages for receiving task execution results of the task execution module, and the like. Aiming at the various types of messages, the messages are divided into two categories of messages of preset types and messages of non-preset types according to whether the message generation process needs to access the task information stored in the database. Therefore, the preset type of message is a message which needs to access the database, the operation of accessing the database belongs to the operation with higher importance, and the task scheduling modules in the invention are multiple, so as to prevent the problem of data inconsistency caused by the fact that the multiple task scheduling modules access the database at the same time, in the embodiment, a preset consistency policy is set for the preset type of message, and the consistency policy is used for ensuring the consistency of data in the database, so as to ensure the stability of the service.

In this embodiment, the consistency policy includes: one main scheduling module is selected from a plurality of task scheduling modules, and the main scheduling module accesses the database and issues the information belonging to the preset type to the task execution module. Specifically, a master scheduling module is selected from a plurality of task scheduling modules in advance, and all messages belonging to a preset type are processed by the master scheduling module. That is to say, only one active scheduling module allows access to the database at the same time, and other task scheduling modules do not allow access to the database, and accordingly, the task scheduling module sends a message read from the database to the task execution module in a single-point mode, that is: only one task scheduler module provides this service and the remaining task schedulers do not. And if the main scheduling module is hung up, reselecting one main scheduling module from the rest task scheduling modules.

Correspondingly, when the task execution module sends various messages such as task response messages to the task scheduling module, the task execution module sends the task response messages to one of the task scheduling modules according to a preset rule, so that the task scheduling module can process the task response messages according to task identifiers contained in the task response messages. Wherein, the preset rule comprises: a random transmission rule, and/or a rule that transmits according to a task identity. In practical situations, a publish/subscribe mode is adopted, and the task execution module randomly sends messages to the task scheduling module, and each message is sent only once. The task scheduling module receiving the message performs related operations on the message, and the situation that the same message is sent to more than two task scheduling modules at the same time does not exist.

Therefore, in the embodiment, the preset type of message is sent by the main scheduling module in a message splitting and classifying mode, so that the potential problem of data tampering or inconsistency caused by the fact that a plurality of task scheduling modules access the database simultaneously is solved. And when receiving the response message, the plurality of task scheduling modules receive the response message at the same time, and the module receiving the response message can directly process the received message according to the related records in the database.

In addition, in order to further improve the system reliability, each task scheduling module in this embodiment is composed of two task scheduling sub-modules that are backup to each other, and at the same time, one of the task scheduling sub-modules is responsible for working, and the other task scheduling sub-module is in a redundant backup state. Similarly, each task execution module in this embodiment is also composed of two task execution submodules that are backed up with each other, and at the same time, one of the task execution submodules is responsible for working, and the other task execution submodule is in a redundant backup state. Once a certain sub-module originally in the working state is down or has an error, the sub-module in the redundant backup state can immediately take over the work of the down or error sub-module, so that the normal operation of the system is not influenced.

In addition, in other implementation manners of the present embodiment, the following consistency policy implementation may also be adopted. The consistency policy includes: according to the hash value corresponding to the task scheduling module, task information of a task to be executed corresponding to the hash value is obtained from a database; selecting a task execution module matched with the task identifier from a plurality of task execution modules as a target execution module according to the task identifier contained in the acquired task information of the task to be executed; and issuing the task scheduling message corresponding to the acquired task information of the task to be executed to the target execution module. The hash value corresponding to each task scheduling module is unique, and each task scheduling module only acquires information corresponding to the hash value when acquiring the information from the database according to the corresponding hash value. That is to say, all information in the database is divided into a plurality of parts in advance according to the hash value, each part corresponds to one hash value, and accordingly, each task scheduling module can only access part of content corresponding to the hash value in the database, and does not have access to the rest of content, so that the security of the content in the database is improved, and the situation that the same content in the database is accessed by a plurality of task scheduling modules at the same time is avoided. In addition, the acquired task information of each task to be executed includes a task identifier, and a task execution module matched with the task identifier can be selected from a plurality of task execution modules as a target execution module according to the task identifier, for example, the task identifier can be subjected to remainder operation, and a corresponding task execution module is determined according to an operation result; for another example, a corresponding relationship between the task identifier and the task execution module may be preset, and the matched task execution module is determined according to the corresponding relationship.

In short, the method in this embodiment can ensure the security of the contents in the database, and prevent a plurality of task scheduling modules from simultaneously acquiring the same task and sending the same task to different task execution modules. And moreover, the concurrency of the system can be greatly improved by simultaneously receiving the messages by a plurality of task scheduling modules.

Additionally, optionally, the system further comprises: and the consistency module 53 is connected to each task scheduling module and each task execution module, and is configured to detect and maintain consistency between each task scheduling module and each task execution module. Specifically, the consistency module can access data contents on each task scheduling module and each task execution module, judge whether the states of each task scheduling module and each task execution module are consistent according to a system log and records in a database, and if the states of each task scheduling module and each task execution module are not consistent, send an update message to the corresponding task scheduling module or task execution module to update the states of the task scheduling modules or task execution modules so as to meet the requirement of consistency.

Optionally, the task scheduling module is further adapted to: after distributing task scheduling information corresponding to the task to be executed to each task execution module, recording the distributed task corresponding to each task execution module in a database; receiving a task state list returned by each task execution module every other preset time period, and determining the task execution module which sends the task state list according to module identification information contained in the task state list; querying a distributed task corresponding to the task execution module which sends the task state list from a database; and comparing the task state list with the inquired distributed tasks, determining the tasks needing to be retransmitted according to the comparison result, and distributing the tasks needing to be retransmitted to at least one task execution module. Optionally, the task scheduling module is specifically adapted to: if the number of tasks included in the task state list is less than the number of the inquired distributed tasks, determining a missing task according to the task identifier of each task, wherein the missing task is a task which is not included in the task state list in the inquired distributed tasks, and determining the missing task as the task to be retransmitted. Each task in the task state list has corresponding timestamp information, and the timestamp information is used for describing the initialization time, the running time and/or the completion time of the task; the task scheduling module is specifically adapted to: when a plurality of tasks corresponding to the same task identifier are available, screening one task from the plurality of tasks corresponding to the same task identifier as an effective task according to the timestamp information of the plurality of tasks corresponding to the same task identifier; and counting the number of tasks contained in the task state list according to the screened effective tasks. Wherein, each task in the task state list has corresponding state information, and the task scheduling module is specifically adapted to: determining the task to be retransmitted according to the state information of each task in the task state list; wherein the state information includes: initialization information, operation success information, operation failure information and final update time information. Optionally, the task scheduling module is specifically adapted to: acquiring the last updating time contained in the state information of the task; judging whether the time difference between the current system time and the last updating time is greater than a preset downtime threshold value or not; if yes, determining the task as a task needing to be retransmitted; and the preset downtime threshold value is greater than the system upgrading duration. Optionally, the task scheduling module is further adapted to: when the execution of the restarting operation is finished, checking each task to be executed in a preset time period; and determining the missing task according to the checking result, and distributing the missing task to at least one task execution module.

The distributed task scheduling system in this embodiment can be used in combination with the task recovery method in the second embodiment and the task scheduling methods in the third and fourth embodiments. Particularly, the method can be combined with a task scheduling method including multiple fault-tolerant mechanisms in the fourth embodiment, so as to improve the stability and reliability of the service.

Example six,

Fig. 6 is a schematic structural diagram illustrating a task scheduling system according to a sixth embodiment of the present invention. The task scheduling system shown in fig. 6 and the distributed task scheduling system shown in fig. 1 may be the same system, the embodiment corresponding to fig. 6 focuses on describing the system from the perspective of a task computation model, the embodiment corresponding to fig. 1 focuses on describing the system from the perspective of the overall architecture of the system, and accordingly, various technical features described in the embodiment corresponding to fig. 6 and various technical features described in the embodiment corresponding to fig. 1 may be combined with each other. Of course, it can be understood by those skilled in the art that the task scheduling system shown in fig. 6 can also be implemented in a manner different from the architecture of the distributed task scheduling system shown in fig. 1, and the specific architecture of the task scheduling system shown in fig. 6 is not limited by the present invention.

As shown in fig. 6, the task scheduling system includes: the task scheduling system comprises a task scheduling module 61 and a plurality of task execution modules 62, wherein the task scheduling module 61 is used for generating a plurality of different types of preset task calculation models in advance and issuing the task calculation models to the task execution modules, and is further used for issuing the acquired tasks to be executed to the task execution modules; the task execution module 62 is configured to receive and store a plurality of different types of preset task computing models from the task scheduling module in advance, and further configured to call a task computing model matched with the task type from the plurality of different types of preset task computing models according to the task type of the task to be executed issued by the task scheduling module, so as to obtain a task parameter in the task to be executed through analysis and run the task according to the task parameter; and each task computing model stored on the task execution module can be synchronized with each task computing model generated in the task scheduling module.

Therefore, in the task scheduling system provided in this embodiment, the task scheduling module generates a plurality of different types of preset task computing models in advance and issues the preset task computing models to each task execution module, and the task execution module calls the task computing model matched with the task type from the plurality of different types of preset task computing models according to the task type of the task to be executed, so as to obtain the task parameter in the task to be executed through analysis and execute the task according to the task parameter. Moreover, each task calculation model stored on the task execution module can be synchronized with each task calculation model generated in the task scheduling module. Therefore, the task execution method based on the task calculation model has the advantages that the preset task calculation models of different types are created in advance and are respectively used for analyzing and operating the tasks of the corresponding types, and the operation flows of the tasks of the same type are substantially similar, so that the tasks of the same type can be quickly processed through the task calculation models, the complex process that the task execution modules analyze the tasks one by one and create task execution programs in real time is avoided, the concurrent execution of a large number of tasks is facilitated, and the task operation efficiency is greatly improved.

In specific implementation, tasks running in the scheduling system all depend on one task calculation model. Each task computation model is used to process the same type of task. The type of the task can be flexibly determined according to factors such as the execution flow of the task and the like. The inventor discovers that in the process of implementing the invention: for a plurality of tasks of the same type, different tasks only have different parameters, and the analysis mode and the execution flow are generally similar. Based on the discovery, the tasks with similar analysis modes and execution flows are divided into the tasks of the same type, and the corresponding task calculation models are set for the tasks of the type so as to analyze or execute the corresponding tasks. Therefore, according to the embodiment, each task calculation model can be set according to the task type and/or the task execution flow. The task computing model in this embodiment can be used not only to run the operation operations related to the task, but also to upgrade the task execution module. For example, when the system needs to upgrade the task execution module according to the adjustment of the actual service requirement, the task execution module can be upgraded by upgrading each task calculation model on the task execution module. For this reason, the present embodiment provides two ways for enabling the respective task computing models stored on the task execution module and the respective task computing models generated in the task scheduling module to be synchronized with each other, so as to implement the updating operation of the task computing models. By the method, all task execution modules needing to be upgraded or the task computing models on a specific group of task execution modules can be upgraded.

The first model synchronization mode is a model issuing mode and is mainly applied to a synchronization process after a new model package is manually uploaded. In a first model synchronization mode, the task scheduling module is mainly used for executing the following operations:

step one, after a task computing model to be issued is generated, inserting a data record corresponding to the task computing model to be issued into a preset template state table, and setting the state of the data record as a state to be synchronized.

The task computing model to be delivered may be generated by a way of uploading a template through a page, for example, a template of the task computing model of the latest version may be uploaded through a front-end interaction module shown in fig. 1, then, a front-end server (e.g., a WEB server) stores the uploaded template in a preset directory, and generates a download address corresponding to the template, and then, a data record is inserted into a template state table to record the template, and the state of the template is set to be synchronized.

The template state table is used for storing all task computing models in the system so as to realize the unified supervision of each task computing model. Each data record in the template state table stores at least an identification of the corresponding task computing model and the corresponding state (e.g., to be synchronized, used, etc.).

And step two, scanning the template state table, and respectively generating a corresponding synchronization task aiming at each scanned data record in a state to be synchronized.

Because new templates are continuously uploaded in the system, the template state table can be periodically scanned, and a corresponding synchronization task is generated for each scanned data record in a state to be synchronized. Specifically, at least one task execution module corresponding to the data record is determined from a plurality of task execution modules; at least one synchronization task corresponding to at least one task execution module is generated for the data record. When at least one task execution module corresponding to the data record is multiple, and at least one synchronous task corresponding to the at least one task execution module is generated aiming at the data record, multiple synchronous tasks respectively corresponding to the task execution modules corresponding to the data record are generated aiming at the data record. In practical situations, a task computation model in a data record to be synchronized may only be specific to a specific task execution module, and at this time, a synchronization task corresponding to the task execution module is generated according to the task execution module corresponding to the task computation model in the data record. Or, a task computation model in a data record to be synchronized may be commonly used for all task execution modules, and at this time, a plurality of synchronization tasks respectively corresponding to each task execution module are generated for the task computation model in the data record, that is: the number of synchronous tasks is equal to the number of task execution modules.

And step three, issuing the synchronous tasks to the corresponding task execution modules so that the task execution modules can synchronize the corresponding task calculation models according to the synchronous tasks.

Optionally, after the synchronization task is issued to the corresponding task execution module, the state of the data record corresponding to the synchronization task is further updated to a synchronization progress state; then, periodically scanning data records with synchronous state contained in the template state table; inquiring a synchronous task execution result of a corresponding task execution module aiming at the scanned data record; and updating the state of the corresponding data record to be a synchronization success state or a synchronization failure state according to the searched execution result of the synchronization task so as to follow up the execution result of each synchronization task, thereby rerunning the failed task.

The second model synchronization mode is an active reporting mode, also called a self-checking mode of the task execution modules, in which each task execution module sends version information of each task computation model on its own machine to the task scheduling module at regular time. And after receiving the version information of each task computing model reported by the task execution module, the task scheduling module compares the version information with the version information recorded in the database, and if the results are inconsistent, a synchronous task for orienting the synchronous task computing model to the corresponding task execution module is created.

It can be seen that, in the second model synchronization manner, each task execution module is further configured to: regularly acquiring and reporting version information of each task computing model stored on the task execution module; the task scheduling module is further configured to: and comparing the version information of each task computing model reported by each task execution module with the version information of each task computing model recorded in the database, and generating a synchronous task for synchronizing the task computing models to the task execution modules according to the comparison result when the comparison result is inconsistent. In order to facilitate transmission and accelerate comparison time, the version information of each task calculation model stored on the task execution module includes: the MD5 value for the version number of the respective task computation model. Correspondingly, each task execution module periodically reports the template MD5 of each task computation model existing on the module to the task scheduling module, and if the task scheduling module finds that the template MD5 reported by a certain task execution module is inconsistent with the content stored in the template table, for example, the version information is inconsistent or the task execution module lacks a certain template, it needs to synchronize the corresponding template with the task execution module again.

The task types in this embodiment include: the real-time task type and/or the offline task type, and accordingly, the task computing model comprises: and each type of model is set according to the characteristics of the corresponding task, so that the corresponding task can be efficiently processed.

In addition, the task scheduling module in this embodiment may be further configured to: monitoring the system running state of each task execution module, and issuing tasks to be executed to each task execution module according to the monitoring result; wherein the system operation state comprises at least one of the following: CPU state, memory state, disk utilization, and task concurrency. Therefore, the task execution method based on the task calculation model has the advantages that the preset task calculation models of different types are created in advance and are respectively used for analyzing and operating the tasks of the corresponding types, and the operation flows of the tasks of the same type are substantially similar, so that the tasks of the same type can be quickly processed through the task calculation models, the complex process that the task execution modules analyze the tasks one by one and create task execution programs in real time is avoided, the concurrent execution of a large number of tasks is facilitated, and the task operation efficiency is greatly improved. In addition, the method can realize the upgrading operation of the task computing model through two synchronous modes, and further realize the upgrading of the task execution module. Namely: in the embodiment, the upgrading operation of the task execution module is converted into the synchronous operation of the task computing model, so that the task computing model is utilized to upgrade the task execution module, the upgrading process is simplified, and the upgrading efficiency is improved. If the task execution module is upgraded in a conventional mode, the task execution module is upgraded integrally by means of a third-party tool, the mode can cause interruption of the operation process of the task execution module, and by means of the synchronous operation of the task computing model, when the task execution module is upgraded indirectly, partial models in the task execution module can be upgraded (the whole upgrade of the task execution module is not needed), the operation of all tasks on the task execution module is not needed to be interrupted, and the upgrading process is rapid and convenient.

One skilled in the art can understand that in the first embodiment to the sixth embodiment, the technical features in any embodiment can be combined with the technical features in one or more other embodiments. In addition, the system provided by the invention further provides the following auxiliary functions: in order to support other platforms to submit tasks to the scheduling system through the service API, the problems of task running states and log information synchronization between the two platforms must exist. In order to solve the synchronization problem between two platforms, the system of the invention provides a task state query interface and a log acquisition interface for the platforms. In addition, in order to be able to actively send notification messages related to tasks to the platform, a task state and log callback mechanism is also provided, specifically, when a task is created, two callback interfaces are provided in the task: a task state interface and a task log interface. The task state interface is used for sending the task state information related to updating to the task callback interface when the task scheduling module updates the database so as to be received by other platforms. The task log interface is used for providing the task scheduling module so as to send the log to the callback interface according to an incremental mode, and providing an breakpoint resume function, if the task log is restarted after scheduling, the task log will begin to be resumed from the last uploading point until the task is finished.

In summary, the systems and methods described in the first to sixth embodiments of the present invention provide a complete set of task scheduling mechanisms. The task scheduling mechanism at least has the following characteristics: the distributed architecture is adopted, the task scheduling module and the task execution module are all multiple machines, and the task stability is improved. Each task execution module can simultaneously and maximally concurrence more than 50 tasks, and the concurrence quantity is large. Supporting a task fault tolerance function: the method comprises the mechanisms of task state checking, task rescheduling, machine offline rescheduling of tasks and the like. Support for task recovery functions: and the task execution is not influenced during the system upgrading. Support multiple types of tasks: different computation models (supporting offline, real-time tasks) can be specified according to business needs. Support for task routing functions: providing the functions of random execution of tasks and designated machine execution. And (3) system monitoring: the indexes of a machine memory, a CPU, a load, a disk and the like can be monitored, and a corresponding task scheduling strategy is adopted according to a monitoring result. Supporting task topology functions: the method provides task dependent topology (supporting task dependence between different dimensions), and supports functions of topology suspension, topology recovery and the like. And (3) upgrading the model: the system supports the issuing of the task computing model and a model self-checking mechanism, and can update the task computing model, thereby indirectly realizing the update of the task execution module.

Fig. 7 is a functional structure diagram of a task scheduling device according to yet another embodiment of the present invention. As shown in fig. 7, the apparatus includes:

the running module 71 is adapted to run an independent task in the tasks to be executed according to a preset task topology table; the task topology table is used for storing the dependency relationship among tasks to be executed;

the subtask determining module 72 is adapted to determine whether the independent task has a corresponding subtask according to the task topology table when the independent task is successfully operated;

a parent task determining module 73, adapted to determine, according to the task topology table, a parent task of a child task corresponding to the independent task if the independent task has a corresponding child task;

a judging module 74, adapted to judge whether a parent task of a child task corresponding to the independent task has been successfully executed; and if so, running the subtasks corresponding to the independent tasks.

Optionally, the apparatus further comprises:

the setting module is suitable for presetting a plurality of task dimensions and defining the size relationship among the task dimensions, wherein the task dimensions are used for describing the time information of task execution;

the determination module is specifically adapted to: and determining the task dimension of the subtask and the task dimension of the parent task, and judging whether the parent task of the subtask corresponding to the independent task is successfully operated or not according to the task dimension of the subtask and the task dimension of the parent task.

Optionally, the task dimension of the sub task is smaller than or equal to the task dimension of the parent task; wherein the content of the first and second substances,

when the task dimension of the subtask is smaller than the task dimension of the parent task, the determining module is specifically adapted to:

calculating theoretical operation times corresponding to the parent task according to the life cycle of the parent task and the task dimension of the parent task;

judging whether the actual running times of the parent task are matched with the theoretical running times or not; and if so, determining that the parent task is successfully operated.

Optionally, the plurality of task dimensions comprises at least one of: a minute dimension, an hour dimension, a day dimension, a week dimension, and a month dimension; and, the month dimension is less than the week dimension, the week dimension is less than the day dimension, the day dimension is less than the hour dimension, and the hour dimension is less than the minute dimension.

Optionally, the apparatus further comprises:

the topology management module is suitable for setting the topology state to be in a suspension state when receiving a topology suspension message triggered by a preset topology suspension entrance; and/or when receiving a topology recovery message triggered by a preset topology recovery entry, recovering the topology state from the suspended state to the activated state;

the subtask determination module is specifically adapted to:

when the independent task is successfully operated, inquiring whether the topological state is an activated state;

and if so, determining whether the independent task has a corresponding subtask according to the task topology table.

Optionally, the operating module is specifically adapted to:

determining tasks which are independent of parent tasks and contained in the tasks to be executed as independent tasks according to a preset task topology table;

initializing the independent task and applying for a corresponding resource for the independent task;

and sending the independent task to one of a plurality of task execution modules, wherein the task execution module is responsible for running the independent task.

Optionally, the determining module is specifically adapted to:

and when the parent tasks of the sub tasks corresponding to the independent tasks are multiple, judging whether each parent task of the sub task is successfully operated one by one.

Optionally, the operating module is specifically adapted to:

when the number of the independent tasks is multiple, determining the execution sequence among the independent tasks according to the task priority, and sequentially operating the independent tasks according to the execution sequence among the independent tasks; wherein, the task priority of each task is set according to a preset priority setting rule;

the priority setting rule includes at least one of:

dynamically detecting the waiting time of each task, and adjusting the task priority of the corresponding task according to the waiting time;

the preset value range of the priority value comprises a first interval and a second interval, wherein the second interval is larger than the first interval and corresponds to a preset type of task.

The task scheduling device may be the aforementioned task scheduling module, and the specific implementation of each module in this embodiment may refer to the description of the corresponding step in the third embodiment, which is not limited herein.

According to an embodiment of the present invention, a non-volatile computer storage medium is provided, where at least one executable instruction is stored in the computer storage medium, and the computer executable instruction may perform the task recovery method in any of the above method embodiments.

Fig. 8 is a schematic structural diagram of a computing device according to an embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the computing device.

As shown in fig. 8, the computing device may include: a processor (processor)802, a Communications Interface 804, a memory 806, and a communication bus 808.

Wherein:

the processor 802, communication interface 804, and memory 806 communicate with one another via a communication bus 808.

A communication interface 804 for communicating with network elements of other devices, such as clients or other servers.

The processor 802 is configured to execute the program 810, and may specifically execute relevant steps in the foregoing service processing method embodiment based on the distributed system.

In particular, the program 810 may include program code comprising computer operating instructions.

The processor 802 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement an embodiment of the invention. The computing device includes one or more processors, which may be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.

The memory 806 stores a program 810. The memory 806 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

The program 810 may be specifically configured to cause the processor 802 to perform the operations of the task scheduling method described above.

The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.

The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components in a distributed system based traffic processing apparatus according to embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

The invention also discloses A1. a task scheduling method, which comprises the following steps:

running an independent task in the tasks to be executed according to a preset task topology table; the task topology table is used for storing the dependency relationship among tasks to be executed;

when the independent task runs successfully, determining whether the independent task has a corresponding subtask according to the task topology table;

if the independent task has a corresponding subtask, determining a parent task of the subtask corresponding to the independent task according to the task topology table;

judging whether the parent task of the sub task corresponding to the independent task is successfully operated; and if so, running the subtasks corresponding to the independent tasks.

A2. The method of a1, wherein the method further comprises: presetting a plurality of task dimensions and defining the size relationship among the task dimensions, wherein the task dimensions are used for describing the time information of task execution;

the judging whether the parent task of the sub task corresponding to the independent task is successfully operated comprises:

and determining the task dimension of the subtask and the task dimension of the parent task, and judging whether the parent task of the subtask corresponding to the independent task is successfully operated or not according to the task dimension of the subtask and the task dimension of the parent task.

A3. The method of a2, wherein the task dimension of the subtask is less than or equal to the task dimension of the parent task; wherein the content of the first and second substances,

when the task dimension of the subtask is smaller than the task dimension of the parent task, the determining whether the parent task of the subtask corresponding to the independent task has been successfully executed includes:

A4. The method of a2 or 3, wherein the plurality of task dimensions includes at least one of: a minute dimension, an hour dimension, a day dimension, a week dimension, and a month dimension; and, the month dimension is less than the week dimension, the week dimension is less than the day dimension, the day dimension is less than the hour dimension, and the hour dimension is less than the minute dimension.

A5. The method according to any of a1-4, wherein the method further comprises:

when receiving a topology suspension message triggered by a preset topology suspension entrance, setting the topology state to a suspension state; and/or when receiving a topology recovery message triggered by a preset topology recovery entry, recovering the topology state from the suspended state to the activated state;

determining whether the independent task has a corresponding subtask according to the task topology table when the independent task is successfully operated includes:

A6. The method according to any one of a1-5, wherein the running of the independent task in the tasks to be executed according to the preset task topology table includes:

A7. The method according to any one of a1-6, wherein the determining whether the parent task of the child task corresponding to the independent task has been successfully executed includes:

A8. The method according to any one of a1-7, wherein the running of the independent task in the tasks to be executed according to the preset task topology table includes:

the priority setting rule includes at least one of:

B9. A task scheduling apparatus comprising:

the operation module is suitable for operating the independent tasks in the tasks to be executed according to a preset task topology table; the task topology table is used for storing the dependency relationship among tasks to be executed;

the subtask determining module is suitable for determining whether the independent task has a corresponding subtask according to the task topology table when the independent task runs successfully;

the parent task determining module is suitable for determining a parent task of a child task corresponding to the independent task according to the task topology table if the independent task has a corresponding child task;

the judging module is suitable for judging whether the parent task of the sub task corresponding to the independent task is successfully operated; and if so, running the subtasks corresponding to the independent tasks.

B10. The apparatus of B9, wherein the apparatus further comprises:

B11. The apparatus of B10, wherein the task dimension of the subtask is less than or equal to the task dimension of the parent task; wherein the content of the first and second substances,

B12. The apparatus of B10 or 11, wherein the plurality of task dimensions includes at least one of: a minute dimension, an hour dimension, a day dimension, a week dimension, and a month dimension; and, the month dimension is less than the week dimension, the week dimension is less than the day dimension, the day dimension is less than the hour dimension, and the hour dimension is less than the minute dimension.

B13. The apparatus of any of B9-12, wherein the apparatus further comprises:

the subtask determination module is specifically adapted to:

B14. The apparatus according to any of B9-13, wherein the operation module is specifically adapted to:

B15. The apparatus according to any of B9-14, wherein the determining means is specifically adapted to:

B16. The apparatus according to any of B9-15, wherein the operation module is specifically adapted to:

the priority setting rule includes at least one of:

C17. A computing device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the task scheduling method according to any one of A1-8.

D18. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the task scheduling method of any one of a 1-8.

Claims

1. A task scheduling method comprises the following steps:

2. The method of claim 1, wherein the method further comprises: presetting a plurality of task dimensions and defining the size relationship among the task dimensions, wherein the task dimensions are used for describing the time information of task execution;

3. The method of claim 2, wherein the task dimension of the child task is less than or equal to the task dimension of the parent task; wherein the content of the first and second substances,

4. The method of claim 2 or 3, wherein the plurality of task dimensions comprises at least one of: a minute dimension, an hour dimension, a day dimension, a week dimension, and a month dimension; and, the month dimension is less than the week dimension, the week dimension is less than the day dimension, the day dimension is less than the hour dimension, and the hour dimension is less than the minute dimension.

5. The method of any of claims 1-4, wherein the method further comprises:

6. The method according to any one of claims 1 to 5, wherein the running of the independent task of the tasks to be executed according to the preset task topology table comprises:

7. The method according to any one of claims 1 to 6, wherein the determining whether the parent task of the child task corresponding to the independent task has been successfully executed comprises:

8. A task scheduling apparatus comprising:

9. A computing device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the task scheduling method according to any one of claims 1-7.

10. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the task scheduling method of any one of claims 1-7.