CN112130966A - Task scheduling method and system - Google Patents

Task scheduling method and system Download PDF

Info

Publication number
CN112130966A
CN112130966A CN201910550364.XA CN201910550364A CN112130966A CN 112130966 A CN112130966 A CN 112130966A CN 201910550364 A CN201910550364 A CN 201910550364A CN 112130966 A CN112130966 A CN 112130966A
Authority
CN
China
Prior art keywords
task
information
execution
current
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910550364.XA
Other languages
Chinese (zh)
Inventor
江鹤
赵鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201910550364.XA priority Critical patent/CN112130966A/en
Publication of CN112130966A publication Critical patent/CN112130966A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a task scheduling method and a task scheduling system, and relates to the technical field of computers. One embodiment of the method comprises: acquiring attribute information and upstream and downstream dependency information of a current task; determining global task dependency data according to the attribute information and the upstream and downstream dependency information; the global task dependency data comprises nodes for representing the current task, and the nodes comprise predicted execution time information determined for the current task; sending the global task dependency data to a resource manager; submitting the current task to a resource manager; in a resource manager, allocating resources to the current task using the global task dependency data. The implementation mode can realize the dynamic resource scheduling and the optimal task execution mode of the distributed system.

Description

Task scheduling method and system
Technical Field
The invention relates to the technical field of computers, in particular to a task scheduling method and system.
Background
In the distributed system architecture Hadoop, YARN (Another Resource coordinator) serves as a Resource manager to provide uniform Resource management and scheduling for upper applications. In practical application, when a user schedules a task, the user firstly maintains an execution queue in a manual mode, sets up an upstream and downstream task dependency relationship and an execution rule, and then submits the task to the YARN, and the YARN allocates resources according to set information to further execute the task.
In the prior art, only the relationship between resource application and resource allocation exists between submitted tasks and the YARNs, the tasks themselves do not have the capability of influencing the resource scheduling process of the YARNs, and meanwhile, the YARNs themselves cannot know whether the current execution mode of the tasks is the optimal mode, and further dynamic resource adjustment and optimization cannot be performed by the YARNs according to specific rules or upstream and downstream dependency relationships of the tasks, which leads to the following problems:
1. with the increase of tasks and unreasonable setting of execution rules, the queue resources of YARN may be in shortage in a certain time period, and in another time period, the queue resources are idle and cannot be fully utilized.
2. With the increasing complexity of the task dependency relationship, the manual setting of the dependency relationship has obvious limitations, and since the YARN cannot acquire the global dependency relationship among the tasks, when multiple batches of tasks are submitted continuously, the YARN cannot achieve dynamic task scheduling and resource allocation, so that the resource utilization rate is difficult to improve.
Disclosure of Invention
In view of this, embodiments of the present invention provide a task scheduling method and system, which can collect attribute information and upstream and downstream dependency information of a task before the task is submitted to a resource manager, generate global task dependency data, and send the global task dependency data to the resource manager for use in scheduling the task, thereby implementing dynamic resource scheduling and an optimal task execution mode.
To achieve the above object, according to one aspect of the present invention, a task scheduling method is provided.
The task scheduling method is used for submitting at least one current task to a resource manager which distributes resources for the task in a distributed system; the method comprises the following steps: acquiring attribute information and upstream and downstream dependency information of the current task; determining global task dependency data according to the attribute information and the upstream and downstream dependency information; the global task dependency data comprises nodes for representing the current task, and the nodes comprise predicted execution time information determined for the current task; sending the global task dependency data to a resource manager; and submitting the current task to a resource manager; in a resource manager, allocating resources to the current task using the global task dependency data.
Optionally, the global task dependent data characterizes: the global dependency of the current task and the task submitted to the resource manager; the attribute information includes: the execution queue information and the execution rule information of the current task; the node contains execution queue information for the current task.
Optionally, the method further comprises: acquiring data of any current task in the current tasks in a preset dimension, and inputting the data into an execution duration prediction model trained in advance to obtain an execution duration prediction value of the current task; the preset dimensionality is related to the execution duration of the task; determining global task dependency data according to the attribute information and the upstream and downstream dependency information, specifically comprising: determining global task dependency data according to the execution queue information, the execution rule information, the upstream and downstream dependency information and the execution duration prediction value of the at least one current task; and the predicted execution time information of any current task is determined according to the execution queue information, the execution rule information, the execution duration predicted value of the current task, the predicted execution time information of an upstream task of the current task and/or the predicted execution time information of a downstream task.
Optionally, the method further comprises: acquiring specific rule information configured for a current task before determining the global task dependent data; and in the global task dependency data, a node representing the current task configured with specific rule information contains the specific rule information.
Optionally, the specific rule information includes specified execution time information configured for the current task; and allocating resources to the current task using the global task dependency data, specifically including: determining a task to be executed in the current task submitted to the resource manager by using the global dependency relationship, and matching the task to be executed with one node in the global task dependency data; judging whether the node contains specific rule information: if so, distributing resources to the task to be executed according to the specified execution time information of the node; otherwise, resources are distributed to the tasks to be executed according to the predicted execution time information of the node.
Optionally, the global task dependent data is linked list data; and the preset dimension comprises at least one of the following: submitting unit identification, submitting unit current resource information, monitoring conditions, task names, task types, task principal identification, the execution rules, running accounts, affiliated business identification, upstream task data volume and change information thereof, cluster current state information and task script modification information.
To achieve the above object, according to another aspect of the present invention, there is provided a task scheduling system.
The task scheduling system of the embodiment of the invention is used for submitting at least one current task to a resource manager which distributes resources for the task in a distributed system; the task scheduling system comprises: the system comprises a task submitting system, a data acquisition system, a real-time computing system and an execution system running in a resource manager; wherein the data acquisition system is configured to: acquiring attribute information and upstream and downstream dependency information of the current task; the real-time computing system is to: determining global task dependency data according to the attribute information and the upstream and downstream dependency information, and sending the global task dependency data to a resource manager; the global task dependency data comprises nodes for representing the current task, and the nodes comprise predicted execution time information determined for the current task; the task submission system is to: submitting the current task to a resource manager; the execution system is to: allocating resources to the current task using the global task dependency data.
Optionally, the global task dependent data characterizes: the global dependency of the current task and the task submitted to the resource manager; the attribute information includes: the execution queue information and the execution rule information of the current task; the node contains the execution queue information of the current task; the task scheduling system further comprises a prediction system, a real-time computing system and a task scheduling system, wherein the prediction system is used for acquiring data of any one current task in the current tasks in a preset dimension, inputting the data into an execution duration prediction model which is trained in advance to obtain an execution duration prediction value of the current task, and sending the execution duration prediction value to the real-time computing system; the preset dimensionality is related to the execution duration of the task; and, the real-time computing system is further to: determining global task dependency data according to the execution queue information, the execution rule information, the upstream and downstream dependency information and the execution duration prediction value of the at least one current task; the predicted execution time information of any current task is determined according to the execution queue information, the execution rule information, the execution duration prediction value of the current task, the predicted execution time information of an upstream task of the current task and/or the predicted execution time information of a downstream task.
Optionally, the data acquisition system is further configured to: acquiring specific rule information configured for a current task before determining the global task dependent data; the specific rule information comprises specified execution time information configured for the current task; in the global task dependency data, a node representing a current task configured with specific rule information contains the specific rule information; and the execution system is further configured to: determining a task to be executed in the current task submitted to the resource manager by using the global dependency relationship, and matching the task to be executed with one node in the global task dependency data; judging whether the node contains specific rule information: if so, distributing resources to the task to be executed according to the specified execution time information of the node; otherwise, resources are distributed to the tasks to be executed according to the predicted execution time information of the node.
Optionally, the task scheduling system further includes: the system comprises a cache unit or a data bus for realizing data transmission among a data acquisition system, a prediction system and a real-time computing system, and a message queue for realizing data transmission among the real-time computing system and a resource manager; and the global task dependent data is linked list data; the preset dimension comprises at least one of: submitting unit identification, submitting unit current resource information, monitoring conditions, task names, task types, task principal identification, the execution rules, running accounts, affiliated business identification, upstream task data volume and change information thereof, cluster current state information and task script modification information.
To achieve the above object, according to still another aspect of the present invention, there is provided a computer-readable storage medium.
A computer-readable storage medium of the present invention has stored thereon a computer program which, when executed by a processor, implements the task scheduling method provided by the present invention.
According to the technical scheme of the invention, one embodiment of the invention has the following advantages or beneficial effects:
firstly, considering the defects that the process of submitting tasks to a resource manager is a unidirectional process and the resource manager cannot master the actual state of the tasks in real time in the prior art, the invention collects the attribute information and the upstream and downstream dependency information of the tasks before submitting the tasks to generate global task dependency data and sends the global task dependency data to the resource manager for use in task scheduling, so that the resource manager can master the global dependency of the tasks and know the predicted execution time information of each task in the execution queue of the task, thereby realizing the optimization of dynamic resource scheduling and task execution modes, ensuring the task execution efficiency and improving the resource utilization rate of a distributed system.
Secondly, in order to obtain more accurate global task dependent data, the invention uses multi-dimensional data such as the submitting unit identification, the monitoring condition, the task name, the task type and the like of the task to train an execution duration prediction model in advance, and obtains the execution duration prediction value of the task by using the model before the task is submitted for calculating the global task dependent data. In addition, for some tasks needing to customize special execution modes, the invention supports the configuration of specific rule information aiming at the tasks, and the specific rule information is reserved in the global task dependency data and is preferentially considered in the resource manager to execute the tasks, thereby meeting the actual application requirements.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram illustrating the main steps of a task scheduling method according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating specific steps performed by a task scheduling method according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating an architecture of a task scheduling system according to an embodiment of the present invention.
Description of reference numerals:
300-resource manager, 301-task submission system, 302-data acquisition system, 303-real-time computing system, 304-execution system, 305-prediction system, 306-cache unit or data bus, 307-message queue.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that the embodiments of the present invention and the technical features of the embodiments may be combined with each other without conflict.
Fig. 1 is a schematic diagram of main steps of a task scheduling method according to an embodiment of the present invention.
As shown in fig. 1, the task scheduling method according to the embodiment of the present invention may be specifically executed according to the following steps:
step S101: and acquiring attribute information and upstream and downstream dependency information of the current task.
In the embodiment of the invention, at least one current task needs to be submitted to a resource manager in the distributed system, and the resource manager allocates resources for each current task to execute. In practical application, the distributed system may be a Hadoop architecture, or may be other architectures. In the Hadoop system, the resource manager may be YARN, or may be other platforms. In this step, the current task refers to a task waiting to be submitted to the resource manager.
In an embodiment of the present invention, the attribute information may include execution queue information and execution rule information of the current task. The execution queue is divided by the distributed system, and provides resources for different services. The execution rule may be various rules configured for the task in advance, for example, to instruct the task to run at a specific time, or run within a certain time period, or run once every certain time. Generally, the execution rule is not a mandatory rule, and the actual execution of the task also needs to consider the execution situation of the upstream and downstream tasks, the current resource situation of the execution queue, and other factors.
It can be understood that, for a certain task, the execution process often needs to depend on the execution results of other tasks, and the execution results can also be used as a premise for executing other tasks, and for the task, the former is the upstream task, and the latter is the downstream task. In this step, the upstream and downstream dependency information of the current task refers to the upstream task information and the downstream task information of the current task.
As a preferred solution, in this step, data of the current task in a preset dimension may also be obtained for predicting an execution duration to be described later. The preset dimensions are all dimensions related to the execution duration of the task, for example, one or more of the following dimensions: the method comprises the steps of submitting unit identification, submitting unit current resource information, monitoring conditions, task names, task types, task principal identification, execution rules, running accounts, task affiliated business identification, upstream task data volume and change information thereof, cluster current state information and task script modification information. Wherein a submission unit refers to a device for receiving and submitting tasks to the resource manager. The monitoring condition may include a timeout time setting for stopping the task execution after it has timed out greater than a threshold. The running account refers to an account used when a task is submitted, and the current cluster state information refers to the current resource state information of the distributed cluster. Preferably, in this step, variable information of the task, such as a variable name, a variable type, a value, description information, and the like, which are transmitted during execution of the task, may also be obtained, and these variable information may be used to find a reason when the execution duration of the subsequent predicted task deviates.
In an actual scene, an application program for collecting data in real time can be established to acquire the attribute information, the upstream and downstream dependency information, the preset dimension data and the variable information of the task. After the application program collects the information, the information can be sent to a data bus or a cache unit in real time in a data stream mode for caching.
Step S102: and determining global task dependent data according to the attribute information and the upstream and downstream dependent information, and sending the global task dependent data to the resource manager.
In this step, the global task dependency data may be data for the current task, or may be data for the current task and a task that has been submitted to the resource manager in the distributed system. The global task dependency data may be one or more linked list data that may characterize global dependencies between the plurality of tasks (i.e., all dependencies between the plurality of tasks). It is to be understood that the above global dependency may be obtained by the upstream and downstream dependency information of each task. Generally, the global task dependency data is formed by connecting a plurality of nodes with each other, each node represents a task, and the connection between the nodes is a directed connection and represents the dependency relationship between the tasks.
Preferably, each node in the global task dependency data also contains one or more types of data, including execution queue information for the task corresponding to the node and predicted execution time information determined for the task. In particular, the expected execution time information of a certain task indicates the optimal execution time or the optimal execution time period of the task, and when the task is not configured with a mandatory execution time requirement, the resource manager can execute the task according to the expected execution time information. The predicted execution time information of a certain task can be calculated according to the execution queue information and the operation rule information of the task and the predicted execution time information of the tasks at the upstream and downstream of the task. The predicted execution time information of the upstream and downstream tasks may be the predicted execution time information of the upstream task, may be the predicted execution time information of the downstream task, or may include both the predicted execution time information of the upstream task and the predicted execution time information of the downstream task. In a specific application, a real-time computing engine can be pre-established to perform the computation of the global task dependent data.
As a preferred scheme, the execution time length of each current task can be predicted, and the predicted value of the execution time length is loaded to the real-time calculation engine to calculate the global task dependent data, so that more accurate global task dependent data can be obtained. In specific application, an execution duration prediction model can be established based on a multiple linear regression algorithm, and the execution duration prediction model is trained by utilizing various historical task data, wherein the historical task data can be various data in the following data: the method comprises the steps of submitting unit identification, submitting unit current resource information, monitoring conditions, task names, task types, task principal identification, execution rules, running accounts, task affiliated service identification, upstream task data volume and change information thereof, cluster current state information (namely state information of a distributed cluster during historical task execution), task script modification information, task execution time, task ending time, execution duration and completion condition information. In specific application, the data are continuously accumulated, subjected to incremental iteration and offline calculation, and input into an execution duration prediction model after being subjected to data cleaning, processing and conversion, so that the data can be trained, and key data influencing task execution duration are extracted. In addition, the trained execution duration prediction model can be continuously optimized and refined along with the continuous accumulation of new historical task data. Finally, the trained model data can be written into a high-performance distributed storage system. It is understood that the execution duration prediction model may be established by using other algorithms than linear regression, and the present invention is not limited thereto.
In the embodiment of the invention, after the preset dimension data of the current task is collected, the data can be input into the trained execution duration prediction model, and the execution duration prediction value of each current task can be obtained. Preferably, when the global task dependent data is calculated in this step, the execution duration prediction value may be input to the real-time calculation engine, and the calculation may be performed together with the execution queue information, the execution rule information, and the upstream and downstream dependency information, so as to improve the accuracy of the global task dependent data. Specifically, in the global task dependency data, the predicted execution time information of each node may be calculated according to the execution queue information, the execution rule information, the execution duration prediction value, and the predicted execution time information of the tasks upstream and downstream of the current task of the node. The predicted execution time information of the upstream and downstream tasks may be the predicted execution time information of the upstream task, may be the predicted execution time information of the downstream task, or may include both the predicted execution time information of the upstream task and the predicted execution time information of the downstream task.
In practical applications, it is often necessary to enforce a task to execute according to a preset specific rule, for example, a task must be run at a certain time, which requires opening an entry configured with a specific rule for the task. In the embodiment of the invention, a rule base can be provided for configuring specific rule information for the task, and the specific rule information can be collected in real time and input into a real-time calculation engine to participate in the calculation of the global task dependency data. In the calculated global task dependency data, the nodes representing the current task configured with the specific rule information contain the specific rule information. Illustratively, the specific rule information may include designated execution time information configured for the current task, and the designated execution time information may be time information or time period information. The specified execution time information is a mandatory requirement for task execution, unlike the expected execution time information.
In the embodiment of the invention, the calculated global task dependency data can be cached in real time and sent to the resource manager in a message queue mode.
Step S103: submitting the current task to a resource manager; in the resource manager, resources are allocated to the current task using global task dependency data.
In this step, the submission of the current task may be performed before, after, or simultaneously with the "sending of the global task dependency data to the resource manager". After submitting the current task to the resource manager, the resource manager may sequentially select the task to be executed from the submitted current task according to the global dependency relationship between the tasks in the global task dependency data, and match the task to be executed with one node in the global task dependency data (i.e., match the identifier of the task to be executed with the node identifier in the global task dependency data according to a preset correspondence). And then, judging whether the matched nodes contain specific rule information: if so, allocating resources to the task to be executed according to the appointed execution time information of the node for further execution (when the appointed execution time information is time information, indicating that the task to be executed must be executed at the time, and when the appointed execution time information is time slot information, arranging the task to be executed in the time slot for execution according to the resource condition of the execution queue); otherwise, allocating resources to the task to be executed for execution according to the expected execution time information of the node (when the expected execution time information is time information, if the resources of the execution queue at the time are sufficient, the task to be executed can be indicated to be executed at the time; when the specified execution time information is time slot information, the task to be executed can be arranged to be executed in the time slot according to the resource condition of the execution queue). If a task is not successfully matched with the node in the global task dependency data, the task is not submitted to the resource manager through the method, and at the moment, the resource can be allocated for the task by adopting the mode of the prior art and then executed.
Through the setting, the resource manager can master the actual state of the current task through the global task dependency data, so that the optimal scheduling mode of the current task is realized. Meanwhile, when the task information is changed or a newly added task is continuously submitted to the resource manager, the global task dependency data can be dynamically updated, so that the resource manager can master the global execution condition of the task and realize dynamic resource scheduling.
Fig. 2 is a schematic diagram illustrating specific execution steps of a task scheduling method according to an embodiment of the present invention. As shown in fig. 2, the specific execution steps of the task scheduling method are as follows: in step S201, attribute information, upstream and downstream dependency information, and current data of a prediction dimension of a current task are collected in real time. In step S202, the current data of the prediction dimension is input into the execution duration prediction model to obtain the predicted value of the execution duration of the current task. The execution duration prediction model is trained and completed by using the historical data of the prediction dimension and the historical operation information of the task (i.e., the execution time, the ending time, the execution duration and the completion information of the task) (step S200). In step S203, global task dependency data is obtained according to the attribute information, the upstream and downstream dependency information, the execution duration prediction value, and the like. In step S204, the resource manager is notified of the global task dependency data via the message queue. Finally, in step S205, the resource manager identifies the current task and allocates resources to the current task using the global task dependency data.
In the technical scheme of the embodiment of the invention, considering the defects that the process of submitting the task to the resource manager is a one-way process and the resource manager cannot master the actual state of the task in real time in the prior art, before submitting the task, the invention collects the attribute information and the upstream and downstream dependency information of the task to generate the global task dependency data and sends the global task dependency data to the resource manager for use when scheduling the task, so that the resource manager can master the global dependency of the task and know the predicted execution time information of each task in the execution queue of the task, thereby realizing the optimization of dynamic resource scheduling and task execution modes, ensuring the task execution efficiency and improving the resource utilization rate of a distributed system. In addition, in order to obtain more accurate global task dependent data, the invention uses multi-dimensional data such as the submitting unit identification, the monitoring condition, the task name, the task type and the like of the task to train an execution duration prediction model in advance, and obtains the execution duration prediction value of the task by using the model before the task is submitted for calculating the global task dependent data. In addition, for some tasks needing to customize special execution modes, the invention supports the configuration of specific rule information aiming at the tasks, and the specific rule information is reserved in the global task dependency data and is preferentially considered in the resource manager to execute the tasks, thereby meeting the actual application requirements.
It should be noted that, for the convenience of description, the foregoing method embodiments are described as a series of acts, but those skilled in the art will appreciate that the present invention is not limited by the order of acts described, and that some steps may in fact be performed in other orders or concurrently. Moreover, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no acts or modules are necessarily required to implement the invention.
To facilitate a better implementation of the above-described aspects of embodiments of the present invention, the following also provides related systems for implementing the above-described aspects.
Referring to fig. 3, a task scheduling system according to an embodiment of the present invention is configured to submit at least one current task to a resource manager 300 for allocating resources for the task in a distributed system, where the task scheduling system includes: a task submission system 301, a data collection system 302, a real-time computing system 303, and an execution system 304 running at a resource manager.
Specifically, the task submission system 301 is a device for submitting a current task to the resource manager 300, which may be a cluster of computers and includes pages for configuring task information. The data collection system 302 may be an application with real-time data collection functionality for obtaining attribute information and upstream and downstream dependency information for a current task. The real-time computing system 303 is configured to determine global task dependency data according to the attribute information and the upstream and downstream dependency information of the current task, and is further configured to send the global task dependency data to the resource manager 300. The global task dependency data includes nodes representing the current task, and the nodes include predicted execution time information determined for the current task. The execution system 304 is used to allocate resources to the current task using the global task dependency data.
In the embodiment of the present invention, the task scheduling system may further include: a buffer unit or data bus 306 for enabling data transfer between the data acquisition system 302 and the prediction system 305 and the real-time computing system 303, described immediately below, and a message queue 307 for enabling data transfer between the real-time computing system 303 and the resource manager 300.
In practical application, the global task dependent data is characterized in that: global dependencies of the current task and tasks that have been submitted to resource manager 300; the attribute information includes: the execution queue information and the execution rule information of the current task; the node contains execution queue information for the current task.
As a preferred solution, the task scheduling system may further include a prediction system 305, configured to obtain data of any current task in the current tasks in a preset dimension, input the data into a pre-trained execution duration prediction model, obtain an execution duration prediction value of the current task, and send the execution duration prediction value to the real-time computing system 303; and the preset dimension is related to the execution duration of the task.
Preferably, the real-time computing system 303 is further operable to: determining global task dependency data according to the execution queue information, the execution rule information, the upstream and downstream dependency information and the execution duration prediction value of the at least one current task; the predicted execution time information of any current task is determined according to the execution queue information, the execution rule information, the execution duration prediction value of the current task, the predicted execution time information of an upstream task of the current task and/or the predicted execution time information of a downstream task.
In one embodiment, the data acquisition system 302 may be further configured to: acquiring specific rule information configured for a current task before determining the global task dependent data; wherein the specific rule information includes specified execution time information configured for the current task. Meanwhile, in the global task dependency data, a node representing the current task configured with specific rule information contains the specific rule information.
In particular applications, the execution system 304 may be further configured to: determining a task to be executed in the current task submitted to the resource manager 300 by using the global dependency relationship, and matching the task to be executed with one node in the global task dependency data; judging whether the node contains specific rule information: if so, distributing resources to the task to be executed according to the specified execution time information of the node; otherwise, resources are distributed to the tasks to be executed according to the predicted execution time information of the node.
In addition, in the embodiment of the present invention, the global task dependency data is linked list data. The preset dimension comprises at least one of: submitting unit identification, submitting unit current resource information, monitoring conditions, task names, task types, task principal identification, the execution rules, running accounts, affiliated business identification, upstream task data volume and change information thereof, cluster current state information and task script modification information.
In the technical scheme of the embodiment of the invention, considering the defects that the process of submitting the task to the resource manager is a one-way process and the resource manager cannot master the actual state of the task in real time in the prior art, before submitting the task, the invention collects the attribute information and the upstream and downstream dependency information of the task to generate the global task dependency data and sends the global task dependency data to the resource manager for use when scheduling the task, so that the resource manager can master the global dependency of the task and know the predicted execution time information of each task in the execution queue of the task, thereby realizing the optimization of dynamic resource scheduling and task execution modes, ensuring the task execution efficiency and improving the resource utilization rate of a distributed system. In addition, in order to obtain more accurate global task dependent data, the invention uses multi-dimensional data such as the submitting unit identification, the monitoring condition, the task name, the task type and the like of the task to train an execution duration prediction model in advance, and obtains the execution duration prediction value of the task by using the model before the task is submitted for calculating the global task dependent data. In addition, for some tasks needing to customize special execution modes, the invention supports the configuration of specific rule information aiming at the tasks, and the specific rule information is reserved in the global task dependency data and is preferentially considered in the resource manager to execute the tasks, thereby meeting the actual application requirements.
Further, in an embodiment of the present invention, there is also provided a computer-readable medium, which may be contained in a computing device; or may exist separately and not be incorporated into a computing device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to perform steps comprising: acquiring attribute information and upstream and downstream dependency information of a current task; determining global task dependency data according to the attribute information and the upstream and downstream dependency information; the global task dependency data comprises nodes for representing the current task, and the nodes comprise predicted execution time information determined for the current task; sending the global task dependency data to a resource manager; submitting the current task to a resource manager; in a resource manager, allocating resources to the current task using the global task dependency data.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (11)

1. A task scheduling method is used for submitting at least one current task to a resource manager which distributes resources for the task in a distributed system; characterized in that the method comprises:
acquiring attribute information and upstream and downstream dependency information of the current task;
determining global task dependency data according to the attribute information and the upstream and downstream dependency information; the global task dependency data comprises nodes for representing the current task, and the nodes comprise predicted execution time information determined for the current task; sending the global task dependency data to a resource manager; and the number of the first and second groups,
submitting the current task to a resource manager; in a resource manager, allocating resources to the current task using the global task dependency data.
2. The method of claim 1,
the global task dependent data characterisation: the global dependency of the current task and the task submitted to the resource manager;
the attribute information includes: the execution queue information and the execution rule information of the current task; the node contains execution queue information for the current task.
3. The method of claim 2,
the method further comprises: acquiring data of any current task in the current tasks in a preset dimension, and inputting the data into an execution duration prediction model trained in advance to obtain an execution duration prediction value of the current task; the preset dimensionality is related to the execution duration of the task;
determining global task dependency data according to the attribute information and the upstream and downstream dependency information, specifically comprising: determining global task dependency data according to the execution queue information, the execution rule information, the upstream and downstream dependency information and the execution duration prediction value of the at least one current task; and the number of the first and second groups,
the predicted execution time information of any current task is determined according to the execution queue information, the execution rule information, the execution duration prediction value of the current task, the predicted execution time information of an upstream task of the current task and/or the predicted execution time information of a downstream task.
4. The method of claim 1, further comprising: acquiring specific rule information configured for a current task before determining the global task dependent data; and the number of the first and second groups,
in the global task dependency data, a node representing a current task configured with specific rule information contains the specific rule information.
5. The method according to claim 4, wherein the specific rule information includes specified execution time information configured for a current task; and allocating resources to the current task using the global task dependency data, specifically including:
determining a task to be executed in the current task submitted to the resource manager by using the global dependency relationship, and matching the task to be executed with one node in the global task dependency data;
judging whether the node contains specific rule information: if so, distributing resources to the task to be executed according to the specified execution time information of the node; otherwise, resources are distributed to the tasks to be executed according to the predicted execution time information of the node.
6. The method of claim 3, wherein the global task dependency data is linked list data; and the number of the first and second groups,
the preset dimension comprises at least one of: submitting unit identification, submitting unit current resource information, monitoring conditions, task names, task types, task principal identification, the execution rules, running accounts, affiliated business identification, upstream task data volume and change information thereof, cluster current state information and task script modification information.
7. A task scheduling system for submitting at least one current task to a resource manager that allocates resources for the task in a distributed system; wherein the task scheduling system comprises: the system comprises a task submitting system, a data acquisition system, a real-time computing system and an execution system running in a resource manager; wherein the content of the first and second substances,
the data acquisition system is configured to: acquiring attribute information and upstream and downstream dependency information of the current task;
the real-time computing system is to: determining global task dependency data according to the attribute information and the upstream and downstream dependency information, and sending the global task dependency data to a resource manager; the global task dependency data comprises nodes for representing the current task, and the nodes comprise predicted execution time information determined for the current task;
the task submission system is to: submitting the current task to a resource manager;
the execution system is to: allocating resources to the current task using the global task dependency data.
8. The task scheduling system of claim 7,
the global task dependent data characterisation: the global dependency of the current task and the task submitted to the resource manager; the attribute information includes: the execution queue information and the execution rule information of the current task; the node contains the execution queue information of the current task;
the task scheduling system further comprises a prediction system, a real-time computing system and a task scheduling system, wherein the prediction system is used for acquiring data of any one current task in the current tasks in a preset dimension, inputting the data into an execution duration prediction model which is trained in advance to obtain an execution duration prediction value of the current task, and sending the execution duration prediction value to the real-time computing system; the preset dimensionality is related to the execution duration of the task; and the number of the first and second groups,
the real-time computing system is further to: determining global task dependency data according to the execution queue information, the execution rule information, the upstream and downstream dependency information and the execution duration prediction value of the at least one current task; the predicted execution time information of any current task is determined according to the execution queue information, the execution rule information, the execution duration prediction value of the current task, the predicted execution time information of an upstream task of the current task and/or the predicted execution time information of a downstream task.
9. The task scheduling system of claim 8, wherein the data collection system is further configured to:
acquiring specific rule information configured for a current task before determining the global task dependent data; the specific rule information comprises specified execution time information configured for the current task; in the global task dependency data, a node representing a current task configured with specific rule information contains the specific rule information; and the number of the first and second groups,
the execution system is further to: determining a task to be executed in the current task submitted to the resource manager by using the global dependency relationship, and matching the task to be executed with one node in the global task dependency data; judging whether the node contains specific rule information: if so, distributing resources to the task to be executed according to the specified execution time information of the node; otherwise, resources are distributed to the tasks to be executed according to the predicted execution time information of the node.
10. A task scheduling system according to claim 8 or 9, characterized in that the task scheduling system further comprises: the system comprises a cache unit or a data bus for realizing data transmission among a data acquisition system, a prediction system and a real-time computing system, and a message queue for realizing data transmission among the real-time computing system and a resource manager; and the number of the first and second groups,
the global task dependent data is linked list data; the preset dimension comprises at least one of: submitting unit identification, submitting unit current resource information, monitoring conditions, task names, task types, task principal identification, the execution rules, running accounts, affiliated business identification, upstream task data volume and change information thereof, cluster current state information and task script modification information.
11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-6.
CN201910550364.XA 2019-06-24 2019-06-24 Task scheduling method and system Pending CN112130966A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910550364.XA CN112130966A (en) 2019-06-24 2019-06-24 Task scheduling method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910550364.XA CN112130966A (en) 2019-06-24 2019-06-24 Task scheduling method and system

Publications (1)

Publication Number Publication Date
CN112130966A true CN112130966A (en) 2020-12-25

Family

ID=73849070

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910550364.XA Pending CN112130966A (en) 2019-06-24 2019-06-24 Task scheduling method and system

Country Status (1)

Country Link
CN (1) CN112130966A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112988362A (en) * 2021-05-14 2021-06-18 南京蓝洋智能科技有限公司 Task processing method and device, electronic equipment and storage medium
CN113220542A (en) * 2021-04-01 2021-08-06 深圳市云网万店科技有限公司 Early warning method and device for computing task, computer equipment and storage medium
CN113254177A (en) * 2021-05-31 2021-08-13 广州虎牙科技有限公司 Cluster-based task submission method, computer program product and electronic device
CN113360270A (en) * 2021-06-30 2021-09-07 杭州数梦工场科技有限公司 Data cleaning task processing method and device
CN113946431A (en) * 2021-12-22 2022-01-18 北京瑞莱智慧科技有限公司 Resource scheduling method, system, medium and computing device
WO2022151668A1 (en) * 2021-01-15 2022-07-21 长鑫存储技术有限公司 Data task scheduling method and apparatus, storage medium, and scheduling tool

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022151668A1 (en) * 2021-01-15 2022-07-21 长鑫存储技术有限公司 Data task scheduling method and apparatus, storage medium, and scheduling tool
CN113220542A (en) * 2021-04-01 2021-08-06 深圳市云网万店科技有限公司 Early warning method and device for computing task, computer equipment and storage medium
CN112988362A (en) * 2021-05-14 2021-06-18 南京蓝洋智能科技有限公司 Task processing method and device, electronic equipment and storage medium
CN113254177A (en) * 2021-05-31 2021-08-13 广州虎牙科技有限公司 Cluster-based task submission method, computer program product and electronic device
CN113360270A (en) * 2021-06-30 2021-09-07 杭州数梦工场科技有限公司 Data cleaning task processing method and device
CN113360270B (en) * 2021-06-30 2024-02-27 杭州数梦工场科技有限公司 Data cleaning task processing method and device
CN113946431A (en) * 2021-12-22 2022-01-18 北京瑞莱智慧科技有限公司 Resource scheduling method, system, medium and computing device
CN113946431B (en) * 2021-12-22 2022-03-04 北京瑞莱智慧科技有限公司 Resource scheduling method, system, medium and computing device

Similar Documents

Publication Publication Date Title
CN112130966A (en) Task scheduling method and system
CN109034396B (en) Method and apparatus for processing deep learning jobs in a distributed cluster
US8185908B2 (en) Dynamic scheduling in a distributed environment
WO2017166803A1 (en) Resource scheduling method and device
US9201690B2 (en) Resource aware scheduling in a distributed computing environment
US10474504B2 (en) Distributed node intra-group task scheduling method and system
CN103699446B (en) Quantum-behaved particle swarm optimization (QPSO) algorithm based multi-objective dynamic workflow scheduling method
US20110202657A1 (en) Method for scheduling cloud-computing resource and system applying the same
CN107003887A (en) Overloaded cpu setting and cloud computing workload schedules mechanism
CN107038071B (en) Storm task flexible scheduling algorithm based on data flow prediction
CN111381950A (en) Task scheduling method and system based on multiple copies for edge computing environment
CN112162865A (en) Server scheduling method and device and server
CN109564528B (en) System and method for computing resource allocation in distributed computing
CN104915253B (en) A kind of method and job processor of job scheduling
CN111258745B (en) Task processing method and device
CN108270805B (en) Resource allocation method and device for data processing
CN110569252B (en) Data processing system and method
CN108241539B (en) Interactive big data query method and device based on distributed system, storage medium and terminal equipment
CN110347515B (en) Resource optimization allocation method suitable for edge computing environment
CN111861412B (en) Completion time optimization-oriented scientific workflow scheduling method and system
WO2023087658A1 (en) Task scheduling method, apparatus and device, and readable storage medium
CN110914805A (en) Computing system for hierarchical task scheduling
Shen et al. Goodbye to fixed bandwidth reservation: Job scheduling with elastic bandwidth reservation in clouds
CN105677467A (en) Yarn resource scheduler based on quantified labels
CN105320565B (en) A kind of computer scheduling of resource method for a variety of application software

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination