CN116610422A

CN116610422A - Task scheduling method, device and system

Info

Publication number: CN116610422A
Application number: CN202210119120.8A
Authority: CN
Inventors: 骆雨; 朱小坤; 包勇军; 王龙辉; 牛文杰; 李开荣; 高新; 郭锦荣
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2022-02-08
Filing date: 2022-02-08
Publication date: 2023-08-18

Abstract

The invention discloses a task scheduling method, device and system, and relates to the technical field of cloud computing. One embodiment of the method comprises the following steps: receiving a task to be processed, wherein the task to be processed comprises: user information for applying task processing and resource information required by task processing; counting the resource use condition of the user information in a plurality of clusters; judging whether the user information meets the resource allocation condition or not based on the resource information, the preset resource quota corresponding to the user information and the counted resource use condition, and if so, determining a target cluster for the task to be processed from a plurality of clusters; the task to be processed is scheduled to one or more nodes in the target cluster, such that the one or more nodes in the target cluster process the task to be processed. According to the embodiment, the resource allocation conditions can be used for evenly scheduling the resources for different users, and a plurality of clusters are scheduled, so that the resource allocation for the users can be achieved in a cross-cluster mode.

Description

Task scheduling method, device and system

Technical Field

The present invention relates to the field of cloud computing technologies, and in particular, to a task scheduling method, device, and system.

Background

In recent years, with the rapid development of cloud computing, a cloud operation load on a shared resource pool has become a mainstream trend. In order to cope with diversified demands, various cloud platforms have been developed, such as an algorithm platform facing machine learning/deep learning tasks, a cloud platform facing data-intensive tasks, a cloud platform facing micro-service architecture, and the like, and the cloud platform mode can effectively improve resource utilization and reduce maintenance cost. One of the core functions of the cloud platform is to schedule resources on the cloud platform for different users.

At present, the resource scheduling of the cloud platform mainly corresponds to an independent scheduling system, and the cluster only has residual resources, so that the cluster can schedule users issuing tasks, namely, the cluster also has residual resources, namely, the task scheduling resources are the task scheduling resources, and the resource scheduling mode easily causes that some users occupy more resources, and some users can be allocated to the resources only by waiting for a longer time, so that the resource scheduling of the cloud platform is unbalanced.

Disclosure of Invention

In view of this, the embodiments of the present invention provide a task scheduling method, apparatus, and system, which can uniformly schedule resources for different users through resource allocation conditions, and schedule a plurality of clusters, so as to realize cross-cluster uniform resource allocation for users.

To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a task scheduling method including:

receiving a task to be processed, wherein the task to be processed comprises: user information for applying task processing and resource information required by task processing;

counting the resource use condition of the user information in a plurality of clusters;

judging whether the user information meets a resource allocation condition or not based on the resource information, a preset resource quota corresponding to the user information and the counted resource use condition, and if so, determining a target cluster for the task to be processed from a plurality of clusters;

and scheduling the task to be processed to one or more nodes in the target cluster so that the one or more nodes in the target cluster process the task to be processed.

Optionally, the task scheduling method further includes:

placing the user information into a preset user queue;

if the user information included in the user queue is a plurality of, calculating a user score corresponding to the user information;

determining the priority of the user information in the user queue based on the user scores corresponding to the user information and the user scores corresponding to other user information in the user queue;

When the priority of the user information in the user queue reaches the highest, distributing resources for the user information;

and executing the step of determining a target cluster for the task to be processed from a plurality of clusters based on the resources allocated by the user information.

Optionally, the determining whether the user information meets a resource allocation condition includes:

calculating the resource occupation amount corresponding to the user information according to the resource information and the counted resource use condition;

and if the occupied amount of the resources is not larger than the resource quota, determining that the user information meets the resource allocation condition.

Optionally, the task scheduling method further includes: distributing the task to be processed to a task queue corresponding to the user information;

after judging that the user information meets the resource allocation condition, the method further comprises the following steps:

judging whether the task to be processed meets the task processing conditions or not according to the determined task basic information of the task to be processed and the task basic information of other tasks in the task queue, and executing the step of determining a target cluster for the task to be processed from a plurality of clusters if the task to be processed meets the task processing conditions.

Optionally, the resource usage comprises a plurality of resource usage indicators;

the resource quota comprises an index quota corresponding to each of the resource usage indexes;

the calculating the user score corresponding to the user information comprises the following steps:

calculating the ratio of each resource use index to the corresponding index quota;

and determining the user score corresponding to the user information based on the calculated multiple ratios.

Optionally, the resource information includes: the task to be processed is occupied by a CPU, a memory and a GPU;

the resource use cases include: other tasks of the user information are respectively aimed at the current usage amount of the CPU, the memory and the GPU;

the calculating the resource occupation amount corresponding to the user information comprises the following steps:

calculating the CPU occupation amount corresponding to the user information by utilizing the CPU occupation and the current use amount of the CPU;

calculating the memory occupation amount corresponding to the user information by utilizing the pre-occupied memory and the current use amount of the memory;

and calculating the GPU occupation amount corresponding to the user information by utilizing the pre-occupation GPU and the current use amount aiming at the GPU.

Optionally, the task processing conditions include:

In the task queue, the task to be processed has the highest priority and one or more clusters meeting the resource information;

or alternatively, the process may be performed,

the waiting time of the task to be processed reaches a preset time threshold and is provided with one or more clusters meeting the resource information.

Optionally, the task scheduling method further includes:

if the task queue comprises a plurality of tasks, calculating task scores corresponding to the tasks to be processed;

determining the priority of the task to be processed in the task queue based on the task score and task scores corresponding to other tasks in the task queue;

and executing the step of judging whether the task to be processed meets the task processing condition or not when the priority of the task score of the task to be processed in the task queue is highest.

Optionally, the resource quota includes: the CPU quota, the memory quota and the GPU quota allocated by the user information; in response to this, the control unit,

the task scheduling method further comprises the following steps:

comparing the CPU occupation amount, the memory occupation amount and the GPU occupation amount with the CPU quota, the memory quota and the GPU quota respectively;

And if the comparison result is that the CPU occupation amount is not greater than the CPU quota, the memory occupation amount is not greater than the memory quota and the GPU occupation amount is not greater than the GPU quota, the resource occupation amount is not greater than the resource quota.

Optionally, for the case where the task includes a plurality of subtasks,

the determining a target cluster for the task to be processed includes:

searching one or more clusters capable of processing a plurality of the subtasks;

and selecting the target cluster from the one or more found clusters.

Optionally, the plurality of subtasks are scheduled to one or more nodes in the target cluster according to preset priorities of the plurality of subtasks.

Optionally, the scheduling the plurality of subtasks to one or more nodes in the target cluster includes:

for each of the subtasks, performing:

screening a plurality of nodes with subtask processing capability from the target cluster;

calculating node scores of the screened nodes;

and selecting the node with the highest node score to process the subtasks.

Optionally, the task scheduling method further includes:

In case that the resource usage rate of the user information satisfies the adjustment condition within the set period of time,

directly adjusting the resource quota; or alternatively, the process may be performed,

and sending adjustment information to the communication address included in the user information, and adjusting the resource quota after responding to adjustment confirmation fed back by the user.

Alternatively, the process may be carried out in a single-stage,

and if the user information does not meet the resource allocation condition, placing the task to be processed into a waiting queue.

In a second aspect, an embodiment of the present invention provides a task scheduling server, including:

the receiving module is used for receiving a task to be processed, wherein the task to be processed comprises: user information for applying task processing and resource information required by task processing;

the control module is used for counting the resource use conditions of the user information in a plurality of clusters; judging whether the user information meets a resource allocation condition or not based on the resource information, a preset resource quota corresponding to the user information and the counted resource use condition, and if so, determining a target cluster for the task to be processed from a plurality of clusters;

and the scheduling module is used for scheduling the task to be processed to one or more nodes in the target cluster so as to enable the one or more nodes in the target cluster to process the task to be processed.

In a third aspect, an embodiment of the present application provides a task scheduling system, including: a plurality of clusters, and a task scheduling server provided by the above embodiment, wherein,

and the nodes in each cluster are used for processing the task to be processed after receiving the task to be processed scheduled by the task scheduling server.

One embodiment of the above application has the following advantages or benefits: according to the application, the resource use condition of the user information in the clusters is counted, and whether the user information meets the resource allocation condition or not is judged based on the resource information, the preset resource quota corresponding to the user information and the counted resource use condition, namely, only if the resource allocation condition is met, a target cluster is determined for the task to be processed from the clusters, and the task to be processed is scheduled to one or more nodes in the target cluster, so that the one or more nodes in the target cluster process the task to be processed. In addition, the resource allocation condition is related to information related to user resources, such as user resource information, preset resource quota corresponding to the user information, and statistical resource use conditions, so that the resource amount of each user scheduling resource can be effectively regulated and controlled through the resource allocation condition, and the resource scheduling of each user is relatively balanced.

In addition, the technical scheme provided by the application determines the target cluster for the task to be processed from the plurality of clusters, and schedules the task to be processed to one or more nodes in the target cluster, so that the one or more nodes in the target cluster process the task to be processed, and task scheduling resources among the plurality of clusters are realized.

Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the application and are not to be construed as unduly limiting the application. Wherein:

FIG. 1 is a schematic diagram of the main flow of a task scheduling method according to one embodiment of the application;

FIG. 2 is a schematic diagram of the main flow of determining user priority according to one embodiment of the application;

FIG. 3 is a schematic diagram of the main flow of calculating user scores corresponding to user information according to one embodiment of the application;

FIG. 4 is a schematic diagram of a main process of calculating a resource occupancy amount corresponding to user information according to an embodiment of the present application;

FIG. 5 is a schematic diagram of the main flow of determining task priorities in accordance with an embodiment of the present application;

FIG. 6 is a schematic diagram of the main flow of scheduling a plurality of subtasks to one or more nodes in a target cluster, according to one embodiment of the application;

FIG. 7 is a schematic diagram of the main flow of a task scheduling method according to another embodiment of the present invention;

FIG. 8 is a schematic diagram of the major modules of a task scheduling server according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of the primary devices of a task scheduling system according to an embodiment of the invention;

FIG. 10 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;

FIG. 11 is a schematic diagram of a computer system suitable for use in implementing the task scheduling server of an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Currently, resource unified management and scheduling systems applied to cloud platform construction have the Yarn, meso and Google of Apache software foundation projects, and the Kubernetes which are improved and source-opened based on the Borg of an internal system of the Google. Kubernetes is widely applied to the construction of cloud platforms as an open-source industrial container arrangement platform, and can provide functions of scheduling, arrangement and the like for running various tasks. In high performance task scenarios, such as machine learning, deep learning, big data applications, etc.

Current schedulers developed for Kubernetes clusters include Kubernetes native schedulers and Volcano schedulers Kube-batch, which support scheduling tasks to be executed on appropriate nodes according to their resource requirements, but still suffer from the following drawbacks:

for the Kubernetes native scheduler, the scheduling is performed by taking each subtask as a unit, the relation among the subtasks is not considered in the scheduling process, and in some scenes, such as the Tensorflow distributed training, the training task can be run only if all the subtasks under the same task are started at the same time. Therefore, when the cluster resources cannot meet all the resource requests of the whole task, part of the subtasks cannot be started, the created subtasks cannot be operated, and the resources are wasted due to deadlock.

For the scheduler Kube-batch applied in Volcano, it supports a batch scheduling function with respect to the Kubernetes native scheduler, i.e. it supports scheduling sub-tasks in groups, which bind sub-tasks to nodes and launch containers only if the resource requests of all sub-tasks of the same group are fulfilled (sub-tasks belonging to the same group have dependencies, a group of sub-tasks is called a task). Volcano also supports multi-user oriented task scheduling. While supporting batch scheduling and multi-user, it does not support allocation of resources for users across clusters and per resource quota, when multiple users share one cluster with a fixed number of resources, scheduling by a scheduler Kube-batch may result in excessive use of resources by a certain user, while some users get fewer resources, resulting in uneven allocation of resources among individual users, while allocating resources for users in only one cluster is very limited, and in addition, existing schedulers for Kubernetes clusters cannot allocate resources among multiple clusters.

In order to solve the above problems, the embodiment of the present invention schedules resources in a plurality of Kubernetes clusters for a task, where the amount of resources used by each user is limited based on resource allocation conditions in the scheduling process, so as to avoid excessive use of resources by the user, and ensure that each user can obtain resources uniformly as much as possible.

Nouns according to embodiments of the invention:

subtasks: the scheduling process distributes corresponding nodes for the subtasks and starts to run the subtasks on the nodes in a container mode according to the basic scheduling unit;

tasks: a user submits a task to a cluster by taking the task as a basic unit, wherein one task usually comprises a plurality of subtasks, for example, the deep learning training can comprise a plurality of PS and workbench, each of the PS and the workbench is a subtask, and the task can start to operate only when all the subtasks are bound to a node;

the user: a cluster of multiple users is shared by multiple organizations or users, and Kubernetes typically creates an independent namespace for each user;

resource quota: the maximum amount of available resources (e.g., CPU, mem, GPU, etc.) configured for the user;

user queue: putting all users into a user queue according to a certain priority, dequeuing the users with high priority first, and globally unique the user queue;

Task queues: and putting all tasks belonging to the same user into a task queue according to a certain priority, dequeuing tasks with high priority, and each user has a task queue.

Fig. 1 is a main flow chart of a task scheduling method according to an embodiment of the present invention, as shown in fig. 1, the task scheduling method may include the following steps:

step S101: receiving a task to be processed, wherein the task to be processed comprises: user information for applying task processing and resource information required by task processing;

the task to be processed may be a task or a job that is sent by the user through the terminal or the service server and needs to be processed on the cluster or the cloud platform. The task may include a plurality of subtasks, and an association relationship may exist between the plurality of subtasks. For example, the task may be training a neural network or deep learning, etc., to deploy a trained or deep-learned model on a business server or cloud platform. Data required by these subtasks, such as sample data, etc., may be stored on the cloud platform for use in subsequent training or deep learning.

The user information for applying for task processing can be user characteristic identification distributed to the user by the cluster or the cloud platform or information capable of uniquely identifying the user, such as a user name of the user.

In addition, the task to be processed can further include task basic information which can be requirements of the task, such as the type of the task, some limiting conditions of task processing, a plurality of subtasks included in the task, descriptions of the subtasks, and the like.

The resource information required for task processing refers to the amount of CPU, the amount of memory, the amount of GPU, etc. required for processing tasks that a user applies to a platform or cluster.

Step S102: counting the resource use condition of the user information in a plurality of clusters;

the resource use condition refers to the amount of CPU, the amount of memory, the amount of GPU and the like occupied by the same user in different clusters. In addition to the amount of CPU, the amount of memory, the amount of GPU, the resource usage may also include other parameters of the task related to the user information for hardware, software usage.

Step S103: judging whether the user information meets the resource allocation condition or not based on the resource information, the preset resource quota corresponding to the user information and the counted resource use condition, and executing step S104 if the user information meets the resource allocation condition; if not, executing step S106;

the resource quota is a resource quota allocated to the user by the cluster or the cloud platform before the user uses the cluster or the cloud platform, or a quota applied by the user to the cluster or the cloud platform. The indexes included by the resource quota are generally consistent with the indexes included by the resource use condition, for example, the resource quota comprises a CPU quota, a memory quota and a GPU quota, and correspondingly, the resource use condition comprises a CPU quantity, a memory quantity and a GPU quantity.

Step S104: determining a target cluster for a task to be processed from a plurality of clusters;

the target cluster is generally a cluster in which a CPU, a memory and a GPU which are idle for each node in the cluster can meet the requirements of processing each subtask in the task to be processed.

Step S105: scheduling the task to be processed to one or more nodes in the target cluster, so that the one or more nodes in the target cluster process the task to be processed, and ending the current flow;

step S106: the task to be processed is not processed temporarily.

A specific implementation of this step S106 may be to put the task to be processed in a waiting queue. Resources are scheduled for pending tasks in the waiting queue at regular time or cycles.

In the embodiment shown in fig. 1, the method and the device determine whether the user information meets the resource allocation condition by counting the resource usage of the user information in a plurality of clusters and based on the resource information, the preset resource quota corresponding to the user information and the counted resource usage, that is, only if the resource allocation condition is met, a target cluster is determined for the task to be processed from the plurality of clusters, and the task to be processed is scheduled to one or more nodes in the target cluster, so that the one or more nodes in the target cluster process the task to be processed. In addition, the resource allocation condition is related to information related to user resources, such as user resource information, preset resource quota corresponding to the user information, and statistical resource use conditions, so that the resource amount of each user scheduling resource can be effectively regulated and controlled through the resource allocation condition, and the resource scheduling of each user is relatively balanced.

In the embodiment of the present application, as shown in fig. 2, the task scheduling method may further perform priority processing on the user information, so as to process the task of the user according to the user priority, and specifically includes the following steps:

step S201: putting the user information into a preset user queue;

this step S201 may be completed before the above step S101 or before or after any one of the above steps S104. The user queue is mainly used for managing users using a cloud platform or a cluster.

Step S202: if the user information included in the user queue is a plurality of, calculating a user score corresponding to the user information;

the calculation is generally performed based on the occupied resources of the user, the resource quota of the user, and the like, and can reflect the equilibrium degree of the resources allocated to the user to a certain extent.

Step S203: determining the priority of the user information in the user queue based on the user scores corresponding to the user information and the user scores corresponding to other user information in the user queue;

Step S204: when the priority of the user information in the user queue reaches the highest, distributing resources for the user information;

step S205: and executing the step of determining a target cluster for the task to be processed from the plurality of clusters based on the resources allocated by the user information.

Wherein the resource usage includes a plurality of resource usage indicators; the resource quota comprises an index quota corresponding to each of the resource use indexes; accordingly, as shown in fig. 3, the specific embodiment for calculating the user score corresponding to the user information may include:

step S301: calculating the ratio of each resource use index to the corresponding index quota;

for example, the resource usage includes a plurality of resource usage indexes including a CPU, a memory and a GPU, and correspondingly, the resource usage includes a CPU usage amount, a memory usage amount and a GPU usage amount, and correspondingly, the resource quota includes an index quota including a CPU quota, a memory quota and a GPU quota.

The calculated ratio is calculated by the following calculation formula (1).

Wherein P is _k Characterizing the ratio corresponding to the resource use index k;characterizing the resource usage of the resource usage index k; />Characterizing an index quota corresponding to the resource use index k; the resource usage index k may be any one of the CPU, the memory, and the GPU.

Step S302: and determining the user score corresponding to the user information based on the calculated multiple ratios.

This step S302 can be obtained based on the following calculation formula (2).

share _i ＝max(P _ki ) The resource utilization index k is CPU, memory and GPU (2)

Wherein, share _i Characterizing a user score for user i; p (P) _ki The ratio of the resource usage index k of user i is characterized.

Wherein, share _i A lower value indicates that the user i has less resource allocation and higher priority and is therefore preferentially share _i User i with a small value allocates resources. For example, the CPU resource limit of user A is 9, the memory Mem resource limit is 18G, the GPU resource limit is 20, the CPU resource usage of user A is 3, the memory Mem usage is 9G, the GPU usage is 5, the share of user A is _A Is thatAnd +.>The maximum value of (a) i.e. the ratio of the used and total amount of the resource Mem is 0.5; the CPU resource limit of the user B is 20, the memory Mem resource limit is 30G, the GPU resource limit is 30, the CPU resource usage of the user A is 3, the memory Mem usage is 9G, the GPU usage is 5, the share of the user A is then _A Is-> And +.>The maximum value of (a) is 0.17, which is the ratio of the used and total amount of the resource GPU. If the score of user A is greater than the score of user B, then the priority of user B is higher than the priority of user A.

The above process analyzes the resource usage difference of each user through the user score, and determines the priority based on the difference, namely, the priority of the users with less general resource usage is higher, the priority of the users with more resource usage is lower, and the resource allocation among the users is relatively balanced by preferentially allocating the resources to the users with higher priority.

In the embodiment of the present invention, the specific implementation manner of determining whether the user information satisfies the resource allocation condition in the step S103 may include: calculating the resource occupation amount corresponding to the user information according to the resource information and the counted resource use condition; and if the occupied amount of the resources is not greater than the resource quota, determining that the user information meets the resource allocation condition. The task to be processed is managed or controlled according to the resource quota of the user through the process, so that excessive allocation of resources is further avoided.

Wherein, the resource information may include: the task to be processed is preempted in the CPU, the memory and the GPU; resource usage may include: other tasks of the user information are respectively aimed at the current usage amount of the CPU, the memory and the GPU; accordingly, as shown in fig. 4, the above embodiment for calculating the resource occupation amount corresponding to the user information may include the following steps:

Step S401: calculating the CPU occupation amount corresponding to the user information by utilizing the CPU occupation and the current use amount aiming at the CPU;

step S402: calculating the memory occupation amount corresponding to the user information by utilizing the pre-occupied memory and the current use amount of the memory;

step S403: and calculating the GPU occupation amount corresponding to the user information by utilizing the pre-occupied GPU and the current use amount aiming at the GPU. It should be noted that, the steps S401 to S403 are not strictly executed.

The calculation resource occupation amount can be calculated by the following calculation formula (3).

Z _i ＝(CPU _req +CPU _used ，Mem _req +Mem _used ，GPU _req +GPU _used )

Wherein Z is _i Characterizing the resource occupation of user i, CPU _req Characterizing the amount of pre-occupied CPU of a task to be processed requested by a user i; CPU (Central processing Unit) _used Characterizing the current usage of CPUs of all tasks of the user i currently in the cluster or cloud platform; mem (Mem) _req Representing the amount of memory occupied by a task to be processed requested by a user i; mem (Mem) _used Representing the current use amount of the memory of all tasks of the user i in the cluster or the cloud platform currently; GPU (graphics processing Unit) _req Characterizing pending task preemption for user i requestAmount of GPU; GPU (graphics processing Unit) _used The current usage of the GPU that characterizes all tasks of user i currently in the cluster or cloud platform. The current usage amount of the CPU refers to the amount of the CPU occupied by the task related to the user i currently processed by the cluster or the cloud platform or the total required amount of the CPU by the task related to the user i (the task currently being processed and the task not yet processed) already stored by the cluster or the cloud platform. The current usage amount of the memory refers to the amount of memory occupied by the task related to the user i currently processed by the cluster or the cloud platform or the total demand of the task related to the user i (the task currently being processed and the task not yet processed) already stored by the cluster or the cloud platform on the memory. The current usage amount of the GPU refers to the amount of the GPU occupied by the cluster or the cloud platform currently processing the task related to the user i or the total demand of the cluster or the cloud platform on the GPU for the task related to the user i (the task currently being processed and the task not yet being processed) already stored.

Wherein, for the resource quota may include: CPU quota, memory quota and GPU quota allocated by user information; the occupation amount of the resources is not more than the quota of the resources, and the method specifically comprises the following steps: comparing the CPU occupation amount, the memory occupation amount and the GPU occupation amount with a CPU quota, a memory quota and a GPU quota respectively; if the comparison result is that the CPU occupation amount is not greater than the CPU quota, the memory occupation amount is not greater than the memory quota and the GPU occupation amount is not greater than the GPU quota, the resource occupation amount is not greater than the resource quota. I.e. CPU _req +CPU _used Comparing with CPU quota, mem _req +Mem _used Comparing with memory quota, GPU _req +GPU _used With GPU quota, if CPU _req +CPU _used CPU quota is not more than and equal to Mem _req +Mem _used Memory quota is not more than and equal to, and GPU _req +GPU _used And if the GPU quota is not more than or equal to the threshold, the resource occupation amount is not more than the resource quota, it can be determined that the user information meets the resource allocation condition, and the task to be processed can be stored in a task queue corresponding to the user or processed.

In an embodiment of the present invention, the task scheduling method may further include: distributing the task to be processed to a task queue corresponding to the user information; accordingly, after determining that the user information satisfies the resource allocation condition, it may further include: judging whether the task to be processed meets the task processing conditions or not according to the determined task basic information of the task to be processed and the task basic information of other tasks in the task queue, and executing the step of determining a target cluster for the task to be processed from a plurality of clusters if the task to be processed meets the task processing conditions. The priority between the task to be processed and other tasks can be determined through the task basic information of the task to be processed and the task basic information of other tasks in the task queue, for example, the basic information can comprise task processing time limit, the ratio between the task pre-occupied resource amount and the used resource amount of a user, and the task with small pre-occupied resource amount is determined to be executed preferentially according to the ratio. Wherein the task processing conditions may include: in the task queue, the task to be processed has the highest priority and one or more clusters meeting the resource information; or the waiting time of the task to be processed reaches a preset time threshold and is provided with one or more clusters meeting the resource information. The task processing conditions ensure that the task can be processed completely in time, and meanwhile, the cluster resource waste caused by the fact that the node cannot process the task to be processed completely can be avoided.

In an embodiment of the present invention, as shown in fig. 5, the task scheduling method may further include:

step S501: if the task queue comprises a plurality of tasks, calculating task scores corresponding to the tasks to be processed;

the task score may also be: the ratio of the allocated resources of a task in the task queue to the total resources of the cluster where the task is located is 0 if one task is not allocated any cluster yet.

Step S502: determining the priority of the task to be processed in the task queue based on the task scores and the task scores corresponding to other tasks in the task queue;

wherein, the lower the task score, the higher the task priority.

Step S503: and executing the step of judging whether the task to be processed meets the task processing condition or not when the priority of the task score of the task to be processed in the task queue reaches the highest.

The step of executing the step S503 to determine whether the task to be processed satisfies the task processing condition is actually: judging whether one or more clusters meeting the resource information exist, if so, determining that the task to be processed meets the task processing conditions, and if not, determining that the task to be processed does not meet the task processing conditions.

Through the process, different tasks of the same user are managed through the task queue, so that each task can be timely processed, and resource waste caused by incomplete task processing of resources is avoided.

In an embodiment of the present invention, for a case where a task includes a plurality of subtasks, determining a target cluster for a task to be processed may include: searching one or more clusters capable of processing a plurality of subtasks; a target cluster is selected from the one or more found clusters. The one or more clusters capable of processing the plurality of subtasks means that each searched cluster has a node capable of processing the plurality of subtasks within a certain time, so that the situation that part of subtasks cannot be processed due to resource limitation due to part of subtask processing is avoided, and resource waste is further avoided.

Wherein, the specific implementation mode of scheduling the task to be processed to one or more nodes in the target cluster can comprise the following steps: and scheduling the plurality of subtasks to one or more nodes in the target cluster according to the preset priorities of the plurality of subtasks. The preset Priority may be a designated Priority field of the subtask, for example, a spec.priority class name field in a yaml file, for example, the higher the Priority value corresponding to the belonging Priority class, the higher the Priority of the subtask; another way of assigning priorities is to directly assign the value of Priority in yaml, to have the highest Priority of a value of Priority of another value, and so on.

In an embodiment of the present invention, as shown in fig. 6, the specific implementation manner of scheduling a plurality of subtasks to one or more nodes in a target cluster may include:

for each subtask, steps S601 to S603 are performed:

step S601: screening a plurality of nodes with subtask processing capability from a target cluster;

step S602: calculating node scores of the screened multiple nodes;

the node scores are scored for the screened nodes according to a certain rule, for example, the node score with the most balanced resource allocation is highest, the node score with the most idle resource can be highest, multiple scoring strategies can be combined, each strategy is given a weight, the total score is calculated, and the scoring strategies, rules and the like can be correspondingly set according to the user requirements. In addition, the node score may be the highest score of the node with less residual resources, so that for some subtasks with less resource requirements, the subtasks with less residual resources may be preferentially placed on the node with less residual resources, so as to leave the node with more residual resources to process the subtasks with more residual resources.

Step S603: and selecting the node with the highest node score to process the subtasks.

For a plurality of subtasks, a plurality of subtasks of the same task can be put into a subtask queue, so that all the subtasks belonging to the same task are put into the subtask queue according to a certain priority, the subtasks with high priority are dequeued first, and each task is provided with a subtask queue.

By calculating the strategy used by the node scores of the screened multiple nodes, each node can be allocated according to requirements (such as balanced allocation requirements or requirements of preferential allocation of the nodes with less residual resources or requirements of preferential allocation of the nodes with the closest residual resources to the resources required by the subtasks, and the like), and flexible allocation of the nodes is realized.

In an embodiment of the present invention, the task scheduling method may further include: and under the condition that the resource utilization rate of the user information meets the adjustment condition in the set time period, directly adjusting the resource quota or sending adjustment information to the communication address included in the user information, and adjusting the resource quota after responding to the adjustment confirmation fed back by the user. The resource quota of the user can be dynamically adjusted through the process, so that resource waste is avoided, and meanwhile, the resource quota of each user can be further ensured to be allocated as required.

The following describes in detail a scheduling process of the task scheduling method according to an embodiment of the present invention, where the task scheduling method is directed to a task having a plurality of subtasks.

As shown in fig. 7, the task scheduling method may include the steps of:

step S701: receiving a task to be processed, wherein the task to be processed comprises user information for applying for task processing, task basic information and resource information required by task processing;

Wherein the resource information includes: the task to be processed is preempted in the CPU, the memory and the GPU;

the resource quota includes: CPU quota, memory quota and GPU quota allocated by user information;

aiming at the resource quota, under the condition that the resource utilization rate of the user information meets the adjustment condition within a set time period, directly adjusting the resource quota; or, sending the adjustment information to the communication address included in the user information, and adjusting the resource quota after responding to the adjustment confirmation fed back by the user.

Step S702: calculating the resource occupation amount corresponding to the user information according to the resource information and the counted resource use condition;

specifically, calculating the CPU occupation amount corresponding to the user information by utilizing the CPU occupation and the current use amount aiming at the CPU; calculating the memory occupation amount corresponding to the user information by utilizing the pre-occupied memory and the current use amount of the memory; and calculating the GPU occupation amount corresponding to the user information by utilizing the pre-occupied GPU and the current use amount aiming at the GPU.

Comparing the CPU occupation amount, the memory occupation amount and the GPU occupation amount with a CPU quota, a memory quota and a GPU quota respectively; if the comparison result is that the CPU occupation amount is not greater than the CPU quota, the memory occupation amount is not greater than the memory quota and the GPU occupation amount is not greater than the GPU quota, the resource occupation amount is not greater than the resource quota.

Step S703: judging whether the resource occupation amount is smaller than or equal to the resource quota, if so, executing step 704; otherwise, go to step 705;

step 704: determining that the user information satisfies the resource allocation condition, and executing step S706;

step S705: putting the task to be processed into a temporary storage queue, and executing step S702 at regular time;

step S706: distributing the task to be processed to a task queue corresponding to the user information;

searching the user information in the user queue, and if the user information is not found, putting the user information into a preset user queue.

Step S707: judging whether the number of users in the user queue is a plurality of users, if so, executing step S708, otherwise, executing step S710;

step S708: calculating a user score corresponding to the user information;

the resource usage includes a plurality of resource usage indicators; the resource quota includes an index quota corresponding to each of the resource usage indexes; the calculation process comprises the following steps: calculating the ratio of each resource use index to the corresponding index quota; and determining the user score corresponding to the user information based on the calculated multiple ratios.

Step S709: determining the priority of the user information in the user queue based on the user scores corresponding to the user information and the user scores corresponding to other user information in the user queue, and executing step S710 when the priority of the user information in the user queue is highest;

Step S710: allocating resources for user information;

step S711: calculating task scores corresponding to the tasks to be processed;

step S712: determining the priority of the task to be processed in the task queue based on the task scores and the task scores corresponding to other tasks in the task queue;

step S713: judging whether the task to be processed meets the task processing condition according to the priority, if so, executing step S714; otherwise, step S715 is performed;

besides judging whether the task to be processed meets the task processing conditions according to the priority, the waiting time of the task to be processed can reach a preset time threshold according to the waiting time of the task to be processed, and if one or more clusters meeting the resource information are provided, the task to be processed is determined to meet the task processing conditions.

Step S714: searching one or more clusters capable of processing a plurality of subtasks included in the task from the plurality of clusters, and executing step S716;

step S715: the task to be processed continues waiting in the task queue and performs step S711;

step S716: selecting a target cluster from the one or more found clusters;

step S717: and scheduling the plurality of subtasks to one or more nodes in the target cluster according to the preset priorities of the plurality of subtasks.

Specific embodiments of this step S717: for each subtask, performing:

screening a plurality of nodes with subtask processing capability from a target cluster; calculating node scores of the screened multiple nodes; and selecting the node with the highest node score to process the subtasks.

It should be noted that the execution sequence of the above steps S701 to S717 is only an example, and many steps may be executed in synchronization or in a permuted order. For example, step S707 may be performed before or after any one of the steps S701 to S706 described above, and the like.

As shown in fig. 8, an embodiment of the present invention provides a task scheduling server 800, and the task scheduling server 800 may include:

a receiving module 801, configured to receive a task to be processed, where the task to be processed includes: user information for applying task processing and resource information required by task processing;

a control module 802, configured to count resource usage of user information in a plurality of clusters; judging whether the user information meets the resource allocation condition or not based on the resource information, the preset resource quota corresponding to the user information and the counted resource use condition, and if so, determining a target cluster for the task to be processed from a plurality of clusters;

A scheduling module 803, configured to schedule the task to be processed to one or more nodes in the target cluster, so that the one or more nodes in the target cluster process the task to be processed.

In the embodiment of the present invention, the control module 802 is further configured to put the user information into a preset user queue; if the user information included in the user queue is a plurality of, calculating a user score corresponding to the user information; determining the priority of the user information in the user queue based on the user scores corresponding to the user information and the user scores corresponding to other user information in the user queue; when the priority of the user information in the user queue reaches the highest, distributing resources for the user information; and executing the step of determining a target cluster for the task to be processed from the plurality of clusters based on the resources allocated by the user information.

In the embodiment of the present invention, the control module 802 is further configured to calculate, according to the resource information and the counted resource usage, a resource occupation amount corresponding to the user information; and if the occupied amount of the resources is not greater than the resource quota, determining that the user information meets the resource allocation condition.

In the embodiment of the present invention, the control module 802 is further configured to allocate a task to be processed to a task queue corresponding to user information; judging whether the task to be processed meets the task processing conditions or not according to the determined task basic information of the task to be processed and the task basic information of other tasks in the task queue, and executing the step of determining a target cluster for the task to be processed from a plurality of clusters if the task to be processed meets the task processing conditions.

In the embodiment of the invention, the resource use condition comprises a plurality of resource use indexes; the resource quota includes an index quota corresponding to each of the resource usage indexes; the control module 802 is further configured to calculate a ratio of each resource usage index to a corresponding index quota; and determining the user score corresponding to the user information based on the calculated multiple ratios.

In the embodiment of the invention, the resource information comprises: the task to be processed is preempted in the CPU, the memory and the GPU; the resource use cases include: other tasks of the user information are respectively aimed at the current usage amount of the CPU, the memory and the GPU; the control module 802 is further configured to calculate a CPU occupation amount corresponding to the user information by using the pre-occupied CPU and a current usage amount for the CPU; calculating the memory occupation amount corresponding to the user information by utilizing the pre-occupied memory and the current use amount of the memory; and calculating the GPU occupation amount corresponding to the user information by utilizing the pre-occupied GPU and the current use amount aiming at the GPU.

In the embodiment of the invention, the task processing conditions comprise: in the task queue, the task to be processed has the highest priority and one or more clusters meeting the resource information; or the waiting time of the task to be processed reaches a preset time threshold and is provided with one or more clusters meeting the resource information.

In the embodiment of the present invention, the control module 802 is further configured to calculate a task score corresponding to a task to be processed if the task queue includes a plurality of tasks; determining the priority of the task to be processed in the task queue based on the task scores and the task scores corresponding to other tasks in the task queue; and executing the step of judging whether the task to be processed meets the task processing condition or not when the priority of the task score of the task to be processed in the task queue reaches the highest.

In an embodiment of the present invention, the resource quota includes: CPU quota, memory quota and GPU quota allocated by user information; the control module 802 is further configured to compare the CPU occupation amount, the memory occupation amount, and the GPU occupation amount with a CPU quota, a memory quota, and a GPU quota, respectively; if the comparison result is that the CPU occupation amount is not greater than the CPU quota, the memory occupation amount is not greater than the memory quota and the GPU occupation amount is not greater than the GPU quota, the resource occupation amount is not greater than the resource quota.

In the embodiment of the present invention, the control module 802 is further configured to, for a case where the task includes a plurality of subtasks, find one or more clusters capable of processing the plurality of subtasks; a target cluster is selected from the one or more found clusters.

In the embodiment of the present invention, the scheduling module 803 is further configured to schedule the plurality of subtasks to one or more nodes in the target cluster according to the preset priorities of the plurality of subtasks.

In the embodiment of the present invention, the scheduling module 803 is further configured to perform, for each subtask: screening a plurality of nodes with subtask processing capability from a target cluster; calculating node scores of the screened multiple nodes; and selecting the node with the highest node score to process the subtasks.

In the embodiment of the present invention, the control module 802 is further configured to directly adjust the resource quota when the resource usage of the user information in the set period of time meets the adjustment condition; or, sending the adjustment information to the communication address included in the user information, and adjusting the resource quota after responding to the adjustment confirmation fed back by the user.

In the embodiment of the present invention, the scheduling module 803 is further configured to put the task to be processed into the waiting queue if the user information does not meet the resource allocation condition.

As shown in fig. 9, an embodiment of the present invention provides a task scheduling system 900, where the task scheduling system 900 may include a plurality of clusters 901 and the task scheduling server 800 provided in any of the foregoing embodiments, where,

The nodes 9011 in each cluster 901 are configured to process the task to be processed after receiving the task to be processed scheduled by the task scheduling server 800.

Fig. 10 illustrates an exemplary system architecture 1000 to which a task scheduling method or task scheduling server of an embodiment of the present invention may be applied.

As shown in fig. 10, a system architecture 1000 may include a task scheduling server 1001, a network 1002, and a plurality of clusters 1003, where each cluster 1003 has a plurality of nodes 10031 therein. The network 1002 is the medium used to provide the communication links between the task scheduling server 1001 and the cluster 1003. The network 1002 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The task scheduling server 1001 may be a server that provides various services, for example, analyzes user information corresponding to a task to be processed or a task to be processed, and schedules nodes in a cluster based on the result of the analysis (merely by way of example).

Node 10031 in cluster 1003 processes the received task or subtask.

It should be noted that, the task scheduling method provided by the embodiment of the present invention is generally executed by the task scheduling server 1001.

It should be understood that the number of task scheduling servers, networks, and clusters in fig. 10 is merely illustrative. There may be any number of task scheduling servers, networks, and clusters, as desired for implementation.

Referring now to FIG. 11, there is illustrated a schematic diagram of a computer system 1100 suitable for use in implementing the terminal device of an embodiment of the present invention. The task scheduling server shown in fig. 11 is only one example, and should not impose any limitation on the functions and scope of use of the embodiment of the present invention.

As shown in fig. 11, the computer system 1100 includes a Central Processing Unit (CPU) 1101, which can execute various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1102 or a program loaded from a storage section 1108 into a Random Access Memory (RAM) 1103. In the RAM 1103, various programs and data required for the operation of the system 1100 are also stored. The CPU 1101, ROM 1102, and RAM 1103 are connected to each other by a bus 1104. An input/output (I/O) interface 1105 is also connected to bus 1104.

The following components are connected to the I/O interface 1105: an input section 1106 including a keyboard, a mouse, and the like; an output portion 1107 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 1108 including a hard disk or the like; and a communication section 1109 including a network interface card such as a LAN card, a modem, and the like. The communication section 1109 performs communication processing via a network such as the internet. The drive 1110 is also connected to the I/O interface 1105 as needed. Removable media 1111, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is installed as needed in drive 1110, so that a computer program read therefrom is installed as needed in storage section 1108.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network via the communication portion 1109, and/or installed from the removable media 1111. The above-described functions defined in the system of the present invention are performed when the computer program is executed by a Central Processing Unit (CPU) 1101.

The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules involved in the embodiments of the present invention may be implemented in software or in hardware. The described modules may also be provided in a processor, for example, as: a processor includes a receiving module, a control module, and a scheduling module. Where the names of the modules do not constitute a limitation on the module itself in some cases, for example, a scheduling module may also be described as "a module that schedules a task to be processed to one or more nodes in a target cluster".

As another aspect, the present application also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to include: receiving a task to be processed, wherein the task to be processed comprises: user information for applying task processing and resource information required by task processing; counting the resource use condition of the user information in a plurality of clusters; judging whether the user information meets the resource allocation condition or not based on the resource information, the preset resource quota corresponding to the user information and the counted resource use condition, and if so, determining a target cluster for the task to be processed from a plurality of clusters; the task to be processed is scheduled to one or more nodes in the target cluster, such that the one or more nodes in the target cluster process the task to be processed.

According to the technical scheme of the embodiment of the application, the resource utilization condition of the user information in the clusters is counted, and whether the user information meets the resource allocation condition or not is judged based on the resource information, the preset resource quota corresponding to the user information and the counted resource utilization condition, namely, only if the resource allocation condition is met, the target cluster is determined for the task to be processed in the clusters, and the task to be processed is scheduled to one or more nodes in the target cluster, so that the one or more nodes in the target cluster process the task to be processed. In addition, the resource allocation condition is related to information related to user resources, such as user resource information, preset resource quota corresponding to the user information, and statistical resource use conditions, so that the resource amount of each user scheduling resource can be effectively regulated and controlled through the resource allocation condition, and the resource scheduling of each user is relatively balanced.

The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims

1. A method for task scheduling, comprising:

2. The task scheduling method according to claim 1, further comprising:

placing the user information into a preset user queue;

3. The task scheduling method according to claim 1, wherein the determining whether the user information satisfies a resource allocation condition includes:

4. A task scheduling method according to claim 1 or 3, characterized in that,

further comprises: distributing the task to be processed to a task queue corresponding to the user information;

5. The task scheduling method of claim 2, wherein,

the resource use condition comprises a plurality of resource use indexes;

6. The task scheduling method according to claim 3, wherein,

the resource information includes: the task to be processed is occupied by a CPU, a memory and a GPU;

7. The task scheduling method according to claim 4, wherein the task processing conditions include:

or alternatively, the process may be performed,

8. The task scheduling method according to claim 4, further comprising:

9. The task scheduling method of claim 6, wherein,

the resource quota includes: the CPU quota, the memory quota and the GPU quota allocated by the user information;

further comprises:

10. A task scheduling method according to any one of claims 1 to 3 and 5 to 9,

for the case where the task includes multiple sub-tasks,

the determining a target cluster for the task to be processed includes:

and selecting the target cluster from the one or more found clusters.

11. The task scheduling method of claim 10, wherein the scheduling the task to be processed to one or more nodes in the target cluster comprises:

and scheduling the plurality of subtasks to one or more nodes in the target cluster according to the preset priorities of the plurality of subtasks.

12. The task scheduling method of claim 11, wherein the scheduling the plurality of subtasks to one or more nodes in the target cluster comprises:

for each of the subtasks, performing:

calculating node scores of the screened nodes;

and selecting the node with the highest node score to process the subtasks.

13. The task scheduling method according to any one of claims 1 to 3, 5 to 9, 11, and 12, characterized by further comprising:

14. The task scheduling method according to any one of claims 1 to 3, 5 to 9, 11 and 12,

15. A task scheduling server, comprising:

16. A task scheduling system, comprising: a plurality of clusters and the task scheduling server of claim 15, wherein,

17. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs,

when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-14.

18. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-14.