WO2022247105A1

WO2022247105A1 - Task scheduling method and apparatus, computer device and storage medium

Info

Publication number: WO2022247105A1
Application number: PCT/CN2021/124136
Authority: WO
Inventors: 叶志晟; 孙鹏; 吴保东; 颜深根
Original assignee: 上海商汤科技开发有限公司
Priority date: 2021-05-27
Filing date: 2021-10-15
Publication date: 2022-12-01
Also published as: CN113238848A; TW202246977A

Abstract

The present disclosure provides a task scheduling method and apparatus, a computer device and a storage medium. The method comprises: acquiring task information of a task to be processed; determining the task type of the task on the basis of the task information; determining a target node server for executing the task and a target resource on the target node server on the basis of the task type; and allocating the target node server and the target resource on the target node server to execute the task. Alternatively, the method comprises: determining the task type of a task to be processed; determining a target node server for executing the task and a target resource on the target node server on the basis of the task type of said task; and sending information of the target node server and information of the target resource to a scheduler, such that the scheduler allocates the target node server and the target resource to said task.

Description

A task scheduling method, device, computer equipment and storage medium

This application claims the priority of the Chinese patent application filed on May 27, 2021, with the application number 202110586526.2, and the title of the invention is "a task scheduling method, device, computer equipment, and storage medium", the entire contents of which are incorporated herein by reference. Applying.

technical field

The present disclosure relates to the field of computer technology, in particular, to a task scheduling method, device, computer equipment and storage medium.

Background technique

In order to meet the different resource requirements corresponding to different deep learning tasks, the task scheduling system is playing an increasingly important role. In the prior art, the task scheduling methods applied in the task scheduling system are mostly based on the resource requirements of deep learning tasks. , with the goal of allocating as few node servers as possible to schedule resources for deep learning tasks.

Contents of the invention

Embodiments of the present disclosure at least provide a task scheduling method, device, computer equipment, and storage medium.

In a first aspect, an embodiment of the present disclosure provides a task scheduling method, including:

Obtain task information of pending tasks;

determining a task type of the task based on the task information;

Based on the task type, determine a target node server for executing the task and target resources on the target node server;

Allocating the target node server and target resources on the target node server to perform the task.

The task information can reflect information such as the resource demand amount and traffic demand information of the task to be processed, so that based on the task information, the task type matching the task information can be determined. The methods of determining the target node server and target resources are different for different task types. Based on the determined task type, the resource determination method matching the task type can be determined, and then, the determined resource determination method can be used as the resource determination method to be processed. Assigning target resources to tasks can improve the rationality of the allocated target resources, and then, using the determined target resources to execute tasks to be processed can improve the speed and efficiency of task execution.

In a possible implementation manner, the determining the task type of the task according to the task information includes:

According to the task information, determine the resource requirement of the task and the traffic requirement information of the task;

Determine the number of resources in the node server;

The task type of the task is determined based on the determined resource demand, communication traffic demand information of the task, and resource quantity in the node server.

In this way, the traffic demand information can reflect the demand information of the task for the traffic and calculation volume, the resource demand can reflect the number of idle resources required to execute the task, and the number of resources in the node server can reflect the maximum idle resources in a node server resource quantity. Furthermore, the task type of the task to be processed can be accurately determined based on the resource demand, the communication traffic demand information of the task, and the resource quantity in the node server.

In a possible implementation manner, the determining the target node server for executing the task and the target resource on the target node server based on the task type includes:

A target node server for executing the task and target resources on the target node server are determined based on the task type of the task, the resource requirement of the task, and the idle resources on the node server.

In this way, tasks of different task types have different resource allocation priorities. For example, tasks of the single-node multi-resource communication type that require high communication traffic have higher resource allocation priorities, so the number of idle resources can be greater than that of resource allocation. Select the target node server from the node server with the required amount. For tasks of the single-node multi-resource computing type with a large amount of calculation, the resource allocation priority is lower than that of the single-node multi-resource communication type. Therefore, the number of idle resources can be Select the target node server from the node servers that are smaller than the resource requirement. In this way, assigning the target node server and the target resources in the target node server to tasks according to different task types can realize the rational allocation of idle resources in the node server to different tasks, thereby improving the speed and efficiency of task scheduling and execution. efficiency.

In a possible implementation manner, based on the task type of the task, the resource demand of the task, and the idle resources on the node server, determine the target node server for executing the task and the target node server on the target node server. target resources, including:

determining at least one initial node server matching the task type based on the task type of the task;

Based on the resource requirements of the task and idle resources on the initial node server, determine a target node server for executing the task and target resources on the target node server from the at least one initial node server.

In this way, based on the task type, the initial node server that can execute the task to be processed can be selected first, and then the final target for executing the task to be processed can be selected according to the resource demand of the task and the idle resources on the initial node server The node server and the target resource, thus, the selected target node server and target resource can not only execute the task to be processed, but also have a faster execution speed, thereby improving the speed and efficiency of task execution.

In a possible implementation manner, the task type includes a single-node multi-resource communication type;

The determining the target node server for performing the task from the at least one initial node server includes:

Based on the number of idle resources on the initial node server in the at least one initial node server, select the initial node server whose number of idle resources is greater than or equal to the resource demand as the first node server;

A target node server for executing the task is determined based on idle resources on the first node server and resource requirements of the task.

In this way, the single-node multi-resource communication type has a high demand for traffic, so select the first node server whose number of idle resources is greater than or equal to the resource demand, instead of waiting for other node servers whose number of idle resources is less than the resource demand Releasing resources, realizing that the initial node server with a large number of idle resources is preferentially allocated to this type of task, so as to meet the traffic demand of this type of task, and then select the target node server for executing the task from the first node server, It can effectively improve the scheduling and execution efficiency of this type of task.

In a possible implementation manner, the determining the target node server for executing the task based on the idle resources on the first node server and the resource requirement of the task includes:

determining at least one preset topology structure corresponding to the resource requirement;

Selecting, among the first node servers, the first node server having any one of the preset topology structures and the least number of idle resources as the target node server.

In this way, resources with a preset topology have higher communication quality, and the use of the target node server with a preset topology can further increase the speed of executing single-node multi-resource communication tasks. In addition, selecting the first node server with the least number of idle resources can reduce the splitting of the original topology structure of the idle resources in the first node server with a large number of idle resources, and realize the protection of the optional topology structure and Apply it reasonably to other tasks such as single-node multi-resource communication types.

In a possible implementation manner, determining the target resource on the target node server includes:

Taking the preset topology in the target node server as the target topology;

Taking idle resources constituting the target topology in the target node server as the target resources.

In this way, it is possible to use the target resources with the target topology to execute single-node multi-resource communication tasks, and improve the scheduling and execution efficiency of this type of tasks.

In a possible implementation manner, the task type includes a single-node multi-resource computing type;

Based on the number of idle resources on the initial node server in the at least one initial node server, screening the second node server whose number of idle resources is less than or equal to the resource requirement;

A target node server for executing the task is determined based on idle resources on the second node server and resource requirements of the task.

In this way, the single-node multi-resource computing type has a high demand for calculation volume and a small demand for communication volume. Therefore, selecting a second node server with a number of idle resources smaller than the resource demand can meet the communication requirements of this type of task. It can improve the scheduling and execution efficiency of this type of task to a certain extent. Furthermore, this type of task does not occupy the initial node server whose number of idle resources is greater than the resource demand, and can reserve this type of node server for the task of single-node multi-resource communication type with high resource demand, which realizes the improvement of single-node While the scheduling and execution efficiency of multi-resource computing tasks is improved, the scheduling and execution efficiency of single-node multi-resource communication tasks is improved. Furthermore, this embodiment can reduce the splitting probability of the original topology structure of idle resources in the initial node server with a large number of idle resources, realize the protection of the optional topology structure and rationally apply it to multiple Resource communication types and other tasks.

In a possible implementation manner, the determining the target node server for executing the task based on the idle resources on the second node server and the resource requirement of the task includes:

Selecting any second server with the preset topology among the second node servers as the target node server.

In this way, resources with a preset topology have higher communication quality, and using the target node server with a preset topology can improve the scheduling and execution efficiency of single-node multi-resource computing tasks.

In a possible implementation manner, the determining the target node server for executing the task based on the idle resources on the second node server and the resource requirement of the task further includes:

In the case that no second node server with any one of the preset topology structures is screened out from the second node servers, after a preset period of time or at least one idle resource on the node server changes, the node server is reacquired and returning to the step of screening the second node server whose number of idle resources is less than or equal to the resource requirement based on the number of idle resources on the initial node server among the at least one initial node server.

In this way, by using the method of reacquiring the number of idle resources on the node server, the required idle resources with a preset topology can be allocated to tasks in a timely manner, and further, tasks of the single-node multi-resource computing type can be executed in a timely manner . In addition, it can also reduce the split probability of the original topology structure of idle resources in the initial node server with a large number of idle resources, realize the protection of optional topology structures and rationally apply it in single-node multi-resource communication type and other tasks.

In a possible implementation manner, the task type includes a multi-node type;

Based on the resource requirement of the task, from the at least one initial node server, determine the maximum number of third node servers required to perform the task; wherein, the third node server is an empty initial node server;

The third node server with the maximum number is used as the target node server.

In this way, idle resources in the unloaded third node server have an optional topology structure, and selecting the third node server with the largest number as the target node server can ensure that the target node servers used to perform multi-node tasks have available The selected topology can improve the efficiency of executing multi-node type tasks.

In a possible implementation manner, the determining the target node server for executing the task from the at least one initial node server further includes:

When the total amount of resources on the largest number of third node servers is less than the resource requirement, determine the difference between the resource total and the resource requirement;

Based on the number of idle resources on the initial node server, screening a fourth node server whose number of idle resources is less than or equal to the difference;

Based on the free resources on the fourth node server and the difference, determine a target node server for executing the task.

In this way, the difference between the total amount of resources and the resource demand can reflect that in addition to the idle resources included in the unloaded target node server, the task also needs to select target resources on other initial node servers, and then, based on the difference, select The fourth node server can realize the reasonable allocation of idle resources required by the task, thereby improving the speed of executing the task.

In a possible implementation manner, the task type includes a single resource type;

Based on the number of idle resources on the initial node server in the at least one initial node server, the initial node server whose idle resource quantity is equal to the resource requirement is used as the target node server.

In this way, a task of a single resource type can be executed only by one idle resource. Therefore, the task can be executed quickly and efficiently by using the target node server whose number of idle resources is equal to the resource demand.

In a possible implementation manner, it also includes a step of determining a preset topology:

For the idle resources in any node server to be matched, determine the topology structure corresponding to any number of idle resources in the node server to be matched;

Determine the communication waiting time of the topology based on the communication medium between idle resources;

Based on the communication waiting time, a preset topology corresponding to any number of idle resources in the node server to be matched is determined.

In this way, the communication waiting time of the topology can represent the waiting time required to execute tasks using the topology, and the preset topology with high task execution efficiency can be screened out by using the communication waiting time, so that the idle time with the preset topology can be used Resource execution tasks can improve the efficiency of task execution.

In a possible implementation manner, the determining a preset topology corresponding to any number of idle resources in the node server to be matched based on the communication waiting time includes:

Sorting the topological structures according to the order of the communication waiting time from low to high;

The topological structure whose sort order is smaller than the preset order is used as the preset topology structure corresponding to the node server to be matched.

In this way, using the preset order to filter the preset topology can ensure that the communication waiting time of the filtered preset topology is relatively short, and then, using the preset topology to execute tasks can realize task execution at a faster speed.

In a possible implementation manner, the determining a preset topology corresponding to any number of idle resources in the node server to be matched based on the communication waiting time includes

For any number of idle resources, the topology structure whose communication waiting time is shorter than the preset waiting threshold is taken as the preset topology structure corresponding to the number of idle resources in the node server to be matched.

In this way, using the preset waiting threshold to filter the preset topology can ensure that the communication waiting time of the filtered preset topology is less than the preset waiting threshold, and then, using the preset topology to execute tasks can reasonably reduce the need for task execution. communication waiting time.

Obtain idle resources on the node server;

Determine a target node server for executing the task and target resources on the target node server based on idle resources on the node server, resource requirements of the task, and traffic demand information of the task;

The task type of the task is determined based on the target node server executing the task and the target resource on the target node server.

In this way, while determining the task type, the target node server and target resource can be determined, and then, the target node server and target resource corresponding to the task type can be directly assigned to the task to be processed, omitting the time for selecting the initial node server, and improving The speed and efficiency of resource scheduling, thereby improving the speed and efficiency of task execution.

In a possible implementation manner, the method also includes:

When the obtained tasks to be processed include multiple tasks, based on the task type of each task, determine the resource allocation priority of each task;

Based on the resource allocation priority of each task, a target node server for executing the task and target resources on the target node server are determined.

In this way, resources are allocated to different tasks based on resource allocation priorities, so that idle resources in the node server can be reasonably allocated to different tasks.

In a possible implementation manner, the task type includes single-node multi-resource communication type, single-node multi-resource computing type, multi-node type, and single-resource type;

The determining the resource allocation priority of each task based on the task type of each task includes:

Setting the resource allocation priority of the task of the single-node multi-resource communication type to the first priority;

Setting the resource allocation priority of the task of the single-node multi-resource computing type to the second priority;

Setting the resource allocation priority of the task of the multi-node type to the third priority;

Setting the resource allocation priority of the single-resource type task to the fourth priority.

In this way, tasks of the single-node multi-resource communication type have a high demand for communication traffic. Setting the task of this task type as the first priority can meet the required communication traffic and increase the speed of executing tasks of this task type. Tasks of the single-node multi-resource computing type have high demands on the amount of computation and relatively small demands on communication traffic. Setting this type of task as the second priority can meet the required amount of computation and reduce the need for The split probability of the original topology of idle resources in node servers with a large number of idle resources. Multi-node tasks require empty node servers and there are relatively few tasks of this task type. Set it to the third priority level, which can ensure that resources are allocated for single-node multi-resource communication tasks and single-node multi-resource computing tasks in a timely manner. Single-resource type tasks only need one resource to execute, so setting it as the fourth priority can ensure On the basis of timely allocating resources to single-node multi-resource communication tasks and single-node multi-resource computing tasks, the execution speed of the task type is not affected.

In a possible implementation manner, the task information includes the resource requirement, and the method further includes the step of obtaining the resource requirement:

Obtain the configuration file of the task to be processed;

Obtain the resource requirement of the task to be processed from the configuration file.

In this way, the resource requirement is obtained directly from the configuration file, which improves the speed of determining the resource requirement.

In a possible implementation manner, the task information includes the traffic demand information, and the method further includes the step of acquiring the traffic demand information:

Acquiring calculation amount information and communication amount information of the task to be processed;

Based on the calculation amount information and communication amount information, the communication amount requirement information is determined.

In this way, the traffic demand information is determined based on the acquired calculation traffic information and traffic traffic information, which can ensure the accuracy of the determined traffic traffic demand information.

In a possible implementation manner, the acquiring the calculation amount information and the communication amount information of the task to be processed includes:

Based on the obtained configuration file of the task to be processed or the obtained task request to be processed, determine the calculation amount information and communication amount information of the task to be processed; wherein, the task request includes the calculation volume information and traffic information.

In this way, the calculation amount information and the communication amount information are determined based on the analysis of the configuration file or the task request, and the accuracy of the determined calculation amount information and the communication amount information can be improved.

Obtain the configuration file of the task to be processed;

Acquiring traffic demand information of the task to be processed from the configuration file.

In this way, the communication traffic requirement information is obtained directly from the configuration file, the step of calculating the information is omitted, and the speed of determining the communication traffic requirement information is improved.

In a possible implementation manner, the determining the task type of the task based on the determined resource demand, task traffic demand information, and resource quantity in the node server includes:

Determine the number of target node servers corresponding to the task to be processed based on the resource demand and the number of resources in the node server;

The task type of the task to be processed is determined based on the number of target node servers corresponding to the task to be processed and the traffic demand information.

In this way, based on the number of resources in the node server and the number of resources in the node server, the task type of the task can be accurately determined.

The determining the task type of the task to be processed based on the number of target node servers corresponding to the task to be processed and the traffic demand information includes:

The number of target node servers corresponding to the task to be processed is equal to 1, the number of target resources corresponding to the task to be processed is greater than 1, and the traffic demand information indicates the traffic demand of the task to be processed When the amount of calculation is higher than the demand, it is determined that the task type of the task to be processed is a single-node multi-resource communication type;

The number of target node servers corresponding to the task to be processed is equal to 1, the number of target resources corresponding to the task to be processed is greater than 1, and the traffic demand information indicates the calculation capacity demand of the task to be processed If it is higher than the communication traffic requirement, it is determined that the task type of the task to be processed is a single-node multi-resource computing type;

When the number of target node servers corresponding to the task to be processed is greater than 1, determine that the task type of the task to be processed is a multi-node type;

In a case where the number of target resources corresponding to the task to be processed is equal to 1, it is determined that the task type of the task to be processed is a single resource type.

In this way, the task type of the task can be accurately determined.

In the second aspect, the embodiment of the present disclosure also provides a task scheduling method applied to a client, including:

Determine the task type of the pending task;

Based on the task type of the task, determine a target node server for executing the task and target resources on the target node server;

Sending the information of the target node server and the information of the target resource to a scheduler, so that the scheduler allocates the target node server and the target resource to the task to be processed.

In this way, the client first determines the target node server and target resource based on the determined task type, and then sends it to the scheduler. The scheduler omits the step of determining the target node server and target resource, and can directly send the target node server and target Resources are allocated to tasks, so that the speed of resource scheduling and the execution speed and efficiency of tasks can be improved.

In a possible implementation manner, the determining the task type of the task to be processed includes:

determining the resource requirement of the task, the traffic requirement information of the task, and the resource quantity in the node server;

In this way, the traffic demand information can reflect the demand information of the task for the traffic and calculation volume, the resource demand can reflect the number of idle resources required to execute the task, and the number of resources in the node server can reflect the maximum idle resources in a node server number of resources. Furthermore, the task type of the task to be processed can be accurately determined based on the resource demand, the communication traffic demand information of the task, and the resource quantity in the node server.

Obtain idle resources on the node server;

Determine at least one target node server for executing the task and target resources on the target node server based on idle resources on the node server, resource requirements of the task, and traffic demand information of the task;

A task type of the task is determined based on the at least one target node server and target resources on the target node server.

In this way, determining the task type based on the idle resources can ensure that the currently determined target node server and target resources can be directly used to execute the task.

In the third aspect, the embodiment of the present disclosure also provides a task scheduling device, which is applied to the scheduler side, including:

An acquisition module, configured to acquire task information of tasks to be processed;

A first determining module, configured to determine the task type of the task based on the task information;

A second determining module, configured to determine a target node server for executing the task and target resources on the target node server based on the task type;

An allocating module, configured to allocate the target node server and target resources on the target node server to execute the task.

In the fourth aspect, the disclosed example also includes a task scheduling device, which is applied to the client, including:

The third determining module is used to determine the task type of the task to be processed;

A fourth determining module, configured to determine a target node server for executing the task and target resources on the target node server based on the task type of the task;

A sending module, configured to send the information of the target node server and the information of the target resource to a scheduler, so that the scheduler allocates the target node server and the target resource to the task to be processed

In the fifth aspect, an optional implementation manner of the present disclosure further provides a computer device, a processor, and a memory, where the memory stores machine-readable instructions executable by the processor, and the processor is configured to execute the instructions stored in the memory. machine-readable instructions, when the machine-readable instructions are executed by the processor, when the machine-readable instructions are executed by the processor, the above-mentioned first aspect is executed, or any possible implementation of the first aspect The steps in the manner, or, perform the steps in the above second aspect, or any possible implementation manner of the second aspect.

In a sixth aspect, an optional implementation manner of the present disclosure further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed, the above-mentioned first aspect, or any one of the first aspects in the first aspect, may be executed. Steps in a possible implementation manner, or, performing the steps in the above second aspect, or any possible implementation manner in the second aspect.

In the seventh aspect, an optional implementation manner of the present disclosure further provides a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium bearing computer-readable codes, when the computer-readable codes are stored in When running in the processor of the electronic device, the processor in the electronic device executes the above-mentioned first aspect, or the steps in any possible implementation manner of the first aspect, or executes the above-mentioned second aspect, or the second aspect A step in any possible implementation of an aspect.

In order to make the above objects, features and advantages of the present disclosure more comprehensible, optional embodiments are given below and described in detail in conjunction with the accompanying drawings.

Description of drawings

In order to illustrate the technical solutions of the embodiments of the present disclosure more clearly, the following will briefly introduce the accompanying drawings used in the embodiments. The accompanying drawings here are incorporated into the specification and constitute a part of the specification. The drawings show the embodiments consistent with the present disclosure, and are used together with the description to explain the technical solution of the present disclosure. It should be understood that the following drawings only show some embodiments of the present disclosure, and therefore should not be regarded as limiting the scope. For those skilled in the art, they can also make From these figures are obtained other related figures.

FIG. 1 shows a flowchart of a task scheduling method applied to a scheduler provided by an embodiment of the present disclosure;

FIG. 2 shows a schematic structural diagram of a node server provided by an embodiment of the present disclosure;

FIG. 3 shows a flowchart of a method for determining a preset topology provided by an embodiment of the present disclosure;

FIG. 4 shows a schematic diagram of a topology relationship between node servers provided by an embodiment of the present disclosure;

FIG. 5 shows a schematic diagram of a topology structure of a multi-node task provided by an embodiment of the present disclosure;

FIG. 6a shows a schematic diagram of a preset topology structure corresponding to 8 resources provided by an embodiment of the present disclosure;

FIG. 6b shows a schematic diagram of a preset topology structure corresponding to 7 resources provided by an embodiment of the present disclosure;

FIG. 6c shows a schematic diagram of a preset topology structure corresponding to 6 resources provided by an embodiment of the present disclosure;

FIG. 6d shows a schematic diagram of a preset topology structure corresponding to 5 resources provided by an embodiment of the present disclosure;

FIG. 6e shows a schematic diagram of a preset topology structure corresponding to 4 resources provided by an embodiment of the present disclosure;

FIG. 6f shows a schematic diagram of a preset topology structure corresponding to 3 resources provided by an embodiment of the present disclosure;

FIG. 6g shows a schematic diagram of a preset topology structure corresponding to 2 resources provided by an embodiment of the present disclosure;

FIG. 7 shows a schematic diagram of splitting an initial node server including different idle resources provided by an embodiment of the present disclosure;

FIG. 8 shows a flow chart of a task scheduling method applied to a client provided by an embodiment of the present disclosure;

FIG. 9 shows a schematic diagram of a task scheduling device applied to the scheduler end provided by an embodiment of the present disclosure;

FIG. 10 shows a schematic diagram of a task scheduling device applied to a client provided by an embodiment of the present disclosure;

FIG. 11 shows a schematic structural diagram of a computer device provided by an embodiment of the present disclosure.

Detailed ways

In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are The present disclosure discloses some embodiments, but not all embodiments. The components of the disclosed embodiments generally described and illustrated herein may be arranged and designed in different configurations. Accordingly, the following detailed description of the embodiments of the present disclosure is not intended to limit the scope of the claimed disclosure, but represents selected embodiments of the present disclosure. Based on the embodiments of the present disclosure, other embodiments obtained by those skilled in the art without making creative efforts all belong to the protection scope of the present disclosure.

In addition, the terms "first", "second" and the like in the description and claims in the embodiments of the present disclosure and the above drawings are used to distinguish similar objects, and not necessarily used to describe a specific order or sequence. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments described herein can be practiced in sequences other than those illustrated or described herein.

"Plural or several" mentioned herein means two or more. "And/or" describes the association relationship of associated objects, indicating that there may be three types of relationships, for example, A and/or B may indicate: A exists alone, A and B exist simultaneously, and B exists independently. The character "/" generally indicates that the contextual objects are an "or" relationship.

It has been found through research that in order to meet the different resource requirements corresponding to different deep learning tasks, the task scheduling system is playing an increasingly important role. In the prior art, the task scheduling methods applied in the task scheduling system are mostly based on deep learning task The resource demand is to allocate resources for deep learning tasks with the goal of allocating as few node servers as possible. However, when using the resources scheduled by the above method to execute deep learning tasks, there are problems of slow execution speed and low efficiency.

Based on the above research, the present disclosure provides a task scheduling method, device, computer equipment, and storage medium. The task information can reflect information such as the resource demand and traffic demand information of the task to be processed. Therefore, based on the task information, it is possible to determine The task type that matches the task information. The methods of determining the target node server and target resources are different for different task types. Based on the determined task type, the resource determination method matching the task type can be determined, and then, the determined resource determination method can be used as the resource determination method to be processed. Assigning target resources to tasks can improve the rationality of the allocated target resources, and then, using the determined target resources to execute tasks to be processed can improve the speed and efficiency of task execution.

The defects in the above solutions are all the results obtained by the inventor after practice and careful research. Therefore, the discovery process of the above problems and the solutions proposed by the present disclosure below for the above problems should be the result of the inventor Contributions made to this disclosure during the course of this disclosure.

It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

It should be noted that the specific nouns mentioned in the embodiments of the present disclosure include:

GPU: Graphics Processing Unit, graphics processor, also known as display core, visual processor, display chip, is a special graphics processing unit for personal computers, workstations, game consoles and some mobile devices (such as tablets, smartphones, etc.) Microprocessors that work with graphics-related operations;

CPU: Central Processing Unit, the central processing unit, as the computing and control core of the computer system, is the final execution unit for information processing and program operation;

NIC: Network Interface Controller, network interface controller, also known as network interface controller, network adapter, network card, or LAN receiver, is a piece of computer hardware designed to allow computers to communicate on a computer network;

PCIe: peripheral component interconnect express, is a high-speed serial computer expansion bus standard;

QPI: Quick Path Interconnect, also known as CSI (Common System Interface) public system interface, is an architecture that can realize direct interconnection between chips;

NVLink is a bus and its communication protocol developed and launched by Nvidia.

In order to facilitate the understanding of this embodiment, a task scheduling method disclosed in the embodiments of the present disclosure is first introduced in detail. The execution subject of the task scheduling method provided in the embodiments of the present disclosure is generally a computer device with a certain computing capability. In some In a possible implementation manner, the task scheduling method may be implemented by calling a computer-readable instruction stored in a memory by a processor.

The task scheduling method provided by the embodiments of the present disclosure will be described below by taking the execution subject as a computer device as an example.

As shown in FIG. 1, a flowchart of a task scheduling method applied to the scheduler side provided by an embodiment of the present disclosure may include the following steps:

S101: Obtain task information of tasks to be processed.

Here, the task to be processed can be a deep learning task, which is a task obtained by the task scheduling system based on a predefined configuration file submitted by the user on the client. The essence of the task scheduling system can be a scheduler for resource scheduling. The task scheduling system can include several node servers, and the node servers include a certain number of computing resources and a certain number of CPUs. Among them, the node servers can be GPU node servers, and the computing resources can be GPUs in the GPU node servers. Can use QPI connection to communicate, GPU and CPU in one node server can use PCIe to communicate, CPU in one node server can use NIC to communicate with CPU in other node servers, GPU in one node server can use NVLink connection to communicate.

As shown in Figure 2, it is a schematic structural diagram of a node server provided by an embodiment of the present disclosure, wherein the node server may include 2 CPU servers (CPU0 and CPU1) and 8 GPUs (V100GPU0, V100GPU1, V100GPU2, V100GPU3 , V100GPU4, V100GPU5, V100GPU6, V100GPU7), PCIe Switches means using PCIe for communication.

While obtaining a task, the task scheduling system can also obtain task information of the task. Wherein, the task information is a kind of information used to represent the resource requirement and the traffic requirement information of the task to be processed.

S102: Based on the task information, determine a task type of the task.

Here, the task type can reflect the number of node servers required by the task, the number of GPU resources, and the traffic requirements, and the number of node servers selected for executing tasks of different task types is different. Different tasks may have different task types. For the same task, if the number of selected node servers is different, the task types of the task are also different.

Therefore, during specific implementation, the task type of the task may be determined based on information such as resource requirements and communication traffic demand information represented by the acquired task information.

S103: Based on the task type, determine a target node server for executing the task and target resources on the target node server.

Here, for tasks of different task types, the ways of determining the target node server and the target resource are different.

After the task type is determined, the target node server that can be used to execute the task and the target resource on the target node server can be determined according to the idle resources on the node server and the task type of the task. Wherein, the target resource is an idle resource.

S104: Allocate the target node server and the target resource on the target node server to execute the task.

Here, after determining the target node server for executing the task and the target resource on the target node server, the task scheduling system may allocate the target node server and the target resource on the target node server to the task to execute the task.

In this way, the task information can reflect information such as the resource demand amount and traffic demand information of the task to be processed, so that based on the task information, the task type matching the task information can be determined. The methods of determining the target node server and target resources are different for different task types. Based on the determined task type, the resource determination method matching the task type can be determined, and then, the determined resource determination method can be used as the resource determination method to be processed. Assigning target resources to tasks can improve the rationality of the allocated target resources, and then, using the determined target resources to execute tasks to be processed can improve the speed and efficiency of task execution.

In one embodiment, the task type of the task can be determined according to the following steps:

Step 1. According to the task information, determine the resource demand amount of the task and the traffic demand information of the task.

Here, different tasks require different numbers of idle GPUs to execute, and the resource requirement of a task may be the number of GPUs required to execute the task. The communication volume requirement information is used to represent the ratio between the calculation volume and the communication volume corresponding to the task. After a task is determined, the number of GPUs it needs will also be determined, that is, the resource requirements of the task can be determined. At the same time, after a task is determined, its corresponding traffic demand information will also be determined. Therefore, after a task is determined, the task information used to characterize resource demand and traffic demand information is also determined.

Further, the task scheduling system can determine the resource requirement and traffic requirement information of the task represented by it according to the acquired task information of the task.

Step 2: Determine the resource quantity in the node server.

Here, the number of resources is the number of GPUs. The number of resources in the node server is equal to the number of resources in the node server.

During specific implementation, the number of GPUs included in the node servers in the task scheduling system may be different. For example, the node servers in the task scheduling system may include 8GPU node servers and 4GPU node servers, which are not limited here.

Step 3: Determine the task type of the task based on the determined resource demand, the traffic demand information of the task, and the resource quantity in the node server.

During specific implementation, the number of node servers required to execute the task to be processed can be determined based on the resource demand and the number of resources in the node server, that is, the number of target node servers corresponding to the task to be processed can be determined. in,

in,

Indicates rounding up.

Furthermore, the task type of the task to be processed may be determined based on the number of target node servers corresponding to the task to be processed and the determined traffic demand information of the task.

In an embodiment, the task types may include single-node multi-resource communication type, single-node multi-resource computing type, multi-node type and single resource type.

Here, based on the resource demand of the task, the task can be divided into single-node type task and multi-node type task. Among them, the single-node type task can be completed by using the resources in one target node server, and the multi-node type task needs at least two The resources in the target node server can be completed; for single-node tasks, based on the resource requirements of the task, it can be divided into single-node multi-resource type tasks and single-resource type tasks, where the single-resource type task means that the task only needs one One idle resource in the target node server can be completed, and a single-node multi-resource task requires multiple idle resources in a target node server to complete; in addition, according to the traffic demand information of the task, the single-node multi-resource Type tasks are divided into single-node multi-resource communication type tasks and single-node multi-resource computing type tasks. Among them, single-node multi-resource communication type tasks indicate that the task has a high demand for traffic, and single-node multi-resource computing type tasks indicate that the task The demand for calculation is high.

During specific implementation, when it is determined that the number of target node servers corresponding to the task to be processed is equal to 1, the number of target resources corresponding to the task to be processed is greater than 1, and the traffic demand information indicates that the traffic demand of the task to be processed is higher than the calculation In the case of high-volume demand, determine the task type of the task to be processed as a single-node multi-resource communication type.

When it is determined that the number of target node servers corresponding to the task to be processed is equal to 1, the number of target resources corresponding to the task to be processed is greater than 1, and the communication volume demand information indicates that the calculation volume demand of the task to be processed is higher than the communication volume demand Next, determine that the task type of the task to be processed is the single-node multi-resource computing type.

When it is determined that the number of target node servers corresponding to the task to be processed is greater than 1, determine that the task type of the task to be processed is a multi-node type;

In a case where it is determined that the number of target resources corresponding to the task to be processed is equal to 1, it is determined that the task type of the task to be processed is a single resource type.

For example, if the resource requirement of the task is 7, the number of resources on the node server is 8, and the ratio between the calculation amount and the communication amount is greater than 1, it can be determined that the number of target node servers required by the task is 1, and then, based on If the ratio is greater than 1 and the number of target node servers is 1, it can be determined that the task type of the task is a single-node multi-resource computing type.

In one embodiment, for S103, it can be implemented according to the following steps:

Based on the task type of the task, the resource demand of the task, and the idle resources on the node server, the target node server for executing the task and the target resource on the target node server are determined.

Here, the GPUs on the node server can be used independently, and the idle GPUs on the node server are currently unused GPUs.

During specific implementation, the node servers may be divided into different idle resource linked lists according to the number of idle GPUs on the node servers, wherein the idle resource linked list is only composed of node servers including a number of idle GPUs. For example, the idle resource linked list may be 1 idle GPU linked list, 2 idle GPU linked list, etc., wherein 1 idle GPU linked list means that all node servers in the linked list have only one idle GUP.

After the task type of the task is determined, at least one initial node server matching the task type may be determined based on the task type first.

Here, the initial node server may be a node server selected from the idle list, and may be a group of multiple node servers or a single node server that can be used to execute tasks to be processed. For example, if a task is a task of single-node multi-resource communication type, and the resource demand is 3, multiple node servers with multiple resources greater than or equal to 3 can be selected from the free list as the initial node server, and multiple groups of nodes server, wherein the number of resources in a group of node servers is greater than or equal to 3.

Furthermore, the target node server for executing the task and the target resource on the target node server may be determined from at least one initial node server based on the resource requirement of the task and the idle resources on the initial node server.

During specific implementation, based on the resource requirements of the task and the idle resources on the initial node server, a target number of target node servers can be selected from at least one initial node server, and can be determined on the target node server based on the resource demand of the task. The target resource chosen to execute the task. Wherein, the target number is the number of target servers determined in the above embodiment.

For example, if the task type of the task is determined to be multi-node type, the resource requirement is 13, and the number of resources on the node server is 8, two target node servers can be selected from the node servers in the task scheduling system, one of which targets The node server is an empty node server, and the number of idle resources included in one target node server is five.

In an embodiment, when the task type includes single-node multi-resource communication type, the target node server for executing the task may be determined from at least one initial node server according to the following steps:

Step 1. Based on the number of idle resources on the initial node server in at least one initial node server, select the initial node server whose number of idle resources is greater than or equal to the resource requirement as the first node server;

Step 2: Determine the target node server for executing the task based on the idle resources on the first node server and the resource requirements of the task.

Here, since single-node multi-resource communication tasks have a high demand for traffic, based on the number of idle resources on the initial node server, the first node server whose number of idle resources is greater than or equal to the resource demand of the task can be screened out , in this way, it can be ensured that there is a sufficient number of idle resources in the first node server to meet the traffic demand of the task. During specific implementation, the selected first node servers may include one or more.

Furthermore, the target node server for executing the task may be selected from the screened first node servers according to the resource requirement.

During specific implementation, after the first node server is screened out, at least one preset topology corresponding to the resource requirement can be determined according to the resource requirement of the task, wherein the preset topology is the number of idle resources required to execute the task. With a better topology structure, idle resources are connected using the preset topology structure to perform tasks, which can reduce the waiting time for communication and improve the speed of task execution.

In one embodiment, the preset topology corresponding to any idle resource can be determined according to the method shown in FIG. 3 , as shown in FIG. A flowchart of the method may include the following steps:

S301: For idle resources in any node server to be matched, determine a topology structure corresponding to any number of idle resources in the node server to be matched.

S302: Based on the communication medium between idle resources, determine the communication waiting time of the topology.

S303: Based on the communication waiting time, determine a preset topology corresponding to any number of idle resources in the node server to be matched.

Here, the node server to be matched includes any number of idle resources, and the node server to be matched may be any node server. The resource requirements correspond to idle resources, specifically, one resource requirement may correspond to idle resources in at least one node server to be matched. For example, if the resource requirement is 7, the node server to be matched can be a node server with 7 idle resources, or the node server to be matched can be a group of multiple node servers with 7 idle resources, for example, It may be a group of node servers consisting of a node server including 3 idle resources and a node server including 4 idle resources. Therefore, this method is also a method for determining a preset topology corresponding to any resource requirement.

The quality of communication between different node servers included in the task scheduling system and the GPU and CPU in the same node server is different. The common communication methods are as mentioned in the above-mentioned embodiments. QPI, PCIe, NIC and NVLink 4 kinds, wherein , the communication rate of NVLink is greater than the communication rate of QPI, and the communication rate of QPI is to be greater than the communication rate of PCIe, NIC and network. The communication medium may be determined according to the medium corresponding to the communication mode.

For any idle resource in the node server to be matched, the topology structure corresponding to the idle resource can be determined first, and then the communication waiting time of the topology structure can be determined based on the communication medium between the idle resources in the node server to be matched.

Here, when the node server to be matched is a group of node servers, the topology structure that can be formed by the idle resources in the group of node servers can be determined, and then, based on the communication medium between the node servers and the relationship between the idle resources The communication medium, which determines the communication waiting time of the topology.

Further, the topological structures may be screened according to the communication waiting time of the topological structures, and the topological structures meeting the preset conditions are screened out as the preset topology structures.

During specific implementation, a priori method can be used to determine the communication waiting time corresponding to a certain amount of idle resources under different topological structures, wherein one amount of idle resources can correspond to multiple topological structures, as shown in Figure 4 It is a schematic diagram of a topology relationship between node servers provided by an embodiment of the present disclosure, wherein NVLINK/NVSWITCH is used for communication between GPUs, and Network (network) and NIC are used for communication between different node servers .

For the topological structure, use the GPU with the topological structure to execute the resource demand task corresponding to the number of idle resources based on the ring allreduce model. In the process of executing tasks based on the ring allreduce model, the GPU needs to communicate sequentially, so , the total communication latency will be limited by the minimum communication bandwidth between GPUs in the topology.

During specific implementation, the waiting time for allreduce communication in a single node server can be determined according to Formula 1:

Among them, T represents the waiting time for communication, n represents the number of GPUs required, α represents the constant delay in establishing communication, s represents the size of the ring allreduce model (that is, the communication scale), and BW represents the communication between GPUs under the topology A collection of bandwidths, min(BW) represents the minimum communication bandwidth between GPUs in the topology.

In one embodiment, for the case where the task corresponding to the idle resource is a multi-node task, the waiting time for an allreduce communication required to execute the task may include the waiting time for allreduce communication in a single node server and the waiting time for allreduce communication between node servers Allreduce communication waiting time, wherein, the allreduce communication waiting time in a single node server may include a reduce operation in the node server and a broadcast operation in the node server.

As shown in FIG. 5 , it is a schematic diagram of a topology structure of a multi-node task provided by an embodiment of the present disclosure, where bw ₀ represents the communication bandwidth between GPUs, and bw ₁ represents the minimum communication bandwidth between different node servers. Wherein, the four dark-colored GPUs shown in the lower right corner of FIG. 5 may represent the GPUs being used.

During specific implementation, the communication waiting time between a reduce operation in the node server and a broadcast operation in the node server can be determined according to Formula 2:

The waiting time for allreduce communication between node servers can be determined according to Formula 3:

Among them, m represents the number of node servers used.

In addition, for the ring allreduce model, when using the node server and the GPU in the node server to execute tasks, the two-layer communication method is usually used. Therefore, it can be determined that the communication waiting time of a two-layer allreduce required to execute the task can be as shown in Equation 4 Show:

max _i 2T(n _i , s, bw ₀ )+T(m, s, bw ₁ ) Formula 4

Among them, max _i 2T(n _i , s, bw ₀ ) represents the communication waiting time of the node with the highest communication time consumption under multiple node servers when executing multi-node tasks.

Furthermore, according to the above formulas 1 to 4, the communication waiting time c i of task j _i corresponding to any number of idle resources _shown in formula 5 can be determined:

c _i =max _j T(N _ij , S _i , bw ₀ )+F(M _i )×(max _j T(N _ij ,S _i ,bw ₀ )+T(M _i ,S _i ,min _j BW _ij )) Formula five

Among them, N _ij represents the number of GPUs required by task j _i on the jth node server, M _i represents the number of node servers required to execute task j _i , F is a function that can be determined according to M _i , if M _i >=2, F(M _i )=1, if _Mi is other values, F(M _i )=0, BW _ij represents the communication bandwidth between task j _i and other node servers on the jth node server, min _j BW _ij represents the minimum communication bandwidth between task j _i and other node servers on the jth node server, bw ₀ represents the communication bandwidth of task j _i in the jth node server, S _i represents the communication volume and computation of task j _i The ratio information between the quantities.

For task j _i , a total of ∑ _j N _ij GPUs are needed. When executing task j _i , one GPU needs to spend the communication waiting time of c _i . Then, the total number of tasks corresponding to the resource demand under this topology can be determined. Then, the total communication waiting time under the topology relationship corresponding to the resource demand can be determined. Then, in order to meet the requirement of improving the speed of task execution, the preset topology structure can be selected according to formula 6:

min∑ _i ∑ _j (N _ij c _i ) Formula 6

Among them, min∑ _i ∑ _j (N _ij c _i ) indicates that the topology structure with the minimum communication waiting time is selected as the preset topology structure.

In one embodiment, based on Formula 5, the total communication waiting time of the topological relationship under a certain amount of idle resources can be determined, and then the total communication waiting time of the topological relationship can be performed in order from low to high Sorting, and then using the preset order, the topological structure whose sorting order is smaller than the preset order is used as the preset topology structure corresponding to the number of idle resources, that is, as the preset topology structure corresponding to the node server to be matched. Wherein, the preset order can be set according to actual needs, which is not limited here.

In this way, the topological structures are screened using the preset order, and the communication waiting time corresponding to the preset topological structures can meet the time requirements for executing the task corresponding to the resource demand amount of the idle resources, and can also reduce the execution time of the task as much as possible. The resource requirement for the time purpose of the task.

In another embodiment, after determining the communication waiting time of a topology relationship under a resource demand based on Formula 5, the topology structure whose communication waiting time is less than the preset waiting threshold can be used as the corresponding resource demand The preset topology, that is, may be the preset topology corresponding to the node server to be matched. In this way, using the preset topology to execute tasks can reasonably reduce the communication waiting time required for task execution, wherein the preset waiting threshold can be set according to actual needs, and is not limited here.

In addition, due to the high communication rate of NVLink, the communication mode between resources in the optional topology should be NVLink.

During specific implementation, when the number of resources on the node server is 8, the following preset topology structures of tasks under different resource demands can be determined, wherein, the preset topology structures of tasks under different resource demands refer to a single node server The preset topological structure corresponding to different resource demands on the Internet, as shown in Figure 6a, is a schematic diagram of the preset topology structure corresponding to 8 resources provided by the embodiment of the present disclosure, as shown in Figure 6b, which is the schematic diagram of the preset topology structure provided by the embodiment of the present disclosure A schematic diagram of a preset topology structure corresponding to 7 resources, as shown in Figure 6c, which is a schematic diagram of a preset topology structure corresponding to 6 resources provided by the embodiment of the present disclosure, as shown in Figure 6d, which is a schematic diagram of a preset topology structure corresponding to 5 resources provided by an embodiment of the present disclosure The schematic diagram of the corresponding preset topology structure, as shown in Figure 6e, is a schematic diagram of the preset topology structure corresponding to the 4 resources provided by the embodiment of the present disclosure, as shown in Figure 6f, which corresponds to the 3 resources provided by the embodiment of the present disclosure A schematic diagram of a preset topology, as shown in FIG. 6g , is a schematic diagram of a preset topology corresponding to 2 resources provided by an embodiment of the present disclosure.

Further, after determining at least one preset topology corresponding to the resource demand and the first node server based on the above method, the first node server with any preset topology can be screened from the first node servers. In this way, using the screening The selected first node server with any preset topology performs this task, which can reduce the communication waiting time, and then, select the number of idle resources from the screened out first node servers with any preset topology The least first node server is used as the target node server. In this way, using the first node server with the smallest number of idle resources as the target node server can reduce the number of first node servers with any preset topology and a large number of idle resources. The splitting of the original topology realizes the protection of the optional topology and its reasonable application to other tasks such as single-node multi-resource communication.

During specific implementation, in the process of screening the target node server, if there is a first node server with the number of idle resources equal to the resource demand and having any preset topology, the first node server can be directly used as the target node server.

If there is only the first node server whose number of idle resources is greater than the resource demand and has any preset topology, select the first node server with the smallest number of idle resources as the target node server, and then use the idle resources in the target node server Splitting, splitting resources with a preset topology for performing tasks, as shown in FIG. 7 , is a schematic diagram of splitting an initial node server including different idle resources provided by an embodiment of the present disclosure, wherein, Slot can represent the initial node server, and it can be seen from Figure 7 that when the number of resources included in an initial node server is 8, the optional topology structure obtained by splitting 6 idle resources is 4 idle resources plus 2 idle resources. The optional topology obtained by splitting 5 idle resources is the mode of 3 idle resources plus 2 idle resources. Job can represent a task, 8GPU jobs can represent a task that requires 8 idle GPUs to complete, and Empty 7 GPU Slot represents a node with 7 idle GPUs. server.

Here, if there are multiple first node servers with any preset topology and the smallest number of idle resources, any one of the first node servers may be randomly selected as the target node server.

In addition, after the target node server is determined, the preset topology of the target node server may be used as the target topology, and then idle resources constituting the target topology in the target node server may be used as the target resource. During specific implementation, when it is determined that the number of idle resources in the target node server is equal to the resource demand, the idle resources therein can be used as the target resource; when it is determined that the number of idle resources in the target node server is greater than the resource demand and the number of idle resources is the smallest In the case of , the split resource with the target topology can be used as the target resource to execute the task.

In another embodiment, if the first node server that can meet the resource requirement cannot be screened out, the execution of the task needs to be suspended until it is determined that there is a first node that meets the resource requirement and has at least one preset topology server, execute this task again.

In one embodiment, when the task type includes single-node multi-resource computing type, the target node server for executing the task may be determined from at least one initial node server according to the following steps:

Step 1. Based on the number of idle resources on the initial node server in at least one initial node server, screen the second node server whose idle resource quantity is less than or equal to the resource demand;

Step 2: Determine the target node server for executing the task based on the idle resources on the second node server and the resource requirements of the task.

Here, since the single-node multi-resource communication task requires more computation than communication, it is possible to filter out the number of idle resources based on the number of idle resources on at least one of the initial node servers. A second node server that is less than or equal to the resource requirement. Then at least one preset topology corresponding to the resource demand can be determined, and whether there is a second node server with any preset topology in the second node server is screened, and if so, the second node server is used as the target node server, and use the idle resources in the target node server as the target resources, here, since the number of idle resources of the second node server screened is less than or equal to the resource demand, the screened ones have any preset topology The number of idle resources included in the second node server must be equal to the resource demand.

If the second node server with any preset topology is not screened out from the second node servers, then wait for the execution of the task, and then reacquire idle resources on the node server after a preset period of time, and then re-acquire the idle resources on the node server. Screen out the second node servers whose number of idle resources is less than or equal to the resource demand, and allocate resources for the task until a target node server capable of executing the task can be selected from the second node servers.

Or, in actual implementation, in the task scheduling system, there are many single-resource-type tasks, so the resources in the node server are often used alone, and the execution speed of single-resource-type tasks is faster, so the node server will There is often a single resource that has just been released, so when it is determined that the number of idle resources on at least one node server has changed, it can be determined whether the existing idle resources on the node server and the single resource that has just been released can form a corresponding resource requirement Any preset topology structure, if yes, the node server is used as the target node server, and the resources in the target node server are used as target resources, and then the target resources in the target node server can be allocated to the task.

In one embodiment, when the task type includes a multi-node type, multiple node servers need to be selected to execute the task of the task type. In addition, according to formulas 1 to 6, it can be concluded that for the multi-node type According to the corresponding resource demand and the maximum number of idle resources included on the node server (that is, the number of resources included on the node server), first allocate the maximum number of empty-loaded third tasks from the initial node server node server, and use the empty third node server as the target server. For example, if the resource requirement is 35, and the number of resources included in the node server is 8, then 4 unloaded third node servers are allocated to it first.

Furthermore, if the total amount of resources included in the maximum number of third node servers is less than the resource requirement, it means that the task cannot be completed by using the maximum number of third node servers, and it is necessary to use another initial node server Therefore, it is necessary to first determine the difference between the total amount of resources and the resource demand, where the difference is the number of idle resources included in an initial node server that needs to be used in addition.

After the difference is determined, a target node server can be selected for the task of the multi-node type according to the method of allocating resources for the task of the single-node multi-resource computing type. During specific implementation, based on the number of idle resources on the initial node server, filter the fourth node server whose number of idle resources is less than or equal to the difference, and then determine the target node of the single-node multi-resource computing task based on the above-mentioned embodiment In the server method, the target node server is selected from the fourth node server, and then, the selected empty target node server and the target node server can be assigned to the multi-node task.

In addition, if there are not enough empty node servers in the process of selecting empty target node servers for multi-node tasks, then wait to execute the task until there are enough empty node servers. It allocates node servers and resources.

In one embodiment, when the task type includes a single-resource type, the resource requirement of the task type is 1, and only a single resource is required to complete it. Therefore, it can be directly based on the idle resource on the initial node server. Quantity, the initial node server whose number of idle resources is equal to the resource demand is taken as the target node server, if there are multiple initial node servers with the number of idle resources equal to the resource demand, any initial node server can be randomly selected as the target server.

In addition, if there is no initial node server whose number of idle resources is equal to the resource demand, wait for the execution of the task until there is a node server whose number of idle resources is equal to the resource demand, allocate resources to it and execute .

In one embodiment, for S102, the task type of the task may also be determined according to the following steps:

Step 1: Obtain idle resources on the node server.

Here, after obtaining the task information of the task to be processed, idle resources on the node server can be obtained at the same time.

Step 2. According to the task information, determine the resource demand amount of the task and the traffic demand information of the task.

For the specific implementation process of this step, reference may be made to the steps of determining resource demand and communication traffic demand information in the above-mentioned embodiments, which will not be repeated here.

Step 3: Determine the target node server for executing the task and the target resource on the target node server based on the idle resources on the node server, the resource demand of the task, and the traffic demand information of the task.

Here, the task scheduling system can directly determine the target that can meet the traffic demand corresponding to the resource demand and communication demand information based on the resource demand required by the task to be processed, the idle resources on the node server, and the traffic demand information. Node server and target resource. Wherein, the target resource in the target node server has an optional preset topology.

Step 4: Determine the task type of the task based on the target node server executing the task and the target resource on the target node server.

Here, the task type of the task may be determined based on the determined quantity of the target node server, the quantity of the target resource on the target node server, and the traffic requirement information.

In another embodiment, the user can also directly mark the task type of the task to be processed in the submitted configuration file, and then the task scheduling system can directly determine the task type of the task from the obtained configuration file, and As the task information obtained for this task.

In one embodiment, tasks of different task types have different resource allocation priorities. In the case that there are multiple tasks to be processed, the task type of each task can be determined first, and then based on the task type of each task Task type, which determines the resource allocation priority for each task.

Furthermore, based on the resource allocation priority of the task, the target node server for executing the task and the target resource on the target node server can be determined, so as to allocate resources to the task reasonably.

During specific implementation, the resource allocation priority of single-node multi-resource communication type tasks can be set to the first priority; the resource allocation priority of single-node multi-resource computing type tasks can be set to the second priority; multi-node The resource allocation priority of the type task is set to the third priority; the resource allocation priority of the single resource type task is set to the fourth priority.

In an embodiment, the task information may include the resource requirement, therefore, the resource requirement may be obtained according to the following steps:

Step 1: Obtain the configuration file of the task to be processed.

Step 2, obtaining the resource requirement of the task to be processed from the configuration file.

During specific implementation, when the user submits a predefined configuration file related to the task to be processed at the client terminal, the resource requirement information of the task to be processed can be directly determined in the configuration file. Furthermore, the task scheduling system can directly acquire the resource requirement of the task to be processed from the acquired configuration file, and use it as the acquired task information of the task.

In this way, the resource demand can be obtained without parsing the configuration file, which improves the speed of determining the resource demand.

In an embodiment, the task information may include traffic demand information, therefore, the traffic demand information may be obtained according to the following steps:

Step 1. Obtain calculation volume information and communication volume information of tasks to be processed.

During specific implementation, the calculation amount information and communication amount information of the task to be processed can be determined based on the analysis of the obtained configuration file of the task to be processed, or based on the analysis of the obtained task request to be processed; wherein, the task request Including calculation amount information and communication amount information, the task request may be a request submitted by a user on the client to allocate resources for the task.

Step 2. Based on the calculation amount information and the communication amount information, determine the communication amount requirement information.

Here, a ratio between the calculation amount corresponding to the calculation amount information and the calculation amount corresponding to the communication amount information may be determined, and based on the determined ratio, the communication amount demand information may be determined.

In another embodiment, in the case that the task information may include traffic demand information, the traffic demand information may also be obtained according to the following steps:

Step 1. Obtain the configuration file of the task to be processed;

Step 2: Obtain the communication traffic requirement information of the task to be processed from the configuration file.

Here, the configuration file submitted by the user may directly include the traffic demand information of the task to be processed. Furthermore, the task scheduling system may directly acquire the traffic demand information of the task to be processed from the acquired configuration file, and use it as the acquired task information of the task.

In this way, the configuration file does not need to be parsed to obtain the traffic demand information, the step of calculating the information is omitted, and the speed of determining the traffic demand information is improved.

In addition, an embodiment of the present disclosure provides a task scheduling method applied to a client, as shown in FIG. 8 , which is a flow chart of a task scheduling method applied to a client provided by an embodiment of the present disclosure, which may include the following steps :

S801: Determine the task type of the task to be processed.

Here, after the user submits a predefined configuration file related to the task to be processed at the client, the client can directly determine the task type of the task based on the configuration file.

S802: Based on the task type of the task, determine the target node server for executing the task and the target resource on the target node server.

After determining the task type of the task, the client can determine the target node server that can be used to execute the task and the target resource on the target node server according to the idle resources on the node server and the task type of the task. Wherein, the target resource is an idle resource.

Regarding the specific implementation process of S802, you can refer to the steps of determining the target node server and target resource based on the task type in the task scheduling method applied to the scheduler, but the execution subject changes from the scheduler to the client corresponding to the scheduler. Therefore, I won't go into details here.

S803: Send the information of the target node server and the information of the target resource to the scheduler, so that the scheduler allocates the target node server and the target resource to the task to be processed.

Here, the client may directly send the determined target node server information and target resource information to the scheduler, that is, to the task scheduling system. Afterwards, the scheduler can directly determine the target node server and the target resource on the target node server for executing the task to be processed based on the information of the target node server and the target resource, and then, the target node server and the target resource can be Assigned to pending tasks.

In one embodiment, for S801, the task type may be determined according to the following steps:

Step 1. Determine the resource requirement of the task, the communication traffic requirement information of the task, and the resource quantity in the node server.

Here, the client can determine the resource requirement of the task and the communication traffic requirement information of the task based on the analysis of the configuration file.

Alternatively, the task's resource requirement and the task's traffic requirement information can be obtained directly from the configuration file, which is not limited here.

Also, the client can determine the resource quantity in the node server in the task scheduling system. Here, the number of resources is the number of GPUs.

Step 2: Determine the task type of the task based on the determined resource demand, the communication traffic demand information of the task, and the resource quantity in the node server.

in,

Indicates rounding up.

In another embodiment, for S801, the task type may also be determined according to the following steps:

Step 1: Obtain idle resources on the node server.

Here, after the client obtains the configuration file submitted by the user, it can simultaneously obtain idle resources on the node server.

Step 2: Determine at least one target node server for executing the task and target resources on the target node server based on idle resources on the node server, resource requirements of the task, and traffic requirements of the task.

Here, the client can directly determine the target node that can meet the traffic demand corresponding to the resource demand and communication demand information based on the resource demand required by the task to be processed, the idle resources on the node server, and the traffic demand information Server and target resources. Wherein, the target resource in the target node server has an optional preset topology.

Step 3: Determine the task type of the task based on at least one target node server and target resources on the target node server.

Here, the task type of the task may be determined based on the determined number of target node servers, the number of target resources on the target node server, and traffic demand information.

During specific implementation, regarding the specific implementation process of the step of determining the task type mentioned in the above embodiments, you can refer to the step of determining the task type in the task scheduling method applied to the scheduler side, only the execution subject changes from the scheduler to the scheduler The corresponding client, therefore, will not be repeated here.

Those skilled in the art can understand that in the above method of specific implementation, the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possible The inner logic is OK.

Based on the same inventive concept, the embodiment of the present disclosure also provides a task scheduling device corresponding to the task scheduling method applied to the scheduler, because the problem-solving principle of the device in the embodiment of the present disclosure is similar to the above-mentioned task scheduling method in the embodiment of the present disclosure , so the implementation of the device can refer to the implementation of the method, and the repetition will not be repeated.

As shown in FIG. 9 , it is a schematic diagram of a task scheduling device applied to the scheduler side provided by an embodiment of the present disclosure, including:

An acquisition module 901, configured to acquire task information of tasks to be processed;

The first determining module 902 is configured to determine the task type of the task based on the task information;

The second determination module 903 is configured to determine the target node server for executing the task and the target resource on the target node server based on the task type;

An allocating module 904, configured to allocate the target node server and target resources on the target node server to execute the task.

In a possible implementation manner, the first determining module 902 is configured to determine the resource requirement of the task and the traffic requirement information of the task according to the task information;

Determine the number of resources in the node server;

In a possible implementation manner, the second determination module 903 is configured to determine the target node for executing the task based on the task type of the task, the resource demand of the task, and the idle resources on the node server server and the target resource on the target node server.

In a possible implementation manner, the second determining module 903 is configured to determine at least one initial node server matching the task type based on the task type of the task;

The second determination module 903 is configured to, based on the number of idle resources on the initial node servers in the at least one initial node server, select the initial node server whose number of idle resources is greater than or equal to the resource requirement as the first node server;

In a possible implementation manner, the second determination module 903 is configured to determine at least one preset topology corresponding to the resource requirement;

In a possible implementation manner, the second determination module 903 is configured to use a preset topology in the target node server as the target topology;

The second determining module 903 is configured to, based on the number of idle resources on the initial node servers in the at least one initial node server, screen a second node server whose number of idle resources is less than or equal to the resource requirement;

In a possible implementation manner, the second determining module 903 is further configured to, when no second node server having any one of the preset topology structures is screened out from the second node servers, After a preset period of time or at least one idle resource on the node server changes, reacquire the number of idle resources on the node server, and return based on the number of idle resources on the initial node server among the at least one initial node server, filter A step of the second node server whose number of idle resources is less than or equal to the required amount of resources.

In a possible implementation manner, the task type includes a multi-node type;

The second determination module 903 is configured to determine, from the at least one initial node server, the maximum number of third node servers required to execute the task based on the resource requirement of the task; wherein, the third node The server is an empty initial node server;

In a possible implementation manner, the second determination module 903 is further configured to determine the total amount of resources and the difference in resource requirements;

The second determining module 903 is configured to, based on the number of idle resources on the initial node servers in the at least one initial node server, use the initial node server whose number of idle resources is equal to the resource requirement as the target node server.

In a possible implementation manner, the second determination module 903 is further configured to determine a preset topology according to the following steps:

In a possible implementation manner, the second determining module 903 is configured to sort the topological structures according to the order of the communication waiting time from low to high;

In a possible implementation manner, the second determination module 903 is configured to, for any number of idle resources, use the topology structure whose communication waiting time is less than a preset waiting threshold as the The default topology corresponding to the number of idle resources.

In a possible implementation manner, the first determining module 902 is configured to acquire idle resources on the node server;

In a possible implementation manner, the second determination module 903 is further configured to determine the resources of each task based on the task type of each task when the obtained tasks to be processed include multiple Assign priority;

The second determination module 903 is configured to set the resource allocation priority of the task of the single-node multi-resource communication type as the first priority;

In a possible implementation manner, the task information includes the resource requirement, and the first determining module 902 is further configured to obtain the resource requirement according to the following steps:

Obtain the configuration file of the task to be processed;

In a possible implementation manner, the task information includes the traffic demand information, and the first determining module 902 is further configured to acquire the traffic demand information according to the following steps:

In a possible implementation manner, the first determining module 902 is configured to determine, based on the obtained configuration file of the pending task or the obtained request of the pending task, the Calculation amount information and communication amount information; wherein, the task request includes the calculation amount information and communication amount information.

Obtain the configuration file of the task to be processed;

In a possible implementation manner, the first determining module 902 is configured to determine the number of target node servers corresponding to the task to be processed based on the resource requirement and the number of resources in the node server;

The first determining module 902 is configured to: when the number of target node servers corresponding to the task to be processed is equal to 1, the number of target resources corresponding to the task to be processed is greater than 1, and the traffic demand information indicates When the communication volume requirement of the task to be processed is higher than the calculation volume requirement, determine that the task type of the task to be processed is a single-node multi-resource communication type;

Based on the same inventive concept, the embodiment of the present disclosure also provides a task scheduling device corresponding to the task scheduling method applied to the client. Since the problem-solving principle of the device in the embodiment of the present disclosure is similar to the above-mentioned task scheduling method in the embodiment of the present disclosure, Therefore, the implementation of the device can refer to the implementation of the method, and the repetition will not be repeated.

As shown in FIG. 10 , it is a schematic diagram of a task scheduling device applied to a client provided by an embodiment of the present disclosure, including:

The third determination module 1001 is configured to determine the task type of the task to be processed;

A fourth determining module 1002, configured to determine a target node server for executing the task and target resources on the target node server based on the task type of the task;

A sending module 1003, configured to send the information of the target node server and the information of the target resource to a scheduler, so that the scheduler allocates the target node server and the target resource to the pending Task

In a possible implementation manner, the third determination module 1001 is configured to determine the resource requirement of the task, the traffic requirement information of the task, and the resource quantity in the node server;

In a possible implementation manner, the third determination module 1001 is configured to acquire idle resources on the node server;

For the description of the processing flow of each module in the device and the interaction flow between the modules, reference may be made to the relevant description in the above method embodiment, and details will not be described here.

An embodiment of the present disclosure also provides a computer device, as shown in FIG. 11 , which is a schematic structural diagram of a computer device provided by an embodiment of the present disclosure, including:

Processor 1101 and memory 1102; the memory 1102 stores machine-readable instructions executable by the processor 1101, the processor 1101 is used to execute the machine-readable instructions stored in the memory 1102, and the machine-readable instructions are executed by the processor 1101 During execution, the processor 1101 performs the following steps: S101: Obtain the task information of the task to be processed; S102: Determine the task type of the task based on the task information; S103: Determine the target node server and the target node for executing the task based on the task type The target resource on the server and S104: allocate the target node server and the target resource on the target node server to execute the task.

The above-mentioned memory 1102 includes a memory 1121 and an external memory 1122; the memory 1121 here is also called an internal memory, and is used to temporarily store the calculation data in the processor 1101 and the data exchanged with the external memory 1122 such as a hard disk. The external memory 1122 performs data exchange.

For the specific execution process of the above instructions, reference may be made to the steps of the task scheduling method described in the embodiments of the present disclosure, which will not be repeated here.

Embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is run by a processor, the steps of the task scheduling method described in the foregoing method embodiments are executed. Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.

The computer program product of the task scheduling method provided by the embodiments of the present disclosure includes a computer-readable storage medium storing program code, and the instructions included in the program code can be used to execute the steps of the task scheduling method described in the above method embodiments For details, reference may be made to the foregoing method embodiments, and details are not repeated here.

The computer program product can be specifically realized by means of hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. Wait.

Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the device described above can refer to the corresponding process in the foregoing method embodiment, and details are not repeated here. In the several embodiments provided in the present disclosure, it should be understood that the disclosed devices and methods may be implemented in other ways. The device embodiments described above are illustrative. For example, the division of the units is a logical function division. In actual implementation, there may be another division method. For example, multiple units or components can be combined, or some Features can be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.

If the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor. Based on this understanding, the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Finally, it should be noted that: the above-mentioned embodiments are only specific implementations of the present disclosure, and are used to illustrate the technical solutions of the present disclosure, rather than limit them, and the protection scope of the present disclosure is not limited thereto, although referring to the aforementioned The embodiments have described the present disclosure in detail, and those skilled in the art should understand that any person familiar with the technical field can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed in the present disclosure Changes can be easily imagined, or equivalent replacements can be made to some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be included in this disclosure. within the scope of protection. Therefore, the protection scope of the present disclosure should be defined by the protection scope of the claims.

Claims

A task scheduling method, characterized in that it is applied to the scheduler side, comprising:

Obtain task information of pending tasks;

determining a task type of the task based on the task information;

Based on the task type, determine a target node server for executing the task and target resources on the target node server;

Allocating the target node server and target resources on the target node server to perform the task.
The method according to claim 1, wherein the determining the task type of the task according to the task information comprises:

According to the task information, determine the resource requirement of the task and the traffic requirement information of the task;

Determine the number of resources in the node server;

The task type of the task is determined based on the determined resource demand, communication traffic demand information of the task, and resource quantity in the node server.
The method according to claim 1, wherein the determining the target node server for executing the task and the target resource on the target node server based on the task type includes:

A target node server for executing the task and target resources on the target node server are determined based on the task type of the task, the resource requirement of the task, and the idle resources on the node server.
The method according to claim 3, wherein the target node server for executing the task and the Target resources on the target node server, including:

determining at least one initial node server matching the task type based on the task type of the task;

Based on the resource requirements of the task and idle resources on the initial node server, determine a target node server for executing the task and target resources on the target node server from the at least one initial node server.
The method according to claim 4, wherein the task type includes a single-node multi-resource communication type;

The determining the target node server for performing the task from the at least one initial node server includes:

Based on the number of idle resources on the initial node server in the at least one initial node server, select the initial node server whose number of idle resources is greater than or equal to the resource demand as the first node server;

A target node server for executing the task is determined based on idle resources on the first node server and resource requirements of the task.
The method according to claim 5, wherein the determining the target node server for executing the task based on the idle resources on the first node server and the resource demand of the task comprises:

determining at least one preset topology structure corresponding to the resource requirement;

Selecting, among the first node servers, the first node server having any one of the preset topology structures and the least number of idle resources as the target node server.
The method according to claim 6, wherein determining the target resource on the target node server comprises:

Taking the preset topology in the target node server as the target topology;

Taking idle resources constituting the target topology in the target node server as the target resources.
The method according to claim 4, wherein the task type includes a single-node multi-resource computing type;

The determining the target node server for performing the task from the at least one initial node server includes:

Based on the number of idle resources on the initial node server in the at least one initial node server, screening the second node server whose number of idle resources is less than or equal to the resource requirement;

A target node server for executing the task is determined based on idle resources on the second node server and resource requirements of the task.
The method according to claim 8, wherein the determining the target node server for executing the task based on the idle resources on the second node server and the resource requirements of the task comprises:

determining at least one preset topology structure corresponding to the resource requirement;

Selecting any second server with the preset topology among the second node servers as the target node server.
The method according to claim 9, wherein, determining the target node server for performing the task based on the idle resource on the second node server and the resource requirement of the task, further comprising:

In the case that no second node server with any one of the preset topology structures is screened out from the second node servers, after a preset period of time or at least one idle resource on the node server changes, the node server is reacquired and returning to the step of screening the second node server whose number of idle resources is less than or equal to the resource requirement based on the number of idle resources on the initial node server among the at least one initial node server.
The method according to claim 4, wherein the task type includes a multi-node type;

The determining the target node server for performing the task from the at least one initial node server includes:

Based on the resource requirement of the task, from the at least one initial node server, determine the maximum number of third node servers required to perform the task; wherein, the third node server is an empty initial node server;

The third node server with the maximum number is used as the target node server.
The method according to claim 11, wherein the determining the target node server for performing the task from the at least one initial node server further comprises:

When the total amount of resources on the largest number of third node servers is less than the resource requirement, determine the difference between the resource total and the resource requirement;

Based on the number of idle resources on the initial node server, screening a fourth node server whose number of idle resources is less than or equal to the difference;

Based on the free resources on the fourth node server and the difference, determine a target node server for executing the task.
The method according to claim 4, wherein the task type includes a single resource type;

The determining the target node server for performing the task from the at least one initial node server includes:

Based on the number of idle resources on the initial node server in the at least one initial node server, the initial node server whose idle resource quantity is equal to the resource requirement is used as the target node server.
The method according to claim 1, further comprising the step of determining a preset topology:

For the idle resources in any node server to be matched, determine the topology structure corresponding to any number of idle resources in the node server to be matched;

Determine the communication waiting time of the topology based on the communication medium between idle resources;

Based on the communication waiting time, a preset topology corresponding to any number of idle resources in the node server to be matched is determined.
The method according to claim 14, wherein the determining a preset topology corresponding to any number of idle resources in the node server to be matched based on the communication waiting time includes:

Sorting the topological structures according to the order of the communication waiting time from low to high;

The topological structure whose sort order is smaller than the preset order is used as the preset topology structure corresponding to the node server to be matched.
The method according to claim 14, wherein, based on the communication waiting time, determining a preset topology corresponding to any number of idle resources in the node server to be matched includes

For any number of idle resources, the topology structure whose communication waiting time is shorter than the preset waiting threshold is taken as the preset topology structure corresponding to the number of idle resources in the node server to be matched.
The method according to claim 1, wherein the determining the task type of the task according to the task information comprises:

Obtain idle resources on the node server;

According to the task information, determine the resource requirement of the task and the traffic requirement information of the task;

Determine a target node server for executing the task and target resources on the target node server based on idle resources on the node server, resource requirements of the task, and traffic demand information of the task;

The task type of the task is determined based on the target node server executing the task and the target resource on the target node server.
The method according to any one of claims 1 to 17, further comprising:

When the obtained tasks to be processed include multiple tasks, based on the task type of each task, determine the resource allocation priority of each task;

Based on the resource allocation priority of each task, a target node server for executing the task and target resources on the target node server are determined.
The method according to claim 18, wherein the task type includes a single-node multi-resource communication type, a single-node multi-resource computing type, a multi-node type, and a single-resource type;

The determining the resource allocation priority of each task based on the task type of each task includes:

Setting the resource allocation priority of the task of the single-node multi-resource communication type to the first priority;

Setting the resource allocation priority of the task of the single-node multi-resource computing type to the second priority;

Setting the resource allocation priority of the task of the multi-node type to the third priority;

Setting the resource allocation priority of the single-resource type task to the fourth priority.
The method according to claim 2 or 17, wherein the task information includes the resource requirement, and the method further includes the step of obtaining the resource requirement:

Obtain the configuration file of the task to be processed;

Obtain the resource requirement of the task to be processed from the configuration file.
The method according to claim 2 or 17, wherein the task information includes the traffic demand information, and the method further includes the step of obtaining the traffic demand information:

Acquiring calculation amount information and communication amount information of the task to be processed;

Based on the calculation amount information and communication amount information, the communication amount requirement information is determined.
The method according to claim 21, wherein said obtaining the calculation amount information and communication amount information of the task to be processed comprises:

Based on the obtained configuration file of the task to be processed or the obtained task request to be processed, determine the calculation amount information and communication amount information of the task to be processed; wherein, the task request includes the calculation volume information and traffic information.
The method according to claim 2 or 17, wherein the task information includes the traffic demand information, and the method further includes the step of obtaining the traffic demand information:

Obtain the configuration file of the task to be processed;

Acquiring traffic demand information of the task to be processed from the configuration file.
The method according to claim 2, wherein determining the task type of the task based on the determined resource demand, task traffic demand information, and resource quantity in the node server includes:

Determine the number of target node servers corresponding to the task to be processed based on the resource demand and the number of resources in the node server;

The task type of the task to be processed is determined based on the number of target node servers corresponding to the task to be processed and the traffic demand information.
The method according to claim 24, wherein the task type includes a single-node multi-resource communication type, a single-node multi-resource computing type, a multi-node type, and a single-resource type;

The determining the task type of the task to be processed based on the number of target node servers corresponding to the task to be processed and the traffic demand information includes:

The number of target node servers corresponding to the task to be processed is equal to 1, the number of target resources corresponding to the task to be processed is greater than 1, and the traffic demand information indicates the traffic demand of the task to be processed When the amount of calculation is higher than the demand, it is determined that the task type of the task to be processed is a single-node multi-resource communication type;

The number of target node servers corresponding to the task to be processed is equal to 1, the number of target resources corresponding to the task to be processed is greater than 1, and the traffic demand information indicates the calculation capacity demand of the task to be processed If it is higher than the communication traffic requirement, it is determined that the task type of the task to be processed is a single-node multi-resource computing type;

When the number of target node servers corresponding to the task to be processed is greater than 1, determine that the task type of the task to be processed is a multi-node type;

In a case where the number of target resources corresponding to the task to be processed is equal to 1, it is determined that the task type of the task to be processed is a single resource type.
A task scheduling method, characterized in that it is applied to a client, comprising:

Determine the task type of the pending task;

Based on the task type of the task, determine a target node server for executing the task and target resources on the target node server;

Sending the information of the target node server and the information of the target resource to a scheduler, so that the scheduler allocates the target node server and the target resource to the task to be processed.
The task scheduling method according to claim 26, wherein said determining the task type of the task to be processed comprises:

determining the resource requirement of the task, the traffic requirement information of the task, and the resource quantity in the node server;

The task type of the task is determined based on the determined resource demand, communication traffic demand information of the task, and resource quantity in the node server.
The task scheduling method according to claim 26, wherein said determining the task type of the task to be processed comprises:

Obtain idle resources on the node server;

Determine at least one target node server for executing the task and target resources on the target node server based on idle resources on the node server, resource requirements of the task, and traffic demand information of the task;

A task type of the task is determined based on the at least one target node server and target resources on the target node server.
A task scheduling device, characterized in that it comprises:

An acquisition module, configured to acquire task information of tasks to be processed;

A first determining module, configured to determine the task type of the task based on the task information;

A second determining module, configured to determine a target node server for executing the task and target resources on the target node server based on the task type;

An allocating module, configured to allocate the target node server and target resources on the target node server to execute the task.
A task scheduling device, characterized in that it is applied to a client, comprising:

The third determining module is used to determine the task type of the task to be processed;

A fourth determining module, configured to determine a target node server for executing the task and target resources on the target node server based on the task type of the task;

A sending module, configured to send the information of the target node server and the information of the target resource to a scheduler, so that the scheduler allocates the target node server and the target resource to the task to be processed .
A computer device, characterized by comprising: a processor and a memory, the memory stores machine-readable instructions executable by the processor, and the processor is configured to execute the machine-readable instructions stored in the memory, When the machine-readable instructions are executed by the processor, the processor executes the steps of the task scheduling method according to any one of claims 1 to 25, or, the processor executes the steps of the task scheduling method according to claims 26 to 28. The steps of any one of the task scheduling methods.
A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is run by a computer device, the computer device executes the computer program described in any one of claims 1 to 25. or, the computer device executes the steps of the task scheduling method according to any one of claims 26 to 28.
A computer program product, comprising computer readable codes, or a non-volatile computer readable storage medium bearing computer readable codes, when the computer readable codes are run in a processor of an electronic device, the electronic The processor in the device executes the steps for realizing the task scheduling method described in any one of claims 1 to 25, or executes the steps of the task scheduling method described in any one of claims 26 to 28.