WO2022262167A1

WO2022262167A1 - Cluster resource scheduling method and apparatus, electronic device and storage medium

Info

Publication number: WO2022262167A1
Application number: PCT/CN2021/126478
Authority: WO
Inventors: 孙鹏; 梁若凡; 颜深根
Original assignee: 上海商汤科技开发有限公司
Priority date: 2021-06-15
Filing date: 2021-10-26
Publication date: 2022-12-22
Also published as: CN113377540A

Abstract

Disclosed in embodiments of the present application are a cluster resource scheduling method and apparatus, an electronic device, and a storage medium. The method comprises: in a first operating environment, acquiring a resource scheduling request for Graphics Processing Units (GPUs) in a GPU cluster; executing a task scheduling strategy according to a request parameter to add a deep learning task to a task queue, and executing a preset resource allocation strategy to determine at least one target GPU from the GPU cluster; scheduling the deep learning task to the at least one target GPU for processing; and adjusting the task scheduling strategy and the preset resource allocation strategy, and deploying the adjusted task scheduling strategy and preset resource allocation strategy in a second operating environment. The embodiments of the present application facilitate reducing development costs of a resource scheduling algorithm.

Description

Cluster resource scheduling method and device, electronic device and storage medium

This application claims the priority of the Chinese patent application filed on June 15, 2021, with the application number 202110664041.0, and the title of the invention is "cluster resource scheduling method and device, electronic equipment and storage medium", the entire content of which is incorporated by reference in this application middle.

technical field

The present application relates to the technical field of distributed systems, in particular to a cluster resource scheduling method and device, electronic equipment and storage media.

Background technique

With the development of artificial intelligence, deep learning has become the focus of researchers, and it has a wide range of applications in target recognition and target detection tasks. The implementation of deep learning algorithms is inseparable from effective training. In order to meet the computing power requirements of its training, large-scale GPU (graphics processing unit, graphics processing unit) clusters have become the support for the research and development of deep learning algorithms. When it comes to clusters, it will inevitably involve resource management and task scheduling. At present, task scheduling in clusters mostly depends on task scheduling algorithms. After the task scheduling algorithm is developed, it needs to be tested in the cluster to verify its effectiveness. and reliability, but in terms of the characteristics of long deep learning training cycle and high computing density, the development and deployment process of the current task scheduling algorithm is time-consuming and laborious, which makes the development cost high.

Contents of the invention

Embodiments of the present application provide a cluster resource scheduling method and device, electronic equipment, and a storage medium. By developing the task scheduling strategy and the resource allocation strategy in the first running environment, it is beneficial to reduce the development cost of the resource scheduling algorithm.

In the first aspect, the embodiment of the present application provides a cluster resource scheduling method, the method including:

In the first operating environment, obtain a resource scheduling request to the GPU in the graphics processor GPU cluster; the resource scheduling request includes request parameters;

Executing a task scheduling strategy according to the request parameters, adding the deep learning task corresponding to the resource scheduling request to a task queue, and executing a preset resource allocation strategy to determine at least one target GPU from the graphics processor GPU cluster;

Scheduling the deep learning task to the at least one target GPU for processing;

Adjusting the task scheduling policy and the preset resource allocation policy, and deploying the adjusted task scheduling policy and the preset resource allocation policy in the second operating environment.

With reference to the first aspect, in a possible implementation manner, the request parameter includes the task type of the deep learning task, and the task scheduling policy is executed according to the request parameter to schedule the resource to request the corresponding deep learning task Added to the task queue, including:

Determine the target task partition to be requested by the deep learning task from at least one task partition of the graphics processor GPU cluster according to the task type of the deep learning task;

Execute the task scheduling policy corresponding to the target task partition to add the deep learning task to the task queue of the target task partition.

With reference to the first aspect, in a possible implementation manner, the request parameters also include the average completion time and average waiting time of historical deep learning tasks, and the implementation of the preset resource allocation strategy is obtained from the GPU cluster Identify at least one target GPU, including:

Calculate the amount of GPU resources to be requested by the deep learning task according to the average completion time and the average waiting time;

Executing a first preset resource allocation strategy or a second preset resource allocation strategy according to the amount of GPU resources to be requested, so as to determine the at least one target GPU from the target task partition; the first preset resource The allocation strategy is used to find an idle GPU resource in the target task partition, then the idle GPU resource is determined as the target GPU, and the second preset resource allocation strategy is used to find the idle GPU resource in the target task partition that satisfies the calculation conditional idle GPU resource, then determine the idle GPU resource satisfying the calculation condition as the target GPU.

With reference to the first aspect, in a possible implementation manner, the target task to be requested by the deep learning task is determined from at least one task partition of the GPU cluster according to the task type of the deep learning task Before partitioning, the method also includes:

Classify the nodes according to the task types of the nodes in the GPU cluster to obtain the at least one task partition;

Nodes are classified according to the switches connected to the nodes in the GPU cluster to obtain at least one network topology.

With reference to the first aspect, in a possible implementation manner, a first preset resource allocation strategy or a second preset resource allocation strategy is executed according to the amount of GPU resources to be requested, so as to determine from the target task partition After obtaining the at least one target GPU, the method also includes:

determining whether the node to which the at least one target GPU belongs is in a different network topology of the at least one network topology;

If yes, add additional communication overhead for the deep learning task.

With reference to the first aspect, in a possible implementation manner, the second operating environment also includes the graphics processing unit GPU cluster, and a cluster manager SLURM is used to monitor the GPU resources in the graphics processing unit GPU cluster The management, deploying the adjusted task scheduling policy and the preset resource allocation policy in the second operating environment includes:

Adding the adjusted task scheduling strategy and the preset resource allocation strategy to the source code module of the cluster manager SLURM, so as to complete the adjusted task scheduling strategy and the preset resource allocation strategy in the The deployment in the second operating environment; the task scheduling strategy includes a combination of one or more of a preemptive scheduling strategy, a non-preemptive scheduling strategy, and a learning scheduling strategy.

With reference to the first aspect, in a possible implementation manner, the obtaining a resource scheduling request for a GPU in a graphics processor GPU cluster includes:

Obtain the resource scheduling request through the preset interface sacct API provided by the cluster manager SLURM; the resource scheduling request is historical deep learning processed by the graphics processor GPU cluster in the second operating environment The task record for the task.

In the second aspect, the embodiment of the present application provides a cluster resource scheduling device, which includes:

A transceiver unit, configured to acquire a resource scheduling request for a GPU in a GPU cluster in a first operating environment; the resource scheduling request includes request parameters;

The processing unit is configured to execute a task scheduling strategy according to the request parameter, add the deep learning task corresponding to the resource scheduling request to the task queue, and execute a preset resource allocation strategy to determine at least a target GPU;

The processing unit is further configured to dispatch the deep learning task to the at least one target GPU for processing;

The processing unit is further configured to adjust the task scheduling policy and the preset resource allocation policy, and deploy the adjusted task scheduling policy and the preset resource allocation policy in the second running environment.

In a third aspect, an embodiment of the present application provides an electronic device, including: a processor, the processor is connected to a memory, the memory is used to store a computer program, and the processor is used to execute the computer program stored in the memory , so that the electronic device executes the method described in the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and the computer program causes a computer to execute the method as described in the first aspect.

In the fifth aspect, the embodiment of the present application provides a computer program product, including computer readable codes, or a non-volatile computer readable storage medium carrying computer readable codes, when the computer readable codes are stored in the electronic device When running in the processor, the processor in the electronic device executes to implement the method as described in the first aspect.

It can be seen that in the embodiment of the present application, the resource scheduling request for the GPU in the graphics processor GPU cluster can be obtained in the first operating environment, and then the task scheduling policy can add the deep learning task corresponding to the resource scheduling request to the task In the queue, and execute the preset resource allocation strategy to determine at least one target GPU from the graphics processor GPU cluster, then schedule the deep learning task to at least one target GPU for processing, adjust the task scheduling strategy and the preset resource allocation strategy, Deploy the adjusted task scheduling policy and preset resource allocation policy in the second operating environment. In this way, the first operating environment is used to test and adjust the task scheduling strategy and resource allocation strategy, and the tested and adjusted task scheduling strategy and resource allocation strategy are deployed in the second operating environment for resource scheduling, which is conducive to reducing the number of resources directly in the second operating environment. The deployment process consumption caused by the testing and adjustment of task scheduling strategy and resource allocation strategy in the environment can reduce the development cost of resource scheduling algorithm.

Description of drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings that need to be used in the description of the embodiments. Obviously, the drawings in the following description are some embodiments of the present application. For Those of ordinary skill in the art can also obtain other drawings based on these drawings without making creative efforts.

FIG. 1 is a schematic diagram of an application environment provided by an embodiment of the present application;

FIG. 2 is a schematic diagram of the architecture of a cluster resource scheduling system provided by an embodiment of the present application;

FIG. 3 is a schematic diagram of a visualization array provided by an embodiment of the present application;

FIG. 4 is a schematic flowchart of a cluster resource scheduling method provided by an embodiment of the present application;

FIG. 5 is a schematic flowchart of another cluster resource scheduling method provided by the embodiment of the present application;

FIG. 6 is a block diagram of functional units of a cluster resource scheduling device provided in an embodiment of the present application;

FIG. 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.

detailed description

The following will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of this application.

The terms "first", "second", "third" and "fourth" in the specification and claims of the present application and the drawings are used to distinguish different objects, rather than to describe a specific order . Furthermore, the terms "include" and "have", as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, product or device comprising a series of steps or units is not limited to the listed steps or units, but optionally also includes unlisted steps or units, or optionally further includes For other steps or units inherent in these processes, methods, products or apparatuses.

Reference herein to an "embodiment" means that a particular feature, result, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The occurrences of this phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is understood explicitly and implicitly by those skilled in the art that the embodiments described herein can be combined with other embodiments.

Referring to FIG. 1, FIG. 1 is a schematic diagram of an application environment provided by an embodiment of the present application. As shown in FIG. 1, the application environment includes a user terminal, an algorithm node, a resource scheduling node, and a graphics processor GPU cluster. Wherein, the user terminal includes but not limited to smart phones, tablet computers, desktop computers and other devices, and the graphics processing unit GPU cluster is a computer cluster, which includes multiple computing nodes, and each computer point is equipped with at least one GPU. Among them, the user terminal is used to submit a deep learning task processing request to the algorithm node, such as the training of the neural network model, etc. When the algorithm node receives the deep learning task processing request, it submits the resource scheduling request to the resource scheduling node, and the resource scheduling node Execute resource scheduling algorithms when receiving resource scheduling requests, such as task scheduling policies, resource allocation policies, etc., to search for GPU resources in the graphics processor GPU cluster, and return the found available GPU resources to the algorithm node , and dispatch deep learning tasks to the found available GPU resources for execution or processing.

In some scenarios, the algorithm node may be a device storing an AI (Artificial Intelligence, artificial intelligence) algorithm, the device may be a server of a user terminal, and the algorithm node and the resource scheduling node may be the same device or different devices . In other scenarios, there is a simulator running in the resource scheduling node, which can maintain the nodes in the graphics processor GPU cluster, such as the usage of the GPU in the node, the CPU (central processing unit, central processing unit) Usage, memory usage, list of running tasks in the node, etc., the simulator can support the development of scheduling algorithms, such as testing and adjusting the scheduling algorithm in the simulator, which can reduce the scheduling algorithm directly in the actual cluster Deployment process consumption caused by testing and adjustment, thereby reducing the development cost of resource scheduling algorithms.

Based on the application environment shown in Figure 1, please refer to Figure 2, Figure 2 is a schematic diagram of the architecture of a cluster resource scheduling system provided by the embodiment of this application, as shown in Figure 2, the architecture mainly includes the actual SLURM cluster and cluster simulation The cluster simulator can be a simulator running on a resource scheduling node. The cluster simulator can maintain the usage of each node in the actual SLURM cluster, that is, the same as the actual SLURM cluster, the cluster simulator also uses task partitions- Three-level management model for nodes-resource GPUs. The deep learning tasks executed on the actual SLURM cluster are submitted to the actual scheduler Real Scheduler (scheduler in the actual SLURM cluster), and the actual scheduler Real Scheduler makes resource scheduling requests to the actual SLURM cluster, and the actual SLURM cluster allocates resources to the available nodes , and return to the actual scheduler Real Scheduler, and the actual scheduler Real Scheduler performs task scheduling. The cluster simulator is driven by the configuration file slurm.conf of the cluster manager SLURM for cluster simulation, that is, the cluster simulator maintains the same graphics processor GPU cluster as the actual SLURM cluster, and the scheduling on the cluster simulator is determined by the actual SLURM cluster Driven by the task record trace of the historical deep learning tasks executed on the network, the simulation scheduler Sim Scheduler (the scheduler in the cluster simulator) sends a resource scheduling request to the cluster simulator according to the task record trace. At the same time, the simulation scheduler Sim Scheduler will Execute the task scheduling strategy to queue the deep learning tasks corresponding to the task record trace, and the cluster simulator will execute the resource allocation strategy to determine the available GPU resources from the graphics processor GPU cluster for resource allocation, and return to the simulation scheduler Sim Scheduler , and the task scheduling is performed by the simulation scheduler Sim Scheduler.

Exemplary, the task scheduling strategy in the above system can be first come first service (First Come First Service, FCFS), multi-level feedback queue (Multi-Level Feedback Queue, MLFQ), short job priority (Shortest Job First, SJF) , Reinforcement Learning (Reinforcement Learning, RL), and so on. The resource allocation strategy can be the first-fit algorithm first-fit, the best-fit algorithm best-fit, the computing power platform free-gpu provided by Google, and so on. The simulation scheduler Sim Scheduler also visualizes the simulation results of various scheduling algorithms on the cluster simulator. Specifically, the GPU usage in the graphics processor GPU cluster can be dynamically displayed in the form of a histogram, or it can also be combined with The array shown in Figure 3 is visualized. As shown in Figure 3, each large rectangle represents a node in the graphics processor GPU cluster, and the numbers in the large rectangle represent node identifiers, such as: 43, 55, 143, etc., and the large rectangle The small rectangle in represents the GPU usage in that node. Certainly, the visual display on the cluster simulator may also be presented in other forms, and the histogram and the array are merely examples and do not impose any limitation on the embodiment of the present application.

Referring to FIG. 4 , FIG. 4 is a schematic flowchart of a cluster resource scheduling method provided by an embodiment of the present application. This method is applied to resource scheduling nodes. As shown in Figure 4, the method includes the following steps:

401: In a first running environment, acquire a resource scheduling request for a GPU in a GPU cluster; the resource scheduling request includes a request parameter.

In the embodiment of the present application, the first operating environment refers to the cluster simulator, that is, the solution is to test and simulate the resource scheduling algorithm on the cluster simulator. Among them, the resource scheduling request is the task record of the historical deep learning tasks processed on the graphics processor GPU cluster in the second operating environment, and the second operating environment refers to the actual SLURM cluster, based on the cluster resource scheduling system shown in Figure 2 Architecture, the cluster simulator and the actual SLURM cluster maintain the same graphics processor GPU cluster (or node), therefore, the resource scheduling on the cluster simulator can be processed on the graphics processor GPU cluster in the actual SLURM cluster Driven by the task record trace of the historical deep learning task, the task record trace records the relevant parameters of the historical deep learning task, such as the average completion time, average waiting time, GPU usage, task volume, and task type. The task record trace of the task is used as training data to drive the cluster simulator, which can make the testing and simulation of the scheduling algorithm closer to the actual situation.

Exemplarily, the resource scheduling request can be obtained through the preset interface sacct API provided by the cluster manager SLURM, that is, the above-mentioned task record trace can be obtained through the preset interface sacct API.

402: Execute a task scheduling strategy according to the request parameter, add the deep learning task corresponding to the resource scheduling request to the task queue, and execute a preset resource allocation strategy to determine at least one target GPU from the graphics processor GPU cluster .

In this embodiment of the application, the request parameter includes the task type of the deep learning task, for example, the task type may be deep learning model training or online prediction.

Exemplarily, the above-mentioned execution of the task scheduling strategy according to the request parameters adds the deep learning task corresponding to the resource scheduling request to the task queue, including:

Specifically, when the resource scheduling request is obtained, the operation of classifying the nodes in the GPU cluster is performed. On the one hand, the nodes are classified according to the task types of the nodes in the GPU cluster to obtain at least one Task partitioning, such as taking the node that performs model training as a task partition, taking the node that performs online prediction as a task partition, and so on. Each task partition has its own independent resource pool and task queue, and each task partition is preset with different task scheduling policies. On the other hand, the nodes are classified according to the switches connected to the nodes in the GPU cluster to obtain at least one network topology. For example,

nodes

43, 55, 46, and 52 are connected to a switch, and the four nodes are regarded as a network topology , the

nodes

94, 97, 100, and 101 are connected to a switch, and the four nodes are regarded as a network topology, and so on.

Exemplarily, the operation of classifying the nodes according to the task types of the nodes in the GPU cluster can be completed by calling the configuration file slurm.conf of the cluster manager SLURM or the preset interface sinfo API. The operation of classifying the nodes according to the switches connected to the nodes in the GPU cluster can be completed by calling the preset interface iblinkinfo API of the wireless bandwidth infiniband.

For at least one task partition obtained by classification, the task partition whose task type is the same as the task type of the deep learning task corresponding to the resource scheduling request is used as the target task partition, and the preset task scheduling strategy of the target task partition is executed to make the deep learning task Added to the task queue of the target task partition to wait.

Exemplarily, the task scheduling strategy includes a combination of one or more of a preemptive scheduling strategy, a non-preemptive scheduling strategy and a learning scheduling strategy. For example, a deep learning task can be executed using only one task scheduling strategy, or it can be executed in parts using different task scheduling strategies. Among them, the preemptive scheduling strategy can be MLFQ, etc. This type of task scheduling strategy allows the suspension and recovery of running tasks, which can be realized through the access interface provided by the cluster simulator. In addition, developers can also configure different configurations for this type of task scheduling strategy. Parameters, such as the running time of the algorithm, hierarchical relationship, etc., to improve the performance of the algorithm; Among them, the non-preemptive scheduling strategy can be FCFS, SJF, etc.; Among them, the learning scheduling strategy can be based on machine learning, reinforcement learning, reverse reinforcement Learning strategy, the development of this type of task scheduling strategy often requires a large number of task record traces as training data, after multiple decision iterations to achieve a better decision result, which is often difficult to achieve in the actual environment, through the cluster Simulating with a simulator helps to reduce the development difficulty of this type of task scheduling strategy. In addition, developers can use various task scheduling strategies for testing and simulation on the cluster simulator, so it is more flexible.

Exemplarily, the aforementioned implementation of the preset resource allocation strategy determines at least one target GPU from the GPU cluster, including:

Calculate the amount of GPU resources to be requested by the deep learning task according to the above-mentioned average completion time and the above-mentioned average waiting time;

Specifically, the amount of GPU resources to be requested refers to how many GPUs are required to execute the deep learning task, such as 4 GPUs with a computing power of 3.7, 8 GPUs with a computing power of 2.5, etc., that is, by recording the parameters in the task trace Learning can calculate the GPU resources required for a deep learning task. After the amount of GPU resources to be requested is determined, a first preset resource allocation strategy or a second preset resource allocation strategy may be executed according to the amount of GPU resources to be requested, wherein the first preset resource allocation strategy may be the first The adaptation algorithm first-fit, the second preset resource allocation strategy can be the best adaptation algorithm best-fit, for example, deep learning tasks that require less GPU resources can use the first adaptation algorithm first-fit, which requires less GPU resources High-level deep learning tasks can use the best-fit algorithm best-fit. Wherein, the aforementioned idle GPU resources satisfying the calculation conditions refer to meeting the calculation requirements of the best-fit algorithm, that is, finding the best GPU resources.

Furthermore, the cluster simulator also allows multiple nodes in the GPU cluster to provide GPU resource support for a single node. For example, a node is performing deep learning tasks, but its GPU performance is low, while other nodes happen to have If there are idle GPU resources, the unexecuted part of the deep learning task can be scheduled to other nodes for execution. In other words, the cluster simulator supports operations such as segmentation, migration, and reconstruction of deep learning tasks.

Furthermore, the cluster simulator also supports dynamic resource migration and reallocation of resources. For example, two nodes have 8 GPUs, 4 of which are already occupied, and the current task requires 8 GPU resources to execute, then the task It has to be scheduled to execute on the two nodes, that is, the resources of the two nodes are fragmented. In the cluster simulator, for this situation, when the 4 GPU resources of a certain node are released, they can be used to execute another part of the current task. Through such resource migration or reallocation, the existing GPU resources can be reduced. Fragmentation of allocated resources.

403: Schedule the deep learning task to the at least one target GPU for processing.

In the embodiment of the present application, after the first preset resource allocation strategy or the second preset resource allocation strategy is executed according to the amount of GPU resources to be requested to determine the at least one target GPU from the target task partition , the method also includes:

If yes, add additional communication overhead for the deep learning task.

Specifically, considering that on an actual SLURM cluster, the performance of deep learning tasks will be affected by the propensity of GPU resources, for example, the same deep learning task is more likely to be executed on the same node or GPU of the same network topology. For the determined At least one target GPU may belong to different nodes or different network topologies. In the embodiment of this application, in the cluster simulator, additional communication overhead will be added for deep learning tasks that are not executed on the same node or in the same network topology. , to guarantee its performance.

For the deep learning tasks waiting in the task queue, after at least one target GPU is determined, they can be dispatched to at least one target GPU for processing.

404: Adjust the task scheduling policy and the preset resource allocation policy, and deploy the adjusted task scheduling policy and the preset resource allocation policy in a second running environment.

In the embodiment of the present application, the above steps 401-403 are the testing and simulation of the scheduling algorithm (including task scheduling strategy and resource allocation strategy) on the cluster simulator. For task scheduling strategies and resource allocation strategies whose effect or performance does not meet the requirements, The task scheduling strategy and preset resource allocation strategy can be adjusted or modified in response to developer input (such as program code or parameters), and the adjusted or modified task scheduling strategy and preset resource allocation strategy can be added to the cluster In the source code modules plugin/select and plugin/sched of the manager SLURM, the deployment of the adjusted or modified task scheduling strategy and preset resource allocation strategy in the second operating environment is completed.

It can be seen that in the embodiment of the present application, in the first operating environment (cluster simulator), the resource scheduling request for the GPU in the graphics processor GPU cluster can be obtained, and then the task scheduling policy will assign the depth corresponding to the resource scheduling request to Add the learning task to the task queue, and execute the preset resource allocation strategy to determine at least one target GPU from the graphics processor GPU cluster, then schedule the deep learning task to at least one target GPU for processing, adjust the task scheduling strategy and pre-set A resource allocation policy is set, and the adjusted task scheduling policy and the preset resource allocation policy are deployed in the second operating environment. In this way, the first operating environment is used to test and adjust the task scheduling strategy and resource allocation strategy, and the tested and adjusted task scheduling strategy and resource allocation strategy are deployed in the second operating environment for resource scheduling, which is conducive to reducing the number of resources directly in the second operating environment. The deployment process consumption caused by testing and adjusting the task scheduling strategy and resource allocation strategy in the environment, thereby reducing the development cost of the resource scheduling algorithm, reducing the risk of developing the resource scheduling algorithm in the second operating environment, and speeding up the scheduling algorithm development iteration speed. For the manager of the graphics processing unit GPU cluster, the defects and bottlenecks of the resource scheduling algorithm can be found by testing the resource scheduling algorithm in the first running environment, so as to explore possible solutions for improvement.

Referring to FIG. 5 , FIG. 5 is a schematic flowchart of another cluster resource scheduling method provided by the embodiment of the present application. This method is also applied to resource scheduling nodes. As shown in Figure 5, the method includes the following steps:

501: In the first running environment, obtain a resource scheduling request for a GPU in a graphics processor GPU cluster; the resource scheduling request includes a request parameter, and the request parameter includes a task of a deep learning task corresponding to the resource scheduling request Types of;

502: Determine a target task partition to be requested by the deep learning task from at least one task partition of the graphics processor GPU cluster according to the task type of the deep learning task;

503: Execute the task scheduling policy corresponding to the target task partition and add the deep learning task to the task queue of the target task partition;

504: Execute a preset resource allocation strategy to determine at least one target GPU from the GPU cluster;

505: Scheduling the deep learning task to the at least one target GPU for processing;

506: Adjust the task scheduling policy and the preset resource allocation policy, and deploy the adjusted task scheduling policy and the preset resource allocation policy in the second running environment.

Wherein, the specific implementation manners of the above-mentioned steps 501-506 have been described in the embodiment shown in FIG. 4 , and can achieve the same or similar beneficial effects. To avoid repetition, details are not repeated here.

Referring to Fig. 6, Fig. 6 is a block diagram of functional units of a cluster resource scheduling device provided by the embodiment of the present application. The cluster resource scheduling apparatus 600 includes: a transceiver unit 601 and a processing unit 602, wherein:

The transceiver unit 601 is configured to obtain a resource scheduling request for GPUs in the GPU cluster in the first operating environment; the resource scheduling request includes request parameters;

The processing unit 602 is configured to execute a task scheduling strategy according to the request parameter, add the deep learning task corresponding to the resource scheduling request to the task queue, and execute a preset resource allocation strategy determined from the graphics processor GPU cluster. At least one target GPU;

The processing unit 602 is further configured to dispatch the deep learning task to the at least one target GPU for processing;

The processing unit 602 is further configured to adjust the task scheduling policy and the preset resource allocation policy, and deploy the adjusted task scheduling policy and the preset resource allocation policy in the second running environment.

In some possible implementation manners, the request parameter includes the task type of the deep learning task, and in terms of executing the task scheduling policy according to the request parameter and adding the deep learning task corresponding to the resource scheduling request to the task queue, the The processing unit 602 is specifically used for:

In some possible implementations, the request parameters also include the average completion time and average waiting time of historical deep learning tasks, and at least one target GPU is determined from the graphics processor GPU cluster by executing the preset resource allocation strategy , the processing unit 602 is specifically used for:

In some possible implementation manners, the processing unit 602 is specifically further configured to:

If yes, add additional communication overhead for the deep learning task.

In some possible implementation manners, the second operating environment also includes the graphics processor GPU cluster, and the cluster manager SLURM is used to manage the GPU resources in the graphics processor GPU cluster. The task scheduling policy and the preset resource allocation policy are deployed in the second operating environment, and the processing unit 602 is specifically used for:

In some possible implementation manners, in terms of obtaining resource scheduling requests for GPUs in the GPU cluster, the processing unit 602 is specifically configured to:

Obtain the resource scheduling request through the preset interface sacct API provided by the cluster manager SLURM; the resource scheduling request is the historical deep learning processed on the graphics processor GPU cluster in the second operating environment The task record for the task.

Referring to FIG. 7, FIG. 7 is a schematic structural diagram of an electronic device provided in an embodiment of the present application. As shown in FIG. 7 , an electronic device 700 includes a transceiver 701 , a processor 702 and a memory 703 . They are connected through a bus 704 . The memory 703 is used to store computer programs and data, and can transmit the data stored in the storage 503 to the processor 702 .

The processor 702 is used to read the computer program in the memory 703 to perform the following operations:

Executing a task scheduling strategy according to the request parameter, adding the deep learning task corresponding to the resource scheduling request to a task queue, and executing a preset resource allocation strategy to determine at least one target GPU from the graphics processor GPU cluster;

In some possible implementation manners, the request parameter includes the task type of the deep learning task, and the processing The device 702 is specifically configured to perform the following operations:

In some possible implementations, the request parameters also include the average completion time and average waiting time of historical deep learning tasks, and at least one target GPU is determined from the graphics processor GPU cluster by executing the preset resource allocation strategy , the processor 702 is specifically configured to perform the following operations:

In some possible implementation manners, the processor 702 is specifically further configured to perform the following operations:

If yes, add additional communication overhead for the deep learning task.

In some possible implementation manners, the second operating environment also includes the graphics processor GPU cluster, and the cluster manager SLURM is used to manage the GPU resources in the graphics processor GPU cluster. The task scheduling policy and the preset resource allocation policy are deployed in the second operating environment, and the processor 702 is specifically configured to perform the following operations:

In some possible implementation manners, in terms of obtaining a resource scheduling request for a GPU in a GPU cluster, the processor 702 is specifically configured to perform the following operations:

Specifically, the above-mentioned transceiver 701 may be the transceiver unit 601 of the cluster resource scheduling device 600 of the embodiment shown in FIG. 6 , and the above-mentioned processor 702 may be the processing unit 602 of the cluster resource scheduling device 600 of the embodiment shown in FIG. 6 .

Exemplarily, the above-mentioned electronic devices may be independent physical servers, or server clusters or distributed systems, and may also provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, intermediate Cloud servers for basic cloud computing services such as software services, domain name services, security services, and big data and artificial intelligence platforms. The electronic device includes but is not limited to a transceiver 701 , a processor 702 memory 703 and a bus 704 . Those skilled in the art can understand that the schematic diagram is only an example of the electronic device, and does not constitute a limitation to the electronic device, and may include more or less components than those shown in the figure, or combine certain components, or different components.

It should be noted that, since the processor 702 of the electronic device executes the computer program to implement the steps in the above cluster resource scheduling method, the above embodiments of the cluster resource scheduling method are all applicable to the electronic device, and all of them can achieve the same or similar beneficial effect.

The embodiment of the present application also provides a computer-readable storage medium (Memory), where the computer-readable storage medium is a memory device in an electronic device, and is used to store programs and data. It can be understood that the computer-readable storage medium here may include a built-in storage medium in the terminal, and of course may also include an extended storage medium supported by the terminal. The computer-readable storage medium provides storage space, and the storage space stores the operating system of the terminal. Moreover, one or more instructions suitable for being loaded and executed by the processor 702 are also stored in the storage space, and these instructions may be one or more computer programs (including program codes). It should be noted that the computer-readable storage medium here can be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory; The aforementioned computer storage medium for the processor 702. In one embodiment, the processor 702 can load and execute one or more instructions stored in the computer storage medium, so as to implement the corresponding steps of the cluster resource scheduling method described above.

Exemplarily, the computer program on the computer-readable storage medium includes computer program code, and the computer program code may be in the form of source code, object code, executable file or some intermediate form. The computer-readable storage medium may include: any entity or device capable of carrying the computer program code, a recording medium, a USB flash drive, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory) ), Random Access Memory (RAM, Random Access Memory), electrical carrier signal, telecommunication signal, and software distribution medium, etc.

It should be noted that, since the computer program of the computer-readable storage medium is executed by the processor to implement the steps in the above-mentioned cluster resource scheduling method, all embodiments of the above-mentioned cluster resource scheduling method are applicable to the computer-readable storage medium, And all can achieve the same or similar beneficial effects. A computer readable storage medium may be a volatile storage medium or a nonvolatile storage medium.

The embodiment of the present application also provides a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device During operation, the processor in the electronic device is used to implement the above method.

It should be noted that for the foregoing method embodiments, for the sake of simple description, they are expressed as a series of action combinations, but those skilled in the art should know that the present application is not limited by the described action sequence. Depending on the application, certain steps may be performed in other orders or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification are all optional embodiments, and the actions and modules involved are not necessarily required by the application.

In the foregoing embodiments, the descriptions of each embodiment have their own emphases, and for parts not described in detail in a certain embodiment, reference may be made to relevant descriptions of other embodiments.

In the several embodiments provided in this application, it should be understood that the disclosed device can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or can be Integrate into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical or other forms.

The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented not only in the form of hardware, but also in the form of software program modules.

The integrated units may be stored in a computer-readable memory if implemented in the form of a software program module and sold or used as an independent product. Based on this understanding, the technical solution of the present application is essentially or part of the contribution to the prior art, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a memory. Several instructions are included to make a computer device (which may be a personal computer, server or network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned memory includes: various media capable of storing program codes such as U disk, read-only memory, random access memory, mobile hard disk, magnetic disk or optical disk.

Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above-mentioned embodiments can be completed by instructing related hardware through a program, and the program can be stored in a computer-readable memory, and the memory can include: a flash disk , read-only memory, random access device, magnetic disk or optical disk, etc.

The embodiments of the present application have been introduced in detail above, and specific examples have been used in this paper to illustrate the principles and implementation methods of the present application. The descriptions of the above embodiments are only used to help understand the methods and core ideas of the present application; meanwhile, for Those skilled in the art will have changes in specific implementation methods and application scopes based on the ideas of the present application. In summary, the contents of this specification should not be construed as limiting the present application.

Claims

A cluster resource scheduling method, characterized in that, comprising:

In the first operating environment, obtain a resource scheduling request to the GPU in the graphics processor GPU cluster; the resource scheduling request includes request parameters;

Executing a task scheduling strategy according to the request parameters, adding the deep learning task corresponding to the resource scheduling request to a task queue, and executing a preset resource allocation strategy to determine at least one target GPU from the graphics processor GPU cluster;

Scheduling the deep learning task to the at least one target GPU for processing;

Adjusting the task scheduling policy and the preset resource allocation policy, and deploying the adjusted task scheduling policy and the preset resource allocation policy in the second operating environment.
The method according to claim 1, wherein the request parameter includes the task type of the deep learning task, and the task scheduling policy is executed according to the request parameter to add the deep learning task corresponding to the resource scheduling request to the task queue, including:

Determine the target task partition to be requested by the deep learning task from at least one task partition of the graphics processor GPU cluster according to the task type of the deep learning task;

Execute the task scheduling policy corresponding to the target task partition to add the deep learning task to the task queue of the target task partition.
The method according to claim 1 or 2, wherein the request parameters also include the average completion time and the average waiting time of historical deep learning tasks, and the implementation of the preset resource allocation strategy is obtained from the graphics processor GPU cluster Identify at least one target GPU, including:

Calculate the amount of GPU resources to be requested by the deep learning task according to the average completion time and the average waiting time;

Executing a first preset resource allocation strategy or a second preset resource allocation strategy according to the amount of GPU resources to be requested, so as to determine the at least one target GPU from the target task partition; the first preset resource The allocation strategy is used to find an idle GPU resource in the target task partition, then the idle GPU resource is determined as the target GPU, and the second preset resource allocation strategy is used to find the idle GPU resource in the target task partition that satisfies the calculation conditional idle GPU resource, then determine the idle GPU resource satisfying the calculation condition as the target GPU.
The method according to claim 2 or 3, wherein the target to be requested by the deep learning task is determined from at least one task partition of the graphics processor GPU cluster according to the task type of the deep learning task Before task partitioning, the method also includes:

Classify the nodes according to the task types of the nodes in the GPU cluster to obtain the at least one task partition;

Nodes are classified according to the switches connected to the nodes in the GPU cluster to obtain at least one network topology.
The method according to claim 4, wherein the first preset resource allocation strategy or the second preset resource allocation strategy is executed according to the amount of GPU resources to be requested, so as to determine the target task partition After the at least one target GPU, the method also includes:

determining whether the node to which the at least one target GPU belongs is in a different network topology of the at least one network topology;

If yes, add additional communication overhead for the deep learning task.
The method according to any one of claims 1-5, wherein the second operating environment also includes the graphics processor GPU cluster, and a cluster manager SLURM is used to manage the graphics processor GPU cluster GPU resources in the system are managed, and the adjusted task scheduling strategy and the preset resource allocation strategy are deployed in the second operating environment, including:

Adding the adjusted task scheduling strategy and the preset resource allocation strategy to the source code module of the cluster manager SLURM, so as to complete the adjusted task scheduling strategy and the preset resource allocation strategy in the The deployment in the second operating environment; the task scheduling strategy includes a combination of one or more of a preemptive scheduling strategy, a non-preemptive scheduling strategy, and a learning scheduling strategy.
The method according to claim 6, wherein said obtaining the resource scheduling request of the GPU in the graphics processing unit (GPU) cluster comprises:

Obtain the resource scheduling request through the preset interface sacct API provided by the cluster manager SLURM; the resource scheduling request is the historical deep learning processed on the graphics processor GPU cluster in the second operating environment The task record for the task.
A cluster resource scheduling device, characterized in that it includes:

A transceiver unit, configured to acquire a resource scheduling request for a GPU in a GPU cluster in a first operating environment; the resource scheduling request includes request parameters;

The processing unit is configured to execute a task scheduling strategy according to the request parameter, add the deep learning task corresponding to the resource scheduling request to the task queue, and execute a preset resource allocation strategy to determine at least a target GPU;

The processing unit is further configured to dispatch the deep learning task to the at least one target GPU for processing;

The processing unit is further configured to adjust the task scheduling policy and the preset resource allocation policy, and deploy the adjusted task scheduling policy and the preset resource allocation policy in the second running environment.
An electronic device, characterized by comprising: a processor, the processor is connected to a memory, the memory is used to store a computer program, and the processor is used to execute the computer program stored in the memory, so that the The electronic device executes the method according to any one of claims 1-7.
A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement the method according to any one of claims 1-7.
A computer program product, comprising computer readable codes, or a non-volatile computer readable storage medium bearing computer readable codes, when the computer readable codes are run in a processor of an electronic device, the electronic A processor in the device is configured to implement the method of any one of claims 1-7.