CN110413412B

CN110413412B - GPU (graphics processing Unit) cluster resource allocation method and device

Info

Publication number: CN110413412B
Application number: CN201910654395.XA
Authority: CN
Inventors: 姬贵阳
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2019-07-19
Filing date: 2019-07-19
Publication date: 2022-03-25
Anticipated expiration: 2039-07-19
Also published as: CN110413412A

Abstract

The invention discloses a resource allocation method based on a GPU cluster, wherein the GPU cluster comprises a plurality of GPU cards; the method comprises the following steps: acquiring tasks to be processed, wherein the tasks to be processed comprise large tasks and small tasks; the large task is a task to be processed, wherein the required resource amount is greater than or equal to one GPU card; the small task is a task to be processed with the required resource amount smaller than one GPU card; distributing one or more GPU cards to execute the large tasks according to the priority sequence of the tasks to be processed; each GPU card is only distributed with one large task; acquiring the resource residual quantity of each GPU card for executing the task; and traversing the small tasks which are not executed according to the priority order for each GPU card with the resource residual quantity, if the resource residual quantity of the GPU card is found to meet the small tasks which are not executed, distributing the resource residual quantity of the GPU card to the small tasks, and updating the resource residual quantity of the GPU card. By the scheme of the invention, the utilization rate of GPU cluster resources is improved.

Description

GPU (graphics processing Unit) cluster resource allocation method and device

Technical Field

The invention relates to the field of cloud computing, in particular to a method and a device for distributing cluster resources based on a GPU.

Background

Currently, Resource scheduling software commonly used by a GPU cluster (a system for managing a large number of GPU cards) includes pbs (portable Batch system) and skid (simple Linux Utility for Resource management), the utilization rate of the GPU cluster is low by adopting the existing Resource scheduling software, the utilization rate of a GPU (graphics Processing Unit) card for allocating Processing tasks is about 30% to 60%, and about 30% of resources are not used and are wasted.

The unit for allocating the GPU cards by the dispatcher software pbs and the scheduler software slrm is the number, namely the card number, when the GPU resources are dispatched, one or more GPU cards are allocated to the tasks, the utilization rate of the GPU is not high when the tasks are operated, and partial GPU resources are wasted. Meanwhile, the resource scheduling of the existing scheduler software pbs and slurm is the multiplexing of the GPU card, namely more than two tasks can be operated on one GPU card, although the GPU resources can be more fully utilized, the task execution speed is reduced due to the fact that a plurality of tasks are operated on one GPU card at the same time, the work efficiency is very low, and the task completion period is prolonged. In addition, in the prior art, only the scheduling of resources is realized, and a related technology for reasonably allocating resources does not exist, and for the problems existing in the conventional GPU cluster resource scheduling, a method based on GPU cluster resource allocation is urgently needed to be provided, and a method and a device capable of improving the GPU cluster utilization rate are needed.

Disclosure of Invention

In order to solve the technical problem, the invention provides a method and a device for allocating resources based on a GPU cluster, which improve the utilization rate of the GPU cluster.

In order to achieve the aim of the invention, the invention provides a resource allocation method based on a GPU cluster, wherein the GPU cluster comprises a plurality of GPU cards; the method comprises the following steps:

acquiring tasks to be processed, wherein the tasks to be processed comprise large tasks and small tasks; the large task refers to a task to be processed, wherein the required resource amount is greater than or equal to one GPU card; the small task is a task to be processed, the required resource amount of which is less than that of one GPU card;

distributing one or more GPU cards to execute the large tasks according to the priority sequence of the tasks to be processed; each GPU card is only distributed with one large task;

acquiring the resource residual quantity of each GPU card for executing the task;

and traversing the small tasks which are not executed according to the priority order for each GPU card with the resource residual quantity, if the resource residual quantity of the GPU card is found to meet the small tasks which are not executed, distributing the resource residual quantity of the GPU card to the small tasks, and updating the resource residual quantity of the GPU card.

In an exemplary embodiment, after the acquiring the task to be processed, the method further includes:

calculating the attribute of the task to be processed by adopting a custom rule to obtain a priority weight value;

sorting according to the calculated priority weight values, wherein the higher the weight value is, the higher the priority is;

wherein the attributes of the task to be processed comprise one or more of the following: the time range for executing the task, the number of GPU cards required for executing the task and the type of GPU cards are calculated.

In an exemplary embodiment, the executing one or more GPU cards allocated to a large task according to the priority order of the tasks to be processed includes:

step 31, according to the priority sequence of the tasks to be processed, determining the first unexecuted big task as the current task to be processed;

step 32, distributing one or more GPU cards for the current task to be processed;

step 33, judging whether large tasks which are not executed exist, and if not, acquiring the resource residual quantity of each GPU card for executing the tasks; if so, determining the next unexecuted big task as the current task to be processed;

step 34, judging whether an idle GPU card exists, if so, performing step 35, and if not, acquiring the resource residual quantity of each GPU card for executing the task;

step 35, judging whether the resource amount of the current idle GPU card is larger than or equal to the resource amount required by the current task to be processed;

if greater than or equal to, go to step 32;

if so, go to step 36;

step 36, judging whether a large task which is not executed exists;

if there is a big task that is not executed, determining the next big task that is not executed as the current task to be processed according to the priority order, and performing step 35;

if there are no large tasks not executed, then this is complete.

In an exemplary embodiment, the amount of resources includes a time frame for executing the task and a resource requirement for executing the task.

In an exemplary embodiment, for each GPU card with a resource residual amount, traversing the small tasks that are not executed respectively according to a priority order, and if the resource residual amount of the GPU card is found to satisfy the small tasks that are not executed, allocating the resource residual amount of the GPU card to the small task, and updating the resource residual amount of the GPU card, the method includes:

step 51, traversing the small tasks which are not executed according to the priority sequence, and determining the first small task which is not executed as the current task to be processed;

step 52, judging whether the resource residual quantity of the GPU card is larger than or equal to the resource quantity required by the current task to be processed; if so, go to step 53, if not, go to step 54;

step 53, judging whether the execution time range of the current task to be processed is within the time range of the GPU card executing the large task;

if the current task to be processed is within the time range of executing the large task, the resource residual amount of the GPU card is distributed to the current task to be processed, and the resource residual amount is updated; if not, go to step 54;

step 54, judging whether small tasks which are not executed exist, and if not, ending the process;

if there are small tasks that have not been executed, the next small task that has not been executed is determined to be the current task to be processed according to the priority order, and the process returns to step 52.

In order to solve the above problem, the present invention further provides a device for allocating resources based on a GPU cluster, including: a memory and a processor, the GPU cluster including a plurality of GPU cards;

the memory is used for storing programs for allocating resources based on GPU clusters;

the processor is configured to read and execute the program for allocating resources based on the GPU cluster, and perform the following operations:

In an exemplary embodiment, the processor reads and executes the program for GPU cluster resource allocation based on the following operations after acquiring the task to be processed:

In an exemplary embodiment, the processor reads and executes the program for GPU cluster resource based allocation, and further performs the following operations:

the executing of the one or more GPU cards distributed to the big tasks according to the priority sequence of the tasks to be processed comprises the following steps:

if greater than or equal to, go to step 32;

if so, go to step 36;

step 36, judging whether a large task which is not executed exists;

if there are no large tasks not executed, then this is complete.

for each GPU card with resource residual quantity, respectively traversing the small tasks which are not executed according to the priority sequence, if the resource residual quantity of the GPU card is found to meet the small tasks which are not executed, distributing the resource residual quantity of the GPU card to the small tasks, and updating the resource residual quantity of the GPU card, wherein the method comprises the following steps:

Compared with the prior art, the invention provides a resource allocation method based on a GPU cluster, wherein the GPU cluster comprises a plurality of GPU cards; acquiring tasks to be processed, wherein the tasks to be processed comprise large tasks and small tasks; the large task is a task to be processed, wherein the required resource amount is greater than or equal to one GPU card; the small task is a task to be processed with the required resource amount smaller than one GPU card; distributing one or more GPU cards to execute the large tasks according to the priority sequence of the tasks to be processed; each GPU card is only distributed with one large task; acquiring the resource residual quantity of each GPU card for executing the task; and traversing the small tasks which are not executed according to the priority order for each GPU card with the resource residual quantity, if the resource residual quantity of the GPU card is found to meet the small tasks which are not executed, distributing the resource residual quantity of the GPU card to the small tasks, and updating the resource residual quantity of the GPU card. By the scheme of the invention, the utilization rate of GPU cluster resources is improved.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and not to limit the invention.

FIG. 1 is a flowchart of a GPU-based cluster resource allocation method according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a GPU cluster resource allocation device in the embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.

The steps illustrated in the flow charts of the figures may be performed in a computer system such as a set of computer-executable instructions. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.

FIG. 1 is a flowchart of a method for allocating resources based on a GPU cluster according to an embodiment of the present invention, where the GPU cluster includes multiple GPU cards; the method comprises the following steps:

step 100, obtaining tasks to be processed, wherein the tasks to be processed comprise large tasks and small tasks.

In this embodiment, a plurality of GPU cards are included in a GPU cluster, each GPU card being an independent processor.

Acquiring a task to be processed, and dividing the task to be processed into a large task and a small task; the large task is a task to be processed, the required resource amount of which is more than or equal to one GPU card; the small task is a task to be processed with less resource amount than one GPU card. Wherein the processing task may be a deep learning task trained in a deep learning platform. The resource amount required by the large task can be distributed by dividing the processing tasks, and the small tasks are backfilled with resources. For example: the resource amount of the large task is 2 GPU cards, and the resource amount of the small task is 20% of the memory in one GPU card.

In an exemplary embodiment, after a task to be processed is obtained, attributes of the task to be processed are calculated by adopting a custom rule to obtain a priority weight value; sorting according to the calculated priority weight values, wherein the higher the weight value is, the higher the priority is; wherein the attributes of the task to be processed comprise one or more of the following: and calculating the expected start time of the task, the expected end time of the task, the execution time range of the task, the number of GPU cards required for executing the task and the type of the GPU cards. After the tasks to be processed are subjected to priority sequencing, the large tasks are in front of the priority sequence, and the small tasks are behind the priority sequence. For example: the priority order of the sequenced tasks to be processed is as follows: major task 1, major task 2, major task 3 … …, minor task 1, minor task 2, minor task 3, and minor task 4 … ….

And 101, distributing one or more GPU cards to execute the large tasks according to the priority sequence of the tasks to be processed.

In the embodiment, according to the priority sequence of the tasks to be processed, the GPU cards are firstly distributed to the large tasks, one or more GPU cards are distributed to execute the large tasks, and only one large task is distributed to each GPU card in the distribution process; one card runs one large task to ensure the speed of task execution and the efficiency of task execution.

In an exemplary embodiment, the one or more GPU cards allocated to the large task according to the priority order of the tasks to be processed may be implemented as follows:

if greater than or equal to, go to step 32;

if so, go to step 36;

step 36, judging whether a large task which is not executed exists;

if there are no large tasks not executed, then this is complete.

And 102, acquiring the resource residual quantity of each GPU card for executing the task.

In this implementation, the resource remaining amount of each GPU card after the large task is allocated may be obtained by the prior art. The technical means adopted for obtaining the resource residual amount of each GPU card after the large task is allocated is not specifically limited in this embodiment.

And 103, traversing the small tasks which are not executed according to the priority sequence for each GPU card with the resource residual quantity, if the resource residual quantity of the GPU card is found to meet the small tasks which are not executed, distributing the resource residual quantity of the GPU card to the small tasks, and updating the resource residual quantity of the GPU card.

In this implementation, for each GPU card with the resource remaining amount obtained in step 102, the small task may be allocated to the GPU card again, so that the resource utilization rate of the GPU card may be improved.

An exemplary embodiment includes a time frame for executing the task and a resource requirement for executing the task.

In an exemplary embodiment, for each GPU card with resource residual, traversing the small tasks that are not executed according to a priority order, if the resource residual of the GPU card is found to satisfy the small tasks that are not executed, allocating the resource residual of the GPU card to the small tasks, and updating the resource residual of the GPU card, includes:

In addition, the present application provides an embodiment of an apparatus based on GPU cluster resource allocation, which corresponds to the embodiment of the method shown in fig. 1.

if greater than or equal to, go to step 32;

if so, go to step 36;

step 36, judging whether a large task which is not executed exists;

if there are no large tasks not executed, then this is complete.

for each GPU card with resource surplus, traversing the small tasks which are not executed according to the priority sequence, if the resource surplus of the GPU card is found to meet the small tasks which are not executed, distributing the resource surplus of the GPU card to the small tasks, and updating the resource surplus of the GPU card, wherein the method comprises the following steps:

step 54, judging whether the small tasks which are not executed exist, if not, ending;

In order to solve the above problem, the present invention further provides a specific embodiment of a method for allocating resources based on a GPU cluster, and an implementation process of an exemplary embodiment is as follows:

step 31, sequencing the tasks to be processed by adopting a custom rule to obtain the priority sequence of the tasks to be processed;

step 32, according to the priority sequence of the tasks to be processed, determining the first unexecuted big task as the current task to be processed;

in this embodiment, in the task allocation process, a resource allocation is first performed on a large task.

Step 33, distributing one or more GPU cards for the current to-be-processed large task; wherein each GPU card is only distributed to one big task;

step 34, judging whether a large task which is not executed exists, if not, performing step 38; if so, determining the next unexecuted big task as the current task to be processed;

step 35, judging whether an idle GPU card exists, if so, performing step 36, and if not, performing step 38;

step 36, judging whether the resource amount of the current idle GPU card is larger than or equal to the resource amount required by the current task to be processed;

if greater than or equal to, go to step 33;

if so, go to step 37;

step 37, judging whether a large task which is not executed exists;

if there is a big task that is not executed, determining the next big task that is not executed as the current task to be processed according to the priority order, and performing step 36;

if there are no large tasks not executed, then this is complete.

And step 38, acquiring the resource residual quantity of each GPU card.

And 39, traversing the small tasks which are not executed according to the priority sequence for each GPU card with the resource residual quantity, if the resource residual quantity of the GPU card is found to meet the small tasks which are not executed, distributing the resource residual quantity of the GPU card to the small tasks, and updating the resource residual quantity of the GPU card.

In this embodiment, after the large task is allocated, the GPU cards for executing the large task also have the resource residual amount, and the resource residual amount of each GPU card is allocated to the small task satisfying the condition, so that the resource utilization rate of the GPU card can be improved.

The specific implementation process for step 39 may include:

step 391, traversing the small tasks which are not executed according to the priority order, and determining the first small task which is not executed as the current task to be processed;

step 392, judging whether a GPU card with the resource residual quantity meeting the resource quantity required by the current task to be processed exists; if so, proceed to step 393, and if not, proceed to step 394;

step 393, if yes, judging whether the execution time range of the current task to be processed is within the time range of the GPU card executing the large task;

if the current task to be processed is within the time range of executing the large task, the resource residual amount of the GPU card is distributed to the current task to be processed, and the resource residual amount is updated; if not, proceed to step 394;

step 394, judging whether there are small tasks which are not executed, if not, ending;

if there are small tasks that have not been executed, the next small task that has not been executed is determined to be the current task to be processed according to the priority order, and the process returns to step 392.

Based on a specific example of this embodiment, a task to be processed is divided into a large task and a small task, different resource allocation methods are respectively adopted for the large task and the small task, and after the large task is allocated, a remaining resource amount exists for a GPU card executing the large task, and the remaining resource amount of each GPU card is allocated to the small task, so that the resource amount of each GPU card is fully utilized, and the GPU cluster utilization rate is improved.

It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.

Claims

1. A method based on GPU cluster resource allocation, wherein the GPU cluster comprises a plurality of GPU cards; characterized in that the method comprises:

for each GPU card with the resource residual quantity, traversing the small tasks which are not executed according to the priority sequence, if the resource residual quantity of the GPU card is found to meet the small tasks which are not executed, distributing the resource residual quantity of the GPU card to the small tasks, and updating the resource residual quantity of the GPU card;

after the task to be processed is obtained, the method further comprises the following steps:

2. The method for allocating GPU-based cluster resources according to claim 1, wherein the executing one or more GPU cards allocated to the big task according to the priority order of the tasks to be processed comprises:

if greater than or equal to, go to step 32;

if so, go to step 36;

step 36, judging whether a large task which is not executed exists;

if there are no large tasks not executed, then this is complete.

3. The method of claim 1, wherein the amount of resources comprises a time frame for executing a task and a resource requirement for executing a task.

4. The method according to claim 3, wherein for each GPU card with resource residual, traversing the small tasks that are not executed according to the priority order, if the resource residual of the GPU card is found to satisfy the small tasks that are not executed, allocating the resource residual of the GPU card to the small tasks, and updating the resource residual of the GPU card comprises:

5. An apparatus for GPU cluster resource allocation based, comprising: a memory and a processor, the GPU cluster including a plurality of GPU cards; the method is characterized in that:

after the task to be processed is obtained, the following operations are also executed:

6. The apparatus according to claim 5, wherein the processor reads and executes the program for GPU cluster resource allocation, and further performs the following operations:

if greater than or equal to, go to step 32;

if so, go to step 36;

step 36, judging whether a large task which is not executed exists;

if there are no large tasks not executed, then this is complete.

7. The GPU cluster resource allocation-based apparatus of claim 5,

the resource amount includes a time range for executing the task and a resource demand amount for executing the task.

8. The apparatus according to claim 7, wherein the processor reads and executes the program for GPU cluster resource allocation and further performs the following operations: