CN115145734B

CN115145734B - Data processing system for distributing GPU

Info

Publication number: CN115145734B
Application number: CN202211063127.9A
Authority: CN
Inventors: 赵洲洋; 于伟; 靳雯; 石江枫; 王全修
Original assignee: Rizhao Ruian Information Technology Co ltd; Beijing Rich Information Technology Co ltd
Current assignee: Rizhao Ruian Information Technology Co ltd; Beijing Rich Information Technology Co ltd
Priority date: 2022-08-31
Filing date: 2022-08-31
Publication date: 2022-11-25
Anticipated expiration: 2042-08-31
Also published as: CN115145734A

Abstract

The invention provides a data processing system for distributing a GPU (graphics processing Unit), which is characterized in that a micro-service information set is obtained, wherein the micro-service information set comprises the minimum value of the GPU storage space required to be occupied by a micro-service and the minimum value of the CPU storage space required to be occupied by the micro-service, a target GPU list is obtained, the residual storage space of the target GPU is larger than the minimum value of the GPU storage space required to be occupied by the micro-service, the residual storage space of the CPU of a processor where the target GPU is located is larger than the minimum value of the CPU storage space required to be occupied by the micro-service, a scheduling priority index list of the target GPU is obtained, the GPU corresponding to the maximum value in the scheduling priority index list is selected as the GPU to be operated with the micro-service, and in the process of obtaining the scheduling priority index list of the target GPU, therefore, the operation speed of the micro-service is ensured, the operation efficiency of the micro-service is improved, and time resources are saved.

Description

Data processing system for distributing GPU

Technical Field

The invention relates to the technical field of GPU scheduling, in particular to a data processing system for distributing a GPU.

Background

In the prior art, when a multi-core Central Processing Unit (CPU) and a Graphics Processing Unit (GPU) cooperatively process a large amount of micro services, in a conventional graphics Processing task scheduling method, a processor generally schedules a GPU to run a micro service according to conditions that a CPU storage space occupied by the micro service and a GPU storage space occupied by the micro service are important, so that only a storage condition of the GPU and a storage condition of the CPU are considered, but a current load of the GPU is not considered, when a remaining storage space of the GPU is large, but the load of the GPU is high, an operation speed of the micro service is not guaranteed, and an operation efficiency of the GPU is reduced.

Disclosure of Invention

Aiming at the technical problems, the technical scheme adopted by the invention is as follows:

a data processing system that allocates GPUs, comprising: a database, a processor, and a memory storing a computer program, wherein the database comprises: initial GPU list G = { G ₁ ，……，G _i ，……，G _n }，G _i =（G ⁱ ₀ ，X ⁱ ，C ⁱ ，L ⁱ ），G ⁱ ₀ Is ID, X of ith initial GPU ⁱ Is G ⁱ ₀ Corresponding size of the first space, C ⁱ Is G ⁱ ₀ Corresponding size of the second space, L ⁱ Is G ⁱ ₀ And the corresponding utilization rate, i is 1 to n, n is the number of the GPUs, the first space is the storage space of the GPU, the second space is the storage space of the CPU, and when a computer program is executed by a processor, the following steps are realized:

s100, acquiring a target micro-service information set F ⁰ =（X ⁰ _min ，C ⁰ _min ) Wherein F is ⁰ Is a target microservice ID, X ⁰ _min Is F ⁰ Corresponding minimum value of first space required to be occupied, C ⁰ _min Is F ⁰ The corresponding minimum value of the second space required to be occupied;

s200, based on X ⁱ And X ⁰ _min Obtaining an intermediate GPU list G ʹ = { G ʹ = G ʹ ₁ ，……，Gʹ _e ，……，Gʹ _h }，Gʹ _e =（Gʹ ^e ₀ ，Xʹ ^e ，Cʹ ^e ，Lʹ ^e ），Gʹ ^e ₀ Is the ID of the e-th intermediate GPU, X ʹ ^e Is G ʹ ^e ₀ Corresponding to the size of the first space, C ʹ ^e Is G ʹ ^e ₀ Corresponding to the size of the second space, L ʹ ^e Is G ʹ ^e ₀ The value of e is 1 to h, h is the number of intermediate GPUs which satisfy X ʹ ^e ＞X ⁰ _min An initial GPU of the condition;

s300, C ʹ -based ^e And C ⁰ _min Acquiring a target GPU list A = { A = { (A) } ₁ ，……，A _r ，……，A _s }，A _r =（A ^r ₀ ，X ^r _min ，C ^r _min ，L ^r ），A ^r ₀ Is ID, X of the r-th target GPU ^r Is A ^r ₀ Corresponding size of the first space, C ^r Is A ^r ₀ Corresponding size of the second space, L ^r Is A ^r ₀ The corresponding utilization value, r is 1 s, s is the number of target GPUs, and the target GPUs satisfy C ^r ＞C ⁰ _min A conditional intermediate GPU;

s400, according to the A and the micro service information set, obtaining a GUP scheduling priority list R = (R) ₁ ，……，R _r ，……R _s ），R _r Scheduling priority, R, for the R-th GPU _r The following conditions are met:

；

wherein E is ⁰ _r Is A _r X ʹ _r Is the r critical first space value, C ʹ _r Is the r critical second spatial value, X ⁰ _max Is F ⁰ Corresponding maximum value of first space required to be occupied, C ⁰ _max Is F ⁰ The maximum value of the corresponding second space required to be occupied, w1 is a first weight, w2 is a second weight, and w3 is a third weight;

s500, traversing R, and obtaining the maximum value R from R _max Corresponding target GPU as run F ⁰ The key GPU of (2).

The invention has at least the following beneficial effects: the method comprises the steps of obtaining a target GPU list by obtaining a micro-service information set, wherein the micro-service information set comprises the minimum value of GPU storage space required to be occupied by micro-service and the minimum value of CPU storage space required to be occupied by the micro-service, obtaining a scheduling priority index list of the target GPU, selecting the GPU corresponding to the maximum value in the scheduling priority index list as the GPU to be operated with the micro-service, and in the process of obtaining the scheduling priority index list of the target GPU, not only considering the residual conditions of display memory and internal memory, but also considering the average utilization value of the GPU, so that the operation speed of the micro-service is guaranteed, the operation efficiency of the micro-service is improved, and time resources are saved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic flowchart illustrating a computer program executed by a data processing system for allocating GPUs according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The embodiment of the invention provides a data processing system for distributing GPU, which comprises: a database, a processor, and a memory storing a computer program, wherein the database comprises: initial GPU list G = { G ₁ ，……，G _i ，……，G _n }，G _i =（G ⁱ ₀ ，X ⁱ ，C ⁱ ，L ⁱ ），G ⁱ ₀ Is ID, X of ith initial GPU ⁱ Is G ⁱ ₀ Corresponding size of the first space, C ⁱ Is G ⁱ ₀ Corresponding size of the second space, L ⁱ Is G ⁱ ₀ And correspondingly, the value of i is 1 to n, n is the number of the GPUs, the first space is the storage space of the GPU, and the second space is the storage space of the CPU.

Specifically, the ID of the GPU is a unique identifier of the GPU, and the identifier of the GPU may be an index number of the GPU or a user-defined serial number.

Further, the processor includes a GPU shared device plugin for pair G ⁱ ₀ Query to obtain X ⁱ 。

Further, the GPU sharing device plug-in queries G through nvml library ⁱ ₀ The remaining storage space, as known by those skilled in the art, any method for querying the remaining storage space of the GPU through the nvml library falls within the scope of the present invention, and is not described herein again.

When executed by a processor, the computer program performs the following steps, as shown in fig. 1:

s100, acquiring a target micro-service information set F ⁰ =（X ⁰ _min ，C ⁰ _min ) Wherein F is ⁰ Is a target microservice ID, X ⁰ _min Is F ⁰ Corresponding minimum value of first space required to be occupied, C ⁰ _min Is F ⁰ A corresponding minimum value of the second space required to be occupied.

Specifically, the target microservice ID is a unique identity of the target microservice, where the target microservice is a to-be-processed computing task that needs to be run using a GPU.

Further, said X ⁰ _min Corresponding units and said C ⁰ _min The corresponding units are the same.

Further, said X ⁰ _min The corresponding unit is GB (Gigabyte).

Further, before S100, the following steps are also included:

s110, obtaining G _i ID list P = { P } of corresponding target CPU ₁ ，……，P _i ，……，P _n }，P _i Is G _i The ID of the corresponding target CPU.

Specifically, the G group _i ID of corresponding target CPU is G _i The ID of the CPU corresponding to the processor.

Further, the ID of the target CPU is a unique identifier of the CPU, and the CPU identifier may be a factory-owned identifier of the CPU or a user-defined serial number.

S130, according to P, obtaining C ⁱ 。

Specifically, the processor further comprises a CPU sharing device plug-in, which is a view CPU utilization performance tool, for example, ps and top; the CPU shared device plug-in is used for inquiring P _i The remaining memory space, those skilled in the art will appreciate that any method for querying the remaining memory space of the CPU falls within the scope of the present invention.

Preferably, P is queried using ps _i On one hand, the occupation condition of the total storage space of the CPU can be inquired about by the residual storage space, and on the other hand, the occupation condition of the storage space of any micro-service in the CPU can be inquired about.

S200, based on X ⁱ And X ⁰ _min Obtaining an intermediate GPU list G ʹ = { G ʹ = ₁ ，……，Gʹ _e ，……，Gʹ _h }，Gʹ _e =（Gʹ ^e ₀ ，Xʹ ^e ，Cʹ ^e ，Lʹ ^e ），Gʹ ^e ₀ Is the ID of the e-th intermediate GPU, X ʹ ^e Is G ʹ ^e ₀ Corresponding to the size of the first space, C ʹ ^e Is G ʹ ^e ₀ Corresponding to the size of the second space, L ʹ ^e Is G ʹ ^e ₀ The value of e is 1 to h, h is the number of intermediate GPUs which meet the requirement of X ʹ ^e ＞X ⁰ _min Initial GPU of the condition.

S300, C ʹ -based ^e _min And C ⁰ _min Acquiring a target GPU list A = { A = { (A) } ₁ ，……，A _r ，……，A _s }，A _r =（A ^r ₀ ，X ^r ，C ^r ，L ^r ），A ^r ₀ Is ID, X of the r-th target GPU ^r Is A ^r ₀ Corresponding size of the first space, C ^r Is A ^r ₀ Corresponding size of the second space, L ^r Is A ^r ₀ The value of r is 1 s, s is the number of target GPUs, and the target GPUs satisfy C ^r ＞C ⁰ _min Conditional intermediate GPU.

S400, according to the A and the micro service information set, obtaining a GUP scheduling priority list R = (R) ₁ ，……，R _r ，……R _s ），R _r For scheduling priority, R, corresponding to the R-th GPU _r The following conditions are met:

；

wherein E is ⁰ _r Is A _r X ʹ _r Is the r critical first space value, C ʹ _r Is the r critical second spatial value, X ⁰ _max Is F ⁰ Corresponding maximum value of first space required to be occupied, C ⁰ _max Is F ⁰ Corresponding second space required to be occupiedW1 is the first weight, w2 is the second weight, and w3 is the third weight.

Specifically, w1+ w2+ w3=1.

Further, w1 is greater than w2 is greater than w3, and when the first space size and the second space size of the GPU both meet the micro-service set operation condition, the current residual value of the GPU determines the operation efficiency of the GPU for operating the target micro-service, so that w1 is greater than w2 and greater than w3, the weight of the utilization value is set to be the maximum, the target micro-service can select a more appropriate GPU to operate, and the work efficiency of the GPU is improved.

Further, the step S400 further includes the following steps:

s401, obtaining A _r Corresponding target first spatial value list X ⁰ =（X ⁰ ₁ ，……，X ⁰ _r ，……，X ⁰ _s ），X ⁰ _r The following conditions are met:

；

wherein, X ⁰ _rg Is A _r At a preset time period T ⁰ The size of the first space of the middle-g time node, the value of g is 1 to z, and z is a preset time period T ⁰ The number of medium time nodes.

Specifically, the A is _r The corresponding target video memory is A _r At a preset time T ⁰ Corresponding average remaining memory space.

Further, T ⁰ =48h。

Further, z satisfies the following condition:

z=T ⁰ /t；

wherein t is a first preset query time threshold.

Specifically, the first preset query time threshold refers to a preset time T ⁰ Once per query A _r The value of t can be set by those skilled in the art according to the actual situation, and is not described herein again.

S402, acquiring F according to the micro-service information set ⁰ Corresponding maximum value X of first space required to be occupied ⁰ _max 。

Specifically, the X ⁰ _max Can be understood as being in operation F ⁰ In the process of (1), F ⁰ As one skilled in the art knows, any method for obtaining the maximum value of the GPU storage space required by the microservice falls within the protection scope of the present invention, and is not described herein again.

S403, mixing X ⁰ And X ⁰ _max And comparing to obtain a key first space value list X '= (X' ₁ ，……，X' _r ，……，X' _s ），X' _r Is the r-th key first space value, X' _r =min（X ⁰ _r ，X ⁰ _max ）；

S404, obtaining A _r Corresponding target second spatial value list C ⁰ =（C ⁰ ₁ ，……，C ⁰ _r ，……，C ⁰ _s ），C ⁰ _r The following conditions are met:

；

wherein, C _rx Is A _r At a preset time period T ⁰ The size of the second space of the xth time node, the value range of x is 1 to q, and q is a preset time period T ⁰ The number of medium time nodes.

Specifically, q satisfies the following condition:

q=T ⁰ /t'；

wherein t' is a second preset query time threshold.

Specifically, the second preset query time threshold refers to a preset time T ⁰ Once per query A _r Corresponding to the interval time of the second space, a person skilled in the art may set the value of t' according to the actual situation, which is not described herein again.

Preferably, t = t', the second preset query time is set to be the same as the first preset query time, so that the query frequency of the GPU and the CPU can be the same, thereby avoiding errors caused by different query frequencies and improving the accuracy.

S405, according to the micro service information set, obtaining F ⁰ Corresponding maximum value C of the second space required to be occupied ⁰ _max 。

In particular, said C ⁰ _max Can be understood as being in operation F ⁰ In the process of (1), F ⁰ The upper limit value of the CPU storage space required to be occupied, and those skilled in the art know that any method for obtaining the maximum value of the CPU storage space required to be occupied by the microservice falls within the protection scope of the present invention, and is not described herein again.

S406, adding C ⁰ And C ^max _j And comparing to obtain a key second space value list C '= (C' ₁ ，……，C' _r ，……，C' _s ），C' _r Is the r critical second space value, C' _r =min（C ⁰ ，C ^max _j ）；

S407, obtaining GUP scheduling priority index L _r 。

In the above, in S401 to S407, the GUP scheduling priority index is obtained by obtaining the target video memory list and the target memory list corresponding to the GPU, the maximum value of the GPU storage space required to be occupied by the micro service, the maximum value of the CPU storage space required to be occupied by the micro service, and the average GPU utilization rate, and according to the above data, calculating.

S500, traversing R, and obtaining the maximum value R from L _max Corresponding target GPU as run F ⁰ The key GPU of (1).

The invention provides a data processing system for distributing a GPU (graphics processing Unit), which is characterized in that a micro-service information set is obtained, wherein the micro-service information set comprises the minimum value of the GPU storage space required to be occupied by a micro-service and the minimum value of the CPU storage space required to be occupied by the micro-service, a target GPU list is obtained, the residual storage space of the target GPU is larger than the minimum value of the CPU storage space required to be occupied by the micro-service, the residual storage space of the CPU of a processor where the target GPU is located is larger than the minimum value of the CPU storage space required to be occupied by the micro-service, a scheduling priority index list of the target GPU is obtained, the GPU corresponding to the maximum value in the scheduling priority index list is selected as the GPU to be operated with the micro-service, and in the process of obtaining the scheduling priority index list of the target GPU, the residual conditions of a display memory and a memory are considered, more average utilization values of the GPU are considered, so that the operation speed of the micro-service is ensured, the operation efficiency of the micro-service is improved, and time resources are saved.

The present specification provides method steps as described in the examples or flowcharts, but may include more or fewer steps based on routine or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. In actual system or server product execution, sequential execution or parallel execution (e.g., parallel processor or multithreaded processing environments) may occur according to the embodiments or methods shown in the figures.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus and computer device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference may be made to some descriptions of the method embodiments for relevant points.

Although some specific embodiments of the present invention have been described in detail by way of example, it should be understood by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the invention. It will also be appreciated by those skilled in the art that various modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the invention is defined by the appended claims.

Claims

1. Data processing system for distributing GPUCharacterized in that said system comprises: a database, a processor, and a memory storing a computer program, wherein the database comprises: initial GPU list G = { G ₁ ，……，G _i ，……，G _n }，G _i =（G ⁱ ₀ ，X ⁱ ，C ⁱ ，L ⁱ ），G ⁱ ₀ Is ID, X of ith initial GPU ⁱ Is G ⁱ ₀ Corresponding size of the first space, C ⁱ Is G ⁱ ₀ Corresponding size of the second space, L ⁱ Is G ⁱ ₀ And correspondingly, the utilization rate, i is 1 to n, n is the number of GPUs, the first space is the storage space of the GPU, the second space is the storage space of the CPU, and when the computer program is executed by a processor, the following steps are realized:

s200, based on X ⁱ And X ⁰ _min Obtaining an intermediate GPU list G ʹ = { G ʹ = G ʹ ₁ ，……，Gʹ _e ，……，Gʹ _h }，Gʹ _e =（Gʹ ^e ₀ ，Xʹ ^e ，Cʹ ^e ，Lʹ ^e ），Gʹ ^e ₀ Is the ID of the e-th intermediate GPU, X ʹ ^e Is G ʹ ^e ₀ Corresponding to the size of the first space, C ʹ ^e Is G ʹ ^e ₀ Corresponding to the size of the second space, L ʹ ^e Is G ʹ ^e ₀ The value of e is 1 to h, h is the number of intermediate GPUs which meet the requirement of X ʹ ^e ＞X ⁰ _min An initial GPU of the condition;

s300, C ʹ -based ^e And C ⁰ _min Obtaining a target GPU list a = { a = ₁ ，……，A _r ，……，A _s }，A _r =（A ^r ₀ ，X ^r ，C ^r ，L ^r ），A ^r ₀ Is ID, X, of the r-th target GPU ^r Is A ^r ₀ Corresponding size of the first space, C ^r Is A ^r ₀ Corresponding size of the second space, L ^r Is A ^r ₀ The value of r is 1 to s, s is the number of target GPUs, and the target GPUs satisfy C ^r ＞C ⁰ _min A conditional intermediate GPU;

；

wherein E is ⁰ _r Is A _r Average utilization value of X ʹ _r Is the r critical first space value, C ʹ _r Is the r critical second spatial value, X ⁰ _max Is F ⁰ Corresponding maximum value of first space required to be occupied, C ⁰ _max Is F ⁰ The maximum value of the corresponding second space required to be occupied, w1 is a first weight, w2 is a second weight, and w3 is a third weight;

s500, traversing R, and obtaining a maximum value R from R _max Corresponding target GPU as run F ⁰ The key GPU of (1).

2. The system according to claim 1, further comprising the following steps in S400:

；

wherein, X _rg Is A _r At a preset time period T ⁰ The size of the first space of the middle-g time node, the value of g is 1 to z, and z is a preset time period T ⁰ The number of medium time nodes;

s402, obtaining F according to the micro service information set ⁰ Corresponding maximum value X of the first space required to be occupied ⁰ _max ；

S403, mixing X ⁰ And X ⁰ _max By contrast, a key first spatial value list X ʹ = (X ʹ) is obtained ₁ ，……，Xʹ _r ，……，Xʹ _s ），Xʹ _r For the r-th critical first space value, X ʹ _r =min（X ⁰ ，X ⁰ _max ）；

；

wherein, C _rx Is A _r At a preset time period T ⁰ The size of the second space of the xth time node, the value range of x is 1 to q, and q is a preset time period T ⁰ The number of medium time nodes;

s405, according to the micro service information set, obtaining F ⁰ Corresponding maximum value C of second space required to be occupied ⁰ _max ；

S406, adding C ⁰ And C ⁰ _max By contrast, a key second spatial value list C ʹ = (C ʹ) is obtained ₁ ，……，Cʹ _r ，……，Cʹ _s ），Cʹ _r Is the r critical second space value, C ʹ _r =min（C ⁰ ，C ⁰ _max ）；

S407, obtaining a GUP scheduling priority index L _r 。

3. The system of claim 1, wherein w1+ w2+ w3=1.

4. The system of claim 3, wherein w1 > w2 > w3.

5. The system of claim 2, wherein z satisfies the following condition:

z=T ⁰ /t；

wherein t is a first preset query time threshold.

6. The system of claim 2, wherein q satisfies the following condition:

q=T ⁰ /tʹ；

wherein t' is a second preset query time threshold.

7. The system of claim 6, wherein t = t ʹ.

8. The system of claim 1, wherein in S400, E ⁰ _r The following conditions are met:

；

wherein E is _rg Is A _r At a preset time period T ⁰ The utilization value of the g-th time node.