WO2022088659A1

WO2022088659A1 - Resource scheduling method and apparatus, electronic device, storage medium, and program product

Info

Publication number: WO2022088659A1
Application number: PCT/CN2021/095292
Authority: WO
Inventors: 霍明明; 张炜; 陈界; 朴元奎; 陈宇恒
Original assignee: 北京市商汤科技开发有限公司
Priority date: 2020-10-26
Filing date: 2021-05-21
Publication date: 2022-05-05
Also published as: CN112346859B; KR20220058844A; CN112346859A

Abstract

A resource scheduling method and apparatus, an electronic device, a storage medium, and a program product. The method comprises: receiving a resource scheduling request for a graphics processing unit (GPU) in a GPU cluster (S201), the resource scheduling request comprising grouping information of a GPU to be requested, and the grouping information of the GPU to be requested being determined according to a task type of a task processing request corresponding to the resource scheduling request; according to the grouping information of the GPU to be requested, matching, in all GPUs of the GPU cluster, a GPU having the grouping information of the GPU to be requested, so as to obtain a matching result (S202), the matching result comprising at least one target GPU corresponding to the grouping information of the GPU to be requested; and returning the matching result (S203).

Description

Resource scheduling method and apparatus, electronic device, storage medium and program product

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is based on the Chinese patent application with the application number of 202011158231.7, the application date of October 26, 2020, and the application name of "Resource Scheduling Method and Device, Electronic Equipment and Storage Medium", and claims the priority of the Chinese patent application, The entire content of this Chinese patent application is hereby incorporated by reference in its entirety.

technical field

The present application relates to the technical field of artificial intelligence, and in particular, to a resource scheduling method and apparatus, electronic equipment, storage medium and program product.

Background technique

Artificial Intelligence (AI) is a relatively mainstream direction at present, which aims to make machines more intelligent, so as to be competent for some complex tasks that require human intelligence to complete, so as to facilitate human life and production. Smartphones, for example, no longer need to manually enter a passcode, and just swipe your face to unlock the screen. An important way to make machines smarter is machine learning. At present, machine learning can be divided into two categories, one of which is to make computers simulate human learning behaviors to acquire new knowledge or skills, and to reorganize existing knowledge structures to continuously improve their performance; the other is Gain hidden, valid, understandable knowledge from massive amounts of data.

The above-mentioned second type of machine learning requires data, algorithms and computing power to realize; among them, computing power needs the support of some computer hardware resources such as Graphics Processing Unit (GPU), so that the computing power can better exert the algorithm and the role of data. In a large-scale cluster, there are often multiple physical machines, and each physical machine includes multiple GPUs. When the scheduling device receives a resource scheduling request, it will perform resource scheduling among the GPUs of all these physical machines. The scheduling methods are all random scheduling, which makes it impossible to precisely control the use of resources.

SUMMARY OF THE INVENTION

Embodiments of the present application provide a resource scheduling method and apparatus, electronic equipment, storage medium and program product, so as to precisely control the use of resources and improve resource scheduling efficiency and resource utilization.

In a first aspect, an embodiment of the present application provides a resource scheduling method, including: receiving a resource scheduling request for a GPU in a graphics processor GPU cluster, where the resource scheduling request includes grouping information of the GPU to be requested, and the The grouping information is determined according to the task type of the task processing request corresponding to the resource scheduling request; according to the grouping information of the GPU to be requested, all GPUs in the GPU cluster are matched with the grouping information of the GPU to be requested. GPU, obtaining a matching result, where the matching result includes at least one target GPU corresponding to the grouping information of the GPU to be requested; and returning the matching result.

In an optional implementation manner, each GPU includes at least one virtual GPU (Virtual Graphics Processing Unit, vGPU), and the resource scheduling request further includes computing parameters and quantity of vGPU; The grouping information of GPUs, after matching the GPUs with the grouping information of the GPUs to be requested among all the GPUs in the GPU cluster, the method further includes: according to the calculation parameters and the number of the vGPUs, in the matching result Screen the vGPUs that satisfy the computing parameters and quantity of the vGPU; return the vGPUs that satisfy the computing parameters and quantity of the vGPU.

In an optional implementation manner, according to the computing parameters and quantity of the vGPU, screening the vGPU that satisfies the resource scheduling request in the matching result includes: screening the matching result that satisfies the vGPU Calculate the vGPU of the parameter to obtain a first screening result; in the first screening result, screen the vGPU resources that meet the requirement of the number of vGPUs.

In an optional implementation manner, the computing parameters include at least one of the following: computing power and video memory; the matching results are screened for vGPUs that satisfy the computing parameters to obtain a first screening result, including: Obtain the priority corresponding to the computing power and the video memory of each vGPU in each of the target GPUs; if the priority of the computing power is greater than the priority of the video memory, then in each of the target GPUs Screening vGPUs that meet the computing power requirements of the vGPUs requested by the resource scheduling request to obtain a second screening result; screening vGPUs that meet the memory requirements of the vGPUs requested by the resource scheduling in the second screening results, obtaining the first screening results Filter results.

In an optional implementation manner, the computing parameters include at least one of the following: computing power and video memory; the matching results are screened for vGPUs that satisfy the computing parameters to obtain a first screening result, including: Obtain the priority corresponding to the computing power and the video memory of each vGPU in each of the target GPUs; if the priority of the computing power is less than the priority of the video memory, then in each of the target GPUs Screening vGPUs that meet the video memory requirements of the vGPUs requested by the resource scheduling request to obtain a third screening result; screening vGPUs that meet the computing power requirements of the vGPUs requested by the resource scheduling in the third screening results, obtaining the first screening results Filter results.

In an optional implementation manner, in the first screening result, screening for vGPU resources that meet the requirement of the number of vGPUs includes: if the number of vGPUs in the first screening result is greater than the number of vGPUs The number of vGPU resources required by the resource scheduling request, then in the first screening result, the number of vGPU resources corresponding to the number of vGPU resources required by the resource scheduling request is selected according to the computing parameters in ascending order; If the number of vGPUs in the first screening result is equal to the number of vGPU resources required by the resource scheduling request, return the first screening result; if the number of vGPUs in the first screening result is less than The number of vGPU resources required by the resource scheduling request, then a prompt message that the screening result is empty is returned.

In an optional implementation manner, the resource scheduling request includes the task type of the task processing request corresponding to the resource scheduling request; vGPUs in different GPUs have tags corresponding to the tags, and the tags corresponding to the vGPUs are determined by the task type of the task processing request corresponding to the resource scheduling request; the method further includes: matching the task processing request corresponding to the resource scheduling request according to the task type of the task processing request corresponding to the resource scheduling request at least one tag corresponding to the task type; the vGPU corresponding to the at least one tag is used as the matching result.

In a second aspect, an embodiment of the present application provides a resource scheduling apparatus, including: a receiving module configured to receive a resource scheduling request for a GPU in a graphics processing unit GPU cluster, where the resource scheduling request includes grouping information of the GPU to be requested, so The grouping information of the GPUs to be requested is determined according to the task type of the task processing request corresponding to the resource scheduling request; the first matching module is configured to, according to the grouping information of the GPUs to be requested, in all GPUs of the GPU cluster Matching the GPU with the grouping information of the GPU to be requested, and obtaining a matching result, the matching result includes at least one target GPU corresponding to the grouping information of the GPU to be requested; the first returning module is configured to return the matching result.

In an optional implementation manner, each of the GPUs includes at least one vGPU, and the resource scheduling request further includes computing parameters and quantities of the vGPUs; the apparatus further includes: a screening module configured to The calculation parameters and the number of vGPUs are selected in the matching result to satisfy the calculation parameters and the number of vGPUs of the vGPU; the second return module is configured to return the vGPUs that satisfy the calculation parameters and the number of the vGPUs.

In an optional implementation manner, the screening module includes: a first screening unit, configured to screen vGPUs that satisfy the calculation parameters in the matching results, to obtain a first screening result; a second screening unit, configured to In the first screening result, the vGPU resources that meet the requirement of the number of vGPUs are screened.

In an optional implementation manner, the computing parameters include at least one of the following: computing power and video memory; the first screening unit is configured to obtain the computing power of each vGPU in each of the target GPUs The priority corresponding to the video memory; if the priority of the computing power is greater than the priority of the video memory, the vGPU that satisfies the computing power requirement of the vGPU of the resource scheduling request is screened in each of the target GPUs, Obtaining a second screening result; screening vGPUs that meet the video memory requirements of the vGPUs requested by the resource scheduling in the second screening results to obtain the first screening results.

In an optional implementation manner, the computing parameters include at least one of the following: computing power and video memory; the first screening unit is configured to obtain the computing power of each vGPU in each of the target GPUs The priority corresponding to the video memory; if the priority of the computing power is less than the priority of the video memory, then filter the vGPUs that meet the video memory requirements of the vGPU requested by the resource scheduling request in each of the target GPUs, and obtain The third screening result: screening the vGPU that meets the computing power requirement of the vGPU requested by the resource scheduling in the third screening result to obtain the first screening result.

In an optional implementation manner, the second screening unit is configured to, if the number of vGPUs in the first screening result is greater than the number of vGPU resources required by the resource scheduling request, perform the filtering in the In the first screening result, a number of vGPU resources corresponding to the number of vGPU resources required by the resource scheduling request are selected according to the computing parameters in ascending order.

In an optional implementation manner, the second screening unit is configured to, if the number of vGPUs in the first screening result is equal to the number of vGPU resources required by the resource scheduling request, return the The first screening result.

In an optional implementation manner, the second screening unit is configured to return a screening result if the number of vGPUs in the first screening result is less than the number of vGPU resources required by the resource scheduling request Empty prompt message.

In an optional implementation manner, the resource scheduling request includes the task type of the task processing request corresponding to the resource scheduling request; vGPUs in different GPUs have tags corresponding to the tags, and the tags corresponding to the vGPUs are The task type of the task processing request corresponding to the resource scheduling request is determined; the apparatus further includes: a second matching module configured to match the resource scheduling request according to the task type of the task processing request corresponding to the resource scheduling request. requesting the corresponding task to process at least one tag corresponding to the task type of the request; and using the vGPU corresponding to the at least one tag as the matching result.

In a third aspect, an embodiment of the present application provides an electronic device, including: a memory;

processor; and

Computer program;

Wherein, the computer program is stored in the memory and configured to be executed by the processor to implement the method of the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor to implement the method described in the first aspect.

In a fifth aspect, an embodiment of the present disclosure provides a computer program product, including computer-readable code, and when the computer-readable code is executed in an electronic device, a processor in the electronic device executes the first the method described in the aspect.

The resource scheduling method and device, electronic device, storage medium, and program product provided by the embodiments of the present application receive a resource scheduling request for a GPU in a graphics processor GPU cluster, where the resource scheduling request includes grouping information of the GPU to be requested, and the resource scheduling request is to be requested. The grouping information of the requested GPU is determined according to the task type of the task processing request corresponding to the resource scheduling request, and then according to the grouping information of the GPU to be requested, the GPUs with the grouping information of the GPU to be requested are matched among all GPUs in the GPU cluster; The matching result of at least one target GPU corresponding to the grouping information of the GPU to be requested is included. Since the resource scheduling request includes the grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is determined according to the task type of the task processing request corresponding to the resource scheduling request, when performing GPU resource scheduling, it can be matched according to the grouping information. to the corresponding GPU, so as to achieve more fine-grained resource scheduling and precisely control the use of GPU.

Description of drawings

FIG. 1 is an application scenario diagram provided by an embodiment of the present application;

FIG. 2 is a flowchart of a resource scheduling method provided by an embodiment of the present application;

3 is a schematic diagram of grouping GPUs of a physical machine according to an embodiment of the present application;

4A is a schematic diagram of a single online prediction task provided by an embodiment of the present application;

4B is a schematic diagram of multiple online prediction tasks provided by an embodiment of the present application;

FIG. 5 is a flowchart of a resource scheduling method provided by another embodiment of the present application;

6 is a schematic diagram of a vGPU in a physical machine provided by an embodiment of the present application;

FIG. 7 is a schematic structural diagram of a resource scheduling apparatus provided by an embodiment of the present application;

FIG. 8 is a block diagram of an electronic device provided by an embodiment of the present application.

The above-mentioned drawings have shown clear embodiments of the present disclosure, and will be described in more detail hereinafter. These drawings and written descriptions are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the disclosed concepts to those skilled in the art by referring to specific embodiments.

Detailed ways

Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. Where the following description refers to the drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the illustrative examples below are not intended to represent all implementations consistent with this disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as recited in the appended claims.

FIG. 1 is an application scenario diagram provided by an embodiment of the present application. As shown in FIG. 1 , the application scenario includes: a user terminal 11 , an AI algorithm device 12 , a scheduling device 13 and a GPU cluster 14 ; the user terminal 11 may at least include electronic devices such as smart phones, Ipads, and personal computers. GPU cluster 14 is a computer cluster that includes a plurality of computer nodes, where each computer node is equipped with one or more GPUs.

In some optional scenarios, the user can submit a task processing request through the user terminal 11, such as a model training task, an online prediction task, etc. in an AI scenario, and the task processing request submitted by the user will be sent to the AI algorithm device 12, and the AI algorithm device 12 generates a resource scheduling request according to the task processing request, and sends the resource scheduling request to the scheduling device 13, and the scheduling device 13 further performs resource scheduling in the GPU cluster 14 according to the resource scheduling request, and returns the resource scheduling result to the AI algorithm device. 12. Then, the scheduling device 13 performs resource scheduling in the GPU cluster 14 according to the resource scheduling request, that is, allocates the resources required by the task processing request to each GPU in the GPU cluster 14, so that each GPU completes the allocated task and finally realizes the User-submitted tasks handle the processing of requests.

In the above resource scheduling process, the minimum scheduling unit for resources in the prior art is a physical machine. For example, assuming that the GPU cluster 14 includes 4 physical machines, the prior art can only implement scheduling of physical machines.

In view of the above technical problems, the embodiment of the present application adopts the following technical solutions: the minimum scheduling unit (physical machine) of the GPU cluster 14 is divided into finer granularity, and according to the type of tasks that the GPU cluster 14 needs to process, the GPU cluster 14 All GPUs are tagged, so that when receiving a task processing request from the user, the GPU corresponding to the tag can be screened according to the task type corresponding to the task processing request, so as to achieve more fine-grained resource scheduling and precise control of GPU. use.

It should be noted that the AI algorithm device 12 may be an independent device or device, or may be a module or component integrated in the user terminal 11 , which is not specifically limited in this embodiment.

The embodiments of the present application can be applied to all artificial intelligence scenarios, such as the field of intelligent video analysis.

The technical solutions of the present application and how the technical solutions of the present application solve the above-mentioned technical problems will be described in detail below with specific examples. The following specific embodiments may be combined with each other, and the same or similar concepts or processes may not be repeated in some embodiments. The embodiments of the present application will be described below with reference to the accompanying drawings.

FIG. 2 is a flowchart of a resource scheduling method provided by an embodiment of the present application. As shown in FIG. 2, the resource scheduling method includes the following steps S201 to S203:

Step S201 , receiving a resource scheduling request for GPUs in the graphics processor GPU cluster 14 .

The execution body of this embodiment is the scheduling device 13 shown in FIG. 1 . The scheduling device 13 receives a resource scheduling request from the AI algorithm device 12, the resource scheduling request includes grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is determined according to the task type of the task processing request corresponding to the resource scheduling request. Among them, the task type can be divided according to the purpose of the task. For example, in an AI scenario, the task types include model training and online prediction. Correspondingly, the grouping information of the GPU to be requested includes model training grouping information and online prediction grouping information.

For example, the user submits a task processing request whose task type is model training to the AI algorithm device 12, and the AI algorithm device 12 generates a resource scheduling request according to the task processing request, and determines the task type to be processed according to the task type corresponding to the task processing request. The grouping information of the requested GPU is the model training grouping information.

In an optional implementation manner, the grouping information of the GPUs to be requested may be specified by the AI algorithm device 12. If the AI algorithm device 12 does not specify the grouping information of the GPUs to be requested, the default is that all GPUs in the GPU cluster 14 are usable.

Step S202 , according to the grouping information of the GPU to be requested, match the GPUs with the grouping information of the GPU to be requested among all the GPUs in the GPU cluster 14 to obtain a matching result.

The matching result includes at least one target GPU corresponding to the grouping information of the GPU to be requested.

The GPU cluster 14 includes multiple physical machines, and each physical machine includes multiple GPUs. In this embodiment, before step S201, all GPUs in the GPU cluster 14 need to be grouped. To group, the usage of the GPU can be determined according to the task type corresponding to the task processing request that needs to be executed by the GPU cluster 14 . The following takes a physical machine as an example to describe the GPU grouping process in detail:

FIG. 3 is a schematic diagram of grouping GPUs of a physical machine according to an embodiment of the present application. Taking a physical machine as an example, as shown in Figure 3, the physical machine is a physical machine 31 with 9 cards (including a physical machine with 9 GPU cards), which are numbered from 0 to 8 cards, assuming that the user plans Model training and online prediction tasks are carried out on this physical machine at the same time, and it is planned to use 0 to 3 cards for model training and 4 to 8 cards for online prediction, then the grouping information of 0 to 3 cards can be set. For the model training grouping information, the grouping information of 4 cards to 8 cards is set as the online prediction grouping information. For example, the model training grouping information can be marked as label A (Label-A), and the online prediction grouping information can be marked as label B (Label-B).

In an optional implementation manner, all GPUs in the GPU cluster 14 can be represented as a list, and each GPU corresponds to grouping information. Taking a physical machine including 9 GPU cards as an example, the list of all GPUs is in the form of a list. Table 1 below:

Table 1 List of all GPU grouping information in a physical machine

GPU卡编号GPU card number	分组信息group information
0卡0 card	模型训练model training
1卡1 card	模型训练model training
2卡2 cards	模型训练model training
3卡3 cards	模型训练model training
4卡4 cards		在线预测Online prediction
5卡5 cards		在线预测Online prediction
6卡6 cards		在线预测Online prediction
7卡7 cards	在线预测Online prediction
8卡8 cards	在线预测Online prediction

As shown in Table 1, when a resource scheduling request is received, it is assumed that the GPU grouping information carried in the resource scheduling request is the model training grouping information, and then the GPUs of cards 0 to 3 will be matched. The GPU grouping information is online prediction grouping information, then it will match the GPU of 4 to 8 cards.

Of course, the GPUs in different physical machines can also be divided into one group. For example, GPU cluster 14 includes physical machine 1, physical machine 2, and physical machine 3; wherein, physical machine 1 includes GPU0, GPU1, and GPU2; physical machine 2 includes GPU3, GPU4, and GPU5; physical machine 3 includes GPU6, GPU7, and GPU8; Then the GPU1 and GPU2 in the physical machine 1, the GPU5 in the physical machine 2, and the GPU8 in the physical machine 3 can be divided into the same group.

By grouping all the GPUs in the GPU cluster 14, each group can be regarded as a resource pool, which can realize logical isolation between resources (GPUs) and resources (GPUs).

Step S203, returning a matching result.

In an optional implementation manner, the matching result may be expressed in the form of a list, and after obtaining the above matching result, the scheduling device 13 generates a GPU list according to the matching result, and returns the GPU list to the AI algorithm device 12 . In an example, assuming that the matching result is from 0 cards to 3 cards, the form of the GPU list can refer to the following Table 2:

Table 2 GPU list

0卡0 card
1卡1 card
2卡2 cards
3卡3 cards

The embodiment of the present application receives a resource scheduling request for GPUs in the graphics processor GPU cluster 14, where the resource scheduling request includes grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is based on the task processing request corresponding to the resource scheduling request If the task type is determined, then according to the grouping information of the GPUs to be requested, the GPUs with the grouping information of the GPUs to be requested are matched among all the GPUs in the GPU cluster 14; match results. Since the resource scheduling request includes the grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is determined according to the task type of the task processing request corresponding to the resource scheduling request, when performing GPU resource scheduling, it can be matched according to the grouping information. to the corresponding GPU, so as to achieve more fine-grained resource scheduling and precisely control the use of GPU.

The present application can improve the controllability of resource scheduling of AI algorithm applications in vGPU mode. For example, for an 8-card GPU machine, among them, 0 to 3 cards use vGPU mode for resource allocation; 4 to 7 cards use non-vGPU mode for resource allocation. In the prior art, the selection of GPU is random, and it is impossible to control the scheduling of applications in the vGPU mode to cards 0 to 3. However, using the resource scheduling method of the embodiment of the present application, by labeling cards 0 to 3 with vGPU labels, when applying for resources, the scheduling device 13 allocates resources among the GPUs labelled with vGPU, so that it can be very accurate. control the use of resources.

In addition, the resource scheduling method of the present application can also satisfy the isolation and classified use of GPU resources on a single GPU machine, and maximize the utilization of resources that meet different requirements. For example, the user's resources are tight, and there is only one 8-card GPU machine, but they want to perform model training tasks and online prediction tasks on this 8-card GPU machine at the same time, and they can be well isolated from each other. Influence. In this scenario, it is usually used by static designation, but static designation is time-consuming and labor-intensive. Using the resource scheduling method of the present application, by labeling some GPU cards with model training labels, and labeling other GPU cards with online prediction labels, when two types of tasks (model training tasks and online prediction tasks) are subsequently received to apply for resources , the AI algorithm device 12 informs the scheduling device 13 to use the GPU card resource corresponding to the tag, and selects which GPU card is implemented by the scheduling device 13 without user participation, which improves the usability to a certain extent.

The above embodiment introduces the GPU-level resource scheduling process. In a single-task resource scheduling scenario, a task can be implemented by only one GPU card. However, in a multi-task parallel resource scheduling scenario, more GPUs are required. card to meet the concurrent needs of multitasking. For example, a city restricts motor vehicles, and many cameras are set up on the city road to monitor the driving of vehicles on the city road. When a vehicle is detected to violate the traffic restriction rules, the camera will take a picture of the vehicle and send it. Notify the information to the owner, prompting the owner to pay the fine. In this process, after the camera captures the image, it needs to identify the vehicle in the image, then circle the vehicle in the image with a rectangular frame, and then identify the license plate information. In the process of license plate information recognition, an online prediction task needs to be used. As shown in Figure 4A, if the image captured by the camera includes a car, then there is only one online prediction task, and only one GPU card is needed at this time. In the actual application process, as shown in Figure 4B, the images captured by the camera often include multiple vehicles, and then there are multiple online prediction tasks. If GPU-level resource scheduling is used, the multiple online prediction tasks will be allocated to multiple GPUs, so that GPU resources cannot be fully utilized, resulting in a waste of expensive GPU resources. Therefore, each GPU can also be divided into smaller scheduling units, that is, using virtual machine technology to virtualize each GPU in Figure 1 to obtain multiple vGPUs, and then assign multiple parallel online prediction tasks to different On vGPU, multiple tasks share the same GPU, thereby improving the resource utilization of a single GPU. On the basis of the above embodiment, the present application can also implement resource scheduling in a GPU sharing scenario, and the implementation manner is as follows:

FIG. 5 is a flowchart of a resource scheduling method provided by another embodiment of the present application. On the basis of the foregoing embodiment, the resource scheduling request may further include the computing parameters of the vGPU and the number of vGPUs, where the number of vGPUs is N, and N is a positive integer greater than 0. As shown in FIG. 5 , the resource scheduling method provided by this embodiment includes the following steps:

Step S501 , according to the calculation parameters of the vGPU and the number of the vGPU, filter the vGPU that satisfies the resource scheduling request in the matching result.

In an optional implementation manner, this step may be to screen the vGPUs that satisfy the computing parameters and quantity requirements of the vGPUs corresponding to the resource scheduling request in the matching result.

FIG. 6 is a schematic diagram of a vGPU in a physical machine according to an embodiment of the present application. As shown in FIG. 6 , each GPU can be further divided into multiple vGPUs (as shown by the circles in FIG. 6 ). It should be noted that each GPU in FIG. 6 includes 3 vGPUs for exemplary illustration only, and does not limit the number of vGPUs.

Wherein, step S501 is executed after the matching result is obtained in step S202. The matching result in this embodiment may be represented by a GPU list, and the matching result may also include computing parameters such as computing power (vcore) and/or video memory (vmemory) of each vGPU of each target GPU, where the computing power of the vGPU is Refers to the computing power of the vGPU.

Assuming that the GPU list includes cards 0 to 3, another form of the GPU list can refer to the following Table 3:

Table 3 GPU list

In an optional implementation manner, step S501 may further include the following steps:

Step S501a: Screen the vGPUs satisfying the calculation parameters in the matching result to obtain the first screening result.

In an optional implementation manner, the vGPUs that satisfy the computing parameters may be shown in the form of a list, and the vGPU list includes at least one vGPU that satisfies the computing parameters. If the computing parameters required by the task processing request submitted by the user include computing power, and the computing power of each vGPU requested by the resource scheduling request is: 3.5, 3.0, 5.2, and 6.1, then the computing power that satisfies the resource scheduling request in Table 3 Required vGPUs (first screening results) include: vGPU-2, vGPU-4, vGPU-8, vGPU-9, vGPU-10, vGPU-11, vGPU-12. The first screening result can also be given in the form of a list, and its form is as follows in Table 4:

Table 4 The first screening results

If the computing parameters required by the task processing request submitted by the user include video memory, and the video memory of each vGPU requested by the resource scheduling request is: 6GB, 8GB, 8GB, and 6GB respectively; then the vGPUs that satisfy the resource scheduling request include: vGPU-3, vGPU-6, vGPU-7, vGPU-8, vGPU-10, vGPU-11, vGPU-12.

If the computing parameters required by the task processing request submitted by the user include computing power and video memory, and the computing power of each vGPU requested by the resource scheduling request is: 3.5, 3.0, 5.2, 6.1, and the video memory is: 6GB, 8GB, 8GB , 6GB; vGPUs that satisfy resource scheduling requests include: vGPU-2, vGPU-3, vGPU-4, vGPU-6, vGPU-7, vGPU-8, vGPU-9, vGPU-10, vGPU-11, vGPU- 12.

Step S501b, in the first screening result, screening vGPU resources that meet the requirement of the number of vGPUs in the resource scheduling request.

Wherein, this step is to filter out N vGPUs in the first screening result.

Assuming that the number of vGPUs required by the task processing request submitted by the user is 4, it is also necessary to select 4 vGPUs in Table 4. In an optional implementation manner, 4 vGPUs may be randomly selected in Table 4. In another optional implementation manner, the first 4 vGPUs may also be selected in Table 4 in ascending order of computing power or video memory. Taking the computing parameters required for the task processing request submitted by the user including computing power as an example, the vGPUs that meet the computing power include: vGPU-2, vGPU-4, vGPU-8, vGPU-9, vGPU-10, vGPU-11, vGPU- 12. Further, 4 vGPUs may be randomly selected from the 7 vGPUs, that is, vGPUs that satisfy the computing parameters and quantity of the vGPUs.

Step S502, returning to the vGPU that satisfies the resource scheduling request.

In an optional implementation manner, step S502 may be to return the vGPU that meets the requirements for computing parameters of the vGPU and the requirements for the number of vGPUs to the AI algorithm device 12 .

In this embodiment, secondary filtering and screening are performed for the matching results. The first time is to filter according to the grouping information. When the GPU cluster 14 is very large, many GPUs that are not within the screening range can be filtered out through the grouping information. Then, in the second screening process, the second time The screening range is narrowed, which can greatly improve the efficiency of resource scheduling. For example, in the prior art, the scheduling device 13 needs to screen all GPUs in the GPU cluster 14 one by one for the GPU resources that can meet the computing parameters and quantity requirements according to the resource scheduling request. If the scale of the GPU cluster 14 is large, the screening range is It will be very large, and the screening time will be very long, making resource scheduling inefficient.

The above embodiment introduces the implementation of jointly determining the vGPU according to the computing parameters and the number N. If the computing parameters include computing power and video memory, when the vGPU is jointly determined according to the computing power and video memory, the following two implementations may be included:

In an optional embodiment, the first screening is performed in the matching results according to the computing power requested by the resource scheduling request, and then the second screening is performed in the first screening results according to the display content required by the resource scheduling request. filter. In an optional implementation manner, the calculation parameters include at least one of the following: computing power and video memory. The matching results introduced in step S501a are screened for vGPUs that satisfy the calculation parameters to obtain the first screening result, including the following steps:

Step a1: Obtain the computing power of each vGPU in each target GPU and the priority corresponding to the video memory.

Step a2: If the priority of computing power is greater than the priority of video memory, screen each of the target GPUs for vGPUs that meet the computing power requirements of the vGPUs requested by resource scheduling to obtain a second screening result.

Step a3: Screen the vGPUs that meet the video memory requirements of the vGPUs requested by the resource scheduling in the second screening result to obtain the first screening result.

In another optional implementation, the first screening is performed in the matching result according to the display content requested by the resource scheduling request, and then the second screening is performed in the first screening result according to the computing power required by the resource scheduling request. secondary filter. In an optional implementation manner, the computing parameters include at least one of the following: computing power and video memory; and determining the vGPU that satisfies the computing power and video memory in the matching result introduced in step S501a, including:

Step b1: Acquire the priority corresponding to the computing power and the video memory of each of the vGPUs in each of the target GPUs.

Step b2: If the priority of computing power is lower than the priority of video memory, select vGPUs that meet the video memory requirements of the vGPUs requested by resource scheduling in each of the target GPUs to obtain a third screening result.

Step b3: Screen the vGPUs that meet the computing power requirements of the vGPUs requested by the resource scheduling in the third screening result to obtain the first screening result.

On the basis of the above embodiment, after filtering out the vGPU that matches the computing power and/or the video memory from the matching results according to the computing power and/or the video memory, the following situations may exist:

In a first optional implementation manner, the number of vGPUs in the first screening result is greater than the number of vGPUs requested by the resource scheduling request. In this case, it is also necessary to filter out the number of vGPUs requested by the resource scheduling request in the first screening result. The number of vGPUs corresponds to the number of vGPUs (N vGPUs are filtered out in the first screening result). For example, the first screening result includes 5 vGPUs. If the number of vGPUs requested by the resource scheduling request is 4, then 4 vGPUs need to be further screened out of the 5 vGPUs, and the scheduling device 13 then selects these 4 vGPUs. The vGPUs are returned to the AI algorithm device 12;

In a second optional implementation manner, if the number of vGPUs in the first screening result is equal to the number of vGPUs requested by the resource scheduling request, the vGPU in the first screening result is directly returned to the scheduling apparatus 13 as the target vGPU. For example, the first screening result includes 5 vGPUs. If the number of vGPUs requested by the resource scheduling request is 5, the 5 vGPUs are directly returned to the AI algorithm device 12 .

In a third optional implementation manner, if the number of vGPUs in the first screening result is less than the number of vGPUs requested by the resource scheduling request, a message that the result is empty is returned to the scheduling apparatus 13 . For example, the first screening result includes 5 vGPUs. If the number of vGPUs requested by the resource scheduling request is 7, then the first screening result cannot meet the requirement of the number of vGPUs requested by the resource scheduling request, representing the GPU cluster. 14 If the resource scheduling request cannot be satisfied, the scheduling device 13 will return a message that the result is empty to the AI algorithm device 12 to notify the AI algorithm device 12 that the GPU cluster 14 cannot meet the resource scheduling request.

In the above-mentioned first optional embodiment, when N vGPUs are screened in the first screening result, the first screening results may be sorted according to the order of computing parameters from small to large, and then selected according to the order of computing parameters from small to large The number of vGPU resources required by the resource scheduling request corresponds to the number of vGPU resources, that is, the first N vGPUs are selected in the sorting result.

For example, in an embodiment in which the computing parameters include computing power, the first screening results may be sorted in ascending order of computing power, and then the top N vGPUs are selected therefrom. Suppose the first screening result is as shown in Table 5 below:

Table 5 The first screening results

vGPU编号vGPU number	算力Computing power
0卡：vGPU-20 card: vGPU-2	3.53.5
1卡：vGPU-41 card: vGPU-4	5.25.2
2卡：vGPU-82 cards: vGPU-8	6.16.1
2卡：vGPU-92 cards: vGPU-9	3.03.0
3卡：vGPU-103 cards: vGPU-10	6.16.1
3卡：vGPU-113 cards: vGPU-11	5.25.2
3卡：vGPU-123 cards: vGPU-12	3.03.0

After sorting the first screening results in ascending order of computing power, the following table 6 can be obtained:

The first screening results after sorting in Table 6

vGPU number

Computing power

3卡：vGPU-123 cards: vGPU-12	3.03.0
2卡：vGPU-92 cards: vGPU-9	3.03.0
0卡：vGPU-20 card: vGPU-2	3.53.5
1卡：vGPU-41 card: vGPU-4	5.25.2
3卡：vGPU-113 cards: vGPU-11	5.25.2
2卡：vGPU-82 cards: vGPU-8	6.16.1
3卡：vGPU-103 cards: vGPU-10	6.16.1

It can be seen from Table 6 that the number of vGPUs that meet the computing power requirements is 7. Assuming that the number of vGPUs requested by the resource scheduling request is 5, the first 5 vGPUs in Table 5 can be selected and returned to the AI algorithm device 12.

In an optional implementation manner, if the calculation parameter includes video memory, in the first screening result, the number of vGPU resources corresponding to the number of vGPU resources required by the resource scheduling request is selected in descending order of video memory. For the implementation in which the computing parameters include video memory, similar to the implementation in which the computing parameters include computing power, please refer to the first screening result to select the number of vGPU resources required by the resource scheduling request in the order of computing power from small to large. The implementation manner of the vGPU resource will not be repeated here.

In an optional implementation manner, if the computing parameters include computing power and video memory, it is also possible to decide whether to use computing power or video memory in the first screening result according to the preset priority of computing power and video memory. Select N vGPU resources in a large order.

In this embodiment, in the process of secondary screening according to the calculation parameters and quantity, the available vGPUs obtained from the first screening are sorted according to the calculation parameters from low to high, and the GPU card (smallest) that can meet the resource requirements is preferentially selected during screening. In this way, existing resources can be maximized, the generation of fragments can be reduced, and the remaining resources can meet the needs of long jobs as much as possible, thereby improving resource utilization.

On the basis of the above embodiment, the resource scheduling request further includes the task type of the task processing request corresponding to the resource scheduling request, and vGPUs in different GPUs correspond to tags, and the tags corresponding to the vGPUs are tasks corresponding to the resource scheduling request The task type of the processing request is determined; the method of the embodiment of the present application further includes the following method steps:

According to the task type of the task processing request corresponding to the resource scheduling request, at least one tag corresponding to the task type of the task processing request corresponding to the resource scheduling request is matched; and the vGPU corresponding to the at least one tag is used as the matching result.

In this embodiment, it can be understood that the labels corresponding to vGPUs in different GPUs are the task types of the task processing requests corresponding to the resource scheduling requests. For example, please continue to refer to Figure 6, assuming that among the 27 vGPUs on cards 0 to 8 in Figure 6, some of the labels corresponding to 13 vGPUs are model training tasks, and these 13 vGPUs can be distributed on 0 cards. To any at least two of the 8 cards, the tags corresponding to the remaining 14 vGPUs are online prediction tasks, then if the task type of the task processing request corresponding to the resource scheduling request is a model training task, the matching result is distributed in 0 Some or all of the 13 vGPUs on any at least two of the 8 cards.

FIG. 7 is a schematic structural diagram of a resource scheduling apparatus provided by an embodiment of the present application. The resource scheduling apparatus provided by the embodiment of the present application may execute the processing flow provided by the resource scheduling method embodiment. As shown in FIG. 7 , the resource scheduling apparatus 70 includes: a receiving module 71, a first matching module 72, and a first returning module 73; wherein , the receiving module 71 is configured to receive a resource scheduling request for the GPUs in the graphics processor GPU cluster 14, the resource scheduling request includes the grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is based on the resource scheduling request. The task type of the corresponding task processing request is determined; the first matching module 72 is configured to, according to the grouping information of the GPU to be requested, match all GPUs in the GPU cluster 14 with the grouping information of the GPU to be requested. The GPU obtains a matching result, where the matching result includes at least one target GPU corresponding to the grouping information of the GPU to be requested; the first returning module 73 is configured to return the matching result.

In an optional implementation manner, each GPU includes at least one vGPU, and the resource scheduling request further includes the computing parameters and quantity of the vGPU; the apparatus further includes: a screening module 74, configured to calculate according to the computing parameters of the vGPU and the number of vGPUs that satisfy the computing parameters and number of the vGPUs in the matching result; the second returning module 75 is configured to return the vGPUs that satisfy the computing parameters and number of the vGPUs.

In an optional embodiment, the screening module 74 includes: a first screening unit 741, configured to screen the vGPUs that satisfy the calculation parameters in the matching results to obtain a first screening result; a second screening unit 742. Configure, in the first screening result, to screen for vGPU resources that meet the requirement on the number of vGPUs.

In an optional implementation manner, the computing parameters include at least one of the following: computing power and video memory; the first screening unit 741 is configured to obtain the computing power of each vGPU in each of the target GPUs The priority corresponding to the power and the video memory; if the priority of the computing power is greater than the priority of the video memory, the vGPU that satisfies the computing power requirement of the vGPU of the resource scheduling request is screened in each of the target GPUs , and obtain a second screening result; in the second screening result, screen the vGPU that meets the video memory requirement of the vGPU requested by the resource scheduling, and obtain the first screening result.

In an optional implementation manner, the computing parameters include at least one of the following: computing power and video memory; the first screening unit 741 is configured to obtain the computing power of each vGPU in each of the target GPUs The priority corresponding to the power and the video memory; if the priority of the computing power is less than the priority of the video memory, then filter the vGPU that meets the video memory requirements of the vGPU requested by the resource scheduling request in each of the target GPUs, Obtaining a third screening result; screening vGPUs that meet the computing power requirement of the vGPU requested by the resource scheduling in the third screening results, to obtain the first screening result.

In an optional implementation manner, the second screening unit 742 is configured to, if the number of vGPUs in the first screening result is greater than the number of vGPU resources required by the resource scheduling request, then In the first screening result, select a number of vGPU resources corresponding to the number of vGPU resources required by the resource scheduling request according to the order of computing parameters from small to large; if the number of vGPUs in the first screening result is equal to the The number of the vGPU resources required by the resource scheduling request, the first screening result is returned; if the number of vGPUs in the first screening result is less than the number of vGPU resources required by the resource scheduling request number, the prompt message that the filter result is empty is returned.

In an optional implementation manner, the resource scheduling request includes the task type of the task processing request corresponding to the resource scheduling request; vGPUs in different GPUs have tags corresponding to the tags, and the tags corresponding to the vGPUs are The task type of the task processing request corresponding to the resource scheduling request is determined; the second matching module 76 is configured to match the task processing request corresponding to the resource scheduling request according to the task type of the task processing request corresponding to the resource scheduling request at least one tag corresponding to the requested task type; and using the vGPU corresponding to the at least one tag as the matching result.

The resource scheduling apparatus in the embodiment shown in FIG. 7 can be used to execute the technical solutions of the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and details are not described herein again.

FIG. 8 is a schematic structural diagram of an electronic device provided by an embodiment of the present application. The electronic device provided by the embodiment of the present application can execute the processing flow provided by the resource scheduling method embodiment. As shown in FIG. 8 , the electronic device 80 includes: a memory 81, a processor 82, a computer program, and a communication interface 83; wherein the computer program stores In the memory 81, and configured to be performed by the processor 82, the method steps of the above method embodiments.

The electronic device in the embodiment shown in FIG. 8 can be used to implement the technical solutions of the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and are not repeated here.

In addition, an embodiment of the present application further provides a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor to implement the resource scheduling method described in the foregoing embodiment.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional units.

The above-mentioned integrated units implemented in the form of software functional units can be stored in a computer-readable storage medium. The above-mentioned software function unit is stored in a storage medium, and includes several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute the methods described in the various embodiments of the present application. some steps. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Those skilled in the art can clearly understand that for the convenience and brevity of the description, only the division of the above functional modules is used for illustration. The internal structure is divided into different functional modules to complete all or part of the functions described above. For the working process of the apparatus described above, reference may be made to the corresponding process in the foregoing method embodiments, and details are not described herein again.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions described in the foregoing embodiments can still be modified, or some or all of the technical features thereof can be equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present application. scope.

Industrial Applicability

In the embodiment of the present application, by receiving a resource scheduling request for a GPU in a graphics processor GPU cluster, the resource scheduling request includes grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is a task processing request corresponding to the resource scheduling request The task type is determined, then according to the grouping information of the GPU to be requested, the GPUs with the grouping information of the GPU to be requested are matched among all the GPUs of the GPU cluster; finally, the grouping information that includes at least one target GPU corresponding to the grouping information of the GPU to be requested is returned. match results. Since the resource scheduling request includes the grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is determined according to the task type of the task processing request corresponding to the resource scheduling request, when performing GPU resource scheduling, it can be matched according to the grouping information. to the corresponding GPU, so as to achieve more fine-grained resource scheduling and precisely control the use of GPU.

Claims

A resource scheduling method, comprising:

Receive a resource scheduling request for a GPU in a graphics processor GPU cluster, where the resource scheduling request includes grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is the task type of the task processing request corresponding to the resource scheduling request definite;

According to the grouping information of the GPU to be requested, match the GPUs with the grouping information of the GPU to be requested among all the GPUs in the GPU cluster, and obtain a matching result, where the matching result includes the grouping information of the GPU to be requested corresponding at least one target GPU;

The matching result is returned.
The method according to claim 1, wherein each of the GPUs includes at least one virtual GPU, and the resource scheduling request further includes computing parameters and quantities of the virtual GPUs;

After matching the GPUs with the grouping information of the GPUs to be requested in all GPUs of the GPU cluster according to the grouping information of the GPUs to be requested, the method further includes:

According to the computing parameters and the number of the virtual GPUs, screening the virtual GPUs that satisfy the computing parameters and the number of the virtual GPUs in the matching result;

Returns a virtual GPU that satisfies the computing parameters and quantity of the virtual GPU.
The method according to claim 2, wherein, according to the calculation parameters and quantity of the virtual GPUs, filtering the virtual GPUs that satisfy the resource scheduling request in the matching results, comprising:

Screening virtual GPUs that satisfy the computing parameters in the matching results to obtain a first screening result;

In the first screening result, the virtual GPU resources that meet the requirement of the number of virtual GPUs are screened.
The method according to claim 3, wherein the calculation parameters include at least one of the following: computing power and video memory; the screening of virtual GPUs that satisfy the calculation parameters in the matching results, to obtain a first screening result, include:

Acquire the priority corresponding to the computing power and the video memory of each of the virtual GPUs in each of the target GPUs;

If the priority of the computing power is greater than the priority of the video memory, screening a virtual GPU that satisfies the computing power requirement of the virtual GPU of the resource scheduling request in each of the target GPUs to obtain a second screening result;

The virtual GPU that meets the video memory requirement of the virtual GPU of the resource scheduling request is screened from the second screening result, and the first screening result is obtained.
The method according to claim 3, wherein the calculation parameters include at least one of the following: computing power and video memory; the screening of virtual GPUs that satisfy the calculation parameters in the matching results, to obtain a first screening result, include:

Acquire the priority corresponding to the computing power and the video memory of each of the virtual GPUs in each of the target GPUs;

If the priority of the computing power is less than the priority of the video memory, screening the virtual GPUs that meet the video memory requirements of the virtual GPUs requested by the resource scheduling in each of the target GPUs to obtain a third screening result;

The virtual GPU that meets the computing power requirement of the virtual GPU requested by the resource scheduling is screened in the third screening result, and the first screening result is obtained.
The method according to any one of claims 3 to 5, wherein, in the first screening result, screening virtual GPU resources that meet the requirement of the number of virtual GPUs, comprising:

If the number of the virtual GPUs in the first screening result is greater than the number of the virtual GPU resources required by the resource scheduling request, select the first screening results according to the order of computing parameters from small to large. The number of virtual GPU resources corresponding to the number of virtual GPU resources required by the resource scheduling request.
The method according to any one of claims 3 to 5, wherein, in the first screening result, screening virtual GPU resources that meet the requirement of the number of virtual GPUs, comprising:

If the quantity of the virtual GPUs in the first screening result is equal to the quantity of the virtual GPU resources required by the resource scheduling request, the first screening result is returned.
The method according to any one of claims 3 to 5, wherein, in the first screening result, screening virtual GPU resources that meet the requirement of the number of virtual GPUs, comprising:

If the quantity of the virtual GPUs in the first screening result is less than the quantity of the virtual GPU resources required by the resource scheduling request, a prompt message indicating that the screening result is empty is returned.
The method according to any one of claims 3 to 5, wherein the resource scheduling request includes a task type of the task processing request corresponding to the resource scheduling request; virtual GPUs in different GPUs are correspondingly tagged, and the The label corresponding to the virtual GPU is determined according to the task type of the task processing request corresponding to the resource scheduling request; the method further includes:

matching at least one tag corresponding to the task type of the task processing request corresponding to the resource scheduling request according to the task type of the task processing request corresponding to the resource scheduling request;

The virtual GPU corresponding to the at least one tag is used as the matching result.
A resource scheduling device, comprising:

A receiving module, configured to receive a resource scheduling request for GPUs in a graphics processor GPU cluster, where the resource scheduling request includes grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is a task corresponding to the resource scheduling request Determined by the type of task processing the request;

The first matching module is configured to, according to the grouping information of the GPU to be requested, match the GPUs with the grouping information of the GPU to be requested among all the GPUs in the GPU cluster, and obtain a matching result, where the matching result includes the Describe at least one target GPU corresponding to the grouping information of the GPU to be requested;

The first returning module is configured to return the matching result.
The apparatus according to claim 10, wherein each of the GPUs includes at least one virtual GPU, and the resource scheduling request further includes calculation parameters and quantities of the virtual GPUs; the grouping information according to the GPUs to be requested, in the After matching the GPUs with the grouping information of the GPUs to be requested among all the GPUs in the GPU cluster, the apparatus further includes:

A screening module, configured to screen the virtual GPUs that satisfy the computing parameters and the number of the virtual GPUs in the matching result according to the computing parameters and the number of the virtual GPUs;

The second returning module is configured to return the virtual GPUs satisfying the computing parameters and quantity of the virtual GPUs.
The apparatus according to claim 11, wherein the screening module comprises: a first screening unit, configured to screen virtual GPUs that satisfy the calculation parameters in the matching result, to obtain a first screening result; a second screening unit, configured to In the first screening result, the virtual GPU resources that meet the requirement of the number of virtual GPUs are screened.
The device according to claim 12, wherein the calculation parameters include at least one of the following: computing power and video memory; the first screening unit is configured to obtain the data of each of the virtual GPUs in each of the target GPUs The priority corresponding to the computing power and the video memory; if the priority of the computing power is greater than the priority of the video memory, filter the computing power of the virtual GPU that satisfies the resource scheduling request in each of the target GPUs The virtual GPU required by the resource scheduling request is selected to obtain a second screening result; the virtual GPU that meets the video memory requirement of the virtual GPU requested by the resource scheduling is screened in the second screening result, and the first screening result is obtained.
The device according to claim 12, wherein the calculation parameters include at least one of the following: computing power and video memory; the first screening unit is configured to be all the target GPUs of each of the virtual GPUs. The priority corresponding to the computing power and the video memory; if the priority of the computing power is less than the priority of the video memory, then filter the video memory requirements of the virtual GPU that satisfies the resource scheduling request in each of the target GPUs The virtual GPU is selected to obtain a third screening result; the virtual GPU that meets the computing power requirement of the virtual GPU requested by the resource scheduling is screened in the third screening result, and the first screening result is obtained.
The apparatus according to any one of claims 12 to 14, wherein the second screening unit is configured to, if the number of the virtual GPUs in the first screening result is greater than the number of the virtual GPUs required by the resource scheduling request The number of virtual GPU resources, the number of virtual GPU resources corresponding to the number of virtual GPU resources required by the resource scheduling request is selected in the first screening result according to the computing parameters in ascending order.
The apparatus according to any one of claims 12 to 14, wherein the second screening unit is configured to, if the number of the virtual GPUs in the first screening result is equal to the number of the virtual GPUs required by the resource scheduling request The number of virtual GPU resources, the first screening result is returned.
The apparatus according to any one of claims 12 to 14, wherein the second screening unit is configured to, if the number of the virtual GPUs in the first screening result is less than the number of the virtual GPUs required by the resource scheduling request The number of virtual GPU resources, and a prompt message that the filter result is empty is returned.
The apparatus according to any one of claims 12 to 14, wherein the resource scheduling request includes a task type of the task processing request corresponding to the resource scheduling request; virtual GPUs in different GPUs are correspondingly tagged, and the The label corresponding to the virtual GPU is determined according to the task type of the task processing request corresponding to the resource scheduling request; the apparatus further includes: a second matching module configured to process the requested task according to the task corresponding to the resource scheduling request type, matching at least one tag corresponding to the task type of the task processing request corresponding to the resource scheduling request; and using the virtual GPU corresponding to the at least one tag as the matching result.
An electronic device comprising:

memory;

processor; and

Computer program;

wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any one of claims 1 to 9.
A computer-readable storage medium having a computer program stored thereon, the computer program implementing the method of any one of claims 1 to 9 when executed by a processor.
A computer program product comprising computer readable code, where the computer readable code is run in an electronic device, a processor in the electronic device executes the method of any one of claims 1 to 9 .