CN111047505A

CN111047505A - GPU multiplexing method, device, equipment and readable storage medium

Info

Publication number: CN111047505A
Application number: CN201911330448.9A
Authority: CN
Inventors: 王延家
Original assignee: Beijing Inspur Data Technology Co Ltd
Current assignee: Beijing Inspur Data Technology Co Ltd
Priority date: 2019-12-20
Filing date: 2019-12-20
Publication date: 2020-04-21

Abstract

The invention discloses a GPU multiplexing method, a device, equipment and a readable storage medium, wherein the method comprises the following steps: after receiving a request for multiplexing a physical GPU, determining the corresponding relation between each virtual container to be created and the virtual GPU; the virtual GPU is obtained by virtualizing a physical GPU; creating a virtualized container according to the corresponding relation; loading a corresponding device driver of the virtual GPU in each virtualization container, and mounting the virtual GPU into the corresponding virtualization container; the respective virtualization containers are started so that the respective virtualization containers use the physical GPUs in parallel. In the method, the virtualized container uses the physical GPU without time limitation, the physical GPUs can be used simultaneously in parallel, the processing capacity of the physical GPU can be utilized more fully, and the processing performance of the virtualized container can be improved.

Description

GPU multiplexing method, device, equipment and readable storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a GPU multiplexing method, apparatus, device, and readable storage medium.

Background

A Graphics Processing Unit (GPU), also called a display core, a visual processor, a display chip or a drawing chip, is a microprocessor that is dedicated to running drawing operations on a personal computer, a workstation, a game machine, and some mobile devices (e.g., a tablet computer, a smart phone, etc.).

Currently, GPU (i.e. physical GPU, physical GPU device) functions are used in containers, and in order to utilize the processing performance of the physical GPU as much as possible, it is common to mount the physical GPU map on the host into multiple containers. Multiple containers are time division multiplexed into the same physical GPU. Because this approach is time-division multiplexed, i.e., only one container can actually use the physical GPU at the same time, other containers cannot really use the same physical GPU at the same time. Only when the time slice corresponding to the last container is used up. In the next time slice, one of the other containers can actually use the physical GPU. That is, the existing multiplexing physical GPU is substantially a macro parallel and micro serial manner, and in a multi-container scenario, the operation efficiency of the container is greatly affected.

In summary, how to effectively solve the problems of physical GPU multiplexing and the like is a technical problem that needs to be solved urgently by those skilled in the art at present.

Disclosure of Invention

The invention aims to provide a GPU multiplexing method, a device, equipment and a readable storage medium, which are used for multiplexing physical GPUs in parallel and improving the service processing performance of a container.

In order to solve the technical problems, the invention provides the following technical scheme:

a GPU multiplexing method comprises the following steps:

after receiving a request for multiplexing a physical GPU, determining the corresponding relation between each virtual container to be created and the virtual GPU; the virtual GPU is obtained by virtualizing the physical GPU;

creating the virtualized container according to the corresponding relation;

loading a corresponding device driver of the virtual GPU in each virtualization container, and mounting the virtual GPU into the corresponding virtualization container;

starting each of the virtualized containers so that each of the virtualized containers uses the physical GPUs in parallel.

Preferably, the virtualizing the physical GPU includes:

and virtualizing the physical GPU according to the physical GPU virtual parameters by using a virtualization technology to obtain at least two virtual GPUs.

Preferably, before the virtualizing the physical GPU according to the physical GPU virtual parameters by using the virtualization technology to obtain at least two virtual GPUs, the method further includes:

receiving and analyzing a request for virtualizing the physical GPU, and determining virtual parameters of the physical GPU; the physical GPU virtual parameters comprise a video memory size, a resolution ratio and an application scene.

Preferably, creating the virtualized container comprises:

creating a virtual machine corresponding to each virtualized container;

and respectively creating the virtualization container in each virtual machine according to the corresponding relation.

Preferably, before creating the virtual container, the method further comprises:

compiling and manufacturing a virtual machine file system mirror image of the virtualization container; and the virtual machine file system image comprises a corresponding device driver of the virtual GPU.

Preferably, the determining the corresponding relationship between each to-be-created virtualization container and the virtual GPU includes:

assigning at least one of said virtual GPUs to each of said virtualized containers in accordance with processing performance requirements of traffic to be carried on said virtualized containers; wherein one of the virtual GPUs is assignable to only one of the virtualized containers.

Preferably, the method further comprises the following steps:

shutting down the virtualized container when a business processing performance requirement of the virtualized container changes;

re-determining the corresponding relation according to the new processing performance requirement;

and after the corresponding relation is determined again, executing the step of creating the virtualized container according to the corresponding relation.

A GPU multiplexing device, comprising:

the corresponding relation determining module is used for determining the corresponding relation between each virtual container to be created and the virtual GPU after receiving a request of multiplexing the physical GPU; the virtual GPU is obtained by virtualizing the physical GPU;

a virtualized container creating module, configured to create the virtualized container according to the corresponding relationship;

the virtual GPU mounting module is used for loading the corresponding equipment driver of the virtual GPU in each virtualization container and mounting the virtual GPU into the corresponding virtualization container;

and the physical GPU multiplexing module is used for starting each virtualization container so that each virtualization container can use the physical GPU in parallel.

A GPU multiplexing device, comprising:

a memory for storing a computer program;

and the processor is used for realizing the steps of the GPU multiplexing method when the computer program is executed.

A readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, realizes the steps of the above-mentioned GPU multiplexing method.

By applying the method provided by the embodiment of the invention, after the request of multiplexing the physical GPU is received, the corresponding relation between each virtual container to be created and the virtual GPU is determined; the virtual GPU is obtained by virtualizing a physical GPU; creating a virtualized container according to the corresponding relation; loading a corresponding device driver of the virtual GPU in each virtualization container, and mounting the virtual GPU into the corresponding virtualization container; the respective virtualization containers are started so that the respective virtualization containers use the physical GPUs in parallel.

In the method, a virtual GPU mounted value obtained by virtualizing the same physical GPU is virtualized into a container. Because the virtual GPUs do not interfere with each other, after the virtualization container mounted with the virtual GPU is started, at least two virtualization containers are allowed to use the physical GPU simultaneously. That is, the way of multiplexing the physical GPUs is not time division multiplexing, but the physical GPUs are virtualized to be several virtual GPUs, and the physical GPUs are divided into several sub-GPUs (i.e. virtual GPUs) from the hardware bottom layer. The virtual GPUs can thus be used separately by the containers, without being limited by the time constraints of time division multiplexing. Therefore, the virtualized container uses the physical GPU without time limitation, the physical GPUs can be used in parallel, the processing capacity of the physical GPU can be utilized more fully, and the processing performance of the virtualized container can be improved.

Accordingly, embodiments of the present invention further provide a GPU multiplexing device, an apparatus, and a readable storage medium corresponding to the GPU multiplexing method, which have the above technical effects and are not described herein again.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flowchart illustrating an embodiment of a GPU multiplexing method;

FIG. 2 is a schematic structural diagram of a GPU multiplexing device according to an embodiment of the present invention;

FIG. 3 is a schematic structural diagram of a GPU multiplexing device according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a GPU multiplexing device according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The first embodiment is as follows:

referring to fig. 1, fig. 1 is a flowchart illustrating a GPU multiplexing method according to an embodiment of the present invention, where the method is applicable to a computer capable of performing accelerated processing by using a physical GPU, and the method includes the following steps:

s101, after receiving a request of multiplexing a physical GPU, determining the corresponding relation between each virtual container to be created and the virtual GPU.

The virtual GPU is obtained by virtualizing the physical GPU.

The request for multiplexing the physical GPU may be specifically a request for multiplexing the physical GPU by a plurality of services. In the request, the physical GPUs to be multiplexed, and the traffic packets that need to be supported by the physical GPUs (e.g., a group corresponding to a virtual container, a traffic packet including at least some specific traffic) may be specified. Based on the request, the number of the virtualized containers to be created and the corresponding relationship between each virtualized container and the virtual GPU can be determined.

Determining the corresponding relation between each virtual container to be created and the virtual GPU, and specifically allocating at least one virtual GPU to each virtual container according to the processing performance requirement of the service to be carried on the virtual container; wherein one virtual GPU can only be allocated to one virtualized container. That is, when the number of the vpus is larger than the number of the virtualization containers, the excess vpus may be preferentially allocated to the traffic with higher processing performance requirement. The processing performance of the vGPU may depend on the physical GPU resources (including video memory size, resolution, and specific scene) it occupies.

The virtual GPU is obtained by virtualizing the physical GPU through a virtualization technology. Specifically, a GPU virtualization technology may be used to create multiple virtual GPUs on a single physical GPU, and the created multiple virtual GPUs divide the threads and processing capabilities of the physical GPU. For convenience of description, a virtual GPU (virtual Graphics Processing Unit) is used herein to represent a virtual GPU.

It should be noted that, when the GPU is used based on the container, and when the virtualized container is created, the vGPU needs to be virtualized only between creating the virtualized containers, because the GPU device driver and the GPU device are used as the device parameters of the container. That is, in this embodiment, the physical GPU may be virtualized in advance, or the physical GPU may be virtualized before the request for multiplexing the physical GPU is received and the virtualization container is created.

The virtualization of the physical GPU may specifically be to utilize a virtualization technology and virtualize the physical GPU according to physical GPU virtual parameters to obtain at least two virtual GPUs. To implement a multiplexed GPU, therefore, at least 2 vGPU's can be virtualized when virtualizing the physical GPU. Of course, in practical applications, the number of vGPU may be determined according to the number of containers, and particularly, one vGPU only belongs to one container, and one container may mount one or more vGPU, so that if the number of vGPU is determined according to the number of containers, it needs to be ensured that the number of vGPU should be greater than or equal to the number of containers.

The physical GPU virtual parameters can be directly preset in advance, namely, when the physical GPU is virtualized, the physical GPU virtual parameters are directly virtualized according to the preset physical GPU parameters. Of course, the method may also be determined by receiving a request corresponding to parameter setting when the physical GPU is actually virtualized. Before the virtualization technology is utilized and the physical GPU is virtualized according to the physical GPU virtual parameters to obtain at least two virtual GPUs, the request of the virtualized physical GPU can be received and analyzed, and the physical GPU virtual parameters are determined; the physical GPU virtual parameters comprise the size of a video memory, the resolution and an application scene. The request for the numerical parameters may specifically include the video memory size, resolution, and application scenario of each vGPU, so that the internal physical resources of the physical GPU may be partitioned to each vGPU based on the video memory size, resolution, and application scenario. In particular, the resources of the resulting multiple vGPU's may or may not be the same for the virtualized physical GPU. That is, the physical resources of the physical GPU may be uniformly divided into the dry vGPU, or may be non-uniformly divided into the several vGPU, so as to meet the requirements of different application scenarios.

And S102, creating a virtualized container according to the corresponding relation.

Multiple virtualized containers may be created using virtualized container creation techniques. Creating a virtualized container requires the vGPU to be the creating device according to the corresponding relationship.

In order to realize that the vGPU is used as a creating device of the virtualization container, a virtual machine file system image of the virtualization container can be compiled and manufactured before the virtual container is created; the virtual machine file system image includes a device driver of a corresponding virtual GPU. Therefore, when the virtual container is created based on the virtual machine file system image, the vGPU can be used as a creation device.

In particular, since the created container is a virtual device, creating a virtualized container may specifically include:

step one, creating a virtual machine corresponding to each virtualized container;

and step two, respectively creating a virtualized container in each virtual machine according to the corresponding relation.

For convenience of description, the above two steps will be described in combination.

The virtualized containers are in one-to-one correspondence with the virtual machines, that is, one container is created in one virtual machine. The virtual machine may be embodied as a mini lightweight virtual machine. The virtualization container may treat the resources in the virtual machine as its own resources.

After the virtual machines are created, the virtualization containers can be created in each virtual machine according to the corresponding relation between the virtualization containers and the vGPU.

S103, loading the device driver of the corresponding virtual GPU in each virtualization container, and mounting the virtual GPU into the corresponding virtualization container.

After the virtualization containers are created, the device drivers for the corresponding vGPU may be loaded in each virtualization container and then mounted into the corresponding virtualization container.

For how to load the device driver of the vGPU and how to mount the vGPU to the corresponding virtualization container, reference may be made to the existing process of loading the device driver in the container and the process of mounting the GPU in the container, which is not described herein again.

And S104, starting each virtualization container so that each virtualization container can use the physical GPU in parallel.

After the creation of the virtualized container is completed, the corresponding device driver is loaded and the vGPU is mounted, the virtualized container can be determined. Because the vGPU is mounted in each virtualization container, the use of the corresponding vGPUs by each virtualization container is not interfered with each other and not influenced with each other, namely, the virtualization containers can use the physical GPUs in parallel. That is, in this embodiment, when the virtualized container uses the physical GPU by the vGPU, the physical GPUs are sequentially used without installing a time division multiplexing mode, but a plurality of virtualized containers can simultaneously use the physical GPUs by the vGPU.

It should be noted that, based on the above embodiments, the embodiments of the present invention also provide corresponding improvements. In the preferred/improved embodiment, the same steps as those in the above embodiment or corresponding steps may be referred to each other, and corresponding advantageous effects may also be referred to each other, which are not described in detail in the preferred/improved embodiment herein.

Preferably, in order to avoid the occurrence of high service processing performance requirement which is not fully supported by the physical GPU, the resource allocation when multiplexing the physical GPU may be changed by re-creating the virtualized container, considering that the service supported by the virtualized container changes with respect to the processing performance requirement over time.

Specifically, the implementation process is to execute the operations corresponding to the following steps on the basis of the first embodiment:

step one, when the service processing performance requirement of a virtualization container changes, closing the virtualization container;

step two, re-determining the corresponding relation according to the new processing performance requirement;

and step three, after the corresponding relation is determined again, the step of creating the virtualized container according to the corresponding relation is executed.

For convenience of explanation, the above three steps will be described in combination.

When a service is newly deployed (i.e., service replacement), or when a service with a larger demand is generated over time (i.e., newly added service), or when a service is dropped (i.e., deleted service), or the service itself (e.g., the user demand is higher or lower), the service processing performance requirement of the virtualized container may be changed. For example, existing virtualized containers also fail to support business requirements. At this time, the virtualized container may be first turned off, specifically, the service running thereon is first turned off, and when the service is turned off, the key data of the service may be stored, so that the service can be quickly recovered when the service is restarted. And then, re-determining the corresponding relation according to the new processing performance requirement, then creating the virtualized container again based on the newly determined corresponding relation, and sequentially executing subsequent processing steps until the new virtualized container is started and multiplexes the physical GPU. Further, if the original resource allocation mode of the vGPU cannot meet the new service requirement, the physical GPU can be virtualized again.

Example two:

corresponding to the above method embodiments, the present invention further provides a GPU multiplexing apparatus, and the GPU multiplexing apparatus described below and the GPU multiplexing method described above may be referred to correspondingly.

Referring to fig. 3, the apparatus includes the following modules:

the corresponding relation determining module 101 is configured to determine a corresponding relation between each to-be-created virtualization container and the virtual GPU after receiving a request for multiplexing the physical GPU; the virtual GPU is obtained by virtualizing a physical GPU;

a virtualized container creating module 102, configured to create a virtualized container according to the corresponding relationship;

the virtual GPU mounting module 103 is configured to load a device driver of a corresponding virtual GPU in each virtualization container, and mount the virtual GPU to the corresponding virtualization container;

and the physical GPU multiplexing module 104 is used for starting each virtualization container so that each virtualization container can use the physical GPU in parallel.

After the device provided by the embodiment of the invention is applied to receive the request of multiplexing the physical GPU, determining the corresponding relation between each virtual container to be created and the virtual GPU; the virtual GPU is obtained by virtualizing a physical GPU; creating a virtualized container according to the corresponding relation; loading a corresponding device driver of the virtual GPU in each virtualization container, and mounting the virtual GPU into the corresponding virtualization container; the respective virtualization containers are started so that the respective virtualization containers use the physical GPUs in parallel.

In the present apparatus, a virtual GPU mounted value obtained by virtualizing the same physical GPU is virtualized in a container. Because the virtual GPUs do not interfere with each other, after the virtualization container mounted with the virtual GPU is started, at least two virtualization containers are allowed to use the physical GPU simultaneously. That is, the way of multiplexing the physical GPUs is not time division multiplexing, but the physical GPUs are virtualized to be several virtual GPUs, and the physical GPUs are divided into several sub-GPUs (i.e. virtual GPUs) from the hardware bottom layer. The virtual GPUs can thus be used separately by the containers, without being limited by the time constraints of time division multiplexing. Therefore, the virtualized container uses the physical GPU without time limitation, the physical GPUs can be used in parallel, the processing capacity of the physical GPU can be utilized more fully, and the processing performance of the virtualized container can be improved.

In one embodiment of the present invention, the physical GPU virtualization module is configured to virtualize a physical GPU. Specifically, a virtualization technology is utilized, and the physical GPU is virtualized according to the physical GPU virtual parameters, so that at least two virtual GPUs are obtained.

In an embodiment of the present invention, the physical GPU virtualization module is further specifically configured to

Before the virtualization technology is utilized and the physical GPU is virtualized according to the physical GPU virtual parameters to obtain at least two virtual GPUs, receiving and analyzing a request of the virtualized physical GPU and determining the physical GPU virtual parameters; the physical GPU virtual parameters comprise the size of a video memory, the resolution and an application scene.

In a specific embodiment of the present invention, the virtualized container creating module 102 is specifically configured to create a virtual machine corresponding to each virtualized container;

and respectively creating a virtualized container in each virtual machine according to the corresponding relation.

In an embodiment of the present invention, the virtualized container creating module 102 is specifically configured to compile a virtual machine file system image of a virtualized container before creating the virtualized container; the virtual machine file system image includes a device driver of a corresponding virtual GPU.

In a specific embodiment of the present invention, the correspondence determining module 101 is specifically configured to allocate at least one virtual GPU to each virtualized container according to a processing performance requirement of a service to be carried on the virtualized container; wherein one virtual GPU can only be allocated to one virtualized container.

In one embodiment of the present invention, the method further comprises:

the multiplexing resource reallocation module is used for turning off the virtualization container when the service processing performance requirement of the virtualization container changes; re-determining the corresponding relation according to the new processing performance requirement; after the corresponding relationship is determined again, the virtualized container creating module 102, the virtual GPU mounting module 103, and the physical GPU multiplexing module 104 are sequentially triggered to execute the corresponding steps.

Example three:

corresponding to the above method embodiment, the embodiment of the present invention further provides a GPU multiplexing device, and a GPU multiplexing device described below and a GPU multiplexing method described above may be referred to in correspondence with each other.

Referring to fig. 3, the GPU multiplexing device includes:

a memory D1 for storing computer programs;

a processor D2, configured to implement the steps of the GPU multiplexing method of the above-described method embodiments when executing the computer program.

Specifically, referring to fig. 4, a specific structural diagram of a GPU multiplexing device provided in this embodiment is a schematic diagram of a GPU multiplexing device, which may generate relatively large differences due to different configurations or performances, and may include one or more processors (CPUs) 322 (e.g., one or more processors) and a memory 332, and one or more storage media 330 (e.g., one or more mass storage devices) storing an application 342 or data 344. Memory 332 and storage media 330 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 330 may include one or more modules (not shown), each of which may include a series of instructions operating on a data processing device. Still further, the central processor 322 may be configured to communicate with the storage medium 330 to execute a series of instruction operations in the storage medium 330 on the GPU multiplexing device 301.

The GPU multiplexing device 301 may also include one or more power supplies 326, one or more wired or wireless network interfaces 350, one or more input-output interfaces 358, and/or one or more operating systems 341. Such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.

The steps in the GPU multiplexing method described above may be implemented by the structure of the GPU multiplexing device.

Example four:

corresponding to the above method embodiment, an embodiment of the present invention further provides a readable storage medium, and a readable storage medium described below and a GPU multiplexing method described above may be referred to correspondingly.

A readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the GPU multiplexing method of the above-described method embodiments.

The readable storage medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and various other readable storage media capable of storing program codes.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

Claims

1. A GPU multiplexing method is characterized by comprising the following steps:

creating the virtualized container according to the corresponding relation;

2. A GPU multiplexing method according to claim 1, wherein said virtualizing the physical GPU comprises:

3. A GPU multiplexing method according to claim 2, wherein before virtualizing the physical GPU according to physical GPU virtualization parameters by using virtualization technology to obtain at least two virtual GPUs, the method further comprises:

4. A GPU multiplexing method according to claim 1, wherein creating the virtualized container comprises:

creating a virtual machine corresponding to each virtualized container;

5. A GPU multiplexing method according to claim 1, further comprising, before creating the virtual container:

6. A GPU multiplexing method according to any of claims 1 to 5, wherein the determining the correspondence between each virtualized container to be created and a virtual GPU comprises:

7. The GPU multiplexing method of claim 6, further comprising:

8. A GPU multiplexing device, comprising:

9. A GPU multiplexing device, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the GPU multiplexing method of any of claims 1 to 7 when executing the computer program.

10. A readable storage medium, characterized in that it has stored thereon a computer program which, when being executed by a processor, carries out the steps of the GPU multiplexing method according to any of claims 1 to 7.