CN115328665B

CN115328665B - Hypervisor-based GPU virtualization method and device and electronic equipment

Info

Publication number: CN115328665B
Application number: CN202211243769.7A
Authority: CN
Inventors: 请求不公布姓名
Original assignee: Zhongling Zhixing Chengdu Technology Co ltd
Current assignee: Zhongling Zhixing Chengdu Technology Co ltd
Priority date: 2022-10-12
Filing date: 2022-10-12
Publication date: 2023-02-28
Anticipated expiration: 2042-10-12
Also published as: CN115328665A

Abstract

The disclosure relates to a Hypervisor-based GPU virtualization method, a Hypervisor-based GPU virtualization device and electronic equipment, wherein the method comprises the steps of receiving access requests initiated by a plurality of virtual machines to a GPU control module; calculating a quota ratio difference of the plurality of virtual machines; and selecting the access request with the largest quota ratio difference for authorization, so that the access request with the largest quota ratio difference is preferentially executed by the GPU. The scheme is used for solving the problems that the GPU virtualization technology in the prior art is low in efficiency, occupies CPU time, cannot be expanded and the like. The scheme can flexibly configure resources such as the number of virtual machines and GPU priority, has strong flexibility, has no extra CPU cost except for interaction with a GPU control module of the Hypervisor, has high CPU utilization rate, and can approach the performance of a physical GPU.

Description

Hypervisor-based GPU virtualization method and device and electronic equipment

Technical Field

The present disclosure relates to the field of information processing, and in particular, to a Hypervisor-based GPU virtualization method, apparatus, and electronic device.

Background

A GPU (graphics processing unit), also called a display core, a visual processor, and a display chip, is a microprocessor that is dedicated to image and graphics related operations on personal computers, workstations, game machines, and some mobile devices (e.g., tablet computers, smart phones, etc.). GPUs play an irreplaceable role in UI (User Interface) display and game entertainment.

The working mode of the GPU is mainly that a driver layer or a user mode encodes an API (Application Programming Interface) command (such as OPENGL, VULKAN, etc.) and a state into a task that can be recognized by hardware, and then submits the task to the hardware for execution.

Hypervisor (Virtual Machine Monitor) is a virtualization technology that runs other operating systems. The virtualization technology can virtualize a chip into a plurality of chips, and a plurality of different operating systems, also called virtual machines, can be run on the chip to meet the needs of different scenes. Meanwhile, the virtualization technology has the advantages of improving the utilization rate of the chip and reducing the cost. Therefore, virtualization technology plays a crucial role in many fields of information technology.

Although the virtualization technology can bring the advantages, only the CPU and the memory under the same framework have the same virtualization technology implementation, and a unified virtualization technology standard does not exist for different peripherals. The purpose of peripheral virtualization is to enable multiple virtual machines to use the same external device, and many times some peripheral devices cannot be virtualized, and the peripheral device can only be individually allocated to a certain virtual machine to enable the virtual machine to monopolize the peripheral device.

Because the GPU can play a role in various scenes, GPU virtualization technologies are also provided, so that a plurality of virtual machines have the capacity of using the GPU for computing. However, these techniques have more or less various limitations, such as:

the API forwarding technology is that only one entity occupies GPU access permission, and if other virtual machines use the GPU, each API is forwarded to the entity occupying the GPU to enable the entity to access the GPU instead. The disadvantage of this approach is that each API needs to be forwarded, which is inefficient.

The proxy technology only has one entity to possess GPU access authority, and the virglrender and the like are realized. If other virtual machines need to access the GPU, all the APIs and relevant states are coded and then transmitted to the entity occupying the GPU in a unified mode, and the entity is decoded and then restored to be the APIs to be submitted to the GPU. The advantage of this approach is a reduced number of retransmissions. The defects are that CPU encoding and decoding API and states are utilized, CPU time is occupied, and GPU loss of the virtual machine is large.

GPU hardware virtualization, some manufacturers implement hardware virtualization on GPU hardware, and multiple groups of resources are designed on the GPU hardware. Each virtual machine may submit GPU tasks over a set of resources. The hardware may automatically sequence execution of the received GPU tasks or may directly allocate a GPU time slice to each set of resources for execution of the GPU tasks. The method avoids the defects of the two methods, but the operations of resource group division, priority setting, time slice division and the like are fixed by manufacturers and cannot be expanded, and the method needs function support on a hardware level.

Therefore, the existing GPU virtualization technology has the problems of low efficiency, CPU time occupation, incapability of expanding and the like.

Disclosure of Invention

The invention aims to provide a Hypervisor-based GPU virtualization method, a Hypervisor-based GPU virtualization device and electronic equipment, which are used for solving the problems that a GPU virtualization technology in the prior art is low in efficiency, occupies CPU time, cannot be expanded and the like.

In order to achieve the above object, a first aspect of the present disclosure provides a Hypervisor-based GPU virtualization method, where the method includes:

receiving access requests sent by a plurality of virtual machines to a GPU control module;

calculating the difference between the current quota ratio and the preset quota ratio of each virtual machine;

and selecting the access request with the largest quota ratio difference for authorization, so that the access request with the largest quota ratio difference is preferentially executed by the GPU.

Optionally, selecting the access request with the largest quota ratio difference for authorization, so that the access request with the largest quota ratio difference is preferentially executed by the GPU, including:

and informing a target virtual machine submitting the access request with the largest quota ratio difference, and accessing the GPU by the target virtual machine so as to enable the access request with the largest quota ratio difference to be preferentially executed by the GPU.

Optionally, the method further includes:

the GPU control module monitors a notice after the target virtual machine finishes accessing the GPU;

if the notification is not monitored within a preset time length, determining that the target virtual machine uses the GPU overtime; and after determining that the target virtual machine is overtime, cleaning the target virtual machine and resetting the GPU.

and based on the access operation packaged in the access request with the largest quota ratio difference, the GPU is accessed by the GPU control module, so that the access request with the largest quota ratio difference is preferentially executed by the GPU.

Optionally, the method further includes:

the GPU control module sends an access result to a target virtual machine corresponding to the access request with the largest quota ratio difference; or

And the target virtual machine corresponding to the access request with the largest quota ratio difference automatically accesses the GPU to obtain an access result after being authorized.

Optionally, the quota ratio difference of the access requests initiated by the multiple virtual machines is calculated according to one or more of a preset virtual machine GPU priority, a preset virtual machine GPU time slice, and a virtual machine GPU task execution time.

Optionally, the access request initiated by the virtual machine includes one or more of submitting a GPU task, mapping a GPU memory, accessing a GPU state, and setting the GPU state.

Optionally, the method is implemented by a program closest to a hardware level in a GPU driver.

A second aspect of the present disclosure provides a Hypervisor-based GPU virtualization apparatus, including:

the receiving module is used for receiving access requests sent by the virtual machines to the GPU control module;

the GPU control module is used for calculating the difference value between the quota ratio of the current virtual machine and the preset quota ratio; and selecting the access request with the largest quota ratio difference for authorization, so that the access request with the largest quota ratio difference is preferentially executed by the GPU.

A third aspect of the disclosure provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of the first aspect.

A fourth aspect of the present disclosure provides an electronic device, comprising:

a memory having a computer program stored thereon;

a processor for executing the computer program in the memory to implement the steps of the method of the first aspect.

In the embodiment of the disclosure, the GPU time of each virtual machine is managed in a time-sharing manner through the GPU control module in the Hypervisor, a time-sharing management algorithm is provided to ensure the priority of different virtual machines for using the GPU, and the technical problems that the GPU virtualization technology is low in efficiency, occupies CPU time and cannot be expanded in the prior art are solved while GPU virtualization is realized. The scheme of the embodiment of the disclosure is realized on a software level, namely GPU virtualization is realized in pure software in a Hypervisor, hardware virtualization function support is not needed, and the universality is strong. In addition, the scheme is realized in a Hypervisor and a virtual machine operating system driving layer, so that the user mode is not required to be modified, the user mode ecology can be perfectly compatible, the user mode transplanting cost is reduced, the user mode software development cycle is shortened, and the performance can be optimal.

Furthermore, the scheme can flexibly configure resources such as the number of virtual machines, GPU priority and the like, and the flexibility is strong. Except for interaction with a GPU control module of the Hypervisor, no extra CPU overhead exists, the CPU utilization rate is high, and the performance of the CPU can be close to that of a physical GPU.

Additional features and advantages of the present disclosure will be set forth in the detailed description which follows.

Drawings

The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:

FIG. 1 is a flowchart illustrating a Hypervisor-based GPU virtualization method in accordance with an exemplary embodiment;

FIG. 2 is a schematic diagram illustrating one possible implementation according to an exemplary embodiment;

FIG. 3 is a block diagram illustrating modules of one possible system according to an exemplary embodiment;

FIG. 4 is a schematic diagram illustrating another possible implementation according to an exemplary embodiment;

FIG. 5 is a block diagram illustrating modules of another possible system in accordance with an exemplary embodiment;

FIG. 6 is a block diagram illustrating a Hypervisor-based GPU virtualization apparatus in accordance with an exemplary embodiment;

FIG. 7 is a block diagram of an electronic device shown in accordance with an example embodiment.

Detailed Description

The following detailed description of the embodiments of the disclosure refers to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present disclosure, are given by way of illustration and explanation only, not limitation.

The embodiment of the disclosure provides a Hypervisor-based GPU virtualization method, which comprises the following steps as shown in FIG. 1.

Step 101, receiving access requests initiated by a plurality of virtual machines to a GPU control module. The access request initiated by the virtual machine includes, but is not limited to, submitting a GPU task, mapping a GPU memory, accessing a GPU state, setting the GPU state, and the like.

And 102, calculating a quota ratio difference of the virtual machines.

In the embodiment of the present disclosure, the quota ratio difference may be calculated according to the GPU priority of the virtual machine and other parameters of the virtual machine, where the parameters include, but are not limited to, a preset GPU time slice of the virtual machine, a GPU task execution time of the virtual machine, and the like, so that it can be ensured that a GPU task of the virtual machine with a high priority can be executed preferentially, and it can be ensured that a GPU task of the virtual machine with a low priority is not starved.

And 103, selecting the access request with the largest quota ratio difference for authorization, so that the access request with the largest quota ratio difference is preferentially executed by the GPU.

By the scheme, the GPU time of each virtual machine is managed in a time-sharing mode in the Hypervisor, a time-sharing management algorithm is provided to ensure the priority of different virtual machines for using the GPU, and the problems that the GPU virtualization technology in the prior art is low in efficiency, occupies CPU time, cannot be expanded and the like are solved while GPU virtualization is achieved.

Compared with the prior art, the scheme in the embodiment of the disclosure is realized on a software level, does not need hardware virtualization function support, has strong universality, and is realized in a Hypervisor and a virtual machine operating system driver layer, so that a user state does not need to be modified, the user state ecology can be perfectly compatible, the user state transplanting cost is reduced, the user state software development cycle is shortened, and the performance can be optimized. The characteristics of the native GPU can be retained to the greatest extent by modifying the code in the native GPU driver that is closest to the hardware level.

Furthermore, the scheme in the embodiment of the disclosure can flexibly configure resources such as the number of virtual machines and the priority of the GPU, and has strong flexibility. Besides, the system is interacted with a GPU control module of the Hypervisor, no extra CPU cost exists, the CPU utilization rate is high, and the performance of the system can be close to that of a physical GPU.

Next, a GPU virtualization method in the embodiment of the present disclosure is explained by two embodiments.

Example one

As shown in fig. 2 and 3, assuming that the virtual machine N needs to initiate an access request to the GPU control module, the method includes the following steps.

Step 201, a virtual machine N wants to access a GPU, and the virtual machine N initiates an access request to a GPU control module of a Hypervisor;

202, the GPU control module calculates quota ratio differences of the requests of the virtual machines including the request of the virtual machine N, wherein the quota ratio differences are calculated according to the GPU priority of the virtual machine and other parameters of the virtual machine;

in step 203, the GPU control module of the hypervisor selects a request with the largest quota ratio difference for authorization. If the request with the largest quota ratio difference is submitted by the virtual machine M, informing the virtual machine M submitting the request that the virtual machine M can access the GPU;

step 204, the virtual machine M receives an access permission request sent by a GPU control module of the Hypervisor, and accesses the GPU;

step 205, after the virtual machine M finishes accessing the GPU, notifying the GPU control module of the Hypervisor.

Step 206, if the GPU control module of the Hypervisor cannot be notified for a long time, the Hypervisor considers that the virtual machine M is out of time, and forcibly switches the GPU and resets the GPU;

in the embodiment of the present disclosure, the timeout duration may be set, and if the GPU control module does not monitor the completion of the virtual machine M within the preset duration and notifies, it is determined that the virtual machine M uses the GPU overtime.

The GPU control module then returns to step 203 to select the next access request.

In the embodiment of the disclosure, the GPU control module of the Hypervisor is used for monitoring the requests of all the virtual machines, the virtual machines which use the GPU overtime are cleaned, the request states of the GPUs of all the virtual machines can be detected in time, the overtime or dead virtual machines can be found in time and the queues of the virtual machines are removed, and the safety is good.

Example two

As shown in fig. 4 and 5, assuming that the virtual machine N needs to initiate an access request to the GPU control module, the method includes the following steps.

Step 401, a virtual machine N wants to access a GPU, initiates an access request to a GPU control module of the Hypervisor, and encapsulates an access operation in the request and submits the access operation to the GPU control module of the Hypervisor;

step 402, calculating a quota ratio difference of the virtual machine N by the GPU control module according to the GPU priority of the virtual machine and other parameters of the virtual machine;

step 403, the GPU control module of the Hypervisor selects a request with the largest quota ratio difference for authorization, and assumes that the request with the largest quota ratio difference is submitted by the virtual machine X, because each virtual machine encapsulates the access operation in the request and submits the access operation to the GPU control module of the Hypervisor, the virtual machine X itself does not access the GPU, and the GPU control module of the Hypervisor accesses the GPU instead of the virtual machine X;

in step 404, the GPU control module informs virtual machine X of the result.

In other embodiments, the virtual machine X may access the GPU to obtain the access result after being authorized.

The GPU control module then selects the next access request per step 403.

In the embodiment of the disclosure, all the virtual machines need to apply for the GPU control module of the Hypervisor before accessing the GPU, so that the work time sequence of GPU hardware can be prevented from being disturbed.

Based on the same inventive concept, an embodiment of the present disclosure further provides a Hypervisor-based GPU virtualization apparatus 600, as shown in fig. 6, the apparatus includes: a receiving module 601, configured to receive an access request initiated by a plurality of virtual machines to a GPU control module; a GPU control module 602, configured to calculate a quota ratio difference of the multiple virtual machines; and selecting the access request with the largest quota ratio difference for authorization, so that the access request with the largest quota ratio difference is preferentially executed by the GPU.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 7 is a block diagram of an electronic device 700 shown in accordance with an example embodiment. As shown in fig. 7, the electronic device 700 may include: a processor 701 and a memory 702. The electronic device 700 may also include one or more of a multimedia component 703, an input/output (I/O) interface 704, and a communication component 705.

The processor 701 is configured to control the overall operation of the electronic device 700, so as to complete all or part of the steps in the GPU virtualization method based on the Hypervisor. The memory 702 is used to store various types of data to support operation at the electronic device 700, such as instructions for any application or method operating on the electronic device 700 and application-related data, such as contact data, transmitted and received messages, pictures, audio, video, and the like. The Memory 702 may be implemented by any type or combination of volatile and non-volatile Memory devices, such as Static Random Access Memory (SRAM), electrically Erasable Programmable Read-Only Memory (EEPROM), erasable Programmable Read-Only Memory (EPROM), programmable Read-Only Memory (PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic or optical disk. The multimedia components 703 may include screen and audio components. Wherein the screen may be, for example, a touch screen and the audio component is used for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signal may further be stored in the memory 702 or transmitted through the communication component 705. The audio assembly also includes at least one speaker for outputting audio signals. The I/O interface 704 provides an interface between the processor 701 and other interface modules, such as a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 705 is used for wired or wireless communication between the electronic device 700 and other devices. Wireless Communication, such as Wi-Fi, bluetooth, near Field Communication (NFC), 2G, 3G, 4G, NB-IOT, eMTC, or other 5G, or combinations thereof, which is not limited herein. The corresponding communication component 705 may thus comprise: wi-Fi modules, bluetooth modules, NFC modules, and the like.

In an exemplary embodiment, the electronic Device 700 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components for performing the above-described Hypervisor-based GPU virtualization method.

In another exemplary embodiment, a computer readable storage medium including program instructions which, when executed by a processor, implement the steps of the Hypervisor-based GPU virtualization method described above is also provided. For example, the computer readable storage medium may be the memory 702 described above that includes program instructions executable by the processor 701 of the electronic device 700 to perform the Hypervisor-based GPU virtualization method described above.

In another exemplary embodiment, a computer program product is also provided, which contains a computer program executable by a programmable apparatus, the computer program having code portions for performing the above-mentioned Hypervisor-based GPU virtualization method when executed by the programmable apparatus.

The preferred embodiments of the present disclosure are described in detail with reference to the accompanying drawings, however, the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present disclosure within the technical idea of the present disclosure, and these simple modifications all belong to the protection scope of the present disclosure.

It should be noted that, in the foregoing embodiments, various features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various combinations that are possible in the present disclosure are not described again.

In addition, any combination of various embodiments of the present disclosure may be made, and the same should be considered as the disclosure of the present disclosure, as long as it does not depart from the spirit of the present disclosure.

Claims

1. A hypervisor-based GPU virtualization method is characterized by comprising the following steps:

calculating a predetermined quota ratio for each virtual machine;

receiving access requests sent to a GPU control module by a plurality of virtual machines;

calculating the current quota ratio of each virtual machine, and selecting the access request with the largest quota ratio difference for authorization, so that the access request with the largest quota ratio difference is preferentially executed by the GPU;

the quota ratio of the access requests initiated by the virtual machines is obtained by calculation according to one or more of preset virtual machine GPU priority, preset virtual machine GPU time slices and virtual machine GPU task execution time; the quota ratio difference is a difference between a current quota ratio and a predetermined quota ratio.

2. The method of claim 1, wherein selecting the access request with the largest quota ratio difference for authorization such that the access request with the largest quota ratio difference is preferentially executed by the GPU comprises:

and informing a target virtual machine submitting the access request with the maximum quota ratio difference, and accessing the GPU by the target virtual machine so as to enable the access request with the maximum quota ratio difference to be preferentially executed by the GPU.

3. The method of claim 2, wherein the method further comprises:

4. The method of claim 1, wherein selecting the access request with the largest quota ratio difference for authorization so that the access request with the largest quota ratio difference is preferentially executed by the GPU comprises:

and accessing the GPU by the GPU control module based on the access operation packaged in the access request with the maximum quota ratio difference, so that the access request with the maximum quota ratio difference is preferentially executed by the GPU.

5. The method of claim 4, wherein the method further comprises:

6. The method of claim 1, wherein the virtual machine initiated access request comprises one or more of submitting a GPU task, mapping GPU memory, accessing a GPU state, setting a GPU state.

7. The method of any of claims 1-6, wherein the method is implemented by a program in the GPU driver that is closest to the hardware level.

8. A hypervisor-based GPU virtualization device, comprising:

the GPU control module is used for calculating quota ratio differences of the virtual machines; selecting the access request with the largest quota ratio difference for authorization, so that the access request with the largest quota ratio difference is preferentially executed by the GPU; the quota ratio of the access requests initiated by the virtual machines is calculated according to one or more of a preset virtual machine GPU priority, a preset virtual machine GPU time slice and a virtual machine GPU task execution time; the quota ratio difference is a difference between a current quota ratio and a predetermined quota ratio.

9. An electronic device, comprising:

a memory having a computer program stored thereon;

a processor for executing the computer program in the memory to carry out the steps of the method of any one of claims 1 to 7.