CN107003892B - GPU virtualization method, device and system, electronic equipment and computer program product - Google Patents
GPU virtualization method, device and system, electronic equipment and computer program product Download PDFInfo
- Publication number
- CN107003892B CN107003892B CN201680002845.1A CN201680002845A CN107003892B CN 107003892 B CN107003892 B CN 107003892B CN 201680002845 A CN201680002845 A CN 201680002845A CN 107003892 B CN107003892 B CN 107003892B
- Authority
- CN
- China
- Prior art keywords
- operating system
- shared memory
- graphics processing
- memory
- processing instruction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45554—Instruction set architectures of guest OS and hypervisor or native processor differ, e.g. Bochs or VirtualPC on PowerPC MacOS
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45583—Memory management, e.g. access or allocation
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Stored Programmes (AREA)
- Digital Computer Display Output (AREA)
Abstract
The embodiment of the application provides a GPU virtualization method, a device and a system, electronic equipment and a computer program product, wherein the method comprises the following steps: receiving a graphics processing operation at a first operating system and determining a corresponding graphics processing instruction according to the graphics processing operation; transmitting the graphic processing instruction to a second operating system through a shared memory; the shared memory is in a readable and writable state to both the first operating system and the second operating system. By adopting the application sending scheme of the embodiment of the application, the virtualization of the GPU can be realized.
Description
Technical Field
The present application relates to computer technologies, and in particular, to a method, an apparatus, a system, an electronic device, and a computer program product for virtualizing a GPU in a graphics processor.
Background
A virtualization architecture based on Qemu/KVM (Kernel-based Virtual Machine) technology is shown in fig. 1.
As shown in fig. 1, the virtualization architecture based on Qemu/KVM technology is composed of a Host operating system and a plurality of Guest operating systems virtualized. The Host operating system comprises a plurality of Host user space programs and a Host Linux kernel. Each Guest Guest operating system includes user space, Guest Linux kernel, and Qemu, respectively. These operating systems run on the same set of hardware processor chips, sharing the processor and peripheral resources. The ARM processor supporting the virtualization architecture at least comprises three modes of EL2, EL1 and EL0, wherein a Hypervisor program of a virtual machine manager is operated in an EL2 mode; running a Linux kernel program, namely a Linux kernel program, in the EL1 mode; the user space program is run in EL0 mode. The Hypervisor layer manages hardware resources such as a CPU, a memory, a timer, an interrupt and the like, and can load different operating systems to a physical processor for running in a time-sharing manner through virtualized resources of the CPU, the memory, the timer and the interrupt, so that the function of system virtualization is realized.
The KVM/Hypervisor spans two layers of a Host Linux kernel and a Hypervisor, on one hand, a driving node is provided for the analog processor Qemu, namely, the Qemu is allowed to establish a virtual CPU through the KVM node, and the virtualized resources are managed; on the other hand, the KVM/Hypervisor can also switch the Host Linux system out of the physical CPU, load the Guest Linux system on the physical processor for running and process subsequent transactions of abnormal exit of the Guest Linux system.
Qemu is used as an application of Host Linux to run, virtual hardware equipment resources are provided for running of Guest Linux, a virtual CPU is established through an equipment KVM node of a KVM/Hypervisor module, physical hardware resources are distributed, and unmodified Guest Linux is loaded to physical hardware processing to run.
The virtualization architecture is realized on terminal equipment such as a mobile phone or a tablet personal computer, so that virtualization of all hardware equipment needs to be solved, and a virtualized operating system can use real hardware equipment. At present, no Graphics Processing Unit (GPU) virtualization method exists.
Disclosure of Invention
The embodiment of the application provides a GPU virtualization method, a device and a system, electronic equipment and a computer program product, which are used for realizing GPU virtualization.
According to a first aspect of embodiments of the present application, there is provided a virtualization method of a GPU, including: receiving a graphics processing operation at a first operating system and determining a corresponding graphics processing instruction according to the graphics processing operation; transmitting the graphic processing instruction to a second operating system through a shared memory; the shared memory is in a readable and writable state to both the first operating system and the second operating system.
According to a second aspect of the embodiments of the present application, there is provided a virtualization method of a GPU, including: obtaining a graphics processing instruction from a first operating system through a shared memory; executing the graphics processing instruction at the second operating system to obtain a processing result, and displaying the processing result as a response to the graphics processing operation, wherein the graphics processing operation is received at the first operating system; the shared memory is in a readable and writable state to both the first operating system and the second operating system.
According to a third aspect of the embodiments of the present application, there is provided a virtualization apparatus for a GPU, including: the first receiving module is used for receiving the graphic processing operation at the first operating system and determining a corresponding graphic operation instruction according to the graphic processing operation; a first transmission module, configured to transmit the graphics processing instruction to the second operating system through a shared memory; the shared memory is in a readable and writable state to both the first operating system and the second operating system.
According to a fourth aspect of the embodiments of the present application, there is provided a virtualization apparatus for a GPU, including: the acquisition module is used for acquiring the graphic processing instruction from the first operating system through the shared memory; an execution module, configured to execute the graphics processing instruction at the second operating system to obtain a processing result, and display the processing result as a response to a graphics processing operation, where the graphics processing operation is received at the first operating system; the shared memory is in a readable and writable state to both the first operating system and the second operating system.
According to a fifth aspect of the embodiments of the present application, there is provided a virtualization system of a GPU, including: a first operating system comprising a virtualization device of a GPU as in the third aspect of embodiments of the present application; the shared memory is used for storing the graphic operation instruction from the first operating system and the processing result from the second operating system; the shared memory is in a readable and writable state to the first operating system and the second operating system; and a second operating system including the virtualization device of the GPU according to the fourth aspect of the embodiment of the present application.
According to a sixth aspect of embodiments of the present application, there is provided an electronic apparatus, including: a display, a memory, one or more processors; and one or more modules stored in the memory and configured to be executed by the one or more processors, the one or more modules including instructions for performing the steps of the method for virtualizing a GPU of the first aspect of the embodiments of the present application.
According to a seventh aspect of embodiments of the present application, there is provided an electronic apparatus, including: a display, a memory, one or more processors; and one or more modules stored in the memory and configured to be executed by the one or more processors, the one or more modules including instructions for performing the steps of the method for virtualizing a GPU of the second aspect of the embodiments of the present application.
According to an eighth aspect of embodiments herein, there is provided a computer program product for use in conjunction with an electronic device including a display, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising instructions for performing the steps of the virtualization method of the GPU of the first aspect of embodiments herein.
According to a ninth aspect of embodiments herein, there is provided a computer program product for use in conjunction with an electronic device comprising a display, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising instructions for performing the steps of the method for virtualizing a GPU of the second aspect of embodiments herein.
By adopting the GPU virtualization method, the device and the system, the electronic equipment and the computer program product, the transmission of the graphic processing instruction and the execution result is realized through the shared memory between the first operating system and the second operating system, and the GPU virtualization is realized.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a diagram of a Qemu/KVM technology based virtualization architecture;
FIG. 2 is a schematic diagram of a system architecture for implementing the virtualization method of the GPU in the embodiment of the present application;
FIG. 3 is a flowchart illustrating a method for virtualizing a GPU according to an embodiment of the present disclosure;
FIG. 4 is a flowchart illustrating a method for virtualizing a GPU according to the second embodiment of the present application;
FIG. 5 is a flowchart illustrating a method for virtualizing a GPU according to a third embodiment of the present application;
FIG. 6 is a schematic structural diagram illustrating a virtualization apparatus of a GPU according to a fourth embodiment of the present application;
FIG. 7 is a schematic structural diagram illustrating a virtualization apparatus of a GPU according to an embodiment of the present application;
FIG. 8 is a schematic structural diagram illustrating a virtualization system of a GPU according to a sixth embodiment of the present application;
fig. 9 is a schematic structural diagram of an electronic device according to a seventh embodiment of the present application;
fig. 10 shows a schematic structural diagram of an electronic device according to an eighth embodiment of the present application.
Detailed Description
In the process of implementing the present application, the inventor finds that implementing the virtualization architecture on a terminal device such as a mobile phone or a tablet computer needs to solve virtualization of all hardware devices, and allows a virtualized operating system to use real hardware devices. Therefore, it is desirable to provide a method for virtualizing a GPU.
In view of the foregoing problems, embodiments of the present application provide a method, an apparatus, a system, an electronic device, and a computer program product for virtualizing a GPU, where transfer of a graphics processing instruction and an execution result is achieved through a shared memory between a first operating system and a second operating system, so that virtualization of the GPU is achieved.
The scheme in the embodiment of the application can be applied to various scenes, for example, an intelligent terminal, an android simulator and the like which adopt a virtualization architecture based on a Qemu/KVM technology.
The scheme in the embodiment of the present application may be implemented by using various computer languages, for example, object-oriented programming language Java, and the like.
In order to make the technical solutions and advantages of the embodiments of the present application more apparent, the following further detailed description of the exemplary embodiments of the present application with reference to the accompanying drawings makes it clear that the described embodiments are only a part of the embodiments of the present application, and are not exhaustive of all embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
Example one
Fig. 2 shows a system architecture for implementing the virtualization method of the GPU in the embodiment of the present application. As shown in fig. 2, the GPU virtualization system according to the embodiment of the present application includes a first operating system 201, a second operating system 202, and a shared memory 203. In particular, the first operating system may be a Guest operating system; the second operating system may be a Host operating system. It should be understood that, in implementation, the first operating system may also be a Host operating system, and the second operating system may also be a Guest operating system, which is not limited in this application.
Next, a detailed description will be given of a specific embodiment of the present application, taking as an example that the first operating system is a Guest operating system and the second operating system is a Host operating system.
Specifically, the Guest operating system may include a user space 2011, a Guest Linux Kernel2012 and a Qemu 2013; there is a virtual Graphics Program interface in the user space of the Guest operating system, and specifically, the Graphics Program interface may be an OpenGL (Open Graphics Library, Open Graphics laboratory) api (application Program interface), or may be other Graphics Program interfaces such as Direct 3D and Quick Draw 3D, which is not limited in this application.
Specifically, the Host operating system may include a user space 2021 and a Host Linux Kernel 2022; a graphics program Backend Server corresponding to a graphics program interface in a Guest operating system can be installed in a user space of a Host operating system, and specifically can be an OpenGL Backend Server; the back-end server may operate GPU device 204 via a GPU driver in the Host Linux Kernel.
Specifically, the shared memory 203 is a memory in which a Guest operating system and a Host operating system are visible to each other; and the memory is in a readable and writable state for both the Guest operating system and the Host operating system, i.e., both the Guest operating system and the Host operating system can perform read and write operations on the shared memory.
In specific implementation, the shared memory 203 may only include the first storage area 2031; or may be divided into a first storage area 2031 and a second storage area 2032. Specifically, the first storage area may also be referred to as a private memory; the second storage area may also be referred to as a common memory. In specific implementation, the division of the first storage area and the second storage area has no specific rule, and can be divided according to the data size commonly stored in the first storage area and the second storage area respectively and according to the experience of a designer; the system can also be divided according to other preset strategies, which are not limited in the present application.
Specifically, the first storage area may be used for transmission of functions and parameters, and/or synchronization information between each thread of the Guest operating system and the Backend Server thread; specifically, the private memory may be further divided into a plurality of blocks, where one block is defined as one channel, and one channel corresponds to one thread of the Guest operating system; when the GPU is divided into a plurality of blocks, the plurality of blocks may be equally divided and have the same size, or may be intelligently divided according to the size of the function and parameter of the GPU called by the common thread in the system and/or the synchronization information, which is not limited in the present application. In specific implementation, the user program of the Guest operating system can dynamically manage the channels in the private memory, that is, the user program can perform allocation, reallocation, and release operations on the channels in the private memory at any time.
In particular, the second memory area may be used for the transmission of large data blocks, e.g., graphics content data, between all threads of the Guest operating system and the Backend Server thread. In one implementation, the common memory may be divided into a number of chunks of unequal size. Specifically, the user program in the Guest operating system may manage the blocks in the common memory, that is, the user program may perform allocation and release operations on the channels in the common memory at any time, and each allocation and release is processed according to the whole block.
In particular implementations, the partitioning of the size of the blocks in the common memory may be adapted to commonly used GPU graphics processing data. For example, in the process of implementing the present application, research and development personnel find that, in the GPU virtualization process, generally, a first operating system transmits graphics content data of about 2M to 16M to a second operating system, so as to meet the GPU graphics virtualization processing requirement; therefore, when allocating the size of a block in the common memory, the common memory may be divided into a plurality of memory blocks such as 2M, 4M, 8M, and 16M.
For example, if the total common memory size is 32M and the memory blocks are divided into 2M, 4M, 8M and 16M5 memory blocks, when the user program applies for 3M space, the 4M memory block area may be directly allocated to the corresponding thread, and a free flag may be set to the 4M block area when the thread is released.
It should be understood that for purposes of example, only one Guest operating system, one Host operating system, and one shared memory are shown in FIG. 2; however, in specific implementation, the operating system may be one or more Guest operating systems, one or more Host operating systems, or one or more shared memories; that is, the Guest operating system, the Host operating system, and the shared memory may be any number, which is not limited in the present application.
It should be understood that the shared memory shown in FIG. 2 includes, for exemplary purposes, both private memory and public memory storage areas; dividing the private memory into 3 channels with equal size; the common memory is divided into 4 channels of unequal size. In specific implementation, the shared memory may be a memory area including only a private memory; the private memory can be divided into a plurality of channels with different sizes without being divided; the public memory may not exist, and may also be divided into a plurality of channels with equal size, etc., which are not limited in the present application.
Next, a method for virtualizing a GPU according to an embodiment of the present application will be described with reference to the system architecture shown in fig. 2.
Fig. 3 is a flowchart illustrating a virtualization method of a GPU according to a first embodiment of the present application. In the first embodiment of the present application, the steps of the GPU virtualization method using the Guest operating system as the execution subject are described. As shown in fig. 3, the virtualization method of the GPU according to the embodiment of the present application includes the following steps:
s301, receiving a graphics processing operation at a Guest operating system, and determining a corresponding graphics processing instruction according to the graphics processing operation.
In specific implementation, before S301, the shared memory corresponding to the GPU device may be created when Qemu corresponding to the Guest system is started. Specifically, Qemu may create a corresponding shared memory through a system call. Specifically, a specific block of address space may be partitioned from memory as shared memory for the GPU device. The size of the shared memory can be set by a developer and adapted to the GPU. For example, the shared memory corresponding to the GPU device may be set to 128M, and the like, which are not limited in this application.
It should be understood that when there are multiple Guest systems, a shared memory may be created for the GPU by the Qemu of each Guest system, or a shared memory corresponding to the GPU may be shared by the multiple Guest systems; this is not a limitation of the present application.
Qemu further maps the shared memory to a PCI (Peripheral Component Interconnect) device memory space of the Guest system; and provides a virtual PCI register for the Guest system as a PCI configuration space.
Then, Guest Linux Kernel divides the shared memory into private memory and public memory.
Specifically, the Guest Linux Kernel can divide the shared memory when the GPU device is initialized; such that the shared memory supports access by multiple processes or threads. Specifically, the private memory, that is, the first storage area may be divided into a plurality of channels of a first preset number; the common memory, i.e., the second storage area, may be divided into a second preset number of blocks. Specifically, the first preset number and the second preset number may be set by a developer. Specifically, the size of the plurality of channels of the private memory may be equal; the size of the plurality of blocks of the common memory may be adapted to process data of the physical device corresponding to the shared memory.
Further, before S301, a step of allocating corresponding shared memory address spaces to the front-end thread and the corresponding back-end thread when the front-end thread is started may also be included.
In particular implementations, when an API call instruction is received, a front-end thread corresponding to the API call instruction may be created. And sending a thread creating instruction corresponding to the API calling instruction to the Host operating system so as to trigger the Host operating system to create a corresponding back-end thread. In the process of creating the front-end thread and the back-end thread, the address space of a private memory channel corresponding to the front-end thread and the public memory address space allocated to the front-end thread can be acquired from a Guest Linux Kernel; mapping the address space of the private memory channel corresponding to the front-end thread and the public memory address space allocated to the front-end thread into the address space of the front-end thread; thereby establishing a synchronous control channel with Qemu. Specifically, a channel in private memory is typically allocated to the front-end thread, and public memory is allocated entirely to the front-end thread.
Next, the address space of the private memory channel corresponding to the front-end thread and the address space of the public memory may be transmitted to Qemu through the PCI configuration space; then Qemu sends the address space of the private memory channel corresponding to the front-end thread and the address space of the public memory to a back-end server through an interprocess communication mechanism; and maps it to the address space of the back-end thread.
At this point, the initialization of the shared memory between the front-end thread and the back-end thread is completed.
In particular implementations, a user typically performs a graphics processing operation for a thread in the Guest operating system, which may be, for example, opening a new window, opening a new page, etc. It is understood that prior to this step, a step may be included in which the user creates a new thread in the user space of the Guest operating system. In particular implementations, the new thread may be an application, e.g., QQ, WeChat, etc. The action of the user creating a new thread may be, for example, the user opening a WeChat, etc.
Specifically, the first storage area may be further divided into one or more channels if the first storage area includes a plurality of channels; before the graphics processing instruction is written into the shared memory, the method further includes: and determining a channel corresponding to the graphics processing instruction according to the thread corresponding to the graphics processing instruction.
When a user creates a new thread in the user space of the Guest operating system, a channel of a corresponding first storage area can be allocated to the thread according to a preset rule. Specifically, the rule may be in the order of the creation of the threads. For example, when a new thread is created, the Guest Linux kernel allocates a unique channel number to the thread, and maps the private memory and the whole public memory corresponding to the channel number to the user program at the same time; the Guest user program informs the OpenGL backup Server to create a thread through Qemu, and maps the corresponding private memory channel number and the whole public memory space to the thread. It should be understood that in the implementation, if the private memory has only one channel, the step of assigning the channel number may not be performed; the step of determining the channel corresponding to the graphics processing instruction according to the thread corresponding to the graphics processing instruction may not be performed.
S302, the graphics processing instruction is transferred to a second operating system through a shared memory, so that the second operating system executes the graphics processing instruction to obtain a processing result.
In specific implementation, the transferring the graphics processing instruction to the second operating system through the shared memory may be implemented by: writing the graphics processing instruction into the shared memory; and sending the offset address of the graphics processing instruction in the shared memory to the second operating system. Specifically, the Guest user program may execute offset recording on the memory allocated to each block, that is, record an offset address of the memory currently written with the graphics processing instruction in the memory block corresponding to the current thread; and then sending the offset address of the current memory block to a corresponding thread in the Host operating system. Then, the Host operating system can read the graphic processing instruction from the corresponding position of the shared memory through the corresponding channel number and the offset address, and immediately execute the function to obtain a processing result.
In a first embodiment, the graphics processing instructions may include only graphics processing functions and parameters; the graphics processing function and parameters may be stored to a first storage area of the shared memory, i.e., private memory. After acquiring the corresponding graphic processing function and parameter, the Host operating system can immediately execute the function to obtain a processing result. Specifically, in order to save data transmission amount, a number corresponding to the graphics processing function may be determined first; the graphics processing function number and parameters are then written to the first memory area. After the Host operating system acquires the corresponding graphics processing function number, the corresponding graphics processing function is determined according to the number, and then the function is executed according to the graphics processing function and the parameters, so that a processing result is obtained. In particular, the graphics processing function may be an OpenGL function.
In a second specific embodiment, the graphics processing instruction includes, in addition to the graphics processing function and the parameter, synchronization information for indicating a time at which the second operating system executes the graphics processing instruction; the graphics processing functions and parameters, as well as synchronization information, may be stored to a first storage area of the shared memory, i.e., private memory. After acquiring the corresponding graphic processing function and parameter, the Host operating system may execute the function at the time indicated by the synchronization information to obtain a processing result.
In a third embodiment, the graphics processing instructions include graphics content data in addition to graphics processing functions and parameters; the graphics processing functions and parameters may be stored to a private memory of the shared memory and the graphics content data written to a second memory area, i.e. a public memory. The Guest user program sends both the offset address of the private memory chunk and the offset address of the public memory chunk to the corresponding thread in the Host operating system. Then, the Host operating system can offset the address through the corresponding channel number and private memory; reading the graphic processing function and the parameter from the corresponding position of the private memory; and reading the graphic content data from the corresponding position of the public memory through the public memory offset address, and immediately executing the function after reading to obtain a processing result. Specifically, the graphic content data may refer to an image frame that needs to be subjected to image processing.
Specifically, the common memory may be further divided into a plurality of blocks having a size adapted to the GPU graphics content data; if the second storage area includes a plurality of blocks; before writing the graphic content data into the second storage area, the method may further include: and determining the block corresponding to the graphic content data according to the size of the graphic content data.
For example, if the total common memory size is 32M and is divided into 2M, 4M, 8M, 16M5 memory chunks, when the user program requests to transfer 3M data content data, the 4M common memory chunks may be directly allocated to the corresponding threads.
In a fourth embodiment, the graphics processing instructions include graphics content data in addition to graphics processing functions, parameters, and synchronization information; the graphics processing functions, parameters and synchronization information may be stored to a private memory of the shared memory and the graphics content data may be written to a second storage area, i.e., a public memory. The Guest user program sends both the offset address of the private memory chunk and the offset address of the public memory chunk to the corresponding thread in the Host operating system. Then, the Host operating system can offset the address through the corresponding channel number and private memory; reading the graphic processing function, the parameter and the synchronous information from the corresponding position of the private memory; and reading the graphic content data from the corresponding position of the public memory through the offset address of the public memory, and executing the function at the time indicated by the synchronous information to obtain a processing result.
It should be appreciated that, in particular implementations, the shared memory may be utilized one or more times between the first operating system and the second operation to transfer any one or more of the following: graphics processing function or graphics processing function number, parameters, synchronization information, graphics content data. Specifically, the first operating system may transmit the graphics processing instruction to be transmitted to the second operating system through the shared memory at one time; or the graphics processing instruction can be split into proper sizes and transmitted to the second operating system by using the shared memory for multiple times; in specific implementation, the splitting policy for the graphics processing instruction may adopt common technical means of those skilled in the art, and the present application does not limit this.
S303, the second operating system displays the processing result as a response to the graphics processing operation.
In specific implementation, after the second operating system obtains the processing result, the processing result may be displayed on a screen through the GPU device.
S304, the first operating system receives the execution result from the second operating system.
In particular, the second operating system may generate an execution result according to the execution result of the function. Specifically, the execution result may include a message that the execution of the graphics processing function succeeded or failed, and/or software version information, and the like; and returns to the first operating system; so that the corresponding thread in the first operating system can obtain execution of the function.
Specifically, the Host operating system may write the execution result into the shared memory; recording the position of the current write-in execution result and the offset address in the memory block corresponding to the current thread; the offset address is then sent to the corresponding thread in the Guest operating system. Then, the Guest operating system can read data from the corresponding location of the shared memory through the corresponding offset address.
Therefore, remote calling of a user program to GPU equipment in a Guest operating system is realized; namely, virtualization of the GPU is achieved.
By adopting the GPU virtualization method in the embodiment of the application, the OpenGL API is remotely called on the basis of sharing the memory, so that the GPU virtualization is realized.
Example two
Fig. 4 is a flowchart illustrating a virtualization method of a GPU according to a second embodiment of the present application. In the second embodiment of the present application, the steps of the GPU virtualization method using the Host operating system as the execution subject are described. The system architecture in the embodiment of the present application can be implemented with reference to the system architecture shown in fig. 2 in the first embodiment, and repeated descriptions are omitted here.
As shown in fig. 4, the virtualization method of the GPU according to the embodiment of the present application includes the following steps:
s401, the Host operating system obtains the graphics processing instruction from the Guest operating system through the shared memory.
In specific implementation, the shared memory may be divided into a private memory and a public memory; and the private memory may be further divided into a plurality of channels corresponding to different threads; if the private memory includes a plurality of channels, before S401, the method further includes: and determining a channel corresponding to the graphics processing instruction by the thread corresponding to the graphics processing instruction.
Specifically, when a user creates a new thread in the user space of the Guest operating system, a channel of the corresponding first storage area may be allocated to the thread according to a preset rule. Specifically, the rule may be in the order of the creation of the threads. For example, when a new thread is created, the Guest Linux kernel allocates a unique channel number to the thread, and maps the private memory and the whole public memory corresponding to the channel number to the user program at the same time; the Guest user program informs the OpenGL backup Server to create a thread through Qemu, and maps the corresponding private memory channel number and the whole public memory space to the thread. It should be understood that in the implementation, if the private memory has only one channel, the step of assigning the channel number may not be performed; the step of determining the channel corresponding to the graphics processing instruction according to the thread corresponding to the graphics processing instruction may not be performed.
In specific implementation, the Guest operating system may send the graphics processing instruction to the Host operating system at the offset address of the shared memory; and the Host operating system reads the graphics processing instruction from the shared memory according to the offset address of the graphics processing instruction in the shared memory.
In a first embodiment, the graphics processing instructions include only graphics processing functions and parameters; the Host operating system may obtain the corresponding graphics processing function and parameter in the private memory. If the number of the graphics processing function is acquired in the private memory, the corresponding graphics processing function can be determined according to the number, and then the graphics processing function and the parameters are determined.
In a second specific embodiment, the graphics processing instruction includes, in addition to the graphics processing function and the parameter, synchronization information for indicating a time at which the second operating system executes the graphics processing instruction; the Host operating system can acquire corresponding graphic processing functions, parameters and synchronization information in a private memory.
In a third embodiment, the graphics processing instructions include graphics content data in addition to graphics processing functions and parameters; the Host operating system can offset the address through the corresponding channel number and private memory; reading the graphic processing function and the parameter from the corresponding position of the private memory; and reading the graphic content data from the corresponding position of the public memory through the public memory offset address.
In a fourth embodiment, the graphics processing instructions include graphics content data in addition to graphics processing functions, parameters, and synchronization information; the Host operating system can offset the address through the corresponding channel number and private memory; reading the graphic processing function, the parameter and the synchronous information from the corresponding position of the private memory; and reading the graphic content data from the corresponding position of the public memory through the public memory offset address.
S402, the Host operating system executes the graphic processing instruction to obtain a processing result.
In specific implementation, if the graphics processing instruction includes the synchronization information, after the Host operating system acquires the graphics processing instruction, the Host operating system may execute the graphics processing function based on the parameter at the time indicated by the synchronization information, and obtain a processing result.
In specific implementation, the graphics processing instruction does not include the synchronization information, and then the Host operating system can execute the graphics processing function based on the parameters immediately after acquiring the graphics processing instruction, and obtain a processing result.
S403, displaying the processing result as a response of the graphics processing operation received at the first operating system.
In specific implementation, the process of displaying the function processing result by the Host operating system may adopt a conventional technical means of those skilled in the art, which is not described in detail herein.
S404, the execution result is transmitted to the first operating system through the shared memory.
In specific implementation, after the Host operating system obtains the processing result, the execution result of the function, for example, a message for identifying the success or failure of the execution of the function, may be written into the shared memory; and sending the message to the first operating system at the offset address of the shared memory, so that the first operating system obtains the function execution result according to the offset address.
Therefore, remote calling of GPU equipment by matching with a user program in a Guest operating system in the Host operating system is realized; namely, virtualization of the GPU is achieved.
By adopting the GPU virtualization method in the embodiment of the application, the OpenGL API is remotely called on the basis of sharing the memory, so that the GPU virtualization is realized.
EXAMPLE III
Fig. 5 is a flowchart illustrating a virtualization method of a GPU according to a third embodiment of the present application. In the third embodiment of the present application, a step of implementing a GPU virtualization method by using an OpenGL graphics processing interface as an example and matching a Guest operating system with a Host operating system is described. The system architecture in the embodiment of the present application can be implemented with reference to the system architecture shown in fig. 2 in the first embodiment, and repeated descriptions are omitted here.
In the embodiment of the application, an initiator of the OpenGL API function remote call is a Guest operating system, a function executor is a Host operating system, and a downlink synchronization process from the Guest operating system to the Host operating system is subjected to Guest Linux kernel and Qemu to reach an OpenGL backup Server; an uplink synchronization process from a Host operating system to a Guest operating system is initiated from an OpenGL backup Server and reaches an OpenGL estimator API through Qemu and Guest Linux kernel.
In the embodiment of the application, each time the Guest operating system creates a new display window, a thread is correspondingly created to initialize and call an OpenGL function, and in the initialization process, the OpenGL backup Server also creates a thread corresponding to Guest ends one by one.
Next, a detailed description will be given of an implementation process of the GPU virtualization method based on the above application scenario.
As shown in fig. 5, the virtualization method of the GPU according to the third embodiment of the present application includes the following steps:
s501, shared memory initialization.
In a specific implementation, the Guest Linux kernel may divide the shared memory into two blocks, which are respectively defined as a private memory and a public memory.
Specifically, the private memory may be averagely divided into a plurality of blocks with equal size, one block is a channel, and each channel is used for transmitting data and synchronization information from one thread of the Guest operating system to the OpenGL background Server thread. In particular, the data may include a graphics processing function number and parameters.
Specifically, the common memory may be divided into several large blocks of unequal size for large block transfer of all threads of the Guest operating system to the OpenGL background Server thread.
S502, establishing the mapping between the shared memory and the thread.
In specific implementation, the Guest Linux kernel can control the number of the private channel, and when a Guest user program creates a new thread each time, the kernel is responsible for allocating a unique channel number and mapping the private memory corresponding to the channel and the whole public memory to the user program at the same time.
And then the Guest user program informs the OpenGL backup Server to create a thread through Qemu and uses the corresponding private channel memory and the whole public memory space.
Guest user program dynamically manages the private channel memory, and the program can be distributed, redistributed and released in the private channel memory at any time.
Guest user program manages public memory in fixed size, each allocation and release is processed according to the whole block, for example, if the size of the total public memory is 32M, the public memory is divided into 2M, 2M, 4M, 8M, 16M5 memory blocks, when a user applies for 3M space, 4M memory block area is directly allocated, and when release, a free mark is set to 4M block area.
The Guest user program performs offset recording on each allocated memory block, that is, records the offset address of the currently allocated memory block existing in the whole cross-system memory block.
S503, the Guest user program responds to the graphic processing operation of the user and determines a corresponding graphic processing instruction.
The implementation of step S503 may refer to the implementation of S301 in the first embodiment, and repeated descriptions are omitted here.
And S504, after the Guest user program writes the function number and the parameters thereof into the allocated memory block, the offset address of the function number and the parameters in the current memory block is transmitted to the thread corresponding to the Host operating system.
The implementation of step S504 may refer to the implementation of the transfer process of the function number and the parameter in S302 in the first embodiment, and repeated details are not repeated.
And S505, the Host operating system acquires the transferred function number and the parameter thereof from the shared memory and starts to execute the function.
The implementation of step S505 may refer to the obtaining process of the function number and the parameter in embodiment two S401 and the implementation of the function executing process in embodiment two S402, and repeated details are not repeated.
S506, after the Host operating system finishes executing the function, the processing result is displayed, the information of the success or failure of the execution of the identification function is written in the shared memory by the same method, and the corresponding offset address is returned to the Guest operating system to finish the execution of the function.
The implementation of step S506 may refer to the implementation of S403 and S404 in the second embodiment, and repeated descriptions are omitted.
Therefore, remote calling of the OpenGL API between the Guest operating system and the Host operating system is achieved, and therefore virtualization of the GPU is achieved.
By adopting the GPU virtualization method in the embodiment of the application, a method of sharing the memory across the operating systems is used, namely the two operating systems can mutually see the read-write operation performed on one memory, the remote call of the OpenGL API is realized on the basis of sharing the memory, and therefore the GPU virtualization is realized.
Based on the same inventive concept, the embodiment of the present application further provides a virtualization device for a GPU, and as the principle of the device for solving the problem is similar to that of the virtualization method for the GPU provided in the embodiment of the present application, the implementation of the device may refer to the implementation of the method, and repeated details are not repeated.
Example four
Fig. 6 is a schematic structural diagram illustrating a virtualization apparatus of a GPU according to a fourth embodiment of the present application.
As shown in fig. 6, a virtualization apparatus 600 of a GPU according to the fourth embodiment of the present application includes: a first receiving module 601, configured to receive a graphics processing operation at a first operating system, and determine a corresponding graphics operation instruction according to the graphics processing operation; a first transmitting module 602, configured to transmit the graphics processing instruction to the second operating system through a shared memory, so that the second operating system executes the graphics processing instruction to obtain a processing result, and displays the processing result as a response of the graphics processing operation; the shared memory is in a readable and writable state to both the first operating system and the second operating system.
Specifically, the first operating system may be a Guest operating system, and the second operating system may be a Host operating system.
Specifically, the first transfer module may specifically include: the first writing sub-module writes the graphics processing instruction into the shared memory; and the first sending submodule is used for sending the offset address of the graphics processing instruction in the shared memory to the second operating system.
In particular, the graphics processing instructions may include graphics processing functions and parameters; the first write submodule may be specifically configured to: and storing the graphic processing instruction to a first storage area of the shared memory.
In particular, the graphics processing instructions may also include synchronization information that may be used to indicate a time at which the second operating system executed the graphics processing instructions.
In particular, the graphics processing instructions may also include graphics content data; the shared memory may further include a second storage area; the first write submodule may be further configured to: and writing the graphic content data into the second storage area.
Specifically, the second memory area includes a plurality of blocks, wherein each block has a preset size, the preset size being adapted to the GPU graphics content data; the apparatus may further include: and the first determining module is used for determining the block corresponding to the graphic content data according to the size of the graphic content data.
Specifically, the first storage area includes a plurality of channels, wherein each channel corresponds to a different thread; the apparatus may further include: and the second determining module is used for determining the channel corresponding to the graphics processing instruction according to the thread corresponding to the graphics processing instruction.
Specifically, the graphics processing instruction may include a number and a parameter corresponding to a graphics processing function; the first write submodule may be specifically configured to: determining the number corresponding to the graphic processing function; and writing the graphics processing function number and the parameters into the first storage area.
Specifically, the GPU virtualization device according to the embodiment of the present application further includes: a second receiving module 603, configured to receive an execution result from the second operating system.
Specifically, the second receiving module may specifically include: the first address receiving submodule is used for receiving an offset address of an execution result from the second operating system in the shared memory; and the first reading submodule is used for reading the execution result from the shared memory according to the offset address of the execution result in the shared memory.
By adopting the GPU virtualization device in the embodiment of the application, the OpenGL API is remotely called on the basis of sharing the memory, so that the GPU virtualization is realized.
Based on the same inventive concept, the embodiment of the present application further provides a virtualization device for a GPU, and as the principle of the device for solving the problem is similar to that of the virtualization method for the GPU provided in the second embodiment of the present application, the implementation of the device may refer to the implementation of the method, and repeated details are not repeated.
EXAMPLE five
Fig. 7 is a schematic structural diagram illustrating a virtualization apparatus of a GPU according to a fifth embodiment of the present application.
As shown in fig. 7, a virtualization device 700 of a GPU according to the fifth embodiment of the present application includes: an obtaining module 701, configured to obtain a graphics processing instruction from a first operating system through a shared memory; an execution module 702, configured to execute the graphics processing instruction at the second operating system to obtain a processing result; a display module 703, configured to display a processing result as a response to the graphics processing operation; wherein the graphics processing operation is received at the first operating system; the shared memory is in a readable and writable state to both the first operating system and the second operating system.
Specifically, the first operating system may be a Guest operating system, and the second operating system may be a Host operating system.
Specifically, the obtaining module may specifically include: a second address receiving submodule, configured to receive an offset address of a graphics processing instruction from the first operating system in the shared memory; and the second reading submodule is used for reading the graphics processing instruction from the shared memory according to the offset address of the graphics processing instruction in the shared memory.
In particular, the graphics processing instructions may include graphics processing functions and parameters; the second read submodule may be specifically configured to: the graphics processing instruction is read from a first storage area of a shared memory.
Specifically, the graphics processing instruction may further include synchronization information, which may be used to indicate a time when the second operating system executes the graphics processing instruction; the execution module may specifically be configured to: and executing the graphics processing instruction at the time indicated by the synchronization information.
In particular, the graphics processing instructions may also include graphics content data; the shared memory may further include a second storage area; a second read submodule, further operable to: and reading the graphic content data from a second storage area of the shared memory.
Specifically, the first storage area includes a plurality of channels, wherein each channel corresponds to a different thread; the apparatus may further include: and the second determining module is used for determining the channel corresponding to the graphics processing instruction according to the thread corresponding to the graphics processing instruction.
Specifically, the graphics processing instruction may include a number and a parameter corresponding to a graphics processing function; the second read submodule may be specifically configured to: reading the graphics processing function number and parameters from the first memory area; and determining the corresponding graphics processing function according to the graphics processing function number.
Specifically, the GPU virtualization device according to the embodiment of the present application may further include: and the second transmission module is used for transmitting the execution result to the first operating system through the shared memory.
Specifically, the second transmission module may specifically include: the second writing submodule is used for writing the execution result into the shared memory; and the second sending submodule is used for sending the execution result to the first operating system at the offset address of the shared memory so that the first operating system can obtain the execution result according to the processing result at the offset address of the shared memory.
By adopting the GPU virtualization device in the embodiment of the application, the OpenGL API is remotely called on the basis of sharing the memory, so that the GPU virtualization is realized.
Based on the same inventive concept, the embodiment of the present application further provides a virtualization system of a GPU, and since the principle of solving the problem of the system is similar to the virtualization method of the GPU provided in the first and second embodiments of the present application, the implementation of the system may refer to the implementation of the method, and repeated details are not repeated.
EXAMPLE six
Fig. 8 is a schematic structural diagram illustrating a virtualization system of a GPU according to a sixth embodiment of the present application.
As shown in fig. 8, a virtualization system 800 of a GPU according to a sixth embodiment of the present application includes: a first operating system 801, a virtualization device 600 comprising a GPU; a shared memory 802 for storing graphics operation instructions from the first operating system and processing results from the second operating system; the shared memory is in a readable and writable state to the first operating system and the second operating system; a second operating system 803, including the virtualization device 700 for a GPU.
In specific implementation, the implementation of the first operating system 801 may refer to the implementation of the first operating system 201 in the first embodiment of the present application, and repeated details are not repeated.
In specific implementation, the implementation of the shared memory 802 may refer to the implementation of the shared memory 203 in the first embodiment of the present application, and repeated details are not described herein.
In specific implementation, the implementation of the second operating system 803 may refer to the implementation of the second operating system 202 in the first embodiment of the present application, and repeated descriptions are omitted here.
Specifically, the first operating system may be a Guest operating system, and the second operating system may be a Host operating system.
By adopting the GPU virtualization system in the embodiment of the application, the OpenGL API is remotely called on the basis of sharing the memory, so that the GPU virtualization is realized.
EXAMPLE seven
Based on the same inventive concept, an electronic device 900 as shown in fig. 9 is also provided in the embodiments of the present application.
As shown in fig. 9, an electronic device 900 according to a seventh embodiment of the present application includes: a display 901, a memory 902, one or more processors 903; a bus 904; and one or more modules stored in the memory and configured to be executed by the one or more processors, the one or more modules including instructions for performing the steps of any of the methods according to the embodiments one of the present application.
Based on the same inventive concept, a computer program product for use with an electronic device 900 including a display is also provided in the embodiments of the present application, the computer program product including a computer-readable storage medium and a computer program mechanism embedded therein, the computer program mechanism including instructions for performing the steps of the method in any one of the embodiments of the present application.
Example eight
Based on the same inventive concept, an electronic device 1000 as shown in fig. 10 is also provided in the embodiments of the present application.
As shown in fig. 10, an electronic device 1000 according to an eighth embodiment of the present application includes: a display 1001, a memory 1002, one or more processors 1003; the bus 1004 and one or more modules stored in the memory and configured to be executed by the one or more processors, the one or more modules including instructions for performing the steps of any of the methods according to the second embodiment of the present application.
Based on the same inventive concept, a computer program product for use with an electronic device 1000 comprising a display is also provided in the embodiments of the present application, the computer program product comprising a computer-readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising instructions for executing the steps of the method in any of the second embodiment of the present application.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.
Claims (35)
1. A method for virtualizing a Graphics Processor (GPU), comprising:
receiving a graphics processing operation at a first operating system and determining a corresponding graphics processing instruction according to the graphics processing operation;
writing the graphic processing instruction into a shared memory, wherein the shared memory is a shared memory corresponding to the GPU;
transmitting the graphic processing instruction to a second operating system through a shared memory; wherein the shared memory is in a readable and writable state to both the first operating system and the second operating system;
the shared memory comprises a first storage area, the first storage area is divided into a plurality of blocks, one block is defined as a channel, one channel corresponds to one thread of a first operating system, and the graphics processing instruction comprises a graphics processing function and parameters; the writing the graphics processing instruction into the shared memory specifically includes: storing the graphics processing function and the parameters to a first storage area of the shared memory;
the shared memory further comprises a second memory area, the second memory area is divided into a plurality of blocks, each block has a preset size, the preset size is adapted to GPU graphic content data, and the graphic processing instruction further comprises the graphic content data; the writing the graphics processing instruction to the shared memory further comprises: and writing the graphic content data into the second storage area.
2. The method of claim 1, wherein transferring the graphics processing instruction to the second operating system via a shared memory comprises:
and sending the offset address of the graphics processing instruction in the shared memory to the second operating system.
3. The method of claim 1, wherein the graphics processing instructions further comprise synchronization information indicating a time at which the second operating system executed the graphics processing instructions.
4. The method of claim 1, further comprising, prior to writing the graphics content data to the second storage area:
and determining a block corresponding to the graphic content data according to the size of the graphic content data.
5. The method of claim 1, further comprising, prior to said storing said graphics processing functions and parameters to said first memory area of said shared memory:
and determining channels corresponding to the graphics processing functions and the parameters according to the threads corresponding to the graphics processing functions and the parameters.
6. The method of claim 1, wherein the graphics processing instruction comprises a number and a parameter corresponding to a graphics processing function; writing the graphics processing instruction to the shared memory, specifically including:
determining the number corresponding to the graphic processing function;
and writing the number and the parameter corresponding to the graphic processing function into the first storage area.
7. The method of claim 1, further comprising: receiving an execution result from the second operating system.
8. The method of claim 7, wherein receiving the execution result from the second operating system specifically comprises:
receiving an offset address of an execution result from the second operating system in the shared memory;
and reading the execution result from the shared memory according to the offset address of the execution result in the shared memory.
9. A method for virtualizing a GPU, comprising:
obtaining a graphics processing instruction from a first operating system through a shared memory;
reading the graphic processing instruction from a shared memory, wherein the shared memory is a shared memory corresponding to the GPU;
executing the graphics processing instruction at a second operating system to obtain a processing result;
displaying the processing result as a response of the graphic processing operation; wherein the graphics processing operation is received at the first operating system;
wherein the shared memory is in a readable and writable state to both the first operating system and the second operating system;
the shared memory comprises a first storage area, the first storage area is divided into a plurality of blocks, one block is defined as a channel, one channel corresponds to one thread of a first operating system, and the graphics processing instruction comprises a graphics processing function and parameters; the reading the graphics processing instruction from the shared memory specifically includes: reading the graphics processing function and the parameters from a first storage area of a shared memory;
the shared memory further comprises a second memory area, the second memory area is divided into a plurality of blocks, each block has a preset size, the preset size is adapted to GPU graphic content data, and the graphic processing instruction further comprises the graphic content data; the reading the graphics processing instruction from the shared memory further comprises: and reading the graphic content data from a second storage area of the shared memory.
10. The method of claim 9, wherein obtaining graphics processing instructions from the first operating system via the shared memory comprises:
receiving an offset address of a graphics processing instruction in a shared memory from a first operating system;
and reading the graphics processing instruction from the shared memory according to the offset address of the graphics processing instruction in the shared memory.
11. The method of claim 9, wherein the graphics processing instructions further comprise synchronization information indicating a time at which the second operating system executed the graphics processing instructions; executing the graphics processing instruction at the second operating system, specifically including:
and executing the graphics processing instruction at the time indicated by the synchronization information.
12. The method of claim 9, further comprising, prior to said reading said graphics processing function and parameters from said first memory area of said shared memory:
and determining channels corresponding to the graphics processing functions and the parameters according to the threads corresponding to the graphics processing functions and the parameters.
13. The method of claim 9, wherein the graphics processing instruction comprises a number and a parameter corresponding to a graphics processing function; reading the graphics processing instruction from a first storage area of a shared memory, specifically comprising:
reading the number and the parameter corresponding to the graphic processing function from the first storage area;
and determining the corresponding graphic processing function according to the number corresponding to the graphic processing function.
14. The method of claim 9, further comprising:
and transmitting the execution result to the first operating system through the shared memory.
15. The method of claim 14, wherein transferring the execution result to the first operating system via a shared memory comprises:
writing an execution result into the shared memory;
and sending the execution result in the offset address of the shared memory to the first operating system so that the first operating system can obtain the execution result according to the offset address of the processing result in the shared memory.
16. An apparatus for virtualizing a GPU, comprising:
the first receiving module is used for receiving the graphic processing operation at the first operating system and determining a corresponding graphic processing instruction according to the graphic processing operation;
the first write submodule is specifically configured to: storing the graphic processing instruction to a shared memory, wherein the shared memory is a shared memory corresponding to the GPU;
the first transmission module is used for transmitting the graphic processing instruction to a second operating system through a shared memory; the shared memory is in a readable and writable state to the first operating system and the second operating system;
the shared memory comprises a first storage area, the first storage area is divided into a plurality of blocks, one block is defined as a channel, one channel corresponds to one thread of a first operating system, and the graphics processing instruction comprises a graphics processing function and parameters; the first write submodule is specifically configured to: storing the graphics processing function and the parameters to a first storage area of the shared memory;
the shared memory further comprises a second memory area, the second memory area is divided into a plurality of blocks, each block has a preset size, the preset size is adapted to GPU graphic content data, and the graphic processing instruction further comprises the graphic content data; the first write submodule is further configured to: and writing the graphic content data into the second storage area.
17. The apparatus of claim 16, wherein the first transfer module comprises:
and the first sending submodule is used for sending the offset address of the graphics processing instruction in the shared memory to the second operating system.
18. The apparatus of claim 16, wherein the graphics processing instructions further comprise synchronization information indicating a time at which the second operating system executed the graphics processing instructions.
19. The apparatus of claim 16, further comprising:
and the first determining module is used for determining the block corresponding to the graphic content data according to the size of the graphic content data.
20. The apparatus of claim 16, further comprising:
and the second determining module is used for determining the channels corresponding to the graphics processing functions and the parameters according to the threads corresponding to the graphics processing functions and the parameters.
21. The apparatus of claim 16, wherein the graphics processing instruction comprises a number and a parameter corresponding to a graphics processing function; the first write submodule is specifically configured to:
determining the number corresponding to the graphic processing function;
and writing the number and the parameter corresponding to the graphic processing function into the first storage area.
22. The apparatus of claim 16, further comprising: and the second receiving module is used for receiving the execution result from the second operating system.
23. The apparatus of claim 22, wherein the second receiving module specifically comprises:
the first address receiving submodule is used for receiving an offset address of an execution result from the second operating system in the shared memory;
and the first reading submodule is used for reading the execution result from the shared memory according to the offset address of the execution result in the shared memory.
24. An apparatus for virtualizing a GPU, comprising:
the acquisition module is used for acquiring the graphic processing instruction from the first operating system through the shared memory;
the second reading submodule is used for reading the graphic processing instruction from a shared memory, wherein the shared memory is a shared memory corresponding to the GPU;
the execution module is used for executing the graphic processing instruction at a second operating system to obtain a processing result;
the display module is used for displaying the processing result as the response of the graphic processing operation; wherein the graphics processing operation is received at the first operating system; the shared memory is in a readable and writable state to the first operating system and the second operating system;
the shared memory comprises a first storage area, the first storage area is divided into a plurality of blocks, one block is defined as a channel, one channel corresponds to one thread of a first operating system, and the graphics processing instruction comprises a graphics processing function and parameters; the second reading submodule is specifically configured to: reading the graphics processing function and the parameters from a first storage area of a shared memory;
the shared memory further comprises a second memory area, the second memory area is divided into a plurality of blocks, each block has a preset size, the preset size is adapted to GPU graphic content data, and the graphic processing instruction further comprises the graphic content data; the second read submodule is further configured to: and reading the graphic content data from a second storage area of the shared memory.
25. The apparatus according to claim 24, wherein the obtaining module specifically includes:
a second address receiving submodule, configured to receive an offset address of a graphics processing instruction from the first operating system in the shared memory;
and the second reading submodule is used for reading the graphics processing instruction from the shared memory according to the offset address of the graphics processing instruction in the shared memory.
26. The apparatus of claim 24, wherein the graphics processing instructions further comprise synchronization information indicating a time at which the graphics processing instructions are executed by the second operating system; an execution module specifically configured to:
and executing the graphics processing instruction at the time indicated by the synchronization information.
27. The apparatus of claim 24, further comprising:
and the second determining module is used for determining the channels corresponding to the graphics processing functions and the parameters according to the threads corresponding to the graphics processing functions and the parameters.
28. The apparatus of claim 24, wherein the graphics processing instruction comprises a number and a parameter corresponding to a graphics processing function; a second read submodule, configured to:
reading the number and the parameter corresponding to the graphic processing function from the first storage area;
and determining the corresponding graphic processing function according to the number corresponding to the graphic processing function.
29. The apparatus of claim 24, further comprising:
and the second transmission module is used for transmitting the execution result to the first operating system through the shared memory.
30. The apparatus of claim 29, wherein the second transfer module specifically comprises:
the second writing submodule is used for writing the execution result into the shared memory;
and the second sending submodule is used for sending the execution result to the first operating system at the offset address of the shared memory so that the first operating system can obtain the execution result according to the processing result at the offset address of the shared memory.
31. A virtualization system for a GPU, comprising:
a first operating system comprising a virtualization device for a GPU as claimed in any of claims 16 to 23;
the shared memory is used for storing the graphic processing instruction from the first operating system and the processing result from the second operating system; the shared memory is in a readable and writable state to the first operating system and the second operating system;
a second operating system comprising a virtualization device for a GPU as claimed in any of claims 24 to 30.
32. An electronic device, characterized in that the electronic device comprises: a display, a memory, one or more processors; and one or more modules stored in the memory and configured to be executed by the one or more processors, the one or more modules including instructions for performing the steps of the method of any of claims 1-8.
33. An electronic device, characterized in that the electronic device comprises: a display, a memory, one or more processors; and one or more modules stored in the memory and configured to be executed by the one or more processors, the one or more modules including instructions for performing the steps of the method of any of claims 9-15.
34. A computer program product for use in conjunction with an electronic device that includes a display, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising instructions for performing the steps of the method of any of claims 1-8.
35. A computer program product for use in conjunction with an electronic device that includes a display, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising instructions for performing the steps of the method of any of claims 9-15.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2016/113260 WO2018119951A1 (en) | 2016-12-29 | 2016-12-29 | Gpu virtualization method, device, system, and electronic apparatus, and computer program product |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107003892A CN107003892A (en) | 2017-08-01 |
CN107003892B true CN107003892B (en) | 2021-10-08 |
Family
ID=59431118
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201680002845.1A Active CN107003892B (en) | 2016-12-29 | 2016-12-29 | GPU virtualization method, device and system, electronic equipment and computer program product |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN107003892B (en) |
WO (1) | WO2018119951A1 (en) |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107436797A (en) * | 2017-08-14 | 2017-12-05 | 深信服科技股份有限公司 | A kind of director data processing method and processing device based on virtualized environment |
WO2019127476A1 (en) * | 2017-12-29 | 2019-07-04 | 深圳前海达闼云端智能科技有限公司 | Virtual system bluetooth communication method and device, virtual system, storage medium, and electronic apparatus |
CN109542829B (en) * | 2018-11-29 | 2023-04-25 | 北京元心科技有限公司 | Control method and device of GPU (graphics processing Unit) equipment in multiple systems and electronic equipment |
CN110442389B (en) * | 2019-08-07 | 2024-01-09 | 北京技德系统技术有限公司 | Method for sharing GPU (graphics processing Unit) in multi-desktop environment |
CN111114320B (en) * | 2019-12-27 | 2022-11-18 | 深圳市众鸿科技股份有限公司 | Vehicle-mounted intelligent cabin sharing display method and system |
CN111522670A (en) * | 2020-05-09 | 2020-08-11 | 中瓴智行(成都)科技有限公司 | GPU virtualization method, system and medium for Android system |
CN112581650A (en) * | 2020-11-12 | 2021-03-30 | 江苏北斗星通汽车电子有限公司 | Video data processing method and device based on intelligent cabin and electronic terminal |
CN112925737B (en) * | 2021-03-30 | 2022-08-05 | 上海西井信息科技有限公司 | PCI heterogeneous system data fusion method, system, equipment and storage medium |
CN113793246B (en) * | 2021-11-16 | 2022-02-18 | 北京壁仞科技开发有限公司 | Method and device for using graphics processor resources and electronic equipment |
CN114579072A (en) * | 2022-03-02 | 2022-06-03 | 南京芯驰半导体科技有限公司 | Display screen projection method and device across multiple operating systems |
CN115344226B (en) * | 2022-10-20 | 2023-03-24 | 亿咖通(北京)科技有限公司 | Screen projection method, device, equipment and medium under virtualization management |
CN115686748B (en) * | 2022-10-26 | 2023-11-17 | 亿咖通(湖北)技术有限公司 | Service request response method, device, equipment and medium under virtualization management |
CN115775199B (en) * | 2022-11-23 | 2024-04-16 | 海光信息技术股份有限公司 | Data processing method and device, electronic equipment and computer readable storage medium |
CN116597025B (en) * | 2023-04-24 | 2023-09-26 | 北京麟卓信息科技有限公司 | Compressed texture decoding optimization method based on heterogeneous instruction penetration |
CN116243872B (en) * | 2023-05-12 | 2023-07-21 | 南京砺算科技有限公司 | Private memory allocation addressing method and device, graphics processor and medium |
CN116485628B (en) * | 2023-06-15 | 2023-12-29 | 摩尔线程智能科技(北京)有限责任公司 | Image display method, device and system |
CN118535356A (en) * | 2024-07-25 | 2024-08-23 | 山东浪潮科学研究院有限公司 | Dynamic shared memory multiplexing method and device applicable to GPGPU |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100417077C (en) * | 2002-10-11 | 2008-09-03 | 中兴通讯股份有限公司 | Method for storage area management with static and dynamic joint |
KR100592105B1 (en) * | 2005-03-25 | 2006-06-21 | 엠텍비젼 주식회사 | Method for controlling access to partitioned blocks of shared memory and portable terminal having shared memory |
US8463980B2 (en) * | 2010-09-30 | 2013-06-11 | Microsoft Corporation | Shared memory between child and parent partitions |
CN102541618B (en) * | 2010-12-29 | 2015-05-27 | 中国移动通信集团公司 | Implementation method, system and device for virtualization of universal graphic processor |
US9047686B2 (en) * | 2011-02-10 | 2015-06-02 | Qualcomm Incorporated | Data storage address assignment for graphics processing |
US9158569B2 (en) * | 2013-02-11 | 2015-10-13 | Nvidia Corporation | Virtual interrupt delivery from a graphics processing unit (GPU) of a computing system without hardware support therefor |
CN104754464A (en) * | 2013-12-31 | 2015-07-01 | 华为技术有限公司 | Audio playing method, terminal and system |
CN104503731A (en) * | 2014-12-15 | 2015-04-08 | 柳州职业技术学院 | Quick identification method for binary image connected domain marker |
CN105487915B (en) * | 2015-11-24 | 2018-11-27 | 上海君是信息科技有限公司 | A method of the GPU vitualization performance boost based on retard transmitter |
-
2016
- 2016-12-29 CN CN201680002845.1A patent/CN107003892B/en active Active
- 2016-12-29 WO PCT/CN2016/113260 patent/WO2018119951A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
CN107003892A (en) | 2017-08-01 |
WO2018119951A1 (en) | 2018-07-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107003892B (en) | GPU virtualization method, device and system, electronic equipment and computer program product | |
CN107077377B (en) | Equipment virtualization method, device and system, electronic equipment and computer program product | |
US8966477B2 (en) | Combined virtual graphics device | |
CN103034524B (en) | Half virtualized virtual GPU | |
US9176765B2 (en) | Virtual machine system and a method for sharing a graphics card amongst virtual machines | |
EP2622470B1 (en) | Techniques for load balancing gpu enabled virtual machines | |
CN102177509B (en) | Virtualized storage assignment method | |
US9798565B2 (en) | Data processing system and method having an operating system that communicates with an accelerator independently of a hypervisor | |
US12117947B2 (en) | Information processing method, physical machine, and PCIE device | |
CN106796530B (en) | A kind of virtual method, device and electronic equipment, computer program product | |
CN107077375B (en) | Display method and device for multiple operating systems and electronic equipment | |
CN107077376B (en) | Frame buffer implementation method and device, electronic equipment and computer program product | |
US9910690B2 (en) | PCI slot hot-addition deferral for multi-function devices | |
CN114138423B (en) | Virtualized construction system and method based on domestic GPU graphics card | |
US20170024231A1 (en) | Configuration of a computer system for real-time response from a virtual machine | |
US12105648B2 (en) | Data processing method, apparatus, and device | |
CN106598696B (en) | Method and device for data interaction between virtual machines | |
CN115904617A (en) | GPU virtualization implementation method based on SR-IOV technology | |
CN113485791B (en) | Configuration method, access method, device, virtualization system and storage medium | |
US20160246629A1 (en) | Gpu based virtual system device identification | |
CN110941408B (en) | KVM virtual machine graphical interface output method and device | |
EP3198406B1 (en) | Facilitation of guest application display from host operating system | |
US9459910B1 (en) | Controlling a layered driver | |
US20160026567A1 (en) | Direct memory access method, system and host module for virtual machine | |
CN117331704B (en) | Graphics processor GPU scheduling method, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |