CN117785451A - GPU video memory allocation method and virtual machine - Google Patents

GPU video memory allocation method and virtual machine Download PDF

Info

Publication number
CN117785451A
CN117785451A CN202311753467.9A CN202311753467A CN117785451A CN 117785451 A CN117785451 A CN 117785451A CN 202311753467 A CN202311753467 A CN 202311753467A CN 117785451 A CN117785451 A CN 117785451A
Authority
CN
China
Prior art keywords
video memory
mode module
limit value
target process
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311753467.9A
Other languages
Chinese (zh)
Inventor
王真
李焱
杨偲乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN202311753467.9A priority Critical patent/CN117785451A/en
Publication of CN117785451A publication Critical patent/CN117785451A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Stored Programmes (AREA)

Abstract

The application discloses a distribution method of GPU video memory and a virtual machine, wherein the distribution method comprises the following steps: receiving a task request of a target process; the user state module determines a video memory limit value corresponding to the target process based on the allocation state of the total video memory and the priority level of each process; transmitting the video memory limit value to a kernel mode module; the kernel state module stores the video memory limit value to an environment block memory area of the target process; and according to the stored video memory limit value and the task request, performing video memory allocation on the target process through the kernel mode module.

Description

GPU video memory allocation method and virtual machine
Technical Field
The present disclosure relates to the field of resource allocation technologies, and in particular, to a method for allocating GPU video memory and a virtual machine.
Background
In graphics processor (Graphics Processing Unit, GPU) programming, each process or application may require an amount of memory to be allocated on the GPU to store data and computation results. However, when multiple processes share the same GPU, there is no explicit mechanism to limit the maximum amount of memory that each process can use. This may result in some processes taking up excessive memory resources while other processes may not obtain sufficient memory or performance degradation.
Currently, the above problem is solved by the following two ways, the first: CUDA Runtime APIs, CUDA provides a set of APIs that allow developers to explicitly specify and control the allocation and release of video memory in a program. This allows the developer to manage the memory usage of each process or application as desired. By calling the related API functions, the limitation and management of the video memory can be realized at the programming level. Second kind: GPU virtualization techniques, some of which allow multiple virtual GPUs to be created on the same GPU and each virtual GPU to be assigned a different memory size, to achieve memory management and restriction by configuring the memory limits of the virtual GPUs.
However, in the first mode, a developer is required to manually manage the application and release of the video memory, so that the management efficiency and the management cost are high; the second method generally fixes the value of the memory limit at the start of the process or at the initialization of the environment (container, virtual machine, etc.), and cannot be dynamically adjusted in real time according to the actual use condition in the running process, which limits the flexibility and efficiency of the memory management.
Disclosure of Invention
The embodiment of the application aims to provide a GPU video memory distribution method and a virtual machine.
In a first aspect, an embodiment of the present application provides a method for allocating GPU video memory, which is applied to a virtual machine of GPU video memory, where the virtual machine includes a user mode module and a kernel mode module, and the allocation method includes:
receiving a task request of a target process;
the user mode module determines a video memory limit value corresponding to the target process based on the allocation state of the total video memory and the priority level of each process;
transmitting the video memory limit value to a kernel mode module;
the kernel mode module stores the video memory limit value to an environment block memory area of the target process;
and performing video memory allocation on the target process through the kernel mode module according to the stored video memory limit value and the task request.
In one possible implementation manner, after receiving a task request of a target process and before performing video memory allocation on the target process through the kernel mode module, the method includes:
creating a first process file of the target process through the user mode module based on the task request;
and transmitting the first process file to the kernel mode module.
In a possible implementation manner, the allocating, by the kernel mode module, the video memory of the target process according to the stored video memory limit value and the task request includes:
analyzing a first process file transmitted by the user mode module through the kernel mode module to obtain a video memory request value included in the task request;
and performing video memory allocation on the target process through the kernel mode module based on the video memory request value, the allocated video memory value included in the allocation state and the video memory limit value, wherein the allocated video memory value is the video memory value allocated to the target process.
In one possible implementation manner, the allocating, by the kernel mode module, the video memory to the target process based on the video memory request value, the allocated video memory value included in the allocation state, and the video memory limit value includes:
determining, by the kernel mode module, a video memory allocation value of the target process when a sum of the video memory request value and the allocated video memory value is less than the video memory limit value;
and based on the video memory allocation value, performing video memory allocation on the target process through the kernel mode module.
In one possible implementation manner, the allocating, by the kernel mode module, the video memory to the target process based on the video memory request value, the allocated video memory value included in the allocation state, and the video memory limit value includes:
generating a prompt message of insufficient memory through the kernel mode module under the condition that the sum value of the video memory request value and the allocated video memory value is larger than or equal to the video memory limit value;
and the kernel mode module transmits the prompt information to the user mode module.
In one possible implementation, the storing, by the kernel mode module, the video memory limit value to an environment block memory area of the target process includes:
according to the target address, determining a target area of the memory area of the environment block of the target process through the kernel mode module;
and storing the video memory limit value to the target area.
In one possible implementation manner, before determining the memory limit value corresponding to the target process, the method further includes:
the user mode module invokes driving information of the GPU driver, wherein the driving information comprises a video memory value of each process in operation and a video memory value which is not allocated;
and based on the driving information, obtaining the distribution state of the total video memory.
In one possible implementation manner, before determining the memory limit value corresponding to the target process, the method further includes:
the user mode module determines a running process;
the priority level of each process in operation is obtained.
In one possible implementation manner, after the memory allocation of the target process by the kernel mode module, the method includes:
the user mode module determines whether an update condition is met, wherein the update condition comprises an update period and a trigger action;
if yes, acquiring the allocation state of the total video memory and the priority level of each process through the user state module;
and updating the video memory limit value of the running process through the user mode module based on the allocation state and the priority level.
In a second aspect, an embodiment of the present application further provides a virtual machine for allocating GPU video memory, including a receiving module, a user mode module, and a kernel mode module, where the user mode module is connected to both the receiving module and the kernel mode module;
the receiving module is used for receiving a task request of the target process;
the user mode module is used for obtaining a video memory limit value corresponding to the target process according to the distribution state of the total video memory and the priority level of each process, and transmitting the video memory limit value to the kernel mode module;
the kernel mode module is used for storing the received video memory limit value to an environment block memory area of the target process;
the kernel mode module is further configured to allocate the video memory to the target process according to the stored video memory limit value and the task request.
According to the method and the device for managing the GPU video memory, the video memory limiting value corresponding to the target process can be determined in real time according to the distribution state of the total video memory and the priority level of each process, namely, the video memory limiting value corresponding to the process can be dynamically adjusted in real time according to the use condition of the GPU video memory, and the flexibility and the efficiency of managing the GPU video memory are greatly improved.
Drawings
In order to more clearly illustrate the technical solutions of the present application or the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 shows a flowchart of a GPU video memory allocation method provided by the present application;
FIG. 2 is a schematic diagram of a virtual machine according to the present application;
FIG. 3 is a flowchart of a method for performing video memory allocation on a target process by using a kernel mode module;
FIG. 4 shows a schematic diagram of one NameSpace provided herein;
FIG. 5 illustrates a schematic diagram of a process environment block provided herein;
fig. 6 shows a schematic diagram of a virtual machine for allocating GPU video memory provided in the present application.
Detailed Description
Various aspects and features of the present application are described herein with reference to the accompanying drawings.
It should be understood that various modifications may be made to the embodiments of the application herein. Therefore, the above description should not be taken as limiting, but merely as exemplification of the embodiments. Other modifications within the scope and spirit of this application will occur to those skilled in the art.
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the application and, together with a general description of the application given above and the detailed description of the embodiments given below, serve to explain the principles of the application.
These and other characteristics of the present application will become apparent from the following description of a preferred form of embodiment, given as a non-limiting example, with reference to the accompanying drawings.
It is also to be understood that, although the present application has been described with reference to some specific examples, a person skilled in the art will certainly be able to achieve many other equivalent forms of the present application, having the characteristics as set forth in the claims and hence all coming within the field of protection defined thereby.
The foregoing and other aspects, features, and advantages of the present application will become more apparent in light of the following detailed description when taken in conjunction with the accompanying drawings.
Specific embodiments of the present application will be described hereinafter with reference to the accompanying drawings; however, it is to be understood that the disclosed embodiments are merely exemplary of the application, which can be embodied in various forms. Well-known and/or repeated functions and constructions are not described in detail to avoid obscuring the application with unnecessary or excessive detail. Therefore, specific structural and functional details disclosed herein are not intended to be limiting, but merely serve as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present application in virtually any appropriately detailed structure.
The specification may use the word "in one embodiment," "in another embodiment," "in yet another embodiment," or "in other embodiments," which may each refer to one or more of the same or different embodiments as per the application.
For the sake of understanding the present application, a detailed description will be given of a GPU video memory allocation method provided in the present application, where the GPU video memory allocation method in the present application is applied to a virtual machine of a graphics processor (Graphics Processing Unit, GPU) video memory, where the virtual machine includes a user mode module and a kernel mode module.
Fig. 1 shows a flowchart of a GPU video memory allocation method according to an embodiment of the present application, where specific steps include S101-S105.
S101, receiving a task request of a target process.
As one example, fig. 2 shows a schematic structural diagram of a virtual machine, where a user mode module and a kernel mode module included in the virtual machine are connected through a standby control interface function ioctl, where the user mode module in fig. 2 includes a processor, a first GPU driver, a first GPU video memory manager, a GPU video memory controller, and the like, and the kernel mode module includes a second GPU video memory manager, a second GPU driver, and the like. It should be noted that, the embodiment of the present application is not limited to a virtual machine, but may be a virtual container, etc., and different virtual machines or virtual containers may be set according to different actual application scenarios and operating system environments, so as to improve flexibility of implementing the allocation method and applicability of the virtual machines.
In a specific implementation, the cloud platform is in communication connection with the client terminal and the server terminal, and is used as a transfer terminal between the client terminal and the server terminal and used for receiving all task requests sent by the corresponding client terminal and transmitting each task request to the server terminal so that the server terminal responds to the task requests of the client terminal. The server comprises a plurality of virtual machines, and each virtual machine is used for responding to the task requests of different client terminals, namely, the virtual machine transmits the task requests of the target process to the corresponding virtual machine after receiving the task requests of the target process sent by the client terminals.
Accordingly, the virtual machine receives a task request for the target process. Optionally, when the virtual machine receives the task request of the target process, the task request of the target process may be directly received through the user mode module, and also the task request of the target process may be received through a preset receiving module. Here, the task request includes identification information of the target process, task content, a GPU memory request value required to execute the task, and the like.
S102, the user mode module determines a video memory limit value corresponding to the target process based on the allocation state of the total video memory and the priority level of each process.
After receiving the task request of the target process, the processor may generate a response instruction based on the task request, and transmit the response instruction to the user mode module, where the user mode module determines a video memory limit value corresponding to the target process based on the response instruction, where the response instruction at least includes identification information of the target process and the video memory request value, and is used to instruct the user mode module to determine the video memory limit value corresponding to the target process. Of course, the processor may also directly transmit the task request of the target process to the user mode module, where the user mode module determines the video memory limit value corresponding to the target process based on the task request of the target process. The embodiment of the present application is not particularly limited thereto.
Optionally, when determining the video memory limit value corresponding to the target process, the user mode module obtains the allocation state of the total video memory and the priority level of each process, and then determines the video memory limit value corresponding to the target process based on the allocation state of the total video memory and the priority level of each process.
In the embodiment of the application, the video memory limit value corresponding to the target process is determined based on the allocation state of the total video memory, so that reasonable allocation of the GPU video memory can be realized, the situation that other processes cannot normally run due to the fact that a certain process excessively occupies GPU video memory resources is avoided, and the utilization rate of the GPU video memory is improved. Meanwhile, the video memory limit value corresponding to the target process is determined based on the priority level of each process, so that different response requirements of each process can be met, and the flexibility is high.
When the user mode module determines the video memory limit value corresponding to the target process, the user mode module determines the video memory limit value corresponding to the target process based on the allocation state of the total video memory and the priority level of each process. Alternatively, the priority level of the setting process is incremented from 0, and the higher the number, the lower the priority level. And calculating a first sum value limit-total of the memory limiting values corresponding to all running processes, judging whether the first sum value limit-total is larger than a total memory value GPU-total-mem of the GPU, and if the first sum value is larger than the total memory value of the GPU, namely limit-total > GPU-total-mem, calculating a second sum value total-pri= Σper-pri of the priority levels of all running processes. Further, a difference value diff-value= (limit-total) - (GPU-total-mem) between the first sum value and the GPU's total memory value is calculated, the difference value is taken as a divisor, and the second sum value is taken as a dividend, and the quotient value obtained is a "single-part" difference value, i.e., per-diff=diff-value/total-pri. And then changing the video memory limit value New-value of each process into the original video memory limit value old-value minus the single-part difference value multiplied by the corresponding priority, namely New-value= (old-value) - (Per-diff) priority.
Therefore, the user mode module can determine the video memory limit value corresponding to the target process.
S103, transmitting the video memory limit value to the kernel mode module.
After determining the video memory limit value corresponding to the target process, the user mode module transmits the video memory limit value to the kernel mode module. Referring to the structural schematic diagram shown in fig. 2, the GPU video memory controller of the user mode module transmits the video memory limit value to the second GPU video memory manager of the kernel mode module through the ioctl call interface.
S104, the kernel mode module stores the video memory limit value into an environment block memory area of the target process.
Optionally, the second GPU video memory manager of the kernel mode module is configured to receive data transmitted by the user mode module to implement data interaction between the user mode module and the kernel mode module. Based on the above, after the kernel mode module receives the video memory limit value corresponding to the target process, the second GPU video memory manager of the kernel mode module stores the video memory limit value into the environment block memory area of the target process.
Wherein each process is respectively corresponding to a process environment block (Process Environment Block, PEB) which is used for storing a structural body of process information, and the process environment blocks of each process are mutually independent. Therefore, the memory limit value is stored in the environment block memory area of the target process, so that maintenance of a corresponding relation table between the process and the memory limit value in the prior art is avoided, the corresponding relation table comprises corresponding relations between a plurality of groups of processes and the memory limit value, and meanwhile, the memory limit value is stored in the environment block memory area of the target process, so that the query efficiency of the memory limit value of the target process is improved, and the accuracy of the query result is improved.
S105, according to the stored video memory limit value and the task request, video memory allocation is carried out on the target process through the kernel mode module.
After the kernel mode module stores the video memory limit value into the memory area of the environment block of the target process, namely the kernel mode module obtains the video memory limit value of the target process. On the basis, according to the stored video memory limit value and the task request, the video memory of the target process is distributed through the kernel mode module.
In the embodiment of the application, the user state module determines the video memory limit value corresponding to the target process according to the allocation state of the total video memory and the priority level of each process, namely, the video memory limit value corresponding to the process can be dynamically adjusted in real time according to the use condition of the GPU video memory, so that the flexibility and the efficiency of managing the GPU video memory are improved, and the rationality of video memory allocation and the utilization rate of the GPU video memory are also improved.
In addition, the embodiment of the application carries out video memory allocation on the target process through the kernel mode module, and further effectively monitors and controls the video memory of the process through the mutual coordination between the user mode module and the kernel mode module, thereby preventing the situation of system breakdown and abnormality of the server side caused by video memory overflow and further improving the stability and reliability of the system.
As one example, fig. 3 shows a flowchart of a method for performing video memory allocation on a target process by a kernel mode module according to a stored video memory limit value and a task request, where specific steps include S301 and S302.
S301, analyzing the first process file transmitted by the user mode module through the kernel mode module to obtain a video memory request value included in the task request.
S302, performing video memory allocation on the target process through the kernel mode module based on the video memory request value, the allocated video memory value and the video memory limit value, wherein the allocated video memory value is the video memory value allocated to the target process.
In an implementation, the user mode module transmits the driving file of the target process to the kernel mode module. In the embodiment of the application, after receiving the task request of the target process and before performing video memory allocation on the target process through the kernel mode module, a first process file of the target process is created through the user mode module based on the task request according to a Name Space mechanism of Linux, the first process file is identical to the target process file of the target process, the first process file and the target process file are in different Name spaces, and further isolation and limitation of resources are achieved, namely different processes or process groups are isolated, so that the processes or the process groups run in independent virtual environments and each process file has own resource view. As shown in the schematic diagram of Name Space in fig. 4, the process files in Name Space1 are identical to the process files in Name Space2, but Name Space1 and Name Space2 are two different Name spaces.
Optionally, after the first process file is created, the first process file is transmitted to the kernel mode module through the ioctl call interface, and specifically, the first process file is transmitted to the second GPU video memory manager of the kernel mode module.
After the kernel mode module receives the first process file, the second GPU video memory manager analyzes the first process file transmitted by the user mode module to obtain a video memory request value included in the task request; and then, based on the memory request value, the allocated memory value and the memory limit value included in the allocation state, performing memory allocation on the target process through the kernel mode module, wherein the allocated memory value is the memory value allocated to the target process.
Further, when the memory allocation is performed on the target process by the kernel mode module based on the memory request value, the allocated memory value and the memory limit value included in the allocation state, the kernel mode module calculates a sum of the memory request value and the allocated memory value, and determines a magnitude relation between the sum and the memory limit value.
Under the condition that the sum of the video memory request value and the allocated video memory value is smaller than the video memory limit value, namely, sufficient unallocated video memory exists in the GPU video memory, at the moment, the video memory allocation value of the target process can be determined through the kernel mode module. And then, based on the video memory allocation value, performing video memory allocation on the target process through the kernel mode module, generating feedback information representing the allocated feedback information through the kernel mode module, and transmitting the feedback information to the kernel mode module.
And under the condition that the sum of the video memory request value and the allocated video memory value is larger than or equal to the video memory limit value, representing that the unallocated video memory does not exist in the GPU video memory, generating prompt information of insufficient memory through the kernel mode module, and transmitting the prompt information to the user mode module by the kernel mode module.
The memory area is a block area included in the process address space. As shown in the schematic diagram of the process environment block in fig. 5, text (program code), initialization data, uninitialized data, unallocated memory, and the like are stored in the environment block memory area, and each sub-area in the environment block memory area corresponds to an address.
Based on the target address, determining a target area of the memory area of the environment block of the target process by the kernel mode module, and storing the video memory limit value into the target area. The target address is the first address of the memory area in the environment block.
In this embodiment of the present application, by storing the video memory limit value of the process in the target area of the memory area of the environmental block, when the limit value of the process is read, the current task_struct structure of the linux kernel is directly utilized, specifically, a pointer component env_start included in the memory descriptor mm_struct is used to point to a first address of the environment block memory area of the process, so that the video memory limit value corresponding to the process is read, and the first address is a fixed address, thereby improving the reading efficiency of the video memory limit value. In addition, each process is mutually independent, so that the memory areas of the environment blocks corresponding to each process are relatively independent, confusion of the memory limit values corresponding to different processes can be avoided to a certain extent, and the accuracy of the read memory limit values is ensured.
Meanwhile, when the limit value of the video memory of each process is adjusted and updated, the efficiency can be improved.
In the embodiment of the application, before determining the video memory limit value corresponding to the target process, the user mode module obtains the allocation state of the total video memory. As one example, when the user mode module obtains the allocation status of the total video memory, the user mode module invokes the driving information of the first GPU driver, where the driving information includes the video memory value of each process in operation and the video memory value that is not allocated. And then, based on the driving information, obtaining the distribution state of the total video memory. The allocation state of the total memory includes sufficient or insufficient total memory and the like.
It should be noted that, the user state module may obtain the allocation status of the total video memory according to a preset period, or may obtain the allocation status of the total video memory after receiving the task request of the target process. The method comprises the steps that the distribution state of the total video memory is obtained according to a preset period, so that the distribution state of the total video memory can be obtained quickly under the condition that the operation of a driver is not influenced, and the situation of task processing delay caused by occupation of resources is prevented; the allocation state of the total video memory is obtained after the task request of the target process is received, the instantaneity of the obtained allocation state of the total video memory can be ensured, and the accuracy of the determined video memory limit value is further improved.
Correspondingly, before determining the memory limit value corresponding to the target process, the user mode module acquires the priority level of each process. Optionally, when the user mode module acquires the priority level of each process, the following three acquisition modes may be referred to:
first acquisition mode
The user mode module queries the current driving progress of the first GPU driver to determine the running process, and then obtains the priority level of each running process. The method comprises the steps of forming a class table in advance based on all processes supported by the GPU and corresponding priority levels thereof, storing the class table in a user mode module in advance, for example, storing the class table in a processor, a first GPU video memory manager or a GPU video memory controller and the like, and searching the priority levels of the running processes from the class table after determining the running processes.
Second acquisition mode
The user mode module queries the current drive of the first GPU driver to determine the running process, and generates a grade acquisition instruction based on the running process, wherein the grade acquisition instruction comprises the name or the unique identification of the running process. And the user mode module transmits the level acquisition instruction to the cloud platform, and the cloud platform searches the priority level corresponding to the running process included in the level acquisition instruction after receiving the level acquisition instruction and returns the priority level corresponding to each running process to the user mode module.
Third acquisition mode
And the user starts the process and simultaneously sends the priority level corresponding to the process to the user mode module of the virtual machine so that the user mode module can acquire the priority level of the running process.
In practical application, the method is not limited as long as the user mode module can obtain the priority level of each process.
In this embodiment of the present application, after performing video memory allocation on a target process by using a kernel mode module, the user mode module may further determine whether an update condition is satisfied, where the update condition includes an update period and a trigger action. For example, an update timer is set, the update timer sets a time interval, and when the update timer reaches the time interval, the update timer is characterized as meeting the update condition; and setting triggering as a new process starting signal, and characterizing that the updating condition is met when the new process starting signal is received.
In a specific implementation, if the update condition is met, the allocation state of the total video memory and the priority level of each process are obtained through the user mode module, and then the video memory limit value of the running process is updated through the user mode module based on the allocation state and the priority level. Here, the video memory limit value may be updated when the process is started, or may be updated when a task request corresponding to the process is received, or may be dynamically updated in the process of running the process, so long as the requirement of the process can be met, which is not particularly limited in the embodiment of the present application.
Here, on the basis that the video memory limit value is stored in the environment block memory area of the target process, and the process environment blocks of each process are mutually independent, when the video memory limit value of each process is updated, the situation that the updated video memory limit value does not correspond to the process is avoided, and the accurate corresponding relation between the updated video memory limit value and the process is ensured; and on the basis that the memory limit value is stored in the first address of the environment block memory area, the efficiency of updating the memory limit value can be improved to a certain extent.
Based on the same inventive concept, the second aspect of the present application further provides a virtual device for allocating GPU video memory, and since the principle of the electronic device in the present application for solving the problem is similar to that of the allocation method described in the present application, the implementation of the virtual device may refer to the implementation of the method, and the repetition is omitted.
Fig. 6 shows a schematic diagram of a virtual device for allocating GPU video memory according to an embodiment of the present application, which specifically includes a receiving module 601, a user mode module 602, and a kernel mode module 603, where the user mode module 602 is connected to both the receiving module 601 and the kernel mode module 603;
the receiving module 601 is configured to receive a task request of a target process;
the user mode module 602 is configured to obtain a video memory limit value corresponding to the target process according to the allocation status of the total video memory and the priority level of each process, and transmit the video memory limit value to the kernel mode module 603;
the kernel mode module 603 is configured to store the received video memory limit value to an environment block memory area of the target process;
the kernel mode module 603 is further configured to allocate a video memory to the target process according to the stored video memory limit value and the task request.
In yet another embodiment, the user mode module 602 is further configured to:
creating a first process file of the target process based on the task request;
the first process file is transferred to the kernel mode module 603.
In yet another embodiment, the kernel mode module 603 is specifically configured to:
analyzing the first process file transmitted by the user mode module 602 to obtain a video memory request value included in the task request;
and performing video memory allocation on the target process based on the video memory request value, the allocated video memory value included in the allocation state and the video memory limit value, wherein the allocated video memory value is the video memory value allocated to the target process.
In yet another embodiment, the kernel mode module 603 is further configured to:
determining a video memory allocation value of the target process under the condition that the sum of the video memory request value and the allocated video memory value is smaller than the video memory limit value;
and based on the video memory allocation value, performing video memory allocation on the target process.
In yet another embodiment, the kernel mode module 603 is further configured to:
generating a prompt message of insufficient memory under the condition that the sum of the video memory request value and the allocated video memory value is larger than or equal to the video memory limit value;
the prompt is transmitted to the user mode module 602.
In yet another embodiment, the kernel mode module 603 is further configured to:
determining a target area of the memory area of the environment block of the target process according to the target address;
and storing the video memory limit value to the target area.
In yet another embodiment, the user mode module 602 is further configured to:
retrieving driving information of a GPU driver, wherein the driving information comprises a video memory value of each process in operation and a video memory value which is not allocated;
and based on the driving information, obtaining the distribution state of the total video memory.
In yet another embodiment, the user mode module 602 is further configured to:
determining a running process;
the priority level of each process in operation is obtained.
In yet another embodiment, the user mode module 602 is further configured to:
determining whether an update condition is satisfied, the update condition including an update period and a trigger action;
if yes, acquiring the allocation state of the total video memory and the priority level of each process;
and updating the video memory limit value of the running process based on the allocation state and the priority level.
According to the method and the device for managing the GPU video memory, the video memory limiting value corresponding to the target process can be determined in real time according to the distribution state of the total video memory and the priority level of each process, namely, the video memory limiting value corresponding to the process can be dynamically adjusted in real time according to the use condition of the GPU video memory, and the flexibility and the efficiency of managing the GPU video memory are greatly improved.
Furthermore, although exemplary embodiments have been described herein, the scope thereof includes any and all embodiments having equivalent elements, modifications, omissions, combinations (e.g., of the various embodiments across), adaptations or alterations as pertains to the present application. Elements in the claims are to be construed broadly based on the language employed in the claims and are not limited to examples described in the present specification or during the practice of the present application, which examples are to be construed as non-exclusive. It is intended, therefore, that the specification and examples be considered as exemplary only, with a true scope and spirit being indicated by the following claims and their full scope of equivalents.
The above description is intended to be illustrative and not restrictive. For example, the above-described examples (or one or more aspects thereof) may be used in combination with each other. For example, other embodiments may be used by those of ordinary skill in the art upon reading the above description. In addition, in the above detailed description, various features may be grouped together to streamline the application. This is not to be interpreted as an intention that the disclosed features not being claimed are essential to any claim. Rather, the subject matter of the present application is capable of less than all of the features of a particular disclosed embodiment. Thus, the following claims are hereby incorporated into the detailed description as examples or embodiments, with each claim standing on its own as a separate embodiment, and it is contemplated that these embodiments may be combined with one another in various combinations or permutations. The scope of the application should be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.
While various embodiments of the present application have been described in detail, the present application is not limited to these specific embodiments, and various modifications and embodiments can be made by those skilled in the art based on the conception of the present application, which modifications and modifications are within the scope of the present application as defined in the appended claims.

Claims (10)

1. The distribution method of the GPU video memory is applied to a virtual machine of the GPU video memory, the virtual machine comprises a user mode module and a kernel mode module, and the distribution method comprises the following steps:
receiving a task request of a target process;
the user mode module determines a video memory limit value corresponding to the target process based on the allocation state of the total video memory and the priority level of each process;
transmitting the video memory limit value to a kernel mode module;
the kernel mode module stores the video memory limit value to an environment block memory area of the target process;
and performing video memory allocation on the target process through the kernel mode module according to the stored video memory limit value and the task request.
2. The allocation method according to claim 1, after receiving a task request of a target process and before performing video memory allocation on the target process by the kernel mode module, comprising:
creating a first process file of the target process through the user mode module based on the task request;
and transmitting the first process file to the kernel mode module.
3. The allocation method according to claim 1, wherein the allocating, by the kernel mode module, the video memory of the target process according to the stored video memory limit value and the task request includes:
analyzing a first process file transmitted by the user mode module through the kernel mode module to obtain a video memory request value included in the task request;
and performing video memory allocation on the target process through the kernel mode module based on the video memory request value, the allocated video memory value included in the allocation state and the video memory limit value, wherein the allocated video memory value is the video memory value allocated to the target process.
4. The allocation method according to claim 3, wherein said allocating, by the kernel mode module, the target process based on the memory request value, the allocated memory value included in the allocation status, and the memory limit value, includes:
determining, by the kernel mode module, a video memory allocation value of the target process when a sum of the video memory request value and the allocated video memory value is less than the video memory limit value;
and based on the video memory allocation value, performing video memory allocation on the target process through the kernel mode module.
5. The allocation method according to claim 3, wherein said allocating, by the kernel mode module, the target process based on the memory request value, the allocated memory value included in the allocation status, and the memory limit value, includes:
generating a prompt message of insufficient memory through the kernel mode module under the condition that the sum value of the video memory request value and the allocated video memory value is larger than or equal to the video memory limit value;
and the kernel mode module transmits the prompt information to the user mode module.
6. The allocation method according to claim 1, wherein the kernel mode module stores the video memory limit value to an environment block memory area of the target process, comprising:
according to the target address, determining a target area of the memory area of the environment block of the target process through the kernel mode module;
and storing the video memory limit value to the target area.
7. The allocation method according to claim 1, further comprising, before determining the memory limit value corresponding to the target process:
the user mode module invokes driving information of the GPU driver, wherein the driving information comprises a video memory value of each process in operation and a video memory value which is not allocated;
and based on the driving information, obtaining the distribution state of the total video memory.
8. The allocation method according to claim 1, further comprising, before determining the memory limit value corresponding to the target process:
the user mode module determines a running process;
the priority level of each process in operation is obtained.
9. The allocation method according to claim 1, after the memory allocation of the target process by the kernel mode module, comprising:
the user mode module determines whether an update condition is met, wherein the update condition comprises an update period and a trigger action;
if yes, acquiring the allocation state of the total video memory and the priority level of each process through the user state module;
and updating the video memory limit value of the running process through the user mode module based on the allocation state and the priority level.
10. The virtual machine for distributing GPU video memory comprises a receiving module, a user mode module and a kernel mode module, wherein the user mode module is connected with the receiving module and the kernel mode module;
the receiving module is used for receiving a task request of the target process;
the user mode module is used for obtaining a video memory limit value corresponding to the target process according to the distribution state of the total video memory and the priority level of each process, and transmitting the video memory limit value to the kernel mode module;
the kernel mode module is used for storing the received video memory limit value to an environment block memory area of the target process;
the kernel mode module is further configured to allocate the video memory to the target process according to the stored video memory limit value and the task request.
CN202311753467.9A 2023-12-19 2023-12-19 GPU video memory allocation method and virtual machine Pending CN117785451A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311753467.9A CN117785451A (en) 2023-12-19 2023-12-19 GPU video memory allocation method and virtual machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311753467.9A CN117785451A (en) 2023-12-19 2023-12-19 GPU video memory allocation method and virtual machine

Publications (1)

Publication Number Publication Date
CN117785451A true CN117785451A (en) 2024-03-29

Family

ID=90382781

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311753467.9A Pending CN117785451A (en) 2023-12-19 2023-12-19 GPU video memory allocation method and virtual machine

Country Status (1)

Country Link
CN (1) CN117785451A (en)

Similar Documents

Publication Publication Date Title
US11016815B2 (en) Code execution request routing
EP3554025B1 (en) Method for forwarding packet and physical host
KR101952795B1 (en) Resource processing method, operating system, and device
CN110941481A (en) Resource scheduling method, device and system
CN108293041B (en) Distributed system, resource container allocation method, resource manager and application controller
JP7074302B2 (en) Virtual machine management methods, virtual machine management systems, virtual machine management devices, non-volatile computer readable storage media and computer programs
US9229751B2 (en) Apparatus and method for managing virtual memory
KR20120086322A (en) Method, system and physical host for virtual machine(vm) storage space management
US20120110293A1 (en) Method and system for managing virtual machine storage space and physical host
CN106897299B (en) Database access method and device
CN111290838B (en) Application access request processing method and device based on container cluster
JP2007141226A (en) System, method and program for allocating shared memory
JP2014520346A5 (en)
CN110162397B (en) Resource allocation method, device and system
CN110990114A (en) Virtual machine resource allocation method, device, equipment and readable storage medium
CN108073423A (en) A kind of accelerator loading method, system and accelerator loading device
CN110659131A (en) Task processing method, electronic device, computer device, and storage medium
CN110750336A (en) OpenStack virtual machine memory hot-expanding method
US20210248000A1 (en) Virtual machine migration to multiple destination nodes
CN110659104A (en) Service monitoring method and related equipment
WO2013162531A1 (en) Dynamic memory allocation
CN108667750B (en) Virtual resource management method and device
CN107797843B (en) Method and device for enhancing function of container
EP4006725A1 (en) Virtual machine migration processing and strategy generation method, apparatus and device, and storage medium
CN112685132A (en) Koji task execution method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination