CN111223036B - GPU (graphics processing unit) virtualization sharing method and device, electronic equipment and storage medium - Google Patents
GPU (graphics processing unit) virtualization sharing method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN111223036B CN111223036B CN201911386438.7A CN201911386438A CN111223036B CN 111223036 B CN111223036 B CN 111223036B CN 201911386438 A CN201911386438 A CN 201911386438A CN 111223036 B CN111223036 B CN 111223036B
- Authority
- CN
- China
- Prior art keywords
- gpu
- target
- virtual
- library
- hijacking
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 238000012545 processing Methods 0.000 title description 6
- 230000006870 function Effects 0.000 claims abstract description 44
- HPTJABJPZMULFH-UHFFFAOYSA-N 12-[(Cyclohexylcarbamoyl)amino]dodecanoic acid Chemical group OC(=O)CCCCCCCCCCCNC(=O)NC1CCCCC1 HPTJABJPZMULFH-UHFFFAOYSA-N 0.000 claims abstract description 33
- 230000008569 process Effects 0.000 claims abstract description 19
- 230000015654 memory Effects 0.000 claims description 35
- 230000004044 response Effects 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 238000013135 deep learning Methods 0.000 claims description 6
- 238000000638 solvent extraction Methods 0.000 claims description 4
- 238000005192 partition Methods 0.000 claims 1
- 238000005516 engineering process Methods 0.000 abstract description 9
- 238000002955 isolation Methods 0.000 abstract description 4
- 238000004891 communication Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- KLDZYURQCUYZBL-UHFFFAOYSA-N 2-[3-[(2-hydroxyphenyl)methylideneamino]propyliminomethyl]phenol Chemical compound OC1=CC=CC=C1C=NCCCN=CC1=CC=CC=C1O KLDZYURQCUYZBL-UHFFFAOYSA-N 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 201000001098 delayed sleep phase syndrome Diseases 0.000 description 1
- 208000033921 delayed sleep phase type circadian rhythm sleep disease Diseases 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/4555—Para-virtualisation, i.e. guest operating system has to be modified
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application discloses a GPU virtualization sharing method and device, an electronic device and a computer readable storage medium, wherein the method comprises the following steps: determining a target physical GPU and dividing the target physical GPU into a plurality of virtual GPUs; when a target task is received, selecting a target virtual GPU from the target physical GPU so as to execute the target task by using the target virtual GPU; in the process of executing the target task, when a call request for a CUDA function is received, redirecting the call request to a hijacking library; the hijacking library comprises a plurality of rename functions corresponding to CUDA functions, and the return value of each rename function is determined based on the information of the target virtual GPU. According to the GPU virtualization sharing method provided by the application, one physical GPU resource is separated into a plurality of virtual GPU resources through the hijacking technology and the isolation technology, so that multi-user and multi-task sharing of the GPU is realized.
Description
Technical Field
The present application relates to the field of computer technology, and more particularly, to a GPU virtualization sharing method and apparatus, and an electronic device and a computer readable storage medium.
Background
With the technical progress of deep learning, the rapid development of artificial intelligence is greatly promoted, and the training and reasoning of the deep learning are greatly dependent on GPU (Chinese full name: graphic processor, english full name: graphics Processing Unit). Some deep learning models are smaller, so that the GPU cannot run at full load, and the current main stream framework executes the deep learning task in a mode of monopolizing the GPU, which is likely to cause waste of GPU resources.
Therefore, how to implement sharing of GPUs is a technical problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The application aims to provide a GPU (graphics processing unit) virtualization sharing method and device, electronic equipment and a computer readable storage medium, and GPU sharing is achieved.
In order to achieve the above object, the present application provides a GPU virtualization sharing method, including:
determining a target physical GPU and dividing the target physical GPU into a plurality of virtual GPUs;
when a target task is received, selecting a target virtual GPU from the target physical GPU so as to execute the target task by using the target virtual GPU;
in the process of executing the target task, when a call request for a CUDA function is received, redirecting the call request to a hijacking library; the hijacking library comprises a plurality of rename functions corresponding to CUDA functions, and the return value of each rename function is determined based on the information of the target virtual GPU.
Wherein, still include:
and determining a file path of the hijacked library, and setting an LD_PRELOAD environment variable as the file path.
Wherein redirecting the call request to a hijacking library comprises:
loading the hijacking library through a loader based on the LD_PRELOAD environment variable, and searching an objective function corresponding to the call request in the hijacking library;
and executing the target function based on the information of the target virtual GPU to obtain a response result of the call request.
The dividing the target physical GPU into a plurality of virtual GPUs includes:
determining a ratio of the target physical GPU to the virtual GPU and dividing the target physical GPU into a plurality of virtual GPUs based on the ratio.
To achieve the above object, the present application provides a GPU virtualization sharing device, including:
the dividing module is used for determining a target physical GPU and dividing the target physical GPU into a plurality of virtual GPUs;
the selecting module is used for selecting a target virtual GPU from the target physical GPU when a target task is received, so that the target task is executed by the target virtual GPU;
the redirection module is used for redirecting the call request to the hijacking library when receiving the call request to the CUDA function in the process of executing the target task; the hijacking library comprises a plurality of rename functions corresponding to CUDA functions, and the return value of each rename function is determined based on the information of the target virtual GPU.
Wherein, still include:
and the setting module is used for determining the file path of the hijacked library and setting the LD_PRELOAD environment variable as the file path.
Wherein the redirection module comprises:
the searching unit is used for loading the hijacking library through a loader based on the LD_PRELOAD environment variable when receiving a call request for the CUDA function in the process of executing the target task, and searching the target function corresponding to the call request in the hijacking library;
and the execution unit is used for executing the target function based on the information of the target virtual GPU to obtain a response result of the call request.
The dividing module is specifically a module for determining a target physical GPU, determining a proportion of the target physical GPU to the virtual GPU, and dividing the target physical GPU into a plurality of virtual GPUs based on the proportion.
To achieve the above object, the present application provides an electronic device including:
a memory for storing a computer program;
and the processor is used for realizing the steps of the GPU virtualization sharing method when executing the computer program.
To achieve the above object, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the GPU virtualization sharing method as described above.
According to the scheme, the GPU virtualization sharing method provided by the application comprises the following steps: determining a target physical GPU and dividing the target physical GPU into a plurality of virtual GPUs; when a target task is received, selecting a target virtual GPU from the target physical GPU so as to execute the target task by using the target virtual GPU; in the process of executing the target task, when a call request for a CUDA function is received, redirecting the call request to a hijacking library; the hijacking library comprises a plurality of rename functions corresponding to CUDA functions, and the return value of each rename function is determined based on the information of the target virtual GPU.
According to the GPU virtualization sharing method provided by the application, through the hijacking technology, the call of a user to a CUDA (Chinese full name: unified computing device architecture, english full name: compute Unified Device Architecture) function is redirected into the call of the hijacked library function. Through the isolation technology, one physical GPU resource is separated into a plurality of virtual GPU resources, each virtual GPU is isolated from each other, and a user can use the virtual GPU as a complete physical GPU, so that multi-user and multi-task sharing of the GPU is realized. The application also discloses a GPU virtualization sharing device, electronic equipment and a computer readable storage medium, and the technical effects can be realized.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate the disclosure and together with the description serve to explain, but do not limit the disclosure. In the drawings:
FIG. 1 is a flowchart illustrating a method for GPU virtualization sharing, according to an example embodiment;
FIG. 2 is a flowchart illustrating another method of GPU virtualization sharing, according to an example embodiment;
FIG. 3 is a flow chart of an embodiment of an application provided by the present application;
FIG. 4 is a block diagram of a GPU virtualization sharing device, according to an example embodiment;
fig. 5 is a block diagram of an electronic device, according to an example embodiment.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The embodiment of the application discloses a GPU virtualization sharing method, which realizes the sharing of GPUs.
Referring to fig. 1, a flowchart of a GPU virtualization sharing method is shown according to an exemplary embodiment, as shown in fig. 1, including:
s101: determining a target physical GPU and dividing the target physical GPU into a plurality of virtual GPUs;
the objective of the present embodiment is to achieve sharing of a target physical GPU, in this step, the target physical GPU is first divided into multiple virtual GPUs, each virtual GPU is isolated from each other, and a user can use each virtual GPU as a complete physical GPU. It will be appreciated that, most important for a GPU is memory resources, and GPU virtualization necessarily isolates memory resources separately for use by different virtual GPUs, and that different virtual GPUs cannot use memory resources of other virtual GPUs. All GPU memories are managed through CUDA functions, and in the embodiment, the CUDA functions are redefined in a hijacking library, so that the memory of the target physical GPU is divided into a plurality of parts, and each virtual GPU obtains a part to provide service for one user.
It should be noted that, the specific partitioning manner is not limited in this embodiment, and preferably, the step of partitioning the target physical GPU into a plurality of virtual GPUs includes: determining a ratio of the target physical GPU to the virtual GPU and dividing the target physical GPU into a plurality of virtual GPUs based on the ratio. In a specific implementation, the ratio of the target physical GPU to the virtual GPU may be determined based on the memory size of the target physical GPU, for example, if the ratio of the target physical GPU to the virtual GPU is 1:5, the target physical GPU is divided into 5 virtual GPUs, where the memory size of each virtual GPU is not limited, and the memory sizes of each virtual GPU may be the same or different.
S102: when a target task is received, selecting a target virtual GPU from the target physical GPU so as to execute the target task by using the target virtual GPU;
in this step, when a target task is received, a target virtual GPU is selected from the target physical GPU, and the target task is implemented by using the resources of the target virtual GPU.
S103: in the process of executing the target task, when a call request for a CUDA function is received, redirecting the call request to a hijacking library; the hijacking library comprises a plurality of rename functions corresponding to CUDA functions, and the return value of each rename function is determined based on the information of the target virtual GPU.
It will be appreciated that in all deep learning software frameworks, parallel computation is implemented on the GPU via the CUDA toolkit. Taking Tensorflow as an example, tensorflow calls a large number of functions in CUDA toolkit in source code. If the CUDA toolkit is hijacked, the request of the Tensorflow to the GPU and the feedback of the GPU to the Tensorflow can be comprehensively taken over at the bottom layer, so that the data are camouflaged and modified, and the virtualization of the GPU is realized.
During the starting process of any task process, the dynamic link library is loaded through the loader. The loader searches the file system for the required dynamic link library according to the environment variables and the system settings. Function calls to CUDA may be redirected into the hijacking library by ld_reload, an environment variable. Namely, the present embodiment further includes: and determining a file path of the hijacked library, and setting an LD_PRELOAD environment variable as the file path.
The loader acquires the ld_reload environment variable, and loads the dynamic link library specified by the ld_reload environment variable, i.e., the hijacked library, before loading all the dynamic link libraries. When the function is called, if a plurality of dynamic link libraries are stored in the same name function, the task process can use the dynamic link library loaded first, namely the hijacking library in the step. In the hijack library, a large number of functions with the same name as the CUDA function are implemented, such as a cuMemALLo function (allocating device memory), a cuMemALLo managed function (allocating memory to be automatically managed by the unified memory system), a cuMemALLoPitch (allocating tilting device memory), a cumevicetotal mem (total amount of memory on the return device), a cumemagmeinfo (acquiring available memory and total memory), and the like. Therefore, the hijacking of the CUDA toolkit can be realized, and the call of the user to the CUDA function is redirected to the call to the hijacked library function. The step of redirecting the call request to a hijacking library comprises: loading the hijacking library through a loader based on the LD_PRELOAD environment variable, and searching an objective function corresponding to the call request in the hijacking library; and executing the target function based on the information of the target virtual GPU to obtain a response result of the call request.
For example, when the target task process invokes the cumemegetinfo function to obtain the memory size, the memory size of the virtual GPU is returned instead of the memory size of the physical GPU. For another example, when the target task process invokes the cumelloc function, an error is returned if the memory size of the virtual GPU is exceeded.
According to the GPU virtualization sharing method provided by the embodiment of the application, the call of the user to the CUDA function is redirected into the call of the hijacked library function through the hijacking technology. Through the isolation technology, one physical GPU resource is separated into a plurality of virtual GPU resources, each virtual GPU is isolated from each other, and a user can use the virtual GPU as a complete physical GPU, so that multi-user and multi-task sharing of the GPU is realized.
The embodiment of the application discloses a GPU virtualization sharing method, which further describes and optimizes a technical scheme relative to the previous embodiment. Specific:
referring to fig. 2, a flowchart of another GPU virtualization sharing method is shown according to an exemplary embodiment, as shown in fig. 2, including:
s201: determining a file path of the hijacked library, and setting an LD_PRELOAD environment variable as the file path;
s202: determining a target physical GPU and dividing the target physical GPU into a plurality of virtual GPUs;
s203: when a target task is received, determining a ratio of the target physical GPU to the virtual GPU, and dividing the target physical GPU into a plurality of virtual GPUs based on the ratio.
S204: in the process of executing the target task, when a call request for a CUDA function is received, loading the hijacking library through a loader based on the LD_PRELOAD environment variable, and searching a target function corresponding to the call request in the hijacking library;
s205: and executing the target function based on the information of the target virtual GPU to obtain a response result of the call request.
An embodiment of the present application, as shown in fig. 3, may include the following steps:
step one: configuring an LD_PRELOAD environment variable in the system to point to a file path of the hijacking library;
step two: configuring the proportion of a physical GPU and a virtual GPU, and determining that one physical GPU is shared by a plurality of users;
step three: and starting a deep learning task through a Tensorflow software framework and the like, and configuring the virtual GPU to the corresponding task.
Step four: after Tensorflow is started, the call to CUDA function is hijacked by us, and the virtual GPU provided by us is used for executing tasks.
The following describes a GPU virtualization sharing device according to an embodiment of the present application, and the GPU virtualization sharing device and the GPU virtualization sharing method described above may be referred to with each other.
Referring to fig. 4, a structural diagram of a GPU virtualized sharing device is shown according to an exemplary embodiment, as shown in fig. 4, including:
the dividing module 401 is configured to determine a target physical GPU and divide the target physical GPU into a plurality of virtual GPUs;
a selecting module 402, configured to, when receiving a target task, select a target virtual GPU at the target physical GPU so as to execute the target task by using the target virtual GPU;
a redirecting module 403, configured to redirect, when receiving a call request to a CUDA function during execution of the target task, the call request to a hijacking library; the hijacking library comprises a plurality of rename functions corresponding to CUDA functions, and the return value of each rename function is determined based on the information of the target virtual GPU.
According to the GPU virtualization sharing device provided by the embodiment of the application, through the hijacking technology, the call of the user to the CUDA function is redirected into the call of the hijacking library function. Through the isolation technology, one physical GPU resource is separated into a plurality of virtual GPU resources, each virtual GPU is isolated from each other, and a user can use the virtual GPU as a complete physical GPU, so that multi-user and multi-task sharing of the GPU is realized.
On the basis of the above embodiment, as a preferred implementation manner, the method further includes:
and the setting module is used for determining the file path of the hijacked library and setting the LD_PRELOAD environment variable as the file path.
Based on the above embodiment, as a preferred implementation manner, the redirecting module 402 includes:
the searching unit is used for loading the hijacking library through a loader based on the LD_PRELOAD environment variable when receiving a call request for the CUDA function in the process of executing the target task, and searching the target function corresponding to the call request in the hijacking library;
and the execution unit is used for executing the target function based on the information of the target virtual GPU to obtain a response result of the call request.
On the basis of the foregoing embodiment, as a preferred implementation manner, the dividing module 401 is specifically a module that determines a target physical GPU, determines a ratio of the target physical GPU to the virtual GPU, and divides the target physical GPU into a plurality of virtual GPUs based on the ratio.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
The present application also provides an electronic device, referring to fig. 5, and a block diagram of an electronic device 500 provided in an embodiment of the present application, as shown in fig. 5, may include a processor 11 and a memory 12. The electronic device 500 may also include one or more of a multimedia component 13, an input/output (I/O) interface 14, and a communication component 15.
The processor 11 is configured to control the overall operation of the electronic device 500 to complete all or part of the steps in the GPU virtualization sharing method. The memory 12 is used to store various types of data to support operation at the electronic device 500, which may include, for example, instructions for any application or method operating on the electronic device 500, as well as application-related data, such as contact data, messages sent and received, pictures, audio, video, and so forth. The Memory 12 may be implemented by any type or combination of volatile or non-volatile Memory devices, such as static random access Memory (Static Random Access Memory, SRAM for short), electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM for short), erasable programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM for short), programmable Read-Only Memory (Programmable Read-Only Memory, PROM for short), read-Only Memory (ROM for short), magnetic Memory, flash Memory, magnetic disk, or optical disk. The multimedia component 13 may include a screen and an audio component. Wherein the screen may be, for example, a touch screen, the audio component being for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signals may be further stored in the memory 12 or transmitted through the communication component 15. The audio assembly further comprises at least one speaker for outputting audio signals. The I/O interface 14 provides an interface between the processor 11 and other interface modules, which may be a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 15 is used for wired or wireless communication between the electronic device 500 and other devices. Wireless communication, such as Wi-Fi, bluetooth, near field communication (Near Field Communication, NFC for short), 2G, 3G or 4G, or a combination of one or more thereof, the corresponding communication component 15 may thus comprise: wi-Fi module, bluetooth module, NFC module.
In an exemplary embodiment, the electronic device 500 may be implemented by one or more application specific integrated circuits (Application Specific Integrated Circuit, abbreviated as ASIC), digital signal processors (Digital Signal Processor, abbreviated as DSP), digital signal processing devices (Digital Signal Processing Device, abbreviated as DSPD), programmable logic devices (Programmable Logic Device, abbreviated as PLD), field programmable gate arrays (Field Programmable Gate Array, abbreviated as FPGA), controllers, microcontrollers, microprocessors, or other electronic components for performing the GPU virtualization sharing method described above.
In another exemplary embodiment, a computer readable storage medium is also provided that includes program instructions that, when executed by a processor, implement the steps of the GPU virtualization sharing method described above. For example, the computer readable storage medium may be the memory 12 described above including program instructions executable by the processor 11 of the electronic device 500 to perform the GPU virtualization sharing method described above.
In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the application can be made without departing from the principles of the application and these modifications and adaptations are intended to be within the scope of the application as defined in the following claims.
It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Claims (6)
1. A method for GPU virtualization sharing, comprising:
determining a target physical GPU and dividing the target physical GPU into a plurality of virtual GPUs;
starting a deep learning task through a Tensorflow software framework, and configuring a virtual GPU to the corresponding task;
when a target task is received, selecting a target virtual GPU from the target physical GPU so as to execute the target task by using the target virtual GPU;
in the process of executing the target task, when a call request for a CUDA function is received, redirecting the call request to a hijacking library; the hijacking library comprises a plurality of rename functions corresponding to CUDA functions, and the return value of each rename function is determined based on the information of the target virtual GPU;
wherein, still include:
determining a file path of the hijacked library, and setting an LD_PRELOAD environment variable as the file path;
correspondingly, redirecting the call request to a hijacking library comprises the following steps:
loading the hijacking library through a loader based on the LD_PRELOAD environment variable, and searching an objective function corresponding to the call request in the hijacking library;
and executing the target function based on the information of the target virtual GPU to obtain a response result of the call request.
2. The GPU virtualization sharing method of claim 1, wherein the partitioning the target physical GPU into a plurality of virtual GPUs comprises:
determining a ratio of the target physical GPU to the virtual GPU and dividing the target physical GPU into a plurality of virtual GPUs based on the ratio.
3. A GPU virtualization sharing device, comprising:
the dividing module is used for determining a target physical GPU and dividing the target physical GPU into a plurality of virtual GPUs;
the module is used for starting a deep learning task through a Tensorflow software framework and configuring a virtual GPU to the corresponding task;
the selecting module is used for selecting a target virtual GPU from the target physical GPU when a target task is received, so that the target task is executed by the target virtual GPU;
the redirection module is used for redirecting the call request to the hijacking library when receiving the call request to the CUDA function in the process of executing the target task; the hijacking library comprises a plurality of rename functions corresponding to CUDA functions, and the return value of each rename function is determined based on the information of the target virtual GPU;
wherein, still include:
the setting module is used for determining a file path of the hijacking library and setting an LD_PRELOAD environment variable as the file path;
correspondingly, the redirection module comprises:
the searching unit is used for loading the hijacking library through a loader based on the LD_PRELOAD environment variable when receiving a call request for the CUDA function in the process of executing the target task, and searching the target function corresponding to the call request in the hijacking library;
and the execution unit is used for executing the target function based on the information of the target virtual GPU to obtain a response result of the call request.
4. The GPU virtualization sharing device of claim 3, wherein the partitioning module is specifically a module that determines a target physical GPU, determines a ratio of the target physical GPU to the virtual GPU, and partitions the target physical GPU into a plurality of virtual GPUs based on the ratio.
5. An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the GPU virtualization sharing method as claimed in claim 1 or 2 when executing the computer program.
6. A computer readable storage medium, characterized in that it has stored thereon a computer program which, when executed by a processor, implements the steps of the GPU virtualization sharing method as claimed in claim 1 or 2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911386438.7A CN111223036B (en) | 2019-12-29 | 2019-12-29 | GPU (graphics processing unit) virtualization sharing method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911386438.7A CN111223036B (en) | 2019-12-29 | 2019-12-29 | GPU (graphics processing unit) virtualization sharing method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111223036A CN111223036A (en) | 2020-06-02 |
CN111223036B true CN111223036B (en) | 2023-11-03 |
Family
ID=70829170
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911386438.7A Active CN111223036B (en) | 2019-12-29 | 2019-12-29 | GPU (graphics processing unit) virtualization sharing method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111223036B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111913794B (en) * | 2020-08-04 | 2024-08-09 | 北京百度网讯科技有限公司 | Method, apparatus, electronic device and readable storage medium for sharing GPU |
CN113360185B (en) * | 2021-05-10 | 2023-06-23 | Tcl空调器(中山)有限公司 | Processing method and device of micro control unit of air conditioner external unit and micro control unit |
CN114595065A (en) * | 2022-03-15 | 2022-06-07 | 北京有竹居网络技术有限公司 | Data acquisition method and device, storage medium and electronic equipment |
CN117632447A (en) * | 2022-08-09 | 2024-03-01 | 第四范式(北京)技术有限公司 | GPU resource using method, GPU virtualization method, job scheduling device and cluster |
CN115601221B (en) * | 2022-11-28 | 2023-05-23 | 苏州浪潮智能科技有限公司 | Resource allocation method and device and artificial intelligent training system |
CN116578416B (en) * | 2023-04-26 | 2024-07-30 | 中国人民解放军92942部队 | Signal-level simulation acceleration method based on GPU virtualization |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103761139A (en) * | 2014-01-25 | 2014-04-30 | 湖南大学 | General purpose computation virtualization implementation method based on dynamic library interception |
CN104380256A (en) * | 2012-04-19 | 2015-02-25 | 加泰罗尼亚理工大学 | Method, system and executable piece of code for virtualisation of hardware resource associated with computer system |
CN108984264A (en) * | 2017-06-02 | 2018-12-11 | 阿里巴巴集团控股有限公司 | The implementation method of virtual GPU, apparatus and system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8643656B2 (en) * | 2010-09-30 | 2014-02-04 | Nec Laboratories America, Inc. | Energy-aware task consolidation on graphics processing unit (GPU) |
-
2019
- 2019-12-29 CN CN201911386438.7A patent/CN111223036B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104380256A (en) * | 2012-04-19 | 2015-02-25 | 加泰罗尼亚理工大学 | Method, system and executable piece of code for virtualisation of hardware resource associated with computer system |
CN103761139A (en) * | 2014-01-25 | 2014-04-30 | 湖南大学 | General purpose computation virtualization implementation method based on dynamic library interception |
CN108984264A (en) * | 2017-06-02 | 2018-12-11 | 阿里巴巴集团控股有限公司 | The implementation method of virtual GPU, apparatus and system |
Non-Patent Citations (2)
Title |
---|
基于虚拟化的多GPU深度神经网络训练框架;杨志刚等;《计算机工程》;20170421(第02期);全文 * |
面向多任务的GPU通用计算虚拟化技术研究;张云洲等;《计算机工程与科学》;20131115(第11期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN111223036A (en) | 2020-06-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111223036B (en) | GPU (graphics processing unit) virtualization sharing method and device, electronic equipment and storage medium | |
US9946525B2 (en) | Extracting source code | |
KR102000266B1 (en) | Identifiers across application instances | |
CA2768752C (en) | Terminal device of non-android platform for executing android applications, and computer readable recording medium for storing program of executing android applications on non-android platform | |
EP3035191B1 (en) | Identifying source code used to build executable files | |
CN105573734B (en) | method and equipment for providing SDK file | |
CN113297566B (en) | Sandbox implementation method, device, equipment and storage medium | |
US9678767B2 (en) | Unified extensible firmware interface (UEFI) driver and protocol | |
US10664278B2 (en) | Method and apparatus for hardware acceleration in heterogeneous distributed computing | |
CN106469071B (en) | Application theme changing method and device | |
CN110362356B (en) | Function data processing method and device, computer equipment and storage medium | |
US11610155B2 (en) | Data processing system and data processing method | |
EP3021216A1 (en) | Incremental source code analysis | |
CN112424765A (en) | Container framework for user-defined functions | |
CN113986402A (en) | Function calling method and device, electronic equipment and storage medium | |
JP2024536659A (en) | Task execution method, apparatus, storage medium and electronic device | |
US20110167405A1 (en) | Application building system, method and computer-readable medium | |
CN112235132A (en) | Method, device, medium and server for dynamically configuring service | |
US20220156363A1 (en) | Multi -tenant actor systems with web assembly | |
CN117555563A (en) | Method and device for constructing platform mirror image, computer equipment and storage medium | |
US20230267005A1 (en) | Thread management | |
CN114860401A (en) | Heterogeneous cloud desktop scheduling system, method, service system, device and medium | |
CN113918290A (en) | API calling method and device | |
CN110427224B (en) | EJB module loading method and device, server and readable storage medium | |
US20230359440A1 (en) | Externally-initiated runtime type extension |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |