CN116860391A - GPU computing power resource scheduling method, device, equipment and medium - Google Patents

GPU computing power resource scheduling method, device, equipment and medium Download PDF

Info

Publication number
CN116860391A
CN116860391A CN202310801620.4A CN202310801620A CN116860391A CN 116860391 A CN116860391 A CN 116860391A CN 202310801620 A CN202310801620 A CN 202310801620A CN 116860391 A CN116860391 A CN 116860391A
Authority
CN
China
Prior art keywords
gpu
target
virtual machine
power resource
equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310801620.4A
Other languages
Chinese (zh)
Inventor
田玉凯
刘昌松
徐莉芳
曹绍猛
靳新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peng Cheng Laboratory
Original Assignee
Peng Cheng Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peng Cheng Laboratory filed Critical Peng Cheng Laboratory
Priority to CN202310801620.4A priority Critical patent/CN116860391A/en
Publication of CN116860391A publication Critical patent/CN116860391A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Power Sources (AREA)

Abstract

The disclosure provides a GPU (graphics processing unit) power resource scheduling method, device, equipment and medium. The GPU computing power resource scheduling method comprises the following steps: receiving a GPU computing power resource request from a target virtual machine; selecting target GPU equipment from a plurality of GPU equipment according to the GPU computing power resource request; generating a configuration file based on the target GPU device; loading a configuration file by using a device management driver, and establishing a through connection between the target virtual machine and the target GPU device by using a device transmission driver; the resource data in the target GPU equipment is transmitted to the target virtual machine by utilizing the equipment transmission drive; after the calculation of the computing power resources in the target GPU equipment is finished by the target virtual machine, modifying the configuration file, loading the configuration file by using the equipment management drive, and removing the through connection between the target virtual machine and the target GPU equipment. The method and the device can reduce resource loss in the GPU computing power resource scheduling process and improve scheduling efficiency. The embodiment of the disclosure can be applied to cloud computing, artificial intelligence and the like.

Description

GPU computing power resource scheduling method, device, equipment and medium
Technical Field
The disclosure relates to the field of cloud computing, and in particular relates to a method, a device, equipment and a medium for scheduling GPU (graphics processing Unit) computational resources.
Background
In current cloud computing, to increase computing efficiency, additional computing resources, such as GPU computing resources, are required. If the GPU computing power resources are directly bound to the computing device, a high computing cost is incurred. Thus, in the prior art, the computing power resources of multiple GPU devices are pooled to form a resource pool on a server. The user may invoke the computing resources in the resource pool as required by the task. However, when the resource pooling is performed, it is also necessary to perform memory division on all GPU cards, which may cause a loss of computing power resources. And when the user uses the GPU computing power resource each time, the user needs to pass through the server to read the storage position of the computing power resource, so that the data transmission efficiency is reduced, and the loss of the computing power resource can occur in the transmission process.
Disclosure of Invention
The embodiment of the disclosure provides a method, a device, equipment and a medium for dispatching GPU (graphics processing unit) computing resources, which can reduce the loss of the computing resources and improve the data transmission efficiency while realizing automatic calling of the GPU computing resources.
According to an aspect of the present disclosure, there is provided a GPU power resource scheduling method, including:
receiving a GPU computing power resource request from a target virtual machine;
Selecting a target GPU device from a plurality of GPU devices according to the GPU computing power resource request;
generating a configuration file based on the target GPU device;
loading the configuration file by using a device management driver, and establishing a through connection between the target virtual machine and the target GPU device by using a device transmission driver;
the device transmission driver is utilized to transparently transmit the resource data in the target GPU device to the target virtual machine;
and after the calculation of the computing power resources in the target GPU equipment by the target virtual machine is finished, modifying the configuration file, loading the configuration file by using the equipment management driver, and removing the through connection between the target virtual machine and the target GPU equipment.
According to an aspect of the present disclosure, there is provided a GPU power resource scheduling apparatus, including:
the receiving unit is used for receiving the GPU computing power resource request from the target virtual machine;
the distribution unit is used for selecting a target GPU device from a plurality of GPU devices according to the GPU computing power resource request;
the generating unit is used for generating a configuration file based on the target GPU equipment;
the direct connection establishing unit is used for loading the configuration file by using an equipment management driver and establishing direct connection between the target virtual machine and the target GPU equipment by using an equipment transmission driver;
The transparent transmission unit is used for using a device transmission driver to transparent transmit the resource data in the target GPU device to the target virtual machine;
and the direct unbinding unit is used for modifying the configuration file after the calculation of the computational power resources in the target GPU equipment used by the target virtual machine is finished, loading the configuration file by using the equipment management driver and removing the direct connection between the target virtual machine and the target GPU equipment.
Optionally, a plurality of the GPU devices are bound with the server;
the pass-through establishing unit is further configured to:
acquiring a node address of the target virtual machine;
unbinding the target GPU equipment and the server;
and loading the configuration file by using the device management driver, and establishing the through connection between the target virtual machine and the target GPU device by using the device transmission driver according to the node address.
Optionally, the through establishment unit is further configured to:
acquiring a total port address of the target GPU equipment;
converting the total port address into a first virtual address;
acquiring a functional port address of the target GPU equipment;
converting the functional port address into a second virtual address;
And establishing a mapping relation between the total port address and the first virtual address, and between the functional port address and the second virtual address.
Optionally, the through establishment unit is further configured to:
creating a virtual GPU device in the target virtual machine using the device transfer driver based on the first virtual address and the second virtual address;
and establishing a through connection between the virtual GPU equipment and the target GPU equipment.
Optionally, the transmission unit includes:
encapsulating the resource data in the target GPU equipment into a resource packet, and encrypting the resource packet;
and transmitting the encrypted resource package to the target virtual machine.
Optionally, the through unbinding unit is further configured to:
detecting the computing power resource use condition of the target GPU equipment according to a preset period;
if the target GPU equipment stops calculating, detecting again after a preset time period, and if the target GPU equipment still stops calculating, modifying the configuration file;
and loading the configuration file by using the device management driver, and releasing the through connection between the target GPU device and the target virtual machine.
Optionally, the GPU power resource scheduling device further includes:
The monitoring unit is used for monitoring the computing power resource use condition of the target GPU equipment in real time;
a visualization unit, configured to visually present the computing power resource usage situation;
and the characteristic generating unit is used for generating the power resource using characteristic of the GPU equipment based on the power resource using condition.
Optionally, the GPU power resource scheduling device further includes:
the training unit is used for training the target GPU equipment to determine a model by utilizing the computing power resource utilization characteristics;
the dispensing unit is further configured to: and inputting the GPU computing power resource request into the target GPU equipment determination model to obtain the target GPU equipment.
According to an aspect of the present disclosure, there is provided an electronic device including a memory storing a computer program and a processor implementing the GPU power resource scheduling method as described above when executing the computer program.
According to an aspect of the present disclosure, there is provided a computer readable storage medium storing a computer program which, when executed by a processor, implements a GPU power resource scheduling method as described above.
According to an aspect of the present disclosure, there is provided a computer program product comprising a computer program that is read and executed by a processor of a computer device, causing the computer device to perform the GPU power resource scheduling method as described above.
In the embodiment of the disclosure, after receiving a GPU power resource request from a target virtual machine, a server selects one of a plurality of GPU devices as a target GPU device allocated to the target virtual machine; and generating a configuration file of the target GPU, loading the configuration file by using a device management driver, and establishing a through connection between the target virtual machine and the target GPU device. The device management driver is used to manage the target virtual machine and detect the usage of the target GPU device so that the GPU power resource scheduling process is monitorable to handle the emergency. And the connection stability can be improved by establishing connection between the target virtual machine and the target GPU equipment through the configuration file. The device transmission driver can improve the efficiency of establishing the through connection, and can also transmit the resource data in the target GPU device to the target virtual machine, thereby improving the accuracy of data transmission. Through the framework, the target virtual machine can use a complete GPU (graphics processing unit) device, and the GPU device does not need to carry out video memory division, so that the complete computing performance is reserved. Meanwhile, the target virtual machine can directly access the storage position of the target GPU equipment, and the distributed GPU computing power resources can be found without passing through a server, so that the data transmission efficiency is improved, and the computing power resource loss is reduced. After the target virtual machine is calculated, the through connection with the target GPU equipment is released, and the problems of increased calculation cost and resource waste caused by continuous binding are avoided. Therefore, the embodiment of the disclosure realizes automatic GPU resource scheduling, saves the calculation cost, improves the data transmission efficiency in the GPU power resource scheduling process, and reduces the power resource loss.
Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the disclosure. The objectives and other advantages of the disclosure will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the disclosed embodiments and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain, without limitation, the disclosed embodiments.
FIG. 1 is a architecture diagram of a GPU power resource scheduling method provided by embodiments of the present disclosure;
FIG. 2 is a flow chart of a GPU power resource scheduling method of an embodiment of the present disclosure;
FIG. 3 is a specific flow chart of step 240 of FIG. 2;
FIG. 4 is a specific flow chart of step 240 of FIG. 2;
FIG. 5 is a specific flowchart of step 240 of FIG. 2;
FIG. 6 is a schematic diagram of a GPU computing resource being invoked from a perspective of a target virtual machine;
FIG. 7 is a specific flow chart of adding detection capability resource usage after step 240;
FIG. 8 is a architectural diagram of a database processing tool with visualization server added to an embodiment of the present disclosure;
FIG. 9 is a specific flowchart of step 250 of FIG. 2;
FIG. 10 is a specific flowchart of step 260 of FIG. 2;
FIG. 11 is an exemplary structural schematic of an embodiment of the present disclosure;
FIG. 12 is a block diagram of a GPU power resource scheduler in accordance with an embodiment of the present disclosure;
FIG. 13 is a terminal block diagram of the GPU power resource scheduling method shown in FIG. 2, in accordance with an embodiment of the present disclosure;
fig. 14 is a server block diagram of the GPU power resource scheduling method shown in fig. 2 according to an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the present disclosure more apparent, the present disclosure will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present disclosure.
Before proceeding to further detailed description of the disclosed embodiments, the terms and terms involved in the disclosed embodiments are described, which are applicable to the following explanation:
cloud computing: a dynamically scalable computing service is provided on demand over a network. According to the user's needs, available, convenient, on-demand network access is provided, and a configurable pool of computing resources (resources including networks, servers, storage, applications, services) is entered, which can be provided quickly with little effort to manage or interact with the service provider.
Graphics Processing Unit (GPU): a GPU is a graphics processor that performs image and graphics-related operations on personal computers, workstations, gaming machines, and mobile devices.
And (3) video memory: the main purpose of the memory is to store the graphic information to be processed and to match with the GPU for graphic processing.
Virtual machine: a complete computer system simulated by software. Work that can be done in a physical computer can be done in a virtual machine. When creating a virtual machine in a computer, a part of hard disk and memory capacity of the physical machine are required to be used as the hard disk and memory capacity of the virtual machine. Each virtual machine has an independent hard disk and operating system, and can operate as if a physical machine were used.
Peripheral Component Interconnect Express (PCIE): is a high-speed serial computer expansion bus standard. The method belongs to high-speed serial point-to-point dual-channel high-bandwidth transmission, and connected equipment distributes exclusive channel bandwidth and does not share bus bandwidth. There are two existing forms of PCIE, m.2 interface channel form and PCIE standard slot. PCIE has strong expansibility, and can support the insertion of various devices, for example: display cards, wireless network cards, sound cards, etc.
API interface: an application program interface. An application program interface is a set of definitions, programs, and protocols that enable the communication between computer software via an API interface. One of the main functions of the API is to provide a generic set of functions. The programmer can lighten the programming task by calling the API function to develop the application program. The API is also a middleware for providing data sharing for various platforms.
Direct Memory Access (DMA): to provide high speed data transfer between a peripheral and a memory, or between a memory and a memory. The data can be transmitted through DMA without CPU control, and a channel for directly transmitting the data is opened up between the RAM and the I/O equipment through hardware, so that CPU resources are saved, and CPU efficiency is improved.
virsh: is a command line tool written in the C language for managing virtual machine virtualization, and a system administrator can operate the virtual machine through a vish command.
Fig. 1 is a system architecture diagram to which a database test method according to an embodiment of the present disclosure is applied. It includes target virtual machine 110, internet 120, server 130, PCIE140, and GPU device 141.
The target virtual machine 110 is a virtual machine that has a need to invoke GPU power resources for computing. It includes virtual machines drawn in a variety of forms, such as desktop computers, laptop computers, PDAs (personal digital assistants), dedicated terminals, computer devices, etc. The device is composed of a single device, or may be composed of a set of a plurality of devices. For example, a plurality of devices are connected through a lan, and a display device is commonly used to perform cooperative work, so as to form a target virtual machine 110. The target virtual machine 110 may communicate with the internet 120 in a wired or wireless manner, exchanging data.
Server 130 is a computer system that provides GPU resource scheduling services to target virtual machine 110. Server 130 may be a high-performance computer in a network platform, a cluster of multiple high-performance computers, or the like. The server 130 exchanges data by communicating with the internet 120, which may be wired or wireless.
PCIE140 is a slot on server 130 for inserting GPU device 141. Each GPU device 141 has a unique interface address on PCIE 140.
According to one embodiment of the present disclosure, a method for scheduling GPU power resources is provided.
GPU power resources are managed by server 130 for supporting the computation of one task in target virtual machine 110. When the target virtual machine 110 sends a GPU power resource request, GPU power resources are allocated to the target virtual machine 110 according to the GPU power resource request. In existing schemes, the computing power resources of multiple GPU devices 141 form a resource pool in server 130, and the computing power resources in the resource pool are invoked according to GPU computing power resource requests. At the same time, all GPU devices 141 are also subjected to memory partitioning, and in this process, the computing power resources are lost. Moreover, when the target virtual machine 110 invokes the GPU data, the allocated GPU computing power resources must be found through the server 130, which not only has low data transmission efficiency, but also causes loss of the computing power resources. The GPU computing power resource scheduling method can improve data transmission efficiency and reduce computing power resource loss.
The GPU computing power resource scheduling method provided by the embodiments of the present disclosure is applied to computing power resource scheduling of the server 130 on the plurality of GPU devices 141, as shown in fig. 2, and includes:
step 210, receiving a GPU computing power resource request from a target virtual machine;
step 220, selecting a target GPU device from a plurality of GPU devices according to the GPU computing power resource request;
step 230, generating a configuration file based on the target GPU device;
step 240, loading a configuration file by using a device management driver, and establishing a through connection between the target virtual machine and the target GPU device by using a device transmission driver;
step 250, using the device transmission driver to transparently transmit the resource data in the target GPU device to the target virtual machine;
and 260, after the calculation of the computing power resources in the target GPU equipment by using the target virtual machine is finished, modifying the configuration file, loading the configuration file by using the equipment management driver, and removing the through connection between the target virtual machine and the target GPU equipment.
In order to ensure that the GPU power resource scheduling method provided by the embodiments of the present disclosure may be implemented smoothly, in one embodiment, before receiving a GPU power resource request from the target virtual machine 110, the server 130 system needs to be configured, which specifically includes:
It is detected whether the server 130 system can be normally connected to the target virtual machine 110. For example, when the target virtual machine 110 is a virtual machine, it is necessary to check whether the server 130 system can achieve normal creation, startup, and connection of the virtual machine. It is checked whether the server 130 system supports virtualization of the I/O devices to ensure that the resources of the target GPU device can be directly allocated to the target virtual machine 110.
Starting the kernel parameters ensures that the target virtual machine 110 can quickly and accurately access a large number of continuous physical memories in the server 130 and can also access high-end memory addresses of the server 130 when performing memory mapping. If the kernel parameters are not started, server 130 cannot allocate a lot of physically contiguous memory to some devices when it is needed. Meanwhile, most servers 130 can only point out 32-bit addressing when performing direct memory access, and in order to access data in a high-end memory address, a temporary data cache needs to be allocated to the high-end memory in a low-end memory, the data in the high-end memory is copied to the temporary data cache, and when the data in the high-end memory is wanted to be accessed, the high-end memory address needs to be accessed through the temporary data cache. Temporary data caching takes up much memory and places additional burden on the memory of server 130 when copying data. After the kernel parameter is started, when the target virtual machine 110 accesses the target GPU device in a subsequent process, the physical address of the target GPU device in the server 130, whether the low-end memory address or the high-end memory address, can be directly accessed according to the mapping table between the virtual memory and the physical memory. In general, starting kernel parameters can improve the memory address access efficiency and accuracy, save memory space and improve the memory operation efficiency.
The bus address of GPU device 141 on server 130 is obtained. The bus address is an address where server 130 can directly access GPU device 141. When multiple GPU devices 141 share a large range of memory space on server 130, obtaining the bus address of GPU device 141 further includes: acquiring an address of a memory group in which the GPU equipment 141 is located; and further obtains the bus address of the GPU. The port address of GPU device 141 stored in the register may be determined by the bus address of GPU device 141.
The number of functional ports of GPU device 141 is determined based on the port address of GPU device 141. The number of functional ports is used to indicate that GPU device 141 specifically includes several functions, for example, some GPU devices 141 may implement a function of exchanging data with other devices through USB ports. Each GPU device 141 has only one bus address, but may contain multiple functional ports. The functional ports are not independent of each other, and all the functional ports cooperate together to normally operate the GPU device 141. Therefore, in the subsequent address mapping process, all the functional port addresses need to be mapped to be passed, so as to ensure that the GPU device 141 can operate normally.
Before the GPU power resource scheduling is performed, the configuration process is performed on the server 130 system, so that the server 130 can be ensured to operate efficiently and stably when the GPU power resource scheduling is performed subsequently.
In step 210, the GPU power resource request of the target virtual machine 110 includes the node address of the target virtual machine 110, the time of initiating the GPU power resource request, and the type of the transaction to be processed in the target virtual machine 110.
A monitoring agent node is disposed in the target virtual machine 110, and is configured to generate a GPU power resource request according to a transaction to be processed of the target virtual machine 110, and send the GPU power resource request to the server 130 through an API interface on the server 130 for receiving the request. After the server 130 establishes a through connection for the target virtual machine 110 and the target GPU device, the monitoring agent node may be used to monitor the usage of the target GPU device. After the target virtual machine 110 completes the calculation by using the target GPU device, the monitoring agent node generates a GPU power resource release request, and sends the request to the server 130 through the API interface for receiving the request, and the server 130 performs a subsequent disconnection operation. The monitoring agent node has the advantages that the monitoring agent node can replace the target virtual machine 110 to manage the target GPU equipment, so that the occupation of the calculation space of the target virtual machine 110 is avoided, and the efficiency of the target virtual machine 110 for processing the transaction to be processed by using the target GPU equipment is improved.
In step 220, server 130, upon receiving a request for GPU power resources from target virtual machine 110 via the API for receiving the request, selects one of the plurality of GPU devices 141 as the target GPU device.
The selection target GPU device is mainly based on the to-be-processed transaction in the GPU computing power resource request. The appropriate GPU device 141 is selected according to the type of transaction to be processed, the amount of computation, and the like. For example, if the transaction to be processed is 3D rendering of a scene interface diagram in a game, then the target GPU device needs to be able to support the operation of 3D rendering, while ensuring that the computing power resources can meet the need for computing a large amount of data in the scene interface diagram.
In one embodiment, selecting a target GPU device from a plurality of GPU devices according to a GPU power resource request, comprises: and inputting the GPU computing power resource request into a target GPU equipment determination model to obtain the target GPU equipment.
The target GPU device determination model may be a machine learning model for determining a target GPU device from a plurality of GPU devices based on characteristics of the GPU power resource request. The target GPU device determination model is trained based on GPU usage characteristics in the historical schedule, as will be described in detail in subsequent steps.
Inputting the GPU computing power resource request into a target GPU equipment determining model, wherein the model firstly extracts a plurality of characteristics which can be identified by the model from the GPU computing power resource request. For example, the type of request to be processed, the data size, the desired accuracy, etc. are extracted from the GPU computational power resource request. Based on the above characteristics, the target GPU device determination model may determine a target GPU device that is suitable for the GPU power resource request of the target virtual machine 110.
Because the target GPU device determination model is generated based on empirical training of the use condition of the GPU device 141 in the historical schedule, the advantage of determining the target GPU device by using the model is that the target GPU device can meet the processing requirement of the transaction to be processed in the target virtual machine 110, and the accuracy of determining the target GPU device is improved.
In step 230, after determining the target GPU device, server 130 needs to generate a configuration file for the target GPU device.
The role of generating the configuration file is to allow the target virtual machine 110 to use the target GPU device. Therefore, the configuration file needs to be generated according to the multiple attribute values of the target GPU device, so as to ensure that the target virtual machine 110 and the target GPU device can be normally connected.
The plurality of attribute values used to generate the configuration file include at least: the network domain address of the target GPU device, the bus address of the target GPU device, the interface position of the target GPU device and the number of functional ports of the target GPU device.
The network domain address is used for positioning identification for positioning the target GPU equipment during data transmission. The bus address is a memory address where server 130 may directly access GPU device 141. The interface position refers to an interface number of the target GPU device in the PCIE140 device, and since the PCIE140 device has a plurality of slots, the interface number corresponds to a plurality of interface numbers. The interface location indicates in which slot the target GPU device is specifically plugged.
The number of function ports is used to indicate that several functions are specifically included in the target GPU device. When generating the configuration file, it is necessary to determine the number of ports of the target GPU device, and add the address of each port to the configuration file, so that the target virtual machine 110 can identify all functional ports on the target GPU device, and ensure that the target GPU device can operate normally.
The four attribute values described above can be expressed in an xml file as:
domain="0x"${result:4:4}#4bit
bus="0x"${result:9:2}#2bit
slot="0x"${result:12:2}#2bit
function="0x"${result:-1}#1bit
domain is a network domain address of the target GPU equipment, and the attribute value length of domain is 4 bits; bus is the bus address of the target GPU equipment, and the attribute value length is 2 bits; the slot is the interface position of the target GPU equipment, and the attribute value length is 2 bits; the function is the number of functional ports of the target GPU equipment, and the attribute value length is 1 bit.
In the configuration process for the server 130 in the foregoing embodiment, the above attribute values of the GPU devices 141 connected to the server 130 may be obtained, and thus, the attribute values of the target GPU are directly invoked to generate the configuration file. The configuration file may be an xml file.
In step 240, the configuration file is loaded first using the device management driver. The device management driver is used to manage the target virtual machine 110 that establishes a connection with the server 130. For the target virtual machine 110, the device management driver used may be Libvirt. Libvirt is used as a tool and API for managing virtual machines. This step is exemplarily described below by taking Libvirt as an example.
Libvirt may provide APIs for the connected target virtual machine 110 to enable operation and management of the target virtual machine 110 on the server 130. Libvirt may be invoked locally, i.e., may manage virtual machines local to server 130. Or may be invoked remotely, i.e., the running program and domain of the virtual machine are not on the present server 130. Libvirt supports a variety of common remote protocols, that is, in the process of the target virtual machine 110 being in a pass-through connection with the GPU device 141, libvirt can provide a variety of management services for the target virtual machine 110, such as: command line interface management, file transfer management, data encryption management, and the like.
Libvirt also includes a daemon, libvirtd, which can realize real-time monitoring of the target virtual machine 110, and avoid interruption by other information generated in the server 130. When the through connection of the target virtual machine 110 and the GPU device 141 is problematic, the daemon can timely perceive the error and solve the problem. Other upper management tools, such as interface management tools, data management tools, etc., may also be connected to the target virtual machine 110 through daemons. Daemon executes operation instructions from other upper-layer management tools to operate target virtual machine 110.
The command line tool virsh is used in Libvirtd to manage the target virtual machine 110, such as starting, closing, restarting, migrating, etc. of the virtual machine, and configuration and resource usage of the target virtual machine 110 and the server 130 may also be collected.
When the device management driver loads the configuration file, a virsh command may be used to build a through connection between the target virtual machine 110 and the target GPU device by running the configuration file in the configuration file.
In one embodiment, the plurality of GPU devices 141 are directly bound to the server 130, and thus, as shown in fig. 3, loading the configuration file with the device management driver, establishing a pass-through connection of the target virtual machine with the target GPU device with the device transfer driver, includes:
Step 310, obtaining a node address of a target virtual machine;
step 320, unbinding the target GPU equipment and the server;
and 330, loading a configuration file by using a device management driver, and establishing a through connection between the target virtual machine and the target GPU device according to the node address by using a device transmission driver.
The node address of the target virtual machine 110 may be directly obtained through the GPU power resource request of the target virtual machine 110.
And finding the total port address of the target GPU equipment through the obtained bus address of the target GPU equipment, and unbinding the total port address of the target GPU equipment with the server 130. After unbinding, the target GPU device may establish a connectivity relationship directly with the target virtual machine 110.
When the device management driver loads the configuration file, the domain address, the bus address, the interface position, and the number of functional ports of the target GPU device are loaded, and the position of the target GPU device is acquired based on the domain address, the bus address, and the interface position, so as to map the target GPU device to the target virtual machine 110. Based on the number of functional ports, the functional ports on the target GPU are traversed in turn, and the location of each functional port is obtained, so as to map all the functional ports to the target virtual machine 110.
The source codes are specifically realized as follows:
in the source code, the configuration file is loaded through a result_real () method. Node refers to a target virtual machine Node address, for example: i-0000002D. And reading the address of the target virtual machine node, acquiring an xml configuration file, and communicating the target virtual machine node with the configuration file through a vi rsh atch command.
After the through connection is established, a starting parameter is set in the configuration file and used for indicating that the target virtual machine is bound with the target GPU equipment. After the subsequent server or virtual machine is restarted, the target virtual machine and the target GPU device are also bound according to the configuration file. This improves the stability of the pass-through connection between the target virtual machine and the target GPU device.
The advantage of this embodiment is that by releasing the binding relationship between the target GPU device and the server 130, the server 130 is prevented from occupying the computing power resource of the target GPU device, and the influence of the server 130 on the process of accessing the target GPU device by the target virtual machine 110 is avoided.
In one embodiment, as shown in fig. 4, loading a configuration file with a device management driver, establishing a pass-through connection of a target virtual machine with a target GPU device with a device transport driver, includes:
Step 410, obtaining a total port address of the target GPU device;
step 420, converting the total port address into a first virtual address;
step 430, obtaining a functional port address of the target GPU device;
step 440, converting the functional port address into a second virtual address;
step 450, a mapping relationship between the total port address and the first virtual address, and a mapping relationship between the functional port address and the second virtual address are established.
The total port address of the target GPU device is the physical address that the target GPU device uses to exchange data directly with the target GPU. The total port address is converted into a first virtual address by using a memory management unit. The function port address is a port address corresponding to a function in the target GPU device, and may be used to exchange data with the corresponding function in the target GPU. Since the virtual machine cannot recognize the physical address, if the physical address is directly transferred to the virtual machine, the virtual machine cannot store the physical address, resulting in the virtual machine not knowing which specific GPU device 141 is in communication with it. When a direct memory access is to be made, memory may be destroyed and the wrong GPU device 141 may be accessed. Therefore, the total port address and the functional port address are converted into the first virtual address and the second virtual address that can be identified by the target virtual machine 110 through the memory management unit, and a mapping relationship between the total port address and the first virtual address, and between the functional port address and the second virtual address is established, so that the target virtual machine 110 can access the target GPU device correctly and efficiently according to the mapping relationship.
When the through connection is established, the device transmission driver corresponding to the target virtual machine 110 is started. The device transfer driver is used to map the target GPU device with all of its functional ports onto the target virtual machine 110. The device transfer driver may be a VFIO driver. In one embodiment, as shown in fig. 5, establishing a pass-through connection of the target virtual machine 110 with the target GPU device using the device transfer driver includes:
step 510, creating a virtual GPU device in the target virtual machine using the device transport driver based on the first virtual address and the second virtual address;
step 520, establishing a through connection between the virtual GPU device and the target GPU device.
The device transfer driver may create a virtual GPU device 141 on the target virtual machine 110 based on the first virtual address and the second virtual address, where the address of the virtual GPU device 141 on the target virtual machine 110 is effectively the first virtual address, and the target virtual machine 110 also includes the same functional port as the target virtual machine 110, and the address of the functional port on the target virtual machine 110 is effectively the second virtual address. Establishing a pass-through connection between the target virtual machine 110 and the target GPU device is in effect establishing a pass-through connection of the virtual GPU device 141 and the target GPU device.
The process of establishing the through connection of the virtual GPU device and the target GPU device is realized by the following source codes:
in the source code, firstly, acquiring a bus address of target GPU equipment; acquiring the number of functional ports of the target GPU equipment through accessing the bus address; for each functional port, a corresponding xml configuration is generated. For each functional port, domain (network address), bus (bus address), slot (interface location), and function (number of functional ports) need to be configured. After generating the xml configuration corresponding to the functional port, adding the xml configuration into an xml file (iommu group_number. Xml) of a memory section where the target GPU equipment is located. After the function port is configured, the function port is mapped into the target virtual machine by using a device transmission driver (vfio), so that each attribute value address of the generated virtual GPU device is the same as the attribute value address of the function port.
When the target virtual machine 110 wants to invoke the computational power resources in the connected target GPU device to perform computation, the first virtual address and the second virtual address of the virtual GPU device 141 in its own space can be directly accessed, and according to the mapping relationship between the first virtual address and the total port address, and the second virtual address and the functional port address, the target GPU device can be directly accessed without sending a request for accessing the target GPU device to the server 130.
In this embodiment, through the through connection between the virtual GPU device 141 and the target GPU device, the target virtual machine 110 can directly access the real hardware device of the target GPU, and no access request needs to be sent to the server 130, so that the resource loss generated in the process of accessing the target GPU by the target virtual machine 110 is reduced, and the access efficiency is improved. Meanwhile, the virtual GPU device 141 on each target virtual machine 110 is directly connected with only one target GPU device, the physical address of the target GPU device can be directly accessed through the address of the virtual GPU device 141, the target virtual machine 110 cannot operate the GPU device 141 connected with other virtual machines, the virtual machine is prevented from accessing the wrong GPU device 141, and the independence of data transmission in the process of dispatching the GPU computing power resources is improved.
From the perspective of the target virtual machine 110, the process from initiating a GPU power resource request to establishing a through connection with the target GPU device is illustrated in fig. 6, the target virtual machine 110 initiates the GPU power resource request to the device management driver on the server 130; the device management driver obtains the determined configuration file of the target GPU device and loads the configuration file of the target GPU device; after loading is completed, starting a device transmission drive corresponding to the target virtual machine 110; the device transfer driver maps the target GPU device on the target virtual machine 110, establishing a pass-through connection of the target GPU device with the target virtual machine 110.
In one embodiment, as shown in fig. 7, after loading the configuration file by using the device management driver and establishing the through connection between the target virtual machine and the target GPU device by using the device transmission driver, the GPU power resource scheduling method further includes:
step 710, monitoring the computing power resource use condition of the target GPU equipment in real time;
step 720, visually presenting the use condition of the computing power resource;
step 730, generating a GPU device computing power resource usage feature based on the computing power resource usage.
In this embodiment, after establishing the through connection between the target virtual machine 110 and the target GPU device, the computing power resource usage of the target GPU device is monitored in real time.
And visually displaying the use condition of the computing power resource to a background operator so that the background operator can monitor the target GPU equipment at any time. The visual presentation of computing resource usage may be through a web server. The web server is the same as the server for GPU power resource scheduling, and a separate web server can be used for visualization processing.
And carrying out data statistics on the computing power resource use condition by utilizing a database processing tool, and generating the computing power resource use characteristics of the GPU equipment 141. The database processing tool may be a MySQL database. The GPU device 141 computational resource usage characteristics may include dimensions of computational power consumption, processing speed, processing accuracy, etc. of the target GPU device when handling the task to be processed on the target virtual machine 110.
As shown in fig. 8, the system architecture of the present embodiment is configured to monitor the use condition of computing power resources in real time by using a server 130; the database processing tool 150 is used for performing data processing on the computing power resource use condition to generate computing power resource use characteristics; the visualization server 160 is used to visualize the computing resource usage.
The method has the advantages that background operators can acquire the service condition of the computing power resources of the target GPU equipment in real time, more experience data are provided for the subsequent GPU computing power resource scheduling, and further the scheduling accuracy of the GPU computing power resource scheduling method is gradually improved.
Based on the GPU device 141 computational power resource usage characteristics generated in the present embodiment, in one embodiment, after generating the GPU device 141 computational power resource usage characteristics based on the computational power resource usage conditions, the GPU computational power resource scheduling method further includes: and training the target GPU equipment to determine a model by using the computing power resource utilization characteristics. In one embodiment of step 220, the target GPU device determination model is used to determine the target GPU device from the GPU power resource request. The target GPU device determination model is trained by using the feature data through the computational resources. The trained model is based on experience of actual GPU power resource scheduling, and fits the GPU power resource scheduling scene in the server 130 more, so that accuracy of determining the target GPU equipment is improved.
In step 250, after establishing the pass-through connection of the target virtual machine 110 and the target GPU device, the target GPU device passes the resource data through to the target virtual machine 110 via direct memory access. Transparent refers to that in the data transmission process, the data is transmitted to the target virtual machine 110 as it is without any modification. The advantage of using transparent transmission is that the integrity and accuracy of the data are ensured.
In one embodiment, as shown in fig. 9, using the device transfer driver to transparently pass resource data in the target GPU device to the target virtual machine, includes:
step 910, encapsulating the resource data in the target GPU device into a resource packet, and encrypting the resource packet;
and step 920, transmitting the encrypted data to the target virtual machine.
In this embodiment, in order to further ensure the security of data transmission, the resource data is encrypted for transmission; in order to ensure that the data is not modified in the encryption process, the resource data is packaged into a resource packet before encryption, and then the resource packet is encrypted. There are various encryption methods, for example, encryption using a hash algorithm, symmetric encryption, or the like.
After the target virtual machine 110 receives the encrypted resource packet, the resource packet is decrypted first based on the encryption algorithm, and then the resource data from the target GPU is obtained.
The embodiment has the advantage that the safety in the data transparent transmission process is improved.
In step 260, after the target virtual machine 110 completes the transaction to be processed by using the target GPU device, the target virtual machine 110 sends an unbinding request to the server 130, and the server 130 modifies the instruction in the configuration file for instructing the target GPU device to bind with the target virtual machine 110 into an instruction for instructing the target GPU device to unbinding with the target virtual machine 110. After the configuration file is modified, the device management driver runs the configuration file again, so that the through connection between the target virtual machine 110 and the target GPU device can be released. After the target GPU device unbundles with the target virtual machine 110, it needs to be bound again with the server 130 in order to cope with the subsequent GPU power resource requests.
However, it may happen that after the target virtual machine 110 completes the pending transaction with the target GPU device, no unbinding request is sent to the server 130 immediately, resulting in idle computational resources of the target GPU device. Thus, in one embodiment, as shown in fig. 10, after the computing power resource in the target GPU device is used by the target virtual machine 110, modifying the configuration file, and loading the configuration file with the device management driver, the through connection between the target virtual machine 110 and the target GPU device is released, including:
Step 1010, detecting the computing power resource use condition of the target GPU equipment according to a preset period;
step 1020, if the target GPU device stops computing, detecting again after a predetermined period of time, and if the target GPU device still stops computing, modifying the configuration file;
and 1030, loading the configuration file by using the device management driver, and releasing the through connection between the target GPU device and the target virtual machine.
In this embodiment, periodic detection is performed on the computing power resource usage on the target GPU device. For example, the computing power resource usage is checked every 1 hour. If the target GPU device stops computing, it indicates that the target GPU device may interrupt use, or has ended use. At this point a predetermined period of time is initiated, for example, after 20 minutes the computational power resource usage of the target GPU device is again detected. If the target GPU device is found to have resumed computation upon re-detection after the predetermined period of time, then the representative target GPU device is still in use; if the target GPU device is found to still stop computing, then the configuration file is modified on behalf of the target GPU device with a high probability that the computing task has been completed.
The device management determines to load the modified configuration file, and releases the through connection between the target virtual machine 110 and the target GPU device.
The method and the device have the advantages that the waste of GPU (graphics processing Unit) computational power resources caused by the fact that the target virtual machine 110 does not actively send out a unbinding request is avoided, meanwhile, the fact that the target GPU equipment completes a computing task is ensured by adopting two detection, and the influence on transaction processing in the target virtual machine 110 is avoided.
As shown in fig. 11, an overall architecture of an embodiment of the present disclosure is illustratively shown. After the server 130 determines the target GPU device according to the GPU power resource request of the target virtual machine 110B, the device management determines to load the configuration file, the device transmission driver generates the virtual GPU device 141, and a through connection is established between the virtual GPU device 141 and the target GPU device, and the target GPU device is unbundled with the server 130. When the target virtual machine 110B accesses the virtual GPU device 141, i.e., is accessing the target GPU device. The disclosed embodiments also apply a database processing tool to count the computing power resource usage of the target GPU device in real time, and visually present the data of the computing power resource usage through the visualization server 130.
The apparatus and devices of embodiments of the present disclosure are described below.
It will be appreciated that, although the steps in the various flowcharts described above are shown in succession in the order indicated by the arrows, the steps are not necessarily executed in the order indicated by the arrows. The steps are not strictly limited in order unless explicitly stated in the present embodiment, and may be performed in other orders. Moreover, at least some of the steps in the flowcharts described above may include a plurality of steps or stages that are not necessarily performed at the same time but may be performed at different times, and the order of execution of the steps or stages is not necessarily sequential, but may be performed in turn or alternately with at least a portion of the steps or stages in other steps or other steps.
In the embodiments of the present application, when related processing is performed on data related to the characteristics of the target virtual machine according to attribute information or attribute information set of the target virtual machine, permission or agreement of the target virtual machine is obtained first, and the collection, use, processing, etc. of the data complies with relevant laws and regulations and standards of relevant countries and regions. In addition, when the embodiment of the application needs to acquire the attribute information of the target virtual machine, the independent permission or independent consent of the target virtual machine is acquired through a popup window or a jump to a confirmation page and the like, and after the independent permission or independent consent of the target virtual machine is definitely acquired, the necessary related data of the target virtual machine for enabling the embodiment of the application to normally operate is acquired.
Fig. 12 is a block diagram of a GPU power resource scheduling apparatus 1200 according to an embodiment of the present disclosure. The GPU power resource scheduling apparatus 1200 includes:
a receiving unit 1210, configured to receive a GPU power resource request from a target virtual machine;
an allocation unit 1220 configured to select a target GPU device from the plurality of GPU devices according to the GPU power resource request;
A generating unit 1230, configured to generate a configuration file based on the target GPU device;
a through establishing unit 1240, configured to load the configuration file by using the device management driver, and establish a through connection between the target virtual machine and the target GPU device by using the device transmission driver;
a pass-through unit 1250, configured to pass through the resource data in the target GPU device to the target virtual machine by using the device transmission driver;
and the direct unbinding unit 1260 is configured to modify the configuration file after the calculation of the computing power resource in the target GPU device by the target virtual machine is finished, and load the configuration file by using the device management driver, so as to release the direct connection between the target virtual machine and the target GPU device.
Optionally, a plurality of GPU devices are bound to a server;
the pass-through establishment unit 1240 is also configured to:
acquiring a node address of a target virtual machine;
the binding between the target GPU equipment and the server is released;
and loading the configuration file by using the device management driver, and establishing the through connection of the target virtual machine and the target GPU device according to the node address by using the device transmission driver.
Optionally, the through establishment unit 1240 is further configured to:
acquiring a total port address of the target GPU equipment;
converting the total port address into a first virtual address;
Acquiring a functional port address of the target GPU equipment;
converting the functional port address into a second virtual address;
and establishing a mapping relation between the total port address and the first virtual address, and between the functional port address and the second virtual address.
Optionally, the through establishment unit 1240 is further configured to:
creating a virtual GPU device in the target virtual machine by using the device transmission driver based on the first virtual address and the second virtual address;
and establishing a through connection between the virtual GPU equipment and the target GPU equipment.
Optionally, the pass-through unit 1250 includes:
encapsulating the resource data in the target GPU equipment into a resource packet, and encrypting the resource packet;
and transmitting the encrypted resource package to the target virtual machine.
Optionally, the through unbinding unit 1260 is further configured to:
detecting the computing power resource use condition of the target GPU equipment according to a preset period;
if the target GPU equipment stops calculating, detecting again after a preset time period, and if the target GPU equipment still stops calculating, modifying the configuration file;
and loading the configuration file by using the device management driver, and removing the through connection between the target GPU device and the target virtual machine.
Optionally, the GPU power resource scheduling apparatus 1200 further includes:
A monitoring unit (not shown) for monitoring the computing power resource usage of the target GPU device in real time;
a visualization unit (not shown) for visualizing the computing power resource usage;
a feature generation unit (not shown) for generating GPU device power resource usage features based on the power resource usage situation.
Optionally, the GPU power resource scheduling apparatus 1200 further includes:
a training unit (not shown) for training the target GPU device to determine a model using the computational power resource usage characteristics;
the allocation unit 1220 is also for: and inputting the GPU computing power resource request into a target GPU equipment determination model to obtain the target GPU equipment.
Referring to fig. 13, fig. 13 is a block diagram of a portion of a terminal implementing an embodiment of the present disclosure, the terminal including: radio Frequency (RF) circuitry 1310, memory 1315, input unit 1330, display unit 1340, sensors 1350, audio circuitry 1360, wireless fidelity (wireless fidelity, wiFi) modules 1370, processor 1380, and power supply 1390. It will be appreciated by those skilled in the art that the terminal structure shown in fig. 13 is not limiting of a cell phone or computer and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
The RF circuit 1310 may be used for receiving and transmitting signals during a message or a call, and in particular, after receiving downlink information of a base station, the RF circuit may process the downlink information for the processor 1380; in addition, the data of the design uplink is sent to the base station.
The memory 1315 may be used to store software programs and modules, and the processor 1380 performs various functional applications and data processing of the terminal by executing the software programs and modules stored in the memory 1315.
The input unit 1330 may be used to receive input numerical or character information and to generate key signal inputs related to the setting and function control of the terminal. Specifically, the input unit 1330 may include a touch panel 1331 and other input devices 1332.
The display unit 1340 may be used to display input information or provided information and various menus of the terminal. The display unit 1340 may include a display panel 1341.
Audio circuitry 1360, speaker 1361, microphone 1362 may provide an audio interface.
In this embodiment, the processor 1380 included in the terminal may perform the GPU power resource scheduling method of the previous embodiment.
Terminals of embodiments of the present disclosure include, but are not limited to, cell phones, computers, intelligent voice interaction devices, intelligent home appliances, vehicle terminals, aircraft, and the like. Embodiments of the present invention may be applied to a variety of scenarios including, but not limited to, y-cloud computing, artificial intelligence, and the like.
Fig. 14 is a block diagram of a portion of a server 130 embodying an embodiment of the present disclosure. The server 130 may vary considerably in configuration or performance and may include one or more central processing units (Central Processing Units, simply CPU) 1422 (e.g., one or more processors) and memory 1432, one or more storage media 1430 (e.g., one or more mass storage devices) that store applications 1442 or data 1444. Wherein the memory 1432 and storage medium 1430 can be transitory or persistent storage. The program stored in the storage medium 1430 may include one or more modules (not shown), each of which may include a series of instruction operations on the server 130. Further, the central processor 1422 may be provided in communication with a storage medium 1430 to execute a series of instruction operations in the storage medium 1430 on the server 130.
The server 130 may also include one or more power supplies 1426, one or more wired or wireless network interfaces 1450, one or more input/output interfaces 1458, and/or one or more operating systems 1441, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, etc.
The central processor 1422 in the server 130 may be used to perform the GPU power resource scheduling methods of embodiments of the present disclosure.
Embodiments of the present disclosure also provide a computer readable storage medium storing program code for executing the GPU power resource scheduling method of the foregoing embodiments.
The disclosed embodiments also provide a computer program product comprising a computer program. The processor of the computer device reads the computer program and executes the computer program, so that the computer device executes the method for realizing the GPU computing power resource scheduling.
The terms "first," "second," "third," "fourth," and the like in the description of the present disclosure and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein, for example. Furthermore, the terms "comprises," "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in this disclosure, "at least one" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
It should be understood that in the description of the embodiments of the present disclosure, the meaning of a plurality (or multiple) is two or more, and that greater than, less than, exceeding, etc. is understood to not include the present number, and that greater than, less than, within, etc. is understood to include the present number.
In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present disclosure may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present disclosure may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server 130, or a network device, etc.) to perform all or part of the steps of the methods of the various embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
It should also be appreciated that the various implementations provided by the embodiments of the present disclosure may be arbitrarily combined to achieve different technical effects.
The above is a specific description of the embodiments of the present disclosure, but the present disclosure is not limited to the above embodiments, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present disclosure, and are included in the scope of the present disclosure as defined in the claims.

Claims (12)

1. The GPU computing power resource scheduling method is characterized in that the GPU computing power resource scheduling method is applied to computing power resource scheduling of a server on a plurality of GPU devices; the GPU computing power resource scheduling method comprises the following steps:
receiving a GPU computing power resource request from a target virtual machine;
selecting a target GPU device from a plurality of GPU devices according to the GPU computing power resource request;
generating a configuration file based on the target GPU device;
loading the configuration file by using a device management driver, and establishing a through connection between the target virtual machine and the target GPU device by using a device transmission driver;
the device transmission driver is utilized to transparently transmit the resource data in the target GPU device to the target virtual machine;
And after the calculation of the computing power resources in the target GPU equipment by the target virtual machine is finished, modifying the configuration file, loading the configuration file by using the equipment management driver, and removing the through connection between the target virtual machine and the target GPU equipment.
2. The GPU power resource scheduling method of claim 1, wherein a plurality of the GPU devices are bound to the server;
the loading the configuration file by using a device management driver, and establishing a through connection between the target virtual machine and the target GPU device by using a device transmission driver, including:
acquiring a node address of the target virtual machine;
unbinding the target GPU equipment and the server;
and loading the configuration file by using the device management driver, and establishing the through connection between the target virtual machine and the target GPU device by using the device transmission driver according to the node address.
3. The GPU computing power resource scheduling method of claim 1, wherein loading the configuration file with a device management driver and establishing a pass-through connection of the target virtual machine with the target GPU device with the device transport driver comprises:
Acquiring a total port address of the target GPU equipment;
converting the total port address into a first virtual address;
acquiring a functional port address of the target GPU equipment;
converting the functional port address into a second virtual address;
and establishing a mapping relation between the total port address and the first virtual address, and between the functional port address and the second virtual address.
4. A GPU power resource scheduling method according to claim 3, wherein said establishing a through connection of the target virtual machine with the target GPU device using the device transfer driver comprises:
creating a virtual GPU device in the target virtual machine using the device transfer driver based on the first virtual address and the second virtual address;
and establishing a through connection between the virtual GPU equipment and the target GPU equipment.
5. The GPU power resource scheduling method of claim 1, wherein the utilizing a device transfer driver to transparently pass resource data in the target GPU device to the target virtual machine comprises:
encapsulating the resource data in the target GPU equipment into a resource packet, and encrypting the resource packet;
And transmitting the encrypted resource package to the target virtual machine.
6. The GPU computing power resource scheduling method of claim 1, wherein modifying the configuration file after the computing power resources in the target GPU device used by the target virtual machine are finished, and loading the configuration file with the device management driver, and releasing the through connection between the target virtual machine and the target GPU device, comprises:
detecting the computing power resource use condition of the target GPU equipment according to a preset period;
if the target GPU equipment stops calculating, detecting again after a preset time period, and if the target GPU equipment still stops calculating, modifying the configuration file;
and loading the configuration file by using the device management driver, and releasing the through connection between the target GPU device and the target virtual machine.
7. The GPU computing power resource scheduling method of claim 1, wherein after loading the configuration file with the device management driver and establishing a pass-through connection of the target virtual machine and the target GPU device with the device transfer driver, the GPU computing power resource scheduling method further comprises:
Monitoring the computing power resource use condition of the target GPU equipment in real time;
visually presenting the computing power resource use condition;
and generating the power consumption characteristics of the GPU equipment based on the power consumption conditions.
8. The GPU computing power resource scheduling method of claim 7, wherein after the generating GPU device computing power resource usage characteristics based on the computing power resource usage conditions, the GPU computing power resource scheduling method further comprises: training a target GPU device to determine a model by utilizing the computing power resource utilization characteristics;
the selecting a target GPU device from a plurality of GPU devices according to the GPU computing power resource request includes: and inputting the GPU computing power resource request into the target GPU equipment determination model to obtain the target GPU equipment.
9. A GPU power resource scheduling apparatus, comprising:
the receiving unit is used for receiving the GPU computing power resource request from the target virtual machine;
the distribution unit is used for selecting a target GPU device from a plurality of GPU devices according to the GPU computing power resource request;
the generating unit is used for generating a configuration file based on the target GPU equipment;
the direct connection establishing unit is used for loading the configuration file by using an equipment management driver and establishing direct connection between the target virtual machine and the target GPU equipment by using an equipment transmission driver;
The transparent transmission unit is used for using a device transmission driver to transparent transmit the resource data in the target GPU device to the target virtual machine;
and the direct unbinding unit is used for modifying the configuration file after the calculation of the computational power resources in the target GPU equipment used by the target virtual machine is finished, loading the configuration file by using the equipment management driver and removing the direct connection between the target virtual machine and the target GPU equipment.
10. An electronic device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the GPU power resource scheduling method according to any of claims 1 to 8 when executing the computer program.
11. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements a GPU power resource scheduling method according to any of claims 1 to 8.
12. A computer program product comprising a computer program that is read and executed by a processor of a computer device, causing the computer device to perform a GPU computing power resource scheduling method according to any of claims 1 to 8.
CN202310801620.4A 2023-06-30 2023-06-30 GPU computing power resource scheduling method, device, equipment and medium Pending CN116860391A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310801620.4A CN116860391A (en) 2023-06-30 2023-06-30 GPU computing power resource scheduling method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310801620.4A CN116860391A (en) 2023-06-30 2023-06-30 GPU computing power resource scheduling method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN116860391A true CN116860391A (en) 2023-10-10

Family

ID=88226109

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310801620.4A Pending CN116860391A (en) 2023-06-30 2023-06-30 GPU computing power resource scheduling method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN116860391A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117573296A (en) * 2024-01-17 2024-02-20 腾讯科技(深圳)有限公司 Virtual machine equipment straight-through control method, device, equipment and storage medium
CN117873735A (en) * 2024-03-11 2024-04-12 湖南马栏山视频先进技术研究院有限公司 GPU scheduling system under virtualized environment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117573296A (en) * 2024-01-17 2024-02-20 腾讯科技(深圳)有限公司 Virtual machine equipment straight-through control method, device, equipment and storage medium
CN117573296B (en) * 2024-01-17 2024-05-28 腾讯科技(深圳)有限公司 Virtual machine equipment straight-through control method, device, equipment and storage medium
CN117873735A (en) * 2024-03-11 2024-04-12 湖南马栏山视频先进技术研究院有限公司 GPU scheduling system under virtualized environment
CN117873735B (en) * 2024-03-11 2024-05-28 湖南马栏山视频先进技术研究院有限公司 GPU scheduling system under virtualized environment

Similar Documents

Publication Publication Date Title
CN116860391A (en) GPU computing power resource scheduling method, device, equipment and medium
CN105094983B (en) Computer, control apparatus, and data processing method
US10341264B2 (en) Technologies for scalable packet reception and transmission
CN106354687B (en) Data transmission method and system
US9161064B2 (en) Auto-scaling management of web content
CN104679598A (en) System and method for selecting a synchronous or asynchronous interprocess communication mechanism
US20230236896A1 (en) Method for scheduling compute instance, apparatus, and system
US10318343B2 (en) Migration methods and apparatuses for migrating virtual machine including locally stored and shared data
CN114281484A (en) Data transmission method, device, equipment and storage medium
CN105677481B (en) A kind of data processing method, system and electronic equipment
CN116800616B (en) Management method and related device of virtualized network equipment
EP3113015B1 (en) Method and apparatus for data communication in virtualized environment
CN113157396A (en) Virtualization service system and method
CN112491794A (en) Port forwarding method, device and related equipment
CN106933646B (en) Method and device for creating virtual machine
CN108563492B (en) Data acquisition method, virtual machine and electronic equipment
CN110489356B (en) Information processing method, information processing device, electronic equipment and storage medium
CN109218371B (en) Method and equipment for calling data
CN116244231A (en) Data transmission method, device and system, electronic equipment and storage medium
CN115098272A (en) GPU resource scheduling method, scheduler, electronic device and storage medium
CN114661465A (en) Resource management method, device, storage medium and electronic equipment
CN111324368A (en) Data sharing method and server
CN114860390B (en) Container data management method, device, program product, medium and electronic equipment
CN116089110B (en) Method for controlling process interaction and related device
CN115546008B (en) GPU (graphics processing Unit) virtualization management system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination