CN113674131A - Hardware accelerator equipment management method and device, electronic equipment and storage medium - Google Patents

Hardware accelerator equipment management method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113674131A
CN113674131A CN202110825190.0A CN202110825190A CN113674131A CN 113674131 A CN113674131 A CN 113674131A CN 202110825190 A CN202110825190 A CN 202110825190A CN 113674131 A CN113674131 A CN 113674131A
Authority
CN
China
Prior art keywords
information
hardware accelerator
resource
accelerator device
state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110825190.0A
Other languages
Chinese (zh)
Inventor
张百林
亓开元
苏志远
宋文平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Mass Institute Of Information Technology
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Shandong Mass Institute Of Information Technology
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Mass Institute Of Information Technology, Zhengzhou Yunhai Information Technology Co Ltd filed Critical Shandong Mass Institute Of Information Technology
Priority to CN202110825190.0A priority Critical patent/CN113674131A/en
Publication of CN113674131A publication Critical patent/CN113674131A/en
Priority to PCT/CN2022/078281 priority patent/WO2023000673A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5044Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities

Abstract

The application discloses a hardware accelerator equipment management method, a hardware accelerator equipment management device, an electronic equipment and a computer readable storage medium, wherein the method comprises the following steps: establishing a resource pool in a cloud platform, and allocating hardware accelerator equipment in the cloud platform to a corresponding resource pool; acquiring basic information, state information and resource pooling information of hardware accelerator equipment in the cloud platform; wherein the state information comprises a usable state, a using state and a maintenance state, and the resource pooling information is used for representing a resource pool to which the hardware accelerator device belongs; storing the basic information, the state information and the resource pooling information of the hardware accelerator equipment into a database and reporting to a resource manager; displaying, by the resource manager, basic information, state information, and resource pooling information of the hardware accelerator device. The hardware accelerator equipment management method improves the flexibility, maintainability and operability of the accelerator equipment.

Description

Hardware accelerator equipment management method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for managing a hardware accelerator device, an electronic device, and a computer-readable storage medium.
Background
In the era of the prevalence of cloud computing, artificial intelligence, and 5G technologies, accelerator devices such as GPUs (graphics processing units), FPGAs (Field Programmable Gate arrays), smartnics (smart network cards), and the like have come into existence. The management of the current cloud platform on the intelligent accelerator device is in a basic stage, and the core technology is to directly bind the physical device of the host to the cloud host for use through the PCI-passthrough technology of the virtualization platform. Certainly, with the rapid development of hardware devices and the continuous enhancement of virtualization hardware acceleration technologies, the hardware accelerator device realizes that one physical accelerator device can derive a plurality of virtual devices, for example, a GPU display card supporting virtualization, can be generally divided into time slices of different specifications as required, and can be simultaneously provided for a plurality of cloud hosts on a cloud platform to be used, so that the utilization rate of the hardware accelerator device is improved, and the computing capability of the cloud platform is greatly improved.
Cyborg is an intelligent accelerator equipment management project with active OpenStack international open source community, the currently realized functions mainly include discovery, resource reporting and display functions of accelerating equipment resources such as GPU, FPGA and SSD (Solid State Disk), and the interactive function of Nova projects and Cyborg projects is realized. In the related art, reservation and protection of accelerator devices cannot be realized, and reservation of accelerator devices on the cloud platform cannot be realized. In addition, when the cloud host is created through Nova, accelerator equipment resources must be specified, and accelerator equipment cannot be scheduled in a customized manner according to requirements.
Therefore, how to implement reservation of the accelerator device and improve flexibility of the accelerator device are technical problems to be solved by those skilled in the art.
Disclosure of Invention
The present application aims to provide a hardware accelerator device management method, an apparatus, an electronic device, and a computer-readable storage medium, which implement reservation of an accelerator device and improve flexibility of the accelerator device.
In order to achieve the above object, the present application provides a hardware accelerator device management method, including:
establishing a resource pool in a cloud platform, and allocating hardware accelerator equipment in the cloud platform to a corresponding resource pool; wherein the hardware accelerator device comprises a physical accelerator device and/or a virtualized accelerator device;
acquiring basic information, state information and resource pooling information of hardware accelerator equipment in the cloud platform; wherein the state information comprises a usable state, a using state and a maintenance state, and the resource pooling information is used for representing a resource pool to which the hardware accelerator device belongs;
storing the basic information, the state information and the resource pooling information of the hardware accelerator equipment into a database and reporting to a resource manager;
displaying, by the resource manager, basic information, state information, and resource pooling information of the hardware accelerator device.
The resource pool has a device virtualization attribute, the device virtualization attribute is used for indicating whether hardware accelerator devices in the resource pool support virtualization, and if the device virtualization attribute is opened, the resource pool comprises physical accelerator devices and corresponding virtualization accelerator devices.
Wherein allocating the hardware accelerator device to a corresponding resource pool comprises:
hardware accelerators of the same type are allocated to the same resource pool.
Wherein allocating the hardware accelerator device to a corresponding resource pool comprises:
and allocating the hardware accelerators of the same physical host configuration to the same resource pool.
Wherein, still include:
receiving a creation request of a cloud host; wherein the creation request comprises a requested target accelerator device type, the target accelerator device type comprising a target physical accelerator device type and/or a target virtualized accelerator device type;
determining a target physical host conforming to the type of the target accelerator, and acquiring performance parameters of the target physical host through information displayed by the resource manager;
determining an optimal target physical host by using a preset scheduling algorithm based on the performance parameters of the target physical host;
and starting the cloud host on the optimal target physical host to complete the creation operation of the cloud host.
Wherein the determining a target physical host that conforms to the target accelerator type comprises:
and determining a target resource pool which accords with the type of the target accelerator, and determining a target physical host to which a hardware accelerator contained in the target resource pool belongs.
Storing the basic information, the state information and the resource pooling information of the hardware accelerator device in a database and reporting the basic information, the state information and the resource pooling information to a resource manager, wherein the method comprises the following steps:
storing the basic information, the state information and the resource pooling information of the hardware accelerator device into a database;
reporting the basic information, the state information and the resource pooling information of the hardware accelerator equipment with the state information being in a usable state and a using state to a resource manager;
correspondingly, the displaying, by the resource manager, the basic information, the state information, and the resource pooling information of the hardware accelerator device includes:
displaying, by the resource manager, basic information, state information, and resource pooling information of the hardware accelerator device for which the state information is a usable state and a using state.
To achieve the above object, the present application provides a hardware accelerator device management apparatus, including:
the system comprises an allocation module, a resource pool creating module and a resource pool allocating module, wherein the allocation module is used for creating the resource pool in a cloud platform and allocating hardware accelerator equipment in the cloud platform to the corresponding resource pool; wherein the hardware accelerator device comprises a physical accelerator device and/or a virtualized accelerator device;
the acquisition module is used for acquiring basic information, state information and resource pooling information of hardware accelerator equipment in the cloud platform; wherein the state information comprises a usable state, a using state and a maintenance state, and the resource pooling information is used for representing a resource pool to which the hardware accelerator device belongs;
the reporting module is used for storing the basic information, the state information and the resource pooling information of the hardware accelerator equipment into a database and reporting the basic information, the state information and the resource pooling information to a resource manager;
and the display module is used for displaying the basic information, the state information and the resource pooling information of the hardware accelerator equipment through the resource manager.
To achieve the above object, the present application provides an electronic device including:
a memory for storing a computer program;
a processor for implementing the steps of the hardware accelerator device management method as described above when executing the computer program.
To achieve the above object, the present application provides a computer-readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the hardware accelerator device management method as described above.
According to the above scheme, the hardware accelerator device management method provided by the application includes: establishing a resource pool in a cloud platform, and allocating hardware accelerator equipment in the cloud platform to a corresponding resource pool; wherein the hardware accelerator device comprises a physical accelerator device and/or a virtualized accelerator device; acquiring basic information, state information and resource pooling information of hardware accelerator equipment in the cloud platform; wherein the state information comprises a usable state, a using state and a maintenance state, and the resource pooling information is used for representing a resource pool to which the hardware accelerator device belongs; storing the basic information, the state information and the resource pooling information of the hardware accelerator equipment into a database and reporting to a resource manager; displaying, by the resource manager, basic information, state information, and resource pooling information of the hardware accelerator device.
The hardware accelerator equipment management method performs resource pooling management on the hardware accelerator equipment, and maintains the state information of a single hardware accelerator, including a usable state, a using state and a maintenance state. The hardware accelerator equipment is set to be in a maintenance state, effective hardware equipment resources can be reserved for special service of a user at any time, and the flexibility, maintainability and operability of the accelerator equipment are effectively improved. The application also discloses a hardware accelerator equipment management device, an electronic device and a computer readable storage medium, which can also realize the technical effects.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:
FIG. 1 is a flow diagram illustrating a method for hardware accelerator device management in accordance with an illustrative embodiment;
FIG. 2 is a schematic representation of a database table before and after modification by Cyborg devices;
FIG. 3 is a diagram illustrating a Cyborg accelerator device management module architecture in accordance with an exemplary embodiment;
FIG. 4 is a flow diagram illustrating another method of hardware accelerator device management in accordance with an illustrative embodiment;
FIG. 5 is a diagram illustrating a Nova and Cyborg interaction architecture in accordance with an exemplary embodiment;
FIG. 6 is a block diagram illustrating a hardware accelerator device management apparatus in accordance with an illustrative embodiment;
FIG. 7 is a block diagram illustrating an electronic device in accordance with an exemplary embodiment.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application. In addition, in the embodiments of the present application, "first", "second", and the like are used for distinguishing similar objects, and are not necessarily used for describing a specific order or a sequential order.
The embodiment of the application discloses a hardware accelerator equipment management method, which realizes reservation of accelerator equipment and improves the flexibility of the accelerator equipment.
Referring to FIG. 1, a flowchart of a hardware accelerator device management method is shown in accordance with an illustrative embodiment, as shown in FIG. 1, comprising:
s101: establishing a resource pool in a cloud platform, and allocating hardware accelerator equipment in the cloud platform to a corresponding resource pool; wherein the hardware accelerator device comprises a physical accelerator device and/or a virtualized accelerator device;
the embodiment can be applied to an OpenStack cloud platform, after deployment of the cloud platform is completed, a user needs to perform resource pooling configuration of hardware acceleration devices on a server carrying the accelerator devices, and the hardware accelerator devices in the cloud platform may include a physical accelerator device and a virtualization accelerator device. If the hardware accelerator device is partitioned into the corresponding resource pools, it cannot be exposed independently.
Taking a GPU and a virtual GPU device as examples, initializing a physical GPU device:
[devices]
enabled_gpu_types=gpu-device-driver
initializing a single virtualization device:
[devices]
enabled_vgpu_types=vgpu-device-1
initializing a plurality of virtualized devices:
[devices]
enabled_vgpu_types=vgpu-device-1,vgpu-device-2
pool _ name ═ resource pool name >
[vgpu_gpu-device-1]
device_addresses=0000:58:00.0,0000:76:00.0
[vgpu_gpu-device-2]
device_addresses=0000:89:00.0
The resource pool name pool _ name is optional, and if the resource pool name is set, two virtual GPU accelerator devices, namely vgpu-device-1 and vgpu-device-2, are allocated to the resource pool for management.
As a possible implementation, the allocating the hardware accelerator devices to the corresponding resource pools includes: hardware accelerators of the same type are allocated to the same resource pool. In a specific implementation, the same type of hardware accelerator device may be centrally managed in the same resource pool, such as GPU devices from Intel and NVDIA.
As another possible implementation, the allocating the hardware accelerator devices to the corresponding resource pools includes: and allocating the hardware accelerators of the same physical host configuration to the same resource pool. In particular implementations, different types of hardware accelerators configured with the same physical host may be managed through the same resource pool, e.g., from an inpur NVMe SSD device and an Intel GPU device.
Further, as a preferred embodiment, the resource pool has a device virtualization attribute, where the device virtualization attribute is used to indicate whether a hardware accelerator device in the resource pool supports virtualization, and if the device virtualization attribute is turned on, the resource pool includes a physical accelerator device and a corresponding virtualization accelerator device. In a particular implementation, the resource pool has a device virtualization attribute for indicating whether a hardware accelerator device in the resource pool supports virtualization. If the device virtualization attribute is turned on, all devices in the resource pool support virtualization, for example, the resource pool may include GPU virtualization, smart network card (SmartNIC) virtualization, and the like.
S102: acquiring basic information, state information and resource pooling information of hardware accelerator equipment in the cloud platform; wherein the state information comprises a usable state, a using state and a maintenance state, and the resource pooling information is used for representing a resource pool to which the hardware accelerator device belongs;
in this embodiment, in the database table of the cyberg devices, fields of new resource pooling information (pool _ name) and status information (status) are added, the resource pooling information is used to represent a resource pool to which the hardware accelerator device belongs, and the status information is used to represent the status of the hardware accelerator device, including an available status (available), an in-use status (in-use) and a maintenance status (maintaining). The database tables before and after modification by Cyborg devices are shown in FIG. 2, with the database table before modification on the left and the database table after modification on the right.
In a specific implementation, the architecture diagram of the Cyborg accelerator device management module is shown in fig. 3, Cyborg maintains the state of devices and device resource pooling information in a database table through a Cyborg-conductor service, Cyborg is deployed on a computing node (computer), and Cyborg-agent collects accelerator device information including device manufacturers, UUIDs, device attributes, device names, device states, and the like through each accelerator device driver and stores the device information in the database table through the Cyborg-conductor. The newly-added cyborg-API Interface can set the state information and resource pooling information of the hardware accelerator device through the Application Programming Interface (API). When accelerator resource pooling information is set through the API, whether the selected hardware accelerator devices have the same attribute is checked, if so, the attribute of the resource pool is marked, and accurate device resource pool information is provided for a resource manager (place).
S103: storing the basic information, the state information and the resource pooling information of the hardware accelerator equipment into a database and reporting to a resource manager;
in a specific implementation, the basic information, state information, and resource pooling information of the hardware accelerator device is stored in a database table by the cyborg-conductor. And synchronously reporting the state information and the resource pooling information of the hardware accelerator device to the Placement resource manager through the cyborg-api. Meanwhile, the service condition of the resource equipment can be reported to the plan resource manager regularly through the timing task, so that accurate resource information is provided when Nova creates a cloud host to schedule resources.
As a preferred embodiment, the present step comprises: storing the basic information, the state information and the resource pooling information of the hardware accelerator device into a database; and reporting the basic information, the state information and the resource pooling information of the hardware accelerator equipment with the state information being in a usable state and a using state to a resource manager.
In specific implementation, if a certain hardware accelerator device is set to be in a maintenance state through cyborg-api, the hardware accelerator device is in a maintenance mode and cannot be accessed, and when a cyborg timing task reports resources to a place resource manager, the hardware accelerator device is removed from the reported resources, so that the influence of the hardware accelerator device on computing resources is reduced, and a cloud platform administrator can query a hardware accelerator device list in the maintenance state through the cyborg-api and update the corresponding accelerator device state.
S104: displaying, by the resource manager, basic information, state information, and resource pooling information of the hardware accelerator device.
In this step, the basic information, the state information and the resource pooling information of the hardware accelerator device are displayed through the resource manager, so that effective scheduling information is provided for the creation of the cloud host. As a preferred embodiment, the present step may include: displaying, by the resource manager, basic information, state information, and resource pooling information of the hardware accelerator device for which the state information is a usable state and a using state. It can be understood that, since the device in the maintenance state cannot report to the resource manager, the device cannot be scheduled when the cloud host scheduling accelerator is created, and thus, the target accelerator device is effectively protected or reserved.
The hardware accelerator device management method provided by the embodiment of the application performs resource pooling management on hardware accelerator devices, and maintains state information of a single hardware accelerator, including a usable state, a using state and a maintenance state. The hardware accelerator equipment is set to be in a maintenance state, effective hardware equipment resources can be reserved for special service of a user at any time, and the flexibility, maintainability and operability of the accelerator equipment are effectively improved.
In this embodiment, a process of creating a cloud host will be described, specifically:
referring to FIG. 4, a flowchart of another hardware accelerator device management method is shown in accordance with an illustrative embodiment, as shown in FIG. 4, comprising:
s201: receiving a creation request of a cloud host; wherein the creation request comprises a requested target accelerator device type, the target accelerator device type comprising a target physical accelerator device type and/or a target virtualized accelerator device type;
in this embodiment, the request for creating the cloud host may include information such as name, description, resource specification, network card information, user, and project of the cloud host, and the requested target accelerator device type is set in the resource specification. Nova and Cyborg interaction architecture as shown in fig. 5, Nova may request a single physical accelerator device, multiple physical accelerator devices, a single virtualized accelerator device, multiple virtualized accelerator devices, i.e. the target accelerator device type includes a target physical accelerator device type and/or a target virtualized accelerator device type. Nova may also directly request all physical accelerator devices within a certain resource pool.
Taking the example of requesting to create a bound GPU device, the format of the creation request is: { "name": "cloud host name", "description": "cloud host description information", "navigator": { 'device _ profile _ name 1': "DP _ GPU' }," network ": "network _ id", "project _ id": "item ID", "user _ ID": "user ID".
If a GPU device and an FPGA device are requested, the format of the creation request is as follows: { "name": "cloud host name", "description": "cloud host description information", "navigator": { 'device _ profile _ gpu': 'DP _ GPU', 'device _ profile _ fpga': 'DP _ FPGA' }.
If the virtual GPU equipment and the virtual FPGA equipment are requested, the format of the creating request is as follows: { "name": "cloud host name", "description": "cloud host description information", "navigator": { 'device _ profile _ gpu': 'DP _ vGPU', 'device _ profile _ fpga': 'DP _ vFPGA' }.
If the virtual GPU equipment and the FPGA equipment are requested, the format of the creating request is as follows: { "name": "cloud host name", "description": "cloud host description information", "navigator": { 'device _ profile _ gpu': 'DP _ vGPU', 'device _ profile _ fpga': 'DP _ FPGA' }.
S202: determining a target physical host conforming to the type of the target accelerator, and acquiring performance parameters of the target physical host through information displayed by the resource manager;
s203: determining an optimal target physical host by using a preset scheduling algorithm based on the performance parameters of the target physical host;
s204: and starting the cloud host on the optimal target physical host to complete the creation operation of the cloud host.
In this embodiment, the Nova API receives a request for creating a cloud host, analyzes the request according to a resource provided by the place through the Nova-conductor, first screens out a physical host meeting a condition, and determines a target physical host equipped with both a GPU and an FPGA device if the GPU device and the FPGA device are requested. The nova-scheduler sets the weight of each target physical host through an accelerator intelligent scheduling algorithm in combination with performance parameters such as the number of CPU cores of the cloud host, the size of an internal memory, the size of a hard disk and the like, sorts the hosts according to the weight, takes the target physical host with the highest weight as the optimal target physical host requesting the cloud host, establishes binding information of the cloud host and target accelerator equipment configured on the optimal target physical host by calling cyborg-api, sets the state information of the target accelerator equipment in a database table to be in-use state (in-use), reports the state information to a place resource manager, and updates the service condition of the equipment. And starting the cloud host on the optimal target physical host to complete the creation operation of the cloud host.
For the situation that different types of hardware accelerators configured on the same physical host are managed through the same resource pool, firstly, a target resource pool conforming to the type of the target accelerator is determined, and a target physical host to which the hardware accelerator contained in the target resource pool belongs is determined. For the above example, the target resource pool contains both GPU devices and FPGA devices.
Therefore, according to the embodiment, the requested accelerator equipment is automatically selected when the cloud host is created, the flexibility of the cloud host of the accelerator equipment created by the cloud platform is effectively guaranteed, and the resources of the accelerator equipment are planned as required.
In the following, a hardware accelerator device management apparatus provided in an embodiment of the present application is introduced, and a hardware accelerator device management apparatus described below and a hardware accelerator device management method described above may be referred to each other.
Referring to fig. 6, a block diagram of a hardware accelerator device management apparatus according to an exemplary embodiment is shown, as shown in fig. 6, including:
the allocation module 601 is configured to create a resource pool in a cloud platform, and allocate hardware accelerator devices in the cloud platform to corresponding resource pools; wherein the hardware accelerator device comprises a physical accelerator device and/or a virtualized accelerator device;
an obtaining module 602, configured to obtain basic information, state information, and resource pooling information of a hardware accelerator device in the cloud platform; wherein the state information comprises a usable state, a using state and a maintenance state, and the resource pooling information is used for representing a resource pool to which the hardware accelerator device belongs;
a reporting module 603, configured to store the basic information, the state information, and the resource pooling information of the hardware accelerator device in a database and report the basic information, the state information, and the resource pooling information to a resource manager;
a display module 604, configured to display, through the resource manager, the basic information, the state information, and the resource pooling information of the hardware accelerator device.
The hardware accelerator device management apparatus provided in the embodiment of the present application performs resource pooling management on a hardware accelerator device, and maintains state information of a single hardware accelerator, including a usable state, a use state, and a maintenance state. The hardware accelerator equipment is set to be in a maintenance state, effective hardware equipment resources can be reserved for special service of a user at any time, and the flexibility, maintainability and operability of the accelerator equipment are effectively improved.
On the basis of the foregoing embodiment, as a preferred implementation manner, the resource pool has a device virtualization attribute, where the device virtualization attribute is used to indicate whether a hardware accelerator device in the resource pool supports virtualization, and if the device virtualization attribute is turned on, the resource pool includes a physical accelerator device and a corresponding virtualization accelerator device.
On the basis of the foregoing embodiment, as a preferred implementation manner, the allocating module 601 specifically is a module that creates a resource pool in a cloud platform and allocates hardware accelerators of the same type to the same resource pool.
On the basis of the foregoing embodiment, as a preferred implementation manner, the allocating module 601 specifically is a module that creates a resource pool in a cloud platform and allocates hardware accelerators configured by the same physical host to the same resource pool.
On the basis of the above embodiment, as a preferred implementation, the method further includes:
the receiving module is used for receiving a creation request of the cloud host; wherein the creation request comprises a requested target accelerator device type, the target accelerator device type comprising a target physical accelerator device type and/or a target virtualized accelerator device type;
the first determining module is used for determining a target physical host conforming to the type of the target accelerator and acquiring the performance parameters of the target physical host through the information displayed by the resource manager;
the second determining module is used for determining the optimal target physical host by utilizing a preset scheduling algorithm based on the performance parameters of the target physical host;
and the starting module is used for starting the cloud host on the optimal target physical host so as to complete the creation operation of the cloud host.
On the basis of the foregoing embodiment, as a preferred implementation manner, the first determining module is specifically a module that determines a target resource pool that conforms to the type of the target accelerator, determines a target physical host to which a hardware accelerator included in the target resource pool belongs, and acquires a performance parameter of the target physical host through information displayed by the resource manager.
On the basis of the foregoing embodiment, as a preferred implementation manner, the reporting module 603 is specifically a module that stores the basic information, the state information, and the resource pooling information of the hardware accelerator device into a database, and reports the basic information, the state information, and the resource pooling information of the hardware accelerator device whose state information is a usable state and a using state to a resource manager;
correspondingly, the display module 604 is specifically a module that displays, by the resource manager, the basic information, the state information, and the resource pooling information of the hardware accelerator device whose state information is the usable state and the in-use state.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Based on the hardware implementation of the program module, and in order to implement the method according to the embodiment of the present application, an embodiment of the present application further provides an electronic device, and fig. 7 is a structural diagram of an electronic device according to an exemplary embodiment, as shown in fig. 7, the electronic device includes:
a communication interface 1 capable of information interaction with other devices such as network devices and the like;
and the processor 2 is connected with the communication interface 1 to realize information interaction with other equipment, and is used for executing the hardware accelerator equipment management method provided by one or more technical schemes when running a computer program. And the computer program is stored on the memory 3.
In practice, of course, the various components in the electronic device are coupled together by the bus system 4. It will be appreciated that the bus system 4 is used to enable connection communication between these components. The bus system 4 comprises, in addition to a data bus, a power bus, a control bus and a status signal bus. For the sake of clarity, however, the various buses are labeled as bus system 4 in fig. 7.
The memory 3 in the embodiment of the present application is used to store various types of data to support the operation of the electronic device. Examples of such data include: any computer program for operating on an electronic device.
It will be appreciated that the memory 3 may be either volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. Among them, the nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a magnetic random access Memory (FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical disk, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM), Enhanced Synchronous Dynamic Random Access Memory (Enhanced DRAM), Synchronous Dynamic Random Access Memory (SLDRAM), Direct Memory (DRmb Access), and Random Access Memory (DRAM). The memory 2 described in the embodiments of the present application is intended to comprise, without being limited to, these and any other suitable types of memory.
The method disclosed in the above embodiment of the present application may be applied to the processor 2, or implemented by the processor 2. The processor 2 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 2. The processor 2 described above may be a general purpose processor, a DSP, or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. The processor 2 may implement or perform the methods, steps and logic blocks disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 3, and the processor 2 reads the program in the memory 3 and in combination with its hardware performs the steps of the aforementioned method.
When the processor 2 executes the program, the corresponding processes in the methods according to the embodiments of the present application are realized, and for brevity, are not described herein again.
In an exemplary embodiment, the present application further provides a storage medium, i.e. a computer storage medium, specifically a computer readable storage medium, for example, including a memory 3 storing a computer program, which can be executed by a processor 2 to implement the steps of the foregoing method. The computer readable storage medium may be Memory such as FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface Memory, optical disk, or CD-ROM.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.
Alternatively, the integrated units described above in the present application may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof that contribute to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for enabling an electronic device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A hardware accelerator device management method, comprising:
establishing a resource pool in a cloud platform, and allocating hardware accelerator equipment in the cloud platform to a corresponding resource pool; wherein the hardware accelerator device comprises a physical accelerator device and/or a virtualized accelerator device;
acquiring basic information, state information and resource pooling information of the hardware accelerator equipment; wherein the state information comprises a usable state, a using state and a maintenance state, and the resource pooling information is used for representing a resource pool to which the hardware accelerator device belongs;
storing the basic information, the state information and the resource pooling information of the hardware accelerator equipment into a database and reporting to a resource manager;
displaying, by the resource manager, basic information, state information, and resource pooling information of the hardware accelerator device.
2. The hardware accelerator device management method according to claim 1, wherein the resource pool has a device virtualization attribute, the device virtualization attribute is used to indicate whether a hardware accelerator device in the resource pool supports virtualization, and if the device virtualization attribute is turned on, the resource pool includes a physical accelerator device and a corresponding virtualization accelerator device.
3. The hardware accelerator device management method of claim 1, wherein allocating the hardware accelerator device to a corresponding resource pool comprises:
hardware accelerators of the same type are allocated to the same resource pool.
4. The hardware accelerator device management method of claim 1, wherein allocating the hardware accelerator device to a corresponding resource pool comprises:
and allocating the hardware accelerators of the same physical host configuration to the same resource pool.
5. The hardware accelerator device management method of claim 4, further comprising:
receiving a creation request of a cloud host; wherein the creation request comprises a requested target accelerator device type, the target accelerator device type comprising a target physical accelerator device type and/or a target virtualized accelerator device type;
determining a target physical host conforming to the type of the target accelerator, and acquiring performance parameters of the target physical host through information displayed by the resource manager;
determining an optimal target physical host by using a preset scheduling algorithm based on the performance parameters of the target physical host;
and starting the cloud host on the optimal target physical host to complete the creation operation of the cloud host.
6. The hardware accelerator device management method of claim 5, wherein the determining a target physical host that conforms to the target accelerator type comprises:
and determining a target resource pool which accords with the type of the target accelerator, and determining a target physical host to which a hardware accelerator contained in the target resource pool belongs.
7. The method for managing the hardware accelerator device of claim 1, wherein storing the basic information, the status information, and the resource pooling information of the hardware accelerator device in a database and reporting to a resource manager comprises:
storing the basic information, the state information and the resource pooling information of the hardware accelerator device into a database;
reporting the basic information, the state information and the resource pooling information of the hardware accelerator equipment with the state information being in a usable state and a using state to a resource manager;
correspondingly, the displaying, by the resource manager, the basic information, the state information, and the resource pooling information of the hardware accelerator device includes:
displaying, by the resource manager, basic information, state information, and resource pooling information of the hardware accelerator device for which the state information is a usable state and a using state.
8. A hardware accelerator device management apparatus, comprising:
the system comprises an allocation module, a resource pool creating module and a resource pool allocating module, wherein the allocation module is used for creating the resource pool in a cloud platform and allocating hardware accelerator equipment in the cloud platform to the corresponding resource pool; wherein the hardware accelerator device comprises a physical accelerator device and/or a virtualized accelerator device;
the acquisition module is used for acquiring basic information, state information and resource pooling information of hardware accelerator equipment in the cloud platform; wherein the state information comprises a usable state, a using state and a maintenance state, and the resource pooling information is used for representing a resource pool to which the hardware accelerator device belongs;
the reporting module is used for storing the basic information, the state information and the resource pooling information of the hardware accelerator equipment into a database and reporting the basic information, the state information and the resource pooling information to a resource manager;
and the display module is used for displaying the basic information, the state information and the resource pooling information of the hardware accelerator equipment through the resource manager.
9. An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the hardware accelerator device management method of any of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the hardware accelerator device management method according to any one of claims 1 to 7.
CN202110825190.0A 2021-07-21 2021-07-21 Hardware accelerator equipment management method and device, electronic equipment and storage medium Pending CN113674131A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110825190.0A CN113674131A (en) 2021-07-21 2021-07-21 Hardware accelerator equipment management method and device, electronic equipment and storage medium
PCT/CN2022/078281 WO2023000673A1 (en) 2021-07-21 2022-02-28 Hardware accelerator device management method and apparatus, and electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110825190.0A CN113674131A (en) 2021-07-21 2021-07-21 Hardware accelerator equipment management method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113674131A true CN113674131A (en) 2021-11-19

Family

ID=78539758

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110825190.0A Pending CN113674131A (en) 2021-07-21 2021-07-21 Hardware accelerator equipment management method and device, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN113674131A (en)
WO (1) WO2023000673A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023000673A1 (en) * 2021-07-21 2023-01-26 山东海量信息技术研究院 Hardware accelerator device management method and apparatus, and electronic device and storage medium
CN117389841A (en) * 2023-12-07 2024-01-12 合芯科技(苏州)有限公司 Method and device for monitoring accelerator resources, cluster equipment and storage medium
WO2024027515A1 (en) * 2022-08-05 2024-02-08 中国移动通信有限公司研究院 Information transmission method and apparatus, cloud platform, network element, and storage medium
CN117389841B (en) * 2023-12-07 2024-04-19 合芯科技(苏州)有限公司 Method and device for monitoring accelerator resources, cluster equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117234741B (en) * 2023-11-14 2024-02-20 苏州元脑智能科技有限公司 Resource management and scheduling method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103309748A (en) * 2013-06-19 2013-09-18 上海交通大学 Adaptive scheduling host system and scheduling method of GPU virtual resources in cloud game
CN104010028A (en) * 2014-05-04 2014-08-27 华南理工大学 Dynamic virtual resource management strategy method for performance weighting under cloud platform
US20180210752A1 (en) * 2015-09-25 2018-07-26 Huawei Technologies Co., Ltd. Accelerator virtualization method and apparatus, and centralized resource manager
CN111736915A (en) * 2020-06-05 2020-10-02 浪潮电子信息产业股份有限公司 Management method, device, equipment and medium for cloud host instance hardware acceleration equipment
CN112925634A (en) * 2019-12-06 2021-06-08 中国电信股份有限公司 Heterogeneous resource scheduling method and system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100228695A1 (en) * 2009-03-06 2010-09-09 Boris Kaplan Computer system in which a received signal-reaction of the computer system of artificial intelligence of a cyborg or an android, an association of the computer system of artificial intelligence of a cyborg or an android, a thought of the computer system of artificial intelligence of a cyborg or an android are substantiated and the working method of this computer system of artificial intelligence of a cyborg or an android
US10373284B2 (en) * 2016-12-12 2019-08-06 Amazon Technologies, Inc. Capacity reservation for virtualized graphics processing
CN113674131A (en) * 2021-07-21 2021-11-19 山东海量信息技术研究院 Hardware accelerator equipment management method and device, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103309748A (en) * 2013-06-19 2013-09-18 上海交通大学 Adaptive scheduling host system and scheduling method of GPU virtual resources in cloud game
CN104010028A (en) * 2014-05-04 2014-08-27 华南理工大学 Dynamic virtual resource management strategy method for performance weighting under cloud platform
US20180210752A1 (en) * 2015-09-25 2018-07-26 Huawei Technologies Co., Ltd. Accelerator virtualization method and apparatus, and centralized resource manager
CN112925634A (en) * 2019-12-06 2021-06-08 中国电信股份有限公司 Heterogeneous resource scheduling method and system
CN111736915A (en) * 2020-06-05 2020-10-02 浪潮电子信息产业股份有限公司 Management method, device, equipment and medium for cloud host instance hardware acceleration equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张政;: "基于OpenStack云平台的弹性资源配置系统", 自动化与仪器仪表, no. 11, 25 November 2016 (2016-11-25) *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023000673A1 (en) * 2021-07-21 2023-01-26 山东海量信息技术研究院 Hardware accelerator device management method and apparatus, and electronic device and storage medium
WO2024027515A1 (en) * 2022-08-05 2024-02-08 中国移动通信有限公司研究院 Information transmission method and apparatus, cloud platform, network element, and storage medium
CN117389841A (en) * 2023-12-07 2024-01-12 合芯科技(苏州)有限公司 Method and device for monitoring accelerator resources, cluster equipment and storage medium
CN117389841B (en) * 2023-12-07 2024-04-19 合芯科技(苏州)有限公司 Method and device for monitoring accelerator resources, cluster equipment and storage medium

Also Published As

Publication number Publication date
WO2023000673A1 (en) 2023-01-26

Similar Documents

Publication Publication Date Title
US11714671B2 (en) Creating virtual machine groups based on request
CN113674131A (en) Hardware accelerator equipment management method and device, electronic equipment and storage medium
JP5510556B2 (en) Method and system for managing virtual machine storage space and physical hosts
CN111966500A (en) Resource scheduling method and device, electronic equipment and storage medium
KR20140049064A (en) Method and apparatus for providing isolated virtual space
CN111309440B (en) Method and equipment for managing and scheduling multiple types of GPUs
CN114416352A (en) Computing resource allocation method and device, electronic equipment and storage medium
CN108073423A (en) A kind of accelerator loading method, system and accelerator loading device
CN112463375A (en) Data processing method and device
CN114138405A (en) Virtual machine creating method and device, electronic equipment and storage medium
CN111798113A (en) Resource allocation method, device, storage medium and electronic equipment
CN107992351B (en) Hardware resource allocation method and device and electronic equipment
CN105677481B (en) A kind of data processing method, system and electronic equipment
CN114490062A (en) Local disk scheduling method and device, electronic equipment and storage medium
CN108667750B (en) Virtual resource management method and device
DE112021003803T5 (en) POOL MANAGEMENT FOR APPLICATION LAUNCH IN AN ONBOARD UNIT
CN114816665B (en) Hybrid arrangement system and virtual machine container resource hybrid arrangement method under super-fusion architecture
CN113535087B (en) Data processing method, server and storage system in data migration process
CN116436968A (en) Service grid communication method, system, device and storage medium
CN111475277A (en) Resource allocation method, system, equipment and machine readable storage medium
CN115080242A (en) Method, device and medium for unified scheduling of PCI equipment resources
CN110879748A (en) Shared resource allocation method, device and equipment
CN115150268A (en) Network configuration method and device of Kubernetes cluster and electronic equipment
CN115063282A (en) GPU resource scheduling method, device, equipment and storage medium
CN112988383A (en) Resource allocation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination