CN116069451B - Virtualization method, device, equipment, medium, accelerator and system - Google Patents

Virtualization method, device, equipment, medium, accelerator and system Download PDF

Info

Publication number
CN116069451B
CN116069451B CN202310233967.3A CN202310233967A CN116069451B CN 116069451 B CN116069451 B CN 116069451B CN 202310233967 A CN202310233967 A CN 202310233967A CN 116069451 B CN116069451 B CN 116069451B
Authority
CN
China
Prior art keywords
kernel
virtual
fpga accelerator
area
accelerator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310233967.3A
Other languages
Chinese (zh)
Other versions
CN116069451A (en
Inventor
郭巍
刘伟
徐亚明
张德闪
李仁刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202310233967.3A priority Critical patent/CN116069451B/en
Publication of CN116069451A publication Critical patent/CN116069451A/en
Application granted granted Critical
Publication of CN116069451B publication Critical patent/CN116069451B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4063Device-to-bus coupling
    • G06F13/4068Electrical coupling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4282Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/0026PCI express
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application discloses a virtualization method, a device, equipment, a medium, an accelerator and a system in the technical field of computers. The shell area and the kernel area are designed for the FPGA accelerator, and PCIe physical equipment, memory and network interfaces are virtualized in the shell area, so that the virtualization and isolation of accelerator resources are realized on the hardware FPGA accelerator; meanwhile, the allocation of the virtualized resources is realized, so that different virtualized resources are allocated to different kernel programs in the kernel area, and the allocation can be realized: different virtual machines or different containers bind different resources in the same FPGA accelerator at the same time, and accesses of the different virtual machines or different containers to the same FPGA accelerator are isolated from each other, so that the utilization rate of the accelerator resources is improved.

Description

Virtualization method, device, equipment, medium, accelerator and system
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a virtualization method, device, apparatus, medium, accelerator, and system.
Background
At present, a server can perform acceleration calculation by means of the calculation power of a hardware FPGA accelerator (such as an FPGA accelerator card), but because the hardware FPGA accelerator itself needs to rely on server software to realize virtualization of various types of resources, different virtual machines or different containers in the server cannot call the same hardware FPGA accelerator at the same time. That is: the hardware FPGA accelerator has no virtualization function, if the accelerator resources are virtualized by means of server software, the actual isolation of the accelerator resources still cannot be realized, so that different virtual machines or different containers on the server still need to access the same FPGA accelerator in sequence, the utilization rate of the accelerator resources is low, and the computational power of the FPGA accelerator is wasted.
Therefore, how to implement isolation of accelerator resources on a hardware FPGA accelerator is a problem that one skilled in the art needs to solve.
Disclosure of Invention
In view of this, it is an object of the present application to provide a virtualization method, apparatus, device, medium, accelerator, and system to achieve isolation of accelerator resources on a hardware FPGA accelerator. The specific scheme is as follows:
in a first aspect, the present application provides a virtualization method applied to a shell area of an FPGA accelerator, the method including:
determining each physical device supporting a high-speed serial computer expansion bus standard in the FPGA accelerator, and virtualizing each physical device into a plurality of virtual devices;
dividing a hardware storage resource in the FPGA accelerator into a plurality of storage areas;
virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports;
recording the plurality of virtual devices, the plurality of storage areas and the plurality of virtual interfaces, and distributing the plurality of virtual devices, the plurality of storage areas and the plurality of virtual interfaces to different kernel programs in a kernel area of the FPGA accelerator so that different virtual machines or different containers call different kernel programs at the same time.
Optionally, the virtualizing each physical device into a plurality of virtual devices includes:
determining the type of each physical device;
creating at least one physical management device based on each type of physical device;
each physical management device is virtualized as a number of virtual devices.
Optionally, the allocating the plurality of virtual devices, the plurality of storage areas, and the plurality of virtual portals to different kernel programs in the kernel area includes:
inquiring a resource configuration table corresponding to each kernel program; the resource allocation table corresponding to each kernel program is recorded with: the least consumption of the current kernel program for the virtual equipment, the storage area and the virtual network port;
and distributing virtual equipment, a storage area and a virtual network port for each kernel program according to the resource configuration table corresponding to each kernel program.
Optionally, the FPGA accelerator is plugged into a server;
correspondingly, the method further comprises the steps of:
and if the server accesses any kernel program in the kernel area, controlling the access operation of the server to the kernel program.
Optionally, the method further comprises:
and if the server accesses any hardware resource in the FPGA accelerator, controlling the access operation of the server to the hardware resource.
Optionally, the method further comprises:
and isolating the signal connection between the shell area and the kernel area according to the program updating condition in the kernel area.
Optionally, the isolating the signal connection between the shell area and the kernel area according to the program update condition in the kernel area includes:
and blocking the signal connection between the shell area and the kernel area in the updating process of any program in the kernel area.
Optionally, the isolating the signal connection between the shell area and the kernel area according to the program update condition in the kernel area includes:
if any program in the kernel area is not updated, the signal connection between the shell area and the kernel area is maintained.
Optionally, after the dividing the hardware storage resource in the FPGA accelerator into a plurality of storage areas, the method further includes:
at least one queue is configured for each storage region.
Optionally, after virtualizing each hardware portal resource in the FPGA accelerator into a plurality of virtual portals, the method further includes:
at least two queues are configured for each virtual portal.
Optionally, the virtualizing each physical device into a plurality of virtual devices includes:
and creating each type of physical PCIe device in the FPGA accelerator as at least one physical management device by using the SR-IOV, and virtualizing each physical management device as a plurality of virtual devices.
Optionally, the dividing the hardware storage resource in the FPGA accelerator into a plurality of storage areas includes:
and dividing the hardware storage resources in the FPGA accelerator into a plurality of storage areas by utilizing Virtio.
Optionally, the virtualizing each hardware portal resource in the FPGA accelerator into a plurality of virtual portals includes:
and virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports by utilizing Virtio.
In a second aspect, the present application provides a virtualization apparatus for application to a shell region of an FPGA accelerator, the apparatus comprising:
the first virtualization module is used for determining all physical devices supporting a high-speed serial computer expansion bus standard in the FPGA accelerator, and virtualizing each physical device into a plurality of virtual devices;
the second virtualization module is used for dividing hardware storage resources in the FPGA accelerator into a plurality of storage areas;
the third virtualization module is used for virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports;
the virtual management module is used for recording the plurality of virtual devices, the plurality of storage areas and the plurality of virtual network ports, and distributing the plurality of virtual devices, the plurality of storage areas and the plurality of virtual network ports to different kernel programs in a kernel area of the FPGA accelerator so that different virtual machines or different containers call different kernel programs at the same time.
In a third aspect, the present application provides a virtualization component disposed in a shell zone of an FPGA accelerator, the virtualization component comprising:
the PCIe device virtualization module is used for virtualizing each physical PCIe device in the FPGA accelerator into a plurality of virtual devices;
the memory virtualization module is used for dividing hardware storage resources in the FPGA accelerator into a plurality of storage areas;
the network interface virtualization module is used for virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports;
a resource management module for recording the plurality of virtual devices, the plurality of storage areas and the plurality of virtual network ports, and distributing the plurality of virtual devices, the plurality of storage areas and the plurality of virtual interfaces to different kernel programs in the kernel area so that different virtual machines or different containers call different kernel programs at the same time.
Optionally, the PCIe device virtualization module is specifically configured to: and creating each type of physical PCIe device in the FPGA accelerator as at least one physical management device by using the SR-IOV, and virtualizing each physical management device as a plurality of virtual devices.
Optionally, the memory virtualization module is specifically configured to: and dividing the hardware storage resources in the FPGA accelerator into a plurality of storage areas by utilizing Virtio.
Optionally, the memory virtualization module is further configured to: at least one queue is configured for each storage region.
Optionally, the network interface virtualization module is specifically configured to: and virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports by utilizing Virtio.
Optionally, the network interface virtualization module is further configured to: at least 8 queues are configured for each virtual portal.
Optionally, the resource management module is specifically configured to: distributing virtual equipment, a storage area and a virtual network port for each kernel program according to a resource configuration table corresponding to each kernel program; the resource allocation table corresponding to each kernel program is recorded with: the current kernel is the least amount of usage for virtual devices, memory areas, and virtual portals.
Optionally, the resource management module is further configured to: the access operation of the server to each kernel program is controlled.
Optionally, the resource management module is further configured to: and controlling access operation of a server to all hardware resources in the FPGA accelerator.
Optionally, the method further comprises:
and the isolation module is used for isolating the signal connection between the shell area and the kernel area.
Optionally, the isolation module is specifically configured to: and blocking signal connection between the shell area and the kernel area in the updating process of any program in the kernel area.
Optionally, the isolation module is specifically configured to: and if any program in the kernel area is not updated, providing signal connection between the shell area and the kernel area.
In a fourth aspect, the present application provides a virtualization method, applied to any one of the above virtualized components disposed in a shell area of an FPGA accelerator, where the virtualized components include: the system comprises a PCIe device virtualization module, a memory virtualization module, a network interface virtualization module and a resource management module;
the method comprises the following steps:
virtualizing each physical PCIe device in the FPGA accelerator into a plurality of virtual devices by utilizing the PCIe device virtualization module;
dividing hardware storage resources in the FPGA accelerator into a plurality of storage areas by using the memory virtualization module;
virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports by utilizing the network interface virtualization module;
and recording the plurality of virtual devices, the plurality of storage areas and the plurality of virtual network ports by using the resource management module, and distributing the plurality of virtual devices, the plurality of storage areas and the plurality of virtual network ports to different kernel programs in the kernel area so as to enable different virtual machines or different containers to call different kernel programs simultaneously.
Optionally, the virtualization component further includes: an isolation module;
accordingly, the isolation module is utilized to isolate the signal connection of the shell area and the core area.
Optionally, the isolating the signal connection between the shell area and the core area by using the isolating module includes: and in the updating process of any program in the kernel area, blocking the signal connection between the shell area and the kernel area by using the isolation module.
Optionally, the isolating the signal connection between the shell area and the core area by using the isolating module includes: and if any program in the kernel area is not updated, the isolation module is utilized to provide signal connection between the shell area and the kernel area.
Optionally, the virtualizing, with the PCIe device virtualization module, each physical PCIe device in the FPGA accelerator into a plurality of virtual devices includes: and creating each type of physical PCIe device in the FPGA accelerator as at least one physical management device through an SR-IOV by utilizing the PCIe device virtualization module, and virtualizing each physical management device into a plurality of virtual devices.
Optionally, the partitioning, by the memory virtualization module, the hardware storage resource in the FPGA accelerator into a plurality of storage areas includes: and dividing hardware storage resources in the FPGA accelerator into a plurality of storage areas through Virtio by utilizing the memory virtualization module.
Optionally, the method further comprises: and configuring at least one queue for each storage area by utilizing the memory virtualization module.
Optionally, the virtualizing, by using the network interface virtualization module, each hardware portal resource in the FPGA accelerator into a plurality of virtual portals includes: and virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports through a virtual interface virtualization module.
Optionally, the method further comprises: and configuring at least 8 queues for each virtual network port by utilizing the network interface virtualization module.
Optionally, the method further comprises: utilizing the resource management module to allocate virtual equipment, a storage area and a virtual network port for each kernel program according to a resource allocation table corresponding to each kernel program; the resource allocation table corresponding to each kernel program is recorded with: the current kernel is the least amount of usage for virtual devices, memory areas, and virtual portals.
Optionally, the method further comprises: and controlling the access operation of the server to each kernel program by utilizing the resource management module.
Optionally, the method further comprises: and controlling access operation of the server to all hardware resources in the FPGA accelerator by using the resource management module.
In a fifth aspect, the present application provides an electronic device, including:
a memory for storing a computer program;
a processor for executing the computer program to implement the previously disclosed virtualization method.
In a sixth aspect, the present application provides a readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the previously disclosed virtualization method.
In a seventh aspect, the present application provides an FPGA accelerator, comprising: a shell area and a kernel area; the shell area is provided with any one of the virtualized components, and the kernel area is provided with a plurality of kernel programs. The shell area can realize any one of the virtualization methods.
Optionally, the kernel area is further provided with a kernel program interconnection module, and the kernel program interconnection module is used for interconnecting different kernel programs.
Optionally, HBM and/or DDR disposed in the kernel area is a hardware storage resource of the FPGA accelerator.
In an eighth aspect, the present application provides an acceleration system comprising: a server and at least one FPGA accelerator as claimed in any preceding claim connected to the server.
As can be seen from the above solution, the present application provides a virtualization method applied to a shell area of an FPGA accelerator, where the method includes: determining each physical device supporting a high-speed serial computer expansion bus standard in the FPGA accelerator, and virtualizing each physical device into a plurality of virtual devices; dividing a hardware storage resource in the FPGA accelerator into a plurality of storage areas; virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports; recording the plurality of virtual devices, the plurality of storage areas and the plurality of virtual interfaces, and distributing the plurality of virtual devices, the plurality of storage areas and the plurality of virtual interfaces to different kernel programs in a kernel area of the FPGA accelerator so that different virtual machines or different containers call different kernel programs at the same time.
Therefore, the beneficial effects of this application are: the shell area and the kernel area are designed for the FPGA accelerator, and PCIe physical equipment, memory and network interfaces are virtualized in the shell area, so that the FPGA accelerator can support resource virtualization, namely: according to the method and the device, resources such as network ports and storage in the FPGA accelerator can be virtualized, so that the virtualization and isolation of accelerator resources are realized on the hardware FPGA accelerator; meanwhile, the allocation of the virtualized resources can be realized, so that different virtualized resources are allocated to different kernel programs in the kernel area, and then different virtual machines or different containers can call different kernel programs at the same time, so that the method is realized: different virtual machines or different containers on the server bind different resources in the same FPGA accelerator at the same time, and accesses of the different virtual machines or different containers to the same FPGA accelerator are isolated from each other, so that the utilization rate of the accelerator resources is improved, and the waste of the accelerator resources is avoided.
Accordingly, other subject matter provided by the application also has the technical effects described above. Other topics include: apparatus, components, devices, media, accelerators, systems, and the like.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings may be obtained according to the provided drawings without inventive effort to a person skilled in the art.
FIG. 1 is a flow chart of a virtualization method disclosed herein;
FIG. 2 is a schematic diagram of a virtualization apparatus disclosed herein;
FIG. 3 is a schematic diagram of a virtualized component as disclosed herein;
FIG. 4 is a flow chart of a virtualization method disclosed herein;
FIG. 5 is a schematic diagram of a design architecture of the FPGA accelerator disclosed in the present application;
FIG. 6 is a schematic diagram of an electronic device disclosed herein;
FIG. 7 is a block diagram of a server provided herein;
fig. 8 is a schematic diagram of a terminal provided in the present application;
FIG. 9 is a block diagram of an acceleration system provided herein;
Fig. 10 is a schematic diagram of an acceleration system according to the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
At present, a hardware FPGA accelerator does not have a virtualization function, if accelerator resources are virtualized by means of server software, the accelerator resources still cannot be truly isolated, so that different virtual machines or different containers on a server still need to access the same FPGA accelerator in sequence, the utilization rate of the accelerator resources is low, and the calculation power of the FPGA accelerator is wasted. Therefore, the virtualization scheme can realize isolation and virtualization of the accelerator resources on the hardware FPGA accelerator, improve the utilization rate of the accelerator resources and avoid waste of the accelerator resources.
Referring to fig. 1, an embodiment of the application discloses a virtualization method based on an FPGA accelerator, which is applied to a shell area of the FPGA accelerator, and the method includes:
S101, determining each physical device supporting the high-speed serial computer expansion bus standard in the FPGA accelerator, and virtualizing each physical device into a plurality of virtual devices.
Among them, high-speed serial computer expansion bus standards are as follows: PCIe (Peripheral Component Interconnect express).
In one embodiment, virtualizing each physical device as a plurality of virtual devices includes: determining the type of each physical device; creating at least one physical management device based on each type of physical device; each physical management device is virtualized as a number of virtual devices.
Specifically, each type of physical PCIe device in the FPGA accelerator is created as at least one physical management device (PF) using SR-IOV, and each physical management device is virtualized as a number of virtual devices (VFs). Wherein each type of physical PCIe device in the FPGA accelerator can be created as a PF (Physical Function ) based on the SR-IOV protocol, and further create several VFs (Virtual functions) for the PF. Among them, physical PCIe devices are as follows: memory, network, etc. In one example, at least one PF for managing memory, at least one PF for managing a network may be created.
S102, dividing hardware storage resources in the FPGA accelerator into a plurality of storage areas.
In one specific embodiment, after dividing the hardware storage resources in the FPGA accelerator into a plurality of storage areas, the method further includes: at least one queue is configured for each storage region.
S103, virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports.
In a specific embodiment, after virtualizing each hardware portal resource in the FPGA accelerator into a plurality of virtual portals, the method further includes: at least two queues are configured for each virtual portal.
S104, recording a plurality of virtual devices, a plurality of storage areas and a plurality of virtual network ports, and distributing the plurality of virtual devices, the plurality of storage areas and the plurality of virtual network ports to different kernel programs in the kernel area of the FPGA accelerator so that different virtual machines or different containers call different kernel programs at the same time.
It should be noted that, the FPGA accelerator is a programmable device, and after the FPGA accelerator is programmed according to the embodiment, the FPGA accelerator may support the virtualization function. Specifically, the shell area may implement: the server manages the basic functions and data channels of the FPGA accelerator. The basic management functions include: the method comprises the steps of managing the downloading of each kernel program in a kernel area, programming a Flash chip, storing shell versions used by power-on, enabling message communication between a management authority drive and a user authority drive and the like. The data channel may implement: the server and the PCIe DMA (Direct Memory Access, a high speed data transfer operation) in the kernel area. Each kernel program in the kernel area is defined by a user to have a specific function, and in general, a plurality of kernel programs can form a specific computing function in a parallel or serial mode, and the use of each kernel program can be dynamically switched by the user, so that the FPGA accelerator has strong universality and flexibility. The kernel area may also manage: on-board DDR (Double Data Rate) interface, on-chip high bandwidth memory, and high speed serial transfer interface.
In one embodiment, the allocation of the plurality of virtual devices, the plurality of storage areas, and the plurality of virtual portals to different kernel programs in the kernel area includes: inquiring a resource configuration table corresponding to each kernel program; the resource allocation table corresponding to each kernel program is recorded with: the least consumption of the current kernel program for the virtual equipment, the storage area and the virtual network port; and distributing virtual equipment, a storage area and a virtual network port for each kernel program according to the resource configuration table corresponding to each kernel program.
In this embodiment, the FPGA accelerator is plugged into the server; correspondingly, the method further comprises the steps of: if the server accesses any kernel program in the kernel area, the access operation of the server to the kernel program is controlled. And if the server accesses any hardware resource in the FPGA accelerator, controlling the access operation of the server to the hardware resource.
In one example, further comprising: and isolating the signal connection of the shell area and the kernel area according to the program update condition in the kernel area. The signal connection between the shell area and the kernel area is isolated according to the program update condition in the kernel area, and the method comprises the following steps: during the update of any program in the kernel area, the signal connection between the shell area and the kernel area is blocked. The signal connection between the shell area and the kernel area is isolated according to the program update condition in the kernel area, and the method comprises the following steps: if any program in the kernel area is not updated, the signal connection between the shell area and the kernel area is maintained.
It can be seen that, in this embodiment, a shell area and a kernel area are designed for the FPGA accelerator, and PCIe physical devices, memories, and network interfaces are virtualized in the shell area, so that the FPGA accelerator itself can support resource virtualization, that is: according to the method and the device, resources such as network ports and storage in the FPGA accelerator can be virtualized, so that the virtualization and isolation of accelerator resources are realized on the hardware FPGA accelerator; meanwhile, the allocation of the virtualized resources can be realized, so that different virtualized resources are allocated to different kernel programs in the kernel area, and then different virtual machines or different containers can call different kernel programs at the same time, so that the method is realized: different virtual machines or different containers on the server bind different resources in the same FPGA accelerator at the same time, and accesses of the different virtual machines or different containers to the same FPGA accelerator are isolated from each other, so that the utilization rate of the accelerator resources is improved, and the waste of the accelerator resources is avoided.
A virtualization apparatus provided in the embodiments of the present application is described below, and a virtualization apparatus described below and any embodiment described herein may be referred to herein.
Referring to fig. 2, an embodiment of the present application discloses a virtualization apparatus applied to a shell area of an FPGA accelerator, where the apparatus includes:
A first virtualization module 201, configured to determine, in an FPGA accelerator, each physical device supporting a high-speed serial computer expansion bus standard, and virtualize each physical device as a plurality of virtual devices;
a second virtualization module 202, configured to divide a hardware storage resource in the FPGA accelerator into a plurality of storage areas;
a third virtualization module 203, configured to virtualize each hardware portal resource in the FPGA accelerator into a plurality of virtual portals;
the virtual management module 204 is configured to record a plurality of virtual devices, a plurality of storage areas, and a plurality of virtual ports, and allocate the plurality of virtual devices, the plurality of storage areas, and the plurality of virtual ports to different kernel programs in a kernel area of the FPGA accelerator, so that different virtual machines or different containers call different kernel programs at the same time.
In one embodiment, the first virtualization module is specifically configured to:
determining the type of each physical device;
creating at least one physical management device based on each type of physical device;
each physical management device is virtualized as a number of virtual devices.
In one embodiment, the virtual management module is specifically configured to:
inquiring a resource configuration table corresponding to each kernel program; the resource allocation table corresponding to each kernel program is recorded with: the least consumption of the current kernel program for the virtual equipment, the storage area and the virtual network port;
And distributing virtual equipment, a storage area and a virtual network port for each kernel program according to the resource configuration table corresponding to each kernel program.
In one specific embodiment, the FPGA accelerator is plugged into the server; accordingly, the virtual management module is further configured to:
if the server accesses any kernel program in the kernel area, the access operation of the server to the kernel program is controlled.
In one embodiment, the virtual management module is further configured to:
and if the server accesses any hardware resource in the FPGA accelerator, controlling the access operation of the server to the hardware resource.
In one specific embodiment, the method further comprises:
and the signal isolation module is used for isolating the signal connection between the shell area and the kernel area according to the program update condition in the kernel area.
In one embodiment, the signal isolation module is specifically configured to:
during the update of any program in the kernel area, the signal connection between the shell area and the kernel area is blocked.
In one embodiment, the signal isolation module is specifically configured to:
if any program in the kernel area is not updated, the signal connection between the shell area and the kernel area is maintained.
In a specific embodiment, the second virtual module is further configured to:
At least one queue is configured for each storage region.
In a specific embodiment, the third virtual module is further configured to:
at least two queues are configured for each virtual portal.
Therefore, the embodiment can realize the virtualization and isolation of the accelerator resources on the hardware FPGA accelerator; meanwhile, the kernel area of the FPGA accelerator can realize the allocation of virtualized resources, so that different virtualized resources are allocated to different kernel programs in the kernel area, and then different virtual machines or different containers can call different kernel programs at the same time, thereby realizing the following steps: different virtual machines or different containers on the server bind different resources in the same FPGA accelerator at the same time, and accesses of the different virtual machines or different containers to the same FPGA accelerator are isolated from each other, so that the utilization rate of the accelerator resources is improved, and the waste of the accelerator resources is avoided.
Referring to fig. 3, the embodiment of the application discloses a virtualization component, which is disposed in a shell area of an FPGA accelerator, and the virtualization component includes: PCIe device virtualization module 301, memory virtualization module 302, network interface virtualization module 303, and resource management module 304.
The PCIe device virtualization module is used for virtualizing each physical PCIe device in the FPGA accelerator into a plurality of virtual devices; the memory virtualization module is used for dividing hardware storage resources in the FPGA accelerator into a plurality of storage areas; the network interface virtualization module is used for virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports; the resource management module is used for recording a plurality of virtual devices, a plurality of storage areas and a plurality of virtual network ports, and distributing the plurality of virtual devices, the plurality of storage areas and the plurality of virtual network ports to different kernel programs in the kernel area so that different virtual machines or different containers call different kernel programs at the same time.
The FPGA accelerator provided in this embodiment is implemented based on the architecture of the shell area and the kernel area, and a virtualization component including a PCIe device virtualization module, a memory virtualization module, a network interface virtualization module, and a resource management module is disposed in the shell area, so that the virtualization component can enable the FPGA accelerator itself to support resource virtualization, that is: the virtualization component can virtualize resources such as network ports, storage and the like in the FPGA accelerator, so that the virtualization of the accelerator resources is realized on the hardware FPGA accelerator, namely the isolation of the accelerator resources is realized on the hardware FPGA accelerator; meanwhile, the allocation of the virtualized resources can be realized, so that different virtualized resources are allocated to different kernel programs in the kernel area, and then different virtual machines or different containers can call different kernel programs at the same time.
In one embodiment, the PCIe device virtualization module is specifically configured to: each type of physical PCIe device in the FPGA accelerator is created as at least one physical management device (PF) using an SR-IOV, and each physical management device is virtualized as several virtual devices (VFs). Wherein each type of physical PCIe device in the FPGA accelerator can be created as a PF (Physical Function ) based on the SR-IOV protocol, and further create several VFs (Virtual functions) for the PF. Among them, physical PCIe devices are as follows: memory, network, etc. In one example, at least one PF for managing memory, at least one PF for managing a network may be created.
Among them, PCIe devices supporting SR-IOV technology are called PFs (Physical Function, physical functions), and Virtual PCIe devices generated by PFs are called VFs (Virtual functions). The PF has full PCIe device functionality, which is capable of generating and managing VFs. While a VF is a lightweight PCIe device. Both PF and VF have I/O capability and can be assigned to virtual machines for use as I/O devices. SR-IOV inherits the transmission technology, and the virtual machine can directly use VF to carry out I/O operation, so the SR-IOV has very high performance; the SR-IOV does not need to modify the kernel of the operating system, so the universality is better; the SR-IOV can generate a plurality of VFs to enable the devices to have sharing property.
In one embodiment, the memory virtualization module is specifically configured to: the hardware storage resources in the FPGA accelerator are divided into a plurality of storage areas by Virtio.
In one embodiment, the memory virtualization module is further configured to: at least one queue is configured for each memory region, each queue being available for configuration of the memory region and data transfer.
In one embodiment, the network interface virtualization module is specifically configured to: and virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports by utilizing Virtio. The number of the virtualized virtual network ports is set according to the requirements of the virtual machines or containers; a maximum of 255 VFs per PF may be created for allocation to virtual machine use. If the number of virtual machines running simultaneously is not large, 1 PF for managing the network is created; if the demand is excessive, more PFs may be created for managing the network. In practical application of FPGA accelerators it is sufficient to create 4 PFs for managing the network.
In one embodiment, the network interface virtualization module is further to: at least 8 queues are configured for each virtual portal.
In one example, a kernel management module may be disposed in the shell area, for implementing allocation and recording of virtual resources. In one embodiment, the resource management module is specifically configured to: distributing virtual equipment, a storage area and a virtual network port for each kernel program according to a resource configuration table corresponding to each kernel program; the resource allocation table corresponding to each kernel program is recorded with: the current kernel is the least amount of usage for virtual devices, memory areas, and virtual portals.
In one embodiment, the resource management module is further configured to: the access operation of the server to each kernel program is controlled.
In one embodiment, the resource management module is further configured to: and controlling access operation of the server to all hardware resources in the FPGA accelerator.
In one embodiment, the method further comprises: and the isolation module is used for isolating signal connection between the shell area and the kernel area.
In one embodiment, the isolation module is specifically configured to: and blocking the signal connection between the shell area and the kernel area in the updating process of any program in the kernel area.
In one embodiment, the isolation module is specifically configured to: if any program in the kernel area is not updated, signal connection between the shell area and the kernel area is provided.
Therefore, the embodiment can realize the virtualization and isolation of the accelerator resources on the hardware FPGA accelerator; meanwhile, the kernel area of the FPGA accelerator can realize the allocation of virtualized resources, so that different virtualized resources are allocated to different kernel programs in the kernel area, and then different virtual machines or different containers can call different kernel programs at the same time, thereby realizing the following steps: different virtual machines or different containers on the server bind different resources in the same FPGA accelerator at the same time, and accesses of the different virtual machines or different containers to the same FPGA accelerator are isolated from each other, so that the utilization rate of the accelerator resources is improved, and the waste of the accelerator resources is avoided.
The embodiment of the application discloses a virtualization method, which is applied to the virtualization component set in the shell area of the FPGA accelerator in the above embodiment, where the virtualization component includes: PCIe device virtualization module, memory virtualization module, network interface virtualization module, and resource management module.
Referring to fig. 4, the method provided in this embodiment includes:
S401, virtualizing each physical PCIe device in the accelerator into a plurality of virtual devices by using a PCIe device virtualization module.
In one embodiment, virtualizing each physical PCIe device in the FPGA accelerator into a plurality of virtual devices using a PCIe device virtualization module comprises: and creating each type of physical PCIe device in the FPGA accelerator as at least one physical management device through the SR-IOV by utilizing the PCIe device virtualization module, and virtualizing each physical management device as a plurality of virtual devices.
S402, dividing hardware storage resources in the accelerator into a plurality of storage areas by using a memory virtualization module.
In one embodiment, dividing hardware storage resources in an FPGA accelerator into a plurality of storage areas using a memory virtualization module includes: and dividing hardware storage resources in the FPGA accelerator into a plurality of storage areas through the Virtio by using the memory virtualization module. In one embodiment, at least one queue is configured for each storage region using a memory virtualization module. The memory virtualization module manages memory resources in the FPGA accelerator such that the memory region is divided into fixed-size memory blocks (i.e., storage areas) that can be inserted or removed. Once inserted, the memory block can be used in a virtual machine like ordinary RAM. The inserted memory can only be used by the virtual machine, which cannot access the un-inserted memory.
S403, virtualizing each hardware network port resource in the accelerator into a plurality of virtual network ports by utilizing a network interface virtualization module.
In one embodiment, virtualizing each hardware portal resource in the FPGA accelerator into a plurality of virtual portals using a network interface virtualization module comprises: and virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports through a virtual interface virtualization module. In one embodiment, at least 8 queues are configured for each virtual portal using a network interface virtualization module. The Virtio-net module manages the network interface of the FPGA accelerator. The network interface virtualization module provides the virtualization capability of the network interface and can realize the data transmission performance close to the physical network linear speed performance. Based on the number of network interfaces provided on the FPGA accelerator, reserving the PF according to the principle that each network interface is configured with an independent PF for management, and reserving 2 PFs by default. The use of a network interface is filed by a driver in the virtual machine.
S404, recording a plurality of virtual devices, a plurality of storage areas and a plurality of virtual interfaces by using the resource management module, and distributing the plurality of virtual devices, the plurality of storage areas and the plurality of virtual interfaces to different kernel programs in the kernel area so that different virtual machines or different containers call different kernel programs at the same time.
In this embodiment, the resource management module is further used to control the access operation of the server to each kernel program, and the resource management module is used to control the access operation of the server to all hardware resources in the FPGA accelerator.
In one embodiment, the virtualization component further comprises: an isolation module; accordingly, the signal connection of the shell area and the core area is isolated by the isolation module. In one embodiment, isolating signal connections of a shell region and a core region with an isolation module includes: and in the updating process of any program in the kernel area, blocking the signal connection between the shell area and the kernel area by using the isolation module. In one embodiment, isolating signal connections of a shell region and a core region with an isolation module includes: if any program in the kernel area is not updated, the isolation module is utilized to provide signal connection between the shell area and the kernel area.
In this embodiment, the resource management module is further configured to allocate a virtual device, a storage area and a virtual network port to each kernel program according to the resource configuration table corresponding to each kernel program; the resource allocation table corresponding to each kernel program is recorded with: the current kernel is the least amount of usage for virtual devices, memory areas, and virtual portals.
It can be seen that, the embodiment can implement virtualization and isolation of accelerator resources on a hardware FPGA accelerator based on a virtualization component including a PCIe device virtualization module, a memory virtualization module, a network interface virtualization module, and a resource management module; meanwhile, the allocation of the virtualized resources can be realized, so that different virtualized resources are allocated to different kernel programs in the kernel area, and then different virtual machines or different containers can call different kernel programs at the same time, so that the method is realized: different virtual machines or different containers on the server bind different resources in the same FPGA accelerator at the same time, and accesses of the different virtual machines or different containers to the same FPGA accelerator are isolated from each other, so that the utilization rate of the accelerator resources is improved, and the waste of the accelerator resources is avoided.
This application is described in further detail below.
Referring to fig. 5, the shell housing part of the FPGA accelerator is provided with: PCIe hard core module (i.e. PCIe device virtualization module), virtual-memory module (i.e. memory virtualization module), virtual-net module (i.e. network interface virtualization module), mgmtPF resource management module, kernel management module and IO isolation module supporting SR-IOV.
The PCIe hard core module supporting the SR-IOV uses a PCIe bus mode to realize PCIe EP (End Point) equipment, and particularly realizes the configuration and read-write access operation of PF or VF, BAR space configuration and read-write access interface, a controller for MSIX interrupt and the like.
The virtual-memory module is a module based on a virtual input-output protocol, and realizes the virtualization of the on-board memory. For example: PF1 for managing memory is created based on Virtio, and VF is created under PF1 for virtual machines to use on-board memory resources. Each VF is at least allocated with 3 virtual queues, wherein 1 queue is used for configuring the request and the state inquiry of the memory resource, and the other 2 queues are used for DMA read-write operation of the on-board memory. The interface of the Virtio-Memory module to the downstream module is an AXI4 bus interface, and supports high-speed read-write access to the Memory interface. As can be seen, the virtual-memory module provides and manages on-board memory in the FPGA accelerator to the virtual machine, and the memory area is divided into fixed-size memory blocks that can be inserted or removed. Once inserted, the memory block can be used in a virtual machine like ordinary RAM. The inserted memory can only be used by the virtual machine, which cannot access the un-inserted memory. The application and cancellation of the memory are requested by a driver in the virtual machine, and the allocation of memory resources is completed under the cooperation of the MgmtPF resource management module.
The Virtio-net module is a module based on a virtual input-output protocol, and realizes the virtualization of a network interface. For example: PF2 and PF3 for managing the ports are created based on Virtio, and PF2 and PF3 are respectively bound to one physical Ethernet port. The VF created under each PF is used for the virtual machine to use the on-board Ethernet interface resources. Each VF is configured with at least 16 virtual queues for receiving and sending the traffic of the transmission Ethernet. It can be seen that the present embodiment uses the Virtio-net module to manage the network interface of the FPGA accelerator. The virtual-net module provides the access capability of the network interface for the virtual machine, and realizes the data transmission performance close to the physical network linear speed performance. Based on the number of network interfaces provided on the FPGA accelerator, reserving the PF according to the principle that each network interface is configured with an independent PF for management, and reserving 2 PFs by default. The use of the network interface is applied by a driver in the virtual machine, and the mapping and the distribution of the ports are completed under the cooperation of the MgmtPF resource management module.
PF0 is created for the MgmtPF resource management module, and the management authority is needed to access the MgmtPF resource management module for operating the configuration management function in the shell. The MgmtPF resource management module can terminate read-write operation of the server on PF0 equipment of the FPGA and generate an AXI-Lite interface for a downstream module. The management of the on-board device comprises: the Flash controller hung under the AXI-Lite bus is used for reading and writing the Flash chip; the lower link HWICAP (Hardware Internal Configuration Access Port) module is used for downloading bit files of the kernel area, and allowing the processor to access a configuration access interface of the FPGA when the FPGA runs, so that the functions of the FPGA are changed; and the down mailbox module is used for carrying out message interaction with the user driver. It can be seen that the present embodiment implements management of kernel (kernel) programs and management of boards in the MgmtPF resource management module. Establishing a binding relation between a Kernel and a virtual machine in a Kernel area in a MgmtPF resource management module, wherein a certain Kernel which is allowed to be accessed by the virtual machine is transmitted to the Kernel by a command which is started and executed by the Kernel, and an AXI-Lite interface (a simplified AXI bus interface) of parameter configuration can read and write the Kernel; AXI (Advanced eXtensible Interface) is an advanced expansion bus interface. Accordingly, the Kernel executes the completed interrupt signal and sends the interrupt signal to the relevant virtual machine. When Kernel design exists in the FPGA on-board used by Kernel, the Kernel design is designated, and correspondingly, the memory area is allocated to a virtual machine using the Kernel for memory insertion, and after insertion, the access to the memory is consistent with that of a common memory. For the use of the network interface, the network interface needs to be matched with an AXIS interconnection Kernel (Kernel program interconnection module), the Kernel itself is customized according to the actual application and the requirements of other kernels, and after the MgmtPF resource management module configures the Kernel interface, the interconnection of multiple channels and the interconnection with the Kernel program of a user (namely, the Kernel program formulated by the user) are realized. The management of the MgmtPF resource management module to the board also comprises: and (3) programming the board-mounted Flash chip, updating the Kernel area of the Kernel program, and managing a message interaction channel between the user driver and the MgmtPF driver.
The IO isolation module is used for isolating signal connection of the shell and the kernel area, and blocking the connection between the shell and the kernel area when the kernel area performs bit file downloading, otherwise providing a signal transmission channel. The bit file is: the binary file generated by the FPGA compiling software and used for configuring the FPGA to realize the specific function is generally defined by an FPGA manufacturer, and comprises chip information, a downloading instruction, FPGA configuration information and the like.
The Kernel management module in the shell area is used for constructing a configuration table of resources such as a memory and a network interface used by a user Kernel, and after a user driver requests corresponding resources, the resources are allocated so as to be used by different virtual machines. It can be seen that the virtual technology Virtio protocol supports virtualization of network devices, storage devices, memory devices and the like, so that virtual machines in a computer system can use and manage virtual devices by using standard interfaces.
In the kernel area, setting: various modules such as Kernel programs, on-board memory controllers, AXIS interconnection Kernel, and the like.
The on-board memory controller may be a DDR memory controller and/or an HBM (High Bandwidth Memory ) controller. The user Kernel supports at least one of an AXI-Lite interface, an AXI4 interface and an AXIs interface for programs developed by the user based on the application.
The AXIS interconnection Kernel is an essential module when an Ethernet interface is used, is used for interconnection of an AXIS bus, a user Kernel and an Ethernet interface module, specifically realizes the realization requirement based on the user Kernel, is not placed in the shell, is beneficial to saving resources, and can flexibly adapt to the development requirements of various Kernels. The basic implementation framework of the module is to perform primary screening on the received frames, forward the network frames to a user Kernel or shell, multiplex multi-port data on the transmitted frames, and combine the frames output from the Kernel or shell to a final physical port for output.
In this embodiment, the virtualization of PCIe interface devices is addressed using SR-IOV techniques. And constructing at least 4 PFs based on PCIe bridging capability provided by the PCIe hard core of the FPGA, wherein other PFs except PF0 are used for the management of the board card and have VF starting capability for the virtual machine or the container to directly access FPGA resources. The ARI (Alternative Routing-ID Interpretation) feature of PCIe is enabled such that the maximum number of VFs that can be enabled per PF is no more than 255. The default PF1 is used for managing read-write access of the on-board storage resources, and PF2 and PF3 are used for managing data transmission of the 2 Ethernet ports on-board. The number of PFs can be increased for resource access of a specific function according to the service needs of the FPGA accelerator.
After implementing the FPGA accelerator according to the embodiment, the process of applying the FPGA accelerator to the server includes: after the FPGA accelerator is connected to a certain server, the server can install a driver of the FPGA accelerator and a runtime program package required for running. If the server is powered on and restarted, all programs of the shell part in the Flash chip on the FPGA accelerator board card are loaded, and the server driver is correctly loaded after identifying the FPGA accelerator. And loading a bit file of a kernel area in the FPGA accelerator by using the authority of an administrator at a server side, checking related resources of available user kernel, configuring a resource configuration table, and presenting the available kernel in the FPGA accelerator to the user. If network interface resources are to be used, AXIS interconnect Kernel needs to be synchronously configured. The FPGA accelerator then waits for the user to apply for resources to allocate the available kernel resources for use by the user. If a certain applied resource is released by the user, the resource is waited to be allocated to the user of the next application. If the FPGA accelerator is closed, stopping the authorization response of the user for applying resources, stopping the operation of the user kernel, and restarting the FPGA accelerator service after installing a new bit file on the FPGA accelerator.
In one example, the main process flow of a user using a user Kernel in an FPGA accelerator includes: the user installs a driver of the FPGA accelerator in the virtual machine or the container, and after the virtual machine is started, the virtual machine can find the available FPGA accelerator. Searching the needed kernel in the available FPGA accelerator, if the available kernel is found, continuing, otherwise, exiting the program. The virtual machine applies for a certain kernel to the server, and obtains confirmation to obtain the kernel and the use authority of the related resource information. Related resource information such as memory, network ports, etc. If memory resources are needed, continuously applying for loading the memory by using a driver of the virtual-memory, and inputting or outputting data to the FPGA accelerator by using the memory; if a network interface is required, a driver of the virtual-net is used to apply for the network interface. If the required resource application is successful, the kernel execution is started, and related instructions and data are issued to the determined kernel through the kernel resource management module to run. If the execution is completed, an interrupt notification is received, and the operation result is read. If the resource is not applied successfully or the resource lease renewing application fails, the existing resource is released and the program is exited.
According to the embodiment, the FPGA Shell is subjected to virtualization transformation, so that the FPGA accelerator is suitable for a virtualized environment, and a virtual machine or a container can conveniently access computing resources of the FPGA accelerator directly. The SR-IOV technology based on PCIe interface can present a plurality of PF and VF to form extensible virtual equipment for virtual machine or container binding use; virtualizing a network and a memory device based on Virtio, and associating storage resources and computing resources (provided in a kernel form) in the FPGA with virqueue; the MgmtPF resource management module is realized, and the use of FPGA board resources and Kernel is managed; in the Kernel area, the user aggregates the computing resources into a specific function group through a customized Kernel, and provides the specific function group for the virtual machine to use, and realizes resource sharing.
The method can enable the FPGA heterogeneous FPGA accelerator to support virtualization, and achieve simultaneous access of multiple virtual machines or containers to the FPGA accelerator equipment and sharing of FPGA computing resources. In the embodiment, the virtual-memory module and the virtual-net module are arranged in the shell of the FPGA accelerator, and the performance of the FPGA accelerator is hardly affected while the virtualization is realized through the comprehensive hardening of the data channel. Different types of resources are managed through the PF, and the number of the VFs is expanded to adapt to the number of users of the multiple virtual machines and containers, so that the expansion requirements of the FPGA accelerator on the types of the resources and the number of the users are met. Isolation and intercommunication between kernel can be realized, and kernel and required resources can be effectively managed.
Aiming at the FPGA heterogeneous accelerator, the design architecture of the FPGA heterogeneous accelerator is divided into a shell area and a dynamic kernel (dynamic kernel) area, so that support for SR-IOV and Virtio protocols can be conveniently increased in the shell implementation of the FPGA, the FPGA accelerator has a device-level virtualization function, the performance experience of the FPGA accelerator is enhanced by multi-user sharing, and the sharing efficiency of computing resources in the FPGA is enhanced.
An electronic device provided in an embodiment of the present application is described below, and an electronic device described below may be referred to with reference to any embodiment described herein.
Referring to fig. 6, an embodiment of the present application discloses an electronic device, including:
a memory 601 for storing a computer program;
a processor 602 for executing the computer program to implement the method disclosed in any of the embodiments above.
Further, the embodiment of the application also provides electronic equipment. The electronic device may be the server 50 shown in fig. 7 or the terminal 60 shown in fig. 8. Fig. 7 and 8 are each a block diagram of an electronic device according to an exemplary embodiment, and the contents of the drawings should not be construed as limiting the scope of use of the present application in any way.
Fig. 7 is a schematic structural diagram of a server according to an embodiment of the present application. The server 50 may specifically include: at least one processor 51, at least one memory 52, a power supply 53, a communication interface 54, an input output interface 55, and a communication bus 56. Wherein the memory 52 is configured to store a computer program that is loaded and executed by the processor 51 to implement the relevant steps in determining the shear wave attenuation parameters disclosed in any of the foregoing embodiments.
In this embodiment, the power supply 53 is configured to provide an operating voltage for each hardware device on the server 50; the communication interface 54 can create a data transmission channel between the server 50 and an external device, and the communication protocol to be followed is any communication protocol applicable to the technical solution of the present application, which is not specifically limited herein; the input/output interface 55 is used for acquiring external input data or outputting external output data, and the specific interface type thereof may be selected according to the specific application needs, which is not limited herein.
The memory 52 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk, or an optical disk, and the resources stored thereon include an operating system 521, a computer program 522, and data 523, and the storage may be temporary storage or permanent storage.
The operating system 521 is used for managing and controlling various hardware devices on the Server 50 and the computer program 522 to implement the operation and processing of the data 523 in the memory 52 by the processor 51, which may be Windows Server, netware, unix, linux, etc. The computer program 522 may further comprise a computer program capable of performing other specific tasks in addition to the computer program capable of performing the shear wave attenuation parameter determination method disclosed in any of the preceding embodiments. The data 523 may include data such as application program developer information in addition to data such as application program update information.
Fig. 8 is a schematic structural diagram of a terminal provided in an embodiment of the present application, and the terminal 60 may specifically include, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, or the like.
Generally, the terminal 60 in this embodiment includes: a processor 61 and a memory 62.
Processor 61 may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc. The processor 61 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 61 may also include a main processor, which is a processor for processing data in an awake state, also called a CPU (Central Processing Unit ), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 61 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 61 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.
Memory 62 may include one or more computer-readable storage media, which may be non-transitory. Memory 62 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 62 is at least used for storing a computer program 621, which, when loaded and executed by the processor 61, enables the relevant steps in the shear wave attenuation parameter determining method performed by the terminal side disclosed in any of the foregoing embodiments to be implemented. In addition, the resources stored by the memory 62 may also include an operating system 622, data 623, and the like, and the storage manner may be transient storage or permanent storage. The operating system 622 may include Windows, unix, linux, among others. The data 623 may include, but is not limited to, update information of the application.
In some embodiments, the terminal 60 may further include a display 63, an input-output interface 64, a communication interface 65, a sensor 66, a power supply 67, and a communication bus 68.
Those skilled in the art will appreciate that the structure shown in fig. 8 is not limiting of the terminal 60 and may include more or fewer components than shown.
A readable storage medium provided in embodiments of the present application is described below, and the readable storage medium described below and any embodiment described herein may be referred to with reference to each other.
The embodiment of the application discloses a readable storage medium for storing a computer program, wherein the computer program realizes the virtualization method disclosed in the previous embodiment when being executed by a processor. The readable storage medium is a computer readable storage medium, and can be used as a carrier for storing resources, such as read-only memory, random access memory, magnetic disk or optical disk, wherein the resources stored on the readable storage medium comprise an operating system, a computer program, data and the like, and the storage mode can be transient storage or permanent storage.
An FPGA accelerator provided in an embodiment of the present application is described below, and the FPGA accelerator described below and any embodiment described herein may be referred to with reference to each other.
The embodiment of the application discloses an FPGA accelerator, comprising: a shell area and a kernel area; the shell area is provided with the virtualized component described in any embodiment, and the kernel area is provided with a plurality of kernel programs.
An acceleration system provided in accordance with embodiments of the present application is described below, and the acceleration system described below may be referred to with respect to any of the embodiments described herein.
The embodiment of the application discloses an acceleration system, comprising the following steps: a server and at least one FPGA accelerator as in any of the embodiments above coupled to the server.
In this embodiment, the virtualized component of the shell area of the FPGA accelerator includes: the PCIe device virtualization module is used for virtualizing each physical PCIe device in the FPGA accelerator into a plurality of virtual devices; the memory virtualization module is used for dividing hardware storage resources in the FPGA accelerator into a plurality of storage areas; the network interface virtualization module is used for virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports; the resource management module is used for recording a plurality of virtual devices, a plurality of storage areas and a plurality of virtual network ports, and distributing the plurality of virtual devices, the plurality of storage areas and the plurality of virtual network ports to different kernel programs in the kernel area so that different virtual machines or different containers call different kernel programs at the same time.
Referring to fig. 9, the FPGA accelerator is plugged into a slot of a host (server), and various programs participating in the operation of the FPGA accelerator are provided at the host end, and the FPGA accelerator is provided with hardware resources such as a memory and a network port. Accordingly, the system architecture diagram can be seen in fig. 10.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of readable storage medium known in the art.
The principles and embodiments of the present application are described herein with specific examples, the above examples being provided only to assist in understanding the methods of the present application and their core ideas; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (20)

1. A method of virtualization, for application to a shell region of an FPGA accelerator, the method comprising:
determining each physical device supporting a high-speed serial computer expansion bus standard in the FPGA accelerator, and virtualizing each physical device into a plurality of virtual devices;
dividing a hardware storage resource in the FPGA accelerator into a plurality of storage areas;
virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports;
recording the plurality of virtual devices, the plurality of storage areas and the plurality of virtual interfaces, and distributing the plurality of virtual devices, the plurality of storage areas and the plurality of virtual interfaces to different kernel programs in a kernel area of the FPGA accelerator so that different virtual machines or different containers call different kernel programs at the same time;
wherein the allocating the plurality of virtual devices, the plurality of storage areas, and the plurality of virtual portals to different kernel programs in the kernel area includes:
distributing virtual equipment, a storage area and a virtual network port for each kernel program according to a resource configuration table corresponding to each kernel program; the resource allocation table corresponding to each kernel program is recorded with: the least consumption of the current kernel program for the virtual equipment, the storage area and the virtual network port;
Wherein the virtualizing each physical device as a plurality of virtual devices includes:
creating each type of physical PCIe device in the FPGA accelerator as at least one physical management device by using an SR-IOV, and virtualizing each physical management device into a plurality of virtual devices;
the method for dividing the hardware storage resources in the FPGA accelerator into a plurality of storage areas comprises the following steps:
dividing a hardware storage resource in the FPGA accelerator into a plurality of storage areas by using Virtio;
the virtualizing each hardware portal resource in the FPGA accelerator into a plurality of virtual portals includes:
and virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports by utilizing Virtio.
2. The method of claim 1, wherein virtualizing each physical device as a plurality of virtual devices comprises:
determining the type of each physical device;
creating at least one physical management device based on each type of physical device;
each physical management device is virtualized as a number of virtual devices.
3. The method of claim 1, wherein a resource allocation table corresponding to each kernel is queried.
4. The method of claim 1, wherein the FPGA accelerator is plugged into a server;
correspondingly, the method further comprises the steps of:
and if the server accesses any kernel program in the kernel area, controlling the access operation of the server to the kernel program.
5. The method as recited in claim 4, further comprising:
and if the server accesses any hardware resource in the FPGA accelerator, controlling the access operation of the server to the hardware resource.
6. The method according to any one of claims 1 to 5, further comprising:
and isolating the signal connection between the shell area and the kernel area according to the program updating condition in the kernel area.
7. The method of claim 6, wherein said isolating the signal connection of the shell region to the kernel region based on program updates in the kernel region comprises:
and blocking the signal connection between the shell area and the kernel area in the updating process of any program in the kernel area.
8. The method of claim 6, wherein said isolating the signal connection of the shell region to the kernel region based on program updates in the kernel region comprises:
If any program in the kernel area is not updated, the signal connection between the shell area and the kernel area is maintained.
9. The method of claim 1, wherein after dividing the hardware memory resources in the FPGA accelerator into a plurality of memory areas, further comprising:
configuration for each storage area at least one queue.
10. The method of claim 1, wherein after virtualizing each hardware portal resource in the FPGA accelerator into a plurality of virtual portals, further comprising:
at least two queues are configured for each virtual portal.
11. A virtualization apparatus for application to a housing section of an FPGA accelerator, the apparatus comprising:
the first virtualization module is used for determining all physical devices supporting a high-speed serial computer expansion bus standard in the FPGA accelerator, and virtualizing each physical device into a plurality of virtual devices;
the second virtualization module is used for dividing hardware storage resources in the FPGA accelerator into a plurality of storage areas;
the third virtualization module is used for virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports;
The virtual management module is used for recording the plurality of virtual devices, the plurality of storage areas and the plurality of virtual network ports, and distributing the plurality of virtual devices, the plurality of storage areas and the plurality of virtual network ports to different kernel programs in a kernel area of the FPGA accelerator so that different virtual machines or different containers call different kernel programs at the same time;
the virtual management module is specifically configured to:
distributing virtual equipment, a storage area and a virtual network port for each kernel program according to a resource configuration table corresponding to each kernel program; the resource allocation table corresponding to each kernel program is recorded with: the least consumption of the current kernel program for the virtual equipment, the storage area and the virtual network port;
the first virtualization module is specifically configured to:
creating each type of physical PCIe device in the FPGA accelerator as at least one physical management device by using an SR-IOV, and virtualizing each physical management device into a plurality of virtual devices;
the second virtualization module is specifically configured to:
dividing a hardware storage resource in the FPGA accelerator into a plurality of storage areas by using Virtio;
the third virtualization module is specifically configured to:
And virtualizing each hardware network port resource in the FPGA accelerator into a plurality of virtual network ports by utilizing Virtio.
12. The apparatus of claim 11, wherein the first virtualization module is specifically configured to:
determining the type of each physical device;
creating at least one physical management device based on each type of physical device;
each physical management device is virtualized as a number of virtual devices.
13. The apparatus of claim 11, wherein the FPGA accelerator is plugged into a server;
correspondingly, the virtual management module is further configured to: and if the server accesses any kernel program in the kernel area, controlling the access operation of the server to the kernel program.
14. The apparatus of claim 13, wherein the virtual management module is further configured to:
and if the server accesses any hardware resource in the FPGA accelerator, controlling the access operation of the server to the hardware resource.
15. An electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the method of any one of claims 1 to 10.
16. A readable storage medium for storing a computer program, wherein the computer program when executed by a processor implements the method of any one of claims 1 to 10.
17. An FPGA accelerator, comprising: a shell area and a kernel area; the shell area is implemented with the method according to any one of claims 1 to 10, the kernel area being provided with a plurality of kernel programs.
18. The FPGA accelerator of claim 17, wherein the kernel area is further provided with a kernel interconnection module, and the kernel interconnection module is configured to interconnect different kernel programs.
19. The FPGA accelerator of claim 17, wherein HBM and/or DDR disposed on the core area serve as hardware memory resources of the FPGA accelerator.
20. An acceleration system, comprising: a server and at least one FPGA accelerator as claimed in claim 17 connected to the server.
CN202310233967.3A 2023-03-13 2023-03-13 Virtualization method, device, equipment, medium, accelerator and system Active CN116069451B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310233967.3A CN116069451B (en) 2023-03-13 2023-03-13 Virtualization method, device, equipment, medium, accelerator and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310233967.3A CN116069451B (en) 2023-03-13 2023-03-13 Virtualization method, device, equipment, medium, accelerator and system

Publications (2)

Publication Number Publication Date
CN116069451A CN116069451A (en) 2023-05-05
CN116069451B true CN116069451B (en) 2023-06-16

Family

ID=86169981

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310233967.3A Active CN116069451B (en) 2023-03-13 2023-03-13 Virtualization method, device, equipment, medium, accelerator and system

Country Status (1)

Country Link
CN (1) CN116069451B (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107894913B (en) * 2016-09-30 2022-05-13 超聚变数字技术有限公司 Computer system and storage access device
CN107404350B (en) * 2017-07-31 2019-11-08 北京邮电大学 Satellite network simulation method, apparatus, electronic equipment and readable storage medium storing program for executing
CN113296884B (en) * 2021-02-26 2022-04-22 阿里巴巴集团控股有限公司 Virtualization method, virtualization device, electronic equipment, virtualization medium and resource virtualization system

Also Published As

Publication number Publication date
CN116069451A (en) 2023-05-05

Similar Documents

Publication Publication Date Title
CN107690622B9 (en) Method, equipment and system for realizing hardware acceleration processing
US10191759B2 (en) Apparatus and method for scheduling graphics processing unit workloads from virtual machines
US9798565B2 (en) Data processing system and method having an operating system that communicates with an accelerator independently of a hypervisor
US11301140B2 (en) Configuring parameters of non-volatile memory target subsystems for workload request quality of service
CA2933712C (en) Resource processing method, operating system, and device
US9454397B2 (en) Data processing systems
US8930568B1 (en) Method and apparatus for enabling access to storage
CN105786589A (en) Cloud rendering system, server and method
EP2375324A2 (en) Virtualization apparatus for providing a transactional input/output interface
US10719333B2 (en) BIOS startup method and apparatus
CN111078353A (en) Operation method of storage equipment and physical server
KR20200001208A (en) Convergence Semiconductor Apparatus and Operation Method Thereof, Stacked Memory Apparatus Having the Same
CN115629882A (en) Method for managing memory in multiple processes
CN109408226A (en) Data processing method, device and terminal device
CN116069451B (en) Virtualization method, device, equipment, medium, accelerator and system
CN115470163A (en) Control method, control device, control equipment and storage medium for DMA transmission
CN115809158A (en) Double-system multi-channel memory sharing method for vehicle-mounted cabin entertainment system
CN108228496B (en) Direct memory access memory management method and device and master control equipment
CN116909689B (en) Virtual machine thermomigration method and device, storage medium and electronic equipment
CN116400982B (en) Method and apparatus for configuring relay register module, computing device and readable medium
CN117112466B (en) Data processing method, device, equipment, storage medium and distributed cluster
CN114816648A (en) Computing device and computing method
CN112148434A (en) Micro-kernel virtual machine communication method and device based on Loongson host environment and Loongson host

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant