AU2015203452B2 - Virtualization processing method and apparatuses, and computer system - Google Patents

Virtualization processing method and apparatuses, and computer system Download PDF

Info

Publication number
AU2015203452B2
AU2015203452B2 AU2015203452A AU2015203452A AU2015203452B2 AU 2015203452 B2 AU2015203452 B2 AU 2015203452B2 AU 2015203452 A AU2015203452 A AU 2015203452A AU 2015203452 A AU2015203452 A AU 2015203452A AU 2015203452 B2 AU2015203452 B2 AU 2015203452B2
Authority
AU
Australia
Prior art keywords
host
cache
dma
virtual
instance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
AU2015203452A
Other versions
AU2015203452A1 (en
Inventor
Feng Wang
Xiaowei Yang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2012250375A external-priority patent/AU2012250375B2/en
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to AU2015203452A priority Critical patent/AU2015203452B2/en
Publication of AU2015203452A1 publication Critical patent/AU2015203452A1/en
Application granted granted Critical
Publication of AU2015203452B2 publication Critical patent/AU2015203452B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Stored Programmes (AREA)

Abstract

A virtualization processing method and apparatuses, and a computer system are provided. Where a computing node includes: a hardware layer, a Host running on the hardware layer, and at least one virtual machine VM running on the Host, the hardware layer includes an 1/0 device, 5 several corresponding virtual function VF devices are virtualized from the 1/0 device, the Host has several VF software instances, the several VF software instances and the several VF devices are in one-to-one correspondence; the Host further has a back-end instance BE of an 1/0 virtual device having the same type with the 1/0 device, the VM has a front-end instance FE of the 1/0 virtual device; the BE in the Host is bound with an idle VF software instance. The solutions of the 0 embodiments of the present invention are beneficial to optimization of the performance and compatibility of a virtualization system. Host Host managementIuiFE of an deiE f deic n1dvic virtual machine Shared machine cache VF software instanceV BE of an I/O Shared ,FE of an I/ virtual machine cache O virtual machine VF software instance -----------------Virtual machine VM n Input/output memory E management unit IOMMU Input/output I/O device Hardware layer FIG. 2-a

Description

VIRTUALIZATION PROCESSING METHOD AND APPARATUSES, 2015203452 22 Jun2015
AND COMPUTER SYSTEM
FIELD OF THE INVENTION
[0001] The present invention relates to the field of computer technologies, and in particular, to a 5 virtualization processing method and apparatuses, and a computer system.
BACKGROUND OF THE INVENTION
[0002] Virtualization technology is a decoupling technology for separating a bottom hardware device from an upper operating system and application programs, and referring to FIG. 1, a virtual machine monitor (VMM, Virtual Machine Monitor) layer is introduced to directly manage bottom 0 hardware resources, create a virtual machine (VM, Virtual Machine) irrelevant to the bottom hardware for being used by the upper operating system and application programs.
[0003] The virtualization technology, as one of the important supporting technologies of the currently popular cloud computing (Cloud Computing) platform, can greatly improve the resource utilization efficiency of a physical device. Compared with a conventional physical server, the virtual 5 machine has better isolation and encapsulation, and information of the whole virtual machine can be saved in a virtual disk image (VDI, Virtual Disk Image), so as to conveniently perform operations such as snapshot, backup, cloning and delivering for the virtual machine.
[0004] With the evolution of the x86 processor, the virtualization technology of a central processing unit (CPU, Central Processing Unit) and a memory is increasingly perfected, with 20 ever-decreasing overhead. Based on the latest processor, the overhead of virtualization of CPU and memory for most applications has been less than 10%. In an input/output (I/O, Input/Output) virtualization field, the virtualization I/O solution having high performance and low delay is still a key technical challenge in the virtualization field. The conventional I/O virtualization solution includes two types, namely, a software solution and a hardware solution. However, the conventional 25 software solution and hardware solution both have their outstanding advantages and disadvantages. For example, the conventional software solution is advantageous in compatibility, but has great performance loss; the hardware solution can bring about desired performance, but has problems such as feature compatibility and guest operating system (Guest OS) compatibility. 1
SUMMARY OF THE INVENTION 2015203452 22 Jun2015 [0005] Embodiments of the present invention provide a virtualization processing method and apparatuses, and a computer system, so as to optimize performance and compatibility of a virtualization system. 5 [0006] In order to solve the foregoing technical problems, the embodiments of the present invention provide the following technical solutions.
[0007] In one aspect, an embodiment of the present invention provides a virtualization processing method, which is applied to a computing node, and the computing node includes: a hardware layer, a Host running on the hardware layer, and at least one virtual machine 0 VM running on the Host, where the hardware layer includes an input/output I/O device, several corresponding virtual function VF devices are virtualized from the I/O device, the Host has several VF software instances, the several VF software instances and the several VF devices are in one-to-one correspondence; the Host further has a back-end instance BE of an I/O virtual device having the same type with the I/O device, the VM has a front-end instance FE of the I/O virtual 5 device; the BE in the Host is bound with an idle VF software instance; the method includes: pre-allocating, by the FE, a cache for direct memory access DMA; acquiring, by the VF software instance bound with the BE, an address corresponding to the cache for DMA through an exporting application programming interface of the BE, writing the 0 acquired address corresponding to the cache for DMA into a first storage unit of a VF device corresponding to the VF software instance; selecting, by the VF device, an address corresponding to the cache for DMA from the first storage unit when there is data to be received, and initiating a DMA write request by using the selected address corresponding to the cache for DMA as a target address; notifying, by the VF device, the VF software instance which is corresponding to the VF 25 device and is in the Host after the DMA write request is executed, so that the VF software instance triggers the FE to receive data written into the cache corresponding to the address.
[0008] In another aspect, an embodiment of the present invention further provides a virtualization processing method, including: after an I/O virtual function of an input/output I/O device is enabled, generating several 30 VF software instances in a Host; where several corresponding virtual function VF devices are virtualized from the I/O device with the I/O virtual function enabled; the several VF software instances and the several VF devices are in one-to-one correspondence; creating, by the Host, an I/O virtual device having the same type with the I/O device, where a back-end instance BE of the I/O virtual device is created in the Host, a front-end 2 instance FE of the I/O virtual device is created in the initiated virtual machine VM; and binding the BE with an idle VF software instance. 2015203452 17 Nov 2016 [0009] In another aspect, an embodiment of the present invention further provides a computing node, including: a hardware layer, a Host running on the hardware layer, and at least one virtual 5 machine VM running on the Host, where the hardware layer includes an input/output I/O device, several corresponding virtual function VF devices are virtualized from the I/O device, the Host has several VF software instances, the several VF software instances and the several VF devices are in one-to-one correspondence; the Host further has a back-end instance BE of an I/O virtual device having the same type with the I/O device, the VM has a front-end instance FE of the I/O virtual 0 device; the BE in the Host is bound with an idle VF software instance; where, the FE is configured to pre-allocate a cache for direct memory access DMA; the VF software instance bond with the BE is configured to acquire an address corresponding to the cache for DMA through an exporting application programming interface of the BE, write the acquired address corresponding to the cache for DMA into a first storage unit in a VF 5 device corresponding to the VF software instance; the VF device is configured to select the address corresponding to the cache for DMA from the first storage unit when there is data to be received, and initiate a DMA write request by using the selected address corresponding to the cache for DMA as a target address; and notify the VF software instance which is corresponding to the VF device and is in the Host after the DMA 0 write request is executed, so that the VF software instance triggers the FE to receive data written into the cache corresponding to the address.
[0010] In another aspect, an embodiment of the present invention further provides a host, including: a first creating module, configured to, after an I/O virtual function of an input/output I/O 25 device is enabled, generate several VF software instances in the Host; where several corresponding virtual function VF devices are virtualized from the I/O device with the I/O virtual function enabled; the several VF software instances and the several VF devices are in one-to-one correspondence; a second creating module, configured to create an I/O virtual device having the same type with the I/O device, 30 where, a back-end instance BE of the I/O virtual device is created in the Host, a front-end instance FE of the I/O virtual device is created in the initiated virtual machine VM; and a binding module, configured to bind the BE created by the second creating module with an idle VF software instance created by the first creating module.
[0010a] In another aspect, an embodiment of the present invention further provides a 3 virtualization processing method, applied in a computing node, wherein the computing node includes: a hardware layer having an input/output (I/O) device, and a Host layer,includes, the method including: 2015203452 27 Oct 2016 generating, after an input/output (I/O) virtual function (VF) of the I/O device is enabled, a 5 plurality of VF software instances in the Host layer, wherein a plurality of corresponding VF devices are virtualized from the I/O device with the I/O VF enabled, and the plurality of VF software instances are in one-to-one correspondence with the plurality of VF devices; creating, in the Host layer, an I/O virtual device having a same type as the I/O device, creating a back-end instance (BE) of the I/O virtual device in the Host layer, and a front-end 0 instance (FE) of the I/O virtual device in an initiated machine (VM) running on the Host layer; binding the BE on a one-to-one basis with an idle VF software instance; acquiring, by the VF software instance bound with the BE, an address corresponding to cache for DMA through an exporting application programming interface of the BE; wirting the acquired address corresponding to the cache for direct memory access (DMA) 5 into a first storage unit of a VF device corresponding to the VF software instance; selecting, by the VF device, an address corresponding to the cache for DMA from the first storage unit when there is data to be received; and initiating a DMA write request by using the selected address corresponding to the cache for DMA as a target address. 0 [0010b] In another aspect, the embodiment of the present invention further provides a host, running on a hardware layer, wherein the hardware layer includes an input/output (I/O) device, the host including: a first creating module, configured to, after an I/O virtual function of the input/output (I/O) device is enabled, generate a plurality of VF software instances; wherein a plurality of 25 corresponding virtual function VF devices are virtualized from the I/O device with the I/O virtual function enabled; wherein each of the plurality of generated VF software instances corresponds to a different VF device of the plurality of VF devices; a second creating module, configured to: create an I/O virtual device having a same type as the I/O device, 30 create a back-end instance BE of the I/O virtual device, and create a front-end instance FE of the I/O virtual device in an initiated virtual machine VM running on the host; and a binding module, configured to bind the BE created by the second creating module on a one-to-one basis with an idle VF software instance created by the first creating module; 3a wherein the binding module performing the bind of the BE created by the second creating module on a one-to-one basis with an idle VF software instance created by the first creating module is configured to: 2015203452 17 Nov 2016 acquire an address corresponding to a cache for direct memory access (DMA) through an 5 exporting application programming interface (API) of the BE; and write the acquired address corresponding to the cache for DMA into a first storage unit in a VF device corresponding to the VF software instance; and wherein the binding module performing the bind of the BE created by the second creating module on a one-to-one basis with an idle VF software instance created by the first creating module 0 is further configured to: select the address corresponding to the cache for DMA from the first storage unit when there is data to be received;and initiate a DMA write request by using the selected address corresponding to the cache for DMA as a target address. 5 [0010c] In another aspect, an embodiment of the present invention further provicdes a computing node, including: a hardware layer, a Host layer, and at least one virtual machine (VM) running on the Host layer, wherein the hardware layer includes an input/output (I/O) device, wherein a plurality of corresponding virtual function (VF) devices are virtualized from the I/O device, wherein the Host has a plurality of VF software instances, the plurality of VF software 0 instances being in one-to-one correspondence with the plurality of VF device, wherein the Host layer further has a back-end instance (BE) of an I/O virtual device having a same type with the I/O device, wherein the VM has a front-end instance (FE) of the virtual device, and wherein the VE in the Host layer is bound on a one-to-one basis with an idle VF software instance; wherein the VF software instance bound with the BE is configured to: 25 acquire an address corresponding to a cache for direct memory access (DMA) through an exporting application programming interface (API) of the BE; and write the acquired address corresponding to the cache for DMA into a first storage unit in a VF device corresponding to the VF software instance; and wherein the VF software instance bound with the BE is further configured to: 30 select the address corresponding to the cache for DMA from the first storage unit when there is data to be received; and initiate a DMA write request by using the selected address corresponding to the cache for DMA as a target address.
[0011] It can be seen that, the computing node in the embodiments of the present invention may 3b include: a hardware layer, a Host running on the hardware layer, and at least one VM running on the Host, the hardware layer includes an input/output I/O device, several corresponding virtual function VF devices are virtualized from the I/O device, the Host has several VF software instances, the several VF software instances and the several VF devices are in one-to-one correspondence; the 5 Host further has a back-end instance BE of an I/O virtual device having the same type with the I/O device, the VM has a front-end instance FE of the I/O virtual device; where the BE in the Host is bound with an idle VF software instance. In this way, application architecture in which each VM can independently use one VF device is established, a channel between one VF device virtualized from the I/O device and the front-end instance FE in one VM is got through, so that the FE can 0 access the VF device through the BE in the Host. The VF device virtualized from the I/O device is 2015203452 22 Jun2015 separately allocated to the VM for use, and the VF device can provide a high-efficiency device interface, so it is beneficial for the VM to acquire the performance similar to that of a physical machine, the delay is low, and any extra CPU overhead is hardly caused. Moreover, a front-end drive (that is, the FE) of the I/O virtual device is in the VM, so the FE transfers data through a 5 back-end drive (that is, the BE) in the Host, and the VM does not perceive a real physical device of the Host, which is convenient for transition and implementation of device sharing, thereby implementing the optimization of compatibility of a virtualization system.
BRIEF DESCRIPTION OF THE DRAWINGS
[0012] To make the technical solutions of the embodiments of the present invention or the prior 0 art clearer, the accompanying drawings used in the description of the embodiments or the prior art are briefly described hereunder. Evidently, the accompanying drawings illustrate some exemplary embodiments of the present invention and persons of ordinary skill in the art may obtain other drawings based on these drawings without creative efforts.
[0013] FIG. 1 is a schematic architectural diagram of a conventional virtualization technology; 25 [0014] FIG. 2-a is a schematic architectural diagram of a virtualization software and hardware system provided in an embodiment of the present invention; [0015] FIG. 2-b is a schematic architectural diagram of another virtualization software and hardware system provided in an embodiment of the present invention; [0016] FIG. 3 is a schematic flow chart of a virtualization processing method provided in an 30 embodiment of the present invention; [0017] FIG. 4 is a schematic flow chart of another virtualization processing method provided in an embodiment of the present invention; [0018] FIG. 5 is a schematic flow chart of another virtualization processing method provided in 4 an embodiment of the present invention; 2015203452 22 Jun2015 [0019] FIG. 6-a is a schematic flow chart of another virtualization processing method provided in an embodiment of the present invention; [0020] FIG. 6-b is a schematic diagram of GPA and HPA address translation provided in an 5 embodiment of the present invention; [0021] FIG. 7-a is a schematic flow chart of another virtualization processing method provided in an embodiment of the present invention; [0022] FIG. 7-b is a schematic diagram of another GPA and HPA address translation provided in an embodiment of the present invention; 0 [0023] FIG. 8-a is a schematic flow chart of another virtualization processing method provided in an embodiment of the present invention; [0024] FIG. 8-b is a schematic diagram of another GPA and HPA address translation provided in an embodiment of the present invention; [0025] FIG. 9-a is a schematic flow chart of another virtualization processing method provided 5 in an embodiment of the present invention; [0026] FIG. 9-b is a schematic diagram of another GPA and HPA address translation provided in an embodiment of the present invention; [0027] FIG. 10 is a schematic diagram of module architecture of a host provided in an embodiment of the present invention; 0 [0028] FIG. 11-a is a schematic diagram of a computing node provided in an embodiment of the present invention; [0029] FIG. 11-b is a schematic diagram of another computing node provided in an embodiment of the present invention; [0030] FIG. 11-c is a schematic diagram of another computing node provided in an embodiment 25 of the present invention; and [0031] FIG. 12 is a schematic diagram of a computer system provided in an embodiment of the present invention.
DETAILED DESCRIPTION OF THE EMBODIMENTS
[0032] Embodiments of the present invention provide a virtualization processing method and 30 apparatuses, and a computer system, so as to optimize the performance and compatibility of a virtualization system.
[0033] In order to make the solutions of the present invention more comprehensible for persons skilled in the art, the technical solutions in the embodiments of the present invention are clearly and 5 completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the embodiments to be described are only a part rather than all of the embodiments of the present invention. All other embodiments derived by persons of ordinary skill in the art based on the embodiments of the present invention without creative efforts 5 shall fall within the protection scope of the present invention. 2015203452 22 Jun2015 [0034] In order to conveniently understand embodiments of the present invention, several elements that will be introduced in the description of the embodiments of the present invention are illustrated herein first.
[0035] Virtual machine VM: 0 One or more virtual computers can be simulated on a physical computer through virtual machine software, and those virtual machines work as real computers, a virtual machine can have an operating system and application programs installed, and the virtual machine can still access network resources. For an application program running in the virtual machine, the virtual machine works just like in a real computer. 5 [0036] Hardware layer: A hardware platform running in a virtualization environment. The hardware layer may include multiple types of hardware, for example, a hardware layer of a computing node may include a CPU and a memory, and may include high speed/low speed input/output (I/O, Input/Output) devices, such as a network card and a storage, and other devices having specific processing 0 functions, such as an input/output memory management unit (IOMMU, Input/Output Memory Management Unit), where the IOMMU may be configured to translate a virtual machine physical address and a Host physical address.
[0037] I/O virtual function: A corresponding physical function (PF, Physical Function) device and several virtual 25 function (VF, Virtual Function) devices can be virtualized from the I/O device having an I/O virtual function after the I/O virtual function is enabled, where the PF device virtualized from the I/O device is mainly responsible for a management function, and a VF device is mainly responsible for a processing function.
[0038] Host (Host): 30 The host, as a management layer, is configured to complete management and allocation for hardware resoruces; present a virtual hardware platform for a virtual machine; and implement scheduling and isolation of the virtual machine. The Host may be a virtual machine monitor (VMM); and moreover, sometimes, a VMM may combine with one prerogative virtual machine to form a Host. The virtual hardware platform provides various hardware resources for virtual machines 6 running on the platform, for example, provides a virtual CPU, a memory, a virtual disk, a virtual network card, and so on. The virtual disk may be corresponding to one file or one logic block device of the Host. The virtual machine is running on the virtual hardware platform prepared by the Host, and the Host may have one or more virtual machines running on the host. 2015203452 22 Jun2015 5 [0039] Referring to FIG. 2-a and FIG. 2-b, FIG. 2-a and FIG. 2-b are schematic architectural diagrams of software and hardware systems of two virtualization solutions provided in embodiments of the present invention. System architecture mainly includes three layers: a hardware layer, a Host and a virtual machine (VM). The hardware layer shown in FIG. 2-a or FIG. 2-b includes an I/O device, and the hardware layer shown in FIG. 2-a further includes an IOMMU. The 0 Host is running on the hardware layer, and at least one virtual machine VM is running on the Host, where several corresponding virtual function VF devices are virtualized from the PO device, the Host has several VF software instances, the several VF software instances and the several VF devices are in one-to-one correspondence; the Host further has a back-end instance (BE, Back-End) of an PO virtual device having the same type with the PO device, the VM has a front-end instance 5 (FE, Front-End) of the PO virtual device; the BE in the Host is bound with an idle VF software instance. In the technical solution of an embodiment of the present invention, the BE in the VM may be considered as a front-end driver of the PO virtual device, the FE in the Host may be considered as a back-end driver of the PO virtual device, and the PO virtual device is composed of the BE and the FE. 0 [0040] A virtualization processing method according to an embodiment of the present invention can be applied to a computing node, where the computing node comprises: a hardware layer, a Host running on the hardware layer, and at least one VM running on the Host, the hardware layer includes an inpuPoutput PO device, the virtualization processing method may include: after an PO virtual function of the input/output PO device is enabled, generating several virtual function (VF, 25 Virtual Function) software instances in the Host; several corresponding virtual function (VF) devices are virtualized from the PO device with the PO virtual function enabled; the several VF software instances and the several VF devices are in one-to-one correspondence; creating, by the Host, an PO virtual device having the same type with the PO device, where a back-end instance BE of the PO virtual device is created in the Host, a front-end instance FE of the PO virtual device is 30 created in the initiated VM; and binding the BE with an idle VF software instance.
[0041] Referring to FIG. 3, a virtualization processing method provided in an embodiment of the present invention may include: [0042] 301. After an PO virtual function of an inpuPoutput PO device is enabled, generate several virtual function VF software instances (VF Instance) in a Host; 7 where, several corresponding VF devices can be virtualized from the I/O device with the I/O virtual function enabled; the several VF software instances and the several VF devices are in one-to-one correspondence. For example, the Host may enable the I/O virtual function of the I/O device when being initiated or at a certain moment after being initiated, so as to enable the I/O 5 virtual function of the I/O device. Alternatively, the I/O device may enable its I/O virtual function automatically after the device is powered on, and in this case, it is unnecessary for the Host to enable the I/O virtual function of the I/O device. 2015203452 22 Jun2015 [0043] It should be noted that, the I/O device mentioned in the embodiment of the present invention may be, for example, a peripheral component interconnect express (PCIe, Peripheral 0 Component Interconnect Express) device or a device of another type, such as a network card.
[0044] 302. Create, by the Host, an I/O virtual device having the same type with the I/O device, where a back-end instance (BE, Back-End) of the I/O virtual device (vDEV) is created in the Host, and a front-end instance (FE, Front-End) of the I/O virtual device is created in an initiated VM.
[0045] 303. Bind, by the Host, the BE with an idle VF software instance. 5 [0046] If the Host creates several I/O virtual devices having the same type with the I/O device, a back-end instance BE of each I/O virtual device is bound with an idle VF software instance, and an inter-access interface between the BE and the VF software instance that are in binding relatinship exists, for example, the BE can access a VF device corresponding to the VF software instance through an access interface provided by the VF software instance bound with the BF. In this way, 0 the application architecture in which each VM can independently use one VF device is established, a channel between one VF device virtualized from the I/O device and a front-end instance FE in one VM is got through, so that the FE can access the VF device through the BE in the Host. The VF device virtualized from the I/O device is separately allocated to the VM for use, so it is beneficial for the VM to acquire the performance similar to that of a physical machine. Based on the 25 application architecture constructed by the Host, the VM can send data, receive data, or perform data processing in other forms.
[0047] For example, in FIG. 2-a and FIG. 2-b, several corresponding VF devices can be virtualized from the I/O device (such as a PCIe device) with the I/O virtual function enabled, and a corresponding physical function (PF, Physical Function) device can further be virtualized from the 30 I/O device with the I/O virtual function enabled, back-end instances BEs of several I/O virtual devices that have the same type with the I/O device and are created by the Host are located in the Host, and a front-end instance FE of each I/O virtual device is located in a different VM. A shared memory may further be configured between the Host and the VM, and the back-end instance BE and the front-end instance FE of the I/O virtual device may, for example, transfer data through the 8 shared memory. 2015203452 22 Jun2015 [0048] It can be seen that, in this embodiment, after the I/O virtual function of the input/output I/O device is enabled, several VF software instances are generated in the Host; several corresponding VF devices are virtualized from the I/O device with the I/O virtual function enabled; 5 the several VF software instances and the several VF devices are in one-to-one correspondence; the Host creates an I/O virtual device having the same type with the I/O device, where a back-end instance BE of the I/O virtual device is created in the Host, and a front-end instance FE of the I/O virtual device is created in the initiated VM; the BE is bound with the idle VF software instance. In this way, the application architecture in which each VM can independently use one VF device is 0 established, a channel between one VF device virtualized from the I/O device and the front-end instance FE in one VM is got through, so that the FE can access the VF device through the BE in the Host. The VF device virtualized from the I/O device is separately allocated to the VM for use, and the VF device can provide a high-efficient device interface, so it is beneficial for the VM to acquire the performance similar to that of a physical machine, the delay is low and any additional 5 CPU overhead is hardly caused. Moreover, a front-end drive (that is, the FE) of the virtual device is in the VM, so data is transferred through a back-end drive (that is, the BE) in the Host, and the VM does not perceive a real physical device of the Host, which is convenient for transition and implementation of device sharing, thereby implementing the optimization of compatibility of a virtualization system. 0 [0049] Referring to FIG. 4, another virtualization processing method provided in an embodiment of the present invention is applied to a computing node, the computing node includes: a hardware layer, a Host running on the hardware layer, and at least one VM running on the Host, where the hardware layer includes an input/output I/O device, the method may include: [0050] 401. Enable, by the Host, an I/O virtual function of the input/output I/O device. 25 For example, the Host may enable the I/O virtual function of the I/O device when being initiated or at a certain moment after being initiated, so as to enable the I/O virtual function of the I/O device. Several corresponding VF devices are virtualized from the I/O device with the I/O virtual function enabled. Alternatively, the I/O device may enable its I/O virtual function automatically after the device is powered on, and in this case, it is unnecessary for the Host to 30 enable the I/O virtual function of the I/O device.
[0051] 402. Generate several VF software instances in the Host, where several corresponding VF devices can be virtualized from the I/O device with the I/O virtual function enabled, and the several VF software instances and the several VF devices are in one-to-one correspondence.
[0052] 403. Create, by the Host, an I/O virtual device having the same type with the I/O device, 9 where a back-end instance BE of the I/O virtual device (vDEV) is created in the Host, and a front-end instance FE of the I/O virtual device is created in an initiated VM. 2015203452 22 Jun2015 [0053] 404. Bind, by the Host, the BE with an idle VF software instance.
[0054] If the Host creates several I/O virtual devices having the same type with the I/O device, 5 a back-end instance BE of each I/O virtual device is bound with an idle VF software instance, and an inter-access interface between the BE and the VF software instance that are in binding relationship exists, for example, the BE can access the VF device corresponding to the VF software instance through an access interface provided by the VF software instance bound with the BE. In this way, the application architecture in which each VM can independently use one VF device is 0 established, a channel between one VF device virtualized from the I/O device and the front-end instance FE in the VM is got through, so that the FE can access the VF device through the BE in the Host. The VF device virtualized from the I/O device is separately allocated to the VM for use, so it is beneficial for the VM to acquire the performance similar to that of a physical machine. Based on the application architecture constructed by the Host, the VM can send data, receive data, or perform 5 data processing in other forms.
[0055] For example, in FIG. 2-a and FIG. 2-b, several corresponding VF devices can be virtualized from the I/O device (such as a PCIe device) with the I/O virtual function enabled, and a corresponding physical function PF device can further be virtualized from the I/O device with the I/O virtual function enabled (the Host may further generate a PF software instance corresponding to 0 the PF device), back-end instances BEs of several I/O virtual devices that have the same type with the I/O device and are created by the Host are located in the Host, and a front-end instance FE of each I/O virtual device is located in a different VM. A shared memory may further be configured between the Host and the VM, and the back-end instance BE and the front-end instance FE of the I/O virtual device can, for example, transfer data through the shared memory. 25 [0056] For ease of understanding, an optional interaction manner of application architecture constructed based on the foregoing mechanism is illustrated below by taking a procedure of receiving data as an example.
[0057] In an application scenario, after the Host binds the BE and the idle VF software instance, the FE may pre-allocate a cache for direct memory access (DMA, Direct Memory Access); the FE 30 may write a guest physical address (GPA, Guest Physical Address) corresponding to the pre-allocated cache for DMA into the shared memory between the Host and the VM; through an exporting application programming interface of the BE, the VF software instance bound with the BE may acquire the GPA corresponding to the cache for DMA; the VF software instance may write the acquired GPA corresponding to the cache for DMA into a receiving queue of a VF device 10 corresponding to the VF software instance; when there is data to be received, the VF device may select the GPA corresponding to the cache for DMA from the receiving queue of the VF device, and may initiate a DMA write request (the DMA write request is used to write data into the cache) by using the selected GPA as a target address; an input/output memory management unit IOMMU 5 modifies the target address GPA of the DMA write request into a corresponding Host physical address HPA (where, an address translation page table, for example, is set in the IOMMU, the address translation page table records mapping between the HPA and the GPA; when the DMA write request passes, the IOMMU may acquire an HPA corresponding to the target address GPA of the DMA write request by looking up the address translation page table, and modify the target 0 address GPA of the DMA write request to the acquired HPA); after the DMA write request whose target address is modified to the HPA is executed, the VF device may notify the VF software instance which is corresponding to the VF device and is in the Host, so that the VF software instance triggers a corresponding FE to receive data written into a cache corresponding to the HPA. 2015203452 22 Jun2015 [0058] In another application scenario, after the Host binds the BE and the idle VF software 5 instance, the FE may pre-allocate a cache for DMA; the FE may write a GPA corresponding to the pre-allocated cache for DMA into the shared memory between the Host and the VM; the Host (for example, the BE or another module in the Host) may modify the GPA corresponding to the cache for DMA to a corresponding HPA (for example, an address translation page table is set in the Host, the address translation page table records mapping between the HPA and the GPA; by looking up 0 the address translation page table, the Host (for example, the BE or another module in the Host) may acquire an HPA corresponding to the GPA that is corresponding to the cache for DMA, and modify the GPA corresponding to the cache for DMA to the acquired HPA); the VF software instance bound with the BE in the Host acquires, through the exporting application programming interface of the BE, the HPA corresponding to the cache for DMA; the acquired HPA corresponding 25 to the cache for DMA is written into a receiving queue of a VF device corresponding to the VF software instance; when there is data to be received, the VF device selects the HPA corresponding to the cache for DMA from the receiving queue of the VF device, and initiates a DMA write request (the DMA write request is used to write data in the cache) by using the selected HPA as a target address; after the DMA write request is executed, the VF device may further notify the VF software 30 instance which is corresponding to the VF device and is in the Host, so that the VF software instance triggers a corresponding FE to receive data written into a cache corresponding to the HPA.
[0059] For ease of understanding, an optional interaction manner of application architecture constructed based on the foregoing mechanism is illustrated below by taking a procedure of sending data as an example. 11 [0060] In an application scenario, after the Host binds the BE and the idle VF software instance, the FE may write a GPA corresponding to the cache where data to be sent locates into the shared memory between the Host and the VM; a corresponding BE may invoke a program sending interface of the VF software instance bound with the BE, and write the GPA corresponding to the 2015203452 22 Jun2015 5 cache where data to be sent locates into a sending queue of a VF device corresponding to the VF software instance; after finding that there is data to be sent, the VF device initiates a DMA read request (the DMA write request is used to read data from the cache) by using the GPA recorded in the sending queue of the the VF device as a target address; and the IOMMU modifies the target address GPA of the DMA read request into a corresponding HPA (where, an address translation 0 page table, for example, is set in the IOMMU, the address translation page table records mapping between the HPA and the GPA; when the DMA read request passes, the IOMMU may acquire an HPA corresponding to the target address GPA of the DMA read request by looking up the address translation page table, and modify the target address GPA of the DMA read request to the acquired HPA). Further, after the DMA read request is executed, the VF device may notify the VF software 5 instance which is corresponding to the VF device and is in the Host, so that the VF software instance triggers a corresponding FE to release the corresponding cache.
[0061] In another application scenario, after the Host binds the BE and the idle VF software instance, the FE may write the GPA corresponding to the cache where data to be sent locates into the shared memory between the Host and the VM; the Host (for example, the BE or another module 0 in the Host) may modify the GPA corresponding to the cache to a corresponding HPA (for example, an address translation page table is set in the Host, the address translation page table records mapping between the HPA and the GPA; by looking up the address translation page table, the Host (for example, the BE or another module in the Host) may acquire an HPA corresponding to the GPA that is corresponding to the cache, and modify the GPA corresponding to the cache to the acquired 25 HPA), the corresponding BE may invoke a program sending interface of the VF software instance bound with the BE, and write the HPA corresponding to the cache where the data to be sent locates into the sending queue of a VF device corresponding to the VF software instance; when finding that there is data to be sent, the VF device initiates a DMA read request by using the HPA recorded in the sending queue of the VF device as a target address. Further, after the DMA read request is 30 executed, the VF device may notify the VF software instance which is corresponding to the VF device and is in the Host, so that the VF software instance triggers a corresponding FE to release the corresponding cache.
[0062] The optional interaction manner of the application architecture constructed based on the foregoing mechanism is illustrated below by taking procedures of sending data and receiving data 12 as examples, and interaction manners in other application scenarios can be deduced through analog.
[0063] It can be seen that, in this embodiment, after the I/O virtual function of the input/output I/O device is enabled, several VF software instances are generated in the Host; several corresponding VF devices are virtualized from the I/O device with the I/O virtual function enabled; 5 the several VF software instances and the several VF devices are in one-to-one correspondence; the Host creates an I/O virtual device having the same type with the I/O device, where a back-end instance BE of the I/O virtual device is created in the Host, and a front-end instance FE of the I/O virtual device is created in an initiated VM; the BE is bound with the idle VF software instance. In this way, the application architecture in which each VM can independently use one VF device is 0 established, a channel between one VF device virtualized from the I/O device and the front-end instance FE in one VM is got through, so that the FE can access the VF device through the BE in the Host. The VF device virtualized from the I/O device is separately allocated to the VM for use, and the VF device can provide a high-efficient device interface, so it is beneficial for the VM to acquire the performance similar to that of a physical machine, the delay is low, and any additional 5 CPU overhead is hardly caused. Moreover, a front-end drive (that is, the FE) of the virtual device is in the VM, so data is transferred through a back-end drive (that is, the BE) in the Host, and the VM does not perceive a real physical device of the Host, which is convenient for transition and implementation of device sharing, thereby implementing the optimization of compatibility of a virtualization system. 2015203452 22 Jun2015 0 [0064] An embodiment of the present invention further provides a virtualization processing method, which is applied to a computing node, the computing node may include: a hardware layer, a Host running on the hardware layer, and at least one VM running on the Host, where the hardware layer includes an input/output I/O device, several corresponding virtual function VF devices are virtualized from the I/O device, the Host has several VF software instances, the several VF software 25 instances and the several VF devices are in one-to-one correspondence; the Host further has a back-end instance BE of an I/O virtual device having the same type with the I/O device, the VM has a front-end instance FE of the I/O virtual device; the BE in the Host is bound with an idle VF software instance, the method includes: pre-allocating, by the FE, a cache for direct memory access DMA; acquiring, by the VF software instance bound with the BE, an address corresponding to the 30 cache for DMA through an exporting application programming interface of the BE, writing the acquired address corresponding to the cache for DMA into a first storage unit of a VF device corresponding to the VF software instance; selecting, by the VF device, the address corresponding to the cache for DMA from the first storage unit when there is data to be received, and initiating a DMA write request by using the selected address corresponding to the cache for DMA as a target 13 address; notifying, by the VF device, the VF software instance which is corresponding to the VF device and is in the Host after the DMA write request is executed, so that the VF software instance triggers the FE to receive data written into the cache corresponding to the address. 2015203452 22 Jun2015 [0065] Referring to FIG. 5, another virtualization processing method in an embodiment of the 5 present invention may include the following steps: [0066] 501. Pre-allocate, by the FE, a cache for direct memory access DMA.
[0067] 502. Acquire, by the VF software instance bound with the BE, an address (the address is, for example, the HPA or GPA) corresponding to the cache for DMA through an exporting application programming interface of the BE, and write the acquired address corresponding to the 0 cache for DMA into a first storage unit of a VF device corresponding to the VF software instance (where, the first storage unit is, for example, a receiving queue or receiving list of the VF device or another data storage structure capable of recording an address).
[0068] 503. Select, by the VF device, the address corresponding to the cache for DMA from the first storage unit when there is data to be received, and initiate a DMA write request by using the 5 selected address corresponding to the cache for DMA as a target address.
[0069] 504. Notify, by the VF device, the VF software instance which is corresponding to the VF device and is in the Host after the DMA write request is executed, so that the VF software instance triggers the FE to receive data written into the cache corresponding to the address.
[0070] It can be seen that, in this embodiment, the computing node may include: a hardware 0 layer, a Host running on the hardware layer, and at least one VM running on the Host, where the hardware layer includes an input/output I/O device, several corresponding virtual function VF devices are virtualized from the I/O device, the Host has several VF software instances, the several VF software instances and the several VF devices are in one-to-one correspondence; the Host further has a back-end instance BE of an I/O virtual device having the same type with the I/O 25 device, the VM has a front-end instance FE of the I/O virtual device; the BE in the Host is bound with an idle VF software instance. In this way, the application architecture in which each VM can independently use one VF device is established, and a channel between one VF device virtualized from the I/O device and the front-end instance FE in one VM is got through, so that the FE can access the VF device through the BE in the Host. The VF device virtualized from the I/O device is 30 separately allocated to the VM for use, and the VF device can provide a high-efficient device interface, so it is beneficial for the VM to acquire the performance similar to that of a physical machine, the delay is low and any additional CPU overhead is hardly caused. Moreover, a front-end drive (that is, the FE) of the I/O virtual device is in the VM, so the FE transfers data through a back-end drive (that is, the BE) in the Host, and the VM does not perceive a real physical device of 14 the Host, which is convenient for transition and implementation of device sharing, thereby implementing the optimization of compatibility of a virtualization system. 2015203452 22 Jun2015 [0071] In an embodiment of the present invention, after the FE pre-allocates the cache for DMA, the FE may write the GPA corresponding to the pre-allocated cache for DMA into the shared 5 memory between the Host and the VM; the VF software instance bound with the BE may acquire the GPA corresponding to the cache for DMA from the shared memory through the exporting application programming interface of the BE (definitely, the FE may also notify the GPA corresponding to the pre-allocated cache for DMA to the corresponding BE; and the VF software instance bound with the BE may acquire the GPA corresponding to the cache for DMA through the 0 exporting application programming interface of the BE), write the acquired GPA corresponding to the cache for DMA into the first storage unit of the VF device corresponding to the VF software instance; and when there is data to be received, the VF device may select the GPA corresponding to the cache for DMA from the first storage unit, and initiate the DMA write request by using the GPA corresponding to the cache for DMA as the target address; the IOMMU may modify the target 5 address GPA of the DMA write request into the corresponding HPA (for example, an address translation page table, for example, is set in the IOMMU, the address translation page table records mapping between the HPA and the GPA; the IOMMU acquires an HPA corresponding to the target address GPA of the DMA write request by looking up the address translation page table, and modifies the target address GPA of the DMA write request to the acquired HPA); after the DMA 0 write request whose target address GPA is modified to the HPA is executed, the VF device notify the VF software instance which is corresponding to the VF device and is in the Host, so that the VF software instance triggers the FE to receive data written into the cache corresponding to the HPA.
[0072] In another application scenario, after the FE pre-allocates the cache for DMA, the FE may write a GPA corresponding to the pre-allocated cache for DMA into the shared memory
25 between the Host and the VM; the Host may modify the GPA corresponding to the cache for DMA in the shared memory to a corresponding HPA (for example, an address translation page table is set in the Host, the address translation page table records mapping between the HPA and the GPA; by looking up the address translation page table, the Host may acquire an HPA corresponding to the GPA that is corresponding to the cache for DMA in the shared memory, and modify the GPA
30 corresponding to the cache for DMA in the shared memory to the acquired HPA. Definitely, the FE
may also notify the GPA corresponding to the pre-allocated cache for DMA to the corresponding BE, and the Host may modify the GPA corresponding to the cache for DMA to the corresponding HPA); the VF software instance bound with the BE acquires, through the exporting application programming interface of the BE, the HPA corresponding to the cache for DMA; the acquired HPA 15 corresponding to the cache for DMA is written into a first storage unit of the VF device corresponding to the VF software instance; when there is data to be received, the VF device selects the HPA corresponding to the cache for DMA from the first storage unit, and initiates a DMA write request by using the selected HPA as a target address. 2015203452 22 Jun2015 5 [0073] In an embodiment of the present invention, when the FE has data to be sent, the FE may write the GPA corresponding to the cache where data to be sent locates into the shared memory between the Host and the VM; the BE may acquire the GPA corresponding to the cache where data to be sent locates from the shared memory (definitely, the FE may also notify the GPA corresponding to the cache where data to be sent locates to the corresponding BE, and the BE 0 acquires, according to the notification, the GPA corresponding to the cache where data to be sent locates); the BE invokes a program sending interface of the VF software instance bound with the BE, and writes the GPA corresponding to the cache where data to be sent locates into a second storage unit of the VF device corresponding to the VF software instance (where the second storage unit is, for example, a sending queue or a sending list of the VF device or another data storage 5 structure capable of recording an address); when finding that there is data to be sent, the VF device initiates a DMA read request by using the GPA recorded in the second storage unit as the target address; the IOMMU may modify the target address GPA of the DMA read request to a corresponding HPA (for example, an address translation page table is set in the IOMMU, the address translation page table records mapping between the HPA and the GPA; the IOMMU 0 acquires an HPA corresponding to the target address GPA of the DMA read request by looking up the address translation page table, and modifies the target address GPA of the DMA read request to the acquired HPA). Further, after the DMA read request is executed, the VF device may notify the VF software instance which is corresponding to the VF device and is in the Host, so that the VF software instance triggers the FE to release the corresponding cache.
25 [0074] In another embodiment of the present invention, when the FE has data to be sent, the FE
may write the GPA corresponding to the cache where data to be sent locates into the shared memory between the Host and the VM; the Host modifies the GPA corresponding to the cache where data to be sent locates in the shared memory to the corresponding HPA (for example, an address translation page table is set in the Host, the address translation page table records mapping between the HPA 30 and the GPA; by looking up the address translation page table, the Host acquires the HPA corresponding to the GPA that is corresponding to the cache where data to be sent locates in the shared memory, and modifies the GPA corresponding to the cache where data to be sent locates in the shared memory to the corresponding HPA. Definitely, the FE may also notify the GPA corresponding to the cache where data to be sent locates to the Host, and the Host modifies the GPA 16 corresponding to the cache where data to be sent locates to the corresponding HPA); the BE acquires the HPA corresponding to the cache where the data to be sent locates, invokes a program sending interface of the VF software instance bound with the BE, and writes the HPA corresponding to the cache where the data to be sent locates into the second storage unit of the VF device 5 corresponding to the VF software instance (where the second storage unit is, for example, a sending queue or a sending list of the VF device or another data storage structure capable of recording an address); when finding that there is data to be sent, the VF device initiates a DMA read request by using the HPA recorded in the second storage unit as a target address. Further, after the DMA read request is executed, the VF device may notify the VF software instance which is corresponding to 0 the VF device and is in the Host, so that the VF software instance triggers the FE to release the corresponding cache. 2015203452 22 Jun2015 [0075] For better understand and implement the foregoing solutions in the embodiments of the present invention, further illustration is made by taking several specific application scenarios of data receiving and data sending as examples. 5 [0076] Referring to FIG. 6-a, another virtualization processing method provided in an embodiment of the present invention may include: [0077] 601. Enable, by the Host, the IOMMU; where, the Host may enable the IOMMU when being initiated or at a certain moment after being initiated; definitely, the IOMMU may also enable the corresponding function 0 automatically when the device is powered on, and in this case, it is unnecessary for the Host to enable the IOMMU; and definitely, another module may also be used to enable the IOMMU.
[0078] 602. Install drivers of a PF device and a VF device in the Host, where the PF device and the VF device are corresponding to the I/O device (for example, referred to as E-l).
[0079] 603. Enable, by the Host, an I/O virtual function of the I/O device E-l; 25 where, for example, the Host may enable the I/O virtual function of the I/O device E-l when being initiated or at a certain moment after being initiated. A corresponding physical function PF device and several virtual function VF devices can be virtualized from the I/O device E-l with the I/O virtual function enabled, definitely, another module may be used to enable the I/O device E-l, and definitely, the I/O device E-l may also automatically enable its I/O virtual function when 30 the device is powered on, and in this case, it is unnecessary for the Host or another module to enable the I/O virtual function of the I/O device E-l. The PF device virtualized from the I/O device E-l is mainly responsible for a management function, and the VF device is mainly responsible for a processing function.
[0080] 604. Generate a PF software instance and several VF software instances in the Host; 17 where the corresponding PF device and several VF devices can be virtualized from the I/O device E-l with the I/O virtual function enabled; the several VF software instances and the several VF devices are in one-to-one correspondence, and the PF software instance generated in the Host is corresponding to the PF device virtualized from the I/O device E-1. 2015203452 22 Jun2015 5 [0081] 605. Create, by the Host, an I/O virtual device (for example, referred to as vE-1) having the same type with the I/O device E-l; where, a back-end instance BE (for example, referred to as BE-1) of the I/O virtual device vE-1 is created in the Host, and a front-end instance FE (for example, referred to as FE-1) of the I/O virtual device vE-1 is created in an initiated VM (for example, referred to as VM-1). For 0 example, the Host may trigger the creation of a front-end instance FE-1 corresponding to the I/O virtual device vE-1 in the initiated VM-1. It may be considered that, the FE-1 created in the VM-1 and the BE-1 created in the Host commonly construct a driver of the I/O virtual device vE-1.
[0082] 606. Bind, by the Host, the created BE-1 with one idle VF software instance (for example, referred to as Vfe-1); 5 where, the VF software instance Vfe-1, for example, is corresponding to the VF device (for example, referred to as VF-1) virtualized from the I/O device E-l. The so-called idle VF software instance is a VF software instance that is not bound with another back-end instance BE.
[0083] So far, a channel between the VF device VF-1 virtualized from the I/O device E-l and the front-end instance FE-1 in the VM-1 is got through, in this way, the FE-1 can access the VF 0 device VF-1 through the BE-1 in the Host. The VF device VF-1 is separately allocated to the VM-1 for use, the VF device VF-1 is virtualized from the I/O device E-l and can provide a high-efficient device interface, so it is beneficial for the VM-1 to acquire the performance similar to that of a physical machine, the delay is low, and any additional CPU overhead is hardly caused. Moreover, a front-end drive (that is, the FE-1) of the virtual device is in the VM-1, so the data is transferred 25 through a back-end drive (that is, the BE-1) in the Host, and the VM-1 does not perceive a real physical device of the Host, which is convenient for transition and implementation of device sharing.
[0084] 607. Pre-allocate, by the FE-1, a cache for direct memory access (DMA); [0085] For example, the cache that is used for DMA and is pre-allocated by the FE-1 may be 30 [GPA1, Lenl], ... [GPAn, Lenn], that is, multiple sections of cache for DMA may be pre-allocated, where GPA1 represents a start address of the GPA of the cache, Lenl represents a length of the cache, and so on.
[0086] 608. Write, by the FE-1, a GPA corresponding to the pre-allocated cache for DMA into a shared memory (shared Memory) between the Host and the VM-1, and notify the BE-1 (definitely, 18 it is also possible that after performing self detection, the BE-1 finds that the GPA corresponding to the cache for DMA is written into the shared memory). 2015203452 22 Jun2015 [0087] 609. Acquire, by the VF software instance Vfe-1, the GPA corresponding to the cache for DMA through an exporting application programming interface of the BE-1, and write the 5 acquired GPA corresponding to the cache for DMA into a receiving queue of the VF device VF-1 corresponding to the VF software instance Vfe-1.
[0088] 610. Select, by the VF device VF-1, the GPA corresponding to the cache for DMA from the receiving queue of the the VF device when there is data to be received, and initiate a DMA write request by using the selected GPA as a target address; 0 where, the DMA write request initiated by the VF device VF-1 will pass the IOMMU.
[0089] 611. Modify, by the IOMMU, the target address GPA of the DMA write request to a corresponding HPA; where, for example, an address translation page table is set in the IOMMU, the address translation page table records mapping between the GPA and the HPA (for example, as shown in 5 FIG. 6-b). When the DMA write request passes, the IOMMU may acquire an HPA corresponding to the target address GPA of the DMA write request by looking up the address translation page table, and modify the target address GPA of the DMA write request to the acquired HPA.
[0090] 612. Notify, by the VF device VF-1, the corresponding VF software instance Vfe-1 in the Host after the DMA write request whose target address is modified to the HPA is executed, so 0 that the VF software instance Vfe-1 triggers the front-end instance FE-1 in the VM-1 to receive data written into the cache corresponding to the HPA.
[0091] When being triggered by the VF software instance Vfe-1, the front-end instance FE-1 in the VM-1 may read the data written in the cache corresponding to the HPA.
[0092] It can be seen that, in this embodiment, after the I/O virtual function of the I/O device 25 enabled by the Host is enabled, several VF software instances are generated in the Host; several corresponding VF devices are virtualized from the I/O device with the I/O virtual function enabled; the several VF software instances and the several VF devices are in one-to-one correspondence; the Host creates an I/O virtual device having the same type with the I/O device, where a back-end instance BE of the I/O virtual device is created in the Host, and a front-end instance FE of the I/O 30 virtual device is created in the initiated VM; the BE is bound with the idle VF software instance, in this way, the application architecture in which each VM can independently use one VF device is established, and a channel between one VF device virtualized from the I/O device and the front-end instance FE in one VM is got through, so that the FE can access the VF device through the BE in the Host. The VF device virtualized from the I/O device is separately allocated to the VM for use, 19 and the VF device can provide a high-efficient device interface, so it is beneficial for the VM to acquire the performance similar to that of a physical machine, the delay is low and any additional CPU overhead is hardly caused. Moreover, a front-end drive (that is, the FE) of the virtual device is in the VM, so data is transferred through a back-end drive (that is, the BE) in the Host, and the VM 5 does not perceive a real physical device of the Host, which is convenient for transition and implementation of device sharing, thereby implementing the optimization of compatibility of a virtualization system. 2015203452 22 Jun2015
[0093] In addition, during the procedure of executing the DMA write request, the hardware module IOMMU implements the translation between the GPA and the HPA, thereby reducing CPU 0 overhead and further improving performance.
[0094] Referring to FIG. 7-a, another virtualization processing method provided in an embodiment of the present invention may include: [0095] 701. Install drivers of a PF device and a VF device in the Host, where the PF device and the VF device are corresponding to an I/O device (referred to as E-2). 5 [0096] 702. Enable, by the Host, the I/O virtual function of the I/O device E-2; where, for example, the Host may enable the I/O virtual function of the I/O device E-2 when being initiated or at a certain moment after being initiated. The corresponding physical function PF device and several virtual function VF devices can be virtualized from the I/O device E-2 with the I/O virtual function enabled by the Host, definitely, another module may be used to 0 enable the I/O device E-2, and definitely, the I/O device E-2 may also automatically enable its PO virtual function when the device is powered on, and in this case, it is unnecessary for the Host or another module to enable the PO virtual function of the PO device E-2. The PF device virtualized from the PO device E-2 is mainly responsible for a management function, and the VF device is mainly responsible for a processing function. 25 [0097] 703. Generate a PF software instance and several VF software instances in the Host; where a corresponding PF device and several VF devices can be virtualized from the PO device E-2 with the PO virtual function enabled; the several VF software instances and the several VF devices are in one-to-one correspondence, and the PF software instance generated in the Host is corresponding to the PF device virtualized from the PO device E-2. 30 [0098] 704. Create, by the Host, an PO virtual device (for example, referred to as vE-2) having the same type with the PO device E-2; where, a back-end instance BE (for example, referred to as BE-2) of the PO virtual device vE-2 is created in the Host, and a front-end instance FE (for example, referred to as FE-2) of the PO virtual device vE-2 is created in the initiated VM (for example, referred to as VM-2). For 20 example, the Host may trigger the creation of the front-end instance FE-2 corresponding to the I/O virtual device vE-2 in the initiated VM-2. It may be considered that, the FE-2 created in the VM-2 and the BE-2 created in the Host commonly construct a driver of the I/O virtual device vE-2. 2015203452 22 Jun2015 [0099] 705. Bind, by the Host, the created BE-2 with one idle VF software instance (for 5 example, referred to as Vfe-2); where, the VF software instance Vfe-2, for example, is corresponding to the VF device (for example, referred to as VF-2) virtualized from the I/O device E-2. The so-called idle VF software instance is a VF software instance that is not bound with another back-end instance BE.
[0100] So far, a channel between the VF device VF-2 virtualized from the I/O device E-2 and 0 the front-end instance FE-2 in the VM-2 is got through, in this way, the FE-2 can access the VF device VF-2 through the BE-2 in the Host. The VF device VF-2 is separately allocated to the VM-2 for use, the VF device VF-2 is virtualized from the I/O device E-2 and can provide a high-efficient device interface, so it is beneficial for the VM-2 to acquire the performance similar to that of a physical machine, the delay is low and any additional CPU overhead is hardly caused. Moreover, a 5 front-end drive (that is, the FE-2) of the virtual device is in the VM-2, so the data is transferred through a back-end drive (that is, the BE-2) in the Host, and the VM-2 does not perceive a real physical device of the Host, which is convenient for transition and implementation of device sharing.
[0101] 706. Pre-allocate, by the FE-2, a cache for direct memory access (DMA). 0 [0102] For example, the cache that is used for DMA and is pre-allocated by the FE-2 may be [GPA1, Lenl], ... [GPAn, Lenn], that is, multiple sections of cache for DMA may be pre-allocated, where GPA1 represents a start address of the GPA of the cache, Lenl represents a cache length, and so on.
[0103] 707. Write, by the FE-2, a GPA corresponding to the pre-allocated cache for DMA into a 25 shared memory (shared Memory) between the Host and the VM-2, and notify the BE-2 (definitely, it is also possible that after performing self detection, the BE-2 finds that the GPA corresponding to the cache for DMA is written into the shared memory).
[0104] 708. Modify, by the Host, the GPA corresponding to the cache for DMA in the shared memory to a corresponding HPA; 30 where, for example, an address translation page table is set in the Host, the address translation page table records mapping between the GPA and the HPA (for example, as shown in FIG. 7-b). By looking up the address translation page table, the Host may acquire an HPA corresponding to the GPA that is corresponding to the cache for DMA, and modify the GPA corresponding to the cache for DMA to the acquired HPA. 21 [0105] 709. Acquire, by the VF software instance Vfe-2, the HPA corresponding to the cache 2015203452 22 Jun2015 for DMA through an exporting application programming interface of the BE-2, and write the acquired HPA corresponding to the cache for DMA into a receiving queue of the VF device VF-2 corresponding to the VF software instance Vfe-2. 5 [0106] 710. Select, by the VF device VF-2, the HPA corresponding to the cache for DMA from the receiving queue of the VF device when there is data to be received, and initiate a DMA write request by using the selected HPA as a target address.
[0107] 711. Notify, by the VF device VF-2, the corresponding VF software instance Vfe-2 in the Host after the DMA write request is executed, so that the VF software instance Vfe-2 triggers 0 the front-end instance FE-2 in the VM-2 to receive data written into the cache corresponding to the HPA.
[0108] When being triggered by the VF software instance Vfe-2, the front-end instance FE-2 in the VM-2 may read the data written in the cache corresponding to the HPA.
[0109] It can be seen that, in this embodiment, after the I/O virtual function of the input/output 5 I/O device enabled by the Host is enabled, several VF software instances are generated in the Host; several corresponding VF devices are virtualized from the I/O device with the I/O virtual function enabled; the several VF software instances and the several VF devices are in one-to-one correspondence; the Host creates an I/O virtual device having the same type with the I/O device, where a back-end instance BE of the I/O virtual device is created in the Host, and a front-end 0 instance FE of the I/O virtual device is created in the initiated VM; the BE is bound with the idle VF software instance, in this way, the application architecture in which each VM can independently use one VF device is established, a channel between one VF device virtualized from the I/O device and the front-end instance FE in one VM is got through, so that the FE can access the VF device through the BE in the Host. The VF device virtualized from the I/O device is separately allocated to 25 the VM for use, and the VF device can provide a high-efficient device interface, so it is beneficial for the VM to acquire the performance similar to that of a physical machine, the delay is low and any additional CPU overhead is hardly caused. Moreover, a front-end drive (that is, the FE) of the virtual device is in the VM, so data is transferred through a back-end drive (that is, the BE) in the Host, and the VM does not perceive a real physical device of the Host, which is convenient for 30 transition and implementation of device sharing, thereby implementing the optimization of compatibility of a virtualization system.
[0110] In addition, during the procedure of executing the DMA write request, the Host implements the translation between the GPA and the HPA, thereby reducing hardware resource configuration, and simplifying the processing flow. 22 [0111] Referring to FIG. 8-a, another virtualization processing method provided in an embodiment of the present invention may include: 2015203452 22 Jun2015 [0112] 801. Enable, by the Host, the IOMMU; where, the Host may enable the IOMMU when being initiated or at a certain moment 5 after being initiated; definitely, the IOMMU may also enable the corresponding function automatically when the device is powered on, and in this case, it is unnecessary for the Host to enable the IOMMU; and definitely, another module may also be used to enable the IOMMU.
[0113] 802. Install drivers of a PF device and a VF device in the Host, where the PF device and the VF device are corresponding to the I/O device (for example, referred to as E-3). 0 [0114] 803. Enable, by the Host, an I/O virtual function of the I/O device E-3; where, for example, the Host may enable the I/O virtual function of the I/O device E-3 when being initiated or at a certain moment after being initiated. The corresponding physical function PF device and several virtual function VF devices can be virtualized from the I/O device E-3 with the I/O virtual function enabled, definitely, another module may be used to enable the I/O 5 device E-3, and definitely, the I/O device E-3 may also automatically enable its I/O virtual function when the device is powered on, and in this case, it is unnecessary for the Host or another module to enable the I/O virtual function of the I/O device E-3. The PF device virtualized from the I/O device E-3 is mainly responsible for a management function, and the VF device is mainly responsible for a processing function. 0 [0115] 804. Generate a PF software instance and several VF software instances in the Host; where the corresponding PF device and several VF devices can be virtualized from the I/O device E-3 with the I/O virtual function enabled; the several VF software instances and the several VF devices are in one-to-one correspondence, and the PF software instance generated in the Host is corresponding to the PF device virtualized from the I/O device E-3. 25 [0116] 805. Create, by the Host, an I/O virtual device (for example, referred to as vE-3) having the same type with the I/O device E-3; where, a back-end instance BE (for example, referred to as BE-3) of the I/O virtual device vE-3 is created in the Host, and a front-end instance FE (for example, referred to as FE-3) of the I/O virtual device vE-3 is created in the initiated VM (for example, referred to as VM-3). For 30 example, the Host may trigger the creation of a front-end instance FE-3 corresponding to the I/O virtual device vE-3 in the initiated VM-3. It may be considered that, the FE-3 created in the VM-3 and the BE-3 created in the Host commonly construct a driver of the I/O virtual device vE-3.
[0117] 806. Bind, by the Host, the created BE-3 with one idle VF software instance (for example, referred to as Vfe-3); 23 where, the VF software instance Vfe-3, for example, is corresponding to the VF device (for example, referred to as VF-3) virtualized from the I/O device E-3. The so-called idle VF software instance is a VF software instance that is not bound with another back-end instance BE. 2015203452 22 Jun2015
[0118] So far, a channel between the VF device VF-3 virtualized from the I/O device E-3 and 5 the front-end instance FE-3 in the VM-3 is got through, in this way, the FE-3 can access the VF device VF-3 through the BE-3 in the Host. The VF device VF-3 is separately allocated to the VM-3 for use, the VF device VF-3 is virtualized from the I/O device E-3 and can provide a high-efficient device interface, so it is beneficial for the VM-3 to acquire the performance similar to that of a physical machine, the delay is low and any additional CPU overhead is hardly caused. Moreover, a 0 front-end drive (that is, the FE-3) of the virtual device is in the VM-3, so the data is transferred through a back-end drive (that is, the BE-3) in the Host, and the VM-3 does not perceive a real physical device of the Host, which is convenient for transition and implementation of device sharing.
[0119] 807. Write, by the front-end instance FE-3, a GPA corresponding to the cache where data 5 to be sent locates into a shared memory between the Host and the VM-3, and notify the BE-3 (definitely, it is also possible that after performing self detection, the BE-3 finds that the GPA corresponding to the cache where data to be sent locates is written into the shared memory).
[0120] For example, the GPA corresponding to the cache where data to be sent locates is [GPA1, Lenl], ...[GPAn, Lenn], that is, multiple sections of cache for DMA may be pre-allocated, where 0 GPA1 represents a start address of the GPA of the cache, Lenl represents a cache length, and so on.
[0121] 808. Invoke, by the BE-3, a program sending interface of the VF software instance Vfe-3 bound with the BE-3, and write the GPA corresponding to the cache where data to be sent locates into a sending queue of the VF device VF-3 corresponding to the VF software instance Vfe-3.
[0122] 809. When finding that there is data to be sent, initiate, by the VF device VF-3, a DMA 25 read request by using the GPA recorded in the sending queue of the the VF device as a target address; where, the VF device VF-3, for example, may detect the sending queue of the the VF device periodically or non-periodically, and when finding that a GPA is newly written into the sending queue, consider that there is data to be sent, or, the VF software instance Vfe-3 may notify 30 the VF device VF-3 after a GPA is newly written into the sending queue; where, the DMA read request initiated by the VF device VF-3 will pass the IOMMU.
[0123] 810. Modify, by the IOMMU, the target address GPA of the DMA read request to a corresponding HPA; where, for example, an address translation page table is set in the IOMMU, the address 24 translation page table records mapping between the GPA and the HPA (for example, as shown in FIG. 8-b). When the DMA read request passes, the IOMMU may acquire an HPA corresponding to the target address GPA of the DMA read request by looking up the address translation page table, and modify the target address GPA of the DMA read request to the acquired HPA. 2015203452 22 Jun2015 5 [0124] 811. Notify, by the VF device VF-3, the corresponding VF software instance Vfe-3 in the Host after the DMA read request whose target address is modified to the HPA is executed, so that the VF software instance Vfe-3 triggers the front-end instance FE-3 in the VM-3 to release the cache corresponding to the HPA.
[0125] When being triggered by the VF software instance Vfe-3, the front-end instance FE-3 in 0 the VM-3 may release the cache corresponding to the HPA, so as to cache new data.
[0126] It can be seen that, in this embodiment, after the I/O virtual function of the input/output I/O device enabled by the Host is enabled, several VF software instances are generated in the Host; several corresponding VF devices are virtualized from the I/O device with the I/O virtual function enabled; the several VF software instances and the several VF devices are in one-to-one 5 correspondence; the Host creates an I/O virtual device having the same type with the I/O device, where a back-end instance BE of the I/O virtual device is created in the Host, and a front-end instance FE of the I/O virtual device is created in the initiated VM; the BE is bound with the idle VF software instance, in this way, the application architecture in which each VM can independently use one VF device is established, a channel between one VF device virtualized from the I/O device 0 and the front-end instance FE in one VM is got through, so that the FE can access the VF device through the BE in the Host. The VF device virtualized from the I/O device is separately allocated to the VM for use, and the VF device can provide a high-efficient device interface, so it is beneficial for the VM to acquire the performance similar to that of a physical machine, the delay is low and any additional CPU overhead is hardly caused. Moreover, a front-end drive (that is, the FE) of the 25 virtual device is in the VM, so the FE transfers data through a back-end drive (that is, the BE) in the
Host, and the VM does not perceive a real physical device of the Host, which is convenient for transition and implementation of device sharing, thereby implementing the optimization of compatibility of a virtualization system.
[0127] In addition, during the procedure of executing the DMA read request, the hardware 30 module IOMMU implements the translation between the GPA and the HPA, thereby reducing CPU overhead and further improving performance.
[0128] Referring to FIG. 9-a, another virtualization processing method provided in an embodiment of the present invention may include: [0129] 901. Install drivers of a PF device and a VF device in the Host, where the PF device and 25 the VF device are corresponding to an I/O device (referred to as E-4). 2015203452 22 Jun2015 [0130] 902. Enable, by the Host, the I/O virtual function of the I/O device E-4; where, for example, the Host may enable the I/O virtual function of the I/O device E-4 when being initiated or at a certain moment after being initiated. The corresponding physical 5 function PF device and several virtual function VF devices can be virtualized from the I/O device E-4 with the I/O virtual function enabled, definitely, another module may be used to enable the I/O device E-4, and definitely, the I/O device E-4 may also automatically enable its I/O virtual function when the device is powered on, and in this case, it is unnecessary for the Host or another module to enable the I/O virtual function of the I/O device E-4. The PF device virtualized from the I/O device 0 E-4 is mainly responsible for a management function, and the VF device is mainly responsible for a processing function.
[0131] 903. Generate a PF software instance and several VF software instances in the Host; where the corresponding PF device and several VF devices can be virtualized from the I/O device E-4 with the I/O virtual function enabled; the several VF software instances and the several VF 5 devices are in one-to-one correspondence, and the PF software instance generated in the Host is corresponding to the PF device virtualized from the I/O device E-4.
[0132] 904. Create, by the Host, an I/O virtual device (for example, referred to as vE-4) having the same type with the I/O device E-4; where, a back-end instance BE (for example, referred to as BE-4) of the I/O virtual 0 device vE-4 is created in the Host, and a front-end instance FE (for example, referred to as FE-4) of the I/O virtual device vE-4 is created in the initiated VM (for example, referred to as VM-4). For example, the Host may trigger the creation of the front-end instance FE-4 corresponding to the I/O virtual device vE-4 in the initiated VM-4. It may be considered that, the FE-4 created in the VM-4 and the BE-4 created in the Host commonly construct a driver of the I/O virtual device vE-4. 25 [0133] 905. Bind, by the Host, the created BE-4 with one idle VF software instance (for example, referred to as Vfe-4); where, the VF software instance Vfe-4, for example, is corresponding to the VF device (for example, referred to as VF-4) virtualized from the I/O device E-4. The so-called idle VF software instance is a VF software instance that is not bound with another back-end instance BE. 30 [0134] So far, a channel between the VF device VF-4 virtualized from the I/O device E-4 and the front-end instance FE-4 in the VM-4 is got through, in this way, the FE-4 can access the VF device VF-4 through the BE-4 in the Host. The VF device VF-4 is separately allocated to the VM-4 for use, the VF device VF-4 is virtualized from the I/O device E-4 and can provide a high-efficient device interface, so it is beneficial for the VM-4 to acquire the performance similar to that of a 26 physical machine, the delay is low and any additional CPU overhead is hardly caused. Moreover, a front-end drive (that is, the FE-4) of the virtual device is in the VM-4, so the data is transferred through a back-end drive (that is, the BE-4) in the Host, and the VM-4 does not perceive a real physical device of the Host, which is convenient for transition and implementation of device 5 sharing. 2015203452 22 Jun2015 [0135] 906. Write, by the FE-4, a GPA corresponding to the cache where data to be sent locates into a shared memory between the Host and the VM-4, and notify the BE-4 (definitely, it is also possible that after performing self detection, the BE-4 finds that the GPA corresponding to the cache where data to be sent locates is written into the shared memory). 0 [0136] 907. Modify, by the Host, the GPA corresponding to the cache where data to be sent locates in the shared memory into a corresponding HPA; where, for example, an address translation page table is set in the Host, the address translation page table records mapping between the GPA and the HPA (for example, as shown in FIG. 9-b). By looking up the address translation page table, the Host may, for example, acquire an 5 HPA corresponding to the GPA that is corresponding to the cache where data to be sent locates, and modify the GPA corresponding to the cache where data to be sent locates to the acquired HPA. [0137] 908. Invoke, by the BE-4, a program sending interface of the VF software instance Vfe-4 bound with the BE-4, and write the HPA corresponding to the cache where data to be sent locates into a sending queue of the VF device VF-4 corresponding to the VF software instance Vfe-4.
0 [0138] 909. When finding that there is data to be sent, initiate, by the VF device VF-4, a DMA read request by using the HPA recorded in the sending queue of the the VF device as a target address; where, the VF device VF-4 may, for example, detect the sending queue of the the VF device periodically or non-periodically, and when finding that an HPA is newly written into the 25 sending queue, consider that there is data to be sent, or, the VF software instance Vfe-4 may notify the VF device VF-4 after an HPA is newly written into the sending queue.
[0139] 910. Notify, by the VF device VF-4, the corresponding VF software instance Vfe-4 in the Host after the DMA read request is executed, so that the VF software instance Vfe-4 triggers the front-end instance FE-4 in the VM-4 to release the cache corresponding to the HPA. 30 [0140] When being triggered by the VF software instance Vfe-4, the front-end instance FE-4 in the VM-4 may release the cache corresponding to the HPA, so as to cache new data.
[0141] It can be seen that, in this embodiment, after the I/O virtual function of the I/O device enabled by the Host is enabled, several VF software instances are generated in the Host; several corresponding VF devices are virtualized from the I/O device with the I/O virtual function enabled; 27 the several VF software instances and the several VF devices are in one-to-one correspondence; the Host creates an I/O virtual device having the same type with the I/O device, where a back-end instance BE of the I/O virtual device is created in the Host, and a front-end instance FE of the I/O virtual device is created in the initiated VM; the BE is bound with the idle VF software instance, in 5 this way, the application architecture in which each VM can independently use one VF device is established, a channel between one VF device virtualized from the I/O device and the front-end instance FE in one VM is got through, so that the FE can access the VF device through the BE in the Host. The VF device virtualized from the I/O device is separately allocated to the VM for use, and the VF device can provide a high-efficient device interface, so it is beneficial for the VM to 0 acquire the performance similar to that of a physical machine, the delay is low and any additional CPU overhead is hardly caused. Moreover, a front-end drive (that is, the FE) of the virtual device is in the VM, so the data is transferred through a back-end drive (that is, the BE) in the Host, and the VM does not perceive a real physical device of the Host, which is convenient for transition and implementation of device sharing, thereby implementing the optimization of compatibility of a 5 virtualization system. 2015203452 22 Jun2015 [0142] In addition, during the procedure of executing the DMA read request, the Host implements the translation between the GPA and the HPA, thereby reducing hardware resource configuration, and simplifying the processing flow.
[0143] For better understanding and implementation of the foregoing methods in the 0 embodiments of the present invention, apparatuses and a computer system configured to implement the foregoing methods are further provided.
[0144] Referring to FIG. 10, a host 1000 provided in an embodiment of the present invention may include: a first creating module 1010, a second creating module 1020 and a binding module 25 1030; the first creating module 1010 is configured to, after an I/O virtual function of an input/output I/O device is enabled, generate several VF software instances in the Host 1000, where several corresponding virtual function VF devices are virtualized from the I/O device with the I/O virtual function enabled, and the several VF software instances generated in the Host 1000 and the 30 several VF devices are in one-to-one correspondence; the second creating module 1020 is configured to create an I/O virtual device having the same type with the I/O device, where, a back-end instance BE of the I/O virtual device is created in the Host 1000, a front-end instance FE of the I/O virtual device is created in an initiated virtual machine VM; and 28 the binding module 1030 is configured to bind the BE created by the second creating module 1020 with an idle VF software instance created by the first creating module 1010. 2015203452 22 Jun2015 [0145] It can be understood that, the host 1000 in this embodiment may be the Host in each of the foregoing method embodiments, functions of each function module may be specifically 5 implemented according to the method in each of the foregoing method embodiments. For the specific implementation procedure, reference can be made to relevant description of the foregoing method embodiments, and details are not repeated herein.
[0146] It can be seen that, in this embodiment, after the I/O virtual function of the I/O device is enabled, several VF software instances are generated in the Host 1000; several corresponding VF 0 devices are virtualized from the I/O device with the I/O virtual function enabled; the several VF software instances generated in the Host 1000 and the several VF devices are in one-to-one correspondence; the Host creates an I/O virtual device having the same type with the I/O device, where a back-end instance BE of the I/O virtual device is created in the Host 1000, and a front-end instance FE of the I/O virtual device is created in the initiated VM; the BE is bound with the idle 5 VF software instance, in this way, the application architecture in which each VM can independently use one VF device is established, a channel between one VF device virtualized from the I/O device and the front-end instance FE in one VM is got through, so that the FE can access the VF device through the BE in the Host. The VF device virtualized from the I/O device is separately allocated to the VM for use, and the VF device can provide a high-efficient device interface, so it is beneficial 0 for the VM to acquire the performance similar to that of a physical machine, the delay is low and any additional CPU overhead is hardly caused. Moreover, a front-end drive (that is, the FE) of the virtual device is in the VM, so data is transferred through a back-end drive (that is, the BE) in the Host 1000, and the VM does not perceive a real physical device of the Host 1000, which is convenient for transition and implementation of device sharing, thereby implementing the 25 optimization of compatibility of a virtualization system.
[0147] Referring to FIG. 11-a, a computing node 1100 provided in an embodiment of the present invention may include, a hardware layer 1110, a Host 1120 running on the hardware layer 1110, and at least one virtual machine VM 1130 running on the Host 1120. 30 [0148] The hardware layer 1110 includes an I/O device 1111, several corresponding virtual function VF devices 11111 are virtualized from the I/O device 1111, the Host 1120 has several VF software instances 1121, the several VF software instances 1121 and the several VF devices 11111 are in one-to-one correspondence; the Host 1120 further has a back-end instance BE 1122 of an I/O virtual device having the same type with the I/O device 1111, the VM 1130 has a front-end instance 29 FE 1131 of the I/O virtual device; the BE 1122 in the Host 1120 is bound with an idle VF software instance 1121. 2015203452 22 Jun2015 [0149] In an application scenario, the FE 1131 is configured to pre-allocate a cache for direct memory access DMA. 5 [0150] The VF software instancell21 bound with the BE 1122 is configured to acquire an address corresponding to the cache for DMA through an exporting application programming interface of the BE 1122, and write the acquired address corresponding to the cache for DMA into a first storage unit of the VF device 11111 corresponding to the VF software instance 1121.
[0151] The VF device 11111 is configured to select the address corresponding to the cache for 0 DMA from the first storage unit when there is data to be received, and initiate a DMA write request by using the address corresponding to the cache for DMA as a target address; notify the corresponding VF software instance 1121 in the Host 1120 after the DMA write request is executed, so that the VF software instance 1121 triggers the FE 1131 to receive data written into the cache corresponding to the address. 5 [0152] Referring to FIG. 11-b, in an application scenario, the FE 1131 is further configured to write a guest physical address GPA corresponding to the pre-allocated cache for DMA into a shared memory 1140 between the Host 1120 and the VM 1130.
[0153] The VF software instance 1121 bound with the BE 1122 may be specifically configured to acquire the GPA corresponding to the cache for DMA from the shared memory 1140 through the
0 exporting application programming interface of the BE 1122, and write the acquired GPA corresponding to the cache for DMA into the first storage unit of the VF device 11111 corresponding to the VF software instance 1121.
[0154] Moreover, the FE 1131 may also notify the GPA corresponding to the pre-allocated cache for DMA to the corresponding BE 1122; and the VF software instance 1121 bound with the 25 BE 1122 may acquire the GPA corresponding to the cache for DMA through the exporting application programming interface of the BE 1122.
[0155] The VF device 11111 may be specifically configured to, select a GPA corresponding to the cache for DMA from the first storage unit when there is data to be received, initiate a DMA write request by using the selected GPA corresponding to the cache for DMA as a target address; 30 notify the corresponding VF software instance 1121 in the Host 1120 after the DMA write request whose target address GPA is modified to a corresponding HPA is executed, so that the VF software instance 1121 triggers the FE 1131 to receive data written into the cache corresponding to the HPA.
[0156] The hardware layer 1110 of the computing node 1100 may further include: an input/output memory management unit IOMMU 1112, configured to modify the 30 target address GPA of the DMA write request initiated by the VF device 11111 to a corresponding Host physical address HPA. 2015203452 22 Jun2015 [0157] For example, an address translation page table is set in the IOMMU 1112, where the address translation page table records mapping between the GPA and the HPA; the IOMMU 1112 5 may acquire an HPA corresponding to the target address GPA of the DMA write request by looking up the address translation page table, and modify the target address GPA of the DMA write request to the acquired HPA.
[0158] Referring to FIG 11-c, in another application scenario, the FE 1131 may further be configured to, write the GPA corresponding to the pre-allocated cache for DMA into the shared 0 memory 1140 between the Host 1120 and the VM 1130.
[0159] The Host 1120 may be configured to modify the GPA corresponding to the cache for DMA in the shared memory 1140 to a corresponding HPA.
[0160] For example, an address translation page table is set in the Host 1120, where the address translation page table records mapping between the GPA and the HPA; by looking up the address 5 translation page table, the Host 1120 may acquire an HPA corresponding to the GPA that is corresponding to the cache for DMA in the shared memory, and modify the GPA corresponding to the cache for DMA in the shared memory to the acquired HPA. Definitely, the FE 1131 may also notify the GPA corresponding to the pre-allocated cache for DMA to the corresponding BE 1122, and the Host 1120 may modify the GPA corresponding to the cache for DMA to the corresponding 0 HPA.
[0161] The VF software instance 1121 bound with the BE 1122 may be specifically configured to acquire the HPA corresponding to the cache for DMA through the exporting application programming interface of the BE 1122; and write the acquired HPA corresponding to the cache for DMA into a first storage unit of the VF device 11111 corresponding to the VF software instance 25 1121.
[0162] The VF device 11111 may be specifically configured to select the HPA corresponding to the cache for DMA from the first storage unit when there is data to be received, initiate a DMA write request by using the selected HPA as a target address, and notify the corresponding VF software instance 1121 in the Host 1120 after the DMA write request is executed, so that the VF 30 software instance 1121 triggers the FE 1131 to receive data written into the cache corresponding to the HPA.
[0163] Further, in an application scenario, the FE 1131 is further configured to write the GPA corresponding to the cache where data to be sent locates into the shared memory 1140 between the Host 1120 and the VM 1130. 31 [0164] The BE 1122 may further be configured to acquire the GPA corresponding to the cache where data to be sent locates from the shared memory 1140; invoke a program sending interface of the VF software instance 1121 bound with the BE 1122, and write the GPA corresponding to the cache where data to be sent locates into a second storage unit of the VF device 11111 corresponding 2015203452 22 Jun2015 5 to the VF software instance 1121.
[0165] Moreover, the FE 1131 may also notify the GPA corresponding to the cache where data to be sent locates to the corresponding BE 1122, and the BE 1122 acquires, according to the notification, the GPA corresponding to the cache where data to be sent locates.
[0166] The VF device 11111 is further configured to initiate a DMA read request by using the 0 GPA recorded in the second storage unit as a target address, when finding that there is data to be sent; moreover, the VF device 11111 may further be configured to, after the DMA read request is executed, notify the corresponding VF software instance 1121 in the Host 1120, so that the VF software instance 1121 triggers the FE 1131 to release the corresponding cache.
[0167] The hardware layer 1110 of the computing node 1100 may further include: 5 an input/output memory management unit IOMMU 1112, configured to modify the target address GPA of the DMA read request initiated by the VF device 11111 to a corresponding HPA.
[0168] For example, an address translation page table is set in the IOMMU 1112, where the address translation page table records mapping between the GPA and the HPA; the IOMMU 1112 0 may acquire an HPA corresponding to the target address GPA of the DMA read request by looking up the address translation page table, and modify the target address GPA of the DMA read request to the acquired HPA.
[0169] In another application scenario, the FE 1131 may further be configured to write the GPA corresponding to the cache where data to be sent locates into the shared memory 1140 between the 25 Host 1120 and the VM 1130.
[0170] The Host 1120 may be configured to modify the GPA corresponding to the cache where data to be sent locates in the shared memory 1140 to a corresponding HPA.
[0171] For example, an address translation page table is set in the Host 1120, where the address translation page table records mapping between the GPA and the HPA; by looking up the address
30 translation page table, the Host 1120 may acquire an HPA corresponding to the GPA that is corresponding to the cache where data to be sent locates in the shared memory, and modify the GPA corresponding to the cache where data to be sent locates in the shared memory into the corresponding HPA. Definitely, the FE 1131 may also notify the GPA corresponding to the cache where data to be sent locates to the Host 1120, and the Host 1120 may modify the GPA 32 corresponding to the cache where data to be sent locates to the corresponding HPA. 2015203452 22 Jun2015 [0172] The BE 1122 is further configured to acquire the HPA corresponding to the cache where data to be sent locates; invoke a program sending interface of the VF software instance 1121 bound with the BE 1122, and write the HPA corresponding to the cache where data to be sent locates into a 5 second storage unit of the VF device 11111 corresponding to the VF software instance 1121.
[0173] The VF device 11111 may further be configured to initiate a DMA read request by using the HPA recorded in the second storage unit as a target address, when finding that there is data to be sent. In addition, the VF device 11111 may further be configured to, after the DMA read request is executed, notify the corresponding VF software instance 1121 in the Host 1120, so that the VF 0 software instance 1121 triggers the FE 1131 to release the corresponding cache.
[0174] The first storage unit, for example, is a receiving queue or receiving list of the VF device or another data storage structure capable of recording an address. The second storage unit, for example, is a sending queue or sending list of the VF device or another data storage structure capable of recording an address. 5 [0175] It can be understood that, the Host 1120 in this embodiment can be the Host in each of the foregoing method embodiments, the working mechanism of a virtualization system on which the computing node 1100 runs in the embodiment may be as that described in the foregoing method embodiments, functions of each function module may be specifically implemented according to the method in each of the foregoing method embodiments. For the specific implementation procedure, 0 reference can be made to relevant description of the foregoing method embodiments, and details are not repeated herein.
[0176] It can be seen that, the computing node 1100 in the embodiment of the present invention includes: a hardware layer, a Host running on the hardware layer, and at least one VM running on the Host, where the hardware layer includes an input/output I/O device, several corresponding 25 virtual function VF devices are virtualized from the I/O device; the Host has several VF software instances, and the several VF software instances and the several VF devices are in one-to-one correspondence; the Host further has a back-end instance BE of an I/O virtual device having the same type with the PO device, and the VM has a front-end instance FE of the I/O virtual device, where the BE in the Host is bound with an idle VF software instance. In this way, application 30 architecture in which each VM can independently use one VF device is established, a channel between one VF device virtualized from the I/O device and the front-end instance FE in one VM is got through, so that the FE can access the VF device through the BE in the Host. The VF device virtualized from the PO device is separately allocated to the VM for use, and the VF device can provide a high-efficiency device interface, so it is beneficial for the VM to acquire the performance 33 similar to that of a physical machine, the delay is low, and any extra CPU overhead is hardly caused. Moreover, a front-end drive (that is, the FE) of the I/O virtual device is in the VM, so the FE transfers data through a back-end drive (that is, the BE) in the Host, and the VM does not perceive a real physical device of the Host, which is convenient for transition and implementation of device 5 sharing, thereby implementing the optimization of compatibility of a virtualization system. 2015203452 22 Jun2015 [0177] Referring to FIG. 12, an embodiment of the present invention further provides a computer system, which may include: at least one computing node 1100.
[0178] It should be noted that, as for each of the foregoing method embodiments, for simple 0 description, the method is described as a series of action combination, but persons of ordinary skill in the art should learn that, the present invention is not limited by the described action sequence, because according to the present invention, some steps may be performed in other sequences or performed simultaneously. Furthermore, persons skilled in the art should learn that all the embodiments described in the specification are exemplary embodiments, and the operations and 5 modules involved may not be necessary for the present invention.
[0179] In the above embodiments, the description of each embodiment has its emphasis, and for the part that is not detailed in an embodiment, reference may be made to the relevant description of other embodiments.
[0180] In view of the above, the computing node in the embodiment of the present invention 0 may include: a hardware layer, a Host running on the hardware layer, and at least one VM running on the Host, the hardware layer includes an input/output I/O device, several corresponding virtual function VF devices are virtualized from the I/O device; the Host has several VF software instances, and the several VF software instances and the several VF devices are in one-to-one correspondence; the Host further has a back-end instance BE of an I/O virtual device having the same type with the 25 I/O device, and the VM has a front-end instance FE of the I/O virtual device, where the BE in the
Host is bound with an idle VF software instance. In this way, application architecture in which each VM can independently use one VF device is established, a channel between one VF device virtualized from the I/O device and the front-end instance FE in one VM is got through, so that the FE can access the VF device through the BE in the Host. The VF device virtualized from the I/O 30 device is separately allocated to the VM for use, and the VF device can provide a high-efficiency device interface, so it is beneficial for the VM to acquire the performance similar to that of a physical machine, the delay is low, and any extra CPU overhead is hardly caused. Moreover, a front-end drive (that is, the FE) of the I/O virtual device is in the VM, so the FE transfers data through a back-end drive (that is, the BE) in the Host, and the VM does not perceive a real physical 34 device of the Host, which is convenient for transition and implementation of device sharing, thereby implementing the optimization of compatibility of a virtualization system. Moreover, it is unnecessary that the OS in the VM has the latest hardware technology support, the computing node is applicable to various mainstream OSs, without depending on VM drives provided by hardware 5 providers (IHV); the isolation and decoupling between a virtual machine and a physical platform are saved completely, and the virtual machine is easy to be transited; the Host can still monitor data receiving and sending in the VM, and advanced features, such as data filtering and memory multiplexing, can still be used; the FE part of front-end of the PV can be multiplexed, and upgrading is performed conveniently. 2015203452 22 Jun2015 0 [0181] Moreover, during the procedure of executing the DMA read/write request, the hardware module IOMMU implements the translation between the GPA and the HPA, thereby reducing hardware resource configuration, and simplifying the processing flow. Alternatively, during the procedure of executing the DMA read request, the Host implements the translation between the GPA and the HPA, thereby reducing hardware resource configuration, and simplifying the 5 processing flow.
[0182] Persons of ordinary skill in the art should understand that all or a part of the steps of the method according to the embodiments of the present invention may be implemented by a program instructing relevant hardware. The program may be stored in a computer readable storage medium, such as a read only memory, a random access memory, a magnetic disk or an optical disk. 0 [0183] The virtualization processing method and apparatuses, and a computer system provided in embodiments of the present invention are described in detail. The principle and implementation of the present invention are described herein through specific examples. The description about the embodiments of the present invention is merely provided for easy understanding of the method and core ideas of the present invention. Persons of ordinary skill in the art can make variations and 25 modifications to the present invention in terms of the specific implementation and application scopes according to the ideas of the present invention. Therefore, the specification shall not be construed as a limit to the present invention. 35

Claims (12)

  1. CLAIMS What is claimed is:
    1. A virtualization processing method, applied in a computing node, wherein the computing node includes: a hardware layer having an input/output (I/O) device, and a Host layer, the method including: generating, after an input/output (I/O) virtual function (VF) of the I/O device is enabled, a plurality of VF software instances in the Host layer, wherein a plurality of corresponding VF devices are virtualized from the I/O device with the I/O VF enabled, and the plurality of VF software instances are in one-to-one correspondence with the plurality of VF devices; creating, in the Host layer, an I/O virtual device having a same type as the I/O device, creating a back-end instance (BE) of the I/O virtual device in the Host layer, and a front-end instance (FE) of the I/O virtual device in an initiated machine (VM) running on the Host layer; binding the BE on a one-to-one basis with an idle VF software instance; acquiring, by the VF software instance bound with the BE, an address corresponding to cache for direct memory access (DMA) through an exporting application programming interface of the BE; wirting the acquired address corresponding to the cache for direct memory access DMA into a first storage unit of a VF device corresponding to the VF software instance; selecting, by the VF device, an address corresponding to the cache for DMA from the first storage unit when there is data to be received; and initiating a DMA write request by using the selected address corresponding to the cache for DMA as a target address.
  2. 2. The method of claim 1, further including notifying, by the VF device, the VF software instance that corresponds to the VF device and is in the Host layer after the DMA write request is executed, so that the VF software instance triggers the FE to receive data written into the cache corresponding to the address.
  3. 3. The method of claim 1, wherein the Host layer includes a virtual machine monitor (VMM) running on the hardware layer.
  4. 4. The method of claim 1, further including pre-allocating, by the FE, the cache for DMA.
  5. 5. A host, running on a hardware layer, wherein the hardware layer includes an input/output (I/O) device, the host including: a first creating module, configured to, after an I/O virtual function of the input/output (I/O) device is enabled, generate a plurality of VF software instances; wherein a plurality of corresponding virtual function VF devices are virtualized from the I/O device with the I/O virtual function enabled; wherein each of the plurality of generated VF software instances corresponds to a different VF device of the plurality of VF devices; a second creating module, configured to: create an I/O virtual device having a same type as the I/O device, create a back-end instance BE of the I/O virtual device, and create a front-end instance FE of the I/O virtual device in an initiated virtual machine VM running on the host; and a binding module, configured to bind the BE created by the second creating module on a one-to-one basis with an idle VF software instance created by the first creating module; wherein the binding module performing the bind of the BE created by the second creating module on a one-to-one basis with an idle VF software instance created by the first creating module is configured to: acquire an address corresponding to a cache for direct memory access (DMA) through an exporting application programming interface (API) of the BE; and write the acquired address corresponding to the cache for DMA into a first storage unit in a VF device corresponding to the VF software instance; and wherein the binding module performing the bind of the BE created by the second creating module on a one-to-one basis with an idle VF software instance created by the first creating module is further configured to: select the address corresponding to the cache for DMA from the first storage unit when there is data to be received;and initiate a DMA write request by using the selected address corresponding to the cache for DMA as a target address.
  6. 6. The host of claim 5, wherein bind the BE created by the second creating module on a one-to-one basis with an idle VF software instance created by the first creating module is further configured to notify the VF software instance that corresponds to the VF device and is in the Host layer after the DMA write request is executed, so that the software instance triggers the FE to receive data wirtten into cache corresponding to the address.
  7. 7. The host of claim 5, wherein the Host is a virtual machine monitor (VMM) running on the hardware layer.
  8. 8. The host of claim 5, wherein the FE is configured to pre-allocate the cache for DMA.
  9. 9. A computing node, including: a hardware layer, a Host layer, and at least one virtual machine (VM) running on the Host layer, wherein the hardware layer includes an input/output (I/O) device, wherein a plurality of corresponding virtual function (VF) devices are virtualized from the I/O device, wherein the Host has a plurality of VF software instances, the plurality of VF software instances being in one-to-one correspondence with the plurality of VF device, wherein the Host layer further has a back-end instance (BE) of an I/O virtual device having a same type with the I/O device, wherein the VM has a front-end instance (FE) of the virtual device, and wherein the VE in the Host layer is bound on a one-to-one basis with an idle VF software instance; wherein the VF software instance bound with the BE is configured to: acquire an address corresponding to a cache for direct memory access (DMA) through an exporting application programming interface (API) of the BE; and write the acquired address corresponding to the cache for DMA into a first storage unit in a VF device corresponding to the VF software instance; and wherein the VF software instance bound with the BE is further configured to: select the address corresponding to the cache for DMA from the first storage unit when there is data to be received; and initiate a DMA write request by using the selected address corresponding to the cache for DMA as a target address.
  10. 10. The computing node of claim 9, wherein the software instance bound with the BE is further configured to notify the VF software instance that corresponds to the VF device and is in the Host layer after the DMA write request is executed, so that the VF software instance triggers the FE to receive data written into the cache corresponding to the address.
  11. 11. The computing node of claim 9, wherein the Host layer includes a virtual machine monitor (VMM) running on the hardware layer.
  12. 12. The computing node of claim 9, wherein the FE is configured to pre-allocate the cache for DMA.
AU2015203452A 2011-12-31 2015-06-22 Virtualization processing method and apparatuses, and computer system Active AU2015203452B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2015203452A AU2015203452B2 (en) 2011-12-31 2015-06-22 Virtualization processing method and apparatuses, and computer system

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201110458345.8 2011-12-31
AU2012250375A AU2012250375B2 (en) 2011-12-31 2012-05-22 Virtualization processing method and apparatuses, and computer system
AU2015203452A AU2015203452B2 (en) 2011-12-31 2015-06-22 Virtualization processing method and apparatuses, and computer system

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
AU2012250375A Division AU2012250375B2 (en) 2011-12-31 2012-05-22 Virtualization processing method and apparatuses, and computer system

Publications (2)

Publication Number Publication Date
AU2015203452A1 AU2015203452A1 (en) 2015-07-16
AU2015203452B2 true AU2015203452B2 (en) 2016-12-22

Family

ID=53673365

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2015203452A Active AU2015203452B2 (en) 2011-12-31 2015-06-22 Virtualization processing method and apparatuses, and computer system

Country Status (1)

Country Link
AU (1) AU2015203452B2 (en)

Also Published As

Publication number Publication date
AU2015203452A1 (en) 2015-07-16

Similar Documents

Publication Publication Date Title
AU2012250375B2 (en) Virtualization processing method and apparatuses, and computer system
EP3457288B1 (en) Computer system and storage access device
TWI625674B (en) Systems and methods for nvme controller virtualization to support multiple virtual machines running on a host
US9501245B2 (en) Systems and methods for NVMe controller virtualization to support multiple virtual machines running on a host
US20160162438A1 (en) Systems and methods for enabling access to elastic storage over a network as local storage via a logical storage controller
US8832688B2 (en) Kernel bus system with a hyberbus and method therefor
EP2335156B1 (en) Virtualized storage assignment method
US9547605B2 (en) Method for data backup, device and system
JP2016139393A (en) Exposing proprietary image backup to hypervisor as disk file bootable by hypervisor
JP2013514584A5 (en)
US11016817B2 (en) Multi root I/O virtualization system
WO2016101282A1 (en) Method, device and system for processing i/o task
WO2023138460A1 (en) Distributed storage space management method, computing device and storage medium
KR101716715B1 (en) Method and apparatus for handling network I/O apparatus virtualization
CN103324532A (en) Dynamic migration method and system of virtual machine
JP2016018298A (en) Notification conversion program and notification conversion method
CN103207805A (en) Virtualization-based hard disk reuse system
CN111367472A (en) Virtualization method and device
AU2015203452B2 (en) Virtualization processing method and apparatuses, and computer system
WO2015043175A1 (en) Server system and operation system starting method thereof, and starting management node
CN113806019A (en) Method for binding and unbinding PMEM (Power management and communication) equipment in OpenStack cloud platform
CN111258661A (en) RAID card drive design method based on UEFI SCSI
Joshi et al. Empirical study of virtual disks performance with KVM on DAS

Legal Events

Date Code Title Description
FGA Letters patent sealed or granted (standard patent)