CN117573041B - Method for improving virtualized storage performance by improving vhost-scsi - Google Patents

Method for improving virtualized storage performance by improving vhost-scsi Download PDF

Info

Publication number
CN117573041B
CN117573041B CN202410056443.6A CN202410056443A CN117573041B CN 117573041 B CN117573041 B CN 117573041B CN 202410056443 A CN202410056443 A CN 202410056443A CN 117573041 B CN117573041 B CN 117573041B
Authority
CN
China
Prior art keywords
host
polling
guest
data
vhost
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410056443.6A
Other languages
Chinese (zh)
Other versions
CN117573041A (en
Inventor
王宇锋
雷翔
谢明
孙立明
张铎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kirin Software Co Ltd
Original Assignee
Kirin Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kirin Software Co Ltd filed Critical Kirin Software Co Ltd
Priority to CN202410056443.6A priority Critical patent/CN117573041B/en
Publication of CN117573041A publication Critical patent/CN117573041A/en
Application granted granted Critical
Publication of CN117573041B publication Critical patent/CN117573041B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention relates to a method for improving the performance of virtualized storage by improving a vhost-scsi, which modifies an active query event mechanism into an active query event query, and comprises the following steps: s100: initiating an application for creating continuous shared memory in an address space of a client kernel layer, and distributing the shared memory to SQ and CQ; s200: the shared memory created in the client kernel layer is transferred to a vHost module of the host kernel layer after being converted by a Qemu module memory address; s300: the Polling thread is started through allocation at the Guest end and the Host end, and the Polling and the setting of the key identifiers of the SQ and the CQ are continuously carried out to complete the loop logic. According to the invention, the previous passive event notification mechanism is changed into the active polling query mechanism, so that the storage process is more efficient when 4K-level data is stored, the service request response event is reduced, and the energy consumption of a server is also reduced.

Description

Method for improving virtualized storage performance by improving vhost-scsi
Technical Field
The invention relates to electronic digital data processing, in particular to a method for improving the virtualized storage performance of a vhost-scsi.
Background
The importance of enhancing IO performance of storage is self-evident for modern data centers and cloud computing environments. With the rapid development of digital transformation and cloud computing, data has become an important asset for enterprises, and the storage and processing capabilities of the data directly affect the business operations and performances of the enterprises.
The stored IO performance refers to the capability of the storage device to perform read-write operation in unit time, and is one of important indexes for measuring the performance of the storage device. In data centers and cloud computing environments, stored IO performance directly affects the following aspects:
the application performance is as follows: the IO performance of the storage device limits the operating speed and response time of the application system. If the IO performance of the storage device is insufficient, the response of the application system is slow or crashes, so that the business operation of an enterprise is affected.
Data center efficiency: the IO performance of a storage device directly affects the efficiency and operating costs of a data center. If the IO performance of the storage device is insufficient, more server devices need to be added to meet the requirements by parallel processing, which increases the operation cost and complexity of the data center.
Competitiveness of cloud computing service providers: providing storage devices with high IO performance is one of the important means for cloud computing service providers to attract customers. The storage device with high IO performance can improve the use experience and reliability of cloud service.
Thus, improving IO performance of storage is critical to data centers and cloud computing environments. Traditional storage architecture cannot meet the requirements of modern application systems, and more advanced storage technologies, such as NVMe, distributed storage, and the like, need to be adopted to improve the IO performance and expandability of the storage device. Meanwhile, factors such as energy consumption and reliability of the storage device are also required to be considered, so that the storage device can meet the requirements of enterprises and long-term operation cost is reduced.
The vhost-scsi directly receives and transmits data in the Kernel layer by moving the rear end implementation of virtio from the Qemu application layer to the Kernel layer of Host, avoids the context switching between user state and Kernel state and the data replication in the middle process, and is the best scheme for the fast local disk drop of virtualized storage data at present. However, while hardware devices are increasingly powerful, in the process of reading and writing data volume IO below 4K and 4K, the consumption of a software stack still occupies a large proportion, so that the optimization of the software layer of storage is more important to the high-performance storage.
Chinese patent invention "a method and system for improving virtualized storage performance of Shenwei platform" (patent number: CN 112148224A). The invention discloses a method and a system for improving virtualized storage performance of a Shenwei platform, comprising the following steps: forming a disk array on a host; constructing a new storage volume on the disk array; constructing a file system based on the new storage volume; creating a virtual machine on the constructed file system; the host machine realizes the read-write operation with the client machine through the virtual machine. The invention utilizes LVM cache to combine RAID0 technology to build file system, and uses KVM virtualization technology to build virtual machine, to improve the read-write performance of virtualized storage IO. The patent mainly utilizes LVM cache and RAID0 to realize performance improvement, is not optimized on an IO path stack, and needs to be optimized from a software stack layer under the condition of random storage test of small data below 4K.
Chinese invention patent (patent number: CN 114020406A) is a method, a device and a system for accelerating I/O of virtual machines by a cloud platform. The invention discloses a cloud platform accelerating virtual machine I/O method, a device, a system and a computer readable storage medium, which comprise the steps of creating a virtual machine sharing a large page memory in advance and acquiring a protocol and connection information of a target volume; sending a request for creating a controller to an SPDK (specific virtual host-user) vhost-user service on a host machine corresponding to the virtual machine so as to create a vhost-user-scsi controller; according to the protocol and the connection information of the target volume, the SPDKBdve takes over the target volume, and the SPDKBdve is added to a vhost-user-scsi controller; associating the virtual machine with a vhost-user-scsi controller such that the QUME within the virtual machine accelerates the target volume by invoking the vhost-user-scsi controller; the invention is beneficial to improving the acceleration efficiency and the system performance. The patent mainly accelerates the I/O of the virtual machine by solving the limit of the virtual machine which is created in advance and shares the large page memory, and is mainly used in a vhost-user-scsi module and the method and the scene of the patent are different.
Chinese patent 'Shenwei platform storage input output device virtualization performance optimization method and system' (patent number: CN 111796912A). The invention discloses a method and a system for optimizing virtualization performance of a Shenwei platform storage input/output device, wherein the method comprises the following steps: the simulation processor QEMU of the client provides a shared memory for the client and the host; the simulation processor QEMU of the client communicates with the host to inform the host of the address information of the shared memory; after the host receives the address information of the shared memory, the address of the address information of the shared memory in the host user process is calculated, and then the read-write operation is performed. The patent mainly uses the shared memory to accelerate, and mainly comprises the technical adaptation Shenwei platform of the shared memory in the vhost-scsi scheme, and has no additional general acceleration method for other platforms.
Chinese patent invention "a method for safely storing and quickly calling data and mobile terminal" (patent number: CN 109829324A). The invention discloses a method for safely storing and quickly calling data and a mobile terminal, comprising the following steps: encrypting data which needs to be stored in an open public path by a system; storing the encrypted data under the open public path; decrypting the data in the open public path, storing the decrypted data in a virtual memory, and forming a mapped path according to the storage address; modifying a system call interface of which the access path defaults to the open public path, and modifying the access path of the system call interface to the mapped path, so that the system invokes decrypted data from the virtual memory for use. The invention not only can solve the problem of safe storage of data under the default path of the system, but also can improve the calling speed of the data, avoid the phenomena of system blocking, no response and the like, and well solve the contradiction between the problem of data storage safety and the problem of data calling rapidity. This patent only encrypts data stored under an open public path, helping little to improve I/O efficiency.
Chinese patent invention "a data access method and device oF NVMe-oF user client" (patent number: CN 114417373A). The embodiment oF the application provides a data access method and device oF an NVMe-oF user client, wherein the method comprises the following steps: receiving a data access request message sent by a virtual host vhost device; analyzing the data access request message to obtain a first service end identifier and an access operation instruction; selecting a first annular queue based on the first service end identifier, and writing an access operation instruction into a control instruction area oF the first NVMe-oF service end through the first annular queue; the first annular queue is a queue between the vhost device and the first NVMe-oF server. The patent is mainly aimed at a method for improving efficiency by reducing IO paths by using DMA for NVME-oF equipment.
Disclosure of Invention
The invention mainly aims to provide a method for improving the virtualized storage performance of the vhost-scsi, which changes a passive event notification mechanism into a active polling query mechanism, so that the storage process is more efficient when 4K-level data is stored, service request response events are reduced, and meanwhile, the energy consumption of a server is reduced.
In order to accomplish the above object, the present invention provides a method for improving the performance of a vhost-scsi to promote virtualized storage, modifying the IOeventfd event mechanism into a Polling active event query, the modifying method comprising the steps of:
s100: initiating an application for creating continuous shared memory in an address space of a Guest Kernel, and distributing the shared memory to the SQ and the CQ;
s200: the shared memory created in the Guest Kernel is transferred to a vHost module of the Host Kernel after being converted by a memory address of the Qemu module;
s300: the Polling and the setting of the key identifications of the SQ and the CQ are continuously polled to complete the Polling logic by distributing and starting the Polling thread at the Guest end and the Host end.
Preferably, the step S100 further includes the steps of:
s110: dividing a shared memory in an address space of a Guest Kernel according to a continuous whole page memory allocation principle, and allocating the division areas of the whole page memory to SQ and CQ;
s120: dividing the space in the SQ area space and distributing the space to the SQ- > head, the SQ- > tail and the SQ- > flag;
s130: the space is divided in the CQ area space and allocated to CQ- > head, CQ- > tail and CQ- > flag.
Further preferably, the step S200 further includes the steps of:
s210: pre-configuring a PCI configuration space in Qemu, and correspondingly writing information of the PCI configuration space in Qemu into the PCI configuration space in Guest Kernel to enable Qemu to obtain Guest Physical Address;
s220: qemu converts Guest Physical Address to Host Virtual Address;
s230: host Virtual Address is passed in Qemu to the vhest module in Host Kernel over the Ioctl interface.
Still more preferably, in step S300, the Polling logic is performed by setting SQ and CQ head, tail, flag, respectively.
Still further preferably, the modification method further comprises the steps of:
in step S300, the concrete Polling logic is as follows:
s310: starting an Sqthread thread in a Host Kernel, and starting a Polling state;
s320: setting the Sq.head mark in the Guest OS as n, wherein n is a natural number;
s330: actively inquiring whether the value of the Sq.head mark is larger than the value of the Sq.tail or not in an Sqthread thread in the Host OS, and if the value of the Sq.head mark is larger than the value of the Sq.tail, indicating that new data is filled;
s340: acquiring new transmitted data through the Sq.head mark in the Host OS;
s350: updating the identification of the Sq.tail in the Host OS, wherein the identification of the data which is newly sent by the Host OS terminal is obtained;
s360: processing the newly transmitted data in a Host OS;
s370: after processing new data in the Host OS, updating cq.head=n, which means that the data corresponding to the index value of n is processed, polling is performed to determine whether the next data is sent, and continuing to step by step n;
s380: judging whether the Cq.head reaches n in the get OS, and informing an upper application layer that the request of the sq.head= n is processed;
s390: after the application layer is notified in the get OS, the cq.tail is updated to n, and the resources occupied by the n index can be released after the n number request result is also processed.
Still further preferably, in step S370, if new data is not received within a predetermined time, the poll is set to the sleep state.
The beneficial effects of the invention are as follows:
the advantages of the Vhost-scsi technique in terms of performance are mainly due to its optimization of the data transmission path, reduced virtualization overhead and improved data transmission mechanisms. The data transfer of the Vhost-scsi technique uses IOeventfd and irqfd for notification between the virtual machine and the host. IOeventfd is used to inform HOST that data is ready, while irqfd is used to inform virtual machines of interrupt injection. After analyzing the whole flow of the vhost-scsi, the event notification mechanism is found to have longer real paths in the KVM module and relatively more performance cost, so that the IOeventfd event mechanism is changed into a Polling active event query, and the mechanism reduces the delay of request response, thereby improving the performance.
In the contrast test of using fio 4K random read-write virtual disk, the invention adopts the Polling scheme to replace event notification mechanism, and the efficiency of iops is improved by about 6%.
Drawings
The invention will be described in further detail with reference to the drawings and the detailed description.
FIG. 1 is a diagram of the original IO flow of a read-write request of a Vhost-scsi prior to modification;
FIG. 2 is a flow chart of the present invention after modification of the Vhost-scsi Polling scheme;
fig. 3 is a flowchart of step S300 of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present invention is not limited to the specific embodiments disclosed below.
Abbreviations and key term definitions:
KVM: the Kernel of the full-scale Kernel-based Virtual Machine is that a virtualization module is added on the Linux operating system, through which the operating system can directly run multiple Virtual machines, i.e. Virtual clients (VM-Virtual Machine)
Guest OS: in virtualization, a computer may run multiple operating systems simultaneously, each of which is referred to as a Guest OS, i.e., an operating system on a virtual machine.
Host OS: in the virtual machine, the Host OS corresponds to the Guest OS, the former refers to an operating system installed on the physical machine, and the latter refers to an operating system installed in the virtual machine. For example, the Ubuntu (u Ban Tu) system installed on a physical computer, on which we then created 1 virtual machine with the OpenEuler system inside, in this case Ubuntu is the Host OS.
Qemu: qemu is called as Quick manager, is a powerful open-source virtual machine manager, and can support hardware virtualization by combining with a KVM module in a Linux kernel. Currently, qemu+kvm is commonly adopted in Linux platforms to provide virtualization services.
vhost-scsi: the virtual disk technology is a virtual disk technology for simulating the Scsi equipment on a Host, and is used as a storage mode provided for a virtual machine as a virtual disk. The Vhost-scsi has the greatest characteristic of high performance, and the method directly carries out data transceiving on the Kernel layer by moving the rear end implementation of Virtio from the Qemu application layer to the Kernel layer of Host, avoids the context switching between a user state and the Kernel state and the data copying of an intermediate process, reduces the stack of IO flow, and obviously helps to promote Iops under the test condition of 4K random reading and writing.
Event notification mechanism in Vhost-scsi: the IO flow of the Vhost-scsi is initiated in the Guest Kernel, data to be transmitted is stored in a Descriibe item of Vring, then a Virtqueue index value is written in a PCI space, a KVM module is triggered, and the back end is informed of which Vring queue has new data to be processed in storage through the bound Ioeventfd.
Vhost-scsi changes the poling mechanism: the shared memory is created in the Guest Kernel and the Host Kernel, and new data is found to be processed by actively inquiring the flag in the shared memory through a series of agreed protocols, so that the waiting time can be reduced, and the IO efficiency can be further improved.
SQ (Submit Queue): SQ is a queue of user space commit IO requests. When user processes need to submit IO requests to the kernel through system calls, these requests are put into SQ. The requests in the SQ queue are arranged in a first-in first-out (FIFO) order for the kernel thread to process. Upon submitting an IO request, the user process needs to provide relevant I/O operating parameters such as start address, operation type (read/write), data length, etc. These parameters are packaged into a request structure and then submitted to SQ via a system call. When the kernel thread reads the request from the SQ, the corresponding I/O operation is performed according to the type and parameters of the request. After the operation is completed, the kernel thread writes the result back to the shared memory and notifies the user process of the completion using CQ (Completion Queue).
CQ (Completion Queue): the CQ is a queue for notifying the user of IO operation completion. When the kernel thread completes one IO operation, a completion notification is placed into the CQ. The user process can continuously read the completion notification from the CQ through the system call so as to process the completed IO operation, and the shared memory is used as a communication medium by the SQ and the CQ, so that the IO initiating terminal and the IO receiving processing terminal can efficiently interact. Through actively inquiring IO requests and reducing the number of system call times, the Polling can obviously improve the disk I/O performance.
The advantages of the Vhost-scsi technique in terms of performance are mainly due to its optimization of the data transmission path, reduced virtualization overhead and improved data transmission mechanisms. The data transfer of the Vhost-scsi technique uses IOeventfd and Irqfd for notification between the virtual machine and the host. IOeventfd is used to inform Host that data is ready, while Irqfd is used to inform virtual machines to perform interrupt injection. After analyzing the whole flow of Vhost-scsi, the event notification mechanism is found to have longer real paths in the KVM module and relatively more performance cost, so that we focus on further optimization to change the IOeventfd event mechanism into a Polling active event query. This mechanism reduces latency in the response to the request, thereby improving performance.
As shown in FIG. 1, the original IO flow of the read-write request is the Vhost-scsi before improvement. The following procedure is depicted in fig. 1:
in Guest: some upper layer- > generic block device layer- > virtssi layer- > virtio layer.
In Host: vhost (virtio) layer- > vhost_scsi layer- > specific physical IO drop.
As can be seen from fig. 1, the event notification mechanism from Guest to Host in the IO flow of the vhost-scsi scheme forwards the event through kvm, and then passes through a relatively long software stack.
The flow chart of the modified Vhost-scsi pumping is shown in FIG. 2. As can be seen from FIG. 2, replacing events with polling the shared memory identifier reduces the time consuming procedure stack calls for event notification.
As can be seen from a comparison of fig. 1 and fig. 2, the vp_notify_iowrite (vq_index) path in fig. 2 is a data notification path before modification, and the present application still keeps the path, and plays a role of notifying the wake-up of the dormant polling thread in the vhost-scsi. This path modification is present before, and is also present in the prior Vhost mechanism preparation, and its final implementation is also by writing PCI space data, this PCI space will bind this space with Irqfd in Qemu, write this PCI space value in Guest, trigger the interrupt event of binding in Host Kernel, then pass through Virtio protocol, and then process IO request.
In fig. 2, vhost_scsi_handle_kck: the Host receives the corresponding processing function of the IO event notification of the Guest (through the kvm module) according to the original logic of the vhost-scsi.
virtqueue_add_split: is a function of the Guest-side virtio adding data in split mode.
To implement the Vhost-scsi event notification mechanism to be modified to a Polling mechanism, the following needs to be solved:
1. how to create a continuous shared memory in the address space.
If only the Guest knows the memory address, the Host does not know the identity of the Guest-written memory, and the Host does not know where to go, the IO event cannot be transferred, so that the shared memory from the Guest Kernel to the Host Kernel needs to be opened. It is known that virtual memory addresses are contiguous and physical memory addresses are not necessarily contiguous, so we need a piece of memory where the Host/Guest appears to be contiguous. GPA (guest physic address) - > HVA (host virtual address) - > HPA (host physic address). This is the first problem to be solved if it is ensured that the memory of both the virtual address space and the physical address space is contiguous. In order to solve the problem of shared memory, a section of whole page memory can be applied to a Guest Kernel, and a certain area is defined for SQ and CQ to be used for recording.
Therefore, the first step in this embodiment for implementing the Vhost-scsi event notification mechanism to be modified into the Polling mechanism is:
s100: and initiating an application for creating continuous shared memory in the address space of the Guest Kernel, and distributing the shared memory to the SQ and the CQ.
In this step, the method specifically further includes the following steps:
s110: and dividing the shared memory in the address space of the Guest Kernel according to the continuous whole page memory allocation principle, and allocating the division areas of the whole page memory to the SQ and the CQ. In this step, it is possible to analyze if we allocate in the Guest in whole page so that the section of memory is also contiguous in the Host view. And dividing the region into SQ and CQ in the whole page of memory of the application.
Specifically, the required specific memory size is estimated, and then the shared memory is allocated according to the size of the whole page of the memory (i.e. the actual page_size: 4096). For example, a worker may apply for a 4088 byte size memory that is smaller than 4096, so a worker may apply for 1 full page. If a 4097 byte size memory is to be applied, which is larger than 4096, one page is not available, the staff member needs to apply a 2-page size, i.e., (4096 x 2=) 8192 bytes of shared memory.
We find that in the Guest the memory according to the size application of the whole page, in Qemu it will also apply to the Host's memory management module according to the whole page. Thus, the memory in the Guest is not composed of the hash table memory in the Host, but is composed of a continuous whole memory. Thus, the GuestOS and HostOS can operate on the same shared memory conveniently.
As described above, only a full page of aligned memory is allocated, its GPAs are contiguous. This is because the virtual address space appears to be contiguous, but the physical addresses are in the form of a hash table, and various translations are required for this segment of Vhost to correctly identify data identified by the Guest, which can facilitate efficient polling in the case of a contiguous memory Vhost kernel.
S120: the space is divided in the SQ region space and allocated to the SQ- > head, the SQ- > tail, the SQ- > flag.
The function of the Sq- > FLAG is to identify whether to wake up, if the Sq poll at the Host end finds that no data is issued at the upper layer for a period of time, the Sq- > FLAG is set to be 'flag_vring_sq_new_wake' (meaning that vhost Sqpoll thread NEEDs to be awakened before data is issued next), then the Sqpoll thread is put into a sleep state (CPU consumption is saved), the identity of whether the Sq- > FLAG has a new WAKEUP is detected before the data is sent for a period of time Guest, if so, the FLAG is informed to the vhest by the previous kclk flow, the polling state is restored at first vhost Sqpoll thread, and then the Sq- > head identity is set. S130: the space is divided in the CQ area space and allocated to CQ- > head, CQ- > tail and CQ- > flag.
The cq- > flag is a reserved item, and is used for adding a poll to the Guest end to receive a completion event, if a poll section of event finds that the completion event is not completed, the cq- > flag is set to be in a new wakeup state before the corresponding thread in the Guest enters sleep, and the next time the Guest end sends the completion event, the Guest end needs to wake up first and then change the cq- > head value.
2. How to transfer the memory address converted by the shared memory created in the Guest Kernel to the vHost module of the Host Kernel, i.e. step S200 of the present application: and transferring the shared memory created in the Guest Kernel to a vHost module of the Host Kernel after the memory address of the Qemu module is converted.
That is, if only the Guest Kernel knows the shared memory address, the Host Kernel does not know that the entire flow is certainly not circulated. Both ends are required to operate the same block of memory space.
In this application, this is achieved by the following specific steps:
s210: pre-configuring a PCI configuration space in Qemu, and correspondingly writing information of the PCI configuration space in Qemu into the PCI configuration space in Guest Kernel to enable Qemu to obtain Guest Physical Address;
s220: qemu converts Guest Physical Address to Host Virtual Address;
s230: host Virtual Address is passed in Qemu to the vhest module in Host Kernel over the Ioctl interface.
3. The Polling logic of vhost-scsi in Host Kernel, step S300 in this embodiment: the Polling logic is completed by setting SQ and CQ.
A set of Polling logic flow chart is needed to be designed, so that the locking can be less when the Loop is as much as possible, and the system can operate efficiently when a large number of requests are made. A specific logic flow diagram of the poling is shown in fig. 3.
In step S300, the Polling thread is started by allocating at the Host end of the Guest end, and the Polling logic is completed by continuously Polling and setting the key identifiers of SQ and CQ.
Specifically, the modification method further comprises the following steps:
in step S300, the concrete Polling logic is as follows:
s310: the vhost_iovq_thread [ start ] is called in the Host Kernel, namely, the Sqthread thread is started in the Host Kernel, and the Polling state is started;
s320: setting the identifier of the sq.head in the Guest OS to n, n being a natural number, as shown in fig. 3, in this embodiment, n=3;
s330: actively inquiring whether the value of the Sq.head mark is larger than the value of the Sq.tail or not in the Sqthread thread in the Host OS, and if the value of the Sq.head mark is larger than the value of the Sq.tail, indicating that new data is filled;
s340: obtaining new transmitted data (get_request) through the Sq.head identification in the Host OS;
s350: updating the identification of the Sq.tail in the Host OS, wherein the identification of the data which is newly sent by the Host OS terminal is obtained;
s360: processing the newly transmitted data (handle_request) in the Host OS;
s370: after processing new data in the Host OS, updating cq.head=3 to indicate that the data corresponding to index value No. 3 has been processed, polling then to determine whether there is next data to send, and continuing to step 3;
s380: judging whether the Cq.head reaches 3 in the get OS, and informing the upper application layer that the request of the sq.head= 3 is processed;
s390: after the application layer is notified in the get OS, the cq.tail is updated to 3, and the resources occupied by the index No. 3 can be released after the request No. 3 result is also processed.
4. If no data exists, the loop can be stopped to rest when no data is requested (CPU is not occupied when idle), and the polling can continue to operate with high efficiency until a new request is again made.
In order to cope with the above-described problem, the present embodiment adds a predetermined time in step S370, and sets the poll to the sleep state if no new data is received within the predetermined time.
It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Claims (4)

1. A method for improving the performance of a vhost-scsi to promote virtualized storage, wherein the IOeventfd event mechanism is modified to a Polling active event query, the modification method comprising the steps of:
s100: initiating an application for creating continuous shared memory in an address space of a Guest Kernel, and distributing the shared memory to the SQ and the CQ;
s200: the shared memory created in the Guest Kernel is transferred to a vHost module of the Host Kernel after being converted by a memory address of the Qemu module;
s300: the Polling thread is started through allocation of a Guest end and a Host end, and Polling is continuously conducted, and key identifiers of the SQ and the CQ are set to complete the Polling logic; in step S300, the Polling logic is completed by setting SQ, head, tail, flag of CQ, respectively;
the modification method further comprises the following steps:
in step S300, the concrete Polling logic is as follows:
s310: starting an Sqthread thread in a Host Kernel, and starting a Polling state;
s320: setting the Sq.head mark in the Guest OS as n, wherein n is a natural number;
s330: actively inquiring whether the value of the Sq.head mark is larger than the value of the Sq.tail or not in an Sqthread thread in the Host OS, and if the value of the Sq.head mark is larger than the value of the Sq.tail, indicating that new data is filled;
s340: acquiring new transmitted data through the Sq.head mark in the Host OS;
s350: updating the identification of the Sq.tail in the Host OS, wherein the identification of the data which is newly sent by the Host OS terminal is obtained;
s360: processing the newly transmitted data in a Host OS;
s370: after processing new data in the Host OS, updating cq.head=n, which means that the data corresponding to the index value of n is processed, polling is performed to determine whether the next data is sent, and continuing to step by step n;
s380: judging whether the Cq.head reaches n in the get OS, and informing an upper application layer that the request of the sq.head=n is processed;
s390: after the application layer is notified in the get OS, the cq.tail is updated to n, and the resources occupied by the n index can be released after the n number request result is also processed.
2. The method for improving the performance of virtualized storage according to claim 1, further comprising the step of, in step S100:
s110: dividing a shared memory in an address space of a Guest Kernel according to a continuous whole page memory allocation principle, and allocating the division areas of the whole page memory to SQ and CQ;
s120: dividing the space in the SQ area space and distributing the space to the SQ- > head, the SQ- > tail and the SQ- > flag;
s130: the space is divided in the CQ area space and allocated to CQ- > head, CQ- > tail and CQ- > flag.
3. The method for improving the performance of virtualized storage according to claim 2, further comprising the step of, in step S200:
s210: pre-configuring a PCI configuration space in Qemu, and correspondingly writing information of the PCI configuration space in Qemu into the PCI configuration space in Guest Kernel to enable Qemu to obtain Guest Physical Address;
s220: qemu converts Guest Physical Address to Host Virtual Address;
s230: host Virtual Address is passed in Qemu to the vhest module in Host Kernel over the Ioctl interface.
4. A method of improving the performance of virtualized storage according to claim 3, wherein in step S370, if no new data is received within a predetermined time, the poll is set to a dormant state.
CN202410056443.6A 2024-01-16 2024-01-16 Method for improving virtualized storage performance by improving vhost-scsi Active CN117573041B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410056443.6A CN117573041B (en) 2024-01-16 2024-01-16 Method for improving virtualized storage performance by improving vhost-scsi

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410056443.6A CN117573041B (en) 2024-01-16 2024-01-16 Method for improving virtualized storage performance by improving vhost-scsi

Publications (2)

Publication Number Publication Date
CN117573041A CN117573041A (en) 2024-02-20
CN117573041B true CN117573041B (en) 2024-04-09

Family

ID=89886540

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410056443.6A Active CN117573041B (en) 2024-01-16 2024-01-16 Method for improving virtualized storage performance by improving vhost-scsi

Country Status (1)

Country Link
CN (1) CN117573041B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250166A (en) * 2015-05-21 2016-12-21 阿里巴巴集团控股有限公司 A kind of half virtualization network interface card kernel accelerating module upgrade method and device
CN111796912A (en) * 2020-07-09 2020-10-20 山东省计算中心(国家超级计算济南中心) Virtualization performance optimization method and system for storage input/output device of Shenwei platform
CN114020406A (en) * 2021-10-28 2022-02-08 郑州云海信息技术有限公司 Method, device and system for accelerating I/O of virtual machine by cloud platform
CN115904628A (en) * 2022-12-14 2023-04-04 安超云软件有限公司 IO virtualization data processing method and application based on vhost protocol
CN117389694A (en) * 2023-12-13 2024-01-12 麒麟软件有限公司 Virtual storage IO performance improving method based on virtio-blk technology

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7251648B2 (en) * 2019-10-08 2023-04-04 日本電信電話株式会社 In-server delay control system, in-server delay control device, in-server delay control method and program

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250166A (en) * 2015-05-21 2016-12-21 阿里巴巴集团控股有限公司 A kind of half virtualization network interface card kernel accelerating module upgrade method and device
CN111796912A (en) * 2020-07-09 2020-10-20 山东省计算中心(国家超级计算济南中心) Virtualization performance optimization method and system for storage input/output device of Shenwei platform
CN114020406A (en) * 2021-10-28 2022-02-08 郑州云海信息技术有限公司 Method, device and system for accelerating I/O of virtual machine by cloud platform
CN115904628A (en) * 2022-12-14 2023-04-04 安超云软件有限公司 IO virtualization data processing method and application based on vhost protocol
CN117389694A (en) * 2023-12-13 2024-01-12 麒麟软件有限公司 Virtual storage IO performance improving method based on virtio-blk technology

Also Published As

Publication number Publication date
CN117573041A (en) 2024-02-20

Similar Documents

Publication Publication Date Title
US11868617B2 (en) Virtualizing non-volatile storage at a peripheral device
CN112422615B (en) Communication method and device
CN107515775B (en) Data transmission method and device
WO2017114283A1 (en) Method and apparatus for processing read/write request in physical host
WO2018035856A1 (en) Method, device and system for implementing hardware acceleration processing
US9792186B2 (en) Kernel state and user state data exchange method for disaster recovery of virtual container system
US20100153514A1 (en) Non-disruptive, reliable live migration of virtual machines with network data reception directly into virtual machines' memory
US20050060443A1 (en) Method, system, and program for processing packets
US20080155571A1 (en) Method and System for Host Software Concurrent Processing of a Network Connection Using Multiple Central Processing Units
WO2024046188A1 (en) I/o unloading method and system in cloud environment, device, and storage medium
WO2022143714A1 (en) Server system, and virtual machine creation method and apparatus
WO2023016414A1 (en) Credential rotation method, computing device, and storage medium
CN114371811A (en) Method, electronic device and computer program product for storage management
WO2023201987A1 (en) Request processing method and apparatus, and device and medium
US10831684B1 (en) Kernal driver extension system and method
US20220121359A1 (en) System and method to utilize a composite block of data during compression of data blocks of fixed size
CN110445580B (en) Data transmission method and device, storage medium, and electronic device
CN117573041B (en) Method for improving virtualized storage performance by improving vhost-scsi
US20060242258A1 (en) File sharing system, file sharing program, management server and client terminal
US10846265B2 (en) Method and apparatus for accessing file, and storage system
US10120594B1 (en) Remote access latency in a reliable distributed computing system
CN116225614A (en) Method and system for virtualizing security cryptographic module in fragments
CN116601616A (en) Data processing device, method and related equipment
Zhang et al. NVMe-over-RPMsg: A Virtual Storage Device Model Applied to Heterogeneous Multi-Core SoCs
US11422963B2 (en) System and method to handle uncompressible data with a compression accelerator

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant