CN115774596A - Data processing method and device and electronic equipment - Google Patents

Data processing method and device and electronic equipment Download PDF

Info

Publication number
CN115774596A
CN115774596A CN202111043321.6A CN202111043321A CN115774596A CN 115774596 A CN115774596 A CN 115774596A CN 202111043321 A CN202111043321 A CN 202111043321A CN 115774596 A CN115774596 A CN 115774596A
Authority
CN
China
Prior art keywords
read
local disk
spdk
data
nvme
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111043321.6A
Other languages
Chinese (zh)
Inventor
苏华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co Ltd filed Critical Beijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN202111043321.6A priority Critical patent/CN115774596A/en
Publication of CN115774596A publication Critical patent/CN115774596A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a data processing method, a data processing device and electronic equipment, which are applied to storage equipment; the method comprises the following steps: acquiring a data read-write request of a virtual cloud host for a storage device; the method comprises the steps that a local disk of a storage device is mapped to a user state through user state driving of an SPDK, read-write operation of the local disk is operated in the user state, and data monitoring, local disk state monitoring management, fault operation and maintenance and the like are added on the basis of the SPDK framework; the read-write operation is the read-write operation corresponding to the data read-write request; and transmitting the disk information of the local disk after the read-write operation to a virtual machine system in the storage device. The invention can remove the frequent process context switching process in the kernel state by using the SPDK storage frame, improve the performance of the equipment by relying on data polling processing logic and lock-free design, and realize the monitoring, management, operation and maintenance and the like of the local disk example in the cloud host scene.

Description

Data processing method and device and electronic equipment
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data processing method and apparatus, and an electronic device.
Background
At present, with the replacement of storage devices used by storage servers of various cloud vendors, it is a trend to replace traditional high Technology attached Technology (SATA) SSDs with Non-Volatile Memory host controller interface specification (NVME) Solid State Disks (SSD) or Solid State Drive (SSD) with higher cost performance. However, the problem is that the existing virtualization software implementation scheme is to limit the performance of NVME SSD disk, so it is necessary to implement a high-performance local disk storage framework for the virtual machine instance to fully utilize the performance of hardware.
However, in the existing solution, the input/output (In/Out, IO) stream of data read/write is processed by the kernel at the back end, which causes frequent process context switching and affects the performance of the device.
Disclosure of Invention
In view of this, an object of the present invention is to provide a data processing method, an apparatus and an electronic device, which can remove a frequent process context switching process in a kernel mode, and improve device performance.
In a first aspect, an embodiment of the present invention provides a data processing method, where the method is applied to a storage device, and the method includes: acquiring a data read-write request aiming at the storage equipment; mapping a local disk of the Storage device to a user mode through a user mode drive of a Storage Performance Development Kit (SPDK), and running read-write operation of the local disk in the user mode; the read-write operation is the read-write operation corresponding to the data read-write request; and transmitting the disk information of the local disk after the read-write operation to a virtual machine system in the storage device.
In one embodiment, the local disk is an NVME local disk of an NVME interface; the step of mapping the local disk of the storage device to a user state and running the read-write operation of the local disk in the user state by the user state drive of the SPDK includes: mapping the NVME local disk of the storage device to a user state through Virtual Function Input Output (VFIO) based on a user state NVME drive of the SPDK; and running the read-write operation of the NVME local disk in the user mode, and creating a class file (BlobStore) on the user mode NVME drive so as to perform disk partition management of multiple logical volumes on the NVME local disk based on the BlobStore.
In one embodiment, the method further comprises: and for the data interface of the local disk, executing data IO operation by using a polling shared memory mode through the SPDK.
In one embodiment, the method further comprises: and in the user mode, binding the process of polling the shared memory with a specific Central Processing Unit (CPU) occupied by the SPDK in a binding mode, so that all the data read-write requests are executed based on the specific CPU in series.
In one embodiment, further comprising: in the process of disk partition management of the multiple logical volumes, correspondingly limiting the speed of different specifications of different logical volumes through Quality of Service (QoS) rules; wherein the management of all the logical volumes is performed by encapsulated device management commands.
In one embodiment, further comprising: monitoring for any one or more of: the operation state of the virtual machine system, the service operation process, the logical volume, the IO performance of the NVME local disk, and the state of the NVME local disk.
In one embodiment, further comprising: in the user mode, controlling data IO to perform Direct Memory Access (DMA) operation through Vring of the virtual machine system and SPDK poler of the SPDK, so that the data IO is prevented from passing through a simulation processor (QEMU) and Kernel of the virtual machine system.
In one embodiment, the self-starting and/or self-resuming in the process of the user state is performed by way of Systemd.
In one embodiment, the framework resources of the SPDK are managed by way of a configuration file.
In a second aspect, an embodiment of the present invention further provides a data processing apparatus, which is applied to a storage device; the device comprises: the acquisition module is used for acquiring a data read-write request aiming at the storage equipment; the mapping module is used for mapping the local disk of the storage device to a user state through a user state drive of SPDK (dynamic host computer system) and running the read-write operation of the local disk in the user state; the read-write operation is the read-write operation corresponding to the data read-write request; and the transmission module is used for transmitting the disk information of the local disk after the read-write operation to a virtual machine system in the storage device.
In a third aspect, an embodiment of the present invention further provides an electronic device, which includes a processor and a memory, where the memory stores computer-executable instructions that can be executed by the processor, and the processor executes the computer-executable instructions to implement any one of the methods provided in the first aspect.
In a fourth aspect, embodiments of the present invention also provide a computer-readable storage medium storing computer-executable instructions that, when invoked and executed by a processor, cause the processor to implement any one of the methods provided in the first aspect.
The embodiment of the invention provides a data processing method, a data processing device and electronic equipment, which are applied to storage equipment. According to the method, the original kernel state NVME drive is mapped to the user state through the SPDK user state drive, namely the user state is operated, the SPDK is used for building and replacing the kernel state, the frequent process context switching process in the kernel state is removed, and a redundant link of the kernel is directly and physically conducted (bypass) so that the influence on the use of the whole machine due to the kernel state drive fault is reduced, and the user state monitoring is facilitated. The embodiment of the invention can remove the frequent process context switching process in the kernel state and improve the performance of the equipment.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic illustration of a prior art solution;
FIG. 2 is a schematic diagram illustrating a comparison between a data processing method according to an embodiment of the present invention and the prior art;
fig. 3 is a schematic flow chart of a data processing method according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram corresponding to a data processing method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a data interaction mechanism according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the embodiments, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
At present, with the replacement of storage devices used by storage servers of various cloud vendors, it is a trend to replace traditional high Technology Attached Technology Attachment (SATA) Solid State disks and Serial Attached small computer system interface (SAS) Solid State disks (Non-Volatile Memory host computer Memory Express, NVME) Solid State Disks (SSD) with higher usability price ratio. However, the problem that follows is that the existing virtualization software implementation scheme is to limit the performance of NVME SSD disk, so it is necessary to implement a high-performance local disk storage framework for the virtual machine instance to fully utilize the performance of hardware. As shown in fig. 1, some cloud vendors like high performance local disk framework are based on SPDK user mode NVME drives at the bottom layer, but the storage resource pooling at the upper layer is achieved by abstracting out a logical volume after NVME disks are pooled, and then providing the logical volume to the virtual machine for use through Virtual Host (VHOST) protocol.
However, in the prior art, all user programs are operated in a user mode, but the work of reading data from a hard disk and the like can only be completed by an operating system, so a mechanism that the user mode program is switched to a kernel mode but cannot control instructions executed in the kernel mode is required at this time. The mechanism is a system call, the implementation in the CPU is called Trap Instruction (Trap Instruction), and the corresponding switching process is called Trap and Trap. Based on the existing solution, on one hand, data read/write (Input/Output, I/O or IO) streams need to be processed by a kernel at the back end, which results in frequent process context switching and overlong and repeated IO links, and on the other hand, a large number of virtual machines are trapped out, and the performance may be greatly affected by using an interrupt mode, the average random read/write performance loss of the SATA interface SSD disk in the actual test process reaches 90%, and the average sequential read/write performance loss reaches 45%. Therefore, the device performance is easily affected by frequent process context switching in the existing scheme.
Based on this, embodiments of the present invention provide a data processing method, an apparatus, and an electronic device, which can map an original kernel-state NVME driver to a user state through an SPDK user state driver, that is, operate in the user state, implement replacing the kernel state with an SPDK component, remove a frequent process context switching process in the kernel state, and directly physically conduct a long link of a (Bypass) kernel, thereby reducing influence on overall machine usage due to a kernel state driver failure, and facilitating monitoring in the user state. The method and the system can remove frequent process context switching processes in a kernel state by using the SPDK storage framework, improve equipment performance by means of data polling processing logic and lock-free design, and realize monitoring, management, operation and maintenance and the like of the local disk example in a cloud host scene.
In order to improve the performance of the local disk example, the invention upgrades the hardware and the software stack, the main content is as shown in fig. 2, NVME interface SSD disk is used on the hardware, the performance of the SSD disk can be improved by 6 times compared with the integral performance of SATA interface protocol SSD disk, but the prices of the two disks with the same capacity are basically equal; the realization that the SPDK component replaces the kernel state on software runs in the user state, the suite maps the original kernel state NVME drive to the user state through VFIO, the frequent process context switching process is eliminated, and the Bypass eliminates the redundant link of the kernel, so that the influence on the use of the whole machine caused by the kernel state drive fault is reduced, and the monitoring in the user state is facilitated; the VFIO is a virtual function input/output (VFIO), and is a framework capable of safely exposing device I/O, interrupt, direct Memory Access (DMA), and the like to a user space, so that device driving can be completed in the user space, and the user space is directly accessed by a device. Due to the fact that VFIO is low in overhead, direct device access of user space is achieved, higher I/O performance can be achieved through virtual machine device allocation, high-performance application and the like; the invention also uses SPDK component to replace kernel mode to realize polling (poler) mode, SPDK uses polling mode to replace traditional interrupt mode, thus greatly improving SSD disk read-write performance and making software not to be the main reason for limiting hardware performance; the invention also realizes the lock-free design by replacing the kernel mode with the SPDK component on the software, and binds the running process with a Central Processing Unit (CPU) in a binding mode, thereby reducing the cost of resource back-and-forth scheduling, improving the data hit efficiency and greatly reducing the influence of the cost of the traditional software lock on the performance.
Among them, QEMU in fig. 2 is an analog processor written by fabry Bellard. QEMU has two main operation modes, namely a User mode (User mode), and the QEMU can start Linux programs compiled for different CPUs; second, system mode (QEMU) can emulate the entire computer System, including the CPU and other peripherals, which facilitates testing and debugging of programs written across platforms. It can also be used to virtualize a number of different virtual computers on a host.
Virtio in FIG. 2 is an I/O para-virtualization solution, and is also a set of general purpose I/O device virtualization programs, and can also be understood as an abstraction of a set of general purpose I/O devices in a para-virtualization hypervisor. The embodiment of the invention provides a communication framework and a programming interface between a set of upper-layer applications and each management program virtualization device, reduces the compatibility problem caused by cross-platform, and greatly improves the development efficiency of the driving program.
Note that VM (VirtualMachine) in fig. 2 is an abbreviation of a virtual machine; dev (device) is an abbreviation for equipment. The/dev directory is important for all users, because all external devices used in the Linux system are contained in the directory; vdb (virtualdisk b) is an abbreviation for virtual disk b. The/dev/vdb represents the path of the virtual disk b in the system.
To facilitate understanding of the present embodiment, first, a data processing method disclosed in the embodiment of the present invention is described in detail, where the method can be applied to a storage device, and referring to a flowchart of the data processing method shown in fig. 3, the method mainly includes the following steps S310 to S330:
in step S310, a data read/write request for the storage device is obtained.
Illustratively, as shown in fig. 4, the storage device may be an NVME local disk, i.e., an NVME SSD disk. NVME, as a logical device interface specification, is primarily a native interface specification that provides low-latency, internal concurrency for flash-based storage devices. The NVME may have 65535 command queues, each of which may be as deep as 65536 commands. And the NVME can be directly communicated with the CPU without an adapter, so that the interaction delay is reduced.
The data processing method provided by the embodiment of the present invention may be used as an implementation scheme for virtual machine data, and the data read-write request in this step may be a data read-write request of the virtual cloud host for the storage device, that is, a data read-write request of the virtual cloud host for the storage device is obtained.
And step S320, mapping the local disk of the storage device to the user mode through the user mode drive of the SPDK, and running the read-write operation of the local disk in the user mode.
And the read-write operation is the read-write operation corresponding to the data read-write request.
It should be noted that SPDK is a storage performance development tool set released by Intel. The SPDK is a framework rather than a distributed system, and the basic foundation of the SPDK is a User space (User space), poll (Polled-mode), asynchronous (Asynchronous), lockless (Lockless) NVME drive that provides zero-copy, high-concurrency access to SSD disks directly from User states. Its original purpose was to optimize block storage destaging. As the SPDK continues to evolve, the SPDK may optimize various aspects of the storage software stack. The read-write operation of the local disk is originally operated in the kernel mode, and the read-write operation of the local disk is operated in the user mode by mapping the local disk of the storage device to the user mode.
The data processing method provided by the embodiment of the invention can also be used as an implementation scheme for virtual machine data based on the open source project SPDK, and data monitoring, local disk state monitoring management, fault operation and maintenance and the like can be added based on the SPDK framework.
Step S330, transmitting the disk information of the local disk after the read-write operation to a virtual machine system in the storage device.
In an optional implementation manner, the disk information of the local disk after the read-write operation may be transmitted to the virtual machine system for use by the virtual machine according to the requirement of the virtual machine.
The data processing method, the data processing device and the electronic equipment are applied to storage equipment, firstly a data read-write request for the storage equipment is obtained, then a local disk of the storage equipment is mapped to a user state through a user state drive of a storage performance development kit SPDK, the read-write operation of the local disk is operated under the user state, wherein the read-write operation is the read-write operation corresponding to the data read-write request, and then disk information of the local disk after the read-write operation is transmitted to a virtual machine system in the storage equipment. According to the method, the original kernel state NVME driver is mapped to the user state through the SPDK user state driver, namely the user state is operated, the SPDK component is used for replacing the kernel state, the frequent process context switching process in the kernel state is eliminated, and the Bypass loses a redundant link of the kernel, so that the influence on the use of the whole machine due to the kernel state driver fault is reduced, and the monitoring in the user state is facilitated. The embodiment of the invention can remove the frequent process context switching process in the kernel mode by using the SPDK storage frame, and improve the performance of the equipment by relying on data polling processing logic and lock-free design.
In some embodiments, the local disk may include multiple types, the user mode driver may be an NVME driver, and a blowstore may be further established on the driver to perform disk analysis management of the logical volume. As an example, the local disk is an NVME local disk of an NVME interface; the step S320 may specifically include the following steps:
step a), a user mode NVME drive based on SPDK maps the NVME local disk of the storage equipment to a user mode through VFIO;
and b), operating the read-write operation of the NVME local disk in the user state, and creating a BlobStore on the user state NVME driver so as to perform disk partition management of the multi-logic volume on the NVME local disk based on the BlobStore.
For the step b), the BlobStore is the semantics of the SPDK to implement a highly reduced class file. This may provide high performance support for workloads of databases, containers, virtual machines, etc. It may be understood that a container having a unique set of identifiers or namespaces is used to identify the account number associated with the object store.
For example, as shown in fig. 4, the method maps an NVME local disk to a user state through VFIO based on a user state NVME driver of the SPDK, and creates a BlobStore on the user state NVME driver, so as to perform disk partition management of multiple logical volumes on the NVME local disk.
According to the scheme, the SPDK component maps the original kernel state NVME drive to the user state through VFIO, and therefore a frequent process context switching process is eliminated. The Bypass eliminates the redundant link of the kernel, so that the influence of the kernel-state drive fault on the use of the whole machine is reduced, and the monitoring and the management of the local disk instance are conveniently realized in the cloud host scene of the user state.
In some embodiments, the data IO operation may be performed on the data interface of the local disk in various ways, so as to implement diversification and high efficiency of the data IO operation. As an example, the method further comprises:
and c), executing data IO operation on a data interface of the local disk in a polling shared memory mode through the SPDK.
For executing data IO operation by using a Polling shared memory, it should be noted that a program interrupt (i.e. interrupt) is a process in which a CPU interrupts a running program and transfers the program to a corresponding service program for processing due to a preselected arrangement or occurrence of various random internal or external events, the process is called program interrupt, and switching from a user mode to a kernel mode is a program interrupt, and a Polling (Polling) I/O mode or a program control I/O mode is a process in which a CPU queries each peripheral device in order at a certain period to determine whether the peripheral device has a data input or output requirement, and if so, performs a corresponding input/output service; if not, or the I/O process is completed, the CPU then queries the next peripheral.
In the embodiment of the invention, the SPDK is used for replacing the traditional interrupt mode by using the polling mode, so that the read-write performance of the SSD is greatly improved, and software does not become a main bottleneck for limiting the performance of hardware.
Based on the step c), the process of polling the shared memory can be bound with the specific CPU occupied by the SPDK, so as to reduce unnecessary overhead caused by back and forth call of resources. As an example, the method may further comprise the steps of:
and d), binding the process of polling the shared memory with the specific CPU occupied by the SPDK in a binding mode in the user mode so as to enable all data read-write requests to be executed in series based on the specific CPU.
In practical applications, a multi-core CPU, such as a quad-core CPU, may be used. At ordinary times, the application program can be managed by an operating system during running, and the operating system schedules the application process so that the application process runs on different cores in turn. For common applications, the default scheduling mechanism of the operating system is not problematic, but when a process needs higher operation efficiency, the overhead is larger due to scheduling on different cores. In the embodiment of the invention, the process of polling the shared memory is bound to a single specific CPU core (namely, a specific CPU occupied by the SPDK) to run, so that the overhead caused by scheduling on different cores is reduced. After a process/thread is bound to a specific CPU core (i.e., a specific CPU occupied by the SPDK), the process can run on the core all the time and is not scheduled to other cores by the operating system.
The running process and the CPU are bound in a binding mode, so that the cost of resource back-and-forth scheduling is reduced, and the data hit efficiency is improved. Through the lock-free design of the IO link, the influence of the expense of the traditional software lock on the performance is greatly reduced.
Based on the step a) and the step b), in the process of disk partition management of multiple logical volumes, speed-limiting processing can be performed on different logical volumes, so as to improve performance stability. As an example, the method may further comprise the steps of:
and e), correspondingly limiting the speed of different specifications of different logical volumes according to the QoS rule in the process of disk partition management of the multiple logical volumes.
For the above step e), the management of all logical volumes is performed by the encapsulated device management command.
As shown in fig. 4, qoS refers to quality of service, and for network services in the conventional sense, that is, bandwidth of transmission, delay of transmission, packet loss rate of data, and the like, and improving quality of service can also ensure bandwidth of transmission, reduce delay of transmission, and reduce packet loss rate of data, delay jitter, and the like. In a broad sense, qoS refers to multiple aspects of network applications that can improve quality of service as long as the measures are favorable to the network application. In terms of quality of service, as compared to disk management, because the transmission bandwidth is limited, as long as there is a situation of robbing bandwidth resources, a requirement of quality of service will occur. For example, many logical volumes are based on the same underlying NVME disk, the total transmission bandwidth is 100Mbps, logical volume a occupies 90Mbps, and other logical volumes only occupy the remaining 10Mbps. And if the maximum bandwidth occupied by the logical volume A is limited to 50Mbps, the service quality of other logical volumes is improved, so that other logical volumes can occupy the bandwidth of at least 50 Mbps.
And the service quality is used for limiting the speed of the logic packages of different specifications, so that the performance stability is finally improved, the established performance of each logic volume is exerted, and the mutual interference of the performance when different virtual machine instances use the same bottom NVME disk is reduced.
Based on the step a) and the step b), the monitoring in the user mode may include a plurality of monitoring contents, so as to implement comprehensive monitoring in the user mode to a greater extent. As an example, the method may further comprise the steps of:
step f), monitoring any one or more of the following: the operation state of the virtual machine system, the service operation process, the logical volume, the IO performance of the NVME local disk and the state of the NVME local disk.
The monitoring in the embodiment of the present invention may include monitoring of a virtual machine running state, monitoring of a service itself, monitoring of an abnormal event in a service running process, monitoring of IO performance of each logical volume and an underlying SSD disk (e.g., real-time read-write bandwidth, number of read/write Operations Per Second (IOPS), etc.), monitoring of a real-time state of each NVME SSD disk, for example, how much life of the disk remains, whether there is serious fault information, running temperature, actual read-write data amount, how much headroom (OP) space remains, etc.).
For example, as shown in fig. 4, by using a monitoring module, a resource management module, an operation and maintenance module, and the like, aspects of data monitoring, local disk state monitoring management, failure operation and maintenance, and the like can be added based on an SPDK framework, and monitoring, management, operation and maintenance, and the like of a local disk instance in a cloud host scene can be realized. The operation and maintenance module may include an actual deployment scheme and a processing scheme for various online faults.
By monitoring in the user mode, the user can check whether the running state of the virtual machine is normal or not and whether the specific running condition of the system equipment meets the expectation or not, so that corresponding adjustment can be made. And whether the parameters such as the performance life of the NVME SSD disk are in normal values or not can be found in time, so that the loss that the data is difficult to retrieve due to the damage of the NVME SSD disk is avoided.
In some embodiments, IO processing performance may be improved by controlling the flow direction of the data IO so as to avoid the data IO from passing through the QEMU of the virtual machine system and implementing the operating system. As an example, the method further comprises the steps of:
and g), in the user mode, controlling the data IO to carry out DMA operation through Vring of the virtual machine system and SPDK poler of the SPDK so as to prevent the data IO from passing through QEMU and Kernel of the virtual machine system.
For step g), the Kernel is the Kernel. Illustratively, as shown in fig. 5, virtuue is an actual data link for data exchange between front and back ring queues, and Guest OS (Guest operating system) is a system installed on a Virtual Machine (VM). Vring completes forwarding of data streams mainly through two ring buffers (buffers), into which guests insert buffers, each buffer being a scatter-gather array. Vring contains three parts, a description table (Desc table), an Available ring table (Available ring) and a Used ring table (Used ring). Virtio uses Virtqueue to implement I/O mechanism, which can be understood as lubricant for communication between virtual machine and host, provides a set of general framework and standard interface or protocol to complete the interactive process between the two, and greatly solves the problem of adaptation between various drivers and different virtualization solutions. Virtio also abstracts a set of Vring interface to complete the data receiving and transmitting process between the virtual machine and the host, and has novel structure and clear interface.
In the method provided by the embodiment of the invention, all IO are controlled to directly access DMA operation through Vring and SPDPoller of the virtual machine without QEMU and Kernel, so that the IO processing performance is greatly improved.
In some embodiments, the self-starting and/or self-resuming in the user-state process is performed in a particular manner to achieve faster self-starting and/or self-resuming. As an example, the self-starting and/or self-resuming in the process of the user state is performed by way of Systemd.
The main purpose of the System management daemon (systemdemon) is to reduce System boot time and computational overhead. Systemd introduces the concept of parallel launching that establishes a socket for each daemon to be launched that is abstract to the process that uses them so that they allow interaction between different daemons. Systemd can create new processes and assign a control group (cgroup) to each process. Processes in different control groups may communicate with each other through the kernel.
The way in which Systemd handles the boot process is much simplified compared to conventional Init based systems. The service management unit realizes self-starting and self-recovery of the user mode process in a Systemd mode, and can be quicker and clearer.
In some embodiments, the frame resources of the SPDK are managed in a specific way to achieve a uniform management of the resources of the entire SPDK frame. As one example, the framework resources of the SPDK are managed by way of a configuration file.
It should be noted that most computer programs currently in use, whether office suites, web browsers, and even video games, are configured through a menu interface system, which is almost the default way for users to use the machine. Some programs require that a text file, called a "profile", must be edited in order to be run at the user's discretion. A configuration file is essentially a file that contains the information needed to successfully operate a program, which is structured in a specific way. They are not hard-coded in the program, but user-configurable, typically stored in a plain text file. For Linux, each Linux program is an executable file that contains a list of opcodes that the CPU will execute to perform a particular operation.
The resources of the whole SPDK framework are managed uniformly in a configuration file mode, so that information such as which CPU resources are used by current service, which managed devices are, configuration attributes of the devices, and how many logical volumes are created is convenient to check. And managing the creation, destruction, load balance and the like of all the hosts of the type in one cluster through a resource management unit.
To sum up, the data processing method provided by the embodiment of the present invention at least includes the following features:
(1) According to the data processing method provided by the embodiment of the invention, the NVME interface SSD is used on hardware, the performance of the SSD can be integrally improved by 6 times compared with that of an SATA interface protocol, but the prices of two disks with the same capacity are basically equal.
(2) According to the data processing method provided by the embodiment of the invention, the SPDK component is used on software to replace a kernel mode to realize operation in a user mode, the suite maps the original kernel mode NVME driver to the user mode through VFIO to realize the mapping, and a frequent process context switching process is removed. The Bypass eliminates the redundant link of the kernel, thereby reducing the influence on the use of the whole machine caused by the drive failure of the kernel mode and being convenient for monitoring in the user mode.
(3) According to the data processing method provided by the embodiment of the invention, the SPDK component is used for replacing the kernel mode on the software to realize the poler mode, and the SPDK uses the polling mode for replacing the traditional interrupt mode, so that the read-write performance of the SSD disk is greatly improved, and the software does not become a main bottleneck limiting the performance of the hardware.
(4) According to the data processing method provided by the embodiment of the invention, the software realizes the lock-free design by replacing the kernel mode with the SPDK component, and the running process and the CPU are bound in a binding mode, so that the cost of resource back-and-forth scheduling is reduced, and the data hit efficiency is improved.
Through the lock-free design of the IO link, the influence of the overhead of the traditional software lock on the performance is greatly reduced.
As to the data processing method provided in the foregoing embodiment, an embodiment of the present invention provides a data processing apparatus, which is applied to a storage device. Referring to fig. 6, a schematic structural diagram of a data processing apparatus is shown, which mainly includes the following parts:
an obtaining module 601, configured to obtain a data read-write request for a storage device;
a mapping module 602, configured to map a local disk of a storage device to a user mode through a user mode drive of an SPDK, and run a read-write operation of the local disk in the user mode; the read-write operation is the read-write operation corresponding to the data read-write request;
a transmission module 603, configured to transmit the disk information of the local disk after the read-write operation to a virtual machine system in the storage device.
The data processing device provided by the embodiment of the invention is applied to a storage device, firstly, a data read-write request for the storage device is obtained, then, a local disk of the storage device is mapped to a user state through a user state drive of a storage performance development kit SPDK, and the read-write operation of the local disk is operated under the user state, wherein the read-write operation is the read-write operation corresponding to the data read-write request, and then, the disk information of the local disk after the read-write operation is transmitted to a virtual machine system in the storage device. According to the method, the original kernel state NVME driver is mapped to the user state through the SPDK user state driver, namely the user state is operated, the SPDK component is used for replacing the kernel state, the frequent process context switching process in the kernel state is eliminated, and the Bypass drops the redundant link of the kernel, so that the influence on the use of the whole machine due to the kernel state driver fault is reduced, and the monitoring in the user state is facilitated. The embodiment of the invention can remove the frequent process context switching process in the kernel state and improve the performance of the equipment.
In an embodiment, the local disk is an NVME local disk of an NVME interface, and the mapping module 602 is specifically configured to: the user state NVME drive based on the SPDK maps the NVME local disk of the storage equipment to the user state through VFIO; and running the read-write operation of the NVME local disk in the user mode, and creating a BlobStore on the user mode NVME driver so as to perform disk partition management of multiple logical volumes on the NVME local disk based on the BlobStore.
In an embodiment, the apparatus further includes an execution module configured to: and for the data interface of the local disk, performing data IO operation by using a polling shared memory mode through SPDK.
In one embodiment, the apparatus further comprises a binding module configured to: and in the user state, the process of polling the shared memory is bound with the specific CPU occupied by the SPDK in a binding mode, so that all data read-write requests are executed in series based on the specific CPU.
In one embodiment, the apparatus further includes a speed limit module configured to: in the process of disk partition management of multiple logical volumes, correspondingly limiting the speed of different specifications of different logical volumes through QoS rules; wherein the management of all logical volumes is performed by encapsulated device management commands.
In one embodiment, the apparatus further includes a monitoring module configured to: monitoring for any one or more of: the operation state of the virtual machine system, the service operation process, the logical volume, the IO performance of the NVME local disk and the state of the NVME local disk.
In one embodiment, the apparatus further comprises a control module configured to: in the user mode, DMA operation is carried out on the control data IO through Vring of the virtual machine system and SPDK poler of the SPDK, so that the data IO is prevented from passing through QEMU and Kernel of the virtual machine system.
In one embodiment, the self-starting and/or self-resuming in the process of the user state is performed by way of Systemd.
In one embodiment, the framework resources of the SPDK are managed by way of a configuration file.
The device provided by the embodiment of the present invention has the same implementation principle and technical effect as the method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the method embodiments without reference to the device embodiments.
The embodiment of the invention provides electronic equipment, which particularly comprises a processor and a storage device, wherein the processor is used for processing a plurality of data files; the storage means has stored thereon a computer program which, when executed by the processor, performs the method of any of the above described embodiments.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, where the electronic device includes: a processor 70, a memory 71, a bus 72 and a communication interface 73, wherein the processor 70, the communication interface 73 and the memory 71 are connected through the bus 72; the processor 70 is adapted to execute executable modules, such as computer programs, stored in the memory 71.
The Memory 71 may include a Random Access Memory (RAM) and a Non-volatile Memory (Non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 73 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used.
The bus 72 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 6, but that does not indicate only one bus or one type of bus.
The memory 71 is configured to store a program, and the processor 70 executes the program after receiving an execution instruction, and the method executed by the apparatus defined by the flow process disclosed in any of the foregoing embodiments of the present invention may be applied to the processor 70, or implemented by the processor 70.
The processor 70 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 70. The Processor 70 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory 71, and the processor 70 reads the information in the memory 71 and completes the steps of the method in combination with the hardware thereof.
The computer program product of the readable storage medium provided in the embodiment of the present invention includes a computer readable storage medium storing a program code, where instructions included in the program code may be used to execute the method described in the foregoing method embodiment, and specific implementation may refer to the foregoing method embodiment, which is not described herein again.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention or a part thereof which substantially contributes to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a portable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, an optical disk, or other various media capable of storing program codes.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (12)

1. A data processing method is applied to a storage device; the method comprises the following steps:
acquiring a data read-write request aiming at the storage equipment;
mapping a local disk of the storage device to a user state through a user state drive of the SPDK, and running read-write operation of the local disk in the user state; the read-write operation is the read-write operation corresponding to the data read-write request;
and transmitting the disk information of the local disk after the read-write operation to a virtual machine system in the storage device.
2. The method of claim 1, wherein the local disk is an NVME local disk of an NVME interface;
the step of mapping the local disk of the storage device to a user mode and running the read-write operation of the local disk in the user mode through the user mode drive of the SPDK includes:
mapping the NVME local disk of the storage device to a user state through VFIO based on a user state NVME drive of the SPDK;
and running the read-write operation of the NVME local disk in the user state, and creating a BlobStore on the user state NVME driver so as to perform disk partition management of multiple logical volumes on the NVME local disk based on the BlobStore.
3. The method of claim 1 or 2, further comprising:
and for the data interface of the local disk, executing data IO operation by using a polling shared memory mode through the SPDK.
4. The method of claim 3, further comprising:
and in the user state, binding the process of polling the shared memory with a specific CPU occupied by the SPDK in a binding mode so as to enable all the data read-write requests to be executed based on the specific CPU in series.
5. The method of claim 2, further comprising:
in the process of disk partition management of the multiple logical volumes, correspondingly limiting the speed of different specifications of different logical volumes through QoS rules;
wherein the management of all the logical volumes is performed by encapsulated device management commands.
6. The method of claim 2, further comprising:
monitoring for any one or more of: the operation state of the virtual machine system, the service operation process, the logical volume, the IO performance of the NVME local disk, and the state of the NVME local disk.
7. The method of claim 1, further comprising:
and in the user mode, controlling data IO to carry out DMA operation through vring of the virtual machine system and SPDK poler of the SPDK so that the data IO is prevented from passing through QEMU and Kernel of the virtual machine system.
8. Method according to claim 1, characterized in that the self-starting and/or self-recovery in the process of the user state is performed by means of a system d.
9. The method of claim 1, wherein the framework resources of the SPDK are managed by means of a configuration file.
10. A data processing apparatus, characterized by being applied to a storage device; the device comprises:
the acquisition module is used for acquiring a data read-write request aiming at the storage equipment;
the mapping module is used for mapping the local disk of the storage device to a user state through a user state drive of SPDK (dynamic host computer system) and running the read-write operation of the local disk in the user state; the read-write operation is the read-write operation corresponding to the data read-write request;
and the transmission module is used for transmitting the disk information of the local disk after the read-write operation to a virtual machine system in the storage device.
11. An electronic device comprising a memory and a processor, wherein the memory stores a computer program operable on the processor, and wherein the processor implements the steps of the method of any of claims 1 to 9 when executing the computer program.
12. A computer readable storage medium having stored thereon computer executable instructions which, when invoked and executed by a processor, cause the processor to execute the method of any of claims 1 to 9.
CN202111043321.6A 2021-09-07 2021-09-07 Data processing method and device and electronic equipment Pending CN115774596A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111043321.6A CN115774596A (en) 2021-09-07 2021-09-07 Data processing method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111043321.6A CN115774596A (en) 2021-09-07 2021-09-07 Data processing method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN115774596A true CN115774596A (en) 2023-03-10

Family

ID=85387593

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111043321.6A Pending CN115774596A (en) 2021-09-07 2021-09-07 Data processing method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN115774596A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116775353A (en) * 2023-05-19 2023-09-19 北京百度网讯科技有限公司 Method and device for repairing failed disk, electronic equipment and readable storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116775353A (en) * 2023-05-19 2023-09-19 北京百度网讯科技有限公司 Method and device for repairing failed disk, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
EP3754498A1 (en) Architecture for offload of linked work assignments
US10691363B2 (en) Virtual machine trigger
US9858095B2 (en) Dynamic virtual machine resizing in a cloud computing infrastructure
JP5689526B2 (en) Resource affinity through dynamic reconfiguration of multiqueue network adapters
US20180157519A1 (en) Consolidation of idle virtual machines
US20180217859A1 (en) Technologies for duplicating virtual machine states
US11831410B2 (en) Intelligent serverless function scaling
US11567884B2 (en) Efficient management of bus bandwidth for multiple drivers
CN115774596A (en) Data processing method and device and electronic equipment
US11507292B2 (en) System and method to utilize a composite block of data during compression of data blocks of fixed size
Gugnani et al. Performance characterization of hadoop workloads on SR-IOV-enabled virtualized InfiniBand clusters
Zhang et al. NVMe-over-RPMsg: A virtual storage device model applied to heterogeneous multi-core SoCs
US10606681B2 (en) Incremental dump with fast reboot
US20230333921A1 (en) Input/output (i/o) virtualization acceleration
US20230114636A1 (en) Systems, methods, and devices for accessing a device program on a storage device
US20230403236A1 (en) Techniques to shape network traffic for server-based computational storage
US20240086339A1 (en) Systems, methods, and devices for accessing a device operating system over an interconnect
US20200174814A1 (en) Systems and methods for upgrading hypervisor locally
de Lacerda Ruivo et al. Efficient High-Performance Computing with Infiniband Hardware Virtualization
CN118034917A (en) PCIe resource allocation method and device, electronic equipment and storage medium
Muda A Strategy for Virtual Machine Migration Based on Resource Utilization
Menon Optimizing network performance in virtual machines
Rizzo et al. Optimizations in Virtual Machine Networking

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination