CN117834561A - Network I/O processing method and device based on SPDK - Google Patents

Network I/O processing method and device based on SPDK Download PDF

Info

Publication number
CN117834561A
CN117834561A CN202311685964.XA CN202311685964A CN117834561A CN 117834561 A CN117834561 A CN 117834561A CN 202311685964 A CN202311685964 A CN 202311685964A CN 117834561 A CN117834561 A CN 117834561A
Authority
CN
China
Prior art keywords
data
network
virtio
queue
bdev
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311685964.XA
Other languages
Chinese (zh)
Inventor
刘轩
李晨晨
秦文超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Suzhou Software Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202311685964.XA priority Critical patent/CN117834561A/en
Publication of CN117834561A publication Critical patent/CN117834561A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Computer And Data Communications (AREA)

Abstract

The application provides a network I/O processing method and device based on SPDK, wherein the method comprises the following steps: establishing a mapping relation between the virtual equipment and a port of the butt-joint external storage equipment, and acquiring an end-to-end data transmission link; the data structure of the driver of the virtual device supports a multi-queue characteristic and can bear network data packets; the virtual device includes a plurality of transmission queues; in response to the I/O request, network I/O transceiving processing is performed based on the data transmission link and the plurality of transmission queues. The multi-queue characteristic is supported and network data packets can be borne by expanding the drive of the virtual device; and based on the data transmission link and a plurality of transmission queues, network I/O receiving and transmitting processing is carried out, so that the receiving and transmitting processing of the network I/O by the efficient user mode storage drive is realized.

Description

Network I/O processing method and device based on SPDK
Technical Field
The present invention relates to the field of network I/O processing technologies, and in particular, to a network I/O processing method, device, electronic equipment, and storage medium based on SPDK.
Background
In the related art, in the actual design scenario of the DPU (Data Processing Unit, the data processor), when the acceleration Chip FPGA adopts the TCP/IP protocol to carry data, the SoC (System On Chip) PF receives network data packets, where the data packets are stored as payload (payload), and the native SPDK virtual-blk front end driver only supports single queue processing single request, and there is no design of receive and transmit queues (rxq, txq), so that the network data packets cannot be received and transmitted in parallel, and only the data structure can be stored through single queue processing.
Disclosure of Invention
The present application aims to solve, at least to some extent, one of the technical problems in the related art.
Therefore, a first object of the present application is to propose a network I/O processing method based on SPDK to implement multi-queue processing of network packets.
A second object of the present application is to propose a network I/O processing device based on SPDK.
A third object of the present application is to propose an electronic device.
A fourth object of the present application is to propose a computer readable storage medium.
A fifth object of the present application is to propose a computer programme product.
To achieve the above objective, an embodiment of a first aspect of the present application provides a network I/O processing method based on SPDK, including:
establishing a mapping relation between the virtual equipment and a port of the butt-joint external storage equipment, and acquiring an end-to-end data transmission link; the data structure of the driver of the virtual device supports a multi-queue characteristic and can bear network data packets; the virtual device includes a plurality of transmission queues;
and responding to the I/O request, and performing network I/O transceiving processing based on the data transmission link and the plurality of transmission queues.
In some implementations, before the mapping relationship between the virtual device and the port interfacing with the external storage device is established, the method further includes:
Responding to a first remote procedure call request, creating a virtual device bound with the identified hot plug setting in the SPDK framework, and loading a drive of the virtual device; the driver of the virtual device comprises an upper-layer virtio driver and a lower-layer bdev driver, and a data structure of the upper-layer virtio driver supports a multi-queue characteristic and can bear network data packets.
In some implementations, the creating, in response to the first remote procedure call request, a virtual device in the SPDK framework that binds to the identified hot plug settings and loading a driver for the virtual device; comprising the following steps:
responding to a first remote procedure call request, matching the identified hot plug equipment, and initializing the hot plug equipment;
creating a virtio device matched with the hot plug device through a virtio bus, reading a description structure of the virtio device, and initializing the virtio device; the description structure of the virtio device comprises a port for docking an external storage device and a poll;
the characteristics of the front-end drive are obtained after updating through the characteristic negotiation of the front-end drive and the back-end equipment; defining the characteristics of the upper-layer virtio driver based on the updated front-end driving characteristics; and loading the upper-layer virtio driver to the virtio device, starting the virtio device and distributing a transmission queue for the virtio device.
In some implementations, after the starting the virtio device and allocating a transmission queue for the virtio device; further comprises:
registering the virtio device as an I/O device, registering the I/O device into a bdev subsystem to form the bdev device, and configuring a data channel of the I/O device.
In some implementations, the establishing a mapping relationship between the virtual device and a port interfacing with the external storage device; comprising the following steps:
adding a port for interfacing with an external storage device based on a host physical function;
and responding to a second remote procedure call request, and establishing a mapping relation between the virtual equipment and a port of the external storage equipment in a butt joint way through the data structure of the underlying bdev drive.
In some implementations, the responding to the I/O request performs network I/O transceiving processing based on the data transmission link and the plurality of transmission queues; comprising the following steps:
receiving an I/O request from a shared memory pool; based on the I/O request, acquiring a plurality of network data packets from the shared memory pool and storing the network data packets into a receiving queue in the transmission queue; wherein the network data packet includes a metadata header;
analyzing the network data packet in the receiving queue to obtain receiving data;
And transmitting the received data to a bdev layer so that the bdev layer transmits the received data to the external storage device.
In some implementations, the responding to the I/O request performs network I/O transceiving processing based on the data transmission link and the plurality of transmission queues; further comprises:
the bdev layer reads the completion data from the port of the docking external storage device, encapsulates the completion data into a network data packet and stores the network data packet into a transmission queue in the transmission queue;
and reading the network data packet of the sending queue, and storing the network data packet into the shared memory pool through a data channel.
In some implementations, the bdev layer reads the completion data from the port of the docking external storage device, encapsulates the completion data into a network data packet, and stores the network data packet into a transmission queue; comprising the following steps:
storing the data content, the receiving length and the writing length of the finished data into a transmission queue buffer area under the condition that the transmission queue is full or the memory is insufficient;
changing the state of the encapsulated network data packet to a waiting state.
In some implementations, the responding to the I/O request is performed before network I/O transceiving processing based on the data transmission link and the plurality of transmission queues; further comprises:
Registering a poll device, and triggering an event; and processing a polling schedule of the transmission queue based on the poller by dynamically setting the transmission queue.
To achieve the above object, an embodiment of a second aspect of the present application provides a network I/O processing device based on SPDK, including:
the link establishment module is used for establishing a mapping relation between the virtual equipment and a port of the external storage equipment in butt joint to acquire an end-to-end data transmission link; the data structure of the driver of the virtual device supports a multi-queue characteristic and can bear network data packets; the virtual device includes a plurality of transmission queues;
and the data processing module is used for responding to the I/O request and carrying out network I/O receiving and transmitting processing based on the data transmission link and the plurality of transmission queues.
In some implementations, the apparatus further includes a drive setting module to:
responding to a first remote procedure call request, creating a virtual device bound with the identified hot plug setting in the SPDK framework, and loading a drive of the virtual device; the driver of the virtual device comprises an upper-layer virtio driver and a lower-layer bdev driver, and a data structure of the upper-layer virtio driver supports a multi-queue characteristic and can bear network data packets.
In some implementations, the drive setting module is specifically configured to:
responding to a first remote procedure call request, matching the identified hot plug equipment, and initializing the hot plug equipment;
creating a virtio device matched with the hot plug device through a virtio bus, reading a description structure of the virtio device, and initializing the virtio device; the description structure of the virtio device comprises a port for docking an external storage device and a poll;
the characteristics of the front-end drive are obtained after updating through the characteristic negotiation of the front-end drive and the back-end equipment; defining the characteristics of the upper-layer virtio driver based on the updated front-end driving characteristics; and loading the upper-layer virtio driver to the virtio device, starting the virtio device and distributing a transmission queue for the virtio device.
In some implementations, the driver settings module is after starting the virtio device and allocating a transmission queue for the virtio device; also used for:
registering the virtio device as an I/O device, registering the I/O device into a bdev subsystem to form the bdev device, and configuring a data channel of the I/O device.
In some implementations, the link establishment module is specifically configured to:
adding a port for interfacing with an external storage device based on a host physical function;
and responding to a second remote procedure call request, and establishing a mapping relation between the virtual equipment and a port of the external storage equipment in a butt joint way through the data structure of the underlying bdev drive.
In some implementations, the data processing module is specifically configured to:
receiving an I/O request from a shared memory pool; based on the I/O request, acquiring a plurality of network data packets from the shared memory pool and storing the network data packets into a receiving queue in the transmission queue; wherein the network data packet includes a metadata header;
analyzing the network data packet in the receiving queue to obtain receiving data;
and transmitting the received data to a bdev layer so that the bdev layer transmits the received data to the external storage device.
In some implementations, the data processing module is further configured to:
the bdev layer reads the completion data from the port of the docking external storage device, encapsulates the completion data into a network data packet and stores the network data packet into a transmission queue in the transmission queue;
and reading the network data packet of the sending queue, and storing the network data packet into the shared memory pool through a data channel.
In some implementations, the data processing module reads the completion data from the port of the docking external storage device at the bdev layer, encapsulates the completion data into a network data packet, and stores the network data packet into a transmission queue; for the purpose of:
storing the data content, the receiving length and the writing length of the finished data into a transmission queue buffer area under the condition that the transmission queue is full or the memory is insufficient;
changing the state of the encapsulated network data packet to a waiting state.
In some implementations, the data processing module; also used for:
registering a poll device, and triggering an event; and processing a polling schedule of the transmission queue based on the poller by dynamically setting the transmission queue.
To achieve the above object, an embodiment of a third aspect of the present application provides an electronic device, including: a processor, and a memory communicatively coupled to the processor; the memory stores computer-executable instructions; the processor executes computer-executable instructions stored in the memory to implement the method of the first aspect.
To achieve the above object, an embodiment of a fourth aspect of the present application proposes a computer-readable storage medium, in which computer-executable instructions are stored, which when executed by a processor are adapted to carry out the method according to the first aspect.
To achieve the above object, an embodiment of a fifth aspect of the present application proposes a computer program product comprising a computer program which, when executed by a processor, implements the method of the first aspect.
According to the network I/O processing method, device, electronic equipment and storage medium based on SPDK, the multi-queue characteristic is supported and network data packets can be borne by expanding the drive of the virtual device; and based on the data transmission link and the plurality of transmission queues, network I/O transceiving processing is performed, so that the transceiving processing of the network I/O by the efficient user mode storage drive is realized.
Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:
fig. 1 is a flow chart of a network I/O processing method based on SPDK according to the embodiment of the present application;
fig. 2 is an exemplary diagram of a network I/O processing method based on SPDK according to the embodiments of the present application;
Fig. 3 is an exemplary diagram of yet another network I/O processing method based on SPDK according to embodiments of the present application;
fig. 4 is an exemplary diagram of yet another network I/O processing method based on SPDK according to embodiments of the present application;
fig. 5 is a schematic diagram of a data flow in step 40 of an SPDK-based network I/O processing method according to the embodiment of the present application;
fig. 6 is a schematic diagram of a data structure of a network data packet according to an embodiment of the present application;
fig. 7 is a state change diagram of a network packet according to an embodiment of the present application;
fig. 8 is a block diagram of a network I/O processing device based on SPDK according to the embodiment of the present application.
Detailed Description
Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the drawings are exemplary and intended for the purpose of explaining the present application and are not to be construed as limiting the present application.
SPDK (Storage Performance Development Kit ) is an Intel open-source storage performance development kit, adopts the idea of kernel bypass, provides a user mode, a polling mode, an asynchronous and lock-free storage drive, and has an overall architecture of three layers: the lowest layer is a driving layer, supports NVMe, AIO, RBD, virtio, iSCSI and other different back-end devices, provides driving for the back-end devices, and comprises user-state NVMe driving (PCIe/RDMA/TCP) and virtio driving (scsi/blk); the middle layer is a storage service layer and provides a universal API interface for the back-end equipment; the uppermost layer is the protocol layer, providing two classes oF targets, namely network type NVMe-oF (RDMA/TCP) and iSCSI, virtualization type vhost-blk, vhost-scsi, vhost-NVMe, according to different transport.
virtio is an I/O (Input/Output) paravirtualized framework under which a virtual machine (Guest) knows to run in a virtualized environment itself and communicates with a Host machine (Host) through a virtual machine monitor (hypervisor), and a device conforming to the framework interface specification is called a virtio device. The virtio currently discloses three standard specs v0.95, v1.0 and v1.1, and the three standard specs mainly comprise three parts, namely a front-end driver (front-end driver), a transmission queue (vq) and a back-end device (back-end device). The front-end driver runs in a Guest system or bare-metal (bare-metal) application process, and the back-end device exists in a hypervisor process (e.g., qemu) or a separate process (or module) of the Host system. The transmission queue is mainly used for transferring data between the front-end driver and the back-end device through memory sharing mapping, and takes the v1.0 standard as an example, the data structure of the transmission queue is a vring, and mainly comprises a descriptor table (descriptor table), an available ring table (available ring) and a used ring table (used ring).
The virtio-blk based on the virtio paravirtualized framework is an implementation mode of virtualized block storage in KVM-Qemu virtualized ecology, and an efficient block storage mounting method is provided by utilizing a virtio shared memory mechanism. The virtual-blk user state driver of the SPDK implementation has been adopted by mainstream DPUs (Data Processing Unit, data processors) for interfacing with conventional block storage devices. The design of each cloud manufacturer for the self-research DPU is different, an ASIC/FPGA+SoC architecture is basically adopted, SPDK is deployed On a SoC (System On Chip) plane, and the data flow of an acceleration Chip is connected through a PF (Physical function ) interface, so that efficient storage data processing is realized.
In the related art, in the actual design scenario of the DPU, there are several problems with the open source SPDK:
1. when the acceleration chip FPGA adopts a TCP/IP protocol to carry data, the SoC PF receives network data packets, wherein the network data packets are stored as effective load (payload), the primary SPDK virtual-blk front end driver only supports a single queue to process single requests, and the designs of a receiving queue and a sending queue (rxq and txq) are not existed, so that the network data packets can not be received and transmitted in parallel.
Therefore, there is a need to improve the native SPDK, and to customize the virtual driver (i.e., virtual-blk-net driver, abbreviated as blk-net driver), the control plane is responsible for forwarding and routing network packets, and the data plane is responsible for parsing payload and forwarding it downward.
The front end and the back end of the blk-NET driver are connected by means of a transmission layer (namely a virtual layer), the IOPS of the DPU is greatly influenced by the number of transmission queues, and the two are basically in a linear relation, so that the custom virtual driver needs to support a multi-queue characteristic, namely the front end driver and the back end device need to support the VIRTIO_NET_F_MQ feature bit, namely the multi-queue characteristic, so that the data throughput rate is maximized.
The following describes a network I/O processing method, device and equipment based on SPDK according to the embodiments of the present application with reference to the accompanying drawings.
Fig. 1 is a flow chart of a network I/O processing method based on SPDK according to the embodiment of the present application. It should be noted that, the execution body of the network I/O processing method based on the SPDK in the embodiment of the present application is the network I/O processing device based on the SPDK in the embodiment of the present application, and the network I/O processing device based on the SPDK may be configured in an electronic device, so that the electronic device may execute the network I/O processing function based on the SPDK.
As shown in fig. 1, the network I/O processing method based on SPDK includes the following steps:
step 10, responding to a first remote procedure call request, creating virtual equipment bound with the identified hot plug setting in an SPDK framework, and loading a drive of the virtual equipment; the driver of the virtual device comprises an upper-layer virtio driver, and a data structure of the upper-layer virtio driver supports a multi-queue characteristic and can bear network data packets.
It can be understood that the DPU includes two layers of drivers, the first layer of drivers being the upper layer virtio drivers, corresponding to host; the second layer of drive is a lower layer bdev drive corresponding to PCI equipment and other block equipment installed on host. The DPU card is an application that is plugged into a server, and the block device is a disk that is plugged into the server, exposing these disks to an upper user, which may be a virtual machine or bare metal, through the DPU. That is, host is only one tenant of the server, and the tenant may be in a virtual machine mode or in a bare metal mode. It is understood that host leases a server, which provides data services such as data storage for host. The data of the Host is transmitted by the FPGA, and the SoC of the Host realizes the data round trip with the real block equipment on the server through the underlying bdev drive.
The main purpose of this step is: and identifying the PCI hot plug device inserted into the whole DPU card, and loading an upper-layer virtio driver (namely a virtio-blk-net driver) to form the binding of the PCI hot plug device. The driver is not an open source driver and includes a data structure supporting network packets that supports multiple queue characteristics (MQ). The original data structure of the open source supports a single queue, and when the method is specifically implemented, the data structure of the open source is expanded, namely, a part of pointers are added in the data structure of the open source, so that the data structure supports the characteristic of multiple queues.
As an implementation manner, as shown in fig. 2, a method for creating a first virtio device matched with a PCI hot plug device in an SPDK framework and completing initialization and first drive setting of the first virtio device includes:
step 101, initializing an SPDK framework, distributing memory resources for the SPDK framework, and starting a service process.
As one implementation, the configuration file and command line parameters are first read, and the SPDK event frame initialization parameters (opts) are set. And calling DPDK to analyze the opts, wherein the analysis process is required to be carried out in a main process, so that the uniform allocation and synchronization of the resources in the shared memory can be ensured. And then, a main and auxiliary polling thread (reactor) is created according to the opts to set a log threshold value, the initialization of the SPDK framework is completed, and a callback designation function starts a customization service.
Step 102, responding to a first remote procedure call request, matching PCI hot plug equipment, and initializing the PCI equipment.
The first remote procedure call request is an rpc bdev _ virtio _ attach _ controller command. In response to the rpc bdev_virtio_attach_controller command, a virtio-pci device is created. The creation process needs an attach operation, PCI device matching is performed, identification of PCI hot plug devices is completed by BDF identifiers, and id of the completed PCI hot plug devices is obtained. When the PCI device scanning and adding of the given PCI address succeeds, callback probe function is used for carrying out public initialization steps such as device validity judgment, BAR space mapping, registration config operation function, transmission queue operation function and the like, and detection of the PCI device is completed.
In this step, for each identified PCI hot plug device, a corresponding virtio device is created according to the PCI context on the bus.
Step 103, creating a first virtio device matched with the PCI device through a virtio bus, and reading a description structure of the first virtio device; the description structure of the first virtio device comprises an external storage port of a docking host and a poller.
As one implementation, a virtio-PCI driver is loaded and PCI devices are bound. The bus of the virtio_device needs to be set as the virtio bus, the PCI devices are registered to the virtio bus and matched with the successful PCI devices, and the detection of the virtio devices is completed. The configuration content about virtio-blk in the BAR space is read, and partial parameters are allocated for the virtio device, including memory allocation, maximum BS (i.e. maximum payload length) setting, etc.
It should be noted that the description structure of the virtio device is not virtio_blk_dev, but is partially modified based on the description structure, and a port, a poller (poller) and the like stored outside the docking host are added. In order to distinguish from the original structure, the modified description structure is called virtio_blk_net_dev.
Adding a virtio device basic operation function set modern_ops, creating the maximum transmission queue number supported by a driver, synchronizing a back-end device and the like, and completing basic initialization of a virtio device level, wherein the virtio device is not started.
Step 104, updating the front-end driving characteristics through the characteristic negotiation between the front-end driving and the back-end equipment; and performing driving setting of the first virtio device based on the characteristic descriptor, starting the first virtio device and distributing a transmission queue for the first virtio device.
The front-end drive characteristics are updated by the front-end drive negotiating with the characteristics of the back-end device. That is, the nature of the virtual-blk-net front end driver support is customized.
Illustratively, custom virtual-blk-net front end drivers support feature bits, as in Table 1:
as can be seen from table 1, since the network data packet is accepted, the feature bit supported by the front end driver is basically of the net type, where the custom function is to support the data to flow into the port exposed by the host.
It should be noted that the virtio-blk-net device implemented by the front-end driver is still a virtio-blk device in nature, and is a block with a special form.
Illustratively, the corresponding back-end device characteristics (features) are obtained from the virtio-blk device parameters, the characteristics supported by the blk-net front-end drive are negotiated, the largest subset of the characteristics supported by the blk-net front-end drive is taken as the characteristics supported by the front-end drive, and the characteristic negotiation between the front-end drive and the back-end device is completed.
The feature negotiation is completed and the state of the VIRTIO device is VIRTIO _ CONFIG _ S _ FEATURES _ OK and the VIRTIO device is reset. When the system receives the VIRTIO _ CONFIG _ S _ DRIVER _ OK signal, indicating that the blk-net DRIVER has been set up successfully, the VIRTIO device is started and the queue is allocated.
It should be noted that, since both the front-end driver and the back-end device support MQ characteristics, one virtio device has multiple queues, including one ctrl_vq and several io_vq, and the upper limit of the number of queues depends on the upper limit of virtio device support. Since one virtio device has a plurality of queues at this time, the application realizes the support of the multi-queue characteristic through the custom first driver.
Illustratively, this step function is implemented by an initialization module.
And step 20, registering the virtual device as an I/O device, registering the I/O device into a bdev subsystem to form the bdev device, and configuring a data channel of the I/O device.
That is, registering the virtio device as an I/O device, and registering the I/O device into the bdev subsystem constitutes the bdev device, and configures a data channel of the I/O device. I.e., load the underlying bdev drivers of the virtual devices.
This step describes how to perform operations such as I/O and control class operations on the block device through an access interface of the SPDK abstract block device layer (abbreviated as bdev).
The SPDK application accesses the virtual-blk-net device mainly through the presented bdev, the I/O read-write and control interface is unchanged from other common bdev, and the assigned bdev operation function set virtual_fn_table can respond to the read-write and configuration commands of the SPDK application layer to interact with the back-end device.
As one implementation, registering the first virtio device as an I/O device, and registering the I/O device into a bdev subsystem and configuring a data channel of the I/O device; comprising the following steps:
in step 201, the first virtio device is registered as an I/O device, and the I/O device is registered into the bdev subsystem.
As one implementation, the virtio-blk-net device obtains a context pointer to the io_device and registers the io_device as an I/O device, and controls the I/O device through the virtio_blk_net_ctrl structure.
When the SPDK application program executes the I/O operation, the callback functions of the set callback function create_cb and the callback function delete_cb can be called back, wherein the create_cb is responsible for distributing resources, configuring an I/O channel and registering the I/O device into the bdev subsystem.
Step 202, configuring a data channel of the I/O device.
As one implementation, the transmission queue of the first virtio device is allocated to a plurality of processor cores, and on the processor cores, the input-output channels of the bdev subsystem are associated with a vring data structure.
The blk-net driven data path takes the form of a double queue, i.e. a receiving and sending queue, of which the type is virtio_net. Considering load balancing, the transmission queue set in step 101 needs to be uniformly allocated to each processor core (CPU core) in use. And (3) performing circular processing in a processor core corresponding to the current spdk_thread, initializing all allocated transmission queues io_vq, adding a virtual_blk_net_qpair management structure, wherein a queue pair (qpair) comprises a group of receiving queues and sending queues, the receiving queues and the sending queues are scheduled in a queue pair mode, and the bdev io_channel and vring data structures are associated on the processor core.
The functions of this step are illustratively implemented by a control module.
The steps realize the setting of the drive of the virtual equipment, and the setting is completed once. The following description steps describe the network I/O processing in more detail.
And step 30, establishing a mapping relation between the virtual equipment and the port of the external storage equipment, and acquiring an end-to-end data transmission link.
The data structure of the driver of the virtual device supports a multi-queue characteristic and can bear network data packets; the virtual device includes a plurality of transmission queues.
This step is used to map bdev to a specific port_id to complete the end-to-end data connection.
As an implementation manner, mapping the bdev subsystem to a device port of a host machine, completing matching of a data channel of the I/O device and the device port of the host machine, and obtaining a method of an end-to-end data transmission link; comprising the following steps:
in step 301, a port for interfacing with an external storage device is added based on the host physical function.
As one implementation, an ifc device handle is obtained, abstracting the PF of the Host into a port.
When the front end driver supports the VIRTIO_NET_F_IMT characteristic, port_id, namely PVF of IMT, and the number of PVF is consistent with the number of vport obtained by the interface function ifc_add_port; the current-side driver disables IMT functionality and port_id becomes the v-queue pair index of the virtio_blk_net device, i.e., similar to vf_ id (Virtual function), this flexible design supports multiplexing of bare metal with virtual machine modes.
In step 302, in response to the second remote procedure call request, a mapping relationship between the virtual device and the port of the external storage device is established through the underlying bdev driver data structure.
As one implementation, in response to an rpc virtio_blk_map_bdev request, when a virtio-blk type request is entered, the front end driver determines whether it is a valid request source, and if the remote procedure call response corresponding to the request is correct, the front end driver maps bdev to a specific port number (port_id) through the virtio_blk_net device's descriptive structure pointer.
The mapping process needs to ensure that there are no unprocessed inputs and outputs on bdev. Since one bdev has input-output channel access from multiple threads, when bdev triggers asynchronous call, it is necessary to ensure that data isolation of concurrent operation is done by always calling on the same thread of spdk_bdev_open_ext.
The bdev of the embodiment of the application triggers two cases: 1. the loading mechanism of the SPDK schedules a plurality of polls to access the same bdev, namely a plurality of threads acquire the descriptor of the bdev at the same time; 2. the bdev has a hierarchical mechanism, and when one bdev goes to other bdev through the input-output channel, the bdev is called virtual bdev, that is, vdev, and the vdev has a response priority lower than that of the bdev. When the input and output are forwarded to a low level vdev, the bdev layer scheduling mechanism cannot coordinate read and write operations between the multiple threads. Actually transmitted to the port is the payload (payload) structure of bdev, i.e. specific data of each network packet (pkt).
According to the embodiment of the application, end-to-end data transmission is supported, multiplexing of a bare metal instance mode and a virtual machine mode is met, a group of RPC commands are customized, the bdev layer is supported to be mapped to a specific external Host port, end-to-end data transmission can be achieved, and service separation is guaranteed. The user-defined blk-net driver is added with an IMT function, supports PF/VF conversion of a Host port, and meets multiplexing of a bare metal instance mode and a virtual machine mode.
The functions of this step are illustratively implemented by a mapping module.
And step 40, responding to the I/O request, and performing network I/O transceiving processing based on the data transmission link and the plurality of transmission queues.
After the first driver (blk-net driver) finishes pairing the input and output channels and the port numbers, the multi-queue receiving and transmitting processing of the input and output requests can be realized.
As an implementation manner, as shown in fig. 3 and 4, the method for performing network I/O transceiving processing based on the data transmission link and the plurality of transmission queues includes:
in step 401, a poller is registered, triggering an event.
It should be noted that, when the I/O device is registered in the bdev subsystem in step 102, the synchronization will also register g_poll_groups (abbreviated as group) in the bdev subsystem. In this technical solution, the global group is the virtual_blk_net_poll_group, and belongs to the io_device of the virtual_blk_net_ctrlr type.
The method and the device initialize groups according to CPU masks (CPU masks) transmitted by remote procedure call requests, and in order to achieve efficient data processing, each processor core only builds one SPDK thread (spdk_thread), so concurrency is avoided, and exclusive processor cores are achieved, so that lock-free of a data link is achieved. Finally, an active polling device (active_poll) is registered on the thread, when an event (event_fn) is triggered, the group is used as a context for transferring messages between polling and the thread.
Step 402, by dynamically setting a transmission queue, processing a polling schedule of the transmission queue based on the poller.
In the process of registering and activating the poller, a poller is created to poll the group, the step 102 has realized the connection of io_device, each SPDK thread has only one spdk_io_channel connected with the poller, the spdk_io_channels inside the threads are multiplexed in the process of polling the group, and the io_channels acquire a data channel queue pair pointer to connect a data channel.
The scheduling of the polling of the queues is performed according to an index (index) of the queue pair of each device, where the scheduling mechanism does not mechanically assign a specific queue to the corresponding device, such as: 0. queue number 1, 2 is assigned to vdev0, queue number 3, 4, 5 is assigned to vdev1, and so on. The invention provides a flexible queue configuration method, namely, traversing the maximum index supported from an initial index to vdev, adding thread locks to equipment during traversing, avoiding a plurality of threads to operate a virtual equipment at the same time, causing disordered use of a plurality of queues, ensuring that the queues bound in the corresponding threads are in an unused state, thereby effectively saving bottom layer resources and realizing maximization of the use frequency of the queues.
It should be noted that, the queue pair creates an independent shared memory pool (memboost), and allocates an internal buffer (buffer). Each SPDK thread has an address isolated, independent memory poll (memory poll) for storing network packets of requested or completed status.
After the dynamic allocation of the queues is completed, as each group of queue pairs is managed by the poller, three special cases exist in the actual transmission process: (1) the space of the sending queue is full, and the network data packet cannot be put in, namely, the queued_cpl; (2) the request is in an as yet incomplete state, i.e. compl_pending; (3) for unknown reasons, the request crashes, requiring bdev reprocessing, i.e., needbdev proc. All three abnormal states need to initialize the corresponding linked list in advance, so that the subsequent call is convenient.
Step 403, receiving an I/O request from a shared memory pool, and based on the I/O request, obtaining a plurality of network data packets from the shared memory pool and storing the network data packets in a receiving queue in the transmission queue; wherein the network data packet includes a metadata header.
The data type processed by the embodiment of the application is a network data packet added with a metadata header and passing through a TCP/IP network protocol, and the real data type is still a storage data packet.
The method comprises the following steps of receiving data packet enqueuing processing: the data packets are retrieved from the shared memory pool and stored in a descriptor list of the receive queue.
Taking a single processor core as an example, the processor core is polled to receive and transmit all, and processing of input/output data is started, and the data flow diagram is shown in fig. 5. Data transfer triggers the activation of the poller informing the driver to receive input and output requests from the shared memory pool.
The virtio_blk_net_pkt packet is an object (object) unit data obtained from a shared memory pool, and the data structure of the virtio_blk_net_pkt packet is shown in fig. 6, and mainly consists of some cache fields pointing to data in the data and the data, namely an actual request/return packet body. As can be seen from the data structure, the blk-net type request uses the custom metadata header, the actually initiated request is still of the virtio-blk type, the SPDK framework is called at the bottom layer, and the iSCSI command is translated into the scsi_cmd for processing.
The blk_net drives a request and return shared network data packet structure, the request and return packet body use the same memory (pkt- > data), the return packet body data can cover the pkt- > data content, and the data field (data) mainly carries relevant data types and information after the request and the operation are completed. The network packets are managed by a state machine, the state change is as shown in fig. 7, and the state is the normal (AVAIL) state.
First, the buffer of the received packet left by the last transmission in the receiving queue needs to be emptied. Then, cnt network data packets are obtained from the shared memory pool at one time and stored into a receiving queue.
Wherein, the cnt value is determined in two steps: 1. acquiring an index used in a used ring table of a receiving queue and an index recorded last time, and calculating the available residual space of the used ring table at the time to obtain a nominal cnt value; 2. the nominal cnt value cannot be used as an actual parameter of actual transmission, because the cache_line_size of the SPDK is fixed, when the transmission structure is vring_desc, the memory cannot ensure that cnt network data packet caches are exactly placed each time, and reasonable redundancy correction is needed for the cnt.
Cnt cycles are performed and each network packet data is stored in the space of the receiver symbol table (desc). The essence of the drive fill-in receiving queue is cnt group data, the data layer is to read the used ring table to obtain the descriptor, and after the reading is completed, the space pointed by the table-attached pointer (desc_idx) needs to be released and added into the free chain table (free chain).
Step 404, parsing the network data packet in the receiving queue to obtain receiving data; and transmitting the received data to the bdev layer so that the bdev layer transmits the received data to the external storage device.
The method comprises the following steps of dequeuing a received data packet: and analyzing the data packet in the receiving queue and transmitting the data packet to the bdev layer.
As one implementation, the data packet is truncated, the actual writing length is calculated, instruction conversion is completed, and a data request is written into the bdev layer so as to store the data into a storage device arranged outside the host machine through a data channel.
The above-mentioned process is only to store a certain amount of network data packets in the symbol table, i.e. the process of queuing the packets, and no actual data processing is performed. The cnt network data packets are required to be sequentially inserted into the tail part of a complete waiting chain table (complete_pending), and at the moment, the state of the network data packets is refreshed into a USED (USED) state, and a specific process function is waited for processing sequentially. The I/O process includes a packet cut-off mechanism that takes only a specific length for packets that exceed a set threshold size, where the threshold is set to 4K (i.e., 4K packets are processed) to ensure that the 32 bytes of data are aligned when the packet is captured and stored in memory.
The method comprises the steps of starting to analyze a data packet, and firstly acquiring a time stamp, a data packet length and metadata header information when the data packet is captured; then checking whether bdev is connected with the upper port number, if the mapping is correct, setting the offset according to the type of the cache request; and then notifying the host that the data is available for dequeuing.
Corresponding operations are performed according to the request state of the network data packet, and generally three types of requests are: write bdev, read bdev, refresh bdev, and resubmit I/O. The operation of the three cases is different in size, and is asynchronous callback completion. The following selection of the write request operation in this embodiment is briefly described, and for the latter two cases, a detailed description is omitted.
The actual request data address, the data packet length and the offset pointed by the attached table are obtained, the length of the metadata header and the offset are subtracted from the length of the intercepted data packet, so that the actual writing length is obtained, partial data offset can be generated in the process, and the data length of the actual writing bottom layer virtio_hw is different from the original length. The request is transferred into the bdev layer through the io_channel to be translated into the scsi_cmd type, the SPDK is used for packaging the back-end equipment interface, and the SPDK framework automatically selects a corresponding function to process the transferred I/O according to the bdev type.
In step 405, the bdev layer reads the completion data from the port of the external storage device, encapsulates the completion data into a network data packet, and stores the network data packet into a transmit queue in the transmit queue.
The method comprises the following steps of transmitting data packets for enqueuing: the bdev layer processes the request of the receiving queue, informs the sending queue of receiving data, reads the data of the corresponding port number, and stores the data in a descriptor list of the sending queue.
As one implementation, the bdev layer I/O completes the post-callback, returns data to point to the data structure, removes the I/O to the port number.
Enqueue procedure description of the transmitted data packet: after bdev I/O is completed, an incoming completion callback function is called, a return data address is set to pkt- > data, and the state of a network data packet is refreshed to be DONE state. Then sequentially removing the I/O of the waiting state of the port number of the interface, and transmitting the data packet into a transmission queue to complete enqueuing, wherein the enqueuing process is an input/output request of the iSCSI type.
Special cases: storing the data content, the receiving length and the writing length of the finished data into a transmission queue buffer area under the condition that the transmission queue is full or the memory is insufficient; changing the state of the encapsulated network data packet to a waiting state.
That is, if the transmission queue is full or the memory is insufficient, the input/output request will be automatically queued, the data content, the reception length and the writing length will be stored in the buffer of the transmission queue, the status of the network packet will be updated to READY status, and wait for enqueuing one by one.
The specific enqueuing operation is as follows:
1) The head and tail nodes of the attached list chain are required to be acquired, namely whether enough free space exists in vq is judged, and if the depth of vq is insufficient, the enqueuing of the request is required to be stopped. In the data writing process of the sending package, vq is required to be kept in a read-only state, and the data from the corresponding io_channel is read only, so that event interruption cannot be triggered in the middle process. And after the writing is finished, notifying the back-end equipment, triggering the back-end callback, brushing the state of the network data packet into AVAIL, refreshing the state of the queue, and confirming that the transmission queue has completed the enqueuing of the data packet.
2) Special cases: at some time, since the transmission queue is full, part of the input/output data of the last cycle cannot be enqueued and can only be piled up in the queue cpl, if the I/O exists, the head node of the single-chain tail queue of the bidirectional queue needs to be changed, namely, the piled I/O is cleaned preferentially, a single network data packet is transmitted to the vq idle linked list of the transmission queue in a circulating way successively, the state is updated, and the request is submitted to bdev again. When an exception occurs, transmission is suddenly interrupted, etc., and the request crashes, the corresponding network data packet is stored in the needbdev proc linked list, and is also scheduled and processed by the single-chain tail queue.
And step 406, reading the network data packet of the sending queue, and storing the network data packet into the shared memory pool through a data channel.
The method comprises the following steps of dequeuing a transmission data packet: and reading the descriptor list of the sending queue, and storing the descriptor list into a shared memory pool through a data channel.
And (3) a process of sending the data packet to be dequeued: first, an internal buffer is acquired from the used ring table of the transmit queue, and cnt data packets are fetched from the buffer. Then, the cnt network data packets in the state of not being completed are processed, the state of the network data packets is brushed into USED, and the data are put into a shared memory (hugepage) to complete the confirmation of the data packets and the updating of the queue. And finally, releasing the memory to finish dequeuing operation.
In step 407, after the bdev layer data transmission is completed, the related resources are destroyed.
When all outstanding requests are processed and the poller has no message event any more, acquiring the corresponding context buffer- > io_channel from the poller, then releasing the reference to the channel by the creation thread of the channel, and triggering the spdk_io_device_register to destroy all I/O resources after the channel has no reference.
The destruction process calls back the delete_cb function set in step 102, including: destroying the poller and releasing the channel resource; releasing all receiving queues and sending queues corresponding to vdev and binding line Cheng Jiebang; and releasing the corresponding shared memory pool of the queue.
Special cases: in the destruction process, there may be an initiated request in the io_channel, but unacknowledged network packets, i.e. enqueued, have been completed and have not been dequeued by the transmit queue. In this case, all network packets need to be transferred to other io_channels of the vdev, that is, the network packets need to be transferred, the scheduling mechanism is responsible for the SDPK framework, no channel needs to be specified, the queues corresponding to the input/output channels are released, the corresponding vq is destroyed, and the reinitialization is waited. The message of transferring the network data packet is captured by the poller and fed back to the corresponding thread through the event, so that the remaining network data packet can be normally dequeued.
The process avoids packet loss and packet error, and maximizes the buffer utilization. The method comprises a data packet state verification mechanism, an abnormal state protection mechanism and a data packet cut-off mechanism. The state machine model of the data packet manages the overall transceiving flow. Due to the emergency situations of insufficient memory, network interruption and the like in the receiving and transmitting process, the abnormal state protection mechanism can effectively manage the abnormal state packets, so that no packet loss and no packet error are realized. The data packet cutting mechanism can flexibly process the data packets with indefinite length and maximize the use of the buffer memory.
The functionality of this step is illustratively implemented by a queue poll scheduler module.
According to the network I/O processing method based on the SPDK, the drive of the virtual device is expanded, the multi-queue characteristic is supported, and network data packets can be borne; and based on the data transmission link and the plurality of transmission queues, network I/O transceiving processing is performed, so that the transceiving processing of the network I/O by the efficient user mode storage drive is realized. Compatible with network packet types, the multiple queue mechanism increases the data path width. Virtual device drivers improve the native SPDK framework and increase support for network data formats. The specific data structure can bear the packaged network data packet and analyze and store the data. With reference to the design of the kernel block device layer blk-mq, the custom blk-net driver changes a request layer in a storage stack into a plurality of queues, provides a plurality of queues for a single CPU to process data streams, increases the width of a transmission data path to a certain extent, and improves the transmission frequency.
In order to implement the above embodiment, the present application further proposes a network I/O processing device based on SPDK. Fig. 8 is a schematic structural diagram of a network I/O processing device based on SPDK according to the embodiment of the present application. As shown in fig. 8, the SPDK-based network I/O processing device may include: a link establishment module 801, a data processing module 802, and a drive setting module 803.
The link establishment module 801 is configured to establish a mapping relationship between a virtual device and a port of a docking external storage device, and obtain an end-to-end data transmission link; the data structure of the driver of the virtual device supports a multi-queue characteristic and can bear network data packets; the virtual device includes a plurality of transmission queues;
a data processing module 802, configured to perform network I/O transceiving processing based on the data transmission link and the plurality of transmission queues in response to an I/O request.
In some implementations, the apparatus further includes a drive setting module 803 to:
responding to a first remote procedure call request, creating a virtual device bound with the identified hot plug setting in the SPDK framework, and loading a drive of the virtual device; the driver of the virtual device comprises an upper-layer virtio driver and a lower-layer bdev driver, and a data structure of the upper-layer virtio driver supports a multi-queue characteristic and can bear network data packets.
In some implementations, the driver setting module 803 is specifically configured to:
responding to a first remote procedure call request, matching the identified hot plug equipment, and initializing the hot plug equipment;
creating a virtio device matched with the hot plug device through a virtio bus, reading a description structure of the virtio device, and initializing the virtio device; the description structure of the virtio device comprises a port for docking an external storage device and a poll;
the characteristics of the front-end drive are obtained after updating through the characteristic negotiation of the front-end drive and the back-end equipment; defining the characteristics of the upper-layer virtio driver based on the updated front-end driving characteristics; and loading the upper-layer virtio driver to the virtio device, starting the virtio device and distributing a transmission queue for the virtio device.
In some implementations, the driver settings module 803 after starting the virtio device and allocating a transmission queue for the virtio device; also used for:
registering the virtio device as an I/O device, registering the I/O device into a bdev subsystem to form the bdev device, and configuring a data channel of the I/O device.
In some implementations, the link establishment module 801 is specifically configured to:
adding a port for interfacing with an external storage device based on a host physical function;
and responding to a second remote procedure call request, and establishing a mapping relation between the virtual equipment and a port of the external storage equipment in a butt joint way through the data structure of the underlying bdev drive.
In some implementations, the data processing module 802 is specifically configured to:
receiving an I/O request from a shared memory pool; based on the I/O request, acquiring a plurality of network data packets from the shared memory pool and storing the network data packets into a receiving queue in the transmission queue; wherein the network data packet includes a metadata header;
analyzing the network data packet in the receiving queue to obtain receiving data;
and transmitting the received data to a bdev layer so that the bdev layer transmits the received data to the external storage device.
In some implementations, the data processing module 802 is further configured to:
the bdev layer reads the completion data from the port of the docking external storage device, encapsulates the completion data into a network data packet and stores the network data packet into a transmission queue in the transmission queue;
and reading the network data packet of the sending queue, and storing the network data packet into the shared memory pool through a data channel.
In some implementations, the data processing module 802 reads the completion data from the port of the docking external storage device at the bdev layer, encapsulates the completion data into a network data packet, and stores the network data packet in a transmit queue; for the purpose of:
storing the data content, the receiving length and the writing length of the finished data into a transmission queue buffer area under the condition that the transmission queue is full or the memory is insufficient;
changing the state of the encapsulated network data packet to a waiting state.
In some implementations, the data processing module 802; also used for:
registering a poll device, and triggering an event; and processing a polling schedule of the transmission queue based on the poller by dynamically setting the transmission queue.
It should be noted that the foregoing explanation of the network I/O processing method embodiment based on the SPDK is also applicable to the network I/O processing device based on the SPDK of this embodiment, and will not be repeated herein.
According to the network I/O processing method based on the SPDK, the drive of the virtual device is expanded, the multi-queue characteristic is supported, and network data packets can be borne; and based on the data transmission link and the plurality of transmission queues, network I/O transceiving processing is performed, so that the transceiving processing of the network I/O by the efficient user mode storage drive is realized.
In order to achieve the above embodiments, the present application further proposes an electronic device including: a processor, and a memory communicatively coupled to the processor; the memory stores computer-executable instructions; the processor executes the computer-executable instructions stored in the memory to implement the methods provided by the previous embodiments.
In order to implement the above-mentioned embodiments, the present application also proposes a computer-readable storage medium in which computer-executable instructions are stored, which when executed by a processor are adapted to implement the methods provided by the foregoing embodiments.
In order to implement the above embodiments, the present application also proposes a computer program product comprising a computer program which, when executed by a processor, implements the method provided by the above embodiments.
The processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user related in the application all accord with the regulations of related laws and regulations, and do not violate the popular public order.
It should be noted that personal information from users should be collected for legitimate and reasonable uses and not shared or sold outside of these legitimate uses. In addition, such collection/sharing should be performed after receiving user informed consent, including but not limited to informing the user to read user agreements/user notifications and signing agreements/authorizations including authorization-related user information before the user uses the functionality. In addition, any necessary steps are taken to safeguard and ensure access to such personal information data and to ensure that other persons having access to the personal information data adhere to their privacy policies and procedures.
The present application contemplates embodiments that may provide a user with selective prevention of use or access to personal information data. That is, the present disclosure contemplates that hardware and/or software may be provided to prevent or block access to such personal information data. Once personal information data is no longer needed, risk can be minimized by limiting data collection and deleting data. In addition, personal identification is removed from such personal information, as applicable, to protect the privacy of the user.
In the foregoing descriptions of embodiments, descriptions of the terms "one embodiment," "some embodiments," "example," "particular example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" is at least two, such as two, three, etc., unless explicitly defined otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and additional implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. As with the other embodiments, if implemented in hardware, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.
The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. Although embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims (12)

1. The network I/O processing method based on SPDK is characterized by comprising the following steps:
establishing a mapping relation between the virtual equipment and a port of the butt-joint external storage equipment, and acquiring an end-to-end data transmission link; the data structure of the driver of the virtual device supports a multi-queue characteristic and can bear network data packets; the virtual device includes a plurality of transmission queues;
and responding to the I/O request, and performing network I/O transceiving processing based on the data transmission link and the plurality of transmission queues.
2. The method of claim 1, wherein prior to establishing the mapping relationship between the virtual device and the port interfacing the external storage device, further comprising:
responding to a first remote procedure call request, creating a virtual device bound with the identified hot plug setting in the SPDK framework, and loading a drive of the virtual device; the driver of the virtual device comprises an upper-layer virtio driver and a lower-layer bdev driver, and a data structure of the upper-layer virtio driver supports a multi-queue characteristic and can bear network data packets.
3. The method of claim 2, wherein the creating a virtual device in the SPDK framework that binds to the identified hot plug settings and loading a driver for the virtual device in response to the first remote procedure call request; comprising the following steps:
responding to a first remote procedure call request, matching the identified hot plug equipment, and initializing the hot plug equipment;
creating a virtio device matched with the hot plug device through a virtio bus, reading a description structure of the virtio device, and initializing the virtio device; the description structure of the virtio device comprises a port for docking an external storage device and a poll;
the characteristics of the front-end drive are obtained after updating through the characteristic negotiation of the front-end drive and the back-end equipment; defining the characteristics of the upper-layer virtio driver based on the updated front-end driving characteristics; and loading the upper-layer virtio driver to the virtio device, starting the virtio device and distributing a transmission queue for the virtio device.
4. The method of claim 3, wherein the starting the virtio device and allocating a transmission queue for the virtio device are followed; further comprises:
Registering the virtio device as an I/O device, registering the I/O device into a bdev subsystem to form the bdev device, and configuring a data channel of the I/O device.
5. The method of claim 4, wherein the establishing a mapping relationship between the virtual device and a port interfacing the external storage device; comprising the following steps:
adding a port for interfacing with an external storage device based on a host physical function;
and responding to a second remote procedure call request, and establishing a mapping relation between the virtual equipment and a port of the external storage equipment in a butt joint way through the data structure of the underlying bdev drive.
6. The method of claim 1, wherein the responding to the I/O request performs network I/O transceiving processing based on the data transmission link and the plurality of transmission queues; comprising the following steps:
receiving an I/O request from a shared memory pool; based on the I/O request, acquiring a plurality of network data packets from the shared memory pool and storing the network data packets into a receiving queue in the transmission queue; wherein the network data packet includes a metadata header;
analyzing the network data packet in the receiving queue to obtain receiving data;
and transmitting the received data to a bdev layer so that the bdev layer transmits the received data to the external storage device.
7. The method of claim 6, wherein the responding to the I/O request performs network I/O transceiving processing based on the data transmission link and the plurality of transmission queues; further comprises:
the bdev layer reads the completion data from the port of the docking external storage device, encapsulates the completion data into a network data packet and stores the network data packet into a transmission queue in the transmission queue;
and reading the network data packet of the sending queue, and storing the network data packet into the shared memory pool through a data channel.
8. The method of claim 7, wherein the bdev layer reads completion data from a port of the docked external storage device, encapsulates the completion data into a network packet, and stores the network packet in a transmit queue; comprising the following steps:
storing the data content, the receiving length and the writing length of the finished data into a transmission queue buffer area under the condition that the transmission queue is full or the memory is insufficient;
changing the state of the encapsulated network data packet to a waiting state.
9. The method of claim 6, wherein the responding to the I/O request is performed before the network I/O transceiving processing based on the data transmission link and the plurality of transmission queues; further comprises:
Registering a poll device, and triggering an event; and processing a polling schedule of the transmission queue based on the poller by dynamically setting the transmission queue.
10. An SPDK-based network I/O processing device, comprising:
the link establishment module is used for establishing a mapping relation between the virtual equipment and a port of the external storage equipment in butt joint to acquire an end-to-end data transmission link; the data structure of the driver of the virtual device supports a multi-queue characteristic and can bear network data packets; the virtual device includes a plurality of transmission queues;
and the data processing module is used for responding to the I/O request and carrying out network I/O receiving and transmitting processing based on the data transmission link and the plurality of transmission queues.
11. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored in the memory to implement the method of any one of claims 1-9.
12. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor are adapted to carry out the method of any one of claims 1-9.
CN202311685964.XA 2023-12-08 2023-12-08 Network I/O processing method and device based on SPDK Pending CN117834561A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311685964.XA CN117834561A (en) 2023-12-08 2023-12-08 Network I/O processing method and device based on SPDK

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311685964.XA CN117834561A (en) 2023-12-08 2023-12-08 Network I/O processing method and device based on SPDK

Publications (1)

Publication Number Publication Date
CN117834561A true CN117834561A (en) 2024-04-05

Family

ID=90514439

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311685964.XA Pending CN117834561A (en) 2023-12-08 2023-12-08 Network I/O processing method and device based on SPDK

Country Status (1)

Country Link
CN (1) CN117834561A (en)

Similar Documents

Publication Publication Date Title
US11704059B2 (en) Remote direct attached multiple storage function storage device
US10055264B2 (en) Reception according to a data transfer protocol of data directed to any of a plurality of destination entities
US7669000B2 (en) Host bus adapter with multiple hosts
EP1884085B1 (en) Packet validation in a virtual network interface architecture
WO2018120986A1 (en) Method for forwarding packet and physical host
CN103827842B (en) Message is write to controller storage space
US20070277179A1 (en) Information Processing Apparatus, Communication Processing Method, And Computer Program
US7234004B2 (en) Method, apparatus and program product for low latency I/O adapter queuing in a computer system
CN107967225B (en) Data transmission method and device, computer readable storage medium and terminal equipment
JP2002222110A (en) Storage system and virtual private volume controlling method
JP2002544620A (en) Event-driven communication interface for logically partitioned computers
US20040252709A1 (en) System having a plurality of threads being allocatable to a send or receive queue
US8930568B1 (en) Method and apparatus for enabling access to storage
US8996774B2 (en) Performing emulated message signaled interrupt handling
JP6492083B2 (en) System and method for managing and supporting a virtual host bus adapter (vHBA) over InfiniBand (IB), and system and method for supporting efficient use of buffers with a single external memory interface
CN106571978B (en) Data packet capturing method and device
US9069592B2 (en) Generic transport layer mechanism for firmware communication
US10795608B2 (en) Computer, communication driver, and communication control method
KR102387922B1 (en) Methods and systems for handling asynchronous event request command in a solid state drive
US20130060963A1 (en) Facilitating routing by selectively aggregating contiguous data units
CN115080479B (en) Transmission method, server, device, bare metal instance and baseboard management controller
CN113821309B (en) Communication method, device, equipment and storage medium between microkernel virtual machines
CN117834561A (en) Network I/O processing method and device based on SPDK
US20050141434A1 (en) Method, system, and program for managing buffers
CN114610678A (en) File access method, storage node and network card

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination