WO2024046188A1 - 一种云环境下的i/o卸载方法、设备、系统及存储介质 - Google Patents

一种云环境下的i/o卸载方法、设备、系统及存储介质 Download PDF

Info

Publication number
WO2024046188A1
WO2024046188A1 PCT/CN2023/114511 CN2023114511W WO2024046188A1 WO 2024046188 A1 WO2024046188 A1 WO 2024046188A1 CN 2023114511 W CN2023114511 W CN 2023114511W WO 2024046188 A1 WO2024046188 A1 WO 2024046188A1
Authority
WO
WIPO (PCT)
Prior art keywords
queue
virtual
instance
component
request
Prior art date
Application number
PCT/CN2023/114511
Other languages
English (en)
French (fr)
Inventor
巩小东
Original Assignee
阿里巴巴(中国)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴(中国)有限公司 filed Critical 阿里巴巴(中国)有限公司
Publication of WO2024046188A1 publication Critical patent/WO2024046188A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45579I/O management, e.g. providing access to device drivers or storage

Definitions

  • This application relates to the field of cloud computing technology, and in particular to an I/O offloading method, device, system and storage medium in a cloud environment.
  • I/O offload cards also known as DPU and IPU, can provide I/O computing engines for high-bandwidth, low-latency, and data-intensive computing scenarios.
  • the I/O processing work is completely offloaded to the I/O offload card.
  • the I/O processing work on the I/O offload card is separated from the host's CPU data link. This separation structure leads to the current I/O Problems such as data overflow often occur during the processing process, and the processing efficiency is poor.
  • Various aspects of this application provide an I/O offloading method, device, system and storage medium in a cloud environment to improve I/O processing efficiency in a cloud environment.
  • Embodiments of the present application provide an I/O offloading system in a cloud environment, including: a CPU of a host machine and an I/O offloading card plugged into the host machine; the CPU is equipped with a queue component;
  • the queue component is used to provide virtual I/O devices and device queues corresponding to the virtual I/O devices for instances on the host; in the CPU, use the device queues to queue the instances Schedule I/O requests occurring between the virtual I/O devices;
  • the I/O offload card is used to monitor I/O requests in the device queue; transfer the monitored I/O between the instance and the physical I/O device corresponding to the virtual I/O device. Request the relevant data.
  • Embodiments of the present application also provide an I/O offloading method in a cloud environment, which is suitable for the CPU in the host machine.
  • the CPU is equipped with a queue component.
  • the method includes:
  • the device queue is used to schedule I/O requests occurring between the instance and the virtual I/O device for monitoring by the I/O offload card plugged into the host. I/O requests in the device queue And the data related to the monitored I/O request is transferred between the instance and the physical I/O device corresponding to the virtual I/O device.
  • Embodiments of the present application also provide an I/O offloading method in a cloud environment, which is suitable for an I/O offloading card plugged into a host machine.
  • the CPU of the host machine is equipped with a queue component.
  • the method includes:
  • Data related to the monitored I/O request is transferred between the instance and the physical I/O device corresponding to the virtual I/O device.
  • Embodiments of the present application also provide a processor CPU installed in a host machine.
  • the CPU is equipped with a queue component.
  • the CPU is used to execute the one or more computer instructions for:
  • the device queue is used to schedule I/O requests occurring between the instance and the virtual I/O device for monitoring by the I/O offload card plugged into the host. I/O requests in the device queue and data related to the monitored I/O requests are transferred between the instance and the physical I/O device corresponding to the virtual I/O device.
  • Embodiments of the present application also provide an I/O offload card, which is plugged into a host machine.
  • the CPU of the host machine is equipped with a queue component, and the I/O offload card includes a memory and a processor;
  • the memory is used to store one or more computer instructions
  • the processor is coupled to the memory for executing the one or more computer instructions for:
  • Data related to the monitored I/O request is transferred between the instance and the physical I/O device corresponding to the virtual I/O device.
  • Embodiments of the present application also provide a computer-readable storage medium that stores computer instructions.
  • the computer instructions are executed by one or more processors, the one or more processors are caused to execute the aforementioned I in a cloud environment. /O uninstall method.
  • a queue component is added to the CPU of the host, and it is innovatively proposed to use the queue component in the I/O processing process.
  • the queue component can be used to provide virtual I/O devices and device queues corresponding to the virtual I/O devices for instances on the host; in the CPU, the device queue can be used to handle the events that occur between the instance and the virtual I/O device. I/O requests are scheduled; the original I/O offload card only needs to be responsible for monitoring I/O requests in the device queue and passing the data related to the I/O requests.
  • the device queue during I/O processing can be moved up to the CPU, so that during the I/O scheduling process, it can be linked with each component in the CPU to obtain the real-time load of each core in the CPU, so that it can Implement I/O scheduling more rationally instead of blindly scheduling, which can effectively improve I/O processing efficiency.
  • Figure 1 is a schematic structural diagram of an I/O offloading system in a cloud environment provided by an exemplary embodiment of the present application
  • Figure 2 is a schematic structural diagram corresponding to an optional implementation solution of an I/O offloading system provided by an exemplary embodiment of the present application;
  • Figure 3 is a schematic structural diagram of an optional implementation solution of an I/O offloading system provided by an exemplary embodiment of the present application
  • Figure 4 is a logical schematic diagram of an instance creation process provided by an exemplary embodiment of the present application.
  • Figure 5 is a logical schematic diagram of an instance destruction process provided by an exemplary embodiment of the present application.
  • Figure 6 is a schematic flowchart of an I/O offloading method in a cloud environment provided by another exemplary embodiment of the present application.
  • Figure 7 is a schematic flowchart of another I/O offloading method in a cloud environment provided by another exemplary embodiment of the present application.
  • Figure 8 is a schematic structural diagram of a processor CPU provided by another exemplary embodiment of the present application.
  • Figure 9 is a schematic structural diagram of an I/O offload card provided by an exemplary embodiment of the present application.
  • a queue component is added to the CPU of the host, and it is innovatively proposed to use the queue component in the process of I/O offloading. Based on this, the queue component can be used to provide virtual I/O devices and device queues corresponding to the virtual I/O devices for instances on the host; in the CPU, the device queue can be used to handle the events that occur between the instance and the virtual I/O device. I/O requests are scheduled; the original I/O offload card only needs to be responsible for monitoring I/O requests in the device queue and passing the data related to the I/O requests.
  • the device queue during the I/O offloading process can be moved up to the CPU, so that during the I/O scheduling process, it can be linked with each component in the CPU to obtain the real-time load of each core in the CPU, making it more reasonable. Realize I/O scheduling effectively instead of blindly scheduling, which can effectively improve I/O processing efficiency.
  • Figure 1 is a schematic structural diagram of an I/O offloading system in a cloud environment provided by an exemplary embodiment of the present application. As shown in Figure 1, the system includes: the CPU of the host computer and the I/O offload card plugged into the host computer.
  • the I/O offloading card can be implemented using chips such as DPU and IPU.
  • the host can be a physical machine in the cloud environment. Refer to Figure 1. In the cloud environment, multiple instances can run on a single host.
  • the core technology under cloud native is virtualization technology. Based on virtualization technology, several virtual machines can be virtualized on the host's CPU. VCPU, instances can run on VCPU.
  • This embodiment proposes adding a queue component to the CPU of the host machine.
  • the queue component can be implemented in hardware or software.
  • hardware such as an application-specific integrated circuit (ASIC) can be used to construct the queue component.
  • ASIC application-specific integrated circuit
  • this embodiment is not limited to this.
  • the queue component added to the CPU in this embodiment can not only be used to provide the queue function for I/O processing work in this embodiment, but can also be reused in other scenarios to provide the queue function for other scenarios.
  • the queue component can be connected to the internal bus of the CPU.
  • This embodiment does not limit the form in which the queue component is connected to the internal bus of the CPU, and the access method can be selected according to the actual situation.
  • the queue component may be connected to a PCIe controller within the CPU to access the CPU's internal bus.
  • the queue component can be connected to other components of the CPU through the UCIe bus.
  • the queue component can also be connected to the memory controller or optional acceleration module in the CPU to interact with the CPU's memory or acceleration module.
  • the queue component can also be interconnected with the I/O offload card plugged into the host.
  • the queue component can be interconnected with the I/O offload card through a cache consistency bus protocol such as CXL.
  • CXL cache consistency bus protocol
  • the queue component added to the CPU can support the linkage between the I/O offload card and the host's CPU, changing the current situation in which the I/O offload card and the CPU data link are separated in the traditional I/O processing solution.
  • the queue component in this embodiment can be used to provide virtual I/O devices and device queues corresponding to the virtual I/O devices for instances on the host.
  • the queue component can perform I/O virtualization on the physical I/O devices that need to perform I/O with the instance to generate virtual I/O devices corresponding to the physical I/O devices (such as the disk device in the figure). ,Internet equipment).
  • the virtualized disk devices may include, but are not limited to, blk devices, etc.
  • the virtualized network devices may include, but are not limited to, net devices, etc.
  • the queue component can use a variety of I/O virtualization solutions to provide virtual I/O devices for instances on the host. The specific solutions will be detailed later.
  • the virtual I/O device here is the definition of the operating system OS level, that is, to the host OS, what is visible is the virtual I/O device, and the virtual I/O device represents are the various physical devices involved in the I/O process.
  • the implementation form of the device queue is not limited in this embodiment, and an implementation form such as a first-in-first-out FIFO queue may be used.
  • the device queue can be used to manage I/O requests.
  • the I/O offloading process will be explained from the perspective of a single instance, but it should be understood that the host can host multiple instances, and the I/O offloading scheme for each instance is consistent.
  • the queue component can provide several virtual I/O devices for a single instance.
  • I/O offloading will be described from the perspective of a single virtual I/O device in the following. solution, but it should be understood that the same optimization solution can be used to optimize the I/O processing involved in other virtual I/O devices provided for a single instance.
  • I/O requests can occur between instances on the host and the virtual I/O devices provided by the queue component.
  • I/O requests between the instance and the virtual I/O device use the io-uring protocol.
  • io-uring By using a common protocol such as io-uring, the need for adaptation when the I/O offload card is linked to different CPU platforms can be avoided.
  • the problem of different device queues implemented on different CPU platforms means that there is no need to consider the issue of adaptability. Of course, this is only preferred.
  • the I/O request occurring between the instance and the virtual I/O device can also adopt a protocol, as long as both parties have reached an agreement in advance.
  • the queue component can use device queues in the CPU to schedule I/O requests that occur between instances and virtual I/O devices.
  • the scheduling algorithm may be five-tuple hash, secret key hash, etc., which is not limited in this embodiment.
  • the device queue involved in the I/O processing process is moved up to the CPU, and the queue component can be linked with various components inside the CPU. This makes the I/O request scheduling work performed by the queue component based on the device queue no longer necessary.
  • the real-time load of each core in the CPU can be used as the basis for scheduling, so that the scheduling of I/O requests can be completed more reasonably, especially the read requests issued by the instance to the virtual I/O device, which avoids the process of the read request. Data overflow problems caused by unreasonable scheduling.
  • the queue component is also interconnected with the I/O offload card plugged into the host.
  • the I/O offload card will no longer need to undertake the work of I/O request scheduling. This part of the work will be moved up to the CPU and borne by the queue component. Other tasks of the I/O offload card can be retained.
  • the I/O offload card can be connected to the device queue provided by the queue component for the virtual I/O device and monitor I/O requests in the device queue. Accordingly, the I/O offload card can monitor I/O requests occurring between the instance and the virtual I/O device.
  • the device queue is connected to the virtual I/O device and the I/O offload card respectively.
  • the data link in this embodiment is: the virtual I/O device connects the data between it and the instance.
  • the I/O request is passed to the device queue, and the device queue passes the I/O request to the I/O offload card so that the I/O offload card senses the I/O request.
  • the I/O offload card can also be used to transfer data related to the monitored I/O requests between the instance and the physical I/O device corresponding to the virtual I/O device.
  • the I/O offload card can be used as a middleware between the physical I/O device and the instance on the host, providing data exchange support for both parties.
  • an acceleration module may also be provided in the queue component, and the acceleration module may be used to perform accelerated processing on data related to I/O requests occurring between the instance and the virtual I/O device.
  • the acceleration processing in this embodiment may include, but is not limited to, encryption and decryption, compression, or statistical unloading.
  • an acceleration module can also be provided in the I/O offload card.
  • the acceleration module in it can also be used to accelerate data related to I/O requests that occur between the instance and the virtual I/O device. Based on the acceleration module provided in the queue component:
  • the VF bound to the device queue corresponding to the virtual I/O device can be associated with the specified acceleration module.
  • the memory address allocated by the customer for the virtual I/O device can also be associated with the specified acceleration module. In this way, when reading or writing the memory address, the data will be accelerated by the specified acceleration module.
  • the acceleration module can preprocess the data before the data enters the memory space allocated by the customer.
  • the I/O computing engine of the I/O offload card receives data from the communication component, and the I/O computing engine DMAs the data to the customer's pre-allocated memory space. Before the data enters the memory space allocated by the customer, the acceleration module can preprocess the data.
  • the data link in this embodiment becomes: the virtual I/O device passes the I/O requests that occur between it and the instance to the device queue; the device queue triggers access to the acceleration module memory address and accelerates processing of data related to the I/O request; the device queue passes the I/O request to the I/O offload card so that the I/O offload card senses the I/O request and reads from the memory address Accelerate processed data.
  • the acceleration scope of the acceleration module to cover the entire path of data.
  • the encryption and decryption module provided by the queue component can ensure the encryption status of data in the entire path, changing the traditional solution that can only be stuck in the data transmission path by I/O offloading. The current status of data acceleration at the end.
  • a queue component is added to the CPU of the host, and it is innovatively proposed to use the queue component in the I/O processing process.
  • the queue component can be used to provide virtual I/O devices and device queues corresponding to the virtual I/O devices for instances on the host; in the CPU, the device queues are used to handle the events that occur between the instances and the virtual I/O devices. I/O requests are scheduled; the original I/O offload card only needs to be responsible for monitoring I/O requests in the device queue and passing the data related to the I/O requests.
  • the device queue in the I/O processing process can be moved up to the CPU, so that during the I/O scheduling process, it can be linked with each component in the CPU to obtain the real-time load of each core in the CPU, so that it can be more reasonably allocated.
  • I/O scheduling instead of blind scheduling, which can effectively improve I/O processing efficiency.
  • FIG. 2 is a schematic structural diagram corresponding to an optional implementation solution of an I/O offloading system provided by an exemplary embodiment of the present application.
  • the device queue provided by the queue component may include a first-layer queue and a second-layer queue.
  • the first-level queue can be connected to the VCPU under the instance
  • the second-level queue can be connected to the physical I/O device through the I/O offload card.
  • the queue component can associate a first specified number of first-level queues and a second specified number of second-level queues for the virtual I/O device; establish The mapping relationship between the first-layer queue and the second-layer queue associated with the virtual I/O device; connect the first-layer queue associated with the virtual I/O device to each VCPU under the instance; connect the first-layer queue associated with the virtual I/O device to each VCPU under the instance; The second layer queue is connected to the physical I/O device corresponding to the virtual I/O device through the I/O offload card.
  • Figure 3 is a schematic structural diagram of an optional implementation solution of an I/O offloading system provided by an exemplary embodiment of the present application, which shows the connection of the first-layer queue associated with the virtual I/O device to the VCPU under the instance.
  • the queue component can use a variety of I/O virtualization schemes to provide virtual I/O devices to instances on the host.
  • SRIOV Single Root I/O Virtualization
  • the VF devices are used to communicate with each instance.
  • VCPU performs data exchange; uses the operating system of the instance to register the VF device as a virtual I/O device of the specified type.
  • the instance's operating system can register the VF as a disk, network device, etc. based on the VF's PCIe ⁇ vender ID, device ID>.
  • the physical I/O devices in this embodiment may include, but are not limited to, high-performance I/O devices such as cloud network cards and cloud disks (such as solid state drives SSD).
  • each SR-I/OV device can have a physical function (Physical FunctI/On, PF), and each PF can have up to 64,000 virtual functions (Virtual FunctI/On, VF) associated with it.
  • PF can create VF from registers, which are designed with properties dedicated to this purpose.
  • the PCI configuration space of each VF can be accessed through the PF's bus, device and function numbers.
  • Each VF has a PCI memory space that maps its set of registers.
  • the VF device driver operates on the register set to enable its functionality and appears as an actual PCI device. After creating a VF, you can directly assign it to each application in the instance. In this way, the VF device in the queue component can exchange data with each VCPU under the instance.
  • the first-layer queue associated with the virtual I/O device can be bound to the VF device corresponding to the virtual I/O device to exchange data with each VCPU under the instance.
  • the physical I/O device can be simulated to generate the physical I/O device.
  • the corresponding simulation device is prepared; based on this, the second layer queue associated with the virtual I/O device can be bound to the simulation device, so as to connect the second layer queue associated with the virtual I/O device to the corresponding virtual I/O device.
  • the I/O offload card can use software simulation or hardware simulation to simulate the physical I/O device, which is not limited in this embodiment.
  • the deployment form of physical I/O devices may be distributed or clustered, and the I/O offload card simulates simulated devices on the corresponding distributed system or cluster to participate in During the I/O processing, eventually, the I/O offload card can accurately pass the I/O request to the physical I/O device by simulating the device. That is, the I/O computing engine in the I/O offload card can provide a specified number of simulated devices, each of which is bound to a specified number of second-level queues in the device queue within the CPU for reading or sending. data.
  • Figure 4 is a logical schematic diagram of an instance creation process provided by an exemplary embodiment of the present application.
  • Figure 5 is a logical schematic diagram of an instance destruction process provided by an exemplary embodiment of the present application. The above queue docking scheme will be explained below through the instance creation/destruction process.
  • an exemplary instance creation process may be:
  • the customer initiates an instance creation request through the console or openAPI.
  • the console can schedule the creation request to the most appropriate host.
  • the instance management agent can be run on the host to create simulated devices corresponding to physical I/O devices such as cloud network cards or cloud disks on the I/O computing engine of the I/O offload card, and bind them to the specified devices in the device queue. number of second-level queues.
  • the queue component in the host's CPU can create the required multiple VFs on the device queue, bind the specified first-level queue, and establish a mapping relationship with the second-level queue in step 3.
  • the instance management agent on the host can create an instance by calling the Hypervisor.
  • an exemplary instance destruction process may be:
  • the customer initiates an instance destruction request through the console or openAPI.
  • the console can find the host where the instance is located, and issue a deletion command through the instance management agent on the host.
  • the instance management agent can delete the simulation device corresponding to the physical I/O device such as the relevant cloud network card or cloud disk on the I/O computing engine of the I/O offload card.
  • the queue component in the host's CPU can be deleted on the device. Delete the relevant VF on the queue.
  • the instance management agent calls the Hypervisor to delete the instance.
  • VF device - first layer queue - second layer queue - simulated device that is, for each additional physical I/O device that needs to perform I/O with the instance, you can add 1 virtual I/O device in Figure 3 I/O device and a list of entity objects managed by the virtual I/O device.
  • the virtual I/O device is generated by virtualizing the physical I/O device, and is mainly used to support instances on the host to discover physical I/O devices.
  • the simulated device in the I/O offload card is generated by simulating the physical I/O device, and is mainly used to simulate the hardware behavior of the physical I/O device.
  • the entity objects participating in the I/O processing process may include: VF device - first layer queue - second layer queue - simulation device - network card - Physical I/O devices.
  • the aforementioned virtual I/O devices represent these entities that participate in the I/O processing process.
  • the simulated device in the I/O offload card represents the last mentioned physical I/O device among these entities.
  • the virtual I/O device and simulated device functions in this embodiment are used to support I/O virtualization under cloud native.
  • the first-layer queue associated with the virtual I/O device can be bound to the VF device to exchange data with each VCPU under the instance; the second-layer queue associated with the virtual I/O device can be bound to the I/O device.
  • /O unloads the emulated device from the card and exchanges data.
  • the first-level queue associated with a virtual I/O device can be mapped to its associated second-level queue N:1, that is, the number of first-level queues associated with a virtual I/O device is usually greater than the number of its associated second-level queues. The number of queues.
  • the write request initiated by the instance to the virtual I/O device can be scheduled to the appropriate physical I/O device by mapping from the first-tier queue to the second-tier queue;
  • the layer queue is mapped to the first layer queue and the read request initiated by the instance to the virtual I/O device is scheduled to the appropriate VCPU.
  • the queue component can read the load information of each VCPU under the instance when the instance initiates a read request to the virtual I/O device; it schedules the read request to the first server associated with the virtual I/O device based on the load information.
  • the first queue in the layer queue to use the VCPU connected to the first queue to process the read request.
  • the first queue may be a queue connected to the VCPU with the lowest load. In this way, read requests can be scheduled to the VCPU with the optimal load for processing, thereby improving I/O processing efficiency.
  • the I/O offload card can obtain the response message corresponding to the read request and add the metadata information in the response message to the designated queue in the second layer queue associated with the virtual I/O device.
  • the queue component can schedule the metadata information in the specified queue to the first queue according to the load information.
  • the host can allocate memory space to the instance, and the I/O offload card can transfer data related to the monitored I/O requests between the instance and the physical I/O device corresponding to the virtual I/O device based on the memory space.
  • the I/O offload card can write the data part in the response message into the memory space corresponding to the instance; add the memory address where the data part is located to the metadata information; based on this, the queue component can The VCPU that triggers the target queue connection reads the data part of the response message according to the memory address in the metadata information. This completes the read request initiated by the instance to the virtual I/O device.
  • the system call can allocate memory space such as sk_buf of the network.
  • the device queue refers to the memory space through address access, and the I/O offload card can access the memory space through DMA or similar Intel SVM mode.
  • the application in the instance can open virtual I/O devices (disk devices and network devices in Figure 2) based on the io-uring protocol, use io_uring_smp_store_release to submit read requests, and use io_uring_enter to trigger the receiver to fetch data from the device queue.
  • the I/O computing engine of the I/O offload card receives the response message from the communication component (the network card in Figure 2) and determines the simulation device to which the message belongs based on the metadata part of the message. Write the data part of the message into the DMA memory space of the simulated device. During the data writing process, the acceleration module of the I/O offload card can be used for data preprocessing.
  • the I/O computing engine generates the metadata information of the read request in Io-uring format.
  • the metadata information contains the DMA memory address where the data part is written.
  • the I/O computing engine can write the metadata information of the read request into the second-level queue of the device queue in the CPU.
  • the queue scheduler running in the queue component can read the power consumption, time slice utilization, and PMU of each VCPU of the instance, and calculate the real-time load of each VCPU in real time, combined with the set five-tuple hash and secret key hash. Waiting for the scheduling policy, the read request in the second-tier queue is scheduled to the target queue in the first-tier queue.
  • the operating system of the instance can wake up the application that submitted the read request, read the metadata information of the read request from the target queue of the device queue, read the memory address where the data part is located from the metadata information, and read from the memory address. data.
  • the queue component can add the metadata information of the write request initiated by the instance to the virtual I/O device to the first-level queue associated with the virtual I/O device; schedule the metadata information of the write request to the virtual I/O The second queue in the second layer queue associated with the device; and the I/O offload card can use the physical I/O device connected to the second queue to process write requests.
  • the queue component can write the data part corresponding to the write request into the memory space corresponding to the instance; add the memory address where the data part is located to the metadata information; the I/O offload card can read from the second queue Metadata information; obtain the data part of the write request according to the memory address in the metadata information; send the data part to the physical I/O device connected to the second queue.
  • the application in the instance can open virtual I/O devices (disk devices and network devices in Figure 2) based on the io-uring protocol, use io_uring_smp_store_release to submit write requests, and use io_uring_enter to trigger the receiver to fetch data from the device queue.
  • the data part of the write request is in the memory space allocated for the application, and the metadata information of the write request enters the first layer queue of the device queue.
  • the queue scheduler running in the queue component can schedule the first-tier queue and the second-tier queue according to the mapping relationship between the first-tier queue and the second-tier queue, the weight of the first-tier queue, and the number of idle queue entries of the second-tier queue. Write requests in the first-level queue are sent to the designated second-level queue.
  • acceleration module in the CPU for data preprocessing, such as data encryption, data compression, etc.
  • the I/O computing engine of the I/O offload card reads the metadata information of the write request from the second-layer queue of the queue component in the CPU to obtain the memory access address and other information of the data part.
  • the I/O computing engine adds metadata headers such as virtual network cards and cloud disks, and sends the data submitted by the application to the target host or back-end storage cluster through the communication component of the I/O offload card.
  • the acceleration module of the I/O offload card can be used for data preprocessing.
  • the queue component can be used to achieve reasonable scheduling of I/O requests occurring between instances and virtual I/O devices.
  • the device queue is moved up to the CPU, which supports the configuration of the scheduling policy of the device queue in the CPU using the tool belt on the instance. It no longer needs to be configured by the console like the traditional solution.
  • the scheduling strategy solution can increase configuration QPS by several orders of magnitude, solving the problem that traditional solutions are not very real-time and cannot meet the needs of rapid creation and destruction of containers and serverless scenarios.
  • the I/O offload card passes the I/O request related to During the process of extracting data, the encryption and decryption module in the CPU can be used to encrypt the data before entering the I/O offload card. In this way, the I/O offload card will no longer be able to see the plain text of the data, which greatly improves the security of user data. promote.
  • FIG. 6 is a schematic flowchart of an I/O offloading method in a cloud environment provided by another exemplary embodiment of the present application. This method can be implemented by the CPU in the host machine in the aforementioned system embodiment, and the CPU is equipped with a queue. components. Referring to Figure 6, the method may include:
  • Step 600 Use the queue component to provide virtual I/O devices for instances on the host;
  • Step 601 Use the queue component to configure the corresponding device queue for the virtual I/O device
  • Step 602 In the CPU, use the device queue to schedule I/O requests that occur between the instance and the virtual I/O device, so that the I/O offload card plugged in the host can monitor the I/O in the device queue. Request and transfer data related to the monitored I/O request between the instance and the physical I/O device corresponding to the virtual I/O device.
  • an application-specific integrated circuit is used to construct the queue component, and the queue component is connected to the internal bus of the CPU.
  • ASIC application-specific integrated circuit
  • the device queue is connected to the virtual I/O device and the I/O offload card respectively.
  • the virtual I/O device transfers the I/O requests that occur between it and the instance to the device queue, and the device queue
  • the I/O request is passed to the I/O offload card so that the I/O offload card becomes aware of the I/O request.
  • the device queue includes a first-layer queue and a second-layer queue
  • the method further includes:
  • the step of scheduling I/O requests occurring between the instance and the virtual I/O device includes:
  • the read request is scheduled to the first queue in the first layer queue associated with the virtual I/O device according to the load information, so that the VCPU connected to the first queue is used to process the read request.
  • the step of scheduling the read request to the first queue in the first layer queue associated with the virtual I/O device according to the load information includes:
  • the metadata information in the designated queue is scheduled to the first queue according to the load information; among them, the I/O offload card obtains the response message corresponding to the read request and adds the metadata information in the response message to the virtual I/O device The specified queue in the associated second-level queue.
  • the method further includes:
  • the VCPU that triggers the docking of the target queue reads the data part in the response message according to the memory address in the metadata information; among them, the I/O offload card writes the data part in the response message into the memory space corresponding to the instance; the data is The memory address where the part is located is added to the metadata information.
  • the step uses a device queue to process I/O requests occurring between the instance and the virtual I/O device.
  • Line scheduling including:
  • the I/O offload card uses the physical I/O device connected to the second queue to process the write request.
  • the method further includes:
  • the I/O offload card reads the metadata information from the second queue; obtains the data part of the write request according to the memory address in the metadata information, and sends the data part to the physical I/O device connected to the second queue. middle.
  • the steps of providing virtual I/O devices for instances on the host include:
  • the steps of performing I/O virtualization on physical I/O devices that need to perform I/O with the instance include:
  • VF devices Use SRIOV technology to create VF devices for physical I/O devices that need to perform I/O with the instance.
  • the VF devices are used to exchange data with each VCPU under the instance;
  • the I/O request occurring between the instance and the virtual I/O device adopts the io-uring protocol.
  • the method further includes:
  • the acceleration module is used to perform acceleration processing on data related to I/O requests that occur between the instance and the virtual I/O device.
  • the acceleration processing includes: one or more of encryption and decryption, compression, and statistical offloading.
  • the acceleration module is bound to the memory address allocated by the host for the virtual I/O device, and the virtual I/O device passes the I/O requests that occur between it and the instance to the device queue; the device queue Trigger the acceleration module to access the memory address and accelerate the data related to the I/O request; the device queue passes the I/O request to the I/O offload card, so that the I/O offload card senses the I/O request and downloads it from the The memory address reads the accelerated data.
  • FIG. 7 is a schematic flowchart of another I/O offloading method in a cloud environment provided by another exemplary embodiment of the present application. This method can be implemented by the I/O offloading card plugged into the host in the aforementioned system embodiment. , the host's CPU is equipped with a queue component. Referring to Figure 7, the method may include:
  • Step 700 Monitor I/O requests that occur between the instance on the host and the virtual I/O device provided by the queue component for the instance from the device queue in the queue component;
  • Step 701 Obtain data related to the monitored I/O request
  • Step 702 Transfer data related to the monitored I/O request between the instance and the physical I/O device corresponding to the virtual I/O device.
  • an application-specific integrated circuit is used to construct the queue component, and the queue component is connected to the internal bus of the CPU.
  • ASIC application-specific integrated circuit
  • the device queue includes a first-level queue and a second-level queue, and the queue component associates a first specified number of first-level queues and a second specified number of second-level queues for the virtual I/O device.
  • Queue establish the mapping relationship between the first-layer queue and the second-layer queue associated with the virtual I/O device; connect the first-layer queue associated with the virtual I/O device to each VCPU under the instance; connect the virtual I/O
  • the second layer queue associated with the device is connected to the physical I/O device corresponding to the virtual I/O device through the I/O offload card.
  • the method may further include:
  • the method may further include:
  • the step of transferring data related to the monitored I/O request between the instance and the physical I/O device corresponding to the virtual I/O device may include: converting the data part in the response message Write the memory space corresponding to the instance; add the memory address where the data part is located to the metadata information, so that the queue component can trigger the VCPU connected to the target queue to read the data part in the response message according to the memory address in the metadata information.
  • the queue component can add the metadata information of the write request initiated by the instance to the virtual I/O device to the first layer queue associated with the virtual I/O device; schedule the metadata information of the write request to The second queue in the second layer queue associated with the virtual I/O device; the method also includes:
  • the step of transferring data related to the monitored I/O request between the instance and the physical I/O device corresponding to the virtual I/O device may include: reading elements from the second queue. Data information; obtain the data part of the write request according to the memory address in the metadata information; send the data part to the physical I/O device connected to the second queue;
  • the queue component writes the data part corresponding to the write request into the memory space corresponding to the instance; the memory address where the data part is located is added to the metadata information.
  • the I/O request occurring between the instance and the virtual I/O device adopts the io-uring protocol.
  • FIG. 8 is a schematic structural diagram of a processor CPU provided by another exemplary embodiment of the present application.
  • the CPU is installed in the host machine, and the CPU is equipped with a queue component 80 .
  • the CPU may be used to execute the one or more computer instructions for:
  • the device queue is used to schedule I/O requests occurring between the instance and the virtual I/O device for monitoring by the I/O offload card plugged into the host. I/O requests in the device queue and data related to the monitored I/O requests are transferred between the instance and the physical I/O device corresponding to the virtual I/O device.
  • an application specific integrated circuit is used to construct the queue component 80, and the queue component 80 is connected to the internal bus of the CPU.
  • ASIC application specific integrated circuit
  • the device queue is connected to the virtual I/O device and the I/O offload card respectively.
  • the virtual I/O device transfers the I/O requests that occur between it and the instance to the device queue, and the device queue
  • the I/O request is passed to the I/O offload card so that the I/O offload card becomes aware of the I/O request.
  • the device queue includes a first-level queue and a second-level queue.
  • the queue component 80 can also be used to:
  • the queue component 80 in the process of scheduling I/O requests occurring between the instance and the virtual I/O device, may be used to:
  • the read request is scheduled to the first queue in the first layer queue associated with the virtual I/O device according to the load information, so that the VCPU connected to the first queue is used to process the read request.
  • the queue component 80 in the process of scheduling the read request to the first queue in the first layer queue associated with the virtual I/O device according to the load information, can be used to:
  • the metadata information in the designated queue is scheduled to the first queue according to the load information; among them, the I/O offload card obtains the response message corresponding to the read request and adds the metadata information in the response message to the virtual I/O device The specified queue in the associated second-level queue.
  • the queue component 80 can also be used to:
  • the VCPU that triggers the docking of the target queue reads the data part in the response message according to the memory address in the metadata information; among them, the I/O offload card writes the data part in the response message into the memory space corresponding to the instance; the data is The memory address where the part is located is added to the metadata information.
  • the queue component 80 in the process of using the device queue to schedule I/O requests occurring between the instance and the virtual I/O device, can be used to:
  • the I/O offload card uses the physical I/O device connected to the second queue to process the write request.
  • the queue component 80 can also be used to:
  • the I/O offload card reads the metadata information from the second queue; obtains the data part of the write request according to the memory address in the metadata information, and sends the data part to the physical I/O device connected to the second queue. middle.
  • the queue component 80 in the process of providing virtual I/O devices for instances on the host, can be used to:
  • the queue component 80 can be used in the process of I/O virtualization of physical I/O devices that need to perform I/O with instances:
  • VF devices Use SRIOV technology to create VF devices for physical I/O devices that need to perform I/O with the instance.
  • the VF devices are used to exchange data with each VCPU under the instance;
  • the I/O request occurring between the instance and the virtual I/O device adopts the io-uring protocol.
  • the queue component 80 may also include an acceleration module, and the queue component 80 may also be used for:
  • the acceleration module is used to perform acceleration processing on data related to I/O requests that occur between the instance and the virtual I/O device.
  • the acceleration processing includes: one or more of encryption and decryption, compression, and statistical offloading.
  • the acceleration module is bound to the memory address allocated by the host for the virtual I/O device, and the virtual I/O device passes the I/O requests that occur between it and the instance to the device queue; the device queue Trigger the acceleration module to access the memory address and accelerate the data related to the I/O request; the device queue passes the I/O request to the I/O offload card, so that the I/O offload card senses the I/O request and downloads it from the The memory address reads the accelerated data.
  • FIG. 9 is a schematic structural diagram of an I/O offload card provided by an exemplary embodiment of the present application.
  • the I/O offload card is plugged into a host, and the CPU of the host is equipped with a queue component.
  • the I/O offload card may include a memory 90 and a processor 91.
  • the memory 90 is used to store one or more computer instructions; the processor 91 is coupled with the memory 90 and is used to execute one or more computer instructions to Used for:
  • Data related to the monitored I/O requests is transferred between the instance and the physical I/O device corresponding to the virtual I/O device.
  • an application-specific integrated circuit is used to construct the queue component, and the queue component is connected to the internal bus of the CPU.
  • ASIC application-specific integrated circuit
  • the device queue includes a first-level queue and a second-level queue, and the queue component associates a first specified number of first-level queues and a second specified number of second-level queues for the virtual I/O device.
  • Queue establish the mapping relationship between the first-layer queue and the second-layer queue associated with the virtual I/O device; connect the first-layer queue associated with the virtual I/O device to each VCPU under the instance; connect the virtual I/O
  • the second layer queue associated with the device is connected to the physical I/O device corresponding to the virtual I/O device through the I/O offload card.
  • processor 91 can also be used to:
  • processor 91 can also be used to:
  • the processor 91 may be used to: report the response The data part in the article is written into the memory space corresponding to the instance; the memory address where the data part is located is added to the metadata information so that the queue component can trigger the VCPU connected to the target queue to read the response message according to the memory address in the metadata information. the data part.
  • the queue component can add the metadata information of the write request initiated by the instance to the virtual I/O device to the first layer queue associated with the virtual I/O device; schedule the metadata information of the write request to A second queue in the second layer queue associated with the virtual I/O device; processor 91 may also be used to:
  • the processor 91 may be used to: from the second Read the metadata information from the queue; obtain the data part of the write request according to the memory address in the metadata information; send the data part to the physical device connected to the second queue. in I/O devices;
  • the queue component writes the data part corresponding to the write request into the memory space corresponding to the instance; the memory address where the data part is located is added to the metadata information.
  • the I/O request occurring between the instance and the virtual I/O device adopts the io-uring protocol.
  • the I/O offload card also includes: a communication component 92, a power supply component 93 and other components. Only some components are schematically shown in Figure 9, which does not mean that the I/O offload card only includes the components shown in Figure 9.
  • embodiments of the present application also provide a computer-readable storage medium storing a computer program.
  • the steps in the above method embodiments that can be executed by the CPU or I/O offload card can be implemented.
  • the memory in Figure 9 above is used to store computer programs, and can be configured to store various other data to support operations on the computing platform. Examples of such data include instructions for any application or method operating on the computing platform, contact data, phonebook data, messages, pictures, videos, etc.
  • Memory can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable memory Read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read-only memory
  • EPROM erasable programmable memory Read-only memory
  • PROM programmable read-only memory
  • ROM read-only memory
  • magnetic memory flash memory
  • flash memory magnetic or optical disk.
  • the above-mentioned communication component in Figure 9 is configured to facilitate wired or wireless communication between the device where the communication component is located and other devices.
  • the device where the communication component is located can access wireless networks based on communication standards, WIFI, 2G, 3G, 4G/LTE, 5G and other mobile communication networks, or a combination thereof.
  • the communication component receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication component further includes a near field communication (NFC) module to facilitate short-range communication.
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • a power component in Figure 9 above provides power to various components of the device where the power supply component is located.
  • a power component may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to the device in which the power component resides.
  • embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions
  • the device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.
  • These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device.
  • Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • Memory may include non-permanent storage in computer-readable media, random access memory (RAM) and/or non-volatile memory in the form of read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash random access memory
  • Computer-readable media includes both persistent and non-volatile, removable and non-removable media that can be implemented by any method or technology for storage of information.
  • Information may be computer-readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • read-only memory read-only memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technology
  • compact disc read-only memory CD-ROM
  • DVD digital versatile disc
  • Magnetic tape cartridges magnetic tape storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • computer-readable media does not include transitory media, such as modulated data signals and carrier waves.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请实施例提供一种云环境下的I/O卸载方法、设备、系统及存储介质。在本申请实施例中,宿主机的CPU中增配了队列组件,且创新性地提出将队列组件用于I/O的处理过程中。基于此,可通过队列组件为宿主机上的实例提供虚拟I/O设备以及与虚拟I/O设备对应的设备队列;在CPU内,利用设备队列对实例与虚拟I/O设备之间发生的I/O请求进行调度;而原本的I/O卸载卡则只需负责监听设备队列中的I/O请求并传递I/O请求所相关的数据即可。这样,可将I/O处理过程中的设备队列上移到CPU内,从而在I/O调度过程中可与CPU内各部件联动,获取到CPU内各核心的实时负载,从而可更加合理地实现I/O调度,而不再盲目调度,这可有效提高I/O的处理效率。

Description

一种云环境下的I/O卸载方法、设备、系统及存储介质
本申请要求于2022年08月30日提交中国专利局、申请号为202211060455.3、申请名称为“一种云环境下的I/O卸载方法、设备、系统及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及云计算技术领域,尤其涉及一种云环境下的I/O卸载方法、设备、系统及存储介质。
背景技术
随着CPU的性能增长边际成本急剧上升,I/O卸载卡应用而生。I/O卸载卡,又称DPU、IPU,可为高带宽、低延迟、数据密集的计算场景提供I/O计算引擎。
目前,I/O的处理工作完全卸载到了I/O卸载卡中,I/O卸载卡上的I/O处理工作与宿主机的CPU数据链路分离,这种分离结构导致目前的I/O处理过程中经常发生数据溢出等问题,处理效率不佳。
发明内容
本申请的多个方面提供一种云环境下的I/O卸载方法、设备、系统及存储介质,用以改善云环境下的I/O处理效率。
本申请实施例提供一种云环境下的I/O卸载系统,包括:宿主机的CPU以及插接在所述宿主机上的I/O卸载卡,所述CPU内装配有队列组件;
所述队列组件,用于为所述宿主机上的实例提供虚拟I/O设备以及与所述虚拟I/O设备对应的设备队列;在所述CPU内,利用所述设备队列对所述实例与所述虚拟I/O设备之间发生的I/O请求进行调度;
所述I/O卸载卡,用于监听所述设备队列中的I/O请求;在所述实例与所述虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。
本申请实施例还提供一种云环境下的I/O卸载方法,适用于宿主机中的CPU,所述CPU中装配有队列组件,所述方法包括:
利用所述队列组件为所述宿主机上的实例提供虚拟I/O设备;
利用所述队列组件为所述虚拟I/O设备配置对应的设备队列;
在所述CPU内,利用所述设备队列对所述实例与所述虚拟I/O设备之间发生的I/O请求进行调度,以供所述宿主机上插接的I/O卸载卡监听所述设备队列中的I/O请求 并在所述实例与所述虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。
本申请实施例还提供一种云环境下的I/O卸载方法,适用于宿主机上插接的I/O卸载卡,所述宿主机的CPU中装配有队列组件,所述方法,包括:
从所述队列组件中的设备队列中监听所述宿主机上的实例与所述队列组件为所述实例提供的虚拟I/O设备之间发生的I/O请求;
获取监听到的I/O请求所相关的数据;
在所述实例与所述虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。
本申请实施例还提供一种处理器CPU,安装在宿主机中,所述CPU内装配有队列组件,所述CPU用于执行所述一条或多条计算机指令,以用于:
利用所述队列组件为所述宿主机上的实例提供虚拟I/O设备;
利用所述队列组件为所述虚拟I/O设备配置对应的设备队列;
在所述CPU内,利用所述设备队列对所述实例与所述虚拟I/O设备之间发生的I/O请求进行调度,以供所述宿主机上插接的I/O卸载卡监听所述设备队列中的I/O请求并在所述实例与所述虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。
本申请实施例还提供一种I/O卸载卡,插接于宿主机上,所述宿主机的CPU内装配有队列组件,所述I/O卸载卡包括存储器和处理器;
所述存储器用于存储一条或多条计算机指令;
所述处理器与所述存储器耦合,用于执行所述一条或多条计算机指令,以用于:
从所述队列组件中的设备队列中监听所述宿主机上的实例与所述队列组件为所述实例提供的虚拟I/O设备之间发生的I/O请求;
获取监听到的I/O请求所相关的数据;
在所述实例与所述虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。
本申请实施例还提供一种存储计算机指令的计算机可读存储介质,当所述计算机指令被一个或多个处理器执行时,致使所述一个或多个处理器执行前述的云环境下的I/O卸载方法。
在本申请实施例中,宿主机的CPU中增配了队列组件,且创新性地提出将队列组件用于I/O的处理过程中。基于此,可通过队列组件为宿主机上的实例提供虚拟I/O设备以及与虚拟I/O设备对应的设备队列;在CPU内,利用设备队列对实例与虚拟I/O设备之间发生的I/O请求进行调度;而原本的I/O卸载卡则只需负责监听设备队列中的I/O请求并传递I/O请求所相关的数据即可。这样,可将I/O处理过程中的设备队列上移到CPU内,从而在I/O调度过程中可与CPU内各部件联动,获取到CPU内各核心的实时负载,从而可 更加合理地实现I/O调度,而不再盲目调度,这可有效提高I/O的处理效率。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1为本申请一示例性实施例提供的云环境下的I/O卸载系统的结构示意图;
图2为本申请一示例性实施例提供的一种I/O卸载系统的可选实现方案对应的结构示意图;
图3为本申请一示例性实施例提供的一种I/O卸载系统的可选实现方案的结构示意图;
图4为本申请一示例性实施例提供的一种实例创建过程的逻辑示意图;
图5为本申请一示例性实施例提供的一种实例销毁过程的逻辑示意图;
图6为本申请另一示例性实施例提供的一种云环境下的I/O卸载方法的流程示意图;
图7为本申请另一示例性实施例提供的另一种云环境下的I/O卸载方法的流程示意图;
图8为本申请又一示例性实施例提供的一种处理器CPU的结构示意图;
图9为本申请有一示例性实施例提供的一种I/O卸载卡的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请具体实施例及相应的附图对本申请技术方案进行清楚、完整地描述。显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
目前,I/O的处理工作完全卸载到了I/O卸载卡中,对I/O的处理效率不佳。为此,本申请的一些实施例中:宿主机的CPU中增配了队列组件,且创新性地提出将队列组件用于I/O卸载的处理过程中。基于此,可通过队列组件为宿主机上的实例提供虚拟I/O设备以及与虚拟I/O设备对应的设备队列;在CPU内,利用设备队列对实例与虚拟I/O设备之间发生的I/O请求进行调度;而原本的I/O卸载卡则只需负责监听设备队列中的I/O请求并传递I/O请求所相关的数据即可。这样,可将I/O卸载处理过程中的设备队列上移到CPU内,从而在I/O调度过程中可与CPU内各部件联动,获取到CPU内各核心的实时负载,从而可更加合理地实现I/O调度,而不再盲目调度,这可有效提高I/O的处理效率。
以下结合附图,详细说明本申请各实施例提供的技术方案。
图1为本申请一示例性实施例提供的云环境下的I/O卸载系统的结构示意图。如图1所示,该系统包括:宿主机的CPU以及插接在宿主机上的I/O卸载卡。
其中,I/O卸载卡可采用DPU、IPU等芯片进行实现,当然,本实施例并不限于。宿主机可以是云环境中的物理机,参考图1,在云环境中,单台宿主机上可运行多个实例。云原生下的核心技术为虚拟化技术,基于虚拟化技术,可在宿主机的CPU之上虚拟出若干 VCPU,实例则可依托于VCPU而运行。
本实施例提出在宿主机的CPU中增配队列组件。其中,本实施例中可采用硬件方式或软件方式来实现队列组件。可选地,本实施例中,可采用专用集成电路ASIC等硬件来构建队列组件,当然,本实施例并不限于此。另外,本实施例中在CPU中增配的队列组件除了可在本实施例中用于为I/O处理工作提供队列功能外,也可复用到其它场景中,为其它场景提供队列功能。
本实施例中,队列组件可接入CPU的内部总线。本实施例并不限定队列组件接入CPU内部总线的形式,可根据实际情况来选用接入方式。例如,队列组件可与CPU内的PCIe控制器连接以接入CPU的内部总线。又例如,队列组件可通过UCIe总线连接到CPU其它组件。另外,队列组件还可与CPU内的内存控制器或者可选的加速模块等连接,以与CPU的内存或加速模块进行交互。而在另一侧,队列组件还可与插接在宿主机上的I/O卸载卡互联,可选地,队列组件可通过CXL等缓存一致性总线协议与I/O卸载卡进行互联。这样,参考图1,CPU中增配的队列组件可以支持I/O卸载卡与宿主机的CPU联动,改变传统I/O处理方案中I/O卸载卡与CPU数据链路分离的现状。
参考图1,本实施例中的队列组件,可用于为宿主机上的实例提供虚拟I/O设备以及与虚拟I/O设备对应的设备队列。本实施例中,队列组件可对需要与实例进行I/O的物理I/O设备进行I/O虚拟化,以产生物理I/O设备对应的虚拟I/O设备(如图中的磁盘设备、网络设备)。虚拟出的磁盘设备可包括但不限于blk设备等,虚拟出的网络设备可包括但不限于net设备等。其中,队列组件可采用多种I/O虚拟化方案为宿主机上的实例提供给虚拟I/O设备,具体方案将在后文中详述。应当理解的是,这里的虚拟I/O设备是操作系统OS层面的定义,也即是,对宿主机的OS来说,其可见的是虚拟I/O设备,而该虚拟I/O设备代表的则是I/O过程中涉及到的各个实体设备。另外,本实施例中不限定设备队列的实现形式,可采用先入先出FIFO队列等实现形式。本实施例中,可利用设备队列来管理I/O请求。为便于描述,本实施例中,将从单个实例的角度来说明I/O卸载过程,但是应当理解的是,宿主机上可以承载多个实例,针对各个实例的I/O卸载方案一致。
值得说明的是,参考图1,本实施例中,队列组件可为单个实例提供若干虚拟I/O设备,为便于描述,后文中将从单个虚拟I/O设备的角度来描述I/O卸载方案,但应当理解的是,可采用相同的优化方案来优化为单个实例提供的其它虚拟I/O设备所涉及到的I/O处理过程。
在此基础上,宿主机上的实例与队列组件为其提供的虚拟I/O设备之间可发生I/O请求。优选地,实例与虚拟I/O设备之间发生的I/O请求采用io-uring协议,通过采用如io-uring的通用协议,可避免I/O卸载卡与不同CPU平台联动时需要适配不同CPU平台实现的不同设备队列的问题,也即不再需要考虑适配性问题。当然,这仅是优选地,本实施例中实例与虚拟I/O设备之间发生的I/O请求还可采用协议,只需保证双方预先达成约定即可。队列组件可在CPU内利用设备队列对实例与虚拟I/O设备之间发生的I/O请求进行调度。 其中,调度算法可以是五元组hash、secret key hash等,本实施例对此不做限定。这样,I/O处理过程中涉及到的设备队列上移到了CPU内,而队列组件可与CPU内部的各组件联动,这使得队列组件基于设备队列所进行的I/O请求调度工作不再是盲目的,而是可以CPU内各核心的实时负载作为调度依据,从而可更加合理地完成I/O请求的调度,尤其是实例向虚拟I/O设备发出的读请求,这避免读请求过程中因调度不合理而导致的数据溢出问题。
正如前文提及的,队列组件还与插接在宿主机上的I/O卸载卡互联。这样,I/O卸载卡将不再需要承担I/O请求调度的工作,这部分工作上移至CPU内,由队列组件承担。I/O卸载卡的其它工作可保留,I/O卸载卡可对接至队列组件为虚拟I/O设备提供的设备队列中,并监听设备队列中的I/O请求。据此,I/O卸载卡可监听到实例与虚拟I/O设备之间发生的I/O请求。
这样,基于为CPU配置的队列组件,设备队列分别与虚拟I/O设备和I/O卸载卡对接,本实施例中的数据链路为:虚拟I/O设备将其与实例之间发生的I/O请求传递至设备队列,设备队列将I/O请求传递至I/O卸载卡,以使I/O卸载卡感知到I/O请求。
参考图1,I/O卸载卡还可用于在实例与虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。I/O卸载卡可作为物理I/O设备与宿主机上的实例之间的中间件,为双方提供数据交互支持。
另外,本实施例中,队列组件中还可提供加速模块,加速模块可用于对实例与虚拟I/O设备之间发生的I/O请求所相关的数据执行加速处理。本实施例中的加速处理可包括但不限于加解密、压缩或统计卸载等。同样,I/O卸载卡中也可提供加速模块,同样,其内的加速模块也可用于对实例与虚拟I/O设备之间发生的I/O请求所相关的数据执行加速处理。基于队列组件中提供的加速模块:
1、客户可配置指定虚拟I/O设备开启加解密、压缩、及统计卸载等功能,该虚拟I/O设备对应的设备队列绑定的VF可关联至指定加速模块。客户为该虚拟I/O设备分配的内存地址也可关联至该指定的加速模块,这样,读写该内存地址,数据都会经过指定的加速模块进行加速处理。
2、实例中的应用向虚拟I/O设备发起写请求时,数据进入客户分配的内存空间前,加速模块可对数据进行预处理。
3、I/O卸载卡的I/O计算引擎从通信组件收到数据,I/O计算引擎将数据DMA到客户预分配的内存空间。数据进入客户分配的内存空间前,加速模块可对数据进行预处理。
这样,基于队列组件中提供的加速模块,本实施例中的数据链路变为:虚拟I/O设备将其与实例之间发生的I/O请求传递至设备队列;设备队列触发加速模块访问内存地址并对I/O请求所相关的数据进行加速处理;设备队列将I/O请求传递至I/O卸载卡,以使I/O卸载卡感知到I/O请求并从内存地址读取加速处理后的数据。这可使加速模块的加速范围覆盖至数据的全部路径,比如,队列组件提供的加解密模块可保证数据在全路径的加密状态,改变传统方案中只能由I/O卸载卡在数据传输路径的末端进行数据加速的现状。
综上,本实施例中,宿主机的CPU中增配了队列组件,且创新性地提出将队列组件用于I/O的处理过程中。基于此,可通过队列组件为宿主机上的实例提供虚拟I/O设备以及与虚拟I/O设备对应的设备队列;在CPU内,利用设备队列对实例与虚拟I/O设备之间发生的I/O请求进行调度;而原本的I/O卸载卡则只需负责监听设备队列中的I/O请求并传递I/O请求所相关的数据即可。这样,可将I/O处理过程中的设备队列上移到CPU内,从而在I/O调度过程中可与CPU内各部件联动,获取到CPU内各核心的实时负载,从而可更加合理地实现I/O调度,而不再盲目调度,这可有效提高I/O的处理效率。
图2为本申请一示例性实施例提供的一种I/O卸载系统的可选实现方案对应的结构示意图。参考图2,在上述或下述实施例中,队列组件提供的设备队列可包括第一层队列和第二层队列。其中,第一层队列可对接实例下的VCPU,第二层队列可通过I/O卸载卡而对接至物理I/O设备。参考图2,对于为实例提供的其中一个虚拟I/O设备来说,队列组件可为虚拟I/O设备关联第一指定数量的第一层队列和第二指定数量的第二层队列;建立虚拟I/O设备关联的第一层队列和第二层队列之间的映射关系;将虚拟I/O设备关联的第一层队列对接至实例下的各个VCPU;将虚拟I/O设备关联的第二层队列通过I/O卸载卡对接至虚拟I/O设备对应的物理I/O设备。
图3为本申请一示例性实施例提供的一种I/O卸载系统的可选实现方案的结构示意图,其中示出了将虚拟I/O设备关联的第一层队列对接至实例下的VCPU的实现方案以及将虚拟I/O设备关联的第二层队列对接至虚拟I/O设备对应的物理I/O设备的实现方案。
参考图3,正如前文提及的,队列组件可采用多种I/O虚拟化方案为宿主机上的实例提供给虚拟I/O设备。在一种示例性的实现方案中:可采用SRIOV(Single Root I/O Virtualization)技术,为需要与实例进行I/O的物理I/O设备创建VF设备,VF设备用于与实例下的各个VCPU进行数据交换;利用实例的操作系统将VF设备注册为指定类型的虚拟I/O设备。例如,实例的操作系统可根据VF的PCIe<vender ID、device ID>,将VF注册为磁盘、网络设备等。其中,本实施例中的物理I/O设备可包括但不限于云网卡、云盘(例如固态硬盘SSD)等高性能的I/O设备。实际应用中,每个SR-I/OV设备都可有一个物理功能(Physical FunctI/On,PF),并且每个PF最多可有64000个与其关联的虚拟功能(Virtual FunctI/On,VF)。PF可以通过寄存器创建VF,这些寄存器设计有专用于此目的的属性。一旦在PF中启用了SR-I/OV,就可以通过PF的总线、设备和功能编号等访问各个VF的PCI配置空间。每个VF都具有一个PCI内存空间,用于映射其寄存器集。VF设备驱动程序对寄存器集进行操作以启用其功能,并且显示为实际存在的PCI设备。创建VF后,可以直接将其指定给实例中的各个应用程序,这样,队列组件中的VF设备可与实例下各个VCPU进行数据交换。
基于此,参考图3,可将虚拟I/O设备关联的第一层队列绑定至虚拟I/O设备对应的VF设备,以与实例下各个VCPU进行数据交换。
继续参考图3,对于I/O卸载卡来说,可对物理I/O设备进行模拟,以产生物理I/O设 备对应的模拟设备;基于此,可将虚拟I/O设备关联的第二层队列绑定至模拟设备,以将虚拟I/O设备关联的第二层队列对接至虚拟I/O设备对应的物理I/O设备。其中,I/O卸载卡可采用软件模拟或硬件模拟的方式来实现对物理I/O设备的模拟,本实施例在此不做限定。应当理解的是,在云环境中,物理I/O设备的部署形式可能是分布式或集群式的,I/O卸载卡是在相应的分布式系统或集群之上模拟出模拟设备以参与到I/O处理过程中,最终,I/O卸载卡可通过模拟设备来将I/O请求准确传递至物理I/O设备。也即是,I/O卸载卡中的I/O计算引擎可提供指定数量个模拟设备,每个模拟设备与CPU内的设备队列中的指定数量个第二层队列绑定以读取或发送数据。当然,本实施例中,还可采用其它实现方案来将虚拟I/O设备关联的第二层队列对接至虚拟I/O设备对应的物理I/O设备,而并不限于模拟这一种方案,在此不再穷举。图4为本申请一示例性实施例提供的一种实例创建过程的逻辑示意图。图5为本申请一示例性实施例提供的一种实例销毁过程的逻辑示意图。以下将通过实例的创建/销毁过程来说明上述的队列对接方案。
参考图4,一种示例性的实例创建过程可以是:
1、客户通过控制台、或openAPI发起实例创建请求。
2、控制台可将创建请求调度到最合适的宿主机。
3、宿主机上可运行实例管控agent,以在I/O卸载卡的I/O计算引擎上创建云网卡或云盘等物理I/O设备对应的模拟设备,并绑定至设备队列中指定数量个第二层队列。
4、宿主机的CPU中的队列组件可在设备队列上创建所需的多个VF,并绑定指定的第一层队列,并建立与步骤3中的第二层队列之间的映射关系。
5、宿主机上的实例管控agent可通过调用Hypervisor,创建实例。
参考图5,一种示例性的实例销毁过程可以是:
1、客户通过控制台、或openAPI发起实例销毁请求。
2、控制台可查找到该实例所在宿主机,通过该宿主机上实例管控agent下发删除命令。
3、实例管控agent,可在I/O卸载卡的I/O计算引擎上删除相关的云网卡或云盘等物理I/O设备对应的模拟设备,宿主机的CPU中的队列组件可在设备队列上删除相关的VF。
4、实例管控agent调用Hypervisor,删除实例。
从图4和图5可知,本实施例中,在实例的创建/销毁过程中,将从单实例的粒度来创建所需的多个虚拟I/O设备及多个模拟设备并进行两层队列的对接过程。值得说明的是,在实例正常运行的过程中,如果发生物理I/O设备的新增/减少,则可从单设备的粒度来配置图3中单个虚拟I/O设备所管理的一串实体对象(VF设备-第一层队列-第二层队列-模拟设备),也即是,每增加一个需要与实例进行I/O的物理I/O设备,则可增加图3中的1个虚拟I/O设备及该虚拟I/O设备所管理的一串实体对象。
另外,值得说明的是,本实施例中,虚拟I/O设备是通过对物理I/O设备进行虚拟化而产生的,其主要用于支持宿主机上的实例发现物理I/O设备。而I/O卸载卡中的模拟设备则是通过对物理I/O设备进行模拟而产生的,其主要用于模拟物理I/O设备的硬件行为。 而基于全文中提供的各个可选实现方案可知,本实施例中,参与到I/O处理过程中的实体对象可能包括:VF设备-第一层队列-第二层队列-模拟设备-网卡-物理I/O设备。而前述的虚拟I/O设备所代表的则是参与到I/O处理过程中的这些实体。而I/O卸载卡中的模拟设备则代表的是这些实体中最后提到的物理I/O设备。本实施例中的虚拟I/O设备和模拟设备功能均用于支持云原生下的I/O虚拟化。这样,本实施例中,虚拟I/O设备关联的第一层队列可绑定至VF设备与实例下的各个VCPU进行数据交换;虚拟I/O设备关联的第二层队列可绑定至I/O卸载卡中的模拟设备并进行数据交换。虚拟I/O设备关联的第一层队列可N:1映射至其关联的第二层队列,也即是,虚拟I/O设备关联的第一层队列的数量通常大于其关联的第二层队列的数量。基于这种双层的队列结构,可通过从第一层队列向第二层队列映射而将实例向虚拟I/O设备发起的写请求调度至合适的物理I/O设备;可通过从第二层队列向第一层队列映射而将实例向虚拟I/O设备发起的读请求调度至合适的VCPU。
以下将对基于双层的队列结构进行调度的过程进行详述。
参考图2,队列组件可在实例向虚拟I/O设备发起读请求的过程中,读取实例下的各个VCPU的负载信息;根据负载信息将读请求调度至虚拟I/O设备关联的第一层队列中的第一队列,以使用第一队列所对接的VCPU处理读请求。可选地,第一队列可以是负载最低的VCPU对接的队列。这样,可将读请求调度至负载最优的VCPU上进行处理,从而可提高I/O处理效率。在此过程中,I/O卸载卡可获取读请求对应的响应报文;将响应报文中的元数据信息加入虚拟I/O设备关联的第二层队列中的指定队列。而队列组件则可根据负载信息将指定队列中的元数据信息调度至第一队列中。另外,宿主机可为实例分配内存空间,I/O卸载卡可基于内存空间来在实例与虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。在一种示例性方案中:I/O卸载卡可将响应报文中的数据部分写入实例对应的内存空间;将数据部分所在的内存地址添加至元数据信息中;基于此,队列组件可触发目标队列对接的VCPU按照元数据信息中的内存地址读取响应报文中的数据部分。从而完成实例向虚拟I/O设备发起的读请求。
其中,本实施例中,实例读写虚拟I/O设备时,系统调用可分配类似网络的sk_buf等内存空间。设备队列通过地址访问引用该内存空间,I/O卸载卡可通过DMA、或者类似Intel SVM模式访问该内存空间。
举例来说,一种实际的应用方案可以是:
1、实例中的应用可基于io-uring协议打开虚拟I/O设备(如图2中磁盘设备、网络设备),使用io_uring_smp_store_release提交读请求,使用io_uring_enter触发接收方到设备队列中取数据。
2、I/O卸载卡的I/O计算引擎从通信组件(如图2中的网卡)收到响应报文,基于报文元数据部分,确定报文所属的模拟设备。将报文的数据部分写入该模拟设备的DMA内存空间。数据写入过程中,可以使用I/O卸载卡的加速模块进行数据预处理。
3、I/O计算引擎生成Io-uring格式的读请求的元数据信息,元数据信息中包含数据部分写入的DMA内存地址。I/O计算引擎可将读请求的元数据信息写入到CPU内设备队列的第二层队列中。
4、队列组件中运行的队列调度器,可读取实例各VCPU的功耗、时间片利用率、及PMU等,实时统计个VCPU的实时负载,结合设定的五元组hash、secret key hash等调度策略,将第二层队列中读请求调度到第一层队列中的目标队列。
5、实例的操作系统可唤醒提交读请求的应用,从设备队列的目标队列中读取读请求的元数据信息,从元数据信息中读取数据部分所在的内存地址,从该内存地址读取数据。
参考图2,队列组件可将实例向虚拟I/O设备发起的写请求的元数据信息加入虚拟I/O设备关联的第一层队列中;将写请求的元数据信息调度至虚拟I/O设备关联的第二层队列中的第二队列;而I/O卸载卡,则可利用第二队列对接的物理I/O设备处理写请求。在此过程中,队列组件可将写请求对应的数据部分写入实例对应的内存空间;将数据部分所在的内存地址添加至元数据信息中;I/O卸载卡可从第二队列中读取元数据信息;按照元数据信息中的内存地址获取写请求的数据部分;将数据部分发送至第二队列对接的物理I/O设备中。
举例来说,一种实际的应用方案可以是:
1、实例中的应用可基于io-uring协议打开虚拟I/O设备(如图2中磁盘设备、网络设备),使用io_uring_smp_store_release提交写请求,使用io_uring_enter触发接收方到设备队列中取数据。
2、写请求的数据部分在为应用分配的内存空间中,写请求的元数据信息进入设备队列的第一层队列中。
3、队列组件中运行的队列调度器,可根据第一层队列与第二层队列之间的映射关系,第一层队列的权重,及第二层队列的空闲队列条目数量等信息,将第一层队列中的写请求送到指定的第二层队列中。
4、使用CPU内的加速模块进行数据预处理,比如数据加密、数据压缩等。
5、I/O卸载卡的I/O计算引擎从CPU内的队列组件的第二层队列中读取写请求的元数据信息,以获得数据部分的访存地址等信息。
6、I/O计算引擎增加虚拟网卡、云盘等元数据报文头,将应用提交的数据,通过I/O卸载卡的通信组件发送到目标宿主机、或者后端存储集群。数据发送过程中,可以使用I/O卸载卡的加速模块进行数据预处理。
据此,可通过队列组件实现对实例与虚拟I/O设备之间发生的I/O请求的合理调度。设备队列上移到CPU内,可支持在实例上使用工具带内配置CPU内设备队列的调度策略,而不再需要像传统方案那样需要由控制台进行配置,本实施例提供的配置设备队列中调度策略的方案可使得配置QPS提升若干个数量级,解决了传统方案实时性不高,不能满足容器、serverless场景的快速创建销毁场景需求的问题。且,I/O卸载卡传递I/O请求所相关 的数据的过程中,可使用CPU内的加解密模块对数据进行加密后再进入I/O卸载卡,这样,I/O卸载卡将无法再看到数据明文,这使得用户数据的安全性大幅提升。
图6为本申请另一示例性实施例提供的一种云环境下的I/O卸载方法的流程示意图,该方法可由前述系统实施例中的宿主机中的CPU实施,该CPU中装配有队列组件。参考图6,该方法可包括:
步骤600、利用队列组件为宿主机上的实例提供虚拟I/O设备;
步骤601、利用队列组件为虚拟I/O设备配置对应的设备队列;
步骤602、在CPU内,利用设备队列对实例与虚拟I/O设备之间发生的I/O请求进行调度,以供宿主机上插接的I/O卸载卡监听设备队列中的I/O请求并在实例与虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。
在一可选实施例中,采用专用集成电路ASIC构建队列组件,队列组件接入CPU的内部总线。
在一可选实施例中,设备队列分别与虚拟I/O设备和I/O卸载卡对接,虚拟I/O设备将其与实例之间发生的I/O请求传递至设备队列,设备队列将I/O请求传递至I/O卸载卡,以使I/O卸载卡感知到I/O请求。
在一可选实施例中,设备队列包括第一层队列和第二层队列,所述方法还包括:
为虚拟I/O设备关联第一指定数量的第一层队列和第二指定数量的第二层队列;
建立虚拟I/O设备关联的第一层队列和第二层队列之间的映射关系;
将虚拟I/O设备关联的第一层队列对接至实例下的各个VCPU;
将虚拟I/O设备关联的第二层队列通过I/O卸载卡对接至虚拟I/O设备对应的物理I/O设备。
在一可选实施例中,步骤对实例与虚拟I/O设备之间发生的I/O请求进行调度,包括:
在实例向虚拟I/O设备发起读请求的过程中,读取实例下的各个VCPU的负载信息;
根据负载信息将读请求调度至虚拟I/O设备关联的第一层队列中的第一队列,以使用第一队列所对接的VCPU处理读请求。
在一可选实施例中,步骤根据负载信息将读请求调度至虚拟I/O设备关联的第一层队列中的第一队列,包括:
根据负载信息将指定队列中的元数据信息调度至第一队列中;其中,由I/O卸载卡获取读请求对应的响应报文并将响应报文中的元数据信息加入虚拟I/O设备关联的第二层队列中的指定队列。
在一可选实施例中,该方法还包括:
触发目标队列对接的VCPU按照元数据信息中的内存地址读取响应报文中的数据部分;其中,由I/O卸载卡将响应报文中的数据部分写入实例对应的内存空间;将数据部分所在的内存地址添加至元数据信息中。
在一可选实施例中,步骤利用设备队列对实例与虚拟I/O设备之间发生的I/O请求进 行调度,包括:
将实例向虚拟I/O设备发起的写请求的元数据信息加入虚拟I/O设备关联的第一层队列中;
将写请求的元数据信息调度至虚拟I/O设备关联的第二层队列中的第二队列;
其中,由I/O卸载卡利用第二队列对接的物理I/O设备处理写请求。
在一可选实施例中,该方法还包括:
将写请求对应的数据部分写入实例对应的内存空间;将数据部分所在的内存地址添加至元数据信息中;
其中,由I/O卸载卡从第二队列中读取元数据信息;按照元数据信息中的内存地址获取写请求的数据部分,并将数据部分发送至第二队列对接的物理I/O设备中。
在一可选实施例中,步骤为宿主机上的实例提供虚拟I/O设备,包括:
对需要与实例进行I/O的物理I/O设备进行I/O虚拟化,以产生物理I/O设备对应的虚拟I/O设备。
在一可选实施例中,步骤对需要与实例进行I/O的物理I/O设备进行I/O虚拟化,包括:
采用SRIOV技术,为需要与实例进行I/O的物理I/O设备创建VF设备,VF设备用于与实例下的各个VCPU进行数据交换;
利用实例的操作系统将VF设备注册为指定类型的虚拟I/O设备。
在一可选实施例中,实例与虚拟I/O设备之间发生的I/O请求采用io-uring协议。
在一可选实施例中,该方法还包括:
利用加速模块对实例与虚拟I/O设备之间发生的I/O请求所相关的数据执行加速处理,加速处理包括:加解密、压缩和统计卸载中的一种或多种处理。
在一可选实施例中,加速模块绑定至宿主机为虚拟I/O设备分配的内存地址,虚拟I/O设备将其与实例之间发生的I/O请求传递至设备队列;设备队列触发加速模块访问内存地址并对I/O请求所相关的数据进行加速处理;设备队列将I/O请求传递至I/O卸载卡,以使I/O卸载卡感知到I/O请求并从内存地址读取加速处理后的数据。
值得说明的是,上述关于I/O卸载方法各实施例中的技术细节,可参考前述的系统实施例中关于CPU的相关描述,为节省篇幅,在此不再赘述,但这不应造成本申请保护范围的损失。
图7为本申请另一示例性实施例提供的另一种云环境下的I/O卸载方法的流程示意图,该方法可由前述系统实施例中插接在宿主机上的I/O卸载卡实施,宿主机的CPU中装配有队列组件。参考图7,该方法可包括:
步骤700、从队列组件中的设备队列中监听宿主机上的实例与队列组件为实例提供的虚拟I/O设备之间发生的I/O请求;
步骤701、获取监听到的I/O请求所相关的数据;
步骤702、在实例与虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。
在一可选实施例中,采用专用集成电路ASIC构建队列组件,队列组件接入CPU的内部总线。
在一可选实施例中,设备队列包括第一层队列和第二层队列,且由队列组件为虚拟I/O设备关联第一指定数量的第一层队列和第二指定数量的第二层队列;建立虚拟I/O设备关联的第一层队列和第二层队列之间的映射关系;将虚拟I/O设备关联的第一层队列对接至实例下的各个VCPU;将虚拟I/O设备关联的第二层队列通过I/O卸载卡对接至虚拟I/O设备对应的物理I/O设备。
在一可选实施例中,该方法还可包括:
对物理I/O设备进行模拟,以产生物理I/O设备对应的模拟设备;
将虚拟I/O设备关联的第二层队列绑定至模拟设备,以将虚拟I/O设备关联的第二层队列对接至物理I/O设备。
在一可选实施例中,该方法还可包括:
获取读请求对应的响应报文;将响应报文中的元数据信息加入虚拟I/O设备关联的第二层队列中的指定队列,以供队列组件根据负载信息将指定队列中的元数据信息调度至第一队列中。
在一可选实施例中,步骤在实例与虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据,可包括:将响应报文中的数据部分写入实例对应的内存空间;将数据部分所在的内存地址添加至元数据信息中,以供队列组件触发目标队列对接的VCPU按照元数据信息中的内存地址读取响应报文中的数据部分。
在一可选实施例中,可由队列组件将实例向虚拟I/O设备发起的写请求的元数据信息加入虚拟I/O设备关联的第一层队列中;将写请求的元数据信息调度至虚拟I/O设备关联的第二层队列中的第二队列;该方法还包括:
利用第二队列对接的物理I/O设备处理写请求。
在一可选实施例中,步骤在实例与虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据,可包括:从第二队列中读取元数据信息;按照元数据信息中的内存地址获取写请求的数据部分;将数据部分发送至第二队列对接的物理I/O设备中;
其中,由队列组件将写请求对应的数据部分写入实例对应的内存空间;将数据部分所在的内存地址添加至元数据信息中。
在一可选实施例中,实例与虚拟I/O设备之间发生的I/O请求采用io-uring协议。
值得说明的是,上述关于I/O卸载方法各实施例中的技术细节,可参考前述的系统实施例中关于I/O卸载卡的相关描述,为节省篇幅,在此不再赘述,但这不应造成本申请保护范围的损失。
另外,在上述实施例及附图中的描述的一些流程中,包含了按照特定顺序出现的多个 操作,但是应该清楚了解,这些操作可以不按照其在本文中出现的顺序来执行或并行执行,操作的序号如700、701等,仅仅是用于区分开各个不同的操作,序号本身不代表任何的执行顺序。另外,这些流程可以包括更多或更少的操作,并且这些操作可以按顺序执行或并行执行。需要说明的是,本文中的“第一”、“第二”等描述,是用于区分不同的队列层、队列等,不代表先后顺序,也不限定“第一”和“第二”是不同的类型。
图8为本申请又一示例性实施例提供的一种处理器CPU的结构示意图,该CPU安装在宿主机中,该CPU内装配有队列组件80。如图8所示,该CPU可用于执行所述一条或多条计算机指令,以用于:
利用所述队列组件80为所述宿主机上的实例提供虚拟I/O设备;
利用所述队列组件80为所述虚拟I/O设备配置对应的设备队列;
在所述CPU内,利用所述设备队列对所述实例与所述虚拟I/O设备之间发生的I/O请求进行调度,以供所述宿主机上插接的I/O卸载卡监听所述设备队列中的I/O请求并在所述实例与所述虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。
在一可选实施例中,采用专用集成电路ASIC构建队列组件80,队列组件80接入CPU的内部总线。
在一可选实施例中,设备队列分别与虚拟I/O设备和I/O卸载卡对接,虚拟I/O设备将其与实例之间发生的I/O请求传递至设备队列,设备队列将I/O请求传递至I/O卸载卡,以使I/O卸载卡感知到I/O请求。
在一可选实施例中,设备队列包括第一层队列和第二层队列,该队列组件80还可用于:
为虚拟I/O设备关联第一指定数量的第一层队列和第二指定数量的第二层队列;
建立虚拟I/O设备关联的第一层队列和第二层队列之间的映射关系;
将虚拟I/O设备关联的第一层队列对接至实例下的各个VCPU;
将虚拟I/O设备关联的第二层队列通过I/O卸载卡对接至虚拟I/O设备对应的物理I/O设备。
在一可选实施例中,队列组件80在对实例与虚拟I/O设备之间发生的I/O请求进行调度的过程中,可用于:
在实例向虚拟I/O设备发起读请求的过程中,读取实例下的各个VCPU的负载信息;
根据负载信息将读请求调度至虚拟I/O设备关联的第一层队列中的第一队列,以使用第一队列所对接的VCPU处理读请求。
在一可选实施例中,队列组件80在根据负载信息将读请求调度至虚拟I/O设备关联的第一层队列中的第一队列的过程中,可用于:
根据负载信息将指定队列中的元数据信息调度至第一队列中;其中,由I/O卸载卡获取读请求对应的响应报文并将响应报文中的元数据信息加入虚拟I/O设备关联的第二层队列中的指定队列。
在一可选实施例中,队列组件80还可用于:
触发目标队列对接的VCPU按照元数据信息中的内存地址读取响应报文中的数据部分;其中,由I/O卸载卡将响应报文中的数据部分写入实例对应的内存空间;将数据部分所在的内存地址添加至元数据信息中。
在一可选实施例中,队列组件80在利用设备队列对实例与虚拟I/O设备之间发生的I/O请求进行调度的过程中,可用于:
将实例向虚拟I/O设备发起的写请求的元数据信息加入虚拟I/O设备关联的第一层队列中;
将写请求的元数据信息调度至虚拟I/O设备关联的第二层队列中的第二队列;
其中,由I/O卸载卡利用第二队列对接的物理I/O设备处理写请求。
在一可选实施例中,队列组件80还可用于:
将写请求对应的数据部分写入实例对应的内存空间;将数据部分所在的内存地址添加至元数据信息中;
其中,由I/O卸载卡从第二队列中读取元数据信息;按照元数据信息中的内存地址获取写请求的数据部分,并将数据部分发送至第二队列对接的物理I/O设备中。
在一可选实施例中,队列组件80在为宿主机上的实例提供虚拟I/O设备的过程中,可用于:
对需要与实例进行I/O的物理I/O设备进行I/O虚拟化,以产生物理I/O设备对应的虚拟I/O设备。
在一可选实施例中,队列组件80在对需要与实例进行I/O的物理I/O设备进行I/O虚拟化的过程中,可用于:
采用SRIOV技术,为需要与实例进行I/O的物理I/O设备创建VF设备,VF设备用于与实例下的各个VCPU进行数据交换;
利用实例的操作系统将VF设备注册为指定类型的虚拟I/O设备。
在一可选实施例中,实例与虚拟I/O设备之间发生的I/O请求采用io-uring协议。
在一可选实施例中,队列组件80还可包括加速模块,队列组件80还可用于:
利用加速模块对实例与虚拟I/O设备之间发生的I/O请求所相关的数据执行加速处理,加速处理包括:加解密、压缩和统计卸载中的一种或多种处理。
在一可选实施例中,加速模块绑定至宿主机为虚拟I/O设备分配的内存地址,虚拟I/O设备将其与实例之间发生的I/O请求传递至设备队列;设备队列触发加速模块访问内存地址并对I/O请求所相关的数据进行加速处理;设备队列将I/O请求传递至I/O卸载卡,以使I/O卸载卡感知到I/O请求并从内存地址读取加速处理后的数据。
值得说明的是,上述关于CPU各实施例中的技术细节,可参考前述的系统实施例中关于CPU的相关描述,为节省篇幅,在此不再赘述,但这不应造成本申请保护范围的损失。
图9为本申请有一示例性实施例提供的一种I/O卸载卡的结构示意图,该I/O卸载卡插接于宿主机上,宿主机的CPU内装配有队列组件。参考图9,该I/O卸载卡可包括存储器90和处理器91,存储器90用于存储一条或多条计算机指令;处理器91与存储器90耦合,用于执行一条或多条计算机指令,以用于:
从队列组件中的设备队列中监听宿主机上的实例与队列组件为实例提供的虚拟I/O设备之间发生的I/O请求;
获取监听到的I/O请求所相关的数据;
在实例与虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。
在一可选实施例中,采用专用集成电路ASIC构建队列组件,队列组件接入CPU的内部总线。
在一可选实施例中,设备队列包括第一层队列和第二层队列,且由队列组件为虚拟I/O设备关联第一指定数量的第一层队列和第二指定数量的第二层队列;建立虚拟I/O设备关联的第一层队列和第二层队列之间的映射关系;将虚拟I/O设备关联的第一层队列对接至实例下的各个VCPU;将虚拟I/O设备关联的第二层队列通过I/O卸载卡对接至虚拟I/O设备对应的物理I/O设备。
在一可选实施例中,处理器91还可用于:
对物理I/O设备进行模拟,以产生物理I/O设备对应的模拟设备;
将虚拟I/O设备关联的第二层队列绑定至模拟设备,以将虚拟I/O设备关联的第二层队列对接至物理I/O设备。
在一可选实施例中,处理器91还可用于:
获取读请求对应的响应报文;将响应报文中的元数据信息加入虚拟I/O设备关联的第二层队列中的指定队列,以供队列组件根据负载信息将指定队列中的元数据信息调度至第一队列中。
在一可选实施例中,处理器91在实例与虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据的过程中,可用于:将响应报文中的数据部分写入实例对应的内存空间;将数据部分所在的内存地址添加至元数据信息中,以供队列组件触发目标队列对接的VCPU按照元数据信息中的内存地址读取响应报文中的数据部分。
在一可选实施例中,可由队列组件将实例向虚拟I/O设备发起的写请求的元数据信息加入虚拟I/O设备关联的第一层队列中;将写请求的元数据信息调度至虚拟I/O设备关联的第二层队列中的第二队列;处理器91还可用于:
利用第二队列对接的物理I/O设备处理写请求。
在一可选实施例中,处理器91在实例与虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据的过程中,可用于:从第二队列中读取元数据信息;按照元数据信息中的内存地址获取写请求的数据部分;将数据部分发送至第二队列对接的物理 I/O设备中;
其中,由队列组件将写请求对应的数据部分写入实例对应的内存空间;将数据部分所在的内存地址添加至元数据信息中。
在一可选实施例中,实例与虚拟I/O设备之间发生的I/O请求采用io-uring协议。
进一步,如图9所示,该I/O卸载卡还包括:通信组件92、电源组件93等其它组件。图9中仅示意性给出部分组件,并不意味着I/O卸载卡只包括图9所示组件。
值得说明的是,上述关于I/O卸载卡各实施例中的技术细节,可参考前述的系统实施例中关于I/O卸载卡的相关描述,为节省篇幅,在此不再赘述,但这不应造成本申请保护范围的损失。
相应地,本申请实施例还提供一种存储有计算机程序的计算机可读存储介质,计算机程序被执行时能够实现上述方法实施例中可由CPU或I/O卸载卡执行的各步骤。
上述图9中的存储器,用于存储计算机程序,并可被配置为存储其它各种数据以支持在计算平台上的操作。这些数据的示例包括用于在计算平台上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
上述图9中的通信组件,被配置为便于通信组件所在设备和其他设备之间有线或无线方式的通信。通信组件所在设备可以接入基于通信标准的无线网络,WIFI,2G、3G、4G/LTE、5G等移动通信网络,或它们的组合。在一个示例性实施例中,通信组件经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
上述图9中的电源组件,为电源组件所在设备的各种组件提供电力。电源组件可以包括电源管理系统,一个或多个电源,及其他与为电源组件所在设备生成、管理和分配电力相关联的组件。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流 程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带式磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。
以上所述仅为本申请的实施例而已,并不用于限制本申请。对于本领域技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (20)

  1. 一种云环境下的I/O卸载系统,包括:宿主机的CPU以及插接在所述宿主机上的I/O卸载卡,所述CPU内装配有队列组件;
    所述队列组件,用于为所述宿主机上的实例提供虚拟I/O设备以及与所述虚拟I/O设备对应的设备队列;在所述CPU内,利用所述设备队列对所述实例与所述虚拟I/O设备之间发生的I/O请求进行调度;
    所述I/O卸载卡,用于监听所述设备队列中的I/O请求;在所述实例与所述虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。
  2. 根据权利要求1所述的系统,采用专用集成电路ASIC构建所述队列组件,所述队列组件接入所述CPU的内部总线。
  3. 根据权利要求1所述的系统,所述设备队列分别与所述虚拟I/O设备和所述I/O卸载卡对接,所述虚拟I/O设备将其与所述实例之间发生的I/O请求传递至所述设备队列,所述设备队列将所述I/O请求传递至所述I/O卸载卡,以使所述I/O卸载卡感知到所述I/O请求。
  4. 根据权利要求1所述的系统,所述设备队列包括第一层队列和第二层队列,所述队列组件还用于:
    为所述虚拟I/O设备关联第一指定数量的第一层队列和第二指定数量的第二层队列;
    建立所述虚拟I/O设备关联的第一层队列和第二层队列之间的映射关系;
    将所述虚拟I/O设备关联的第一层队列对接至所述实例下的各个VCPU;
    将所述虚拟I/O设备关联的第二层队列通过I/O卸载卡对接至所述虚拟I/O设备对应的物理I/O设备。
  5. 根据权利要求4所述的系统,所述I/O卸载卡,还用于:
    对所述物理I/O设备进行模拟,以产生所述物理I/O设备对应的模拟设备;
    将所述虚拟I/O设备关联的第二层队列绑定至所述模拟设备,以将所述虚拟I/O设备关联的第二层队列对接至所述物理I/O设备。
  6. 根据权利要求4所述的系统,所述队列组件在利用所述设备队列对所述实例与所述虚拟I/O设备之间发生的I/O请求进行调度的过程中,用于:
    在所述实例向所述虚拟I/O设备发起读请求的过程中,读取所述实例下的各个VCPU的负载信息;
    根据所述负载信息将所述读请求调度至所述虚拟I/O设备关联的第一层队列中的第一队列,以使用所述第一队列所对接的VCPU处理所述读请求。
  7. 根据权利要求6所述的系统,所述I/O卸载卡,还用于:获取所述读请求对应的响应报文;将所述响应报文中的元数据信息加入所述虚拟I/O设备关联的第二层队列中的指定队列;
    所述队列组件在根据所述负载信息将所述读请求调度至所述虚拟I/O设备关联的第一层队列中的第一队列的过程中,用于:根据所述负载信息将所述指定队列中的所述元数据信息调度至所述第一队列中。
  8. 根据权利要求7所述的系统,所述I/O卸载卡在所述实例与所述虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据的过程中,用于:将所述响应报文中的数据部分写入所述实例对应的内存空间;将所述数据部分所在的内存地址添加至所述元数据信息中;
    所述队列组件,还用于:触发目标队列对接的VCPU按照所述元数据信息中的内存地址读取所述响应报文中的数据部分。
  9. 根据权利要求4所述的系统,所述队列组件在利用所述设备队列对所述实例与所述虚拟I/O设备之间发生的I/O请求进行调度过程中,用于:将所述实例向所述虚拟I/O设备发起的写请求的元数据信息加入所述虚拟I/O设备关联的第一层队列中;将所述写请求的元数据信息调度至所述虚拟I/O设备关联的第二层队列中的第二队列;
    所述I/O卸载卡,还用于:利用所述第二队列对接的物理I/O设备处理所述写请求。
  10. 根据权利要求9所述的系统,所述队列组件,还用于:将所述写请求对应的数据部分写入所述实例对应的内存空间;将所述数据部分所在的内存地址添加至所述元数据信息中;
    所述I/O卸载卡在所述实例与所述虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据的过程中,用于:从所述第二队列中读取所述元数据信息;按照所述元数据信息中的内存地址获取所述写请求的数据部分;将所述数据部分发送至所述第二队列对接的所述物理I/O设备中。
  11. 根据权利要求1所述的系统,所述队列组件在为所述宿主机上的实例提供虚拟I/O设备的过程中,用于:
    对需要与所述实例进行I/O的物理I/O设备进行I/O虚拟化,以产生所述物理I/O设备对应的虚拟I/O设备。
  12. 根据权利要求11所述的系统,所述队列组件在对需要与所述实例进行I/O的物理I/O设备进行I/O虚拟化的过程中,用于:
    采用SRIOV技术,为需要与所述实例进行I/O的物理I/O设备创建VF设备,所述VF设备用于与所述实例下的各个VCPU进行数据交换;
    利用所述实例的操作系统将所述VF设备注册为指定类型的虚拟I/O设备。
  13. 根据权利要求1所述的系统,所述实例与所述虚拟I/O设备之间发生的I/O请求采用io-uring协议。
  14. 根据权利要求1所述的系统,所述队列组件中还包括加速模块,所述加速模块用于:
    对所述实例与所述虚拟I/O设备之间发生的I/O请求所相关的数据执行加速处理,所述加速处理包括:加解密、压缩和统计卸载中的一种或多种处理。
  15. 根据权利要求14所述的系统,所述加速模块绑定至所述宿主机为所述虚拟I/O设备分配的内存地址,所述虚拟I/O设备将其与所述实例之间发生的I/O请求传递至所述设备队列;所述设备队列触发所述加速模块访问所述内存地址并对所述I/O请求所相关的数据进行加速处理;所述设备队列将所述I/O请求传递至所述I/O卸载卡,以使所述I/O卸载卡感知到所述I/O请求并从所述内存地址读取所述加速处理后的数据。
  16. 一种云环境下的I/O卸载方法,适用于宿主机中的CPU,所述CPU中装配有队列组件,所述方法包括:
    利用所述队列组件为所述宿主机上的实例提供虚拟I/O设备;
    利用所述队列组件为所述虚拟I/O设备配置对应的设备队列;
    在所述CPU内,利用所述设备队列对所述实例与所述虚拟I/O设备之间发生的I/O请求进行调度,以供所述宿主机上插接的I/O卸载卡监听所述设备队列中的I/O请求并在所述实例与所述虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。
  17. 一种云环境下的I/O卸载方法,适用于宿主机上插接的I/O卸载卡,所述宿主机的CPU中装配有队列组件,所述方法,包括:
    从所述队列组件中的设备队列中监听所述宿主机上的实例与所述队列组件为所述实例提供的虚拟I/O设备之间发生的I/O请求;
    获取监听到的I/O请求所相关的数据;
    在所述实例与所述虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。
  18. 一种处理器CPU,安装在宿主机中,所述CPU内装配有队列组件,所述CPU用于执行一条或多条计算机指令,以用于:
    利用所述队列组件为所述宿主机上的实例提供虚拟I/O设备;
    利用所述队列组件为所述虚拟I/O设备配置对应的设备队列;
    在所述CPU内,利用所述设备队列对所述实例与所述虚拟I/O设备之间发生的I/O请求进行调度,以供所述宿主机上插接的I/O卸载卡监听所述设备队列中的I/O请求并在所述实例与所述虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。
  19. 一种I/O卸载卡,插接于宿主机上,所述宿主机的CPU内装配有队列组件,所述I/O卸载卡包括存储器和处理器;
    所述存储器用于存储一条或多条计算机指令;
    所述处理器与所述存储器耦合,用于执行所述一条或多条计算机指令,以用于:
    从所述队列组件中的设备队列中监听所述宿主机上的实例与所述队列组件为所述实例提供的虚拟I/O设备之间发生的I/O请求;
    获取监听到的I/O请求所相关的数据;
    在所述实例与所述虚拟I/O设备对应的物理I/O设备之间传递监听到的I/O请求所相关的数据。
  20. 一种存储计算机指令的计算机可读存储介质,当所述计算机指令被一个或多个处理器执行时,致使所述一个或多个处理器执行权利要求16或17任一项所述的云环境下的I/O卸载方法。
PCT/CN2023/114511 2022-08-30 2023-08-23 一种云环境下的i/o卸载方法、设备、系统及存储介质 WO2024046188A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211060455.3 2022-08-30
CN202211060455.3A CN115408108A (zh) 2022-08-30 2022-08-30 一种云环境下的i/o卸载方法、设备、系统及存储介质

Publications (1)

Publication Number Publication Date
WO2024046188A1 true WO2024046188A1 (zh) 2024-03-07

Family

ID=84163870

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/114511 WO2024046188A1 (zh) 2022-08-30 2023-08-23 一种云环境下的i/o卸载方法、设备、系统及存储介质

Country Status (2)

Country Link
CN (1) CN115408108A (zh)
WO (1) WO2024046188A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115408108A (zh) * 2022-08-30 2022-11-29 阿里巴巴(中国)有限公司 一种云环境下的i/o卸载方法、设备、系统及存储介质
CN116301663A (zh) * 2023-05-12 2023-06-23 新华三技术有限公司 一种数据存储方法、装置及主机
CN117874400A (zh) * 2024-03-13 2024-04-12 中国空气动力研究与发展中心设备设计与测试技术研究所 飞行器模型动导数试验数据处理系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109564514A (zh) * 2016-06-30 2019-04-02 亚马逊科技公司 部分卸载的虚拟化管理器中的存储器分配技术
CN112148422A (zh) * 2019-06-29 2020-12-29 华为技术有限公司 一种io处理的方法和装置
WO2022072096A1 (en) * 2020-10-03 2022-04-07 Intel Corporation Infrastructure processing unit
CN115408108A (zh) * 2022-08-30 2022-11-29 阿里巴巴(中国)有限公司 一种云环境下的i/o卸载方法、设备、系统及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109564514A (zh) * 2016-06-30 2019-04-02 亚马逊科技公司 部分卸载的虚拟化管理器中的存储器分配技术
CN112148422A (zh) * 2019-06-29 2020-12-29 华为技术有限公司 一种io处理的方法和装置
WO2022072096A1 (en) * 2020-10-03 2022-04-07 Intel Corporation Infrastructure processing unit
CN115408108A (zh) * 2022-08-30 2022-11-29 阿里巴巴(中国)有限公司 一种云环境下的i/o卸载方法、设备、系统及存储介质

Also Published As

Publication number Publication date
CN115408108A (zh) 2022-11-29

Similar Documents

Publication Publication Date Title
WO2024046188A1 (zh) 一种云环境下的i/o卸载方法、设备、系统及存储介质
US10275851B1 (en) Checkpointing for GPU-as-a-service in cloud computing environment
TWI637613B (zh) 用於支援經由nvme將網路上的可擴展存放裝置作為本機存放區進行訪問的系統和方法
CN104965757B (zh) 虚拟机热迁移的方法、虚拟机迁移管理装置及系统
Huang et al. High-performance design of hbase with rdma over infiniband
WO2018035856A1 (zh) 实现硬件加速处理的方法、设备和系统
US10572309B2 (en) Computer system, and method for processing multiple application programs
US20150127691A1 (en) Efficient implementations for mapreduce systems
CN111722786A (zh) 基于NVMe设备的存储系统
CN104636077A (zh) 用于虚拟机的网络块设备存储系统与方法
US10802753B2 (en) Distributed compute array in a storage system
US11379405B2 (en) Internet small computer interface systems extension for remote direct memory access (RDMA) for distributed hyper-converged storage systems
CN111309649B (zh) 一种数据传输和任务处理方法、装置及设备
US10761859B2 (en) Information processing system, management device, and method for controlling information processing system
WO2024041412A1 (zh) 存储系统、方法以及硬件卸载卡
WO2020163327A1 (en) System-based ai processing interface framework
WO2022143714A1 (zh) 服务器系统、虚拟机创建方法及装置
KR102326280B1 (ko) 데이터 처리 방법, 장치, 기기 및 매체
CN108475201A (zh) 一种虚拟机启动过程中的数据获取方法和云计算系统
US11507292B2 (en) System and method to utilize a composite block of data during compression of data blocks of fixed size
US11281602B1 (en) System and method to pipeline, compound, and chain multiple data transfer and offload operations in a smart data accelerator interface device
WO2023246843A1 (zh) 数据处理方法、装置及系统
TW202331523A (zh) 適用於分散式深度學習計算的隨需即組共用資料快取方法、電腦程式、電腦可讀取媒體
CN117573041B (zh) 一种改进vhost-scsi提升虚拟化存储性能的方法
CN103902354A (zh) 一种虚拟化应用中快速初始化磁盘的方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23859221

Country of ref document: EP

Kind code of ref document: A1