WO2023155698A1 - 基于半虚拟化设备的数据处理方法、装置和系统 - Google Patents

基于半虚拟化设备的数据处理方法、装置和系统 Download PDF

Info

Publication number
WO2023155698A1
WO2023155698A1 PCT/CN2023/074420 CN2023074420W WO2023155698A1 WO 2023155698 A1 WO2023155698 A1 WO 2023155698A1 CN 2023074420 W CN2023074420 W CN 2023074420W WO 2023155698 A1 WO2023155698 A1 WO 2023155698A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
queue
initial
aggregation
host
Prior art date
Application number
PCT/CN2023/074420
Other languages
English (en)
French (fr)
Inventor
梁晨
Original Assignee
阿里巴巴(中国)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴(中国)有限公司 filed Critical 阿里巴巴(中国)有限公司
Publication of WO2023155698A1 publication Critical patent/WO2023155698A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4022Coupling between buses using switching circuits, e.g. switching matrix, connection or expansion network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/0026PCI express

Definitions

  • the present application relates to the technical field of virtualization, and in particular, to a data processing method, device and system based on a paravirtualization device.
  • PCIe Peripheral Component Interconnect express, high-speed serial computer expansion bus standard
  • the device when the device receives the data, it needs to submit it to the CPU (Central Processing Unit, central processing unit) through multiple steps.
  • Each step is written to the host through DMA (Direct Memory Access, direct memory access) Memory.
  • DMA Direct Memory Access, direct memory access
  • Embodiments of the present application provide a data processing method, device, and system based on a paravirtualized device, so as to at least solve the technical problem in the related art that the paravirtualized device interacts frequently with a host before, resulting in decreased DMA performance.
  • a data processing method based on a paravirtualized device including: acquiring multiple initial data stored in the completion queue of the paravirtualized device, wherein the multiple initial data are used to represent The descriptive information of the original data that has been processed by the paravirtualized device but not submitted to the host; determine the multiple first data that meet the preset conditions among the multiple initial data; perform an aggregation operation on the multiple first data to generate the first Aggregating results; sending a direct memory access request carrying the first aggregation result to the memory of the host.
  • a data processing apparatus based on a paravirtualized device, including: a data acquisition module, configured to acquire a plurality of initial data stored in the completion queue of the paravirtualized device, wherein , a plurality of initial data is used to represent the descriptive information of the original data that has been processed by the paravirtualized device but not submitted to the host; the data determination module is used to determine a plurality of first data satisfying preset conditions among the plurality of initial data ; an aggregation module, configured to perform an aggregation operation on a plurality of first data to generate a first aggregation result; a sending module, configured to send a direct memory access request carrying the first aggregation result to the memory of the host.
  • a data processing system based on a paravirtualized device, including: a host, including: a memory and a completion queue; a paravirtualized device, connected to the host, for obtaining the completion queue Multiple initial data stored in , where the multiple initial data is used to characterize the descriptive information of the original data that has been processed by the paravirtualized device but not submitted to the host; determine the multiple initial data that meet the preset conditions first data; to many performing an aggregation operation on the first data to generate a first aggregation result; and sending a direct memory access request carrying the first aggregation result to the memory of the host.
  • a computer-readable storage medium includes a stored program, wherein, when the program is running, the device where the computer-readable storage medium is located is controlled to execute the above semi-based A data processing method of a virtualization device.
  • a computer terminal including: a memory and a processor, and the processor is used to run the program stored in the memory, wherein, when the program is running, the above data based on the paravirtualization device is executed. Approach.
  • a plurality of initial data stored in used can be obtained, and then a plurality of first data that meet preset conditions can be selected from the plurality of initial data, and the plurality of first data
  • the data is aggregated to generate the first aggregation result, and a DMA request carrying the first aggregation result is sent to the memory to achieve the purpose of updating multiple queue items of the used ring at one time. It is easy to notice that by initiating a DMA request through an aggregation operation, there is no need to initiate a DMA request for each queue item, thereby reducing the number of operations generated by updating the used ring and avoiding backpressure on the PCIe interface on the device side.
  • the technical effect of improving DMA performance further solves the technical problem of frequent interaction between the paravirtualized device and the host in the related art, which leads to the decline of DMA performance.
  • Fig. 1 is a schematic diagram of updating a used ring according to the prior art
  • FIG. 2 is a block diagram of a hardware structure of a computer terminal (or mobile device) for implementing a data processing method based on a paravirtualized device according to an embodiment of the present application;
  • FIG. 3 is a flowchart of a data processing method based on a paravirtualized device according to an embodiment of the present application
  • FIG. 4 is a schematic diagram of an optional virtio device virtualization implementation architecture combining hardware and software according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of an optional update used ring according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of an optional scheduling by queue according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a data processing apparatus based on a paravirtualized device according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a data processing system based on a paravirtualized device according to an embodiment of the present application.
  • Fig. 9 is a structural block diagram of a computer terminal according to an embodiment of the present application.
  • Virtio is an I/O paravirtualization solution, a set of general I/O device virtualization programs, and an abstraction of a set of general I/O devices in a paravirtualized Hypervisor.
  • a device that uses the virtio protocol is called a virtio device.
  • Data buffer stores the data received by the device.
  • used ring It is the completion queue of the virtio device.
  • the hardware notifies the driver that the command has been completed by submitting the used ring.
  • the used ring points to the structure of the data buffer, which only contains the description information of the data (address, length, etc.).
  • used ring index used ring pointer
  • the current device After the current device receives data, it needs to go through the following three steps to submit to the CPU: write data buffer; write used ring; write used ring index.
  • the used ring contains 8 queue items.
  • the submitted queue items are indicated by solid boxes, and the unsubmitted queue items are indicated by hollow boxes.
  • the existing steps for writing the used ring are as follows: Submit queue item 0 to the CPU, and update the used ring index to 1, indicating that it has not been submitted from queue 1; submit queue 1 to the CPU, and update the used ring index to 2, indicating that it has not been submitted from queue 2. Not submitted at the beginning; submit queue 2 to the CPU, update the used ring index to 3, indicating that it has not been submitted since queue 3; submit queue 3 to the CPU, and update the used ring index to 4, indicating that it has not submitted since queue 4.
  • DMA requests need to be initiated multiple times, resulting in more operations and lower DMA performance.
  • the present application provides an aggregation submission scheme, which reduces the number of updates and improves DMA performance by aggregating multiple DMA requests that need to be initiated into one.
  • a data processing method based on a paravirtualization device is also provided. It should be noted that the steps shown in the flowchart of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions , and, although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that shown or described herein.
  • FIG. 2 shows a block diagram of a hardware structure of a computer terminal (or mobile device) for implementing a data processing method based on a paravirtualization device.
  • the computer terminal 20 may include one or more (shown by 202a, 202b, ..., 202n in the figure) processors (processors may include but are not limited to microprocessors) processing device such as processor MCU or programmable logic device FPGA), memory 204 for storing data, and transmission device 206 for communication function.
  • processors may include but are not limited to microprocessors
  • memory 204 for storing data
  • transmission device 206 for communication function.
  • FIG. 2 is only a schematic diagram, which does not limit the structure of the above-mentioned electronic device.
  • computer terminal 20 may also include more or fewer components than shown in FIG. 2 , or have a different configuration than that shown in FIG. 1 .
  • the one or more processors 202 and/or other data processing circuits described above may generally be referred to herein as "data processing circuits".
  • the data processing circuit may be implemented in whole or in part as software, hardware, firmware or other arbitrary combinations.
  • the data processing circuit can be a single independent processing module, or be fully or partially integrated into any of the other elements in the computer terminal 20 (or mobile device).
  • the data processing circuit acts as a kind of processor control (eg selection of the variable resistor terminal path connected to the interface).
  • the memory 204 can be used to store software programs and modules of application software, such as the program instruction/data storage device corresponding to the data processing method based on paravirtualized equipment in the embodiment of the present application, and the processor runs the software program stored in the memory 204 And modules, so as to execute various functional applications and data processing, that is, to realize the above-mentioned data processing method based on paravirtualization equipment.
  • the memory 204 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory.
  • the memory 204 may further include a memory that is remotely located relative to the processor 202 , and these remote memories may be connected to the computer terminal 20 through a network.
  • networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the transmission device 206 is used to receive or transmit data via a network.
  • the specific example of the above-mentioned network may include a wireless network provided by the communication provider of the computer terminal 20 .
  • the transmission device 206 includes a network adapter (Network Interface Controller, NIC), which can be connected to other network devices through a base station so as to communicate with the Internet.
  • the transmission device 206 may be a radio frequency (Radio Frequency, RF) module, which is used to communicate with the Internet in a wireless manner.
  • RF Radio Frequency
  • the display may be, for example, a touchscreen liquid crystal display (LCD), which may enable a user to interact with the user interface of the computer terminal 20 (or mobile device).
  • LCD liquid crystal display
  • the computer device (or mobile device) shown in FIG. 2 may include hardware elements (including circuits), software elements (including computer code), or a combination of both hardware and software elements. It should be noted that FIG. 2 is only one example of a particular embodiment, and is intended to illustrate the types of components that may be present in a computer device (or mobile device) as described above.
  • Fig. 3 is a flowchart of a data processing method based on a paravirtualized device according to an embodiment of the present application. As shown in Figure 3, the method includes the following steps:
  • Step S302 acquiring a plurality of initial data stored in the completion queue of the paravirtualized device, wherein the plurality of initial data is used to represent the descriptive information of the original data that has been processed by the paravirtualized device but not submitted to the host.
  • the paravirtualized device in the above steps can be installed on the host, for example, it can be a paravirtualized network card, or it can be a paravirtualized memory, but it is not limited thereto.
  • the initial data in the above steps may be data items stored in the used ring and not submitted to the CPU. Each data item stores description information of the corresponding original data.
  • the description information may include the address of the data buffer storing the original data, The length of the original data, etc., but not limited thereto.
  • Different types of paravirtualized devices have different types of original data. For example, for paravirtualized network cards, the original data may be original packets.
  • the device After receiving the original data, the device first executes the buffer writing step, writes the original data into different buffers, and waits for the host to update the buffer.
  • the original data is processed.
  • the corresponding description information can be stored in the used ring.
  • the original data stored in buffer 0 to buffer 2 has been written Enter the corresponding buffer, and store the corresponding description information in the used ring, corresponding to queue item 0 to queue item 2 respectively. Then execute the step of writing used ring, and you can use the corresponding description information in queue item 0 to queue item 2 as the initial data.
  • Step S304 determining a plurality of first data satisfying a preset condition among the plurality of initial data.
  • the preset condition in the above steps can be the aggregation condition set by the actual needs in advance, for example, the condition can be directly all unsubmitted queue items in the used ring; the condition can also be the same network card transmission queue corresponding Uncommitted queue items, but not limited to.
  • queue item 0 to queue item 2 can be as the first data.
  • Step S306 performing an aggregation operation on a plurality of first data to generate a first aggregation result.
  • the above-mentioned aggregation operation may be splicing description information corresponding to a plurality of first data, for example, splicing addresses and summing lengths to obtain the above-mentioned first aggregation result , but not limited to this, other aggregation operations can also be used.
  • Step S308 sending a direct memory access request carrying the first aggregation result to the memory of the host.
  • the first aggregation result in order to update the memory corresponding to the used ring at one time, can be encapsulated according to the DMA protocol to obtain a DMA request, and send the DMA request to the host's Memory, to complete the purpose of updating the used ring.
  • queue item 1 to queue item 3 can be selected as the first data through preset conditions, and the three queue items can be aggregated to generate a DMA request to update items 1 to 3 of the used ring at one time, and Change the boxes corresponding to queue item 1 to queue item 3 to solid boxes, indicating that queue item 1 to queue item 3 have been submitted.
  • determining the plurality of first data satisfying the preset condition among the plurality of initial data includes at least one of the following: obtaining data stored in the target cache to obtain a plurality of first data, wherein the plurality of initial The data is sequentially cached to the target cache; when the preset timing time arrives, determine that the multiple initial data are multiple first data; when the number of multiple initial data is greater than or equal to the preset number, determine that multiple initial data The data is a plurality of first data.
  • the aforementioned target cache may be a preset adaptive cache, and when a bottleneck occurs in PCIe performance, data is accumulated in the adaptive cache.
  • data is accumulated in the adaptive cache.
  • the queue items in the used ring before they are submitted to the CPU, they will be sequentially stored in the adaptive cache, so all data items stored in the adaptive cache can be used as the first data.
  • the aforementioned preset timing time may be a preset timer time, which may be set according to actual needs.
  • the preset timing time arrives, there is no need to perform any processing on the queue items stored in the used ring, and after the preset timing time arrives, all uncommitted queue items stored in the used ring can be The queue item is used as the first data.
  • the above preset number may be a preset maximum aggregation number, which may be set according to actual needs.
  • the preset number before the number of unsubmitted queue items stored in the used ring does not reach the preset number, there is no need to perform any processing on the queue items stored in the used ring. After the number reaches the preset number, all unsubmitted queue items stored in the used ring can be used as the first data.
  • the above three conditions can be used alone or in any combination.
  • the specific combination manner can be determined according to actual needs, which is not specifically limited in this application.
  • the preset conditions can include adaptive cache, timer and maximum aggregation number.
  • the above three methods can be used comprehensively to determine the The first data, and perform an aggregation operation on the first data.
  • performing an aggregation operation on a plurality of first data, and generating a first aggregation result includes: determining a first transmission queue corresponding to each first data, wherein the first transmission queue is used to transmit original data; Obtain first data corresponding to the same first transmission queue, and obtain a first aggregation result.
  • the above-mentioned first transmission queue can be a queue queue supported by the virtio device, and the virtio device can use multiple queues to transmit original data to the host, and each queue corresponds to a used ring.
  • the queue to which the corresponding original data is sent may be determined. Since the queue items corresponding to different queues are often intertwined before the aggregation operation, the aggregation operation cannot be performed directly. Therefore, queue items corresponding to the same queue may be arranged together in a manner of scheduling by queue, and then the queue items corresponding to the same queue may be aggregated to obtain the first aggregation result.
  • the virtio device virtualization implementation architecture combined with software and hardware as shown in Figure 4 is still used as an example for illustration.
  • boxes filled with different patterns represent queue items corresponding to different queues, and different numbers represent queue items in the queue. serial number.
  • the queue items corresponding to different queues are intertwined.
  • the queue items corresponding to the same queue are adjacent. Therefore, the queue items corresponding to the same queue can be aggregated and submitted to the memory at one time.
  • obtaining the first data corresponding to the same first transmission queue and obtaining the first aggregation result includes: determining the first transmission queue corresponding to the first first data among the multiple first data, obtaining the target A transmission queue: acquiring the first data corresponding to the target transmission queue among the plurality of first data, and obtaining a first aggregation result.
  • the storage position of the first data can be The queue corresponding to the first queue item is used as the target queue, and all the first data corresponding to the target queue are aggregated to obtain the first aggregation result, while the first data corresponding to other queues need to wait for the first data corresponding to the target queue to be submitted Afterwards, proceed with processing.
  • the method further includes: determining a plurality of second data among the plurality of initial data based on the initial queue identifier corresponding to the completion queue, wherein the initial queue identifier is used to represent the identifier of the submitted data Information, a plurality of second data is used to represent data currently submitted to the host; an aggregation operation is performed on the plurality of second data to obtain a second aggregation result; and an initial queue identifier is updated based on the second aggregation result.
  • the above initial queue ID can be used ring index, pointing to the first uncommitted queue item in the used ring.
  • the above update may be to update the value of the used ring index.
  • the used ring index needs to be updated. Since multiple queue items in the used ring are updated, the used ring index needs to be updated multiple times. In order to avoid many operations caused by updating the used ring index multiple times, the latest submitted queue item can be determined as the second data, and multiple DMA requests are combined into one, that is, the second data is aggregated to obtain the first Two aggregation results, and an update to the used ring index.
  • the step of writing used ring index can be executed to perform aggregation operations on queue item 0 to queue item 2 , update the used ring index based on the second aggregation result, and update the value to 3.
  • the aggregation operation is performed on multiple second data
  • obtaining the second aggregation result includes: determining Determine the second transmission queue corresponding to each second data; obtain the second data corresponding to the same second transmission queue, and obtain the second aggregation result.
  • the above-mentioned second transmission queue can also be a queue queue supported by the virtio device.
  • the virtio device can use multiple queues to transmit raw data to the host, and each queue corresponds to a used ring.
  • the queue items corresponding to the same queue can be arranged together by queue scheduling, and then the queues corresponding to the same queue can be grouped together. Items are aggregated to obtain the second aggregation result.
  • the virtio device virtualization implementation architecture combined with software and hardware as shown in Figure 4 is still used as an example for illustration.
  • boxes filled with different patterns represent queue items corresponding to different queues.
  • different queues correspond to The queue items of the queues are intertwined together.
  • the queue items corresponding to the same queue are adjacent. Therefore, the queue items corresponding to the same queue can be aggregated to update the used ring index at one time.
  • scheduling by queue can only be performed once, that is, if the aggregation operation has been scheduled by queue in the process of updating the used ring, it is not necessary to schedule by queue when updating the used ring index If the aggregation operation is not performed by queue scheduling in the process of updating the used ring, the aggregation operation is performed by queue scheduling when updating the used ring index.
  • the method further includes: acquiring a plurality of original data; determining a third transmission queue corresponding to each original data; sorting the multiple original data according to the third transmission queue to obtain the sorted data, Wherein, the original data corresponding to the same third transmission queue are adjacent; and the sorted data is written into the data buffer of the host in turn.
  • the above-mentioned third transmission queue may also be a queue queue supported by the virtio device, and the virtio device may use multiple queues to transmit original data to the host.
  • the original data corresponding to different queues are often intertwined before the aggregation operation, the aggregation operation cannot be performed directly. Therefore, the original data corresponding to the same queue can be arranged in a queue scheduling manner Together, and then written into the data buffer in sequence according to the sorted data.
  • the virtio device virtualization implementation architecture combined with software and hardware as shown in Figure 4 is still used as an example for illustration.
  • boxes filled with different patterns represent queue items corresponding to different queues.
  • different queues correspond to The data are intertwined together.
  • the data corresponding to the same queue are adjacent. Therefore, the data can be stored in the data buffer in turn, so as to ensure that among the queue items stored in the used ring, the queue items corresponding to the same queue are adjacent. .
  • scheduling by queue can only be performed once, that is, if the original data is cached and scheduled by scheduling by queue before writing the data buffer, when the used ring and used ring index are updated subsequently There is no need to use queue scheduling to perform aggregation operations; if the original data is not cached and scheduled by queue scheduling before writing the data buffer, then if the queue scheduling is used to perform aggregation operations during the process of updating the used ring , when updating the used ring index, there is no need to use queue scheduling to perform aggregation operations; if queue scheduling is not used in the process of updating used ring If the aggregation operation is performed in the same way, when the used ring index is updated, the aggregation operation is performed by queue scheduling.
  • the method according to the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation.
  • the technical solution of the present application can be embodied in the form of a software product in essence or the part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, disk, CD) contains several instructions to enable a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to execute the methods described in the various embodiments of the present application.
  • a data processing device based on a paravirtualized device for implementing the above data processing method based on a paravirtualized device is also provided.
  • the device 700 includes: a data acquisition module 702 , a data determination module 704, an aggregation module 706 and a sending module 708.
  • the data acquiring module 702 is used to acquire a plurality of initial data stored in the completion queue of the paravirtualized device, wherein the plurality of initial data is used to represent the raw data that has been processed by the paravirtualized device but not submitted to the host Descriptive information; the data determination module 704 is used to determine a plurality of first data satisfying preset conditions among the plurality of initial data; the aggregation module 706 is used to perform an aggregation operation on a plurality of first data to generate a first aggregation result; the sending module 708 It is used for sending a direct memory access request carrying the first aggregation result to the memory of the host.
  • the above-mentioned data acquisition module 702, data determination module 704, aggregation module 706, and sending module 708 correspond to steps S302 to S308 in Embodiment 1, and the examples and implementations of the four modules and corresponding steps
  • the application scenarios are the same, but are not limited to the content disclosed in Embodiment 1 above.
  • the above modules can run in the computer terminal 10 provided in Embodiment 1.
  • the data determination module includes at least one of the following: a data acquisition unit, a first data determination unit, and a second data determination unit.
  • the data obtaining unit is used to obtain the data stored in the target cache to obtain a plurality of first data, wherein a plurality of initial data are sequentially cached to the target cache;
  • the plurality of initial data is determined as a plurality of first data;
  • the second determining unit is configured to determine the plurality of initial data as a plurality of first data when the quantity of the plurality of initial data is greater than or equal to a preset number.
  • the aggregation module includes: a queue determination unit and a result acquisition unit.
  • the queue determination unit is used to determine the first transmission queue corresponding to each first data, wherein the first transmission queue is used to transmit the original data; the result acquisition unit is used to obtain the first data corresponding to the same first transmission queue, get The first aggregation result.
  • the result acquisition unit is further configured to determine the first transmission queue corresponding to the first first data among the multiple first data, obtain the target transmission queue, and obtain the target transmission queue among the multiple first data Corresponding to the first data, the first aggregation result is obtained.
  • the device further includes: an update module.
  • the data determination module is further configured to determine a plurality of second data in the plurality of initial data based on the initial queue identifier corresponding to the completion queue, wherein the initial queue identifier is used to represent the identification information of the submitted data, and the plurality of second data
  • the second data is used to represent the data currently submitted to the host;
  • the aggregation module is also used to perform an aggregation operation on a plurality of second data to obtain a second aggregation result;
  • the update module is used to update the initial queue identifier based on the second aggregation result.
  • the aggregation module includes: a queue determination unit and a result acquisition unit.
  • the queue determining unit is used to determine the second transmission queue corresponding to each second data; the result obtaining unit is used to obtain the second data corresponding to the same second transmission queue to obtain the second aggregation result.
  • the device further includes: a data acquisition module, a queue determination module, a sorting module and a writing module.
  • the data acquisition module is used to obtain a plurality of original data; the queue determination module is used to determine the third transmission queue corresponding to each original data; the sorting module is used to sort a plurality of original data according to the third transmission queue, and the sorted Among them, the original data corresponding to the same third transmission queue are adjacent; the writing module is used to sequentially write the sorted data into the data buffer of the host.
  • a data processing system based on a paravirtualized device for implementing the above data processing method based on a paravirtualized device is also provided.
  • the system includes:
  • Host 82 including: memory and completion queue.
  • the paravirtualized device 84 is connected to the host, and is used to obtain multiple initial data stored in the completion queue, wherein the multiple initial data are used to represent the description of the original data that has been processed by the paravirtualized device but not submitted to the host information; determine a plurality of first data satisfying preset conditions among the plurality of initial data; perform an aggregation operation on the plurality of first data to generate a first aggregation result; send a direct memory access request carrying the first aggregation result to the host Memory.
  • the paravirtualization device is further configured to perform at least one of the following steps: obtaining data stored in the target cache to obtain a plurality of first data, wherein a plurality of initial data are sequentially cached in the target cache; When the preset timing arrives, determine the plurality of initial data as a plurality of first data; when the number of the plurality of initial data is greater than or equal to the preset number, determine the plurality of initial data as a plurality of first data.
  • the paravirtualization device is also used to determine the first transmission queue corresponding to each first data, and obtain the first data corresponding to the same first transmission queue to obtain the first aggregation result, wherein, The first transmission queue is used to transmit raw data.
  • the paravirtualization device is also used to determine the first transmission queue corresponding to the first first data among the multiple first data, obtain the target transmission queue, and obtain the target transmission queue among the multiple first data The first data corresponding to the queue to obtain the first aggregation result.
  • the paravirtualization device is further configured to determine multiple second data among multiple initial data based on the initial queue identifier corresponding to the completion queue, wherein the initial queue identifier is used to represent data that has been submitted and completed
  • the multiple second data are used to represent the data currently submitted to the host; the multiple second data are aggregated to obtain a second aggregate result; and the initial queue identifier is updated based on the second aggregate result.
  • the paravirtualization device is also used to determine the second transmission queue corresponding to each second data; the result acquisition unit is used to obtain the second data corresponding to the same second transmission queue to obtain the second aggregation result.
  • the host computer further includes: a data buffer; the paravirtualized device is also used to obtain a plurality of original data; determine the third transmission queue corresponding to each original data; The data is sorted to obtain sorted data, wherein the original data corresponding to the same third transmission queue is adjacent; and the sorted data is sequentially written into the data buffer.
  • Embodiments of the present application may provide a computer terminal, and the computer terminal may be any computer terminal device in a group of computer terminals.
  • the foregoing computer terminal may also be replaced with a terminal device such as a mobile terminal.
  • the foregoing computer terminal may be located in at least one network device among multiple network devices of the computer network.
  • the above-mentioned computer terminal may execute the program code of the following steps in the data processing method based on the paravirtualized device: obtain multiple initial data stored in the completion queue of the paravirtualized device, wherein the multiple initial data are used In order to represent the descriptive information of the original data that has been processed by the paravirtualized device but not submitted to the host; determine the multiple first data that meet the preset conditions among the multiple initial data; perform an aggregation operation on the multiple first data to generate A first aggregation result; sending a direct memory access request carrying the first aggregation result to the memory of the host.
  • FIG. 9 is a structural block diagram of a computer terminal according to an embodiment of the present application.
  • the computer terminal A may include: one or more (only one is shown in the figure) processors 902 , and a memory 904 .
  • the memory can be used to store software programs and modules, such as program instructions/modules corresponding to the data processing method and device based on paravirtualization equipment in the embodiment of the present application, and the processor runs the software programs and modules stored in the memory, In this way, various functional applications and data processing are executed, that is, the above-mentioned data processing method based on paravirtualization device is realized.
  • the memory may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory.
  • the memory may further include a memory remotely located relative to the processor, and these remote memories may be connected to the terminal A through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the processor can call the information and the application program stored in the memory through the transmission device to perform the following steps: obtain multiple initial data stored in the completion queue of the paravirtualized device, wherein the multiple initial data are used to represent the paravirtualized device The descriptive information of the original data that has been processed but not submitted to the host; determine the multiple first data that meet the preset conditions among the multiple initial data; perform an aggregation operation on the multiple first data to generate the first aggregation result; send A direct memory access request carrying the first aggregation result is sent to the memory of the host.
  • the above-mentioned processor may also execute the program code of the following steps: acquire the data stored in the target cache to obtain a plurality of first data, wherein the multiple initial data are sequentially cached in the target cache; and/or, in the preset When the timing time arrives, determine that the multiple initial data are multiple first data; and/or, when the quantity of multiple initial data is greater than or equal to the preset number, determine that multiple initial data are multiple first data.
  • the above-mentioned processor may also execute the program code of the following steps: determine the first transmission queue corresponding to each first data, where the first transmission queue is used to transmit the original data; obtain the corresponding first transmission queue The first data, to obtain the first aggregation result.
  • the above-mentioned processor may also execute the program code of the following steps: determine the first transmission queue corresponding to the first first data among the multiple first data, obtain the target transmission queue; obtain the target transmission queue among the multiple first data The first data corresponding to the queue to obtain the first aggregation result.
  • the above-mentioned processor may also execute the program code in the following steps: determine a plurality of second data among the plurality of initial data based on the initial queue identifier corresponding to the completion queue, wherein the initial queue identifier is used to represent the completed The identification information of the data, the plurality of second data is used to represent the data currently submitted to the host; the aggregation operation is performed on the plurality of second data to obtain the second aggregation result; the initial queue identifier is updated based on the second aggregation result.
  • the above-mentioned processor may also execute the program code of the following steps: determining the second transmission queue corresponding to each second data; obtaining the second data corresponding to the same second transmission queue to obtain the second aggregation result.
  • the above-mentioned processor can also execute the program code of the following steps: obtain multiple original data; determine the third transmission queue corresponding to each original data; sort the multiple original data according to the third transmission queue, and obtain the sorted data, wherein the original data corresponding to the same third transmission queue is adjacent; and the sorted data is written into the data buffer of the host in turn.
  • a data processing solution based on a paravirtualized device is provided. Initiating a DMA request through the aggregation operation does not need to initiate a DMA request for each queue item, thereby achieving the technical effect of reducing the number of operations generated by updating the used ring, avoiding the back pressure of the PCIe interface on the device side, and improving DMA performance. , and further solve the technical problem in the related art that the paravirtualized device interacts frequently with the host before, resulting in a decrease in DMA performance.
  • the structure shown in Figure 9 is only schematic, and the computer terminal can also be a smart phone (such as an Android phone, an iOS phone, etc.), a tablet computer, a handheld computer, and a mobile Internet device (Mobile Internet Devices, MID ), PAD and other terminal equipment.
  • FIG. 9 does not limit the structure of the above-mentioned electronic device.
  • the computer terminal A may also include more or less components than those shown in FIG. 9 (such as a network interface, a display device, etc.), or have a configuration different from that shown in FIG. 9 .
  • the program is used to instruct the relevant hardware of the terminal device to complete.
  • the program can be stored in a computer-readable storage medium, and the storage medium can include: a flash disk, a read-only memory (Read-Only Memory, ROM), a random access device ( Random Access Memory, RAM), disk or CD, etc.
  • the embodiment of the present application also provides a storage medium.
  • the foregoing storage medium may be used to store program codes executed by the paravirtualized device-based data processing method provided in the foregoing embodiments.
  • the above-mentioned storage medium may be located in any computer terminal in the group of computer terminals in the computer network, or in any mobile terminal in the group of mobile terminals.
  • the storage medium is configured to store program codes for performing the following steps: acquiring multiple initial data stored in the completion queue of the paravirtualized device, wherein the multiple initial data are used to represent The descriptive information of the original data that has been processed by the paravirtualized device but not submitted to the host; determine the multiple first data that meet the preset conditions among the multiple initial data; perform an aggregation operation on the multiple first data to generate the first Aggregating results; sending a direct memory access request carrying the first aggregation result to the memory of the host.
  • the above-mentioned storage medium is further configured to store program codes for performing the following steps: obtaining data stored in the target cache to obtain a plurality of first data, wherein multiple initial data are sequentially cached to the target cache; and/ Or, when the preset timing time arrives, determine a plurality of initial data as a plurality of first data; and/or, when the quantity of the plurality of initial data is greater than or equal to the preset quantity, determine a plurality of initial data is a plurality of first data.
  • the above-mentioned storage medium is also configured to store program codes for performing the following steps: determining the first transmission queue corresponding to each first data, wherein the first transmission queue is used to transmit original data; obtaining the same first transmission queue The first data corresponding to a transmission queue is obtained to obtain a first aggregation result.
  • the above-mentioned storage medium is also configured to store program codes for performing the following steps: determine the first transmission queue corresponding to the first first data among the multiple first data, and obtain the target transmission queue; obtain the multiple first data The first data corresponding to the target transmission queue in the data is used to obtain the first aggregation result.
  • the above-mentioned storage medium is also configured to store program codes for performing the following steps: determine a plurality of second data among the plurality of initial data based on the initial queue identifier corresponding to the completion queue, wherein the initial queue identifier is used for Characterize the identification information of the data that has been submitted, and multiple second data are used to represent the data currently submitted to the host; perform aggregation operations on multiple second data to obtain the second aggregation result; identify the initial queue based on the second aggregation result to update.
  • the above-mentioned storage medium is also configured to store program codes for performing the following steps: determine the second transmission queue corresponding to each second data; obtain the second data corresponding to the same second transmission queue, and obtain the second aggregate results.
  • the above-mentioned storage medium is also configured to store program codes for performing the following steps: acquiring a plurality of original data; determining a third transmission queue corresponding to each original data; Sorting to obtain sorted data, wherein the original data corresponding to the same third transmission queue are adjacent; sequentially write the sorted data into the data buffer of the host.
  • the disclosed technical content can be realized in other ways.
  • the device embodiments described above are only illustrative, for example, the division of the units is only a logical function division, and there may be other division methods in actual implementation, for example, multiple units or components can be combined or can be Integrate into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of units or modules may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, server or network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computer Hardware Design (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种基于半虚拟化设备的数据处理方法、装置和系统。其中,该方法包括:获取半虚拟化设备的完成队列中存储的多个初始数据,其中,多个初始数据用于表征半虚拟化设备已经处理完成,但未提交至主机的原始数据的描述信息;确定多个初始数据中满足预设条件的多个第一数据;对多个第一数据进行聚合操作,生成第一聚合结果;发送携带有第一聚合结果的直接存储器访问请求至主机的内存。本申请解决了相关技术中半虚拟化设备与主机之间交互频繁,导致DMA性能下降的技术问题。

Description

基于半虚拟化设备的数据处理方法、装置和系统
本申请要求于2022年02月18日提交中国专利局、申请号为202210153414.2、申请名称为“基于半虚拟化设备的数据处理方法、装置和系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及虚拟化技术领域,具体而言,涉及一种基于半虚拟化设备的数据处理方法、装置和系统。
背景技术
目前,软硬结合的virtio设备虚拟化实现中,主机和设备之间通过PCIe(Peripheral Component Interconnect express,高速串行计算机扩展总线标准)连接。根据virtio设备规范,当设备收到数据后,需要经过多个步骤提交给CPU(Central Processing Unit,中央处理器),每个步骤均是通过DMA(Direct Memory Access,直接存储器访问)的方式写主机内存。但是,当每个步骤过于频繁时,CPU PCIe子系统出现瓶颈,设备侧会看到PCIe接口反压,DMA性能下降。
针对上述的问题,目前尚未提出有效的解决方案。
发明内容
本申请实施例提供了一种基于半虚拟化设备的数据处理方法、装置和系统,以至少解决相关技术中半虚拟化设备与主机之前交互频繁,导致DMA性能下降的技术问题。
根据本申请实施例的一个方面,提供了一种基于半虚拟化设备的数据处理方法,包括:获取半虚拟化设备的完成队列中存储的多个初始数据,其中,多个初始数据用于表征半虚拟化设备已经处理完成,但未提交至主机的原始数据的描述信息;确定多个初始数据中满足预设条件的多个第一数据;对多个第一数据进行聚合操作,生成第一聚合结果;发送携带有第一聚合结果的直接存储器访问请求至主机的内存。
根据本申请实施例的另一方面,还提供了一种基于半虚拟化设备的数据处理装置,包括:数据获取模块,用于获取半虚拟化设备的完成队列中存储的多个初始数据,其中,多个初始数据用于表征半虚拟化设备已经处理完成,但未提交至主机的原始数据的描述信息;数据确定模块,用于确定多个初始数据中满足预设条件的多个第一数据;聚合模块,用于对多个第一数据进行聚合操作,生成第一聚合结果;发送模块,用于发送携带有第一聚合结果的直接存储器访问请求至主机的内存。
根据本申请实施例的另一方面,还提供了一种基于半虚拟化设备的数据处理系统,包括:主机,包括:内存和完成队列;半虚拟化设备,与主机连接,用于获取完成队列中存储的多个初始数据,其中,多个初始数据用于表征半虚拟化设备已经处理完成,但未提交至主机的原始数据的描述信息;确定多个初始数据中满足预设条件的多个第一数据;对多 个第一数据进行聚合操作,生成第一聚合结果;发送携带有第一聚合结果的直接存储器访问请求至主机的内存。
根据本申请实施例的另一方面,还提供了一种计算机可读存储介质,计算机可读存储介质包括存储的程序,其中,在程序运行时控制计算机可读存储介质所在设备执行上述的基于半虚拟化设备的数据处理方法。
根据本申请实施例的另一方面,还提供了一种计算机终端,包括:存储器和处理器,处理器用于运行存储器中存储的程序,其中,程序运行时执行上述的基于半虚拟化设备的数据处理方法。
在本申请实施例中,当需要更新used ring时,可以获取used中存储的多个初始数据,然后从多个初始数据中筛选出符合预设条件的多个第一数据,对多个第一数据进行聚合操作生成第一聚合结果,并发送携带有该第一聚合结果的DMA请求至内存,达到一次性更新used ring的多个队列项的目的。容易注意到的是,通过聚合操作的方式发起一次DMA请求,无需针对每个队列项发起一次DMA请求,从而达到了减少更新used ring所产生的操作次数,避免设备侧看到PCIe接口反压,提高DMA性能的技术效果,进而解决了相关技术中半虚拟化设备与主机之前交互频繁,导致DMA性能下降的技术问题。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1是根据现有技术的一种更新used ring的示意图;
图2是根据本申请实施例的一种用于实现基于半虚拟化设备的数据处理方法的计算机终端(或移动设备)的硬件结构框图;
图3是根据本申请实施例的一种基于半虚拟化设备的数据处理方法的流程图;
图4是根据本申请实施例的一种可选的软硬结合的virtio设备虚拟化实现架构的示意图;
图5是根据本申请实施例的一种可选的更新used ring的示意图;
图6是根据本申请实施例的一种可选的按queue调度的示意图;
图7是根据本申请实施例的一种基于半虚拟化设备的数据处理装置的示意图;
图8是根据本申请实施例的一种基于半虚拟化设备的数据处理系统的示意图;
图9是根据本申请实施例的一种计算机终端的结构框图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二” 等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
首先,在对本申请实施例进行描述的过程中出现的部分名词或术语适用于如下解释:
virtio:virtio是一种I/O半虚拟化解决方案,是一套通用I/O设备虚拟化的程序,是对半虚拟化Hypervisor中的一组通用I/O设备的抽象。使用virtio协议的设备称为virtio设备。
数据buffer:存储有设备收到的数据。
used ring:是virtio设备的完成队列,当设备完成了一个驱动下发的请求后,硬件通过提交used ring来通知驱动命令已经完成。used ring指向数据buffer的结构体,仅包含对数据的描述信息(地址、长度等等)。
used ring index:是used ring的指针。
目前,当前设备收到数据后,提交给CPU需要经过如下三个步骤:写数据buffer;写used ring;写used ring index。例如,以used ring的提交为例进行说明,如图1所示,used ring包含有8个队列项,已提交的队列项用实心方框所示,未提交的队列项用空心方框表示。现有写used ring的步骤如下:将队列项0提交给CPU,used ring index更新为1,表明从队列1开始未提交;将队列1提交给CPU,used ring index更新为2,表明从队列2开始未提交;将队列2提交给CPU,used ring index更新为3,表明从队列3开始未提交;将队列3提交给CPU,used ring index更新为4,表明从队列4开始未提交。
因此,当需要更新used ring中的多个队列项,或者需要多次更新used ring index时,需要多次发起DMA请求,导致操作次数较多,DMA性能下降。
为了解决上述问题,本申请提供了一种聚合提交方案,通过将需要发起的多次DMA请求聚合成一次,降低了更新的次数,提高了DMA性能。
实施例1
根据本申请实施例,还提供了一种基于半虚拟化设备的数据处理方法,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
本申请实施例所提供的方法实施例可以在移动终端、计算机终端或者类似的运算装置中执行。图2示出了一种用于实现基于半虚拟化设备的数据处理方法的计算机终端(或移动设备)的硬件结构框图。如图2所示,计算机终端20(或移动设备)可以包括一个或多个(图中采用202a、202b,……,202n来示出)处理器(处理器可以包括但不限于微处 理器MCU或可编程逻辑器件FPGA等的处理装置)、用于存储数据的存储器204、以及用于通信功能的传输装置206。除此以外,还可以包括:显示器、输入/输出接口(I/O接口)、通用串行总线(USB)端口(可以作为BUS总线的端口中的一个端口被包括)、网络接口、电源和/或相机。本领域普通技术人员可以理解,图2所示的结构仅为示意,其并不对上述电子装置的结构造成限定。例如,计算机终端20还可包括比图2中所示更多或者更少的组件,或者具有与图1所示不同的配置。
应当注意到的是上述一个或多个处理器202和/或其他数据处理电路在本文中通常可以被称为“数据处理电路”。该数据处理电路可以全部或部分的体现为软件、硬件、固件或其他任意组合。此外,数据处理电路可为单个独立的处理模块,或全部或部分的结合到计算机终端20(或移动设备)中的其他元件中的任意一个内。该数据处理电路作为一种处理器控制(例如与接口连接的可变电阻终端路径的选择)。
存储器204可用于存储应用软件的软件程序以及模块,如本申请实施例中的基于半虚拟化设备的数据处理方法对应的程序指令/数据存储装置,处理器通过运行存储在存储器204内的软件程序以及模块,从而执行各种功能应用以及数据处理,即实现上述的基于半虚拟化设备的数据处理方法。存储器204可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器204可进一步包括相对于处理器202远程设置的存储器,这些远程存储器可以通过网络连接至计算机终端20。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
传输装置206用于经由一个网络接收或者发送数据。上述的网络具体实例可包括计算机终端20的通信供应商提供的无线网络。在一个实例中,传输装置206包括一个网络适配器(Network Interface Controller,NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输装置206可以为射频(Radio Frequency,RF)模块,其用于通过无线方式与互联网进行通讯。
显示器可以例如触摸屏式的液晶显示器(LCD),该液晶显示器可使得用户能够与计算机终端20(或移动设备)的用户界面进行交互。
此处需要说明的是,在一些可选实施例中,上述图2所示的计算机设备(或移动设备)可以包括硬件元件(包括电路)、软件元件(包括存储在计算机可读介质上的计算机代码)、或硬件元件和软件元件两者的结合。应当指出的是,图2仅为特定具体实例的一个实例,并且旨在示出可存在于上述计算机设备(或移动设备)中的部件的类型。
在上述运行环境下,本申请提供了如图3所示的基于半虚拟化设备的数据处理方法。图3是根据本申请实施例的一种基于半虚拟化设备的数据处理方法的流程图。如图3所示,该方法包括如下步骤:
步骤S302,获取半虚拟化设备的完成队列中存储的多个初始数据,其中,多个初始数据用于表征半虚拟化设备已经处理完成,但未提交至主机的原始数据的描述信息。
上述步骤中的半虚拟化设备可以安装在主机上,例如,可以是半虚拟化网卡,也可以是半虚拟化存储器,但不仅限于此。上述步骤中的初始数据可以是存储在used ring中未提交给CPU的数据项,每个数据项中存储有相应原始数据的描述信息,该描述信息可以包括存储有原始数据的数据buffer的地址、原始数据的长度等,但不仅限于此。针对不同类型的半虚拟化设备,原始数据的类型不同,例如,对于半虚拟化网卡,原始数据可以是原始报文。
例如,以如图4所示的软硬结合的virtio设备虚拟化实现架构为例进行说明,设备接收到原始数据之后,首先执行写buffer步骤,将原始数据写入不同的buffer中,等待主机对原始数据进行处理,在virtio设备处理完毕,也即,将原始数据写入不同的buffer之后,可以将相应的描述信息存储至used ring中,例如,buffer 0至buffer 2中存储的原始数据已经写入相应的buffer,可以将相应的描述信息存储至used ring中,分别对应队列项0至队列项2。然后执行写used ring步骤,可以将队列项0至队列项2中对应的描述信息作为初始数据。
步骤S304,确定多个初始数据中满足预设条件的多个第一数据。
上述步骤中的预设条件可以是预先实际需要所设定的聚合条件,例如,该条件可以是直接将used ring中所有未提交的队列项;该条件也可以是属于同一个网卡传输队列对应的未提交的队列项,但不仅限于此。
例如,仍以如图4的软硬结合的virtio设备虚拟化实现架构为例进行说明,假设预设条件为将used ring中所有未提交的队列项,因此,可以将队列项0至队列项2作为第一数据。
步骤S306,对多个第一数据进行聚合操作,生成第一聚合结果。
在一种可选的实施例中,上述的聚合操作可以是将多个第一数据对应的描述信息进行拼接,例如,将地址进行拼接,将长度进行求和,从而得到上述的第一聚合结果,但不仅限于此,也可以采用其他聚合操作。
步骤S308,发送携带有第一聚合结果的直接存储器访问请求至主机的内存。
在一种可选的实施例中,对于第一聚合结果,为了能够一次性更新used ring对应的内存,可以按照DMA协议对第一聚合结果进行封装,得到DMA请求,并发送DMA请求至主机的内存,完成更新used ring的目的。
例如,如图5所示的used ring为例进行说明,used ring中包含的8个队列项均未提交至内存,此时,队列项0已经完成提交,可以将队列项1至队列项7作为初始数据,通过预设条件可以筛选出队列项1至队列项3作为第一数据,并将三个队列项进行聚合,生成一个DMA请求,一次性完成更新used ring的第1至3项,并将队列项1至队列项3对应的方框更改为实心方框,表明队列项1至队列项3已完成提交。
根据本申请上述实施例提供的方案,当需要更新used ring时,可以获取used中存储的多个初始数据,然后从多个初始数据中筛选出符合预设条件的多个第一数据,对多个 第一数据进行聚合操作生成第一聚合结果,并发送携带有该第一聚合结果的DMA请求至内存,达到一次性更新used ring的多个队列项的目的。容易注意到的是,通过聚合操作的方式发起一次DMA请求,无需针对每个队列项发起一次DMA请求,从而达到了减少更新used ring所产生的操作次数,避免设备侧看到PCIe接口反压,提高DMA性能的技术效果,进而解决了相关技术中半虚拟化设备与主机之前交互频繁,导致DMA性能下降的技术问题。
在本申请上述实施例中,确定多个初始数据中满足预设条件的多个第一数据包括如下至少之一:获取目标缓存中存储的数据,得到多个第一数据,其中,多个初始数据依次缓存至目标缓存;在预设定时时间到达的情况下,确定多个初始数据为多个第一数据;在多个初始数据的数量大于或等于预设数量的情况下,确定多个初始数据为多个第一数据。
上述的目标缓存可以是预先设定的自适应缓存,当PCIe性能出现瓶颈时,数据堆积在自适应缓存中。在一种可选的实施例中,used ring中的队列项在提交给CPU之前,都会依次存储在自适应缓存中,因此,可以将自适应缓存中存储的所有数据项作为第一数据。
上述的预设定时时间可以是预先设置的定时器的时间,可以根据实际需要进行设定。在一种可选的实施例中,在预设定时时间到达之前,无需对used ring中存储的队列项进行任何处理,在预设定时时间到达之后,可以将used ring中存储的所有未提交的队列项作为第一数据。
上述的预设数量可以是预先设置的最大聚合数量,可以根据实际需要进行设定。在一种可选的实施例中,在used ring中存储的未提交的队列项的数量未到达预设数量之前,无需对used ring中存储的队列项进行任何处理,在未提交的队列项的数量到达预设数量之后,可以将used ring中存储的所有未提交的队列项作为第第一数据。
需要说明的是,上述的三个条件可以单独使用,也可以任意组合使用,例如,将目标缓存和预设定时时间进行组合,目标缓存和预设数量进行组合,预设定时时间和预设数量进行组合,以及目标缓存、预设定时时间和预设数量进行组合。具体组合方式可以根据实际需要进行确定,本申请对此不做具体限定。
例如,仍以如图4的软硬结合的virtio设备虚拟化实现架构为例进行说明,预设条件可以包括自适应缓存、定时器和最大聚合数量,可以将上述三个方法综合使用,确定出第一数据,并对第一数据进行聚合操作。
在本申请上述实施例中,对多个第一数据进行聚合操作,生成第一聚合结果包括:确定每个第一数据对应的第一传输队列,其中,第一传输队列用于传输原始数据;获取同一个第一传输队列对应的第一数据,得到第一聚合结果。
由于virtio设备支持多队列,因此,上述的第一传输队列可以是virtio设备支持的队列queue,virtio设备可以使用多个queue传输原始数据给主机,每个queue都对应一个used ring。
在一种可选的实施例中,针对每个队列项,可以确定相应的原始数据发往的queue。由于聚合操作之前,不同queue对应的队列项往往交织在一起,无法直接进行聚合操作, 因此,可以采用按queue调度的方式,将相同queue对应的队列项排列在一起,进而可以将属于同一个queue对应的队列项进行聚合,得到第一聚合结果。
例如,仍以如图4的软硬结合的virtio设备虚拟化实现架构为例进行说明,如图6所示,填充不同图案的方框代表不同queue对应的队列项,不同的数字代表queue队列中的序号。在调度之前,不同queue对应的队列项交织在一起,经过调度,同一个queue对应的队列项相邻,因此,可以将同一个queue对应的队列项进行聚合,一次性提交给内存。
在本申请上述实施例中,获取同一个第一传输队列对应的第一数据,得到第一聚合结果包括:确定多个第一数据中第一个第一数据对应的第一传输队列,得到目标传输队列;获取多个第一数据中目标传输队列对应的第一数据,得到第一聚合结果。
在一种可选的实施例中,由于used ring中存储位置越前的队列项表示半虚拟化设备处理的时间越早,为了降低发送原始数据的设备的等待时间,可以将第一数据中存储位置最前的队列项对应的queue作为目标queue,并将目标queue对应的所有第一数据进行聚合操作,得到第一聚合结果,而其他queue对应的第一数据需要等待目标queue对应的第一数据提交完成之后,再进行处理。
在本申请上述实施例中,该方法还包括:基于完成队列对应的初始队列标识,确定多个初始数据中的多个第二数据,其中,初始队列标识用于表征已经提交完成的数据的标识信息,多个第二数据用于表征当前提交至主机的数据;对多个第二数据进行聚合操作,得到第二聚合结果;基于第二聚合结果对初始队列标识进行更新。
上述的初始队列标识可以是至used ring index,指向了used ring中第一个未提交的队列项。上述的更新可以是对used ring index的数值进行更新。
在一种可选的实施例中,在更新used ring之后,需要对used ring index进行更新,由于更新了used ring中的多个队列项,因此,需要对used ring index进行多次更新。为了避免多次更新used ring index产生的操作次数多,可以确定出最新提交的队列项作为第二数据,并将多次DMA请求合并成一次,也即,对第二数据进行聚合操作,得到第二聚合结果,并对used ring index进行一次更新。
例如,仍以如图4的软硬结合的virtio设备虚拟化实现架构为例进行说明,在执行写used ring步骤之后,可以执行写used ring index步骤,对队列项0至队列项2进行聚合操作,基于第二聚合结果对used ring index进行更新,将取值更新为3。
例如,如图5所示的used ring为例进行说明,used ring中包含的8个队列项均未提交至内存,此时,队列项0已经完成提交,此时,used ring index的取值为1,也即,初始队列标识是1。并且,一次性完成更新used ring的第1至3项,因此,可以直接将used ring index的取值更为4。
需要说明的是,不同版本virtio设备的完成队列的队列结构不同,对于新版本的virtio设备,如果不需要对used ring index进行更新,则无需执行上述步骤。
在本申请上述实施例中,对多个第二数据进行聚合操作,得到第二聚合结果包括:确 定每个第二数据对应的第二传输队列;获取同一个第二传输队列对应的第二数据,得到第二聚合结果。
上述的第二传输队列也可以是virtio设备支持的队列queue,virtio设备可以使用多个queue传输原始数据给主机,每个queue都对应一个used ring。
在一种可选的实施例中,与used ring中队列项的聚合操作类似,可以采用按queue调度的方式,将相同queue对应的队列项排列在一起,进而可以将属于同一个queue对应的队列项进行聚合,得到第二聚合结果。
例如,仍以如图4的软硬结合的virtio设备虚拟化实现架构为例进行说明,如图6所示,填充不同图案的方框代表不同queue对应的队列项,在调度之前,不同queue对应的队列项交织在一起,经过调度,同一个queue对应的队列项相邻,因此,可以将同一个queue对应的队列项进行聚合,一次性更新used ring index。
需要说明的是,按queue调度的方式可以只进行一次,也即,如果在更新used ring的过程中已经采用按queue调度的方式进行聚合操作,则更新used ring index的时候无需采用按queue调度的方式进行聚合操作;如果在更新used ring的过程中未采用按queue调度的方式进行聚合操作,则更新used ring index的时候采用按queue调度的方式进行聚合操作。
在本申请上述实施例中,该方法还包括:获取多个原始数据;确定每个原始数据对应的第三传输队列;按照第三传输队列对多个原始数据进行排序,得到排序后的数据,其中,同一个第三传输队列对应的原始数据相邻;依次将排序后的数据写入主机的数据缓冲区中。
上述的第三传输队列也可以是virtio设备支持的队列queue,virtio设备可以使用多个queue传输原始数据给主机。
在一种可选的实施例中,由于聚合操作之前,不同queue对应的原始数据往往交织在一起,无法直接进行聚合操作,因此,可以采用按queue调度的方式,将相同queue对应的原始数据排列在一起,进而按照排序后的数据依次写入到数据buffer中。
例如,仍以如图4的软硬结合的virtio设备虚拟化实现架构为例进行说明,如图6所示,填充不同图案的方框代表不同queue对应的队列项,在调度之前,不同queue对应的数据交织在一起,经过调度,同一个queue对应的数据相邻,因此,可以依次将数据存储至数据buffer,从而确保used ring中存储的队列项中,属于同一个queue对应的队列项相邻。
需要说明的是,按queue调度的方式可以只进行一次,也即,如果在写数据buffer之前,采用按queue调度的方式对原始数据进行缓存调度,则后续更新used ring和更新used ring index的时候无需采用按queue调度的方式进行聚合操作;如果在写数据buffer之前,未采用按queue调度的方式对原始数据进行缓存调度,则在更新used ring的过程中如果采用按queue调度的方式进行聚合操作,则更新used ring index的时候无需采用按queue调度的方式进行聚合操作;如果在更新used ring的过程中未采用按queue调度 的方式进行聚合操作,则更新used ring index的时候采用按queue调度的方式进行聚合操作。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
实施例2
根据本申请实施例,还提供了一种用于实施上述基于半虚拟化设备的数据处理方法的基于半虚拟化设备的数据处理装置,如图7所示,该装置700包括:数据获取模块702、数据确定模块704、聚合模块706和发送模块708。
其中,数据获取模块702用于获取半虚拟化设备的完成队列中存储的多个初始数据,其中,多个初始数据用于表征半虚拟化设备已经处理完成,但未提交至主机的原始数据的描述信息;数据确定模块704用于确定多个初始数据中满足预设条件的多个第一数据;聚合模块706用于对多个第一数据进行聚合操作,生成第一聚合结果;发送模块708用于发送携带有第一聚合结果的直接存储器访问请求至主机的内存。
此处需要说明的是,上述数据获取模块702、数据确定模块704、聚合模块706和发送模块708对应于实施例1中的步骤S302至步骤S308,四个模块与对应的步骤所实现的实例和应用场景相同,但不限于上述实施例1所公开的内容。需要说明的是,上述模块作为装置的一部分可以运行在实施例1提供的计算机终端10中。
在本申请上述实施例中,数据确定模块包括如下至少之一:数据获取单元、第一数据确定单元和第二数据确定单元。
其中,数据获取单元用于获取目标缓存中存储的数据,得到多个第一数据,其中,多个初始数据依次缓存至目标缓存;第一确定单元用于在预设定时时间到达的情况下,确定多个初始数据为多个第一数据;第二确定单元用于在多个初始数据的数量大于或等于预设数量的情况下,确定多个初始数据为多个第一数据。
在本申请上述实施例中,聚合模块包括:队列确定单元和结果获取单元。
其中,队列确定单元用于确定每个第一数据对应的第一传输队列,其中,第一传输队列用于传输原始数据;结果获取单元用于获取同一个第一传输队列对应的第一数据,得到 第一聚合结果。
在本申请上述实施例中,结果获取单元还用于确定多个第一数据中第一个第一数据对应的第一传输队列,得到目标传输队列,并获取多个第一数据中目标传输队列对应的第一数据,得到第一聚合结果。
在本申请上述实施例中,该装置还包括:更新模块。
其中,数据确定模块还用于基于完成队列对应的初始队列标识,确定多个初始数据中的多个第二数据,其中,初始队列标识用于表征已经提交完成的数据的标识信息,多个第二数据用于表征当前提交至主机的数据;聚合模块还用于对多个第二数据进行聚合操作,得到第二聚合结果;更新模块用于基于第二聚合结果对初始队列标识进行更新。
在本申请上述实施例中,聚合模块包括:队列确定单元和结果获取单元。
其中,队列确定单元用于确定每个第二数据对应的第二传输队列;结果获取单元用于获取同一个第二传输队列对应的第二数据,得到第二聚合结果。
在本申请上述实施例中,该装置还包括:数据获取模块、队列确定模块、排序模块和写入模块。
其中,数据获取模块用于获取多个原始数据;队列确定模块用于确定每个原始数据对应的第三传输队列;排序模块用于按照第三传输队列对多个原始数据进行排序,得到排序后的数据,其中,同一个第三传输队列对应的原始数据相邻;写入模块用于依次将排序后的数据写入主机的数据缓冲区中。
需要说明的是,本申请上述实施例中涉及到的优选实施方案与实施例1提供的方案以及应用场景、实施过程相同,但不仅限于实施例1所提供的方案。
实施例3
根据本申请实施例,还提供了一种用于实施上述基于半虚拟化设备的数据处理方法的基于半虚拟化设备的数据处理系统,如图8所示,该系统包括:
主机82,包括:内存和完成队列。
半虚拟化设备84,与主机连接,用于获取完成队列中存储的多个初始数据,其中,多个初始数据用于表征半虚拟化设备已经处理完成,但未提交至主机的原始数据的描述信息;确定多个初始数据中满足预设条件的多个第一数据;对多个第一数据进行聚合操作,生成第一聚合结果;发送携带有第一聚合结果的直接存储器访问请求至主机的内存。
在本申请上述实施例中,半虚拟化设备还用于执行如下至少之一步骤:获取目标缓存中存储的数据,得到多个第一数据,其中,多个初始数据依次缓存至目标缓存;在预设定时时间到达的情况下,确定多个初始数据为多个第一数据;在多个初始数据的数量大于或等于预设数量的情况下,确定多个初始数据为多个第一数据。
在本申请上述实施例中,半虚拟化设备还用于确定每个第一数据对应的第一传输队列,并获取同一个第一传输队列对应的第一数据,得到第一聚合结果,其中,第一传输队列用于传输原始数据。
在本申请上述实施例中,半虚拟化设备还用于确定多个第一数据中第一个第一数据对应的第一传输队列,得到目标传输队列,并获取多个第一数据中目标传输队列对应的第一数据,得到第一聚合结果。
在本申请上述实施例中,半虚拟化设备还用于基于完成队列对应的初始队列标识,确定多个初始数据中的多个第二数据,其中,初始队列标识用于表征已经提交完成的数据的标识信息,多个第二数据用于表征当前提交至主机的数据;对多个第二数据进行聚合操作,得到第二聚合结果;基于第二聚合结果对初始队列标识进行更新。
在本申请上述实施例中,半虚拟化设备还用于确定每个第二数据对应的第二传输队列;结果获取单元用于获取同一个第二传输队列对应的第二数据,得到第二聚合结果。
在本申请上述实施例中,主机还包括:数据缓冲区;半虚拟化设备还用于获取多个原始数据;确定每个原始数据对应的第三传输队列;按照第三传输队列对多个原始数据进行排序,得到排序后的数据,其中,同一个第三传输队列对应的原始数据相邻;依次将排序后的数据写入数据缓冲区中。
需要说明的是,本申请上述实施例中涉及到的优选实施方案与实施例1提供的方案以及应用场景、实施过程相同,但不仅限于实施例1所提供的方案。
实施例4
本申请的实施例可以提供一种计算机终端,该计算机终端可以是计算机终端群中的任意一个计算机终端设备。可选地,在本实施例中,上述计算机终端也可以替换为移动终端等终端设备。
可选地,在本实施例中,上述计算机终端可以位于计算机网络的多个网络设备中的至少一个网络设备。
在本实施例中,上述计算机终端可以执行基于半虚拟化设备的数据处理方法中以下步骤的程序代码:获取半虚拟化设备的完成队列中存储的多个初始数据,其中,多个初始数据用于表征半虚拟化设备已经处理完成,但未提交至主机的原始数据的描述信息;确定多个初始数据中满足预设条件的多个第一数据;对多个第一数据进行聚合操作,生成第一聚合结果;发送携带有第一聚合结果的直接存储器访问请求至主机的内存。
可选地,图9是根据本申请实施例的一种计算机终端的结构框图。如图9所示,该计算机终端A可以包括:一个或多个(图中仅示出一个)处理器902、以及存储器904。
其中,存储器可用于存储软件程序以及模块,如本申请实施例中的基于半虚拟化设备的数据处理方法和装置对应的程序指令/模块,处理器通过运行存储在存储器内的软件程序以及模块,从而执行各种功能应用以及数据处理,即实现上述的基于半虚拟化设备的数据处理方法。存储器可包括高速随机存储器,还可以包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器可进一步包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至终端A。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
处理器可以通过传输装置调用存储器存储的信息及应用程序,以执行下述步骤:获取半虚拟化设备的完成队列中存储的多个初始数据,其中,多个初始数据用于表征半虚拟化设备已经处理完成,但未提交至主机的原始数据的描述信息;确定多个初始数据中满足预设条件的多个第一数据;对多个第一数据进行聚合操作,生成第一聚合结果;发送携带有第一聚合结果的直接存储器访问请求至主机的内存。
可选的,上述处理器还可以执行如下步骤的程序代码:获取目标缓存中存储的数据,得到多个第一数据,其中,多个初始数据依次缓存至目标缓存;和/或,在预设定时时间到达的情况下,确定多个初始数据为多个第一数据;和/或,在多个初始数据的数量大于或等于预设数量的情况下,确定多个初始数据为多个第一数据。
可选的,上述处理器还可以执行如下步骤的程序代码:确定每个第一数据对应的第一传输队列,其中,第一传输队列用于传输原始数据;获取同一个第一传输队列对应的第一数据,得到第一聚合结果。
可选的,上述处理器还可以执行如下步骤的程序代码:确定多个第一数据中第一个第一数据对应的第一传输队列,得到目标传输队列;获取多个第一数据中目标传输队列对应的第一数据,得到第一聚合结果。
可选的,上述处理器还可以执行如下步骤的程序代码:基于完成队列对应的初始队列标识,确定多个初始数据中的多个第二数据,其中,初始队列标识用于表征已经提交完成的数据的标识信息,多个第二数据用于表征当前提交至主机的数据;对多个第二数据进行聚合操作,得到第二聚合结果;基于第二聚合结果对初始队列标识进行更新。
可选的,上述处理器还可以执行如下步骤的程序代码:确定每个第二数据对应的第二传输队列;获取同一个第二传输队列对应的第二数据,得到第二聚合结果。
可选的,上述处理器还可以执行如下步骤的程序代码:获取多个原始数据;确定每个原始数据对应的第三传输队列;按照第三传输队列对多个原始数据进行排序,得到排序后的数据,其中,同一个第三传输队列对应的原始数据相邻;依次将排序后的数据写入主机的数据缓冲区中。
采用本申请实施例,提供了一种基于半虚拟化设备的数据处理的方案。通过聚合操作的方式发起一次DMA请求,无需针对每个队列项发起一次DMA请求,从而达到了减少更新used ring所产生的操作次数,避免设备侧看到PCIe接口反压,提高DMA性能的技术效果,进而解决了相关技术中半虚拟化设备与主机之前交互频繁,导致DMA性能下降的技术问题。
本领域普通技术人员可以理解,图9所示的结构仅为示意,计算机终端也可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌上电脑以及移动互联网设备(Mobile Internet Devices,MID)、PAD等终端设备。图9其并不对上述电子装置的结构造成限定。例如,计算机终端A还可包括比图9中所示更多或者更少的组件(如网络接口、显示装置等),或者具有与图9所示不同的配置。
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通 过程序来指令终端设备相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:闪存盘、只读存储器(Read-Only Memory,ROM)、随机存取器(Random Access Memory,RAM)、磁盘或光盘等。
实施例5
本申请的实施例还提供了一种存储介质。可选地,在本实施例中,上述存储介质可以用于保存上述实施例所提供的基于半虚拟化设备的数据处理方法所执行的程序代码。
可选地,在本实施例中,上述存储介质可以位于计算机网络中计算机终端群中的任意一个计算机终端中,或者位于移动终端群中的任意一个移动终端中。
可选地,在本实施例中,存储介质被设置为存储用于执行以下步骤的程序代码:获取半虚拟化设备的完成队列中存储的多个初始数据,其中,多个初始数据用于表征半虚拟化设备已经处理完成,但未提交至主机的原始数据的描述信息;确定多个初始数据中满足预设条件的多个第一数据;对多个第一数据进行聚合操作,生成第一聚合结果;发送携带有第一聚合结果的直接存储器访问请求至主机的内存。
可选的,上述存储介质还被设置为存储用于执行以下步骤的程序代码:获取目标缓存中存储的数据,得到多个第一数据,其中,多个初始数据依次缓存至目标缓存;和/或,在预设定时时间到达的情况下,确定多个初始数据为多个第一数据;和/或,在多个初始数据的数量大于或等于预设数量的情况下,确定多个初始数据为多个第一数据。
可选的,上述存储介质还被设置为存储用于执行以下步骤的程序代码:确定每个第一数据对应的第一传输队列,其中,第一传输队列用于传输原始数据;获取同一个第一传输队列对应的第一数据,得到第一聚合结果。
可选的,上述存储介质还被设置为存储用于执行以下步骤的程序代码:确定多个第一数据中第一个第一数据对应的第一传输队列,得到目标传输队列;获取多个第一数据中目标传输队列对应的第一数据,得到第一聚合结果。
可选的,上述存储介质还被设置为存储用于执行以下步骤的程序代码:基于完成队列对应的初始队列标识,确定多个初始数据中的多个第二数据,其中,初始队列标识用于表征已经提交完成的数据的标识信息,多个第二数据用于表征当前提交至主机的数据;对多个第二数据进行聚合操作,得到第二聚合结果;基于第二聚合结果对初始队列标识进行更新。
可选的,上述存储介质还被设置为存储用于执行以下步骤的程序代码:确定每个第二数据对应的第二传输队列;获取同一个第二传输队列对应的第二数据,得到第二聚合结果。
可选的,上述存储介质还被设置为存储用于执行以下步骤的程序代码:获取多个原始数据;确定每个原始数据对应的第三传输队列;按照第三传输队列对多个原始数据进行排序,得到排序后的数据,其中,同一个第三传输队列对应的原始数据相邻;依次将排序后的数据写入主机的数据缓冲区中。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
在本申请的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的技术内容,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅是本申请的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。

Claims (11)

  1. 一种基于半虚拟化设备的数据处理方法,其特征在于,包括:
    获取所述半虚拟化设备的完成队列中存储的多个初始数据,其中,所述多个初始数据用于表征所述半虚拟化设备已经处理完成,但未提交至主机的原始数据的描述信息;
    确定所述多个初始数据中满足预设条件的多个第一数据;
    对所述多个第一数据进行聚合操作,生成第一聚合结果;
    发送携带有所述第一聚合结果的直接存储器访问请求至所述主机的内存。
  2. 根据权利要求1所述的方法,其特征在于,确定所述多个初始数据中满足预设条件的多个第一数据包括如下至少之一:
    获取目标缓存中存储的数据,得到所述多个第一数据,其中,所述多个初始数据依次缓存至所述目标缓存;
    在预设定时时间到达的情况下,确定所述多个初始数据为所述多个第一数据;
    在所述多个初始数据的数量大于或等于预设数量的情况下,确定所述多个初始数据为所述多个第一数据。
  3. 根据权利要求1所述的方法,其特征在于,对所述多个第一数据进行聚合操作,生成第一聚合结果包括:
    确定每个第一数据对应的第一传输队列,其中,所述第一传输队列用于传输所述原始数据;
    获取同一个第一传输队列对应的第一数据,得到所述第一聚合结果。
  4. 根据权利要求3所述的方法,其特征在于,获取同一个第一传输队列对应的第一数据,得到所述第一聚合结果包括:
    确定所述多个第一数据中第一个第一数据对应的第一传输队列,得到目标传输队列;
    获取所述多个第一数据中所述目标传输队列对应的第一数据,得到所述第一聚合结果。
  5. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    基于所述完成队列对应的初始队列标识,确定所述多个初始数据中的多个第二数据,其中,所述初始队列标识用于表征已经提交完成的数据的标识信息,所述多个第二数据用于表征当前提交至所述主机的数据;
    对所述多个第二数据进行聚合操作,得到第二聚合结果;
    基于所述第二聚合结果对所述初始队列标识进行更新。
  6. 根据权利要求5所述的方法,其特征在于,对所述多个第二数据进行聚合操作,得到第二聚合结果包括:
    确定每个第二数据对应的第二传输队列;
    获取同一个第二传输队列对应的第二数据,得到所述第二聚合结果。
  7. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    获取多个原始数据;
    确定每个原始数据对应的第三传输队列;
    按照所述第三传输队列对所述多个原始数据进行排序,得到排序后的数据,其中,同一个第三传输队列对应的原始数据相邻;
    依次将所述排序后的数据写入所述主机的数据缓冲区中。
  8. 一种基于半虚拟化设备的数据处理装置,其特征在于,包括:
    数据获取模块,用于获取所述半虚拟化设备的完成队列中存储的多个初始数据,其中,所述多个初始数据用于表征所述半虚拟化设备已经处理完成,但未提交至主机的原始数据的描述信息;
    数据确定模块,用于确定所述多个初始数据中满足预设条件的多个第一数据;
    聚合模块,用于对所述多个第一数据进行聚合操作,生成第一聚合结果;
    发送模块,用于发送携带有所述第一聚合结果的直接存储器访问请求至所述主机的内存。
  9. 一种基于半虚拟化设备的数据处理系统,其特征在于,包括:
    主机,包括:内存和完成队列;
    所述半虚拟化设备,与所述主机连接,用于获取所述完成队列中存储的多个初始数据,其中,所述多个初始数据用于表征所述半虚拟化设备已经处理完成,但未提交至所述主机的原始数据的描述信息;确定所述多个初始数据中满足预设条件的多个第一数据;对所述多个第一数据进行聚合操作,生成第一聚合结果;发送携带有所述第一聚合结果的直接存储器访问请求至所述主机的内存。
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括存储的程序,其中,在所述程序运行时控制所述计算机可读存储介质所在设备执行权利要求1至7中任意一项所述的基于半虚拟化设备的数据处理方法。
  11. 一种计算机终端,其特征在于,包括:存储器和处理器,所述处理器用于运行所述存储器中存储的程序,其中,所述程序运行时执行权利要求1至7中任意一项所述的基于半虚拟化设备的数据处理方法。
PCT/CN2023/074420 2022-02-18 2023-02-03 基于半虚拟化设备的数据处理方法、装置和系统 WO2023155698A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210153414.2 2022-02-18
CN202210153414.2A CN114637574A (zh) 2022-02-18 2022-02-18 基于半虚拟化设备的数据处理方法、装置和系统

Publications (1)

Publication Number Publication Date
WO2023155698A1 true WO2023155698A1 (zh) 2023-08-24

Family

ID=81945784

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/074420 WO2023155698A1 (zh) 2022-02-18 2023-02-03 基于半虚拟化设备的数据处理方法、装置和系统

Country Status (2)

Country Link
CN (1) CN114637574A (zh)
WO (1) WO2023155698A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114637574A (zh) * 2022-02-18 2022-06-17 阿里巴巴(中国)有限公司 基于半虚拟化设备的数据处理方法、装置和系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103176833A (zh) * 2013-03-11 2013-06-26 华为技术有限公司 一种基于虚拟机的数据发送方法、接收方法及系统
CN104395895A (zh) * 2012-06-25 2015-03-04 超威半导体公司 用于输入/输出虚拟化的系统和方法
US20180088978A1 (en) * 2016-09-29 2018-03-29 Intel Corporation Techniques for Input/Output Access to Memory or Storage by a Virtual Machine or Container
CN111133416A (zh) * 2017-09-26 2020-05-08 英特尔公司 处理来自虚拟机命令的方法和装置
CN114637574A (zh) * 2022-02-18 2022-06-17 阿里巴巴(中国)有限公司 基于半虚拟化设备的数据处理方法、装置和系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104395895A (zh) * 2012-06-25 2015-03-04 超威半导体公司 用于输入/输出虚拟化的系统和方法
CN103176833A (zh) * 2013-03-11 2013-06-26 华为技术有限公司 一种基于虚拟机的数据发送方法、接收方法及系统
US20180088978A1 (en) * 2016-09-29 2018-03-29 Intel Corporation Techniques for Input/Output Access to Memory or Storage by a Virtual Machine or Container
CN111133416A (zh) * 2017-09-26 2020-05-08 英特尔公司 处理来自虚拟机命令的方法和装置
CN114637574A (zh) * 2022-02-18 2022-06-17 阿里巴巴(中国)有限公司 基于半虚拟化设备的数据处理方法、装置和系统

Also Published As

Publication number Publication date
CN114637574A (zh) 2022-06-17

Similar Documents

Publication Publication Date Title
US11500810B2 (en) Techniques for command validation for access to a storage device by a remote client
WO2020156259A1 (zh) 内存管理方法、装置、移动终端及存储介质
US9986028B2 (en) Techniques to replicate data between storage servers
US9696942B2 (en) Accessing remote storage devices using a local bus protocol
US10908841B2 (en) Increasing throughput of non-volatile memory express over fabric (NVMEoF) via peripheral component interconnect express (PCIe) interface
US8996755B2 (en) Facilitating, at least in part, by circuitry, accessing of at least one controller command interface
US9881680B2 (en) Multi-host power controller (MHPC) of a flash-memory-based storage device
US10901624B1 (en) Dummy host command generation for supporting higher maximum data transfer sizes (MDTS)
CN111813713B (zh) 数据加速运算处理方法、装置及计算机可读存储介质
WO2023155698A1 (zh) 基于半虚拟化设备的数据处理方法、装置和系统
US20220222016A1 (en) Method for accessing solid state disk and storage device
WO2013154541A1 (en) Remote direct memory access with reduced latency
CN114662136B (zh) 一种基于pcie通道的多算法ip核的高速加解密系统及方法
CN110149374B (zh) 一种文件传输方法、终端设备及计算机可读存储介质
CN113312143A (zh) 云计算系统、命令处理方法及虚拟化仿真装置
WO2022179486A1 (zh) 多核处理器任务调度方法、装置及设备、存储介质
CN101675413A (zh) 物理网络接口选择
US20200004557A1 (en) Semiconductor apparatus, operation method thereof, and stacked memory apparatus having the same
CN109324874A (zh) 一种虚拟机内存快照导入块设备的方法、系统及装置
US20230137668A1 (en) storage device and storage system
WO2013154540A1 (en) Continuous information transfer with reduced latency
US10832132B2 (en) Data transmission method and calculation apparatus for neural network, electronic apparatus, computer-readable storage medium and computer program product
CN110837482A (zh) 分布式块存储低延迟控制方法、系统及设备
CN107729140B (zh) 一种并行实现多个eMMC主机接口命令排队功能的装置及方法
US20220237132A1 (en) Data transmission method and ping-pong dma architecture

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23755701

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023755701

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2023755701

Country of ref document: EP

Effective date: 20240219