Detailed Description
In order to make the present application solution better understood by those skilled in the art, the following description will be made in detail and with reference to the accompanying drawings in the embodiments of the present application, it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, partial terms or terminology appearing in describing embodiments of the present application are applicable to the following explanation:
mdev: the device node is a tool provided by the busy box and used in an embedded system, which is equivalent to a simplified version udev, and is used for automatically creating the device node when the system is started and a driver is hot plugged or dynamically loaded. The device nodes in the file system under the/dev directory are all created by mdev, and in the process of loading the drive, the device nodes are automatically created under the/dev directory according to the drive program.
The Non-volatile memory standard (Non-Volatile Memory Express, NVMe), which is a standard protocol running on PCIE interfaces, is a protocol for communication between a Host and an SSD.
Virtio: is an abstraction of a set of generic analog devices in a paravirtualized hypervisor. This setup allows the hypervisor to export a generic set of analog devices and make them available through a generic Application Programming Interface (API). The paravirtualized hypervisor enables the guest operating system to implement a generic set of interfaces, employing specific device emulation after a back-end set of drivers. The backend drivers need not be generic because they only implement the behaviors required by the front end. In addition to the front-end driver (implemented in the guest operating system) and the back-end driver (implemented in the hypervisor), virtio also defines two layers to support guest operating system to hypervisor communications. At the top level (called virtio) is a virtual queue interface that conceptually attaches a front-end driver to a back-end driver. The driver may use 0 or more queues, the specific number depending on the requirements.
PCIe (Peripheral Component Interconnect express, bus and interface standard): is a high-speed serial computer expansion bus standard. PCIe belongs to high-speed serial point-to-point dual-channel high-bandwidth transmission, and connected devices allocate exclusive channel bandwidths, do not share bus bandwidths, and mainly support functions such as active power management, error reporting, end-to-end reliability transmission, hot plug, qoS, and the like. PCIe is a layered protocol consisting of a transaction layer, a data link layer subdivided into layers including Media Access Control (MAC) sublayers, and a physical layer subdivided into logical and electronic sublayers, with PCIe links being dedicated unidirectional coupling around serial (1-bit) point-to-point connections called lanes. The method comprises the steps of carrying out a first treatment on the surface of the The physical logic sub-layer includes a physical coding sub-layer (PCS).
SR-IOV (Single root virtualization): being an extension of the PCIe specification, SR-IOVs allow devices (e.g., network adapters) to separate access to their resources among various PCIe hardware functions, including a PCIe Physical Function (PF) and one or more PCIe Virtual Functions (VFs), each VF associated with a PF of the device. The SR-IOV assigns each PF and VF a unique PCIe Requester ID (RID) that allows the IO memory management unit (IO Memory Management Unit, simply IOMMU) to distinguish between the different traffic flows and apply memory and interrupt translation between PFs and VFs, which allows the traffic flows to pass directly to the appropriate Hyper-v parent or child partition, so that non-privileged data traffic will flow from the PF to the VF without affecting the other VFs. The SR-IOV enables network traffic to bypass the software switch layer of the Hyper-v virtualization stack, since VF is allocated to a child partition, network traffic flows directly between VF and the child partition, IO overhead in the software emulation layer is reduced, and nearly the same performance is achieved as in a non-virtualized environment.
PF/VF (physical/virtual function): the PF is a PCIe function supporting the SR-IOV interface that includes SR-IOV expansion functions in the PCIe configuration space for configuring and managing the SR-IOV functions of the network adapter, such as enabling virtualization and exposing PCIe virtual functions VFs. The PF miniport driver uses the conventional NDIS miniport driver functionality to provide access to network IO resources for managing the operating system, as well as for managing the resources allocated on the adapter for VFs. Thus, before any resources are allocated to the VF, the PF miniport driver is loaded in the management operating system, and after all the resources allocated for VFs are released, the PF miniport driver is stopped.
The VFs are lightweight PCIe functions on the network adapter, support SR-IOVs, associate with PFs, represent virtualized instances of the network adapter, each with its own PCI configuration space, and share one or more physical resources (e.g., external network ports) on the network adapter with the PF and other VFs. VFs are not self-contained PCIe devices, but rather provide a basic mechanism for transferring data directly between the Hyper-v child partition and the underlying SR-IOV network adapter, with the software resources associated with the data transfer being directly available to the VF and used independently of the other VFs or PF's, most of the configuration of these resources typically being performed by running the PF miniport driver in the management operating system of the Hyper-v parent partition. The VF is used as a virtual network adapter to run a guest operating system in a hyper-v sub-partition, after the VF is associated with a virtual port (VPort) on a NIC switch of the SR-IOV network adapter, a Virtual PCI (VPCI) driver running in a VM can disclose the VF network adapter, and after the VF network adapter is disclosed, a PnP manager in the guest operating system can load the VF miniport driver.
IOMMU (memory management unit): the IOMMU is used to remap physical memory addresses to addresses used by the sub-partitions. The IOMMU operates independently of the memory management hardware used by the processor.
DMA (Direct Memory Access ): allowing hardware devices of different speeds to communicate without relying on the massive interrupt load of the CPU. DMA transfers copy data from one address space to another, and when the CPU initiates this transfer, the transfer itself is performed and completed by the DMA controller, which not only does not hold off the processor work, but can be rearranged Cheng Quchu for other work, which is important for high performance embedded system algorithms and networks. When implementing DMA transmission, the DMA controller directly takes charge of the bus, so that there is a problem of transferring bus control right, i.e. before the DMA transmission, the CPU gives the bus control right to the DMA controller, and after finishing the DMA transmission, the DMA controller should immediately give the bus control right back to the CPU. A complete DMA transfer process typically includes 4 steps of DMA request, DMA response, DMA transfer, DMA end.
Example 1
In the related art, for an NVMe acceleration device without SR-IOV function, mainly, a QEMU (open source simulator) is used to implement virtualization of the device by means of software virtualization, and mdev (automatically creating device nodes) driving or Virtio (paravirtualization) and other modes are used.
As shown in fig. 1, a method of virtualization by QEMU mdev is provided. The device manufacturer simulates the VM to access the virtual PCIe configuration space and the bar space in the mdev drive, converts the access to the bar space into the management and operation of the virtual Admin queue and the IO queue, and meanwhile, the mdev drive also needs to realize the simulation of IO interrupt so that the VM can operate the virtual device of the mdev simulation by using a standard driver. Specifically, the mdev driver simulates Admin queue as follows: firstly, acquiring related attributes of a physical NVMe disc through driving of physical NVMe equipment; for commands such as identification, constructing corresponding response according to the attribute of the physical NVMe disk and the configuration of the current mdev, and sending a virtual interrupt request; for operations such as creating IO queue and deleting IO queue, creating corresponding virtual IO queue, calling a drive of a physical NVMe disk to construct corresponding physical IO queue, and associating virtual IO queue and virtual interrupt with physical IO queue and physical interrupt respectively; and returning corresponding log information for the request for acquiring the log information and the like. Meanwhile, the simulation of the mdev driver on the IO queue is as follows: calling a driver of the IOMMU to modify the address of the DMA request in the IO command into a memory space address in the corresponding VM, pinning the corresponding space and setting an IOMMU conversion table; complete the conversion of LBA (Logical Block Address ); the physical NVMe directly accesses the VM memory space by DMA.
As shown in fig. 2, another method of virtualization by QEMU Virtio is provided. The VM and the QEMU interact through a standard Virtio standard, and the QEMU calls an NVMe Driver at a Virtio Driver (band) to achieve a final function.
It can be seen that the above-described virtualization approach has the following drawbacks: the CPU core resources are required to participate in virtualization, so that additional expenditure is brought; the DMA address in the IO command needs to be converted twice in mdev driving, the IO virtualization capability provided by the IOMMU cannot be utilized, and delay overhead is increased; qoS (Quality of Service ) and fragmentation require host driven participation, bringing additional overhead to software; the interrupt virtualization needs frequent VM exit/VM entry switching, so that corresponding performance loss is brought; lack of physical isolation; the management and control is not perceived by the user in use.
In order to solve the above problem, the embodiment of the present application provides a new scheme for unloading and pooling virtualization, as shown in fig. 3, where a local side adopts an emulator to emulate a device for each PF/VF, and the VM accesses the emulated device in a pass-through manner under the support of PCIe SR-IOV, the emulator locally responds to a command in an Admin queue, performs an Admin operation on a back-end physical device according to the content of the command, performs mapping to an IO command of the back-end physical device for the command in the IO queue, and constructs a response to the emulated device in the VM according to the result of the physical device performing the IO command, and if the physical device needs to access the VM memory during the execution of the IO command, initiates access to the VM memory in a DMA manner. The TLP in the TLP proxy refers to a PCIe protocol transport layer packet, and functions to implement capabilities such as routing or distributing the TLP by processing the packet in the original TLP, which is similar to a network agent in the network. The control function is to configure all module parameters and detect the working state of each module in the system. Compared with the mdev-based virtualization mode, the method has the advantages that the virtualized load is unloaded onto the local card, and due to the adoption of the SR-IOV virtualization mechanism, CPU and memory overhead required by unloading virtualization in repeated conversion and mapping of DMA addresses in the mdev virtualization process are avoided, the advantages of management and control operation are exerted, and meanwhile, qoS (quality of service) realization and fine-grained resource segmentation are combined more efficiently.
On the basis of the above scheme, the embodiment of the application provides a command processing method, and fig. 4 shows a hardware structure block diagram of a computer terminal (or mobile device) for implementing the method. As shown in fig. 4, the computer terminal 40 (or mobile device 40) may include one or more (shown as 402a, 402b, … …,402 n) processors 402 (the processors 402 may include, but are not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA), a memory 404 for storing data, and a transmission module 406 for communication functions. In addition, the method may further include: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a power supply, and/or a camera. It will be appreciated by those of ordinary skill in the art that the configuration shown in fig. 4 is merely illustrative and is not intended to limit the configuration of the electronic device described above. For example, the computer terminal 40 may also include more or fewer components than shown in FIG. 4, or have a different configuration than shown in FIG. 4.
It should be noted that the one or more processors 402 and/or other data processing circuits described above may be referred to herein generally as "data processing circuits. The data processing circuit may be embodied in whole or in part in software, hardware, firmware, or any other combination. Furthermore, the data processing circuitry may be a single stand-alone processing module, or incorporated, in whole or in part, into any of the other elements in the computer terminal 40 (or mobile device). As referred to in the embodiments of the present application, the data processing circuit acts as a processor control (e.g., selection of the path of the variable resistor termination to interface).
The memory 404 may be used to store software programs and modules of application software, such as program command/data storage devices corresponding to the methods in the embodiments of the present application, and the processor 402 executes the software programs and modules stored in the memory 404, thereby performing various functional applications and data processing, that is, implementing the above-mentioned vulnerability detection method of application programs. Memory 404 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, memory 404 may further include memory located remotely from processor 402, which may be connected to computer terminal 40 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission module 406 is used to receive or transmit data via a network. The specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 40. In one example, the transmission means 406 comprises a network adapter (Network Interface Controller, NIC) that can be connected to other network devices via a base station to communicate with the internet. In one example, the transmission device 406 may be a Radio Frequency (RF) module for communicating with the internet wirelessly.
The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the computer terminal 40 (or mobile device).
In the above operating environment, as shown in fig. 5, a command processing method provided in the embodiment of the present application includes at least steps S502 to S508, where:
in step S502, the emulator receives a notification message sent by the host device, where the notification message carries at least one command to be executed, and the command to be executed indicates that the command to be executed needs to be executed in the host device.
In some optional embodiments of the present application, after generating the command to be executed, the master device loads the command to be executed into a notification message and then sends the notification message to the emulator. Specifically, the transmission method includes: the master device sends once every time a command to be executed is generated; or loading at least one command to be executed generated in a preset time period on the notification message and sending the command to be executed once, namely, loading all the commands to be executed generated in a period into one notification message and sending the notification message to the simulator.
In step S504, the emulator reads at least one command to be executed from the memory of the host device if the local buffer is free.
In some optional embodiments of the present application, after receiving a notification message of a master device, the emulator needs to determine whether a local buffer space is full, and if the local buffer is free, reads at least one command to be executed from a memory of the master device; if the local buffer is full, the existing virtual command is processed, and after the local buffer is free, at least one command to be executed is read from the memory of the main device.
After the emulator reads at least one command to be executed from the memory of the host device, the emulator buffers the command to be executed locally, and updates a locally buffered pointer, where the pointer is used to indicate a buffer address of each virtual command.
In step S506, the emulator responds to the extraction request sent by the target device to extract the command to be executed, and sends the command to be executed to the target device, where the target device sends the command to be executed to the physical device in the network resource pool.
In some optional embodiments of the present application, after the emulator reads at least one command to be executed from the memory of the main device, notifying the target side device that there is a command to be executed, where the target side device sends an extraction request for extracting the command to be executed to the emulator, and after receiving the extraction request, the emulator returns the command to be executed to the target side device in response to the extraction request; and then, the target side device sends the command to be executed to the physical device in the network resource pool, and the physical device executes the command to be executed.
The emulator updates the locally buffered pointer for indicating the buffer address of each virtual command at the same time after returning the command to be executed to the target-side device.
In some optional embodiments of the present application, the simulator simulates at least one simulation device, and one physical device may be allocated to multiple simulation devices, i.e., any physical device may allocate its own capability to multiple simulation devices, and allocate a portion of the capability to each simulation device, where the allocation is mainly performed by a QoS mechanism.
Specifically, the QoS mechanism is represented in the embodiments of the present application as: under the condition that the simulator simulates a plurality of simulation devices, after receiving a command to be executed sent by any simulation device in the simulator, the target side device compares an execution result of the command to be executed, which is executed by each simulation device, with an allocated quota limit, and if the consumption resource amount of the execution result is smaller than or equal to the quota limit and the simulation device has the command to be executed to be processed, the command to be executed of any simulation device is selected to be forwarded through the arbiter.
The above process is shown in fig. 6, where, the Quota represents a Quota of each simulation device, the Target is the Target side device, and after the Target side device receives the commands to be executed sent by the multiple simulation devices, one of the multiple simulation devices is selected to execute the IO command according to the QoS mechanism.
In step S508, the emulator receives the execution result fed back by the physical device, where the physical device accesses the memory of the host device in a direct memory access manner, and executes the command to be executed.
In some optional embodiments of the present application, after receiving a command to be executed sent by a target side device, a physical device accesses a memory of a main device by a direct memory access manner, executes the command to be executed, and feeds back an obtained execution result to the target side device; after the target side device obtains the execution result, the execution result is put into a response buffer area of the simulator, and meanwhile, local buffering is carried out; the emulator extracts the execution result from the response buffer and feeds back the execution result to the host device.
In some alternative embodiments of the present application, the simulator simulates at least one simulated device and maps the simulated device to a plurality of physical devices. The simulation device maintains a first communication queue for each physical device, wherein a one-to-one or one-to-many relationship exists between the first communication queue and a second communication queue of the main device, the first communication queue stores at least one command to be executed, an execution result generated by executing the command to be executed, and a corresponding relationship between the command to be executed and the corresponding execution result, and the communication queue mainly comprises an IO request/response queue pair buffer.
In particular, for fine-grained partitioning, one emulated device may be mapped onto multiple physical devices, where the capabilities of the emulated device are aggregated by part of the capabilities of the multiple back-end physical devices, but since the back-end physical devices are independent from each other, it is necessary to ensure that there is no causal association between commands sent to each physical device, which may be achieved by application layer scheduling, for example, in NVMe acceleration devices, nsid (namespace) may be used to distinguish unrelated task flows.
As shown in fig. 7, an alternative implementation manner of mapping between an emulation device and a plurality of physical devices is shown, for simplicity of implementation, the emulation device maintains, for each back-end physical device, one of the IO request/response queue pair buffers, where the IO request/response queue pair buffers stores at least one command to be executed, an execution result generated by executing the command to be executed, and a correspondence between the command to be executed and a corresponding execution result. It should be noted that, the relationship between the IO request/response queue pair buffer and the IO request/response queue pair of the master device is not one-to-one, for example, in the NVMe acceleration device, the requests with different nsids in the main device memory may be dispatched to the buffer associated with the physical device.
In some optional embodiments of the present application, when receiving an execution result from a physical device, a target device needs to determine an emulator corresponding to the execution result and a master device, specifically, determine an initiator (initiator) source and PF/VF, so before forwarding an IO command of the execution result, the target device needs to save and construct the following mapping table:
{uuid,pf/vf,command_id_old}:{command_id_new}
wherein: uuid is the unique identifier of the initiator in the whole system; the PF/VF is the request ID number of the PF/VF corresponding to the IO command; command_id_old is the identification number of the IO command in the initiator side IO command/response buffer queue pair; command_id_new is used to determine a unique identification IO command in the IO command/response queue pair of the target device, and associate the execution result fed back by the physical device with the IO command.
In some optional embodiments of the present application, in order to enable a physical device to directly perform a DMA operation on a memory space on a host device when executing an IO command, information on a corresponding initiator side, including relevant information about uuid and pf/vf, needs to be carried in a corresponding IO command and transferred to the physical device, where an optional implementation method is shown in fig. 8. The target side device maintains mapping tables of { uuid, pf/vf } and { uuid_local }, wherein uuid_local is a local unique identifier, when an IO command of an Initiator side is forwarded to an IO command queue of a physical device, the uuid_local is adopted to replace the high order of an IO command address, the low order of the address is kept unchanged, and the high order of the address is stored locally; when the physical device sends out a memory DMA request by using the address, the PCIe Proxy module intercepts the address, finds { uuid, pf/vf } information by using the uuid_local information stored in the upper bits of the address, replaces the upper bits of the address with the originally stored upper bits, and executes the DMA request to the initiator side memory by using the restored address and { uuid, pf/vf } information.
Fig. 9 is a flow chart of forwarding an entire IO command when the command processing method according to the present embodiment performs virtualization unloading and pooling on an NVMe-like device, where the process includes steps a) to m), where:
a) The master device informs the simulator of a new command, and the notification can be sent once when one command is generated, or can be sent all at once for a plurality of commands generated in a period of time;
b) The simulator judges whether IO SQ/CQ buffering of the rear end is full, if not, the simulator acquires the commands from the main equipment, and can acquire one command at a time or acquire a plurality of commands at a time;
where SQ is the command written and CQ is the result of the command execution. Two types of SQ and CQ are provided, one is Admin, which is used for placing Admin commands for managing and controlling SSD by a host; one is IO, which is used for releasing IO commands and is used for transmitting data between a host and SSD; the SQ and CQ may be in a one-to-one relationship or in a many-to-one relationship, which are paired.
c) The simulator puts the IO command obtained from the main equipment into a local IO command buffer;
d) The simulator informs the target side equipment at the far end that a new IO command is locally available;
e) The target side device sends an IO command extraction request to the simulator according to the quota and the QoS mechanism;
f) The simulator returns the corresponding IO command to the target side device according to the request of the target side device, and simultaneously updates the pointer buffered by the local IO command;
g) The target side device maps the IO command and then distributes the IO command to the back-end physical device;
h) The physical device accesses the memory of the main device in a DMA mode and executes the IO command;
i) The physical device returns an execution result of the IO command;
j) The target side device performs remapping according to the IO command returned by the physical device, and sends the IO command execution result to a local buffer of an IO response buffer of a correct simulator;
k) The simulator extracts IO response from the local buffer to obtain an execution result;
l) the simulator returns IO execution results to the main equipment;
m) the emulator updates the pointer of the local IO response buffer and notifies the target side device.
In some alternative embodiments of the present application, another solution of virtualization offloading and pooling is also proposed, as shown in fig. 10, in which an Initiator uses a network card supporting SR-IOV functions, a virtual network card is provided for each PF/VF, an App in a VM uses a standard nvmeoh driver to access a virtual device, and the function of the Initiator moves from the Initiator side to the Target side, and other functions are similar to those of the solution of fig. 3.
In the embodiment of the application, a notification message sent by a main device is received through an emulator, wherein the notification message carries at least one command to be executed, and the command to be executed indicates that the command to be executed needs to be executed in the main device; the simulator reads at least one command to be executed from the memory of the main equipment under the condition that the local buffer area is free; the simulator responds to an extraction request for extracting a command to be executed, which is sent by the target side device, and returns the command to be executed to the target side device, wherein the target side device sends the command to be executed to the physical device in the network resource pool; and the simulator receives an execution result fed back by the physical equipment, wherein the physical equipment accesses the memory of the main equipment in a direct memory access mode and executes the command to be executed. By adopting the scheme, the technical problems of CPU resource consumption, larger delay and difficult operation and maintenance existing in the process of carrying out virtualization management and control on the similar NVMe equipment in the related technology are effectively solved.
It should be noted that, in the embodiment of the steps shown in fig. 5, the simulation device may be implemented by the simulation device obtained by the simulation when performing the corresponding function, especially when performing data interaction with other devices (such as the host and the target device).
Specifically, by adopting an SR-IOV mechanism to carry out virtualization unloading on an initiator side, an emulator can realize access to a VM virtual memory through an IOMMU, so that CPU overhead required by multiple address translations and CPU overhead required by virtualization are avoided; under the support of SR-IOV virtualization, interrupt virtualization can utilize the interrupt virtualization capability of hardware as much as possible, and the expenditure brought by VM-Exit under mdev virtualization is reduced; transparent to VM users, simple in operation and maintenance management, and capable of supporting bare metal servers (bare Metal Server); physical isolation can be realized for different VM users, and safety is ensured; by pooling and binding one emulgator device to a plurality of physical devices and aggregating quota capabilities of the plurality of physical devices, fine-grained partitioning of resources is achieved.
The scheme is applicable to NVMe protocol, other protocols similar to NVMe (such as SATA protocol and the like), and can be applied to scenes such as storage, GPU, encryption and decryption card, AI special accelerator card, FPGA card and the like.
Taking an encryption and decryption card as an example, in some embodiments of the present application, the host device may be any one server or computer device, and when the virtual machine of the host device performs encryption operation, the emulation device corresponding to the encryption and decryption card may read an encryption command from the memory of the host device; the simulator responds to an extraction request for extracting the encryption command sent by the target side device, the encryption command is sent to the target side device, and the target side device sends the encryption command to the physical device in the network resource pool; and then the simulator receives an execution result fed back by the physical equipment, wherein the physical equipment accesses the memory of the main equipment in a direct memory access mode and executes the encryption command.
Example 2
There is further provided, according to an embodiment of the present application, a cloud computing system, as shown in fig. 11, where the system includes at least a main device 110, a source device 112, a target-side device 114, and at least one physical device 116, where:
the main device 110 is operated with at least one virtual machine, and the virtual machine is connected with the simulation device in the source device through a physical function PF channel or a virtual function VF channel; in some alternative embodiments of the present application, the master device 112 transmits once every time a command to be executed is generated, or transmits once at least one command to be executed generated within a predetermined period of time by loading it onto a notification message. The master device 110 may be a host.
A source device 112 having a simulator 1120 running thereon, the simulator 1120 being configured to simulate a device for each PF channel or VF channel device to obtain simulated devices; the emulated device interacts with the target-side device 114 via a remote direct memory access RDMA module. The source device and the host may be two separate hardware devices.
The target side device 114 is connected with at least one physical device 116 in the resource pool through a PCIe interface, and determines the mapping relation between the simulation device and the at least one physical device 116; the physical device 116 includes, but is not limited to, an NVMe device or an NVMe-like device.
At least one physical device 116, connected to the target-side device 114, for executing the target task request forwarded by the target-side device 114 via the source device 112, and transmitting the execution result of the target task request to the host device 110 through the emulation device.
In some optional embodiments of the present application, the emulation device is further configured to receive a notification message sent by the host device 110, where the notification message carries at least one command to be executed corresponding to the target task request. The simulation device reads at least one command to be executed from the memory of the main device 110 under the condition that the local buffer is free; in response to an extraction request sent by the target-side device 114 to extract a command to be executed, sending the command to be executed to the target-side device; and receives the execution result fed back by the physical device 116. The notification message is used for indicating that there is a command to be executed, e.g. the host device (e.g. host) detects a new command, and then notifies the emulator, i.e. at least one emulated device emulated by the emulator
The target-side device 114 is configured to receive a notification message indicating that there is a command to be executed by the emulation device and send an extraction request for extracting the command to be executed to the emulation device; receiving a command to be executed returned by the simulation equipment, and sending the command to be executed to the physical equipment in the network resource pool; the execution result fed back by the physical device 116 is put into a response buffer and locally buffered.
The physical device 116 is configured to access the memory of the host device by a direct memory access manner when executing the command to be executed, and execute the command to be executed.
In some optional embodiments of the present application, the target-side device 114 is configured to compare, when receiving a command to be executed sent by any one of the emulation devices, an execution result of the command to be executed that has been executed by each of the emulation devices with an allocated quota limit, and if a consumption resource amount of the execution result is less than or equal to the quota limit and the emulation device has the command to be executed to be processed, select, by the arbiter, the command to be executed of any one of the emulation devices to forward.
The target-side device 114 is further configured to maintain a mapping relationship between the source device and the target-side device, and replace an address of the target-side device with an address of the source device in the command to be executed when forwarding the command to be executed forwarded by the source device 112 to a command queue of the physical device.
In some optional embodiments of the present application, after receiving a notification message of a master device, a simulation device needs to determine whether a local buffer space is full, and if the local buffer is free, at least one command to be executed is read from a memory of the master device; if the local buffer area is full, processing the existing virtual command, and reading at least one command to be executed from the memory of the main equipment after the local buffer area is free; after the simulation equipment reads at least one command to be executed from the memory of the main equipment, the command to be executed is buffered locally, a pointer of the local buffer is updated, meanwhile, the target side equipment is informed of the command to be executed, the target side equipment sends an extraction request for extracting the command to be executed to the simulation equipment, and the simulation equipment returns the command to be executed to the target side equipment in response to the extraction request after receiving the extraction request; the target side device sends the command to be executed to physical devices in a network resource pool, the physical devices access the memory of the main device in a direct memory access mode, the command to be executed is executed, an execution result is fed back to the target side device, and the target side device places the execution result fed back by the physical devices into a response buffer area; the simulation device extracts the execution result from the response buffer and feeds the execution result back to the main device.
Example 3
According to an embodiment of the present application, there is further provided a virtualization emulation device, as shown in fig. 12, where the device at least includes: a first receiving module 120, a reading module 122 and a sending module 124 and a second receiving module 126, wherein:
the first receiving module 120 is configured to receive a notification message sent by the master device, where the notification message carries at least one command to be executed, and the command to be executed indicates that the command to be executed needs to be executed in the master device.
And the reading module 122 is configured to read at least one command to be executed from the memory of the master device when the local buffer is free.
Optionally, the virtualized emulation device further includes a buffering module 126 and a judging module 128, where:
the buffer module 126 is configured to buffer the command to be executed locally and update the locally buffered pointer.
A determining module 128, configured to determine whether the local buffer is free.
In some optional embodiments of the present application, after receiving a notification message of a master device, the emulation apparatus needs to determine whether a local buffer space is full, and if the local buffer is free, reads at least one command to be executed from a memory of the master device; if the local buffer is full, the existing virtual command is processed, and after the local buffer is free, at least one command to be executed is read from the memory of the main device. After the emulation device reads at least one command to be executed from the memory of the host device, the command to be executed is buffered locally, and a locally buffered pointer is updated, wherein the pointer is used for indicating the buffer address of each virtual command.
And the sending module 124 is configured to send the command to be executed to the target side device in response to an extraction request sent by the target side device to extract the command to be executed, where the target side device sends the command to be executed to a physical device in the network resource pool.
A second receiving module 126, configured to receive an execution result fed back by the physical device, where the physical device accesses the memory of the host device by using a direct memory access manner, and executes the command to be executed
In some optional embodiments of the present application, after generating the command to be executed, the master device loads the command to be executed into a notification message and then sends the notification message to the emulation device. Specifically, the transmission method includes: the master device sends once every time a command to be executed is generated; or all the commands to be executed generated within a period of time are loaded into one notification message and sent to the simulation device.
After the simulation device reads at least a command to be executed from the memory of the main device, notifying the target side device that the command to be executed exists, and sending an extraction request for extracting the command to be executed to the simulation device by the target side device, and after receiving the extraction request, responding to the extraction request by the simulation device, and returning the command to be executed to the target side device; and then, the target side equipment sends the command to be executed to the physical equipment in the network resource pool, the physical equipment executes the command to be executed, and the simulation device receives an execution result fed back by the physical equipment.
It should be noted that, in the virtualized simulation device according to the embodiment of the present application, functions of each module correspond to the command processing method in embodiment 1, and a specific implementation process may refer to the content in embodiment 1.
Example 4
According to the embodiment of the application, there is also provided an electronic device, and the specific structure of the electronic device may be referred to as the structure of the computer terminal shown in fig. 4, but is not limited thereto, for example, the electronic device may further include more structures than the computer terminal shown in fig. 4. In an embodiment of the present application, the electronic device includes a processor and a memory, wherein:
the memory is coupled to the processor for providing the processor with commands to process the following processing steps:
the simulator receives a notification message sent by the main equipment, wherein the notification message carries at least one command to be executed, and the command to be executed indicates that the command to be executed needs to be executed in the main equipment; the simulator reads at least one command to be executed from the memory of the main equipment under the condition that the local buffer area is free; the simulator responds to an extraction request for extracting a command to be executed, which is sent by the target side device, and returns the command to be executed to the target side device, wherein the target side device sends the command to be executed to the physical device in the network resource pool; and the simulator receives an execution result fed back by the physical equipment, wherein the physical equipment accesses the memory of the main equipment in a direct memory access mode and executes the command to be executed.
It should be noted that, the preferred implementation manner in the embodiments of the present application may be referred to the related descriptions in embodiments 1-3, and will not be repeated here.
Example 5
The embodiment of the application also provides another command processing method, as shown in fig. 13, which includes:
step S132, the simulator receives a notification message sent by the main device, wherein the notification message carries at least one command to be executed, and the command to be executed indicates that the command to be executed needs to be executed in the main device;
step S134, the simulator reads at least one command to be executed from the memory of the main device and returns the command to be executed to the physical device in the network resource pool;
it should be noted that the number of physical devices may be plural, that is, one emulator may correspond to plural physical devices in the resource pool. In some optional embodiments of the present application, the simulator simulates at least one simulation device, and one physical device may be allocated to multiple simulation devices, that is, any physical device may allocate its own capability to multiple simulation devices, and allocate a portion of the capability to each simulation device, where the allocation is mainly implemented by a QoS mechanism.
In step S136, the emulator receives the execution result fed back by the physical device, where the physical device accesses the memory of the host device through the direct memory access mode, and executes the command to be executed.
In some optional embodiments of the present application, after receiving a command to be executed sent by a target side device, a physical device accesses a memory of a main device by a direct memory access manner, executes the command to be executed, and feeds back an obtained execution result to the target side device; after the target side device obtains the execution result, the execution result is put into a response buffer area of the simulator, and meanwhile, local buffering is carried out; the emulator extracts the execution result from the response buffer and feeds back the execution result to the host device.
It should be noted that, the preferred implementation manner in the embodiments of the present application may be referred to the related descriptions in embodiments 1-3, and will not be repeated here.
Example 6
According to an embodiment of the present application, there is further provided a nonvolatile storage medium including a stored program, where the device in which the nonvolatile storage medium is controlled to execute the above-described command processing method when the program runs.
Optionally, the program controls the device in which the nonvolatile storage medium is located to execute the following steps when running: the simulator receives a notification message sent by the main equipment, wherein the notification message carries at least one command to be executed, and the command to be executed indicates that the command to be executed needs to be executed in the main equipment; the simulator reads at least one command to be executed from the memory of the main equipment under the condition that the local buffer area is free; the simulator responds to an extraction request for extracting a command to be executed, which is sent by the target side device, and returns the command to be executed to the target side device, wherein the target side device sends the command to be executed to the physical device in the network resource pool; and the simulator receives an execution result fed back by the physical equipment, wherein the physical equipment accesses the memory of the main equipment in a direct memory access mode and executes the command to be executed.
Optionally, the program controls the device in which the nonvolatile storage medium is located to execute the following steps when running: the simulator receives a notification message sent by the main equipment, wherein the notification message carries at least one command to be executed, and the command to be executed indicates that the command to be executed needs to be executed in the main equipment; the simulator reads at least one command to be executed from the memory of the main equipment and returns the command to be executed to the physical equipment; and the simulator receives an execution result fed back by the physical equipment, wherein the physical equipment accesses the memory of the main equipment in a direct memory access mode and executes the command to be executed.
The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments.
In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed technology content may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and are merely a logical functional division, and there may be other manners of dividing the apparatus in actual implementation, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in whole or in part in the form of a software product stored in a storage medium, including several commands to cause a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application and are intended to be comprehended within the scope of the present application.