CN115904259B

CN115904259B - Processing method and related device of nonvolatile memory standard NVMe instruction

Info

Publication number: CN115904259B
Application number: CN202310174005.5A
Authority: CN
Inventors: 国海涛
Original assignee: Zhuhai Xingyun Zhilian Technology Co Ltd
Current assignee: Zhuhai Xingyun Zhilian Technology Co Ltd
Priority date: 2023-02-28
Filing date: 2023-02-28
Publication date: 2023-05-16
Anticipated expiration: 2043-02-28
Also published as: CN115904259A

Abstract

The embodiment of the application discloses a processing method and a related device of a nonvolatile memory standard NVMe instruction, wherein the method comprises the following steps: determining whether the NVMe instruction information includes a low latency tag; if the NVMe instruction information does not comprise the low-latency tag, the first DMA instruction is sent to a second DMA queue; if the low-delay tag is included, generating a second DMA instruction according to the NVMe instruction information, and sending the second DMA instruction to a low-delay DMA queue pool or a second DMA queue by judging whether the number of cache instructions in the current first DMA queue is larger than a preset threshold value. By adopting the embodiment of the application, the second DMA instruction can be sent to the low-delay DMA queue pool, and the low-delay DMA queue pool adopts the low-delay data processing channel to carry the data of the application program, thereby being beneficial to improving the running performance of the delay-sensitive application.

Description

Processing method and related device of nonvolatile memory standard NVMe instruction

Technical Field

The application relates to the technical field of storage, in particular to a processing method and a related device of nonvolatile memory standard NVMe instructions.

Background

Virtualization technology is one of the most critical technologies in the cloud computing era. Through the virtualization technology, the server host can provide needed storage services for different tenants, so that the utilization rate of the equipment is improved. Delay-sensitive applications typically have a small amount of data, but require devices to provide a low delay, however, because different tenants share the same server storage bandwidth, there is a problem that other tenants and delay-sensitive applications compete with each other, because traffic generated by other tenants causes a problem that the performance of the delay-sensitive applications is reduced.

Disclosure of Invention

The embodiment of the application provides a processing method and a related device of a nonvolatile memory standard NVMe instruction, which can determine whether to send a second DMA instruction to a low-delay DMA queue pool or a second DMA queue according to a low-delay tag in the NVMe request instruction and the number of cache instructions in a current first DMA queue, wherein the low-delay DMA queue pool and the DMA instructions in the second DMA queue adopt different data processing channels to carry data of an application program, so that the low-delay data of a delay sensitive application can be timely carried to a simulated disk, and the running performance of the delay sensitive application is improved.

In a first aspect, an embodiment of the present application provides a method for processing a standard NVMe instruction of a nonvolatile memory, which is applied to a driving module in an embedded virtualization system, where the embedded virtualization system includes an FPGA controller and the driving module, the FPGA controller is connected to a target server and the driving module, the embedded virtualization system is configured to provide an analog NVMe disk for the target server, the FPGA controller is configured to receive an NVMe request instruction sent by the target server, and send the NVMe request instruction to the driving module, the driving module includes a first sending queue, a first DMA queue, a first recycling queue, and the analog NVMe disk, the FPGA controller includes a second sending queue, a second DMA queue, a second recycling queue, and a low-latency queue pool, where the low-latency queue pool includes a plurality of low-latency DMA queues, the second DMA queue corresponds to a first data processing channel, the low-latency DMA pool corresponds to a low-latency data processing channel, the second sending DMA queue is configured to receive the NVMe request instruction received by the FPGA controller, and the driving module includes a buffer instruction received by the first sending queue, and the method includes:

Analyzing each NVMe request instruction according to the queue sequence of a plurality of NVMe request instructions in the first sending queue to obtain NVMe instruction information;

determining whether the NVMe instruction information includes a low latency tag;

if the NVMe instruction information does not comprise the low-latency tag, generating a first DMA instruction according to the NVMe instruction information, and adding the first DMA instruction to the first DMA queue; the first DMA command is sent to the second DMA queue of the FPGA controller, wherein the second DMA queue is used for the FPGA controller to move first data corresponding to the first DMA command in the target server to the simulated NVMe disk through the first data processing channel, and the first data is data of a non-delay sensitive application program of the target server;

if the NVMe request instruction comprises the low-delay tag, generating a second DMA instruction according to the NVMe instruction information, determining the number of cache instructions in the first DMA queue currently, and judging whether the number of cache instructions is larger than a preset threshold value;

if the number of the cache instructions is greater than the preset threshold, sending the second DMA instruction to the low-delay DMA queue pool in the FPGA controller, wherein the low-delay DMA queue pool is used for the FPGA controller to move second data corresponding to the second DMA instruction in the target server to the simulated NVMe disk through the low-delay data processing channel, and the second data is low-delay data of a delay-sensitive application program of the target server;

If the number of the cache instructions is smaller than or equal to the preset threshold value, adding the second DMA instruction to the first DMA queue; and sending the second DMA instruction to the second DMA queue of the FPGA controller, where the second DMA queue is configured to move, by the FPGA controller through the first data processing channel, the second data corresponding to the second DMA instruction in the target server to the simulated NVMe disk.

In a second aspect, an embodiment of the present application provides a method for processing an NVMe command with a non-volatile memory standard, where the method is applied to a target server, where the target server is connected to an FPGA controller in an embedded virtualization system, where the embedded virtualization system includes the FPGA controller and a driving module, where the FPGA controller is respectively connected to the target server and the driving module, where the embedded virtualization system is configured to provide an emulated NVMe disk for the target server, where the FPGA controller is configured to receive an NVMe request command sent by the target server and send the NVMe command to the driving module, where the driving module includes a first sending queue, a first DMA queue, a first recycling queue, and an emulated NVMe disk, where the FPGA controller includes a second sending queue, a second recycling queue, and a low latency DMA queue pool, where the second DMA queue includes a plurality of low latency queues, where the second DMA queue corresponds to a first data processing channel, where the low latency DMA queue corresponds to the low latency DMA queue, where the low latency DMA queue also receives the emulated DMA command from the first DMA queue to the first data processing channel, where the second DMA queue is used to send the emulated DMA command to the first buffer, where the data is processed by the first DMA queue, and moving the second data corresponding to the second DMA instruction in the target server to the emulated NVMe disk via the first data processing channel, the method comprising:

Determining whether the target application is a delay-sensitive application;

if the target application program is the delay sensitive application program, adding a low-delay tag on the NVMe request instruction when the NVMe request instruction corresponding to the target application program is sent to the FPGA controller;

receiving a first completion instruction or a second completion instruction sent by the FPGA controller;

and generating a first recovery success signal according to the first completion instruction or generating a second recovery success signal according to the second completion instruction, and sending the first recovery success signal or the second recovery success signal to the driving module through the FPGA controller.

In a third aspect, an embodiment of the present application provides a processing apparatus for a non-volatile memory standard NVMe instruction, which is applied to a driving module in an embedded virtualization system, where the embedded virtualization system includes an FPGA controller and the driving module, the FPGA controller is connected to a target server and the driving module, the embedded virtualization system is configured to provide an analog NVMe disk for the target server, the FPGA controller is configured to receive an NVMe request instruction sent by the target server, and send the NVMe request instruction to the driving module, the driving module includes a first sending queue, a first DMA queue, a first recycling queue, and the analog NVMe disk, the FPGA controller includes a second sending queue, a second DMA queue, a second recycling queue, and a low latency queue pool, where the low latency queue pool includes a plurality of low latency DMA queues, the second DMA queue corresponds to a first data processing channel, the low latency DMA pool corresponds to a low latency data processing channel, the second sending DMA queue is configured to receive the NVMe request instruction received by the FPGA controller, and the driving module includes a buffer instruction received by the first sending queue, and the driving module includes the buffer instruction received by the buffer instruction, and the buffer queue is configured to receive the NVMe request instruction from the first buffer. The device comprises an analysis unit, a determination unit, a sending unit and a judging unit, wherein,

The analyzing unit is configured to analyze each NVMe request instruction according to a queue sequence of a plurality of NVMe request instructions in the first sending queue, so as to obtain NVMe instruction information;

the determining unit is configured to determine whether the NVMe instruction information includes a low latency tag;

the sending unit is configured to generate a first DMA instruction according to the NVMe instruction information if the NVMe instruction information does not include the low latency tag, and add the first DMA instruction to the first DMA queue; the first DMA command is sent to the second DMA queue of the FPGA controller, wherein the second DMA queue is used for the FPGA controller to move first data corresponding to the first DMA command in the target server to the simulated NVMe disk through the first data processing channel, and the first data is data of a non-delay sensitive application program of the target server;

the judging unit is configured to generate a second DMA instruction according to the NVMe instruction information if the NVMe request instruction includes the low latency tag, determine the number of cache instructions in the first DMA queue currently, and judge whether the number of cache instructions is greater than a preset threshold;

The sending unit is further configured to send the second DMA instruction to the low-latency DMA queue pool in the FPGA controller if the number of cache instructions is greater than the preset threshold, where the low-latency DMA queue pool is used for the FPGA controller to move second data corresponding to the second DMA instruction in the target server to the simulated NVMe disk through the low-latency data processing channel, and the second data is low-latency data of a latency-sensitive application program of the target server;

the sending unit is further configured to add the second DMA instruction to the first DMA queue if the number of cache instructions is less than or equal to the preset threshold; and sending the second DMA instruction to the second DMA queue of the FPGA controller, wherein the second DMA queue is used for the FPGA controller to move the second data corresponding to the second DMA instruction in the target server to the simulated NVMe disk through the first data processing channel.

In a fourth aspect, an embodiment of the present application provides a processing apparatus for a non-volatile memory standard NVMe instruction, applied to a target server, where the target server is connected to an FPGA controller in an embedded virtualization system, where the embedded virtualization system includes the FPGA controller and a driving module, where the FPGA controller is respectively connected to the target server and the driving module, where the embedded virtualization system is configured to provide an emulated NVMe disk for the target server, where the FPGA controller is configured to receive an NVMe request instruction sent by the target server and send the NVMe request instruction to the driving module, where the driving module includes a first sending queue, a first DMA queue, a first recycling queue, and the emulated NVMe disk, where the FPGA controller includes a second sending queue, a second recycling queue, and a low latency DMA queue pool, where the low latency DMA queue pool includes a plurality of low latency DMA queues, where the second DMA queue corresponds to a first data processing channel, where the low latency DMA queue corresponds to the low latency DMA queue, where the low latency DMA queue also receives the NVMe request instruction to the first data processing channel through the first DMA queue and the first DMA queue, where the second DMA queue is used to send the emulated data to the first DMA queue, and moving the second data corresponding to the second DMA instruction in the target server to the emulated NVMe disk via the first data processing channel, the apparatus comprising: a determining unit, an adding unit, a receiving unit and a transmitting unit, wherein,

The determining unit is used for determining whether the target application program is a delay sensitive application program or not;

the adding unit is configured to add a low-delay tag to the NVMe request instruction when the NVMe request instruction corresponding to the target application program is sent to the FPGA controller if the target application program is the delay-sensitive application program;

the receiving unit is used for receiving a first completion instruction or a second completion instruction sent by the FPGA controller;

the sending unit is used for generating a first recovery success signal according to the first completion instruction or generating a second recovery success signal according to the second completion instruction, and sending the first recovery success signal or the second recovery success signal to the driving module through the FPGA controller.

In a fifth aspect, embodiments of the present application provide an electronic device comprising a processor, a memory, a communication interface, and one or more programs stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing part or all of the steps as described in any of the methods of the first or second aspects of embodiments of the present application.

In a sixth aspect, embodiments of the present application provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program for electronic data exchange, where the computer program causes a computer to perform some or all of the steps described in any of the methods of the first aspect or the second aspect of embodiments of the present application.

In a seventh aspect, embodiments of the present application provide a computer program product, wherein the computer program product comprises a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps described in any of the methods of the first aspect of the embodiments of the present application. The computer program product may be a software installation package.

It can be seen that in the embodiment of the present application, the driving module in the embedded virtualization system may parse each NVMe request instruction according to the queue sequence of the plurality of NVMe request instructions in the first sending queue to obtain NVMe instruction information, then determine whether the NVMe instruction information includes a low latency tag, if the NVMe instruction information does not include the low latency tag, generate a first DMA instruction according to the NVMe instruction information, and add the first DMA instruction to the first DMA queue, and send the first DMA instruction to the second DMA queue of the FPGA controller, where the second DMA queue is used for the FPGA controller to move the first data of the first DMA instruction in the target server to the emulated NVMe disk through the first data processing channel, and if the NVMe request instruction includes the low latency tag, generate the second DMA instruction according to the NVMe instruction information, determine the number of cache instructions in the current first DMA queue, and determine whether the number of cache instructions is greater than the preset threshold, and further send the first DMA instruction to the second DMA queue through the first data processing channel to the emulated NVMe disk, where the first data is transferred to the low latency data of the FPGA controller in the low latency pool; if the number of the cache instructions is smaller than or equal to a preset threshold value, adding a second DMA instruction to the first DMA queue, and sending the second DMA instruction to a second DMA queue of the FPGA controller, wherein the second DMA queue is used for the FPGA controller to move second data corresponding to the second DMA instruction in the target server to an NVMe (network video Messaging) disk through the first data processing channel, and the second data is data of a non-delay sensitive application program of the target server. Therefore, the method and the device realize that whether the second DMA instruction is sent to the low-delay DMA queue pool or the second DMA queue is determined according to the low-delay tag in the NVMe instruction and the number of cache instructions in the current first DMA queue, and the low-delay DMA queue pool and the DMA instructions in the second DMA queue adopt different data processing channels to carry data of application programs, so that the low-delay data of delay-sensitive applications can be processed in time and carried to corresponding simulated magnetic discs, and the method and the device are favorable for improving the running performance of the delay-sensitive applications.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the following description will briefly introduce the drawings that are needed in the embodiments or the description of the prior art, it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1a is a schematic diagram of a processing system for nonvolatile memory standard NVMe instruction according to an embodiment of the present application;

FIG. 1b is a schematic architecture diagram of a processing system for another non-volatile memory standard NVMe instruction provided by an embodiment of the present application;

fig. 2 is a flow chart of a processing method of a nonvolatile memory standard NVMe instruction provided in an embodiment of the application;

FIG. 3 is a flow chart of another method for processing NVMe command according to the embodiment of the present application;

fig. 4 is a schematic structural diagram of an electronic device a according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of an electronic device B according to an embodiment of the present application;

fig. 6 is a schematic diagram of a processing device a for nonvolatile memory standard NVMe instruction according to an embodiment of the present application;

Fig. 7 is a schematic diagram of a processing device B for nonvolatile memory standard NVMe instruction according to an embodiment of the present application.

Detailed Description

In order to make the present application solution better understood by those skilled in the art, the following description will clearly and completely describe the technical solution in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.

The terms first, second and the like in the description and in the claims of the present application and in the above-described figures, are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

The electronic device according to the embodiments of the present application may include various handheld devices, vehicle-mounted devices, wearable devices, computing devices, or other processing devices connected to a wireless modem, and various forms of User Equipment (UE), mobile Station (MS), terminal devices (terminal devices), and so on. For convenience of description, the above-mentioned devices are collectively referred to as electronic devices. In this application, the electronic device may be a server that interacts with the simulated NVMe disk data by means of NVMe (Non Volatile Memory Host Controller Interface Specification, nonvolatile memory standard) standard drive, for example: reading data from or writing data to the emulated NVMe disk. In the present application, the electronic device may further include a chip, which may include, for example, a DPU (Data Processing Unit, smart network card).

1) NVMe (Non Volatile Memory Host Controller Interface Specification, nonvolatile memory standard), a bus transfer protocol specification based on device logic interface, is used to access nonvolatile memory media over PCIe (Peripheral Component Interconnect Express, high speed serial computer expansion bus standard) buses. NVMe1 specifies registers, instruction sets, and queue pair management, etc., required for software and hardware interactions. The queue pair may include a send queue and a reclaim queue.

2) FPGA (Field Programmable Gate Array ), FPGA controller can interact with host TLP (Transaction Layer Packet, transaction layer data packet) package and DMA (Direct Memory Access ) data communication service through PCIe interface; DMA, among other things, is a function provided by the computer bus architecture that enables data to be sent directly from an attached device (e.g., disk drive) to the memory of the computer motherboard.

3) The DPU (Data Processing Unit, intelligent network card) is a special processor which is constructed by taking data as a center, adopts a software-defined technical route to support the resource virtualization of an infrastructure layer, supports the functions of the infrastructure layer such as storage, security, service quality management and the like, and has the most direct function of serving as an unloading engine of a central processing unit (Central Processing Unit, CPU), taking over the services of the infrastructure layer such as network virtualization, hardware resource pooling and the like, and releasing the calculation power of the CPU to an upper layer application.

At present, the software virtualization technology guarantees delay services required by delay-sensitive applications through a relatively complex software scheduling module, but the software virtualization technology can cause the reduction of the overall performance of the system due to the overhead of the software virtualization technology and an instruction scheduling module. In order to reduce the overhead of the server host, an embedded virtualization technology appears, and server host virtualization service is unloaded through hardware, so that host side resources are saved, and the cost is reduced. However, the embedded system is not suitable for running complex instruction scheduling software because of the limitation of hardware resources and development difficulty. And different tenants share the same server storage bandwidth, so that the problem of competition can occur, although delay-sensitive applications usually have smaller data volume, the devices are required to provide lower delay, and the problem of performance degradation caused by traffic generated by other tenants, namely the problem of 'noisy neighbors'.

In view of the above, the present application provides a method and related apparatus for processing NVMe instructions, which are used as a standard for nonvolatile memory, and are described in detail below.

Referring to fig. 1a, fig. 1a is a schematic architecture diagram of a processing system of a nonvolatile memory standard NVMe instruction according to an embodiment of the present application, where the system architecture may include an embedded virtualization system 100a and a target server 100b, where the embedded virtualization system 100a includes an FPGA controller 101 and a driving module 102.

The embedded virtualization system 100a may be a system running in the DPU intelligent network card, so as to realize a part of functions in the data unloading function of the DPU intelligent network card, to provide the target server 100b with a mode of simulating an NVMe disk, and move data of an application program running in the server to the simulated NVMe disk by means of the FPGA controller 101 for storage, and when the application program needs to use the part of data, perform data interaction with the simulated NVMe disk by means of an NVMe standard drive, and read the part of data stored in the simulated NVMe disk.

Wherein the FPGA controller 101 in the embedded virtualization system 100a is respectively in communication connection with the target server 100b and the driving module 102, the embedded virtualization system 100a is used for providing an analog NVMe disk for the target server 100b, the FPGA controller 101 is used for receiving an NVMe request instruction sent by the target server 100b and sending the NVMe request instruction to the driving module 102,

the driving module 102 includes a plurality of first queue pairs and a plurality of analog NVMe disks, each first queue pair includes a first sending queue, a first DMA queue, and a first recycling queue, the FPGA controller 101 includes a plurality of second queue pairs, each second queue pair includes a second sending queue, a second DMA queue, a second recycling queue, and a plurality of low-latency DMA queues, the plurality of low-latency DMA queues in one second queue pair form a low-latency DMA queue pool, the second sending queue is used for buffering NVMe request instructions received by the FPGA controller 101, the first sending queue is used for buffering NVMe request instructions received by the driving module 102, the target server 100b includes a plurality of third queue pairs, each third queue pair includes a third sending queue and a third recycling queue, the analog NVMe disk corresponds to the first queue pair, the first queue pair corresponds to the second queue pair, the second queue pair corresponds to the third queue pair, a plurality of first data processing channels and a plurality of low-latency data processing channels exist between the driving module 102 and the FPGA controller 101, and the second DMA processing channels correspond to the low-latency data processing channels.

Referring to fig. 1b, fig. 1b is a schematic architecture diagram of another processing system for non-volatile memory standard NVMe instruction provided in this embodiment of the present application, taking an example of a queue pair serving an emulated NVMe disk as a case, a third queue pair in the target server 100b includes a third transmit queue and a third retrieve queue, the FPGA controller 101 includes a second queue pair corresponding to the third queue pair including a second transmit queue, a second DMA queue, a second retrieve queue and a pool of low latency queues, the driver module 102 includes a first queue pair corresponding to the second queue, the first queue pair includes a first transmit queue, a first DMA queue and a first retrieve queue, the target server 100b includes low latency data and normal data, the FPGA controller 101 can acquire an NVMe request instruction in the third transmit queue of the target server 100b, and add the NVMe request instruction to the second transmit queue, the NVMe request instruction in the second transmit driver module 102 to the first DMA request instruction, the first DMA request instruction can be moved to the first DMA queue to the first queue through the first DMA queue, the first DMA controller 101 can be moved to the first DMA controller to the first queue, the emulated disk can be moved to the first DMA controller 100b by the first DMA controller, the first DMA controller can be moved to the first DMA controller 100b through the low latency channel, the emulated DMA queue can be processed in the first DMA queue, the first DMA controller can be moved to the first DMA queue 100b by the corresponding to the first DMA queue, and the first DMA controller can be moved to the emulated by the first DMA controller 100b, and the low latency channel can be moved to the first DMA queue in the first DMA queue is moved to the first DMA queue, and the first DMA queue is moved to the first buffer can be in the first DMA queue, and the is moved to the low latency by the first DMA queue is moved by the is in the by the is moved to the by the low by the DMA queue can be in the is in the by the is, the driving module 102 receives a first movement completion signal generated by completing the movement of the first data or a second movement completion signal generated by completing the movement of the second data sent by the FPGA controller 101, the driving module 102 generates a corresponding first completion instruction according to the first movement completion signal or generates a corresponding second completion instruction according to the second movement completion signal, the driving module 102 further adds the first completion instruction or the second completion instruction to the first recovery queue, sends the first completion instruction to the target server 100b or sends the second completion instruction to the target server 100b through the FPGA controller 101, and then the driving module 102 receives a first recovery success signal generated by the target server 100b after the first completion instruction is forwarded by the FPGA controller 101 or a second recovery success signal generated by the second completion instruction is processed, and the driving module 102 updates a head pointer of the first recovery queue according to the first recovery success signal or the second recovery success signal.

The emulation NVMe disk may be used to store first data and second data carried by the FPGA controller 101 from the target server 100b, and the target server 100b may also forward a read instruction to the driving module 102 through the FPGA controller 101 to further read the first data, the second data, and other data stored in the emulation NVMe disk, where the first data is low-latency data of the latency-sensitive application of the target server 100b, and the second data is normal data of the non-latency-sensitive application of the target server 100 b.

The target server 100b may be a host server, and the target server 100b may create the aforementioned third queue pair serving for the simulated NVMe disk by using the NVMe standard driver, and the delay sensitive application program and the non-delay sensitive application program send corresponding NVMe request instructions to the third sending queue by using the standard NVMe driver of the target server 100b, where the third sending queue caches the received NVMe request instructions.

In one possible example, the driving module 102 in the embedded virtualization system 100a parses each NVMe request instruction according to a queue sequence of a plurality of NVMe request instructions in the first transmission queue to obtain NVMe instruction information, then the driving module 102 determines whether the NVMe instruction information includes a low latency tag, if the NVMe instruction information does not include the low latency tag, generates a first DMA instruction according to the NVMe instruction information, adds the first DMA instruction to the first DMA queue, and sends the first DMA instruction to a second DMA queue of the FPGA controller 101, wherein the second DMA queue is used for the FPGA controller 101 to move first data corresponding to the first DMA instruction in the target server 100b to a simulated NVMe disk through the first data processing channel, the first data is low latency data of a latency sensitive application program of the target server 100b, and if the NVMe request instruction includes the low latency tag, the driving module 102 generates a second DMA instruction according to the NVMe instruction information, determines whether the number of instructions in the current first DMA queue is greater than a preset threshold, and sends the second DMA instruction to the second DMA queue to the FPGA controller 101 through the first data processing channel, and if the DMA instruction is greater than the preset threshold in the second DMA queue is greater than the threshold, and the DMA controller is moved to the second DMA controller 101 through the low latency threshold; if the number of the buffer commands is less than or equal to the preset threshold, the driving module 102 adds a second DMA command to the first DMA queue, and sends the second DMA command to the second DMA queue of the FPGA controller 101, where the second DMA queue is used by the FPGA controller 101 to move second data corresponding to the second DMA command in the target server 100b to the NVMe-simulated disk through the first data processing channel, and the second data is data of the non-delay sensitive application program of the target server 100 b. Therefore, the method and the device realize that whether the second DMA instruction is sent to the low-delay DMA queue pool or the second DMA queue is determined according to the low-delay tag in the NVMe request instruction and the number of cache instructions in the current first DMA queue, and the low-delay DMA queue pool and the DMA instructions in the second DMA queue adopt different data processing channels to carry data of an application program, so that the low-delay data of delay-sensitive application can be processed in time and carried to a corresponding simulated disk, and the operation performance of the delay-sensitive application is improved.

It should be noted that, in this application, a plurality may refer to two or more, and will not be described in detail later.

Referring to fig. 2, fig. 2 is a schematic flow chart of a processing method of a non-volatile memory standard NVMe instruction provided in an embodiment of the present application, which is applied to a driving module in an embedded virtualization system, where the embedded virtualization system includes an FPGA controller and the driving module, the FPGA controller is connected to a target server and the driving module, the embedded virtualization system is configured to provide an analog NVMe disk for the target server, the FPGA controller is configured to receive an NVMe request instruction sent by the target server and send the NVMe request instruction to the driving module, the driving module includes a first sending queue, a first DMA queue, a first recycling queue, and the analog NVMe disk, the FPGA controller includes a second sending queue, a second DMA queue, a second recycling queue, and a low latency queue pool, where the low latency DMA queue pool includes a plurality of low latency DMA queues, the second DMA queue corresponds to a first data processing channel, the low latency DMA queue pool corresponds to the low latency data processing channel, the second DMA queue is configured to receive the NVMe request instruction from the first DMA queue, and the driving module is configured to receive the NVMe request instruction from the FPGA controller, and the driving module is configured to receive the NVMe instruction.

S201, analyzing each NVMe request instruction according to the queue sequence of a plurality of NVMe request instructions in the first sending queue to obtain NVMe instruction information.

The embedded virtualization system is used for unloading server (host machine) host machine virtualization service through hardware by means of an embedded virtualization technology so as to save server host machine side resources; the embedded virtualization system can present an analog NVMe disk to the server, an application program of the server can interact with the analog NVMe disk by means of an NVMe standard drive, the data can be read or written, in order to realize that the data of the application program in the server are transported to the analog NVMe disk, the embedded virtualization system provides a separate data transmission channel for the analog NVMe, the data transmission channel comprises a low-delay data processing channel and a first data processing channel, the two types of channels can be used for transmitting the data of the corresponding application program in parallel, and the analog NVMe disk can be used for storing the data of the application program in the target service.

The low-delay data processing channel and the first data processing channel are independent DMA channels between the FPGA controller and the driving module, wherein the DMA channels are data processing channels for processing DMA instructions generated according to NVMe instruction information, the low-delay data processing channel is used for carrying low-delay data, namely first data, and the first data processing channel is used for carrying low-delay data and common data of an application program.

The queue order refers to the queuing order of the NVMe request instruction in the first sending queue.

The driving module can acquire an NVMe request instruction in a second sending queue of the FPGA controller, and adds the NVMe request instruction into a first sending queue, the first sending queue caches the NVMe request instruction, and the driving module also has an analysis function of the NVMe request instruction, can analyze the NVMe request instruction in the first sending queue and obtains NVMe instruction information of the NVMe request instruction.

Optionally, before step 201, the following steps may be further included: receiving an instruction fetching signal sent by the FPGA controller; according to the instruction fetching signal, acquiring the NVMe request instruction corresponding to the instruction fetching signal from the second transmission queue in the FPGA controller;

the FPGA controller may generate a corresponding instruction fetch signal for each of the plurality of NVMe request instructions in the second transmit queue, and transmit the instruction fetch signal to the driving module. The driving module can orderly acquire the corresponding NVMe instruction in the second transmission queue according to the instruction fetching signal sent by the FPGA controller, thereby being beneficial to improving the data order of the subsequent DMA instruction generation and application program moving, and further improving the data moving efficiency.

The NVMe instruction information at least comprises one of the following: a namespace identifier Namespace Identifier, a data pointer, a metadata pointer, a low latency tag, and a command double byte value.

Wherein, the NVMe instruction includes a plurality of Command double byte value commands, for example: command Dword0, command Dword1, command Dword2, and so forth.

The metadata is data describing data, and information describing data attributes, and is used for supporting functions of storage locations, historical data, resource searching, file recording and the like.

S202, determining whether the NVMe instruction information comprises a low latency tag.

The low latency tag may be attached to the NVMe request command, and if the low latency tag is included in the NVMe request command, the low latency tag is included in the NVMe command information to which the driving module parses the NVMe request command.

Whether the NVMe request command is an NVMe request command sent by the delay sensitive application program or the non-delay sensitive application program through an NVMe standard driver in the server can be determined according to whether the NVMe request command includes a low-delay tag, if the NVMe request command includes the low-delay tag, the NVMe request command sent by the delay sensitive application program through the NVMe standard driver in the target server is an NVMe request command sent by the non-delay sensitive application program through the NVMe standard driver in the target server, and if the NVMe request command does not include the low-delay tag, the NVMe request command is sent by the non-delay sensitive application program through the NVMe standard driver in the target server.

The delay sensitive application is an important application such as a database, the data volume of the delay sensitive application is relatively small, but a relatively low delay is often required, the requirement for timely processing of the data is large, however, when a data transmission path corresponding to an analog NVMe disk cannot distinguish whether the application is the delay sensitive application, but low delay data of the delay sensitive application and common data of a non-delay sensitive application are transmitted through the same data transmission path, the low delay data cannot be timely processed easily, and a problem of "noisy neighbors" is caused, for example, the application a is sending a large amount of I/O data flow to the queue 1, and at this time, the delay sensitive application b waits for sending metadata instructions through the queue 1 and expects to return quickly, but due to limitation of the bandwidth of the queue 1, the bandwidth occupied by the application a forces the metadata instructions of the delay sensitive application b to be queued for processing, which results in poor performance of the delay sensitive application b and even results in that the delay sensitive application b cannot run normally due to metadata instruction processing timeout.

S203, if the NVMe instruction information does not comprise the low-latency tag, generating a first DMA instruction according to the NVMe instruction information, and adding the first DMA instruction to the first DMA queue; and sending the first DMA instruction to the second DMA queue of the FPGA controller, wherein the second DMA queue is used for the FPGA controller to move first data corresponding to the first DMA instruction in the target server to the simulated NVMe disk through the first data processing channel, and the first data is data of a non-delay-sensitive application program of the target server.

The first DMA queue can cache a plurality of DMA instructions, the plurality of DMA instructions comprise a first DMA instruction and a second DMA instruction, and the driving module can send the DMA instructions in the first DMA queue to the second DMA queue of the FPGA controller according to the queue sequence of the DMA instructions in the first DMA queue.

The FPGA controller moves first data of the non-delay sensitive application program from the target server to the FPGA controller, and then moves to a corresponding simulated NVMe disk through the first data processing channel.

Optionally, in step 203, the generating the first DMA command according to the NVMe command information may include the following steps: determining DMA instruction data parameters according to the NVMe instruction information; determining a DMA instruction transmission mode; and carrying out initialization configuration of the first DMA instruction according to the DMA instruction data parameters and the DMA instruction transmission mode to obtain the first DMA instruction.

The DMA command data parameters include a source address, a destination address, a transmission direction, a data transmission amount, a word width, a length, and the like.

The DMA command transmission mode comprises a cyclic transmission mode and a normal mode, wherein the normal mode is to limit the transmission times of the DMA command at first, and stop after the appointed transmission times are completed, the cyclic transmission mode is to restart the DMA transmission when the new transmission data amount is added after the residual transmission data amount is 0 and the DMA transmission is stopped.

Optionally, the driving module may determine DMA instruction data parameters according to the NVMe instruction information; determining a DMA instruction transmission mode; and carrying out initialization configuration of the second DMA instruction according to the DMA instruction data parameters and the DMA instruction transmission mode to obtain the second DMA instruction.

Therefore, the driving module can determine the DMA command data parameters and determine the DMA command transmission mode according to the command information, and perform initialization configuration of the DMA command according to the DMA command data parameters and the DMA command transmission mode to generate a first DMA command or a second DMA command, so that the second DMA command is favorably transmitted to the low-delay queue pool in the follow-up process, and the low-delay data is timely processed.

S204, if the NVMe request instruction comprises the low-delay tag, generating a second DMA instruction according to the NVMe instruction information, determining the number of cache instructions in the first DMA queue currently, and judging whether the number of cache instructions is larger than a preset threshold.

The number of cache instructions determined here refers to the number of DMA instructions (including the first DMA instruction and the second DMA instruction) in the first DMA queue currently.

The preset threshold may be set manually or by default, which is not limited herein.

And S205, if the number of the cache instructions is greater than the preset threshold, sending the second DMA instruction to the low-delay DMA queue pool in the FPGA controller, wherein the low-delay DMA queue pool is used for the FPGA controller to move second data corresponding to the second DMA instruction in the target server to the simulated NVMe disk through the low-delay data processing channel, and the second data is low-delay data of a delay-sensitive application program of the target server.

And if the number of the buffer memory instructions is smaller than or equal to the preset threshold value, the second DMA instruction can be accepted, the second DMA instruction is sent to a second DMA queue, and the second DMA instruction can be processed in time by means of the first data processing channel.

Wherein the low latency DMA queue pool includes a plurality of low latency DMA queues, the driver module may add the second DMA instruction to the low latency DMA queues in the low latency DMA queue pool.

The FPGA controller moves the second data of the delay sensitive application program from the target server to the FPGA controller, and then moves the second data to the corresponding simulated NVMe disk through the low-delay data processing channel.

S206, if the number of the cache instructions is smaller than or equal to the preset threshold, adding the second DMA instruction to the first DMA queue; and sending the second DMA instruction to the second DMA queue of the FPGA controller, where the second DMA queue is configured to move, by the FPGA controller through the first data processing channel, the second data corresponding to the second DMA instruction in the target server to the simulated NVMe disk.

The FPGA controller moves second data of the delay-sensitive application program from the target server to the FPGA controller, and then moves the second data to a corresponding simulated NVMe disk through the first data processing channel.

Optionally, considering that the amount of data to be processed by the low latency application program corresponding to the second DMA instruction is smaller and the difference between the second DMA instruction and the amount of data to be processed by the application program corresponding to the first DMA instruction, the waiting processing time acceptable for the second DMA instruction may be determined, the number of DMA instructions in the first DMA queue may be processed within each waiting processing time period in a preset time period, so as to obtain a plurality of first instruction numbers, and a second preset threshold may be determined according to the first instruction numbers, where the second preset threshold may be obtained by performing data analysis on the plurality of first instruction numbers or may be an average instruction number of the plurality of first instruction numbers. Meanwhile, the update preset threshold value can be determined to be the update time of the second preset threshold value, a first preset time threshold value corresponding to the update time is set, a register is updated when the update time is larger than the first preset time threshold value, the determination of the first preset time threshold value is related to the system load state, the corresponding relation between the first preset time threshold value and the system load state can be set, a smaller system load (for example, the system load is 3%) can correspond to the smaller first preset time threshold value, a larger system load (for example, the system load is 70%) can correspond to the larger first preset time threshold value, the current system load state can be detected, and the first preset time threshold value corresponding to the current system load state is further determined according to the corresponding relation between the current system load state and the first preset time threshold value and the system load state. In terms of setting the minimum value of the first preset time threshold, the maximum instruction delay can be taken as the minimum value of the first preset time threshold in consideration of the fact that the magnitude of the first preset time threshold affects the instruction delay of updating the register. Optionally, after the step 206, the following steps may be further included: receiving a first moving completion signal of the first data or a second moving completion signal of the second data sent by the FPGA controller, and generating a corresponding first completion instruction according to the first moving completion signal or a corresponding second completion instruction according to the second moving completion signal; adding the first completion instruction or the second completion instruction to the first reclamation queue, wherein the first reclamation queue is used for caching the received first completion instruction or the second completion instruction; the first completion instruction is sent to the target server or the second completion instruction is sent to the target server through the FPGA controller, wherein the first completion instruction is used for generating a corresponding first recovery success signal by the target server, and the second completion instruction is used for generating a corresponding second recovery success signal by the target server; and receiving the first recovery success signal or the second recovery success signal forwarded by the target server through the FPGA controller, and updating a head pointer of the first recovery queue according to the first recovery success signal or the second recovery success signal.

After the first data are conveyed to the corresponding simulated NVMe disk, the FPGA controller sends a first conveying completion signal to the driving module, and after the second data are conveyed to the corresponding simulated NVMe disk, the FPGA controller sends a second conveying completion signal to the driving module.

The target server can add the received first completion instruction and the second completion instruction to the third recovery queue for processing, generate a corresponding first recovery success signal when the first completion instruction is processed, transmit the corresponding first recovery success signal to the driving module through the FPGA controller, and generate a corresponding second recovery success signal when the second completion instruction is processed, and transmit the corresponding second recovery success signal to the driving module through the FPGA controller.

In specific implementation, the driving module receives a first moving completion signal generated by completing moving of first data or a second moving completion signal generated by completing moving of second data sent by the FPGA controller, generates a corresponding first completion instruction according to the first moving completion signal or generates a corresponding second completion instruction according to the second moving completion signal, further adds the first completion instruction or the second completion instruction to the first recovery queue, sends the first completion instruction to the target server or sends the second completion instruction to the target server through the FPGA controller, then receives a first recovery success signal generated by completing processing of the first completion instruction or a second recovery success signal generated by completing processing of the second completion instruction forwarded by the target server through the FPGA controller, and updates a head pointer of the first recovery queue according to the first recovery success signal or the second recovery success signal.

Therefore, the driving module can generate a completion instruction and forward the completion instruction to the target server for processing through the FPGA controller, and further update the head pointer of the first recovery queue according to the recovery success signal forwarded by the target server through the FPGA controller so as to enter the processing process of the next NVMe instruction of the application program, thereby being beneficial to the stable operation of the application program.

Referring to fig. 3, fig. 3 is a schematic flow chart of another processing method of a non-volatile memory standard NVMe instruction provided in this embodiment of the present application, which is applied to a target server, where the target server is connected to an FPGA controller in an embedded virtualization system, the embedded virtualization system includes the FPGA controller and a driving module, where the FPGA controller is respectively connected to the target server and the driving module, where the embedded virtualization system is configured to provide an emulated NVMe disk for the target server, where the FPGA controller is configured to receive an NVMe request instruction sent by the target server and send the NVMe request instruction to the driving module, where the driving module includes a first sending queue, a first DMA queue, a first recycling queue, and the emulated NVMe disk, where the FPGA controller includes a second sending queue, a second DMA queue, a second recycling queue, and a low-latency DMA queue pool, where the second queue corresponds to a first data processing channel, where the low-latency DMA queue and the low-latency DMA queue are respectively, where the low-latency DMA queue is used to send the first data processing channel and the emulated DMA queue to the first data processing channel, where the low-latency DMA queue is used to send the first data to the emulated DMA queue to the first DMA channel, and the first DMA channel is configured to send the emulated data request to the first DMA channel, and moving the second data corresponding to the second DMA instruction in the target server to the simulated NVMe disk through the first data processing channel, as shown in the figure, the processing method of the nonvolatile memory standard NVMe instruction comprises the following operations.

S301, determining whether the target application program is a delay-sensitive application program.

Wherein the delay sensitive application and the non-delay sensitive application can send NVMe request instructions to the third send queue by means of the NVMe standard driver of the target server.

The target application may be a delay-sensitive application or a non-delay-sensitive application.

S302, if the target application program is the delay sensitive application program, adding a low-delay tag to the NVMe request instruction when the NVMe request instruction corresponding to the target application program is sent to the FPGA controller.

Optionally, in step S302, the adding a low latency tag to the NVMe request instruction may include the following steps: determining a reserved bit in the NVMe request instruction; acquiring byte adjustment information of the low-delay tag; and adjusting at least one byte in the reserved bit according to the byte adjustment information so that the NVMe request instruction comprises a low-delay tag.

Wherein, there is a reserved bit in the NVMe request instruction, for example: the NVMe request command includes 60 bytes of data content, and the reserved bits from the 9 th byte to the 16 th byte can be preset.

The byte adjustment information may be at least one byte of a plurality of bytes in the adjustment reserved bit, for example: modifying the 9 th byte to 1 and the 11 th byte to 0, the reserved bit includes a low latency tag, and the drive module considers that the NVMe request instruction includes a low latency tag when parsing the NVMe request instruction if the 9 th byte is 1 and the 11 th byte is 0.

Therefore, the target server can attach a low-delay tag to the reserved bit of the NVMe request instruction so as to distinguish the NVMe request instruction sent by the delay-sensitive application program from the non-delay-sensitive application program, thereby being beneficial to carrying the NVMe request instruction sent by the delay-sensitive application program to the simulated NVMe disk through a low-delay channel and improving the running performance of the delay-sensitive application program in the target server.

S303, receiving a first completion instruction or a second completion instruction sent by the FPGA controller.

The first completion instruction or the second completion instruction is generated by a driving module and is forwarded to the target server through the FPGA controller.

S304, generating a first recovery success signal according to the first completion instruction or generating a second recovery success signal according to the second completion instruction, and sending the first recovery success signal or the second recovery success signal to the driving module through the FPGA controller.

It can be seen that in this embodiment of the present application, the target server may determine whether the target application is a delay-sensitive application, if the target application is a delay-sensitive application, when the NVMe request instruction corresponding to the target application is sent to the FPGA controller, add a low-delay tag to the NVMe request instruction, the target server receives a first completion instruction or a second completion instruction sent by the FPGA controller, and the target server generates a corresponding first reclamation success signal or a corresponding second reclamation success signal according to the first completion instruction or the second completion instruction, and sends the first reclamation success signal or the second reclamation success signal to the driving module through the FPGA controller, where the FPGA controller is configured to move, through the first data processing channel, first data corresponding to the first DMA instruction in the target server to the analog NVMe disk, move, through the low-delay data processing channel, second data corresponding to the second DMA instruction in the target server to the analog NVMe disk, and send, through the first data processing channel, the second data corresponding to the second DMA instruction in the target server to the analog NVMe disk. Therefore, the method realizes that the DMA instruction in the low-delay DMA queue pool and the DMA instruction in the second DMA queue adopts different data processing channels to carry the data of the application program, so that the low-delay data of the delay sensitive application can be processed in time and carried to the corresponding simulated disk, and the method is beneficial to improving the running performance of the delay sensitive application.

Referring to fig. 4, fig. 4 is a schematic structural diagram of an electronic device a provided in an embodiment of the present application, as shown in the foregoing embodiment, the electronic device a is applied to a driving module in an embedded virtualization system, where the embedded virtualization system includes an FPGA controller and the driving module, the FPGA controller is connected to a target server and the driving module, the embedded virtualization system is configured to provide an analog NVMe disk for the target server, the FPGA controller is configured to receive an NVMe request instruction sent by the target server and send the NVMe request instruction to the driving module, the driving module includes a first sending queue, a first DMA queue, a first recycling queue, and the analog NVMe disk, the FPGA controller includes a second sending queue, a second DMA queue, the second recycling queue, and a low latency DMA queue pool, the low latency DMA queue pool includes a plurality of low latency DMA queues, the second DMA queue corresponds to a first data processing channel, the low latency DMA queue pool corresponds to the low latency data processing channel, the second DMA queue is configured to receive the NVMe request instruction from the first DMA queue, the first DMA queue and the first DMA queue is configured to receive the NVMe request from the electronic device, the electronic device is configured to store the NVMe instruction, and the electronic device is configured to store the NVMe request by the command, and is stored by the buffer.

It can be seen that, in the electronic device a described in the embodiment of the present application, each NVMe request instruction may be parsed according to a queue sequence of a plurality of NVMe request instructions in a first transmission queue to obtain NVMe instruction information, then determine whether the NVMe instruction information includes a low latency tag, if the NVMe instruction information does not include the low latency tag, generate a first DMA instruction according to the NVMe instruction information, and add the first DMA instruction to the first DMA queue, and send the first DMA instruction to a second DMA queue of an FPGA controller, where the second DMA queue is used for the FPGA controller to move first data corresponding to the first DMA instruction in a target server to an emulated NVMe disk through a first data processing channel, and if the NVMe instruction includes the low latency tag, then generate a second DMA instruction according to the NVMe instruction information, determine a number of cache instructions in the current first DMA queue, and determine whether the number of cache instructions is greater than a preset threshold, and further send the first DMA instruction to the second DMA queue through the first data processing channel to the emulated NVMe disk, where the first data is corresponding to the low latency data in the second DMA queue is moved to the low latency FPGA of the target server; if the number of the cache instructions is smaller than or equal to a preset threshold value, adding a second DMA instruction to the first DMA queue, and sending the second DMA instruction to a second DMA queue of the FPGA controller, wherein the second DMA queue is used for the FPGA controller to move second data corresponding to the second DMA instruction in the target server to an NVMe (network video Messaging) disk through the first data processing channel, and the second data is data of a non-delay sensitive application program of the target server. Therefore, the method and the device realize that whether the second DMA instruction is sent to the low-delay DMA queue pool or the second DMA queue is determined according to the low-delay tag in the NVMe instruction and the number of cache instructions in the current first DMA queue, and the low-delay DMA queue pool and the DMA instructions in the second DMA queue adopt different data processing channels to carry data of application programs, so that the low-delay data of delay-sensitive applications can be processed in time and carried to corresponding simulated magnetic discs, and the method and the device are favorable for improving the running performance of the delay-sensitive applications.

In one possible example, in generating the first DMA instruction according to the NVMe instruction information, the program includes instructions for:

determining DMA instruction data parameters according to the NVMe instruction information;

determining a DMA instruction transmission mode;

and carrying out initialization configuration of the first DMA instruction according to the DMA instruction data parameters and the DMA instruction transmission mode to obtain the first DMA instruction.

In one possible example, the above-described program further includes instructions for performing the steps of:

receiving a first moving completion signal of the first data or a second moving completion signal of the second data sent by the FPGA controller, and generating a corresponding first completion instruction according to the first moving completion signal or a corresponding second completion instruction according to the second moving completion signal;

adding the first completion instruction or the second completion instruction to the first reclamation queue, wherein the first reclamation queue is used for caching the received first completion instruction or the second completion instruction;

the first completion instruction is sent to the target server or the second completion instruction is sent to the target server through the FPGA controller, wherein the first completion instruction is used for generating a corresponding first recovery success signal by the target server, and the second completion instruction is used for generating a corresponding second recovery success signal by the target server;

And receiving the first recovery success signal or the second recovery success signal forwarded by the target server through the FPGA controller, and updating a head pointer of the first recovery queue according to the first recovery success signal or the second recovery success signal.

receiving an instruction fetching signal sent by the FPGA controller;

according to the instruction fetching signal, acquiring the NVMe request instruction corresponding to the instruction fetching signal from the second transmission queue in the FPGA controller;

referring to fig. 5, fig. 5 is a schematic structural diagram of an electronic device B provided in the embodiment of the present application, where as shown in the drawing, the electronic device B is applied to a target server, the target server is connected to an FPGA controller in an embedded virtualization system, the embedded virtualization system includes the FPGA controller and a driving module, the FPGA controller is respectively connected to the target server and the driving module, the embedded virtualization system is configured to provide an emulated NVMe disk for the target server, the FPGA controller is configured to receive an NVMe request instruction sent by the target server and send the NVMe request instruction to the driving module, the driving module includes a first sending queue, a first DMA queue, a first recycling queue, and the emulated NVMe disk, the FPGA controller includes a second sending queue, a second DMA queue, a second recycling queue, and a low-latency DMA queue pool, the second queue corresponds to a first data processing channel, the low-latency DMA queue is configured to receive the NVMe request instruction from the target server, the low-latency DMA queue is further configured to send the emulated data to the first DMA queue through the first DMA queue, the low-latency DMA queue is configured to send the emulated data to the first DMA queue, the low-latency DMA queue is configured to the first DMA queue, and the low-latency DMA queue is configured to send the emulated data to the first DMA queue, and the low-latency DMA queue is configured to the low latency DMA queue is configured to be processed to the low latency buffer, and the low latency DMA queue is configured to be processed to be in the buffer, and the low latency buffer is in the buffer, and the low latency is in the buffer. And moving the second data corresponding to the second DMA instruction in the target server to the emulated NVMe disk via the first data processing channel, the electronic device B including a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory, the one or more programs configured with instructions for execution by the processor to:

Determining whether the target application is a delay-sensitive application;

It can be seen that, in the electronic device B described in the embodiment of the present application, whether the target application is a delay-sensitive application may be determined, if the target application is a delay-sensitive application, when the NVMe request instruction corresponding to the target application is sent to the FPGA controller, a low-delay tag is added to the NVMe request instruction, the target server receives a first completion instruction or a second completion instruction sent by the FPGA controller, the target server generates a corresponding first reclamation success signal or a corresponding second reclamation success signal according to the first completion instruction or the second completion instruction, and sends the first reclamation success signal or the second reclamation success signal to the driving module through the FPGA controller, and the FPGA controller is configured to move, through the first data processing channel, first data corresponding to the first DMA instruction in the target server to the analog NVMe disk, move, through the low-delay data processing channel, second data corresponding to the second DMA instruction in the target server to the analog NVMe disk, and send, through the first data processing channel, second data corresponding to the second DMA instruction in the target server to the analog NVMe disk. Therefore, the method realizes that the DMA instruction in the low-delay DMA queue pool and the DMA instruction in the second DMA queue adopts different data processing channels to carry the data of the application program, so that the low-delay data of the delay sensitive application can be processed in time and carried to the corresponding simulated disk, and the method is beneficial to improving the running performance of the delay sensitive application.

In one possible example, in adding a low latency tag to the NVMe request instruction, the above program includes instructions for:

determining a reserved bit in the NVMe request instruction;

acquiring byte adjustment information of the low-delay tag;

and adjusting at least one byte in the reserved bit according to the byte adjustment information so that the NVMe request instruction comprises a low-delay tag.

The foregoing description of the embodiments of the present application has been presented primarily in terms of a method-side implementation. It will be appreciated that the electronic device, in order to achieve the above-described functions, includes corresponding hardware structures and/or software modules that perform the respective functions. Those of skill in the art will readily appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied as hardware or a combination of hardware and computer software. Whether a function is implemented as hardware or computer software driven hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The embodiment of the application may divide the functional units of the electronic device according to the above method example, for example, each functional unit may be divided corresponding to each function, or two or more functions may be integrated in one processing unit. The integrated units may be implemented in hardware or in software functional units. It should be noted that, in the embodiment of the present application, the division of the units is schematic, which is merely a logic function division, and other division manners may be implemented in actual practice.

In the case of dividing each functional module by adopting each corresponding function, fig. 6 shows a schematic diagram of a processing device a of a non-volatile memory standard NVMe instruction, as shown in fig. 6, where the device is applied to a driving module in an embedded virtualization system, where the embedded virtualization system includes an FPGA controller and the driving module, where the FPGA controller is connected to a target server and the driving module, respectively, where the embedded virtualization system is configured to provide an analog NVMe disk for the target server, where the FPGA controller is configured to receive an NVMe request instruction sent by the target server and send the NVMe request instruction to the driving module, where the driving module includes a first sending queue, a first DMA queue, a first recycling queue, and the analog NVMe disk, where the FPGA controller includes a second sending queue, a second DMA queue, a second recycling queue, and a low latency DMA queue pool, where the second queue corresponds to a first data processing channel, where the low latency DMA queue corresponds to the low latency DMA queue, and where the low latency DMA queue can receive the NVMe instruction from the first sending queue for processing the non-volatile memory standard NVMe instruction, and where the low latency DMA queue is used for processing the NVMe instruction is received by the driving module 600, where the low latency DMA queue is stored in the first queue. An analysis unit 601, a determination unit a 602, a transmission unit a 603, a judgment unit 604, a generation unit 605, and an acquisition unit 606, wherein,

The parsing unit 601 is configured to parse each NVMe request instruction according to a queue order of a plurality of NVMe request instructions in the first sending queue, so as to obtain NVMe instruction information;

the determining unit a 602 is configured to determine whether the NVMe instruction information includes a low latency tag;

the sending unit a 603 is configured to generate a first DMA instruction according to the NVMe instruction information if the NVMe instruction information does not include the low latency tag, and add the first DMA instruction to the first DMA queue; the first DMA command is sent to the second DMA queue of the FPGA controller, wherein the second DMA queue is used for the FPGA controller to move first data corresponding to the first DMA command in the target server to the simulated NVMe disk through the first data processing channel, and the first data is data of a non-delay sensitive application program of the target server;

the determining unit 604 is configured to generate a second DMA instruction according to the NVMe instruction information if the NVMe request instruction includes the low latency tag, determine the number of cache instructions in the first DMA queue currently, and determine whether the number of cache instructions is greater than a preset threshold;

The sending unit a 603 is further configured to send the second DMA instruction to the low latency DMA queue pool in the FPGA controller if the number of cache instructions is greater than the preset threshold, where the low latency DMA queue pool is used for the FPGA controller to move, through the low latency data processing channel, second data corresponding to the second DMA instruction in the target server to the simulated NVMe disk, where the second data is low latency data of a latency sensitive application of the target server;

the sending unit a 603 is further configured to add the second DMA instruction to the first DMA queue if the number of buffered instructions is less than or equal to the preset threshold; and sending the second DMA instruction to the second DMA queue of the FPGA controller, wherein the second DMA queue is used for the FPGA controller to move the second data corresponding to the second DMA instruction in the target server to the simulated NVMe disk through the first data processing channel.

It can be seen that, in the processing device a for non-volatile memory standard NVMe instruction described in this embodiment of the present application, each NVMe request instruction may be parsed according to a queue sequence of multiple NVMe request instructions in a first transmission queue to obtain NVMe instruction information, then it is determined whether the NVMe instruction information includes a low latency tag, if the NVMe instruction information does not include the low latency tag, a first DMA instruction is generated according to the NVMe instruction information, and the first DMA instruction is added to the first DMA queue, and the first DMA instruction is sent to a second DMA queue of an FPGA controller, where the second DMA queue is used by the FPGA controller to move first data corresponding to the first DMA instruction in a target server to an analog NVMe disk through a first data processing channel, the first data is low latency data of a latency sensitive application of the target server, and if the NVMe request instruction includes the low latency tag, then a second DMA instruction is generated according to the NVMe instruction information, and a cache instruction number in the current first queue is determined, and if the instruction number is greater than a preset threshold, and if the instruction is further greater than the preset threshold, the second DMA instruction is further moved to the second DMA instruction in the second DMA queue through the low latency buffer queue, and the FPGA controller is moved to the low latency data corresponding to the low latency DMA queue; if the number of the cache instructions is smaller than or equal to a preset threshold value, adding a second DMA instruction to the first DMA queue, and sending the second DMA instruction to a second DMA queue of the FPGA controller, wherein the second DMA queue is used for the FPGA controller to move second data corresponding to the second DMA instruction in the target server to an NVMe (network video Messaging) disk through the first data processing channel, and the second data is data of a non-delay sensitive application program of the target server. Therefore, the method and the device realize that whether the second DMA instruction is sent to the low-delay DMA queue pool or the second DMA queue is determined according to the low-delay tag in the NVMe instruction and the number of cache instructions in the current first DMA queue, and the low-delay DMA queue pool and the DMA instructions in the second DMA queue adopt different data processing channels to carry data of application programs, so that the low-delay data of delay-sensitive applications can be processed in time and carried to corresponding simulated magnetic discs, and the method and the device are favorable for improving the running performance of the delay-sensitive applications.

In one possible example, in generating the first DMA instruction according to the NVMe instruction information, the generating unit 605 is specifically configured to:

determining a DMA instruction transmission mode;

In one possible example, the sending unit a 603 is specifically configured to:

In one possible example, the obtaining unit 606 is specifically configured to:

receiving an instruction fetching signal sent by the FPGA controller;

fig. 7 is a schematic diagram of a processing device B for non-volatile memory standard NVMe commands, as shown in fig. 7, where the device is applied to a target server, where the target server is connected to an FPGA controller in an embedded virtualization system, where the embedded virtualization system includes the FPGA controller and a driving module, where the FPGA controller is connected to the target server and the driving module, where the FPGA controller is configured to provide an emulated NVMe disk for the target server, where the FPGA controller is configured to receive an NVMe request command sent by the target server and send the NVMe request command to the driving module, where the driving module includes a first sending queue, a first DMA queue, a first recycling queue, and the emulated NVMe disk, where the FPGA controller includes a second sending queue, a second DMA queue, and a low latency DMA queue pool, where the low latency DMA queue includes a plurality of low latency DMA queues, where the second DMA queue corresponds to a first data processing channel, where the low latency DMA queue also corresponds to the low latency DMA queue, where the low latency DMA queue is used to send the data to the first buffer command to the first buffer, where the data is received by the first DMA queue through the first buffer to the first buffer, where the controller sends the data is processed to the emulated DMA command to the first buffer, and moving the second data corresponding to the second DMA instruction in the target server to the emulated NVMe disk through the first data processing channel, the processing device B700 of the nonvolatile memory standard NVMe instruction may include: a determining unit B701, an adding unit 702, a receiving unit 703, and a transmitting unit B704, wherein,

The determining unit B701 is configured to determine whether the target application is a delay-sensitive application;

the adding unit 702 is configured to add a low-latency tag to the NVMe request instruction when the NVMe request instruction corresponding to the target application program is sent to the FPGA controller if the target application program is the latency-sensitive application program;

the receiving unit 703 is configured to receive a first completion instruction or a second completion instruction sent by the FPGA controller;

the sending unit B704 is configured to generate a first recovery success signal according to the first completion instruction or generate a second recovery success signal according to the second completion instruction, and send the first recovery success signal or the second recovery success signal to the driving module through the FPGA controller.

It can be seen that, in the processing device B for a non-volatile memory standard NVMe instruction described in the embodiment of the present application, it may be determined whether the target application is a delay-sensitive application, if the target application is a delay-sensitive application, when the NVMe request instruction corresponding to the target application is sent to the FPGA controller, a low delay tag is added to the NVMe request instruction, the target server receives a first completion instruction or a second completion instruction sent by the FPGA controller, the target server generates a corresponding first reclamation success signal or a corresponding second reclamation success signal according to the first completion instruction or the second completion instruction, and sends the first reclamation success signal or the second reclamation success signal to the driving module through the FPGA controller, where the FPGA controller is configured to move first data corresponding to the first DMA instruction in the target server to the analog NVMe disk through the first data processing channel, move second data corresponding to the NVMe disk in the target server through the low delay data processing channel to the analog NVMe disk, and send second data corresponding to the second DMA instruction in the target server to the analog NVMe disk through the first data processing channel. Therefore, the method realizes that the DMA instruction in the low-delay DMA queue pool and the DMA instruction in the second DMA queue adopts different data processing channels to carry the data of the application program, so that the low-delay data of the delay sensitive application can be processed in time and carried to the corresponding simulated disk, and the method is beneficial to improving the running performance of the delay sensitive application.

In one possible example, in adding a low latency tag on the NVMe request instruction, the adding unit 702 is specifically configured to:

determining a reserved bit in the NVMe request instruction;

acquiring byte adjustment information of the low-delay tag;

It should be noted that, all relevant contents of each step related to the above method embodiment may be cited to the functional description of the corresponding functional module, which is not described herein.

The electronic device provided in this embodiment is configured to execute the processing method of the nonvolatile memory standard NVMe instruction, so that the same effect as that of the implementation method can be achieved.

The embodiment of the application also provides a computer storage medium, where the computer storage medium stores a computer program for electronic data exchange, where the computer program causes a computer to execute part or all of the steps of any one of the methods described in the embodiments of the method, where the computer includes an electronic device.

Embodiments of the present application also provide a computer program product comprising a non-transitory computer-readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any one of the methods described in the method embodiments above. The computer program product may be a software installation package, said computer comprising an electronic device.

The embodiment of the application also provides a chip, which comprises a processor and can be used for executing instructions, and when the processor executes the instructions, the chip can realize part or all of the steps of any one of the methods described in the embodiment of the method. Optionally, the chip may further comprise a communication interface, which may be used to receive signals or to transmit signals.

It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, such as the above-described division of units, merely a division of logic functions, and there may be additional manners of dividing in actual implementation, such as multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, or may be in electrical or other forms.

The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units described above, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a memory, including several instructions for causing a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the above-mentioned method of the various embodiments of the present application. And the aforementioned memory includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Those of ordinary skill in the art will appreciate that all or a portion of the steps in the various methods of the above embodiments may be implemented by a program that instructs associated hardware, and the program may be stored in a computer readable memory, which may include: flash disk, read-only memory, random access memory, magnetic or optical disk, etc.

The foregoing has outlined rather broadly the more detailed description of embodiments of the present application, wherein specific examples are provided herein to illustrate the principles and embodiments of the present application, the above examples being provided solely to assist in the understanding of the methods of the present application and the core ideas thereof; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. The method is characterized by being applied to a driving module in an embedded virtualization system, wherein the embedded virtualization system comprises an FPGA controller and the driving module, the FPGA controller is respectively connected with a target server and the driving module, the embedded virtualization system is used for providing an analog NVMe disk for the target server, the FPGA controller is used for receiving an NVMe request instruction sent by the target server and sending the NVMe request instruction to the driving module, the driving module comprises a first sending queue, a first DMA queue, a first recycling queue and the analog NVMe disk, the FPGA controller comprises a second sending queue, a second DMA queue, a second recycling queue and a low-delay DMA queue pool, the low-delay DMA queue pool comprises a plurality of low-delay DMA queues, the second DMA queue corresponds to a first data processing channel, the low-delay DMA queue pool corresponds to a low-delay data processing channel, and the second sending queue is used for receiving the NVMe request instruction received by the controller and is used for the driving module to receive the NVMe request instruction; the method comprises the following steps:

2. The method of claim 1, wherein the generating a first DMA instruction from the NVMe instruction information comprises:

determining a DMA instruction transmission mode;

3. The method according to claim 1 or 2, characterized in that the method further comprises:

4. The method according to claim 1, wherein the method further comprises:

receiving an instruction fetching signal sent by the FPGA controller;

and acquiring the NVMe request instruction corresponding to the instruction fetching signal from the second sending queue in the FPGA controller according to the instruction fetching signal.

5. The method of claim 2, wherein the NVMe instruction information includes at least one of: a namespace identifier, a data pointer, a metadata pointer, a low latency tag, and a command double byte value.

6. The method for processing the NVMe command of the nonvolatile memory standard is applied to a target server, and is characterized in that the target server is connected with an FPGA controller in an embedded virtualization system, the embedded virtualization system comprises the FPGA controller and a driving module, the FPGA controller is respectively connected with the target server and the driving module, the embedded virtualization system is used for providing an analog NVMe disk for the target server, the FPGA controller is used for receiving the NVMe request command sent by the target server and sending the NVMe request command to the driving module, the driving module comprises a first sending queue, a first DMA queue, a first recycling queue and the analog NVMe disk, the FPGA controller comprises a second sending queue, a second DMA queue, a second recycling queue and a low-delay DMA queue pool, the second queue pool comprises a plurality of low-delay DMA queues, the second queue corresponds to a first data processing channel, the low-delay DMA queue pool corresponds to the low-delay data processing channel, the second controller is used for receiving the NVMe request command from the target server to the first data processing channel through the first DMA queue, the first DMA controller receives the analog NVMe command from the first DMA controller and sends the data to the first buffer command to the first buffer, and moving the second data corresponding to the second DMA instruction in the target server to the emulated NVMe disk via the first data processing channel; the method comprises the following steps:

Determining whether the target application is a delay-sensitive application;

7. The method of claim 6, wherein the adding a low latency tag on the NVMe request instruction comprises:

determining a reserved bit in the NVMe request instruction;

acquiring byte adjustment information of the low-delay tag;

8. The processing device of the non-volatile memory standard NVMe instruction is applied to a driving module in an embedded virtualization system, and is characterized in that the embedded virtualization system comprises an FPGA controller and the driving module, the FPGA controller is respectively connected with a target server and the driving module, the embedded virtualization system is used for providing an analog NVMe disk for the target server, the FPGA controller is used for receiving the NVMe request instruction sent by the target server and sending the NVMe request instruction to the driving module, the driving module comprises a first sending queue, a first DMA queue, a first recycling queue and the analog NVMe disk, the FPGA controller comprises a second sending queue, a second DMA queue, a second recycling queue and a low-delay DMA queue pool, the low-delay DMA queue pool comprises a plurality of low-delay DMA queues, the second DMA queue corresponds to a first data processing channel, the low-delay DMA queue pool corresponds to a low-delay data processing channel, the second sending queue is used for receiving the NVMe request instruction received by the controller, and the driving module is used for processing the NVMe request instruction, and the driving device comprises the non-volatile memory standard NVMe instruction. The device comprises an analysis unit, a determination unit, a sending unit and a judging unit, wherein,

9. An electronic device comprising a processor, a memory, a communication interface, and one or more programs stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing the steps in the method of any of claims 1-5 or 6-7.

10. A computer-readable storage medium, characterized in that a computer program for electronic data exchange is stored, wherein the computer program causes a computer to perform the method according to any one of claims 1-5 or 6-7.