CN111459417B - Non-lock transmission method and system for NVMeoF storage network - Google Patents

Non-lock transmission method and system for NVMeoF storage network Download PDF

Info

Publication number
CN111459417B
CN111459417B CN202010338868.8A CN202010338868A CN111459417B CN 111459417 B CN111459417 B CN 111459417B CN 202010338868 A CN202010338868 A CN 202010338868A CN 111459417 B CN111459417 B CN 111459417B
Authority
CN
China
Prior art keywords
nvmeof
queue
linked list
head node
nvmeoh
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010338868.8A
Other languages
Chinese (zh)
Other versions
CN111459417A (en
Inventor
李琼
宋振龙
赵曦
谢徐超
谢旻
袁远
黎铁军
肖立权
魏登萍
任静
李世杰
陈浩稳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202010338868.8A priority Critical patent/CN111459417B/en
Publication of CN111459417A publication Critical patent/CN111459417A/en
Application granted granted Critical
Publication of CN111459417B publication Critical patent/CN111459417B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The application discloses a lock-free transmission method and a lock-free transmission system for an NVMeoF storage network, wherein a host side creates NVMeoF queues with the same number according to the number of CPU cores, and applies for a section of blank memory in an array format for each NVMeoF queue; when the command packet arrives, the command packet is added into a blank memory in an array format corresponding to the NVMeoF queue and is cached through an independent linked list; and polling each NVMeoF queue, and sending the cached command packet to the target end through the network. The application adds the chain table buffer in each NVMeoF queue, adopts a method of polling a plurality of chain tables, relieves the problem of strong competition of the plurality of NVMeoF queues to a single transmission chain table lock and solves the problem of I/O bottleneck under the condition of large I/O pressure.

Description

Non-lock transmission method and system for NVMeoF storage network
Technical Field
The application relates to a remote storage and storage network technology, in particular to a lock-free transmission method and system for an NVMeoF storage network.
Background
The NVMe protocol is customized for a novel high-speed Non-Volatile Memory (such as a flash Memory, a 3D XPoint and the like), and the high-efficiency combination of the PCIe interface and the NVMe protocol reduces the overhead of an I/O protocol stack and the storage access delay, improves the I/O throughput rate and the I/O bandwidth, and is widely applied to a data center.
However, limited to the scalability of the PCIe bus, the NVMe protocol is not suitable for large-scale cross-network remote storage access. The NVMeoF storage network protocol based on RDMA network expansion is generated, and can be used for communicating with remote NVMe equipment between a host and a storage system through various network structures, so that an effective technical approach is provided for constructing a high-performance and easily-expandable network storage system for a data center, and the future development trend is realized.
NVMeoF (NVMe over Fabrics) may choose different link layer and physical layer protocol implementations, bearer networks including Infiniband, ROCE, iWARP, fibreChannel, etc., and custom protocol based RDMA networks such as the custom high speed interconnect network employed in the Tianhe supercomputer.
The RDMA network-based NVMeoF network storage I/O command transfer flow is shown in FIG. 1. The command packet (CMD Capsule) is a generic name after the I/O request command packet in fig. 1, and includes optional additional SGL or command data in addition to the basic command ID, the operation code, the cache address, and the command parameters; the response packet (RSP Capsule) is a generic term after the I/O response packet in fig. 1, and contains optional command data in addition to basic command parameters, SQ (transmit queue) head pointer, command status, command ID; information such as an ID number of a queue, a size of the queue, an nvme command message and the like is packaged in an NVMeoF queue (nvme_fabrics_queue); the send list (send_list) is a list used to store command packets or response packet pointers.
The number of nvmeoh queues is created based on the number of cores of the current server CPU, in the case of a multi-core CPU, the traditional way of connection is a "multi-producer, single consumer model". A connection mode based on a single transmission linked list is generally adopted, as shown in fig. 2. I/O command packets sent by a plurality of NVMeoF queues at a Host end (Host) are all required to be sent through an interconnection network interface card (network card for short), so that a sending chain table is required to be locked, and as only one sending chain table is provided, each request is required to lock the chain table for ensuring mutual exclusivity when entering and exiting the chain table; each command packet can be added to the tail of the transmission chain table after being locked, and waiting for the network card to transmit. Similarly, I/O response packets generated by the Target end queue may need to be inserted into the transmit chain table after the lock is acquired. Under the condition of multi-core and multi-process, competition for a linked list lock is very frequent, and the processing efficiency of the I/O request is seriously affected, so that the rate of sending a request to a target end by a host end and sending a response to the host end by the target end is too slow, and the performance of the underlying high-speed NVMe storage device cannot be fully exerted. Lock contention is particularly acute where I/O pressure is high, resulting in a significant I/O bottleneck problem.
Disclosure of Invention
The application aims to solve the technical problems: aiming at the problem that the lock competition is strong to cause the I/O bottleneck in the prior art under the condition of high I/O pressure, the application provides a lock-free transmission method and a lock-free transmission system for an NVMeoF storage network.
In order to solve the technical problems, the application adopts the following technical scheme:
the lock-free transmission method for the NVMeoF storage network comprises the following implementation steps:
1) The host side creates the same number of NVMeoF queues according to the number of CPU cores, and applies for a section of blank memory in an array format for each NVMeoF queue;
2) When the command packet arrives, the command packet is added into a blank memory in an array format corresponding to the NVMeoF queue and is cached through an independent linked list; and polling each NVMeoF queue, and sending the cached command packet to the target end through the network.
Optionally, caching through an independent linked list in step 2) specifically refers to caching the command packet in the blank memory in the array format corresponding to the nvmeoh queue to the tail of the transmission linked list corresponding to the nvmeoh queue; the step of sending the cached command packet to the target end through the network specifically means that the command packet at the head of the sending chain table is sent to the target end through the network.
Optionally, the host side further includes the step of managing the blank memory of the array format of each nvmeoh queue by using a linked list: 1. adding the blank memory in the array format of each NVMeoF queue into a management linked list after the blank memory is applied; 2. when a command packet arrives, a head node in the management linked list is taken out and then deleted from the management linked list; when the I/O pressure is overlarge, if the management chain table is empty, applying a temporary memory space to store the command packet; 3. assigning a value to a head node extracted from the management linked list, and storing the message content of the command packet into an address pointed by the head node; 4. the head node after assignment is not in any management linked list, and then the head node is added to the tail part of a transmission linked list corresponding to the NVMeoF queue; 5. when a network card polls a certain NVMeoF queue, taking out a head node from a transmission linked list in the NVMeoF queue, and deleting the node from the transmission linked list after taking out the head node; 6. the network card sends the message content stored in the address pointed by the head node through the network card; 7. clearing the address pointed by the head node after the transmission is completed, and re-adding the address pointed by the head node to the tail part of the management linked list so as to ensure that the applied memory is repeatedly available; if the address in the head node is the memory address of the temporary application, the address is released immediately after the network card transmission is completed.
Optionally, step 1) further includes a step that the target initializes the nvmeoh queue: creating the same NVMeoF queues as the NVMeoF queues at the host end, and applying for a blank memory with an array format for each NVMeoF queue.
Optionally, step 2) further includes a step of sending a response packet after the target receives the command packet: when the response packet arrives, the response packet is added into a blank memory in an array format corresponding to the NVMeoF queue and is cached through an independent linked list; and polling each NVMeoF queue and sending the cached response packet to the host through the network.
Optionally, when the target receives the command packet and then sends the response packet, the caching by the independent linked list specifically refers to caching the command packet in the blank memory in the array format corresponding to the nvmeoh queue to the tail of the sending linked list corresponding to the nvmeoh queue; the step of sending the buffered command packet to the host end through the network specifically means that the command packet at the head of the sending chain table is sent to the host end through the network.
Optionally, the target end further includes the step of managing the blank memory of the array format of each nvmeoh queue by using a linked list: 1. adding the blank memory in the array format of each NVMeoF queue into a management linked list after the blank memory is applied; 2. when a response packet arrives, the head node in the management linked list is taken out and then deleted from the management linked list; when the I/O pressure is overlarge, if the management chain table is empty, applying a temporary memory space to store the response packet; 3. assigning a value to the head node extracted from the management linked list, and storing the message content of the response packet into an address pointed by the head node; 4. the head node after assignment is not in any management linked list, and then the head node is added to the tail part of a transmission linked list corresponding to the NVMeoF queue; 5. when a network card polls a certain NVMeoF queue, taking out a head node from a transmission linked list in the NVMeoF queue, and deleting the node from the transmission linked list after taking out the head node; 6. the network card sends the message content stored in the address pointed by the head node through the network card; 7. clearing the address pointed by the head node after the transmission is completed, and re-adding the address pointed by the head node to the tail part of the management linked list to ensure that the applied memory is repeatedly available; if the address in the head node is the memory address of the temporary application, the address is released immediately after the network card transmission is completed.
In addition, the application also provides a non-locking transmission system facing the NVMeoF storage network, which comprises a host end and a target end, wherein the host end is programmed or configured to execute the steps of the non-locking transmission method facing the NVMeoF storage network, or the target end is programmed or configured to execute the steps of the non-locking transmission method facing the NVMeoF storage network
The application also provides a non-locking transmission system facing the NVMeoF storage network, which comprises a computing device, wherein the computing device is programmed or configured to execute the steps of the non-locking transmission method facing the NVMeoF storage network, or a computer program programmed or configured to execute the non-locking transmission method facing the NVMeoF storage network is stored in a memory of the computing device.
Furthermore, the application provides a computer readable storage medium, on which a computer program programmed or configured to execute the non-lock transmission method for the nvmeoh storage network is stored.
Compared with the prior art, the application has the following advantages: the host side creates the same number of NVMeoF queues according to the number of CPU cores, and applies for a section of blank memory in an array format for each NVMeoF queue; when the command packet arrives, the command packet is added into a blank memory in an array format corresponding to the NVMeoF queue and is cached through an independent linked list; and (3) polling each NVMeoF queue by the network card, and sending the cached command packet to the target end through the network. The application adds the chain table buffer in each NVMeoF queue, adopts a method of polling a plurality of chain tables, relieves the problem of strong competition of the plurality of NVMeoF queues to a single transmission chain table lock and solves the problem of I/O bottleneck under the condition of large I/O pressure.
Drawings
FIG. 1 is a schematic diagram of a conventional NVMeoF I/O command processing flow.
Fig. 2 is a conventional I/O message transmission manner of the conventional nvmeoh.
Fig. 3 is a schematic diagram of the basic principle of the method according to the embodiment of the application.
FIG. 4 is a "Single producer Single consumer model" of NVMeoF I/O messaging in accordance with one embodiment of the present application.
FIG. 5 is a flow chart of a linked list management array according to an embodiment of the present application.
Fig. 6 is an improved nvmeoh I/O messaging mechanism in accordance with an embodiment of the present application.
Detailed Description
As shown in fig. 3 and fig. 4, the implementation steps of the lock-free transmission method for the nvmeoh storage network in this embodiment include:
1) The host side creates the same number of NVMeoF queues according to the number of CPU cores, and applies for a section of blank memory in an array format for each NVMeoF queue;
2) When the command packet arrives, the command packet is added into a blank memory in an array format corresponding to the NVMeoF queue and is cached through an independent linked list; and (3) polling each NVMeoF queue by the network card, and sending the cached command packet to the target end through the network.
As shown in fig. 3 and fig. 4, in the lock-free transmission method for an nvmeoh storage network of this embodiment, when a Host (Host) and a Target (Target) establish a connection, the Host creates the number of queues of the nvmeoh queues according to the number of CPU cores, and the Target creates the corresponding number of queues according to the number of nvmeoh queues at the Host. Each queue applies for a section of blank memory in an array format when being created, and when a command packet or a response packet of each queue arrives, the messages are respectively stored in the memory of the corresponding queue. The network card polls each queue and sends the messages in the memory one by one, and the network card is connected in the mode of 'single producer and single consumer model'. All IO instructions cannot compete for the lock of the transmission chain table, and the problem of low efficiency caused by intense lock competition is avoided.
As an optional implementation manner, in this embodiment, a transmit chain table is used to organize the command packets to be transmitted of each nvmeoh queue, so as to improve the polling processing efficiency of the network card. In this embodiment, the caching through the independent linked list in step 2) specifically refers to caching the command packet in the blank memory in the array format corresponding to the nvmeoh f queue to the tail of the transmitting linked list corresponding to the nvmeoh queue; the step of sending the cached command packet to the target end through the network specifically means that the command packet at the head of the sending chain table is sent to the target end through the network.
When the upper IO pressure is too high, the host end can generate the situation that the receiving speed of the message exceeds the sending speed of the network card, so that the memory overflows. In order to solve the above problem, as shown in fig. 5, the host side further includes a step of managing the blank memory in the array format of each nvmeoh queue by using a linked list: 1. adding the blank memory in the array format of each NVMeoF queue into a management linked list after the blank memory is applied; 2. when a command packet arrives, a head node in the management linked list is taken out and then deleted from the management linked list; when the I/O pressure is overlarge, if the management chain table is empty, applying a temporary memory space to store the command packet; 3. assigning a value to a head node extracted from the management linked list, and storing the message content of the command packet into an address pointed by the head node; 4. the head node after assignment is not in any management linked list, and then the head node is added to the tail part of a transmission linked list corresponding to the NVMeoF queue; 5. when a network card polls a certain NVMeoF queue, taking out a head node from a transmission linked list in the NVMeoF queue, and deleting the node from the transmission linked list after taking out the head node; 6. the network card sends the message content stored in the address pointed by the head node through the network card; 7. clearing the address pointed by the head node after the transmission is completed, and re-adding the address pointed by the head node to the tail part of the management linked list to ensure that the applied memory is repeatedly available; if the address in the head node is the memory address of the temporary application, the address is released immediately after the network card transmission is completed. As an alternative implementation, the I/O pressure may be selected according to the requirement, and other feasible I/O pressure index values may be selected according to the requirement.
Referring to fig. 4, step 1) in this embodiment further includes the step that the target end initializes the nvmeoh queue: creating the same NVMeoF queues as the NVMeoF queues at the host end, and applying for a blank memory with an array format for each NVMeoF queue.
In this embodiment, step 2) further includes the step of receiving the command packet by the target and then sending a response packet: when the response packet arrives, the response packet is added into a blank memory in an array format corresponding to the NVMeoF queue and is cached through an independent linked list; and polling each NVMeoF queue and sending the cached response packet to the host through the network.
In this embodiment, when the target receives the command packet and then sends the response packet, the caching by the independent linked list specifically refers to caching the command packet in the blank memory in the array format corresponding to the nvmeoh queue to the tail of the sending linked list corresponding to the nvmeoh queue; the step of sending the buffered command packet to the host end through the network specifically means that the command packet at the head of the sending chain table is sent to the host end through the network.
When the upper IO pressure is too high, the receiving speed of the message at the target end exceeds the sending speed of the network card, so that the memory overflows. In order to solve the above problem, referring to fig. 5, the target end in this embodiment further includes a step of managing the blank memory of the array format of each nvmeoh queue by using a linked list: 1. adding the blank memory in the array format of each NVMeoF queue into a management linked list after the blank memory is applied; 2. when a response packet arrives, the head node in the management linked list is taken out and then deleted from the management linked list; when the I/O pressure is overlarge, if the management chain table is empty, applying a temporary memory space to store the response packet; 3. assigning a value to the head node extracted from the management linked list, and storing the message content of the response packet into an address pointed by the head node; 4. the head node after assignment is not in any management linked list, and then the head node is added to the tail part of a transmission linked list corresponding to the NVMeoF queue; 5. when a network card polls a certain NVMeoF queue, taking out a head node from a transmission linked list in the NVMeoF queue, and deleting the node from the transmission linked list after taking out the head node; 6. the network card sends the message content stored in the address pointed by the head node through the network card; 7. clearing the address pointed by the head node after the transmission is completed, and re-adding the address pointed by the head node to the tail part of the management linked list so as to ensure that the applied memory is repeatedly available; if the address in the head node is the memory address of the temporary application, the address is released immediately after the network card transmission is completed.
In summary, the network card "single producer single consumer model" finally obtained by the lockless transmission method for the nvmeoh storage network in this embodiment is shown in fig. 6. Referring to fig. 6, it can be known that, in the non-lock transmission method for an nvmeoh storage network in this embodiment, each nvmeoh queue corresponds to one transmission linked list, so that a situation that multiple nvmeoh queues compete for one transmission linked list is avoided, a network card "single producer single consumer model" is implemented, and meanwhile, when the memory is not enough, the non-lock transmission method for an nvmeoh storage network in this embodiment can temporarily apply for a section of memory at both a host end and a target end and treat the memory in a manner of releasing the memory after the memory is used up, thereby effectively solving the problem of memory overflow. According to the lock-free transmission method for the NVMeoF storage network, a linked list cache is added in each NVMeoF queue, a method of polling a plurality of linked lists is adopted, the problem that the plurality of NVMeoF queues compete strongly for a single transmission linked list lock is solved, and the problem of I/O bottleneck under the condition of large I/O pressure is solved.
In addition, the embodiment also provides a lock-free transmission system facing the nvmeoh storage network, which comprises a host end and a target end, wherein the host end is programmed or configured to execute the steps of the lock-free transmission method facing the nvmeoh storage network, or the target end is programmed or configured to execute the steps of the lock-free transmission method facing the nvmeoh storage network.
In addition, the embodiment also provides a lock-free transmission system facing the NVMeoF storage network, which comprises a computing device, wherein the computing device is programmed or configured to execute the steps of the lock-free transmission method facing the NVMeoF storage network.
In addition, the embodiment also provides a lock-free transmission system facing the NVMeoF storage network, which comprises a computing device, wherein a computer program programmed or configured to execute the lock-free transmission method facing the NVMeoF storage network is stored in a memory of the computing device.
Furthermore, the present embodiment also provides a computer readable storage medium having stored thereon a computer program programmed or configured to perform the foregoing method of lockless transmission towards an nvmeoh storage network.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is directed to methods, apparatus (systems), and computer program products in accordance with embodiments of the present application that produce means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks by reference to the instructions that execute in the flowchart and/or processor. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present application, and the protection scope of the present application is not limited to the above examples, and all technical solutions belonging to the concept of the present application belong to the protection scope of the present application. It should be noted that modifications and adaptations to the present application may occur to one skilled in the art without departing from the principles of the present application and are intended to be within the scope of the present application.

Claims (9)

1. The lock-free transmission method for the NVMeoF storage network is characterized by comprising the following implementation steps of:
1) The host side creates the same number of NVMeoF queues according to the number of CPU cores, and applies for a section of blank memory in an array format for each NVMeoF queue;
2) When the command packet arrives, the command packet is added into a blank memory in an array format corresponding to the NVMeoF queue and is cached through an independent linked list; polling each NVMeoF queue, and sending the cached command packet to a target end through a network;
the host side further comprises the step of managing the blank memory of the array format of each NVMeoF queue by adopting a linked list: 1. adding the blank memory in the array format of each NVMeoF queue into a management linked list after the blank memory is applied; 2. when a command packet arrives, a head node in the management linked list is taken out and then deleted from the management linked list; when the I/O pressure is overlarge, if the management chain table is empty, applying a temporary memory space to store the command packet; 3. assigning a value to a head node extracted from the management linked list, and storing the message content of the command packet into an address pointed by the head node; 4. the head node after assignment is not in any management linked list, and then the head node is added to the tail part of a transmission linked list corresponding to the NVMeoF queue; 5. when a network card polls a certain NVMeoF queue, taking out a head node from a transmission linked list in the NVMeoF queue, and deleting the node from the transmission linked list after taking out the head node; 6. the network card sends the message content stored in the address pointed by the head node through the network card; 7. clearing the address pointed by the head node after the transmission is completed, and re-adding the address pointed by the head node to the tail part of the management linked list to ensure that the applied memory is repeatedly available; if the address in the head node is the memory address of the temporary application, the address is released immediately after the network card transmission is completed.
2. The method for lock-free transmission to an nvmeoh storage network according to claim 1, wherein in step 2), caching by an independent linked list specifically means that a command packet in a blank memory in an array format corresponding to an nvmeoh queue is cached to a tail of a transmission linked list corresponding to the nvmeoh queue; the step of sending the cached command packet to the target end through the network specifically means that the command packet at the head of the sending chain table is sent to the target end through the network.
3. The method for lock-free transmission to an nvmeoh storage network as claimed in claim 1, wherein step 1) further comprises the step of initializing an nvmeoh queue by the target end: creating the same NVMeoF queues as the NVMeoF queues at the host end, and applying for a blank memory with an array format for each NVMeoF queue.
4. The method for lock-free transmission in nvmeoh storage network as claimed in claim 3, wherein step 2) further comprises the step of sending a response packet after receiving the command packet by the target terminal: when the response packet arrives, the response packet is added into a blank memory in an array format corresponding to the NVMeoF queue and is cached through an independent linked list; and polling each NVMeoF queue and sending the cached response packet to the host through the network.
5. The lock-free transmission method for an nvmeoh storage network according to claim 4, wherein when the target receives the command packet and then sends the response packet, the caching by the independent linked list specifically means that the command packet in the blank memory in the array format corresponding to the nvmeoh queue is cached to the tail of the sending linked list corresponding to the nvmeoh queue; the step of sending the buffered command packet to the host end through the network specifically means that the command packet at the head of the sending chain table is sent to the host end through the network.
6. The method for lock-free transmission to an nvmeoh storage network as claimed in claim 5, wherein the target further comprises the step of managing the empty memory of the array format of each nvmeoh queue using a linked list: 1. adding the blank memory in the array format of each NVMeoF queue into a management linked list after the blank memory is applied; 2. when a response packet arrives, the head node in the management linked list is taken out and then deleted from the management linked list; when the I/O pressure is overlarge, if the management chain table is empty, applying a temporary memory space to store the response packet; 3. assigning a value to the head node extracted from the management linked list, and storing the message content of the response packet into an address pointed by the head node; 4. the head node after assignment is not in any management linked list, and then the head node is added to the tail part of a transmission linked list corresponding to the NVMeoF queue; 5. when a network card polls a certain NVMeoF queue, taking out a head node from a transmission linked list in the NVMeoF queue, and deleting the node from the transmission linked list after taking out the head node; 6. the network card sends the message content stored in the address pointed by the head node through the network card; 7. clearing the address pointed by the head node after the transmission is completed, and re-adding the address pointed by the head node to the tail part of the management linked list so as to ensure that the applied memory is repeatedly available; if the address in the head node is the memory address of the temporary application, the address is released immediately after the network card transmission is completed.
7. A lockless transmission system for an nvmeoh storage network, comprising a host side and a target side, wherein the host side is programmed or configured to perform the steps of the lockless transmission method for an nvmeoh storage network of claim 1 or 2, or the target side is programmed or configured to perform the steps of the lockless transmission method for an nvmeoh storage network of any one of claims 3-6.
8. A non-locking transmission system for an nvmeoh storage network, comprising a computing device, wherein the computing device is programmed or configured to perform the steps of the non-locking transmission method for an nvmeoh storage network according to any one of claims 1-6, or a computer program programmed or configured to perform the non-locking transmission method for an nvmeoh storage network according to any one of claims 1-6 is stored on a memory of the computing device.
9. A computer readable storage medium having stored thereon a computer program programmed or configured to perform the method of lockless transmission to an nvmeoh storage network of any one of claims 1-6.
CN202010338868.8A 2020-04-26 2020-04-26 Non-lock transmission method and system for NVMeoF storage network Active CN111459417B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010338868.8A CN111459417B (en) 2020-04-26 2020-04-26 Non-lock transmission method and system for NVMeoF storage network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010338868.8A CN111459417B (en) 2020-04-26 2020-04-26 Non-lock transmission method and system for NVMeoF storage network

Publications (2)

Publication Number Publication Date
CN111459417A CN111459417A (en) 2020-07-28
CN111459417B true CN111459417B (en) 2023-08-18

Family

ID=71683815

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010338868.8A Active CN111459417B (en) 2020-04-26 2020-04-26 Non-lock transmission method and system for NVMeoF storage network

Country Status (1)

Country Link
CN (1) CN111459417B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112328178B (en) * 2020-11-05 2022-08-09 苏州浪潮智能科技有限公司 Method and device for processing IO queue full state of solid state disk
CN113176896B (en) * 2021-03-19 2022-12-13 中盈优创资讯科技有限公司 Method for randomly taking out object based on single-in single-out lock-free queue
CN114328317B (en) * 2021-11-30 2023-07-14 苏州浪潮智能科技有限公司 Method, device and medium for improving communication performance of storage system
CN115550377B (en) * 2022-11-25 2023-03-07 苏州浪潮智能科技有限公司 NVMF (network video and frequency) storage cluster node interconnection method, device, equipment and medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1051472A (en) * 1996-05-31 1998-02-20 Internatl Business Mach Corp <Ibm> Method and device for transmitting and receiving packet
CN1859325A (en) * 2006-02-14 2006-11-08 华为技术有限公司 News transfer method based on chained list process
WO2007109920A1 (en) * 2006-03-27 2007-10-04 Zte Corporation A method for constructing and using a memory pool
CN101248618A (en) * 2005-12-07 2008-08-20 中兴通讯股份有限公司 Flow control transport protocol output stream queue management and data transmission processing method
CN106302238A (en) * 2015-05-13 2017-01-04 深圳市中兴微电子技术有限公司 A kind of queue management method and device
DE102017104817A1 (en) * 2016-04-13 2017-10-19 Samsung Electronics Co., Ltd. System and method for a high performance lockable scalable target
CN107924289A (en) * 2015-10-26 2018-04-17 株式会社日立制作所 Computer system and access control method
CN108694021A (en) * 2017-04-03 2018-10-23 三星电子株式会社 The system and method for configuring storage device using baseboard management controller
US10440145B1 (en) * 2016-09-13 2019-10-08 Amazon Technologies, Inc. SDK for reducing unnecessary polling of a network service

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10958729B2 (en) * 2017-05-18 2021-03-23 Intel Corporation Non-volatile memory express over fabric (NVMeOF) using volume management device
US11016911B2 (en) * 2018-08-24 2021-05-25 Samsung Electronics Co., Ltd. Non-volatile memory express over fabric messages between a host and a target using a burst mode

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1051472A (en) * 1996-05-31 1998-02-20 Internatl Business Mach Corp <Ibm> Method and device for transmitting and receiving packet
CN101248618A (en) * 2005-12-07 2008-08-20 中兴通讯股份有限公司 Flow control transport protocol output stream queue management and data transmission processing method
CN1859325A (en) * 2006-02-14 2006-11-08 华为技术有限公司 News transfer method based on chained list process
WO2007109920A1 (en) * 2006-03-27 2007-10-04 Zte Corporation A method for constructing and using a memory pool
CN106302238A (en) * 2015-05-13 2017-01-04 深圳市中兴微电子技术有限公司 A kind of queue management method and device
CN107924289A (en) * 2015-10-26 2018-04-17 株式会社日立制作所 Computer system and access control method
DE102017104817A1 (en) * 2016-04-13 2017-10-19 Samsung Electronics Co., Ltd. System and method for a high performance lockable scalable target
US10440145B1 (en) * 2016-09-13 2019-10-08 Amazon Technologies, Inc. SDK for reducing unnecessary polling of a network service
CN108694021A (en) * 2017-04-03 2018-10-23 三星电子株式会社 The system and method for configuring storage device using baseboard management controller

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
朱佳平,李琼,宋振龙,董德尊,欧洋,徐炜遐.NVMeoF网络存储协议及硬件卸载技术研究.第二十二届计算机工程与工艺年会暨第八届微处理器技术论坛.2018,第15-24页. *

Also Published As

Publication number Publication date
CN111459417A (en) 2020-07-28

Similar Documents

Publication Publication Date Title
CN111459417B (en) Non-lock transmission method and system for NVMeoF storage network
US20200314181A1 (en) Communication with accelerator via RDMA-based network adapter
EP1581875B1 (en) Using direct memory access for performing database operations between two or more machines
US8249072B2 (en) Scalable interface for connecting multiple computer systems which performs parallel MPI header matching
EP1868093B1 (en) Method and system for a user space TCP offload engine (TOE)
CN107948094A (en) A kind of high speed data frame Lothrus apterus is joined the team the device and method of processing
CN102404212A (en) Cross-platform RDMA (Remote Direct Memory Access) communication method based on InfiniBand
CN102831018B (en) Low latency FIFO messaging system
CN109547519B (en) Reverse proxy method, apparatus and computer readable storage medium
CN113452591B (en) Loop control method and device based on CAN bus continuous data frame
CN110535811B (en) Remote memory management method and system, server, client and storage medium
CN105141603A (en) Communication data transmission method and system
CN113572582B (en) Data transmission and retransmission control method and system, storage medium and electronic device
CN111865813B (en) Data center network transmission control method and system based on anti-ECN mark and readable storage medium
EP2383647A1 (en) Networking system call data division for zero copy operations
CN116471242A (en) RDMA-based transmitting end, RDMA-based receiving end, data transmission system and data transmission method
CN115509644A (en) Calculation force unloading method and device, electronic equipment and storage medium
CN114244785B (en) 5G data flow out-of-order processing method and device
CN113204517B (en) Inter-core sharing method of Ethernet controller special for electric power
WO2022151475A1 (en) Message buffering method, memory allocator, and message forwarding system
EP3977705B1 (en) Streaming communication between devices
CN114186163A (en) Application layer network data caching method
CN111586040A (en) High-performance network data receiving method and system
US8190765B2 (en) Data reception management apparatus, systems, and methods
CN1231841C (en) Primary channel adapter and its packet receiving method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant