WO2021259483A1 - Dispositif et procédé d'accès direct à une mémoire à distance - Google Patents

Dispositif et procédé d'accès direct à une mémoire à distance Download PDF

Info

Publication number
WO2021259483A1
WO2021259483A1 PCT/EP2020/067801 EP2020067801W WO2021259483A1 WO 2021259483 A1 WO2021259483 A1 WO 2021259483A1 EP 2020067801 W EP2020067801 W EP 2020067801W WO 2021259483 A1 WO2021259483 A1 WO 2021259483A1
Authority
WO
WIPO (PCT)
Prior art keywords
indication
packets
rdma
operations
fencing
Prior art date
Application number
PCT/EP2020/067801
Other languages
English (en)
Inventor
Ben-Shahar BELKAR
David GANOR
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to CN202080102405.XA priority Critical patent/CN115867894A/zh
Priority to PCT/EP2020/067801 priority patent/WO2021259483A1/fr
Publication of WO2021259483A1 publication Critical patent/WO2021259483A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/10Program control for peripheral devices
    • G06F13/102Program control for peripheral devices where the programme performs an interfacing function, e.g. device driver
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17306Intercommunication techniques
    • G06F15/17331Distributed shared memory [DSM], e.g. remote direct memory access [RDMA]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal

Definitions

  • the present disclosure relates to high performance computing technologies, in particular, to Remote Direct Memory Access (RDMA) technologies.
  • RDMA Remote Direct Memory Access
  • the disclosure allows transporting RDMA transactions over a packet-based network.
  • the present disclosure provides a device, a method and a data packet format for RDMA.
  • RDMA is the primary method of transport for remote memory operations.
  • the viability of RDMA relies heavily on its low latency properties.
  • RDMA operations require ordering of execution, hence it provides mechanisms to achieve the required order.
  • the RDMA receiver needs to be aware of the correct order of operations.
  • the RDMA receiver may be able to process RDMA operations out- of-order.
  • networks may deliver packets out-of-order, for instance due to different network paths used by different packets.
  • packets of RDMA data may arrive out-of-order at the RDMA receiver, e.g., due to network delivery, or packets that are lost and get retransmitted.
  • RDMA InfiniBand
  • IBT InfiniBand Transport Alliance
  • IETF Internet Engineering Task Force
  • IETF Internet Engineering Task Force
  • RDMA provides only simplistic mechanisms for ordering of operations.
  • IB specification defines a “strict order” of execution
  • iWARP defines a “relaxed order” of execution with some restrictions.
  • One of RDMA’s mechanism to enforce an order of execution of operations is fencing.
  • a fence may be used only on the RDMA sender side to enforce in-order delivery of operations.
  • a fence causes the RDMA sender to delay the sending of an operation, until the RDMA receiver sends a completion reply for a previous dependent operation.
  • this delay diminishes the low latency properties of RDMA, and hence diminishes a viability of RDMA.
  • embodiments of the disclosure aim to provide a device and a method for RDMA.
  • An objective is, in particular, to reduce a high latency of RDMA operations that include a fencing indication.
  • One aim is to allow a sender that sends RDMA operations with the fencing indication without waiting for a completion signal of a previous operation.
  • a first aspect of the present disclosure provides a transmitting device for RDMA.
  • the transmitting device is configured to: transmit a first set of packets to a receiving device, wherein the first set of packets comprises one or more first RDMA operations; after transmitting the first set of packets, transmit a second set of packets to the receiving device, wherein the second set of packets comprises one or more second RDMA operations, wherein the one of more second RDMA operations are associated with a fencing indication; and provide a notification to the receiving device indicating that the one or more second RDMA operations are associated with the fencing indication.
  • Embodiments of the present disclosure provide a solution for the transmitting device to communicate the existence of the fencing indication associated with an operation to a receiving device.
  • This solution allows the receiving device to process the fence, instead of the transmitting device, and thus to reduce the latency of the RDMA operations that include a fencing indication.
  • the second set of packets is transmitted without receiving any indication of the completion status of the one or more first RDMA operations at the receiving device.
  • this disclosure allows the transmitting device to send a fenced operation, i.e., the one or more second RDMA operations associated with a fencing indication, immediately after the transmission of the first RDMA operations. That is, the transmitting device does not have to wait until receiving a completion signal of the first RDMA operations.
  • one or more packets in the second set of packets comprise one or more headers, and the fencing indication is piggybacked onto the one or more headers.
  • the fencing indication is piggybacked onto the one or more headers, particularly by overwriting one or more reserved or unused fields in the one or more headers.
  • the fencing indication is carried in a packet with an RDMA opcode.
  • fenced operation an operation with a fencing indication may be referred to as fenced operation.
  • This embodiment proposes to expand the protocol to have a specific new opcode designed for fenced operations. For example, if RDMA “WRITE ONLY” has an opcode: OxA (1010b), a new opcode for the fenced operation: Fenced RDMA “WRITE ONLY” may be set as 0x1 A (11010b).
  • one or more packets in the second set of packets comprise one or more payloads, and the fencing indication is carried in the one or more payloads.
  • the fencing indication may be carried in payloads of packets, instead of in headers.
  • the fencing indication is used to indicate to the receiving device to complete the one or more first RDMA operations, before starting to process the second RDMA operation.
  • fencing is one of the RDMA mechanism to enforce an order of execution of RDMA operations. Being notified that an operation is associated with a fencing indication, the receiving device will complete the previous operations before starting to process the fenced operation.
  • the transmitting device is further configured to: transmit a third set of packets to the receiving device, wherein the third set of packets comprises one or more third RDMA operations, wherein when no indication that the one or more first RDMA operations are complete at the receiving device is received by the transmitting device, the third set of packets further comprises the fencing indication associated with the one or more second RDMA operations.
  • the fencing indication may be added to the next operation of the fenced operation as well, i.e., the third RDMA operation, to prevent an ambiguity in a case of a packet loss.
  • the fencing indication associated with the one or more second RDMA operations is further used to indicate to the receiving device to complete the one or more first RDMA operations, before starting to process the third RDMA operations.
  • the fencing indication is a read-fence indication or a local-fence indication, wherein the read-fence indication is used to indicate the receiving device to complete read operations in the one or more first RDMA operations, before starting to process the second RDMA operations; and wherein the local-fence indication is used to indicate the receiving device to complete all operations in the one or more first RDMA operations, before starting to process the second RDMA operations.
  • a second aspect of the present disclosure provides a receiving device for RDMA.
  • the receiving device is configured to: maintain a list of received RDMA operations; receive a first set of packets from a transmitting device, wherein the first set of packets comprises one or more first RDMA operations; receive a second set of packets from the transmitting device, wherein the second set of packets comprises one or more second RDMA operations, wherein the one or more second RDMA operations are associated with a fencing indication; receive a notification from the transmitting device indicating that the one or more second RDMA operations are associated with the fencing indication; and add the one or more first RDMA operations, the one or more second RDMA operations and the associated fencing indication into the list.
  • Embodiments of the present disclosure provide a solution for a transmitting device to communicate the existence of the fencing indication of an operation to the receiving device. This solution allows the entrusted receiving device to process the fence, instead of the transmitting device, and thus to reduce the latency of the RDMA operations that include a fencing indication.
  • the second set of packets is received following the first set of packets.
  • the transmitting device may send the fenced operation, i.e., the one or more second RDMA operations which are associated with a fencing indication, immediately after the transmission of the first RDMA operations.
  • the receiving device is further configured to process the one or more second RDMA operations, after the one or more first RDMA operations are complete at the receiving device.
  • the transmitting device does not have to wait until receiving a completion signal of the first RDMA operations, to send the one or more second RDMA operations
  • the receiving device will preferably not process the one or more second RDMA operations until the processing of the one or more first RDMA operations is complete. In this way, an execution order of RDMA operations can be kept without a high latency.
  • one or more packet in the second set of packets comprise one or more headers, and the fencing indication is piggybacked onto the one or more headers.
  • the fencing indication may be piggybacked onto the headers by overwriting one or more reserved or unused fields in the one or more headers.
  • the fencing indication is carried in a packet with an RDMA opcode.
  • one or more packet in the second set of packets comprise one or more payloads, and the fencing indication is carried in the one or more payloads.
  • the fencing indication is used to indicate the receiving device to complete the one or more first RDMA operations, before starting to process the second RDMA operations.
  • the receiving device is further configured to receive a third set of packets from the transmitting device, wherein the third set of packets comprises one or more third RDMA operations, wherein when no indication that the one or more first RDMA operations are complete at the receiving device is received by the transmitting device, the third set of packets further comprises the fencing indication associated with the one or more second RDMA operations.
  • the fencing indication associated with the one or more second RDMA operations is further used to indicate the receiving device to complete the one or more first RDMA operations, before starting to process the one or more third RDMA operations.
  • the fencing indication is a read-fence indication or a local-fence indication, wherein the read-fence indication is used to indicate the receiving device to complete read operations in the one or more first RDMA operations, before starting to process the second RDMA operations; and wherein the local-fence indication is used to indicate the receiving device to complete all operations in the one or more first RDMA operations, before starting to process the second RDMA operations.
  • a third aspect of the present disclosure provides a method for RDMA, wherein the method comprises: transmitting a first set of packets to a receiving device, wherein the first set of packets comprises one or more first RDMA operations; after transmitting the first set of packets, transmitting a second set of packets to the receiving device, wherein the second set of packets comprises one or more second RDMA operations, wherein the one of more second RDMA operations are associated with a fencing indication; and providing a notification to the receiving device indicating that the one or more second RDMA operations are associated with the fencing indication.
  • the method of the third aspect and its implementation forms provide the same advantages and effects as described above for the transmitting device of the first aspect and its respective implementation forms.
  • a fourth aspect of the present disclosure provides method for RDMA, wherein the method comprises: maintaining a list of received operations; receiving a first set of packets from a transmitting device, wherein the first set of packets comprises one or more first RDMA operations; receiving a second set of packets from the transmitting device, wherein the second set of packets comprises one or more second RDMA operations, wherein the one or more second RDMA operations are associated with a fencing indication; receiving a notification from the transmitting device indicating that the one or more second RDMA operations are associated with the fencing indication; and adding the one or more first RDMA operations, the one or more second RDMA operations and the associated fencing indication into the list.
  • the method of the fourth aspect and its implementation forms provide the same advantages and effects as described above for the receiving device of the second aspect and its respective implementation forms.
  • a fifth aspect of the present disclosure provides a computer program comprising a program code for carrying out, when implemented on a processor, the method according to the third or fourth aspect or any of its implementation forms.
  • a sixth aspect of the present disclosure provides a computer readable storage medium comprising computer program code instructions, being executable by a computer, for performing a method according to the third or fourth aspect or any of its implementation forms when the computer program code instructions runs on a computer.
  • FIG. 1 shows an example of packets exchanged between a sender and a receiver.
  • FIG. 2 shows an example of packets exchanged between a sender and a receiver.
  • FIG. 3 shows a transmitting device according to an embodiment of the disclosure.
  • FIG. 4 shows a receiving device according to an embodiment of the disclosure.
  • FIG. 5 shows an example of packets exchanged between a sender and a receiver according to an embodiment of the disclosure.
  • FIG. 6 shows an example of packets exchanged between a sender and a receiver according to an embodiment of the disclosure.
  • FIG. 7 shows a method according to an embodiment of the disclosure.
  • FIG. 8 shows a method according to an embodiment of the disclosure.
  • an embodiment/example may refer to other embodiments/examples.
  • any description including but not limited to terminology, element, process, explanation and/or technical advantage mentioned in one embodiment/example is applicative to the other embodiments/examples.
  • FIG. 1 shows an example of packets exchanged between a sender and a receiver.
  • FIG. 1 shows the packets exchanged with a WRITE operation including a Read-Fence with IB “strict order” execution.
  • “strict order” implies that operations carried in all packets must be executed in order.
  • WQE Work Queue Entry
  • QP Queue Pair
  • a WQE is an RDMA operation or transaction that is pushed into a QP, either to be transmitted to the peer, or having been received from the peer.
  • QP represents a communication endpoint, which consists of a SEND Queue and a RECEIVE Queue.
  • READl requests three responses
  • READ2 requests two responses
  • WRITE 1 which includes a Read-Fence indication
  • the two READ operations will be sent as soon as the sender can process the WQEs, regardless if previous READ operations have been completed.
  • READl and READ2 are processed in order as dictated by “strict order” execution.
  • the WRITE operation will be sent only after the sender receives two completion signals indicating the READ operations have been completed.
  • a time delay is caused by the Read-Fence indication. Particularly, the delay is between sending READ2 and the starting of WRITE1.
  • READ2 starts after READ 1 has been completed; it is also possible that READ2 is sent out before the last response of READl has been accepted.
  • the Read-Fence indication on the WRITE operation just indicates to not send the WRITE before all previous READ operations have been completed. However, if there is no Read-Fence indication on WRITE1, READ2 and WRITE1 requests will not wait for all the responses of READ 1 to arrive before they go out. That is, both of the READ2 and WRITE 1 will go out to the network in order, but it might happen that the WRITE operation
  • WRITE 1 will arrive at the receiver even before the previous READ operations have been completed.
  • the responses of the READ2 will arrive after READl. That is, the receiver will not process READ2 before READl is complete.
  • the WRITE1 may be processed before any of the two READ operations has been completed.
  • the ordering rules on strict ordering scenarios are defined in the IB specification.
  • FIG. 2 shows another example of packets exchanged between a sender and a receiver.
  • FIG. 2 shows the packets exchanged with a WRITE operation including a Local- Fence with iWARP “relaxed order” execution.
  • “relaxed order” implies that packets could be placed out-of-order.
  • WRITE1 is a multi-packets operation with three packets
  • WRITE2, which includes a Local-Fence indication is a multi-packet operation with three packets
  • WRITE3 is a multi-packet operation with two packets.
  • WRITE2 and WRITE3 will be sent, only after the sender receives a completion signal of WRITE1.
  • WRITE3 is sent out even before WRITE2 is signaled as completed.
  • a time delay between the sending the last packet of WRITE 1 and starting WRITE2, is caused by the Local-Fence indication.
  • FIG. 3 shows a transmitting device 300 adapted for RDMA according to an embodiment of the disclosure.
  • the transmitting device 300 may comprise processing circuitry (not shown) configured to perform, conduct or initiate the various operations of the transmitting device 300 described herein.
  • the processing circuitry may comprise hardware and software.
  • the hardware may comprise analog circuitry or digital circuitry, or both analog and digital circuitry.
  • the digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable arrays (FPGAs), digital signal processors (DSPs), or multi purpose processors.
  • the transmitting device 300 may further comprise memory circuitry, which stores one or more instruction(s) that can be executed by the processor or by the processing circuitry, in particular under control of the software.
  • the memory circuitry may comprise a non-transitory storage medium storing executable software code which, when executed by the processor or the processing circuitry, causes the various operations of the transmitting device 300 to be performed.
  • the processing circuitry comprises one or more processors and a non-transitory memory connected to the one or more processors.
  • the non-transitory memory may carry executable program code which, when executed by the one or more processors, causes the transmitting device 300 to perform, conduct or initiate the operations or methods described herein.
  • the transmitting device 300 is configured to transmit a first set of packets 301 to a receiving device 310.
  • the first set of packets 301 comprises one or more first RDMA operations.
  • the transmitting device 300 is further configured to transmit a second set of packets 302 to the receiving device 310.
  • the second set of packets 302 comprises one or more second RDMA operations.
  • the one of more second RDMA operations are associated with a fencing indication.
  • the transmitting device 300 is configured to provide a notification 303 to the receiving device 310 indicating that the one or more second RDMA operations are associated with the fencing indication.
  • embodiments of this disclosure propose a RDMA operation transmitting and receiving method that allows the fencing indication being communicated to the receiver, i.e., the receiving device 310.
  • the fencing is signaled to the receiver by the sender, i.e., the transmitting device 300.
  • the latency can be reduced by entrusting the receiver to process the fencing, instead of the sender processing it.
  • the second set of packets 302 may be transmitted without receiving any indication of the completion status of the one or more first RDMA operations at the receiving device 310. This is in contrary to the conventional solution, in which a following operation will be sent only after the sender receives the completion signal of a previous operation, if the previous operation includes a fencing indication (as the examples shown in FIG. 1 and FIG. 2).
  • FIG. 4 shows a receiving device 310 adapted for RDMA according to an embodiment of the disclosure.
  • the receiving device 310 may comprise processing circuitry (not shown) configured to perform, conduct or initiate the various operations of the transmitting device 300 described herein.
  • the processing circuitry may comprise hardware and software.
  • the hardware may comprise analog circuitry or digital circuitry, or both analog and digital circuitry.
  • the digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable arrays (FPGAs), digital signal processors (DSPs), or multi-purpose processors.
  • the receiving device 310 may further comprise memory circuitry, which stores one or more instruction(s) that can be executed by the processor or by the processing circuitry, in particular under control of the software.
  • the memory circuitry may comprise a non-transitory storage medium storing executable software code which, when executed by the processor or the processing circuitry, causes the various operations of the receiving device 310 to be performed.
  • the processing circuitry comprises one or more processors and a non-transitory memory connected to the one or more processors.
  • the non-transitory memory may carry executable program code which, when executed by the one or more processors, causes the receiving device 310 to perform, conduct or initiate the operations or methods described herein.
  • the receiving device 310 is configured to maintain a list 311 of received RDMA operations.
  • the receiving device 310 is further configured to receive a first set of packets 301 from a transmitting device 300.
  • the transmitting device 300 may be the transmitting device shown in FIG. 3.
  • the first set of packets 301 comprises one or more first RDMA operations.
  • the receiving device 310 is further configured to receive a second set of packets 302 from the transmitting device 300.
  • the second set of packets 302 comprises one or more second RDMA operations.
  • the one or more second RDMA operations are associated with a fencing indication.
  • the receiving device 310 is configured to receive a notification 303 from the transmitting device 300 indicating that the one or more second RDMA operations are associated with the fencing indication. Then, the receiving device 310 is further configured to add the one or more first RDMA operations, the one or more second RDMA operations and the associated fencing indication into the list 311. It should be noted that, the second set of packets 302 may be received following the first set of packets 301.
  • Embodiment of the disclosure proposes a receiving device 310 that maintains a list 311 of received RDMA operations.
  • the list 311 may further include metadata of a RDMA operation, such as a fencing indication associated with the RDMA operation.
  • the receiving device 310 receives an RDMA operation, it will keep the received operation in the list 311, along with the fencing indication associated with that operation (if there is a fencing indication).
  • the list 311 of the received RDMA operation may be stored in a storage space of the receiving device 310.
  • the list 311 may be stored in a buffer, or a queue.
  • FIG. 5 shows an example of packets exchanged between a sender and an entrusted receiver according to an embodiment of the disclosure.
  • the sender may be the transmitting device 300 as shown in FIG. 3, and the receiver may be the receiving device 310 as shown in FIG. 4.
  • FIG. 5 shows the packets exchanged with a WRITE operation including a Read-Fence with IB “strict order” execution.
  • IB Read-Fence
  • strict order implies that operations are executed in order.
  • the ordering rules on strict ordering scenarios are defined in the IB specification.
  • WRITE is a multi-packet operation with two packets, and includes a Read-Fence indication.
  • WRITE operation will be sent by the sender, i.e., the transmitting device 300, after the READ request, without any delay, and will include the fencing indication.
  • the receiver i.e., the receiving device 310, will store these WRITE requests but will not process them, due to the fencing indication.
  • the fencing indication informs the receiver that all previous READ operation(s) needs to be completed, before the WRITE operation can be executed. Accordingly, the receiver processes the READ operation and sends the three responses. As soon as the receiver completes executing the READ request, it starts executing the stored WRITE requests.
  • FIG. 6 shows another example of packets exchanged between a sender and an entrusted receiver according to an embodiment of the disclosure.
  • the sender may be the transmitting device 300 as shown in FIG. 3, and the receiver may be the receiving device 310 as shown in FIG. 4.
  • FIG. 6 shows the packets exchanged with a WRITE operation including a Local-Fence with iWARP “relaxed order” execution.
  • iWARP Local-Fence with iWARP “relaxed order” execution.
  • “relaxed order” implies that operations could be executed out-of-order.
  • the ordering rules on relaxed ordering scenarios are defined in the iWARP specification.
  • WRITE1 is a multi-packets operation with three packets
  • WRITE2, which includes a Local-Fence indication is a multi-packet operation with three packets
  • WRITE3 is a multi-packet operation with two packets.
  • all WRITE operations are sent as soon as the sender, i.e., the transmitting device 300, processes their WQEs.
  • WRITE2 is sent before WRITE1 is complete, even though WRITE2 includes a Local-Fence indication.
  • WRITE3 is sent before WRITE2 is complete as this example shows a “relaxed order” of execution.
  • WRITE3 can be executed either before, or after WRITE2 is complete, as iWARP “relaxed order” execution permits such implementation, and since WRITE3 is not fenced by itself.
  • Embodiments of this disclosure define a new flow of RDMA operations’ execution.
  • RDMA operations are sent by a sender to a receiver, over the network, even if a fencing indication is present.
  • the sender may be the transmitting device 300 as shown in FIG. 3, and the receiver may be the receiving device 310 as shown in FIG. 4.
  • the transmitting device 300 communicates the fencing indication, if present, to the receiving device 310.
  • the transmitting device 300 may keep communicating fencing indications for subsequent operations, until it receives the completion indication of the first fenced operation. This is to ensure that the receiving device 310 gets the fencing indication even if it received out-of-order operations or packets.
  • the receiving device 310 may maintain a queue of received operations, and when a fencing indication is present for a specific operation, and for all the following operations, it will not process this operation, and the following operations, until the receiving device 310 completes all the previous operations that they depend upon.
  • this disclosure reduces a latency of RDMA operations that include a fencing indication, by communicating this fencing indication to a receiver, particularly an entrusted receiver.
  • the entrusted receiver i.e., the receiving device 310 as shown in FIG. 3 or FIG. 4, becomes the entity that enforces the fence, not the sender, i.e., the transmitting device 300.
  • the transmitting device 300 does not wait for the completion of the previous dependent operation to be received, in order to send the fenced operation. In this way, the transmitting device 300 is not delaying an operation due to a fencing indication, instead, the receiving device
  • the fencing indication may be metadata of an operation and may be incorporated into all packets in the operation.
  • the fencing indication may be added to the next operations too, to prevent ambiguity in a case of a packet loss.
  • Metadata of an operation may be piggybacked onto unused fields in header, or sent as a new packet with a different opcode.
  • opcode defined for a fenced operation.
  • Embodiments of the disclosure suggest to expand the protocol to have a specific new opcode for the fenced operation. For instance, when an operation of RDMA Write ONLY has an opcode OxA (1010b), a new opcode for an operation of Fenced RDMA Write ONLY may be created as 0x1 A (11010b).
  • the receiving device 310 may maintain an operations queue, where metadata of operations may be stored in a case of a fencing indication exists, until all previous dependent operations have been processed.
  • the latency of an operation with a fencing indication can be reduced by at least one round trip time (RTT) of the channel between the transmitting device 300 and the receiving device 310.
  • RTT is a sum of the time it takes for the completion signal of the previous operation to reach the transmitting device 300 from the receiving device 310, and the time it takes for the fenced operation to reach the receiving device 310, from the transmitting device 300.
  • FIG. 7 shows a method 700 for RDMA according to an embodiment of the disclosure.
  • the method 700 is performed by a transmitting device 300 shown in FIG. 3.
  • the method 700 comprises a step 701 of transmitting a first set of packets to a receiving device 310.
  • the first set of packets 301 comprises one or more first RDMA operations.
  • the receiving device 310 may be the receiving device shown in FIG. 4.
  • the method 700 further comprises a step 702 of transmitting a second set of packets 302 to the receiving device 310.
  • the second set of packets 302 comprises one or more second RDMA operations, wherein the one of more second RDMA operations are associated with a fencing indication.
  • This step 702 is particularly performed after the step 701, i.e., the step of transmitting the first set of packets 301. Further, the method 700 comprises a step 703 of providing a notification 303 to the receiving device 310 indicating that the one or more second RDMA operations are associated with the fencing indication.
  • FIG. 8 shows a method 800 for RDMA according to an embodiment of the disclosure.
  • the method 800 is performed by a receiving device 310 shown in FIG. 4.
  • the method 800 comprises a step 801 of maintaining a list 311 of received operations; a step 802 of receiving a first set of packets 301 from a transmitting device 300, wherein the first set of packets 301 comprises one or more first RDMA operations.
  • the transmitting device 300 may be the transmitting device shown in FIG. 3.
  • the method 800 further comprises a step 803 of receiving a second set of packets 302 from the transmitting device 300.
  • the second set of packets 302 comprises one or more second RDMA operations, wherein the one or more second RDMA operations are associated with a fencing indication.
  • the method 800 comprises a step 804 of receiving a notification 303 from the transmitting device 300 indicating that the one or more second RDMA operations are associated with the fencing indication; and a step 805 of adding the one or more first RDMA operations, the one or more second RDMA operations and the associated fencing indication into the list 311.
  • any method according to embodiments of the disclosure may be implemented in a computer program, having code means, which when run by processing means causes the processing means to execute the steps of the method.
  • the computer program is included in a computer readable medium of a computer program product.
  • the computer readable medium may comprise essentially any memory, such as a ROM (Read-Only Memory), a PROM (Programmable Read-Only Memory), an EPROM (Erasable PROM), a Flash memory, an EEPROM (Electrically Erasable PROM), or a hard disk drive.
  • embodiments of the transmitting device 300 or the receiving device 310 comprises the necessary communication capabilities in the form of e.g., functions, means, units, elements, etc., for performing the solution.
  • means, units, elements and functions are: processors, memory, buffers, control logic, encoders, decoders, rate matchers, de-rate matchers, mapping units, multipliers, decision units, selecting units, switches, interleavers, de-interleavers, modulators, demodulators, inputs, outputs, antennas, amplifiers, receiver units, transmitter units, DSPs, trellis-coded modulation (TCM) encoder, TCM decoder, power supply units, power feeders, communication interfaces, communication protocols, etc.
  • TCM trellis-coded modulation
  • the processor(s) of the transmitting device 300 or the receiving device 310 may comprise, e.g., one or more instances of a Central Processing Unit (CPU), a processing unit, a processing circuit, a processor, an Application Specific Integrated Circuit (ASIC), a microprocessor, or other processing logic that may interpret and execute instructions.
  • CPU Central Processing Unit
  • ASIC Application Specific Integrated Circuit
  • microprocessor may thus represent a processing circuitry comprising a plurality of processing circuits, such as, e.g., any, some or all of the ones mentioned above.
  • the processing circuitry may further perform data processing functions for inputting, outputting, and processing of data comprising data buffering and device control functions, such as call processing control, user interface control, or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

La présente invention concerne un dispositif et un procédé de RDMA. De façon spécifique, l'invention propose un dispositif de transmission pour RDMA. Le dispositif de transmission est configuré pour : transmettre un premier ensemble de paquets à un dispositif de réception, le premier ensemble de paquets comprenant une ou plusieurs premières opérations RDMA ; après la transmission du premier ensemble de paquets, transmettre un second ensemble de paquets au dispositif de réception, le second ensemble de paquets comprenant une ou plusieurs secondes opérations RDMA, la ou les secondes opérations RDMA étant associées à une indication de clôture ; et fournir une notification au dispositif de réception indiquant que la ou les secondes opérations RDMA sont associées à l'indication de clôture. En outre, l'invention propose un dispositif de réception pour RDMA.
PCT/EP2020/067801 2020-06-25 2020-06-25 Dispositif et procédé d'accès direct à une mémoire à distance WO2021259483A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080102405.XA CN115867894A (zh) 2020-06-25 2020-06-25 用于远程直接存储器访问的设备和方法
PCT/EP2020/067801 WO2021259483A1 (fr) 2020-06-25 2020-06-25 Dispositif et procédé d'accès direct à une mémoire à distance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2020/067801 WO2021259483A1 (fr) 2020-06-25 2020-06-25 Dispositif et procédé d'accès direct à une mémoire à distance

Publications (1)

Publication Number Publication Date
WO2021259483A1 true WO2021259483A1 (fr) 2021-12-30

Family

ID=71143748

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2020/067801 WO2021259483A1 (fr) 2020-06-25 2020-06-25 Dispositif et procédé d'accès direct à une mémoire à distance

Country Status (2)

Country Link
CN (1) CN115867894A (fr)
WO (1) WO2021259483A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11622004B1 (en) * 2022-05-02 2023-04-04 Mellanox Technologies, Ltd. Transaction-based reliable transport

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190034381A1 (en) * 2017-07-26 2019-01-31 Mellanox Technologies, Ltd. Network data transactions using posted and non-posted operations

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190034381A1 (en) * 2017-07-26 2019-01-31 Mellanox Technologies, Ltd. Network data transactions using posted and non-posted operations

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SANTHANARAMAN G ET AL: "Design alternatives for implementing fence synchronization in MPI-2 one-sided communication for InfiniBand clusters", CLUSTER COMPUTING AND WORKSHOPS, 2009. CLUSTER '09. IEEE INTERNATIONAL CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, 31 August 2009 (2009-08-31), pages 1 - 9, XP031547992, ISBN: 978-1-4244-5011-4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11622004B1 (en) * 2022-05-02 2023-04-04 Mellanox Technologies, Ltd. Transaction-based reliable transport

Also Published As

Publication number Publication date
CN115867894A (zh) 2023-03-28

Similar Documents

Publication Publication Date Title
US20220311544A1 (en) System and method for facilitating efficient packet forwarding in a network interface controller (nic)
US11934340B2 (en) Multi-path RDMA transmission
US11184439B2 (en) Communication with accelerator via RDMA-based network adapter
US10110518B2 (en) Handling transport layer operations received out of order
EP2574000B1 (fr) Accélération de message
US8023520B2 (en) Signaling packet
US7889762B2 (en) Apparatus and method for in-line insertion and removal of markers
CA2548966C (fr) Vitesse de traitement de retransmission de tcp amelioree
US8655974B2 (en) Zero copy data transmission in a software based RDMA network stack
US7243284B2 (en) Limiting number of retransmission attempts for data transfer via network interface controller
US7782905B2 (en) Apparatus and method for stateless CRC calculation
US7733875B2 (en) Transmit flow for network acceleration architecture
US7760741B2 (en) Network acceleration architecture
US7912979B2 (en) In-order delivery of plurality of RDMA messages
EP2722768B1 (fr) Traitement TCP pour dispositifs
US9692560B1 (en) Methods and systems for reliable network communication
WO2021259483A1 (fr) Dispositif et procédé d'accès direct à une mémoire à distance
US7929536B2 (en) Buffer management for communication protocols
US10320694B2 (en) Methods, apparatuses and computer-readable storage mediums for communication via user services platform
US10498867B2 (en) Network interface device and host processing device field
EP4363988A1 (fr) Dispositif et procédé d?accès direct à une mémoire à distance

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20734732

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20734732

Country of ref document: EP

Kind code of ref document: A1