WO2023011712A1 - A device and method for remote direct memory access - Google Patents

A device and method for remote direct memory access Download PDF

Info

Publication number
WO2023011712A1
WO2023011712A1 PCT/EP2021/071749 EP2021071749W WO2023011712A1 WO 2023011712 A1 WO2023011712 A1 WO 2023011712A1 EP 2021071749 W EP2021071749 W EP 2021071749W WO 2023011712 A1 WO2023011712 A1 WO 2023011712A1
Authority
WO
WIPO (PCT)
Prior art keywords
packet
packets
receiving device
sequence
message
Prior art date
Application number
PCT/EP2021/071749
Other languages
French (fr)
Inventor
Reuven Cohen
David GANOR
Amit GERON
Ben-Shahar BELKAR
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to CN202180096824.1A priority Critical patent/CN117203627A/en
Priority to PCT/EP2021/071749 priority patent/WO2023011712A1/en
Publication of WO2023011712A1 publication Critical patent/WO2023011712A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17306Intercommunication techniques
    • G06F15/17331Distributed shared memory [DSM], e.g. remote direct memory access [RDMA]

Definitions

  • the present disclosure relates to high-performance computing technologies, in particular, to Remote Direct Memory Access (RDMA) technologies.
  • the disclosure is concerned with transporting RDMA transactions over a packet-based network.
  • the present disclosure provides a device, a method, and a data packet format for RDMA.
  • TCP Transmission Control Protocol
  • UDP Quick User Datagram Protocol
  • QUIC Quick User Datagram Protocol
  • RDMA-RC RDMA-Reliable Connection
  • the receiver sends acknowledgments (ACKs) for the received packets.
  • ACKs acknowledgments
  • NAKs negative acknowledgments
  • the various retransmission schemes are sometimes classified according to their operations into “Stop and Wait”, “Go-Back-N” and “Selective-Repeat”, for example.
  • RDMA is a technology that allows applications to perform memory access operations on remote memory installed in a remote network node.
  • RDMA-RC provides reliable transports of data, and it is implemented in the RDMA Network Interface Card (RNIC) device, thus allowing a network node to perform such memory access operations without involving the operating system nor the node’s main Central Processing Unit (CPU).
  • RNIC RDMA Network Interface Card
  • RDMA allows a computer to perform such memory access operations without involving the operating system that runs on the computer.
  • RDMA is becoming widely used in modern data centers and by computer clusters, as it provides low-latency remote memory access operations together with high network bandwidth.
  • InfiniBand RDMA has two variants that allow it to run over IP/Ethemet networks, namely RoCE and RoCEv2.
  • RDMA-RC in the InfiniBand uses the relatively simple Go-Back-N (GBN) retransmission algorithm, in which many packets are retransmitted following an identified event of a lost packet.
  • GBN Go-Back-N
  • messages are signaled as completed in the same order they have been posted by the software layer.
  • Current RDMA-RC does not support any out-of-order faster completion signaling.
  • embodiments of the disclosure aim to provide a retransmission scheme for reliable transport.
  • An objective is to propose a retransmission scheme, in particular, a retransmission scheme that guarantees that all transmitted packets are processed exactly once by the receiver.
  • One aim is to enable generating signal completion of messages in a different order (in time) than in which they were issued.
  • Another aim is to enable the easy application of Equal Cost Multiple Paths (ECMP) flowlets for accelerating messages execution and completion.
  • ECMP Equal Cost Multiple Paths
  • a first aspect of the present disclosure provides a transmitting device for RDMA.
  • the transmitting device is configured to: transmit a sequence of packets to a receiving device for RDMA, wherein each transmitted packet of the sequence of packets is associated with a packet sequence number (PSN), and carries a message; determine whether a particular packet of the sequence of packets is received at the receiving device, wherein the particular packet comprises a first message; and retransmit the particular packet to the receiving device as the next step after determining that the particular packet is missing at the receiving device, wherein the retransmitted packet carries the same first message as the particular packet, and is associated with a new PSN.
  • PSN packet sequence number
  • Embodiments of this disclosure introduce a new RDMA retransmission scheme for reliable transport, referred to an Order- Agnostic retransmission.
  • This disclosure enables out-of-order or faster completion signals.
  • each packet carries exactly one message.
  • the message mentioned here may be a working queue element (WQE), which is an RDMA operation or transaction that is posted or issued by a source ULP (e.g., an application or software) and is pushed into a queue pair (QP).
  • WQE working queue element
  • QP queue pair
  • each transmitted packet comprises a transaction identifier (XID), identifying the message carried in the packet; the particular packet comprises a first XID identifying the first message carried in the particular packet; and the retransmitted packet comprises the same first XID as the particular packet.
  • XID transaction identifier
  • each packet (transmitted or retransmitted) carries two different numbers: an XID and a PSN.
  • the XID identifies the message, thus each message is assigned with a unique XID by the sender.
  • the retransmitted copy carries the same XID, but the PSN is different.
  • the PSN of the retransmitted packet is logically larger than the previous missing packet sent towards the same destination.
  • the transmitting device is further configured to assign each transmitted packet and/or each retransmitted packet with a flowlet ID identifying a flowlet, wherein the flowlet comprises a plurality of packets, which are routed through the same network route, and each transmitted packet and/or retransmitted packet further comprises the flowlet ID; and transmit each transmitted packet and/or retransmitted packet to the receiving device over the flowlet.
  • the transmitting device may maintain a separate (continuous) PSN space for each sub-stream, i.e., flowlet. That is, each flowl et is associated with a separate PSN space.
  • each flowl et is associated with a separate PSN space.
  • the flowlet ID that is assigned to the retransmitted packet is different from the flowlet ID that is assigned to the transmitted packet, wherein the transmitted packet and the retransmitted packet are routed through different flowlets identified by the flowlet IDs of the transmitted packet and the retransmitted packet, respectively.
  • the transmitting device may retransmit the same message in a new packet over any flowlet. However, it may be preferred to assign a retransmitted packet a different flowlet ID, and route the retransmitted packet through a different flowlet.
  • the transmitting device is configured to receive a notification message for one or more transmitted packets of the sequence of packets from the receiving device, wherein the notification message indicates whether the transmitted packets are received at the receiving device; and determine whether the particular packet of the sequence of packets is received at the receiving device according to the notification message.
  • the notification message may be an ACK, which carries the PSN and the flowlet ID of the received packet.
  • a single notification message can report multiple packets belonging to the same flowlet.
  • the transmitting device is further configured to generate a completion signal for a message, when receiving a notification message indicating that a transmitted packet comprising an XID identifying that message is received for the first time.
  • the message may specifically be a RDMA Write operation or a RDMA Send operation.
  • this message can be signaled as completed. Possibly, later ACKs that potentially report the same message as received, do not trigger a completion signal (of the already completed message).
  • completion of a message happens regardless of the state of any previously issued messages, i.e., regardless of whether all previously issued messages have been completed or not.
  • This completion generation mechanism may be referred to as “out-of-order completion” or “Order- Agnostic completion” in this disclosure.
  • the transmitting device is further configured to: determine that the particular packet with a PSN X of the sequence of packets is missing at the receiving device when receiving a notification message indicating that a packet with a PSN logically larger than X is received at the receiving device before receiving a notification message indicating that the particular packet with the PSN X is received at the receiving device, from the receiving device, wherein X is a non-negative integer.
  • a retransmission scheme may have two complementary mechanisms: PSN-based and timer-based.
  • PSN-based retransmission may be based on a unique strictly monotonically increasing PSN assigned to every transmitted packet and to quickly identify lost packets when a “gap” is detected in the PSNs reported back by a notification message.
  • each notification message indicates the transmitted packets that with PSNs falling in a particular range.
  • each notification message indicates the transmitted packets that comprise the same flowlet ID.
  • the transmitting device is configured to start to track a time period after sending a packet, if the tracking of the time period has not been started; and reset the tracking of the time period after receiving a notification message.
  • Timer-based retransmission is similar to the traditional way of triggering an event of retransmission: if no notification message(s) are received during a configurable period of time, all outstanding packets are retransmitted.
  • the transmitting device is further configured to determine that a particular packet with a PSN X is missing at the receiving device when no notification message indicating that the packet with the PSN X is received at the receiving device is received before the time period expires.
  • the transmitting device is further configured to set the time period according to the receiving device.
  • a timer i.e., the same time period
  • the transmitting device 100 may maintain a single timer, i.e., set the same time period, for all the connections with different receivers.
  • the message carried in each packet is limited to fit in a single network maximum transmission unit.
  • a second aspect of the present disclosure provides a receiving device for RDMA, wherein the receiving device is configured to: receive a sequence of packets from a transmitting device for RDMA, wherein each received packet of the sequence of packets is associated with a packet sequence number, PSN, and carriers a message; and transmit a notification message for the sequence of packets to the transmitting device, wherein the notification message indicates which packets of the sequence of packets are received at the receiving device.
  • Embodiments of the present disclosure further provide a receiving device that operates accordingly to the transmitting device of the first aspect.
  • each received packet comprises an XID identifying the message carried in the packet.
  • XID identifies the message.
  • Each message is assigned with a unique XID.
  • each packet of the sequence of packets further comprises a flowlet ID identifying a flowlet, wherein the flowlet comprises a plurality of packets, which are routed through the same network route.
  • each notification message indicates packets that comprise the same flowlet ID.
  • a single notification message can report multiple packets belonging to the same flowlet.
  • the receiving device is further configured to: maintain a first data structure storing PSNs of received packets; and update the first data structure after receiving a packet from the transmitting device.
  • the receiving device may further record the XIDs carried in the received packets.
  • the receiving device is further configured to discard a received packet, if a PSN of that received packet cannot be recorded in the first data structure.
  • the receiving device is further configured to transmit the notification message based on the first data structure.
  • each notification message indicates packets having PSNs falling in a particular range.
  • a third aspect of the present disclosure provides a method for RDMA.
  • the method comprises: transmitting a sequence of packets to a receiving device for RDMA, wherein each transmitted packet of the sequence of packets is associated with a PSN, and carries a message; determining whether a particular packet of the sequence of packets is received at the receiving device, wherein the particular packet comprises a first message; and retransmitting that particular packet to the receiving device as the next step after determining that the particular packet is missing at the receiving device, wherein the retransmitted packet carries the same first message as the particular packet, and is associated with a new PSN.
  • the method of the third aspect and its implementation forms provide the same advantages and effects as described above for the transmitting device of the first aspect and its respective implementation forms.
  • a fourth aspect of the present disclosure provides a method for RDMA.
  • the method comprises: receiving a sequence of packets from a transmitting device for RDMA, wherein each received packet of the sequence of packets is associated with a packet sequence number, PSN, and comprises a transaction identifier, XID, identifying a message carried in the packet; and transmitting a notification message for the sequence of packets to the transmitting device, wherein the notification message indicates which packets of the sequence of packets are received at the receiving device.
  • the method of the fourth aspect and its implementation forms provide the same advantages and effects as described above for the receiving device of the second aspect and its respective implementation forms.
  • a fifth aspect of the present disclosure provides a computer program comprising a program code for carrying out, when implemented on a processor, the method according to any of the third aspect and its implementation forms, or any of the fourth aspect and its implementation forms.
  • FIG. 1 shows a transmitting device according to an embodiment of the disclosure.
  • FIG. 2 shows an exchange of packets between a transmitting device and a receiving device according to an embodiment of the disclosure.
  • FIG. 3 shows an exchange of packets between a transmitting device and a receiving device according to an embodiment of the disclosure.
  • FIG. 4 shows an exchange of packets between a transmitting device and a receiving device according to an embodiment of the disclosure.
  • FIG. 5 shows a receiving device according to an embodiment of the disclosure.
  • FIG. 6 shows a method according to an embodiment of the disclosure.
  • FIG. 7 shows a method according to an embodiment of the disclosure.
  • an embodiment/example may refer to other embodiments/examples.
  • any description including but not limited to terminology, element, process, explanation and/or technical advantage mentioned in one embodiment/example is applicative to the other embodiments/examples.
  • FIG. 1 shows a transmitting device 100 adapted for RDMA according to an embodiment of the disclosure.
  • the transmitting device 100 may comprise processing circuitry (not shown) configured to perform, conduct or initiate the various operations of the transmitting device 100 described herein.
  • the processing circuitry may comprise hardware and software.
  • the hardware may comprise analog circuitry or digital circuitry, or both analog and digital circuitry.
  • the digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable arrays (FPGAs), digital signal processors (DSPs), or multipurpose processors.
  • the transmitting device 100 may further comprise memory circuitry, which stores one or more instruction(s) that can be executed by the processor or by the processing circuitry, in particular under control of the software.
  • the memory circuitry may comprise a non-transitory storage medium storing executable software code which, when executed by the processor or the processing circuitry, causes the various operations of the transmitting device 100 to be performed.
  • the processing circuitry comprises one or more processors and a non-transitory memory connected to the one or more processors.
  • the non-transitory memory may carry executable program code which, when executed by the one or more processors, causes the transmitting device 100 to perform, conduct or initiate the operations or methods described herein.
  • the transmitting device 100 is configured to transmit a sequence of packets 101 to a receiving device 200 adapted for RDMA.
  • each transmitted packet of the sequence of packets 101 is associated with a PSN and carries a message.
  • the transmitting device 100 is configured to determine whether a particular packet 1011 of the sequence of packets 101 is received at the receiving device 200.
  • the particular packet 1011 comprises a first message 1012.
  • the transmitting device 100 is configured to retransmit the particular packet to the receiving device 200 as the next step after determining that the particular packet is missing at the receiving device 200.
  • the retransmitted packet 102 carries the same first message 1012 as the particular packet 1011 and is associated with a new PSN.
  • Embodiments of this disclosure introduce a new RDMA retransmission scheme for reliable transport. It may be understood that such a retransmission scheme is not sufficient for applications that require messages to be signaled as completed while maintaining their relative order. However, many other applications which do not require messages to be signaled as completed in order could benefit from out-of-order faster completion signals.
  • each packet carries exactly one message. That is, the message carried in each packet is limited to fit in a single network maximum transmission unit.
  • PSN-based As in most known protocols that guarantee a reliable transport, two different mechanisms are used together to identify lost packets: PSN-based and timer-based. Details are provided in the latter part of this disclosure.
  • an RDMA transaction involves an initiator node and a destination target node (or a target node).
  • the initiator node initiates or sends an RDMA operation request, and the target node receives the RDMA operation request and responds accordingly.
  • the transmitting device 100 shown in FIG. 1 may be considered as the initiator node, and the receiving device 200 shown in FIG. 1 may be considered as the target node.
  • each transmitted packet comprises a transaction identifier (XID) identifying the message carried in the packet.
  • the particular packet 1011 comprises a first XID identifying the first message 1012 carried in the particular packet 1011.
  • the transmitting device 100 determines that the particular packet 1011 is missing, it retransmits a packet carrying the same message as the missing packet. That is, the retransmitted packet 102 comprises the same first XID as the particular packet 1011.
  • each packet (transmitted or retransmitted) carries two different numbers: an XID and a PSN.
  • the XID identifies the message.
  • Each message is assigned with a unique XID, by the sender, i.e., the transmitting device 100.
  • the retransmitted copy carries the same XID, but the PSN is different.
  • the PSN of the retransmitted packet is logically larger than the previous missing packet sent towards the same destination.
  • the XID number space for messages could be a continuous number space to simplify and reduce the required memory to store the “received XID” state at the target.
  • PSN number space could be a continuous number space to simplify and reduce the required memory to store the “outstanding PSNs” state at the initiator as well as to reduce the data structure for ACKs.
  • the transmitting device 100 may be configured to assign each transmitted packet and/or each retransmitted packet with a flowlet ID identifying a flowlet.
  • the flowlet comprises a plurality of packets, which are routed through the same network route.
  • each transmitted packet and/or retransmitted packet further comprises the flowlet ID.
  • the transmitting device 100 may be configured to transmit each transmitted packet and/or retransmitted packet to the receiving device 200 over the flowlet. All packets, belonging to the same flowl et, may be routed over the same network path. It is assumed here that different flowlets use different network paths.
  • ECMP flowlets are supported by dividing the stream of packets between an initiator node and a target node into sub-streams. Packets of different sub-streams are assumed to traverse the network using a different route.
  • the transmitting device 100 may maintain a separate (continuous) PSN space for each sub-stream, i.e. flowlet. That is, each flowl et is associated with a separate PSN space.
  • a packet is transmitted or retransmitted, it is assigned to a flowlet and is assigned with a PSN.
  • a transmitted packet or a retransmitted from the transmitting device 100 may include the XID identifying a message, PSN of the packet, and the flowlet ID, over which this packet is transmitted.
  • the flowlet ID that is assigned to the retransmitted packet may be different from the flowlet ID that is assigned to the transmitted packet.
  • the transmitted packet and the retransmitted packet may be routed through different flowlets identified by the flowlet IDs of the transmitted packet and the retransmitted packet, respectively. That is, every packet can be transmitted over different network paths established between the sender and the receiver. It should be noted that when the transmitting device 100 decides that a packet is lost, it may retransmit the same message in a new packet over any flowlet. That is, the flowlet assigned to the retransmitted packet can be the same to the flowlet assigned to the missing packet.
  • a retransmitted packet with a different flowlet ID may be assigned to assign a retransmitted packet with a different flowlet ID, and route the retransmitted packet through a different flowlet. Since the previous packet is missing, it may be a sign that the previous network route has some issues. Decision flexibility exists that allows the transmitting device 100 to dynamically choose a different flowlet to use for retransmission. This decision could be based on any combination of criteria, e.g. a “better” flowlet that is less loaded, faster, less congested, etc.
  • the transmitting device may be configured to receive a notification message 201 for one or more transmitted packets of the sequence of packets 101 from the receiving device 200.
  • the notification message 201 indicates whether the transmitted packets are received at the receiving device 200.
  • the transmitting device may be further configured to determine whether the particular packet of the sequence of packets 101 is received at the receiving device 200 according to the notification message 201.
  • the notification message 201 (also referred to as ACK) is sent from the receiver back to the sender for each received packet, no matter this packet is accepted or not (in case it carries a duplicate message).
  • a notification message 201 carries the PSN and the flowlet ID of the received packet.
  • a single notification message 201 can report multiple packets belonging to the same flowlet.
  • each notification message indicates the transmitted packets with PSNs falling in a particular range.
  • each notification message indicates the transmitted packets that comprise the same flowlet ID.
  • ACKs may not be accumulative. This means that each notification message 201 may report about a specific range of PSNs, all belonging to a specific flowlet.
  • ACKs of packets of the same flowlet may be routed over the same path.
  • all ACKs are transmitted back to the sender over the same network path.
  • the transmitting device 100 may be configured to generate a completion signal for a message, when receiving a notification message indicating that a transmitted packet comprising an XID identifying that message is received for the first time.
  • this message can be signaled as completed. Later ACKs, that potentially report the same message as received, do not trigger a completion signal (of the already completed message).
  • completion of a message happens regardless of the state of any previously issued messages, i.e., whether all previously issued messages have been completed or not.
  • This completion generation mechanism may be referred to as “out-of-order completion” or “Order- Agnostic completion”.
  • the transmitting device 100 may be configured to determine that the particular packet 1011 with a PSN X of the sequence of packets 101 is missing at the receiving device 200 when receiving a notification message indicating that a packet with a PSN logically larger than X is received at the receiving device 200 before receiving a notification message indicating that the particular packet 1011 with the PSN X is received at the receiving device 200, from the receiving device 200, wherein X is a non-negative integer.
  • the transmitting device 100 may be configured to start to track a time period after sending a packet, if the tracking of the time period has not been started. Then, the transmitting device 100 may be configured to reset the tracking of the time period after receiving a notification message.
  • the transmitting device 100 may be configured to determine that a particular packet 1011 with a PSN X is missing at the receiving device 200 when no notification message indicating that the packet with the PSN X is received at the receiving device 200 is received before the time period expires.
  • whether a packet is lost may be decided by the receiver side. For instance, the receiving device 200 may decide that a packet is lost and needs to be reported as such, if another packet is received over the same flowlet and its PSN is logically larger the previous one.
  • the sender i.e., the transmitting device 100, may decide that a packet was lost using at least one of the following cases: - It is informed by a notification message 201 (e.g., ACK, NAK, or SACK) that this packet was lost.
  • a notification message 201 e.g., ACK, NAK, or SACK
  • a timeout event is detected on the flowlet that indicates that all outstanding packets for this fl owlet (i.e. that were transmitted but not ACKed yet) should be considered lost.
  • the transmitting device 100 may be configured to set the time period according to the receiving device 200. Best latency performance may be achieved with a timer per flowlet. To reduce required resources, a timer, i.e., the same time period, can be used for a pair of network nodes, i.e., the transmitting device 100 and the receiving device 200.
  • the transmitting device 100 may maintain a single timer, i.e., set the same time period, for all the connections with different receivers.
  • FIG. 2 shows a packets exchange between an initiator node, i.e., the transmitting device 100, and a target node, i.e., the receiving device 200, according to an embodiment of the disclosure.
  • the transmitting device 100 may be the transmitting device as shown in FIG. 1
  • the receiving device 200 may be the receiving device as shown in FIG. 1.
  • FIG. 3 illustrates the retransmission of lost packets, detected based on a notification message 201.
  • the example uses ACKs that include information about multiple PSNs.
  • the size of this data structure is an implementation parameter detail. In this example, a single flowl et is considered, therefore the flowlet ID is omitted for brevity.
  • the transmitting device 100 sends a sequence of packets P20 to P26, i.e., the sequence of packets 101 as shown in FIG. 1, to the receiving device 200.
  • the receiving device 200 For each received packet, the receiving device 200 records it in a data structure, for example, a bitmap (the receiving device 200 may mark a received packet as “1” and mark a lost packet as “0”). It can be seen that, in the example shown in FIG. 2, when the receiving device 200 receives P22 but still does not receive P21, it decides that P21 is lost and mark it as “0” in the bitmap. Later, when the receiving device 200 receives P26 but still does not receive P25, it decides that P25 is lost.
  • the transmitting device 100 receives an ACK, e.g., the notification message 201, which notifies the transmitting device 100 that P21 and P25 are not received at the receiving device 200.
  • the transmitting device 100 transmits retransmission packet P29 which is the retransmission of P21, and retransmission packet P30 which is the retransmission of P25, to the receiving device 200.
  • P21 may be the particular missing packet 1011 that comprises the first message 1012, as shown in FIG. 1, then the retransmission packet P29 may be the retransmitted packet 102 carries the same first message 1012 but with a new PSN (29).
  • FIG. 3 shows another signaling flowchart between an initiator node, i.e., the transmitting device 100, and a target node, i.e., the receiving device 200, according to an embodiment of the disclosure.
  • the transmitting device 100 may be the transmitting device as shown in FIG. 1
  • the receiving device 200 may be the receiving device as shown in FIG. 1.
  • a packet loss is illustrated.
  • the sender side considers a packet as missing: if the packet is lost on its way to the receiver, i.e., it fails to reach the receiver; or it does reach the receiver, but a notification message 201 for this packet is lost on the way to the sender.
  • FIG. 3 emphasizes the case where the XID of a packet does not change during retransmission, but the PSN does.
  • packet P21 does not arrive at the receiving device 200.
  • the transmitting device 100 receives a notification message A22 from the receiving device, which acknowledges the reception of P22, but still does not receive a notification message 201 acknowledging the reception of P21, the transmitting device 100 determines that P21 is lost and quickly transmits P24 that carries the same message as P21.
  • the XID of the retransmitted packet P24 is still “301”, the same to the XID of the lost packet P21.
  • the transmitting device 100 continuous with the transmission of the normal packets.
  • FIG. 4 shows another signaling flowchart between an initiator node, i.e., the transmitting device 100, and a target node, i.e., the receiving device 200, according to an embodiment of the disclosure.
  • FIG. 4 illustrates another loss case, where a notification message 201 is lost.
  • the transmitting device 100 transmits packets P20 to P23 to the receiving device 200. Although all packets P20 to P23 arrive at the receiving device 200 and the receiving device 200 sends a notification message 201 (ACK) for each of P20 to P23, A22 is lost and does not arrive at the transmitting device 100.
  • ACK notification message 201
  • the transmitting device 100 When the transmitting device 100 receives ACK A23 from the receiving device but still does not receive ACK 22, the transmitting device 100 considers that P22 is “lost” and quickly transmits P25 that carries the same message as P22. As shown in FIG. 4, the XID of the retransmitted packet P25 is still “302”, the same as the XID of the packet P22.
  • PSN-based retransmission may be based on a unique strictly monotonically increasing PSN assigned to every transmitted packet and to quickly identify lost packets when a “gap” is detected in the PSNs reported back by a notification message 201.
  • Timer-based retransmission is similar to the traditional way of triggering an event of retransmission: if no notification message(s) 201 are received during a configurable period of time, all outstanding packets are retransmitted. Outstanding packets are defined as the packets that were transmitted but a notification message 201 is not received yet.
  • FIG. 5 shows a receiving device 200 adapted for RDMA according to an embodiment of the disclosure.
  • the receiving device 200 may comprise processing circuitry (not shown) configured to perform, conduct or initiate the various operations of the receiving device 200 described herein.
  • the processing circuitry may comprise hardware and software.
  • the hardware may comprise analog circuitry or digital circuitry, or both analog and digital circuitry.
  • the digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable arrays (FPGAs), digital signal processors (DSPs), or multi-purpose processors.
  • the receiving device 200 may further comprise memory circuitry, which stores one or more instruction(s) that can be executed by the processor or by the processing circuitry, in particular under control of the software.
  • the memory circuitry may comprise a non-transitory storage medium storing executable software code which, when executed by the processor or the processing circuitry, causes the various operations of the receiving device 200 to be performed.
  • the processing circuitry comprises one or more processors and a non-transitory memory connected to the one or more processors.
  • the non-transitory memory may carry executable program code which, when executed by the one or more processors, causes the receiving device 200 to perform, conduct or initiate the operations or methods described herein.
  • the receiving device 200 is configured to receive a sequence of packets 101 from a transmitting device 100 for RDMA.
  • the transmitting device 100 here may be the transmitting device 100 shown in FIG. 1.
  • each received packet of the sequence of packets 101 is associated with a PSN, and carries a message.
  • the receiving device 200 is configured to transmit a notification message 201 for the sequence of packets 101 to the transmitting device 100, wherein the notification message 201 indicates which packets of the sequence of packets 101 are received at the receiving device 200.
  • Embodiments of the present disclosure further provide a receiving device 200 that operates according to the transmitting device 100 as previously described in this disclosure.
  • a notification message 201 supports aggregating notifications for multiple packets and supports “selective ACK” (SACK) to report both received and lost packets.
  • SACK selective ACK
  • the number of such reported packets, i.e. the size of the ACK data structure, is an implementation parameter.
  • each received packet comprises an XID identifying the message carried in the packet.
  • each packet of the sequence of packets 101 further comprises a flowlet ID identifying a flowlet, wherein the flowlet comprises a plurality of packets, which are routed through the same network route.
  • each notification message 201 indicates packets that comprise the same flowlet ID.
  • the receiving device 200 may be configured to maintain a first data structure storing PSNs of received packets.
  • the receiving device 200 may be further configured to update the first data structure after receiving a packet from the transmitting device 100.
  • the receiving device 200 may be configured to discard a received packet, if a PSN of that received packet cannot be recorded in the first data structure. For instance, if there are no resources at the receiving device 200 to store or mark the new PSN, then the packet may be discarded.
  • the receiving device 200 may be configured to transmit the notification message based on the first data structure.
  • the receiving device 200 may maintain a state of “received XIDs”. If a message is received for the first time, its XID is stored in this state. If another copy of this message is later on received, it is not accepted. That is, the receiving device 200 may be further configured to determine for each received packet whether it is a duplicated packet, by checking whether an XID comprised in the received packet has been stored in the database, and ignore any duplicated packet. Possibly, these duplicates are only acknowledged but not executed again. In this way, each message is guaranteed to be accepted by the receiving device 200 exactly once.
  • each notification message 201 indicates packets having PSNs falling in a particular range.
  • embodiments of this disclosure enable out-of-order completion.
  • This disclosure uses both an XID and a PSN for each transmitted packet. Packets that are retransmitted carry the same XID but a different PSN. This allows to accelerate the detection of lost retransmitted packets thus "fast retransmit” occurs for all lost packets, even if these are already retransmitted). “Fast retransmit” means that there is no need to wait for a timeout event. Further, this disclosure enables retransmission of a packet over any flowlet, regardless of the flowlet it was last transmitted over.
  • FIG. 6 shows a method 600 for RDMA according to an embodiment of the disclosure.
  • the method 600 is performed by a transmitting device 100 shown in FIG. 1 or FIG. 5.
  • the method 600 comprises a step 601 of transmitting a sequence of packets 101 to a receiving device 200 for RDMA.
  • each transmitted packet of the sequence of packets 101 is associated with a PSN and carries a message.
  • the method 600 further comprises a step 602 of determining whether a particular packet 1011 of the sequence of packets 101 is received at the receiving device 200, wherein the particular packet 1011 comprises a first message 1012.
  • the method 600 further comprises a step 603 of retransmitting the particular packet to the receiving device 200 as the next step after determining that the particular packet is missing at the receiving device 200, wherein the retransmitted packet 102 carries the same first message 1012 as the particular packet 1011, and is associated with a new PSN.
  • FIG. 7 shows a method 700 for RDMA according to an embodiment of the disclosure.
  • the method 700 is performed by a receiving device 200 shown in FIG. 1 or FIG. 5.
  • the method 700 comprises a step 701 of receiving a sequence of packets 101 from a transmitting device 100 for RDMA.
  • each received packet of the sequence of packets 101 is associated with a PSN, and comprises an XID identifying a message carried in the packet.
  • the method 700 further comprises a step 702 of transmitting a notification message 201 for the sequence of packets 101 to the transmitting device 100, wherein the notification message 201 indicates which packets of the sequence of packets 101 are received at the receiving device 200.
  • any method according to embodiments of the disclosure may be implemented in a computer program, having code means, which when run by processing means causes the processing means to execute the steps of the method.
  • the computer program is included in a computer readable medium of a computer program product.
  • the computer readable medium may comprise essentially any memory, such as a ROM (Read-Only Memory), a PROM (Programmable Read-Only Memory), an EPROM (Erasable PROM), a Flash memory, an EEPROM (Electrically Erasable PROM), or a hard disk drive.
  • embodiments of the transmitting device 100, or the receiving device 200 comprises the necessary communication capabilities in the form of e.g., functions, means, units, elements, etc., for performing the solution.
  • means, units, elements and functions are: processors, memory, buffers, control logic, encoders, decoders, rate matchers, de-rate matchers, mapping units, multipliers, decision units, selecting units, switches, interleavers, de-interleavers, modulators, demodulators, inputs, outputs, antennas, amplifiers, receiver units, transmitter units, DSPs, trellis-coded modulation (TCM) encoder, TCM decoder, power supply units, power feeders, communication interfaces, communication protocols, etc.
  • TCM trellis-coded modulation
  • the processor(s) of the transmitting device 100, or the receiving device 200 may comprise, e.g., one or more instances of a Central Processing Unit (CPU), a processing unit, a processing circuit, a processor, an Application Specific Integrated Circuit (ASIC), a microprocessor, or other processing logic that may interpret and execute instructions.
  • CPU Central Processing Unit
  • ASIC Application Specific Integrated Circuit
  • microprocessor may thus represent a processing circuitry comprising a plurality of processing circuits, such as, e.g., any, some or all of the ones mentioned above.
  • the processing circuitry may further perform data processing functions for inputting, outputting, and processing of data comprising data buffering and device control functions, such as call processing control, user interface control, or the like.

Abstract

The present disclosure relates to devices and methods for RDMA. Specifically, the disclosure proposes a transmitting device and a receiving device. The transmitting device is configured to transmit a sequence of packets to a receiving device for RDMA, wherein each transmitted packet of the sequence of packets is associated with a packet sequence number, PSN, and carries a message; determine whether a particular packet of the sequence of packets is received at the receiving device, wherein the particular packet comprises a first message; and retransmit the particular packet to the receiving device as the next step after determining that the particular packet is missing at the receiving device, wherein the retransmitted packet carries the same first message as the particular packet, and is associated with a new PSN. The receiving device is configured to operate accordingly.

Description

A DEVICE AND METHOD FOR REMOTE DIRECT MEMORY ACCESS
TECHNICAL FIELD
The present disclosure relates to high-performance computing technologies, in particular, to Remote Direct Memory Access (RDMA) technologies. The disclosure is concerned with transporting RDMA transactions over a packet-based network. To this end, the present disclosure provides a device, a method, and a data packet format for RDMA.
BACKGROUND
Applications often require communication with other applications or other computational resources over a network fabric. The delivery of the exchanged data packets is not always guaranteed, as some packets might be dropped by the network switches while traversing the network. It is also possible for packets to be dropped by the remote computational resource for various reasons.
To guarantee the delivery of data packets, a reliable transport protocol is typically used, such as Transmission Control Protocol (TCP), Quick User Datagram Protocol (UDP) Internet Connection (QUIC), or RDMA-Reliable Connection (RDMA-RC). Such reliable protocols identify and retransmit lost packets. Each retransmitted packet consumes extra bandwidth resources, and is therefore required to avoid unnecessary retransmissions.
To allow the sender to identify lost packets, the receiver sends acknowledgments (ACKs) for the received packets. Sometimes negative acknowledgments (NAKs) are also used. The various retransmission schemes are sometimes classified according to their operations into “Stop and Wait”, “Go-Back-N” and “Selective-Repeat”, for example.
RDMA is a technology that allows applications to perform memory access operations on remote memory installed in a remote network node. RDMA-RC provides reliable transports of data, and it is implemented in the RDMA Network Interface Card (RNIC) device, thus allowing a network node to perform such memory access operations without involving the operating system nor the node’s main Central Processing Unit (CPU). RDMA allows a computer to perform such memory access operations without involving the operating system that runs on the computer. RDMA is becoming widely used in modern data centers and by computer clusters, as it provides low-latency remote memory access operations together with high network bandwidth.
There are two common RDMA technologies: one that was defined in the InfiniBand specification, and another that was defined by the Internet Engineering Task Force (IETF). Specifically, InfiniBand RDMA has two variants that allow it to run over IP/Ethemet networks, namely RoCE and RoCEv2.
RDMA-RC in the InfiniBand uses the relatively simple Go-Back-N (GBN) retransmission algorithm, in which many packets are retransmitted following an identified event of a lost packet. However, in RDMA-RC, messages are signaled as completed in the same order they have been posted by the software layer. Current RDMA-RC does not support any out-of-order faster completion signaling.
SUMMARY
In view of the above, embodiments of the disclosure aim to provide a retransmission scheme for reliable transport. An objective is to propose a retransmission scheme, in particular, a retransmission scheme that guarantees that all transmitted packets are processed exactly once by the receiver. One aim is to enable generating signal completion of messages in a different order (in time) than in which they were issued. Another aim is to enable the easy application of Equal Cost Multiple Paths (ECMP) flowlets for accelerating messages execution and completion.
These and other objectives are achieved by the embodiments of the disclosure as described in the enclosed independent claims. Advantageous implementations of the embodiments of the disclosure are further defined in the dependent claims.
A first aspect of the present disclosure provides a transmitting device for RDMA. The transmitting device is configured to: transmit a sequence of packets to a receiving device for RDMA, wherein each transmitted packet of the sequence of packets is associated with a packet sequence number (PSN), and carries a message; determine whether a particular packet of the sequence of packets is received at the receiving device, wherein the particular packet comprises a first message; and retransmit the particular packet to the receiving device as the next step after determining that the particular packet is missing at the receiving device, wherein the retransmitted packet carries the same first message as the particular packet, and is associated with a new PSN.
Embodiments of this disclosure introduce a new RDMA retransmission scheme for reliable transport, referred to an Order- Agnostic retransmission. This disclosure enables out-of-order or faster completion signals. In this disclosure, each packet carries exactly one message. Notably, the message mentioned here may be a working queue element (WQE), which is an RDMA operation or transaction that is posted or issued by a source ULP (e.g., an application or software) and is pushed into a queue pair (QP).
In an implementation form of the first aspect, each transmitted packet comprises a transaction identifier (XID), identifying the message carried in the packet; the particular packet comprises a first XID identifying the first message carried in the particular packet; and the retransmitted packet comprises the same first XID as the particular packet.
In particular, each packet (transmitted or retransmitted) carries two different numbers: an XID and a PSN. The XID identifies the message, thus each message is assigned with a unique XID by the sender. When a packet is identified as lost, it will be immediately retransmitted. The retransmitted copy carries the same XID, but the PSN is different. Notably, the PSN of the retransmitted packet is logically larger than the previous missing packet sent towards the same destination.
In an implementation form of the first aspect, the transmitting device is further configured to assign each transmitted packet and/or each retransmitted packet with a flowlet ID identifying a flowlet, wherein the flowlet comprises a plurality of packets, which are routed through the same network route, and each transmitted packet and/or retransmitted packet further comprises the flowlet ID; and transmit each transmitted packet and/or retransmitted packet to the receiving device over the flowlet.
This disclosure may thus further support ECMP flowlets. Possibly, the transmitting device may maintain a separate (continuous) PSN space for each sub-stream, i.e., flowlet. That is, each flowl et is associated with a separate PSN space. When a packet is transmitted or retransmitted, it is assigned to a flowlet and is assigned a PSN. In an implementation form of the first aspect, the flowlet ID that is assigned to the retransmitted packet is different from the flowlet ID that is assigned to the transmitted packet, wherein the transmitted packet and the retransmitted packet are routed through different flowlets identified by the flowlet IDs of the transmitted packet and the retransmitted packet, respectively.
It should be noted that when the transmitting device determines that a packet is lost, it may retransmit the same message in a new packet over any flowlet. However, it may be preferred to assign a retransmitted packet a different flowlet ID, and route the retransmitted packet through a different flowlet.
In an implementation form of the first aspect, the transmitting device is configured to receive a notification message for one or more transmitted packets of the sequence of packets from the receiving device, wherein the notification message indicates whether the transmitted packets are received at the receiving device; and determine whether the particular packet of the sequence of packets is received at the receiving device according to the notification message.
Notably, the notification message may be an ACK, which carries the PSN and the flowlet ID of the received packet. Optionally, a single notification message can report multiple packets belonging to the same flowlet.
In an implementation form of the first aspect, the transmitting device is further configured to generate a completion signal for a message, when receiving a notification message indicating that a transmitted packet comprising an XID identifying that message is received for the first time.
In this case, the message may specifically be a RDMA Write operation or a RDMA Send operation. When the transmitting device is informed by a notification message that a message (i.e., carried in a packet) was received, this message can be signaled as completed. Possibly, later ACKs that potentially report the same message as received, do not trigger a completion signal (of the already completed message). It should be understood that completion of a message happens regardless of the state of any previously issued messages, i.e., regardless of whether all previously issued messages have been completed or not. This completion generation mechanism may be referred to as “out-of-order completion” or “Order- Agnostic completion” in this disclosure. In an implementation form of the first aspect, the transmitting device is further configured to: determine that the particular packet with a PSN X of the sequence of packets is missing at the receiving device when receiving a notification message indicating that a packet with a PSN logically larger than X is received at the receiving device before receiving a notification message indicating that the particular packet with the PSN X is received at the receiving device, from the receiving device, wherein X is a non-negative integer.
Notably, a retransmission scheme may have two complementary mechanisms: PSN-based and timer-based. PSN-based retransmission may be based on a unique strictly monotonically increasing PSN assigned to every transmitted packet and to quickly identify lost packets when a “gap” is detected in the PSNs reported back by a notification message.
In an implementation form of the first aspect, each notification message indicates the transmitted packets that with PSNs falling in a particular range.
In an implementation form of the first aspect, each notification message indicates the transmitted packets that comprise the same flowlet ID.
In an implementation form of the first aspect, the transmitting device is configured to start to track a time period after sending a packet, if the tracking of the time period has not been started; and reset the tracking of the time period after receiving a notification message.
Timer-based retransmission is similar to the traditional way of triggering an event of retransmission: if no notification message(s) are received during a configurable period of time, all outstanding packets are retransmitted.
In an implementation form of the first aspect, the transmitting device is further configured to determine that a particular packet with a PSN X is missing at the receiving device when no notification message indicating that the packet with the PSN X is received at the receiving device is received before the time period expires.
In an implementation form of the first aspect, the transmitting device is further configured to set the time period according to the receiving device.
Notably, to reduce required resources, a timer (i.e., the same time period) can be used for a pair of network nodes. Optionally, to further reduce required resources and overhead, the transmitting device 100 may maintain a single timer, i.e., set the same time period, for all the connections with different receivers.
In an implementation form of the first aspect, the message carried in each packet is limited to fit in a single network maximum transmission unit.
A second aspect of the present disclosure provides a receiving device for RDMA, wherein the receiving device is configured to: receive a sequence of packets from a transmitting device for RDMA, wherein each received packet of the sequence of packets is associated with a packet sequence number, PSN, and carriers a message; and transmit a notification message for the sequence of packets to the transmitting device, wherein the notification message indicates which packets of the sequence of packets are received at the receiving device.
Embodiments of the present disclosure further provide a receiving device that operates accordingly to the transmitting device of the first aspect.
In an implementation form of the second aspect, each received packet comprises an XID identifying the message carried in the packet.
Notably, XID identifies the message. Each message is assigned with a unique XID.
In an implementation form of the second aspect, each packet of the sequence of packets further comprises a flowlet ID identifying a flowlet, wherein the flowlet comprises a plurality of packets, which are routed through the same network route.
In an implementation form of the second aspect, each notification message indicates packets that comprise the same flowlet ID.
Optionally, a single notification message can report multiple packets belonging to the same flowlet.
In an implementation form of the second aspect, the receiving device is further configured to: maintain a first data structure storing PSNs of received packets; and update the first data structure after receiving a packet from the transmitting device.
Optionally, the receiving device may further record the XIDs carried in the received packets. In an implementation form of the second aspect, the receiving device is further configured to discard a received packet, if a PSN of that received packet cannot be recorded in the first data structure.
In an implementation form of the second aspect, the receiving device is further configured to transmit the notification message based on the first data structure.
In an implementation form of the second aspect, each notification message indicates packets having PSNs falling in a particular range.
A third aspect of the present disclosure provides a method for RDMA. The method comprises: transmitting a sequence of packets to a receiving device for RDMA, wherein each transmitted packet of the sequence of packets is associated with a PSN, and carries a message; determining whether a particular packet of the sequence of packets is received at the receiving device, wherein the particular packet comprises a first message; and retransmitting that particular packet to the receiving device as the next step after determining that the particular packet is missing at the receiving device, wherein the retransmitted packet carries the same first message as the particular packet, and is associated with a new PSN.
The method of the third aspect and its implementation forms provide the same advantages and effects as described above for the transmitting device of the first aspect and its respective implementation forms.
A fourth aspect of the present disclosure provides a method for RDMA. The method comprises: receiving a sequence of packets from a transmitting device for RDMA, wherein each received packet of the sequence of packets is associated with a packet sequence number, PSN, and comprises a transaction identifier, XID, identifying a message carried in the packet; and transmitting a notification message for the sequence of packets to the transmitting device, wherein the notification message indicates which packets of the sequence of packets are received at the receiving device.
The method of the fourth aspect and its implementation forms provide the same advantages and effects as described above for the receiving device of the second aspect and its respective implementation forms.
A fifth aspect of the present disclosure provides a computer program comprising a program code for carrying out, when implemented on a processor, the method according to any of the third aspect and its implementation forms, or any of the fourth aspect and its implementation forms.
It has to be noted that all devices, elements, units, and means described in the present application could be implemented in the software or hardware elements or any kind of combination thereof. All steps which are performed by the various entities described in the present application as well as the functionalities described to be performed by the various entities are intended to mean that the respective transmitting device is adapted to or configured to perform the respective steps and functionalities. Even if, in the following description of specific embodiments, a specific functionality or step to be performed by external entities is not reflected in the description of a specific detailed element of that transmitting device that performs that specific step or functionality, it should be clear for a skilled person that these methods and functionalities can be implemented in respective software or hardware elements, or any kind of combination thereof.
BRIEF DESCRIPTION OF DRAWINGS
The above described aspects and implementation forms will be explained in the following description of specific embodiments in relation to the enclosed drawings, in which
FIG. 1 shows a transmitting device according to an embodiment of the disclosure.
FIG. 2 shows an exchange of packets between a transmitting device and a receiving device according to an embodiment of the disclosure.
FIG. 3 shows an exchange of packets between a transmitting device and a receiving device according to an embodiment of the disclosure.
FIG. 4 shows an exchange of packets between a transmitting device and a receiving device according to an embodiment of the disclosure.
FIG. 5 shows a receiving device according to an embodiment of the disclosure.
FIG. 6 shows a method according to an embodiment of the disclosure.
FIG. 7 shows a method according to an embodiment of the disclosure. DETAILED DESCRIPTION OF EMBODIMENTS
Illustrative embodiments of method, device, and program product for an Order-Agnostic retransmission scheme are described with reference to the figures. Although this description provides a detailed example of possible implementations, it should be noted that the details are intended to be exemplary and in no way limit the scope of the application.
Moreover, an embodiment/example may refer to other embodiments/examples. For example, any description including but not limited to terminology, element, process, explanation and/or technical advantage mentioned in one embodiment/example is applicative to the other embodiments/examples.
FIG. 1 shows a transmitting device 100 adapted for RDMA according to an embodiment of the disclosure. The transmitting device 100 may comprise processing circuitry (not shown) configured to perform, conduct or initiate the various operations of the transmitting device 100 described herein. The processing circuitry may comprise hardware and software. The hardware may comprise analog circuitry or digital circuitry, or both analog and digital circuitry. The digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable arrays (FPGAs), digital signal processors (DSPs), or multipurpose processors. The transmitting device 100 may further comprise memory circuitry, which stores one or more instruction(s) that can be executed by the processor or by the processing circuitry, in particular under control of the software. For instance, the memory circuitry may comprise a non-transitory storage medium storing executable software code which, when executed by the processor or the processing circuitry, causes the various operations of the transmitting device 100 to be performed. In one embodiment, the processing circuitry comprises one or more processors and a non-transitory memory connected to the one or more processors. The non-transitory memory may carry executable program code which, when executed by the one or more processors, causes the transmitting device 100 to perform, conduct or initiate the operations or methods described herein.
In particular, the transmitting device 100 is configured to transmit a sequence of packets 101 to a receiving device 200 adapted for RDMA. In particular, each transmitted packet of the sequence of packets 101 is associated with a PSN and carries a message. Then, the transmitting device 100 is configured to determine whether a particular packet 1011 of the sequence of packets 101 is received at the receiving device 200. The particular packet 1011 comprises a first message 1012. Further, the transmitting device 100 is configured to retransmit the particular packet to the receiving device 200 as the next step after determining that the particular packet is missing at the receiving device 200. In particular, the retransmitted packet 102 carries the same first message 1012 as the particular packet 1011 and is associated with a new PSN.
Embodiments of this disclosure introduce a new RDMA retransmission scheme for reliable transport. It may be understood that such a retransmission scheme is not sufficient for applications that require messages to be signaled as completed while maintaining their relative order. However, many other applications which do not require messages to be signaled as completed in order could benefit from out-of-order faster completion signals. In this disclosure, each packet carries exactly one message. That is, the message carried in each packet is limited to fit in a single network maximum transmission unit.
Further, as in most known protocols that guarantee a reliable transport, two different mechanisms are used together to identify lost packets: PSN-based and timer-based. Details are provided in the latter part of this disclosure.
Notably, an RDMA transaction involves an initiator node and a destination target node (or a target node). The initiator node initiates or sends an RDMA operation request, and the target node receives the RDMA operation request and responds accordingly. The transmitting device 100 shown in FIG. 1 may be considered as the initiator node, and the receiving device 200 shown in FIG. 1 may be considered as the target node.
According to embodiments of this disclosure, each transmitted packet comprises a transaction identifier (XID) identifying the message carried in the packet. In particular, the particular packet 1011 comprises a first XID identifying the first message 1012 carried in the particular packet 1011. Once the transmitting device 100 determines that the particular packet 1011 is missing, it retransmits a packet carrying the same message as the missing packet. That is, the retransmitted packet 102 comprises the same first XID as the particular packet 1011.
It can be seen that each packet (transmitted or retransmitted) carries two different numbers: an XID and a PSN. The XID identifies the message. Each message is assigned with a unique XID, by the sender, i.e., the transmitting device 100. When a packet is identified as lost, it will be immediately retransmitted. The retransmitted copy carries the same XID, but the PSN is different. Notably, the PSN of the retransmitted packet is logically larger than the previous missing packet sent towards the same destination.
Optionally, the XID number space for messages could be a continuous number space to simplify and reduce the required memory to store the “received XID” state at the target. PSN number space could be a continuous number space to simplify and reduce the required memory to store the “outstanding PSNs” state at the initiator as well as to reduce the data structure for ACKs.
According to an embodiment of this disclosure, the transmitting device 100 may be configured to assign each transmitted packet and/or each retransmitted packet with a flowlet ID identifying a flowlet. Typically, the flowlet comprises a plurality of packets, which are routed through the same network route. Optionally, each transmitted packet and/or retransmitted packet further comprises the flowlet ID. The transmitting device 100 may be configured to transmit each transmitted packet and/or retransmitted packet to the receiving device 200 over the flowlet. All packets, belonging to the same flowl et, may be routed over the same network path. It is assumed here that different flowlets use different network paths.
It may be worth mentioning that ECMP flowlets are supported by dividing the stream of packets between an initiator node and a target node into sub-streams. Packets of different sub-streams are assumed to traverse the network using a different route. The transmitting device 100 may maintain a separate (continuous) PSN space for each sub-stream, i.e. flowlet. That is, each flowl et is associated with a separate PSN space. When a packet is transmitted or retransmitted, it is assigned to a flowlet and is assigned with a PSN.
According to an embodiment of this disclosure, a transmitted packet or a retransmitted from the transmitting device 100 may include the XID identifying a message, PSN of the packet, and the flowlet ID, over which this packet is transmitted.
According to an embodiment of this disclosure, the flowlet ID that is assigned to the retransmitted packet may be different from the flowlet ID that is assigned to the transmitted packet. In particular, the transmitted packet and the retransmitted packet may be routed through different flowlets identified by the flowlet IDs of the transmitted packet and the retransmitted packet, respectively. That is, every packet can be transmitted over different network paths established between the sender and the receiver. It should be noted that when the transmitting device 100 decides that a packet is lost, it may retransmit the same message in a new packet over any flowlet. That is, the flowlet assigned to the retransmitted packet can be the same to the flowlet assigned to the missing packet. However, it may be preferred to assign a retransmitted packet with a different flowlet ID, and route the retransmitted packet through a different flowlet. Since the previous packet is missing, it may be a sign that the previous network route has some issues. Decision flexibility exists that allows the transmitting device 100 to dynamically choose a different flowlet to use for retransmission. This decision could be based on any combination of criteria, e.g. a “better” flowlet that is less loaded, faster, less congested, etc.
According to an embodiment of this disclosure, as shown in FIG. 2, the transmitting device may be configured to receive a notification message 201 for one or more transmitted packets of the sequence of packets 101 from the receiving device 200. In particular, the notification message 201 indicates whether the transmitted packets are received at the receiving device 200. The transmitting device may be further configured to determine whether the particular packet of the sequence of packets 101 is received at the receiving device 200 according to the notification message 201.
In this disclosure, the notification message 201 (also referred to as ACK) is sent from the receiver back to the sender for each received packet, no matter this packet is accepted or not (in case it carries a duplicate message). A notification message 201 carries the PSN and the flowlet ID of the received packet. Optionally, a single notification message 201 can report multiple packets belonging to the same flowlet.
According to an embodiment of this disclosure, each notification message indicates the transmitted packets with PSNs falling in a particular range. Optionally, each notification message indicates the transmitted packets that comprise the same flowlet ID.
It should be noted that ACKs may not be accumulative. This means that each notification message 201 may report about a specific range of PSNs, all belonging to a specific flowlet.
Notably, ACKs of packets of the same flowlet may be routed over the same path. Optionally, it is also possible that all ACKs are transmitted back to the sender over the same network path.
According to an embodiment of this disclosure, the transmitting device 100 may be configured to generate a completion signal for a message, when receiving a notification message indicating that a transmitted packet comprising an XID identifying that message is received for the first time.
When the transmitting device 100 is informed by a notification message 201 that a message (i.e., carried in a packet) was received, this message can be signaled as completed. Later ACKs, that potentially report the same message as received, do not trigger a completion signal (of the already completed message).
It should be understood that completion of a message happens regardless of the state of any previously issued messages, i.e., whether all previously issued messages have been completed or not. This completion generation mechanism may be referred to as “out-of-order completion” or “Order- Agnostic completion”.
According to embodiments of this disclosure, the transmitting device 100 may be configured to determine that the particular packet 1011 with a PSN X of the sequence of packets 101 is missing at the receiving device 200 when receiving a notification message indicating that a packet with a PSN logically larger than X is received at the receiving device 200 before receiving a notification message indicating that the particular packet 1011 with the PSN X is received at the receiving device 200, from the receiving device 200, wherein X is a non-negative integer.
According to an embodiment of this disclosure, the transmitting device 100 may be configured to start to track a time period after sending a packet, if the tracking of the time period has not been started. Then, the transmitting device 100 may be configured to reset the tracking of the time period after receiving a notification message.
According to an embodiment of this disclosure, the transmitting device 100 may be configured to determine that a particular packet 1011 with a PSN X is missing at the receiving device 200 when no notification message indicating that the packet with the PSN X is received at the receiving device 200 is received before the time period expires.
Possibly, whether a packet is lost may be decided by the receiver side. For instance, the receiving device 200 may decide that a packet is lost and needs to be reported as such, if another packet is received over the same flowlet and its PSN is logically larger the previous one.
According to an embodiment of this disclosure, the sender, i.e., the transmitting device 100, may decide that a packet was lost using at least one of the following cases: - It is informed by a notification message 201 (e.g., ACK, NAK, or SACK) that this packet was lost.
- It does not receive a notification message 201for this packet but receives a notification messages 210 that indicates that a packet with a logically larger PSN that was transmitted over the same flowlet was received.
- A timeout event is detected on the flowlet that indicates that all outstanding packets for this fl owlet (i.e. that were transmitted but not ACKed yet) should be considered lost.
According to an embodiment of this disclosure, the transmitting device 100 may be configured to set the time period according to the receiving device 200. Best latency performance may be achieved with a timer per flowlet. To reduce required resources, a timer, i.e., the same time period, can be used for a pair of network nodes, i.e., the transmitting device 100 and the receiving device 200.
Possibly, to further reduce required resources and overhead, the transmitting device 100 may maintain a single timer, i.e., set the same time period, for all the connections with different receivers.
FIG. 2 shows a packets exchange between an initiator node, i.e., the transmitting device 100, and a target node, i.e., the receiving device 200, according to an embodiment of the disclosure. The transmitting device 100 may be the transmitting device as shown in FIG. 1, and the receiving device 200 may be the receiving device as shown in FIG. 1. In particular, FIG. 3 illustrates the retransmission of lost packets, detected based on a notification message 201. The example uses ACKs that include information about multiple PSNs. The size of this data structure is an implementation parameter detail. In this example, a single flowl et is considered, therefore the flowlet ID is omitted for brevity.
In this embodiment, the transmitting device 100 sends a sequence of packets P20 to P26, i.e., the sequence of packets 101 as shown in FIG. 1, to the receiving device 200. For each received packet, the receiving device 200 records it in a data structure, for example, a bitmap (the receiving device 200 may mark a received packet as “1” and mark a lost packet as “0”). It can be seen that, in the example shown in FIG. 2, when the receiving device 200 receives P22 but still does not receive P21, it decides that P21 is lost and mark it as “0” in the bitmap. Later, when the receiving device 200 receives P26 but still does not receive P25, it decides that P25 is lost.
The transmitting device 100 receives an ACK, e.g., the notification message 201, which notifies the transmitting device 100 that P21 and P25 are not received at the receiving device 200. In response to the ACK, the transmitting device 100 transmits retransmission packet P29 which is the retransmission of P21, and retransmission packet P30 which is the retransmission of P25, to the receiving device 200. Possibly, P21 may be the particular missing packet 1011 that comprises the first message 1012, as shown in FIG. 1, then the retransmission packet P29 may be the retransmitted packet 102 carries the same first message 1012 but with a new PSN (29).
FIG. 3 shows another signaling flowchart between an initiator node, i.e., the transmitting device 100, and a target node, i.e., the receiving device 200, according to an embodiment of the disclosure. Similarly, the transmitting device 100 may be the transmitting device as shown in FIG. 1, and the receiving device 200 may be the receiving device as shown in FIG. 1. In this example, a packet loss is illustrated. Notably, there are two different loss cases where the sender side considers a packet as missing: if the packet is lost on its way to the receiver, i.e., it fails to reach the receiver; or it does reach the receiver, but a notification message 201 for this packet is lost on the way to the sender.
In particular, FIG. 3 emphasizes the case where the XID of a packet does not change during retransmission, but the PSN does. In this embodiment, packet P21 does not arrive at the receiving device 200. When the transmitting device 100 receives a notification message A22 from the receiving device, which acknowledges the reception of P22, but still does not receive a notification message 201 acknowledging the reception of P21, the transmitting device 100 determines that P21 is lost and quickly transmits P24 that carries the same message as P21. As shown in FIG. 3, the XID of the retransmitted packet P24 is still “301”, the same to the XID of the lost packet P21. After the fast retransmission of the missing packet, the transmitting device 100 continuous with the transmission of the normal packets.
FIG. 4 shows another signaling flowchart between an initiator node, i.e., the transmitting device 100, and a target node, i.e., the receiving device 200, according to an embodiment of the disclosure. In particular, FIG. 4 illustrates another loss case, where a notification message 201 is lost. In this embodiment, the transmitting device 100 transmits packets P20 to P23 to the receiving device 200. Although all packets P20 to P23 arrive at the receiving device 200 and the receiving device 200 sends a notification message 201 (ACK) for each of P20 to P23, A22 is lost and does not arrive at the transmitting device 100. When the transmitting device 100 receives ACK A23 from the receiving device but still does not receive ACK 22, the transmitting device 100 considers that P22 is “lost” and quickly transmits P25 that carries the same message as P22. As shown in FIG. 4, the XID of the retransmitted packet P25 is still “302”, the same as the XID of the packet P22.
Notably, embodiments of this disclosure employ a retransmission scheme, which may have two complementary mechanisms: PSN-based and timer-based. PSN-based retransmission may be based on a unique strictly monotonically increasing PSN assigned to every transmitted packet and to quickly identify lost packets when a “gap” is detected in the PSNs reported back by a notification message 201. Timer-based retransmission is similar to the traditional way of triggering an event of retransmission: if no notification message(s) 201 are received during a configurable period of time, all outstanding packets are retransmitted. Outstanding packets are defined as the packets that were transmitted but a notification message 201 is not received yet.
FIG. 5 shows a receiving device 200 adapted for RDMA according to an embodiment of the disclosure. The receiving device 200 may comprise processing circuitry (not shown) configured to perform, conduct or initiate the various operations of the receiving device 200 described herein. The processing circuitry may comprise hardware and software. The hardware may comprise analog circuitry or digital circuitry, or both analog and digital circuitry. The digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable arrays (FPGAs), digital signal processors (DSPs), or multi-purpose processors. The receiving device 200 may further comprise memory circuitry, which stores one or more instruction(s) that can be executed by the processor or by the processing circuitry, in particular under control of the software. For instance, the memory circuitry may comprise a non-transitory storage medium storing executable software code which, when executed by the processor or the processing circuitry, causes the various operations of the receiving device 200 to be performed. In one embodiment, the processing circuitry comprises one or more processors and a non-transitory memory connected to the one or more processors. The non-transitory memory may carry executable program code which, when executed by the one or more processors, causes the receiving device 200 to perform, conduct or initiate the operations or methods described herein. In particular, the receiving device 200 is configured to receive a sequence of packets 101 from a transmitting device 100 for RDMA. Possibly, the transmitting device 100 here may be the transmitting device 100 shown in FIG. 1. In particular, each received packet of the sequence of packets 101 is associated with a PSN, and carries a message. Further, the receiving device 200 is configured to transmit a notification message 201 for the sequence of packets 101 to the transmitting device 100, wherein the notification message 201 indicates which packets of the sequence of packets 101 are received at the receiving device 200.
Embodiments of the present disclosure further provide a receiving device 200 that operates according to the transmitting device 100 as previously described in this disclosure. It may be worth mentioning that in this disclosure, a notification message 201 (ACK) supports aggregating notifications for multiple packets and supports “selective ACK” (SACK) to report both received and lost packets. The number of such reported packets, i.e. the size of the ACK data structure, is an implementation parameter.
According to an embodiment of the disclosure, each received packet comprises an XID identifying the message carried in the packet.
According to an embodiment of the disclosure, each packet of the sequence of packets 101 further comprises a flowlet ID identifying a flowlet, wherein the flowlet comprises a plurality of packets, which are routed through the same network route.
According to an embodiment of the disclosure, each notification message 201 indicates packets that comprise the same flowlet ID.
According to an embodiment of the disclosure, the receiving device 200 may be configured to maintain a first data structure storing PSNs of received packets. The receiving device 200 may be further configured to update the first data structure after receiving a packet from the transmitting device 100.
According to an embodiment of the disclosure, the receiving device 200 may be configured to discard a received packet, if a PSN of that received packet cannot be recorded in the first data structure. For instance, if there are no resources at the receiving device 200 to store or mark the new PSN, then the packet may be discarded.
According to an embodiment of the disclosure, the receiving device 200 may be configured to transmit the notification message based on the first data structure. Optionally, the receiving device 200 may maintain a state of “received XIDs”. If a message is received for the first time, its XID is stored in this state. If another copy of this message is later on received, it is not accepted. That is, the receiving device 200 may be further configured to determine for each received packet whether it is a duplicated packet, by checking whether an XID comprised in the received packet has been stored in the database, and ignore any duplicated packet. Possibly, these duplicates are only acknowledged but not executed again. In this way, each message is guaranteed to be accepted by the receiving device 200 exactly once.
According to an embodiment of the disclosure, each notification message 201 indicates packets having PSNs falling in a particular range.
To summarize, embodiments of this disclosure enable out-of-order completion. This disclosure uses both an XID and a PSN for each transmitted packet. Packets that are retransmitted carry the same XID but a different PSN. This allows to accelerate the detection of lost retransmitted packets thus "fast retransmit" occurs for all lost packets, even if these are already retransmitted). “Fast retransmit” means that there is no need to wait for a timeout event. Further, this disclosure enables retransmission of a packet over any flowlet, regardless of the flowlet it was last transmitted over.
FIG. 6 shows a method 600 for RDMA according to an embodiment of the disclosure. In a particular embodiment of the disclosure, the method 600 is performed by a transmitting device 100 shown in FIG. 1 or FIG. 5. The method 600 comprises a step 601 of transmitting a sequence of packets 101 to a receiving device 200 for RDMA. In particular, each transmitted packet of the sequence of packets 101 is associated with a PSN and carries a message. The method 600 further comprises a step 602 of determining whether a particular packet 1011 of the sequence of packets 101 is received at the receiving device 200, wherein the particular packet 1011 comprises a first message 1012. Then, the method 600 further comprises a step 603 of retransmitting the particular packet to the receiving device 200 as the next step after determining that the particular packet is missing at the receiving device 200, wherein the retransmitted packet 102 carries the same first message 1012 as the particular packet 1011, and is associated with a new PSN.
FIG. 7 shows a method 700 for RDMA according to an embodiment of the disclosure. In a particular embodiment of the disclosure, the method 700 is performed by a receiving device 200 shown in FIG. 1 or FIG. 5. The method 700 comprises a step 701 of receiving a sequence of packets 101 from a transmitting device 100 for RDMA. In particular, each received packet of the sequence of packets 101 is associated with a PSN, and comprises an XID identifying a message carried in the packet. The method 700 further comprises a step 702 of transmitting a notification message 201 for the sequence of packets 101 to the transmitting device 100, wherein the notification message 201 indicates which packets of the sequence of packets 101 are received at the receiving device 200.
The present disclosure has been described in conjunction with various embodiments as examples as well as implementations. However, other variations can be understood and effected by those persons skilled in the art and practicing the claimed disclosure, from the studies of the drawings, this disclosure and the independent claims. In the claims as well as in the description the word “comprising” does not exclude other elements or steps and the indefinite article “a” or “an” does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in the mutual different dependent claims does not indicate that a combination of these measures cannot be used in an advantageous implementation.
Furthermore, any method according to embodiments of the disclosure may be implemented in a computer program, having code means, which when run by processing means causes the processing means to execute the steps of the method. The computer program is included in a computer readable medium of a computer program product. The computer readable medium may comprise essentially any memory, such as a ROM (Read-Only Memory), a PROM (Programmable Read-Only Memory), an EPROM (Erasable PROM), a Flash memory, an EEPROM (Electrically Erasable PROM), or a hard disk drive.
Moreover, it is realized by the skilled person that embodiments of the transmitting device 100, or the receiving device 200, comprises the necessary communication capabilities in the form of e.g., functions, means, units, elements, etc., for performing the solution. Examples of other such means, units, elements and functions are: processors, memory, buffers, control logic, encoders, decoders, rate matchers, de-rate matchers, mapping units, multipliers, decision units, selecting units, switches, interleavers, de-interleavers, modulators, demodulators, inputs, outputs, antennas, amplifiers, receiver units, transmitter units, DSPs, trellis-coded modulation (TCM) encoder, TCM decoder, power supply units, power feeders, communication interfaces, communication protocols, etc. which are suitably arranged together for performing the solution. Especially, the processor(s) of the transmitting device 100, or the receiving device 200, may comprise, e.g., one or more instances of a Central Processing Unit (CPU), a processing unit, a processing circuit, a processor, an Application Specific Integrated Circuit (ASIC), a microprocessor, or other processing logic that may interpret and execute instructions. The expression “processor” may thus represent a processing circuitry comprising a plurality of processing circuits, such as, e.g., any, some or all of the ones mentioned above. The processing circuitry may further perform data processing functions for inputting, outputting, and processing of data comprising data buffering and device control functions, such as call processing control, user interface control, or the like.

Claims

1. A transmitting device (100) for Remote Direct Memory Access, RDMA, wherein the transmitting device (100) is configured to: transmit a sequence of packets (101) to a receiving device (200) for RDMA, wherein each transmitted packet of the sequence of packets (101) is associated with a packet sequence number, PSN, and carries a message; determine whether a particular packet (1011) of the sequence of packets (101) is received at the receiving device (200), wherein the particular packet (1011) comprises a first message (1012); and retransmit the particular packet to the receiving device (200) as the next step after determining that the particular packet is missing at the receiving device (200), wherein the retransmitted packet (102) carries the same first message (1012) as the particular packet (1011), and is associated with a new PSN.
2. The transmitting device (100) according to claim 1, wherein: each transmitted packet comprises a transaction identifier, XID, identifying the message carried in the packet; the particular packet (1011) comprises a first XID identifying the first message (1012) carried in the particular packet (1011); and the retransmitted packet (102) comprises the same first XID as the particular packet (ion).
3. The transmitting device (100) according to claim 1 or 2, configured to: assign each transmitted packet and/or each retransmitted packet with a flowlet ID identifying a flowlet, wherein the flowlet comprises a plurality of packets, which are routed through the same network route, and each transmitted packet and/or retransmitted packet further comprises the flowlet ID; and transmit each transmitted packet and/or retransmitted packet to the receiving device (200) over the flowlet.
4. The transmitting device (100) according to claim 3, wherein the flowlet ID that is assigned to the retransmitted packet is different from the flowlet ID that is assigned to the transmitted packet, wherein the transmitted packet and the retransmitted packet are routed through different flowlets identified by the flowlet IDs of the transmitted packet and the retransmitted packet, respectively.
5. The transmitting device (100) according to one of the claims 1 to 4, configured to: receive a notification message (201) for one or more transmitted packets of the sequence of packets (101) from the receiving device (200), wherein the notification message (201) indicates whether the transmitted packets are received at the receiving device (200); and determine whether the particular packet of the sequence of packets (101) is received at the receiving device (200) according to the notification message (201).
6. The transmitting device (100) according to claim 5, configured to: generate a completion signal for a message, when receiving a notification message indicating that a transmitted packet comprising a XID identifying that message is received for the first time.
7. The transmitting device (100) according to claim 5 or 6, configured to: determine that the particular packet (1011) with a PSN X of the sequence of packets (101) is missing at the receiving device (200) when receiving a notification message indicating that a packet with a PSN larger than X is received at the receiving device (200) before receiving a notification message indicating that the particular packet (1011) with the PSN X is received at the receiving device (200), from the receiving device (200), wherein X is a non-negative integer.
8. The transmitting device (100) according to one of the claims 5 to 7, wherein each notification message indicates the transmitted packets that with PSNs falling in a particular range.
9. The transmitting device (100) according to one of the claims 5 to 8, wherein each notification message indicates the transmitted packets that comprise the same flowlet ID.
10. The transmitting device (100) according to one of the claims 5 to 9, configured to: start to track a time period after sending a packet, if the tracking of the time period has not been started; and reset the tracking of the time period after receiving a notification message.
11. The transmitting device (100) according to claim 10, configured to: determine that a particular packet (1011) with a PSN X is missing at the receiving device (200) when no notification message indicating that the packet with the PSN X is received at the receiving device (200) is received before the time period expires.
12. The transmitting device (100) according to claim 10 or 11, configured to: set the time period according to the receiving device (200).
13. The transmitting device (100) according to one of the claims 1 to 12, wherein the message carried in each packet is limited to fit in a single network maximum transmission unit.
14. A receiving device (200) for Remote Direct Memory Access, RDMA, wherein the receiving device (200) is configured to: receive a sequence of packets (101) from a transmitting device (100) for RDMA, wherein each received packet of the sequence of packets (101) is associated with a packet sequence number, PSN, and carriers a message; and transmit a notification message (201) for the sequence of packets (101) to the transmitting device (100), wherein the notification message (201) indicates which packets of the sequence of packets (101) are received at the receiving device (200).
15. The receiving device (200) according to claim 14, wherein: each received packet comprises a transaction identifier, XID, identifying the message carried in the packet.
16. The receiving device (200) according to claim 14 or 15, wherein each packet of the sequence of packets (101) further comprises a flowlet ID identifying a flowlet, wherein the flowlet comprises a plurality of packets, which are routed through the same network route.
17. The receiving device (200) according to claim 16, wherein each notification message (201) indicates packets that comprise the same flowlet ID.
18. The receiving device (200) according to one of the claims 14 to 17, configured to: maintain a first data structure storing PSNs of received packets; and update the first data structure after receiving a packet from the transmitting device (100).
19. The receiving device (200) according to claim 18, configured to: discard a received packet, if a PSN of that received packet cannot be recorded in the first data structure.
20. The receiving device (200) according to claim 18 or 19, configured to: transmit the notification message based on the first data structure.
21. The receiving device (200) according to one of the claims 14 to 20, wherein each notification message (201) indicates packets having PSNs falling in a particular range.
22. A method (600) for Remote Direct Memory Access, RDMA, wherein the method comprises: transmitting (601) a sequence of packets (101) to a receiving device (200) for RDMA, wherein each transmitted packet of the sequence of packets (101) is associated with a packet sequence number, PSN, and carries a message; determining (602) whether a particular packet (1011) of the sequence of packets (101) is received at the receiving device (200), wherein the particular packet (1011) comprises a first message (1012); and retransmitting (603) the particular packet to the receiving device (200) as the next step after determining that the particular packet is missing at the receiving device (200), wherein the retransmitted packet (102) carries the same first message (1012) as the particular packet (1011), and is associated with a new PSN.
23. A method (700) for Remote Direct Memory Access, RDMA, wherein the method comprises: receiving (701) a sequence of packets (101) from a transmitting device (100) for RDMA, wherein each received packet of the sequence of packets (101) is associated with a packet sequence number, PSN, and comprises a transaction identifier, XID, identifying a message carried in the packet; and transmitting (702) a notification message (201) for the sequence of packets (101) to the transmitting device (100), wherein the notification message (201) indicates which packets of the sequence of packets (101) are received at the receiving device (200).
24. A computer program product comprising a program code for carrying out, when implemented on a processor, the method according to claim 22 or 23.
PCT/EP2021/071749 2021-08-04 2021-08-04 A device and method for remote direct memory access WO2023011712A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202180096824.1A CN117203627A (en) 2021-08-04 2021-08-04 Apparatus and method for remote direct memory access
PCT/EP2021/071749 WO2023011712A1 (en) 2021-08-04 2021-08-04 A device and method for remote direct memory access

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2021/071749 WO2023011712A1 (en) 2021-08-04 2021-08-04 A device and method for remote direct memory access

Publications (1)

Publication Number Publication Date
WO2023011712A1 true WO2023011712A1 (en) 2023-02-09

Family

ID=77358263

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2021/071749 WO2023011712A1 (en) 2021-08-04 2021-08-04 A device and method for remote direct memory access

Country Status (2)

Country Link
CN (1) CN117203627A (en)
WO (1) WO2023011712A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200151137A1 (en) * 2015-06-19 2020-05-14 Amazon Technologies, Inc. Flexible remote direct memory access
US20200334195A1 (en) * 2017-12-15 2020-10-22 Microsoft Technology Licensing, Llc Multi-path rdma transmission
US20210119930A1 (en) * 2019-10-31 2021-04-22 Intel Corporation Reliable transport architecture

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200151137A1 (en) * 2015-06-19 2020-05-14 Amazon Technologies, Inc. Flexible remote direct memory access
US20200334195A1 (en) * 2017-12-15 2020-10-22 Microsoft Technology Licensing, Llc Multi-path rdma transmission
US20210119930A1 (en) * 2019-10-31 2021-04-22 Intel Corporation Reliable transport architecture

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
RADHIKA MITTAL ET AL: "Revisiting Network Support for RDMA", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 21 June 2018 (2018-06-21), XP080893075 *

Also Published As

Publication number Publication date
CN117203627A (en) 2023-12-08

Similar Documents

Publication Publication Date Title
US8190960B1 (en) Guaranteed inter-process communication
US10148581B2 (en) End-to-end enhanced reliable datagram transport
US10419329B2 (en) Switch-based reliable multicast service
US10430374B2 (en) Selective acknowledgement of RDMA packets
US8233483B2 (en) Communication apparatus, communication system, absent packet detecting method and absent packet detecting program
US7171484B1 (en) Reliable datagram transport service
US7876751B2 (en) Reliable link layer packet retry
EP2001180B1 (en) One-way message notification with out-of-order packet delivery
US11863370B2 (en) High availability using multiple network elements
EP2001152B1 (en) Reliable message transport network
EP2774322B1 (en) Apparatus and method for transmitting a message to multiple receivers
CN108234089B (en) Method and system for low latency communication
US7535916B2 (en) Method for sharing a transport connection across a multi-processor platform with limited inter-processor communications
CN112383622A (en) Reliable transport protocol and hardware architecture for data center networking
WO2023016646A1 (en) A device and method for remote direct memory access
WO2023011712A1 (en) A device and method for remote direct memory access
WO2021249651A1 (en) Device and method for delivering acknowledgment in network transport protocols
US20230327812A1 (en) Device and method for selective retransmission of lost packets
WO2019015931A1 (en) Point-to-point transmitting method based on the use of an erasure coding scheme and a tcp/ip protocol
WO2021223853A1 (en) Device and method for delivering acknowledgment in network transport protocols
WO2012043142A1 (en) Multicast router and multicast network system
WO2023247005A1 (en) Receiver-agnostic scheme for reliable delivery of data over multipath
WO2023241770A1 (en) Efficient rerouting of a selective-repeat connection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21755454

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE