WO2022121469A1 - 一种流量控制方法、装置、设备及可读存储介质 - Google Patents

一种流量控制方法、装置、设备及可读存储介质 Download PDF

Info

Publication number
WO2022121469A1
WO2022121469A1 PCT/CN2021/121873 CN2021121873W WO2022121469A1 WO 2022121469 A1 WO2022121469 A1 WO 2022121469A1 CN 2021121873 W CN2021121873 W CN 2021121873W WO 2022121469 A1 WO2022121469 A1 WO 2022121469A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
rdma
sender
flow control
receiving end
Prior art date
Application number
PCT/CN2021/121873
Other languages
English (en)
French (fr)
Inventor
刘钧锴
李仁刚
阚宏伟
张翔宇
韩海跃
赵坤
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Publication of WO2022121469A1 publication Critical patent/WO2022121469A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/22Traffic shaping
    • H04L47/225Determination of shaping rate, e.g. using a moving window
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/26Flow control; Congestion control using explicit feedback to the source, e.g. choke packets
    • H04L47/263Rate modification at the source after receiving feedback

Definitions

  • the present application relates to the field of computer technologies, and in particular, to a flow control method, apparatus, device, and readable storage medium.
  • RDMA Remote Direct Memory Access, Remote Direct Memory Access
  • RDMA is one of the basic protocols for the new generation of data center high-speed network interconnection.
  • RDMA comes from the field of high-performance computing. It improves many shortcomings of the traditional TCP/IP protocol stack under high-speed networks, so that network communication no longer passes through the kernel or CPU, but instead directly reads and writes memory through the network card. In the application, the network bandwidth above 10 Gigabit can be fully utilized.
  • the current RoCE relies on ECN (Explicit Congestion Notification) for flow control.
  • ECN Exlicit Congestion Notification
  • the network device on the transmission link will mark the ECN field in the header of the data packet.
  • the destination node will be fed back to the source node.
  • the source node performs flow control by adjusting the sending rate of the data packet.
  • ECN adjusts the sending rate of data packets after congestion occurs, and cannot prevent the occurrence of congestion, and there will still be packet loss; and, in the case of a relatively large network delay, the source node cannot adjust the sending rate of data packets in time, so it cannot Resolving network congestion in a timely manner will result in a significant reduction in RDMA transmission efficiency.
  • the purpose of the present application is to provide a flow control method, apparatus, device and readable storage medium, so as to perform RDMA flow control and avoid network congestion. Its specific plan is as follows:
  • the present application provides a flow control method, including:
  • sender sends the target data to the receiver based on RDMA, then determine the amount of flight data from the sender to the receiver;
  • the sender is not allowed to send the target data to the receiver based on RDMA, and after waiting for a preset period of time, execute the determination of the sender to the receiver Steps for the amount of flight data at the receiving end.
  • the method further includes:
  • the product of the maximum bandwidth and the minimum delay is determined as the BDP.
  • the minimum delay is the sum of standard delay values of all devices on the communication link from the sender to the receiver, or the minimum delay is the communication link from the sender to the receiver The smallest test delay value among the N test delay values above.
  • the allowing the sender to send the target data to the receiver based on RDMA includes:
  • the target data is divided into a plurality of sub-data packets according to the MTU, and each sub-data packet is sent one by one as the target data.
  • the target data is an RDMA write command, an RDMA read command, a send operation command or return data corresponding to the RDMA read command.
  • the receiving end reports the send operation command to the host of the receiving end.
  • the sending end and the receiving end are different FPGA accelerator cards.
  • a flow control device including:
  • a determining module configured to determine the amount of flight data from the transmitting end to the receiving end if the transmitting end sends the target data to the receiving end based on RDMA;
  • a judging module for judging whether the amount of flight data is less than the BDP from the sending end to the receiving end
  • a sending module configured to allow the sending end to send the target data to the receiving end based on RDMA if the amount of flight data is less than the BDP;
  • control module configured to not allow the sender to send the target data to the receiver based on RDMA if the amount of flight data is not less than the BDP, and after waiting for a preset time period, execute the determining of the The step of sending the amount of flight data from the receiving end to the receiving end.
  • the present application provides a flow control device, including:
  • a processor for executing the computer program to implement the flow control method disclosed above.
  • the present application provides a readable storage medium for storing a computer program, wherein when the computer program is executed by a processor, the flow control method disclosed above is implemented.
  • the present application provides a flow control method, comprising: if a sender sends target data to a receiver based on RDMA, determining the amount of flight data from the sender to the receiver; determining the amount of flight data Whether it is smaller than the BDP from the sender to the receiver; if the amount of flight data is smaller than the BDP, the sender is allowed to send the target data to the receiver based on RDMA; if the amount of flight data is is not less than the BDP, the sender is not allowed to send the target data to the receiver based on RDMA, and after waiting for a preset time period, the process of determining the amount of flight data from the sender to the receiver is performed. step.
  • the sender before the sender sends the target data to the receiver based on RDMA, it first determines the amount of flight data from the sender to the receiver (that is, the amount of data that has been sent from the sender but has not yet reached the receiver); judges whether the amount of flight data is smaller than the sender.
  • the sending end is allowed to send the target data to the receiving end based on RDMA; If the amount of data is not less than the BDP from the sender to the receiver, it means that the communication link from the sender to the receiver is congested and the data transmission is not smooth. Therefore, the sender is temporarily not allowed to send the target data to the receiver based on RDMA.
  • the flow control apparatus, device and readable storage medium provided by the present application also have the above technical effects.
  • Fig. 1 is a flow chart of a flow control method disclosed in the application
  • FIG. 2 is a schematic diagram of a flow control scheme disclosed in the application.
  • FIG. 3 is a schematic structural diagram of an FPGA accelerator card disclosed in the application.
  • Fig. 6 is a kind of RDMA read command split schematic diagram disclosed by the application.
  • FIG. 7 is a schematic diagram of a flow control device disclosed in the application.
  • FIG. 8 is a schematic diagram of a flow control device disclosed in this application.
  • ECN adjusts the sending rate of data packets after the occurrence of congestion. It cannot prevent the occurrence of congestion, and there will still be packet loss; and, in the case of a relatively large network delay, the source node cannot adjust the sending rate of data packets in time, so it cannot Resolving network congestion in a timely manner will result in a significant reduction in RDMA transmission efficiency. Therefore, the present application provides a flow control scheme, which can perform RDMA flow control and avoid network congestion.
  • an embodiment of the present application discloses a flow control method, including:
  • the amount of flight data is the amount of data that has been sent from the sender but has not yet reached the receiver.
  • the method before judging whether the amount of flight data is less than the BDP from the sender to the receiver, the method further includes: obtaining the maximum bandwidth of the sender and the minimum delay from the sender to the receiver; and determining the product of the maximum bandwidth and the minimum delay as BDP (Bandwidth Delay Product, bandwidth delay product).
  • BDP Bandwidth Delay Product, bandwidth delay product
  • the minimum delay is the sum of the standard delay values of all devices on the communication link from the sender to the receiver (such as the delay values marked on switches, routers, etc.), or the minimum delay is the communication link from the sender to the receiver.
  • the minimum test delay value among the N test delay values above for example, send N test packets on the communication link from the sender to the receiver, and see which test packet corresponds to the minimum delay).
  • the sending end is not allowed to send the target data to the receiving end based on RDMA, and after waiting for a preset time period, S101 is performed.
  • the target data is an RDMA write command, an RDMA read command, a send operation command, or return data corresponding to the RDMA read command.
  • the sender when the target data is an RDMA write command, an RDMA read command, and a send operation command, the sender is the initiator of the data transmission, that is, the sender actively sends the RDMA write command, RDMA read command, and send operation command to the receiver. In the process of command sending, flow control is performed, and the receiving end can respond one by one.
  • the target data is the return data corresponding to the RDMA read command
  • the sender is not the initiator of the data transmission, it is the passive responder, that is, the sender returns the data to the receiver one by one based on the request of the receiver, in the process of returning data in flow control.
  • This embodiment can be applied to the above two processes. In different processes, it is sufficient to distinguish two ends of data transmission.
  • the target data is an RDMA write command, an RDMA read command or a send operation command
  • allowing the sender to send the target data to the receiver based on RDMA including: according to the MTU (Maximum Transmission Unit, the maximum transmission unit) split the target data into multiple sub-packets, and send each sub-packet as target data one by one.
  • MTU Maximum Transmission Unit, the maximum transmission unit
  • the target data is an RDMA write command
  • the target data includes data that needs to be stored in the receiving end
  • each sub-data package is each sub-data that needs to be stored in the receiving end.
  • the target data includes: which data are read from which positions of the receiving end. Therefore, each sub-data package refers to which piece of data is read from which position of the receiving end, so that the receiving end can return the corresponding data one by one.
  • the send operation command is similar to the RDMA write command and will not be repeated here.
  • the receiving end reports the send operation command to the host of the receiving end to notify the host.
  • the sending end and the receiving end are different FPGA (Field Programmable Gate Array, Field Programmable Gate Array) accelerator cards, and the two FPGA accelerator cards can be plugged into different servers.
  • FPGA Field Programmable Gate Array
  • Field Programmable Gate Array Field Programmable Gate Array
  • the sending end before the sending end sends the target data to the receiving end based on RDMA, it first determines the amount of flight data from the sending end to the receiving end; judges whether the amount of flight data is less than the BDP from the sending end to the receiving end;
  • the BDP of the receiver indicates that the communication link from the sender to the receiver is not congested and the data can be transmitted normally, so the sender is allowed to send the target data to the receiver based on RDMA; if the amount of flight data is not less than the BDP from the sender to the receiver, it indicates If the communication link between the sender and the receiver is congested and the data transmission is not smooth, the sender is not allowed to send the target data to the receiver based on RDMA for the time being.
  • this embodiment can avoid the occurrence of congestion in the RDMA network transmission from the source, thereby reducing the occurrence probability of packet loss and improving the RDMA transmission efficiency.
  • this solution only needs to modify the sender accordingly, and neither the receiver nor the network forwarding devices (such as switches, routers, etc.) need to be changed, thus reducing the deployment cost and difficulty.
  • the two ends of the transmission data are both the sender and the receiver, corresponding modifications need to be made in the two ends respectively.
  • an embodiment of the present application discloses a flow control solution.
  • one device is the requester (ie, the sender), responsible for issuing RDMA read and write commands; the other device is the responder (ie, the receiver), which is used to respond according to the received RDMA read and write commands.
  • the two devices transmit data using the RDMA protocol, forming a queue pair.
  • the requester If the requester sends an RDMA write command to the responder, the requester divides the data to be transmitted into several data packets (pkt) according to the MTU according to the user's instruction, and sends it to the responder.
  • PSN is inserted into the header of the data packet and sent to the answering end together, so that the answering end can distinguish each data packet based on the sequence number.
  • the answering end After receiving the data packet, the answering end returns an acknowledgment data packet (ack) to the requesting end according to the sequence number of the data packet.
  • RTT Red-Trip Time, round-trip delay
  • the requester when the requester sends the nth data packet, it may only receive the response packet of the 0th data packet, and there are n packets in the network at this time.
  • the n packets are called flight data packets, and the total data volume is the flight data volume.
  • the RTT When there is no network congestion, the RTT is the smallest, and the maximum output bandwidth of the requester is multiplied by the minimum round-trip delay to obtain the bandwidth-delay product (BDP) when there is no congestion.
  • BDP bandwidth-delay product
  • Congestion can be avoided as long as the total data volume of the flight packets is less than the BDP at any time. Specifically, before each data packet is sent, the number of bytes of the data packet that has been sent by the requester is calculated first, and the number of bytes of the data packet that has been confirmed is subtracted to obtain the number of bytes of the in-flight data packet. If the number of bytes of the flight data packet is less than the BDP, the next data packet can be sent; if the number of bytes of the flight data packet is not less than the BDP, the sending of the data packet is suspended.
  • the requester sends an RDMA read command to the responder, the requester will split the RDMA read command into several sub-read commands according to the MTU. Each sub-read command only needs the responder to return one data packet, which can precisely control the bytes of the flight data packet. number.
  • the sending process of these sub-read commands and the returning process of the returned data can be as follows: only the flow of the returned data is controlled, and after the first sub-read command is sent, the second sub-read command is sent after the corresponding return data is received.
  • the sending process of the sub-read command and the returning process of the returned data can also be regarded as two independent processes, that is, flow control is performed during the sending process of the sub-read command to send each sub-read command, and at the same time the data is returned. During the return process, flow control is performed to return each return data.
  • the requester is both a data sender and a data receiver.
  • the responder is both a data sender and a data receiver.
  • the BDP is obtained by multiplying the maximum output bandwidth (fixed value) of the requester by the RTT, only the RTT is required to obtain the BDP.
  • RTT There are two ways to obtain RTT. One is to use software to obtain network path information when a connection is established, and add the nominal delay values of all devices on the path to obtain the RTT; the other is to send the RTT when a connection is established. A set of test data, calculate the round-trip delay of each packet, and take the minimum value as RTT.
  • the FPGA accelerator card can use intel's arria10 chip, and configure two 10G Ethernet optical ports and two 4GB SDRAM (Synchronous Dynamic Random Access Memory, synchronous dynamic random access memory) as memory.
  • the FPGA accelerator card can be connected to the CPU of the server through PCI-E (Peripheral Component Interconnect Express, a high-speed serial computer expansion bus standard).
  • test minimum RTT module is used to calculate the minimum RTT.
  • the module for calculating the number of flight data bytes is used to calculate the number of flight data bytes.
  • the datapath merge module is used to send data when appropriate.
  • the "test minimum RTT module" inside the FPGA of the requester generates 1000 round-trip delay test packets (that is, RDMA write command packets) and sends them to the "data path merge module” ”, and the “Data Path Merging Module” transmits the test data packet to the answering end through the network.
  • the responder returns an acknowledgment packet to the requester according to the RDMA protocol.
  • the requester records the sending time for each RDMA write command data packet sent, and records the return time when the corresponding acknowledgment data packet is received, and subtracts the sending time from the return time to obtain the round-trip delay RTT. Compare the RTTs of 1000 test packets, take the minimum value as the RTT without congestion, and send it to the "calculate the number of flight data bytes module".
  • the "calculate the number of flight data bytes module” receives the RTT without congestion, multiply the transmission bandwidth by 10Gbps to obtain the BDP without congestion. Then start to receive the RDMA command from the user. If it is an RDMA write command, record the number of bytes of each sent write command data packet and the total number of bytes sent. First send a write command data packet, after receiving the confirmation data packet, subtract the number of write command data packet bytes corresponding to the confirmation data packet from the total number of bytes to obtain the real-time flight data bytes.
  • the "calculate the number of flight data bytes module" in the requester receives the RDMA read command, it will split the read command into several sub-read commands according to MTU (0x400 bytes).
  • the splitting method is shown in Figure 6.
  • the data start address (the start address of the data to be read) of the remote end (that is, the response end) is 0x100000
  • the end address of the data to be read is 0x1103ff
  • the data length is 0x10400 bytes
  • the starting address of the first read command is 0x100000
  • the starting address of each subsequent read command differs by 0x400 bytes
  • the length of each sub-read command is 0x400 bytes.
  • the responder After the responder receives each sub-read command, it returns the read data packets one by one according to the RDMA protocol.
  • the requester is both a data sender and a data receiver.
  • the responder is both a data sender and a data receiver.
  • the "calculating flight data byte count module" in the requester records the length bytes of each sub-read command and the total length bytes of the sub-read commands that have been sent. Send the first sub-read command, and after receiving its corresponding return data, subtract the byte length of the first sub-read command from the total length in bytes to obtain the real-time number of bytes of the flight data packet. Compare the number of flight data bytes and BDP. If the number of flight data bytes is greater than or equal to the BDP, the sub-read command will be suspended. If the number of flight data bytes is less than the BDP, the next sub-read command will be sent to the "Data Path Merging Module". so that the "Datapath Merge Module" is sent over the network to the answering side.
  • the "calculate the number of flight data bytes module" in the responder records the length bytes of the returned data corresponding to each sub-read command, and the total length bytes of the returned data corresponding to the sub-read commands sent by the responder. . After sending the first returned data, subtract the length of the first returned data from the total length of bytes to obtain the real-time number of bytes of flight data packets. Compare the number of flight data bytes and BDP, if the number of flight data bytes is greater than or equal to BDP, stop sending the return data, if the number of flight data bytes is less than BDP, send the next return data to the "data path merge module", so that Datapath Merge Module" is sent to the requester over the network.
  • the number of bytes of real-time flight data is compared with the BDP under non-congested conditions to control the transmission flow, and the occurrence of congestion in RDMA network transmission is avoided from the source, thereby reducing the occurrence of packet loss and improving RDMA transmission. efficiency.
  • the following describes a flow control device provided by an embodiment of the present application.
  • the flow control device described below and the flow control method described above can be referred to each other.
  • an embodiment of the present application discloses a flow control device, including:
  • a determination module 701 configured to determine the amount of flight data from the sender to the receiver if the sender sends target data to the receiver based on RDMA;
  • Judging module 702 for judging whether the amount of flight data is less than the BDP from the sender to the receiver;
  • the sending module 703 is used to allow the sending end to send the target data to the receiving end based on RDMA if the amount of flight data is less than the BDP;
  • the control module 704 is configured to not allow the sender to send target data to the receiver based on RDMA if the amount of flight data is not less than the BDP, and after waiting for a preset period of time, execute the step of determining the amount of flight data from the sender to the receiver.
  • it also includes:
  • the obtaining module is used to obtain the maximum bandwidth of the sender and the minimum delay from the sender to the receiver;
  • the BDP calculation module is used to determine the product of the maximum bandwidth and the minimum delay as the BDP.
  • the minimum delay is the sum of standard delay values of all devices on the communication link from the sender to the receiver, or the minimum delay is N test delay values on the communication link from the sender to the receiver The minimum test latency value in .
  • the target data is an RDMA write command, an RDMA read command, a send operation command, or return data corresponding to the RDMA read command.
  • the sending module is specifically used for:
  • the target data is an RDMA write command, an RDMA read command or a send operation command
  • the target data is divided into multiple sub-data packets according to the MTU, and each sub-data packet is sent one by one as target data.
  • the receiving end reports the send operation command to the host of the receiving end.
  • the sending end and the receiving end are different FPGA accelerator cards.
  • this embodiment provides a flow control device, which can avoid the occurrence of congestion in RDMA network transmission from the source, thereby reducing the probability of packet loss and improving the efficiency of RDMA transmission.
  • the following describes a flow control device provided by an embodiment of the present application.
  • the flow control device described below and the flow control method and device described above can be referred to each other.
  • an embodiment of the present application discloses a flow control device, including:
  • the processor 802 is configured to execute the computer program to implement the method disclosed in any of the foregoing embodiments.
  • a readable storage medium provided by an embodiment of the present application is introduced below.
  • a readable storage medium described below and a flow control method, apparatus, and device described above may be referred to each other.
  • a readable storage medium for storing a computer program, wherein when the computer program is executed by a processor, the flow control method disclosed in the foregoing embodiments is implemented. For the specific steps of the method, reference may be made to the corresponding content disclosed in the foregoing embodiments, which will not be repeated here.
  • references in this application to "first”, “second”, “third”, “fourth”, etc. are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It is to be understood that target data so used can be interchanged under appropriate circumstances so that the embodiments described herein can be practiced in sequences other than those illustrated or described herein.
  • the terms “comprising” and “having”, and any variations thereof are intended to cover non-exclusive inclusion, for example, a process, method or apparatus comprising a series of steps or elements is not necessarily limited to those steps or elements expressly listed , but may include other steps or elements not expressly listed or inherent to these processes, methods or apparatus.
  • a software module can be placed in random access memory (RAM), internal memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other in the technical field. in any other form of readable storage medium that is well known.
  • RAM random access memory
  • ROM read only memory
  • EEPROM electrically programmable ROM
  • erasable programmable ROM electrically erasable programmable ROM
  • registers hard disk, removable disk, CD-ROM, or any other in the technical field. in any other form of readable storage medium that is well known.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请公开了一种流量控制方法、装置、设备及可读存储介质。本申请公开的方法包括:若发送端基于RDMA发送目标数据至接收端,则确定发送端至接收端的飞行数据量;判断飞行数据量是否小于发送端至接收端的BDP;若飞行数据量小于发送端至接收端的BDP,则允许发送端基于RDMA发送目标数据至接收端;若飞行数据量不小于发送端至接收端的BDP,则不允许发送端基于RDMA发送目标数据至接收端,在等待预设时长后,执行确定发送端至接收端的飞行数据量的步骤,从而实现了发送端至接收端的通信链路上的流量控制,从源头避免了RDMA网络传输中拥塞的发生,降低了丢包的发生概率,提高了RDMA传输效率。相应地,本申请提供的一种流量控制装置、设备及可读存储介质,也同样具有上述技术效果。

Description

一种流量控制方法、装置、设备及可读存储介质
本申请要求在2020年12月10日提交中国专利局、申请号为202011438863.9、发明名称为“一种流量控制方法、装置、设备及可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,特别涉及一种流量控制方法、装置、设备及可读存储介质。
背景技术
RDMA(Remote Direct Memory Access,远程内存直接访问)是新一代数据中心高速网络互联的基础协议之一。RDMA来自于高性能计算领域,它改进了传统的TCP/IP协议栈在高速网络下的诸多缺点,使得网络通信不再经过内核或CPU,取而代之的则是直接通过网卡读写内存来进行,从而在应用上能够充分利用万兆以上的网络带宽。
随着模型复杂度和数据规模的快速增长,深度学习系统需要越来越多的加速卡进行并行训练,使用高吞吐量、低延迟的RDMA技术成为必然选择。而大规模、高拓展性的深度学习系统,则需要基于RoCE(RDMA over Converged Ethernet,基于聚合以太网的RDMA)进行数据和命令传输。
当前的RoCE依赖ECN(Explicit Congestion Notification,显式拥塞通知)来进行流量控制。若启用ECN,那么一旦检测到RoCE流量出现了拥塞,传输链路上的网络设备会在数据包的头部进行ECN域的标记。当被ECN标记过的数据包到达它们原本要到达的目的节点时,目的节点就会被反馈给源节点,此时源节点通过调整数据包的发送速率来进行流量控制。可见,ECN是在拥塞发生以后调整数据包的发送速率,不能预防拥塞的发生,仍然会有丢包;并且,在网络延迟比较大的情况下,源节点不能及时调整数据包发送速率,因此不能及时解决网络拥塞,会导致RDMA传输效率大幅降低。
因此,如何进行RDMA的流量控制,避免网络拥塞,是本领域技术人员需要解决的问题。
发明内容
有鉴于此,本申请的目的在于提供一种流量控制方法、装置、设备及可读存储介质,以进行RDMA的流量控制,避免网络拥塞。其具体方案如下:
第一方面,本申请提供了一种流量控制方法,包括:
若发送端基于RDMA发送目标数据至接收端,则确定所述发送端至所述接收端的飞行数据量;
判断所述飞行数据量是否小于所述发送端至所述接收端的BDP;
若所述飞行数据量小于所述BDP,则允许所述发送端基于RDMA发送所述目标数据至所述接收端;
若所述飞行数据量不小于所述BDP,则不允许所述发送端基于RDMA发送所述目标数据至所述接收端,在等待预设时长后,执行所述确定所述发送端至所述接收端的飞行数据量的步骤。
优选地,所述判断所述飞行数据量是否小于所述发送端至所述接收端的BDP之前,还包括:
获取所述发送端的最大带宽,以及所述发送端至所述接收端的最小延迟;
将所述最大带宽与所述最小延迟的乘积确定为所述BDP。
优选地,所述最小延迟为所述发送端至所述接收端的通信链路上的所有设备的标准延迟值之和,或者,所述最小延迟为所述发送端至所述接收端的通信链路上的N个测试延迟值中的最小测试延迟值。
优选地,若所述目标数据为RDMA写命令、RDMA读命令或send操作命令,则所述允许所述发送端基于RDMA发送目标数据至所述接收端,包括:
按照MTU将所述目标数据拆分为多个子数据包,并将每个子数据包作为所述目标数据逐一发送。
优选地,所述目标数据为RDMA写命令、RDMA读命令、send操作命令或所述RDMA读命令对应的返回数据。
优选地,若所述目标数据为send操作命令,则所述接收端接收所述send操作命令后,上报所述send操作命令至所述接收端的主机。
优选地,所述发送端和所述接收端为不同的FPGA加速卡。
第二方面,本申请提供了一种流量控制装置,包括:
确定模块,用于若发送端基于RDMA发送目标数据至接收端,则确定所述发送端至所述接收端的飞行数据量;
判断模块,用于判断所述飞行数据量是否小于所述发送端至所述接收端的BDP;
发送模块,用于若所述飞行数据量小于所述BDP,则允许所述发送端基于RDMA发送所述目标数据至所述接收端;
控制模块,用于若所述飞行数据量不小于所述BDP,则不允许所述发送端基于RDMA发送所述目标数据至所述接收端,在等待预设时长后,执行所述确定所述发送端至所述接收端的飞行数据量的步骤。
第三方面,本申请提供了一种流量控制设备,包括:
存储器,用于存储计算机程序;
处理器,用于执行所述计算机程序,以实现前述公开的流量控制方法。
第四方面,本申请提供了一种可读存储介质,用于保存计算机程序,其中,所述计算机程序被处理器执行时实现前述公开的流量控制方法。
通过以上方案可知,本申请提供了一种流量控制方法,包括:若发送端基于RDMA发送目标数据至接收端,则确定所述发送端至所述接收端的飞行数据量;判断所述飞行数据量是否小于所述发送端至所述接收端的BDP;若所述飞行数据量是否小于所述BDP,则允许所述发送端基于RDMA发送所述目标数据至所述接收端;若所述飞行数据量不小于所述BDP,则不允许所述发送端基于RDMA发送所述目标数据至所述接收端,在等待预设时长后,执行所述确定所述发送端至所述接收端的飞行数据量的步骤。
可见,发送端基于RDMA发送目标数据至接收端之前,首先确定发送端至接收端的飞行数据量(即已从发送端发出,但尚未到达接收端的数据量大小);判断飞行数据量是否小于发送端至接收端的BDP;若飞行数据量小于发送端至接收端的BDP,则表明发送端至接收端的通信链路未拥塞,数据可以正常传输,那么允许发送端基于RDMA发送目标数据至接收端;若飞行数 据量不小于发送端至接收端的BDP,则表明发送端至接收端的通信链路拥塞,数据传输不顺利,那么暂时不允许发送端基于RDMA发送目标数据至接收端,在等待预设时长后,再次判断飞行数据量是否小于发送端至接收端的BDP,以便发送端至接收端的通信链路不拥塞时,发送端继续发送目标数据至接收端,从而实现了发送端至接收端的通信链路上的流量控制,从源头避免了RDMA网络传输中拥塞的发生,从而降低了丢包的发生概率,提高了RDMA传输效率。
相应地,本申请提供的一种流量控制装置、设备及可读存储介质,也同样具有上述技术效果。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。
图1为本申请公开的一种流量控制方法流程图;
图2为本申请公开的一种流量控制方案示意图;
图3为本申请公开的一种FPGA加速卡结构示意图;
图4为本申请公开的一种流量控制的功能框架示意图;
图5为本申请公开的另一种流量控制方法流程图;
图6为本申请公开的一种RDMA读命令拆分示意图;
图7为本申请公开的一种流量控制装置示意图;
图8为本申请公开的一种流量控制设备示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
目前,ECN是在拥塞发生以后调整数据包的发送速率,不能预防拥塞的发生,仍然会有丢包;并且,在网络延迟比较大的情况下,源节点不能及时调整数据包发送速率,因此不能及时解决网络拥塞,会导致RDMA传输效率大幅降低。为此,本申请提供了一种流量控制方案,能够进行RDMA的流量控制,避免网络拥塞。
参见图1所示,本申请实施例公开了一种流量控制方法,包括:
S101、若发送端基于RDMA发送目标数据至接收端,则确定发送端至接收端的飞行数据量。
其中,飞行数据量即已从发送端发出,但尚未到达接收端的数据量大小。
在一种具体实施方式中,判断飞行数据量是否小于发送端至接收端的BDP之前,还包括:获取发送端的最大带宽,以及发送端至接收端的最小延迟;将最大带宽与最小延迟的乘积确定为BDP(Bandwidth Delay Product,带宽延迟乘积)。
其中,最小延迟为发送端至接收端的通信链路上的所有设备的标准延迟值(如交换机、路由器等设备上标示的延迟值)之和,或者,最小延迟为发送端至接收端的通信链路上的N个测试延迟值中的最小测试延迟值(如:在发送端至接收端的通信链路上发送N个测试包,看哪个测试包对应的延迟最小)。
S102、判断飞行数据量是否小于发送端至接收端的BDP;若飞行数据量小于发送端至接收端的BDP,则执行S103;若飞行数据量不小于发送端至接收端的BDP,则执行S104。
S103、允许发送端基于RDMA发送目标数据至接收端。
S104、不允许发送端基于RDMA发送目标数据至接收端,在等待预设时长后,执行S101。
在一种具体实施方式中,目标数据为RDMA写命令、RDMA读命令、send操作命令或所述RDMA读命令对应的返回数据。
需要说明的是,当目标数据为RDMA写命令、RDMA读命令、send操作命令时,发送端为数据传输的发起端,也就是发送端主动发送RDMA写命令、RDMA读命令、send操作命令至接收端,在命令发送的过程中进行流量控制,而接收端逐一响应即可。而当目标数据为RDMA读命令对应的返回数据时, 发送端不是数据传输的发起端,其是被动响应端,也就是发送端基于接收端的请求来逐一返回数据给接收端,在返回数据的过程中进行流量控制。本实施例能够应用于上述两个过程中,在不同过程中,区分数据传输的两端即可。
在一种具体实施方式中,若所述目标数据为RDMA写命令、RDMA读命令或send操作命令,则允许发送端基于RDMA发送目标数据至接收端,包括:按照MTU(Maximum Transmission Unit,最大传输单元)将目标数据拆分为多个子数据包,并将每个子数据包作为目标数据逐一发送。
其中,若目标数据为RDMA写命令,则目标数据中包括需要存入接收端的数据,各个子数据包即为需要存入接收端的各个子数据。若目标数据为RDMA读命令,则目标数据中包括:从接收端的哪些位置读取哪些数据。因此各个子数据包即为从接收端的哪个位置读取哪块数据,以便接收端逐一返回相应数据。send操作命令与RDMA写命令类似,在此不再赘述。
在一种具体实施方式中,若目标数据为send操作命令,则接收端接收send操作命令后,上报send操作命令至接收端的主机,以通知主机。
在一种具体实施方式中,发送端和接收端为不同的FPGA(Field Programmable Gate Array,现场可编程与门阵列)加速卡,这两个FPGA加速卡可以插接在不同的服务器上。
在本实施例中,发送端基于RDMA发送目标数据至接收端之前,首先确定发送端至接收端的飞行数据量;判断飞行数据量是否小于发送端至接收端的BDP;若飞行数据量小于发送端至接收端的BDP,则表明发送端至接收端的通信链路未拥塞,数据可以正常传输,那么允许发送端基于RDMA发送目标数据至接收端;若飞行数据量不小于发送端至接收端的BDP,则表明发送端至接收端的通信链路拥塞,数据传输不顺利,那么暂时不允许发送端基于RDMA发送目标数据至接收端,在等待预设时长后,再次判断飞行数据量是否小于发送端至接收端的BDP,以便发送端至接收端的通信链路不拥塞时,发送端继续发送目标数据至接收端,从而实现了发送端至接收端的通信链路上的流量控制。
可见,本实施例能够从源头避免RDMA网络传输中拥塞的发生,从而降低了丢包的发生概率,提高了RDMA传输效率。并且,该方案只需要对发送端进行相应修改,接收端和网络转发设备(如交换机、路由器等)都不需要 改变,因此降低了部署成本和难度。当然,若传输数据的两端,既作为发送端,又作为接收端,就需要分别在两端中进行相应修改。
参见图2所示,本申请实施例公开了一种流量控制方案。在图2中,一个设备为请求端(即发送端),负责发出RDMA读写命令;另一个设备为应答端(即接收端),用于根据接收到的RDMA读写命令做出响应。这两个设备使用RDMA协议传输数据,形成了队列对(queue pair)。
若请求端发送RDMA写命令至应答端,则请求端根据用户指令,将需要传输的数据根据MTU分为若干个数据包(pkt),并发送给应答端,同时将每个数据包的序号(PSN)插入数据包头部,一同发送给应答端,以便应答端基于序号区分各个数据包。应答端接收到数据包后,按数据包的序号给请求端返回确认数据包(ack)。其中,由于网络传输中存在RTT(Round-Trip Time,往返延迟),当请求端发出第n个数据包时,可能只接收到了第0个数据包的应答包,此时有n个包在网络中传输,这个n个包称为飞行数据包,其总数据量即为飞行数据量。
当网络没有发生拥塞时,RTT最小,请求端的最大输出带宽乘以最小往返延迟,可得到没有拥塞时的带宽延迟乘积(BDP)。只要任何时刻飞行数据包的总数据量小于BDP,就可以避免拥塞的发生。具体的,在发送每个数据包之前,首先计算请求端已经发送的数据包字节数,减去已经被确认的数据包字节数,得到飞行数据包字节数。如果飞行数据包字节数小于BDP,则可以发送下一个数据包;如果飞行数据包字节数不小于BDP,暂停发送数据包。
若请求端发送RDMA读命令至应答端,则请求端将RDMA读命令按照MTU拆分成若干个子读命令,每个子读命令只需要应答端返回一个数据包,这样可以精确控制飞行数据包字节数。
这些子读命令的发送过程和返回数据的返回过程可以为:只控制返回数据的流量,在发送第一个子读命令后,待接收到相应的返回数据,再发送第二个子读命令。当然,还可以将子读命令的发送过程和返回数据的返回过程看作两个独立的过程,即:在子读命令的发送过程中进行流量控制,来发送各个子读命令,同时在返回数据的返回过程中进行流量控制,来返回各个返 回数据。在该过程中,请求端既是数据发送端,又是数据接收端。应答端既是数据发送端,又是数据接收端。
由于BDP是由请求端的最大输出带宽(定值)乘以RTT得到的,因此获取BDP只需要获取RTT。获取RTT的方法有两种,一种是建立连接时,使用软件获取网络路径信息,将路径上所有设备的标称延迟值相加得到RTT;另一种是在建立连接时,由请求端发出一组测试数据,计算每个数据包的往返延迟,取最小值为RTT。
请参见图3,使用FPGA加速卡作为请求端和发送端的设备。FPGA加速卡可以使用intel的arria10芯片,并配置两个10G以太网光口,以及两个4GB的SDRAM(Synchronous Dynamic Random Access Memory,同步动态随机存储器)作为存储器。FPGA加速卡可以通过PCI-E(Peripheral Component Interconnect Express,一种高速串行计算机扩展总线标准)连接服务器的CPU。
在请求端和应答端内部,可分别设计如图4所示的三个模块,以完成流量控制,包括:测试最小RTT模块、计算飞行数据字节数模块和数据通路合并模块。其中,测试最小RTT模块用于计算最小RTT。计算飞行数据字节数模块用于计算飞行数据字节数。数据通路合并模块用于在适当的时候发送数据。
请参见图5,两个FPGA加速卡建立RDMA连接后,请求端FPGA内部的“测试最小RTT模块”产生1000个往返延迟测试数据包(即RDMA写命令数据包)并发送到“数据通路合并模块”,“数据通路合并模块”将测试数据包通过网络传送给应答端。应答端根据RDMA协议返回确认数据包到请求端。请求端对发送的每个RDMA写命令数据包记录发送时间,当接收到对应的确认数据包时记录返回时间,用返回时间减去发送时间得到往返延迟RTT。比较1000个测试数据包的RTT,取最小值作为没有拥塞情况的RTT,发送给“计算飞行数据字节数模块”。
“计算飞行数据字节数模块”接收到没有拥塞情况的RTT后,乘以发送带宽10Gbps,得到没有拥塞情况的BDP。之后开始接收来自用户的RDMA命令,如果是RDMA写命令,则记录每个发送的写命令数据包的字节数和总的发送字节数。先发送一个写命令数据包,当接收到确认数据包后,从总字节数中减去确认数据包对应的写命令数据包字节数,得到实时的飞行数据字 节数。对比飞行数据字节数和BDP,如果飞行数据字节数大于等于BDP,则暂停发送数据,如果飞行数据字节数小于BDP,则发送下一个写命令数据包到“数据通路合并模块”,以便“数据通路合并模块”通过网络发送到应答端。
如果请求端中的“计算飞行数据字节数模块”接收到RDMA读命令,则将读命令按MTU(0x400字节)拆分成若干子读命令。拆分方法如图6所示,用户RDMA读命令中远端(即应答端)数据起始地址(待读数据的起始地址)为0x100000,待读数据的终止地址为0x1103ff,数据长度为0x10400字节,则拆分成41个子读命令,第一个读命令的起始地址为0x100000,后面每个读命令的起始地址相差0x400字节,每个子读命令的长度都为0x400字节。应答端接收到每个子读命令后,按照RDMA协议逐一相应返回读数据包。此时,请求端既是数据发送端,又是数据接收端。应答端既是数据发送端,又是数据接收端。
具体的,请求端中的“计算飞行数据字节数模块”记录每个子读命令的长度字节数,和已经发送的子读命令的总长度字节数。发送第一个子读命令,接收到其对应的返回数据后,用总长度字节数减去第一个子读命令的字节长度,得到实时的飞行数据包字节数。对比飞行数据字节数和BDP,如果飞行数据字节数大于等于BDP,则暂停发送子读命令,如果飞行数据字节数小于BDP,则发送下一个子读命令到“数据通路合并模块”,以便“数据通路合并模块”通过网络发送到应答端。
相应的,应答端中的“计算飞行数据字节数模块”记录每个子读命令对应的返回数据的长度字节数,和应答端已经发送的子读命令对应的返回数据的总长度字节数。发送第一个返回数据后,用总长度字节数减去第一个返回数据的长度,得到实时的飞行数据包字节数。对比飞行数据字节数和BDP,如果飞行数据字节数大于等于BDP,则暂停发送返回数据,如果飞行数据字节数小于BDP,则发送下一个返回数据到“数据通路合并模块”,以便“数据通路合并模块”通过网络发送到请求端。
可见,本实施例将实时飞行数据字节数和非拥塞情况下BDP进行对比,来控制发送流量,从源头避免了RDMA网络传输中拥塞的发生,从而降低了丢包的发生,提高了RDMA传输效率。
下面对本申请实施例提供的一种流量控制装置进行介绍,下文描述的一种流量控制装置与上文描述的一种流量控制方法可以相互参照。
参见图7所示,本申请实施例公开了一种流量控制装置,包括:
确定模块701,用于若发送端基于RDMA发送目标数据至接收端,则确定发送端至接收端的飞行数据量;
判断模块702,用于判断飞行数据量是否小于发送端至接收端的BDP;
发送模块703,用于若飞行数据量小于BDP,则允许发送端基于RDMA发送目标数据至接收端;
控制模块704,用于若飞行数据量不小于BDP,则不允许发送端基于RDMA发送目标数据至接收端,在等待预设时长后,执行确定发送端至接收端的飞行数据量的步骤。
在一种具体实施方式中,还包括:
获取模块,用于获取发送端的最大带宽,以及发送端至接收端的最小延迟;
BDP计算模块,用于将最大带宽与最小延迟的乘积确定为BDP。
在一种具体实施方式中,最小延迟为发送端至接收端的通信链路上的所有设备的标准延迟值之和,或者,最小延迟为发送端至接收端的通信链路上的N个测试延迟值中的最小测试延迟值。
在一种具体实施方式中,目标数据为RDMA写命令、RDMA读命令、send操作命令或所述RDMA读命令对应的返回数据。
在一种具体实施方式中,发送模块具体用于:
若所述目标数据为RDMA写命令、RDMA读命令或send操作命令,则按照MTU将目标数据拆分为多个子数据包,并将每个子数据包作为目标数据逐一发送。
在一种具体实施方式中,若目标数据为send操作命令,则接收端接收send操作命令后,上报send操作命令至接收端的主机。
在一种具体实施方式中,发送端和接收端为不同的FPGA加速卡。
其中,关于本实施例中各个模块、单元更加具体的工作过程可以参考前述实施例中公开的相应内容,在此不再进行赘述。
可见,本实施例提供了一种流量控制装置,该装置能够从源头避免RDMA网络传输中拥塞的发生,从而降低了丢包的发生概率,提高了RDMA传输效率。
下面对本申请实施例提供的一种流量控制设备进行介绍,下文描述的一种流量控制设备与上文描述的一种流量控制方法及装置可以相互参照。
参见图8所示,本申请实施例公开了一种流量控制设备,包括:
存储器801,用于保存计算机程序;
处理器802,用于执行所述计算机程序,以实现上述任意实施例公开的方法。
下面对本申请实施例提供的一种可读存储介质进行介绍,下文描述的一种可读存储介质与上文描述的一种流量控制方法、装置及设备可以相互参照。
一种可读存储介质,用于保存计算机程序,其中,所述计算机程序被处理器执行时实现前述实施例公开的流量控制方法。关于该方法的具体步骤可以参考前述实施例中公开的相应内容,在此不再进行赘述。
本申请涉及的“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的目标数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法或设备固有的其它步骤或单元。
需要说明的是,在本申请中涉及“第一”、“第二”等的描述仅用于描述目的,而不能理解为指示或暗示其相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。另外,各个实施例之间的技术方案可以相互结合,但是必须是以本领域普通技术人员能够实现为基础,当技术方案的结合出现相 互矛盾或无法实现时应当认为这种技术方案的结合不存在,也不在本申请要求的保护范围之内。
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的可读存储介质中。
本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。

Claims (10)

  1. 一种流量控制方法,其特征在于,包括:
    若发送端基于RDMA发送目标数据至接收端,则确定所述发送端至所述接收端的飞行数据量;
    判断所述飞行数据量是否小于所述发送端至所述接收端的BDP;
    若所述飞行数据量小于所述BDP,则允许所述发送端基于RDMA发送所述目标数据至所述接收端;
    若所述飞行数据量不小于所述BDP,则不允许所述发送端基于RDMA发送所述目标数据至所述接收端,在等待预设时长后,执行所述确定所述发送端至所述接收端的飞行数据量的步骤。
  2. 根据权利要求1所述的流量控制方法,其特征在于,所述判断所述飞行数据量是否小于所述发送端至所述接收端的BDP之前,还包括:
    获取所述发送端的最大带宽,以及所述发送端至所述接收端的最小延迟;
    将所述最大带宽与所述最小延迟的乘积确定为所述BDP。
  3. 根据权利要求2所述的流量控制方法,其特征在于,所述最小延迟为所述发送端至所述接收端的通信链路上的所有设备的标准延迟值之和,或者,所述最小延迟为所述发送端至所述接收端的通信链路上的N个测试延迟值中的最小测试延迟值。
  4. 根据权利要求1至3任一项所述的流量控制方法,其特征在于,所述目标数据为RDMA写命令、RDMA读命令、send操作命令或所述RDMA读命令对应的返回数据。
  5. 根据权利要求4所述的流量控制方法,其特征在于,若所述目标数据为RDMA写命令、RDMA读命令或send操作命令,则所述允许所述发送端基于RDMA发送目标数据至所述接收端,包括:
    按照MTU将所述目标数据拆分为多个子数据包,并将每个子数据包作为所述目标数据逐一发送。
  6. 根据权利要求4所述的流量控制方法,其特征在于,若所述目标数据为send操作命令,则所述接收端接收所述send操作命令后,上报所述send操作命令至所述接收端的主机。
  7. 根据权利要求1所述的流量控制方法,其特征在于,所述发送端和所述接收端为不同的FPGA加速卡。
  8. 一种流量控制装置,其特征在于,包括:
    确定模块,用于若发送端基于RDMA发送目标数据至接收端,则确定所述发送端至所述接收端的飞行数据量;
    判断模块,用于判断所述飞行数据量是否小于所述发送端至所述接收端的BDP;
    发送模块,用于若所述飞行数据量小于所述BDP,则允许所述发送端基于RDMA发送所述目标数据至所述接收端;
    控制模块,用于若所述飞行数据量不小于所述BDP,则不允许所述发送端基于RDMA发送所述目标数据至所述接收端,在等待预设时长后,执行所述确定所述发送端至所述接收端的飞行数据量的步骤。
  9. 一种流量控制设备,其特征在于,包括:
    存储器,用于存储计算机程序;
    处理器,用于执行所述计算机程序,以实现如权利要求1至7任一项所述的流量控制方法。
  10. 一种可读存储介质,其特征在于,用于保存计算机程序,其中,所述计算机程序被处理器执行时实现如权利要求1至7任一项所述的流量控制方法。
PCT/CN2021/121873 2020-12-10 2021-09-29 一种流量控制方法、装置、设备及可读存储介质 WO2022121469A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011438863.9 2020-12-10
CN202011438863.9A CN112653634A (zh) 2020-12-10 2020-12-10 一种流量控制方法、装置、设备及可读存储介质

Publications (1)

Publication Number Publication Date
WO2022121469A1 true WO2022121469A1 (zh) 2022-06-16

Family

ID=75350738

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/121873 WO2022121469A1 (zh) 2020-12-10 2021-09-29 一种流量控制方法、装置、设备及可读存储介质

Country Status (2)

Country Link
CN (1) CN112653634A (zh)
WO (1) WO2022121469A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112653634A (zh) * 2020-12-10 2021-04-13 苏州浪潮智能科技有限公司 一种流量控制方法、装置、设备及可读存储介质
CN114024913B (zh) * 2021-09-30 2024-03-08 浪潮电子信息产业股份有限公司 一种网络性能优化方法、装置、设备以及存储介质
CN114710426A (zh) * 2022-04-02 2022-07-05 珠海星云智联科技有限公司 一种确定读操作往返时延的方法、装置、系统及相关设备
CN115086712B (zh) * 2022-06-07 2023-06-02 同济大学 一种基于cps系统的双网络时敏自适应通信方法及系统
CN115955437B (zh) * 2023-03-14 2023-05-30 苏州浪潮智能科技有限公司 一种数据传输方法、装置、设备及介质
CN117579226A (zh) * 2023-11-22 2024-02-20 无锡众星微系统技术有限公司 一种基于ib流控包的链路重传方法和装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109150708A (zh) * 2018-08-28 2019-01-04 中国科学院计算机网络信息中心 数据转发接口的选择方法及装置
US20200195567A1 (en) * 2018-12-13 2020-06-18 Amazon Technologies, Inc. Continuous calibration of network metrics
CN111416775A (zh) * 2019-01-04 2020-07-14 阿里巴巴集团控股有限公司 数据接收和发送方法、装置及系统
CN112054965A (zh) * 2019-06-05 2020-12-08 阿里巴巴集团控股有限公司 一种拥塞控制方法、设备及计算机可读介质
CN112653634A (zh) * 2020-12-10 2021-04-13 苏州浪潮智能科技有限公司 一种流量控制方法、装置、设备及可读存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106664290B (zh) * 2015-05-26 2019-12-06 华为技术有限公司 一种光电混合网络的数据传输方法及装置
CN109976661B (zh) * 2017-12-27 2020-08-14 华为技术有限公司 基于nof的读取控制方法、装置及系统
CN110460533B (zh) * 2019-07-12 2023-09-19 锐捷网络股份有限公司 基于rdma的数据传输方法及装置
CN111274195B (zh) * 2020-01-19 2023-06-23 西安奥卡云数据科技有限公司 Rdma网络流控方法、装置及计算机可读存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109150708A (zh) * 2018-08-28 2019-01-04 中国科学院计算机网络信息中心 数据转发接口的选择方法及装置
US20200195567A1 (en) * 2018-12-13 2020-06-18 Amazon Technologies, Inc. Continuous calibration of network metrics
CN111416775A (zh) * 2019-01-04 2020-07-14 阿里巴巴集团控股有限公司 数据接收和发送方法、装置及系统
CN112054965A (zh) * 2019-06-05 2020-12-08 阿里巴巴集团控股有限公司 一种拥塞控制方法、设备及计算机可读介质
CN112653634A (zh) * 2020-12-10 2021-04-13 苏州浪潮智能科技有限公司 一种流量控制方法、装置、设备及可读存储介质

Also Published As

Publication number Publication date
CN112653634A (zh) 2021-04-13

Similar Documents

Publication Publication Date Title
WO2022121469A1 (zh) 一种流量控制方法、装置、设备及可读存储介质
US10700995B2 (en) System and method for improving an aggregated throughput of simultaneous connections
US8996718B2 (en) TCP-aware receive side coalescing
US11012367B2 (en) Technologies for managing TCP/IP packet delivery
US9467390B2 (en) Method and device for data transmission
Ren et al. A survey on TCP Incast in data center networks
CN103312807B (zh) 数据传输方法、装置及系统
EP2232791B1 (en) Tcp packet spacing
US20100054123A1 (en) Method and device for hign utilization and efficient flow control over networks with long transmission latency
WO2020207479A1 (zh) 一种网络拥塞控制方法和装置
KR20090014334A (ko) 전송 프로토콜의 성능을 향상시키는 시스템 및 방법
WO2018112877A1 (zh) 路径计算和访问请求分发方法、装置及系统
EP2661029B1 (en) Avoiding Delayed Data
CN105376173B (zh) 一种发送窗口流量控制方法和终端
EP2930899A1 (en) Tcp link configuration method, apparatus and device
WO2020253488A1 (zh) 拥塞控制方法及装置、通信网络、计算机存储介质
Chen et al. Mp-rdma: enabling rdma with multi-path transport in datacenters
Riedl et al. Investigation of the M/G/R processor sharing model for dimensioning of IP access networks with elastic traffic
WO2013029424A1 (zh) 网络检测方法、装置及系统
US7869366B1 (en) Application-aware rate control
CN109787861B (zh) 网络数据延迟控制方法
US9590909B2 (en) Reducing TCP timeouts due to Incast collapse at a network switch
Suryavanshi et al. An application layer technique to overcome TCP incast in data center network using delayed server response
WO2016184079A1 (zh) 一种处理系统日志报文的方法和装置
WO2016169251A1 (zh) 虚拟机的tcp数据传输方法和虚拟机系统

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21902177

Country of ref document: EP

Kind code of ref document: A1