CN118317366B - A load balancing method and related equipment for RDMA network data transmission - Google Patents

A load balancing method and related equipment for RDMA network data transmission Download PDF

Info

Publication number
CN118317366B
CN118317366B CN202410538262.7A CN202410538262A CN118317366B CN 118317366 B CN118317366 B CN 118317366B CN 202410538262 A CN202410538262 A CN 202410538262A CN 118317366 B CN118317366 B CN 118317366B
Authority
CN
China
Prior art keywords
path
data packet
paths
congestion
delay
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410538262.7A
Other languages
Chinese (zh)
Other versions
CN118317366A (en
Inventor
史庆宇
蒋芳雪
刘利枚
贺泓睿
黄璜
李沁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiangjiang Laboratory
Original Assignee
Xiangjiang Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiangjiang Laboratory filed Critical Xiangjiang Laboratory
Priority to CN202410538262.7A priority Critical patent/CN118317366B/en
Publication of CN118317366A publication Critical patent/CN118317366A/en
Application granted granted Critical
Publication of CN118317366B publication Critical patent/CN118317366B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/08Load balancing or load distribution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/0284Traffic management, e.g. flow control or congestion control detecting congestion or overload during communication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/0289Congestion control

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请涉及数据传输技术领域,提供了一种RDMA网络数据传输的负载均衡方法及相关设备。该方法包括:获取数据包转发速率、数据包队列总长度、每条路径的数据包到达速率、每条路径的路径数据包队列长度;基于数据包转发速率和所有路径的数据包到达速率计算得到拥塞暂停时间差;当拥塞暂停时间差小于预设时间差时,基于数据包队列总长度获取路径数据包队列长度阈值,并利用路径数据包队列长度阈值从所有路径中确定出拥塞路径;基于数据包队列总长度、数据包转发速率、所有路径的数据包到达速率获取每条路径的路径延迟;利用所有路径延迟从所有路径中确定出目标路径;向上游交换机发送路径拥塞消息。该方法能够提高数据传输的负载均衡性能。

The present application relates to the technical field of data transmission, and provides a load balancing method and related equipment for RDMA network data transmission. The method includes: obtaining the packet forwarding rate, the total length of the packet queue, the packet arrival rate of each path, and the path packet queue length of each path; calculating the congestion pause time difference based on the packet forwarding rate and the packet arrival rate of all paths; when the congestion pause time difference is less than the preset time difference, obtaining the path packet queue length threshold based on the total length of the packet queue, and determining the congested path from all paths using the path packet queue length threshold; obtaining the path delay of each path based on the total length of the packet queue, the packet forwarding rate, and the packet arrival rate of all paths; determining the target path from all paths using all path delays; and sending a path congestion message to the upstream switch. The method can improve the load balancing performance of data transmission.

Description

Load balancing method for RDMA network data transmission and related equipment
Technical Field
The application relates to the technical field of data transmission, in particular to a load balancing method for RDMA network data transmission and related equipment.
Background
Currently, data center remote direct memory access (RDMA, remote Direct MemoryAccess) networks are widely used in distributed storage, high performance computing (HPC, high Performance Computing), distributed AI training, and other scenarios. To further enhance the transmission performance of data center lossless networks, researchers have proposed a series of transmission control and load balancing schemes to optimize RDMA communication performance between data center servers. Different from the optimization of the end-to-end single-path communication process, the load balancing scheme aims to uniformly distribute network traffic among nodes to a plurality of paths, and is important to improving network transmission performance. The current data transmission load balancing method for RDMA data flows in a data center lossless network can reroute the data flows to paths with lower congestion degrees by monitoring the congestion degrees of different paths. In addition, priority flow control (PFC-based Flow Control) is a necessary flow control mechanism in lossless ethernet, and an existing load balancing mechanism predicts PFC pause/resume frames at a switch, so as to implement early response to network congestion, so as to avoid performance loss caused by PFC pause due to congestion.
However, although the existing load balancing mechanism achieves a certain effect in improving the transmission performance of the lossless network, the existing load balancing mechanism still cannot quickly sense the data flow causing the path congestion, which will cause performance loss, including: 1) Load balancing based on link utilization: the path of the PFC pause is identified as an uncongested path due to low link utilization, so that a large number of data flows are rerouted to the path of the PFC pause, and PFC pause and PFC congestion diffusion are aggravated; 2) Load balancing based on link delay: since at least 1 round trip time is needed to monitor the path delay at the source switch, and the path delay changes quickly, the time period of PFC suspension/restoration is small, resulting in low old path link utilization, new path PFC suspension and PFC congestion diffusion when the congested data stream is rerouted to other paths, and its old path PFC suspension may have ended; 3) Load balancing based on downstream switch PFC pause prediction: and predicting PFC pause time according to the queuing length of the port queue data packet of the downstream switch, and actively informing the upstream switch that the path is a congestion path, but because the data flow which leads to the path congestion is not precisely positioned, other normal data flows are easily rerouted to other paths, the transmission efficiency is affected, and extra RDMA data packets are disordered. It can be seen that the existing load balancing mechanism has the problem of low load balancing performance of RDMA network data transmission.
Disclosure of Invention
The application provides a load balancing method and related equipment for RDMA network data transmission, which can solve the problem of low load balancing performance of RDMA network data transmission.
In a first aspect, an embodiment of the present application provides a load balancing method for RDMA network data transmission, where the load balancing method includes:
Acquiring the data packet forwarding rate and the total length of a data packet queue of a downstream switch at the current moment, and the data packet arrival rate of each path and the length of a path data packet queue of each path at the current moment; the path is a link for the upstream switch to send the data packet to the downstream switch;
Calculating to obtain congestion pause time difference of a downstream switch based on the data packet forwarding rate and the data packet arrival rates of all paths;
When the congestion pause time difference is smaller than the preset time difference, acquiring a path data packet queue length threshold value based on the total length of the data packet queue, and determining a plurality of congestion paths from all paths by utilizing the path data packet queue length threshold value; the length of the path data packet queue of the congested path is greater than the threshold value of the length of the path data packet queue;
acquiring the path delay of each path based on the total length of the data packet queue, the data packet forwarding rate and the data packet arrival rates of all paths; the path delay is used for describing the delay of message transmission and the delay of data packet transmission by the path;
determining a multi-label path from all paths by using all path delays; the path delay of each target path is smaller than the path delays of all other paths;
And sending the path congestion message to the upstream switch, so that the upstream switch transmits the data packets sent by all the congestion paths in a preset time period in the future by utilizing all the target paths according to the identification information of each congestion path and the identification information of each target path carried by the path congestion message.
Optionally, calculating the congestion suspension time difference of the downstream switch based on the packet forwarding rate and the packet arrival rates of all paths includes:
by the formula:
Calculating congestion pause time difference delta t of a downstream switch;
Wherein Q PFC represents the total length of the packet queue suspension triggering congestion suspension, t d represents the current delay of message transmission to the upstream switch, n represents the number of paths, vi (t) represents the packet arrival rate of the ith path at the current time, t represents the current time, and vr (t) represents the packet forwarding rate.
Optionally, the obtaining the path packet queue length threshold based on the total length of the packet queue includes:
by the formula:
Calculating a path data packet queue length threshold F cc;
Where Q represents the total length of the packet queue, Q T represents the list that records the packet queue length for all paths and all paths, qInd represents the queue index, and fNum represents the number of paths in the queue index.
Optionally, obtaining the path delay of each path based on the total length of the packet queue, the packet forwarding rate, and the packet arrival rates of all paths includes:
acquiring congestion pause recovery time based on the total length of a data packet queue, the data packet forwarding rate and the data packet arrival rates of all paths;
Based on the congestion suspension resume time, the path delay of each path is obtained.
Optionally, the obtaining congestion suspension resume time based on the total length of the packet queue, the packet forwarding rate, and the packet arrival rates of all paths includes:
by the formula:
Calculating congestion suspension resume time t re;
where Q represents the total length of the packet queue, Indicating the total length of packet queue recovery at the end of congestion suspension, T re_flight indicating the time when the message of congestion suspension end arrives at the upstream switch, Δl indicating the queue length change gradient:
where n represents the number of paths, vi (t) represents the packet arrival rate of the ith path at the current time, t represents the current time, and vr (t) represents the packet forwarding rate.
Optionally, based on the congestion suspension resume time, acquiring the path delay of each path includes:
acquiring the transmission delay of each path;
the following steps are performed for each path respectively:
Judging whether the path is a congestion path or not;
if yes, taking the sum of the congestion pause resume time and the transmission delay of the path as the path delay of the path;
otherwise, the transmission delay of the path is taken as the path delay of the path.
Optionally, determining the multi-label path from all paths using all path delays includes:
arranging all path delays from small to large, and taking the path delays of the preset number in the arrangement result as target path delays;
And respectively taking the path corresponding to each target path delay as a target path.
In a second aspect, an embodiment of the present application provides a load balancing apparatus for RDMA network data transmission, including:
the first acquisition module acquires the data packet forwarding rate, the total length of a data packet queue of a downstream switch at the current moment, and the data packet arrival rate and the path data packet queue length of each path at the current moment; the path is a link for the upstream switch to send the data packet to the downstream switch;
the calculating module is used for calculating and obtaining congestion pause time difference of the downstream switch based on the data packet forwarding rate and the data packet arrival rates of all paths;
The first determining module is used for acquiring a path data packet queue length threshold based on the total length of the data packet queue when the congestion pause time difference is smaller than a preset time difference, and determining a plurality of congestion paths from all paths by utilizing the path data packet queue length threshold; the length of the path data packet queue of the congested path is greater than the threshold value of the length of the path data packet queue;
the second acquisition module acquires the path delay of each path based on the total length of the data packet queue, the data packet forwarding rate and the data packet arrival rates of all paths; the path delay is used for describing the delay of message transmission and the delay of data packet transmission by the path;
the second determining module is used for determining a multi-item target path from all paths by utilizing all path delays; the path delay of each target path is smaller than the path delays of all other paths;
And the sending module sends the path congestion message to the upstream switch so that the upstream switch can transmit the data packets sent by all the congestion paths in a preset time period in the future by utilizing all the target paths according to the identification information of each congestion path and the identification information of each target path carried by the path congestion message.
In a third aspect, an embodiment of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the load balancing method for RDMA network data transmission described above when executing the computer program.
In a fourth aspect, embodiments of the present application provide a computer readable storage medium storing a computer program which, when executed by a processor, implements the load balancing method for RDMA network data transfer described above.
The scheme of the application has the following beneficial effects:
In the embodiment of the application, the congestion pause time difference of the downstream switch is calculated by acquiring the data packet forwarding rate and the total length of a data packet queue of the downstream switch at the current moment and the data packet arrival rate and the path data packet queue length of each path at the current moment, then based on the data packet forwarding rate and the data packet arrival rates of all paths, when the congestion pause time difference is smaller than the preset time difference, the path data packet queue length threshold is acquired based on the total length of the data packet queue, the path data packet queue length threshold is utilized to determine a plurality of congestion paths from all paths, then the path delay of each path is acquired based on the total length of the data packet queue, the data packet forwarding rate and the data packet arrival rates of all paths, then all path delays are utilized to determine a plurality of target paths from all paths, and finally the upstream switch transmits path congestion information according to the identification information of each congestion path carried by the path congestion information and the identification information of each target path, and the data packets transmitted by all the congestion paths in the future preset time period are utilized by all target paths. The congestion pause time difference is calculated based on the data packet forwarding rate and the data packet arrival rate at the current moment, congestion conditions can be monitored in real time, instantaneity and accuracy of the congestion pause time difference are improved, a target path is determined based on the total length of a data packet queue, the data packet forwarding rate and the data packet arrival rate, the change condition of the data packet queue of the path is considered, the acquired target path is better than that of other paths, meanwhile, path congestion information carrying identification information of the target path and identification information of the congestion path is sent to an upstream switch, the upstream switch can identify the congestion path and the target path at the same time, accurate load balancing work is facilitated for the upstream switch, and therefore load balancing performance of data transmission is improved.
Other advantageous effects of the present application will be described in detail in the detailed description section which follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments or the description of the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a load balancing method for RDMA network data transfer according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a load balancing system for RDMA network data transfer according to an embodiment of the present application;
FIG. 3 is a schematic diagram illustrating the operation of a load balancing system for RDMA network data transmission according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a load balancing device for RDMA network data transmission according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
As used in the present description and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".
Furthermore, the terms "first," "second," "third," and the like in the description of the present specification and in the appended claims, are used for distinguishing between descriptions and not necessarily for indicating or implying a relative importance.
Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.
Aiming at the problem of low load balancing performance of the existing RDMA network data transmission, the embodiment of the application provides a load balancing method of the RDMA network data transmission, which comprises the steps of obtaining the data packet forwarding rate and the total length of a data packet queue of a downstream switch at the current moment, obtaining the arrival rate of the data packet of each path and the length of the data packet queue of the path at the current moment, calculating to obtain a congestion pause time difference based on the data packet forwarding rate and the arrival rate of the data packet of all paths, obtaining a length threshold of the data packet queue of the path based on the total length of the data packet queue when the congestion pause time difference is smaller than a preset time difference, determining a plurality of congestion paths from all paths by utilizing the length threshold of the data packet queue of the path, obtaining the path delay of each path based on the total length of the data packet queue, the data packet forwarding rate and the arrival rate of the data packet of all paths, determining a plurality of target paths from all paths by utilizing all path delays, and finally sending path congestion information to an upstream switch according to the identification information of each congestion path carried by the path congestion information and the identification information of each target path, and utilizing all target paths to carry out future transmission of data packets of all congestion paths within the preset time. The congestion pause time difference is calculated based on the data packet forwarding rate and the data packet arrival rate at the current moment, congestion conditions can be monitored in real time, instantaneity and accuracy of the congestion pause time difference are improved, a target path is determined based on the total length of a data packet queue, the data packet forwarding rate and the data packet arrival rate, the change condition of the data packet queue of the path is considered, the acquired target path is better than that of other paths, meanwhile, path congestion information carrying identification information of the target path and identification information of the congestion path is sent to an upstream switch, the upstream switch can identify the congestion path and the target path at the same time, accurate load balancing work is facilitated for the upstream switch, and therefore load balancing performance of data transmission is improved.
The load balancing method for RDMA network data transmission provided by the application is exemplified as follows.
As shown in fig. 1, the load balancing method for RDMA network data transmission provided by the present application includes the following steps:
and step 11, acquiring the data packet forwarding rate, the total length of a data packet queue of a downstream switch at the current moment, and the data packet arrival rate and the path data packet queue length of each path at the current moment.
The path is a link where an upstream switch sends a packet to a downstream switch.
In some embodiments of the present application, a transmission performance testing tool such as the iperf may be used to obtain the packet forwarding rate of the downstream switch, the total length of the packet queue, the packet arrival rate of each path, and the path packet queue length of each path.
The packet forwarding rate is a rate at which the downstream switch forwards the packet to other devices, the total length of the packet queues is a queue length of the packets received by the downstream switch and not forwarded, the packet arrival rate is a rate at which the packets in the path reach the downstream switch, and the path packet queue length is a queue length of the packets transmitted to the downstream switch and not forwarded in the path.
And step 12, calculating to obtain the congestion pause time difference of the downstream switch based on the data packet forwarding rate and the data packet arrival rates of all paths.
Specifically, the formula is as follows:
A congestion suspension time difference Δt of the downstream switch is calculated.
Wherein Q PFC represents the total length of the packet queue suspension triggering congestion suspension, t d represents the current delay of message transmission to the upstream switch, n represents the number of paths, vi (t) represents the packet arrival rate of the ith path at the current time, t represents the current time, and vr (t) represents the packet forwarding rate.
It should be noted that, the congestion suspension time difference is a time difference from the current time to the time when the flow control algorithm (such as PFC suspension) is triggered, if the current time is 9 points, the congestion suspension time difference is 10 minutes, and then the PFC suspension is triggered at 9 points and 10 minutes.
Illustratively, the congestion suspension time difference may be calculated using computer software of MATLAB, mathematica or the like.
It is worth mentioning that, calculate the congestion pause time difference based on the data packet forwarding speed and the data packet arrival speed at the present moment, can monitor the congestion condition in real time, improve the real-time and accuracy of the congestion pause time difference.
And step 13, when the congestion pause time difference is smaller than the preset time difference, acquiring a path data packet queue length threshold based on the total length of the data packet queue, and determining a plurality of congestion paths from all paths by utilizing the path data packet queue length threshold.
The length of the path data packet queue of the congestion path is greater than the threshold value of the length of the path data packet queue. The preset time difference may be a time difference set in the PFC mechanism.
In some embodiments of the self-application, when the congestion suspension time difference is greater than or equal to the preset time difference, it is indicated that the PFC mechanism is not yet determined to be triggered at this time, for example, the preset time difference is 10 seconds, that is, the PFC mechanism is triggered after 10 seconds, and the congestion suspension time difference is 30 seconds, greater than the preset time difference, and the PFC mechanism is not required to be triggered.
Specifically, the formula is as follows:
Calculating a path data packet queue length threshold F cc;
Where Q represents the total length of the packet queue, Q T represents the list that records the packet queue length for all paths and all paths, qInd represents the queue index, and fNum represents the number of paths in the queue index.
It should be noted that the step of determining a plurality of congestion paths from all paths by using the path packet queue length threshold specifically includes: and judging whether the length of the path data packet queue of the path is larger than the length threshold of the path data packet queue for each path, and if so, taking the path as a congestion path. The above list is a data flow table maintained by the downstream switch, wherein the flow table is a flow table (flowID, qInd, buff_pkts, sendRate, RECEIVERATE, curr_time, td), the flowID is an identifier for distinguishing different paths, qInd is an index for allocating queues, buff_pkts is a queue capacity occupied in a flowID path, sendRate is an arrival rate of a packet in a path to an ingress port of the downstream switch, RECEIVERATE is a forwarding rate of a packet in an ingress port queue of the downstream switch, curr_time is a current time, td is a transmission delay of the path, and Td is a path delay of the path. The queue index includes identification information of all paths transmitting the data packet at the current moment.
The path packet queue length threshold value may be calculated, for example, using computer software such as MATLAB, mathematica's mathematical calculations.
It is worth mentioning that the congestion path is obtained according to the length of the path data packet queue of each path, so that the path needing to be subjected to load balancing at the current moment can be accurately positioned, and the situation of carrying out load balancing on the normal path is avoided.
And step 14, acquiring the path delay of each path based on the total length of the data packet queue, the data packet forwarding rate and the data packet arrival rates of all paths.
The path delay is used to describe the delay of message transmission and the delay of packet transmission by the path.
In some embodiments of the present application, the step of obtaining the path delay of each path based on the total length of the packet queue, the packet forwarding rate, and the packet arrival rates of all paths specifically includes:
And the first step, acquiring congestion pause recovery time based on the total length of the data packet queue, the data packet forwarding rate and the data packet arrival rates of all paths.
Specifically, the formula is as follows:
the congestion suspension resume time t re is calculated.
Where Q represents the total length of the packet queue,Indicating the total length of packet queue recovery at the end of congestion suspension, T re_flight indicating the time when the message of congestion suspension end arrives at the upstream switch, Δl indicating the queue length change gradient:
where n represents the number of paths, vi (t) represents the packet arrival rate of the ith path at the current time, t represents the current time, and vr (t) represents the packet forwarding rate.
And a second step of acquiring the path delay of each path based on the congestion suspension resume time.
First, the transmission delay of each path is acquired.
For example, the transmission delay of each path may be obtained using a transmission performance testing tool such as the iperf.
Then, for each path, the following steps are performed:
And judging whether the path is a congestion path or not. Specifically, if the path is determined to be a congested path in the above step, the path is a congested path.
If yes, the sum of the congestion pause resume time and the transmission delay of the path is taken as the path delay of the path.
Otherwise, the transmission delay of the path is taken as the path delay of the path.
Illustratively, the 1 st path is a congested path, the path delay of the path is 2 seconds of the transmission delay of the path plus 3 seconds of the congestion pause resume time, which is 5 seconds, and the 2 nd path is not a congested path, the path delay of the path is equal to 3 seconds of the transmission delay of the path.
It should be noted that, the congestion suspension resume time is used to describe a time when the congestion can normally work when the congestion is terminated by the path, that is, a total time when the total length of the packet queues of the ingress port of the downstream switch reaches the total length of the packet queue resume, and a message of the congestion suspension end is transmitted to the upstream switch. The total length of the packet queue recovery can be configured by a switch supporting PFC, and the time when the congestion pause end message reaches the upstream switch is related to the hardware devices such as the forwarding rate, the processing delay and the like of the downstream switch.
It is worth mentioning that the congestion condition of the path and the transmission delay of the path are considered when the path delay of each path is obtained, so that the path delay accords with the actual condition of the path, and the accuracy of the path delay is improved.
And 15, determining a multi-label path from all paths by using all path delays.
The path delay of each target path is smaller than the path delays of all other paths.
In some embodiments of the present application, the step of determining the multi-label path from all paths by using all path delays specifically includes:
the first step is to arrange all path delays from small to large, and the path delays of the preset number in the arrangement result are all used as target path delays.
And step two, respectively taking the path corresponding to each target path delay as a target path.
Illustratively, the 3 path delays are ordered: and 3 seconds, 5 seconds and 6 seconds, taking the first two path delays as target path delays, and taking the paths corresponding to the target path delays as target paths, namely taking the paths corresponding to the 3 seconds and the 5 seconds as target paths.
It should be noted that the number of target paths may be set according to the number of congestion paths.
It is worth mentioning that the target path is determined by using the path delay, and the change condition of the data packet queue of the path is considered, so that the acquired target path has better performance than other paths.
And step 16, sending a path congestion message to the upstream switch, so that the upstream switch transmits the data packets sent by all the congestion paths in a preset time period in the future by utilizing all the target paths according to the identification information of each congestion path and the identification information of each target path carried by the path congestion message.
The path congestion message carries identification information of each congestion path and identification information of each standard path. The preset time period is a time for recovering all congestion paths to be normal, for example: and receiving the path congestion message by the upstream switch at the point 9 and 40, transmitting the data packets which are required to be transmitted by all the congestion paths by utilizing all the target paths, wherein the path data packet queue length of all the congestion paths at the point 10 is less than or equal to the path data packet queue length threshold value, which means that all the congestion paths are recovered to be normal at the moment and data transmission can be continued, and 20 minutes between the point 9 and the point 40 are the preset time period.
In some embodiments of the present application, after a downstream switch sends a path congestion message, an upstream switch receives the path congestion message, identifies all congestion paths and all target paths in all paths according to identification information of each congestion path and identification information of each target path, and then uses all target paths to transmit data packets sent to the downstream switch by all congestion paths within a preset time period in the future.
It should be noted that, the step of transmitting the data packets sent to the downstream switch by all the congestion paths in the preset time period in the future using all the target paths may be implemented by using the PFC mechanism, for example, a target path with small path delay is allocated to the congestion path with high priority, and the data packets to be sent by the corresponding congestion path are transmitted using the target path until the upstream switch receives the message that the congestion path is recovered to be normal and ends.
The load balancing method of RDMA network data transmission of the present application is exemplified below in connection with a specific example.
The load balancing system for RDMA network data transmission is shown in fig. 2, where a data flow is sent to an upstream source switch (i.e., an upstream switch above), an output port of the upstream source switch is connected to an input port of a downstream switch, the upstream source switch includes a rerouting module for accepting congestion notification (i.e., a path congestion message above) and rerouting to a specified path (i.e., a target path above), and the downstream destination switch includes a traffic prediction and monitoring module for predicting PFC suspension and resumption, identifying a congestion flow (i.e., a congestion path above) and an optimal path (i.e., a target path above), and sending a congestion notification, where FCN is the congestion notification.
The data transmission process when the load balancing system works is shown in fig. 3, wherein the Spine is an upstream switch layer, the Leaf is a downstream switch layer, a straight line between the upstream switch and the downstream switch represents a path of data transmission, an arrow represents a direction of data transmission or message sending, f 1 represents a 1 st path, f 2 represents a 2 nd path, f n-1 represents an n-1 st path, f n represents an n-th path, Q PFC represents a total length of a data packet queue pause triggering congestion pause,For the above formula of calculating the path packet queue length threshold, the downstream switch ingress port sends the FCN, which is a congestion notification, to the upstream switch egress port, and the upstream switch switches the congestion flow (i.e., the congestion path above) to the new path (i.e., the target path above) according to the FCN.
It is worth mentioning that, calculate the congestion and suspend the time difference based on the data packet forwarding rate and data packet arrival rate of the present moment, can monitor the congestion situation in real time, improve the real-time and accuracy of the congestion and suspend the time difference, confirm the goal route based on the total length of the data packet queue, data packet forwarding rate, data packet arrival rate, take into account the data packet queue change situation of the route, make the performance of the goal route obtained better than other routes, meanwhile, send the route congestion message carrying identification information of the goal route and identification information of the congestion route to the upstream exchanger, make the upstream exchanger discern congestion route and goal route at the same time, facilitate the upstream exchanger to carry on the accurate load balancing work, and then improve the load balancing performance of data transmission.
The following describes an exemplary load balancing device for RDMA network data transmission.
As shown in fig. 4, an embodiment of the present application provides a load balancing apparatus for RDMA network data transmission, where a load balancing apparatus 400 for RDMA network data transmission includes:
A first obtaining module 401, configured to obtain a packet forwarding rate, a total length of a packet queue of a downstream switch at a current time, and an arrival rate of a packet of each path and a length of a packet queue of a path at the current time; the path is a link for the upstream switch to send the data packet to the downstream switch;
The calculating module 402 calculates to obtain congestion pause time difference of the downstream switch based on the data packet forwarding rate and the data packet arrival rates of all paths;
the first determining module 403 obtains a path packet queue length threshold based on the total length of the packet queue when the congestion pause time difference is less than a preset time difference, and determines a plurality of congestion paths from all paths by using the path packet queue length threshold; the length of the path data packet queue of the congested path is greater than the threshold value of the length of the path data packet queue;
A second obtaining module 404, configured to obtain a path delay of each path based on a total length of the packet queue, a packet forwarding rate, and packet arrival rates of all paths; the path delay is used for describing the delay of message transmission and the delay of data packet transmission by the path;
a second determining module 405 determines a multi-label path from all paths using all path delays; the path delay of each target path is smaller than the path delays of all other paths;
And the sending module 406 sends the path congestion message to the upstream switch, so that the upstream switch uses all the target paths to transmit the data packets sent by all the congestion paths in a preset time period in the future according to the identification information of each congestion path and the identification information of each target path carried by the path congestion message.
It should be noted that, because the content of information interaction and execution process between the above devices/units is based on the same concept as the method embodiment of the present application, specific functions and technical effects thereof may be referred to in the method embodiment section, and will not be described herein.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
As shown in fig. 5, an embodiment of the present application provides a terminal device, a terminal device D10 of which includes: at least one processor D100 (only one processor is shown in fig. 5), a memory D101 and a computer program D102 stored in the memory D101 and executable on the at least one processor D100, the processor D100 implementing the steps in any of the various method embodiments described above when executing the computer program D102.
Specifically, when the processor D100 executes the computer program D102, the congestion suspension time difference of the downstream switch is calculated by obtaining the packet forwarding rate and the total length of the packet queue of the downstream switch at the current time, and the packet arrival rate and the path packet queue length of each path at the current time, and then based on the packet forwarding rate and the packet arrival rates of all paths, when the congestion suspension time difference is smaller than the preset time difference, the congestion suspension time difference is calculated by obtaining the path packet queue length threshold value based on the total length of the packet queue, determining a plurality of congestion paths from all paths by using the path packet queue length threshold value, obtaining the path delay of each path based on the total length of the packet queue, the packet forwarding rate and the packet arrival rate of all paths, determining a plurality of target paths from all paths by using the path delay, and finally sending the path congestion message to the upstream switch, so that the upstream switch can transmit the packets sent by all the target paths within the preset time period in the future according to the identification information of each congestion path carried by the path congestion message and the identification information of each target path. The congestion pause time difference is calculated based on the data packet forwarding rate and the data packet arrival rate at the current moment, congestion conditions can be monitored in real time, instantaneity and accuracy of the congestion pause time difference are improved, a target path is determined based on the total length of a data packet queue, the data packet forwarding rate and the data packet arrival rate, the change condition of the data packet queue of the path is considered, the acquired target path is better than that of other paths, meanwhile, path congestion information carrying identification information of the target path and identification information of the congestion path is sent to an upstream switch, the upstream switch can identify the congestion path and the target path at the same time, accurate load balancing work is facilitated for the upstream switch, and therefore load balancing performance of data transmission is improved.
The Processor D100 may be a central processing unit (CPU, central Processing Unit), the Processor D100 may also be other general purpose processors, digital signal processors (DSP, digital Signal processors), application SPECIFIC INTEGRATED integrated circuits (ASICs), off-the-shelf Programmable gate arrays (FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory D101 may in some embodiments be an internal storage unit of the terminal device D10, for example a hard disk or a memory of the terminal device D10. The memory D101 may also be an external storage device of the terminal device D10 in other embodiments, for example, a plug-in hard disk, a smart memory card (SMC, smart Media Card), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the terminal device D10. Further, the memory D101 may also include both an internal storage unit and an external storage device of the terminal device D10. The memory D101 is used for storing an operating system, an application program, a boot loader (BootLoader), data, other programs, etc., such as program codes of the computer program. The memory D101 may also be used to temporarily store data that has been output or is to be output.
Embodiments of the present application also provide a computer readable storage medium storing a computer program which, when executed by a processor, implements steps for implementing the various method embodiments described above.
Embodiments of the present application provide a computer program product enabling a terminal device to carry out the steps of the method embodiments described above when the computer program product is run on the terminal device.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application may implement all or part of the flow of the method of the above embodiments, and may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include at least: any entity or device, recording medium, computer Memory, read-Only Memory (ROM), random-access Memory (RAM, random Access Memory), electrical carrier signals, telecommunications signals, and software distribution media capable of carrying computer program code to a load balancing method device/terminal device for RDMA network data transfer. Such as a U-disk, removable hard disk, magnetic or optical disk, etc. In some jurisdictions, computer readable media may not be electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
While the foregoing is directed to the preferred embodiments of the present application, it will be appreciated by those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present application, and such modifications and adaptations are intended to be comprehended within the scope of the present application.

Claims (5)

1.一种RDMA网络数据传输的负载均衡方法,其特征在于,应用于下游交换机,包括:1. A load balancing method for RDMA network data transmission, characterized in that it is applied to a downstream switch and comprises: 获取在当前时刻所述下游交换机的数据包转发速率、数据包队列总长度,以及在当前时刻每条路径的数据包到达速率和路径数据包队列长度;所述路径为上游交换机向所述下游交换机发送数据包的链路;Obtaining the packet forwarding rate and total length of the packet queue of the downstream switch at the current moment, as well as the packet arrival rate and path packet queue length of each path at the current moment; the path is a link through which the upstream switch sends packets to the downstream switch; 基于所述数据包转发速率和所有路径的数据包到达速率,计算得到所述下游交换机的拥塞暂停时间差;Based on the data packet forwarding rate and the data packet arrival rate of all paths, the congestion pause time difference of the downstream switch is calculated; 当所述拥塞暂停时间差小于预设时间差时,基于所述数据包队列总长度获取路径数据包队列长度阈值,并利用所述路径数据包队列长度阈值从所有路径中确定出多条拥塞路径;所述拥塞路径的路径数据包队列长度大于所述路径数据包队列长度阈值;When the congestion pause time difference is less than the preset time difference, a path data packet queue length threshold is obtained based on the total length of the data packet queue, and a plurality of congested paths are determined from all paths using the path data packet queue length threshold; the path data packet queue length of the congested path is greater than the path data packet queue length threshold; 基于所述数据包队列总长度、所述数据包转发速率、所有路径的数据包到达速率,获取每条路径的路径延迟;所述路径延迟用于描述消息传输的延迟和路径进行数据包传输的延迟;Based on the total length of the data packet queue, the data packet forwarding rate, and the data packet arrival rate of all paths, the path delay of each path is obtained; the path delay is used to describe the delay of message transmission and the delay of data packet transmission by the path; 利用所有路径延迟从所有路径中确定出多条目标路径;每条所述目标路径的路径延迟均小于所有其他路径的路径延迟;Determine multiple target paths from all paths using all path delays; the path delay of each of the target paths is less than the path delays of all other paths; 向上游交换机发送路径拥塞消息,以使所述上游交换机根据所述路径拥塞消息携带的每条拥塞路径的标识信息和每条目标路径的标识信息,利用所有目标路径对所有拥塞路径在未来预设时间段内发送的数据包进行传输;Sending a path congestion message to an upstream switch, so that the upstream switch uses all target paths to transmit data packets sent by all congested paths within a future preset time period according to identification information of each congested path and identification information of each target path carried in the path congestion message; 其中,所述基于所述数据包转发速率和所有路径的数据包到达速率,计算得到所述下游交换机的拥塞暂停时间差,包括:The step of calculating the congestion pause time difference of the downstream switch based on the data packet forwarding rate and the data packet arrival rate of all paths includes: 通过公式:By formula: 计算所述下游交换机的拥塞暂停时间差Δt;Calculating the congestion pause time difference Δt of the downstream switch; 其中,QPFC表示触发拥塞暂停的数据包队列暂停总长度,td表示向上游交换机传输消息的当前延迟,n表示路径的数量,vi(t)表示当前时刻第i条路径的数据包到达速率,t表示当前时刻,vr(t)表示所述数据包转发速率;Wherein, Q PFC represents the total length of the queue pause of packets that trigger congestion pause, t d represents the current delay of transmitting messages to the upstream switch, n represents the number of paths, vi(t) represents the packet arrival rate of the i-th path at the current moment, t represents the current moment, and vr(t) represents the packet forwarding rate; 所述基于所述数据包队列总长度获取路径数据包队列长度阈值,包括:The acquiring a path data packet queue length threshold based on the total length of the data packet queue comprises: 通过公式:By formula: 计算所述路径数据包队列长度阈值FccCalculating the path data packet queue length threshold F cc ; 其中,Q表示所述数据包队列总长度,QT表示记录所有路径数据包队列长度及所有路径的列表,qInd表示队列索引,fNum表示所述队列索引中的路径的数量;Wherein, Q represents the total length of the data packet queue, QT represents a list recording the lengths of all path data packet queues and all paths, qInd represents a queue index, and fNum represents the number of paths in the queue index; 所述基于所述数据包队列总长度、所述数据包转发速率、所有路径的数据包到达速率,获取每条路径的路径延迟,包括:The acquiring the path delay of each path based on the total length of the data packet queue, the data packet forwarding rate, and the data packet arrival rate of all paths includes: 基于所述数据包队列总长度、所述数据包转发速率、所有路径的数据包到达速率,获取拥塞暂停恢复时间;Obtaining a congestion pause recovery time based on the total length of the data packet queue, the data packet forwarding rate, and the data packet arrival rate of all paths; 基于所述拥塞暂停恢复时间,获取每条路径的路径延迟;Based on the congestion pause recovery time, obtaining a path delay of each path; 所述基于所述数据包队列总长度、所述数据包转发速率、所有路径的数据包到达速率,获取拥塞暂停恢复时间,包括:The acquiring the congestion suspension recovery time based on the total length of the data packet queue, the data packet forwarding rate, and the data packet arrival rate of all paths includes: 通过公式:By formula: 计算所述拥塞暂停恢复时间treCalculating the congestion pause recovery time t re ; 其中,Q表示所述数据包队列总长度,表示拥塞暂停结束时的数据包队列恢复总长度,Tre_flight表示拥塞暂停结束的消息到达上游交换机的时间,ΔL表示队列长度变化梯度:Wherein, Q represents the total length of the packet queue, It indicates the total length of the packet queue restored when the congestion pause ends, Tre_flight indicates the time when the message of the end of the congestion pause arrives at the upstream switch, and ΔL indicates the gradient of the queue length change: 其中,n表示路径的数量,vi(t)表示当前时刻第i条路径的数据包到达速率,t表示当前时刻,vr(t)表示所述数据包转发速率;Wherein, n represents the number of paths, vi(t) represents the packet arrival rate of the i-th path at the current moment, t represents the current moment, and vr(t) represents the packet forwarding rate; 所述基于所述拥塞暂停恢复时间,获取每条路径的路径延迟,包括:The acquiring the path delay of each path based on the congestion suspension recovery time includes: 获取每条路径的传输延迟;Get the transmission delay of each path; 分别针对每条路径,进行以下步骤:For each path, perform the following steps: 判断所述路径是否为拥塞路径;Determining whether the path is a congested path; 若是,则将所述拥塞暂停恢复时间与所述路径的传输延迟之和作为所述路径的路径延迟;If yes, taking the sum of the congestion suspension recovery time and the transmission delay of the path as the path delay of the path; 否则,将所述路径的传输延迟作为所述路径的路径延迟。Otherwise, the transmission delay of the path is taken as the path delay of the path. 2.根据权利要求1所述的负载均衡方法,其特征在于,所述利用所有路径延迟从所有路径中确定出多条目标路径,包括:2. The load balancing method according to claim 1, wherein the step of determining multiple target paths from all paths using all path delays comprises: 将所有路径延迟由小到大进行排列,将排列结果中前预设个数的路径延迟均作为目标路径延迟;Arrange all path delays from small to large, and use the path delays of a preset number in the arrangement result as the target path delay; 分别将每个所述目标路径延迟对应的路径作为一目标路径。A path corresponding to each target path delay is respectively used as a target path. 3.一种RDMA网络数据传输的负载均衡装置,其特征在于,包括:3. A load balancing device for RDMA network data transmission, characterized in that it includes: 第一获取模块,获取在当前时刻下游交换机的数据包转发速率、数据包队列总长度,以及在当前时刻每条路径的数据包到达速率和路径数据包队列长度;所述路径为上游交换机向所述下游交换机发送数据包的链路;The first acquisition module acquires the packet forwarding rate and the total length of the packet queue of the downstream switch at the current moment, as well as the packet arrival rate and the length of the path packet queue of each path at the current moment; the path is a link through which the upstream switch sends a packet to the downstream switch; 计算模块,基于所述数据包转发速率和所有路径的数据包到达速率,计算得到所述下游交换机的拥塞暂停时间差;A calculation module, based on the data packet forwarding rate and the data packet arrival rate of all paths, calculates the congestion pause time difference of the downstream switch; 第一确定模块,当所述拥塞暂停时间差小于预设时间差时,基于所述数据包队列总长度获取路径数据包队列长度阈值,并利用所述路径数据包队列长度阈值从所有路径中确定出多条拥塞路径;所述拥塞路径的路径数据包队列长度大于所述路径数据包队列长度阈值;A first determination module, when the congestion pause time difference is less than a preset time difference, obtains a path data packet queue length threshold based on the total length of the data packet queue, and determines multiple congested paths from all paths using the path data packet queue length threshold; the path data packet queue length of the congested path is greater than the path data packet queue length threshold; 第二获取模块,基于所述数据包队列总长度、所述数据包转发速率、所有路径的数据包到达速率,获取每条路径的路径延迟;所述路径延迟用于描述消息传输的延迟和路径进行数据包传输的延迟;A second acquisition module, based on the total length of the data packet queue, the data packet forwarding rate, and the data packet arrival rate of all paths, acquires the path delay of each path; the path delay is used to describe the delay of message transmission and the delay of path data packet transmission; 第二确定模块,利用所有路径延迟从所有路径中确定出多条目标路径;每条所述目标路径的路径延迟均小于所有其他路径的路径延迟;A second determination module determines a plurality of target paths from all paths using all path delays; the path delay of each of the target paths is smaller than the path delays of all other paths; 发送模块,向上游交换机发送路径拥塞消息,以使所述上游交换机根据所述路径拥塞消息携带的每条拥塞路径的标识信息和每条目标路径的标识信息,利用所有目标路径对所有拥塞路径在未来预设时间段内发送的数据包进行传输;A sending module, sending a path congestion message to an upstream switch, so that the upstream switch uses all target paths to transmit data packets sent by all congested paths within a future preset time period according to the identification information of each congested path and the identification information of each target path carried in the path congestion message; 其中,所述计算模块具体用于:Wherein, the calculation module is specifically used for: 通过公式:By formula: 计算所述下游交换机的拥塞暂停时间差Δt;Calculating the congestion pause time difference Δt of the downstream switch; 其中,QPFC表示触发拥塞暂停的数据包队列暂停总长度,td表示向上游交换机传输消息的当前延迟,n表示路径的数量,vi(t)表示当前时刻第i条路径的数据包到达速率,t表示当前时刻,vr(t)表示所述数据包转发速率;Wherein, Q PFC represents the total length of the queue pause of packets that trigger congestion pause, t d represents the current delay of transmitting messages to the upstream switch, n represents the number of paths, vi(t) represents the packet arrival rate of the i-th path at the current moment, t represents the current moment, and vr(t) represents the packet forwarding rate; 所述第一确定模块具体用于实现:The first determination module is specifically used to implement: 通过公式:By formula: 计算所述路径数据包队列长度阈值FccCalculating the path data packet queue length threshold F cc ; 其中,Q表示所述数据包队列总长度,QT表示记录所有路径数据包队列长度及所有路径的列表,qInd表示队列索引,fNum表示所述队列索引中的路径的数量;Wherein, Q represents the total length of the data packet queue, Q T represents a list recording the lengths of all path data packet queues and all paths, qInd represents a queue index, and fNum represents the number of paths in the queue index; 所述第二获取模块具体用于:The second acquisition module is specifically used for: 基于所述数据包队列总长度、所述数据包转发速率、所有路径的数据包到达速率,获取拥塞暂停恢复时间;Obtaining a congestion pause recovery time based on the total length of the data packet queue, the data packet forwarding rate, and the data packet arrival rate of all paths; 基于所述拥塞暂停恢复时间,获取每条路径的路径延迟;Based on the congestion pause recovery time, obtaining a path delay of each path; 所述基于所述数据包队列总长度、所述数据包转发速率、所有路径的数据包到达速率,获取拥塞暂停恢复时间,包括:The acquiring the congestion suspension recovery time based on the total length of the data packet queue, the data packet forwarding rate, and the data packet arrival rate of all paths includes: 通过公式:By formula: 计算所述拥塞暂停恢复时间treCalculating the congestion pause recovery time t re ; 其中,Q表示所述数据包队列总长度,表示拥塞暂停结束时的数据包队列恢复总长度,Tre_flight表示拥塞暂停结束的消息到达上游交换机的时间,ΔL表示队列长度变化梯度:Wherein, Q represents the total length of the packet queue, It indicates the total length of the packet queue restored when the congestion pause ends, Tre_flight indicates the time when the message of the end of the congestion pause arrives at the upstream switch, and ΔL indicates the gradient of the queue length change: 其中,n表示路径的数量,vi(t)表示当前时刻第i条路径的数据包到达速率,t表示当前时刻,vr(t)表示所述数据包转发速率;Wherein, n represents the number of paths, vi(t) represents the packet arrival rate of the i-th path at the current moment, t represents the current moment, and vr(t) represents the packet forwarding rate; 所述基于所述拥塞暂停恢复时间,获取每条路径的路径延迟,包括:The acquiring the path delay of each path based on the congestion suspension recovery time includes: 获取每条路径的传输延迟;Get the transmission delay of each path; 分别针对每条路径,进行以下步骤:For each path, perform the following steps: 判断所述路径是否为拥塞路径;Determining whether the path is a congested path; 若是,则将所述拥塞暂停恢复时间与所述路径的传输延迟之和作为所述路径的路径延迟;If yes, taking the sum of the congestion suspension recovery time and the transmission delay of the path as the path delay of the path; 否则,将所述路径的传输延迟作为所述路径的路径延迟。Otherwise, the transmission delay of the path is taken as the path delay of the path. 4.一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至2任一项所述的RDMA网络数据传输的负载均衡方法。4. A terminal device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein when the processor executes the computer program, the load balancing method for RDMA network data transmission as described in any one of claims 1 to 2 is implemented. 5.一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至2任一项所述的RDMA网络数据传输的负载均衡方法。5. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the load balancing method for RDMA network data transmission as described in any one of claims 1 to 2.
CN202410538262.7A 2024-04-30 2024-04-30 A load balancing method and related equipment for RDMA network data transmission Active CN118317366B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410538262.7A CN118317366B (en) 2024-04-30 2024-04-30 A load balancing method and related equipment for RDMA network data transmission

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410538262.7A CN118317366B (en) 2024-04-30 2024-04-30 A load balancing method and related equipment for RDMA network data transmission

Publications (2)

Publication Number Publication Date
CN118317366A CN118317366A (en) 2024-07-09
CN118317366B true CN118317366B (en) 2024-11-22

Family

ID=91727196

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410538262.7A Active CN118317366B (en) 2024-04-30 2024-04-30 A load balancing method and related equipment for RDMA network data transmission

Country Status (1)

Country Link
CN (1) CN118317366B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119865461B (en) * 2025-03-24 2025-06-24 湖南工商大学 A load balancing method based on adaptive adjustment and related equipment
CN121173746B (en) * 2025-11-19 2026-03-03 湖南工商大学 A congestion-aware load balancing method and related equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115022227A (en) * 2022-06-12 2022-09-06 长沙理工大学 Data transmission method and system based on circulation or rerouting in data center network
CN115134302A (en) * 2022-06-27 2022-09-30 长沙理工大学 Flow isolation method for avoiding head of line congestion and congestion diffusion in lossless network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114513467B (en) * 2022-04-18 2022-07-15 苏州浪潮智能科技有限公司 A method and device for network traffic load balancing in a data center

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115022227A (en) * 2022-06-12 2022-09-06 长沙理工大学 Data transmission method and system based on circulation or rerouting in data center network
CN115134302A (en) * 2022-06-27 2022-09-30 长沙理工大学 Flow isolation method for avoiding head of line congestion and congestion diffusion in lossless network

Also Published As

Publication number Publication date
CN118317366A (en) 2024-07-09

Similar Documents

Publication Publication Date Title
CN118317366B (en) A load balancing method and related equipment for RDMA network data transmission
US8014281B1 (en) Systems and methods for limiting the rates of data to/from a buffer
CN113994642A (en) On-line performance monitoring
CN108243116B (en) Flow control method and switching equipment
US9614777B2 (en) Flow control in a network
WO2020022209A1 (en) Network control device and network control method
JPH03174848A (en) Delay base rush evading method in computer network and device
KR101333856B1 (en) Method of managing a traffic load
US8929213B2 (en) Buffer occupancy based random sampling for congestion management
CN113162862A (en) Congestion control method and device
US20040032826A1 (en) System and method for increasing fairness in packet ring networks
US6771601B1 (en) Network switch having source port queuing and methods, systems and computer program products for flow level congestion control suitable for use with a network switch having source port queuing
CN119013954A (en) Notification-based load balancing in a network
GB2497846A (en) Hybrid arrival-occupancy based congestion management
CN108540395A (en) Without the congestion judgment method lost in network
CN112737940B (en) A data transmission method and device
CN113572655B (en) Congestion detection method and system for non-lost network
US12348433B2 (en) Method and system for dynamic quota-based congestion management
CN107689967B (en) DDoS attack detection method and device
CN118524065B (en) Congestion control method and device, storage medium and electronic equipment
US7500012B2 (en) Method for controlling dataflow to a central system from distributed systems
CN121173746B (en) A congestion-aware load balancing method and related equipment
US11025519B2 (en) Systems, methods and computer-readable media for external non-intrusive packet delay measurement
US11924106B2 (en) Method and system for granular dynamic quota-based congestion management
CN116032852B (en) Flow control method, device, system, equipment and storage medium based on session

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant