CN114844837A - Congestion control mechanism and control device based on time delay in multi-service queue environment - Google Patents
Congestion control mechanism and control device based on time delay in multi-service queue environment Download PDFInfo
- Publication number
- CN114844837A CN114844837A CN202210439516.0A CN202210439516A CN114844837A CN 114844837 A CN114844837 A CN 114844837A CN 202210439516 A CN202210439516 A CN 202210439516A CN 114844837 A CN114844837 A CN 114844837A
- Authority
- CN
- China
- Prior art keywords
- congestion
- transmission
- data packet
- sending
- window
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/12—Avoiding congestion; Recovering from congestion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/28—Flow control; Congestion control in relation to timing considerations
- H04L47/283—Flow control; Congestion control in relation to timing considerations in response to processing delays, e.g. caused by jitter or round trip time [RTT]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/50—Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a congestion control mechanism and a control device based on time delay in a multi-service queue environment, which comprises three parts: a congestion detector, an idle bandwidth detector and a flow controller. The invention provides a method for controlling congestion by fusing RTT information on the basis of HPCC (high performance packet control), obtains a congestion signal through the change of RTT of a transmission data packet, and accurately and quickly adjusts the sending rate by using the congestion information carried by RTT when congestion occurs. The simulation platform test result shows that: the invention can well solve the problems that the bandwidth allocation in the prior art violates the scheduling intention of the queues in the multi-queue environment and the bandwidth allocation is uneven under the coexistence of heterogeneous congestion control algorithms, realizes high throughput, low time delay, weighted fair sharing and performance isolation among different queues in the multi-service queue environment, and simultaneously keeps the advantages of high throughput, low time delay and fairness in the single-queue environment.
Description
Technical Field
The invention belongs to the technical field of data center networks, and particularly relates to a congestion control mechanism based on time delay in a multi-service queue environment.
Background
Various services are operated in a data center, and the services put demands on a data center network for network transmission services with high bandwidth and low delay, and the key to meeting the demands is to design a good end-to-end congestion control mechanism. This is because the congestion Control mechanism is a main mechanism for avoiding cache accumulation and data packet loss under high traffic load, if the congestion Control mechanism fails frequently, the consequences caused by congestion in the network include a decrease in data transmission rate, an increase in data transmission delay, an increase in packet loss during transmission, and the like, and meanwhile, backup mechanisms Based on Priority-Based Flow Control (PFC) or data packet retransmission, and the like, may also cause problems of instability and performance degradation, and may also cause network paralysis under severe conditions.
High Precision Congestion Control (HPCC) is a High Precision Congestion Control strategy proposed by the ariiba, and the HPCC innovatively applies fine-grained link information provided by an In-band Network Telemetry (INT) function In a switch device to redesign a Congestion Control idea, so that High bandwidth, low delay and High stability can be realized In a single queue Network environment. The main idea of HPCC is to utilize the accurate link load information provided by INT for congestion detection to accurately calculate the target rate to be adjusted when congestion is detected in the network or there is free bandwidth. The HPCC is a sender-driven congestion control algorithm, and the receiver will acknowledge each packet received. The main work flow is as follows: in the process, each switch on a complete path where the data packet passes through inserts link information into the head of the data packet by using an INT function of the switch, and the link information can reflect the current load of a data packet outlet, including queue length, a timestamp, the number of bytes transmitted by link bandwidth capacity, the number of bytes transmitted and the like. When receiving data packet, the receiving end separates out the link information metadata inserted by the exchanger from the data packet, then these information are inserted into the generated acknowledgement data packet, and finally returns the acknowledgement data packet to the transmitting end. After receiving an acknowledgement packet (ACK) with network load information, a sending end determines how to adjust its traffic each time according to the load information carried therein. The sending party can calculate the service condition of the passing link by using the queue length, the number of bytes transmitted and other load information, if the service condition exceeds a certain threshold value, the congestion of the link is judged, at the moment, the sending party can accurately adjust the sending rate according to the actual service condition and the ideal state of the link without converging to the final target rate through multiple iterations, and meanwhile, when the link is idle, the HPCC can also accurately adjust the sending rate by adopting the same method to quickly utilize the available bandwidth.
However, current congestion control mechanisms in data center networks are based on the assumption of a single service queue (e.g., HPCC), i.e., there is only one queue per port of a data center switch. However, in the existing data center network, a multi-service queue structure is often used, that is, a plurality of queues are set at each port of a switch, different services are distributed in different service queues to perform performance isolation on different data center services, and algorithms such as priority and polling are adopted among the service queues for scheduling. Although the existing congestion control mechanism shows good performance in a single-service queue network environment, if the congestion control mechanism is directly applied to a multi-service queue network environment, the problem of unreasonable bandwidth allocation occurs, which mainly appears in the following two aspects:
(1) the bandwidth allocation violates the queue scheduling intent. In a multi-service queue environment, when traffic to which different weights are assigned coexist, bandwidth cannot be distributed according to the weights of queues, and weighted fair sharing is performed between different queues.
(2) The heterogeneous congestion control algorithm has uneven bandwidth allocation under coexistence. Different services in the data center network can generate different types of traffic, such as RDMA traffic based on HPCC and TCP traffic based on congestion control algorithms such as Reno, and when these different types of traffic coexist, a problem that one type of traffic preempts another type of traffic bandwidth occurs, and the bandwidth cannot be fairly distributed among the different types of traffic.
The essential reason for unreasonable bandwidth allocation is that in order to meet the performance requirement of low latency, a congestion control mechanism usually controls the queue length in a switch to a level close to 0, so that bandwidth allocation in a network is mainly determined by the congestion control mechanism, but the current popular congestion control mechanism lacks fine-grained link information when congestion control is performed, and the congestion condition of each queue cannot be obtained. Taking an HPCC as an example, the HPCC uses an INT function provided by the switch during design to obtain information such as the queue length and the number of bytes transmitted by the link to calculate the service condition of the link, but research shows that, in the multi-queue switch produced by mainstream switch manufacturers in the market at present, the INT function can only provide the total queue length and the total number of bytes transmitted by the link, that is, the sum of the queue lengths of all queues in one port and the sum of the number of bytes transmitted by the link can only be obtained. There is no way to distinguish which queue or queues are accumulating causing congestion, thereby introducing unfairness.
Disclosure of Invention
The invention aims to provide a congestion control mechanism and a control device based on time delay in a multi-service queue environment in a data center network, which can meet the requirements of various services on high-bandwidth and low-time-delay network services on one hand, and can realize the performance isolation of flow in different queues on the other hand, thereby meeting the respective performance requirements of different services.
In order to achieve the above object, the congestion control mechanism based on delay in a multi-service queue environment according to the present invention comprises the following steps:
step 1: a sending end records a current sending time stamp when sending a data packet and extracts quintuple information from the data packet; the original data packet is sent to a receiving end through a route, and in the transmission process of the original data packet, the switch inserts INT metadata into the head of the original data packet and transmits the INT metadata to the next hop; after receiving the data packet, the receiving end extracts INT metadata, inserts the INT metadata into the generated ACK, and then returns the ACK with the INT metadata to the transmitting end;
step 2: after receiving the ACK with the INT metadata, the sending end extracts the INT metadata from the ACK, inquires the sending time of a data packet confirmed by the ACK according to the quintuple, then detects whether the transmission path of the data packet is congested or not according to the transmission round-trip delay of the data packet, and detects whether idle bandwidth exists on the path or not when the congestion exists;
and step 3: determining whether and how to adjust the sending window according to the detection result obtained in the step 2;
and 4, step 4: and calculating the updated sending window and updating the sending window.
Further, step 2 comprises the following steps:
step 2.1: calculating the Round Trip Time (RTT) of data packet transmission according to the current timestamp and the recorded sending timestamp;
step 2.2: normalizing the Round Trip Time (RTT) of the data packet transmission to obtain a normalization result u rtt ;
Step 2.3: using the normalized result u rtt Judging whether congestion occurs:
if u rtt If the data packet is not less than 1, the transmission path of the data packet is congested;
otherwise, the transmission path of the data packet is not congested, and whether available idle bandwidth exists in the network is detected.
Further, in step 2.3, detecting whether there is free bandwidth available in the network includes the following steps:
step 2.3.1: calculating the number of bytes in transmission on each link on the path according to INT metadata, normalizing the calculated number of bytes in transmission on the ith link, and obtaining a result u after normalization of the number of bytes in transmission on the ith link i Then, the maximum value U of the normalized result of the transmission byte number in all the links of the same path is obtained inflight :
Step 2.3.2: comparison U inflight And the size of the utilization η of link j:
if U is inflight <Eta, the path has idle bandwidth; otherwise the path has no free bandwidth.
Further, step 3 comprises the following steps:
step 3.1: if the occurrence of the congestion is detected, reducing the sending window, and controlling the congestion, wherein the formula for reducing the sending window is as follows:
W new is the calculated updated window size, W c Is a reference window, W AI Is an additive addition moiety;
step 3.2: if the occurrence of congestion is not detected and idle bandwidth in the network is detected, performing multiple additive increases, wherein each increase formula is as follows: w new =W c +W AI 。
Further, in step 3.2, if there is still idle bandwidth in the network after the maximum number of times of additive increase, the transmission window is increased by a multiplicative increase method.
Further, the multiplicative increasing method comprises the following steps:where W is the updated window, W c Is a reference window, W AI Is an additive addition moiety, U inflight The maximum value of the normalized result of the number of transmitted bytes in all links of the same path is obtained, and eta is the link utilization rate.
Further, step 4 comprises the following steps:
step 4.1: new reference window W for recording c The first packet sequence number lastUpdateSeq sent, if the ACK sequence number is greater than lastUpdateSeq, the reference window W is updated c Otherwise, not updating the reference window; updating the reference window W c The formula is as follows:
W c =W
step 4.2: updating the sending window: w ═ W new W is the current sending window;
step 4.3: and calculating and recording the sending rate and the current feedback INT metadata, and calculating the result after the normalization of the transmission speed and the transmission byte number when processing the next confirmation data packet of the stream.
A congestion control device based on time delay in a multi-service queue environment comprises a congestion detector, an idle bandwidth detector and a flow controller;
the congestion detector generates a congestion signal based on each flow by calculating the round-trip delay of each data packet transmitted in the network, and sends the congestion signal generated by the round-trip delay and based on each flow to the flow controller;
the idle bandwidth detector is used for acquiring the number of bytes in transmission of each link in the network according to INT metadata attached to the data packet, detecting whether available bandwidth exists in the network according to the number of bytes in transmission, and sending the number of bytes in transmission and a detection result to the flow controller;
when the network is congested, the flow controller: reducing the transmit window based on the round trip delay; and when the link has idle bandwidth, increasing the sending window based on the maximum value of the normalized result of the transmission byte number in all the links.
Compared with the prior art, the invention has at least the following beneficial technical effects:
the invention performs congestion detection and congestion control by fusing Round-Trip Time (RTT) information on the basis of HPCC, obtains a congestion signal by the change of RTT of a transmission data packet, and accurately and quickly adjusts the size of a sending window by using the congestion information carried by the RTT when congestion occurs, thereby adjusting the sending rate, and provides a congestion control mechanism MECC based on Time delay in a multi-service queue environment, which has the following advantages:
firstly, the method comprises the following steps: the invention provides a method for acquiring a congestion signal through the change of RTT in a multi-service queue environment, monitoring the RTT transmitted by each data packet in real time, and generating the congestion signal when the RTT is increased to a certain degree. Congestion is detected based on RTT, a congestion signal of each queue with finer granularity can be obtained, so that congestion caused by accumulation of which queue or queues can be distinguished, and accurate and timely detection can be realized when congestion occurs.
Secondly, the method comprises the following steps: the invention provides a method for guiding the adjustment of a sending window by using congestion information carried by RTT in a multi-service queue environment, when the congestion of a network is detected, the increase of the RTT reflects the increase of the queuing delay of a congested link, so the increase degree of the RTT actually reflects the congestion degree of the link, the sending rate of related flow is accurately adjusted aiming at the queue with the congestion, the rapid convergence of the sending rate can be realized, and the sending rate of flow in a non-congested queue can not be reduced.
Further, when idle bandwidth exists in the link, the sending window is adjusted to occupy available bandwidth in time according to the link use condition obtained from INT information, so that reasonable distribution of bandwidth is realized in a multi-service queue environment, the problems that bandwidth distribution violates a queue scheduling intention and bandwidth distribution is uneven under coexistence of heterogeneous congestion control algorithms in the multi-queue environment in the prior art can be well solved, high throughput, low time delay, weighted fair sharing and performance isolation among different queues are realized in the multi-service queue environment, and the advantages of high throughput, low time delay and fairness are kept in a single-queue environment.
Drawings
FIG. 1 is a block diagram of a MECC;
fig. 2 is a schematic diagram of the working flow of MECC after receiving an ACK;
FIG. 3 is MECC algorithm pseudo-code;
FIG. 4 is a diagram of a topology used in a simulation environment;
FIG. 5 is a diagram of MECC performance test results in a multi-queue environment;
fig. 6 is a diagram of the results of the MECC performance test in a single queue environment.
Detailed Description
In order to make the objects and technical solutions of the present invention clearer and easier to understand. The present invention will be described in further detail with reference to the following drawings and examples, wherein the specific examples are provided for illustrative purposes only and are not intended to limit the present invention.
In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "up", "down", "front", "back", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on those shown in the drawings, and are used only for convenience in describing the present invention and for simplicity in description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and thus, are not to be construed as limiting the present invention. Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless otherwise specified. In the description of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
The invention provides a congestion control mechanism MECC based on time delay in a multi-service queue environment, which integrates and uses RTT information to control congestion on the basis of HPCC.
According to the invention, on the basis of HPCC, the RTT information is fused for carrying out congestion detection and congestion adjustment, a congestion signal is obtained by monitoring the variation of RTT of a transmission data packet, and accurate congestion control is carried out by combining the congestion information carried by INT and RTT; when detecting that the link has the idle bandwidth, adjusting a sending window to timely occupy the available bandwidth according to the link use condition obtained from the INT information, thereby realizing reasonable distribution of the bandwidth in a multi-service queue environment and simultaneously keeping the performance advantages of high bandwidth, low time delay and fairness in a single-service queue network environment.
The congestion control device structure of the present invention is shown in fig. 1, and comprises 3 parts: a congestion detector, an idle bandwidth detector and a flow controller. The congestion detector, the idle bandwidth detector and the traffic controller are all located at the sending end. The congestion detector generates a congestion signal based on each flow by calculating RTT of each data packet transmitted in the network, and the RTT can be used for guiding the adjustment of a sending window when congestion occurs in the network; the idle bandwidth detector processes INT information attached to a data packet by utilizing an INT function of the switch to obtain the number of bytes in transmission of each link in the network, detects whether available bandwidth exists in the network according to the number of bytes in transmission, and guides the adjustment of a sending window when the available bandwidth exists in the network; the flow controller reduces a sending window of a flow based on RTT when the network is congested, increases the sending window based on the number of bytes in transmission of the link using the largest bandwidth when the link is idle, and adjusts based on the number of bytes in transmission of the link using the largest bandwidth to prevent congestion of other links.
The invention is characterized in that the invention fuses and uses RTT information to detect and adjust the congestion on the basis of INT information, and the invention is explained in detail by combining the complete process that a data packet is sent from a sending end to the sending end and receives ACK thereof with the following figures 2 and 3:
the invention relates to a congestion control mechanism based on window action on a sending end, wherein the sending end maintains a sending window for each flow and is controlled by the sending window when a data packet is sent. A packet undergoes the following processing steps in the network:
step 1: the sending end generates an original data packet: when a sending end sends a data packet, the sending end records the current sending time stamp, and simultaneously extracts quintuple information from the data packet as the unique identifier of each stream; the original data packet is sent to the receiving end through the route, and in the process of data packet transmission, every time the data packet passes through one switch, the switch supporting INT collects required information according to the indication in the INT header, and the INT metadata is inserted into the INT header and continuously transmitted to the next hop. And after receiving the data packet, the receiving end extracts all INT metadata, then the INT metadata is inserted into the generated ACK, and the ACK returns to the transmitting end.
The INT metadata includes a queue number, a queue length, time stamps of the ingress and egress ports, the number of bytes transmitted by the ingress and egress ports, link bandwidth capacity, and the like, and can be collected by a forwarding device such as a switch supporting INT according to an indication in the INT header.
Step 2: after receiving the ACK with the INT metadata, the sending end extracts the INT metadata from the ACK and inquires the sending time of the data packet confirmed by the ACK according to the quintuple, and firstly, a detection detector detects whether the transmission path of the data packet is congested or not, and the following operations are carried out:
measurett (ack): where ack represents the acknowledgement packet currently being processed by the sender. The operation is used for measuring RTT of a data packet transmission process, and the detection of whether congestion occurs on a route link comprises the following steps:
step 2.1: calculating the Round Trip Time (RTT) of data packet transmission according to the current timestamp and the recorded sending timestamp, and expressing the sending timestamp recorded when the sending end sends the original data packet as ack.tsSend, and recording the current time as tsNow:
RTT=tsNow-ack.tsSend
step 2.2: normalizing the calculated RTT to obtain a normalization result u rtt :
u rtt =RTT/(baseRTT+Threshold rtt )
The baseRTT is the minimum round-trip delay between the sender and the receiver of the data packet, and in a specific implementation, the currently measured minimum round-trip delay is used instead of Threshold rtt Is a Threshold value of rtt change, can be set by self according to needs, and is Threshold rtt The size of the queue can represent the tolerance of congestion detection on the queue length, and the difference value of the RTT and the baseRTT reflects the queuing delay in the transmission process; the use of EWMA (exponentially weighted average) at 26 in fig. 3 is to filter out timer inaccuracies and transient queue induced noise.
Step 2.3: using the normalized result u rtt Judging whether congestion occurs or not;
if u is rtt More than or equal to 1, the data packet is illustratedSends out a congestion signal if a congestion occurs on the transmission path of the network and sends out u rtt To the traffic controller for adjusting the send window.
If the congestion detector does not detect that a congestion link exists in the network, it needs to further determine whether an available idle bandwidth exists in the network, and the idle bandwidth detector continues to detect the available idle bandwidth by using a detection mechanism of the HPCC, specifically performing the following operations:
measurelnflight (ack): where ack represents the acknowledgement packet currently being processed by the sender. The operation is used for measuring the number of bytes in transmission of each link on a data packet transmission path, and detecting whether idle bandwidth exists on the path or not based on the number of bytes in transmission, and the following operation is carried over from the HPCC, and the operation comprises the following steps:
step 2.3.1: estimating the number of bytes in transmission on each link on the path according to INT metadata and carrying out normalization processing to obtain the result u after the normalization of the number of bytes in transmission in the ith link i Then, the maximum value U of the normalized result of the transmission byte number in all the links of the same path is obtained inflight :
Uinflight=max(ui)
txRate i Is the transmission speed, ack, of the ith link 1 Is the acknowledgement packet, ack, currently processed 0 Is the previous acknowledgment packet, ack, of the same flow j .L[i]Indicating an acknowledgement packet ack j INT metadata, such as ack, carried therein about link i 1 .L[i]txBytes represents the number of transmission bytes obtained by INT metadata about the link i carried in the preprocessed acknowledgement data packet; wherein B represents link bandwidth, qlen is queue length of link, ts is time stamp, txBytes is number of bytes transferred, and B, qlen, txBytes and ts can all beDirectly obtaining from INT metadata; line 13 in fig. 2 filters possible noise present at qlen using the minimum of the current qlen and the previous qlen; the use of EWMA (exponentially weighted average) at line 17 is also to filter out timer inaccuracies and noise from transient queues.
Step 2.3.2: comparison U inflight And judging whether idle bandwidth exists on a path through which the data packet passes according to the set link control utilization rate eta, and selecting a link, U, with the largest occupation degree from all the links in order not to introduce new congestion on a certain link when the idle bandwidth is utilized inflight That is, the normalization result of the number of bytes in the link transmission:
if U is present inflight <Eta, idle bandwidth exists on the path; otherwise, no free bandwidth exists;
η is a parameter close to 1, which means that the link utilization is controlled to be η, slightly less than 1;
and step 3: based on the results of the congestion detector and the idle bandwidth detector, the traffic controller decides whether and how to adjust the send window, wherein the process of detecting the idle bandwidth increasing send window portion continues to use the self-HPCC, and the traffic controller performs the following operations:
ComputeWind(U inflight ,u rtt updateWc): wherein u is rtt Is the result of normalization of the RTT updateWc is a boolean variable indicating whether or not to update the reference window Wc. This operation is used to calculate an updated send window. This operation is accomplished by the following three steps, wherein the process of detecting the spare bandwidth part continues to use the self HPCC:
step 3.1: if the congestion detector detects the occurrence of congestion, the traffic controller uses u delivered by the congestion detector rtt To reduce the sending window, to control the congestion, the formula for reducing the sending window is as follows:
W new is the calculated updated window size, W c Is a reference window, W AI Is a small additive increase to ensure fairness, typically within 100MB, which in this embodiment is set to 50 MB.
Step 3.2: if the occurrence of congestion is not detected, and the idle bandwidth detector detects that idle bandwidth exists in the network, the number of bytes in transmission information transmitted by the idle bandwidth detector is used for increasing the sending window, firstly, the set maximum number of maxStage additive increases is tried, and the formula for each increase is as follows:
W new =W c +W AI
step 3.3: if there is still free bandwidth in the network after maxStage additive increases, the available bandwidth is quickly utilized by multiplicative increases:
further, a reference window size Wc is introduced in S3.1, and the reference window is updated only once during one RTT. All ACKs received during the same RTT use the same W c Calculating an updated window size W new This allows fast reaction to be detected while at the same time not reacting excessively.
And 4, step 4: after calculating the updated sending window size, executing the following operations to adjust the sending window based on the strategy of each ACK and each RTT:
newack (ack): where ack represents the acknowledgement packet currently being processed by the sender. This operation combines per-ACK and per-RTT based policies to adjust the send window. The strategy continues to use self-HPCC and is completed by the following three steps:
step 4.1: the variable lastUpdateSeq records that the new W is used c The first packet sequence number sent by the window, and if the sequence number of ACK is greater than lastUpdateSeq, the reference window W is updated c Otherwise, not updating the reference window; updating the reference window W c The formula is as follows:
W c w step 4.2: flow controller performs updatesSending a window:
W=W new
wherein W is the current transmission window, W new The updated window size calculated for ComputeWind.
Step 4.3: the sending end calculates and records the sending rate and the current feedback INT metadata, and is used for processing the result u after the transmission speed txRate and the transmission byte number are normalized when the next confirmation data packet of the stream is processed i And (4) calculating.
MECC was verified and performance evaluated on ns-3 simulation platform. In a simulation experiment, the dumbbell type topology shown in fig. 4 is used, two interconnected switches are arranged in the middle of the topology, a plurality of end hosts are respectively connected with the two switches on two sides, the capacity of all links in the topology network is set to 25Gbps, the propagation delay of each link is 10us, the error rate is 0, 8 queues are arranged on each port of the switch in a multi-service queue network environment, a dequeue mechanism of each queue is configured to be polling (WRR) based on weight, the whole network is deployed by using an RDMA technology, and some parameters of simulation configuration are detailed in table 1.
TABLE 1
In order to test the applicability of the MECC in the multi-service queue environment, four groups of typical application scenes are selected to respectively test the throughput, the time delay, the fairness and the isolation of the MECC in the multi-queue environment. Fig. 5 (a) shows that when traffic is sent many-to-many in a multi-queue environment, the MECC can fully utilize the bandwidth of the link, and the total throughput obtained by all the flows exceeds 24.5Gbps and is close to the full bandwidth of the link, 25Gbps, and meanwhile, it can be seen in the figure that when the flow sending ends at two moments and the link is idle, the MECC can rapidly adjust, and the throughput is rapidly improved to be close to the full bandwidth by utilizing the available bandwidth. Fig. 5 (b) shows that when three flows entering different queues are sent for a long time in many-to-many manner, the length of the queues in the three queues will result in a certain amount of queue accumulation since the sending side starts sending respective flows at linear speed, but the MECC can quickly detect congestion and immediately take measures to reduce the sending rate to a suitable level, and then the queues quickly decrease to a length close to 0, and then keep at a low level in the following transmission process, and the maximum amount of accumulated data in the queues will not exceed the length of 5 data packets. Fig. 5 (c) shows that when the traffic assigned with different weights coexist, the bandwidth is distributed according to the weights (1: 2: 3) of the queues, and the weighted fair sharing can be realized among different queues; meanwhile, when the traffic transmission is completed, the remaining traffic in the link can quickly utilize the available bandwidth and converge to a proper rate, and the bandwidth is still distributed according to the respective weight of the remaining traffic. Fig. 5 (d) shows that when heterogeneous congestion control algorithms coexist (MECC-based RDMA traffic and TCP-reno-based TCP traffic), there is no bandwidth preemption problem, and the bandwidths obtained by different traffic are also allocated according to the weights of their respective queues. Therefore, MECC shows the advantages of high throughput, low latency, weighted fair sharing between different queues and performance isolation in a multi-service queue environment.
Meanwhile, the compatibility of the MECC in the single queue environment is further verified. Fig. 6 (a) shows that in a single queue environment, the MECC can still achieve high throughput, making full use of the bandwidth in the network. Fig. 6 (b) shows that MECC can quickly avoid congestion in a single queue environment, and congestion control is started quickly when the queue starts to accumulate, keeping the queue length at a very low level, and thus the delay at a very low level. Fig. 6 (c) shows the fairness of the MECC in a single queue environment, and when three flows are added to the same link one by one at an interval of 0.2s and then end at the same interval, the MECC can make the different flows converge to a fair state quickly. Therefore, MECC can also maintain the advantages of high throughput, low latency, and fairness in a single queue environment.
The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.
Claims (8)
1. A delay-based congestion control mechanism in a multi-service queue environment, comprising the steps of:
step 1: a sending end records a current sending time stamp when sending a data packet and extracts quintuple information from the data packet; the original data packet is sent to a receiving end through a route, and in the transmission process of the original data packet, the switch inserts INT metadata into the head of the original data packet and transmits the INT metadata to the next hop; after receiving the data packet, the receiving end extracts INT metadata, inserts the INT metadata into the generated ACK, and then returns the ACK with the INT metadata to the transmitting end;
step 2: after receiving the ACK with the INT metadata, the sending end extracts the INT metadata from the ACK, inquires the sending time of a data packet confirmed by the ACK according to the quintuple, then detects whether the transmission path of the data packet is congested or not according to the transmission round-trip delay of the data packet, and detects whether idle bandwidth exists on the path or not when the congestion exists;
and step 3: determining whether and how to adjust the sending window according to the detection result obtained in the step 2;
and 4, step 4: and calculating the updated sending window and updating the sending window.
2. A congestion control mechanism based on delay in a multi-service queue environment according to claim 1, wherein the step 2 comprises the following steps:
step 2.1: calculating the Round Trip Time (RTT) of data packet transmission according to the current timestamp and the recorded sending timestamp;
step 2.2: normalizing the Round Trip Time (RTT) of the data packet transmission to obtain a normalization result u rtt ;
Step 2.3: using the normalized result u rtt Judging whether congestion occurs:
if u rtt If the data is more than or equal to 1, the data isCongestion occurs on a transmission path of the packet;
otherwise, the transmission path of the data packet is not congested, and whether available idle bandwidth exists in the network is detected.
3. A congestion control mechanism based on delay in multi-service queue environment according to claim 2, characterized in that in step 2.3, detecting whether there is free bandwidth available in the network comprises the following steps:
step 2.3.1: calculating the number of bytes in transmission on each link on the path according to INT metadata, normalizing the calculated number of bytes in transmission on the ith link, and obtaining a result u after normalization of the number of bytes in transmission on the ith link i Then, the maximum value U of the normalized result of the transmission byte number in all the links of the same path is obtained inflight :
Step 2.3.2: comparison U inflight And the size of the utilization η of link j:
if U is inflight If the path is less than eta, idle bandwidth exists in the path; otherwise the path has no free bandwidth.
4. A congestion control mechanism based on delay in multi-service queue environment according to claim 1, wherein said step 3 comprises the following steps:
step 3.1: if the occurrence of the congestion is detected, reducing the sending window, and controlling the congestion, wherein the formula for reducing the sending window is as follows:
W new is the calculated updated window size, W c Is a reference window, W AI Is an additive addition moiety;
step 3.2: if the occurrence of congestion is not detected and idle bandwidth in the network is detected, performing multiple additive increases, wherein each increase formula is as follows: w new =W c +W AI 。
5. The congestion control mechanism based on delay in the multi-service queue environment as claimed in claim 4, wherein in step 3.2, if there is still idle bandwidth in the network after the maximum number of additive increases, the transmission window is increased by multiplicative increase.
6. The latency-based congestion control mechanism in a multiservice queue environment according to claim 5, wherein the multiplicative addition is performed by:where W is the updated window, W c Is a reference window, W AI Is an additive addition moiety, U inflight The maximum value of the normalized result of the number of transmitted bytes in all links of the same path is obtained, and eta is the link utilization rate.
7. A congestion control mechanism based on delay in a multi-service queue environment according to claim 1, wherein the step 4 comprises the following steps:
step 4.1: new reference window W for recording c The first packet sequence number lastUpdateSeq sent, if the ACK sequence number is greater than lastUpdateSeq, the reference window W is updated c Otherwise, not updating the reference window; updating the reference window W c The formula is as follows:
W c =W
step 4.2: updating the sending window: w ═ W new W is the current sending window;
step 4.3: and calculating and recording the sending rate and the current feedback INT metadata, and calculating the result after the normalization of the transmission speed and the transmission byte number when processing the next confirmation data packet of the stream.
8. A congestion control device based on time delay in a multi-service queue environment is characterized by comprising a congestion detector, an idle bandwidth detector and a flow controller;
the congestion detector generates a congestion signal based on each flow by calculating the round-trip delay of each data packet transmitted in the network, and sends the congestion signal generated by the round-trip delay and based on each flow to the flow controller;
the idle bandwidth detector is used for acquiring the number of bytes in transmission of each link in the network according to INT metadata attached to the data packet, detecting whether available bandwidth exists in the network according to the number of bytes in transmission, and sending the number of bytes in transmission and a detection result to the flow controller;
when the network is congested, the flow controller: reducing the transmit window based on the round trip delay; and when the link has idle bandwidth, increasing the sending window based on the maximum value of the normalized result of the transmission byte number in all the links.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210439516.0A CN114844837B (en) | 2022-04-25 | 2022-04-25 | Congestion control method and device based on time delay in multi-service queue environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210439516.0A CN114844837B (en) | 2022-04-25 | 2022-04-25 | Congestion control method and device based on time delay in multi-service queue environment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114844837A true CN114844837A (en) | 2022-08-02 |
CN114844837B CN114844837B (en) | 2023-09-26 |
Family
ID=82565690
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210439516.0A Active CN114844837B (en) | 2022-04-25 | 2022-04-25 | Congestion control method and device based on time delay in multi-service queue environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114844837B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115442314A (en) * | 2022-09-05 | 2022-12-06 | 天津大学 | Practical data center network active transmission system and method |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102006230A (en) * | 2010-11-26 | 2011-04-06 | 中南大学 | Method for controlling congestion control method by fusing three kinds of information in wired/wireless hybrid network |
CN105791054A (en) * | 2016-04-22 | 2016-07-20 | 西安交通大学 | Autonomous controllable and reliable multicast transmission method based on flow classification realization |
US20170187598A1 (en) * | 2015-12-23 | 2017-06-29 | Emc Corporation | Connection-oriented communication devices with round trip time estimation |
CN108965151A (en) * | 2018-08-27 | 2018-12-07 | 华中科技大学 | A kind of Explicit Congestion control method based on queuing delay |
CN110061927A (en) * | 2019-04-26 | 2019-07-26 | 东南大学 | Congestion aware and labeling method towards micro- burst flow in a kind of more queuing data center environments |
CN110620737A (en) * | 2019-09-09 | 2019-12-27 | 中南大学 | Self-adaptive congestion control method based on delay |
US20200120036A1 (en) * | 2017-06-20 | 2020-04-16 | Huawei Technologies Co., Ltd. | Method and apparatus for handling network congestion, and system |
CN111526096A (en) * | 2020-03-13 | 2020-08-11 | 北京交通大学 | Intelligent identification network state prediction and congestion control system |
CN112491736A (en) * | 2020-11-13 | 2021-03-12 | 锐捷网络股份有限公司 | Congestion control method and device, electronic equipment and storage medium |
CN113518040A (en) * | 2021-04-30 | 2021-10-19 | 东北大学 | Multipath coupling congestion control method for delay sensitive service |
CN113711572A (en) * | 2021-07-15 | 2021-11-26 | 新华三技术有限公司 | Message transmission method and device |
-
2022
- 2022-04-25 CN CN202210439516.0A patent/CN114844837B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102006230A (en) * | 2010-11-26 | 2011-04-06 | 中南大学 | Method for controlling congestion control method by fusing three kinds of information in wired/wireless hybrid network |
US20170187598A1 (en) * | 2015-12-23 | 2017-06-29 | Emc Corporation | Connection-oriented communication devices with round trip time estimation |
CN105791054A (en) * | 2016-04-22 | 2016-07-20 | 西安交通大学 | Autonomous controllable and reliable multicast transmission method based on flow classification realization |
US20200120036A1 (en) * | 2017-06-20 | 2020-04-16 | Huawei Technologies Co., Ltd. | Method and apparatus for handling network congestion, and system |
CN108965151A (en) * | 2018-08-27 | 2018-12-07 | 华中科技大学 | A kind of Explicit Congestion control method based on queuing delay |
CN110061927A (en) * | 2019-04-26 | 2019-07-26 | 东南大学 | Congestion aware and labeling method towards micro- burst flow in a kind of more queuing data center environments |
CN110620737A (en) * | 2019-09-09 | 2019-12-27 | 中南大学 | Self-adaptive congestion control method based on delay |
CN111526096A (en) * | 2020-03-13 | 2020-08-11 | 北京交通大学 | Intelligent identification network state prediction and congestion control system |
CN112491736A (en) * | 2020-11-13 | 2021-03-12 | 锐捷网络股份有限公司 | Congestion control method and device, electronic equipment and storage medium |
CN113518040A (en) * | 2021-04-30 | 2021-10-19 | 东北大学 | Multipath coupling congestion control method for delay sensitive service |
CN113711572A (en) * | 2021-07-15 | 2021-11-26 | 新华三技术有限公司 | Message transmission method and device |
Non-Patent Citations (3)
Title |
---|
DANFENG SHAN; FENGYUAN REN; PENG CHENG; RAN SHU; CHUANXIONG GUO: "Observing and Mitigating Micro-Burst Traffic in Data Center Networks", 《IEEE/ACM TRANSACTIONS ON NETWORKING ( VOLUME: 28, ISSUE: 1, FEBRUARY 2020)》 * |
PENG WANG; JIAXIN ZHANG; XING ZHANG; ZHI YAN; BARRY G. EVANS; WENBO WANG: "Convergence of Satellite and Terrestrial Networks: A Comprehensive Survey", 《IEEE ACCESS ( VOLUME: 8)》 * |
沈耿彪;李清;江勇;汪漪;徐明伟;: "数据中心网络负载均衡问题研究", 《软件学报》, no. 07 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115442314A (en) * | 2022-09-05 | 2022-12-06 | 天津大学 | Practical data center network active transmission system and method |
CN115442314B (en) * | 2022-09-05 | 2024-05-31 | 天津大学 | Practical active data center network transmission system and method |
Also Published As
Publication number | Publication date |
---|---|
CN114844837B (en) | 2023-09-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1780961B1 (en) | Flow control for real-time data streams | |
EP2823610B1 (en) | Signalling congestion | |
JP3833739B2 (en) | Communication management and congestion control for packet-based networks | |
EP1632059B1 (en) | Supervisory packet transmission to control congestion and call establishment in bandwidth-limited packet-based networks | |
US7069356B2 (en) | Method of controlling a queue buffer by performing congestion notification and automatically adapting a threshold value | |
US6535482B1 (en) | Congestion notification from router | |
US7054269B1 (en) | Congestion control and traffic management system for packet-based networks | |
CN114938350B (en) | Congestion feedback-based data stream transmission control method in lossless network of data center | |
JP2006014329A (en) | Communication terminal | |
CA2237264A1 (en) | Receiver based congestion control | |
JPH1093624A (en) | Packet transmission network | |
JP2004254164A (en) | Band monitoring device | |
US20040037223A1 (en) | Edge-to-edge traffic control for the internet | |
US20020031089A1 (en) | Method for marking packets of a data transmission flow and marker device performing this method | |
US7394762B2 (en) | Congestion control in data networks | |
CN114844837B (en) | Congestion control method and device based on time delay in multi-service queue environment | |
Haas et al. | Congestion Control by Adaptive Admission. | |
EP1626544B1 (en) | Improvement in average queue depth calculation for use in random early packet discard (red) algorithms | |
Shorten et al. | On queue provisioning, network efficiency and the transmission control protocol | |
CN114401230A (en) | Sending rate control method and device based on cross-data center network communication | |
Raniwala et al. | Evaluation of a stateful transport protocol for multi-channel wireless mesh networks | |
Turner et al. | An approach for congestion control in InfiniBand | |
CN117527698A (en) | Wide area network congestion control algorithm based on end network cooperation | |
Bruno et al. | Early fair drop: a new buffer management policy | |
Sivakumar et al. | Convex Optimized Lagrange Multiplier Based Algebraic Congestion Likelihood for Improved TCP Performance in MANET |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |