CN117041166A - Congestion control method and device, switch and computer readable storage medium - Google Patents

Congestion control method and device, switch and computer readable storage medium Download PDF

Info

Publication number
CN117041166A
CN117041166A CN202311099041.6A CN202311099041A CN117041166A CN 117041166 A CN117041166 A CN 117041166A CN 202311099041 A CN202311099041 A CN 202311099041A CN 117041166 A CN117041166 A CN 117041166A
Authority
CN
China
Prior art keywords
data
stream
flow
congestion control
data stream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311099041.6A
Other languages
Chinese (zh)
Inventor
田源
陈映
刘圆
王子潇
车碧瑶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Technology Innovation Center
China Telecom Corp Ltd
Original Assignee
China Telecom Technology Innovation Center
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Technology Innovation Center, China Telecom Corp Ltd filed Critical China Telecom Technology Innovation Center
Priority to CN202311099041.6A priority Critical patent/CN117041166A/en
Publication of CN117041166A publication Critical patent/CN117041166A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2441Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2483Traffic characterised by specific attributes, e.g. priority or QoS involving identification of individual flows
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/30Flow control; Congestion control in combination with information about buffer occupancy at either end or at transit nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present disclosure relates to a congestion control method and apparatus, a switch, and a computer-readable storage medium. The method comprises the following steps: analyzing a data packet accessed by a remote direct address, and identifying a data stream to which the data packet belongs; determining the type of the data stream; and sending the data stream to a buffer zone corresponding to the type of the data stream according to the type of the data stream. The data packet can be put into different caches based on different streams, and the fine granularity control based on the streams is realized, so that the unfair phenomenon and the victim stream problem caused by the fact that the PFC coarse granularity control mechanism does not distinguish the streams are avoided.

Description

Congestion control method and device, switch and computer readable storage medium
Technical Field
The present disclosure relates to the field of network technologies, and in particular, to a congestion control method and apparatus, a switch, and a computer readable storage medium.
Background
Related art data centers are growing in demand for RDMA (Remote Direct Memory Access, remote direct address access) and flow control technologies. First, the increase in data center size and complexity puts pressure on traditional network technology. RDMA helps to improve performance and reduce latency, which is critical for high performance computing applications. Second, the rise of cloud computing has driven the need for more efficient ways of data management and sharing. RDMA helps to improve the performance of cloud-based applications and services. Third, the increasing popularity of Artificial Intelligence (AI) creates new demands on data center bandwidth. RDMA may help to improve performance of AI-based applications and services.
Disclosure of Invention
The inventors found through research that: RDMA, the main stream of the related art, needs to configure PFC (Priority Flow Control ) to ensure no packet loss, thereby forming a lossless network to realize a high-performance network. PFC controls traffic based on priority, and through realizing a back pressure mechanism of a queue, sends a Pause signal to inform upstream equipment to Pause data sending so as to prevent buffer overflow and packet loss. However, PFC belongs to a coarse granularity mechanism, uses ports and priorities to control congestion, and does not distinguish flows (flows), so that an elephant Flow and a mouse Flow may use the same queue, and the PFC may interrupt data transmission of the mouse Flow in the same queue due to the elephant Flow, so that performance problems such as unfair phenomenon and victim Flow are finally caused.
In view of at least one of the above technical problems, the present disclosure provides a congestion control method and apparatus, a switch, and a computer-readable storage medium, in which packets are placed into different buffers based on different flows, enabling flow-based fine-grained control.
According to one aspect of the present disclosure, there is provided a congestion control method including:
analyzing a data packet accessed by a remote direct address, and identifying a data stream to which the data packet belongs;
determining the type of the data stream;
and sending the data stream to a buffer zone corresponding to the type of the data stream according to the type of the data stream.
In some embodiments of the disclosure, the parsing the data packet accessed by the remote direct address, identifying the data flow to which the data packet belongs includes:
analyzing a data packet accessed by a remote direct address to obtain a source IP address, a destination IP address, a source port number and a destination queue pair;
a unique identifier of the data stream is formed by using a source IP address, a destination IP address, a source port number and a destination queue pair;
and determining the data flow to which the data packet belongs according to the unique identification of the data flow.
In some embodiments of the disclosure, the parsing the data packet accessed by the remote direct address, identifying the data flow to which the data packet belongs includes:
analyzing the data packet accessed by the remote direct address, and obtaining and recording the size of the data packet and the time for receiving the data packet.
In some embodiments of the present disclosure, the types of data streams include elephant streams and mouse streams.
In some embodiments of the disclosure, the determining the type of the data stream includes:
calculating a stream throughput of the data stream;
and determining the type of the data stream according to the stream throughput of the data stream.
In some embodiments of the disclosure, the determining the type of the data stream according to the stream throughput of the data stream includes:
determining that the data stream is an elephant stream if the stream throughput of the data stream is greater than a predetermined threshold;
and in the case that the flow throughput of the data flow is not greater than a preset threshold value, judging that the data flow is a mouse flow.
In some embodiments of the disclosure, the calculating the flow throughput of the data flow comprises:
and calculating the stream throughput of the data stream according to the stored stream creation time, the transmitted data size and the current data packet size.
In some embodiments of the disclosure, the calculating the flow throughput of the data flow further comprises:
judging whether the current data stream exists in a stream mapping queue or not;
inserting the current data stream into the stream map queue if the current data stream does not exist in the stream map queue; and then executing the step of calculating the stream throughput of the data stream according to the stored stream creation time, the transmitted data size and the current data packet size.
In some embodiments of the disclosure, the calculating the flow throughput of the data flow further comprises:
judging whether the current data stream exists in a stream mapping queue or not;
under the condition that the current data flow exists in the flow mapping queue, old data corresponding to the current data flow in the flow mapping queue is obtained;
merging the current data stream into the old data, and updating a stream mapping queue; and then executing the step of calculating the stream throughput of the data stream according to the stored stream creation time, the transmitted data size and the current data packet size.
In some embodiments of the present disclosure, the flow mapping queue is a least recently used flow mapping queue.
In some embodiments of the present disclosure, the flow mapping queue is configured to store a total transmission size of the data flow.
In some embodiments of the present disclosure, the flow mapping queue is configured to reserve a data flow with a first frequency of transmission data, and eliminate a data flow with a second frequency of transmission data, where the first frequency is higher than the second frequency.
In some embodiments of the present disclosure, the types of data streams include elephant streams and mouse streams;
said transmitting said data stream to a buffer corresponding to said type of data stream according to said type of data stream comprising:
in the case that the type of the data stream is an elephant stream, transmitting the data stream to an elephant stream buffer;
in case the type of the data stream is a mouse stream, the data stream is sent to a rat stream buffer.
In some embodiments of the present disclosure, the size of the mouse stream buffer is greater than the size of the elephant stream buffer.
In some embodiments of the present disclosure, the size of the mouse stream buffer is N times the size of the elephant stream buffer, N being a natural number greater than 1.
According to another aspect of the present disclosure, there is provided a congestion control apparatus including:
the packet analysis module is configured to analyze the data packet accessed by the remote direct address and identify the data flow to which the data packet belongs;
a stream calculation module configured to determine a type of the data stream; and sending the data stream to a buffer zone corresponding to the type of the data stream according to the type of the data stream.
According to another aspect of the present disclosure, there is provided a congestion control apparatus including:
a memory configured to store instructions;
a processor configured to execute the instructions, so that the congestion control apparatus performs operations of implementing the congestion control method according to any of the embodiments described above.
According to another aspect of the present disclosure, there is provided a switch including a congestion control apparatus as described in any of the above embodiments.
According to another aspect of the present disclosure, there is provided a computer readable storage medium storing computer instructions which, when executed by a processor, implement a congestion control method as described in any of the embodiments above.
The data packet can be put into different caches based on different streams, and the fine granularity control based on the streams is realized, so that the unfair phenomenon and the victim stream problem caused by the fact that the PFC coarse granularity control mechanism does not distinguish the streams are avoided.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings may be obtained according to these drawings without inventive effort to a person of ordinary skill in the art.
Fig. 1 is a schematic diagram of some embodiments of a congestion control method of the present disclosure.
Fig. 2 is a schematic diagram of some embodiments of a congestion control apparatus of the present disclosure.
Fig. 3 is a schematic diagram illustrating some embodiments of a packet parsing method according to the present disclosure.
Fig. 4 is a schematic diagram of a packet in a RoCEv2 data format in some embodiments of the present disclosure.
Fig. 5 is a schematic diagram of a method of flow throughput computation for a data flow in some embodiments of the present disclosure.
Fig. 6 is a schematic diagram of a mouse stream buffer and an elephant stream buffer in some embodiments of the disclosure.
Fig. 7 is a schematic diagram of some embodiments of a congestion control apparatus of the present disclosure.
Fig. 8 is a schematic structural diagram of other embodiments of a congestion control device of the present disclosure.
Detailed Description
The following description of the technical solutions in the embodiments of the present disclosure will be made clearly and completely with reference to the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are only some embodiments of the present disclosure, not all embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses. Based on the embodiments in this disclosure, all other embodiments that a person of ordinary skill in the art would obtain without making any inventive effort are within the scope of protection of this disclosure.
The relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless it is specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description.
Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but should be considered part of the specification where appropriate.
In all examples shown and discussed herein, any specific values should be construed as merely illustrative, and not a limitation. Thus, other examples of the exemplary embodiments may have different values.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.
Fig. 1 is a schematic diagram of some embodiments of a congestion control method of the present disclosure. Preferably, the present embodiment may be performed by the congestion control apparatus of the present disclosure or the switch of the present disclosure. Fig. 2 is a schematic diagram of some embodiments of a congestion control apparatus of the present disclosure. As shown in fig. 2, the congestion control apparatus of the present disclosure may include a packet parsing module, a flow calculation module, a flow mapping queue, and Buffer buffers of different flows.
The method of the embodiment of fig. 1 may comprise at least one of steps 1-3, wherein:
and 1, analyzing a data packet accessed by a remote direct address, and identifying a data stream to which the data packet belongs.
Fig. 3 is a schematic diagram illustrating some embodiments of a packet parsing method according to the present disclosure. As shown in fig. 3, the method for resolving a data packet of the present disclosure (e.g., the step of resolving a data packet accessed by a remote direct address) may include at least one of steps 11-12, wherein:
and 11, analyzing information in the data packet accessed by the remote direct address.
In some embodiments of the present disclosure, the data packet may be a data packet in a RoCEv2 data format, where RoCE is RDMA over Converged Ethernet (RDMA over converged ethernet).
Fig. 4 is a schematic diagram of a packet in a RoCEv2 data format in some embodiments of the present disclosure. As shown in fig. 4, the data shown includes UDP encapsulation and InfiniBand original messages, where InfiniBand (IB for short). The IP Header contains a source IP address and a destination IP address, and the UDP Header contains a source port number and a destination port number. InfiniBand Payload is the message organization, i.e. the data that is transmitted. The Ethernet header includes a source MAC address and a destination MAC address. InfiniBand Base Transport Header (the header field of the InfiniBand transport layer) includes key fields for intelligent traffic analysis. ICRC corresponds to redundancy detection. FCS corresponds to frame check. The header field of the InfiniBand transport layer includes an Operation Code (Operation Code) of 8 bits. A message type representing RoCEv2, indicating in what mode of operation the message is; a 2-bit Pad Count (Pad Count value) indicates how many extra bytes are padded into InfiniBand PayLoad; a 24 bit Dest QP (Destination Queue Pair ) to identify a RoCEv2 stream; PSN (Packet Sequence Number, representing the sequence number of the RoCEv2 message); 7-bit Reserved is a Reserved field; a 1-bit ACK Request is an acknowledgement Request field.
Step 12, acquiring a source IP address, a destination IP address, a source port number and a destination queue pair; the size of the data packet and the time of receiving the data packet are obtained and recorded.
In some embodiments of the present disclosure, in step 1, the step of identifying a data flow to which the data packet belongs may include: a unique identifier of the data stream is formed by using a source IP address, a destination IP address, a source port number and a destination queue pair; and determining the data flow to which the data packet belongs according to the unique identification of the data flow.
And 2, determining the type of the data stream.
In some embodiments of the present disclosure, step 2 may comprise at least one of step 21-step 22, wherein:
step 21, calculating the stream throughput of the data stream.
Fig. 5 is a schematic diagram of a method of flow throughput computation for a data flow in some embodiments of the present disclosure. As shown in fig. 5, the step of the flow throughput calculating method of the data flow of the present disclosure (i.e., the calculating of the flow throughput of the data flow) may include at least one of steps 210 to 218, wherein:
step 210, obtain a source IP address, a destination IP address, a source port number, and a destination queue pair.
Step 211, using the source IP address, destination IP address, source port number and destination queue pair to make up a unique identification of the data flow.
Step 212, determining whether the current data stream exists in the stream map queue according to the unique identifier. In case the current data stream does not exist in the stream map queue, step 213 is performed; otherwise, in case the current data flow exists in the flow mapping queue, step 214 is performed.
In step 213, the current data stream is inserted into the stream map queue. Step 217 and step 218 are then performed.
Step 214, obtaining old data corresponding to the current data flow in the flow mapping queue.
Step 215, merging the current data stream into the old data.
In step 216, the stream map queue is updated. Step 217 and step 218 are then performed.
And step 217, calculating the stream throughput of the data stream according to the stored stream creation time, the transmitted data size and the current data packet size.
Step 218, save stream creation time (Create) and transferred data Size (Size).
In some embodiments of the present disclosure, the stream Map queue is an LRU (Least Recently Used ) cache Map (Map) queue, where Map is a mechanism that holds data in the form of key-value pairs, as shown in fig. 2.
In some embodiments of the present disclosure, a key (key) is in the LRU cache Map queue that uniquely identifies the ID for the data stream; the value (value) is the stream creation time (Create) and the transmitted data Size (Size).
In some embodiments of the present disclosure, the flow creation time (Create) is 182739964; the transmitted data Size (Size) is 1827399.
In some embodiments Of the present disclosure, the flow mapping queue is used to save the total transmission size Of the data flow and avoid the OOM (Out Memory) exhaustion.
In some embodiments of the present disclosure, the flow mapping queue is configured to reserve a data flow with a first frequency of transmission data, and eliminate a data flow with a second frequency of transmission data, where the first frequency is higher than the second frequency.
In the above embodiments of the present disclosure, the LRU may ensure that a stream with low frequency of sending data is naturally eliminated, ensure that the memory does not overflow, and limit the calculation of the stream to only the stream of the last period of time.
And step 22, determining the type of the data stream according to the stream throughput of the data stream.
In some embodiments of the present disclosure, step 22 may include: determining that the data stream is an elephant stream if the stream throughput of the data stream is greater than a predetermined threshold; and in the case that the flow throughput of the data flow is not greater than a preset threshold value, judging that the data flow is a mouse flow.
In some embodiments of the present disclosure, the predetermined threshold is different for different scenarios.
And step 3, according to the type of the data stream, sending the data stream to a buffer area corresponding to the type of the data stream.
In some embodiments of the present disclosure, the types of data streams include elephant streams and mouse streams.
In some embodiments of the present disclosure, the Buffer of the rat stream is used to Buffer and send packets of different streams.
In some embodiments of the present disclosure, as shown in fig. 2, step 3, that is, the step of transmitting the data stream to a buffer corresponding to the type of the data stream according to the type of the data stream may include: in the case that the type of the data stream is an elephant stream, transmitting the data stream to an elephant stream buffer; in case the type of the data stream is a mouse stream, the data stream is sent to a rat stream buffer.
In some embodiments of the present disclosure, as shown in fig. 2, the data packet passing through the flow calculation module is transferred to a Buffer to be transmitted (switch egress Buffer).
In some embodiments of the present disclosure, the size of the mouse stream buffer is greater than the size of the elephant stream buffer.
In some embodiments of the present disclosure, the size of the mouse stream buffer is N times the size of the elephant stream buffer, N being a natural number greater than 1.
Fig. 6 is a schematic diagram of a mouse stream buffer and an elephant stream buffer in some embodiments of the disclosure. As shown in fig. 6, for the elephant stream, the mouse stream has two different types of streams, the Buffer has different sizes, and the mouse stream Buffer is set to N times (N > 1) the elephant stream Buffer. Compared with the mouse flow, the number of data packets sent by the elephant flow is higher than that of the mouse flow, but the speed of sending the data packets is limited because of the smaller buffer area. The design can effectively utilize the self-owned flow control/congestion control algorithm of the transmission layer to automatically regulate the quantity of the transmitted data packets. The small buffer area of the elephant flow can trigger the congestion avoidance mechanism of the flow control/congestion control algorithm of the elephant flow earlier and more easily, thereby reducing the bandwidth occupied by the elephant flow in the data center network, guaranteeing the bandwidth owned by the mouse flow in the network, and reducing the RTT (Round-Trip Time) of data packets in the mouse flow and the Latency of the whole flow.
The above embodiment of the present disclosure provides a method for identifying a network based on a flow, which uses information in a data packet, stores the data packet information by using LRU, can effectively identify a flow where a packet is located, and classifies the flow by calculating; flow-based fine grain control is achieved.
The above embodiments of the present disclosure utilize LRU to automatically eliminate stale data flows and ensure that no OOM occurs.
According to the different Buffer structures based on the flows in the switch in the embodiment of the disclosure, by using buffers with different sizes, the congestion control mechanism of the multiplexed flows is used, so that the elephant flows are automatically triggered to slow down more frequently, and the mouse flow bandwidth and time delay in the whole network are ensured.
Fig. 2 is a schematic diagram of some embodiments of a congestion control apparatus of the present disclosure. Fig. 7 is a schematic diagram of some embodiments of a congestion control apparatus of the present disclosure. As shown in fig. 2 and 7, the congestion control apparatus of the present disclosure may include a packet parsing module 71 and a flow calculation module 72, wherein:
the packet parsing module 71 is configured to parse the data packet accessed by the remote direct address and identify the data stream to which the data packet belongs.
In some embodiments of the present disclosure, the packet parsing module 71 may be configured to parse a data packet accessed by a remote direct address, obtaining a source IP address, a destination IP address, a source port number, and a destination queue pair; a unique identifier of the data stream is formed by using a source IP address, a destination IP address, a source port number and a destination queue pair; and determining the data flow to which the data packet belongs according to the unique identification of the data flow.
In some embodiments of the present disclosure, the packet parsing module 71 may be further configured to parse a data packet accessed by a remote direct address, and obtain and record the size of the data packet and the time when the data packet is received.
In some embodiments of the present disclosure, the packet parsing module 71 may be configured to parse a sender IP address and port number, a receiver IP address port number, a packet size in a packet Header.
A stream calculation module 72 configured to determine a type of the data stream; and sending the data stream to a buffer zone corresponding to the type of the data stream according to the type of the data stream.
In some embodiments of the present disclosure, the flow calculation module 72 may be configured to calculate the size of the flow, the frequency of packet transmission, update the LRUMap list, and determine to which buffer the flow is sent.
In some embodiments of the present disclosure, the types of data streams include elephant streams and mouse streams.
In some embodiments of the present disclosure, the flow calculation module 72 may be configured to calculate a flow throughput of the data flow if the type of the data flow is determined; and determining the type of the data stream according to the stream throughput of the data stream.
In some embodiments of the present disclosure, the flow calculation module 72, in determining the type of the data flow according to the flow throughput of the data flow, may be configured to determine that the data flow is an elephant flow in a case where the flow throughput of the data flow is greater than a predetermined threshold; and in the case that the flow throughput of the data flow is not greater than a preset threshold value, judging that the data flow is a mouse flow.
In some embodiments of the present disclosure, the flow calculation module 72 may be configured to calculate the flow throughput of the data flow according to the saved flow creation time, the transmitted data size, and the current packet size, in the case of calculating the flow throughput of the data flow.
In some embodiments of the present disclosure, the flow calculation module 72, in calculating the flow throughput of the data flow, may be further configured to determine whether the current data flow is present in the flow mapping queue; inserting the current data stream into the stream map queue if the current data stream does not exist in the stream map queue; and then, performing the operation of calculating the stream throughput of the data stream according to the stored stream creation time, the transmitted data size and the current data packet size.
In some embodiments of the present disclosure, the flow calculation module 72, in calculating the flow throughput of the data flow, may be further configured to determine whether the current data flow is present in the flow mapping queue; under the condition that the current data flow exists in the flow mapping queue, old data corresponding to the current data flow in the flow mapping queue is obtained; merging the current data stream into the old data, and updating a stream mapping queue; and then, performing the operation of calculating the stream throughput of the data stream according to the stored stream creation time, the transmitted data size and the current data packet size.
In some embodiments of the present disclosure, the flow mapping queue is a least recently used flow mapping queue.
In some embodiments of the present disclosure, the flow mapping queue is used to preserve the total transmit size of the data flow and avoid OOM, as shown in fig. 2.
In some embodiments of the present disclosure, the flow mapping queue is configured to reserve a data flow with a first frequency of transmission data, and eliminate a data flow with a second frequency of transmission data, where the first frequency is higher than the second frequency.
In some embodiments of the present disclosure, the types of data streams include elephant streams and mouse streams.
In some embodiments of the present disclosure, the stream computation module 72, in the case of transmitting the data stream to a buffer corresponding to the type of the data stream according to the type of the data stream, may be configured to transmit the data stream to an elephant stream buffer in the case that the type of the data stream is an elephant stream; in case the type of the data stream is a mouse stream, the data stream is sent to a rat stream buffer.
In some embodiments of the present disclosure, the flow calculation module 72 may be configured to determine the flow to which the data packet belongs using the ID received from the packet parsing module, then query the LRU queue for flow information, calculate the flow throughput by the saved flow creation time, the transmitted data size, and the current data packet size, and determine whether the flow is an elephant flow or a mouse flow. The stream data in the LRU is updated. If no flow information is found in the LRU queue, the representation is a new flow, inserting the data directly into the LRU queue. The LRU can ensure that streams with low frequencies of transmitted data are naturally eliminated, ensure that memory does not overflow, and limit the computation of the streams to only the streams of the last period of time.
In some embodiments of the present disclosure, as shown in fig. 2, buffer buffers of elephant and mouse flows are used to buffer and transmit data packets of different flows.
In some embodiments of the present disclosure, as shown in fig. 6, the size of the mouse stream buffer is greater than the size of the elephant stream buffer.
In some embodiments of the present disclosure, as shown in fig. 6, the size of the mouse stream buffer is N times the size of the elephant stream buffer, N being a natural number greater than 1.
The embodiment of the disclosure uses the information in the data packet to effectively identify the stream where the packet is located, classifies the stream by calculation, ensures the low delay of the mouse stream, and cannot be influenced by the elephant stream.
According to the embodiment of the disclosure, the data packets are put into different caches based on different streams, so that the fine granularity control based on the streams is realized, and the unfair phenomenon and the victim stream problem caused by the fact that the PFC coarse granularity control mechanism does not distinguish the streams are avoided.
The above embodiments of the disclosure design buffer sizes for different types of flows, multiplex congestion control mechanisms of the flows themselves, and not apply congestion control to the switch, thereby avoiding the high cost of upgrading the switch after upgrading the congestion control algorithm. Sufficient flexibility and convenience of transport layer protocol upgrades are preserved.
The above embodiments of the present disclosure do not require modification of the related art RDMA protocol congestion control algorithm.
The above-described embodiments of the present disclosure do not require extensive and costly modification projects to the related art switch hardware.
The above embodiments of the disclosure adopt a flow control technique, which can well avoid unfair phenomenon and victim flow problems caused by the PFC coarse-grained control mechanism without differentiating flows.
Fig. 8 is a schematic structural diagram of other embodiments of a congestion control device of the present disclosure. As shown in fig. 8, the congestion control apparatus of the present disclosure may include a memory 101 and a processor 102.
The memory 101 is configured to store instructions, and the processor 102 is coupled to the memory 101, the processor 102 being configured to implement a congestion control method as described in any of the embodiments above (e.g., any of fig. 1, 3, and 5) based on the instructions stored by the memory.
As shown in fig. 8, the congestion control apparatus further includes a communication interface 103 for information interaction with other devices. Meanwhile, the congestion control apparatus further includes a bus 104, and the processor 102, the communication interface 103, and the memory 101 perform communication with each other through the bus 104.
The Memory 101 may include a high-speed RAM Memory or may further include a Non-volatile Memory (Non-volatile Memory), such as at least one magnetic disk Memory. Memory 101 may also be a memory array. Memory 101 may also be partitioned and the blocks may be combined into virtual volumes according to certain rules.
Further, the processor 102 may be a central processing unit CPU, or may be an application specific integrated circuit ASIC, or one or more integrated circuits configured to implement embodiments of the present disclosure.
The above embodiments of the present disclosure provide a flow control-based RDMA congestion control method and apparatus to alleviate victim flow problems and optimize network transmission efficiency in a data center.
According to another aspect of the present disclosure, there is provided a switch comprising congestion control apparatus as described in any of the embodiments described above (e.g. any of fig. 2, 6, 7 and 8).
The embodiment of the disclosure can be applied to RDMA networks of a data center, can be applied to cloud service providers, and can effectively solve the problems of increased service network delay caused by micro-service, mixed deployment of business service and data service in the data center, increased user request time and increased user request timeout probability caused by data service without network isolation.
The above embodiments of the present disclosure use the congestion control algorithm of the network transport protocol itself to place data packets into different buffers based on the class of the flow to which the packet belongs through the flow-based identification algorithm. The method can adapt to the upgrading of the congestion control algorithm of the future network protocol without modification, and saves time and money.
According to the embodiment of the disclosure, the network transmission efficiency of RDMA in the data center can be effectively improved, large-scale network congestion is avoided, user satisfaction is improved, network request pressure in the data center is relieved, and the RDMA data center has an important role in the IDC data center.
According to another aspect of the disclosure, there is provided a computer readable storage medium storing computer instructions that when executed by a processor implement a congestion control method as described in any of the embodiments above (e.g., any of fig. 1, 3, and 5).
In some embodiments of the present disclosure, the computer-readable storage medium may be a non-transitory computer-readable storage medium.
The above embodiments of the present disclosure belong to the technical field of network, RDMA and software and hardware co-acceleration.
The embodiment of the disclosure discloses an RDMA congestion control method and device based on flow control (per-flow). The congestion control device comprises a packet analysis module, a stream calculation module, an LRU buffer Map queue and buffer buffers of different streams. The above embodiments of the present disclosure identify, through the packet parsing module, which flow the packet belongs to, and determine, using the calculation module, that the flow enters the corresponding buffer. The LRU cache queue of the above embodiments of the present disclosure ensures that no OOM occurs.
It will be apparent to those skilled in the art that embodiments of the present disclosure may be provided as a method, apparatus, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The congestion control apparatus, switches, packet parsing modules, and flow computation modules described above may be implemented as general-purpose processors, programmable Logic Controllers (PLCs), digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or any suitable combination thereof, for performing the functions described herein.
Thus far, the present disclosure has been described in detail. In order to avoid obscuring the concepts of the present disclosure, some details known in the art are not described. How to implement the solutions disclosed herein will be fully apparent to those skilled in the art from the above description.
Those of ordinary skill in the art will appreciate that all or a portion of the steps implementing the above embodiments may be implemented by hardware, or may be implemented by a program indicating that the relevant hardware is implemented, where the program may be stored on a non-transitory computer readable storage medium, where the storage medium may be a read-only memory, a magnetic disk or optical disk, etc.
The description of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the disclosure in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiments were chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated.

Claims (17)

1. A congestion control method, comprising:
analyzing a data packet accessed by a remote direct address, and identifying a data stream to which the data packet belongs;
determining the type of the data stream;
and sending the data stream to a buffer zone corresponding to the type of the data stream according to the type of the data stream.
2. The congestion control method according to claim 1, wherein said parsing the data packet accessed by the remote direct address, identifying the data flow to which the data packet belongs, comprises:
analyzing a data packet accessed by a remote direct address to obtain a source IP address, a destination IP address, a source port number and a destination queue pair;
a unique identifier of the data stream is formed by using a source IP address, a destination IP address, a source port number and a destination queue pair;
and determining the data flow to which the data packet belongs according to the unique identification of the data flow.
3. The congestion control method according to claim 1 or 2, wherein said parsing the data packet accessed by the remote direct address, and identifying the data flow to which the data packet belongs, comprises:
analyzing the data packet accessed by the remote direct address, and obtaining and recording the size of the data packet and the time for receiving the data packet.
4. The congestion control method according to claim 1 or 2, wherein the types of data flows include elephant flows and mouse flows;
wherein said determining the type of the data stream comprises:
calculating a stream throughput of the data stream;
and determining the type of the data stream according to the stream throughput of the data stream.
5. The congestion control method of claim 4, wherein the determining the type of the data flow according to the flow throughput of the data flow comprises:
determining that the data stream is an elephant stream if the stream throughput of the data stream is greater than a predetermined threshold;
and in the case that the flow throughput of the data flow is not greater than a preset threshold value, judging that the data flow is a mouse flow.
6. The congestion control method of claim 4, wherein the calculating the flow throughput of the data flow comprises:
and calculating the stream throughput of the data stream according to the stored stream creation time, the transmitted data size and the current data packet size.
7. The congestion control method of claim 6, wherein the calculating the flow throughput of the data flow further comprises:
judging whether the current data stream exists in a stream mapping queue or not;
inserting the current data stream into the stream map queue if the current data stream does not exist in the stream map queue; and then executing the step of calculating the stream throughput of the data stream according to the stored stream creation time, the transmitted data size and the current data packet size.
8. The congestion control method of claim 6, wherein the calculating the flow throughput of the data flow further comprises:
under the condition that the current data flow exists in the flow mapping queue, old data corresponding to the current data flow in the flow mapping queue is obtained;
merging the current data stream into the old data, and updating a stream mapping queue; and then executing the step of calculating the stream throughput of the data stream according to the stored stream creation time, the transmitted data size and the current data packet size.
9. The congestion control method of claim 8, wherein the flow mapping queue is a least recently used flow mapping queue;
the flow mapping queue is configured to store a total transmission size of the data flow.
10. The congestion control method of claim 9, wherein,
the stream mapping queue is configured to reserve a data stream with a first frequency of transmission data, and eliminate a data stream with a second frequency of transmission data, where the first frequency is higher than the second frequency.
11. The congestion control method according to claim 1 or 2, wherein the types of data flows include elephant flows and mouse flows;
said transmitting said data stream to a buffer corresponding to said type of data stream according to said type of data stream comprising:
in the case that the type of the data stream is an elephant stream, transmitting the data stream to an elephant stream buffer;
in case the type of the data stream is a mouse stream, the data stream is sent to a rat stream buffer.
12. The congestion control method of claim 11, wherein:
the size of the mouse stream buffer is greater than the size of the elephant stream buffer.
13. The congestion control method of claim 12, wherein:
the size of the mouse stream buffer is N times the size of the elephant stream buffer, and N is a natural number greater than 1.
14. A congestion control apparatus, comprising:
the packet analysis module is configured to analyze the data packet accessed by the remote direct address and identify the data flow to which the data packet belongs;
a stream calculation module configured to determine a type of the data stream; and sending the data stream to a buffer zone corresponding to the type of the data stream according to the type of the data stream.
15. A congestion control apparatus, comprising:
a memory configured to store instructions;
a processor configured to execute the instructions to cause the congestion control apparatus to perform operations implementing the congestion control method of any of claims 1-13.
16. A switch comprising the congestion control apparatus of claim 14 or 15.
17. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the congestion control method according to any one of claims 1 to 13.
CN202311099041.6A 2023-08-29 2023-08-29 Congestion control method and device, switch and computer readable storage medium Pending CN117041166A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311099041.6A CN117041166A (en) 2023-08-29 2023-08-29 Congestion control method and device, switch and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311099041.6A CN117041166A (en) 2023-08-29 2023-08-29 Congestion control method and device, switch and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN117041166A true CN117041166A (en) 2023-11-10

Family

ID=88633565

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311099041.6A Pending CN117041166A (en) 2023-08-29 2023-08-29 Congestion control method and device, switch and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN117041166A (en)

Similar Documents

Publication Publication Date Title
US11677851B2 (en) Accelerated network packet processing
US20200280518A1 (en) Congestion management techniques
CN109412964B (en) Message control method and network device
CN108307434B (en) Method and apparatus for flow control
WO2018210117A1 (en) Congestion control method, network device, and network interface controller thereof
US9755947B2 (en) Hierarchical self-organizing classification processing in a network switch
US9485200B2 (en) Network switch with external buffering via looparound path
US20230216767A1 (en) Technologies for out-of-order network packet management and selective data flow splitting
US9356844B2 (en) Efficient application recognition in network traffic
US8949578B2 (en) Sharing of internal pipeline resources of a network processor with external devices
CN116018790A (en) Receiver-based precise congestion control
CN107787570A (en) Light weight transportation protocol
JP2006506845A (en) How to select a logical link for a packet in a router
CN112953848A (en) Strict priority based traffic supervision method, system and equipment
CN114079638A (en) Data transmission method, device and storage medium of multi-protocol hybrid network
CN113992588A (en) Data transmission method and device, electronic equipment and readable storage medium
US20220103479A1 (en) Transmit rate based on detected available bandwidth
CN113438182B (en) Credit-based flow control system and flow control method
CN111865813B (en) Data center network transmission control method and system based on anti-ECN mark and readable storage medium
CN113542148A (en) Message aggregation method and device, network card and readable storage medium
US20230239244A1 (en) Heavy hitter flow detection
CN113765812A (en) Method and device for marking message
CN117041166A (en) Congestion control method and device, switch and computer readable storage medium
CN117014967A (en) Mobile communication system, method and user plane node
US11528187B1 (en) Dynamically configurable networking device interfaces for directional capacity modifications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination