WO2022246710A1 - Procédé de commande de transmission de flux de données et dispositif de communication - Google Patents

Procédé de commande de transmission de flux de données et dispositif de communication Download PDF

Info

Publication number
WO2022246710A1
WO2022246710A1 PCT/CN2021/096160 CN2021096160W WO2022246710A1 WO 2022246710 A1 WO2022246710 A1 WO 2022246710A1 CN 2021096160 W CN2021096160 W CN 2021096160W WO 2022246710 A1 WO2022246710 A1 WO 2022246710A1
Authority
WO
WIPO (PCT)
Prior art keywords
rate
data
total
data flow
subset
Prior art date
Application number
PCT/CN2021/096160
Other languages
English (en)
Chinese (zh)
Inventor
徐磊
唐德智
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2021/096160 priority Critical patent/WO2022246710A1/fr
Priority to CN202180092829.7A priority patent/CN116868554A/zh
Publication of WO2022246710A1 publication Critical patent/WO2022246710A1/fr

Links

Images

Definitions

  • the present application relates to the technical field of data transmission, and in particular to a method and communication device for controlling data stream transmission.
  • DCN data center networks
  • the data transmission mode mainly includes transmission control protocol (transmission control protocol, TCP) transmission mechanism and lossless network transmission.
  • TCP transmission control protocol
  • the hop-by-hop flow control scheme mainly includes the priority-based flow control (PFC) scheme based on the data center bridging (DCB) technology, and the InfiniBand technology-based
  • PFC priority-based flow control
  • DCB data center bridging
  • CBFC credit-based data flow control
  • the upstream port when the input queue is seriously congested, the upstream port is notified to stop sending data by using a message for indicating pause and wait.
  • the CBFC scheme controls the output rate of the upstream port by periodically updating the available signaling (credit) of the current port buffer to the upstream port.
  • CBD cyclic buffer dependency
  • the switch in the ring network suspends the upstream port and waits for the downstream port to start transmission at the same time. In this case, the switches in the ring network are in a deadlock state.
  • GFC gentle flow control
  • the GFC solution uses the input queue length of the switch as a benchmark to calculate the appropriate output rate of the upstream port, and assigns the calculated output rate to the upstream port to ensure uninterrupted data transmission and avoid deadlock problems.
  • the GFC solution is aimed at adjusting the output rate of the upstream port and directly reduces the rate of the upstream output port, which cannot effectively alleviate the congestion of the current output queue of the switch. Therefore, there are still problems of large data stream transmission delay and low throughput .
  • the end-to-end congestion control scheme mainly includes data center quantized congestion notification (DCQCN) and high precision congestion control (HPCC).
  • DCQCN consists of three parts: the sending end, the switch and the receiving end.
  • the explicit congestion notification mechanism (ECN) is used to mark the congestion information and put it in the packet header.
  • ECN explicit congestion notification mechanism
  • the receiving end receives a packet marked with congestion information, it sends a congestion notification packet (CNP) to the sending end.
  • CNP congestion notification packet
  • the sender receives the CNP, it adjusts the data sending rate to alleviate the congestion of the switch.
  • HPCC feeds back more detailed congestion information through In-network telemetry (INT), and the end host estimates the degree of network congestion through the congestion information. Since the congestion information needs to be fed back, there is also a certain delay in data stream transmission.
  • INT In-network telemetry
  • the present application provides a method for controlling data stream transmission, a communication device, a storage medium, and a computer program product, so as to reduce the transmission delay of the data stream as much as possible.
  • the present application provides a method for controlling data flow transmission, the method may include obtaining the allocation rate of the M data flow subsets in the first data flow set, and measuring the actual rate of the M data flow subsets; According to the actual rate and the allocation rate, the idle rate of the first output port can be determined, and the idle rate can be allocated to the data stream subset in the second data stream set; wherein, M is a positive integer, and the first output port can be used to output the first The data streams in the set of data streams and the second set of data streams.
  • the method can be executed by a communication device, and the communication device can be a switching node; or a module in the switching node, such as a chip.
  • the idle rate is assigned to the second data flow set, thereby helping to increase the rate at which the second data flow set transmits data streams, and further helping to reduce the second data flow set.
  • the transmission delay of the data streams of the two data stream sets This is especially the case when the actual rate of the subset of streams in the second set of streams is about to be insufficient.
  • each data flow set includes at least one data flow subset, and the data flows in each data flow subset come from the same input port. It can also be understood that a data flow subset corresponds to an input port.
  • the actual rate of the data stream subset in the second data stream set is not less than the product of the allocation rate and the coefficient, and the coefficient is greater than the threshold and less than 1.
  • the transmission rate of the data streams in the second data stream set can be increased, which in turn helps The transmission delay of the data streams in the second data stream set is reduced.
  • the first average input rate and the first average output rate of the i-th data stream subset can be obtained, i takes an integer in the closed interval [1, M], and the i-th data stream
  • the subset is one of the M data stream subsets; the minimum value of the first average input rate and the first average output rate is determined as the allocation rate of the i-th data stream subset.
  • the minimum value of the first average input rate and the first average output rate is the rate at which the data flow in the i-th data flow subset first encounters a transmission bottleneck, and the minimum value is determined as the i-th data flow subset
  • the allocation rate helps to reduce the transmission delay of the data streams in the i-th data stream subset as much as possible.
  • the following exemplarily shows a possible manner of determining the first average output rate.
  • the first data can be determined according to the rate of the first output port, the first total number of data streams output by the first output port, and the second total number of data streams included in the first data stream set
  • the rate at which the flow set occupies the first output port; the minimum value of the rate at which the first data flow set occupies the first output port and the feedback rate is determined as the maximum average output rate of the first data flow set, and the feedback rate is the downstream switching node
  • the third total number of data flows included in the i-th data flow subset flowing from the first input port, and the i-th data flow subset corresponding to A fourth total number of data streams included in the input port determines the first average input rate.
  • the following exemplarily shows three methods for determining the idle rate of the first output port.
  • Method 1 Determine the idle rate of the first data flow set.
  • the first total allocation rate of the first data flow set is determined according to the allocation rates of the M data flow subsets; the first data flow set is determined according to the actual rates of the M data flow subsets The first total actual rate of ; according to the first total actual rate and the first total allocation rate, determine the idle rate of the first data flow set.
  • the data stream subset whose actual rate is not less than the product of the allocation rate and the coefficient and whose first average output rate is less than the first average input rate is determined as the second data stream A subset of data streams of the stream collection with a coefficient greater than the threshold and less than 1.
  • the fourth total allocation rate of the second data stream set and the fifth total number of data streams included in the second data stream set are obtained, according to the first total allocation rate, the first total actual rate, the fourth total The allocation rate, the third total and the fifth total determine a second average output rate for the second set of data streams.
  • the idle rate of the first data stream set can be determined, and the idle rate of the first data stream set can be allocated to the second data stream whose rate is about to be insufficient.
  • the set of data streams helps to reduce the transmission delay of the data streams in the second set of data streams.
  • Method 2 determining the idle rate of the first output port.
  • the first total allocation rate of the first data flow set is determined according to the allocation rates of the M data flow subsets; the first data flow set is determined according to the actual rates of the M data flow subsets The first total actual rate of ; according to the first total allocation rate of the first data stream set, determine the second total allocation rate of the first output port; according to the first total actual rate of the first data stream set, determine the first output port The second total actual rate; according to the second total actual rate and the second total allocation rate, determine the idle rate of the first output port.
  • the N data flow sets corresponding to the first output port include the first data flow set; each of the N data flow sets corresponding to the first output port can be obtained
  • the data flow set occupies the rate and feedback rate of the first output port; the first total actual rate in the N data flow sets corresponding to the first output port is not less than the product of the first total allocation rate and the coefficient, and occupies the first output port
  • the data flow set whose rate is less than the feedback rate is determined as the data flow subset of the second data flow set, the feedback rate is the rate of the corresponding data flow set specified by the downstream switching node, and the coefficient is greater than the threshold and less than 1.
  • Method 3 traversing the data subset corresponding to the first input port.
  • the M data stream subsets include a first data stream subset, and the first data stream subset corresponds to the first input port; the allocated rate and the actual rate of the first data stream subset can be obtained , according to the allocation rate of the first data flow subset, determine the third total allocation rate corresponding to the first input port; according to the actual rate of the first data flow subset, determine the third total actual rate corresponding to the first input port; according to The third total actual rate and the third total allocated rate determine the idle rate of the data flow subset corresponding to the first input port.
  • the subset of data streams corresponding to the first input port whose actual rate is not less than the product of the allocation rate and the coefficient and whose first average output rate is greater than the first average input rate is determined is a data stream subset of the second data stream set, and the coefficient is greater than a threshold and less than 1.
  • a preset time interval determines the third total number of data streams flowing into the i-th data stream subset from the first input port, where the i-th data stream subset is M data stream sub-sets one of the sets; determine the ratio of the third total to the preset time interval as the actual rate of the ith subset of data streams.
  • the actual rate of flow from the first input port to the i-th subset of data streams can be measured.
  • the updated allocation rate of the second data flow subset may be sent to the upstream node.
  • the rate allocation table of the switching node updates the rate of a data flow f
  • the updated allocation rate of the data flow subset where the data flow is located is fed back to the upstream switching node or service network card of the data flow f, and the corresponding The data flow's own allocation rate, which can limit the sending rate of the queue execution data flow in the upstream switch node or network card.
  • the updated allocation rate may be carried in the priority-based PFC control message.
  • the updated allocation rate is carried in the PFC control message, so that the existing PFC control message can be compatible.
  • the new data flow may be stored in the idle queue.
  • the new data flow may be stored in a reserved queue.
  • the present application provides a communication device, which is used to implement the first aspect or any one of the methods in the first aspect, and includes corresponding functional modules, respectively used to implement the steps in the above methods.
  • the functions may be implemented by hardware, or may be implemented by executing corresponding software through hardware.
  • Hardware or software includes one or more modules corresponding to the above-mentioned functions.
  • the processing module is used to obtain the allocation rate of the M data flow subsets in the first data flow set, and M is a positive integer; the measurement module is used to measure the actual rate of the M data flow subsets; the processing module is also used to an actual rate and an allocated rate, determining an idle rate of the first output port; and allocating the idle rate to a subset of data streams in the second set of data streams; wherein the first output port is used to output the first set of data streams and the second set of data streams Dataflows in the dataflows collection.
  • each data flow set includes at least one data flow subset, and the data flows in each data flow subset come from the same input port.
  • the actual rate of the data stream subset in the second data stream set is not less than the product of the allocation rate and the coefficient, and the coefficient is greater than the threshold and less than 1.
  • the processing module is specifically configured to: obtain the first average input rate and the first average output rate of the i-th data flow subset, where i takes an integer in the closed interval [1, M],
  • the i-th data flow subset is one of the M data flow subsets; the minimum value of the first average input rate and the first average output rate is determined as the allocation rate of the i-th data flow subset.
  • the processing module is specifically configured to: according to the rate of the first output port, the first total number of data streams output by the first output port, and the second total number of data streams included in the first data stream set , determine the rate at which the first data flow set occupies the first output port; the minimum value of the rate at which the first data flow set occupies the first output port and the feedback rate is determined as the maximum average output rate of the first data flow set, and the feedback The rate specifies the rate of the first data flow collection for the downstream switching node; and according to the maximum average output rate, the second total number, and the third total number of data flows included in the i-th data flow subset flowing from the first input port, determine the first - Average output rate.
  • the processing module is specifically configured to: according to the rate of the input port corresponding to the i-th data flow subset, flow from the first input port into the third data flow included in the i-th data flow subset The total number and the fourth total number of data streams included in the input ports corresponding to the i-th data stream subset determine the first average input rate.
  • the processing module is specifically configured to: determine the first total allocation rate of the first data stream set according to the allocation rates of the M data stream subsets; and determine the first total allocation rate of the first data stream set according to the actual rates of the M data stream subsets, A first total actual rate of the first data flow set is determined; and an idle rate of the first data flow set is determined according to the first total actual rate and the first total allocated rate.
  • the processing module is further configured to: select the subset of data streams whose actual rate is not less than the product of the allocation rate and the coefficient and whose first average output rate is less than the first average input rate among the M data stream subsets , is determined as a subset of data streams of the second data stream set, and the coefficient is greater than a threshold and less than 1.
  • the processing module is specifically configured to: obtain the fourth total allocation rate of the second data stream set and the fifth total number of data streams included in the second data stream set; according to the first total allocation rate, The first total actual rate, the fourth total allocated rate, the third total and the fifth total determine a second average output rate for the second set of data streams.
  • the processing module is specifically configured to: determine the first total allocation rate of the first data stream set according to the allocation rates of the M data stream subsets; and determine the first total allocation rate of the first data stream set according to the actual rates of the M data stream subsets, determining a first total actual rate for the first set of data streams; determining a second total allocated rate for the first output port based on the first total allocated rate for the first set of data streams; based on the first total actual rate for the first set of data streams , determine the second total actual rate of the first output port; determine the idle rate of the first output port according to the second total actual rate and the second total allocated rate.
  • the N data flow sets corresponding to the first output port include the first data flow set; the processing module is also used to: obtain the N data flow sets corresponding to the first output port Each data flow set in the flow set occupies the rate and feedback rate of the first output port; the first total actual rate in the N data flow sets corresponding to the first output port is not less than the product of the first total allocation rate and the coefficient, and The set of data streams occupying the first output port whose rate is less than the feedback rate is determined as a data stream subset of the second data stream set, the feedback rate is the rate of the corresponding data stream set specified by the downstream switching node, and the coefficient is greater than the threshold and less than 1.
  • the processing module is specifically configured to: determine the fifth total allocation rate of the second data stream set, and the sixth total number of data streams included in the second data stream set; according to the second total allocation rate, The second total actual rate, the fifth total allocated rate, the second total and the sixth total determine the rate at which the second data flow set occupies the first output port.
  • the M data stream subsets include a first data stream subset, and the first data stream subset corresponds to the first input port;
  • the processing module is specifically configured to: obtain the first data stream subset Allocation rate and actual rate; according to the allocation rate of the first data flow subset, determine the third total allocation rate corresponding to the first input port; according to the actual rate of the first data flow subset, determine the third total allocation rate corresponding to the first input port The total actual rate: according to the third total actual rate and the third total allocation rate, determine the idle rate of the data flow subset corresponding to the first input port.
  • the processing module is further configured to: set the actual rate in each data flow subset corresponding to the first input port to be not less than the product of the allocation rate and the coefficient, and the first average output rate to be greater than the first average input rate
  • the data stream subset of is determined as the data stream subset of the second data stream set, and the coefficient is greater than the threshold and less than 1.
  • the processing module is specifically configured to: determine the sixth total allocation rate of the second data stream set, and the seventh total number of data streams included in the second data stream set; according to the third total allocation rate, the first The three total actual rate, the sixth total allocated rate, the third total, and the seventh total determine a second average input rate for the second set of data streams.
  • the measurement module is specifically configured to: within a preset time interval, determine the third total number of data streams flowing into the i-th data stream subset from the first input port, and the i-th data stream subset is one of the M data flow subsets; the ratio of the third total number to the preset time interval is determined as the actual rate of the i-th data flow subset.
  • the processing module is further configured to: determine that a new data flow flows into the second data flow subset, update the allocation rate of the second data flow subset, the second data flow subset is M data One of the flow subsets.
  • the communication device further includes a transceiver module, configured to send the updated allocation rate of the second data flow subset to an upstream node.
  • the updated allocation rate is carried in the priority-based PFC control message.
  • the processing module is further configured to: if it is determined that a new data flow arrives at the first output port and there is an idle queue at the first output port, store the new data flow in the idle queue.
  • the processing module is further configured to: if it is determined that the new data flow arrives at the first output port and there is no idle queue at the first output port, store the new data flow in the reserved queue.
  • the processing module is further configured to: determine that there is an idle queue at the first output port, and there is a data flow in the reserved queue, and move the data flow in the reserved queue into the idle queue.
  • the present application provides a communication device, which is used to implement the first aspect or any one of the methods in the first aspect, and includes corresponding functional modules, respectively used to implement the steps in the above method.
  • the functions may be implemented by hardware, or may be implemented by executing corresponding software through hardware.
  • Hardware or software includes one or more modules corresponding to the above-mentioned functions.
  • the communication device may be a switching node, or a module that may be used in a switching node, such as a chip or a chip system or a circuit.
  • the communication device may include a processor, and further, optionally, may also include a communication interface.
  • the processor may be configured to support the communication device to perform the corresponding functions of the switching node shown above, and the communication interface is used to support communication between the communication device and other communication devices (such as other switching nodes).
  • the communication device may further include a memory, which may be coupled with the processor, and store necessary program instructions and data of the communication device.
  • the processor is used to obtain the allocation rate of the M data stream subsets in the first data stream set, M is a positive integer, measure the actual rate of the M data stream subsets, and determine the first output according to the actual rate and the allocation rate The idle rate of the port, and assigning the idle rate to the subset of data streams in the second data stream set; wherein, the first output port is used to output the data streams in the first data stream set and the second data stream set.
  • each data flow set includes at least one data flow subset, and the data flows in each data flow subset come from the same input port.
  • the actual rate of the data stream subset in the second data stream set is not less than the product of the allocation rate and the coefficient, and the coefficient is greater than the threshold and less than 1.
  • the processor is specifically configured to: obtain the first average input rate and the first average output rate of the i-th data stream subset, where i takes an integer in the closed interval [1, M],
  • the i-th data flow subset is one of the M data flow subsets; the minimum value of the first average input rate and the first average output rate is determined as the allocation rate of the i-th data flow subset.
  • the processor is specifically configured to: according to the rate of the first output port, the first total number of data streams output by the first output port, and the second total number of data streams included in the first data stream set , determine the rate at which the first data flow set occupies the first output port; the minimum value of the rate at which the first data flow set occupies the first output port and the feedback rate is determined as the maximum average output rate of the first data flow set, and the feedback The rate specifies the rate of the first data flow collection for the downstream switching node; and according to the maximum average output rate, the second total number, and the third total number of data flows included in the i-th data flow subset flowing from the first input port, determine the first - Average output rate.
  • the processor is specifically configured to: according to the rate of the input port corresponding to the i-th data flow subset, flow from the first input port to the third data flow included in the i-th data flow subset The total number and the fourth total number of data streams included in the input ports corresponding to the i-th data stream subset determine the first average input rate.
  • the processor is specifically configured to: determine the first total allocation rate of the first data flow set according to the allocation rates of the M data flow subsets; and determine the first total allocation rate of the first data flow set according to the actual rates of the M data flow subsets, A first total actual rate of the first data flow set is determined; and an idle rate of the first data flow set is determined according to the first total actual rate and the first total allocated rate.
  • the processor is further configured to: select the subset of data streams whose actual rate is not less than the product of the allocation rate and the coefficient and whose first average output rate is less than the first average input rate among the M data stream subsets , is determined as a subset of data streams of the second data stream set, and the coefficient is greater than a threshold and less than 1.
  • the processor is specifically configured to: obtain the fourth total allocation rate of the second data stream set and the fifth total number of data streams included in the second data stream set; according to the first total allocation rate, The first total actual rate, the fourth total allocated rate, the third total and the fifth total determine a second average output rate for the second set of data streams.
  • the processor is specifically configured to: determine the first total allocation rate of the first data flow set according to the allocation rates of the M data flow subsets; and determine the first total allocation rate of the first data flow set according to the actual rates of the M data flow subsets, determining a first total actual rate for the first set of data streams; determining a second total allocated rate for the first output port based on the first total allocated rate for the first set of data streams; based on the first total actual rate for the first set of data streams , determine the second total actual rate of the first output port; determine the idle rate of the first output port according to the second total actual rate and the second total allocated rate.
  • the N data flow sets corresponding to the first output port include the first data flow set; the processor is also used to: obtain the N data flow sets corresponding to the first output port Each data flow set in the flow set occupies the rate and feedback rate of the first output port; the first total actual rate in the N data flow sets corresponding to the first output port is not less than the product of the first total allocation rate and the coefficient, and The set of data streams occupying the first output port whose rate is less than the feedback rate is determined as a data stream subset of the second data stream set, the feedback rate is the rate of the corresponding data stream set specified by the downstream switching node, and the coefficient is greater than the threshold and less than 1.
  • the processor is specifically configured to: determine the fifth total allocation rate of the second data stream set, and the sixth total number of data streams included in the second data stream set; according to the second total allocation rate, The second total actual rate, the fifth total allocated rate, the second total and the sixth total determine the rate at which the second data flow set occupies the first output port.
  • the M data stream subsets include a first data stream subset, and the first data stream subset corresponds to the first input port;
  • the processor is specifically configured to: obtain the first data stream subset Allocation rate and actual rate; according to the allocation rate of the first data flow subset, determine the third total allocation rate corresponding to the first input port; according to the actual rate of the first data flow subset, determine the third total allocation rate corresponding to the first input port The total actual rate: according to the third total actual rate and the third total allocation rate, determine the idle rate of the data flow subset corresponding to the first input port.
  • the processor is further configured to: set the actual rate in each data flow subset corresponding to the first input port to be not less than the product of the allocation rate and the coefficient, and the first average output rate to be greater than the first average input rate
  • the data stream subset of is determined as the data stream subset of the second data stream set, and the coefficient is greater than the threshold and less than 1.
  • the processor is specifically configured to: determine a sixth total allocation rate of the second data stream set, and a seventh total number of data streams included in the second data stream set; according to the third total allocation rate, the first The three total actual rate, the sixth total allocated rate, the third total, and the seventh total determine a second average input rate for the second set of data streams.
  • the processor is specifically configured to: within a preset time interval, determine the third total number of data streams flowing into the i-th data stream subset from the first input port, and the i-th data stream subset is one of the M data flow subsets; the ratio of the third total number to the preset time interval is determined as the actual rate of the i-th data flow subset.
  • the processor is further configured to: determine that a new data stream flows into the second data stream subset, and update the allocation rate of the second data stream subset, where the second data stream subset is M data streams One of the flow subsets.
  • the communication device further includes a transceiver module, configured to send the updated allocation rate of the second data flow subset to an upstream node.
  • the updated allocation rate is carried in the priority-based data flow control PFC control message.
  • the processor is further configured to: if it is determined that the new data flow arrives at the first output port and there is an idle queue at the first output port, store the new data flow in the idle queue.
  • the processor is further configured to: if it is determined that the new data flow arrives at the first output port and there is no idle queue at the first output port, store the new data flow in the reserved queue.
  • the processor is further configured to: determine that there is an idle queue at the first output port, and there is a data flow in the reserved queue, and move the data flow in the reserved queue into the idle queue.
  • the present application provides a computer-readable storage medium, in which a computer program or instruction is stored, and when the computer program or instruction is executed by a communication device, the communication device executes the above-mentioned first aspect or the first aspect.
  • a method in any possible implementation of an aspect.
  • the present application provides a computer program product, the computer program product includes a computer program or an instruction, and when the computer program or instruction is executed by a communication device, the communication device executes any of the above-mentioned first aspect or the first aspect. method in a possible implementation.
  • FIG. 1a is a schematic diagram of a data center network architecture provided by the present application.
  • Figure 1b is a schematic diagram of another data center network architecture provided by the present application.
  • Fig. 2 is a schematic structural diagram of a switching node provided by the present application.
  • FIG. 3 is a schematic structural diagram of another data center network provided by the present application.
  • FIG. 4 is a schematic flowchart of a method for controlling data stream transmission provided by the present application.
  • FIG. 5 is a schematic flowchart of a method for determining a first average output rate of a first data flow subset provided by the present application
  • FIG. 6a is a schematic flowchart of a method for determining whether there is an idle rate in a data stream set provided by the present application
  • FIG. 6b is a schematic flowchart of a method for allocating idle rates for a second data flow set provided by the present application
  • FIG. 7a is a schematic flowchart of a method for determining whether an output port has an idle rate provided by the present application
  • FIG. 7b is a schematic flowchart of a method for allocating idle rates for a second data flow set provided by the present application.
  • FIG. 8a is a schematic flowchart of a method for determining whether there is an idle rate in a data flow set provided by the present application
  • FIG. 8b is a schematic flowchart of a method for allocating idle rates for a second data flow set provided by the present application.
  • FIG. 9a is a schematic diagram of the flow direction of a data flow in a switching node provided by the present application.
  • FIG. 9b is a schematic diagram of the flow direction of another data flow in the switching node provided by the present application.
  • FIG. 9c is a schematic diagram of the flow direction of another data flow in the switching node provided by the present application.
  • FIG. 9d is a schematic diagram of the flow direction of another data flow in the switching node provided by the present application.
  • FIG. 10 is a schematic diagram of the format of a PFC control message provided by the present application.
  • FIG. 11 is a schematic flowchart of a method for processing a new data stream received by a switching node provided in the present application
  • FIG. 12 is a schematic structural diagram of a communication device provided by the present application.
  • FIG. 13 is a schematic structural diagram of a communication device provided by the present application.
  • the data flow identifier can uniquely identify a data flow.
  • the data flow identifier may be a quintuple, a triple, or other identifiers that can uniquely identify the data flow.
  • the five-tuple usually refers to a source network protocol (internet protocol, IP) address, a source port, a destination IP address, a destination port and a transport layer protocol.
  • IP internet protocol
  • a set of five quantities consisting of source IP address, source port, destination IP address, destination port, and transport layer protocol. For example: 192.168.1.1 10000 TCP 121.14.88.76 80 constitutes a five-tuple.
  • the meaning is that a terminal with IP address 192.168.1.1 connects with a terminal with IP address 121.14.88.76 and port 80 through port 10000 and TCP protocol.
  • the data flow identifier is used by the second node to determine which data flow a data packet belongs to.
  • Queue refers to using a queue algorithm to classify the received data streams, and then send the data streams in the queue based on the queue scheduling mechanism. Different data streams may correspond to different queues.
  • the following introduces the architecture of the data center network applicable to the present application.
  • There are multiple networking modes for the data center network and the following exemplarily shows two possible architecture diagrams of the data center network applicable to the present application.
  • FIG. 1a it is a schematic diagram of a data center network architecture provided by the present application.
  • the data center network includes three layers, namely a core layer (Core), an aggregation layer (Aggregation) and an access layer (Access), wherein the access layer can also be called an edge layer.
  • Each layer may include one or more switching nodes (or called switching devices).
  • Figure 1a takes the core layer including 2 switching nodes, the aggregation layer including 4 switching nodes, and the access layer including 3 switching nodes as an example.
  • the switching nodes included in the core layer may be called Core nodes
  • the switching nodes included in the aggregation layer may be called Aggregation nodes
  • the switching nodes included in the access layer may be called Access nodes or top of rack (TOR) nodes
  • the switching node may be, for example, a switch.
  • the downlink port of the Access node (for example, the rate can be 10Gbps) can be connected to the server (Server) (or called the host), and the uplink port of the Access node (for example, the rate can be 40Gbps) can be connected to Aggregation switching node; the downlink port of the Aggregation node can be connected to the Access node, and the uplink port of the Aggregation node can be connected to the Core node.
  • the Core node may be called the upstream node of the Aggregation node, and the Aggregation node may be called the upstream node of the Access node.
  • the networking mode of the data center network shown in Figure 1a is only an example, and the data center network can be divided into two layers, or can be divided into more than three layers, the core layer, the aggregation layer and the access layer
  • the number of switching nodes included may be the same or different, which is not limited in this application.
  • FIG. 1 b it is a schematic diagram of another data center network architecture provided by the present application.
  • the data center network includes two levels of nodes, which are backbone (spine) switching nodes and leaf (leaf) switching nodes.
  • a backbone switching node is used to connect each leaf switching node, and the leaf switching node is used to connect to a server (or called a host).
  • Fig. 1b takes an example including 2 backbone switching nodes and 4 leaf switching nodes.
  • the downlink ports of the backbone switching nodes can be used to connect to the leaf switching nodes.
  • networking mode of the data center network shown in Figure 1b is only an example, and the number of backbone switching nodes and the number of leaf switching nodes included in the data center network may be the same or different. There is no limit to this.
  • FIG. 2 it is a schematic structural diagram of a switching node provided in this application.
  • the switching node includes P input ports and Q output ports, both P and Q are integers greater than 1, and the input ports and output ports may also be called network interfaces (network interface).
  • the input port is used to receive data flow from a node other than the switching node (for example, from an upstream switching node).
  • the output port is used to transmit data streams to the outside of the switching node (for example, to a downstream switching node).
  • the switching node has built-in multiple output queues (output queue, OQ), which are used to cache the data flow to the downstream switching node or server. Wherein, each queue may include one or more data flow sets.
  • OQ output queue
  • each set of data streams may include one or more subsets of data streams, and each subset of data streams may include one or more data streams.
  • the data flow set may be that the downstream switching node of the switching node specifies which data flows belong to the same data flow set through a feedback message.
  • the last-hop switching node in the data center network can divide the data flows flowing into the same server from the same output port of the last-hop switching node into the same data flow set according to the server connected to the output port, and send the data flow to the last-hop An upstream switching node of the switching node (for example, the switching node shown in FIG.
  • the switching node may further divide the data flow set into data flow subsets according to the corresponding relationship between data flows and input ports. For example, the switching node may further divide data flows received from the same input port into a data flow subset, and data flows received from different input ports are divided into different data flow subsets.
  • the switching node can also include an output scheduler (egress scheduler, SCE) and a queue manager (queue manager, QM), the queue manager can be responsible for maintaining these multiple output queues, and the output scheduler is used for scheduling queue management The output queue in the switch, so that the data flow buffered in the output queue is transmitted to the external node of the switching node through a network interface (network interface).
  • an output scheduler egress scheduler, SCE
  • queue manager queue manager
  • the queue manager can be responsible for maintaining these multiple output queues
  • the output scheduler is used for scheduling queue management The output queue in the switch, so that the data flow buffered in the output queue is transmitted to the external node of the switching node through a network interface (network interface).
  • the switching node may also be configured with an external cache.
  • FIG. 3 it is a schematic structural diagram of another data center network provided by the present application.
  • the data center network includes a switching node S1, a switching node S2, a switching node S3, a switching node S4, a switching node S5, and a switching node S6.
  • Data streams of switching node S1, switching node S2, and switching node S3 can flow to switching node S4 through respective output ports; data streams of switching node S4 can respectively flow to switching node S5 and switching node S6 through output ports.
  • switching node S1, switching node S2, and switching node S3 are all upstream nodes of switching node S4, switching node S4 is a downstream node of switching node S1, switching node S2, and switching node S3, and switching node S4 is a node of switching node S5 and switching node S4. Upstream of node S6, switching node S5 and switching node S6 are downstream nodes of switching node S4.
  • a data flow set of a queue of an output port of an upstream switching node usually flows to a data flow set of a queue of an output port of a downstream switching node. It should be understood that when the queue in an output port of the downstream switching node includes a data flow set, a data flow set of a queue of an output port of the upstream switching node points to a queue of an output port of the downstream switching node.
  • the transmission method based on the existing data stream will cause a large transmission delay of the data stream.
  • the present application proposes a method for controlling data stream transmission.
  • the method for controlling data flow transmission can efficiently solve the congestion problem in the network, thereby reducing the time delay of data transmission.
  • the allocation rate of each data stream included in the data stream subset is the same, for example, the allocation rate of the data stream is equal to the allocation rate of the data stream subset and the total number of data streams included in the data stream subset ratio.
  • the allocation rate of the data flow set is equal to the sum of the allocation rates of the respective data flow subsets included in the data flow set.
  • the actual rate of each data stream included in the data stream subset is the same, for example, the actual rate of the data stream is equal to the ratio of the actual rate of the data stream subset to the total number of data streams included in the data stream subset.
  • the actual rate of the data flow set is equal to the sum of the actual rates of the respective data flow subsets included in the data flow set.
  • FIG. 4 is a schematic flowchart of a method for controlling data stream transmission provided in this application.
  • This method can be applied to the data center network shown in FIG. 1a, FIG. 1b or FIG. 3 above.
  • the switching node in this method may be any switching node in the above-mentioned Figure 1a or Figure 1b, may also be the switching node shown in the above-mentioned Figure 2, or may be any switching node in the above-mentioned Figure 3, and the switching node may include P input ports and Q output ports, the first input port is one of the P input ports, the first output port is one of the Q output ports, and the first data flow set is a data flow of the first output port gather.
  • the method includes the following steps:
  • the switching node may obtain the allocated rates of the M data flow subsets in the first data flow set.
  • the M is a positive integer.
  • the switching node may first determine the first average input rate and the first average output rate of the i-th data flow subset, and determine the minimum value of the first average input rate and the first average output rate as the i-th data flow The allocation rate for the subset.
  • the allocation rate rate_i_Fx of the i-th data stream subset Min(first average output rate Rl_i_Fx, first average input rate Rr_i_Fx).
  • the switching node may be based on the rate Rin of the first input port, the third total number n_i_Fx of data flows flowing from the first input port to the i-th data flow subset, and the data included in the first input port
  • the fourth total number of flows, M_i determines the first average input rate Rr_i_Fx. It should be understood that the rate Rin of the first port is equal to the bandwidth B_i of the first port.
  • the first average input rate Rr_i_Fx the rate Rin of the first input port*the third total number n_i_Fx/the fourth total number M_i.
  • the first data flow set of the first output port can be expressed as PjQiFx, which data flow subsets are included in PjQiFx are specified by the feedback information sent by the downstream switching node, and which data flow subset is included is also passed through the downstream
  • the feedback information sent by the switching node is specified.
  • For the feedback information refer to the related description in FIG. 10 below, and details are not repeated here.
  • the following is a schematic flowchart of a method for determining a first average output rate of a subset of data streams exemplarily shown. Referring to Figure 5, the method includes the following steps:
  • the switching node may determine the first data flow set according to the rate of the first output port, the first total number K_j of data flows output by the first output port, and the second total number N_Fx of data flows included in the first data flow set Occupies the first rate BP_Fx of the first output port.
  • Step 502 the switching node determines the minimum value of the first rate BP_Fx and the feedback rate Rf_Fx of the first output port occupied by the first data flow set as the maximum average output rate Rout_Fx of the first data flow set.
  • the feedback rate specifies the rate at which the downstream switching node transmits the data flow from the first data flow set.
  • the maximum average output rate Rout_Fx Min (first rate BP_Fx, Rf_Fx of the first output port occupied by the first data flow set).
  • the feedback rate specifies the rate at which the downstream switching node transmits the data flow from the first data flow set.
  • Step 503 The switching node determines a first average output rate according to the maximum average output rate, the second total number, and the third total number of data flows flowing from the first input port to the i-th data flow subset.
  • first average output rate Rl_i_Fx maximum average output rate Rout_Fx ⁇ third total number n_i_Fx/second total number N_Fx.
  • Step 402 the switching node measures the actual rates of the M data flow subsets.
  • the switching node may determine the flow from the first input port to the i-th data flow within a preset time interval t
  • the third total number n_i_Fx of data streams of the subset, the ratio of the third total number n_i_Fx to the preset time interval is determined as the actual rate Rm_i_Fx of the i-th data stream subset.
  • actual rate Rm_i_Fx third total number n_i_Fx/preset time interval t.
  • the preset time interval t may be a link delay, for example, 2 microseconds to 4 microseconds.
  • Step 403 the switching node determines the idle rate of the first output port according to the actual rate and the allocated rate.
  • determining the idle rate of the first output port may include determining the idle rate of the first data flow set of the first output port, for details, refer to the introduction of FIG. 6a below; it may also include determining the idle rate of the first output port, Refer to the introduction of FIG. 7a below; it may also include determining the idle rate of the data stream subset corresponding to the first input port of the first output port, refer to the introduction of FIG. 8a below, and repeat it here.
  • step 404 the switching node allocates idle rates to a subset of data flows in the second set of data flows.
  • the actual rate of the subset of data streams in the second data stream set is not less than the product of the allocation rate and the coefficient, and the coefficient is greater than the threshold and less than 1.
  • the jth data stream subset satisfies Rm_j_Fx ⁇ 0.9 ⁇ Rate_j_Fx, where 0.9 is the coefficient.
  • the coefficient can also be less than 0.9 or greater than 0.9, and any coefficient that can indicate that the actual rate of the subset of data streams in the second data stream set is close to the allocated rate is possible, and this application does not limit the specific value of the coefficient .
  • the actual rate of the subset of data streams in the second set of data streams is not less than the product of the allocation rate and the coefficient, indicating that the rate at which the subset of data streams in the second set of data streams transmits data streams will soon be insufficient.
  • the imminently insufficient transmission data stream rate of the second data stream set includes that the imminently insufficient input rate of the second data stream set and/or the imminently insufficient output rate of the second data stream set. If the output rate of the second data flow set is about to be insufficient, the idle rate can be allocated to the output rate of the second data flow set, for details, please refer to the introduction of the following figure 6b or figure 7b. If the input rate of the second data stream set is about to be insufficient, the idle rate may be allocated to the input rate of the second data stream set, for details, refer to the introduction of FIG. 8b below.
  • the idle rate is allocated to the second data flow set, thereby helping to improve the rate at which the second data flow set transmits data streams, and thus helps to reduce the transmission delay of the data streams of the second data stream set. This is especially the case when the actual rate of the subset of streams in the second set of streams is about to be insufficient.
  • the third total number n_i_Fx of data streams flowing into the i-th data stream subset from the first input port, the allocation rate Rate_i_Fx of the i-th data stream subset, and the i-th data stream subset ’s An average input rate Rr_i_Fx, a first average output rate Rl_i_Fx of the i-th subset of data streams, a maximum average output rate Rout_Fx of the set of data streams Fx, and a first rate BP_Fx of the set of data streams Fx occupying the output port Pj. Further, optionally, the actual rate Rm_i_Fx of the i-th data flow subset may also be measured. Based on this, the switching node can maintain the rate allocation table shown in Table 1.
  • the column where the data flow set Fx is located represents a data flow set of a queue of an output port, and the entry includes the allocation rate Rate_i_Fx of the data flow subset, the actual rate Rm_i_Fx, and the data flow rate from the first input
  • the entry includes the allocation rate Rate_i_Fx of the data flow subset, the actual rate Rm_i_Fx, and the data flow rate from the first input
  • n_i_Fx a first average input rate Rr_i_Fx
  • a first average output rate Rl_i_Fx of ports flowing into the i-th data flow subset a third total number n_i_Fx, a first average input rate Rr_i_Fx, a first average output rate Rl_i_Fx of ports flowing into the i-th data flow subset.
  • the following exemplarily shows three possible methods for the switching node to determine the idle rate of the first output port.
  • Method 1 traversing in the first data flow set, that is, determining the idle rate of the first data flow set of the first output port.
  • FIG. 6a it is a schematic flowchart of a method for determining the idle rate of the first data flow set provided by the present application. The method comprises the steps of:
  • the switching node may determine a first total allocation rate corresponding to the first data flow set according to the allocation rates of the M data flow subsets.
  • the first total allocation rate represents the sum of the allocation rates rate_i_Fx of all data stream subsets in the column where the first data stream set is located.
  • Step 602 The switching node determines a first total actual rate corresponding to the first data flow set according to the actual rates of the M data flow subsets.
  • the first total actual rate is obtained by measuring and summing the actual rates Rm_i_Fx of all data stream subsets in the column where the first data stream set is located.
  • Step 603 the switching node determines the idle rate of the first data flow set according to the first total actual rate and the first total allocated rate.
  • the switching node determines that the first total actual rate is less than the first total allocated rate, it determines that there is an idle rate in the first data flow set. Further, optionally, if the switching node determines that the first total actual rate is less than the first total allocated rate and exceeds 2 round-trip delay (round-trip time, RTT) durations, it indicates that the first data flow set is under-throughput, that is, the first There is an idle rate for a set of streams.
  • round-trip time, RTT round-trip time
  • CPr_Fx ⁇ CP_Fx and exceeding 2 RTT durations
  • CPr_Fx ⁇ 0.9 ⁇ CP_Fx and exceeding 2 RTT durations indicating that there is an idle rate in the first data flow set.
  • the idle rate of the first data flow set can be determined, and the idle rate of the first data flow set can be allocated to the second data flow set.
  • FIG. 6b it is a schematic flowchart of a method for allocating idle rates for the second data stream set provided by the present application. The method includes the following steps:
  • Step 611 the switching node traverses the allocation rate, the first average output rate and the first average input rate of the M data flow subsets of the first data flow set, and calculates that the actual rate is not less than the product of the allocation rate and the coefficient, and the first average A subset of data flows whose output rate is less than the first average input rate is determined as a subset of data flows in the second set of data flows.
  • the switching node traverses the distribution rate, the first average output rate and the first average input rate of the data flow subset in the column where the data flow set Fx is located, and sets the actual rate Rm_i_Fx to be not less than the distribution rate Rate_i_Fx ⁇ 0.9 (coefficient ), and the data flow subset whose first average output rate Rl_i_Fx is less than the first average input rate Rr_i_Fx is determined as the data flow subset in the second data flow set.
  • the switching node may put the set of Rm_i_Fx ⁇ 0.9 ⁇ Rate_i_Fx and Rl_i_Fx ⁇ Rr_i_Fx into the second data flow set ColR_Fx. It should be understood that for the subset of data streams where Rm_i_Fx ⁇ 0.9 ⁇ Rate_i_Fx and Rl_i_Fx ⁇ Rr_i_Fx, it means that Rl_i_Fx is the bottleneck of the transmission rate.
  • the second data flow set may be a set of data flow subsets from one input port, or a set of data flow subsets from multiple input ports, which is not limited in this application.
  • Step 612 the switching node obtains the fourth total allocation rate corresponding to the second data flow set and the fifth total number of data flows included in the second data flow set.
  • the fourth total allocation rate CPrx_Fx ⁇ Fx Rate i_Fx .
  • the allocation rates Rate_i_Fx of these three data flow subsets can be summed to obtain the fourth total allocation rate CPrx_Fx.
  • the fifth total Kcr_j is the number of all data streams in the three data stream subsets.
  • Step 613 the switching node determines a second average output rate of the second data flow set according to the first total allocated rate, the first total actual rate, the fourth total allocated rate, the third total and the fifth total.
  • the second average output rate Rl'_i_Fx (first total allocated rate CP_Fx ⁇ first total actual rate CPr_Fx+fourth total allocated rate CPrx_Fx) ⁇ third total number n_i_Fx/fifth total number Kcr_j.
  • Step 614 the switching node may update the first average output rate of the set of the second data stream set whose first average output rate is smaller than the first average input rate according to the second average output rate.
  • the switching node replaces Rl_i_Fx in the set corresponding to Rl_i_Fx ⁇ Rr_i_Fx in the second data flow set with Rl'_i_Fx.
  • Rl_i_Fx in the data stream subset of Rl_i_Fx ⁇ Rr_i_Fx in the second data stream set with Rl'_i_Fx.
  • the switching node may empty the second data flow set ColR_Fx.
  • Method 2 traverse within the first output port.
  • FIG. 7a it is a schematic flowchart of a method for determining the idle rate of the first output port provided in the present application. The method comprises the steps of:
  • Step 701 the switching node may determine a second total allocation rate corresponding to the first output port according to the allocation rates of the M data flow subsets.
  • the first output port corresponds to N data flow sets
  • the switching node can determine the allocation rate of the data flow subsets included in each data flow set; for each data flow set, according to the allocation rate, A first total allocation rate for each set of data streams may be determined. Further, the second total allocation rate corresponding to the first output port may be determined according to the first total allocation rate of each data flow set in the N data flow sets.
  • CP Fx represents the first total allocation rate of the data flow set Fx pair.
  • the data stream set Fx corresponds to a first total allocation rate
  • the data stream set Fx-1 also corresponds to a first total allocation rate
  • the data stream set Fx+1 also corresponds to a first total allocation rate
  • the sum of the first total allocation rates corresponding to all data flow sets of the first output port is summed to obtain the second total allocation rate BPa_j. It should be understood that, for the process of determining the first total allocation rate, reference may be made to the relevant description of the foregoing step 601, and details are not repeated here.
  • Step 702 the switching node may determine a second total actual rate corresponding to the first output port according to the actual rates of the M data flow subsets.
  • the first output port corresponds to N data flow sets
  • the switching node can determine the actual rate of the data flow subset included in each data flow set; for each data flow set, according to the actual rate, A first total actual rate for each set of data streams may be determined. Further, the second total actual rate corresponding to the first output port may be determined according to the first total actual rate of each data flow set in the N data flow sets. It should be understood that the first data stream set is one of the N data stream sets.
  • CPr Fx represents the first total actual rate of the data flow set Fx pair. It should be understood that the total number of data flow sets included in the first output port is represented by KFj.
  • the data stream set Fx corresponds to a first total actual rate
  • the data stream set Fx-1 also corresponds to a first total actual rate
  • the data stream set Fx+1 also corresponds to a first total actual rate
  • the sum of the first total actual rates corresponding to all data flow sets of the first output port is summed to obtain the second total actual rate BPa_j. It should be understood that, for the process of determining the first total actual rate, reference may be made to the relevant description of the foregoing step 602, and details are not repeated here.
  • Step 703 if the switching node determines the idle rate of the first output port according to the second total actual rate and the second total allocated rate.
  • the idle rate of the first output port can be determined, and the idle rate of the first output port can be allocated to the second data flow set.
  • FIG. 7b it is a schematic flowchart of another method for allocating idle rates for the second data flow set provided by the present application.
  • the method includes the following steps:
  • Step 711 the switching node traverses the first total allocation rate, the rate occupied by the first output port and the feedback rate corresponding to each data flow set in the N data flow sets in the first output port, and the first total actual rate is not less than
  • the product of the first total allocation rate and the coefficient and the set of data streams occupying the first output port whose first rate is smaller than the feedback rate are determined as a data stream subset of the second data stream set.
  • the feedback rate is the rate at which the downstream switching node specifies the corresponding data flow set to transmit the data flow.
  • the switching node may put the first total actual rate CPr_Fx ⁇ 0.9 (coefficient) ⁇ the first total allocation rate CP_Fx and the rate BP_Fx ⁇ feedback rate Rf_Fx occupying the first output port into the data flow set as the second data flow Set OutBr_j. It should be understood that the data flow set of CPr_Fx>0.9 ⁇ CP_Fx and BP_Fx ⁇ Rf_Fx indicates that BP_Fx is the bottleneck of the data flow set transmitting the data flow.
  • the switching node may determine a fifth total allocation rate corresponding to the second data flow set and a sixth total number of data flows included in the second data flow set corresponding to the first output port.
  • the switching node may determine a second rate of the second data flow set occupying the first output port according to the second total allocated rate, the second total actual rate, the fifth total allocated rate, the second total and the sixth total.
  • the rate BP'_Fx of the second data stream set occupying the first output port (second total allocation rate BPa_j-second total actual rate BPr_j+fifth total allocation rate BPrx_j) ⁇ second total number N_Fx/sixth Total Nbr_j.
  • Step 714 the switching node may update the first rate of the first output port of the data flow set of the second data flow set whose first rate of the first output port is less than the feedback rate according to the second rate of the first output port.
  • the switching node replaces BP_Fx in the data flow set corresponding to BP_Fx ⁇ Rf_Fx in the second data flow set with BP'_Fx.
  • BP'_Fx is used to replace BP_Fx in the data flow set corresponding to BP_Fx ⁇ Rf_Fx in the second data flow set.
  • 7a and 7b above can also be understood as first determining that there is an idle rate at the first output port, and then allocating the idle rate to the set of data streams (that is, the second set of data streams) whose rate is about to be insufficient in the output port.
  • method 2 may be performed after method 1.
  • the switching node may empty the second data stream set OutBr_j.
  • FIG. 8a it is a schematic flow chart of another method for determining the idle rate of the data flow subset corresponding to the first input port provided in the present application.
  • the method comprises the steps of:
  • the switching node may obtain the allocated rate and the actual rate of the first subset of data streams.
  • a first data stream subset in the first data stream set corresponds to a first input port, and the first input port may correspond to multiple data stream subsets.
  • the switching node may obtain the allocated rate and the actual rate of the first input port corresponding to each data flow subset.
  • Step 802 the switching node may determine a third total allocation rate corresponding to the first input port according to the allocation rate of the first data flow subset.
  • the switching node may determine the third total allocation rate corresponding to the first input port according to the allocation rate of each data flow subset corresponding to the first input port; according to the actual rate of each data flow subset , to determine the third total actual rate corresponding to the first input port.
  • the third total allocation rate represents the sum of allocation rates rate_i_Fx of all data stream subsets in the row where the first input port is located. It should be understood that the data streams of the first input port flow into MFi data stream sets in total.
  • Step 803 the switching node may determine a third total actual rate corresponding to the first input port according to the allocated rate of the first data flow subset.
  • the switching node may determine the third total actual rate corresponding to the first input port according to the actual rate of each data flow subset corresponding to the first input port.
  • the third total actual rate is obtained by measuring and summing the actual rates of all data flow subsets in the row where the first input port is located.
  • Step 804 the switching node determines the idle rate of the data stream subset corresponding to the first input port according to the third total actual rate and the third total allocated rate.
  • the switching node determines that the third total actual rate is smaller than the third total allocated rate, and determines that the data flow subset corresponding to the first input port has an idle rate.
  • the switching node determines that the third total actual rate is less than the third total allocated rate and exceeds 2 RTT durations, indicating that the data flow subset corresponding to the first input port is under-throughput, that is, the data stream corresponding to the first input port Idle rates exist for a subset of streams.
  • RPr_i ⁇ RP_i and exceeds 2 RTT durations further, optionally, RPr_i ⁇ 0.9 ⁇ RP_i and exceeds 2 RTT durations, indicating that there is an idle rate in the data flow subset corresponding to the first input port.
  • the idle rate of the data stream subset corresponding to the first input port can be determined, and the idle rate of the data stream subset corresponding to the first input port can be allocated to the second data stream set.
  • FIG. 8b it is a schematic flowchart of another method for allocating idle rates for the second data flow set provided by the present application.
  • the method includes the following steps:
  • Step 811 the switching node traverses the allocation rate, the first average output rate, and the first average input rate of the data flow subset corresponding to the first input port, and calculates that the actual rate is not less than the product of the allocation rate and the coefficient, and the first average output rate A subset of data flows greater than the first average input rate is determined as a subset of data flows in the second set of data flows.
  • the allocation rate of the data stream subset corresponding to the first input port refers to the allocation rate of the data stream subset in the row where the first input port is located.
  • the first average output rate of the data stream subset corresponding to the first input port refers to the first average output rate of the data stream subset in the row where the first input port is located.
  • the first average input rate of the data stream subset corresponding to the first input port refers to the first average input rate of the data stream subset in the row where the first input port is located. It should be noted that the data streams of the first input port flow into the MFi data stream subsets in total, that is, the first input port corresponds to the MFi data stream subsets.
  • the switching node traverses the distribution rate, the first average output rate and the first average input rate of all data flow subsets in the row where the first input port is located, and the actual rate Rm_i_Fx is not less than the distribution rate Rate_i_Fx, and the first The set whose average output rate Rl_i_Fx is greater than the first average input rate Rr_i_Fx is determined as a subset of data streams in the second set of data streams.
  • the switching node may put the subset of data flows of Rm_i_Fx>0.9 ⁇ Rate_i_Fx and Rl_i_Fx>Rr_i_Fx into the second data flow set RowR_i. It should be understood that Rm_i_Fx>0.9 ⁇ Rate_i_Fx and Rl_i_Fx>Rr_i_Fx data stream subsets indicate that Rr_i_Fx is the bottleneck of the transmission rate of the data stream subsets.
  • Step 812 the switching node determines a sixth total allocation rate of the second data flow set and a seventh total number of data flows included in the second data flow set.
  • the sixth total allocation rate RPrx_i ⁇ Fx Rate i_Fx .
  • the first input port corresponds to three data stream subsets satisfying the above step 811, then the allocation rates Rate_i_Fx of these three data stream subsets can be summed to obtain the sixth total allocation rate RPrx_i.
  • the seventh total Krr_i is the number of all data streams in the three data stream subsets.
  • Step 813 the switching node determines the second average input of the second data flow set flowing from the first input port according to the third total allocated rate, the third total actual rate, the sixth total allocated rate, the third total, and the seventh total. rate.
  • the second average input rate Rr'_i_Fx (third total allocated rate RP_i ⁇ third total actual rate RPr_i+sixth total allocated rate RPrx_i) ⁇ third total n_i_Fx/seventh total Krr_i.
  • Step 814 the switching node may update the first average input rate of the subset of data flows in the second data flow set whose first average output rate is greater than the first average input rate according to the second average input rate.
  • the switching node replaces Rr_i_Fx in the data flow subset corresponding to Rl_i_Fx>Rr_i_Fx in the second data flow set with Rr′_i_Fx.
  • Rr_i_Fx in the data stream subset corresponding to Rl_i_Fx>Rr_i_Fx in the second data stream set with Rr'_i_Fx.
  • the switching node may empty the second data stream set RowR_i.
  • FIG. 9a it is a schematic diagram of a flow direction of a data flow in a switching node provided by the present application.
  • the switching node includes an input port P1, an input port P2, an input port P3, an output port P4 and an output port P5, wherein the output port P4 has a queue Q1, and the output port P5 has a queue Q1; the input port P1, the input port P2,
  • the data flow f1 flows from the input port P1 to P4Q1F1.
  • ⁇ in the above Table 2 represents the unreceived feedback rate, and the unit of the rate is Gbps.
  • the flow direction of the data flow in the switching node can be seen in FIG. 9 b , and the corresponding rate allocation table can be seen in FIG. 3 .
  • the flow direction of the data flow in the switching node can be seen in FIG. 9c , and the corresponding rate allocation table can be seen in FIG. 4 .
  • the flow direction of the data flow in the switching node can be seen in FIG. 9d , and the corresponding rate allocation table can be seen in Table 5.
  • feedback information may be sent to the upstream switching node, where the feedback information includes the allocated rate of the subset of data streams.
  • the feedback information may include data stream f1, 0.5G, P4Q1F1, and data stream f2, 0.9G, P4Q1F1.
  • the f feedback information may include data stream 1, 0.5G, P4Q1F1, and data stream f2, 0.9G, P4Q1F1.
  • the feedback information may include data streams f1, 0.5G, P4Q1F1, and data streams f2, 0.9G, P5Q1F1.
  • the switching node when the switching node determines that a new data flow flows into the first data flow set from the first input port, it needs to update the allocation rate corresponding to the first input port flowing into the first data flow set.
  • the second data stream needs to be updated Allocation rate for a subset of streams.
  • the switching node may send the updated allocation rate of the second data flow subset to the upstream switching node.
  • the upstream switching node updates the feedback rate of the data flow set in which the data flow belongs. It should be noted that if the switching node receives a new data flow f, which flows into the first data flow set from the first input port, the switching node can send the second data flow to the upstream switching node through the first input port Allocation rate after subset update.
  • the switching node can send the second data flow f to the upstream switching node S1 through the input port connected to the switching node S1.
  • the allocation rate for the stream subset after the update is not limited to the above FIG. 3, if the new data flow f is received by the switching node S4 from the input port connected to the switching node S1, then the switching node can send the second data flow f to the upstream switching node S1 through the input port connected to the switching node S1.
  • the rate allocation table of the switching node updates the rate of a data flow f
  • the updated allocation rate of the data flow subset where the data flow is located is fed back to the upstream switching node or service network card of the data flow f, and the corresponding The data flow's own allocation rate, which can limit the sending rate of the queue execution data flow in the upstream switch node or network card.
  • the updated allocation rate of the second data flow subset may be carried in the PFC-based control message.
  • the PFC control message includes feedback information
  • the feedback information includes the updated allocation rate of the second data flow subset.
  • FIG. 10 is a schematic diagram of the format of a PFC control message provided by this application.
  • the PFC control message includes station medium access control (medium access control, MAC) address (Station MAC Address), 0x8808, 0x0101, class enabling vector (Class Enable Vector) and Time 0-7 flag bit, Time 0-7 flag Each flag in the bit is 16 bits (bit).
  • Feedback information uses Time 0–7 flags. Feedback information includes flow ID (Flow ID), allocation rate Rate (20bit), flow quantity (Flow Num) (12bit), output port ID (Output Port ID) (10bit), output queue ID (Output Queue ID) (10bit ), data flow subset identifier (Flow Set ID) (10bit), packet type (Packet Type) (2bit).
  • Rate occupies 20 bits, the unit is Mbps, and the maximum value is 1Tbps.
  • Packet Type occupies 2 bits, and 00 can represent a feedback message.
  • the switching node determines which output port the new data flow should flow into, but it is not sure which data flow set and which data flow the new data flow should flow into. For the flow subset, it needs to wait for the feedback message from the downstream switching node. While waiting for the feedback message sent by the downstream switching node, the switching node may first store the new data flow in the idle queue or the reserved queue.
  • FIG. 11 it is a schematic flowchart of a method for processing a new data stream received by a switching node provided in the present application.
  • a new data stream flows to the first output port as an example.
  • Step 1101 the switching node determines that a new data flow arrives at the first output port.
  • Step 1102 the switching node determines whether there is an idle queue at the first output port; if yes, execute step 1103; if not, execute step 1104.
  • Step 1103 the switching node stores the new data flow in the idle queue.
  • Step 1104 the switching node stores the new data flow in the reserved queue.
  • Step 1105 the switching node may re-allocate the allocation rate of the reserved queue or the idle queue, and wait for the feedback information from the downstream switching node.
  • all or part of the data flows in the reserved queue may be moved into the idle queue.
  • the communication device includes hardware structures and/or software modules corresponding to each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software in combination with the modules and method steps described in the embodiments disclosed in the present application. Whether a certain function is executed by hardware or computer software drives the hardware depends on the specific application scenario and design constraints of the technical solution.
  • FIG. 12 and FIG. 13 are schematic structural diagrams of possible communication devices provided in the present application. These communication devices can be used to implement the function of the switching node in the above method embodiments, and thus can also realize the beneficial effects of the above method embodiments.
  • the communication device may be a switching node as shown in Figure 1a or Figure 1b, or a switching node as shown in Figure 2, or any switching node in Figure 3 above, or a switching node used in Modules (such as chips) that switch nodes.
  • the communication device 1200 includes a processing module 1201 and a measurement module 1202 .
  • the communication device 1200 is configured to realize the function of the switching node in the method embodiment shown in FIG. 4, FIG. 5, FIG. 6a, FIG. 6b, FIG. 7a, FIG. 7b, FIG. 8a or FIG. 8b.
  • the processing module 1201 is used to obtain the allocation rate of the M data flow subsets in the first data flow set, where M is a positive integer;
  • the measurement module 1202 is used to measure the actual rate of the M data stream subsets;
  • the processing module 1201 is also used to determine the idle rate of the first output port according to the actual rate and the allocated rate; and allocate the idle rate to the second data stream set A subset of data streams; wherein, the first output port is used to output the data streams in the first data stream set and the second data stream set.
  • the processing module 1201 and the measurement module 1202 in this embodiment of the present application may be implemented by a processor or processor-related circuit components.
  • the communication device may further include a transceiver module 1203, and the transceiver module 1203 may be implemented by a communication interface or a communication interface-related circuit component.
  • the present application further provides a communication device 1300 .
  • the communication device 1300 may include a processor 1301 . Further, optionally, the communication device may further include a communication interface 1302 .
  • the processor 1301 and the communication interface 1302 are coupled to each other. It can be understood that the communication interface 1302 may be an interface circuit or an input and output interface.
  • the communication device 1300 may further include a memory 1303 for storing instructions executed by the processor 1301 or storing input data required by the processor 1301 to execute the instructions or storing data generated by the processor 1301 after executing the instructions.
  • the processor 1301 is used to execute the functions of the processing module 1201 and the measurement module 1202
  • the communication interface 1302 is used to execute the functions of the transceiver module 1203 .
  • the chip of the switching node implements the function of the switching node in the above-mentioned method embodiment.
  • the switching node chip receives information from other modules in the switching node (such as radio frequency modules or antennas), and the information is sent to the switching node by other switching nodes; or, the switching node chip sends information to other modules in other switching nodes (such as radio frequency module or antenna) to send information, the information is sent by the switch node to other switch nodes.
  • processor in the embodiments of the present application may be a central processing unit (central processing unit, CPU), and may also be other general processors, digital signal processors (digital signal processor, DSP), application specific integrated circuits (application specific integrated circuit, ASIC), field programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof.
  • CPU central processing unit
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general-purpose processor can be a microprocessor, or any conventional processor.
  • the method steps in the embodiments of the present application may be implemented by means of hardware, or may be implemented by means of a processor executing software instructions.
  • Software instructions can be composed of corresponding software modules, and software modules can be stored in random access memory (random access memory, RAM), flash memory, read-only memory (read-only memory, ROM), programmable read-only memory (programmable ROM) , PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically erasable programmable read-only memory (electrically EPROM, EEPROM), register, hard disk, mobile hard disk, CD-ROM or known in the art any other form of storage medium.
  • An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium.
  • the storage medium may also be a component of the processor.
  • the processor and storage medium can be located in the ASIC. Additionally, the ASIC may be located in the communication device. Of course, the processor and the storage medium may also exist in the communication
  • all or part of them may be implemented by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • a computer program product consists of one or more computer programs or instructions. When the computer programs or instructions are loaded and executed on the computer, the processes or functions of the embodiments of the present application are executed in whole or in part.
  • the computer can be a general purpose computer, special purpose computer, a network of computers, or other programmable devices.
  • the computer program or instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer program or instructions may be downloaded from a website, computer, A server or data center transmits to another website site, computer, server or data center by wired or wireless means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrating one or more available media. Described usable medium can be magnetic medium, for example, floppy disk, hard disk, magnetic tape; It can also be optical medium, for example, digital video disc (digital video disc, DVD); It can also be semiconductor medium, for example, solid state drive (solid state drive) , SSD).

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

L'invention concerne un procédé de commande de transmission de flux de données et un dispositif de communication, étant applicables à un réseau de centre de données et analogues, et étant utilisés pour résoudre le problème de retard de transmission de flux de données important dans la technologie existante. Le procédé peut comprendre : l'obtention d'un taux d'allocation de M sous-ensembles de flux de données dans un premier ensemble de flux de données, et la mesure du taux réel des M sous-ensembles de flux de données ; la détermination du taux d'inactivité d'un premier port de sortie selon le taux réel obtenu et le taux d'allocation des M sous-ensembles de flux de données, et l'allocation du taux d'inactivité à un sous-ensemble de flux de données dans un second ensemble de flux de données, le premier port de sortie étant utilisé pour délivrer en sortie des flux de données du premier ensemble de flux de données et du second ensemble de flux de données. En allouant le taux d'inactivité à un second ensemble de flux de données, le taux de transmission des flux de données du second ensemble de flux de données peut être augmenté, réduisant ainsi le délai de transmission des flux de données du second ensemble de flux de données.
PCT/CN2021/096160 2021-05-26 2021-05-26 Procédé de commande de transmission de flux de données et dispositif de communication WO2022246710A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2021/096160 WO2022246710A1 (fr) 2021-05-26 2021-05-26 Procédé de commande de transmission de flux de données et dispositif de communication
CN202180092829.7A CN116868554A (zh) 2021-05-26 2021-05-26 一种控制数据流传输的方法及通信装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/096160 WO2022246710A1 (fr) 2021-05-26 2021-05-26 Procédé de commande de transmission de flux de données et dispositif de communication

Publications (1)

Publication Number Publication Date
WO2022246710A1 true WO2022246710A1 (fr) 2022-12-01

Family

ID=84229335

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/096160 WO2022246710A1 (fr) 2021-05-26 2021-05-26 Procédé de commande de transmission de flux de données et dispositif de communication

Country Status (2)

Country Link
CN (1) CN116868554A (fr)
WO (1) WO2022246710A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000074322A1 (fr) * 1999-06-01 2000-12-07 Fastforward Networks, Inc. Systeme et procede d'attribution de bande passante
CN106453111A (zh) * 2015-08-11 2017-02-22 中兴通讯股份有限公司 基于聚合链路的流量管理方法及装置
CN107438029A (zh) * 2016-05-27 2017-12-05 华为技术有限公司 转发数据的方法和设备
WO2019154335A1 (fr) * 2018-02-07 2019-08-15 华为技术有限公司 Procédé permettant de traiter des flux de données et dispositif d'élément de réseau

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000074322A1 (fr) * 1999-06-01 2000-12-07 Fastforward Networks, Inc. Systeme et procede d'attribution de bande passante
CN106453111A (zh) * 2015-08-11 2017-02-22 中兴通讯股份有限公司 基于聚合链路的流量管理方法及装置
CN107438029A (zh) * 2016-05-27 2017-12-05 华为技术有限公司 转发数据的方法和设备
WO2019154335A1 (fr) * 2018-02-07 2019-08-15 华为技术有限公司 Procédé permettant de traiter des flux de données et dispositif d'élément de réseau

Also Published As

Publication number Publication date
CN116868554A (zh) 2023-10-10

Similar Documents

Publication Publication Date Title
US20210320820A1 (en) Fabric control protocol for large-scale multi-stage data center networks
US8248930B2 (en) Method and apparatus for a network queuing engine and congestion management gateway
JP6420354B2 (ja) 優先度および帯域幅割り当てに基づくトラフィッククラスアービトレーション
CN110944358B (zh) 数据传输方法和设备
WO2021148020A1 (fr) Procédé, appareil, dispositif et support d'informations de réglage de classe de service
CN113676416B (zh) 一种在高速网卡/dpu内提升网络服务质量的方法
US20200252337A1 (en) Data transmission method, device, and computer storage medium
WO2021244450A1 (fr) Procédé et appareil de communication
WO2021143913A1 (fr) Procédé, appareil et système de gestion de congestion, et support de stockage
CN112005528B (zh) 一种数据交换方法、数据交换节点及数据中心网络
CN109995608B (zh) 网络速率计算方法和装置
WO2023226716A1 (fr) Procédé de transmission de paquets, nœud de réacheminement, extrémité d'émission et support de stockage
WO2021101640A1 (fr) Procédé et appareil de nettoyage de paquets pour la distribution de paquets dans les temps
US20080253288A1 (en) Traffic shaping circuit, terminal device and network node
WO2022246710A1 (fr) Procédé de commande de transmission de flux de données et dispositif de communication
CN112838992A (zh) 报文调度方法及网络设备
US20210281524A1 (en) Congestion Control Processing Method, Packet Forwarding Apparatus, and Packet Receiving Apparatus
CN114531399B (zh) 一种内存阻塞平衡方法、装置、电子设备和存储介质
CN112714072B (zh) 一种调整发送速率的方法及装置
CN114501544A (zh) 一种数据传输方法、装置和存储介质
CN114095449A (zh) 流量控制方法、网络设备及存储介质
US11870708B2 (en) Congestion control method and apparatus
TWI815606B (zh) 無線通信系統中執行業務流管理的方法和設備
WO2024036476A1 (fr) Procédé et appareil d'acheminement de paquets
WO2023123075A1 (fr) Procédé et appareil de commande d'échange de données

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 202180092829.7

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21942291

Country of ref document: EP

Kind code of ref document: A1