CN113518037A - Congestion information synchronization method and related device - Google Patents

Congestion information synchronization method and related device Download PDF

Info

Publication number
CN113518037A
CN113518037A CN202010273713.0A CN202010273713A CN113518037A CN 113518037 A CN113518037 A CN 113518037A CN 202010273713 A CN202010273713 A CN 202010273713A CN 113518037 A CN113518037 A CN 113518037A
Authority
CN
China
Prior art keywords
value
network computing
ecn
bit
intra
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010273713.0A
Other languages
Chinese (zh)
Inventor
林钦亮
王巧灵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202010273713.0A priority Critical patent/CN113518037A/en
Priority to PCT/CN2021/083150 priority patent/WO2021203985A1/en
Publication of CN113518037A publication Critical patent/CN113518037A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/11Identifying congestion
    • H04L47/115Identifying congestion using a dedicated packet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the application discloses a method and a related device for synchronizing congestion information, which are applied to a scene of an in-network computing network and used for filling a gap that the in-network computing network lacks applicable congestion information synchronization, so that the sending rate of each working node in N working nodes tends to be smooth, and the condition that the sending of the working node corresponding to a communication link without congestion is interrupted or even overtime due to the fact that the sending rate is too high is avoided. The method for synchronizing the congestion information comprises the following steps: the congestion information is obtained and used for indicating that a first communication link is congested, the first communication link is a link between a first working node and a calculation switch in a network, the first working node is any one of N working nodes, and N is an integer larger than 2; and sending N messages with the same sequence number to the N working nodes, wherein the N messages with the same sequence number all carry congestion information, so that the N working nodes respectively carry out congestion control based on the congestion information.

Description

Congestion information synchronization method and related device
Technical Field
The embodiment of the application relates to the technical field of communication, in particular to a method for synchronizing congestion information and a related device.
Background
More and more network applications rely on large-scale computing, such as: artificial intelligence, internet of things, cloud computing, and the like. However, to realize large-scale computation, it is not feasible to rely on a single training node, and only by the cooperative computation of multiple training nodes in distributed computation, high-performance computation can be provided for network applications. And an in-network computing (in-network computing) network fully utilizes network computing resources, distributes partial key computation for distributed computing nodes, and provides aggregation computation to aggregate multiple data into one, thereby compressing network bandwidth occupation, accelerating network transmission and accelerating distributed computing.
As is well known, congestion control is an important means for improving the utilization rate of network resources and optimizing transmission quality, and the performance of a system is directly affected by the quality of congestion processing. At present, a congestion control algorithm based on an indication bit of an Explicit Congestion Notification (ECN) is adopted in a standard Remote Direct Memory Access (RDMA) protocol, and more protocols and applications based on a Transmission Control Protocol (TCP) start to enable an ECN indication bit and a corresponding congestion control method in a standard TCP protocol. The congestion control method based on the ECN flag bit can be understood as follows: if a data stream is sent from a node a to a node B, when a switch in a network detects that a port is congested, a forward ECN bit in all messages passing through the port is set to 1, if the node B receives a message with the forward ECN set to 1 (indicating that a route from the node a to the node B is congested), a backward ECN bit in the message with the node a is set to 1, or a congestion notification message (CNP) is replied, so that the node a performs congestion control when receiving the message with the backward ECN set or the CNP message.
However, the current congestion control method based on the ECN flag bit is only applicable to point-to-point unicast communication, and in a many-to-many synchronous communication mode of in-network computing, by using the current congestion control method based on the ECN flag bit, congestion information is easily discarded because a data packet sent by a working node is aggregated by an in-network computing switch; in addition, the intra-network computing network changes the communication mode into a many-to-many communication mode in the process of providing aggregation computation, and the traditional point-to-point congestion control method is adopted, so that the sending rates of the working nodes in the many-to-many communication mode cannot be synchronized, and thus the congestion control cannot be synchronously performed among the working nodes.
Disclosure of Invention
The embodiment of the application provides a method and a related device for synchronizing congestion information, which are used for synchronously sending the congestion information to N working nodes, filling the gap that the suitable congestion information synchronization is lacked in an intra-network computing network, and further enabling the sending rate of each working node in the N working nodes to tend to be smooth.
In a first aspect, an embodiment of the present application provides a method for synchronizing congestion information, where the method may include:
the method comprises the steps that an intra-network computing switch acquires congestion information, wherein the congestion information is used for indicating that a first communication link is congested, the first communication link is a link between a first working node and the intra-network computing switch, the first working node is any one of N working nodes, and N is an integer larger than 2;
and the intra-network computing switch sends N messages with the same sequence number to the N working nodes, wherein the N messages with the same sequence number all carry the congestion information, so that the N working nodes respectively carry out congestion control based on the congestion information.
By the mode, the intra-network computing switch sends N messages with the same sequence number and all carrying congestion information to the N working nodes, so that the congestion information can be synchronized to the N working nodes, the blank that the intra-network computing network lacks applicable congestion information synchronization is filled, the N working nodes can synchronously perform congestion control according to the congestion information carried in the messages, and the sending rate of each working node tends to be smooth.
Optionally, with reference to the first aspect, in a first possible implementation manner, the acquiring, by the intra-network computing switch, congestion information may include:
the intra-network computing switch acquires a first message sent by the first working node, wherein the first message carries the congestion information, the congestion information comprises a first value on a congestion notification ECN (event-based network) flag bit of the first message, and the first value is used for indicating that the first communication link is congested, wherein the first working node is any one of the N working nodes;
correspondingly, the intra-network computing switch sends N packets with the same sequence number to the N working nodes, where the N packets with the same sequence number all carry the congestion information, and includes:
and in the congestion indication time period of the first value, the intra-network computing switch sends N messages with the same sequence number to the N working nodes, wherein the N messages with the same sequence number carry the congestion information.
Through the mode, when the congestion indication time-of-validity period of the first value is not overtime, the intra-network computing switch brings congestion information in N messages with the same sequence number and respectively sends the N messages with the same sequence number to N working nodes, so that the congestion information can be synchronized into the N working nodes, the blank that the intra-network computing network lacks applicable congestion information synchronization is filled, the N working nodes can synchronously perform congestion control according to the congestion information carried in the messages, and the failure problem of the congestion information in the intra-network computing network is further solved based on the time-of-validity period of the first value.
Optionally, with reference to the first possible implementation manner of the first aspect, in a second possible implementation manner, before the sending, by the intra-network computing switch, N packets with the same sequence number to the N working nodes, the method may further include:
the intra-network computing switch modifies the value of the ECN zone bit of a second message into the first value to obtain a third message, wherein the second message is a message subjected to first aggregation in the time-validity period, and the serial number of the second message is the same as that of the third message;
correspondingly, the sending, by the intra-network computing switch, N packets with the same sequence number to the N working nodes includes:
and the intra-network computing switch sends N third messages to the N working nodes, the first value in each third message indicates the corresponding working node to perform congestion control, and the sequence numbers in the N third messages are the same.
Through the mode, the intra-network computing switch sends the N third messages with the same sequence number to the N working nodes to realize the synchronization of the congestion information, and further enables the first value in each third message to indicate the corresponding working node to carry out congestion control, so that the sending rate of each working node tends to be smooth.
Optionally, with reference to the first to second possible implementation manners of the first aspect, in a third possible implementation manner, the method may further include:
if the congestion indication time period of the first value is within, the intra-network computing switch receives a fourth message sent by the first working node, and the value of the ECN flag bit in the fourth message is the first value;
the in-network computing switch ignores the first value in the fourth message.
Through the above manner, only the congestion information carried in the first message needs to be synchronously transmitted, and the congestion information in the fourth message is omitted, so that not only is repeated transmission of the congestion information avoided, but also network resources are saved.
Optionally, with reference to the second possible implementation manner of the first aspect, in a fourth possible implementation manner, the modifying, by the intra-network computing switch, the value on the ECN flag of the second packet to the first value may include:
when the ECN flag bit of the first packet includes a first ECN field and the ECN flag bit of the second packet includes a second ECN field, modifying, by the in-network computing switch, the value in the second ECN field to the first value in the first ECN field; or the like, or, alternatively,
when the ECN flag bit of the first packet includes a first forward display congestion notification FECN bit and the ECN flag bit of the second packet includes a second FECN bit, the intra-network computation switch modifies the value in the second FECN bit to the first value in the first FECN bit; or the like, or, alternatively,
and when the ECN zone bit of the first message comprises a first backward display congestion notification BECN bit and the ECN zone bit of the second message comprises a second BECN bit, modifying the value in the second BECN bit into the first value on the first BECN bit by the in-network computing switch.
In the embodiment, because values of the ECN flag bit in different protocols take various forms, there may be various ways for the intra-network computing switch to modify the value of the ECN flag bit of the second packet to the first value.
Optionally, with reference to the first aspect, in a fifth possible implementation manner, the acquiring, by the intra-network computing switch, congestion information may include:
when the port state between the first working node and the in-network computing switch shows congestion, the in-network computing switch modifies the value of ECN (equal cost per unit time) zone bits in N data messages to be broadcasted to obtain congestion information, wherein the N data messages to be broadcasted are messages with the same sequence number in the N working nodes;
correspondingly, the sending, by the intra-network computing switch, N packets with the same sequence number to the N working nodes, where the N packets with the same sequence number all carry the congestion information, may include:
and the intra-network computing switch sends the modified N data messages to be broadcasted to the N working nodes, and the values on ECN flag bits in the modified N data messages to be broadcasted are respectively used for indicating the N working nodes to carry out congestion control.
In the embodiment, the intra-network computing switch can send the N modified data messages to be broadcasted to the N working nodes, so that each modified data message to be broadcasted carries congestion information, and thus the congestion information can be synchronized to the N working nodes, and the blank that the intra-network computing network lacks applicable congestion information synchronization is filled; and the N working nodes can synchronously carry out congestion control according to the congestion information carried in the modified data to be broadcasted, and further the sending rate of each working node in the N working nodes tends to be smooth.
Optionally, with reference to the fifth possible implementation manner of the first aspect, in a sixth possible implementation manner, the modifying, by the intra-network computing switch, the value of the ECN flag bit in the N data packets to be broadcasted includes:
when the ECN zone bit in the data message to be broadcasted comprises a third ECN field, the value in the third ECN field is set by the in-network computing switch; or the like, or, alternatively,
and when the ECN flag bit in the data message to be broadcasted comprises a third FECN bit, setting a value in the third FECN bit by the in-network computing switch.
In the embodiment, because values of the ECN flag bit in different protocols take various forms, there may be various ways for the intra-network computing switch to modify the value of the ECN flag bit of the second packet to the first value.
In a second aspect, an embodiment of the present application provides an in-network computing switch, where the in-network computing switch may include:
an obtaining unit, configured to obtain congestion information, where the congestion information is used to indicate that a first communication link is congested, the first communication link is a link between a first working node and the intra-network computing switch, the first working node is any one of N working nodes, and N is an integer greater than 2;
and the sending unit is used for sending N messages with the same sequence number to the N working nodes, wherein the N messages with the same sequence number all carry the congestion information, so that the N working nodes respectively carry out congestion control based on the congestion information.
Optionally, with reference to the second aspect, in a first possible implementation manner, the obtaining unit may include:
a first obtaining module, configured to obtain a first packet sent by the first working node, where the first packet carries the congestion information, where the congestion information includes a first value on a congestion notification ECN flag of the first packet, and the first value is used to indicate that a congestion occurs in the first communication link;
correspondingly, the sending unit includes:
and the first sending module is configured to send N messages with the same sequence number to the N working nodes within the congestion indication validity period of the first value obtained by the first obtaining module, where the N messages with the same sequence number all carry the congestion information.
Optionally, with reference to the first possible implementation manner of the second aspect, in a second possible implementation manner, the intra-network computing switch may further include:
a modifying unit, configured to modify a value on an ECN flag of a second packet to the first value before the first sending module sends N packets with the same serial number to the N working nodes, so as to obtain a third packet, where the second packet is a packet that is first aggregated within the time-validity period, and the serial number of the second packet is the same as the serial number of the third packet;
correspondingly, the first sending module comprises:
and the first sending submodule is used for sending the N third messages obtained by the modification unit to the N working nodes, the first value in each third message indicates the corresponding working node to perform congestion control, and the sequence numbers in the N third messages are the same.
Optionally, with reference to the first to second possible implementation manners of the second aspect, in a third possible implementation manner, the intra-network computing switch further includes:
the obtaining unit is configured to receive a fourth packet sent by the first working node within the congestion indication validity period of the first value, where a value on the ECN flag bit in the fourth packet is the first value;
and the ignoring unit is used for ignoring the first value in the fourth message acquired by the acquiring unit.
Optionally, with reference to the second possible implementation manner of the second aspect, in a fourth possible implementation manner,
the modifying unit is configured to modify a value in the second ECN field to the first value in the first ECN field when the ECN flag of the first packet includes a first ECN field and the ECN flag of the second packet includes a second ECN field; or the like, or, alternatively,
the modifying unit is configured to modify a value in the second FECN bit to the first value in the first FECN bit when the ECN flag bit of the first packet includes a first forward display congestion notification FECN bit and the ECN flag bit of the second packet includes a second FECN bit; or the like, or, alternatively,
the modifying unit is configured to modify a value in the second BECN bit to the first value on the first BECN bit when the ECN flag bit of the first packet includes a first backward display congestion notification BECN bit and the ECN flag bit of the second packet includes a second BECN bit.
Optionally, with reference to the second aspect, in a fifth possible implementation manner, the obtaining unit may include:
a second obtaining module, configured to modify values of ECN flag bits in N data packets to be broadcast when a port state between the first working node and the intra-network computing switch shows congestion, so as to obtain congestion information, where the N data packets to be broadcast are packets with the same sequence number in the N working nodes;
correspondingly, the sending unit includes:
and a second sending module, configured to send the modified N data packets to be broadcast to the N working nodes, where values on ECN flag bits in the modified N data packets to be broadcast are respectively used to indicate the N working nodes to perform congestion control.
Optionally, with reference to the fifth possible implementation manner of the second aspect, in a sixth possible implementation manner, the second obtaining module is configured to, when an ECN flag bit in the data message to be broadcast includes a third ECN field, set, by the intra-network computing switch, a value in the third ECN field; or the like, or, alternatively,
the second obtaining module is configured to, when the ECN flag bit in the data packet to be broadcast includes a third FECN bit, set a value in the third FECN bit by the intra-network computing switch.
In a third aspect, an embodiment of the present application provides a computer device, including: a processor and a memory; the memory is configured to store program instructions that, when executed by the computer device, are executed by the processor to cause the computer device to perform the method of congestion information synchronization according to the first aspect or any one of the possible implementations of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, which includes instructions that, when executed on a computer, cause the computer to perform a method according to the first aspect or any one of the possible implementations of the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product containing instructions that, when executed on a computer, cause the computer to perform the method according to the first aspect or any one of the possible implementations of the first aspect.
In a sixth aspect, an embodiment of the present application provides a chip system, where the chip system includes a processor, and is configured to support an in-network computing switch to implement the functions in the first aspect or any one of the possible implementation manners of the first aspect. In one possible design, the system-on-chip further includes a memory for storing program instructions and data necessary for the compute switch within the network. The chip system may be constituted by a chip, or may include a chip and other discrete devices.
According to the technical scheme, the embodiment of the application has the following advantages:
in the embodiment of the application, because the congestion information can indicate that the first communication link between any one working node and the intra-network computing switch is congested, when the intra-network computing switch acquires the congestion information, the congestion information is carried in N messages with the same sequence number, so that the N messages with the same sequence number are sent to the N working nodes. That is to say, the intra-network computing switch sends N packets with the same sequence number and all carrying congestion information to the N working nodes, so that the congestion information can be synchronized to the N working nodes, the N working nodes can perform synchronous congestion control according to the congestion information carried in the packets after receiving the packets respectively, the blank that the intra-network computing network lacks applicable congestion information synchronization is filled, and the sending rate of each working node in the N working nodes tends to be smooth.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below, and it is apparent that the drawings in the following description are only some embodiments of the present application.
FIG. 1 is a system architecture diagram according to an embodiment of the present application;
fig. 2 is a schematic diagram of an embodiment of a method for synchronizing congestion information provided by the present embodiment;
fig. 3 is a schematic diagram of aggregation computation performed by an intra-network computation switch provided in the present embodiment;
fig. 4 is a schematic diagram of another embodiment of the congestion information synchronization method provided in the present embodiment;
fig. 5 is a schematic state diagram of an ECN flag bit of the RoCE v2 protocol or the TCP protocol proposed in the embodiment of the present application;
fig. 6 is a schematic diagram of another embodiment of the congestion information synchronization method provided in the present embodiment;
fig. 7 is a schematic diagram of an embodiment of an intra-network computing switch provided in an embodiment of the present application;
fig. 8 is a schematic diagram of another embodiment of an intra-network computing switch provided in an embodiment of the present application;
fig. 9 is a schematic diagram of another embodiment of an intra-network computing switch provided in an embodiment of the present application;
fig. 10 is a schematic diagram of another embodiment of an intra-network computing switch provided in an embodiment of the present application;
fig. 11 is a schematic diagram of a hardware configuration of a communication apparatus in the embodiment of the present application.
Detailed Description
The embodiment of the application provides a method and a related device for synchronizing congestion information, which are used for synchronously sending the congestion information to N working nodes, filling the gap that the suitable congestion information synchronization is lacked in an intra-network computing network, and further enabling the sending rate of each working node in the N working nodes to tend to be smooth.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Congestion control is an important means for improving the utilization rate of network resources and optimizing transmission quality, and the performance of a system is directly influenced by the quality of congestion processing. Because the in-network computing network can fully utilize network computing resources, can allocate partial key computing for distributed computing nodes, and can provide aggregation computing to aggregate multiple data into one, thereby compressing network bandwidth occupation and accelerating network transmission, the in-network computing network capable of providing aggregation computing is more and more favored. And the in-network computing network has the corresponding flow control characteristics, namely: 1. the intra-network computing switch performs aggregation computing on messages with the same serial number (index value) sent by N working nodes (N is an integer greater than 2), and sends out the messages after computing parameters in the data messages with the same serial number sent by the N working nodes, otherwise, packet loss is avoided; 2. in order to prevent the overflow of the buffer area in the calculation switch in the network, the sending rate of the N working nodes when sending data is needed to be synchronized; the transmission rate of the N working nodes depends on the slowest link in the topology. The traditional congestion control method based on the ECN zone bit is only suitable for point-to-point unicast communication and cannot be suitable for a many-to-many synchronous communication mode of an in-network computing network, namely, the congestion control method in the point-to-point unicast communication cannot solve the problem that when the sending rate of any one working node is reduced due to congestion, the sending rates of other N-1 working nodes can be correspondingly reduced.
Therefore, in order to solve the above problem, the method provided in the embodiment of the present application is mainly applied to an application scenario of an intra-network computing network that performs congestion control based on an ECN flag. The foregoing application scenarios of the in-network computing network include, but are not limited to, Artificial Intelligence (AI) distributed training, MapReduce model (MapReduce), or distributed database, etc. Referring to fig. 1, a system architecture diagram is provided according to an application scenario of an in-network computing network. As can be seen from fig. 1, the system may include an in-network computing switch, and N working nodes; the intra-network computing switch is mainly used for acquiring congestion information, and the congestion information can indicate that a first communication link between any one of the N working nodes and the intra-network computing switch is congested (a black dot in fig. 1), so that when the first communication link is congested, the intra-network computing switch can carry the congestion information in N messages with the same sequence number, and thus the intra-network computing switch can send the N messages with the same sequence number to the N working nodes. Therefore, the same congestion information is carried based on the N messages with the same sequence number, so that the congestion information can be synchronized to the N working nodes, and the N working nodes can synchronously perform congestion control according to the congestion information carried in the messages after receiving the N messages with the same sequence number, thereby avoiding the condition that the working nodes corresponding to the communication links without congestion cause transmission interruption and even overtime due to over-high transmission rate.
It should be understood that the first communication link that is shown in fig. 1 and is congested is a link between an intra-network computing switch and a working node 0, and is merely an illustrative description, and in a practical application, the first communication link that is congested may also be a link between a working node, such as the working node 1 or the working node 2, and the intra-network computing switch, and is not specifically limited in this embodiment of the application.
It should be understood that the foregoing in-network computing switch has certain programmability and computing ability in addition to the conventional forwarding ability, and can perform computation and modification on fields in the message, for example, modify ECN fields, replace computation results into the load of the message, and the like. In addition, the intra-network computing switch includes, but is not limited to, a barrebot Wedge 100B switch, a Cisco N3400 switch, and the like, and is not limited in this embodiment of the present application. In addition, the aforementioned N work nodes may be a server with a Graphics Processing Unit (GPU), a training node, and the like, and are not specifically limited in this embodiment of the application.
The method for synchronizing congestion information in this embodiment may be applied to the system architecture shown in fig. 1, and may also be applied to other system architectures, which are not limited herein.
To better understand the proposed solution in the embodiment of the present application, a detailed flow in the embodiment will be described below, please refer to fig. 2, which is a schematic diagram of an embodiment of a method for synchronizing congestion information provided in the embodiment, where the method may include:
201. the method comprises the steps that an intra-network computing switch obtains congestion information, the congestion information is used for indicating that a first communication link is congested, the first communication link is a link between a first working node and the intra-network computing switch, the first working node is any one of N working nodes, and N is an integer larger than 2.
In an embodiment, each working node and an in-network computing switch have a corresponding communication link therebetween, and a first communication link between any one of the N working nodes and the in-network computing switch is congested, and the in-network computing switch acquires congestion information and then specifies congestion occurrence based on the congestion information.
It can be understood that the congestion information may be obtained by the intra-network computing switch detecting that any one of the communication ports leading to the N working nodes is in a congestion state, so that the congestion information is obtained based on the communication port in the congestion state; the congestion information may also be notified to the in-network computing switch by the first operating node corresponding to the first communication link in which the congestion occurred. It should be understood that, in the embodiment of the present application, a manner of obtaining the congestion information is not particularly limited.
202. The intra-network computing switch sends N messages with the same sequence number to the N working nodes, and the N messages with the same sequence number all carry congestion information, so that the N working nodes respectively carry out congestion control based on the congestion information.
In the embodiment, the serial number can indicate the number of the packet, and the packet with the same serial number indicates that the data packets sent by the N working nodes to the intra-network computing switch respectively belong to the same batch, so that the intra-network computing switch can distinguish the data packets sent by the N working nodes based on the serial number, and correspondingly aggregate the data packets of the same batch according to the parameters in the data packets from the N different working nodes with the same serial number.
The aggregation calculation can be understood that N working nodes synchronously send data messages carrying data to be calculated to the in-network computing switch, different working nodes send different data messages and are numbered by using serial numbers, the in-network computing switch performs corresponding aggregation calculation on parameters in the data messages with the same serial numbers after receiving the N data messages, and after all data messages with a certain serial number sent by the N working nodes are calculated, the in-network computing switch sends an aggregation result to the N working nodes in a message form.
For example, please refer to fig. 3 for a schematic diagram of an intra-network computing switch performing aggregation computation. As can be seen from fig. 3, the parameters in the data packet sent by the working node 0 to the intra-network computing switch in sequence are "1", "2", and "3", respectively; the parameters in the data message sent by the working node 1 to the intra-network computing switch in sequence are respectively '4', '5' and '6'; the parameters in the data packet sent by the working node 2 to the intra-network computing switch in sequence are respectively "7", "8" and "9". The sequence numbers (assumed as index0) of the data packets corresponding to the parameters "1", "4" and "7" are the same, the sequence numbers (assumed as index1) of the data packets corresponding to the parameters "2", "5" and "8" are the same, and the sequence numbers (assumed as index2) of the data packets corresponding to the parameters "3", "6" and "9" are the same.
Then, the intra-network computing switch may sum and average the parameters in the data packets with the same sequence number sent in the working node 0, the working node 1, and the working node 2 in sequence, for example, average the parameters "1", "4", and "7" in the data packet corresponding to index0, and obtain an aggregation result of 4. Therefore, the intra-network computing switch sends the aggregation result corresponding to the sequence number to the corresponding working node after computing the parameters in the data message of the same sequence number sent by the three working nodes. It is understood that the working node 0, the working node 1, and the working node 2 in fig. 3 are only one schematic description made for the aggregation calculation, and the number of the working nodes involved in the aggregation calculation is not limited in practical application, as long as N is an integer greater than 2.
Therefore, in order to synchronize the congestion information to the N working nodes, the intra-network computing switch needs to carry the congestion information in N packets with the same sequence number, so that the N packets with the same sequence number can be sent to the N working nodes. That is to say, the intra-network computing switch needs to carry congestion information in a packet obtained after aggregation is completed, and sends the packet which is obtained after aggregation and carries the congestion information to the N working nodes, because only the packets with the same sequence number are sent to the N working nodes, the N working nodes can obtain the congestion information in the packets of the same batch, that is, it is stated that the congestion information sent to the N working nodes is synchronous, so that the N working nodes can perform synchronous congestion control based on the received congestion information in the packets, and the situation that sending is interrupted or even overtime due to too fast sending rate of the working node corresponding to the communication link without congestion is avoided.
In addition, it should be understood that, when the N working nodes perform congestion control based on the congestion information, the N working nodes may be understood that the N working nodes synchronously reduce the rates of sending the packets corresponding to the N working nodes, and the like, and a specific description thereof will not be specifically limited in this embodiment of the present application.
Based on the above embodiment corresponding to fig. 2, it can be seen that the intra-network computing switch may obtain the congestion information in multiple ways, and implement synchronization of the congestion information in different ways, which will be described in detail below by embodiments respectively:
and in the first situation, the working node corresponding to the congested communication link informs the in-network computing switch.
And in the second situation, the intra-network computing switch actively detects.
First, referring to fig. 4, another embodiment of the method for synchronizing congestion information according to the embodiment of the present application is shown. As shown in fig. 4, another embodiment of the method for synchronizing congestion information provided in the embodiment of the present application may include:
401. the method comprises the steps that an intra-network computing switch obtains a first message sent by a first working node, the first message carries congestion information, the congestion information comprises a first value on a congestion notification ECN (equal cost network) zone bit of the first message, and the first value is used for indicating that a first communication link is congested.
In an embodiment, a communication port of an intra-network computing switch is not congested, but is congested with any communication port of another switch connected to the intra-network computing switch, where the first working node is understood to be indirectly connected to the intra-network computing switch through a first communication link, and then the first working node corresponding to the congested first communication link carries congestion information in the first message, so as to send the congestion information to the intra-network computing switch.
It is understood that the congestion information may include a first value on an ECN flag of the first message, where the ECN flag in the first message is a 2-bit field in a header of the message. For example: in a remote direct memory access (remote direct memory access over converted ethernet, RDMA over converted ethernet, RoCE) v2 protocol or TCP protocol based on the converged ethernet, the ECN flag bit is located in a 2-bit field of a message header of an Internet protocol version four (IPv 4) or an Internet protocol version six (IPv 6); in the wireless broadband (IB) protocol, the ECN flag is located in a 2-bit field of a Base Transport Header (BTH), and may specifically be composed of a Forward Explicit Congestion Notification (FECN) or a Backward Explicit Congestion Notification (BECN), that is, a first bit is an FECN bit and a subsequent bit is a BECN bit, which is not limited in this embodiment.
In addition, the first value is a value on the ECN flag of the first packet, which may be used to indicate that the first communication link is congested. For example: please refer to fig. 5, which is a schematic state diagram of an ECN flag of the RoCE v2 protocol or the TCP protocol in the embodiment of the present application. As can be seen from fig. 5, when the value of the ECN field in the header of the IPv4 or IPv6 message is 11, the corresponding state is a forward congestion flag, which indicates that congestion occurs, and therefore, the first value in the ECN flag bit in the RoCE v2 protocol or the TCP protocol may be 11, which is used to indicate that a first communication link between the first working node and the intra-network computing switch is congested.
In the ECN flag bit in the InfiniBand protocol, the value of the FECN bit is 1, which indicates that congestion occurs, and the value of the BECN bit is 1, which also indicates that congestion occurs, but it is worth noting that, assuming that a data flow flows from a working node a to a working node B, and at this time, the value of the FECN bit is 1, which indicates that congestion occurs in the process of sending the data flow from a to B; if the data flow flows from the working node B to the working node A, and the value of the BECN bit is 1 at the moment, the congestion is encountered in the process that the B sends the data flow to the A. The first value on the ECN flag bit in the InfiniBand protocol may therefore comprise: the value of the first bit is 1 or the value of the second bit is 1.
Thus, in the RoCE v2 protocol or the TCP protocol, the congestion information may be expressed as ip.ecn ═ 11, while in the InfiniBand protocol, the congestion information may be expressed as infiniband.fecn ═ 1 or infiniband.becn ═ 1. However, it should be understood that, in addition to the foregoing definitions, other values may be used in practical applications to indicate congestion occurrence, and are not limited in this embodiment.
402. And in the congestion indication time period of the first value, the intra-network computing switch sends N messages with the same sequence number to the N working nodes, and the N messages with the same sequence number all carry congestion information.
Based on the aggregation calculation described in fig. 3, assuming that parameters "4" and "7" in data packets corresponding to index0 sent by working node 1 and working node 2 have reached the in-network computing switch, at this time, the in-network computing switch will wait for working node 0 to send parameter "1" in data packet corresponding to index0, and until receiving "1", the in-network computing switch will further perform aggregation calculation on all data packets corresponding to index 0. However, if the working node 0 cannot receive the corresponding data packet all the time due to the congestion of the first communication link, aggregation calculation cannot be performed, and because the space of the cache area of the intra-network computing switch is limited, the working node 1 and the working node 2 that are not congested can send the data packet to the intra-network computing switch all the time, which easily causes the cache area of the intra-network computing switch to overflow and be exhausted. Based on the flow control characteristic that the sending rate in the in-network computing network often depends on the slowest link in the topology, the data messages sent by the working nodes are cached or discarded to cause the outdated and invalid congestion information before the aggregation of the data messages with a certain sequence number is completed.
Therefore, under the characteristic that the congestion information has strong timeliness, when the intra-network computing switch detects that the congestion information is carried in the first message acquired from the first working node, the time-validity period of the congestion information in the first message is monitored through a timer, a message timer and the like.
When the congestion indication time-validity period of the first value is not overtime, the intra-network computing switch brings congestion information in N messages with the same sequence number and respectively sends the N messages with the same sequence number to N working nodes, so that the congestion information can be synchronized into the N working nodes, the N working nodes can synchronously perform congestion control according to the congestion information carried in the messages after receiving the N messages with the same sequence number, the blank that the applicable congestion information synchronization in the intra-network computing network is lacked is filled, and the failure problem of the congestion information in the intra-network computing network can be solved based on the fact that the first value is within the time-validity period.
Optionally, in some embodiments, before the intra-network computing switch sends N packets with the same sequence number to the N working nodes, the method for synchronizing congestion information may further include:
the intra-network computing switch modifies the value of the ECN zone bit of the second message into a first value to obtain a third message, wherein the second message is a message subjected to first aggregation in the aging period, and the serial number of the second message is the same as that of the third message;
correspondingly, the intra-network computing switch sends N messages with the same serial number to the N working nodes, and the method comprises the following steps:
and the intra-network computing switch sends N third messages to the N working nodes, the first value in each third message indicates the corresponding working node to carry out congestion control, and the sequence numbers in the N third messages are the same.
That is, it is understood that, after a first packet is obtained, an intra-network computing switch analyzes the first packet first, and based on a flow control characteristic that the intra-network computing switch sends out a packet only after aggregation computation is completed, the intra-network computing switch needs to wait for a first aggregated packet, that is, a second packet, within a congestion indication validity period of a first value, and then modify a value on an ECN flag bit of the second packet to the first value, so that the modified second packet can be used as a third packet carrying congestion information, regardless of whether the first packet obtained after analysis belongs to the first type of packet or the second type of packet.
Therefore, the intra-network computing switch can send N third messages to the N working nodes to realize the synchronization of the congestion information, so that the first value in each third message indicates the corresponding working node to carry out congestion control, and further the sending rate of each working node tends to be smooth.
It should be noted that, whether the first packet belongs to the first type packet or the second type packet may be determined according to the packet length of the first packet. For example: when the message length of the first message is within a first preset message length, determining that the first message belongs to a first type of message; when the message length of the first message is within the second preset message length, it can be determined that the first message belongs to the second type of message. In addition, the length of the first preset message is greater than that of the second preset message, and the first type of message can be understood as being capable of performing aggregation calculation and also can be used for directly broadcasting a data message; while the second type of message may be understood as neither an aggregation calculation nor a broadcast data message.
Optionally, in other embodiments, because values of the ECN flag bit in different protocols take various forms, there may be various ways for the intra-network computing switch to modify the value of the ECN flag bit of the second packet into the first value, which can be understood with reference to the following ways:
the first method is as follows: and when the ECN zone bit of the first message comprises a first ECN field and the ECN zone bit of the second message comprises a second ECN field, the value in the second ECN field is modified into the first value in the first ECN field by the in-network computing switch.
That is, in the RoCE v2 protocol or the TCP protocol, if the ECN flag bit in the received first message is the first ECN field, the value in the second ECN field in the second message is modified to be the same as the first value, such as: the congestion information carried in the first message is copied into the second message by modifying to '11', so that the possibility of multiple applications is provided for the subsequent synchronization of the congestion information.
It should be understood that the value of "11" in the first ECN field is used to indicate that congestion occurs, and is merely an illustrative description, and in practical applications, it is also possible to define the value of the first ECN field as another value to indicate that congestion occurs, and the embodiment of the present application is not limited in particular.
The second method comprises the following steps: when the ECN zone bit of the first message comprises a first forward display congestion notification FECN bit and the ECN zone bit of the second message comprises a second FECN bit, the value in the second FECN bit is modified to be a first value in the first FECN bit by the intra-network computing switch; or the like, or, alternatively,
the third method comprises the following steps: and when the ECN zone bit of the first message comprises a first backward display congestion notification BECN bit and the ECN zone bit of the second message comprises a second BECN bit, the value in the second BECN bit is modified into a first value on the first BECN bit by the in-network computing switch.
In the embodiment, for the second and third modes, because the ECN flag bit in the InfiniBand protocol is located in the 2-bit field in the BTH header, the 2-bit field is formed by FECN and BECN, and the value of the FECN bit is "1", or the value of the BECN bit is "1", which can indicate that congestion occurs.
Therefore, in the InfiniBand protocol, if the ECN flag bit in the received first message is the first FECN bit, the value in the second FECN bit in the second message is modified to be the same as the first value, such as: modified to "1". Or, if the ECN flag bit in the received first message is the first BECN bit, modifying the value in the second BECN bit in the second message to be the same as the first value, such as: the congestion information carried in the first message can be copied into the second message by modifying to be '1', so that the possibility of multiple applications is provided for the subsequent synchronization of the congestion information.
It should be understood that, taking the value of "1" in the second FECN bit or "1" in the second BECN bit to indicate that congestion occurs is merely an illustrative description, and in practical applications, it is also possible to define the value of the second FECN bit or the value of the second BECN bit as another numerical value to indicate that congestion occurs, and the specific embodiment of the present application is not limited thereto.
In addition, in order to save network resources, avoid repeated sending of congestion information, and the like, in other embodiments, the method for synchronizing congestion information may further include:
if the congestion indication time period of the first value is within, the intra-network computing switch receives a fourth message sent by the first working node, and the value of an ECN zone bit in the fourth message is the first value;
the intra-network computing switch ignores the first value in the fourth message.
In the embodiment, if the intra-network computing switch receives a fourth message sent by the first working node within the congestion indication time period of the first value, at this time, since the value of the ECN flag bit in the fourth message is the same as the first value in the congestion information carried in the first message, it is determined that the fourth message also carries congestion information. However, the intra-network computing switch has started a timer to monitor the aging of the first value when receiving the first packet, and in order to avoid repeated copying and sending of congestion information, if a fourth packet carrying the same congestion information is also received within the expiration of the congestion indication of the first value at this time, the intra-network computing switch does not need to restart another timer when receiving the fourth packet, but ignores the first value in the fourth packet. That is, the intra-network computing switch may omit the congestion information in the fourth packet, and forward the fourth packet to the destination according to the original forwarding rule, that is, only the congestion information carried in the first packet needs to be synchronously sent, so as to save network resources.
Additionally, it may be further appreciated that if the congestion of the first value indicates a time out, the intra-network compute switch ignores the congestion information. That is, when the congestion indication aging of the first value is out of date, it indicates that the congestion information corresponding to the first value is invalid, and at this time, the intra-network computing switch synchronizes the invalid congestion information again, so that the N working nodes cannot perform synchronous congestion control, and therefore, the intra-network computing switch may ignore the congestion information in the first message, send the first message to the destination according to the original forwarding rule, and obtain other non-invalid congestion information from the working nodes corresponding to the communication links in which congestion occurs again.
Second, referring to fig. 6, another embodiment of the method for synchronizing congestion information according to the embodiment of the present application is shown. As shown in fig. 6, another embodiment of the method for synchronizing congestion information provided in the embodiment of the present application may include:
601. when the port state between the first working node and the in-network computing switch shows congestion, the in-network computing switch modifies the value of the ECN zone bit in the N data messages to be broadcasted to obtain congestion information, wherein the N data messages to be broadcasted are messages with the same sequence number in the N working nodes.
In the embodiment, the intra-network computing switch may monitor the cache queue information of the packet sent by the N working nodes, and when the cache queue information exceeds the cache threshold, the intra-network computing switch may determine that the port state between any one of the N working nodes and the intra-network computing switch shows congestion, that is, the port state between the first working node and the intra-network computing switch shows congestion.
It should be noted that, the first working node herein may be understood as a working node directly connected to the intra-network computing switch through the first communication link, and in addition, the first working node is only any one of the N working nodes, and specifically, which embodiment of the present application is not limited; in addition to the manner of determining whether the port is congested by buffering the queue information, the intra-network computing switch may determine that the port is in a congested state by other determination manners such as a port utilization rate in practical application, which is not specifically limited in this embodiment.
Therefore, when a port is congested, the intra-network computing switch may modify the value of the ECN flag bit in the N data packets to be broadcast, so as to obtain congestion information, for example: modified to "ECN ═ 1", or "ECN ═ 11", etc., then the corresponding congestion information can be expressed as "infiniband. ECN ═ 1", or "ip. ECN ═ 11", etc.
In addition, it should be noted that the N data packets to be broadcasted are packets with the same sequence number obtained by the intra-network computing switch after completing aggregation computation on the data packets with the same sequence number sent by the N working nodes. The description of the sequence number can be understood with reference to step 202 in fig. 2, and will not be described herein in detail.
602. And the intra-network computing switch sends the modified N data messages to be broadcasted to the N working nodes, and the values on ECN flag bits in the modified N data messages to be broadcasted are respectively used for indicating the N working nodes to carry out congestion control.
In the embodiment, after the value of the ECN zone bit in the N data messages to be broadcasted is modified, the obtained N modified data messages to be broadcasted all carry congestion information, so that the intra-network computing switch can send the N modified data messages to be broadcasted to N working nodes, the congestion information can be synchronized to the N working nodes, and the blank that the intra-network computing network lacks applicable congestion information synchronization is filled; and after the N working nodes respectively receive the modified data messages to be broadcasted, the N working nodes can synchronously carry out congestion control according to congestion information carried in the modified data messages to be broadcasted, so that the sending rate of each working node in the N working nodes tends to be smooth, and the condition that the sending of the working node corresponding to the communication link without congestion is interrupted or even overtime due to the fact that the sending rate is too high is avoided.
Optionally, in other embodiments, because values of the ECN flag bit in different protocols take various forms, there may be various ways for the intra-network computing switch to modify the value of the ECN flag bit of the second packet to the first value, and therefore the intra-network computing switch modifies the values of the ECN flag bit in the N data packets to be broadcast by referring to the following ways:
the first method is as follows: and when the ECN zone bit in the data message to be broadcasted comprises the third ECN field, the in-network computing switch sets the value in the third ECN field.
That is, in the RoCE v2 protocol or the TCP protocol, if the ECN flag bit in each data packet to be broadcast is the third ECN field, the value in the third ECN field in each data packet to be broadcast is set, for example, "11", so that the congestion information "ip.
It should be understood that the setting of the value in the third ECN field to "11" to indicate that congestion occurs is merely an illustrative description, and in practical applications, it is also possible to set the value in the third ECN field to other values to indicate that congestion occurs, and the embodiment of the present application is not limited in particular.
Alternatively, the second mode: and when the ECN zone bit in the data message to be broadcasted comprises the third FECN bit, the value in the third FECN bit is set by the in-network computing switch.
In an embodiment, since the ECN flag bit in the InfiniBand protocol is located in the 2-bit field in the BTH header, the 2-bit field is composed of an FECN bit and a BECN bit, and a value of "1" on the FECN bit or a value of "1" on the BECN bit can indicate that congestion occurs. However, under the condition of active detection of the intra-network computing switch, the congestion can be indicated only by modifying the value on the FECN bit. Therefore, in the InfiniBand protocol, if the ECN flag bit in each data packet to be broadcast is the third FECN bit, the value in the third FECN bit in each data packet to be broadcast is set, for example, "1", so that the congestion information "InfiniBand. FECN ═ 1" can be copied into N data packets to be broadcast, thereby providing multiple application possibilities for subsequent synchronization of congestion information.
It should be understood that the value of "1" in the third FECN bit is used to indicate that congestion occurs, and is merely an illustrative description, and in practical applications, it is also possible to set the value of the third FECN bit to other values to indicate that congestion occurs, and the embodiment of the present application is not limited in particular.
In the embodiment of the application, the intra-network computing switch sends the N messages with the same sequence number and all carrying congestion information to the N working nodes, so that the congestion information can be synchronized to the N working nodes, the N working nodes can synchronously perform congestion control according to the congestion information carried in the messages after receiving the messages respectively, the blank that the intra-network computing network lacks applicable congestion information synchronization is filled, and the sending rate of each working node in the N working nodes tends to be smooth.
The foregoing mainly introduces a method for synchronizing congestion information provided in the embodiment of the present application from a method perspective. It is to be understood that the hardware structure and/or software modules for performing the respective functions are included to realize the above functions. Those of skill in the art will readily appreciate that the various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiment of the present application, functional modules of the apparatus may be divided according to the above method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. It should be noted that, in the embodiment of the present application, the division of the module is schematic, and is only one logic function division, and there may be another division manner in actual implementation.
Referring to fig. 7, an embodiment of an intra-network computing switch in an embodiment of the present application is described in detail below, where:
an obtaining unit 701, configured to obtain congestion information, where the congestion information is used to indicate that a first communication link is congested, where the first communication link is a link between a first working node and the intra-network computing switch, the first working node is any one of N working nodes, and N is an integer greater than 2;
a sending unit 702, configured to send N messages with the same sequence number to the N working nodes, where the N messages with the same sequence number all carry the congestion information obtained by the obtaining unit 701, so that the N working nodes perform congestion control based on the congestion information respectively.
Through the manner, the sending unit 702 sends the N messages with the same sequence number, which all carry the congestion information obtained by the obtaining unit 701, to the N working nodes, so that the congestion information can be synchronized into the N working nodes, the blank that the applicable congestion information synchronization is lacked in the in-network computing network is filled, and the N working nodes can synchronously perform congestion control according to the congestion information carried in the messages, so that the sending rate of each working node tends to be smooth.
For convenience of understanding, referring to fig. 8 on the basis of the embodiment described in fig. 7, in another embodiment of the intra-network computing switch in the embodiment of the present application, the obtaining unit 701 may include:
a first obtaining module 7011, configured to obtain a first message sent by a first working node, where the first message carries congestion information, the congestion information includes a first value on a congestion display notification ECN flag of the first message, and the first value is used to indicate that a first communication link is congested, where the first working node is any one of N working nodes;
correspondingly, the sending unit 702 may include:
a first sending module 7021, configured to send N messages with the same sequence number to the N working nodes within the congestion indication time period of the first value obtained by the first obtaining module 7011, where the N messages with the same sequence number all carry congestion information.
Through the above manner, when the congestion indication period of the first value is not overtime, the first sending module 7021 brings the congestion information obtained by the first obtaining module 7011 to the N messages with the same sequence number, and sends the N messages with the same sequence number to the N working nodes, so that the congestion information can be synchronized to the N working nodes, the gap that the applicable congestion information synchronization is lacked in the intra-network computing network is filled, the N working nodes can perform synchronous congestion control according to the congestion information carried in the messages, and further, the failure problem of the congestion information occurring in the intra-network computing network is solved based on the first value within the period of time.
Optionally, on the basis of the embodiment described in fig. 8, referring to fig. 9, in another embodiment of the intra-network computing switch in the embodiment of the present application, the intra-network computing switch may further include:
a modifying unit 703, configured to modify a value on an ECN flag of a second packet to a first value before the first sending module 7021 sends N packets with the same serial number to the N working nodes, so as to obtain a third packet, where the second packet is a packet whose aggregation is completed first in an aging period, and the serial number of the second packet is the same as the serial number of the third packet;
correspondingly, the first sending module 7021 includes:
the first sending sub-module 70211 is configured to send N third messages obtained by the modifying unit 703 to the N working nodes, where a first value in each third message indicates a corresponding working node to perform congestion control, and sequence numbers in the N third messages are the same.
Optionally, on the basis of the embodiments described in fig. 8 and fig. 9, in another embodiment of the intra-network computing switch in the embodiment of the present application, the intra-network computing switch further includes:
an obtaining unit 701, configured to receive a fourth message sent by the first working node within a congestion indication validity period of the first value, where a value on an ECN flag bit in the fourth message is the first value;
and an ignoring unit, configured to ignore the first value in the fourth message acquired by the acquiring unit 701.
Optionally, on the basis of the embodiment described in fig. 9, in another embodiment of the intra-network computing switch in the embodiment of the present application, the modifying unit 703 is configured to modify a value in the second ECN field to a first value in the first ECN field when the ECN flag of the first message includes the first ECN field and the ECN flag of the second message includes the second ECN field; or the like, or, alternatively,
a modifying unit 703, configured to modify a value in the second FECN bit to a first value in the first FECN bit when the ECN flag bit of the first message includes the first forward display congestion notification FECN bit and the ECN flag bit of the second message includes the second FECN bit; or the like, or, alternatively,
a modifying unit 703, configured to modify a value in the second BECN bit to a first value on the first BECN bit when the ECN flag bit of the first message includes the first backward display congestion notification BECN bit, and the ECN flag bit of the second message includes the second BECN bit.
Optionally, on the basis of the embodiment described in fig. 7, referring to fig. 10, in another embodiment of the intra-network computing switch in the embodiment of the present application, the obtaining unit 701 may include:
a second obtaining module 7012, configured to modify values of ECN flag bits in N data packets to be broadcast when a port state between a first working node and a network internal computing switch shows congestion, to obtain congestion information, where the N data packets to be broadcast are packets with the same sequence number in N working nodes, and a second working node is any one of the N working nodes;
correspondingly, the sending unit 702 includes:
a second sending module 7022, configured to send the modified N data packets to be broadcasted to the N working nodes, where values on ECN flag bits in the modified N data packets to be broadcasted are respectively used to instruct the N working nodes to perform congestion control.
In an embodiment, the second sending module 7022 may send the N modified data packets to be broadcasted to the N working nodes, so that each modified data packet to be broadcasted carries the congestion information obtained by the second obtaining module 7012, so that the congestion information can be synchronized to the N working nodes, and a gap that an intra-network computing network lacks applicable congestion information synchronization is filled; and the N working nodes can synchronously carry out congestion control according to the congestion information carried in the modified data to be broadcasted, and further the sending rate of each working node in the N working nodes tends to be smooth.
Optionally, on the basis of the embodiment described in fig. 10, in another embodiment of the intra-network computing switch in the embodiment of the present application, the second obtaining module 7012 is configured to, when the ECN flag bit in the data message to be broadcast includes the third ECN field, set the value in the third ECN field by the intra-network computing switch; or the like, or, alternatively,
a second obtaining module 7012, configured to set, when the ECN flag bit in the data packet to be broadcast includes a third FECN bit, a value in the third FECN bit by the intra-network computing switch.
The intra-network computing switch in the embodiment of the present application is described above from the perspective of the modular functional entity, and the intra-network computing switch in the embodiment of the present application is described below from the perspective of hardware processing. Fig. 11 is a schematic diagram of a hardware configuration of a communication apparatus in the embodiment of the present application. As shown in fig. 11, the communication apparatus may include:
the communication device includes at least one processor 1101, communication lines 1107, memory 1103, and at least one communication interface 1104.
The processor 1101 may be a general processing unit (CPU), a microprocessor, an application-specific integrated circuit (server IC), or one or more ICs for controlling the execution of programs in accordance with the present invention.
Communication link 1107 may include a path that conveys information between the aforementioned components.
Communication interface 1104, which may be any device such as a transceiver, may be used to communicate with other devices or communication networks, such as an ethernet, a Radio Access Network (RAN), a Wireless Local Area Network (WLAN), etc.
The memory 1103 may be a read-only memory (ROM) or other type of static storage device that may store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that may store information and instructions, and may be separate and coupled to the processor via a communication line 1107. The memory may also be integral to the processor.
The memory 1103 is used for storing computer-executable instructions for executing the present invention, and is controlled by the processor 1101. The processor 1101 is configured to execute computer-executable instructions stored in the memory 1103, so as to implement the method for synchronizing congestion information provided by the above-mentioned embodiments of the present application.
Optionally, the computer-executable instructions in the embodiments of the present application may also be referred to as application program codes, which are not specifically limited in the embodiments of the present application.
In particular implementations, for one embodiment, a communication device may include multiple processors, such as processor 1101 and processor 1102 in fig. 11. Each of these processors may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).
In a specific implementation, the communications apparatus may further include an output device 1105 and an input device 1106, as an example. The output device 1105 is in communication with the processor 1101 and may display information in a variety of ways. The input device 1106 is in communication with the processor 1101 and may receive user input in a variety of ways. For example, the input device 1106 may be a mouse, a touch screen device, or a sensing device, among others.
The communication device may be a general-purpose device or a dedicated device. In particular implementations, the communication device may be a router, an in-network computing switch, or a device having a similar structure as in fig. 11. The embodiment of the present application does not limit the type of the communication device.
The obtaining unit 701, the first obtaining module 7011, and the second obtaining module 7012 may all be implemented by an input device 1106, the sending unit 702, the first sending module 7021, the first sending submodule 70211, and the second sending module 7022 may all be implemented by an output device 1105, and the modifying unit 703 and the ignoring unit may all be implemented by the processor 1101 or the processor 1102.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the unit is only one logical functional division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (17)

1. A method of congestion information synchronization, comprising:
the method comprises the steps that an intra-network computing switch acquires congestion information, wherein the congestion information is used for indicating that a first communication link is congested, the first communication link is a link between a first working node and the intra-network computing switch, the first working node is any one of N working nodes, and N is an integer larger than 2;
and the intra-network computing switch sends N messages with the same sequence number to the N working nodes, wherein the N messages with the same sequence number all carry the congestion information, so that the N working nodes respectively carry out congestion control based on the congestion information.
2. The method of claim 1, wherein the intra-network computing switch obtaining congestion information comprises:
the intra-network computing switch acquires a first message sent by the first working node, wherein the first message carries the congestion information, the congestion information comprises a first value on a congestion notification ECN (event-based network) flag bit of the first message, and the first value is used for indicating that the first communication link is congested;
correspondingly, the intra-network computing switch sends N packets with the same sequence number to the N working nodes, where the N packets with the same sequence number all carry the congestion information, and includes:
and in the congestion indication time period of the first value, the intra-network computing switch sends N messages with the same sequence number to the N working nodes, wherein the N messages with the same sequence number carry the congestion information.
3. The method according to claim 2, wherein before the intra-network computing switch sends N packets with the same sequence number to the N working nodes, the method further comprises:
the intra-network computing switch modifies the value of the ECN zone bit of a second message into the first value to obtain a third message, wherein the second message is a message subjected to first aggregation in the time-validity period, and the serial number of the second message is the same as that of the third message;
correspondingly, the sending, by the intra-network computing switch, N packets with the same sequence number to the N working nodes includes:
and the intra-network computing switch sends N third messages to the N working nodes, the first value in each third message indicates the corresponding working node to perform congestion control, and the sequence numbers in the N third messages are the same.
4. The method according to any one of claims 2-3, further comprising:
if the congestion indication time period of the first value is within, the intra-network computing switch receives a fourth message sent by the first working node, and the value of an ECN zone bit in the fourth message is the first value;
the in-network computing switch ignores the first value in the fourth message.
5. The method of claim 3, wherein modifying the value on the ECN flag of the second packet to the first value by the in-network computing switch comprises:
when the ECN flag bit of the first packet includes a first ECN field and the ECN flag bit of the second packet includes a second ECN field, modifying, by the in-network computing switch, the value in the second ECN field to the first value in the first ECN field; or the like, or, alternatively,
when the ECN flag bit of the first packet includes a first forward display congestion notification FECN bit and the ECN flag bit of the second packet includes a second FECN bit, the intra-network computation switch modifies the value in the second FECN bit to the first value in the first FECN bit; or the like, or, alternatively,
and when the ECN zone bit of the first message comprises a first backward display congestion notification BECN bit and the ECN zone bit of the second message comprises a second BECN bit, modifying the value in the second BECN bit into the first value on the first BECN bit by the in-network computing switch.
6. The method of claim 1, wherein the intra-network computing switch obtaining congestion information comprises:
when the port state between the first working node and the in-network computing switch shows congestion, the in-network computing switch modifies the value of ECN (equal cost per unit time) zone bits in N data messages to be broadcasted to obtain congestion information, wherein the N data messages to be broadcasted are messages with the same sequence number in the N working nodes;
correspondingly, the intra-network computing switch sends N packets with the same sequence number to the N working nodes, where the N packets with the same sequence number all carry the congestion information, and includes:
and the intra-network computing switch sends the modified N data messages to be broadcasted to the N working nodes, and the values on ECN flag bits in the modified N data messages to be broadcasted are respectively used for indicating the N working nodes to carry out congestion control.
7. The method of claim 6, wherein the modifying the value of the ECN flag bit in the N data packets to be broadcast by the in-network computing switch comprises:
when the ECN zone bit in the data message to be broadcasted comprises a third ECN field, the value in the third ECN field is set by the in-network computing switch; or the like, or, alternatively,
and when the ECN flag bit in the data message to be broadcasted comprises a third FECN bit, setting a value in the third FECN bit by the in-network computing switch.
8. An intra-network computing switch, comprising:
an obtaining unit, configured to obtain congestion information, where the congestion information is used to indicate that a first communication link is congested, the first communication link is a link between a first working node and the intra-network computing switch, the first working node is any one of N working nodes, and N is an integer greater than 2;
and the sending unit is used for sending N messages with the same sequence number to the N working nodes, wherein the N messages with the same sequence number all carry the congestion information, so that the N working nodes respectively carry out congestion control based on the congestion information.
9. The intra-network computing switch of claim 8, wherein the obtaining unit comprises:
a first obtaining module, configured to obtain a first packet sent by the first working node, where the first packet carries the congestion information, where the congestion information includes a first value on a congestion notification ECN flag of the first packet, and the first value is used to indicate that a congestion occurs in the first communication link;
correspondingly, the sending unit includes:
and the first sending module is configured to send N messages with the same sequence number to the N working nodes within the congestion indication validity period of the first value obtained by the first obtaining module, where the N messages with the same sequence number all carry the congestion information.
10. The in-network computing switch of claim 9, further comprising:
a modifying unit, configured to modify a value on an ECN flag of a second packet to the first value before the first sending module sends N packets with the same serial number to the N working nodes, so as to obtain a third packet, where the second packet is a packet that is first aggregated within the time-validity period, and the serial number of the second packet is the same as the serial number of the third packet;
correspondingly, the first sending module comprises:
and the first sending submodule is used for sending the N third messages obtained by the modification unit to the N working nodes, the first value in each third message indicates the corresponding working node to perform congestion control, and the sequence numbers in the N third messages are the same.
11. The in-network computing switch of any of claims 9-10, wherein the in-network computing switch further comprises:
the obtaining unit is configured to receive a fourth packet sent by the first working node within the congestion indication validity period of the first value, where a value on the ECN flag bit in the fourth packet is the first value;
and the ignoring unit is used for ignoring the first value in the fourth message acquired by the acquiring unit.
12. The in-network computing switch of claim 10,
the modifying unit is configured to modify a value in the second ECN field to the first value in the first ECN field when the ECN flag of the first packet includes a first ECN field and the ECN flag of the second packet includes a second ECN field; or the like, or, alternatively,
the modifying unit is configured to modify a value in the second FECN bit to the first value in the first FECN bit when the ECN flag bit of the first packet includes a first forward display congestion notification FECN bit and the ECN flag bit of the second packet includes a second FECN bit; or the like, or, alternatively,
the modifying unit is configured to modify a value in the second BECN bit to the first value on the first BECN bit when the ECN flag bit of the first packet includes a first backward display congestion notification BECN bit and the ECN flag bit of the second packet includes a second BECN bit.
13. The intra-network computing switch of claim 8, wherein the obtaining unit comprises:
a second obtaining module, configured to modify values of ECN flag bits in N data packets to be broadcast when a port state between the first working node and the intra-network computing switch shows congestion, so as to obtain congestion information, where the N data packets to be broadcast are packets with the same sequence number in the N working nodes, and the second working node is any one of the N working nodes;
correspondingly, the sending unit includes:
and a second sending module, configured to send the modified N data packets to be broadcast to the N working nodes, where values on ECN flag bits in the modified N data packets to be broadcast are respectively used to indicate the N working nodes to perform congestion control.
14. The intra-network computing switch of claim 13,
the second obtaining module is configured to, when an ECN flag bit in the data packet to be broadcast includes a third ECN field, set a value in the third ECN field by the intra-network computing switch; or the like, or, alternatively,
the second obtaining module is configured to, when the ECN flag bit in the data packet to be broadcast includes a third FECN bit, set a value in the third FECN bit by the intra-network computing switch.
15. A computer device, comprising: a processor coupled with a memory for storing a program or instructions that, when executed by the processor, cause the computer device to perform the method of any of claims 1 to 7.
16. A computer-readable storage medium having stored thereon a computer program or instructions, which when executed cause a computer to perform the method of any one of claims 1 to 7.
17. A chip, comprising: a processor coupled with a memory, the memory to store a program or instructions that, when executed by the processor, cause an in-network computing switch to perform the method of any of claims 1 to 7.
CN202010273713.0A 2020-04-09 2020-04-09 Congestion information synchronization method and related device Pending CN113518037A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010273713.0A CN113518037A (en) 2020-04-09 2020-04-09 Congestion information synchronization method and related device
PCT/CN2021/083150 WO2021203985A1 (en) 2020-04-09 2021-03-26 Congestion information synchronizing method and related apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010273713.0A CN113518037A (en) 2020-04-09 2020-04-09 Congestion information synchronization method and related device

Publications (1)

Publication Number Publication Date
CN113518037A true CN113518037A (en) 2021-10-19

Family

ID=78022429

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010273713.0A Pending CN113518037A (en) 2020-04-09 2020-04-09 Congestion information synchronization method and related device

Country Status (2)

Country Link
CN (1) CN113518037A (en)
WO (1) WO2021203985A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115396372A (en) * 2022-10-26 2022-11-25 阿里云计算有限公司 Data stream rate control method, intelligent network card, cloud device and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117544567B (en) * 2024-01-09 2024-03-19 南京邮电大学 Memory transfer integrated RDMA data center congestion control method

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101754266A (en) * 2008-12-15 2010-06-23 中国移动通信集团公司 Method, system and device for adjusting transmission speed and redirecting routing
US8605584B2 (en) * 2009-07-02 2013-12-10 Qualcomm Incorporated Transmission of control information across multiple packets
CN102196502B (en) * 2011-04-06 2013-10-16 东南大学 Congestion control method for wireless sensor network
US9419900B2 (en) * 2013-12-31 2016-08-16 International Business Machines Corporation Multi-bit indicator set according to feedback based on an equilibrium length of a queue
CN104581821B (en) * 2015-01-28 2018-03-20 湘潭大学 Jamming control method based on nodal cache length fair allocat speed

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115396372A (en) * 2022-10-26 2022-11-25 阿里云计算有限公司 Data stream rate control method, intelligent network card, cloud device and storage medium
CN115396372B (en) * 2022-10-26 2023-02-28 阿里云计算有限公司 Data stream rate control method, intelligent network card, cloud device and storage medium

Also Published As

Publication number Publication date
WO2021203985A1 (en) 2021-10-14

Similar Documents

Publication Publication Date Title
EP3758412B1 (en) Multichannel data transmission method, apparatus, system and computer-readable medium
US11228534B2 (en) Congestion control method, network device, and network interface controller
CN107005485B (en) Method for determining route, corresponding device and system
US9729459B2 (en) System and method for credit-based link level flow control
Zhang et al. Multipath routing and MPTCP-based data delivery over manets
EP3694160A1 (en) Date transmission method, apparatus and device
US8416684B2 (en) Time and data rate policing
US20150215224A1 (en) Positive feedback ethernet link flow control for promoting lossless ethernet
CN110521178B (en) Method, device and system for distributing data
US20200099624A1 (en) Flow control method and system, and device
WO2020233313A1 (en) Delay adjustment method and apparatus for end-to-end service, storage medium, and electronic apparatus
CN111526089B (en) Data fusion transmission and scheduling device based on variable-length granularity
CN113518037A (en) Congestion information synchronization method and related device
CN113055301A (en) Congestion control method and related equipment
US20230198897A1 (en) Method, network device, and system for controlling packet sending
US11245635B2 (en) Feedback loop for frame maximization
CN113438182A (en) Flow control system and flow control method based on credit
US10972442B1 (en) Distributed predictive packet quantity threshold reporting
CN107231316B (en) Message transmission method and device
CN115695523A (en) Data transmission control method and device, electronic equipment and storage medium
US11240164B2 (en) Method for obtaining path information of data packet and device
CN113365252A (en) Data transmission method, data transmission device, storage medium and electronic device
WO2020108020A1 (en) Congestion control processing method, message forwarding apparatus, and message receiving apparatus
Karrakchou et al. EP4: An application-aware network architecture with a customizable data plane
TWI821882B (en) Packet loss rate measuring method, communication apparatus, and communication system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination