WO2021008562A1 - 流速控制方法和装置 - Google Patents

流速控制方法和装置 Download PDF

Info

Publication number
WO2021008562A1
WO2021008562A1 PCT/CN2020/102158 CN2020102158W WO2021008562A1 WO 2021008562 A1 WO2021008562 A1 WO 2021008562A1 CN 2020102158 W CN2020102158 W CN 2020102158W WO 2021008562 A1 WO2021008562 A1 WO 2021008562A1
Authority
WO
WIPO (PCT)
Prior art keywords
cnp
message
messages
data stream
sending
Prior art date
Application number
PCT/CN2020/102158
Other languages
English (en)
French (fr)
Inventor
林彦竹
刘和洋
郑合文
韩磊
严金丰
陶佩莹
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP20840799.9A priority Critical patent/EP3993330A4/en
Publication of WO2021008562A1 publication Critical patent/WO2021008562A1/zh
Priority to US17/573,909 priority patent/US20220141137A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/74Address processing for routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/11Identifying congestion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/11Identifying congestion
    • H04L47/115Identifying congestion using a dedicated packet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/12Avoiding congestion; Recovering from congestion
    • H04L47/125Avoiding congestion; Recovering from congestion by balancing the load, e.g. traffic engineering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/25Flow control; Congestion control with rate being modified by the source upon detecting a change of network conditions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/26Flow control; Congestion control using explicit feedback to the source, e.g. choke packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/38Flow based routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/50Queue scheduling
    • H04L47/62Queue scheduling characterised by scheduling criteria
    • H04L47/621Individual queue per connection or flow, e.g. per VC

Definitions

  • This application relates to communication technology, and in particular to a flow rate control method and device.
  • RDMA Remote Direct Memory Access
  • data is sent and received directly through the registered cache on the network interface cards (NICs) of the end nodes.
  • the network protocols are all deployed on the NICs and do not need to go through the network protocol stack of the host. This method significantly reduces the amount of data in the host.
  • CPU Central Processing Unit
  • the RDMA (RDMA over Converged Ethernet, RoCE) protocol applied to converged Ethernet includes two versions, RoCEv1 and RoCEv2.
  • RoCEv1 is an RDMA protocol based on the Ethernet link layer
  • RoCEv2 is based on Ethernet.
  • the RDMA protocol implemented by the UDP layer in the Transmission Control Protocol/Internet Protocol (Transmission Control Protocol/Internet Protocol, TCP/IP) protocol.
  • the communication system that implements the DCQCN-based congestion control algorithm includes reaction point (RP), congestion point (CP) and notification point (Notification Point, NP).
  • RP reaction point
  • CP congestion point
  • NP notification point
  • ECN explicit Congestion Notification
  • a message with an ECN mark that is, a Congestion Encountered (CE) message
  • CE Congestion Encountered
  • the RoCEv2 protocol defines an explicit Congestion Notification Packet (CNP) message for this purpose. If a CE message reaches a certain flow, and NP has not sent a CNP message for the flow in the past n microseconds, Then the NP immediately sends a CNP message. That is, if there are multiple CE packets arriving at a certain flow within the time window (n microseconds), the NP generates at most one CNP packet for the flow every n microseconds.
  • CNP Congestion Notification Packet
  • the RP when the RP receives a CNP message, the RP reduces the sending rate and updates the rate reduction factor.
  • the RP will also increase the sending rate according to a certain algorithm when it does not receive a CNP packet for a continuous period of time.
  • the average bandwidth that each flow can be allocated is small.
  • there may be a packet time interval for each flow that is, the flow can obtain CNP packets).
  • the minimum time interval is greater than the time interval of rate increase.
  • the NP generates CNP according to each CE message, it cannot allow the RP to decelerate at the interval of the rate increase. This will cause the flow in a congested state to be Perform rate increase processing, resulting in failure of rate control convergence, which affects the efficiency of packet transmission.
  • the present application provides a flow rate control method and device to solve the problem that the data stream is still processed for rate increase when congestion occurs.
  • this application provides a flow rate control method, including:
  • N Receive N explicit congestion notification packet CNP messages from the first device, where the N CNP messages correspond to the first data stream, and N is a natural number; send to the second device according to the N CNP messages M CNP messages, the M CNP messages correspond to the first data stream, and M is an integer greater than N.
  • the network device of this application receives N CNP messages from the first device, and then sends M CNP messages to the second device.
  • M is greater than N, which can ensure that the second device can receive a message corresponding to the first device every interval period.
  • the CNP packet of the data stream so that the transmission rate of the first data stream is reduced based on the CNP packet, which solves the problem that the congestion of the first data stream is still processed for rate increase.
  • the CNP message is used to indicate that the first data stream is congested.
  • the CNP message includes a first destination address, a first source address, and a first destination queue pair identifier
  • the message in the first data stream includes a second destination address, a second A source address and a second destination queue pair identification, wherein the first destination address and the second source address are the same, and the first source address and the second destination address are the same.
  • the sending M CNP messages to the second device according to the N CNP messages includes: sending the M CNP messages according to a set period.
  • This application can ensure that the second device can receive a CNP message corresponding to the first data stream every interval period, so that the sending rate of the first data stream is reduced based on the CNP message, which solves the first problem. Congestion in the data stream is still a problem of speed increase processing.
  • the sending M CNP messages to the second device according to the N CNP messages includes: monitoring whether the CNP from the first device is received in the current period Message; if the CNP message from the first device is not received, create an auxiliary CNP message, and send the auxiliary CNP message to the second device.
  • the network device of this application receives N CNP messages from the first device, and directly forwards the CNP message to the second device. If it does not receive a CNP message from the first device in the entire period, it creates one The CNP message is sent to the second device, which can ensure that the second device can receive a CNP message corresponding to the first data stream every interval period, so that the sending rate of the first data stream is determined based on the CNP message.
  • Speed reduction processing solves the problem that the first data stream is congested and is still processed for speed increase.
  • the sending M CNP messages to the second device according to the N CNP messages includes: receiving the first CNP message from the N CNP messages At the beginning, start timing after receiving each CNP message. When the timed duration exceeds the set threshold, if the next CNP message from the first device has not been received, an auxiliary CNP is created Message, and send the auxiliary CNP message to the second device.
  • the network device of this application starts from receiving the first CNP message among the N CNP messages from the first device, and directly forwards the CNP message to the second device, but if it receives a certain CNP message from the first device After a CNP message, if the next CNP message from the first device is not received after a period of time (for example, the duration exceeds the set threshold), a CNP message is created and sent to the second device, which can prevent the second device from being After receiving a certain CNP message, it may take a long time to receive the next CNP message corresponding to the first data stream, ensuring that the second device can send the first data stream based on the CNP message in time.
  • the speed reduction process solves the problem that the first data stream is congested and is still processed for speed increase.
  • the method further includes: starting timing after each auxiliary CNP message is sent to the second device, and when the timing duration exceeds the set threshold, if it has not been received from all For the next CNP message of the first device, another auxiliary CNP message is created, and the auxiliary CNP message is sent to the second device.
  • the network device of this application can restart timing after sending a CNP message created by itself.
  • the timing duration exceeds the set threshold and the next CNP message from the first device has not been received, another CNP message is created.
  • the message is sent to the second device. That is, the network device receives a CNP message from the first device. After a period of time (for example, the duration exceeds the set threshold), if it does not receive the next CNP message from the first device, the network device creates a CNP message by itself The message is sent to the second device, but after sending the CNP message, the network device still does not receive the next CNP message from the first device after a certain period of time (for example, the duration exceeds the set threshold).
  • the device creates a CNP message and sends it to the second device, and so on. This prevents the second device from receiving the next CNP packet corresponding to the first data stream for a long time after receiving a certain CNP packet, and ensures that the second device can promptly pair based on the CNP packet
  • the first data stream performs transmission rate reduction processing, which solves the problem that the first data stream is still subject to rate increase processing when congestion occurs.
  • the failure to receive the CNP packet from the first device refers to a marked value of the CNP in a flow table entry of the first data flow Is the first value.
  • the network device of the present application can confirm whether the CNP message from the first device is received based on the specific mark in the flow table entry, for example, the CNP is marked, thereby improving processing efficiency.
  • the method further includes: monitoring whether the CNP message from the first device is received in the current period; if the CNP message from the first device is received The CNP message, the CNP marked value in the flow table entry of the first data stream is set to the second value, and the CNP marked value is set at the end of the current period Is the first value.
  • the method before the receiving N CNP messages from the first device, the method further includes: creating a flow table entry of the first data flow according to the link establishment message, and the first data
  • the flow table entry of the flow includes the second destination address, the second source address, the second destination queue pair identifier, the source queue pair identifier of the first data flow, and the CNP marked.
  • the method further includes: deleting the flow table entry of the first data flow according to the link deletion message .
  • the method before the sending M CNP messages to the second device according to the N CNP messages, the method further includes: when there is no flow table entry of the first data flow, If the CNP message from the first device is received for the first time, the flow table entry of the first data flow is created and the entry timeout timer is started.
  • the flow of the first data flow The table entry includes the second destination address, the second source address, the source queue pair identifier of the first data flow, the CNP mark and the timeout identifier of the flow table entry of the first data flow , The timeout identifier is used to indicate whether the entry timeout timer expires.
  • the method further includes: when the timeout identifier indicates that the entry timeout timer expires If the CNP message from the first device is not received within the time period of the entry timeout timer, the flow table entry of the first data flow is deleted.
  • the sending M CNP messages to the second device according to the N CNP messages includes: when the first queue enters a congested state, sending M CNP messages to the second device according to the N CNP messages The second device sends the M CNP messages, and the first queue is a queue that includes the first data flow among multiple sending queues of an egress port.
  • the network device of this application can actively create an auxiliary CNP message and send it to the second device after the first queue enters the congested state, and then it will actively create an auxiliary CNP message and send it to the second device. Speed reduction due to unnecessary CNP packets when entering a congested state.
  • the method before sending M CNP messages to the second device according to the N CNP messages, the method further includes: judging the current status of the first queue; when the first queue When the first queue is not in the congested state and the depth of the first queue is greater than the first threshold, it is determined that the first queue enters the congested state; or, when the first queue is in the congested state, and the When the depth of the first queue is less than the second threshold, it is determined that the first queue exits the congested state; wherein the first threshold is greater than the second threshold.
  • this application provides a flow rate control method, including:
  • the congestion notification packet CNP message of, the M CNP messages correspond to the first data stream, and M is an integer greater than N.
  • the server receives the first data stream from the second device, the first data stream includes N CE packets, and then sends M CNP packets to the second device.
  • M is greater than N, which can ensure that the second device can
  • a CNP message corresponding to the first data stream is received every interval period, so that the sending rate of the first data stream is reduced based on the CNP message, which solves the problem of the congestion of the first data stream.
  • the CNP message is used to indicate that the first data stream is congested.
  • the CNP message includes a first destination address, a first source address, and a first destination queue pair identifier
  • the message in the first data stream includes a second destination address, a second A source address and a second destination queue pair identification, wherein the first destination address and the second source address are the same, and the first source address and the second destination address are the same.
  • the sending M CNP messages to the second device according to the N CE messages includes: sending the M CNP messages according to a set period.
  • This application can ensure that the second device can receive a CNP message corresponding to the first data stream every interval period, so that the sending rate of the first data stream is reduced based on the CNP message, which solves the first problem. Congestion in the data stream is still a problem of speed increase processing.
  • the sending M CNP messages to the second device according to the N CE messages includes: monitoring whether the CNP message is sent in the current period; if not sent For the CNP message, an auxiliary CNP message is created, and the auxiliary CNP message is sent to the second device.
  • the server of this application After the server of this application receives the CE message in the first data stream, it can send a CNP message to the second device according to the CE message, and if the CNP message is not sent to the second device in the entire cycle, a CNP message is created The message is sent to the second device to ensure that the second device can receive a CNP message corresponding to the first data stream every interval period, so as to reduce the sending rate of the first data stream based on the CNP message. Speed processing solves the problem that the first data stream is congested and still processed by speed increase.
  • the failure to send the CNP packet means that the value of the CNP sending flag in the flow table entry of the first data flow is the first value.
  • the network device of the present application can confirm whether a CNP message has been sent to the second device based on a specific mark in the flow table entry, for example, a CNP sending mark, thereby improving processing efficiency.
  • the method further includes: monitoring whether the CNP message is sent in the current period; if the CNP message is sent, adding the value in the flow table entry of the first data flow The value of the CNP sending flag is set to the second value, and the value of the CNP sending flag is set to the first value at the end of the current period.
  • the method before the receiving the first data flow from the second device, the method further includes: creating a flow table entry of the first data flow according to a link establishment message, and the first data flow
  • the flow table entry includes the second destination address, the second source address, the second destination queue pair identifier, the source queue pair identifier of the first data flow, and the CNP sending flag.
  • the method further includes: deleting the flow table entry of the first data flow according to the link deletion message .
  • the method before the sending M CNP messages to the second device according to the N CE messages, the method further includes: when there is no flow table entry for the first data flow
  • the flow table entry of the first data flow is created and the entry timeout timer is started.
  • the flow table entry of the first data flow includes the second The destination address, the second source address, the source queue pair identifier of the first data flow, the CNP sending flag, and the timeout identifier of the flow table entry of the first data flow, the timeout identifier is used to indicate Whether the entry timeout timer expires.
  • the method further includes: when the timeout identifier indicates that the entry timeout timer expires If the CNP message is not sent within the time period of the entry timeout timer, delete the flow table entry of the first data flow.
  • the sending M CNP messages to the second device according to the N CE messages includes: when the first data stream is congested, according to the N CE messages The CE message sends the M CNP messages to the second device.
  • the server of this application can actively create an auxiliary CNP message and send it to the second device after the first data stream is congested, when it does not send a CNP message to the second device, so as to avoid not entering the congestion state in the first queue When the speed is reduced due to unnecessary CNP packets.
  • the method before the sending M CNP messages to the second device according to the N CE messages, the method further includes: judging the current state of the first data stream; when the When the first data stream is not in the congested state and the number of CE packets in the received first data stream is greater than a third threshold, it is determined that the first data stream enters the congested state; or, When the first data flow is not in the congested state and the number of CNP packets sent is greater than the fourth threshold, it is determined that the first data flow enters the congested state; or, when the first data When the flow is in the congested state and a non-CE packet in the first data flow is received, it is determined that the first data flow exits the congested state; or, when the first data flow is in the congested state State, and no packet in the corresponding data flow is received within a set time, it is determined that the first data flow exits the congestion state; or, when the first data flow is in the congestion state State, and when no CNP packet is
  • the present application provides a flow rate control device, including:
  • the receiving module is configured to receive N explicit congestion notification packet CNP messages from the first device, where the N CNP messages correspond to the first data stream, and N is a natural number; the sending module is configured to N CNP messages send M CNP messages to the second device, where the M CNP messages correspond to the first data stream, and M is an integer greater than N.
  • the CNP message is used to indicate that the first data stream is congested.
  • the CNP message includes a first destination address, a first source address, and a first destination queue pair identifier
  • the message in the first data stream includes a second destination address, a second A source address and a second destination queue pair identification, wherein the first destination address and the second source address are the same, and the first source address and the second destination address are the same.
  • the sending module is specifically configured to send the M CNP messages according to a set period.
  • it further includes: a processing module, configured to monitor whether the CNP message from the first device is received in the current cycle; and the sending module is specifically configured to: The CNP message from the first device creates an auxiliary CNP message and sends the auxiliary CNP message to the second device.
  • the processing module is further configured to start timing after receiving the first CNP message among the N CNP messages;
  • the sending module is further configured to create an auxiliary CNP message if the time duration exceeds a set threshold, and if the next CNP message from the first device has not been received, and send it to the The second device sends the auxiliary CNP message.
  • the processing module is further configured to start timing after each auxiliary CNP message is sent to the second device; the sending module is also configured to start timing when the timing duration exceeds all When the threshold is set, if the next CNP message from the first device has not been received, another auxiliary CNP message is created, and the auxiliary CNP message is sent to the second device.
  • the failure to receive the CNP message from the first device refers to that the marked value of CNP in the flow table entry of the first data flow is the first One value.
  • the processing module is further configured to monitor whether the CNP message from the first device is received in the current period; if the CNP message from the first device is received In the CNP message of the first device, the marked value of the CNP in the flow table entry of the first data flow is set to the second value, and the CNP is set at the end of the current period. The marked value is set to the first value.
  • the processing module is further configured to create a flow table entry of the first data flow according to the link establishment message, and the flow table entry of the first data flow includes the first data flow.
  • the second destination address, the second source address, the second destination queue pair identifier, the source queue pair identifier of the first data flow, and the CNP are marked.
  • the processing module is further configured to delete the flow table entry of the first data flow according to the chain delete message.
  • the processing module is further configured to, when the flow table entry of the first data stream does not exist, if the first data stream is received from the first device CNP message, the flow table entry of the first data flow is created and the entry timeout timer is started.
  • the flow table entry of the first data flow includes the second destination address and the second source address ,
  • the source queue pair identifier of the first data flow, the CNP marked and the timeout identifier of the flow table entry of the first data flow, the timeout identifier is used to indicate whether the entry timeout timer has expired .
  • the processing module is further configured to, when the time-out identifier indicates that the entry time-out timer expires, if the entry is not received within the time period of the entry time-out timer For the CNP message from the first device, the flow table entry of the first data flow is deleted.
  • the sending module is specifically configured to send the M CNP messages to the second device according to the N CNP messages after the first queue enters a congested state, so
  • the first queue is a queue of the first data flow among the multiple sending queues of the egress port.
  • it further includes: a processing module, configured to determine the current state of the first queue; when the first queue is not in the congested state, and the depth of the first queue is greater than the first queue
  • a processing module configured to determine the current state of the first queue; when the first queue is not in the congested state, and the depth of the first queue is greater than the first queue
  • a threshold is reached, it is determined that the first queue enters the congested state; or, when the first queue is in the congested state, and the depth of the first queue is less than a second threshold, the first queue is determined Exit the congested state; wherein the first threshold is greater than the second threshold.
  • a flow rate control device including:
  • the receiving module is used to receive the first data stream from the second device, the first data stream includes N CE packets where congestion occurs, and N is a natural number; the sending module is used to send data to the N CE packets according to the The second device sends M explicit congestion notification packet CNP messages, where the M CNP messages correspond to the first data stream, and M is an integer greater than N.
  • the CNP message is used to indicate that the first data stream is congested.
  • the CNP message includes a first destination address, a first source address, and a first destination queue pair identifier
  • the message in the first data stream includes a second destination address, a second A source address and a second destination queue pair identification, wherein the first destination address and the second source address are the same, and the first source address and the second destination address are the same.
  • the sending module is specifically configured to send the M CNP messages according to a set period.
  • the sending module is specifically configured to monitor whether the CNP message is sent in the current cycle; if the CNP message is not sent, an auxiliary CNP message is created and sent to the The second device sends the auxiliary CNP message.
  • the failure to send the CNP packet means that the value of the CNP sending flag in the flow table entry of the first data flow is the first value.
  • it further includes: a processing module, configured to monitor whether the CNP message is sent in the current period; if the CNP message is sent, the flow of the first data stream The value of the CNP sending flag in the table entry is set to the second value, and the value of the CNP sending flag is set to the first value at the end of the current period.
  • the processing module is further configured to create a flow table entry of the first data flow according to the link establishment message, and the flow table entry of the first data flow includes the first data flow.
  • the processing module is further configured to delete the flow table entry of the first data flow according to the chain delete message.
  • the processing module is further configured to, when the flow table entry of the first data flow does not exist, if the CNP packet is sent for the first time, create the first The flow table entry of the data flow and the entry timeout timer is started.
  • the flow table entry of the first data flow includes the second destination address, the second source address, and the source queue of the first data flow
  • the timeout identifier is used to indicate whether the entry timeout timer expires.
  • the processing module is further configured to, when the timeout identifier indicates that the entry timeout timer expires, if the entry timeout timer is not sent within the time period of the entry The CNP message deletes the flow table entry of the first data flow.
  • the sending module is specifically configured to send the M CNP messages to the second device according to the N CE messages after the first data stream is congested .
  • it further includes: a processing module configured to determine the current state of the first data stream; when the first data stream is not in the congested state, and the received first data stream When the number of CE packets in the data stream is greater than a third threshold, it is determined that the first data stream enters the congested state; or, when the first data stream is not in the congested state, and the sent When the number of CNP packets is greater than the fourth threshold, it is determined that the first data flow enters the congested state; or, when the first data flow is in the congested state, and a non-congestion in the first data flow is received When the CE message, it is determined that the first data flow exits the congestion state; or, when the first data flow is in the congestion state and the corresponding data flow is not received within a set time When any message, it is determined that the first data flow exits the congestion state; or, when the first data flow is in the congestion state and no CNP message is sent within a set time, it is determined that all The
  • this application provides a network device, including:
  • One or more processors are One or more processors;
  • Memory used to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the method according to any one of the above-mentioned first aspects.
  • this application provides a server, including:
  • One or more processors are One or more processors;
  • Memory used to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the method according to any one of the above second aspects.
  • the present application provides a computer-readable storage medium including a computer program, which when executed on a computer, causes the computer to execute the method described in any one of the first to second aspects.
  • this application provides a computer program, which is characterized in that, when the computer program is executed by a computer, it is used to execute the method described in any one of the first to second aspects.
  • Figure 1 is an example of a typical scenario where the flow rate control method of this application is applicable
  • Figure 2 is an example of a communication system implementing the existing congestion control method
  • FIG. 5 is a schematic structural diagram of an embodiment of a flow rate control device according to the present application.
  • FIG. 6 is a schematic structural diagram of a device 600 provided by this application.
  • At least one (item) refers to one or more, and “multiple” refers to two or more.
  • “And/or” is used to describe the association relationship of associated objects, indicating that there can be three types of relationships, for example, “A and/or B” can mean: only A, only B, and both A and B , Where A and B can be singular or plural.
  • the character “/” generally indicates that the associated objects are in an “or” relationship.
  • the following at least one item (a)” or similar expressions refers to any combination of these items, including any combination of a single item (a) or plural items (a).
  • At least one (a) of a, b or c can mean: a, b, c, "a and b", “a and c", “b and c", or "a and b and c" ", where a, b, and c can be single or multiple.
  • FIG 1 is an example of a typical scenario where the flow rate control method of this application is applicable.
  • this scenario is a data center network, which can be applied to high-performance computing, high-performance distributed storage, big data, artificial intelligence, etc.
  • the data center network may be a CLOS network.
  • the CLOS network includes a core switch and an access switch. The server accesses the network through the access switch.
  • FIG. 1 shows an example of a scenario to which the flow rate control method of the present application is applicable, and the present application may also be applicable to the RDMA scenario, so the application does not specifically limit the application scenario.
  • Figure 2 is an example of a communication system that implements the existing congestion control method.
  • the communication system can include RP, CP, and NP, where RP and NP can be servers in the scenario shown in Figure 1, and CP can It is the core switch or access switch in the scenario shown in Figure 1.
  • the congestion control process may include: on the CP, if the depth of its egress port queue exceeds the threshold, the CP determines that the egress port queue enters a congested state, and the CP marks the ECN mark in the message to be sent in the egress port queue with ECN The marked message is called a Congestion Encountered (CE) message.
  • CE Congestion Encountered
  • the NP when the CE message arrives, the NP immediately sends a CNP message to notify the RP that the data stream is congested. Or, the RP will only send one CNP message after receiving multiple CE messages to notify the RP that the data stream is congested.
  • the RP when the RP receives a CNP message, the RP reduces the sending rate and updates the rate reduction factor. In addition, the RP will increase the sending rate according to a certain algorithm when it does not receive a CNP packet for a continuous period of time.
  • This application provides a flow rate control method to solve the above technical problems.
  • FIG. 3 is a flowchart of Embodiment 1 of the flow rate control method of this application. As shown in FIG. 3, the method in this embodiment may be executed by a network device, which may be a core switch or an access switch in the scenario shown in FIG. 1 .
  • Flow rate control methods can include:
  • Step 301 Receive N CNP messages from the first device, where the N CNP messages correspond to the first data stream.
  • the first device refers to the receiving end of the first data stream, such as the NP in FIG. 2, and the second device refers to the sending end of the first data stream, such as the RP in FIG. 2.
  • the first device in this application may also refer to the sending end of the first data stream, such as the RP in FIG. 2, and the second device may also refer to the receiving end of the first data stream, such as the one in FIG. NP.
  • the first device may also refer to other network devices downstream of the network device, and the second device may also refer to other network devices upstream of the network device. This application does not specifically limit the first device and the second device.
  • the CNP message in this application is used to indicate that the first data stream is congested.
  • the network device may receive multiple CNP messages from the first device, and the network device needs to forward these CNP messages to the second device.
  • the CNP message includes the first destination address, the first source address, and the first destination queue pair identifier, and the message in the first data stream includes the second destination address, the second source address, and the second destination queue pair identifier.
  • the CNP message corresponding to the first data stream means that the first destination address and the second source address are the same (the address is usually the address of the sending end device of the first data stream, such as the address of the RP), the first source address and the second source address The two destination addresses are the same (the address is usually the address of the receiving end device of the first data stream, such as the address of the NP).
  • the CNP message corresponding to the first data stream also includes the first destination queue pair identifier and the second destination queue pair identifier corresponding, that is, the first destination queue pair and the second destination queue pair form a pair of queue pairs.
  • the network device when the first queue enters a congested state, the network device sends M CNP messages to the second device according to N CNP messages, and the first queue is among the multiple sending queues of the egress port The queue that includes the first data stream.
  • the network device can actively create an auxiliary CNP message and send it to the second device when the CNP message from the first device is not received after the first queue enters the congested state. In this way, it is possible to avoid slowdown processing due to unnecessary CNP packets when the first queue does not enter the congested state.
  • the network device can select the data stream from each queue in turn and send it out according to the priority of each queue.
  • the first queue in this application may be any queue on the outgoing port of the network device, and the queue includes the first data flow.
  • the network device may determine that the first queue is in a congested state when the first queue is not in a congested state and the depth of the first queue is greater than the first threshold; or, when the first queue is in a congested state, and the depth of the first queue is less than When the threshold is two, it is determined that the first queue exits the congested state; wherein, the first threshold is greater than the second threshold.
  • Step 302 Send M CNP messages to the second device according to N CNP messages, and the M CNP messages correspond to the first data stream.
  • the above M is an integer greater than N.
  • the network device may send M CNP messages to the second device according to the set period.
  • the cycle set in this application can be set with reference to the speed-up period in which the sending end device (such as RP) of the first data stream increases the sending rate of the data stream.
  • the speed-up period defaults to 300us, and the period can be set It is within 80-95% of the speed-up determination period, which is 240-285us.
  • the network device monitors whether it receives a CNP message from the first device in the current cycle. If it does not receive a CNP message from the first device, it creates an auxiliary CNP message and sends the auxiliary CNP message to the second device.
  • the network device if it receives a CNP message from the first device, it will immediately forward the CNP message to the second device, and if the network device has not received a CNP message from the first device in the current cycle It will spontaneously create an auxiliary CNP message at the end of the current period, and then send the auxiliary CNP message to the second device.
  • the sender device for example, RP
  • the sender device of the first data stream can receive a CNP packet corresponding to the first data stream every interval of the set period of time, so that the first data stream is processed based on the CNP packet.
  • the speed reduction processing of the sending rate avoids the problem that the congestion of the first data stream is still processed by the speed increase processing.
  • the foregoing failure to receive the CNP message from the first device means that the CNP value in the flow table entry of the first data flow maintained by the network device is the first value. That is, the network device in this application may send a CNP packet to the second device according to the specific identifier (for example, the CNP is marked) in the flow table entry of the first data flow created in advance. The CNP in the flow table entry of the first data stream is marked to indicate whether the network device receives a CNP packet from the first device in the current cycle.
  • the marked value of the CNP is the first value (for example, 0) It means that the network device has not received the CNP message from the first device in the current cycle, and the marked value of CNP is the second value (for example, 1), which means that the network device has received the CNP message from the first device in the current cycle.
  • the network device reads the flow table entry of the first data stream. If the marked value of CNP is the second value, it means that the network device has received the CNP message from the first device and forwarded it to the second device If the CNP message from the first device is the first value, it means that the network device has not received the CNP message from the first device, and the network device spontaneously creates an auxiliary CNP message.
  • the sender device for example, RP
  • the sender device can receive a CNP packet corresponding to the first data stream every interval of the set period of time, so that the first data stream is processed based on the CNP packet.
  • the speed reduction processing of the sending rate avoids the problem that the first data stream is congested and is still processed for speed increase.
  • the flow rate control process of the network device can also be that regardless of whether a CNP message from the first device is received in the current cycle, the network device sends a CNP message to the second device at the end of the current cycle.
  • CNP message if the network device receives the CNP message from the first device, it will forward the CNP message to the second device at the end of the current period; if the network device does not receive the CNP message from the first device, it will At the end of the period, an auxiliary CNP message is created, and the auxiliary CNP message is sent to the second device.
  • the sender device (for example, RP) of the first data stream can receive a CNP packet corresponding to the first data stream every interval of the set period of time, so as to compare the first data stream based on the CNP packet.
  • the transmission rate is reduced to avoid the problem that the first data stream is congested and is still processed for rate increase.
  • the network device receives N CNP messages from the first device, and then sends M CNP messages to the second device.
  • M is greater than N, which can ensure that the sender device of the first data stream can set a period every interval
  • the transmission rate of the first data stream is reduced based on the CNP message to avoid congestion of the first data stream and still be processed for rate increase problem.
  • the network device may use the following two methods to create the flow table entry of the first data flow:
  • the first method is that the sender device and receiver device of the first data stream send a link establishment message to each other, where the link establishment message sent by the sender device includes the address and queue identifier of the sender device, and the message sent by the receiver device
  • the link establishment message includes the address of the receiving device and the queue identifier.
  • the queue on the sending device and the queue on the receiving device form a queue pair, and the data stream and the corresponding CNP packet are sent and received on the same pair of queues. For example, the data stream in the A queue of the sending device is sent to the receiver After the end device, the CNP message for the data flow belongs to the B queue of the receiving end device, and the A queue and the B queue are a pair of queues.
  • the network device creates a flow table entry of the first data flow according to the above-mentioned link establishment message, and the flow table entry of the first data flow includes the second destination address in the first data flow (for example, the receiving end of the first data flow)
  • the IP address of the device The IP address of the device), the second source address (for example, the IP address of the sender device of the first data stream), the second destination queue pair identifier (for example, the queue identifier in the receiver device of the first data stream), and the first data
  • the source queue pair identifier of the flow for example, the queue identifier in the sender device of the first data flow
  • the network device may also delete the flow table entry of the first data stream according to the delete link message (sent by the sender device or the receiver device of the first data stream).
  • the network device can monitor whether it receives a CNP message from the first device in the current cycle. If it receives a CNP message from the first device, it will mark the value of CNP in the flow table entry of the first data stream. Set to the second value, and set the CNP marked value to the first value at the end of the current period.
  • the network device divides the time axis by periods. In the current period, if the network device receives a CNP message from the first device, it sets the CNP marked value in the flow table entry of the first data stream to the second value , And when the current period ends, the network device will set the marked value of the CNP to the first value so as to start a new monitoring operation in the next period.
  • the second method is that when there is no flow table entry for the first data flow, if the network device receives a CNP message from the first device for the first time, it creates a flow table for the first data flow based on the CNP message. Entry and start the entry timeout timer. In this method, the network device creates the flow table entry of the first data flow based on the CNP message received from the first device for the first time, and determines the first data flow through the entry timeout timer. The deletion timing of the flow table entry of a data flow.
  • the flow table entry of the first data flow includes the second destination address in the first data flow (that is, the first source address in the CNP packet, such as the IP address of the receiving end device of the first data flow), The second source address (that is, the first destination address in the above-mentioned CNP message, such as the IP address of the sending end device of the first data stream), the source queue pair identifier of the first data stream (that is, the first destination address in the above-mentioned CNP message)
  • the first destination queue pair identifier such as the queue identifier in the sender device of the first data flow, the CNP mark and the timeout identifier of the flow table entry of the first data flow, the timeout identifier is used to indicate the entry timeout timer Whether it has timed out.
  • the network device monitors whether it receives a CNP message from the first device in the current cycle, and if it receives a CNP message from the first device, it sets the CNP marked value in the flow table entry of the first data stream Is the second value, and the CNP marked value is set to the first value at the end of the current period.
  • the network device divides the time axis by periods. In the current period, if the network device receives a CNP message from the first device, it sets the CNP marked value in the flow table entry of the first data stream to the second value , And when the current period ends, the network device will set the marked value of the CNP to the first value so as to start a new monitoring operation in the next period.
  • the timeout indicator indicates that the entry timeout timer expires, if the CNP message from the first device is not received within the time period of the entry timeout timer, the network device deletes the flow table entry of the first data flow.
  • FIG. 4 is a flowchart of Embodiment 2 of the flow rate control method of this application. As shown in FIG. 4, the method in this embodiment can be executed by the server in the scenario shown in FIG. 1, and the server is mainly responsible for the function of the NP in FIG. .
  • Flow rate control methods can include:
  • Step 401 Receive a first data stream from a second device, where the first data stream includes N CE packets.
  • the first device refers to the receiving end of the first data stream, such as the NP in FIG. 2, and the second device refers to the sending end of the first data stream, such as the RP in FIG. 2.
  • the second device may also refer to other network devices upstream of the server. This application does not specifically limit the first device and the second device.
  • the CE message in this application is a message generated by a network device (such as the core switch or access switch in the scenario shown in Figure 1) after marking the ECN mark in the message of the first data flow, and is used to indicate the first data flow Congestion occurs.
  • a network device such as the core switch or access switch in the scenario shown in Figure 1
  • Step 402 Send M CNP messages to the second device according to N CE messages, and the M CNP messages correspond to the first data stream.
  • the above M is an integer greater than N.
  • the server may send M CNP messages to the second device according to the set period.
  • the cycle set in this application can be set with reference to the increase cycle of the sending end device (such as RP) of the first data stream to increase the transmission rate of the data stream.
  • the increase judgment cycle defaults to 300us, and the cycle can be set Set to within 80-95% of the speed-up determination period, that is, 240-285us.
  • the CNP message includes the first destination address, the first source address, and the first destination queue pair identifier, and the message in the first data stream includes the second destination address, the second source address, and the second destination queue pair identifier.
  • the CNP message corresponding to the first data stream means that the first destination address and the second source address are the same (the address is usually the address of the sending end device of the first data stream, such as the address of the RP), the first source address and the second source address The two destination addresses are the same (the address is usually the address of the receiving end device of the first data stream, such as the address of the NP).
  • the CNP message corresponding to the first data stream also includes the first destination queue pair identifier and the second destination queue pair identifier corresponding, that is, the first destination queue pair and the second destination queue pair form a pair of queue pairs.
  • the server monitors whether the CNP message is sent in the current cycle, and if the CNP message is not sent, it creates an auxiliary CNP message, and sends the auxiliary CNP message to the second device.
  • the server may receive multiple CE packets in the first data stream from the network device. At this time, the server may send a CNP packet to the second device for each CE packet. The server may also send only one CNP message to the second device for multiple CE messages.
  • the server If the server has not sent a CNP message in the current cycle, it will spontaneously create an auxiliary CNP message at the end of the current cycle, and then send the auxiliary CNP message to the second device.
  • This can ensure that the sender device (for example, RP) of the first data stream can receive a CNP packet corresponding to the first data stream every interval of the set period of time, so that the first data stream is processed based on the CNP packet.
  • the speed reduction processing of the sending rate avoids the problem that the congestion of the first data stream is still processed by the speed increase processing.
  • the aforementioned failure to send the CNP message means that the value of the CNP sending flag in the flow table entry of the first data flow maintained by the server is the first value. That is, in this application, the server may send a CNP packet to the second device according to a specific identifier (for example, a CNP sending flag) in a flow table entry of the first data flow created in advance.
  • the CNP sending flag of the flow table entry of the first data stream is used to indicate whether the server sends CNP packets in the current period.
  • the value of the CNP sending flag is the first value (for example, 0), which means that the server has not sent any messages to the server in the current period.
  • the second device sends a CNP packet, and the value of the CNP sending flag is the second value (for example, 1), which indicates that the server sends the CNP packet to the second device in the current cycle.
  • the server reads the flow table entry of the first data stream. If the value of the CNP sending flag is the second value, it means that the server has sent a CNP message to the second device. If the value of the CNP sending flag is the first A value means that the server does not send a CNP message to the second device, and the server spontaneously creates an auxiliary CNP message, and then sends the auxiliary CNP message to the second device.
  • the speed reduction processing of the sending rate avoids the problem that the congestion of the first data stream is still processed by the speed increase processing.
  • the process in which the server performs flow rate control may also be that regardless of whether the CNP message is sent in the current cycle, the server sends a CNP message to the second device at the end of the current cycle.
  • the server if the server needs to send a CNP message, it will send a CNP message to the second device at the end of the current period; if the server does not send a CNP message, it will create an auxiliary CNP message at the end of the current period and send it to the second device.
  • the device sends the auxiliary CNP packet.
  • the sender device (for example, RP) of the first data stream can receive a CNP packet corresponding to the first data stream every interval of the set period of time, so as to compare the first data stream based on the CNP packet.
  • the transmission rate is reduced to avoid the problem that the first data stream is congested and is still processed for rate increase.
  • the server when the first data stream is congested, the server sends M CNP messages to the second device according to the N CE messages.
  • the server can take the initiative to create an auxiliary CNP message and send it to the second device when the CNP message is not sent after the first data stream is congested. In this way, it is possible to avoid speed reduction processing due to unnecessary CNP packets when the first data stream is not congested.
  • the server may determine that the first data stream is in a congested state when the first data stream is not in a congested state and the number of CE packets in the received first data stream is greater than the third threshold; or, when the first data stream is not in a congested state When the number of CNP packets sent is greater than the fourth threshold, it is determined that the first data stream is in a congested state; or, when the first data stream is in a congested state and non-CE packets in the first data stream are received , Determine that the first data stream exits the congested state; or, when the first data stream is in a congested state and no packet in the corresponding data stream is received within a set time, determine that the first data stream exits the congested state; Or, when the first data stream is in a congested state and no CNP message is sent within a set time, it is determined that the first data stream exits the congested state.
  • the server receives the first data stream from the second device, the first data stream includes N CE packets, and then sends M CNP packets to the second device.
  • M is greater than N, which can ensure that the second device can A CNP message corresponding to the first data flow is received every interval of the set period of time, so that the sending rate of the first data flow is reduced based on the CNP message, so as to prevent the first data flow from being congested and still being Deal with the problem of speed increase.
  • the server may use the following two methods to create the flow table entry of the first data flow:
  • the first method is that the sender device and receiver device of the first data stream send a link establishment message to each other, where the link establishment message sent by the sender device includes the address and queue identifier of the sender device, and the message sent by the receiver device
  • the link establishment message includes the address of the receiving device and the queue identifier.
  • the queue on the sending device and the queue on the receiving device form a queue pair, and the data stream and the corresponding CNP packet are sent and received on the same pair of queues. For example, the data stream in the A queue of the sending device is sent to the receiver After the end device, the CNP message for the data flow belongs to the B queue of the receiving end device, and the A queue and the B queue are a pair of queues.
  • the server creates the flow table entry of the first data flow according to the above-mentioned link establishment message, and the flow table entry of the first data flow includes the second destination address in the first data flow (for example, the receiving end device of the first data flow)
  • the second source address for example, the IP address of the sending end device of the first data stream
  • the second destination queue pair identifier for example, the queue identifier in the receiving end device of the first data stream
  • the source queue pair identifier for example, the queue identifier in the sender device of the first data stream
  • the server may also delete the flow table entry of the first data stream according to the delete link message (sent by the sender device or the receiver device of the first data stream).
  • the server can monitor whether to send CNP packets in the current period. If sending CNP packets, set the value of the CNP sending flag to the second value in the flow table entry of the first data stream, and set the value of the CNP sending flag to the second value at the end of the current period. The value of the CNP transmission flag is set to the first value.
  • the server divides the time axis by periods. In the current period, if the server sends a CNP packet, it sets the value of the CNP sending flag in the flow table entry of the first data stream to the second value, and when the current period ends , The server will set the value of the CNP sending flag to the first value to start a new monitoring operation in the next cycle.
  • the second method is that if the server sends a CNP message for the first time when there is no flow table entry for the first data flow, it will create the flow table entry for the first data flow based on the CNP message and start the entry timeout timer In this method, the server creates the flow table entry of the first data flow based on the CNP message sent for the first time, and determines the deletion timing of the flow table entry of the first data flow through the entry timeout timer.
  • the flow table entry of the first data flow includes the second destination address in the first data flow (that is, the first source address in the CNP packet, such as the IP address of the receiving end device of the first data flow), The second source address (that is, the first destination address in the above-mentioned CNP message, such as the IP address of the sending end device of the first data stream), the source queue pair identifier of the first data stream (that is, the first destination address in the above-mentioned CNP message)
  • the first destination queue pair identifier such as the queue identifier in the sender device of the first data flow, the CNP sending flag and the timeout identifier of the flow table entry of the first data flow, the timeout identifier is used to indicate the entry timeout timer Whether it has timed out.
  • the server monitors whether to send CNP packets in the current period. If sending CNP packets, it sets the value of the CNP sending flag in the flow table entry of the first data stream to the second value, and sets the CNP packet at the end of the current period. The value of the sending flag is set to the first value.
  • the server divides the time axis by periods. In the current period, if the server sends a CNP packet, it sets the value of the CNP sending flag in the flow table entry of the first data stream to the second value, and when the current period ends , The server will set the value of the CNP sending flag to the first value to start a new monitoring operation in the next cycle. When the timeout flag indicates that the entry timeout timer expires, if the CNP message is not sent within the time period of the entry timeout timer, the server deletes the flow table entry of the first data flow.
  • FIG. 5 is a schematic structural diagram of an embodiment of a flow rate control device of this application.
  • the device of this embodiment is applied to the network equipment in the first embodiment of the above method, and can also be applied to the server in the second embodiment of the above method.
  • the device includes: a receiving module 501, a sending module 502, and a processing module 503.
  • the server in the second embodiment of the above method includes: a receiving module 501, a sending module 502, and a processing module 503.
  • the receiving module 501 is configured to receive N explicit congestion notification packet CNP messages from the first device, and the N CNP messages and the first device Corresponds to a data stream, and N is a natural number; the sending module 502 is configured to send M CNP messages to the second device according to the N CNP messages, and the M CNP messages are the same as the first data stream.
  • M is an integer greater than N.
  • the CNP message is used to indicate that the first data stream is congested.
  • the CNP message includes a first destination address, a first source address, and a first destination queue pair identifier
  • the message in the first data stream includes a second destination address, a second A source address and a second destination queue pair identification, wherein the first destination address and the second source address are the same, and the first source address and the second destination address are the same.
  • the sending module 502 is specifically configured to send the M CNP messages according to a set period.
  • the processing module 503 is configured to monitor whether the CNP message from the first device is received in the current cycle; the sending module 502 is specifically configured to: When the CNP message from the first device is reached, an auxiliary CNP message is created, and the auxiliary CNP message is sent to the second device.
  • the processing module 503 is further configured to start from receiving the first CNP message among the N CNP messages, and start timing after each CNP message is received
  • the sending module 502 is also used to create an auxiliary CNP message if the next CNP message from the first device has not been received when the timing duration exceeds a set threshold, and send it to The second device sends the auxiliary CNP message.
  • the processing module 503 is further configured to start timing after each auxiliary CNP message is sent to the second device; the sending module 502 is also configured to perform timing When the set threshold is exceeded, if the next CNP message from the first device has not been received, another auxiliary CNP message is created, and the auxiliary CNP message is sent to the second device Text.
  • the failure to receive the CNP message from the first device refers to that the marked value of CNP in the flow table entry of the first data flow is the first One value.
  • the processing module 503 is further configured to monitor whether the CNP message from the first device is received in the current period; if the CNP message from the first device is received For the CNP message of the first device, the marked value of the CNP in the flow table entry of the first data flow is set to the second value, and the CNP is set at the end of the current period. The marked value of CNP is set to the first value.
  • the processing module 503 is further configured to create a flow table entry of the first data flow according to the link establishment message, and the flow table entry of the first data flow includes the The second destination address, the second source address, the second destination queue pair identifier, the source queue pair identifier of the first data flow, and the CNP are marked.
  • the processing module 503 is further configured to delete the flow table entry of the first data flow according to the delete link message.
  • the processing module 503 is further configured to, when there is no flow table entry of the first data stream, if the all data from the first device is received for the first time.
  • the CNP message the flow table entry of the first data flow is created and the entry timeout timer is started.
  • the flow table entry of the first data flow includes the second destination address, the second source Address, the source queue pair identifier of the first data flow, the CNP mark and the timeout identifier of the flow table entry of the first data flow, the timeout identifier is used to indicate whether the entry timeout timer is time out.
  • the processing module 503 is further configured to, when the timeout identifier indicates that the entry timeout timer expires, if no data is received within the time period of the entry timeout timer The CNP message from the first device deletes the flow table entry of the first data flow.
  • the sending module 502 is specifically configured to send the M CNP messages to the second device according to the N CNP messages after the first queue enters a congested state,
  • the first queue is a queue including the first data flow among the multiple sending queues of the egress port.
  • the processing module 503 is configured to determine the current state of the first queue; when the first queue is not in the congested state, and the depth of the first queue is greater than the first queue When a threshold is reached, it is determined that the first queue enters the congested state; or, when the first queue is in the congested state, and the depth of the first queue is less than a second threshold, the first queue is determined Exit the congested state; wherein the first threshold is greater than the second threshold.
  • This device can be used to implement the technical solution of the method embodiment shown in FIG. 3, and its implementation principles and technical effects are similar, and will not be repeated here.
  • the receiving module 501 is configured to receive a first data stream from a second device, and the first data stream includes N congestion occurrence CE packets, N Is a natural number; the sending module 502 is configured to send M explicit congestion notification packet CNP messages to the second device according to the N CE messages, the M CNP messages and the first data flow Correspondingly, M is an integer greater than N.
  • the CNP message is used to indicate that the first data stream is congested.
  • the CNP message includes a first destination address, a first source address, and a first destination queue pair identifier
  • the message in the first data stream includes a second destination address, a second A source address and a second destination queue pair identification, wherein the first destination address and the second source address are the same, and the first source address and the second destination address are the same.
  • the sending module 502 is specifically configured to send the M CNP messages according to a set period.
  • the sending module 502 is specifically configured to monitor whether the CNP message is sent in the current cycle; if the CNP message is not sent, create an auxiliary CNP message and send it to all The second device sends the auxiliary CNP message.
  • the failure to send the CNP packet means that the value of the CNP sending flag in the flow table entry of the first data flow is the first value.
  • the processing module 503 is configured to monitor whether the CNP packet is sent in the current period; if the CNP packet is sent, the flow of the first data stream The value of the CNP sending flag in the table entry is set to the second value, and the value of the CNP sending flag is set to the first value at the end of the current period.
  • the processing module 503 is further configured to create a flow table entry of the first data flow according to the link establishment message, and the flow table entry of the first data flow includes the The second destination address, the second source address, the second destination queue pair identifier, the source queue pair identifier of the first data flow, and the CNP sending flag.
  • the processing module 503 is further configured to delete the flow table entry of the first data flow according to the delete link message.
  • the processing module 503 is further configured to, when the flow table entry of the first data flow does not exist, if the CNP packet is sent for the first time, create the first data flow.
  • a flow table entry of a data stream and an entry timeout timer is started.
  • the flow table entry of the first data stream includes the second destination address, the second source address, and the source of the first data stream.
  • the processing module 503 is further configured to, when the timeout identifier indicates that the entry timeout timer expires, if the entry is not sent within the time period of the entry timeout timer For the CNP message, the flow table entry of the first data flow is deleted.
  • the sending module 502 is specifically configured to send the M CNP messages to the second device according to the N CE messages after the first data stream is congested Text.
  • the processing module 503 is configured to determine the current state of the first data stream; when the first data stream is not in the congested state, and the received first data stream When the number of CE packets in the data stream is greater than a third threshold, it is determined that the first data stream enters the congested state; or, when the first data stream is not in the congested state, and the sent When the number of CNP packets is greater than the fourth threshold, it is determined that the first data flow enters the congested state; or, when the first data flow is in the congested state, and a non-congestion in the first data flow is received When the CE message, it is determined that the first data flow exits the congestion state; or, when the first data flow is in the congestion state and the corresponding data flow is not received within a set time When any message, it is determined that the first data flow exits the congestion state; or, when the first data flow is in the congestion state and no CNP message is sent within a set time, it is determined that all The first data flow flow
  • This device can be used to implement the technical solution of the method embodiment shown in FIG. 4, and its implementation principles and technical effects are similar, and will not be repeated here.
  • FIG. 6 is a schematic structural diagram of a device 600 provided by this application.
  • the device 600 may be the network device in the first embodiment above, or the server in the second embodiment above, and the device 600 includes a processor 601 and a transceiver 602.
  • the device 600 further includes a memory 603.
  • the processor 601, the transceiver 602, and the memory 603 can communicate with each other through an internal connection path to transfer control signals and/or data signals.
  • the memory 603 is used to store a computer program.
  • the processor 601 is configured to execute a computer program stored in the memory 603, so as to realize each function of the flow rate control device in the foregoing device embodiment.
  • the memory 603 may also be integrated in the processor 601 or independent of the processor 601.
  • the device 600 may further include an antenna 604 for transmitting the signal output by the transceiver 602.
  • the transceiver 602 receives signals through an antenna.
  • the device 600 may further include a power supply 605, which is used to provide power to various devices or circuits in the network device.
  • a power supply 605 which is used to provide power to various devices or circuits in the network device.
  • the device 600 may further include an input unit 606 or a display unit 607 (which can also be regarded as an output unit).
  • the present application also provides a computer-readable storage medium with a computer program stored on the computer-readable storage medium.
  • the computer program When the computer program is executed by a computer, the computer executes any of the above-mentioned method embodiments and is executed by a network device or a server. The steps and/or processing.
  • the computer program product includes computer program code.
  • the computer program code When the computer program code is run on a computer, the computer executes any of the foregoing method embodiments executed by a network device or server. Steps and/or treatments.
  • the steps of the foregoing method embodiments can be completed by hardware integrated logic circuits in the processor or instructions in the form of software.
  • the processor can be a general-purpose processor, digital signal processor (digital signal processor, DSP), application-specific integrated circuit (ASIC), field programmable gate array (field programmable gate array, FPGA) or other Programming logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware encoding processor, or executed and completed by a combination of hardware and software modules in the encoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the memory mentioned in the above embodiments may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be read-only memory (ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), and electronic Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • the volatile memory may be random access memory (RAM), which is used as an external cache.
  • RAM random access memory
  • static random access memory static random access memory
  • dynamic RAM dynamic random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • double data rate synchronous dynamic random access memory double data rate SDRAM, DDR SDRAM
  • enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
  • serial link DRAM SLDRAM
  • direct rambus RAM direct rambus RAM
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (personal computer, server, or network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请提供一种流速控制方法和装置。本申请流速控制方法,包括:接收来自第一设备的N个显式的拥塞通知包CNP报文,所述N个CNP报文和第一数据流相对应,N为自然数;根据所述N个CNP报文向第二设备发送M个CNP报文,所述M个CNP报文和所述第一数据流相对应,M为大于N的整数。本申请解决数据流发生拥塞时仍然被做速率升速处理的问题。

Description

流速控制方法和装置
本申请要求于2019年7月18日提交中国专利局、申请号为201910649933.6、申请名称为“流速控制方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信技术,尤其涉及一种流速控制方法和装置。
背景技术
远程直接数据存储(Remote Direct Memory Acess,RDMA)技术为了减少网络传输中服务器端数据处理的延迟,允许客户端的应用程序直接远程读取和写入服务器端的内存。RDMA技术中在端节点的网卡(Network Interface Cards,NICs)上通过已注册的缓存直接收发数据,网络协议全部部署在NICs上,不需要经过主机的网络协议栈,这种方式显著减少了主机中的中央处理器(Central Processing Unit,CPU)的占有率和整体时延。应用于聚合以太网的RDMA(RDMA over Converged Ethernet,RoCE)协议包括两个版本RoCEv1和RoCEv2,二者的主要区别是RoCEv1是基于以太网链路层实现的RDMA协议,RoCEv2是基于以太网中的传输控制协议/因特网互联协议(Transmission Control Protocol/Internet Protocol,TCP/IP)协议中的UDP层实现的RDMA协议。
在部署了满足高吞吐量、超低时延和低CPU开销需求的网络协议后,需要找到一个拥塞控制算法以使网络无丢包可靠传输,因此提出了数据中心量化拥塞通知(Data Center Quantized Congestion Notification,DCQCN)。实现基于DCQCN的拥塞控制算法的通信系统包括反应点(Reaction Point,RP)、拥塞点(Congestion Point,CP)和通知点(Notification Point,NP),在CP上,如果其出端口队列的长度超出阈值,则CP在该出端口队列上新加入的报文中打上显式拥塞通知(Explicit Congestion Notification,ECN)标记。在NP上,当带ECN标记的报文(即拥塞发生(Congestion Encountered,CE)报文)到达NP时,表示网络拥塞,因此NP将该网络拥塞信息传递给RP。RoCEv2协议为此定义了显式的拥塞通知包(Congestion Notification Packet,CNP)报文,如果CE报文到达某个流,并且在过去的n微秒内NP没有为该流发送过CNP报文,则NP立即发送一个CNP报文。即如果在时间窗口(n微妙)内到达某个流的有多个CE报文,则NP每n微秒最多为该流生成一个CNP报文。在RP上,当RP收到一个CNP报文时,RP减小发送速率,并更新速率降低因子。RP还会在连续一段时间内未收到CNP报文时,按照一定的算法增加发送速率。
但是,上述过程中当流的规模较大时,每条流能分到的平均带宽较小,对于发生拥塞的流,可能存在各条流的报文时间间隔(即该流能获得CNP报文的最小时间间隔)大于速率升速的时间间隔,即使NP根据每个CE报文产生CNP,也无法让RP在速率升速的时间间隔处进行降速处理,这会导致处于拥塞状态的流被做速率升速处理,从而出现控速收敛失败的情况,影响报文传输效率。
发明内容
本申请提供一种流速控制方法和装置,以解决数据流发生拥塞时仍然被做速率升速处理的问题。
第一方面,本申请提供一种流速控制方法,包括:
接收来自第一设备的N个显式的拥塞通知包CNP报文,所述N个CNP报文和第一数据流相对应,N为自然数;根据所述N个CNP报文向第二设备发送M个CNP报文,所述M个CNP报文和所述第一数据流相对应,M为大于N的整数。
本申请网络设备接收来自第一设备的N个CNP报文,再向第二设备发送M个CNP报文,M大于N,可以确保第二设备能够每间隔周期的时长接收到一个对应于第一数据流的CNP报文,从而基于该CNP报文对第一数据流进行发送速率的降速处理,解决了第一数据流发生拥塞仍然被做速率升速处理的问题。
在一种可能的实现方式中,所述CNP报文用于指示所述第一数据流发生拥塞。
在一种可能的实现方式中,所述CNP报文包括第一目的地址、第一源地址和第一目的队列对标识,所述第一数据流中的报文包括第二目的地址、第二源地址和第二目的队列对标识,其中,所述第一目的地址和所述第二源地址相同,所述第一源地址和所述第二目的地址相同。
在一种可能的实现方式中,所述根据所述N个CNP报文向第二设备发送M个CNP报文,包括:按照设定的周期发送所述M个CNP报文。
本申请可以确保第二设备能够每间隔周期的时长接收到一个对应于第一数据流的CNP报文,从而基于该CNP报文对第一数据流进行发送速率的降速处理,解决了第一数据流发生拥塞仍然被做速率升速处理的问题。
在一种可能的实现方式中,所述根据所述N个CNP报文向第二设备发送M个CNP报文,包括:在当前周期内监测是否接收到来自所述第一设备的所述CNP报文;若没有接收到所述来自所述第一设备的所述CNP报文,则创建辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
本申请网络设备在接收到来自第一设备的N个CNP报文,直接向第二设备转发该CNP报文,如果在整个周期内都没有接收到来自第一设备的CNP报文,则创建一个CNP报文并发送给第二设备,可以确保第二设备能够每间隔周期的时长接收到一个对应于第一数据流的CNP报文,从而基于该CNP报文对第一数据流进行发送速率的降速处理,解决了第一数据流发生拥塞仍然被做速率升速处理的问题。
在一种可能的实现方式中,所述根据所述N个CNP报文向第二设备发送M个CNP报文,包括:从收到所述N个CNP报文中的第一个CNP报文开始,每收到一个所述CNP报文后开始计时,当计时的时长超过设定阈值时,若还没有接收到来自所述第一设备的下一个所述CNP报文,则创建一个辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
本申请网络设备从接收到来自第一设备的N个CNP报文中的第一个CNP报文开始,直接向第二设备转发该CNP报文,但如果在收到来自第一设备的某一个CNP报文后,间隔一段时间(例如时长超过设定阈值)还没有收到来自第一设备的下一个CNP报文,则创建一个CNP报文并发送给第二设备,可以避免第二设备在收到某一个CNP报文后,可 能间隔很长时间才能接收到对应于第一数据流的下一个CNP报文的情况,确保第二设备可以及时基于CNP报文对第一数据流进行发送速率的降速处理,解决了第一数据流发生拥塞仍然被做速率升速处理的问题。
在一种可能的实现方式中,还包括:每向所述第二设备发送一个所述辅助CNP报文后开始计时,当计时的时长超过所述设定阈值时,若还没有接收到来自所述第一设备的下一个所述CNP报文,则再创建一个辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
本申请网络设备可以在发送出一个自己创建的CNP报文后,重新开始计时,当计时的时长超过设定阈值,还没有收到来自第一设备的下一个CNP报文,则再创建一个CNP报文并发送给第二设备。即网络设备收到了一个来自第一设备的CNP报文,经过一段时间(例如时长超过设定阈值)后,如果没有收到来自第一设备的下一个CNP报文,则网络设备自己创建一个CNP报文发送给第二设备,但发送出该CNP报文后,网络设备又经过一段时间(例如时长超过设定阈值)后,还是没有收到来自第一设备的下一个CNP报文,则网络设备再创建一个CNP报文发送给第二设备,以此类推。这样可以避免第二设备在收到某一个CNP报文后,可能间隔很长时间才能接收到对应于第一数据流的下一个CNP报文的情况,确保第二设备可以及时基于CNP报文对第一数据流进行发送速率的降速处理,解决了第一数据流发生拥塞仍然被做速率升速处理的问题。
在一种可能的实现方式中,所述没有接收到所述来自所述第一设备的所述CNP报文是指所述第一数据流的流表表项中的所述CNP经过标记的值为第一值。
本申请网络设备可以基于流表表项中的特定标记,例如CNP经过标记,来确认是否接收到来自第一设备的CNP报文,提高处理效率。
在一种可能的实现方式中,还包括:在所述当前周期内监测是否接收到所述来自所述第一设备的所述CNP报文;若接收到所述来自所述第一设备的所述CNP报文,则将所述第一数据流的流表表项中的所述CNP经过标记的值设置为第二值,并在所述当前周期结束时将所述CNP经过标记的值设置为第一值。
在一种可能的实现方式中,所述接收来自第一设备的N个CNP报文之前,还包括:根据建链报文创建所述第一数据流的流表表项,所述第一数据流的流表表项包括所述第二目的地址、所述第二源地址、所述第二目的队列对标识、所述第一数据流的源队列对标识和所述CNP经过标记。
在一种可能的实现方式中,所述根据建链报文创建所述第一数据流的流表表项之后,还包括:根据删链报文删除所述第一数据流的流表表项。
在一种可能的实现方式中,所述根据所述N个CNP报文向第二设备发送M个CNP报文之前,还包括:当不存在所述第一数据流的流表表项时,若第一次接收到所述来自所述第一设备的所述CNP报文,则创建所述第一数据流的流表表项并启动表项超时定时器,所述第一数据流的流表表项包括所述第二目的地址、所述第二源地址、所述第一数据流的源队列对标识、所述CNP经过标记和所述第一数据流的流表表项的超时标识,所述超时标识用于表示所述表项超时定时器是否超时。
在一种可能的实现方式中,所述创建所述第一数据流的流表表项并启动表项超时定时器之后,还包括:当所述超时标识表示所述表项超时定时器超时时,若在所述表项超时定 时器的计时周期内没有接收到所述来自所述第一设备的所述CNP报文,则删除所述第一数据流的流表表项。
在一种可能的实现方式中,所述根据所述N个CNP报文向第二设备发送M个CNP报文,包括:当第一队列进入拥塞状态后,根据所述N个CNP报文向所述第二设备发送所述M个CNP报文,所述第一队列为出端口的多个发送队列中包括所述第一数据流的队列。
本申请网络设备可以在第一队列进入拥塞状态之后,才会在没有接收到来自第一设备的CNP报文时主动创建辅助CNP报文并发送给第二设备,这样可以避免在第一队列没有进入拥塞状态时由于不必要的CNP报文导致的降速处理。
在一种可能的实现方式中,所述根据所述N个CNP报文向第二设备发送M个CNP报文之前,还包括:判断所述第一队列的当前状态;当所述第一队列没有处于所述拥塞状态,且所述第一队列的深度大于第一阈值时,确定所述第一队列进入所述拥塞状态;或者,当所述第一队列处于所述拥塞状态,且所述第一队列的深度小于第二阈值时,确定所述第一队列退出所述拥塞状态;其中,所述第一阈值大于所述第二阈值。
第二方面,本申请提供一种流速控制方法,包括:
接收来自第二设备的第一数据流,所述第一数据流中包括N个拥塞发生CE报文,N为自然数;根据所述N个CE报文向所述第二设备发送M个显式的拥塞通知包CNP报文,所述M个CNP报文和所述第一数据流相对应,M为大于N的整数。
本申请,服务器接收来自第二设备的第一数据流,该第一数据流中包括N个CE报文,再向第二设备发送M个CNP报文,M大于N,可以确保第二设备能够每间隔周期的时长接收到一个对应于第一数据流的CNP报文,从而基于该CNP报文对第一数据流进行发送速率的降速处理,解决了第一数据流发生拥塞仍然被做速率升速处理的问题。
在一种可能的实现方式中,所述CNP报文用于指示所述第一数据流发生拥塞。
在一种可能的实现方式中,所述CNP报文包括第一目的地址、第一源地址和第一目的队列对标识,所述第一数据流中的报文包括第二目的地址、第二源地址和第二目的队列对标识,其中,所述第一目的地址和所述第二源地址相同,所述第一源地址和所述第二目的地址相同。
在一种可能的实现方式中,所述根据所述N个CE报文向所述第二设备发送M个CNP报文,包括:按照设定的周期发送所述M个CNP报文。
本申请可以确保第二设备能够每间隔周期的时长接收到一个对应于第一数据流的CNP报文,从而基于该CNP报文对第一数据流进行发送速率的降速处理,解决了第一数据流发生拥塞仍然被做速率升速处理的问题。
在一种可能的实现方式中,所述根据所述N个CE报文向所述第二设备发送M个CNP报文,包括:在当前周期内监测是否发送所述CNP报文;若没有发送所述CNP报文,则创建辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
本申请服务器在接收到第一数据流中的CE报文后,可以根据该CE报文向第二设备发送CNP报文,而如果整个周期没有向第二设备发送CNP报文,则创建一个CNP报文并发送给第二设备,可以确保第二设备能够每间隔周期的时长接收到一个对应于第一数据流的CNP报文,从而基于该CNP报文对第一数据流进行发送速率的降速处理,解决了第 一数据流发生拥塞仍然被做速率升速处理的问题。
在一种可能的实现方式中,所述没有发送所述CNP报文是指所述第一数据流的流表表项中的CNP发送标记的值为第一值。
本申请网络设备可以基于流表表项中的特定标记,例如CNP发送标记,来确认是否曾向第二设备发送过CNP报文,提高处理效率。
在一种可能的实现方式中,还包括:在所述当前周期内监测是否发送所述CNP报文;若发送所述CNP报文,则将所述第一数据流的流表表项中的所述CNP发送标记的值设置为第二值,并在所述当前周期结束时将所述CNP发送标记的值设置为第一值。
在一种可能的实现方式中,所述接收来自第二设备的第一数据流之前,还包括:根据建链报文创建所述第一数据流的流表表项,所述第一数据流的流表表项包括所述第二目的地址、所述第二源地址、所述第二目的队列对标识、所述第一数据流的源队列对标识和所述CNP发送标记。
在一种可能的实现方式中,所述根据建链报文创建所述第一数据流的流表表项之后,还包括:根据删链报文删除所述第一数据流的流表表项。
在一种可能的实现方式中,所述根据所述N个CE报文向所述第二设备发送M个CNP报文之前,还包括:当不存在所述第一数据流的流表表项时,若第一次发送所述CNP报文,则创建所述第一数据流的流表表项并启动表项超时定时器,所述第一数据流的流表表项包括所述第二目的地址、所述第二源地址、所述第一数据流的源队列对标识、所述CNP发送标记和所述第一数据流的流表表项的超时标识,所述超时标识用于表示所述表项超时定时器是否超时。
在一种可能的实现方式中,所述创建所述第一数据流的流表表项并启动表项超时定时器之后,还包括:当所述超时标识表示所述表项超时定时器超时时,若在所述表项超时定时器的计时周期内没有发送所述CNP报文,则删除所述第一数据流的流表表项。
在一种可能的实现方式中,所述根据所述N个CE报文向所述第二设备发送M个CNP报文,包括:当所述第一数据流发生拥塞后,根据所述N个CE报文向所述第二设备发送所述M个CNP报文。
本申请服务器可以在第一数据流发生拥塞之后,才会在没有向第二设备发送CNP报文时主动创建辅助CNP报文并发送给第二设备,这样可以避免在第一队列没有进入拥塞状态时由于不必要的CNP报文导致的降速处理。
在一种可能的实现方式中,所述根据所述N个CE报文向所述第二设备发送M个CNP报文之前,还包括:判断所述第一数据流的当前状态;当所述第一数据流没有处于所述拥塞状态,且接收到的所述第一数据流中的所述CE报文数量大于第三阈值时,确定所述第一数据流进入所述拥塞状态;或者,当所述第一数据流没有处于所述拥塞状态,且发送的所述CNP报文数量大于第四阈值时,确定所述第一数据流进入所述拥塞状态;或者,当所述第一数据流处于所述拥塞状态,且接收到所述第一数据流中的非CE报文时,确定所述第一数据流退出所述拥塞状态;或者,当所述第一数据流处于所述拥塞状态,且在设定的时间内没有收到所述对应数据流中的任何报文时,确定所述第一数据流退出所述拥塞状态;或者,当所述第一数据流处于所述拥塞状态,且在设定的时间内没有发送任何CNP报文时,确定所述第一数据流退出所述拥塞状态。
第三方面,本申请提供一种流速控制装置,包括:
接收模块,用于接收来自第一设备的N个显式的拥塞通知包CNP报文,所述N个CNP报文和第一数据流相对应,N为自然数;发送模块,用于根据所述N个CNP报文向第二设备发送M个CNP报文,所述M个CNP报文和所述第一数据流相对应,M为大于N的整数。
在一种可能的实现方式中,所述CNP报文用于指示所述第一数据流发生拥塞。
在一种可能的实现方式中,所述CNP报文包括第一目的地址、第一源地址和第一目的队列对标识,所述第一数据流中的报文包括第二目的地址、第二源地址和第二目的队列对标识,其中,所述第一目的地址和所述第二源地址相同,所述第一源地址和所述第二目的地址相同。
在一种可能的实现方式中,所述发送模块,具体用于按照设定的周期发送所述M个CNP报文。
在一种可能的实现方式中,还包括:处理模块,用于在当前周期内监测是否接收到来自所述第一设备的所述CNP报文;所述发送模块,具体用于若没有接收到所述来自所述第一设备的所述CNP报文,则创建辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
在一种可能的实现方式中,所述处理模块,还用于从收到所述N个CNP报文中的第一个CNP报文开始,每收到一个所述CNP报文后开始计时;所述发送模块,还用于当计时的时长超过设定阈值时,若还没有接收到来自所述第一设备的下一个所述CNP报文,则创建一个辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
在一种可能的实现方式中,所述处理模块,还用于每向所述第二设备发送一个所述辅助CNP报文后开始计时;所述发送模块,还用于当计时的时长超过所述设定阈值时,若还没有接收到来自所述第一设备的下一个所述CNP报文,则再创建一个辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
在一种可能的实现方式中,所述没有接收到所述来自所述第一设备的所述CNP报文是指所述第一数据流的流表表项中的CNP经过标记的值为第一值。
在一种可能的实现方式中,所述处理模块,还用于在所述当前周期内监测是否接收到所述来自所述第一设备的所述CNP报文;若接收到所述来自所述第一设备的所述CNP报文,则将所述第一数据流的流表表项中的所述CNP经过标记的值设置为第二值,并在所述当前周期结束时将所述CNP经过标记的值设置为第一值。
在一种可能的实现方式中,所述处理模块,还用于根据建链报文创建所述第一数据流的流表表项,所述第一数据流的流表表项包括所述第二目的地址、所述第二源地址、所述第二目的队列对标识、所述第一数据流的源队列对标识和所述CNP经过标记。
在一种可能的实现方式中,所述处理模块,还用于根据删链报文删除所述第一数据流的流表表项。
在一种可能的实现方式中,所述处理模块,还用于当不存在所述第一数据流的流表表项时,若第一次接收到所述来自所述第一设备的所述CNP报文,则创建所述第一数据流的流表表项并启动表项超时定时器,所述第一数据流的流表表项包括所述第二目的地址、所述第二源地址、所述第一数据流的源队列对标识、所述CNP经过标记和所述第一数据 流的流表表项的超时标识,所述超时标识用于表示所述表项超时定时器是否超时。
在一种可能的实现方式中,所述处理模块,还用于当所述超时标识表示所述表项超时定时器超时时,若在所述表项超时定时器的计时周期内没有接收到所述来自所述第一设备的所述CNP报文,则删除所述第一数据流的流表表项。
在一种可能的实现方式中,所述发送模块,具体用于当第一队列进入拥塞状态后,根据所述N个CNP报文向所述第二设备发送所述M个CNP报文,所述第一队列为出端口的多个发送队列中包括所述第一数据流的队列。
在一种可能的实现方式中,还包括:处理模块,用于判断所述第一队列的当前状态;当所述第一队列没有处于所述拥塞状态,且所述第一队列的深度大于第一阈值时,确定所述第一队列进入所述拥塞状态;或者,当所述第一队列处于所述拥塞状态,且所述第一队列的深度小于第二阈值时,确定所述第一队列退出所述拥塞状态;其中,所述第一阈值大于所述第二阈值。
第四方面,本申请提供一种流速控制装置,包括:
接收模块,用于接收来自第二设备的第一数据流,所述第一数据流中包括N个拥塞发生CE报文,N为自然数;发送模块,用于根据所述N个CE报文向所述第二设备发送M个显式的拥塞通知包CNP报文,所述M个CNP报文和所述第一数据流相对应,M为大于N的整数。
在一种可能的实现方式中,所述CNP报文用于指示所述第一数据流发生拥塞。
在一种可能的实现方式中,所述CNP报文包括第一目的地址、第一源地址和第一目的队列对标识,所述第一数据流中的报文包括第二目的地址、第二源地址和第二目的队列对标识,其中,所述第一目的地址和所述第二源地址相同,所述第一源地址和所述第二目的地址相同。
在一种可能的实现方式中,所述发送模块,具体用于按照设定的周期发送所述M个CNP报文。
在一种可能的实现方式中,所述发送模块,具体用于在当前周期内监测是否发送所述CNP报文;若没有发送所述CNP报文,则创建辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
在一种可能的实现方式中,所述没有发送所述CNP报文是指所述第一数据流的流表表项中的CNP发送标记的值为第一值。
在一种可能的实现方式中,还包括:处理模块,用于在所述当前周期内监测是否发送所述CNP报文;若发送所述CNP报文,则将所述第一数据流的流表表项中的所述CNP发送标记的值设置为第二值,并在所述当前周期结束时将所述CNP发送标记的值设置为第一值。
在一种可能的实现方式中,所述处理模块,还用于根据建链报文创建所述第一数据流的流表表项,所述第一数据流的流表表项包括所述第二目的地址、所述第二源地址、所述第二目的队列对标识、所述第一数据流的源队列对标识和所述CNP发送标记。
在一种可能的实现方式中,所述处理模块,还用于根据删链报文删除所述第一数据流的流表表项。
在一种可能的实现方式中,所述处理模块,还用于当不存在所述第一数据流的流表表 项时,若第一次发送所述CNP报文,则创建所述第一数据流的流表表项并启动表项超时定时器,所述第一数据流的流表表项包括所述第二目的地址、所述第二源地址、所述第一数据流的源队列对标识、所述CNP发送标记和所述第一数据流的流表表项的超时标识,所述超时标识用于表示所述表项超时定时器是否超时。
在一种可能的实现方式中,所述处理模块,还用于当所述超时标识表示所述表项超时定时器超时时,若在所述表项超时定时器的计时周期内没有发送所述CNP报文,则删除所述第一数据流的流表表项。
在一种可能的实现方式中,所述发送模块,具体用于当所述第一数据流发生拥塞后,根据所述N个CE报文向所述第二设备发送所述M个CNP报文。
在一种可能的实现方式中,还包括:处理模块,用于判断所述第一数据流的当前状态;当所述第一数据流没有处于所述拥塞状态,且接收到的所述第一数据流中的所述CE报文数量大于第三阈值时,确定所述第一数据流进入所述拥塞状态;或者,当所述第一数据流没有处于所述拥塞状态,且发送的所述CNP报文数量大于第四阈值时,确定所述第一数据流进入所述拥塞状态;或者,当所述第一数据流处于所述拥塞状态,且接收到所述第一数据流中的非CE报文时,确定所述第一数据流退出所述拥塞状态;或者,当所述第一数据流处于所述拥塞状态,且在设定的时间内没有收到所述对应数据流中的任何报文时,确定所述第一数据流退出所述拥塞状态;或者,当所述第一数据流处于所述拥塞状态,且在设定的时间内没有发送任何CNP报文时,确定所述第一数据流退出所述拥塞状态。
第五方面,本申请提供一种网络设备,包括:
一个或多个处理器;
存储器,用于存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如上述第一方面中任一项所述的方法。
第六方面,本申请提供一种服务器,包括:
一个或多个处理器;
存储器,用于存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如上述第二方面中任一项所述的方法。
第七方面,本申请提供一种计算机可读存储介质,包括计算机程序,所述计算机程序在计算机上被执行时,使得所述计算机执行上述第一至二方面中任一项所述的方法。
第八方面,本申请提供一种计算机程序,其特征在于,当所述计算机程序被计算机执行时,用于执行上述第一至二方面中任一项所述的方法。
附图说明
图1为本申请流速控制方法适用的一个典型场景示例;
图2为实现现有拥塞控制方法的一个通信系统示例;
图3为本申请流速控制方法实施例一的流程图;
图4为本申请流速控制方法实施例二的流程图;
图5为本申请流速控制装置实施例的结构示意图;
图6为本申请提供的设备600的示意性结构图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请中的附图,对本申请中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书实施例和权利要求书及附图中的术语“第一”、“第二”等仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元。方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。
图1为本申请流速控制方法适用的一个典型场景示例,如图1所示,该场景为数据中心网络,其可应用于高性能计算、高性能分布式存储、大数据、人工智能等。数据中心网络可以是CLOS网络,CLOS网络包括核心交换机和接入交换机,服务器通过接入交换机接入网络。
需要说明的是,图1示出的是本申请流速控制方法适用的一种场景示例,本申请还可以适用于RDMA场景,因此本申请对应用场景不做具体限定。
图2为实现现有拥塞控制方法的一个通信系统示例,如图2所示,该通信系统可以包括RP、CP和NP,其中,RP和NP可以是图1所示场景中的服务器,CP可以是图1所示场景中的核心交换机或接入交换机。拥塞控制过程可以包括:在CP上,如果其出端口队列的深度超出阈值,则CP确定该出端口队列进入拥塞状态,CP在该出端口队列中待发送的报文中打上ECN标记,带ECN标记的报文称为拥塞发生(Congestion Encountered,CE)报文。在NP上,当CE报文到达时,NP立即发送一个CNP报文,以通知RP数据流发生拥塞。或者,RP会在接收到多个CE报文后只发送一个CNP报文,以通知RP数据流发生拥塞。在RP上,当RP收到一个CNP报文时,RP减小发送速率,并更新速率降低因子。此外RP还会在连续一段时间内未收到CNP报文时,按照一定的算法增加发送速率。但是,如果数据流的规模较大时,对于发生拥塞的数据流,可能存在数据流获得对应CNP报文的时间间隔大于RP进行发送速率升速的时间间隔的情况,这会导致处于拥塞状态的数据流被做速率升速处理,从而出现控速收敛失败的情况,影响报文传输效率。
本申请提供了一种流速控制方法,用于解决上述技术问题。
图3为本申请流速控制方法实施例一的流程图,如图3所示,本实施例的方法可以由 网络设备执行,该网络设备可以是图1所示场景中的核心交换机或接入交换机。流速控制方法可以包括:
步骤301、接收来自第一设备的N个CNP报文,N个CNP报文和第一数据流相对应。
上述N为自然数。通常第一设备是指第一数据流的接收端,例如图2中的NP,第二设备是指第一数据流的发送端,例如图2中的RP。需要说明的是,本申请中第一设备也可以是指第一数据流的发送端,例如图2中的RP,第二设备也可以是指第一数据流的接收端,例如图2中的NP。另外,第一设备也可以是指处于网络设备下游的其他网络设备,第二设备也可以是指处于网络设备上游的其他网络设备。本申请对第一设备和第二设备不做具体限定。
本申请中CNP报文用于指示第一数据流发生拥塞。根据图2中描述的拥塞控制过程,网络设备可能会接收到来自第一设备的多个CNP报文,网络设备需要把这些CNP报文转发给第二设备。
在CNP报文中包括第一目的地址、第一源地址和第一目的队列对标识,在第一数据流中的报文中包括第二目的地址、第二源地址和第二目的队列对标识。CNP报文和第一数据流相对应是指第一目的地址和第二源地址相同(该地址通常是第一数据流的发送端设备的地址,例如RP的地址),第一源地址和第二目的地址相同(该地址通常是第一数据流的接收端设备的地址,例如NP的地址)。另外CNP报文和第一数据流相对应还包括第一目的队列对标识和第二目的队列对标识是相对应的,即第一目的队列对和第二目的队列对组成一对队列对。
在一种可能的实现方式中,当第一队列进入拥塞状态后,网络设备根据N个CNP报文向第二设备发送M个CNP报文,该第一队列为出端口的多个发送队列中包括第一数据流的队列。
即网络设备可以在上述第一队列进入拥塞状态之后,才会在没有接收到来自第一设备的CNP报文时主动创建辅助CNP报文并发送给第二设备。这样可以避免在第一队列没有进入拥塞状态时由于不必要的CNP报文导致的降速处理。
网络设备的出端口上可以创建多个队列,各个待发送的数据流被分配至各个队列,网络设备可以按照各个队列的优先级,依次从各个队列中选取数据流发送出去。本申请中第一队列可以是网络设备的出端口上的任意一个队列,该队列包括第一数据流。网络设备可以在第一队列没有处于拥塞状态,且第一队列的深度大于第一阈值时,确定第一队列进入拥塞状态;或者,在第一队列处于拥塞状态,且第一队列的深度小于第二阈值时,确定第一队列退出拥塞状态;其中,第一阈值大于第二阈值。
步骤302、根据N个CNP报文向第二设备发送M个CNP报文,M个CNP报文和第一数据流相对应。
上述M为大于N的整数。网络设备可以按照设定的周期向第二设备发送M个CNP报文。本申请中设定的周期可以参照第一数据流的发送端设备(例如RP)对数据流的发送速率进行升速处理的升速周期进行设置,例如升速周期默认为300us,可以将周期设置为升速判定周期的80~95%之内,即240-285us。
以下以当前周期为例,描述网络设备进行流速控制的过程:
网络设备在当前周期内监测是否接收到来自第一设备的CNP报文,若没有接收到来 自第一设备的CNP报文,则创建辅助CNP报文,并向第二设备发送辅助CNP报文。
本申请中,若网络设备接收到来自第一设备的CNP报文,则立即将该CNP报文转发给第二设备,而若网络设备在当前周期内一直没有接收到来自第一设备的CNP报文,就会在当前周期结束时自发的创建一个辅助CNP报文,再把该辅助CNP报文发送给第二设备。这样可以确保第一数据流的发送端设备(例如RP)能够每间隔设定的周期的时长接收到一个对应于第一数据流的CNP报文,从而基于该CNP报文对第一数据流进行发送速率的降速处理,避免第一数据流发生拥塞仍然被做速率升速处理的问题。
上述没有接收到来自第一设备的CNP报文是指网络设备维护的第一数据流的流表表项中的CNP经过标记的值为第一值。即本申请中网络设备可以根据预先创建的第一数据流的流表表项中的特定标识(例如CNP经过标记)向第二设备发送CNP报文。第一数据流的流表表项中的CNP经过标记用于表示网络设备是否在当前周期内接收到来自第一设备的CNP报文,例如,CNP经过标记的值为第一值(例如0)表示网络设备在当前周期内没有接收到来自第一设备的CNP报文,CNP经过标记的值为第二值(例如1)表示网络设备在当前周期内接收到来自第一设备的CNP报文。网络设备读取第一数据流的流表表项,如果其中的CNP经过标记的值为第二值,则表示网络设备已经接收到来自第一设备的CNP报文,并向第二设备转发过该来自第一设备的CNP报文,如果其中的CNP经过标记的值为第一值,则表示网络设备没有接收到来自第一设备的CNP报文,网络设备自发的创建一个辅助CNP报文,再把该辅助CNP报文发送给第二设备。这样可以确保第一数据流的发送端设备(例如RP)能够每间隔设定的周期的时长接收到一个对应于第一数据流的CNP报文,从而基于该CNP报文对第一数据流进行发送速率的降速处理,避免第一数据流发生拥塞仍然被做速率升速处理的问题。
在一种可能的实现方式中,网络设备进行流速控制的过程也可以是无论是否在当前周期内接收到来自第一设备的CNP报文,网络设备均在当前周期结束时向第二设备发送一个CNP报文。其中,若网络设备接收到来自第一设备的CNP报文,则在当前周期结束时向第二设备转发该CNP报文;若网络设备没有接收到来自第一设备的CNP报文,则在当前周期结束时创建辅助CNP报文,并向第二设备发送该辅助CNP报文。这样同样可以确保第一数据流的发送端设备(例如RP)能够每间隔设定的周期的时长接收到一个对应于第一数据流的CNP报文,从而基于该CNP报文对第一数据流进行发送速率的降速处理,避免第一数据流发生拥塞仍然被做速率升速处理的问题。
本申请,网络设备接收来自第一设备的N个CNP报文,再向第二设备发送M个CNP报文,M大于N,可以确保第一数据流的发送端设备能够每间隔设定的周期的时长接收到一个对应于第一数据流的CNP报文,从而基于该CNP报文对第一数据流进行发送速率的降速处理,避免第一数据流发生拥塞仍然被做速率升速处理的问题。
在一种可能的实现方式中,网络设备可以采用以下两种方法创建第一数据流的流表表项:
第一种方法是第一数据流的发送端设备和接收端设备相互发送建链报文,其中,发送端设备发出的建链报文包括发送端设备的地址和队列标识,接收端设备发出的建链报文包括接收端设备的地址和队列标识。发送端设备上的队列和接收端设备上的队列组成队列对,数据流和对应的CNP报文在同一对队列对上发送和接收,例如,发送端设备的A队 列中的数据流发送至接收端设备后,针对该数据流的CNP报文属于接收端设备的B队列,A队列和B队列是一对队列对。
网络设备根据上述建链报文创建第一数据流的流表表项,该第一数据流的流表表项包括上述第一数据流中的第二目的地址(例如第一数据流的接收端设备的IP地址)、第二源地址(例如第一数据流的发送端设备的IP地址)、第二目的队列对标识(例如第一数据流的接收端设备中的队列标识)、第一数据流的源队列对标识(例如第一数据流的发送端设备中的队列标识)和CNP经过标记。网络设备还可以根据删链报文(由第一数据流的发送端设备或接收端设备发送)删除第一数据流的流表表项。
网络设备可以在当前周期内监测是否接收到来自第一设备的CNP报文,若接收到来自第一设备的CNP报文,则将第一数据流的流表表项中的CNP经过标记的值设置为第二值,并在当前周期结束时将CNP经过标记的值设置为第一值。网络设备以周期划分时间轴,在当前周期内,如果网络设备接收到了来自第一设备的CNP报文,就在第一数据流的流表表项中将CNP经过标记的值设置为第二值,而当该当前周期结束时,网络设备会把该CNP经过标记的值设置为第一值,以便在下一周期内开始新的监测操作。
第二种方法是网络设备在不存在第一数据流的流表表项时,若第一次接收到来自第一设备的CNP报文,则根据CNP报文创建第一数据流的流表表项并启动表项超时定时器,该方法中网络设备是基于第一次接收到来自第一设备的CNP报文创建第一数据流的流表表项,并通过表项超时定时器来确定第一数据流的流表表项的删除时机。该第一数据流的流表表项包括上述第一数据流中的第二目的地址(亦即上述CNP报文中的第一源地址,例如第一数据流的接收端设备的IP地址)、第二源地址(亦即上述CNP报文中的第一目的地址,例如第一数据流的发送端设备的IP地址)、第一数据流的源队列对标识(亦即上述CNP报文中的第一目的队列对标识,例如第一数据流的发送端设备中的队列标识)、CNP经过标记和第一数据流的流表表项的超时标识,该超时标识用于表示表项超时定时器是否超时。
网络设备在当前周期内监测是否接收到来自第一设备的CNP报文,若接收到来自第一设备的CNP报文,则在第一数据流的流表表项中将CNP经过标记的值设置为第二值,并在当前周期结束时将CNP经过标记的值设置为第一值。网络设备以周期划分时间轴,在当前周期内,如果网络设备接收到了来自第一设备的CNP报文,就在第一数据流的流表表项中将CNP经过标记的值设置为第二值,而当该当前周期结束时,网络设备会把该CNP经过标记的值设置为第一值,以便在下一周期内开始新的监测操作。当超时标识表示表项超时定时器超时时,若在表项超时定时器的计时周期内没有接收到来自第一设备的CNP报文,则网络设备删除第一数据流的流表表项。
图4为本申请流速控制方法实施例二的流程图,如图4所示,本实施例的方法可以由图1所示场景中的服务器执行,该服务器主要是承担图2中的NP的功能。流速控制方法可以包括:
步骤401、接收来自第二设备的第一数据流,该第一数据流中包括N个CE报文。
上述N为自然数。通常第一设备是指第一数据流的接收端,例如图2中的NP,第二设备是指第一数据流的发送端,例如图2中的RP。另外,第二设备也可以是指处于服务器上游的其他网络设备。本申请对第一设备和第二设备不做具体限定。
本申请中CE报文是网络设备(例如图1所示场景中的核心交换机或接入交换机)在第一数据流的报文中打上ECN标记后生成的报文,用于指示第一数据流发生拥塞。
步骤402、根据N个CE报文向第二设备发送M个CNP报文,M个CNP报文和第一数据流相对应。
上述M为大于N的整数。服务器可以按照设定的周期向第二设备发送M个CNP报文。本申请中设定的周期可以参照第一数据流的发送端设备(例如RP)对数据流的发送速率进行升速处理的升速周期进行设置,例如升速判定周期默认为300us,可以将周期设置为升速判定周期的80~95%之内,即240-285us。
在CNP报文中包括第一目的地址、第一源地址和第一目的队列对标识,在第一数据流中的报文中包括第二目的地址、第二源地址和第二目的队列对标识。CNP报文和第一数据流相对应是指第一目的地址和第二源地址相同(该地址通常是第一数据流的发送端设备的地址,例如RP的地址),第一源地址和第二目的地址相同(该地址通常是第一数据流的接收端设备的地址,例如NP的地址)。另外CNP报文和第一数据流相对应还包括第一目的队列对标识和第二目的队列对标识是相对应的,即第一目的队列对和第二目的队列对组成一对队列对。
以下以当前周期为例,描述服务器进行流速控制的过程:
服务器在当前周期内监测是否发送CNP报文,若没有发送CNP报文,则创建辅助CNP报文,并向第二设备发送辅助CNP报文。
根据图2中描述的拥塞控制过程,服务器可能会接收到来自网络设备的第一数据流中的多个CE报文,此时服务器可以针对每个CE报文向第二设备分别发送一个CNP报文,服务器也可以针对多个CE报文向第二设备只发送一个CNP报文。
而若服务器在当前周期内一直没有发送CNP报文,就会在当前周期结束时自发的创建一个辅助CNP报文,再把该辅助CNP报文发送给第二设备。这样可以确保第一数据流的发送端设备(例如RP)能够每间隔设定的周期的时长接收到一个对应于第一数据流的CNP报文,从而基于该CNP报文对第一数据流进行发送速率的降速处理,避免第一数据流发生拥塞仍然被做速率升速处理的问题。
上述没有发送CNP报文是指服务器维护的第一数据流的流表表项中的CNP发送标记的值为第一值。即本申请中服务器可以根据预先创建的第一数据流的流表表项中的特定标识(例如CNP发送标记)向第二设备发送CNP报文。第一数据流的流表表项的CNP发送标记用于表示服务器是否在当前周期内发送CNP报文,例如,CNP发送标记的值为第一值(例如0)表示服务器在当前周期内没有向第二设备发送CNP报文,CNP发送标记的值为第二值(例如1)表示服务器在当前周期内向第二设备发送CNP报文。服务器读取第一数据流的流表表项,如果其中的CNP发送标记的值为第二值,则表示服务器已经向第二设备发送一个CNP报文,如果其中的CNP发送标记的值为第一值,则表示服务器没有向第二设备发送CNP报文,服务器自发的创建一个辅助CNP报文,再把该辅助CNP报文发送给第二设备。这样可以确保第一数据流的发送端设备(例如RP)能够每间隔设定的周期的时长接收到一个对应于第一数据流的CNP报文,从而基于该CNP报文对第一数据流进行发送速率的降速处理,避免第一数据流发生拥塞仍然被做速率升速处理的问题。
在一种可能的实现方式中,服务器进行流速控制的过程也可以是无论是否在当前周期内发送CNP报文,服务器均在当前周期结束时向第二设备发送一个CNP报文。其中,服务器若需要发送CNP报文,则在当前周期结束时向第二设备发送一个CNP报文;若服务器没有发送CNP报文,则在当前周期结束时创建辅助CNP报文,并向第二设备发送该辅助CNP报文。这样同样可以确保第一数据流的发送端设备(例如RP)能够每间隔设定的周期的时长接收到一个对应于第一数据流的CNP报文,从而基于该CNP报文对第一数据流进行发送速率的降速处理,避免第一数据流发生拥塞仍然被做速率升速处理的问题。
在一种可能的实现方式中,当第一数据流发生拥塞后,服务器根据N个CE报文向第二设备发送M个CNP报文。
即服务器可以在上述第一数据流发生拥塞之后,才会在没有发送CNP报文时主动创建辅助CNP报文并发送给第二设备。这样可以避免在第一数据流没有发生拥塞时由于不必要的CNP报文导致的降速处理。
服务器可以在第一数据流没有处于拥塞状态,且接收到的第一数据流中的CE报文数量大于第三阈值时,确定第一数据流进入拥塞状态;或者,在第一数据流没有处于拥塞状态,且发送的CNP报文数量大于第四阈值时,确定第一数据流进入拥塞状态;或者,在第一数据流处于拥塞状态,且接收到第一数据流中的非CE报文时,确定第一数据流退出拥塞状态;或者,在第一数据流处于拥塞状态,且在设定的时间内没有收到对应数据流中的任何报文时,确定第一数据流退出拥塞状态;或者,在第一数据流处于拥塞状态,且在设定的时间内没有发送任何CNP报文时,确定第一数据流退出拥塞状态。
本申请,服务器接收来自第二设备的第一数据流,该第一数据流中包括N个CE报文,再向第二设备发送M个CNP报文,M大于N,可以确保第二设备能够每间隔设定的周期的时长接收到一个对应于第一数据流的CNP报文,从而基于该CNP报文对第一数据流进行发送速率的降速处理,避免第一数据流发生拥塞仍然被做速率升速处理的问题。
在一种可能的实现方式中,服务器可以采用以下两种方法创建第一数据流的流表表项:
第一种方法是第一数据流的发送端设备和接收端设备相互发送建链报文,其中,发送端设备发出的建链报文包括发送端设备的地址和队列标识,接收端设备发出的建链报文包括接收端设备的地址和队列标识。发送端设备上的队列和接收端设备上的队列组成队列对,数据流和对应的CNP报文在同一对队列对上发送和接收,例如,发送端设备的A队列中的数据流发送至接收端设备后,针对该数据流的CNP报文属于接收端设备的B队列,A队列和B队列是一对队列对。
服务器根据上述建链报文创建第一数据流的流表表项,该第一数据流的流表表项包括上述第一数据流中的第二目的地址(例如第一数据流的接收端设备的IP地址)、第二源地址(例如第一数据流的发送端设备的IP地址)、第二目的队列对标识(例如第一数据流的接收端设备中的队列标识)、第一数据流的源队列对标识(例如第一数据流的发送端设备中的队列标识)和CNP发送标记。服务器还可以根据删链报文(由第一数据流的发送端设备或接收端设备发送)删除第一数据流的流表表项。
服务器可以在当前周期内监测是否发送CNP报文,若发送CNP报文,则在第一数据流的流表表项中将CNP发送标记的值设置为第二值,并在当前周期结束时将CNP发送标 记的值设置为第一值。服务器以周期划分时间轴,在当前周期内,如果服务器发送CNP报文,就在第一数据流的流表表项中将CNP发送标记的值设置为第二值,而当该当前周期结束时,服务器会把该CNP发送标记的值设置为第一值,以便在下一周期内开始新的监测操作。
第二种方法是服务器在不存在第一数据流的流表表项时,若第一次发送CNP报文,则根据CNP报文创建第一数据流的流表表项并启动表项超时定时器,该方法中服务器是基于第一次发送CNP报文创建第一数据流的流表表项,并通过表项超时定时器来确定第一数据流的流表表项的删除时机。该第一数据流的流表表项包括上述第一数据流中的第二目的地址(亦即上述CNP报文中的第一源地址,例如第一数据流的接收端设备的IP地址)、第二源地址(亦即上述CNP报文中的第一目的地址,例如第一数据流的发送端设备的IP地址)、第一数据流的源队列对标识(亦即上述CNP报文中的第一目的队列对标识,例如第一数据流的发送端设备中的队列标识)、CNP发送标记和第一数据流的流表表项的超时标识,该超时标识用于表示表项超时定时器是否超时。
服务器在当前周期内监测是否发送CNP报文,若发送CNP报文,则在第一数据流的流表表项中将CNP发送标记的值设置为第二值,并在当前周期结束时将CNP发送标记的值设置为第一值。服务器以周期划分时间轴,在当前周期内,如果服务器发送CNP报文,就在第一数据流的流表表项中将CNP发送标记的值设置为第二值,而当该当前周期结束时,服务器会把该CNP发送标记的值设置为第一值,以便在下一周期内开始新的监测操作。当超时标识表示表项超时定时器超时时,若在表项超时定时器的计时周期内没有发送CNP报文,则服务器删除第一数据流的流表表项。
图5为本申请流速控制装置实施例的结构示意图,如图5所示,本实施例的装置应用于上述方法实施例一中的网络设备,也可以应用于上述方法实施例二中的服务器。该装置包括:接收模块501、发送模块502和处理模块503。上述方法实施例二中的服务器
当流速控制装置应用于上述方法实施例一中的网络设备时,接收模块501,用于接收来自第一设备的N个显式的拥塞通知包CNP报文,所述N个CNP报文和第一数据流相对应,N为自然数;发送模块502,用于根据所述N个CNP报文向第二设备发送M个CNP报文,所述M个CNP报文和所述第一数据流相对应,M为大于N的整数。
在一种可能的实现方式中,所述CNP报文用于指示所述第一数据流发生拥塞。
在一种可能的实现方式中,所述CNP报文包括第一目的地址、第一源地址和第一目的队列对标识,所述第一数据流中的报文包括第二目的地址、第二源地址和第二目的队列对标识,其中,所述第一目的地址和所述第二源地址相同,所述第一源地址和所述第二目的地址相同。
在一种可能的实现方式中,所述发送模块502,具体用于按照设定的周期发送所述M个CNP报文。
在一种可能的实现方式中,所述处理模块503,用于在当前周期内监测是否接收到来自所述第一设备的所述CNP报文;所述发送模块502,具体用于若没有接收到所述来自所述第一设备的所述CNP报文,则创建辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
在一种可能的实现方式中,所述处理模块503,还用于从收到所述N个CNP报文中 的第一个CNP报文开始,每收到一个所述CNP报文后开始计时;所述发送模块502,还用于当计时的时长超过设定阈值时,若还没有接收到来自所述第一设备的下一个所述CNP报文,则创建一个辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
在一种可能的实现方式中,所述处理模块503,还用于每向所述第二设备发送一个所述辅助CNP报文后开始计时;所述发送模块502,还用于当计时的时长超过所述设定阈值时,若还没有接收到来自所述第一设备的下一个所述CNP报文,则再创建一个辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
在一种可能的实现方式中,所述没有接收到所述来自所述第一设备的所述CNP报文是指所述第一数据流的流表表项中的CNP经过标记的值为第一值。
在一种可能的实现方式中,所述处理模块503,还用于在所述当前周期内监测是否接收到所述来自所述第一设备的所述CNP报文;若接收到所述来自所述第一设备的所述CNP报文,则将所述第一数据流的流表表项中的所述CNP经过标记的值设置为第二值,并在所述当前周期结束时将所述CNP经过标记的值设置为第一值。
在一种可能的实现方式中,所述处理模块503,还用于根据建链报文创建所述第一数据流的流表表项,所述第一数据流的流表表项包括所述第二目的地址、所述第二源地址、所述第二目的队列对标识、所述第一数据流的源队列对标识和所述CNP经过标记。
在一种可能的实现方式中,所述处理模块503,还用于根据删链报文删除所述第一数据流的流表表项。
在一种可能的实现方式中,所述处理模块503,还用于当不存在所述第一数据流的流表表项时,若第一次接收到所述来自所述第一设备的所述CNP报文,则创建所述第一数据流的流表表项并启动表项超时定时器,所述第一数据流的流表表项包括所述第二目的地址、所述第二源地址、所述第一数据流的源队列对标识、所述CNP经过标记和所述第一数据流的流表表项的超时标识,所述超时标识用于表示所述表项超时定时器是否超时。
在一种可能的实现方式中,所述处理模块503,还用于当所述超时标识表示所述表项超时定时器超时时,若在所述表项超时定时器的计时周期内没有接收到所述来自所述第一设备的所述CNP报文,则删除所述第一数据流的流表表项。
在一种可能的实现方式中,所述发送模块502,具体用于当第一队列进入拥塞状态后,根据所述N个CNP报文向所述第二设备发送所述M个CNP报文,所述第一队列为出端口的多个发送队列中包括所述第一数据流的队列。
在一种可能的实现方式中,所述处理模块503,用于判断所述第一队列的当前状态;当所述第一队列没有处于所述拥塞状态,且所述第一队列的深度大于第一阈值时,确定所述第一队列进入所述拥塞状态;或者,当所述第一队列处于所述拥塞状态,且所述第一队列的深度小于第二阈值时,确定所述第一队列退出所述拥塞状态;其中,所述第一阈值大于所述第二阈值。
该装置,可以用于执行图3所示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。
当流速控制装置应用于上述方法实施例二中的服务器时,接收模块501,用于接收来自第二设备的第一数据流,所述第一数据流中包括N个拥塞发生CE报文,N为自然数;发送模块502,用于根据所述N个CE报文向所述第二设备发送M个显式的拥塞通知包 CNP报文,所述M个CNP报文和所述第一数据流相对应,M为大于N的整数。
在一种可能的实现方式中,所述CNP报文用于指示所述第一数据流发生拥塞。
在一种可能的实现方式中,所述CNP报文包括第一目的地址、第一源地址和第一目的队列对标识,所述第一数据流中的报文包括第二目的地址、第二源地址和第二目的队列对标识,其中,所述第一目的地址和所述第二源地址相同,所述第一源地址和所述第二目的地址相同。
在一种可能的实现方式中,所述发送模块502,具体用于按照设定的周期发送所述M个CNP报文。
在一种可能的实现方式中,所述发送模块502,具体用于在当前周期内监测是否发送所述CNP报文;若没有发送所述CNP报文,则创建辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
在一种可能的实现方式中,所述没有发送所述CNP报文是指所述第一数据流的流表表项中的CNP发送标记的值为第一值。
在一种可能的实现方式中,所述处理模块503,用于在所述当前周期内监测是否发送所述CNP报文;若发送所述CNP报文,则将所述第一数据流的流表表项中的所述CNP发送标记的值设置为第二值,并在所述当前周期结束时将所述CNP发送标记的值设置为第一值。
在一种可能的实现方式中,所述处理模块503,还用于根据建链报文创建所述第一数据流的流表表项,所述第一数据流的流表表项包括所述第二目的地址、所述第二源地址、所述第二目的队列对标识、所述第一数据流的源队列对标识和所述CNP发送标记。
在一种可能的实现方式中,所述处理模块503,还用于根据删链报文删除所述第一数据流的流表表项。
在一种可能的实现方式中,所述处理模块503,还用于当不存在所述第一数据流的流表表项时,若第一次发送所述CNP报文,则创建所述第一数据流的流表表项并启动表项超时定时器,所述第一数据流的流表表项包括所述第二目的地址、所述第二源地址、所述第一数据流的源队列对标识、所述CNP发送标记和所述第一数据流的流表表项的超时标识,所述超时标识用于表示所述表项超时定时器是否超时。
在一种可能的实现方式中,所述处理模块503,还用于当所述超时标识表示所述表项超时定时器超时时,若在所述表项超时定时器的计时周期内没有发送所述CNP报文,则删除所述第一数据流的流表表项。
在一种可能的实现方式中,所述发送模块502,具体用于当所述第一数据流发生拥塞后,根据所述N个CE报文向所述第二设备发送所述M个CNP报文。
在一种可能的实现方式中,所述处理模块503,用于判断所述第一数据流的当前状态;当所述第一数据流没有处于所述拥塞状态,且接收到的所述第一数据流中的所述CE报文数量大于第三阈值时,确定所述第一数据流进入所述拥塞状态;或者,当所述第一数据流没有处于所述拥塞状态,且发送的所述CNP报文数量大于第四阈值时,确定所述第一数据流进入所述拥塞状态;或者,当所述第一数据流处于所述拥塞状态,且接收到所述第一数据流中的非CE报文时,确定所述第一数据流退出所述拥塞状态;或者,当所述第一数据流处于所述拥塞状态,且在设定的时间内没有收到所述对应数据流中的任何报文时,确 定所述第一数据流退出所述拥塞状态;或者,当所述第一数据流处于所述拥塞状态,且在设定的时间内没有发送任何CNP报文时,确定所述第一数据流退出所述拥塞状态。
该装置,可以用于执行图4所示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。
图6为本申请提供的设备600的示意性结构图。如图6所示,设备600可以是上述实施例一中的网络设备,也可以是上述实施例二中的服务器,该设备600包括处理器601和收发器602。
可选地,设备600还包括存储器603。其中,处理器601、收发器602和存储器603之间可以通过内部连接通路互相通信,传递控制信号和/或数据信号。
其中,存储器603用于存储计算机程序。处理器601用于执行存储器603中存储的计算机程序,从而实现上述装置实施例中流速控制装置的各功能。
可选地,存储器603也可以集成在处理器601中,或者独立于处理器601。
可选地,设备600还可以包括天线604,用于将收发器602输出的信号发射出去。或者,收发器602通过天线接收信号。
可选地,设备600还可以包括电源605,用于给网络设备中的各种器件或电路提供电源。
除此之外,为了使得网络设备的功能更加完善,设备600还可以包括输入单元606或显示单元607(也可以认为是输出单元)。
本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被计算机执行时,使得计算机执行上述任一方法实施例中由网络设备或服务器执行的步骤和/或处理。
本申请还提供一种计算机程序产品,所述计算机程序产品包括计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行上述任一方法实施例中由网路设备或服务器执行的步骤和/或处理。
在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。处理器可以是通用处理器、数字信号处理器(digital signal processor,DSP)、特定应用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。本申请实施例公开的方法的步骤可以直接体现为硬件编码处理器执行完成,或者用编码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
上述各实施例中提及的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作 外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (34)

  1. 一种流速控制方法,其特征在于,包括:
    接收来自第一设备的N个显式的拥塞通知包CNP报文,所述N个CNP报文和第一数据流相对应,N为自然数;
    根据所述N个CNP报文向第二设备发送M个CNP报文,所述M个CNP报文和所述第一数据流相对应,M为大于N的整数。
  2. 根据权利要求1所述的方法,其特征在于,所述CNP报文用于指示所述第一数据流发生拥塞。
  3. 根据权利要求1或2所述的方法,其特征在于,所述CNP报文包括第一目的地址、第一源地址和第一目的队列对标识,所述第一数据流中的报文包括第二目的地址、第二源地址和第二目的队列对标识,其中,所述第一目的地址和所述第二源地址相同,所述第一源地址和所述第二目的地址相同。
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述N个CNP报文向第二设备发送M个CNP报文,包括:
    按照设定的周期发送所述M个CNP报文。
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述N个CNP报文向第二设备发送M个CNP报文,包括:
    在当前周期内监测是否接收到来自所述第一设备的所述CNP报文;
    若没有接收到所述来自所述第一设备的所述CNP报文,则创建辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
  6. 根据权利要求3所述的方法,其特征在于,所述根据所述N个CNP报文向第二设备发送M个CNP报文,包括:
    从收到所述N个CNP报文中的第一个CNP报文开始,每收到一个所述CNP报文后开始计时,当计时的时长超过设定阈值时,若还没有接收到来自所述第一设备的下一个所述CNP报文,则创建一个辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
  7. 根据权利要求6所述的方法,其特征在于,还包括:
    每向所述第二设备发送一个所述辅助CNP报文后开始计时,当计时的时长超过所述设定阈值时,若还没有接收到来自所述第一设备的下一个所述CNP报文,则再创建一个辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
  8. 根据权利要求5-7中任一项所述的方法,其特征在于,所述没有接收到所述来自所述第一设备的所述CNP报文是指所述第一数据流的流表表项中的CNP经过标记的值为第一值。
  9. 根据权利要求8所述的方法,其特征在于,还包括:
    在所述当前周期内监测是否接收到所述来自所述第一设备的所述CNP报文;
    若接收到所述来自所述第一设备的所述CNP报文,则将所述第一数据流的流表表项中的所述CNP经过标记的值设置为第二值,并在所述当前周期结束时将所述CNP经过标记的值设置为第一值。
  10. 根据权利要求1-9中任一项所述的方法,其特征在于,所述根据所述N个CNP 报文向第二设备发送M个CNP报文,包括:
    当第一队列进入拥塞状态后,根据所述N个CNP报文向所述第二设备发送所述M个CNP报文,所述第一队列为出端口的多个发送队列中包括所述第一数据流的队列。
  11. 一种流速控制方法,其特征在于,包括:
    接收来自第二设备的第一数据流,所述第一数据流中包括N个拥塞发生CE报文,N为自然数;
    根据所述N个CE报文向所述第二设备发送M个显式的拥塞通知包CNP报文,所述M个CNP报文和所述第一数据流相对应,M为大于N的整数。
  12. 根据权利要求11所述的方法,其特征在于,所述CNP报文用于指示所述第一数据流发生拥塞。
  13. 根据权利要求11或12所述的方法,其特征在于,所述CNP报文包括第一目的地址、第一源地址和第一目的队列对标识,所述第一数据流中的报文包括第二目的地址、第二源地址和第二目的队列对标识,其中,所述第一目的地址和所述第二源地址相同,所述第一源地址和所述第二目的地址相同。
  14. 根据权利要求13所述的方法,其特征在于,所述根据所述N个CE报文向所述第二设备发送M个CNP报文,包括:
    按照设定的周期发送所述M个CNP报文。
  15. 根据权利要求14所述的方法,其特征在于,所述根据所述N个CE报文向所述第二设备发送M个CNP报文,包括:
    在当前周期内监测是否发送所述CNP报文;
    若没有发送所述CNP报文,则创建辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
  16. 根据权利要求15所述的方法,其特征在于,所述没有发送所述CNP报文是指所述第一数据流的流表表项中的CNP发送标记的值为第一值。
  17. 根据权利要求16所述的方法,其特征在于,还包括:
    在所述当前周期内监测是否发送所述CNP报文;
    若发送所述CNP报文,则将所述第一数据流的流表表项中的所述CNP发送标记的值设置为第二值,并在所述当前周期结束时将所述CNP发送标记的值设置为第一值。
  18. 根据权利要求11-17中任一项所述的方法,其特征在于,所述根据所述N个CE报文向所述第二设备发送M个CNP报文,包括:
    当所述第一数据流发生拥塞后,根据所述N个CE报文向所述第二设备发送所述M个CNP报文。
  19. 一种流速控制装置,其特征在于,包括:
    接收模块,用于接收来自第一设备的N个显式的拥塞通知包CNP报文,所述N个CNP报文和第一数据流相对应,N为自然数;
    发送模块,用于根据所述N个CNP报文向第二设备发送M个CNP报文,所述M个CNP报文和所述第一数据流相对应,M为大于N的整数。
  20. 根据权利要求19所述的装置,其特征在于,还包括:
    处理模块,用于在当前周期内监测是否接收到来自所述第一设备的所述CNP报文;
    所述发送模块,具体用于若没有接收到所述来自所述第一设备的所述CNP报文,则创建辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
  21. 根据权利要求20所述的装置,其特征在于,所述处理模块,还用于从收到所述N个CNP报文中的第一个CNP报文开始,每收到一个所述CNP报文后开始计时;
    所述发送模块,还用于当计时的时长超过设定阈值时,若还没有接收到来自所述第一设备的下一个所述CNP报文,则创建一个辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
  22. 根据权利要求21所述的装置,其特征在于,所述处理模块,还用于每向所述第二设备发送一个所述辅助CNP报文后开始计时;
    所述发送模块,还用于当计时的时长超过所述设定阈值时,若还没有接收到来自所述第一设备的下一个所述CNP报文,则再创建一个辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
  23. 根据权利要求20-22中任一项所述的装置,其特征在于,所述没有接收到所述来自所述第一设备的所述CNP报文是指所述第一数据流的流表表项中的CNP经过标记的值为第一值。
  24. 根据权利要求20所述的装置,其特征在于,所述处理模块,还用于在所述当前周期内监测是否接收到所述来自所述第一设备的所述CNP报文;若接收到所述来自所述第一设备的所述CNP报文,则将所述第一数据流的流表表项中的所述CNP经过标记的值设置为第二值,并在所述当前周期结束时将所述CNP经过标记的值设置为第一值。
  25. 根据权利要求19所述的装置,其特征在于,所述发送模块,具体用于当第一队列进入拥塞状态后,根据所述N个CNP报文向所述第二设备发送所述M个CNP报文,所述第一队列为出端口的多个发送队列中包括所述第一数据流的队列。
  26. 一种流速控制装置,其特征在于,包括:
    接收模块,用于接收来自第二设备的第一数据流,所述第一数据流中包括N个拥塞发生CE报文,N为自然数;
    发送模块,用于根据所述N个CE报文向所述第二设备发送M个显式的拥塞通知包CNP报文,所述M个CNP报文和所述第一数据流相对应,M为大于N的整数。
  27. 根据权利要求26所述的装置,其特征在于,所述发送模块,具体用于在当前周期内监测是否发送所述CNP报文;若没有发送所述CNP报文,则创建辅助CNP报文,并向所述第二设备发送所述辅助CNP报文。
  28. 根据权利要求27所述的装置,其特征在于,所述没有发送所述CNP报文是指所述第一数据流的流表表项中的CNP发送标记的值为第一值。
  29. 根据权利要求28所述的装置,其特征在于,还包括:
    处理模块,用于在所述当前周期内监测是否发送所述CNP报文;若发送所述CNP报文,则将所述第一数据流的流表表项中的所述CNP发送标记的值设置为第二值,并在所述当前周期结束时将所述CNP发送标记的值设置为第一值。
  30. 根据权利要求26所述的方法,其特征在于,所述发送模块,具体用于当所述第一数据流发生拥塞后,根据所述N个CE报文向所述第二设备发送所述M个CNP报文。
  31. 一种网络设备,其特征在于,包括:
    一个或多个处理器;
    存储器,用于存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-10中任一项所述的方法。
  32. 一种服务器,其特征在于,包括:
    一个或多个处理器;
    存储器,用于存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求11-18中任一项所述的方法。
  33. 一种计算机可读存储介质,其特征在于,包括计算机程序,所述计算机程序在计算机上被执行时,使得所述计算机执行权利要求1-18中任一项所述的方法。
  34. 一种计算机程序,其特征在于,当所述计算机程序被计算机执行时,用于执行权利要求1-18中任一项所述的方法。
PCT/CN2020/102158 2019-07-18 2020-07-15 流速控制方法和装置 WO2021008562A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP20840799.9A EP3993330A4 (en) 2019-07-18 2020-07-15 METHOD AND DEVICE FOR FLOW RATE CONTROL
US17/573,909 US20220141137A1 (en) 2019-07-18 2022-01-12 Flow rate control method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910649933.6 2019-07-18
CN201910649933.6A CN112242956B (zh) 2019-07-18 2019-07-18 流速控制方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/573,909 Continuation US20220141137A1 (en) 2019-07-18 2022-01-12 Flow rate control method and apparatus

Publications (1)

Publication Number Publication Date
WO2021008562A1 true WO2021008562A1 (zh) 2021-01-21

Family

ID=74168006

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/102158 WO2021008562A1 (zh) 2019-07-18 2020-07-15 流速控制方法和装置

Country Status (4)

Country Link
US (1) US20220141137A1 (zh)
EP (1) EP3993330A4 (zh)
CN (2) CN118301078A (zh)
WO (1) WO2021008562A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200236052A1 (en) * 2020-03-04 2020-07-23 Arvind Srinivasan Improving end-to-end congestion reaction using adaptive routing and congestion-hint based throttling for ip-routed datacenter networks
CN113411263B (zh) * 2021-06-18 2023-03-14 中国工商银行股份有限公司 一种数据传输方法、装置、设备及存储介质
CN116055416B (zh) * 2023-03-28 2023-05-30 新华三工业互联网有限公司 应用于长距通信网络场景下传输速率的调整方法及装置
CN116545933B (zh) * 2023-07-06 2023-10-20 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) 网络拥塞控制方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104753810A (zh) * 2013-12-30 2015-07-01 腾讯数码(天津)有限公司 一种网络入流量限速方法及装置
US20170093699A1 (en) * 2015-09-29 2017-03-30 Mellanox Technologies Ltd. Hardware-based congestion control for TCP traffic
CN107493238A (zh) * 2016-06-13 2017-12-19 华为技术有限公司 一种网络拥塞控制方法、设备及系统
CN108512774A (zh) * 2018-04-18 2018-09-07 清华大学 无丢失网络中的拥塞控制方法
WO2019031739A1 (ko) * 2017-08-11 2019-02-14 삼성전자 주식회사 이동 통신 시스템 망에서 혼잡 제어를 효율적으로 수행하는 방법 및 장치
CN109981471A (zh) * 2017-12-27 2019-07-05 华为技术有限公司 一种缓解拥塞的方法、设备和系统

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080298248A1 (en) * 2007-05-28 2008-12-04 Guenter Roeck Method and Apparatus For Computer Network Bandwidth Control and Congestion Management
US9577957B2 (en) * 2015-02-03 2017-02-21 Oracle International Corporation Facilitating congestion control in a network switch fabric based on group traffic rates
CN109391560B (zh) * 2017-08-11 2021-10-22 华为技术有限公司 网络拥塞的通告方法、代理节点及计算机设备
KR101992750B1 (ko) * 2017-12-18 2019-06-25 울산과학기술원 라우터 장치 및 그의 혼잡 제어 방법
CN108418767B (zh) * 2018-02-09 2021-12-21 华为技术有限公司 数据传输方法、设备及计算机存储介质
US10944660B2 (en) * 2019-02-08 2021-03-09 Intel Corporation Managing congestion in a network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104753810A (zh) * 2013-12-30 2015-07-01 腾讯数码(天津)有限公司 一种网络入流量限速方法及装置
US20170093699A1 (en) * 2015-09-29 2017-03-30 Mellanox Technologies Ltd. Hardware-based congestion control for TCP traffic
CN107493238A (zh) * 2016-06-13 2017-12-19 华为技术有限公司 一种网络拥塞控制方法、设备及系统
WO2019031739A1 (ko) * 2017-08-11 2019-02-14 삼성전자 주식회사 이동 통신 시스템 망에서 혼잡 제어를 효율적으로 수행하는 방법 및 장치
CN109981471A (zh) * 2017-12-27 2019-07-05 华为技术有限公司 一种缓解拥塞的方法、设备和系统
CN108512774A (zh) * 2018-04-18 2018-09-07 清华大学 无丢失网络中的拥塞控制方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GAO, YIXIAO ET AL.: "DCQCN+: Taming Large-scale Incast Congestion in RDMA over Ethernet Networks", IEEE 26TH INTERNATIONAL CONFERENCE ON NETWORK PROTOCOLS, 31 December 2018 (2018-12-31), XP033442014, DOI: 20200918171424A *
See also references of EP3993330A4

Also Published As

Publication number Publication date
US20220141137A1 (en) 2022-05-05
CN112242956A (zh) 2021-01-19
CN118301078A (zh) 2024-07-05
CN112242956B (zh) 2024-04-26
EP3993330A1 (en) 2022-05-04
EP3993330A4 (en) 2022-08-10

Similar Documents

Publication Publication Date Title
WO2021008562A1 (zh) 流速控制方法和装置
US10237153B2 (en) Packet retransmission method and apparatus
US10826830B2 (en) Congestion processing method, host, and system
US10237376B2 (en) Hardware-based congestion control for TCP traffic
WO2019033857A1 (zh) 报文控制方法及网络装置
CN110061923B (zh) 流量控制方法、装置、交换机、发送端服务器及介质
WO2020042624A1 (zh) 传输速率控制方法、装置、发送设备和接收设备
WO2012066824A1 (ja) 通信装置および通信システム
US20120201136A1 (en) Mechanisms to improve the transmission control protocol performance in wireless networks
CN107852371B (zh) 数据分组网络
CN112104562B (zh) 拥塞控制方法及装置、通信网络、计算机存储介质
WO2019210725A1 (zh) 拥塞控制方法、装置、设备及存储介质
WO2017114231A1 (zh) 一种报文发送方法、tcp代理以及tcp客户端
WO2021238025A1 (zh) 一种网络拥塞控制方法及相关产品
US11671377B2 (en) System and method for reducing bandwidth usage of a network
WO2017097201A1 (zh) 一种数据传输方法、发送装置及接收装置
US20230059755A1 (en) System and method for congestion control using a flow level transmit mechanism
WO2020073907A1 (zh) 转发表项的更新方法及装置
WO2019128649A1 (zh) 一种发送数据流的方法、设备和系统
CN107852372B (zh) 数据分组网络
CN108243117B (zh) 一种流量监控方法、装置及电子设备
EP3108631B1 (en) Buffer bloat control
CN111274195B (zh) Rdma网络流控方法、装置及计算机可读存储介质
CN114095431A (zh) 队列管理的方法和网络设备
US11870708B2 (en) Congestion control method and apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20840799

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020840799

Country of ref document: EP

Effective date: 20220126