WO2021026740A1 - 流量均衡方法、网络设备及电子设备 - Google Patents

流量均衡方法、网络设备及电子设备 Download PDF

Info

Publication number
WO2021026740A1
WO2021026740A1 PCT/CN2019/100270 CN2019100270W WO2021026740A1 WO 2021026740 A1 WO2021026740 A1 WO 2021026740A1 CN 2019100270 W CN2019100270 W CN 2019100270W WO 2021026740 A1 WO2021026740 A1 WO 2021026740A1
Authority
WO
WIPO (PCT)
Prior art keywords
packet
flow
data
stream
data packets
Prior art date
Application number
PCT/CN2019/100270
Other languages
English (en)
French (fr)
Inventor
林云
塔尔亚利克斯
唐德智
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201980099031.8A priority Critical patent/CN114208131A/zh
Priority to PCT/CN2019/100270 priority patent/WO2021026740A1/zh
Priority to EP19941408.7A priority patent/EP4009604A4/en
Publication of WO2021026740A1 publication Critical patent/WO2021026740A1/zh
Priority to US17/670,216 priority patent/US20220166721A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/12Avoiding congestion; Recovering from congestion
    • H04L47/125Avoiding congestion; Recovering from congestion by balancing the load, e.g. traffic engineering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2441Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/74Address processing for routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/11Identifying congestion
    • H04L47/115Identifying congestion using a dedicated packet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/41Flow control; Congestion control by acting on aggregated flows or links

Definitions

  • This application relates to the field of communication technology, and in particular to a flow balancing method, network equipment and electronic equipment.
  • a data center network data center network
  • different networking modes can be adopted to provide a network for many servers in the DCN.
  • FIG 1 is a common three-layer networking mode among multiple networking modes.
  • the DCN is divided into three layers, and the downlink port of the top of rack (TOR) node of the access layer is connected to the server.
  • the uplink port of the TOR node is connected to the downlink port of the aggregation node of the aggregation layer.
  • the uplink port of the aggregation node is connected to the downlink port of a spine node (also referred to as a backbone node) of the core layer.
  • TOR top of rack
  • the uplink port of the aggregation node is connected to the downlink port of a spine node (also referred to as a backbone node) of the core layer.
  • spine node also referred to as a backbone node
  • the flow balancing method, network equipment, and electronic equipment provided by the embodiments of the present application can dynamically control path switching and achieve more flexible flow balancing.
  • the network device may refer to a stand-alone device or a component integrated in the device, such as a chip system in the device.
  • the network device may be a device with the function of a sending node, for example, it may be a sending node or a chip system in the sending node.
  • the network device includes the following functional modules: a first module for acquiring data packets to be sent; The second module is used to create or maintain flow packets and divide the data packets to be sent into corresponding flow packets according to the destination node; the third module is used to send the data to be sent based on the flow packets to which the data packets to be sent belong package.
  • one flow packet includes at least one data packet, the destination nodes of the data packets belonging to the same flow packet are the same, and the sending paths of the data packets belonging to the same flow packet are the same.
  • a stream packet is a data combination form proposed in the embodiment of this application.
  • a stream packet refers to a collection of a series of data packets. Different flow packets can be distinguished by means such as flow packet labels.
  • the flow balancing method provided in the embodiments of this application flexibly creates or maintains flow packets, and divides the data packets to be sent into corresponding flow packets according to the destination node.
  • the data packets in the same flow packet can be sent through the same path, and the data packets in different flow packets are sent through different paths. That is, according to the technical solution of the embodiment of the present application, it is possible to implement dynamic control path switching according to the flow packet to which the data packet belongs.
  • the granularity of path switching is related to the way of dividing stream packets.
  • the path switching granularity is finer, that is, every time fewer data packets are sent, the path switching can be triggered.
  • the path switching granularity is coarser. That is, every time more data packets are sent, the path switching can be triggered.
  • the second module is also used to: create or maintain the first stream packet; classify the data packets targeting the first node into the first stream packet; create or maintain the second stream packet; send it later The data packets that target the first node are classified into the second flow packet.
  • the transmission path of the data packet in the second flow packet is different from the transmission path of the data packet in the first flow packet.
  • the second module is also used to: determine whether the network equalization parameters meet the preset conditions; if the network equalization parameters meet the preset conditions, create or maintain a second stream packet; the network equalization parameters are used to The network balance principle divides data packets into corresponding flow packets.
  • the network equalization parameter meeting the preset condition includes meeting any one or more of the following conditions:
  • the data amount of the first flow packet reaches the preset data amount; the preset condition is that if the data amount of the data packet in a certain created flow packet reaches the preset data amount, the sender creates a new flow packet.
  • each flow packet includes data packets of the same amount of data, that is, each flow packet includes data packets of a preset data amount. In this way, the data amount of the data packets in each flow packet is made the same, so as to balance the bandwidth resources occupied by the data packets in different flow packets.
  • the duration of the first stream packet reaches the preset duration; the duration of the created stream packet refers to the duration of the data packet in the created stream packet, specifically, the first one from the created stream packet A period of time from the moment the packet is sent. If there is no data packet to the same destination node after the transmission time of the first data packet in a created flow packet is reached, the sender creates a new flow packet and sets the time after the preset duration The sent data packet is included in the newly created flow packet.
  • the data packets destined to the same destination node in each flow packet occupy roughly the same time resources, that is, it can reduce the data packets destined to the same destination node in one or several flow packets occupying more time resources, while other flows
  • the probability that the data packets in the packet occupy less time resources, and the time resources occupied by the data packets in different flow packets are balanced.
  • the sending frequency of the first stream packet reaches the preset frequency.
  • the sending frequency of data packets can reflect the speed of sending data packets.
  • the sending frequency of data packets is affected by the time interval between data packets.
  • the transmission frequency of data packets is related to data transmission requirements and/or other factors. For example, when sending data in burst mode, the sending frequency of data packets is not fixed, and the sending frequency can be faster in a certain period of time, and the sending frequency can be slower in other periods.
  • the sending frequency of the data packet is relatively fixed, and the data packet can be sent at a uniform speed.
  • a new flow packet can be created and the subsequent The sent flowlet is included in the newly created flow package. In this way, in order to balance the load between the paths corresponding to each flow packet, and reduce the probability that the path corresponding to the one or more flow packets is overloaded due to the relatively fast transmission frequency of the flowlet in one or more flow packets.
  • At least one data packet in the first flow packet carries the first delimitation identifier and the second flow packet At least one data packet in the packet carries a second delimitation identifier, and the first delimitation identifier and the second delimitation identifier are used to distinguish the first flow packet from the second flow packet.
  • the third module is also used to send delimited packets through the first path.
  • the delimited packet is a control packet used to distinguish the first flow packet from the second flow packet
  • the first path is a path for sending the data packet in the first flow packet.
  • the data packets in a flow packet come from the same or different data flows.
  • the second module is also used to set a transmission path for the data packet in the flow packet based on the principle of network balance.
  • the embodiments of the present application provide a network device.
  • the network device may refer to an independent device or a component integrated in the device, for example, it may refer to a chip system in the device.
  • the network device may be a device with the function of a receiving node, for example, it may be a receiving node, or a chip system in the receiving node.
  • the network device includes the following functional modules: The fourth module is used to determine when the second stream packet is received In the case of, whether the tail packet of the first flow packet has been received; the fifth module is used for if the tail packet of the first flow packet has not been received, the data packets in the first flow packet and the second flow packet are processed through the sorting channel RC Perform reordering; the sixth module is used to release the RC if it is determined that the tail packet of the first stream packet has been received after the reordering.
  • the receiver performs reordering through RC.
  • the receiver releases the RC after receiving the tail packet of the first stream packet.
  • the receiver receives the tail packet of the first stream packet, indicating that the first stream packet has been received. In this way, there is no longer a disorder problem between the first stream packet and the second stream packet.
  • the RC can be released and the The released RC is used in other reordering processes, thereby improving the utilization of RC. Since the idle RC can also be used for other reordering processes, it is equivalent to expanding the number of available RCs.
  • an embodiment of the present application provides an electronic device, including a processor and a storage device, the storage device is used to store instructions, and the processor is used to perform the following actions based on the instructions: create or maintain a first stream packet; use the first node as The target data packet is classified into the first flow packet; the second flow packet is created or maintained when the network equalization parameters meet the preset conditions; the subsequent data packets targeting the first node are classified into the second flow packet, Among them, the destination node of the data packets belonging to the same flow packet is the same, and the transmission path of the data packets belonging to the same flow packet is the same; the transmission path of the data packet in the second flow packet is the same as that of the data packet in the first flow packet. The path is different.
  • an embodiment of the present application provides an electronic device, including a processor and a storage device, the storage device is used to store instructions, and the processor is used to perform the following actions based on the instructions: determine whether the second stream packet is received The tail packet of the first flow packet has been received; if the tail packet of the first flow packet is not received, reorder the data packets in the first flow packet and the second flow packet through the sorting channel RC; after the reordering, if it is determined If the tail packet of the first stream packet has been received, the RC is released.
  • an embodiment of the present application provides a traffic balancing method, which may be executed by the device of the first aspect or the third aspect.
  • the method includes the following steps: obtaining the data packet to be sent, creating or maintaining the flow packet, and dividing the data packet to be sent into the corresponding flow packet according to the destination node; sending the data packet to be sent based on the flow packet to which the data packet to be sent belongs Packets.
  • one flow packet includes at least one data packet, the destination nodes of the data packets belonging to the same flow packet are the same, and the sending paths of the data packets belonging to the same flow packet are the same.
  • the method of creating or maintaining a flow package and dividing the flow packages to which different data packets belong includes the following steps: creating or maintaining the first flow package; The data packets that target the first node are classified into the first flow packet; the second flow packet is created or maintained; the data packets that are sent after the first node are classified into the second flow packet.
  • the transmission path of the data packet in the second flow packet is different from the transmission path of the data packet in the first flow packet.
  • the creation or maintenance of the second stream package can be specifically implemented as the following steps: determine whether the network balance parameters meet the preset conditions; if the network balance parameters meet the preset conditions, create or maintain the second stream package ; Among them, the network balance parameter is used to divide the data packet into the corresponding flow packet based on the network balance principle.
  • Satisfying preset conditions for network equalization parameters includes meeting any one or more of the following conditions:
  • the data volume of the first flow packet reaches the preset data volume; the duration of the first flow packet reaches the preset time Set the duration; the sending frequency of the first stream packet reaches the preset frequency.
  • the first-class package of the data packet can be distinguished by the label of the data packet. At least one data packet in the first stream packet carries a first delimitation identifier, and at least one data packet in the second stream packet carries a second delimitation identifier. The first delimitation identifier and the second delimitation identifier are used to distinguish the first stream packet from the second delimitation identifier. Second-rate package.
  • the data packets belonging to different flow packets can be distinguished by inserting control packets between the data packets, and the delimited packet is sent through the first path.
  • the delimited packet is used to distinguish the first flow packet from the second flow packet.
  • the first path is the path for sending the data packet in the first flow packet.
  • the data packets in a flow packet come from the same or different data flows.
  • the above method further includes: setting a transmission path for the data packet in the flow packet based on the principle of network balance.
  • the embodiments of the present application provide a traffic balancing method, which can be executed by the device of the second or fourth aspect (that is, a device with a receiving node function), and the method includes the following steps:
  • the present application provides a flow balancing device, which has the function of realizing the flow balancing method of any one of the foregoing aspects.
  • This function can be realized by hardware, or by hardware executing corresponding software.
  • the hardware or software includes one or more modules corresponding to the above-mentioned functions.
  • a traffic balancing device including: a processor and a memory; the memory is used to store computer execution instructions, and when the traffic balancing device is running, the processor executes the computer execution instructions stored in the memory to make The flow balancing device executes the flow balancing method of any one of the above aspects.
  • a flow balancing device including: a processor; the processor is configured to couple with a memory, and after reading an instruction in the memory, execute the flow balancing method according to any one of the foregoing aspects according to the instruction.
  • a computer-readable storage medium stores instructions that, when run on a computer, enable the computer to execute the traffic balancing method of any one of the foregoing aspects.
  • a computer program product containing instructions which when running on a computer, enables the computer to execute the flow balancing method of any one of the above-mentioned aspects.
  • a circuit system in a twelfth aspect, includes a processing circuit configured to execute the flow equalization method according to any one of the fifth aspect or the sixth aspect.
  • a chip in a thirteenth aspect, includes a processor, the processor is coupled with a memory, the memory stores program instructions, and when the program instructions stored in the memory are executed by the processor, any one of the fifth aspect or the sixth aspect is implemented The method of traffic balancing.
  • Figure 1 is a schematic diagram of a system architecture provided by an embodiment of the application.
  • Figure 2 is a schematic diagram of a stream block provided by an embodiment of the application.
  • FIG. 3 is a schematic diagram of a principle of load balancing according to an embodiment of this application.
  • FIG. 4 is a schematic diagram of another principle of load balancing according to an embodiment of the application.
  • Figure 5 is a flow chart of a method for traffic balancing provided by an embodiment of the application.
  • FIG. 6 is a flowchart of a method for traffic balancing provided by an embodiment of the application.
  • FIG. 11 is a schematic diagram of the principle of data packet sorting provided by an embodiment of the application.
  • FIG. 12 to FIG. 15 are schematic diagrams of the structure of a traffic balancing device provided by embodiments of the application.
  • first and second in the description of the application and the drawings are used to distinguish different objects, or to distinguish different processing of the same object, rather than describing a specific order of objects.
  • At least one means one or more
  • Multiple means two or more.
  • the flow balancing method provided in the embodiments of the present application is applied to a system that needs to perform flow balancing. Exemplarily, it is used in DCN where traffic balance is required.
  • the DCN involved in the embodiments of the present application may have different networking modes or may have different levels.
  • the nodes of the access layer may be referred to as access nodes.
  • the specific implementation and name of the access node may be different.
  • the access node in a DCN adopting a leaf-spine networking mode, the access node may be called a leaf node.
  • the leaf node at the top of the rack can also be called a TOR node.
  • the following description mainly takes the access node as a TOR node as an example, and the access node may also be a node of other forms, which will not be listed in the embodiment of the present application.
  • a server is connected to the downstream port of the TOR node, and the TOR node can send and receive data through the server.
  • the nodes of the convergence layer can be referred to as sink nodes.
  • the downstream port of the sink node is connected to the upstream port of the TOR node.
  • the nodes of the core layer may be called core nodes.
  • the core nodes may be called spine nodes.
  • the following description mainly takes the core layer node as the spine node as an example, and the description is unified here.
  • the core node may also be a node other than the spine node, which is not listed one by one in the embodiment of this application.
  • the port of the spine node can be connected to the uplink port of the aggregation node.
  • the aforementioned access node, sink node, and core node may all be referred to as switching nodes.
  • I won’t repeat it here.
  • one or more TOR nodes and one or more aggregation nodes can form a group (point of delivery, Pod).
  • the spine node can also be divided into different spine planes (plane). Different aggregation nodes of the same Pod can be connected to different spine planes.
  • the spine node in Figure 1 is divided into three planes. The spine node in each plane is directly connected to different aggregation nodes in each Pod.
  • the aggregation node and the TOR node in the same Pod may be in a fully connected relationship. That is, each TOR node in a certain PoD is connected to all aggregation nodes in the PoD, and each aggregation node in the PoD is connected to all TOR nodes in the PoD.
  • the connection relationship between the TOR node and the aggregation node in the same PoD may also be in other forms, which is not limited in the embodiment of the present application.
  • the aggregation node is used to complete the traffic exchange between cross-TOR nodes in the same Pod. For example, the traffic exchange from source d0 to destination d1 in Figure 1.
  • the spine node is used to complete the exchange of traffic between pods. For example, the traffic exchange from source S0 to destination s4 in Figure 1.
  • Data flow refers to a set of ordered data sequence with a start and end byte.
  • Data packets belonging to the same data flow usually have the same attributes, such as the same source internet protocol address (IP) address, destination IP address, destination port (port), etc.
  • IP internet protocol address
  • the source IP address, destination IP address, destination port (port), etc. usually form a 5-tuple.
  • byte data from the same source and sent to the same destination can be referred to as the same data stream.
  • the data flow can also have other definitions.
  • Transmission control protocol transmission control protocol
  • TCP transmission control protocol
  • This method of sending packets not at a fixed rate can be called a burst sending method.
  • one or more data packets sent each time can be called a flowlet.
  • a data flow can include multiple flowlets. For example, take Figure 2 as an example.
  • the sending node (such as the TOR node) sends flowlet1 with a small amount of data.
  • the sending node sends flowlet2 with a larger amount of data.
  • the sending node sends flowlet3.
  • a is the sending node
  • b is the receiving node
  • flowlet1 is the multiple data packets sent by the sending node in the above-mentioned burst sending method during t3-t4
  • flowlet2 is the sending node using the above-mentioned burst sending method at t1-
  • the path delays of the two links are d1 and d2 respectively
  • the time interval between flowlet1 and flowlet2 is Gap. If flowlet1 and flowlet2 belong to the same data flow, usually when the time interval of flowlet1 and flowlet2 is Gap>
  • the time interval between flowlet1 and flowlet2 refers to the time interval between the last packet in flowlet1 and the second packet in flowlet2, that is, the time interval between t2-t3.
  • the flowlet may also be referred to as a flow segment, or other names.
  • Distribution by flow also known as static load balancing
  • load balance load balance
  • the switching nodes in DCN use a hash (hash) algorithm for data
  • the flow performs hash calculation, and based on the hash result, selects a path for sending the data flow among multiple available paths.
  • the hash algorithm can use the five-tuple mentioned above as input. Taking Figure 1 as an example, if only considering the traffic exchange with PoD, there are 4 available paths from d0 to d2, namely path d0-s0-d2, path d0-s1-d2, path d0-s2-d2, path d0 -s3-d2.
  • FLB flow-based load balance
  • SLB static load balance
  • the flow load balancing method may cause congestion of the data receiving and sending ports.
  • the SLB mechanism that uses the hash algorithm to select a route will generate a hash collision.
  • a hash collision may occur on the uplink port from the TOR node to the aggregation node, or from the aggregation node to the spine node.
  • the hash values of multiple data streams may be the same. Therefore, when the TOR node uses the hash algorithm to select the route, multiple data with the same hash value may be selected The stream is sent to the same aggregation node. If there are multiple simultaneously active data streams, a large number of data packets may be sent to the same uplink port at the same time, which may cause congestion on the uplink port.
  • Packetspray is a (dynamic load balancing, DLB) technology.
  • DLB dynamic load balancing
  • packet-based distribution also has corresponding shortcomings. Since packets belonging to the same data flow will be sent via different paths, the congestion levels of different paths may be different, resulting in different path delays. Therefore, different data packets of the same data flow may reach their destination at different times. That is to say: the packets belonging to the same data stream may be out of order (that is, the situation of first coming first, first coming first), which requires the receiving end to deal with out-of-order data packets through special design Do reordering, such as relying on hardware or software to perform reordering. In a possible implementation manner, each data stream that needs to be sorted can exclusively or share a sorting channel (re-ordering channel, RC) with other data streams, which is also called re-ordering logic.
  • re-ordering channel re-ordering channel
  • RC refers to the logic, cache, resources, and related data structures used for reordering, or other related content collectively.
  • the time occupied by the RC is related to the life cycle of the data stream (for example, 10ms or 2s).
  • the receiving side needs to have enough RC to reorder the data packets. That is to say, if each data stream occupies one RC, then if there are 10K data streams, 10K RCs are required, which leads to the complexity or scalability of the receiver.
  • the burst sending method can also implement DLB, that is, through the above burst sending method, the sending path can be dynamically switched for different flowlets belonging to the same data flow, such as flowlet1 and flowlet2 belonging to the same data flow in Figure 3 It is sent through different paths to make the traffic distribution more even.
  • the TCP source uses pacing to send data, such as pacing to send data. Since the time interval between data packets is small, it is difficult to meet the preset gap and trigger path switching.
  • the switching node can trigger backpressure, which sends a message to the data sender to instruct the data sender to suspend sending. As a result, data is waiting at the sender. Accordingly, the time interval between sending data may not be guaranteed to be greater than the delay difference between paths. In other words, it may cause disorder.
  • an embodiment of the present application provides a traffic balancing method. Referring to FIG. 5, the method includes the following steps:
  • the sender obtains a data packet to be sent.
  • the sender when the sender has data to send, the data to be sent is encapsulated into a data packet according to the format defined by the protocol. According to the amount of data to be sent, the sender can encapsulate the data to be sent into one or more data packets, that is, the data packet to be sent in the embodiment of the present application may refer to one or more data packets.
  • the sender creates or maintains a flow packet (flowpac), and divides the data packet to be sent into a corresponding flow packet according to the destination node.
  • flowpac flow packet
  • the stream packet is a data combination form proposed in the embodiment of this application.
  • a flow packet refers to a series of data packets, that is, a flow packet includes one or more data packets.
  • a data packet belonging to the same flow packet has the same destination node, that is, the data packets belonging to the same flow packet all go to the same destination node.
  • Different flow packets can be distinguished by means of flow packet labels, etc. The specific scheme for distinguishing flow packets can be found below. It should be noted that the stream packet is not a data encapsulation body specified by a protocol.
  • the flow packet may include one or more complete flowlets.
  • a flow packet may include all data packets in flowlet1 and flowlet2. .
  • the flow packet may also include part of the data packets in the flowlet.
  • a flow packet may include all the data packets of flowlet1 and the first two data packets of flowlet2. The specific introduction of flowlet can be found above, so I won't repeat it here.
  • the data packets in the stream packet can come from the same data stream or from different data streams.
  • the data packets in the flow package can be part or all of the data packets in a certain data flow, or data packets in multiple data flows.
  • a flow package includes all the data packets in flowlet1 and flowlet2, flowlet1 and flowlet2.
  • flowlet2 belongs to a different data flow.
  • a flow package includes all data packages in flowlet1 and flowlet2, and flowlet1 and flowlet2 belong to the same data flow.
  • the sender can create or maintain different flow packets by using the acquired data packets to be sent.
  • the multiple data packets to be sent acquired by the sender can be classified into the same flow packet, or the data packets to be sent Part of the data packets in are classified into a certain flow packet, and other parts of the data packets to be sent are classified into other flow packets, that is, the data packets to be sent can be classified into different flow packets.
  • the sender After the sender obtains the data packet to be sent, it needs to create the flow package first, and subsequently, the sender can continue to obtain the data packet to be sent to maintain the created flow package.
  • the current data packet to be sent includes 3 data packets
  • the sender can create a flow packet including 3 data packets according to the destination node of these 3 data packets, that is, divide these 3 data packets into the flow packet.
  • the sender obtains the data packets to be sent at subsequent moments, and maintains the flow package created before, that is, the data packets to be sent at the subsequent moments can be assigned to the flow package created before, and the flow is updated.
  • the number of packets included in the packet is the number of packets included in the packet.
  • the sender obtains one or more data packets to be sent, and creates or maintains multiple flow packets, so as to classify one or more data packets into corresponding flow packets, so as to facilitate subsequent flow packets according to each data packet. Decide the sending path for the packet.
  • the basis for the sender to classify the data packet into the first-class packet may be the destination node of the data packet, that is, data packets sent to the same destination node are classified into the same flow packet.
  • the transmission paths of data packets belonging to the same flow packet are the same.
  • the basis for the sender to classify data packets into the stream packets may also be the network equalization parameters of the data packets.
  • the network balance parameter is used by the sender to divide multiple data packets into different flow packets based on the network balance principle, and the data packets of different flow packets are sent through different paths.
  • the transmission path can be set for the data packet in the flow packet based on the network balance principle.
  • network balance mainly refers to the balance of traffic between links.
  • the balance of traffic means that within a period of time, the difference between the traffic of each of the multiple links is within a preset range.
  • flow balancing means that the flow is shared across multiple links, and the value of the flow shared by each link has little difference.
  • the sender determines which first-rate packet to be included in the data packet based on the network equalization principle, which means that when there are many data packets, some data packets need to be classified into a certain flow according to the network equalization parameters Packet, divide another part of the data packet into another flow packet, so that in the future, part of the data packet can be sent through the path corresponding to a certain flow packet, and the other part of the data packet can be sent through the path corresponding to the other flow packet mentioned above, that is, it can be multiple Each data packet is sent through different paths to achieve flow balance among multiple paths.
  • the network equalization principle which means that when there are many data packets, some data packets need to be classified into a certain flow according to the network equalization parameters Packet, divide another part of the data packet into another flow packet, so that in the future, part of the data packet can be sent through the path corresponding to a certain flow packet, and the other part of the data packet can be sent through the path corresponding to the other flow packet mentioned above, that is, it
  • the network equalization parameter of the data packet can be used to characterize the network resources occupied by the data packet, and the network resources can be, for example, bandwidth resources, time resources, and so on.
  • the main network equalization parameters used for traffic equalization in the embodiments of the present application can be, for example, one or a combination of the following: the time interval Gap between data packets, the data volume Size of the data packets, the duration of the data packet Time, and data
  • the frequency of packet transmission can also be in various other measurable ways.
  • the one or more network equalization parameters are used to control the dynamic capabilities of the DLB.
  • the dynamic capability of DLB refers to the ability of data packets to flexibly select different paths. Generally, the stronger the dynamic capability, the greater the possibility of different data packets being transmitted through different paths, or in other words, the easier it is to trigger path switching. The weaker the dynamic capability, different data packets are often carried by the same one or several paths, in other words, it is difficult to trigger path switching.
  • the network balancing parameters used for traffic balancing in the embodiments of the present application may also be other parameters, such as the congestion degree of different paths obtained by the switching node from other switching nodes, such as TOR aggregation from upstream
  • the congestion degree of the path between the TOR and the aggregation obtained by the node for example, the congestion degree of the path between spine 0 and the TOR obtained by the TOR from spine 0, the depth of the sending queue of a certain switching node, and the sliding of the switching node
  • the characteristics of the sliding window such as the size of the sliding window.
  • the network equalization parameters listed in this paragraph can be used to reflect the congestion state of the network.
  • the network equalization parameters used for traffic balancing in the embodiment of this application may also include other parameters, which are not listed here in the embodiment of this application.
  • the sender when the sender continuously obtains the data packets to be sent and creates or maintains different stream packets, if the network equalization parameters meet the preset conditions, the sender can create a new stream packet and update the subsequent The data packet to be sent is classified into the newly created stream packet, where the network equalization parameter satisfies the preset conditions including one or more combinations of the following:
  • the data volume of the created stream packet reaches the preset data volume.
  • the data volume of the created flow package refers to the data volume of the data package in the created flow package.
  • the preset condition is that if the data amount of the data packet in a certain created flow packet reaches the preset data amount, the sender creates a new flow packet.
  • each flow packet includes data packets of the same amount of data, that is, each flow packet includes data packets of a preset data amount. In this way, the data amount of the data packets in each flow packet is made the same, so as to balance the bandwidth resources occupied by the data packets in different flow packets.
  • the duration of the created stream packet reaches the preset duration.
  • the duration of the created flow packet refers to the duration of the data packet in the created flow packet, specifically, it refers to a period of time from the sending moment of the first data packet in the created flow packet. If there is no data packet to the same destination node after the transmission time of the first data packet in a created flow packet is reached, the sender creates a new flow packet and sets the time after the preset duration The sent data packet is included in the newly created flow packet.
  • the data packets destined to the same destination node in each flow packet occupy roughly the same time resources, that is, it can reduce the data packets destined to the same destination node in one or several flow packets occupying more time resources, while other flows
  • the probability that the data packets in the packet occupy less time resources, and the time resources occupied by the data packets in different flow packets are balanced.
  • the sending frequency of the created stream packet reaches the preset frequency. Refers to the sending frequency of the created flow packet.
  • the sending frequency of data packets can reflect the speed of sending data packets. Generally, the sending frequency of data packets is affected by the time interval between data packets. Generally, the transmission frequency of data packets is related to data transmission requirements and/or other factors. For example, when using the burst mode to send data, the sending frequency of the data packet is not fixed, and the sending frequency can be faster in a certain period of time, and the sending frequency can be slower in other periods. When the data is sent in a smooth manner, the sending frequency of the data packet is relatively fixed, and the data packet can be sent at a uniform speed.
  • a new flow packet can be created and the subsequent The sent flowlet is included in the newly created flow package. In this way, in order to balance the load between the paths corresponding to each flow packet, and reduce the probability that the path corresponding to the one or more flow packets is overloaded due to the relatively fast transmission frequency of the flowlet in one or more flow packets.
  • the sender sends the data packet to be sent based on the stream packet to which the data packet to be sent belongs.
  • the flow balancing method provided in the embodiments of this application flexibly creates or maintains flow packets, and divides the data packets to be sent into corresponding flow packets according to the destination node.
  • the data packets in the same flow packet can be sent through the same path, and the data packets in different flow packets are sent through different paths. That is, according to the technical solution of the embodiment of the present application, it is possible to implement dynamic control path switching according to the flow packet to which the data packet belongs.
  • the granularity of path switching is related to the way of dividing stream packets.
  • the path switching granularity is finer, that is, every time fewer data packets are sent, the path switching can be triggered.
  • the path switching granularity is coarser. That is, every time more data packets are sent, the path switching can be triggered.
  • the traffic balancing method includes the following steps:
  • the sender obtains a data packet to be sent.
  • the data packet to be sent is shown in FIG. 7 and includes 8 data packets.
  • the sender creates or maintains the first flow packet, and classifies the data packet with the first node as the destination node into the first flow packet.
  • the embodiment of the present application mainly takes the data packet from the same source node to the same destination node as an example to illustrate the technical solution of using the same path or different paths to balance the traffic to the same destination node.
  • the source node may be node a as shown in FIGS. 7-10
  • the destination node may be node b as shown in FIGS. 7-10, that is, the first node may be node b.
  • the sender can also create or maintain flow packets in combination with the network equalization parameters of the data packets, and classify the data packets to be sent into the corresponding flow packets.
  • the first stream packet created by the sender may only include one data packet among the data packets to be sent.
  • the data packet to be sent obtained by the sender includes 8 data packets.
  • the sender can set the first data packet of the 8 data packets to be sent (that is, the data packet numbered 1 Data packet) is divided into the first flow packet, and the first flow packet including only the first data packet is created.
  • the sender can maintain the created first flow packet and update the number of data packets included in the first flow packet. For example, as shown in FIG. 7, the sending time interval with data packet 1 is less than or equal to the preset interval.
  • Data packet 2 is classified into the first stream packet.
  • the data packet 3 whose transmission time interval with the data packet 2 is less than or equal to the preset interval is also classified as the first stream packet.
  • the sender detects the transmission time interval between the last data packet in the first flow packet and the next data packet to be sent, if the transmission time between the next data packet to be sent and the last data packet in the first flow packet If the interval is less than or equal to the preset interval, the sender classifies the next data packet to be sent into the first stream packet. Exemplarily, as shown in FIG. 7, the sending time interval between every two data packets of data packet 1 to data packet 5 in the data packet to be sent is less than or equal to the preset interval, then the sender will send data packet 1 ⁇ Data packet 5 is classified as the first stream packet.
  • the first stream packet created by the sender may include a data packet with a preset data amount, or may include a data packet less than the preset data amount.
  • the preset data amount is set to 64KB (that is, the size of 4 data packets) as an example.
  • the sender obtains a data packet to be sent with a large amount of data, assuming that it is 128KB (8 data packets in size), and the data amount of the data packet to be sent is greater than the preset data amount.
  • the sender can divide 4 of the 8 to-be-sent data packets (that is, data packet 1 to data packet 4 shown in Figure 8 (a)) into the first stream packet and create The first stream packet including the 4 data packets.
  • the data amount of the first stream packet is exactly the preset data amount.
  • the sender obtains a data packet to be sent with a small amount of data at a certain moment, assuming that it is 32KB (the size of 2 data packets).
  • the data volume is less than the preset data volume.
  • the sender can first classify the two data packets to be sent into the first flow packet, and create the first flow packet including the two data packets.
  • the sender continues to obtain the data packets to be sent at the subsequent time. It can be understood that if the data volume of the data packets to be sent at the subsequent time is greater than 32KB, the sender can only include the data packets to be sent at the subsequent time. The 32KB data packet of the data packet is classified into the first flow packet, and other data packets in the data packet to be sent at the subsequent moment can be classified into other flow packets created subsequently. Conversely, if the data volume of the data packet to be sent at the subsequent time is less than or equal to 32KB, the sender can classify all the data packets to be sent at the subsequent time into the first flow packet, and update the number of data packets in the first flow packet. And continue to obtain the data packet to be sent until the data amount of the data packet in the first stream packet reaches the preset data amount as shown in FIG. 8(b).
  • the first stream packet created by the sender may only include one data packet among the data packets to be sent.
  • the data packet to be sent obtained by the sender includes 8 data packets.
  • the sender can set the first data packet of the 8 data packets to be sent (ie The data packet numbered 1) is divided into the first flow packet, and the first flow packet including only the first data packet is created. Subsequently, the sender may maintain the created first flow packet and update the number of data packets included in the first flow packet.
  • the network equalization parameter can also be a sending period.
  • a sending period is defined as a preset period of time.
  • the sender periodically creates a stream packet, that is, creates a stream packet every other sending period.
  • the first stream packet created by the sender may only include one data packet among the data packets to be sent.
  • the data packet to be sent obtained by the sender includes 8 data packets, and initially, the sender can set the first data packet of the 8 data packets to be sent (ie The data packet numbered 1) is divided into the first flow packet, and the first flow packet including only the first data packet is created.
  • the sender may maintain the created first flow packet and update the number of data packets included in the first flow packet.
  • t1 to t2 are the sending period of the first stream packet
  • t2 to t3 are the sending period of the second stream packet
  • the sending time of data packet 2 is within the sending period of the first stream packet.
  • the sender divides data packet 2 into the first stream packet.
  • the sending time of data packet 3 is also within the sending period of the first flow packet, then data packet 3 is also classified as the first flow packet.
  • the first stream packet created by the sender may include one of the data packets to be sent.
  • the data packet to be sent obtained by the sender includes 8 data packets.
  • the sender can set the first data packet of the 8 data packets to be sent (that is, the data packet numbered 1 Data packet) is divided into the first flow packet, and the first flow packet including only the first data packet is created. Subsequently, the sender may maintain the created first flow packet and update the number of data packets included in the first flow packet.
  • data packet 2 and data packet 3 belong to the same flowlet 1 as data packet 1, and the sending time interval between data packet 2 and data packet 1 is small (not reaching the preset sending interval), The sending time interval between data 2 and data packet 3 is also small, indicating that the sending frequency of flowlet 1 is small, and the sender classifies data packet 2 and data packet 3 belonging to flowlet 1 into the first flow packet. Similarly, if the sending frequency of flowlet 2 to be sent later reaches the preset frequency, data packets 4 to 6 belonging to flowlet 2 are also classified as the first flow packet.
  • the network equalization parameter can also be a combination of the data amount of the data packet and the time interval between the data packets.
  • the transmission time interval between every two data packets in the data packets in the first flow packet is less than or equal to the preset interval
  • the data amount of the first flow packet is the preset data amount.
  • the network equalization parameter may also be a combination of two or more of the above multiple network equalization parameters.
  • the creation and maintenance process of the first stream package can also be referred to the above introduction, and will not be repeated here.
  • the network equalization parameters meet the preset conditions, which refers to a combination of one or more of the following:
  • the network equalization parameters meet the preset conditions, and other situations are also possible, which will not be listed here.
  • the sender creates or maintains the first stream packet. If the sender detects that a certain data packet in the first stream packet is sent within a preset interval, it will not If there are data packets destined to the same destination node, a second flow packet is created, and the data packets sent after the preset interval can be classified as the second flow packet. As shown in Figure 7, the sender has created and maintained a first flow packet. The first flow packet includes data packet 1 to data packet 5. The sender detects that since the sending time of data packet 5, there is no destination within the preset interval.
  • a second flow packet is created, and the data packet 6 sent after the preset interval is classified into the second flow packet, and the data packet 6 is the first data packet in the second flow packet.
  • the specific maintenance process of the second flow packet refer to the maintenance process of the first flow packet described above when the network balance parameter is the time interval between data packets.
  • the value of the preset interval G can be smaller than the G required by the flowlet in the prior art (the value can be determined based on experience or other methods) to avoid that in actual applications, too large G is often difficult to trigger the path The phenomenon of switching.
  • the sender creates or maintains the first stream packet, when the sender detects that the data volume of the first stream packet reaches the preset data
  • the second flow packet is created, and the data packets sent afterwards are classified into the second flow packet.
  • a stream packet can be constructed every time a data packet with a preset data amount (such as 256KB) is sent to improve the dynamic capability of DLB.
  • a preset interval such as 5 us
  • a preset data amount such as 256KB
  • it can be determined according to actual experience and statistical information that the preset interval is “small” when it meets the conditions. For example, when the preset interval is less than the first threshold, it is considered that the preset interval is small.
  • the first threshold can be flexibly set, which is not limited in the embodiment of the present application.
  • the sender creates or maintains the first flow packet, and when the sender detects that the first data packet in the first flow packet has been sent, When the preset time period arrives, the sender creates a second flow packet, and classifies the data packet sent after the preset time period into the second flow packet.
  • the network equalization parameter can also be the sending period. See Figure 9(b).
  • the sender creates or maintains the first stream packet. When the sender detects that the sending period of the first stream packet ends, the sender creates the second stream packet, and The data packets sent after the sending period of the first flow packet are classified as the second flow packet.
  • the sender configuration can also configure other network equalization parameters, and detect whether the parameters meet preset conditions, to determine whether to create a second stream packet. For example, the sender configures the parameter of the congestion level of the current path (such as the first path). When the sender detects that the congestion level of the data packets in the currently created flow packet is greater than or equal to the congestion threshold, it creates the first path. Second-rate package.
  • the sender can also determine whether to perform path switching in combination with the above two or more conditions. For example, the sender detects that the data volume of the first stream packet is large (that is, reaches the preset data volume), for example, it reaches 5 data packets, and since the fifth data packet of the first stream packet is sent, within the preset interval When there is no data packet to the same destination node, a new second stream packet is created.
  • the preset data amount S may be configured to have a smaller value
  • the preset interval G for example, 50 us
  • the preset interval G can be used to trigger the creation of a new second stream packet.
  • the idle time between the first flow packet and the second flow packet refers to that the first flow packet has no data packets during this idle time, but data packets of other flow packets are allowed to be sent during this idle time.
  • the specific definitions for the smaller value of the preset data amount S and the larger value of the preset interval G can be determined according to actual applications and algorithms, for example, setting a threshold, when the preset data amount S is less than or When it is equal to the threshold, it is deemed that the preset data amount S is small.
  • the sender sends the data packet to be sent based on the stream packet to which the data packet to be sent belongs.
  • the transmission path of the data packet in the second flow packet is different from the transmission path of the data packet in the first flow packet.
  • one or more data packets in the first stream packet can be from the same data stream or from different data streams. That is, the data packets in the first stream packet can be part of a data stream or data packets in multiple data streams (of course, it is required that these data packets in the same stream packet must go to the same Target device). Similarly, one or more data packets in the second stream packet can be from the same data stream or from different data streams.
  • the first stream packet and the second stream packet may be data packets in the same data stream.
  • the first stream packet and the second stream packet can also come from different data streams.
  • the first flow packet is a data packet in data flow 1
  • the second flow packet is a data packet in data flow 2
  • the first flow packet is a data packet in data flow 2
  • the second flow packet is a data packet in data flow 1; or, the first flow packet includes a data packet in data flow 1 and a data packet in data flow 2; or, the second flow packet Including the data packets in the data stream 1 and the data packets in the data stream 2.
  • DLB with different performance can be achieved by setting different parameters (such as setting the time interval Gap or the duration Time). Or, by setting different values of the same parameter, DLB with different performance can also be realized. Moreover, these parameters can be set separately or jointly. Exemplarily, when only the parameter of time interval is set, when the transmission period is set to 5us, a path switch may be triggered every 5us, and the performance of DLB is higher. Correspondingly, the receiving end may require more sequencing resources. When the transmission period is set to 100us, a path switch may be triggered every 100us, and DLB performance is low. Correspondingly, the ordering resources required by the receiving end may be less.
  • multiple sets of parameters can be set.
  • a set of parameters refers to one or more parameters.
  • the first set of parameters includes time interval and duration.
  • the second set of parameters includes time interval and data volume.
  • the same data stream uses a set of parameters.
  • Different data streams can use different parameters.
  • data stream 1 uses the first set of parameters for traffic balancing, and data stream 2 uses the second set of parameters.
  • different data streams can also use the same set of parameters, which is not limited in the embodiment of the present application.
  • different data streams can achieve traffic balance based on different parameters to meet the requirements of service differentiation. For example, for data flow 1, the sender detects whether the time interval between data packets meets the aforementioned preset interval, so as to create one or more flow packets for the data flow 1 to perform flow balance. For data stream 2, the sender detects the parameter of the data volume of the data packet, and switches the path every time 256KB of data is sent, thereby dynamically selecting a different path for the data packet in data stream 2.
  • the sender needs to distinguish between different stream packets. That is, delimit different stream packets.
  • Delimitation refers to clearly distinguishing which first-class packet a data packet belongs to.
  • the receiver can perform re-ordering on the data packet based on the delimitation.
  • the sender may use any of the following methods to distinguish the first flow packet from the second flow packet.
  • the sender distinguishes the first flow packet from the second flow packet by re-encapsulating the data packet of the first flow packet or the second flow packet. For example, you can add some label fields to the data packet, and set a certain value for the field to identify which first-class packet the data packet belongs to. Or, directly reuse the existing fields, and set a new value for the field to identify the data packet.
  • the receiver decapsulates the data packet, and learns from the corresponding field in the data packet which first-class packet the data packet belongs to.
  • the receiver may also determine whether the tail packet of the first stream packet is received according to the delimitation identifier.
  • the specific implementation manner for the sender to encapsulate the data packet may be: at least one data packet in the first flow packet carries a delimitation identifier, at least one data packet in the second flow packet carries a delimitation identifier, and the delimitation The identifier is used to distinguish the first flow packet from the second flow packet.
  • the delimitation identifier may be a flow packet sequence number, which is used to explicitly indicate the flow packet to which the data packet belongs.
  • Data packets in the same flow packet all carry the same flow packet sequence number (flowpacsequence, FSN).
  • FSN flow packet sequence number
  • the value of the FSN field of all packets in flowpac 0 is 0, the value of the FSN field of all packets in flowpac 1 is 1, and so on.
  • the FSN bit width (that is, the required number of bits) needs to enable the receiving side to distinguish different flowpacs received via different paths, for example, it can be 4 bits.
  • the receiver determines that the current data packet is the first packet of flow packet 3.
  • the tail packet information is stored in a re-ordering table (ROT).
  • ROT re-ordering table
  • the receiver can query the sequenced flow table. If the last packet information of the upper-class packet can be queried, it means that the receiver has received the last packet of the upper-class packet and does not need to assign the new data packet to it.
  • the receiver After the receiver receives a new stream packet (such as a second stream packet), if the receiver determines by querying the sorting flow table that it has not received the upper stream packet, that is, the last packet of the first stream packet, then It shows that the first stream packet and the second stream packet do not reach the receiver in the order of first sent first, second sent first, resulting in disorder. Therefore, the receiver performs reordering through RC. As a possible implementation manner, the receiver releases the RC after receiving the tail packet of the first stream packet. It is easy to understand that the receiver receives the tail packet of the first stream packet, indicating that the first stream packet has been received. In this way, the receiver sorts the second stream packet after the first stream packet.
  • a new stream packet such as a second stream packet
  • the RC can be released, and the released RC can be used in other reordering procedures, thereby improving the utilization rate of the RC. Since the idle RC can also be used for other reordering processes, it is equivalent to expanding the number of available RCs.
  • the receiver only needs to occupy the RC to perform reordering when path switching occurs. That is to say, if the delay difference between the two paths is at most 20us, the stream packet that occurs path switching only takes about 20us of RC time, that is, it is guaranteed that after receiving the first-transmitted data from the original path (at most only only It takes 20us) to release the RC. Compared with the packetspray mechanism, the entire data stream's life cycle (which can reach the ms or even s level) needs to occupy RC, the reordering process in the embodiment of the present application can reduce RC consumption.
  • the sorting period may refer to the moment when the receiver determines that the disorder occurs to the moment when the RC is released.
  • the embodiment of the present application does not limit the specific time period of the sorting period.
  • the sending side can control the dynamic capabilities of DLB through different parameters, such as Gap, Size, and Time, for example, can control the number of flow packets that undergo path switching. Furthermore, based on the stream packets in which path switching occurs, the number of required RCs and/or the RC sorting period can also be controlled.
  • the parameter value is different, and/or the parameter type (for example, the parameter type is Time or Gap) is different, the performance of load balancing may be different. For example, when the value of the data volume is large, a path switch can be initiated every time a large data volume is sent. The dynamic performance of DLB is poor, but the receiver needs less RC and consumes less RC.
  • different ranking strategies can be selected based on the number of RCs available to the receiver:
  • each data stream can correspond to one RC.
  • RC number threshold If the number of available RCs is less than the RC number threshold, data packets of different data streams are reordered through the same RC. That is, one data stream (for example, the data stream including the above-mentioned first stream packet and the second stream packet) or multiple data streams share one RC.
  • the following dynamic re-ordering channel (DRC) algorithm may be used to further control the number of RCs.
  • DRC dynamic re-ordering channel
  • one data stream for example, the data stream including the above-mentioned first stream packet and the second stream packet
  • multiple data streams share the same RC.
  • limited RC resources can be used to ensure that most or all streams do not appear out of order.
  • the receiver can also feed back the available RC resources to the sender to ensure that the sent flowpac has a corresponding RC available.
  • the delimited identifier used to distinguish the first stream packet from the second stream packet may be some preset identifiers in the first packet. That is, the sender only carries the preset identifier in the first packet of each flowpac. For example, setting the value of the SwitchOver (SO) field of the first packet to 1 is used to indicate that the flow packet is the first packet of a new flow packet, and it is also used to indicate that the flow packet is sent using a different path from the previous first packet. That is, the stream packet has a path switch relative to the previous stream packet.
  • the first packet carries other forms of first packet identifiers, for example, the first packet (first packet, FP) field value is 1.
  • the receiver determines that the current data packet is the first packet of the second stream packet.
  • other values of packettype can represent the packets between the first and last packets in the flowpac.
  • the delimitation identifier may be a packet sequence number (PSN).
  • PSN packet sequence number
  • M is a positive integer
  • the PSN carried in the i-th data packet of the second flow packet is i-1
  • i is an integer greater than 1
  • M Is the difference between the number of data packets in the first stream packet and 1.
  • the delimitation identifier may also be a characteristic value of the tail packet in the first flow packet carried in the first packet of the second flow packet, and the characteristic value includes a cyclic redundancy check (CRC) check value. That is, the first packet of the current flow packet carries some information of the last packet of the first-class packet.
  • CRC cyclic redundancy check
  • the sender can combine the above multiple delimited identifiers, for example, the sender carries the first packet identifier and the last packet identifier in the data packet.
  • a dedicated delimited packet can be inserted between the two flowpacs to indicate whether the tail packet of the first flow packet has been received.
  • the sender when it is determined to start sending a new flow packet and the path needs to be switched, the sender first sends a flowpac delimiter (FPD) through the original path, that is, the first path, and then sends a new path through the new path, that is, the second path.
  • the flow packet is the second flow packet.
  • the delimited packet is used to distinguish the first flow packet from the second flow packet.
  • the delimited packet is received through the first path and is the next data packet of the tail packet of the first flow packet.
  • the delimited packet may be a control packet with preset characteristics. For example, it may be a control packet containing a preset field, or a control packet of a preset size. The embodiments of this application do not limit the specific implementation of the delimited package.
  • the size of the delimited packet is small (for example, it may be less than a certain threshold), so that the delimited packet can reach the receiver with a lower delay. For example, make the delimited packet arrive at the receiver earlier than the second stream packet.
  • the delimited packet may also arrive at the receiver later than the second stream packet, and the embodiment of the present application does not limit the time when the delimited packet arrives at the receiver.
  • the receiver is currently receiving the first stream packet through the first path, and subsequently, the receiver also receives the delimited packet via the first path, the receiver can determine that the first stream packet has been received.
  • the receiver receives the second stream packet after receiving the delimited packet (the first stream packet has been received), and there is usually no problem of disorder in the first stream packet and the second stream packet.
  • the receiver usually does not need to reorder, which can reduce RC consumption.
  • the above mainly describes the flow balance of the same data flow, and the delimitation, RC, and DRC methods of the data packets in the data flow.
  • multiple data streams can also be aggregated (also called merge) into a coarse-grained stream (for example, through a certain hash algorithm).
  • a preset algorithm (such as a hash algorithm, or other similar algorithms) is used to aggregate the data streams to be sorted.
  • the coarse-grained flow formed by the convergence may be referred to herein as a convergent flow.
  • the above-mentioned flow balancing method can be performed for the convergent flow, and the specific implementation of the data packet delimitation, RC, and DRC methods in the convergent flow can be referred to the above.
  • the "data flow” in this article can refer to a TCP flow in the usual sense as mentioned above, or can refer to an aggregate flow.
  • the aggregation stream can share the same RC.
  • head of line blocking may occur between streams sharing the same RC. Therefore, the receiving end can only process the next flowpac after confirming that it has received the tail packet corresponding to the previous flowpac.
  • the sending node and the receiving node include hardware structures and/or software modules corresponding to each function.
  • the embodiments of the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Those skilled in the art can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of the technical solutions of the embodiments of the present application.
  • the embodiment of the application can divide the sending node and the receiving node into functional units according to the above method examples.
  • each functional unit can be divided corresponding to each function, or two or more functions can be integrated into one processing unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit. It should be noted that the division of units in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.
  • FIG. 12 shows a possible exemplary block diagram of a traffic balancing device involved in an embodiment of the present application.
  • the device 1200 may exist in the form of software, may also be a network device, or may be used in a network.
  • the device 1200 includes: a processing unit 1202 and a communication unit 1203.
  • the processing unit 1202 is used to control and manage the actions of the device 1200. For example, if the device is used to implement the sending node function, the processing unit 1202 is used to support the device 1200 to execute S502 in FIG. 5, and S602, S603, and S604 in FIG. , And/or other processes used in the techniques described herein.
  • the processing unit 1202 is used to support the device 1200 to determine whether the tail packet of the first flow packet has been received when the second flow packet is received, and if the tail packet of the first flow packet has not been received Packet, the data packets in the first stream packet and the second stream packet are reordered through the ordering channel RC, and after the reordering, if it is determined that the tail packet of the first stream packet has been received, the RC is released, and/ Or other processes used in the techniques described herein.
  • the communication unit 1203 is used to support communication between the device 1200 and other network entities.
  • the device 1200 may further include a storage unit 1201 for storing program codes and data of the device 1200.
  • the processing unit 1202 may be a processor or a controller, for example, a CPU, a general-purpose processor, DSP, ASIC, FPGA, or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It can implement or execute various exemplary logical blocks, modules and circuits described in conjunction with the disclosure of this application.
  • the processor may also be a combination that implements computing functions, for example, a combination of one or more microprocessors, a combination of a DSP and a microprocessor, and so on.
  • the communication unit 1203 may be a communication interface, a transceiver or a transceiver circuit, etc., where the communication interface is a general term. In a specific implementation, the communication interface may include multiple interfaces, for example, may include: an interface between a sending node and a receiving node And/or other interfaces.
  • the storage unit 1201 may be a memory or another form of storage device.
  • the processing unit 1202 is a processor
  • the communication unit 1203 is a communication interface
  • the storage unit 1201 is a memory
  • the device 1200 involved in the embodiment of the present application may be a device having the structure shown in FIG. 13.
  • the device 1300 includes: a processor 1302, a communication interface 1303, and a memory 1301.
  • the apparatus 1300 may further include a bus 1304.
  • the communication interface 1303, the processor 1302, and the memory 1301 can be connected to each other through a bus 1304;
  • the bus 1304 can be a Peripheral Component Interconnect (PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated as PCI). EISA) bus, etc.
  • the bus 1304 can be divided into an address bus, a data bus, a control bus, and so on. For ease of presentation, only one thick line is used in FIG. 13, but it does not mean that there is only one bus or one type of bus.
  • FIG. 14 shows the implementation of the sending node function involved in the above embodiment.
  • the flow balancing device 1400 may include: a first module 1401, a second module 1402, and a third module 1403.
  • the first module 1401 is used to support the flow balancing device 1400 to execute S501 in FIG. 5, S601 in FIG. 6, and/or other processes used in the solution described herein.
  • the second module 1402 is used to support the flow balancing device 1400 to perform the process S502 in FIG. 5, S602 to S604 in FIG.
  • the third module 1403 is used to support the flow balancing device 1400 to perform the process S503 in FIG. 5, and S605 in FIG. 6 is also used to send a delimited packet through the first path, where the delimited packet is used to distinguish the A control packet of the first stream packet and the second stream packet, the first path is a path for sending the data packet in the first stream packet, and/or other processes used in the scheme described herein.
  • the flow balancing device may also include other modules, which will not be repeated here.
  • FIG. 15 shows another possible structural schematic diagram of the device for implementing the receiving node function involved in the foregoing embodiment.
  • the flow balancing device 1500 may include: a fourth module 1501, a fifth module 1502, and a sixth module 1503.
  • the fourth module 1501 is used to support the flow balancing device 1500 to determine whether the tail packet of the first flow packet has been received when the second flow packet is received, and/or other processes used in the solution described herein.
  • the fifth module 1502 is used to support the flow balancing device 1500 to perform reordering of the data packets in the first flow packet and the second flow packet through the ordering channel RC when the tail packet of the first flow packet is not received, and/or use Other processes in the scheme described in this article.
  • the sixth module 1503 is used to support the flow equalization device 1500 after reordering, if it is determined that the tail packet of the first flow packet has been received, the RC is released, and/or other processes used in the solution described herein.
  • a person of ordinary skill in the art can understand that: in the above-mentioned embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software it can be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server, or data center via wired (such as coaxial cable, optical fiber, Digital Subscriber Line (DSL)) or wireless (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a Digital Video Disc (DVD)), or a semiconductor medium (for example, a Solid State Disk (SSD)) )Wait.
  • a magnetic medium for example, a floppy disk, a hard disk, and a magnetic tape
  • an optical medium for example, a Digital Video Disc (DVD)
  • DVD Digital Video Disc
  • SSD Solid State Disk
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical or other forms.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network devices (for example, Terminal). Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each functional unit may exist independently, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional units.

Abstract

本申请提供一种流量均衡方法、网络设备及电子设备,涉及通信技术领域。其中,网络设备包括:第一模块,用于获取待发送的数据包;第二模块,用于创建或维护流包,并将待发送的数据包按照目的节点划入对应的流包;第三模块,用于基于待发送的数据包所属的流包,发送所述待发送的数据包,其中,一个流包包括至少一个数据包,属于同一流包中的数据包的目的节点相同,且属于同一流包中的数据包的发送路径相同。

Description

流量均衡方法、网络设备及电子设备 技术领域
本申请涉及通信技术领域,尤其涉及流量均衡方法、网络设备及电子设备。
背景技术
在数据中心网络(data center network,DCN)中,可以采用不同的组网模式,为DCN内的众多服务器(server)提供网络。参见图1,为多种组网模式中的一种常见的三层组网模式。其中,DCN划分为三层,接入(access)层的顶架(top of rack,TOR))节点的下行端口连接服务器。TOR节点的上行端口连接汇聚(aggregation)层的aggregation节点的下行端口。aggregation节点的上行端口与核心(core)层的脊(spine)节点(也称为骨干节点)的下行端口连接。
从图1所示架构可以看出,从数据源端到数据目的之间可能存在多条路径可用。比如,从图1中的数据源端S0到数据目的端S4存在多条可用路径。因此,亟待提出一种流量均衡方法,以便数据均匀的分摊到多条可用路径上,以最大化DCN的带宽利用率。
发明内容
本申请实施例提供的流量均衡方法、网络设备及电子设备,能够动态的控制路径切换,实现更加灵活的流量均衡。
为达到上述目的,本申请实施例提供如下技术方案:
第一方面,本申请实施例提供一种网络设备,该网络设备可以指独立存在的设备,也可以指集成在设备中的组件,比如可以指设备中的芯片系统。该网络设备可以是具有发送节点功能的装置,比如可以是发送节点,也可以是发送节点中的芯片系统,该网络设备包括如下功能模块:第一模块,用于获取待发送的数据包;第二模块,用于创建或维护流包,并将待发送的数据包按照目的节点划入对应的流包;第三模块,用于基于待发送的数据包所属的流包,发送待发送的数据包。
其中,一个流包包括至少一个数据包,属于同一流包中的数据包的目的节点相同,且属于同一流包中的数据包的发送路径相同。
需要说明的是,流包为本申请实施例提出的一种数据组合形式。流包指的是一系列数据包的集合。不同流包可以通过流包标签等方式来区分。
相比于现有技术中难以触发足够粒度的动态负载均衡,本申请实施例提供的流量均衡方法,灵活创建或维护流包,并按照目的节点将待发送数据包划入对应流包,如此,可以将同一流包中的数据包通过同一路径发送,不同流包中的数据包通过不同路径发送,即按照本申请实施例的技术方案,能够实现根据数据包所属流包动态的控制路径切换。并且,路径切换的粒度与流包的划分方式相关。其中,当流包包括较少的数据包时,路径切换粒度较细,即每发送较少的数据包就可触发路径切换,当流包包括较多的数据包时,路径切换粒度较粗,即每发送较多的数据包才可以触发路径切换。可见,本申请实施例提供的流量均衡方法,还能够通过控制流包的划分方式灵活控制 路径切换粒度,以满足不同应用场景下的流量均衡需求。
作为一种可能的实现方式,第二模块还用于:创建或维护第一流包;将以第一节点为目标的数据包,划入第一流包;创建或维护第二流包;将之后发送的以第一节点为目标的数据包,划入第二流包。
其中,第二流包中的数据包的发送路径,与第一流包中的数据包的发送路径不同。
作为一种可能的实现方式,第二模块,还用于:判断网络均衡参数是否满足预设条件;若网络均衡参数满足预设条件,则创建或维护第二流包;网络均衡参数用于基于网络均衡原则将数据包划入对应的流包。
其中,网络均衡参数满足预设条件包括满足如下任一个条件或多个条件:
1、从发送第一流包中第一数据包的时刻开始的预设间隔内,不存在去往同一目的节点的数据包;也就是说,发送方获取某一待发送数据包,并将该数据包划入某一已创建流包,从该数据包被发送的时刻开始,若较长时间内,即预设间隔内,不存在与该数据包具有同一目的节点的其他数据包,则发送方创建新的流包,并可以将预设间隔之外的后续待发送数据包划入该新创建的流包。如此,使得时间间隔较大的两个数据包可以被划入不同流包。
2、第一流包的数据量达到预设数据量;该预设条件是说,若某一已创建流包中数据包的数据量达到预设数据量,则发送方创建新的流包。换句话说,每个流包均包括相同数据量的数据包,即每一流包包括预设数据量的数据包。如此,使得每一流包中数据包的数据量相同,以平衡不同流包中数据包占用的带宽资源。
3、第一流包的持续时长达到预设时长;已创建流包的持续时长,指的是该已创建流包中数据包的持续时长,具体的,指从该已创建流包中第一个数据包的发送时刻开始的一段时间。若从某一已创建流包中第一个数据包的发送时刻开始达到预设时长仍不存在去往同一目的节点的数据包,则发送方创建新的流包,并将预设时长之后时刻发送的数据包划入该新创建流包中。如此,使得每一流包中去往同一目的节点的数据包占用的时间资源大致相同,即能够降低某一个或几个流包中去往同一目的节点的数据包占用较多时间资源,而其他流包中数据包占用较少的时间资源的概率,平衡不同流包中数据包占用的时间资源。
4、第一流包的发送频率达到预设频率。已创建流包的发送频率,指的是。数据包的发送频率可以反映数据包发送的快慢,通常,数据包的发送频率受数据包相互之间的发送时间间隔影响。通常,数据包的发送频率与数据发送需求和/或其他因素有关。比如,当使用突发方式发送数据时,数据包的发送频率不固定,可以在某一时段内有较快的发送频率,可以在其他时段有较慢的发送频率。当采用平滑方式发送数据时,数据包的发送频率较为固定,可以是匀速发送数据包。通常,若采用突发方式发送数据,且当前已创建流包中一个或多个flowlet的发送频率较大,为了减轻当前已创建流包对应路径的负载,可以创建新的流包,并将之后发送的flowlet划入新创建的流包中。如此,以便于平衡各个流包对应路径之间的负载,降低某一个或多个流包中因flowlet的发送频率较快,导致该一个或多个流包对应的路径负载过重的概率。
作为一种可能的实现方式,可以通过为不同流包的数据包打不同标签来区分数据包属于哪一流包,具体的,第一流包中至少一个数据包携带第一定界标识,第二流包 中至少一个数据包携带第二定界标识,第一定界标识和第二定界标识用于区分第一流包和第二流包。
作为一种可能的实现方式,可以通过在不同流包之间插入控制包来区分数据包属于哪一流包,这种实现方式中,第三模块,还用于通过第一路径发送定界包,定界包为用于区分第一流包和第二流包的控制包,第一路径为发送第一流包中数据包的路径。
作为一种可能的实现方式,一个流包内的数据包来自相同或不同数据流。
作为一种可能的实现方式,第二模块,还用于基于网络均衡的原则为流包中的数据包设定发送路径。
第二方面,本申请实施例提供一种网络设备,该网络设备可以指独立存在的设备,也可以指集成在设备中的组件,比如可以指设备中的芯片系统。该网络设备可以是具有接收节点功能的装置,比如可以是接收节点,也可以是接收节点中的芯片系统,该网络设备包括如下功能模块:第四模块,用于判断在接收到第二流包的情况下,是否已接收到第一流包的尾包;第五模块,用于若未接收到第一流包的尾包,则通过排序通道RC对第一流包和第二流包中的数据包执行重排序;第六模块,用于在重排序之后,若确定已接收到第一流包的尾包,则释放RC。
容易理解的是,接收方在接收到新流包(比如第二流包)之后,若通过某些方式确定未接收到上一流包,即第一流包的尾包,则说明第一流包、第二流包并未按照先发先至、后发后至的顺序达到接收方,导致乱序。因此,接收方通过RC执行重排序。本申请实施例中,接收方在接收到第一流包的尾包后,释放RC。容易理解的是,接收方接收到第一流包的尾包,说明第一流包已接收完毕,如此,第一流包和第二流包之间不再存在乱序问题,可以释放该RC,并将释放的RC用于其他重排序流程,从而提升RC的利用率。由于闲置RC还可以用于其他重排序流程,相当于拓展了可用RC的数目。
第三方面,本申请实施例提供一种电子设备,包括处理器和存储设备,存储设备用于存储指令,处理器用于基于指令执行下列动作:创建或维护第一流包;将以第一节点为目标的数据包,划入第一流包;在网络均衡参数满足预设条件时,创建或维护第二流包;将之后来的以第一节点为目标的数据包,划入第二流包,其中,属于同一流包中的数据包的目的节点相同,属于同一流包中的数据包的发送路径相同;第二流包中的数据包的发送路径,与第一流包中的数据包的发送路径不同。
第四方面,本申请实施例提供一种电子设备,包括处理器和存储设备,存储设备用于存储指令,处理器用于基于指令执行下列动作:判断在接收到第二流包的情况下,是否已接收到第一流包的尾包;若未接收到第一流包的尾包,则通过排序通道RC对第一流包和第二流包中的数据包执行重排序;在重排序之后,若确定已接收到第一流包的尾包,则释放RC。
第五方面,本申请实施例提供一种流量均衡方法,该方法可以由上述第一方面或第三方面的装置执行。该方法包括如下步骤:获取待发送的数据包,创建或维护流包,并将待发送的数据包按照目的节点划入对应的流包;基于待发送的数据包所属的流包,发送待发送的数据包。
其中,一个流包包括至少一个数据包,属于同一流包中的数据包的目的节点相同, 且属于同一流包中的数据包的发送路径相同。
作为一种可能的实现方式,以创建或维护第一流包和第二流包为例,创建或维护流包,并划分不同数据包所属流包的方法包括如下步骤:创建或维护第一流包;将以第一节点为目标的数据包,划入第一流包;创建或维护第二流包;将之后发送的以第一节点为目标的数据包,划入第二流包。
其中,第二流包中的数据包的发送路径,与第一流包中的数据包的发送路径不同。
作为一种可能的实现方式,创建或维护第二流包,具体可以实现为如下步骤:判断网络均衡参数是否满足预设条件;若网络均衡参数满足预设条件,则创建或维护第二流包;其中,网络均衡参数用于基于网络均衡原则将数据包划入对应的流包。
网络均衡参数满足预设条件包括满足如下任一个条件或多个条件:
从发送第一流包中第一数据包的时刻开始的预设间隔内,不存在去往同一目的节点的数据包;第一流包的数据量达到预设数据量;第一流包的持续时长达到预设时长;第一流包的发送频率达到预设频率。
作为一种可能的实现方式,可以通过数据包的标签来区分数据包属于哪一流包。第一流包中至少一个数据包携带第一定界标识,第二流包中至少一个数据包携带第二定界标识,第一定界标识和第二定界标识用于区分第一流包和第二流包。
作为一种可能的实现方式,可以通过在数据包之间插入控制包来区分属于不同流包的数据包,通过第一路径发送定界包,定界包为用于区分第一流包和第二流包的控制包,第一路径为发送第一流包中数据包的路径。
作为一种可能的实现方式,一个流包内的数据包来自相同或不同数据流。
作为一种可能的实现方式,上述方法还包括:基于网络均衡的原则为流包中的数据包设定发送路径。
第六方面,本申请实施例提供一种流量均衡方法,该方法可以由上述第二方面或第四方面的装置(即具备接收节点功能的装置)执行,该方法包括如下步骤:
判断在接收到第二流包的情况下,是否已接收到第一流包的尾包;若未接收到第一流包的尾包,则通过排序通道RC对第一流包和第二流包中的数据包执行重排序;并在重排序之后,若确定已接收到第一流包的尾包,则释放RC。
第七方面,本申请提供一种流量均衡装置,该流量均衡装置具有实现上述任一方面任一项的流量均衡方法的功能。实现该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块。
第八方面,提供一种流量均衡装置,包括:处理器和存储器;该存储器用于存储计算机执行指令,当该流量均衡装置运行时,该处理器执行该存储器存储的该计算机执行指令,以使该流量均衡装置执行如上述任一方面中任一项的流量均衡方法。
第九方面,提供一种流量均衡装置,包括:处理器;处理器用于与存储器耦合,并读取存储器中的指令之后,根据指令执行如上述任一方面中任一项的流量均衡方法。
第十方面,提供一种计算机可读存储介质,该计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机可以执行上述任一方面中任一项的流量均衡方法。
第十一方面,提供一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机可以执行上述任一方面中任一项的流量均衡方法。
第十二方面,提供一种电路系统,电路系统包括处理电路,处理电路被配置为执行如上述第五方面或者第六方面中任一项的流量均衡方法。
第十三方面,提供一种芯片,芯片包括处理器,处理器和存储器耦合,存储器存储有程序指令,当存储器存储的程序指令被处理器执行时实现上述第五方面或者第六方面任意一项的流量均衡方法。
附图说明
图1为本申请实施例提供的系统架构示意图;
图2为本申请实施例提供的流块的示意图;
图3为本申请实施例提供的一种负载均衡的原理示意图;
图4为本申请实施例提供的另一种负载均衡的原理示意图;
图5为本申请实施例提供的流量均衡的方法流程图;
图6为本申请实施例提供的流量均衡的方法流程图;
图7至图10为本申请实施例提供的流量均衡方法的原理示意图;
图11为本申请实施例提供的数据包排序的原理示意图;
图12至图15为本申请实施例提供的流量均衡装置的结构示意图。
具体实施方式
本申请的说明书以及附图中的术语“第一”和“第二”等是用于区别不同的对象,或者用于区别对同一对象的不同处理,而不是用于描述对象的特定顺序。
“至少一个”是指一个或者多个,
“多个”是指两个或两个以上。
“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系,例如,A/B可以表示A或B。
此外,本申请的描述中所提到的术语“包括”和“具有”以及它们的任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括其他没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
需要说明的是,本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
本申请的说明书以及附图中“的(英文:of)”,相应的“(英文corresponding,relevant)”和“对应的(英文:corresponding)”有时可以混用,应当指出的是,在不强调其区别时,其所要表达的含义是一致的。
本申请实施例提供的流量均衡方法应用在需进行流量均衡的系统中。示例性的,应用在需进行流量均衡的DCN中。本申请实施例涉及的DCN可以具有不同组网模式也可以具有不同的层级。如下主要以应用在三层的DCN中为例来说明本申请实施例所适用的系统架构。参见图1,该三层的DCN包括接入层、汇聚层、核心层。
其中,接入层的节点可称为接入(access)节点。在不同组网模式中,接入节点的具 体实现方式和名称可能不同。示例性的,在采用叶脊(leaf-spine)组网模式的DCN中,接入节点可称为叶子(leaf)节点。其中,若leaf节点位于机架顶部,则这种位于机架顶部的leaf节点还可被称为TOR节点。如下主要以接入节点为TOR节点为例说明,接入节点还可能为其他形式的节点,本申请实施例不再一一列举。TOR节点的下行端口连接有服务器,TOR节点可以通过服务器收发数据。
汇聚层的节点可称为汇聚节点。汇聚节点的下行端口与TOR节点的上行端口连接。
核心层的节点可称为核心(core)节点,在采用leaf-spine组网模式的DCN中,核心节点可称为spine节点。如下主要以核心层节点为spine节点为例进行说明,在此统一说明。核心节点还可能为spine节点之外的其他节点,本申请实施例不再一一列举。spine节点的端口可以与aggregation节点的上行端口连接。
在本申请实施例中,上述接入节点、汇聚节点和核心节点,均可以称为交换节点。在此统一声明,下文不再赘述。
其中,一个或多个TOR节点以及一个或多个汇聚节点可以构成群组(point of delivery,Pod)。还可以将spine节点划分到不同的spine平面(plane)。同一Pod的不同aggregation节点可以分别连到不同的spine平面。示例性的,图1的spine节点划分为3个平面。每个平面中的spine节点分别和各Pod内的不同aggregation节点直接相连。
在一种可能的设计中,同一Pod内的的aggregation节点和TOR节点可以是全连接关系。即某一PoD内的每一TOR节点均与该PoD内的全部aggregation节点相连接,且该PoD内的每一aggregation节点均与该PoD内的全部TOR节点相连接。当然,同一PoD内的TOR节点和aggregation节点之间的连接关系还可能为其他形式,本申请实施例对此不进行限制。
其中,aggregation节点,用于完成同一Pod内跨TOR节点之间的流量交换。比如图1中从源d0到目的d1之间的流量交换。spine节点,用于完成跨Pod间流量的交换。比如图1中从源S0到目的s4之间的流量交换。
为了便于理解本申请实施例的内容,如下对一些技术术语进行介绍:
1、数据流:是指一组有序,有起点和终点的字节的数据序列。属于同一条数据流的数据包通常具有相同的属性,比如相同的源网络互连协议(internet protocol address,IP)地址,目的IP地址,目的端口(port)等。源IP地址,目的IP地址,目的端口(port)等通常可以构成五元组(5-tuple)。其中,在一种常用的数据流定义方式中,来自同一发送源且发往同一目的地的字节数据可称为同一数据流。当然,数据流还可以有其他定义。
2、流块(flowlet):传输控制协议(transmission control protocol,TCP)通常可以采用固定速率发包,比如按照速率5Gbps发包,也可以不按照固定速率发包,而是可以某些时刻(或较短的时间间隔内)发送大量数据包,在另一些时刻(或较短时间间隔内)发送小量数据包,这种不按照固定速率发包的方式可称为突发(burst)发包方式。其中,每次发送的一个或多个数据包可称为一个flowlet。通常,一条数据流可包括多个flowlet。比如,以图2为例,在t1-t2时段内,发送节点(比如TOR节点) 发送数据量较小的flowlet1,在t3-t4时段内,发送节点发送数据量较大的flowlet2,在t5-t6时段内,发送节点发送flowlet3。
参见图3,a为发送节点、b为接收节点,flowlet1为发送节点采用上述突发发包方式在t3-t4时段内发送的多个数据包,flowlet2为发送节点采用上述突发发包方式在t1-t2时段内发送的多个数据包,两条链路的路径延时delay分别为d1,d2,flowlet1和flowlet2之间的时间间隔为Gap。若flowlet1和flowlet2属于同一数据流,通常当flowlet1,flowlet2的时间间隔Gap>|d1-d2|,则这两个flowlet可经不同路径发送,而无需担心乱序。这里,flowlet1,flowlet2之间的时间间隔,指的是flowlet1中最后一个数据包和flowlet2中第二个数据包之间的时间间隔,即t2-t3这段时间间隔。
在本申请实施例中,flowlet还可以称为流段,或者其他名称。
3、按流分发(也称静态负载均衡):在一种传统的流量均衡(也称为负载均衡(load balance,LB))方式中,DCN中的交换节点采用哈希(hash)算法对数据流(flow)进行hash计算,并基于hash结果在多条可用路径中,选择一条用于发送该数据流的路径。其中,hash算法可以用上述提及的五元组作为输入。以图1为例,若仅考虑同PoD内的流量交换,d0至d2的可用路径有4条,即路径d0-s0-d2、路径d0-s1-d2、路径d0-s2-d2、路径d0-s3-d2。假设定义当hash结果为0时,表示选择PoD#1中s0来转发流量,当hash结果为1时,表示选择PoD#1中s1来转发流量,hash结果为2时,表示选择PoD#1中s2来转发流量,hash结果为3时,表示选择PoD#1中s3来转发流量。则若当前某条数据流的hash结果为2时,通过路径d0-s2-d2转发该流量。
上述这种负载均衡方式,属于同一数据流的数据包(packet)通常通过同一路径发送。这种基于数据流进行的负载均衡,称为按流分发(flow-basedload balance,FLB)或者按流负载均衡,或者称为静态负载均衡(static load balance,SLB)。
上述的按流负载均衡方式,由于通常将属于同一数据流的数据包承载在同一路径上,可保证同一数据流的数据包之间不产生乱序。因此,接收端通常无需对接收的数据包进行重排序。
但是,按流负载均衡方式可能会导致数据收发端口拥塞。具体的,采用hash算法选路的SLB机制会产生哈希冲突(hash collision),比如,hash冲突可能会发生在TOR节点至aggregation节点,或者aggregation节点至spine节点的上行端口上。以TOR节点和aggregation节点之间出现hash冲突为例,参见图4,多个数据流的hash值可能均相同,如此,TOR节点采用hash算法进行选路时,可能将hash值相同的多个数据流发往同一aggregation节点。如果存在多个同时活跃的数据流,则可能造成大量数据包同时发往同一上行端口,进而可能造成该上行端口拥塞。
4、按包(packet)分发:各交换节点均按packet在多条可用路径中均匀分发。如此,属于同一条数据流的多个数据包可能通过不同路径发送。这种机制可统称为按包分发(packetspray)。packetspray是一种(dynamic load balancing,DLB)技术。与按流分发相比,按包分发技术的优点是:不存在上述哈希冲突的问题,且可以更好地利用多路径的带宽。
但是,按包分发也存在相应缺点,由于属于同一条数据流的packet会经不同的路径发送,不同路径的拥塞程度可能不同,造成不同路径的时延可能不同。因此,同一 数据流的不同数据包到达目的地的时间可能不同。也就是说:属于同一条数据流的packet之间可能会产生乱序(即,后发先至、先发后至的情况),这就需要接收端通过专门的设计来对乱序的数据包做重排序,比如依靠硬件或软件执行重排序。在一种可能的实现方式中,每一条需要排序的数据流可以独占或与其它数据流共用一个排序通道(re-ordering channel,RC),也称为重排序逻辑。RC即用于重排序的逻辑、缓存、资源、以及相关的数据结构,或者其他相关内容的统称。占用RC的时间和数据流的生命周期(比如10ms或2s)相关。如此,当数据流较多时,接收侧需有足够的RC,来对数据包进行重排序。也就是说,若每条数据流独占一个RC,那么如果有10K条数据流,则需要10K个RC,导致接收端实现的复杂性(complexity)或可扩展性(scalability)的问题。
在一个示例中,突发发包方式也可以实现DLB,即通过上述突发发包方式可以为属于同一条数据流的不同flowlet动态地切换发送路径,比如图3中将属于同一数据流的flowlet1和flowlet2通过不同路径发送,从而令流量分布更均匀。
但如下两个方面限制flowlet应用:
一方面,DCN带宽大,通常路径之间的延时差较小(比如延时差为20us),所以,过大的时间间隔Gap设置(比如在某些算法中,设置Gap=500us)难以触发足够细粒度的动态负载均衡。也就是说,当设置Gap值过大时,两个flowlet之间的时间间隔只有大于该Gap才能切换路径,即相对于高速传输的数据流来说,两个flowlet之间的时间间隔往往较小,每隔很长时间可能才触发一次路径切换,无法满足动态负载均衡的性能。
另一方面,网络中的动态行为可能会影响路径切换。例如,TCP源端采用平滑(pacing)方式发送数据,比如采用pacing发送数据,由于数据包之间的时间间隔较小,难以满足预设Gap,难以触发路径切换。另外,当交换节点资源耗尽,比如交换节点的缓冲区(buffer)耗尽时,交换节点可触发反压(backpressure),其向数据发送方发送消息,以指示数据发送方暂缓发送。如此,造成数据在发送方等待,相应的,发送数据之间的时间间隔可能无法保证大于路径间延时差。也就是说可能造成乱序。
上文已指出,按照上述按包分发方式进行流量均衡时,难以触发足够粒度的动态负载均衡,比如,可能每隔很长时间才触发一次路径切换。在按照按流分发方式进行流量均衡时,并不触发动态负载均衡。如此,导致目前的负载均衡性能较低。为了解决这一技术问题,本申请实施例提供一种流量均衡方法,参见图5,该方法包括如下步骤:
S501、发送方获取待发送的数据包。
容易理解的是,当发送方有数据需发送时,将待发送的数据按照协议定义格式封装为数据包。根据待发送数据量的大小,发送方可以将待发送数据封装成一个或多个数据包,即本申请实施例中待发送数据包可以指一个或多个数据包。
S502、发送方创建或维护流包(flowpac),并将所述待发送的数据包按照目的节点划入对应的流包。
其中,流包为本申请实施例提出的一种数据组合形式。流包指的是一系列数据包的集合,即一个流包包括一个或多个数据包。一属于同一流包中的数据包的目的节点 相同,即属于同一流包的数据包均去往同一目的节点。不同流包可以通过流包标签等方式来区分,区分流包的具体方案介绍可参见下文。需要说明的是,流包并非种协议规定的数据封装(data encapsulation)体。
在一种可能的实现方式中,若采用上述突发发包方式,流包可以包括一个或多个完整的flowlet,比如,以图2为例,一个流包可以包括flowlet1和flowlet2中的全部数据包。当然,流包也可以包括flowlet中的部分数据包,仍以图2为例,一个流包可以包括flowlet1的全部数据包和flowlet2中的前2个数据包。flowlet的具体介绍可参见上文,这里不再赘述。此外,流包中的数据包可以来自同一数据流,也可以来自不同数据流。即流包中的数据包可以为某一数据流中的部分或全部数据包,也可以为多个数据流中的数据包,比如,一个流包包括flowlet1和flowlet2中的全部数据包,flowlet1和flowlet2属于不同数据流。又比如,一个流包包括flowlet1和flowlet2中的全部数据包,flowlet1和flowlet2属于同一数据流。
本申请实施例中,发送方可以利用获取的待发送数据包创建或维护不同的流包,这里,发送方获取的多个待发送数据包可以划入同一流包,也可以将待发送数据包中的部分数据包划入某一流包,将待发送数据包中的其他部分数据包划入其他流包,即待发送数据包可被划入不同流包。以创建、维护某一流包为例,发送方获取待发送数据包后,需先创建流包,后续,发送方可以继续获取待发送数据包,来维护所创建的流包。比如,当前的待发送数据包包括3个数据包,发送方可根据这3个数据包的目的节点创建一个包括3个数据包的流包,即将这3个数据包划入该流包。之后,随着时间的推移,发送方获取后续时刻的待发送数据包,并维护之前创建的该流包,即可以将后续时刻的待发送数据包划入之前创建的该流包,更新该流包包括的数据包数目。如此,发送方获取待发送的一个或多个数据包,并创建或维护多个流包,从而将一个或多个数据包划入对应的流包,以便于后续根据每一数据包所属流包为数据包决策发送路径。
在一些实施例中,发送方将数据包具体划入哪一流包的依据可以是数据包的目的节点,即将发送至同一目的节点的数据包划入同一流包。并且,属于同一流包中的数据包的发送路径相同。
在另一些实施例中,发送方为数据包划分所属流包的依据还可以是数据包的网络均衡参数,当网络均衡参数满足预设条件时,可以创建新的流包,并将之后待发送的数据包划入该新创建的流包。网络均衡参数用于发送方基于网络均衡原则将多个数据包分别划入不同流包,且不同流包的数据包通过不同路径发送。也就是说,可以基于网络均衡原则为流包中的数据包设定发送路径。本申请实施例中,网络均衡主要指链路之间的流量均衡,流量均衡即在一段时间内,多条链路中每条链路的流量之间的差异在预设范围内。换言之,流量均衡,指的是流量通过多条链路分担,且每条链路分担的流量值差异不大。而当仅通过多条链路中的几条链路来分担大部分流量,而其他链路仅分担小部分流量时,我们可以视为链路之间的流量不均衡。本申请实施例中,发送方基于网络均衡原则为数据包确定所需划入至哪一流包,指的是,当存在较多数据包时,需要根据网络均衡参数将一部分数据包划入某一流包,将另一部分数据包划入另外流包中,如此,后续,一部分数据包可通过某一流包对应的路径发送,另一部 分数据包可通过上述另外流包对应的路径发送,即能够将多个数据包通过不同路径发送,实现多条路径之间的流量均衡。
其中,数据包的网络均衡参数可以用于表征该数据包占用的网络资源,网络资源比如可以为带宽资源、时间资源等。本申请实施例中用于流量均衡的主要网络均衡参数比如可以为如下一种或几种的组合:数据包之间的时间间隔Gap、数据包的数据量Size、数据包的持续时长Time、数据包的发送频率,当然,还可以为其他各种可度量的方式。
该一个或多个网络均衡参数用于控制DLB的动态能力。DLB的动态能力指的是数据包灵活选择不同路径的能力。通常,动态能力越强,不同数据包通过不同路径传输的可能性越大,或者说,较容易触发路径切换。动态能力越弱,不同数据包往往通过相同的一条或几条路径承载,换句话说,即难以触发路径切换。
当然,除了上述所列举的网络均衡参数,本申请实施例用于流量均衡的网络均衡参数还可以为其他参数,比如,交换节点从其他交换节点获取的不同路径的拥塞程度,比如TOR从上游aggregation节点获取的该TOR至该aggregation之间路径的拥塞程度,又比如,TOR从spine 0获取的spine 0至该TOR之间路径的拥塞程度、某个交换节点的发送队列的深度、交换节点的滑动窗口(sliding window)的特征,比如滑动窗口大小。本段列举的这些网络均衡参数可以用于反映网络的拥塞状态。本申请实施例用于流量均衡的网络均衡参数还可以包括其他参数,本申请实施例这里不再一一列举这些参数。
上文已指出,在发送方持续获取待发送数据包,并创建或维护不同流包的过程中,若所述网络均衡参数满足预设条件,发送方可以创建新的流包,并可以将后续待发送数据包划入该新创建的流包,其中,网络均衡参数满足预设条件包括如下一项或多项的组合:
1、从发送已创建流包中第一数据包的时刻开始的预设间隔内,不存在去往同一目的节点的数据包,也就是说,发送方获取某一待发送数据包,并将该数据包划入某一已创建流包,从该数据包被发送的时刻开始,若较长时间内,即预设间隔内,不存在与该数据包具有同一目的节点的其他数据包,则发送方创建新的流包,并可以将预设间隔之外的后续待发送数据包划入该新创建的流包。如此,使得时间间隔较大的两个数据包可以被划入不同流包。
2、所述已创建流包的数据量达到预设数据量。已创建流包的数据量,指的是该已创建流包中数据包的数据量。该预设条件是说,若某一已创建流包中数据包的数据量达到预设数据量,则发送方创建新的流包。换句话说,每个流包均包括相同数据量的数据包,即每一流包包括预设数据量的数据包。如此,使得每一流包中数据包的数据量相同,以平衡不同流包中数据包占用的带宽资源。
3、所述已创建流包的持续时长达到预设时长。已创建流包的持续时长,指的是该已创建流包中数据包的持续时长,具体的,指从该已创建流包中第一个数据包的发送时刻开始的一段时间。若从某一已创建流包中第一个数据包的发送时刻开始达到预设时长仍不存在去往同一目的节点的数据包,则发送方创建新的流包,并将预设时长之后时刻发送的数据包划入该新创建流包中。如此,使得每一流包中去往同一目的节点 的数据包占用的时间资源大致相同,即能够降低某一个或几个流包中去往同一目的节点的数据包占用较多时间资源,而其他流包中数据包占用较少的时间资源的概率,平衡不同流包中数据包占用的时间资源。
4、所述已创建流包的发送频率达到预设频率。已创建流包的发送频率,指的是。数据包的发送频率可以反映数据包发送的快慢,通常,数据包的发送频率受数据包相互之间的发送时间间隔影响。通常,数据包的发送频率与数据发送需求和/或其他因素有关。比如,当使用上述突发方式发送数据时,数据包的发送频率不固定,可以在某一时段内有较快的发送频率,可以在其他时段有较慢的发送频率。当采用平滑方式发送数据时,数据包的发送频率较为固定,可以是匀速发送数据包。通常,若采用突发方式发送数据,且当前已创建流包中一个或多个flowlet的发送频率较大,为了减轻当前已创建流包对应路径的负载,可以创建新的流包,并将之后发送的flowlet划入新创建的流包中。如此,以便于平衡各个流包对应路径之间的负载,降低某一个或多个流包中因flowlet的发送频率较快,导致该一个或多个流包对应的路径负载过重的概率。
S503、发送方基于待发送的数据包所属的流包,发送所述待发送的数据包。
其中,属于不同流包中的数据包的发送路径不同。
相比于现有技术中难以触发足够粒度的动态负载均衡,本申请实施例提供的流量均衡方法,灵活创建或维护流包,并按照目的节点将待发送数据包划入对应流包,如此,可以将同一流包中的数据包通过同一路径发送,不同流包中的数据包通过不同路径发送,即按照本申请实施例的技术方案,能够实现根据数据包所属流包动态的控制路径切换。并且,路径切换的粒度与流包的划分方式相关。其中,当流包包括较少的数据包时,路径切换粒度较细,即每发送较少的数据包就可触发路径切换,当流包包括较多的数据包时,路径切换粒度较粗,即每发送较多的数据包才可以触发路径切换。可见,本申请实施例提供的流量均衡方法,还能够通过控制流包的划分方式灵活控制路径切换粒度,以满足不同应用场景下的流量均衡需求。
如下结合具体例子,并以创建或维护第一流包、创建或维护第二流包为例来说明本申请实施例的技术方案。参见图6,本申请实施例提供的流量均衡方法包括如下步骤:
S601、发送方获取待发送数据包。
示例性的,待发送数据包如图7所示,包括8个数据包。
S602、发送方创建或维护第一流包,并将以第一节点为目的节点的数据包划入第一流包。
本申请实施例主要以数据包由同一源节点去往同一目的节点为例,来说明去往同一目的节点采用同一路径或者不同路径来进行流量均衡的技术方案。其中,源节点可以是如图7~图10中的节点a,目的节点可以是如图7~图10中的节点b,也就是说,第一节点可以为节点b。
如上文阐述,除了按照数据包的目的节点为数据包划分所属流包,发送方还可以结合数据包的网络均衡参数创建或维护流包,并将待发送数据包划入对应的流包。
若网络均衡参数设置为数据包之间的时间间隔,初始时,发送方创建的第一流包可以仅包括待发送数据包中的一个数据包。比如,如图7所示,发送方获取的待发送 数据包包括8个数据包,则初始时,发送方可以将待发送的8个数据包中的第一个数据包(即编号为1的数据据包)划入第一流包,并创建仅包括该第一个数据包的第一流包。后续,发送方可以维护所创建的第一流包,并更新第一流包包括的数据包数目,比如,如图7所示,将与数据包1之间的发送时间间隔小于或等于预设间隔的数据包2划入第一流包。类似的,将与数据包2之间的发送时间间隔小于或等于预设间隔的数据包3也划入第一流包。也就是说,发送方检测第一流包中最后一个数据包与下一个待发送数据包之间的发送时间间隔,若下一个待发送数据包与第一流包中最后一个数据包之间的发送时间间隔小于或等于预设间隔,则发送方将该下一个待发送数据包划入第一流包。示例性的,如图7所示,待发送数据包中的数据包1~数据包5中每两个数据包之间的发送时间间隔均小于或等于预设间隔,则发送方将数据包1~数据包5均划入第一流包。
若网络均衡参数为数据包的数据量,初始时,发送方创建的第一流包可以包括预设数据量的数据包,也可以包括少于预设数据量的数据包。以每个数据包大小相同,且单个数据包大小为16KB,设置的预设数据量为64KB(即4个数据包大小)为例,在一个示例中,如图8中(a)所示,发送方在某一时刻,获取数据量较大的待发送数据包,假设为128KB(8个数据包大小),待发送数据包的数据量大于预设数据量。这种情况下,初始时,发送方可以将8个待发送数据包中的4个数据包(即图8中(a)所示数据包1~数据包4)划入第一流包,并创建包括该4个数据包的第一流包。该第一流包的数据量恰好为预设数据量。在另外的示例中,如图8中(b)所示,发送方在某一时刻,获取数据量较小的待发送数据包,假设为32KB(2个数据包大小),待发送数据包的数据量小于预设数据量。这种情况下,初始时,发送方可以将2个待发送数据包先划入第一流包,并创建包括2个数据包的第一流包。之后,发送发继续获取后续时刻的待发送数据包,可以理解的是,若该后续时刻的待发送数据包的数据量大于32KB,则发送方可以仅将该后续时刻的的待发送数据包中的32KB数据包划入第一流包,该后续时刻的待发送数据包中的其他数据包可以划入后续创建的其他流包中。反之,若该后续时刻的待发送数据包的数据量小于或等于32KB,则发送方可以将该后续时刻的的待发送数据包全部划入第一流包,更新第一流包中数据包的数目,并继续获取待发送数据包,直至第一流包中的数据包的数据量达到如图8中(b)所示的预设数据量。
若网络均衡参数设置为数据包的持续时长,初始时,发送方创建的第一流包可以仅包括待发送数据包中的一个数据包。比如,如图9中(a)所示,发送方获取的待发送数据包包括8个数据包,则初始时,发送方可以将待发送的8个数据包中的第一个数据包(即编号为1的数据据包)划入第一流包,并创建仅包括该第一个数据包的第一流包。后续,发送方可以维护所创建的第一流包,并更新第一流包包括的数据包数目。比如,如图9中(a)所标注,数据包2的发送时刻在预设时长内,则发送方将数据包2划入第一流包。类似的,数据包3的发送时刻也在预设时长内,则数据包3也被划入第一流包。也就是说,自第一流包中第一个数据包的发送时刻开始,在预设预设时长内发送的数据包被划入第一流包。
网络均衡参数还可以为发送周期,作为一种可能的实现方式,一个发送周期定义 为一段预设时间段,发送方周期性创建流包,即每隔一个发送周期,创建一个流包。当网络均衡参数为发送周期时,初始时,发送方创建的第一流包可以仅包括待发送数据包中的一个数据包。比如,如图9中(b)所示,发送方获取的待发送数据包包括8个数据包,则初始时,发送方可以将待发送的8个数据包中的第一个数据包(即编号为1的数据据包)划入第一流包,并创建仅包括该第一个数据包的第一流包。后续,发送方可以维护所创建的第一流包,并更新第一流包包括的数据包数目。比如,如图9中(b)所标注,t1~t2为第一流包的发送周期,t2~t3为第二流包的发送周期,数据包2的发送时刻在第一流包的发送周期内,则发送方将数据包2划入第一流包。类似的,数据包3的发送时刻也在第一流包的发送周期内,则数据包3也被划入第一流包。
若网络均衡参数为数据包的发送频率,初始时,发送方创建的第一流包可以包括待发送数据包中的一个数据包。比如,如图10所示,发送方获取的待发送数据包包括8个数据包,则初始时,发送方可以将待发送的8个数据包中的第一个数据包(即编号为1的数据据包)划入第一流包,并创建仅包括该第一个数据包的第一流包。后续,发送方可以维护所创建的第一流包,并更新第一流包包括的数据包数目。比如,如图10所标注,数据包2和数据包3与数据包1属于同一flowlet 1,且数据包2与数据包1之间的发送时间间隔较小(并未达到预设发送间隔),数据2与数据包3之间的发送时间间隔也较小,说明flowlet 1的发送频率较小,则发送方将属于flowlet 1的数据包2和数据包3划入第一流包。类似的,之后发送的flowlet 2的发送频率达到预设频率,则属于flowlet 2的数据包4~6也被划入第一流包。
网络均衡参数还可以为数据包的数据量和数据包之间的时间间隔两者的组合。这种情况下,第一流包中的数据包中每两个数据包之间的发送时间间隔均小于或等于预设间隔,且第一流包的数据量为预设数据量。第一流包的具体创建和维护过程可参见上述网络均衡参数分别为数据包的数据量和时间间隔的创建和维护过程,这里不再赘述。
当然,网络均衡参数还可以是上述多个网络均衡参数中其他两个或两个以上的组合。这些情况下,第一流包的创建和维护过程也可参见上文介绍,这里不再赘述。
S603、发送方判断网络均衡参数是否满足预设条件。
其中,网络均衡参数满足预设条件,指的是如下一项或多项的组合:
所述从发送第一流包中第一数据包的时刻开始的预设间隔内,不存在去往同一目的节点的数据包,其中,第一数据包为第一流包中的某一数据包;所述第一流包的数据量达到预设数据量;所述第一流包的持续时长达到预设时长;所述第一流包的发送频率达到预设频率。当然,如上文所阐述,网络均衡参数满足预设条件,还可以是其他情况,这里不再一一列举。
S604、若网络均衡参数满足预设条件,则发送方创建或维护第二流包,并将之后发送的以第二节点为目标的数据包,划入所述第二流包。
以网络均衡参数设置为数据包之间的时间间隔为例,发送方创建或维护第一流包,若发送方检测到从发送第一流包中某一数据包的时刻开始的预设间隔内,不存在去往同一目的节点的数据包,则创建第二流包,并可以将预设间隔之后发送的数据包划入第二流包。如图7所示,发送方已创建和维护有第一流包,第一流包包括数据包1~数 据包5,发送方检测到自从数据包5的发送时刻开始,预设间隔内不存在去往节点b的数据包,则创建第二流包,并将预设间隔之后发送的数据包6划入第二流包,数据包6为第二流包中的首个数据包。其中,第二流包的具体维护过程可参见网络均衡参数为数据包之间时间间隔时上述第一流包的维护过程。
本申请实施例中,预设间隔G的数值可以较现有技术中flowlet所要求的G小(可以依据经验或采用其他方式确定该数值),以避免实际应用中,过大G往往难以触发路径切换的现象。
若网络均衡参数为数据包的数据量,参见图8中(a)或图8中(b),发送方创建或维护第一流包,当发送方检测到第一流包的数据量达到预设数据量时,则创建第二流包,并将之后发送的数据包划入第二流包。
如此,当第一流包与第二流包的时间间隔较短,即第一流包中最后一个数据包与第二流包中第一个数据包之间的发送时间间隔较短,无法满足时间间隔大于或等于预设间隔(比如5us)的条件时,可通过每发送预设数据量(比如256KB)的数据包构建一个流包,提升DLB的动态能力。其中,可以根据实际经验和统计信息确定预设间隔在满足什么条件时“较小”,比如,预设间隔小于第一阈值时,则视为预设间隔较小。第一阈值可灵活设置,本申请实施例对此不进行限制。
若网络均衡参数设置为数据包的持续时长,参见图9中(a),发送方创建或维护第一流包,当发送方检测到自从发送第一流包中的第一个数据包的时刻开始,预设时长到达,则发送方创建第二流包,并将该预设时长之后发送的数据包划入第二流包。
网络均衡参数还可以为发送周期,参见图9中(b),发送方创建或维护第一流包,当发送方检测到发送第一流包的发送周期结束,则发送方创建第二流包,并将第一流包的发送周期之后发送的数据包划入第二流包。
在另一些实施例中,发送方配置还可以配置其他网络均衡参数,并检测参数是否满足预设条件,以判断是否创建第二流包。比如,发送方配置当前路径(比如第一路径)的拥塞程度这一参数,当发送方检测到当前已创建流包中数据包通过链路发送时的拥塞程度大于或等于拥塞阈值,则创建第二流包。
需要说明的是,本申请实施例对发送方配置何种参数、以及配置参数的个数不进行限制。
当然,发送方还可以结合上述两个或多个条件来判断是否进行路径切换。比如,发送方检测到第一流包的数据量较大(即达到预设数据量),比如达到5个数据包,且自从发送第一流包的第五个数据包的时刻开始,预设间隔内不存在去往同一目的节点的数据包,才创建新的第二流包。在另一些实施例中,可以配置预设数据量S的数值较小,预设间隔G(比如50us)的数值较大。这样一来,当某一数据流的数据量很少时,可通过预设间隔G触发创建新的第二流包。即,如果两个流包之间通过不小于50us的空闲时间隔开。其中,第一流包与第二流包之间的空闲时间,指的是第一流包在这段空闲时间内无数据包,但这段空闲时间内允许发送其它流包的数据包。参见上文描述,关于预设数据量S的数值较小,以及预设间隔G的数值较大的具体定义,可以根据实际应用和算法确定,比如,设置阈值,当预设数据量S小于或等于该阈值时,就视为预设数据量S较小。
在一种示例性的参数配置方式中,还可以配置预设间隔G=infinity,预设数据量S=infinity,预设时长T=infinity。这意味着,发送方始终保持SLB,即,不创建新的第二流包,将属于同一数据流的数据包均属于第一流包。
S605、发送方基于待发送数据包所属流包,发送待发送数据包。
其中,所述第二流包中的数据包的发送路径,与所述第一流包中的数据包的发送路径不同。
需要说明的是,第一流包中的一个或多个数据包可以来自同一数据流,也可以来自不同数据流。即第一流包中的数据包可以为某一数据流中的部分数据包,也可以为多个数据流中的数据包(当然,要求在同一流包中的这些数据包必须要去往相同的目标设备)。类似的,第二流包中的一个或多个数据包可以来自同一数据流,也可以来自不同数据流。
第一流包和第二流包可以为同一数据流中的数据包。第一流包和第二流包也可以来自不同数据流。比如,所述第一流包为数据流1中的数据包,且所述第二流包为数据流2中的数据包;或者,所述第一流包为数据流2中的数据包,且所述第二流包为数据流1中的数据包;或者,所述第一流包包括所述数据流1中的数据包和所述数据流2中的数据包;或者,所述第二流包包括所述数据流1中的数据包和所述数据流2中的数据包。
可见,通过设置不同参数(比如设置时间间隔Gap或者持续时长Time)可以实现不同性能的DLB。或者,通过相同参数设置不同的数值,也可以实现不同性能的DLB。并且,设置的这些参数可以分别,或者联合作用。示例性的,仅设置时间间隔这一参数时,发送周期设置为5us时,每隔5us可能就会触发一次路径切换,DLB的性能较高。相应的,接收端可能需较多排序资源。当发送周期设置为100us时,每隔100us可能才触发一次路径切换,DLB性能较低。相应的,接收端所需的排序资源可能较少。
在另一些实施例中,可以设置多套参数。一套参数指的是一个或多个参数,比如,第一套参数包括时间间隔、持续时长。第二套参数包括时间间隔、数据量。同一数据流使用一套参数。不同的数据流可以采用不同的参数。比如,数据流1使用第一套参数进行流量均衡,数据流2使用第二套参数。当然,不同数据流也可以使用同一套参数,本申请实施例对此不进行限制。
另外,实际应用中,不同的数据流可以基于不同参数来实现流量均衡,以满足业务差异化的要求。比如,针对数据流1,发送方检测数据包之间的时间间隔是否满足上述预设间隔,从而为该数据流1创建一个或多个流包,以进行流量均衡。针对数据流2,发送方检测数据包的数据量这一参数,每发送256KB数据,就切换一次路径,从而动态为数据流2中的数据包选择不同路径。
在一些实施例中,发送方需区分不同的流包。即,对不同流包进行定界。定界指的是,明确区分出数据包属于哪一流包。后续,接收方可以基于该定界对数据包执行重排序(re-ordering)。示例性的,以第一流包和第二流包为例,发送方可以采用如下任一种方式区分第一流包和第二流包。
1、发送方通过重新封装第一流包或第二流包的数据包,来区分第一流包和第二流包。比如,可以在数据包新增一些标签字段,并为字段设置一定数值来标识数据包属 于哪一流包。或者,直接复用现有的字段,并为字段设置新的数值来标识数据包。
相应的,接收方解封装数据包,从数据包中的相应字段获知该数据包属于哪一流包。接收方还可以根据所述定界标识判断是否接收到所述第一流包的尾包。
具体的,发送方封装数据包的具体实现方式可以为:所述第一流包中至少一个数据包携带定界标识,所述第二流包中至少一个数据包携带定界标识,所述定界标识用于区分所述第一流包和所述第二流包。
可选的,定界标识可以为流包序列号,用以显式指示数据包所归属的流包。同一流包中的数据包均携带相同的流包序列号(flowpacsequence,FSN)。比如flowpac 0内的所有数据包的FSN字段的值均为0,flowpac 1内的所有包的FSN字段的值均为1,依此类推。其中,FSN位宽(即所需的比特位数)需使得接收侧能区分经不同路径接收到的不同flowpac,比如可以为4bit。
示例性的,如果流包只携带同一flow的数据。接收方在接收到当前数据包之后,若解封装该当前数据包后,首次发现FSN字段的数值为3,则接收方判断该当前数据包为流包3的首包(first packet)。
在一些实施例中,接收方一旦接收到某一流包中携带尾包标识的尾包,则将尾包信息存储在排序流表(re-ordering table,ROT)中。
后续,接收方一旦接收新的数据包,其可以查询该排序流表,若能够查询到上一流包的尾包信息,说明接收方已接收到上一流包的尾包,无需对新数据包所属flowpac进行重排序。比如,接收方接收到FSN=4的数据包,其查询排序流表发现已接收到FSN=3的尾包,由于流包3、4是按照先发先至、后发后至顺序接收的,则接收方无需执行重排序。
反之,参见图11,接收方在接收到新流包(比如第二流包)之后,若接收方通过查询排序流表确定未接收到所述上一流包,即第一流包的尾包,则说明第一流包、第二流包并未按照先发先至、后发后至的顺序达到接收方,导致乱序。因此,接收方通过RC执行重排序。作为一种可能的实现方式,接收方在接收到所述第一流包的尾包后,释放所述RC。容易理解的是,接收方接收到第一流包的尾包,说明第一流包已接收完毕,这样一来,接收方将第二流包排序在第一流包之后。如此,第一流包和第二流包之间不再存在乱序问题,可以释放该RC,并将释放的RC用于其他重排序流程,从而提升RC的利用率。由于闲置RC还可以用于其他重排序流程,相当于拓展了可用RC的数目。
在一些实施例中,接收方在发生路径切换的情况下才需占用RC执行重排序。也就是说,如果两条路径的时延差最多为20us,则发生路径切换的流包只需占用RC大约20us的时间,即保证从原路径上收到先发后至的数据后(最多只需要20us),即可释放RC。相比于packetspray机制中,在整个数据流的生命周期内(可以达到ms甚至s级)均需占用RC,本申请实施例中的重排序流程能够降低RC消耗。
更进一步的,接收方的上述重排序流程中,仅在产生乱序的情况下,即在接收新流包时,还未完成上一流包的接收,才启用RC资源,并在确定无乱序问题后,立即释放RC,RC的排序周期缩短,能够降低RC的消耗,提升RC的利用率。其中,排序周期可以指接收方确定产生乱序的时刻至释放RC的时刻。当然,也可以指接收方 确定存在通过切换路径的流包的时刻至释放RC的时刻。本申请实施例对排序周期具体为哪一时间段不进行限制。
采用本申请实施例提供的流量均衡方法,发送侧可通过不同的参数,如Gap、Size、Time,来控制DLB的动态能力,比如,可以控制发生路径切换的流包的数目。进一步,基于发生路径切换的流包,还能控制所需RC数量,和/或RC的排序周期。并且,参数数值不同,和/或,参数类型(比如参数类型为Time,或者Gap)不同时,负载均衡的性能可能不同。举例来说,当数据量的数值较大时,每发送较大的数据量,才能出发一次路径切换,DLB的动态性能较差,但是,接收方所需的RC较少,RC消耗少。
在一些实施例中,可以基于接收方可用RC的数目来选择不同的排序策略:
1、若可用RC的数目大于或等于RC数目阈值,则通过不同RC重排序不同数据流的数据包。即每一数据流可以对应一个RC。
2、若可用RC的数目小于RC数目阈值,则通过相同RC重排序不同数据流的数据包。即一条数据流(比如包括上述第一流包和第二流包的数据流)或多条数据流共用一路RC。
在一些实施例中,可使用如下可动态分配RC(dynamic re-ordering channel,DRC)算法,用于进一步控制RC的数量。
可预设门限N1,比如N1=512。
当可用的RC数量大于或等于N1,则待排序的数据流都可独占一个RC。
否则,当可用的RC数量小于N1,则一条数据流(比如包括上述第一流包和第二流包的数据流)或多条数据流共用同一RC。
采用上述DRC算法,可以用有限的RC资源来保证大部分流或全部流都不出现乱序。另外,接收方也可以将可用RC资源反馈给发送方,以保证发送的flowpac都有对应的RC可用。
可选的,用于区分第一流包和第二流包的定界标识可以为首包中的某些预设标识。即发送方只在每个flowpac的首包携带预设标识。比如,将首包的切换完成(SwitchOver,SO)字段的值设置为1,用以表示该流包为新流包的首包,还用于表示该流包与上一流包采用不同路径发送,即该流包相对于上一流包发生了一次路径切换。或者,首包携带其他形式的首包标识,比如首包(first packet,FP)字段值为1。或者,首包携带包类型(packettype)字段,例如以packettype=0表示首包。
相应的,接收方在接收到当前数据包之后,若解封装该当前数据包后,该当前数据包的SO字段的数值为1,且接收方已经接收到第一流包的一个或多个数据包,则接收方判断该当前数据包为第二流包的首包。
与上述类似,接收方可以通过查询排序流表判断是否执行重排序。比如,接收方接收到SO=1的当前数据包,其查询排序流表发现已接收到上一流包的尾包,由于上一流包和当前流包是按照先发先至、后发后至顺序接收的,则接收方无需执行重排序。
如此,仅通过首包就可以区分出当前数据包属于哪一flowpac,能够降低控制信息的开销,从而提升有效的数据信息的传输效率。
可选的,定界标识可以为尾包中的某些预设标识。即发送方只在每个flowpac的尾包携带尾包标识,或者,携带包类型标识,包类型标识用于指示数据包为尾包。比 如,以packettype=3表示尾包。
可选的,packettype的其它值可以表示flowpac内首、尾包之间的包。
可选的,定界标识可以为包序号(packet sequence,PSN)。其中,所述第二流包的首包携带的PSN为M,M为正整数,所述第二流包的第i个数据包携带的PSN为i-1,i为大于1的整数,M为所述第一流包中数据包数目与1的差值。
比如,flowpac 3包括100个数据包,则首包中PSN=99,第二个数据包中PSN=1,第三个数据包中PSN=2,第四个数据包中PSN=3,依次类推,第100个数据包中PSN=99。如此,当接收方接收到具有相同PSN的数据包时,比如,接收方先接收到flowpac 3中的首包(PSN=99),之后,再次接收到PSN同样为99的数据包时,接收方可判断该数据包为flowpac 3的尾包。之后,接收方可准备接收下一流包。
可选的,定界标识还可以为第二流包首包携带的所述第一流包中尾包的特征值,所述特征值包括循环冗余校验(cyclic redundancy check,CRC)校验值。即在当前流包的首包中携带上一流包的尾包的一些信息。
当然,发送方可以将上述多个定界标识结合,比如,发送方在数据包中携带首包标识和尾包标识。
2、可以在两个flowpac之间插入专用的定界包来指示第一流包的尾包是否已收到。
具体的,当确定开始发送新的流包,并且需切换路径,发送方先通过原路径,即第一路径发送定界包(flowpac delimiter,FPD),再通过新路径,即第二路径发新流包,即第二流包。
所述定界包用于区分所述第一流包和所述第二流包。定界包为通过第一路径接收,且为第一流包的尾包的下一数据包。定界包可以为具有预设特征的控制包。比如,可以是包含预设字段的控制包,或者预设大小的控制包。本申请实施例对定界包的具体实现不进行限制。
可选的,定界包的大小较小(比如可以小于某一阈值),以使得定界包能够以较低时延到达接收方。比如,使得定界包比第二流包早些到达接收方。当然,定界包也可以比第二流包晚些到达接收方,本申请实施例对定界包到达接收方的时机并不进行限制。
相应的,接收方当前正在通过第一路径接收第一流包,后续,接收方还通过第一路径接收定界包,则接收方可确定第一流包已接收完毕。在一种示例中,接收方在接收定界包之后(第一流包已接收完毕),才接收到第二流包,第一流包和第二流包通常不存在乱序问题。进而,接收方通常无需重排序,能够降低RC消耗。
需要说明的是,本申请实施例中所提及的字段名称均为示例性的名称,在实际实现时还可以为其他名称,本申请实施例对此不进行限制。
上述主要描述对同一数据流进行流量均衡,以及该数据流中数据包的定界、RC、DRC方式。
在另一些实施例中,还可以将多条数据流可以汇聚(也称聚合(merge))为一条粗粒度的流(比如通过某种hash算法)。示例性的,用预设算法(比如hash算法,或其他类似算法)对待排序的数据流做聚合。该汇聚形成的粗粒度的流在本文中可称为汇聚流。在本申请实施例中,可以针对汇聚流执行上述流量均衡方法,该汇聚流中 数据包的定界、RC、DRC方式的具体实现方式可参考上文。
基于此,本文内的“数据流”,即可以指如上文所提及的通常意义的TCP流,也可以指汇聚流。
作为一种可能的实现方式,汇聚流可以共用同一RC。其中,共用同一RC的流之间可能出现头阻塞(head of line blocking,HOL)。因此,接收端只有确定已接收到上一flowpac对应的尾包,才处理下一flowpac。
上述主要从不同网元之间交互的角度对本申请实施例提供的方案进行了介绍。可以理解的是,发送节点和接收节点为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。结合本申请中所公开的实施例描述的各示例的单元及算法步骤,本申请实施例能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。本领域技术人员可以对每个特定的应用来使用不同的方法来实现所描述的功能,但是这种实现不应认为超出本申请实施例的技术方案的范围。
本申请实施例可以根据上述方法示例对发送节点和接收节点等进行功能单元的划分,例如,可以对应各个功能划分各个功能单元,也可以将两个或两个以上的功能集成在一个处理单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。需要说明的是,本申请实施例中对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
图12示出了本申请实施例中所涉及的一种流量均衡装置的一种可能的示例性框图,该装置1200可以以软件的形式存在,也可以为网络设备,还可以为可以用于网络设备的芯片。装置1200包括:处理单元1202和通信单元1203。处理单元1202用于对装置1200的动作进行控制管理,例如,若该装置用于实现发送节点功能,处理单元1202用于支持装置1200执行图5中的S502,图6中的S602、S603、S604,和/或用于本文所描述的技术的其它过程。若该装置用于实现接收节点功能,处理单元1202用于支持装置1200判断在接收到第二流包的情况下,是否已接收到第一流包的尾包,若未接收到第一流包的尾包,则通过排序通道RC对第一流包和第二流包中的数据包执行重排序,并在重排序之后,若确定已接收到第一流包的尾包,则释放所述RC,和/或用于本文所描述的技术的其它过程。通信单元1203用于支持装置1200与其他网络实体的通信。装置1200还可以包括存储单元1201,用于存储装置1200的程序代码和数据。
其中,处理单元1202可以是处理器或控制器,例如可以是CPU,通用处理器,DSP,ASIC,FPGA或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。通信单元1203可以是通信接口、收发器或收发电路等,其中,该通信接口是统称,在具体实现中,该通信接口可以包括多个接口,例如可以包括:发送节点和接收节点之间的接口和/或其他接口。存储单元1201可以是存储器,或者其他形式的存储设备。
当处理单元1202为处理器,通信单元1203为通信接口,存储单元1201为存储器 时,本申请实施例所涉及的装置1200可以为具有图13所示结构的装置。
参阅图13所示,该装置1300包括:处理器1302、通信接口1303、存储器1301。可选的,装置1300还可以包括总线1304。其中,通信接口1303、处理器1302以及存储器1301可以通过总线1304相互连接;总线1304可以是外设部件互连标准(Peripheral Component Interconnect,简称PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,简称EISA)总线等。所述总线1304可以分为地址总线、数据总线、控制总线等。为便于表示,图13中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
作为另一种可能的实现方式,在采用对应各个功能划分各个功能模块的情况下,若流量均衡装置用于实现发送节点功能,图14示出了上述实施例中所涉及的实现发送节点功能的装置的另一种可能的结构示意图。流量均衡装置1400可以包括:第一模块1401、第二模块1402和第三模块1403。第一模块1401用于支持流量均衡装置1400执行图5中的S501、图6中的S601,和/或用于本文所描述的方案的其它过程。第二模块1402用于支持流量均衡装置1400执行图5中的过程S502,图6中的S602~S604,还用于基于网络均衡的原则为流包中的数据包设定发送路径,和/或用于本文所描述的方案的其它过程。第三模块1403用于支持流量均衡装置1400执行图5中的过程S503,图6中的S605,还用于通过所述第一路径发送定界包,所述定界包为用于区分所述第一流包和所述第二流包的控制包,所述第一路径为发送第一流包中数据包的路径,和/或用于本文所描述的方案的其它过程。其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。当然,为了实现本申请实施例的技术方案,流量均衡装置还可能包括其他模块,这里就不再赘述。
若流量均衡装置用于实现接收节点功能,图15示出了上述实施例中所涉及的实现接收节点功能的装置的另一种可能的结构示意图。流量均衡装置1500可以包括:第四模块1501、第五模块1502和第六模块1503。第四模块1501用于支持流量均衡装置1500判断在接收到第二流包的情况下,是否已接收到第一流包的尾包,和/或用于本文所描述的方案的其它过程。第五模块1502用于支持流量均衡装置1500在未接收到第一流包的尾包的情况下,通过排序通道RC对第一流包和第二流包中的数据包执行重排序,和/或用于本文所描述的方案的其它过程。第六模块1503用于支持流量均衡装置1500在重排序之后,若确定已接收到第一流包的尾包,则释放所述RC,和/或用于本文所描述的方案的其它过程。
本领域普通技术人员可以理解:在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(Digital  Subscriber Line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包括一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,数字视频光盘(Digital Video Disc,DVD))、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络设备(例如终端)上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个功能单元独立存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘,硬盘或光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (12)

  1. 一种网络设备,其特征在于,包括:
    第一模块,用于获取待发送的数据包;
    第二模块,用于创建或维护流包,并将待发送的数据包按照目的节点划入对应的流包;
    第三模块,用于基于待发送的数据包所属的流包,发送所述待发送的数据包,
    其中,一个流包包括至少一个数据包,属于同一流包中的数据包的目的节点相同,且属于同一流包中的数据包的发送路径相同。
  2. 根据权利要求1所述的网络设备,其特征在于,所述第二模块还用于:
    创建或维护第一流包;
    将以第一节点为目标的数据包,划入所述第一流包;
    创建或维护第二流包;
    将之后发送的以第一节点为目标的数据包,划入所述第二流包,
    其中,所述第二流包中的数据包的发送路径,与所述第一流包中的数据包的发送路径不同。
  3. 根据权利要求2所述的网络设备,其特征在于,所述第二模块,还用于:
    判断网络均衡参数是否满足预设条件;
    若网络均衡参数满足预设条件,则创建或维护所述第二流包;
    所述网络均衡参数用于基于网络均衡原则将数据包划入对应的流包;
    所述网络均衡参数满足预设条件包括:
    从发送第一流包中第一数据包的时刻开始的预设间隔内,不存在去往同一目的节点的数据包;
    所述第一流包的数据量达到预设数据量;
    所述第一流包的持续时长达到预设时长;
    所述第一流包的发送频率达到预设频率。
  4. 根据权利要求2或3所述的网络设备,其特征在于,所述第一流包中至少一个数据包携带第一定界标识,所述第二流包中至少一个数据包携带第二定界标识,所述第一定界标识和所述第二定界标识用于区分所述第一流包和所述第二流包。
  5. 根据权利要求2至4中任一项所述的网络设备,其特征在于,所述第三模块,还用于通过第一路径发送定界包,所述定界包为用于区分所述第一流包和所述第二流包的控制包,所述第一路径为发送第一流包中数据包的路径。
  6. 根据权利要求1至5中任一项所述的网络设备,其特征在于,一个流包内的数据包来自相同或不同数据流。
  7. 根据权利要求1至6中任一项所述的网络设备,其特征在于,所述第二模块,还用于基于网络均衡的原则为流包中的数据包设定发送路径。
  8. 一种网络设备,其特征在于,包括:
    第四模块,用于判断在接收到第二流包的情况下,是否已接收到第一流包的尾包;
    第五模块,用于若未接收到第一流包的尾包,则通过排序通道RC对第一流包和第二流包中的数据包执行重排序;
    第六模块,用于在重排序之后,若确定已接收到第一流包的尾包,则释放所述RC。
  9. 一种电子设备,其特征在于,包括处理器和存储设备,所述存储设备用于存储指令,所述处理器用于基于所述指令执行下列动作:
    创建或维护第一流包;
    将以第一节点为目标的数据包,划入所述第一流包;
    在网络均衡参数满足预设条件时,创建或维护第二流包;
    将之后来的以第一节点为目标的数据包,划入所述第二流包,
    其中,属于同一流包中的数据包的目的节点相同,属于同一流包中的数据包的发送路径相同;所述第二流包中的数据包的发送路径,与所述第一流包中的数据包的发送路径不同。
  10. 一种电子设备,其特征在于,包括处理器和存储设备,所述存储设备用于存储指令,所述处理器用于基于所述指令执行下列动作:
    判断在接收到第二流包的情况下,是否已接收到第一流包的尾包;
    若未接收到第一流包的尾包,则通过排序通道RC对第一流包和第二流包中的数据包执行重排序;
    在重排序之后,若确定已接收到第一流包的尾包,则释放所述RC。
  11. 一种流量均衡方法,其特征在于,包括:
    创建或维护第一流包;
    将以第一节点为目标的数据包,划入所述第一流包;
    在网络均衡参数满足预设条件时,创建或维护第二流包;
    将之后来的以第一节点为目标的数据包,划入所述第二流包,
    其中,属于同一流包中的数据包的目的节点相同,属于同一流包中的数据包的发送路径相同;所述第二流包中的数据包的发送路径,与所述第一流包中的数据包的发送路径不同。
  12. 一种流量均衡方法,其特征在于,包括:
    判断在接收到第二流包的情况下,是否已接收到第一流包的尾包;
    若未接收到第一流包的尾包,则通过排序通道RC对第一流包和第二流包中的数据包执行重排序;
    在重排序之后,若确定已接收到第一流包的尾包,则释放所述RC。
PCT/CN2019/100270 2019-08-12 2019-08-12 流量均衡方法、网络设备及电子设备 WO2021026740A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201980099031.8A CN114208131A (zh) 2019-08-12 2019-08-12 流量均衡方法、网络设备及电子设备
PCT/CN2019/100270 WO2021026740A1 (zh) 2019-08-12 2019-08-12 流量均衡方法、网络设备及电子设备
EP19941408.7A EP4009604A4 (en) 2019-08-12 2019-08-12 TRAFFIC LOAD BALANCING METHOD, NETWORK DEVICE AND ELECTRONIC DEVICE
US17/670,216 US20220166721A1 (en) 2019-08-12 2022-02-11 Traffic balancing method, network device, and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/100270 WO2021026740A1 (zh) 2019-08-12 2019-08-12 流量均衡方法、网络设备及电子设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/670,216 Continuation US20220166721A1 (en) 2019-08-12 2022-02-11 Traffic balancing method, network device, and electronic device

Publications (1)

Publication Number Publication Date
WO2021026740A1 true WO2021026740A1 (zh) 2021-02-18

Family

ID=74570405

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/100270 WO2021026740A1 (zh) 2019-08-12 2019-08-12 流量均衡方法、网络设备及电子设备

Country Status (4)

Country Link
US (1) US20220166721A1 (zh)
EP (1) EP4009604A4 (zh)
CN (1) CN114208131A (zh)
WO (1) WO2021026740A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11876705B2 (en) * 2021-12-16 2024-01-16 Huawei Technologies Co., Ltd. Methods and systems for adaptive stochastic-based load balancing

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103875261A (zh) * 2012-08-23 2014-06-18 高通股份有限公司 用于指示数据流的结尾以及更新用户上下文的带内信令
US20150124608A1 (en) * 2013-11-05 2015-05-07 International Business Machines Corporation Adaptive Scheduling of Data Flows in Data Center Networks for Efficient Resource Utilization
CN108390820A (zh) * 2018-04-13 2018-08-10 华为技术有限公司 负载均衡的方法、设备及系统
CN108737557A (zh) * 2018-05-29 2018-11-02 Oppo(重庆)智能科技有限公司 一种数据包传输方法、终端及计算机存储介质
CN109257282A (zh) * 2018-08-09 2019-01-22 北京邮电大学 一种数据传输方法及装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10778584B2 (en) * 2013-11-05 2020-09-15 Cisco Technology, Inc. System and method for multi-path load balancing in network fabrics
US9923828B2 (en) * 2015-09-23 2018-03-20 Cisco Technology, Inc. Load balancing with flowlet granularity
US9985903B2 (en) * 2015-12-29 2018-05-29 Amazon Technologies, Inc. Reliable, out-of-order receipt of packets
US11394649B2 (en) * 2018-06-29 2022-07-19 Intel Corporation Non-random flowlet-based routing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103875261A (zh) * 2012-08-23 2014-06-18 高通股份有限公司 用于指示数据流的结尾以及更新用户上下文的带内信令
US20150124608A1 (en) * 2013-11-05 2015-05-07 International Business Machines Corporation Adaptive Scheduling of Data Flows in Data Center Networks for Efficient Resource Utilization
CN108390820A (zh) * 2018-04-13 2018-08-10 华为技术有限公司 负载均衡的方法、设备及系统
CN108737557A (zh) * 2018-05-29 2018-11-02 Oppo(重庆)智能科技有限公司 一种数据包传输方法、终端及计算机存储介质
CN109257282A (zh) * 2018-08-09 2019-01-22 北京邮电大学 一种数据传输方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4009604A4 *

Also Published As

Publication number Publication date
EP4009604A4 (en) 2022-07-27
CN114208131A (zh) 2022-03-18
EP4009604A1 (en) 2022-06-08
US20220166721A1 (en) 2022-05-26

Similar Documents

Publication Publication Date Title
US11646967B2 (en) Packet control method and network apparatus
US10326830B1 (en) Multipath tunneling to a service offered at several datacenters
EP2915299B1 (en) A method for dynamic load balancing of network flows on lag interfaces
CN108390820B (zh) 负载均衡的方法、设备及系统
US8594112B2 (en) Memory management for high speed media access control
WO2021043181A1 (zh) 一种数据传输方法及装置
US11218572B2 (en) Packet processing based on latency sensitivity
US20030026267A1 (en) Virtual channels in a network switch
CN103534997A (zh) 用于无损耗以太网的基于端口和优先级的流控制机制
JP2006005437A (ja) トラフィック分散制御装置
CN113746751A (zh) 一种通信方法及装置
US11646978B2 (en) Data communication method and apparatus
WO2021026740A1 (zh) 流量均衡方法、网络设备及电子设备
US20190097935A1 (en) Technologies for selecting non-minimal paths and throttling port speeds to increase throughput in a network
WO2023226716A1 (zh) 数据包发送方法、转发节点、发送端及存储介质
US10999210B2 (en) Load sharing method and network device
WO2023123104A1 (zh) 一种报文传输方法及网络设备
US11201829B2 (en) Technologies for pacing network packet transmissions
US9154569B1 (en) Method and system for buffer management
Yang et al. BFRP: Endpoint congestion avoidance through bilateral flow reservation
US11895015B1 (en) Optimized path selection for multi-path groups
WO2020238875A1 (zh) 确定端口属性的方法和装置
US20210336895A1 (en) Data transmission method and network device
Dong et al. Reducing Tail Latency in Proactive Congestion Control via Moderate Speculation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19941408

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019941408

Country of ref document: EP

Effective date: 20220303