WO2021232190A1

WO2021232190A1 - Forward path planning method in massive data center networks

Info

Publication number: WO2021232190A1
Application number: PCT/CN2020/090827
Authority: WO
Inventors: Jianguo Liang; ChenChen QI; Haiyang ZHENG; Xuemei SHI
Original assignee: Alibaba Group Holding Limited
Priority date: 2020-05-18
Filing date: 2020-05-18
Publication date: 2021-11-25
Also published as: CN115462049B; CN115462049A

Abstract

A method for planning a forward path in massive data center networks is provided. The method is implemented by a first computing node in a network. The method comprises obtaining information associated with an algorithm implemented by at least one second computing node; obtaining network topology data stored associated with the at least one second computing node; receiving, at the first computing node, a first data packet from a source device to be forwarded to a destination device; determining a forward path from the source device to the destination device according to information associated with the first data packet and the network topology data; and transmitting the first data packet to the destination device through the forward path.

Description

[Corrected under Rule 26, 03.06.2020] FORWARD PATH PLANNING METHOD IN MASSIVE DATA CENTER NETWORKS

BACKGROUND

Data center networks often use compactly interconnected network topology to deliver high bandwidth for internal data exchange. In such networks, it is precarious to employ effective load balancing schemes so that all the available bandwidth resources can be utilized. In order to utilize all the available bandwidth, data flows need to be routed across the network instead of overloading a single path. Equal Cost Multipath (ECMP) path planning algorithm enables the usage of multiple equal cost paths from the source node to the destination node in the network. The advantage of using this algorithm is that the data flows can be split more uniformly to the whole network, thus, avoiding congestion and increasing bandwidth consumption.

In an existing data center network, for example, a two‐tier or a three‐tier Clos network, multiple servers connect to the network via a first tier switch, i.e., a leaf switch or an access switch. When a data flow arrives, the server forwards the data packets to various paths in the network to their respective destinations. The data packets forwarding may be determined based on the “server‐version” topology information and pre‐calculated values associated with the various paths. As the data packets arrive at the next hop leaf switches, each of the leaf switches performs dynamic path planning to distribute the data packets based on the “switch‐version” topology information and the path planning algorithm. As the server is not configured to perform the dynamic path planning, the “server‐version” topology information may not represent the real‐time network topology information. When a network congestion occurs, the server cannot determine the source of the network congestion and respond timely to re‐route the data flow.

BRIEF DESCRIPTION OF THE DRAWINGS

Methods and systems for dynamic path planning in massive data center networks are provided. The present disclosure implements the dynamic path planning capability of a switch (i.e., a leaf switch or an access switch of a two‐tier or three‐tire Clos network) on a computing node (i.e., a server device) that access the network via the switch. The computing node communicates with one or more switches to obtain information related to the path planning algorithm or the routing algorithm used by the switch and the network topology associated with the switch through various protocols, such as Link Layer Discovery Protocol (LLDP) . The computing node further configures the path planning algorithm or the routing algorithm used therein based on the information related to the path planning algorithm or the routing algorithm used by the switch. The path planning algorithm or the routing algorithm used by the switch may include an equal cost multipath (ECMP) planning algorithm. The computing node further synchronizes the network topology associated with the computing node with the network topology associated with the switch. The present disclosure enables the computing node to perform the dynamic path planning for received data packets ahead of the switch in massive data center networks, hence, effectively avoiding the data flow collisions in the network. Further, the computing node according to the present disclosure can efficiently detect the network congestions and respond timely by re‐routing the data flow so as to bypass the network congestions.

The detailed description is set forth with reference to the accompanying figures. In the figures, the left‐most digit (s) of a reference number identifies the figure in which the reference number first appears. The use of the same reference numbers in different figures indicates similar or identical items.

FIG. 1 illustrates an example environment in which a forward path planning system may be used in accordance with an embodiment of the present disclosure.

FIG. 2 illustrates an example network architecture of massive data center networks in accordance with an embodiment of the present disclosure.

FIG. 3 illustrates examples failures occurred in the massive data center networks in accordance with an embodiment of the present disclosure.

FIG. 4 illustrates an example configuration of a computing node for implementing the forward path planning method in accordance with an embodiment of the present disclosure.

FIG. 5 illustrates an example forward path planning in accordance with an embodiment of the present disclosure.

FIG. 6 illustrates another example forward path planning in accordance with an embodiment of the present disclosure.

FIG. 7 illustrates an example equal cost multipath (ECMP) planning in accordance with an embodiment of the present disclosure.

FIG. 8 illustrates an example forward path planning algorithm in accordance with an embodiment of the present disclosure.

FIG. 9 illustrates another example forward path planning algorithm in accordance with an embodiment of the present disclosure.

FIG. 10 illustrates another example forward path planning algorithm in accordance with an embodiment of the present disclosure.

FIG. 11 illustrates another example forward path planning algorithm in accordance with an embodiment of the present disclosure.

DETAILED DESCRIPTION

The application describes multiple and varied embodiments and implementations. The following section describes an example framework that is suitable for practicing various implementations. Next, the application describes example systems, devices, and processes for implementing a distributed training system.

FIG. 1 illustrates an example environment in which a forward path planning system may be used in accordance with an embodiment of the present disclosure. The environment 100 may include a data center network 102. In this example, the data center network 102 may include a plurality of computing nodes or servers 104‐1, 104‐2, …, 104‐K (which are collectively called hereinafter as computing nodes 104) , where K is a positive integer greater than one. In implementations, the plurality of computing nodes 104 may communicate data with each other via a communication network 106.

The computing node 104 may be implemented as any of a variety of computing devices having computing/processing and communication capabilities, which may include, but not limited to, a server, a desktop computer, a notebook or portable computer, a handheld device, a netbook, an Internet appliance, a tablet computer, a mobile device (e.g., a mobile phone, a personal digital assistant, a smart phone, etc. ) , etc., or a combination thereof.

The communication network 106 may be a wireless or a wired network, or a combination thereof. The network 106 may be a collection of individual networks interconnected with each other and functioning as a single large network (e.g., the Internet or an intranet) . Examples of such individual networks include, but are not limited to, telephone networks, cable networks, Local Area Networks (LANs) , Wide Area Networks (WANs) , and Metropolitan Area Networks (MANs) . Further, the individual networks may be wireless or wired networks, or a combination thereof. Wired networks may include an electrical carrier connection (such a communication cable, etc. ) and/or an optical carrier or connection (such as an optical fiber connection, etc. ) . Wireless networks may include, for example, a WiFi network, other radio frequency networks (e.g.,

Zigbee, etc. ) , etc. In implementations, the communication network 106 may include a plurality of inter‐node interconnects or switches 108‐1, 108‐2, …, 108‐L (which are collectively called hereinafter as inter‐node switches 108) for providing connections between the computing nodes 104, where L is a positive integer greater than one.

In implementations, the environment 100 may further include a plurality of client devices 110‐1, 110‐2, …, 110‐N (which are collectively called hereinafter as client devices 110) , where N is a positive integer greater than one. In implementations, users of the client devices 110 may communicate with each other via the communication network 106, or access online resources and services. These online resources and services may be implemented at the computing nodes 104. Data flows generated by users of the client devices 110 may be distributed to a plurality of routing paths and routed to a destination device through one or more of the plurality of paths. In implementations, the destination device may include another client device 110 or a computing node 104. In implementations, each of the plurality of routing paths may include one or more computing nodes 104 and switches 108 inter‐connected by physical links.

FIG. 2 illustrates an example network architecture of massive data center networks in accordance with an embodiment of the present disclosure. The network architecture of massive data center networks 200 may provide a detailed view of the environment in which a forward path planning system may be used. In implementations, the network architecture of massive data center networks is a three‐tier Clos network architecture in a full‐mesh topology. A first tier corresponds to a tier of leaf switches 206, also called access switches or top of rack (ToR) switches. The computing nodes 208 are directly connected to the leaf switches 206, with each computing node 208 being connected to at least two leaf switches 206. In implementations, a computing node 208 may include one or more network interface controllers (e.g., four network interface controllers) which are connected to one or more ports (e.g., four ports) of a leaf switch 206. In implementations, the number of network interface controllers in each computing node 208 may or may not be the same. A second tier corresponds to a tier of aggregation switches 204, also called spine switches 204, that are connected to one or more leaf switches 206. In implementations, a number of computing nodes 208, the interconnected leaf switches 206, and the interconnected aggregation switches 204 may form a point of delivery (PoD) unit, for example, PoD‐1 and PoD‐2, as illustrated. A third tier corresponds to a tier of core switches 202 that are connected to one or more aggregation switches 204. The core switches 202 is at the top of the cloud data center network pyramid and may include a wide area network (WAN) connection to the outside carrier network.

In implementations, if two processing units or processes included in different computing nodes 208 are connected under a same leaf switch 206, data packets that are transmitted between the two processing units or processes will pass through that same leaf switch 206 without passing through any of the aggregation switches 204 or core switches 202. Alternatively, if two processing units or processes included in different computing nodes are connected under different leaf switches, data packets that are transmitted between the two processing units or processes will pass through one of the aggregation switches. In implementations, a data packet that is transmitted between the two processing units or processes can be made to flow through a specified aggregation switch by setting an appropriate combination of source and destination ports in the data packet. The routing management for congestion avoidance may aim at enabling data flows from a same leaf switch to different destination leaf switches to pass through different aggregation switches, and/or data flows from different source leaf switches to a same destination leaf switches to pass through different aggregation switches, thus avoiding collisions between the data flows, and leading to no network congestion at the aggregation switches.

In implementations, a processing unit or process in a computing node 208 may send/receive data to/from a processing unit or process in another computing node through a network interface controller (NIC) . In implementations, a processing unit or process in a computing node 208 may be associated with a single network interface controllers or multiple network interface controllers for transmitting data to processing units or processes in other computing nodes. Additionally, or alternatively, multiple processing units or processes may be associated with a single network interface controller and employ that network interface controller for transmitting data to processing units or processes in other computing nodes. In implementations, a plurality of rules for data packet forwarding/routing may be implemented on the computing nodes 208. The plurality of rules may include, but are not limited to, priorities for a processing unit or process in a first computing node to select a neighboring processing unit or process, conditions for a network interface controller in a first computing node to send or receive data, conditions for a network interface controller in a first computing node to route data to/from a network interface controller in a second computing node, etc.

In implementations, the routing management may assign network interface controller (NIC) identifiers to each network interface controller that is connected or linked to a same leaf switch. In some examples, a network interface controller of the processing unit or process and a network interface controller of a next processing unit or process are located in a same computing node or are directly connected or linked to a same leaf switch, a routing identifier may be determined as a default value or identifier. This default routing identifier indicates that data is either routed within a computing node or through a leaf switch, without passing through any aggregation switch in the communication network. Otherwise, the routing identifier may be determined to be equal to a NIC identifier of that processing unit or process, or other predefined value. Based on a mapping relationship between routing identifiers and aggregation identifiers, an aggregation identifier may be determined based on the determined routing identifier. In implementations, the mapping relationship between routing identifiers and aggregation identifiers may be predetermined in advance using a probing‐based routing mechanism (e.g., sending probing data packets between computing nodes as described in the foregoing description) , for example. In other words, data flows between processing units (or processes) which are included in a same computing node or which network interface controllers a same leaf switch will not go through any aggregation switch in the communication network. On the other hand, data flows between processing units (or processes) which are included in different computing nodes and which network interface controllers different leaf switches will pass through a designated aggregation switch based on a predetermined mapping relationship, thus enabling routing control and management of data flows and distributing the data flows to different aggregation switches to avoid network congestion.

It should be appreciated that the three‐tier Clos network as illustrated in FIG. 2 is merely an example network architecture of the massive data center network. Other network architectures including, but not limited to, two‐tier Clos networks may also be adopted to construct the massive data center networks.

FIG. 3 illustrates example failures occurred in the massive data center networks in accordance with an embodiment of the present disclosure. After the configuration of a data center network 300 is delivered, network anomaly, including, but not limited to, link failure (e.g., failure 312, 316, and 320) , computing node failure (e.g., failure 310) , leaf switch failure (e.g., failure 318) , aggregation switch failure (e.g., failure 314) , or core switch failure (e.g., failure 322) may occur, causing data packet loss and congestions in certain routing paths. To detect those anomaly in the data center network, detection technology such as network quality analyzer (NQA) Track may be introduced. In implementations, one or more computing nodes in each point of delivery (PoD) unit may implement the NQA track scheme to detect whether another computing node in the same or different PoD unit, a leaf switch in the same or different PoD unit, an aggregation switch in the same or different PoD unit, or a core switch is unreachable, etc.

An example detection approach is to utilize one computing node in a point of delivery (PoD) unit as a detection source. By way of example and not limitation, computing node 308‐1 in PoD‐1 may be assigned as a detection source. The computing node 308‐1 may periodically ping other computer nodes, leaf switches, aggregation switches, and core switches, by sending an Internet Control Message Protocol (ICMP) echo request packet. The computing node 308‐1 waits for an ICMP echo reply from each of those nodes and switches. After a pre‐set period, or called Time to Live (TTL) period, if no echo reply is received from either the computing node or the switch, the computing node 308‐1 determines that the computing node or the switch is unreachable. In one example, when the packet loss occurs in a large amount of computing nodes connected to a same leaf switch, the detection source (i.e., the designated computing node for anomaly detection) may further determine that the leaf switch may be in failure. In another example, when the packet loss only occurs in sporadic computing nodes connected to a same leaf switch, the detection source may further determine that these sporadic computing nodes may experience overload or the corresponding ports in the leaf switch may be full. In yet another example, when packet loss occurs in a large amount of computing nodes located in a different point of delivery (PoD) unit, the detection source may further determine that failures may occur in one or more corresponding aggregation switches located therein and/or one or more corresponding core switches connected thereto. In yet another example, when the packet loss only occurs in sporadic computing nodes located in a different point of delivery (PoD) unit, the detection source may further determine that these sporadic computing nodes may experience overload or the corresponding ports in the aggregation switch and/or the core switch may be full.

Another example detection approach is to cooperate various computing nodes in different locations as detection sources and deploy different detection strategies among those detection sources. Each of the detection sources may implement an agent program capable of additionally detecting the anomaly associated with OSI layer 4 through layer 7. The detection sources may accept input control commands to dynamically configure the detection strategies. This approach may establish TCP connections to further detect one or more parameters associated with the transport layer (i.e., OSI layer 4) , for example, transmission delay or transmission rate. With the network topology information, the transmission delay or transmission rate, and the data packet loss rate, the detection approach may further learn the exact location of the anomaly. In one example, when the data packets routed by a first leaf switch to a number of computing nodes are experiencing high packet loss rate but the data packets to the number of computing nodes routed by a different leaf switch are received with no significant delays, the detection source may determine that the number of computing nodes are operating normal but the first leaf switch may experience certain anomaly. In another example, when the data packets to a first number of computing nodes routed by the first leaf switch experience long delay but the data packets to a second number of computing nodes routed by the same leaf switch have no delay, the detection source may determine that the first leaf switch are operating but the ports in the first leaf switch that correspond to the first number of computing nodes may be congested.

It should be appreciated that the network failure detection approaches describes above are merely for illustration purposes. Other approaches including, but not limited to, random early detection (RED) , weighted random early detection (WRED) , robust random early detection (RRED) , explicit congestion notification (ECN) , backward ECN (BECN) may also be implemented to detect network congestions.

FIG. 4 illustrates an example configuration of a computing node for implementing the forward path planning method in accordance with an embodiment of the present disclosure. In implementations, the example configuration 400 of the computing node 402 may include, but is not limited to, one or more processing units 404, one or more network interfaces 406, an input/output (I/O) interface 408, and memory 412. In implementations, the computing node 402 may further include one or more intra‐node interconnects or switches 410.

In implementations, the processing units 404 may be configured to execute instructions that are stored in the memory 412, and/or received from the input/output interface 408, and/or the network interface 406. In implementations, the processing units 404 may be implemented as one or more hardware processors including, for example, a microprocessor, an application‐specific instruction‐set processor, a physics processing unit (PPU) , a central processing unit (CPU) , a graphics processing unit, a digital signal processor, a tensor processing unit, etc. Additionally, or alternatively, the functionality described herein can be performed, at least in part, by one or more hardware logic components. For example, and without limitation, illustrative types of hardware logic components that can be used include field‐programmable gate arrays (FPGAs) , application‐specific integrated circuits (ASICs) , application‐specific standard products (ASSPs) , system‐on‐a‐chip systems (SOCs) , complex programmable logic devices (CPLDs) , etc.

The memory 412 may include machine readable media in a form of volatile memory, such as Random Access Memory (RAM) and/or non‐volatile memory, such as read only memory (ROM) or flash RAM. The memory 412 is an example of machine readable media.

The machine readable media may include a volatile or non‐volatile type, a removable or non‐removable media, which may achieve storage of information using any method or technology. The information may include a machine readable instruction, a data structure, a program module or other data. Examples of machine readable media include, but not limited to, phase‐change memory (PRAM) , static random access memory (SRAM) , dynamic random access memory (DRAM) , other types of random‐access memory (RAM) , read‐only memory (ROM) , electronically erasable programmable read‐only memory (EEPROM) , quick flash memory or other internal storage technology, compact disk read‐only memory (CD‐ROM) , digital versatile disc (DVD) or other optical storage, magnetic cassette tape, magnetic disk storage or other magnetic storage devices, or any other non‐transmission media, which may be used to store information that may be accessed by a computing node. As defined herein, the machine readable media does not include any transitory media, such as modulated data signals and carrier waves.

In implementations, the network interfaces 406 may be configured to connect the computing node 402 to other computing nodes via the communication network 106. In implementations, the network interfaces 406 may be established through a network interface controller (NIC) , which may employ both hardware and software in connecting the computing node 402 to the communication network 106. In implementations, each type of NIC may use a different type of fabric or connector to connect to a physical medium associated with the communication network 106. Examples of types of fabrics or connectors may be found in the IEEE 802 specifications, and may include, for example, Ethernet (which is defined in 802.3) , Token Ring (which is defined in 802.5) , and wireless networking (which is defined in 802.11) , an InfiniBand, etc.

In implementations, the intra‐node switches 410 may include various types of interconnects or switches, which may include, but are not limited to, a high‐speed serial computer expansion bus (such as PCIe, etc. ) , a serial multi‐lane near‐range communication link (such as Nolan, which is a wire‐based communications protocol serial multi‐lane near‐range communication link, for example) , a switch chip with a plurality of ports (e.g., a NVSwitch, etc. ) , a point‐to‐point processor interconnect (such as an Intel QPI/UPI, etc. ) ., etc.

In implementations, the computing node 402 may further include other hardware components and/or other software components, such as program modules 414 to execute instructions stored in the memory 412 for performing various operations, and program data 416 for storing data received for path planning, anomaly detection, etc. In implementations, the program modules 414 may include a topology awareness module 418, a path planning module 420, and an anomaly detection module 422.

The topology awareness module 418 may be configured to maintain topology data associated with the network 106. The topology data may be generated and implemented on each elements of the network 106 when the network architecture is delivered. The topology data includes arrangements of the elements of a network, computing nodes, leaf switches, aggregation switches, core switches, etc., and indicates the connections/links among these elements. In the equal cost multipath (ECMP) algorithm, the topology data may be represented as non‐directional graph and stored as an adjacent list. All paths that route the data packets from a source node to a destination node may be configured to have equal cost. Hence, data flow from the source node to the destination node may be evenly distributed to all paths. In implementations, one or more available paths from the source node to the destination may be configured as reserved bandwidths and the data flow may be forwarded only to available paths. The topology awareness module 418 may communicate with one or more switches in the massive data center network (e.g., leaf switch 306) in real‐time, periodically, or at a pre‐set time period to obtain topology data associated with the network 106. In implementations, the topology data associated with the network 106 may be stored in one or more separate storage devices. The topology awareness module 418 may communicate with the one or more separate storage devices to obtain the real‐time topology data. In implementations, the topology data associated with the network 106 may be dynamically updated to the topology awareness module 418 in response to a notification of network congestions. In implementations, the communication and topology data exchange between the computing node and the leaf switches may be achieved by using protocols including, but not limited to, Link Layer Discovery Protocol (LLDP) , Link Aggregation Control Protocol (LACP) , general‐purpose Remote Procedure Calls (GRPC) , etc. In implementations, the communication and topology data exchange between the computing node and the leaf switches may be achieved by remote software control.

The path planning module 410 may be configured to determine the routing paths to forward data packets and distribute the data packets to the routing paths to balance the data flows in the network 106. In implementations, when a data packet arrives, the path planning module 420 may obtain a source address, a destination address, and a protocol from an IP header of a TCP/IP data packet, and a source port and a destination port from the TCP packet. The source address, the destination address, the source port, the destination port, and the protocol may form a so‐called five‐tuple (or 5‐tuple) . The five‐tuple may uniquely indicate a data flow, in which all data packets have exactly the same five‐tuples. The path planning module 420 may determine all possible routing paths from the source node to the destination node. A data flow, in which all data packets have exactly the same five‐tuples may use one of the all possible routing paths at one time. In implementations, when it is expected to select a different path, for example, when a network anomaly occurs, the network topology data changes due to the anomaly. The topology awareness module 418 may update the network topology data stored in the program data 416 to reflex the changes. In implementations, the network topology data associated with the computing node 402 may be stored separately from the program data 416 and may be updated in response to the topology data changes due to the anomaly. The path planning module 410 may recompute using the Hash algorithm based on the updated network topology data and select a different path by riding another source port. It should be appreciated that the five‐tuple (or 5‐tuple) that uniquely indicates a data flow described above is merely for the purpose of illustration. The present disclosure is not intended to be limiting. The path planning module 420 may construct a three‐tuple (or 3‐tuple) including the source IP address, destination IP address, and ICMP Identifier that uniquely identifies an ICMP Query session to indicate a data flow.

In implementations, the planning module 420 may determine all possible routing paths from the source node to the destination node. The routing paths may be determined based on various path finding algorithms including, but not limited to, shortest path algorithms. Examples of shortest path algorithms may include, but not limited to, Dijkstra's algorithm, Viterbi algorithm, Floyd–Warshall algorithm, Bellman–Ford algorithm, etc. The path planning module 420 may further perform a Hash operation on the five‐tuple to get a corresponding five‐tuple hash values and determine a routing path from all possible shortest routing paths according to the five‐tuple has values. As the Hash operation maps a five‐tuple to a unique path, data packets that all have the same five‐tuple may be routed through the same path. Various Hash algorithms may be implemented by the path planning module 420 including, but not limited to, Message Digest (MD, MD2, MD4, MD5 and MD6) , RIPEMD (RIPEND, RIPEMD‐128, and RIPEMD‐160) , Whirlpool (Whirlpool‐0, Whirlpool‐T, and Whirlpool) or Secure Hash Function (SHA‐0, SHA‐1, SHA‐2, and SHA‐3) . By implementing the Hash operation on the five‐tuple data, different data flows can be distributed evenly to all possible routing paths between a source node and a destination node to avoid network congestion.

In current path planning methods, the computing node forwards data packets based on pre‐computed Hash values corresponding to all possible routing paths, respectively and the switches (i.e., the leaf switches) perform path planning based on the dynamic network topology and a Hash algorithm. Quite often, the topology data associated with the computing node and the leaf switches are not synchronized, and the Hash algorithms implemented on the computing node and the leaf switches have different configurations. Hence, the five‐tuples Hash values calculated by the computing node and the switches (i.e., leaf switches or aggregation switches) may direct different data flows to a same routing path, causing possible congestions in the network. In the present implementation, the path planning module 420 of the computing node 402 may obtain information associated with the Hash algorithm implemented on one or more leaf switches and configure the Hash algorithm implemented on the computing node 402 using the obtained information. The information may include one or more parameters configured for the Hash algorithm implemented on the one or more leaf switches. The path planning module 420 of the computing node 402 may further obtain network topology data stored associated with the one or more leaf switches and update the network topology data according to the obtained network topology data. In implementations, the communication and data exchange between the computing node and the leaf switches may be achieved by using protocols including, but not limited to, Link Layer Discovery Protocol (LLDP) , Link Aggregation Control Protocol (LACP) , general‐purpose Remote Procedure Calls (GRPC) , etc. As the topology data and Hash algorithm configuration is synchronized between the computing node and the leaf switch, the collisions of mapping different five‐tuple hash values to a same path may be reduced and possible flow congestion may be avoided. Further, when a network anomaly occurs, as the computing node maintain an updated topology from the view of the leaf switch and active sessions of the data flow, the computing node may effectively determine the element in the network that is involved in the anomaly and re‐plan the forward paths for the data flow.

The anomaly detection module 422 may be configured to detect anomaly occurred in the network 106. The anomaly detection module 422 may implement the detection approaches described above with respect to FIG. 3, hence, is not described in detail herein.

The program data 416 may be configured to store topology information 424, configuration information 426, and routing information 428. The topology information may include the network elements and the connection status of the network elements. The topology information may be dynamically updated according to the path planning and data exchange between the computing node 402 and the leaf switches. The configuration information 426 may include versions and parameters of the algorithms implemented on the computing node 402, routing algorithms, Hash algorithms, for example. The routing information 428 may include all possible routing paths between a source node and a destination node. The routing information 428 may further include mappings between the five‐tuple hash values and the corresponding forward path.

FIG. 5 illustrates an example forward path planning in accordance with an embodiment of the present disclosure. The example forward path planning 500 is illustrated among various computing nodes and leaf switches, ultimately connected to a single aggregation switch. Data packets from computing node 506‐1 to computing node 506‐2 are distributed to two data flows in two routing paths. Path A includes four hops: Path A‐1, Path A‐2, Path A‐3, and Path A‐4 and goes through the computing node 506‐1, the leaf switch 504‐1, the aggregation switch 502, the leaf switch 504‐2, and the computing node 506‐2. Path B includes four hops: Path B‐1, Path B‐2, Path B‐3, and Path B‐4 and goes through the computing node 506‐1, the leaf switch 504‐1, the aggregation switch 502, the leaf switch 504‐3, and the computing node 506‐2. In the transmission of the data flow, the computing nodes 506‐1 detects anomaly in Path A and further determines that Path A‐3 and Path A‐4 are involved in the anomaly. The anomaly may be associated with the leaf switch 504‐3 and/or the ports of the leaf switch 504‐3. The network topology data associated with the computing node 506‐1 may be updated to reflect the dynamic changes of the network caused by the anomaly. Based on the updated network topology data, the computing node 506‐1 may recompute using the Hash algorithm based on the updated network topology data and select a different path by riding another source port. The computing node 506‐1 may transmit the data flow using the different path that goes through the leaf switch 504‐4 including Path A‐1, Path A‐2, Path A‐3’ and Path A‐4’ .

FIG. 6 illustrates another example forward path planning in accordance with an embodiment of the present disclosure. The example forward path planning 600 is illustrated among various computing nodes and leaf switches, ultimately connected to two aggregation switches. Data packets from computing node 606‐1 to computing node 606‐2 are distributed to two data flows in two routing paths. Path A includes four hops: Path A‐1, Path A‐2, Path A‐3, and Path A‐4 and goes through the computing node 606‐1, the leaf switch 604‐3, the aggregation switch 602‐2, the leaf switch 604‐4, and the computing node 606‐2. Path B includes four hops: Path B‐1, Path B‐2, Path B‐3, and Path B‐4 and goes through the computing node 606‐1, the leaf switch 604‐1, the aggregation switch 602‐1, the leaf switch 604‐2, and the computing node 606‐2. In the transmission of the data flow, the computing nodes 606‐1 detects anomaly in Path A and further determines that the leaf switch 604‐4 is involved in the anomaly. The network topology data associated with the computing node 506‐1 may be updated to reflect the dynamic changes of the network caused by the anomaly. Based on the updated network topology data, the computing node 606‐1 may recompute using the Hash algorithm based on the updated network topology data and select a different path by riding another source port. The computing node 606‐1 may transmit the data flow using the different path that goes through the leaf switch 604‐2 including Path A‐1, Path A‐2, Path A‐3’ and Path A‐4’ .

FIG. 7 illustrates an example equal cost multipath (ECMP) planning in accordance with an embodiment of the present disclosure. ECMP load balancing refers to distributing data flows evenly by using load balancing algorithm to identify flows and distribute the data flows to different routing paths. As illustrated in the ECMP planning 700, four equal cost paths Path A, Path B, Path C, and Path D are available to route the data packets from computing node 706‐1 to computing node 706‐2. With the use of Hash algorithm on the five‐tuple data for routing, all four paths are getting utilized to route the data packets from computing node 706‐1 to computing node 706‐2. At the initial path planning state, this facilitates distributing data flows evenly across the network and reducing possible congestion. Further, when one of the four paths fails, the data flows can be distributed among the other three paths. It should be appreciated that the five‐tuple and the ECMP load balancing shown in FIG. 7 is merely for the purpose of illustration. The present disclosure is not intended to be limiting. In implementations, a three‐tuple including the source IP address, destination IP address, and ICMP Identifier that uniquely identifies an ICMP Query session may be adopted to indicate a data flow. Further, one or more paths may be set as reserved bandwidth, and hence, the data flow is distributed to only the available paths excluding the reserved paths.

FIG. 8 illustrates an example forward path planning algorithm in accordance with an embodiment of the present disclosure. FIG. 9 illustrates another example forward path planning algorithm in accordance with an embodiment of the present disclosure. FIG. 10 illustrates another example forward path planning algorithm in accordance with an embodiment of the present disclosure. FIG. 11 illustrates another example forward path planning algorithm in accordance with an embodiment of the present disclosure. The methods described in FIGs. 8‐11 may be implemented in the environment of FIG. 1 and/or the network architecture of FIG. 2. However, the present disclosure is not intended to be limiting. The methods described in FIGs. 8‐11 may alternatively be implemented in other environments and/or network architectures.

The methods described in FIGs. 8‐11 are described in the general context of machine‐executable instructions. Generally, machine‐executable instructions can include routines, programs, objects, components, data structures, procedures, modules, functions, and the like that perform particular functions or implement particular abstract data types. Furthermore, each of the example methods are illustrated as a collection of blocks in a logical flow graph representing a sequence of operations that can be implemented in hardware, software, firmware, or a combination thereof. The order in which the method is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method, or alternate methods. Additionally, individual blocks may be omitted from the method without departing from the spirit and scope of the subject matter described herein. In the context of software, the blocks represent computer instructions that, when executed by one or more processors, perform the recited operations. In the context of hardware, some or all of the blocks may represent application specific integrated circuits (ASICs) or other physical components that perform the recited operations.

Referring back to the method 800 described in FIG. 8, at block 802, a first computing node (e.g., the computing node 104) may obtain information associated with an algorithm implemented by at least a second computing node.

In implementations, the first computing node may implement a same algorithm as the at least second computing node. The algorithm implemented on the first computing node may be configured differently from the algorithm implemented on the second computing node. The algorithm may include various Hash algorithms used for routing path planning based on five‐tuple associated with the data packets in the data flow. In implementations, the at least one second computing node may be a leaf switch in a three‐tier Clos network, through which, the first computing node connects to the data center network.

At block 804, the first computing node (e.g., the computing node 104) may obtain network topology data stored associated with the at least one second computing node.

In implementations, the first computing node may obtain the information associated with an algorithm implemented by at least a second computing node described at block 802 and the network topology data associated with the at least one second computing node described at block 804 via various network protocols, Link Layer Discovery Protocol (LLDP) , for example. The network topology data may be represented in a non‐directional graph illustrating the elements of the network and the connection status associated therewith.

At block 806, the first computing node (e.g., the computing node 104) may receive a first data packet from a source device to be forwarded to a destination device.

In implementations, the source device and the destination device may refer to the client devices 110 of FIG. 1. In an example, the data packet is generated when a user operates the client devices 110 to communicate with another user operating a different client device. In another example, the data packet is generated when a user visits an online sources or uses an online services. In yet another examples, the computing nodes 104 may upload or download data from cloud storage spaces, thus generating a flow of data packets.

At block 808, the first computing node (e.g., the computing node 104) may determine a set of values associated with the first data packet according to the information associated with the algorithm.

In implementations, the set of values associated with the first data packet may include a five‐tuple extracted from the first data packet. The set of values may include a source IP address, a destination IP address, a source port number, a destination port number, and a protocol used for communication. The set of values, i.e., the five‐tuple, may be hashed using a Hash algorithm implemented by the first computing node. In implementations, a three‐tuple including the source IP address, destination IP address, and ICMP Identifier that uniquely identifies an ICMP Query session may be adopted to indicate a data flow.

At block 810, the first computing node (e.g., the computing node 104) may determine a forward path from the source device to the destination device according to the set of values associated with the first data packet and the network topology data.

In implementations, the first computing node may update the network topology data associated therewith using the network topology data obtained from a storage associated with the at least one second computing node. The first computing node may determine all best shortest routing paths according to the updated network topology data and select the forward path from the best shortest routing paths according to the set of values, the five‐tuple hashed values, for example.

At block 812, the first computing node (e.g., the computing node 104) may transmit the first data packet to the destination device through the forward path.

Referring back to the method 900 described in FIG. 9, at block 902, a first computing node (e.g., the computing node 104) may determine five‐tuple data associated with the first data packet, the five‐tuple data including a source IP address, a source port number, a destination IP address, a destination port number, and a protocol.

In implementations, the first computing node may extract the source IP address, the destination IP address, and the protocol from the IP header of the data packet and further extract the source port number and the destination port number from the TCP portion. The protocol may include any types of IP protocols including, but not limited to, IPv4 and IPv6. In other implementations, the first computing node may extract the source IP address, the destination IP address, and the ICMP identifier to generate a three‐tuple (or 3‐tuple) to indicate the data flow.

At block 904, the first computing node (e.g., the computing node 104) may update the configuration of a Hash algorithm implemented by a first computing node using information associated with the algorithm implemented by at least one second computing node. In implementations, the information associated with the algorithm implemented by at least one second computing node may include versions of the algorithms, one or more parameters configuration of the algorithms, etc.

At block 906, the first computing node (e.g., the computing node 104) may compute a five‐tuple hash values corresponding to the five‐tuple data as the set of values associated with the first data packet using the updated Hash algorithm. In implementations, when the first computing node generates a three‐tuple to identify a data flow, the first computing node computes a three‐tuple hash values corresponding to the three‐tuple data as the set of values associated with the first data packet using the updated Hash algorithm.

Referring back to the method 1000 described in FIG. 10, at block 1002, a first computing node (e.g., the computing node 104) may determine one or more paths from the source device to the destination device according to the network topology data. In implementations, the one or more paths from the source device to the destination device may include one or more shortest paths. The first computing node may implement various shortest path algorithms, Dijkstra's algorithm, Viterbi algorithm, Floyd–Warshall algorithm, Bellman–Ford algorithm, for example.

At block 1004, the first computing node (e.g., the computing node 104) may perform a modulus operation on the five‐tuple hash values with respect to the one or more paths. In implementations, the modulus operation may generate one or more distinct modulus values corresponding to the one or more paths, respectively. Data packets having all same five‐tuples may form a data flow. Data packets directed from a source IP address to a destination IP address may be distributed to different data flows depending on the source ports the data packets are transmitted through. Data flows may be distributed to the one or more paths according to the one or more distinct modulus values that distinctly correspond to the one or more paths.

At block 1006, the first computing node (e.g., the computing node 104) may determine a forward path from the one or more paths according to the results of the modulus operation. In implementations, the first computing node may select one path that maps to a data flow represented by the five‐tuple as the forward path. The arriving data packets that have the same five‐tuple may use the same forward path. In other implementations, the first computing node may designate one of the one or more paths as the forward path based on the traffic on these paths.

Referring back to the method 1100 described in FIG. 11, at block 1102, a first computing node (e.g., the computing node 104) may determine at least a first forward path and a second forward path from the one or more paths. The first computing node may implement the equal cost multipaths (ECMP) algorithms to determine all possible paths between a source device and a destination device. In implementations, the one or more paths may be sorted based on the associated one or more distinct modulus values. The first computing node may select one path that maps to a data flow represented by the five‐tuple to forward the data flow. Alternatively, the first computing node may designate more than one path to forward the data flow. In implementations, the first computing node may distribute the data packets from the source device to the destination device to different data flows to be transmitted to all possible paths between the source device and the destination device. In other implementations, the first computing node may distribute the data flows to a set of all possible paths.

At block 1104, the first computing node (e.g., the computing node 104) may receive a plurality of second data packets from the source device to the be forwarded to the destination device. The plurality of second data packets may have the same or different five‐tuple. The data packets that have the same five‐tuples may be transmitted through one path of the first forward path and the second forward path as a data flow at one time. The data packets that have different five‐tuples may form different data flows that go through different forward paths. In implementations, when one of the plurality of second data packets has the same five‐tuple as the first data packet described in FIG. 8, the second data packet is transmitted through the same forward path generated according to the embodiment illustrated in FIG. 8.

At block 1106, the first computing node (e.g., the computing node 104) may distribute the plurality of second data packets to the first forward path and the second forward path, each of the first forward path and the second forward path carries a portion of the plurality of second data packets. In implementations, the first computing node may evenly distribute the data flows to all possible paths (i.e., the first forward path and the second forward path ) between the source device and the destination device based on the Hash computation. In other implementations, the data flow carried by the first forward path and the second forward path may be uneven.

At block 1108, the first computing node (e.g., the computing node 104) may determine an anomaly in one of the first forward path and the second forward path. The anomaly may be associated with a computing node, a switch, port of a computing node, port of a switch node, etc., causing network congestion. The first computing node may detect the anomaly using the detection approaches described above with respect to FIG. 3. In implementations, the first computing node may generate one or more sessions corresponding to one or more forward paths, respectively. The first computing node may detect the anomaly when a session timeout occurs in one of the one or more sessions.

At block 1110, the first computing node (e.g., the computing node 104) may determine a third forward path from the source device to the destination device to reroute data flow, i.e., the portion of the plurality of second data packets that is involved in the abnormality. In implementations, the first computing node may recompute using the Hash algorithm based on the updated network topology data and select a different path by riding another source port.

Although the above method blocks are described to be executed in a particular order, in some implementations, some or all of the method blocks can be executed in other orders, or in parallel.

Although implementations have been described in language specific to structural features and/or methodological acts, it is to be understood that the claims are not necessarily limited to the specific features or acts described. Rather, the specific features and acts are disclosed as exemplary forms of implementing the claimed subject matter. Additionally, or alternatively, some or all of the operations may be implemented by one or more ASICS, FPGAs, or other hardware.

EXAMPLE CLAUSES

A. A method implemented by a first computing node, the method comprising: obtaining, via a network, information associated with an algorithm implemented by at least one second computing node; obtaining, via the network, network topology data stored associated with the at least one second computing node; receiving, at the first computing node, a first data packet from a source device to be forwarded to a destination device; determining a forward path from the source device to the destination device according to information associated with the first data packet and the network topology data; and transmitting the first data packet to the destination device through the forward path.

B. The method as recited in paragraph A, wherein the information associated with the first data packet includes a set of values and determining, at the first computing node, a set of values associated with the first data packet according to the information associated with the algorithm further comprises: executing a Hash algorithm implemented by the first computing node; applying at least the information associated with the algorithm implemented by the at least one second computing node to the Hash algorithm; and computing the a set of values associated with the first data packet using the Hash algorithm.

C. The method as recited in paragraph A, wherein the information associated with the first data packet including a set of values, and determining, at the first computing node, a set of values associated with the first data packet according to the information associated with the algorithm further comprises: determining five‐ tuple data associated with the first data packet, the five‐tuple data including a source IP address associated with the source device, a source port number associated with the source device, a destination IP address associated with the destination device, a destination port number associated with the destination device, and a protocol for communication in the network; updating a Hash algorithm implemented by the first computing node using the information associated with the algorithm implemented by the at least one second computing node; and computing a five‐tuple Hash values corresponding to the five‐tuple data as the set of values associated with the first data packet using the updated Hash algorithm.

D. The method as recited in paragraph C, wherein determining a forward path from the source device to the destination device according to the information associated with the first data packet and the network topology data further comprises: determining one or more paths from the source device to the destination device according to the network topology data; and determining the forward path from the one or more paths according to the five‐tuple hash values.

E. The method as recited in paragraph D, wherein determining a forward path from the source device to the destination device according to the information associated with the first data packet and the network topology data further comprises: performing a modulus operation on the five‐tuple Hash values with respect to the one or more paths; and determining the forward path from the one or more paths according to results of the modulus operation.

F. The method as recited in paragraph A, wherein the forward path from the source device to the destination device includes at least a first forward path and a second forward path, and the method further comprises: receiving a plurality of second data packets from the source device to be forwarded to the destination device; distributing the plurality of second data packets to the first forward path and the second forward path, each of the first forward path and the second forward path carries at least a portion of the plurality of second data packets; detecting an abnormality occurred in one of the first forward path and the second forward path; and determining a third forward path from the source device to the destination device to reroute the portion of the plurality of second data packets carried by one of the first forward path and the second forward path that is involved with the abnormality.

G. The method as recited in paragraph A, wherein the determining of a forward path from the source device to the destination device according to the set of values associated with the first data packet and the network topology data is based on an equal cost multipaths (ECMP) planning algorithm.

H. One or more machine readable media storing machine readable instructions that, when executed by a first computing node, cause the first computing node to perform acts comprising: obtaining, via a network, information associated with an algorithm implemented by at least one second computing node; obtaining, via the network, network topology data stored associated with the at least one second computing node; receiving, at the first computing node, a first data packet from a source device to be forwarded to a destination device; determining a forward path from the source device to the destination device according to information associated with the first data packet and the network topology data; and transmitting the first data packet to the destination device through the forward path.

I. The one or more machine readable media as recited in paragraph H, wherein the information associated with the first data packet including a set of values, and the acts further comprises: executing a Hash algorithm implemented by the first computing node; applying at least the information associated with the algorithm implemented by the at least one second computing node to the Hash algorithm; and computing the set of values associated with the first data packet using the Hash algorithm.

J. The one or more machine readable media as recited in paragraph H, wherein the information associated with the first data packet including a set of values, and the acts further comprises: determining five‐tuple data associated with the first data packet, the five‐tuple data including a source IP address associated with the source device, a source port number associated with the source device, a destination IP address associated with the destination device, a destination port number associated with the destination device, and a protocol for communication in the network; updating a Hash algorithm implemented by the first computing node using the information associated with the algorithm implemented by the at least one second computing node; and computing a five‐tuple Hash values corresponding to the five‐tuple data as the set of values associated with the first data packet using the updated Hash algorithm.

K. The one or more machine readable media as recited in paragraph J, the acts further comprising: determining one or more paths from the source device to the destination device according to the network topology data; and determining the forward path from the one or more paths according to the five‐tuple hash values.

L. The one or more machine readable media as recited in paragraph K, the acts further comprising: performing a modulus operation on the five‐tuple Hash values with respect to the one or more paths; and determining the forward path from the one or more paths according to results of the modulus operation.

M. The one or more machine readable media as recited in paragraph H, the acts further comprising: receiving a plurality of second data packets from the source device to be forwarded to the destination device; distributing the plurality of second data packets to the first forward path and the second forward path, each of the first forward path and the second forward path carries at least a portion of the plurality of second data packets; detecting an abnormality occurred in one of the first forward path and the second forward path; and determining a third forward path from the source device to the destination device to reroute the portion of the plurality of second data packets carried by one of the first forward path and the second forward path that is involved with the abnormality.

N. The one or more machine readable media as recited in paragraph H, wherein the determining of a forward path from the source device to the destination device according to the set of values associated with the first data packet and the network topology data is based on an equal cost multipaths (ECMP) planning algorithm.

O. A first computing node comprising: one or more processing units; and memory storing machine executable instructions that, when executed by one or more processing units, cause the one or more processing units to perform acts comprising: obtaining, via a network, information associated with an algorithm implemented by at least one second computing node; obtaining, via the network, network topology data stored associated with the at least one second computing node; receiving, at the first computing node, a first data packet from a source device to be forwarded to a destination device; determining a forward path from the source device to the destination device according to information associated with the first data packet and the network topology data; and transmitting the first data packet to the destination device through the forward path.

P. The first computing node as recited in paragraph O, wherein the information associated with the first data packet including a set of values, and the acts further comprises: determining five‐tuple data associated with the first data packet, the five‐tuple data including a source IP address associated with the source device, a source port number associated with the source device, a destination IP address associated with the destination device, a destination port number associated with the destination device, and a protocol for communication in the network; updating a Hash algorithm implemented by the first computing node using the information associated with the algorithm implemented by the at least one second computing node; and computing a five‐tuple Hash values corresponding to the five‐tuple data as the set of values associated with the first data packet using the updated Hash algorithm.

Q. The first computing node as recited in paragraph P, wherein the information associated with the first data packet including a set of values, and the acts further comprises: determining one or more paths from the source device to the destination device according to the network topology data; and determining the forward path from the one or more paths according to the five‐tuple hash values.

R. The first computing node as recited in paragraph Q, the acts further comprising: performing a modulus operation on the five‐tuple Hash values with respect to the one or more paths; and determining the forward path from the one or more paths according to results of the modulus operation.

S. The first computing node as recited in paragraph O, the acts further comprising: receiving a plurality of second data packets from the source device to be forwarded to the destination device; distributing the plurality of second data packets to the first forward path and the second forward path, each of the first forward path and the second forward path carries at least a portion of the plurality of second data packets; detecting an abnormality occurred in one of the first forward path and the second forward path; and determining a third forward path from the source device to the destination device to reroute the portion of the plurality of second data packets carried by one of the first forward path and the second forward path that is involved with the abnormality.

T. The first computing node as recited in paragraph O, wherein the determining of a forward path from the source device to the destination device according to the set of values associated with the first data packet and the network topology data is based on an equal cost multipaths (ECMP) planning algorithm.

Claims

A method implemented by a first computing node, the method comprising:

obtaining, via a network, information associated with an algorithm implemented by at least one second computing node;

obtaining, via the network, network topology data stored associated with the at least one second computing node;

receiving, at the first computing node, a first data packet from a source device to be forwarded to a destination device;

determining a forward path from the source device to the destination device according to information associated with the first data packet and the network topology data; and

transmitting the first data packet to the destination device through the forward path.
The method of claim 1, wherein the information associated with the first data packet includes a set of values and determining, at the first computing node, a set of values associated with the first data packet according to the information associated with the algorithm further comprises:

executing a Hash algorithm implemented by the first computing node;

applying at least the information associated with the algorithm implemented by the at least one second computing node to the Hash algorithm; and

computing the set of values associated with the first data packet using the Hash algorithm.
The method of claim 1, wherein the information associated with the first data packet including a set of values, and determining, at the first computing node, a set of values associated with the first data packet according to the information associated with the algorithm further comprises:

determining five‐tuple data associated with the first data packet, the five‐tuple data including a source IP address associated with the source device, a source port number associated with the source device, a destination IP address associated with the destination device, a destination port number associated with the destination device, and a protocol for communication in the network;

updating a Hash algorithm implemented by the first computing node using the information associated with the algorithm implemented by the at least one second computing node; and

computing a five‐tuple Hash values corresponding to the five‐tuple data as the set of values associated with the first data packet using the updated Hash algorithm.
The method of claim 3, wherein determining a forward path from the source device to the destination device according to the information associated with the first data packet and the network topology data further comprises:

determining one or more paths from the source device to the destination device according to the network topology data; and

determining the forward path from the one or more paths according to the five‐tuple hash values.
The method of claim 4, wherein determining a forward path from the source device to the destination device according to the information associated with the first data packet and the network topology data further comprises:

performing a modulus operation on the five‐tuple Hash values with respect to the one or more paths; and

determining the forward path from the one or more paths according to results of the modulus operation.
The method of claim 1, wherein the forward path from the source device to the destination device includes at least a first forward path and a second forward path, and the method further comprises:

receiving a plurality of second data packets from the source device to be forwarded to the destination device;

distributing the plurality of second data packets to the first forward path and the second forward path, each of the first forward path and the second forward path carries at least a portion of the plurality of second data packets;

detecting an abnormality occurred in one of the first forward path and the second forward path; and

determining a third forward path from the source device to the destination device to reroute the portion of the plurality of second data packets carried by one of the first forward path and the second forward path that is involved with the abnormality.
The method of claim 1, wherein the determining of a forward path from the source device to the destination device according to the set of values associated with the first data packet and the network topology data is based on an equal cost multipaths (ECMP) planning algorithm.
One or more machine readable media storing machine readable instructions that, when executed by a first computing node, cause the first computing node to perform acts comprising:

obtaining, via a network, information associated with an algorithm implemented by at least one second computing node;

obtaining, via the network, network topology data stored associated with the at least one second computing node;

receiving, at the first computing node, a first data packet from a source device to be forwarded to a destination device;

determining a forward path from the source device to the destination device according to information associated with the first data packet and the network topology data; and

transmitting the first data packet to the destination device through the forward path.
The one or more machine readable media of claim 8, wherein the information associated with the first data packet including a set of values, and the acts further comprises:

executing a Hash algorithm implemented by the first computing node;

applying at least the information associated with the algorithm implemented by the at least one second computing node to the Hash algorithm; and

computing the set of values associated with the first data packet using the Hash algorithm.
The one or more machine readable media of claim 8, wherein the information associated with the first data packet including a set of values, and the acts further comprises:

determining five‐tuple data associated with the first data packet, the five‐tuple data including a source IP address associated with the source device, a source port number associated with the source device, a destination IP address associated with the destination device, a destination port number associated with the destination device, and a protocol for communication in the network;

updating a Hash algorithm implemented by the first computing node using the information associated with the algorithm implemented by the at least one second computing node; and

computing a five‐tuple Hash values corresponding to the five‐tuple data as the set of values associated with the first data packet using the updated Hash algorithm.
The one or more machine readable media of claim 10, the acts further comprising:

determining one or more paths from the source device to the destination device according to the network topology data; and

determining the forward path from the one or more paths according to the five‐tuple hash values.
The one or more machine readable media of claim 11, the acts further comprising:

performing a modulus operation on the five‐tuple Hash values with respect to the one or more paths; and

determining the forward path from the one or more paths according to results of the modulus operation.
The one or more machine readable media of claim 8, the acts further comprising:

receiving a plurality of second data packets from the source device to be forwarded to the destination device;

distributing the plurality of second data packets to the first forward path and the second forward path, each of the first forward path and the second forward path carries at least a portion of the plurality of second data packets;

detecting an abnormality occurred in one of the first forward path and the second forward path; and

determining a third forward path from the source device to the destination device to reroute the portion of the plurality of second data packets carried by one of the first forward path and the second forward path that is involved with the abnormality.
The one or more machine readable media of claim 8, wherein the determining of a forward path from the source device to the destination device according to the set of values associated with the first data packet and the network topology data is based on an equal cost multipaths (ECMP) planning algorithm.
A first computing node comprising:

one or more processing units; and

memory storing machine executable instructions that, when executed by one or more processing units, cause the one or more processing units to perform acts comprising:

obtaining, via a network, information associated with an algorithm implemented by at least one second computing node;

obtaining, via the network, network topology data stored associated with the at least one second computing node;

receiving, at the first computing node, a first data packet from a source device to be forwarded to a destination device;

determining a forward path from the source device to the destination device according to information associated with the first data packet and the network topology data; and

transmitting the first data packet to the destination device through the forward path.
The first computing node of claim 15, wherein the information associated with the first data packet including a set of values, and the acts further comprises:

determining five‐tuple data associated with the first data packet, the five‐tuple data including a source IP address associated with the source device, a source port number associated with the source device, a destination IP address associated with the destination device, a destination port number associated with the destination device, and a protocol for communication in the network;

updating a Hash algorithm implemented by the first computing node using the information associated with the algorithm implemented by the at least one second computing node; and

computing a five‐tuple Hash values corresponding to the five‐tuple data as the set of values associated with the first data packet using the updated Hash algorithm.
The first computing node of claim 16, wherein the information associated with the first data packet including a set of values, and the acts further comprises:

determining one or more paths from the source device to the destination device according to the network topology data; and

determining the forward path from the one or more paths according to the five‐tuple hash values.
The first computing node of claim 17, the acts further comprising:

performing a modulus operation on the five‐tuple Hash values with respect to the one or more paths; and

determining the forward path from the one or more paths according to results of the modulus operation.
The first computing node of claim 15, the acts further comprising:

receiving a plurality of second data packets from the source device to be forwarded to the destination device;

distributing the plurality of second data packets to the first forward path and the second forward path, each of the first forward path and the second forward path carries at least a portion of the plurality of second data packets;

detecting an abnormality occurred in one of the first forward path and the second forward path; and

determining a third forward path from the source device to the destination device to reroute the portion of the plurality of second data packets carried by one of the first forward path and the second forward path that is involved with the abnormality.
The first computing node of claim 15, wherein the determining of a forward path from the source device to the destination device according to the set of values associated with the first data packet and the network topology data is based on an equal cost multipaths (ECMP) planning algorithm.