CN115004655B - System and method for Border Gateway Protocol (BGP) controlled network reliability - Google Patents

System and method for Border Gateway Protocol (BGP) controlled network reliability Download PDF

Info

Publication number
CN115004655B
CN115004655B CN202080094697.7A CN202080094697A CN115004655B CN 115004655 B CN115004655 B CN 115004655B CN 202080094697 A CN202080094697 A CN 202080094697A CN 115004655 B CN115004655 B CN 115004655B
Authority
CN
China
Prior art keywords
controller
bgp
controllers
cluster
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202080094697.7A
Other languages
Chinese (zh)
Other versions
CN115004655A (en
Inventor
陈怀谟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202311237020.6A priority Critical patent/CN117376234A/en
Publication of CN115004655A publication Critical patent/CN115004655A/en
Application granted granted Critical
Publication of CN115004655B publication Critical patent/CN115004655B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/64Routing or path finding of packets in data switching networks using an overlay routing layer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/02Topology update or discovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/28Routing or path finding of packets in data switching networks using route fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/46Cluster building
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/58Association of routers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • H04L67/141Setup of application sessions

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The application provides a method implemented by a first controller in a network comprising a controller cluster, the controller cluster comprising the first controller and a second controller, the method comprising: sending a first BGP message to a Network Element (NE) comprising first controller network layer reachability information (network layer reachability information, NLRI) carrying a location of the first controller relative to other controllers in the controller cluster; receiving a second BGP message from the NE comprising a second controller NLRI carrying a location of the second controller relative to the other controllers in the controller cluster; a master controller is determined from the cluster of controllers based on the location of the first controller and the location of the second controller, the master controller being responsible for controlling the network.

Description

System and method for Border Gateway Protocol (BGP) controlled network reliability
Technical Field
The present application relates generally to network communications, and more particularly to various systems and methods for improving the reliability of one or more controllers in a network implementing border gateway protocol (border gateway protocol, BGP).
Background
BGP is a protocol that manages the transmission of data packets over the internet by exchanging routing and reachability information between edge Network Elements (NEs), such as routers, located within a communication system. BGP directs packets between autonomous systems (autonomous system, AS) or networks managed by a single enterprise or service provider. BGP provides network stability that ensures that when a particular path fails, network Elements (NEs) can quickly adapt to send packets over another reconnection. The NE implementing BGP makes routing decisions based on paths, rules, or network policies configured by the network administrator.
In a network implemented as a software defined network (software defined network, SDN), a cluster of controllers controls all NEs in the network by communicating with one or more NEs in the network. The controller cluster comprises two or more controllers, wherein one controller is selected as the master controller controlling the NEs in the network. The host controller receives information from NEs in the network via BGP sessions and sends the information to other controllers in the controller cluster. In these networks, the reliability and availability of the network is largely dependent on the proper functioning of the controllers and connections in the controller cluster. Any problem or failure in the controller cluster can seriously affect the operation and reliability of the NEs in the network.
Disclosure of Invention
According to a first aspect of the present application, there is provided a method implemented by a first controller in a network comprising a controller cluster, the controller cluster comprising the first controller and a second controller. The method comprises the following steps: establishing a border gateway protocol (border gateway protocol, BGP) session with a Network Element (NE) in the network; sending a first BGP message to the NE comprising first controller network layer reachability information (network layer reachability information, NLRI) carrying an Identifier (ID) of each controller in the controller cluster, the first controller NLRI also carrying a location of the first controller relative to other controllers in the controller cluster based on a priority order; receiving a second BGP message from the NE comprising a second controller NLRI carrying the ID of each controller in the cluster of controllers, the second controller NLRI carrying a location of the second controller relative to the other controllers in the cluster of controllers based on the priority order; and determining a main controller from the controller cluster based on the position of the first controller carried in the first controller NLRI and the position of the second controller carried in the second controller NLRI, wherein the main controller is responsible for controlling the network.
Optionally, in a first implementation manner of the first aspect, the first BGP message includes at least one of: a flag indicating whether the first controller is the master controller of the network, the location of the first controller, an old location of the first controller, a number of controllers in the controller cluster, and a priority of the first controller relative to other controllers in the controller cluster.
Optionally, in a second implementation manner of the first aspect or any other implementation manner of the first aspect, the second BGP message includes at least one of the following: a second flag indicating whether the second controller is the master controller of the network, the location of the second controller, an old location of the second controller, a number of controllers in the controller cluster, and a priority of the second controller relative to other controllers in the controller cluster.
Optionally, in a third implementation manner of the first aspect or any other implementation manner of the first aspect, establishing the BGP session with the NE includes: establishing a plurality of BGP sessions with a plurality of NEs in the network to create a plurality of control channels, the plurality of NEs comprising the NE; and respectively establishing a BGP session with extension with the NE to create an information channel.
Optionally, in a fourth implementation manner of the first aspect or any other implementation manner of the first aspect, establishing the BGP session with the NE includes: establishing a plurality of BGP sessions with a plurality of NEs in the network to create a plurality of control channels, the plurality of NEs excluding the NE; establishing a BGP session with the NE with the extension to create an information channel.
Optionally, in a fifth implementation manner of the first aspect or any other implementation manner of the first aspect, establishing the BGP session with the NE includes: transmitting a first OPEN message to the NE having a high availability support capability triplet, the high availability support capability triplet including a flag indicating that the first controller is a controller; a second OPEN message is received from the NE having a high availability support capability triplet including a flag indicating that the NE is a node in the network.
Optionally, in a sixth implementation manner of the first aspect or any other implementation manner of the first aspect, the high availability support capability triplet in the first OPEN message is carried in a selectable parameter of the first OPEN message, and the high availability support capability triplet in the second OPEN message is carried in a selectable parameter of the second OPEN message.
Optionally, in a seventh implementation manner of the first aspect or any other implementation manner of the first aspect, the first BGP message includes a first controller address family identifier (address family identifier, AFI), a first controller sub-address family identifier (sub-address family identifier, SAFI), and the first controller NLRI, wherein the second BGP message includes a second controller AFI, a second controller SAFI, and the second controller NLRI.
Optionally, in an eighth implementation manner of the first aspect or any other implementation manner of the first aspect, the first BGP message is encoded as BGP UPDATE, the first controller NLRI is carried in a first path attribute field of the first BGP message, and the second BGP message is encoded as BGP UPDATE, and the second controller NLRI is carried in a second path attribute field of the second BGP message.
Optionally, in a ninth implementation manner of the first aspect or any other implementation manner of the first aspect, the method further includes: determining whether the second controller is malfunctioning after receiving an indication from the NE that the second controller is malfunctioning or after determining that no BGP message has been received from the secondary controller within a predetermined period of time; selecting the first controller as the master controller of the network when the second controller fails; and sending a third BGP message to the NE, the third BGP message including a third controller NLRI, the third controller NLRI indicating that the first controller is the master controller of the network.
Optionally, in a tenth implementation form of the first aspect or any other implementation form of the first aspect, the controller cluster comprises a plurality of controllers including the first controller and the second controller, wherein the method further comprises: determining that at least one fault has occurred within the controller cluster to create a first set of controllers and a second set of controllers within the controller cluster; the first controller is determined to be coupled to the first set of controllers that does not include the second controller, which is coupled to the second set of controllers that does not include the first controller.
Optionally, in an eleventh implementation manner of the first aspect or any other implementation manner of the first aspect, the first set of controllers has a first number of controllers and the second set of controllers has a second number of controllers, wherein the method further comprises: determining that the first controller of the first set of controllers is an intended master controller of the first set of controllers based on the old location of the first controller or a priority of the first controller relative to other controllers of the first set of controllers; sending a third BGP message to the NE indicating a status of the first set of controllers, the third BGP message including a number of controllers in the first set of controllers, the old location of the first controller, and the priority of the first controller; a fourth BGP message is received from the NE indicating a status of the second set of controllers, the fourth BGP message indicating that the second controller is an intended master of the second set of controllers, the fourth BGP message including a number of controllers in the second set of controllers, an old location of the second controller, and a priority of the second controller relative to other controllers in the second set of controllers.
Optionally, in a twelfth implementation manner of the first aspect or any other implementation manner of the first aspect, the method further includes: the first controller is selected as the master controller of the network based on the number of controllers in each of the first and second groups of controllers, the highest old location of the first or second controller, or the highest priority of the first or second controller.
Optionally, in a thirteenth implementation form of the first aspect or any other implementation form of the first aspect, the controller cluster further comprises a third controller, wherein the method further comprises receiving a third BGP message comprising a third controller NLRI carrying the ID of each controller in the controller cluster, the third controller NLRI carrying a location of the second controller relative to the other controllers in the controller cluster based on the priority order, wherein the master controller is determined based on the location of the first controller carried in the first controller NLRI, the location of the second controller carried in the second controller NLRI, and the location of the third controller carried in the third controller NLRI.
According to a second aspect of the present application, there is provided a method provided by a Network Element (NE) in a network comprising a cluster of controllers. The method comprises the following steps: establishing a first border gateway protocol (border gateway protocol, BGP) session with a master controller of the network; establishing a second BGP session with an auxiliary controller of the network, wherein the controller cluster comprises the main controller and the auxiliary controller, and the main controller is responsible for controlling the network; receiving BGP messages from the host controllers, the BGP messages including controller network layer reachability information (network layer reachability information, NLRI) indicating that the BGP messages were sent by the host controllers, the controller NLRI carrying a location of the host controllers relative to other controllers in the controller cluster and an Identification (ID) of each controller in the controller cluster; forwarding the BGP message to the secondary controller.
Optionally, in a first implementation manner of the second aspect, the BGP message includes at least one of: a flag indicating that the master controller controls the network, a location of the master controller relative to other controllers in the controller cluster, an old location of the master controller, a number of controllers in the controller cluster, and a priority of the master controller relative to other controllers in the controller cluster.
Optionally, in a second implementation manner of the second aspect or any other implementation manner of the second aspect, establishing the first BGP session with the host controller includes: transmitting a first OPEN message to the master controller with a high availability support capability triplet, the high availability support capability triplet including a flag indicating that the NE is a node in the network; a second OPEN message is received from the master controller having a high availability support capability triplet including a flag indicating that the master controller is a controller in the network.
Optionally, in a third implementation manner of the second aspect or any other implementation manner of the second aspect, the high availability support capability triplet in the first OPEN message is carried in a selectable parameter of the first OPEN message, and the high availability support capability triplet in the second OPEN message is carried in a selectable parameter of the second OPEN message.
Optionally, in a second implementation manner of the second aspect or any other implementation manner of the second aspect, the method further includes: detecting a failure of the master controller; and sending a second BGP message to the auxiliary controller, wherein the second BGP message comprises a third controller NLRI indicating that the main controller fails, and the second BGP message indicates that the auxiliary controller withdraws information about the main controller from a state database.
According to a third aspect of the present application, there is provided a first controller implemented in a network comprising a controller cluster, the controller cluster comprising the first controller and a second controller. The first controller includes: a memory for storing instructions; a processor coupled to the memory and configured to execute the instructions, the instructions causing the first controller to: establishing a border gateway protocol (border gateway protocol, BGP) session with a Network Element (NE) in the network; sending a first BGP message to the NE comprising first controller network layer reachability information (network layer reachability information, NLRI) carrying an Identifier (ID) of each controller in the controller cluster, the first controller NLRI also carrying a location of the first controller relative to other controllers in the controller cluster based on a priority order; receiving a second BGP message from the NE comprising a second controller NLRI carrying the ID of each controller in the cluster of controllers, the second controller NLRI carrying a location of the second controller relative to the other controllers in the cluster of controllers based on the priority order; and determining a main controller from the controller cluster based on the position of the first controller carried in the first controller NLRI and the position of the second controller carried in the second controller NLRI, wherein the main controller is responsible for controlling the network.
Optionally, in a first implementation manner of the third aspect, the first BGP message includes at least one of: a flag indicating whether the first controller is the master controller of the network, the location of the first controller, an old location of the first controller, a number of controllers in the controller cluster, and a priority of the first controller relative to other controllers in the controller cluster.
Optionally, in a second implementation manner of the third aspect or any other implementation manner of the third aspect, the second BGP message includes at least one of the following: a second flag indicating whether the second controller is the master controller of the network, the location of the second controller, an old location of the second controller, a number of controllers in the controller cluster, and a priority of the second controller relative to other controllers in the controller cluster.
Optionally, in a third implementation manner of the third aspect or any other implementation manner of the third aspect, the instructions further cause the first controller to: establishing a plurality of BGP sessions with a plurality of NEs in the network to create a plurality of control channels, the plurality of NEs comprising the NE; and respectively establishing a BGP session with extension with the NE to create an information channel.
Optionally, in a fourth implementation manner of the third aspect or any other implementation manner of the third aspect, the instructions further cause the first controller to: establishing a plurality of BGP sessions with a plurality of NEs in the network to create a plurality of control channels, the plurality of NEs excluding the NE; establishing a BGP session with the NE with the extension to create an information channel.
Optionally, in a fifth implementation manner of the third aspect or any other implementation manner of the third aspect, the instructions further cause the first controller to: transmitting a first OPEN message to the NE having a high availability support capability triplet, the high availability support capability triplet including a flag indicating that the first controller is a controller; a second OPEN message is received from the NE having a high availability support capability triplet including a flag indicating that the NE is a node in the network.
Optionally, in a sixth implementation manner of the third aspect or any other implementation manner of the third aspect, the high availability support capability triplet in the first OPEN message is carried in a selectable parameter of the first OPEN message, and the high availability support capability triplet in the second OPEN message is carried in a selectable parameter of the second OPEN message.
Optionally, in a seventh implementation manner of the third aspect or any other implementation manner of the third aspect, the first BGP message includes a first controller address family identifier (address family identifier, AFI), a first controller sub-address family identifier (sub-address family identifier, SAFI) and the first controller NLRI, wherein the second BGP message includes a second controller AFI, a second controller SAFI and the second controller NLRI.
Optionally, in an eighth implementation manner of the third aspect or any other implementation manner of the third aspect, the first BGP message is encoded as BGP UPDATE, the first controller NLRI is carried in a first path attribute field of the first BGP message, and the second BGP message is encoded as BGP UPDATE, and the second controller NLRI is carried in a second path attribute field of the second BGP message.
Optionally, in a ninth implementation manner of the third aspect or any other implementation manner of the third aspect, the instructions further cause the first controller to: determining whether the second controller is malfunctioning after receiving an indication from the NE that the second controller is malfunctioning or after determining that no BGP message has been received from the secondary controller within a predetermined period of time; selecting the first controller as the master controller of the network when the second controller fails; and sending a third BGP message to the NE, the third BGP message including a third controller NLRI, the third controller NLRI indicating that the first controller is the master controller of the network.
Optionally, in a tenth implementation manner of the third aspect or any other implementation manner of the third aspect, the controller cluster includes a plurality of controllers, the plurality of controllers including the first controller and the second controller, wherein the instructions further cause the first controller to: determining that at least one fault has occurred within the controller cluster to create a first set of controllers and a second set of controllers within the controller cluster; the first controller is determined to be coupled to the first set of controllers that does not include the second controller, which is coupled to the second set of controllers that does not include the first controller.
Optionally, in an eleventh implementation manner of the third aspect or any other implementation manner of the third aspect, the first set of controllers has a first number of controllers and the second set of controllers has a second number of controllers, wherein the instructions further cause the first controller to: determining that the first controller of the first set of controllers is an intended master controller of the first set of controllers based on the old location of the first controller or a priority of the first controller relative to other controllers of the first set of controllers; sending a third BGP message to the NE indicating a status of the first set of controllers, the third BGP message including a number of controllers in the first set of controllers, the old location of the first controller, and the priority of the first controller; a fourth BGP message is received from the NE indicating a status of the second set of controllers, the fourth BGP message indicating that the second controller is an intended master of the second set of controllers, the fourth BGP message including a number of controllers in the second set of controllers, an old location of the second controller, and a priority of the second controller relative to other controllers in the second set of controllers.
Optionally, in a twelfth implementation manner of the third aspect or any other implementation manner of the third aspect, the instructions further cause the first controller to: the first controller is selected as the master controller of the network based on the number of controllers in each of the first and second groups of controllers, the highest old location of the first or second controller, or the highest priority of the first or second controller.
Optionally, in a thirteenth implementation manner of the third aspect or any other implementation manner of the third aspect, the controller cluster further includes a third controller, wherein the instructions further cause the first controller to receive a third BGP message including a third controller NLRI, the third controller NLRI carrying the ID of each controller in the controller cluster, the third controller NLRI carrying a location of a second controller relative to the other controllers in the controller cluster based on the priority order, wherein the master controller is determined based on the location of the first controller carried in the first controller NLRI, the location of the second controller carried in the second controller NLRI, and the location of the third controller carried in the third controller NLRI.
According to a fourth aspect of the present application there is provided a NE implementing a network comprising a cluster of controllers. The NE includes: a memory for storing instructions; a processor coupled to the memory and configured to execute the instructions that cause the NE to: establishing a first border gateway protocol (border gateway protocol, BGP) session with a master controller of the network; establishing a second BGP session with an auxiliary controller of the network, wherein the controller cluster comprises the main controller and the auxiliary controller, and the main controller is responsible for controlling the network; receiving BGP messages from the host controllers, the BGP messages including controller network layer reachability information (network layer reachability information, NLRI) indicating that the BGP messages were sent by the host controllers, the controller NLRI carrying a location of the host controllers relative to other controllers in the controller cluster and an Identification (ID) of each controller in the controller cluster; forwarding the BGP message to the secondary controller.
Optionally, in a first implementation manner of the fourth aspect, the BGP message includes at least one of the following: a flag indicating that the master controller controls the network, a location of the master controller relative to other controllers in the controller cluster, an old location of the master controller, a number of controllers in the controller cluster, and a priority of the master controller relative to other controllers in the controller cluster.
Optionally, in a second implementation manner of the fourth aspect or any other implementation manner of the fourth aspect, the instructions further cause the NE to: transmitting a first OPEN message to the master controller with a high availability support capability triplet, the high availability support capability triplet including a flag indicating that the NE is a node in the network; a second OPEN message is received from the master controller having a high availability support capability triplet including a flag indicating that the master controller is a controller in the network.
Optionally, in a third implementation manner of the fourth aspect or any other implementation manner of the fourth aspect, the high availability support capability triplet in the first OPEN message is carried in a selectable parameter of the first OPEN message, and the high availability support capability triplet in the second OPEN message is carried in a selectable parameter of the second OPEN message.
Optionally, in a second implementation manner of the fourth aspect or any other implementation manner of the fourth aspect, the instructions further cause the NE to: detecting a failure of the master controller; and sending a second BGP message to the auxiliary controller, wherein the second BGP message comprises a third controller NLRI indicating that the main controller fails, and the second BGP message indicates that the auxiliary controller withdraws information about the main controller from a state database.
According to a fifth aspect of the present application, there is provided a first controller implemented in a network comprising a controller cluster, the controller cluster comprising the first controller and a second controller, the first controller comprising: means for establishing a border gateway protocol (border gateway protocol, BGP) session with a Network Element (NE) in the network; means for sending a first BGP message to the NE comprising first controller network layer reachability information (network layer reachability information, NLRI) carrying an Identifier (ID) of each controller in the controller cluster, the first controller NLRI also carrying a location of the first controller relative to other controllers in the controller cluster based on a priority order; means for receiving a second BGP message from the NE comprising a second controller NLRI carrying the ID of each controller in the controller cluster, the second controller NLRI carrying a location of the second controller relative to the other controllers in the controller cluster based on the priority order; and determining a master controller from the controller cluster based on the position of the first controller carried in the first controller NLRI and the position of the second controller carried in the second controller NLRI, wherein the master controller is responsible for controlling the network.
According to a sixth aspect of the present application there is provided a NE implemented in a network comprising a cluster of controllers, the NE comprising: means for establishing a first border gateway protocol (border gateway protocol, BGP) session with a master controller of the network; a module for establishing a second BGP session with an auxiliary controller of the network, the cluster of controllers including the primary controller and the auxiliary controller, the primary controller being responsible for controlling the network; means for receiving a BGP message from the host controller, the BGP message including controller network layer reachability information (network layer reachability information, NLRI) indicating that the BGP message was sent by the host controller, the controller NLRI carrying a location of the host controller relative to other controllers in the controller cluster and an Identification (ID) of each controller in the controller cluster; and means for forwarding the BGP message to the secondary controller.
Any of the above embodiments may be combined with any one or more of the other embodiments described above for clarity to create new embodiments within the scope of the present application.
These and other features will become more fully apparent from the following detailed description and appended claims, taken in conjunction with the accompanying drawings.
Drawings
Fig. 1 is a diagram of a controller cluster network for implementing BGP for high availability (high availability, HA) of the network, in accordance with various embodiments of the application.
Fig. 2 is a diagram of another controller cluster network for implementing BGP for a network HA in accordance with various embodiments of the application.
Fig. 3 is a diagram of a NE for implementing BGP for a network HA in accordance with various embodiments of the present application.
Fig. 4A-4D are illustrations of TLVs for encoding capabilities of controllers and NEs, according to various embodiments of the application.
Fig. 5A-5B are illustrations of BGP messages communicated through a controller cluster network according to various embodiments of the application.
Fig. 6A-6C are illustrations of TLVs for encoding BGP messages into existing BGP NLRIs, according to various embodiments of the present application.
Fig. 7A-7C are illustrations of BGP common headers included in a new BGP message or an existing BGP message, according to various embodiments of the application.
Fig. 8A-8C are illustrations of BGP messages communicated over a controller cluster network prior to any failure of a cluster in the controller cluster network, in accordance with various embodiments of the application.
Fig. 9A-9C are illustrations of BGP messages communicated over a controller cluster network after a cluster failure in the controller cluster network, in accordance with various embodiments of the application.
Fig. 10A-10C are diagrams of BGP messages communicated over a controller cluster network after a failure of a master controller in a cluster of the controller cluster network, in accordance with various embodiments of the application.
Fig. 11A-11C are illustrations of BGP messages communicated through a controller cluster network prior to any failure of a cluster in the controller cluster network, in accordance with various embodiments of the application.
Fig. 12A-12E are illustrations of BGP messages communicated through a controller cluster network after a failure of a link in a cluster of the controller cluster network, in accordance with various embodiments of the application.
Fig. 13A-13E are diagrams of BGP messages communicated over a controller cluster network after a failure of a controller in a cluster of the controller cluster network, in accordance with various embodiments of the application.
Fig. 14 is a flow chart of a method performed by a controller for implementing BGP for a network HA in accordance with various embodiments of the application.
Fig. 15 is a flow chart of a method performed by a NE for implementing BGP for a network HA in accordance with various embodiments of the application.
Fig. 16 is a diagram of an apparatus implemented as a controller for executing BGP for a network HA in accordance with various embodiments of the application.
Fig. 17 is a diagram of an apparatus implemented as a NE for executing BGP for a network HA according to various embodiments of the present application.
Detailed Description
It should be understood at the outset that although an illustrative implementation of one or more embodiments are provided below, the disclosed systems and/or methods may be implemented using any number of techniques. The application should in no way be limited to the illustrative implementations, drawings, and techniques illustrated below, including the exemplary designs and implementations illustrated and described herein, but may be modified within the scope of the appended claims along with their full scope of equivalents.
Fig. 1 is a diagram of a controller cluster network 100 for implementing BGP for high availability (high availability, HA) of the network, in accordance with various embodiments of the application. The controller cluster network 100 includes a network 103 and clusters 106 of controllers 109A-B.
The controller cluster network 100 includes NEs 110-116 interconnected by links 119. NEs 110-116 may be physical devices such as routers, bridges, virtual machines, network switches. NEs 110-116 may also be logical devices, such as virtual machines, for performing switching and routing according to various routing protocols. As described herein, NEs 110-116 are used to implement BGP. BGP is further defined in the article entitled "border gateway protocol 4 (BGP-4) (A Border Gateway Protocol (BGP-4))" published by y.rekhter et al, inter-domain routing working group (Inter-Domain Routing Working Group, IDR WG) request opinion script (Request for Comments, RFC) 4271 (hereinafter "RFC 4271").
Link 119 interconnecting NEs 110-116 may be a wired or wireless link, or interface, interconnecting each of NEs 110-116. Each of links 119 is used to forward traffic according to various routing protocols (e.g., BGP).
Cluster 106 includes at least two controllers 109A and 109B interconnected by a link 121. Although FIG. 1 shows cluster 106 as including two controllers 109A-B, it should be understood that cluster 106 may include any number of controllers 109A-B. Similar to link 119, link 121 may be a wired or wireless link or interface interconnecting controllers 109A and 109B. Link 121 is used to forward traffic between controllers 109A and 109B.
Each controller 109A-B may be implemented as a central entity for controlling NEs 110-116 in the controller cluster network 100. In one embodiment, each controller 109A-B may be implemented as an SDN controller, as further described in the article entitled "segmented routing architecture (Segment Routing Architecture)" filed by C.Filsfils in IETF RFC 8402, 7 in 2018. In another embodiment, each controller 109A-B may be implemented as a path computation element (path computation element, PCE) as further described in the article entitled "PCE-initiated LSP established path computation element communication protocol (PCEP) extension in stateful PCE model," published by E.Crabbe in the Internet engineering task force (Internet Engineering Task Force, IETF) request for opinion documents (Request for Comments, RFC) 8281, month 12 in 2017. In another embodiment, each controller 109A-B may be implemented as an application-layer traffic optimization (application layer traffic optimization, ALTO) server, as further described in the article entitled "application-layer traffic optimization (ALTO) Protocol (Application Layer Traffic Optimization) filed by R.Alimi in IETF RFC 7285, month 9 of 2014.
In the controller cluster network 100, individual controllers 109A-B act as master controllers of the controller cluster network 100, responsible for controlling and managing NEs 110-116 in the network 103. The other controllers 109A-B, which do not act as primary controllers, act as backup controllers in the controller cluster network 100. To this end, each controller 109A-B in cluster 106 maintains the same up-to-date information about each NE 110-116 in a status database 124 stored locally at the controller 109A-B. As shown in fig. 1, controller 109A stores a status database 124 with up-to-date information about NEs 110-116 in network 103. Similarly, controller 109B stores the same status database 124 with up-to-date information about NEs 110-116 in network 103.
While cluster 106 includes multiple controllers 109A-B, NEs 110-116 treat cluster 106 as a single controller because one or more of NEs 110-116 communicate with only a single controller (i.e., the master controller). In this way, NEs 110-116 do not maintain information about the different controllers 109A-B within cluster 106. Instead, the NEs 110-116 may only maintain information about links established between one or more of the NEs 110-116 and the master controller of the controller cluster network 100.
In the example shown in fig. 1, the controller 109A is a master controller (sometimes referred to herein as "master controller 109A"). Controller 109A establishes links 127, 128, and 129 with one or more NEs 110-116 in network 103. Links 127, 128, and 129 may be similar to links 119 and 121 in that links 127, 128, and 129 may be wired or wireless links or interfaces that interconnect controller 109A with NEs 114, 110, and 111, respectively.
In FIG. 1, link 127 interconnects master controller 109A with NE 114, link 128 interconnects master controller 109A with NE 110, and link 129 interconnects master controller 109A with NE 111. In controller cluster network 100, links 127-129 represent BGP sessions established between master controller 109A and NEs 114, 110, and 111. Links 127-129 are therefore also referred to herein as BGP sessions 127-129, respectively.
To establish BGP sessions 127-129, master controller 109A and NEs 114, 110, and 111 exchange OPEN messages according to RFC 4271. Master controller 109A sends OPEN messages to NEs 114, 110 and 111. Each OPEN message includes information for negotiating and establishing BGP sessions 127-129 between master controller 109A and NEs 114, 110, and 111. For example, the OPEN message includes the BGP version, BGP identity, suppression timer, and one or more optional parameters that master controller 109A is capable of implementing. Similarly, each of NEs 114, 110 and 111 sends an OPEN message to master controller 109A. The OPEN messages sent by NEs 114, 110 and 111 include information for negotiating and establishing BGP sessions 127-129 between master controller 109A and NEs 114, 110 and 111. For example, the OPEN message includes BGP versions, BGP identities, suppression timers, and one or more optional parameters that the respective NEs 114, 110, and 111 are capable of implementing. BGP sessions 127-129 are established between master controller 109A and each of NEs 114, 110, and 111 when master controller 109A is compatible with the characteristics and capabilities of NEs 114, 110, and 111. As such, controllers 109A and NEs 114, 110, and 114 are BGP speakers in controller cluster network 100. Although only three BGP sessions 127-129 are shown in fig. 1, it should be appreciated that master controller 109A may establish BGP sessions with any number of NEs 110-116.
In controller cluster network 100, only controller 109A in cluster 106 establishes BGP sessions 127-129 with NEs 114, 110, and 111. Other controllers 109B in cluster 106, not master controller 109A, will not establish BGP sessions 127-129 with any of NEs 110-116 in network 103. The master controller 109A is responsible for communicating with NEs 110-116 to control the network 103, and the master controller 109A sends relevant information to the other controllers 109B to maintain a status database 124 in all controllers 109A-B in the cluster 106.
For example, master controller 109A may provide path 122 in network 103. As shown in fig. 1, path 122 flows through NEs 110, 113 and 112. To provide path 122, master controller 109A may send a message with information about path 122 to NE 110 over BGP session 128. For example, according to the IETF document entitled "advertisement segment routing policy in BGP (Advertising Segment Routing Policies in BGP)" submitted by s.previ et al in month 5 2020, the message may be encoded as a BGP UPDATE message, which may include Segment Routing (SR) path attribute fields carrying information describing the NE 113 and segment identification (segment identifier, SID) of NE 112 of path 122 (which may be SR path 122). NE 110, upon receiving the message from master controller 109A, updates the local forwarding table to indicate information about SR path 122. After updating the local forwarding table to indicate information about SR path 122, NEs 110, 113, and 112 on path 122 have successfully provided path 122 in network 103.
After successfully provisioning path 122, one or more of NEs 110, 113 and 112 generate a feedback message indicating that path 122 has been successfully provisioned by the corresponding NE 110, 113 and 112 on path 122. In one case, the header or ingress NE 110 of the SR path 122 generates a feedback message indicating that the path 122 has been successfully provided. In another case, any BGP message from NE 110 to the controller that acts as a keep-alive message may be used as a feedback message indicating that SR path 122 has been successfully provisioned. In another case, a link may be established between master controller 109A and each NE 110, 113 and 112 on path 122, through which each NE 110, 113 and 112 sends a feedback message back to master controller 109A. In another case, NEs 112 and 113 send feedback messages back to NE 110 (as BGP speakers). NE 110 then forwards a feedback message to host controller 109A over BGP session 128 indicating that path 122 has been successfully provisioned between NEs 110, 113, and 112.
After the master controller 109A receives the feedback message, the controller 109A updates the local status database 124 to indicate that the path 122 has been successfully provisioned on the NEs 110, 113, and 112 in the network 103. To ensure that all controllers 109A-B in cluster 106 maintain a common status database 124, master controller 109A forwards information in feedback messages to controller 109B over link 121. Similarly, controller 109B updates local state database 124 to indicate that path 122 has been successfully provisioned on NEs 110, 113, and 112 in network 103.
At this stage, all controllers 109A-B maintain a common status database 124. Subsequently, when the master controller 109A fails, the controller 109B is lifted as the master controller 109B of the controller cluster network 100. For example, when the controller 109B detects that the master controller 109A fails, the controller 109B determines that the controller 109B is elevated to be the master controller 109B responsible for controlling the network 103. In this case, the controller 109B takes over the BGP sessions 127-129 with one or more NEs 110-116 in the network 103 and begins to control the network 103, and the NEs 110-116 are unaware of the master controller change from the controller 109A to the controller 109B.
However, in some cases, the controller 109B may erroneously detect that the main controller 109A has failed. In this case, the controller 109B begins to act as the master controller, while the controller 109A still acts as the master controller of the network 103. For example, controller 109B may detect that link 121 between controllers 109A and 109B has failed and assume that master controller 109A also has failed. In practice, however, master controller 109A may still be operating normally and controlling NEs 110-116 in network 103. The controller 109B may then determine that the controller 109B is the master controller 109B of the network 103 and also begin controlling NEs 110-116 in the network 103. When this occurs, the network 103 is controlled by two different master controllers 109A and 109B, which may cause the network 103 to fail because the master controllers 109A and 109B are not consistent in the manner in which the NEs 110-116 within the network 103 are programmed. Thus, any failure or problem occurring within the cluster 106 of controllers 109A-B may also cause the entire network 103 to crash or fail.
Embodiments are disclosed herein that aim to prevent a failure of a network 103 when a failure occurs within a cluster 106 of controllers 109A-B by configuring each of the controllers 109A-B and one or more NEs 110-116 in the network 103 to implement BGP for the network HA. To implement BGP for a network HA, all controllers 109A-B and one or more NEs 110-116 in cluster 106 establish an extended BGP session (also referred to herein as an "enhanced BGP session"). One or more of NEs 110-116 communicates with all controllers 109A-B in cluster 106, rather than only with master controller 109A in controller cluster network 100.
In one embodiment, one or more NEs 110-116 establish an enhanced BGP session with each controller 109A-B in cluster 106. In the example shown in FIG. 1, controllers 109A-B and/or NEs 110-116 select NEs 110 and 111 as two designated NEs in network 103 that communicate with all controllers 109A-B in cluster 106. NEs 110 and 111 may be selected based on an Identification (ID) that identifies each of NEs 110-116. For example, NEs 110 and 111 in network 103 with the highest ID are designated for communication with all controllers 109A-B in cluster 106. For example, NEs 110 and 111 may have the highest ID and the second highest ID, respectively, in network 103. In this way, each of NEs 110 and 111 establishes an enhanced BGP session with each of controllers 109A-B in cluster 106.
As shown in fig. 1, NE 110 is coupled to controller 109A via link 130 and to controller 109B via link 132. NE 111 is coupled to controller 109A via link 131 and to controller 109B via link 133. Links 130-133 are similar to links 127-129. Each of the links 130-133 represents an enhanced BGP session and, thus, may also be referred to herein as enhanced BGP sessions 130-133 through which extended BGP messages 140 may be transmitted. In one embodiment, enhanced BGP sessions 130-133 are established and maintained over internet protocol (Internet Protocol, IP) paths between NEs 110 and 111 and controllers 109A and 109B. Enhanced BGP sessions 130-133 may also be referred to herein as "information channels 130-133".
To establish enhanced BGP sessions 130-133, each of controllers 109A and 109B exchanges OPEN messages with NEs 110 and 111. The OPEN message may include information for negotiating and establishing enhanced BGP sessions 130-133 between controllers 109A and 109B and NEs 110 and 111. For example, the OPEN message may include the version of BGP, the BGP identification, the suppression timer, and one or more optional parameters that controllers 109A-B and NEs 110-111 are capable of implementing. In one embodiment, the OPEN message includes a controller capability triplet that, when included in the OPEN message, indicates that the controller 109A-B or NE 110-111 that sent the OPEN message is capable of implementing BGP for the network HA. The capability triplet includes three elements: a capability code of 1 byte, a capability length of 1 byte, and a capability value. The value of the code represents the capability. The value of the length represents the size of the capability value in bytes. Fig. 4A illustrates an example of a controller capability triplet that may be sent in an embodiment disclosed herein. When the characteristics and capabilities of the controllers 109A-B and NEs 110-111 are compatible, an enhanced BGP session 130-133 is established between the controllers 109A-B and the NEs 110-111.
In the embodiment shown in fig. 1, master controller 109A establishes BGP session 128 and enhanced BGP session 130 with NE 110 as separate sessions (e.g., with separate OPEN message sets), respectively. In another embodiment, master controller 109A may establish an enhanced BGP session 130 with only NE 110 that will serve as a control channel and an information channel between master controller 109A and NE 110.
Similarly, the embodiment shown in fig. 1 illustrates master controller 109A establishing BGP session 129 and enhanced BGP session 131 with NE 111 as separate sessions, respectively. In another embodiment, master controller 109A may establish an enhanced BGP session 131 with only NE 111 that will act as a control channel and an information channel between master controller 109A and NE 111.
After the enhanced BGP session 130-133 is established in controller cluster network 100, controllers 109A-B and NEs 110-111 exchange BGP messages 140 with each other to transmit information describing clusters 106 of controllers 109A-B. In one embodiment, the controller 109A generates a first BGP message 140 that includes information indicating the states of the controller 109A and the cluster 106. The first BGP message 140 may indicate whether the controller 109A is a master controller 109A. The first BGP message 140 may also include the location of the controller 109A relative to other controllers 109B in the cluster 106. The location refers to the current or expected location of the controller 109A within the cluster 106 within the priority order of the controllers 109A-B.
The priority order indicates the order of the controllers 109A-B by which the master controller 109A is selected from the cluster 106. For example, the operator of the controller cluster network 100 may set the priority of each controller 109A-B in the cluster 106, indicating the order of priority of the controllers 109A-B from highest priority to lowest priority. The controller 109A with the highest priority (e.g., priority 200) is the first primary controller 109A, the secondary controller 109B with the next highest priority (e.g., priority 180) is the backup controller of the first primary controller 109A, the third controller with the next highest priority (e.g., priority 178) is the backup controller of the first primary controller 109A and the second controller 109B, and so on.
The location of a controller 109A-B relative to other controllers 109A-B in the cluster 106 refers to a current or expected location within the priority order of the controllers 109A-B of the cluster 106. In the example shown in fig. 1, the priority order may be { controller 109A, controller 109B }. The location of the master controller 109A is 1 because the current location of the controller 109A within the priority order is first. The position of the auxiliary controller 109B is 2 because the current position of the controller 109B within the priority order is the second.
In one embodiment, the first BGP message 140 includes the old location of the controller 109A relative to the other controllers 109B in the cluster 106. The old location refers to the previous location of controller 109A within the priority order of controllers 109A-B. For example, assume that the priority order is { another controller X, controller 109A, controller 109B }, before another controller X fails. In this case, the controller 109A becomes the main controller 109A, and the old position of the controller 109A is 2 because the previous position of the controller 109A in the priority order is the second.
In one embodiment, the first BGP message 140 includes a priority of the controller 109A. As described above, the operator of the controller cluster network 100 may pre-configure the priority of each controller 109A-B in the cluster 106.
In one embodiment, the first BGP message 140 includes the number of controllers 109A-B in the cluster 106. In the example shown in FIG. 1, the number of controllers 109A-B is two. However, it should be appreciated that the cluster 106 may include any number of controllers 109A-B.
In one embodiment, the first BGP message 140 includes a controller ID for each controller 109A-B in the cluster 106. In the example shown in fig. 1, the first BGP message 140 includes the controller ID of controller 109A and the controller ID of controller 109B. It should be appreciated that the first BGP message 140 may include other information not described herein. Examples of the first BGP message 140 are further described below with reference to fig. 5A-5B through fig. 7A-7C.
After generating first BGP message 140, master controller 109A sends first BGP message 140 to NE 110 over enhanced BGP session 130 and first BGP message 140 to NE 111 over enhanced BGP session 131. NE 110 forwards first BGP message 140 to controller 109B via enhanced BGP session 132. NE 111 also forwards first BGP message 140 to controller 109B via enhanced BGP session 133. The redundancy of sending the first BGP message 140 from NEs 110 and 111 to controller 109B is used to further ensure that when one of NEs 110 and 111 fails, controller 109B receives the first BGP message 140 from controller 109A. The controller 109B receives the first BGP message 140 and updates the status database 124 to include information indicating the status of the controller 109A and the cluster 106 carried in the first BGP message 140.
Similarly, the controller 109B generates a second BGP message 140. The second BGP message 140 includes similar information as the first BGP message 140, except that the second BGP message 140 includes information indicating the states of the controller 109B and the cluster 106. After generating second BGP message 140, controller 109B sends second BGP message 140 to NE 111 over enhanced BGP session 133 and second BGP message 140 to NE 110 over enhanced BGP session 132. NE 110 forwards second BGP message 140 to controller 109A via enhanced BGP session 130. NE 111 also forwards second BGP message 140 to controller 109A via enhanced BGP session 131. Redundancy in sending the second BGP message 140 from NEs 110 and 111 is used to further ensure that when one of NEs 110 and 111 fails, controller 109A receives the second BGP message 140 from controller 109B. The controller 109A receives the second BGP message 140 and updates the status database 124 to include information indicating the status of the controller 109B and the cluster 106 carried in the second BGP message 140.
In some embodiments, when the state of the respective controller 109A-B or cluster 106 is updated, each controller 109A-B sends a subsequent BGP message 140 with updated information about the state of the respective controller 109A-B and cluster 106. Similarly, when a cluster 106 fails or becomes problematic, each controller 109A-B sends a subsequent BGP message 140 with updated information about the status of the respective controller 109A-B and cluster 106.
In this embodiment shown in FIG. 1, two controllers 109A-B maintain information about the state of all controllers 109A-B in cluster 106 and maintain enhanced BGP sessions 130-133 with two NEs 110-111 in network 103. In this embodiment, the controller 109B will not act as a master in error when the link 121 interconnecting the controllers 109A-B fails. Controller 109B will wait a predetermined period of time after detecting a failure at link 121 to determine whether a subsequent BGP message 140 is received from controller 109A through NE 110 or NE 111.
For example, when master controller 109A detects a failure of link 121, master controller 109A sends third BGP message 140 to NE 110 over enhanced BGP session 130 and sends third BGP message 140 to NE 111 over enhanced BGP session 131. The third BGP message 140 indicates that master controller 109A detected a failure at link 121 and is therefore no longer coupled to controller 109B. In this example, the third BGP message 140 may indicate that the number of controllers 109A-B in cluster 106 is one because controller 109A is no longer able to communicate with controller 109B via link 121, and therefore, assume that controller 109B is off. NE 110 forwards third BGP message 140 to controller 109B over enhanced BGP session 132 and NE 111 forwards third BGP message 140 to controller 109B over enhanced BGP session 133.
In one embodiment, controller 109B waits for a predetermined period of time to receive a third BGP message 140 from controller 109A via NEs 110 and/or 111. When the controller 109B receives the third BGP message 140, the controller 109B determines that the master controller 109A is still active and operating properly and, therefore, will not act as a master controller. When the controller 109B does not receive the third BGP message 140 within a predetermined period of time, the controller 109B acts as a master controller and begins to control the network 103.
In this way, embodiments of the present application prevent network 103 from failing when multiple controllers 109A-B in cluster 106 act as master controllers in controller cluster network 100. To this end, embodiments of the present application increase the lifetime of NEs 110-116 within controller cluster network 100 and increase the accuracy of controlling controller cluster network 100.
Fig. 2 is a diagram of another controller cluster network 200 for implementing BGP for a network HA in accordance with various embodiments of the application. The controller cluster network 200 is similar to the controller cluster network 100 of fig. 1, except that the cluster 106 includes more than two controllers 109A-D. The controllers 109A-D are similar to the controllers 109A-B described above with reference to FIG. 1.
Controllers 109A-D are interconnected by links 121A-E. Links 121A-E are similar to links 121 described above with reference to fig. 1. Link 121A interconnects controller 109A and controller 109B. Link 121B interconnects controller 109A and controller 109C. Link 121C interconnects controller 109A and controller 109D. Link 121D interconnects controller 109C and controller 109D. Link 121E interconnects controller 109B and controller 109D.
The network 103 shown in FIG. 2 is similar to the network 103 of FIG. 1 in that the network 103 includes NEs 110-116 interconnected by links 119. However, in network 103, only one NE 111 establishes enhanced BGP sessions 131, 133, 203, and 206 with controllers 109A-D in cluster 106. For example, controllers 109A-D and/or NEs 110-116 determine that NE 111 has the highest ID among all NEs 110-116, and thus, NE 111 is designated for establishing enhanced BGP sessions 131, 133, 203, and 206 with controllers 109A-D in cluster 106.
Similar to controller cluster network 100 of fig. 1, controller 109A establishes an enhanced BGP session 131 with NE 111 and controller 109B establishes an enhanced BGP session 133 with NE 111. Unlike controller cluster network 100 of fig. 1, NE 111 also establishes enhanced BGP sessions 203 and 206 with controllers 109C and 109D, respectively. In this manner, each of controllers 109A-D establishes enhanced BGP sessions 131, 133, 203, and 206 with NE 111.
After establishing enhanced BGP sessions 131, 133, 203, and 206, each of controllers 109A-D generates BGP message 140 that includes information describing the states of the respective controllers 109A-D and cluster 106. BGP messages 140 sent by controllers 109A-D may indicate whether the respective controllers 109A-D are master controllers. BGP message 140 may also include the location of the respective controller 109A-D, the old location of the respective controller 109A-D, the priority of the respective controller 109A-D, the number of controllers 109A-D in cluster 106, and the controller ID of each of controllers 109A-D in network 103. It should be appreciated that BGP message 140 may include other information not described herein. Examples of BGP messages 140 are further described below with reference to fig. 5A-5B through fig. 7A-7C.
In some cases, a failure 215 may occur along one or more of the links 121A-E of the interconnect controllers 109A-D. In the example shown in fig. 2, failure 215 occurs along links 121A, 121C, and 121D. After failure 215, controller 109A and controller 109C are interconnected by link 121B, and controller 109B and controller 109D are interconnected by link 121E. That is, the controllers 109A and 109C are no longer connected to the controller 109B or the controller 109D, and therefore, the controllers 109A and 109C assume that the controllers 109B and 109D have failed. Similarly, the controllers 109B and 109D are no longer connected to the controller 109A or the controller 109C, and therefore, the controllers 109B and 109D assume that the controllers 109A and 109C have failed. In this way, the remaining interconnected controllers 109A and 109C and controllers 109B and 109D form two separate controller groups 210A and 210B. The controller group 210A includes a controller 109A and a controller 109C interconnected by a link 121B. The controller group 210B includes a controller 109B and a controller 109D interconnected by a link 121E.
Each controller group 210A-B is unaware of the presence of the other controller group 210A-B. In this case, the two controller groups 210A-B determine the master controllers within each controller group 210A-B, resulting in two master controllers controlling the network 103. As described above, when multiple master controllers control network 103, different master controllers may not program NEs 110-116 consistently, which may result in network 103 as a whole failing.
Embodiments disclosed herein prevent such failure of network 103 by NE 111 in network 103 transmitting BGP messages 140 from each of controllers 109A-D. In one embodiment, after detecting failure 215 along links 121A, 121C, and 121D, each of controllers 109A-D waits a predetermined period of time to receive BGP message 140 from controllers 109A-D in another controller group 210A-B before determining whether to reassign another master controller in cluster 106.
For example, the controller 109A may be a master controller 109A in the controller cluster network 200. After failure 215 within cluster 106, controller 109B initially determines that the connection with master controller 109A has failed. Then, controller 109B waits for a predetermined period of time to receive BGP message 140 from controller 109A from NE 111.
During this time, controller 109A detects failure 215 of links 121A, 121C, and 121D. The controller 109A then generates a BGP message 140, for example, indicating that the number of controllers 109A-B in the cluster 106 is now two, as the controller 109A is no longer connected to the controllers 109B and 109D. BGP message 140 may also include the IDs of controllers 109A and 109C. Controller 109A sends BGP message 140 to NE 111 through enhanced BGP session 131.
NE 111 still maintains enhanced BGP sessions 203, 206, and 133 with other controllers 109C, 109B, and 109D, respectively. NE 111 then forwards BGP message 140 to controllers 109C, 109B, and 109D via enhanced BGP sessions 203, 206, and 133, respectively. When the controller 109B receives a BGP message 140 indicating that the master controller 109A is still active for a predetermined period of time and that there are two controllers in the group 210A, the controller 109B does not act as a master controller. In contrast, when the controller 109B does not wait to receive BGP messages 140 for a predetermined period of time, the controller 109B does not act as a master controller 109B in the controller cluster network 200.
Fig. 3 is a diagram of a NE 300 implementing BGP for a network HA in accordance with various embodiments of the present application. In one embodiment, NE 300 can be implemented as any one of NEs 110-116 or as any one of controllers 109A-D.
NE 300 includes a port 320, a transceiver unit (Tx/Rx) 310, a processor 330, and a memory 333. The processor 330 includes a controller module 334. The port 320 is coupled with a Tx/Rx310, which Tx/Rx310 may be a transmitter, a receiver, or a combination thereof. Tx/Rx310 may transmit and receive data through port 320. The processor 330 is used for processing data. Memory 333 is used to store data and instructions for implementing the embodiments described herein. NE 300 may also include an electro-optical (EO) component and an optical-to-electrical (OE) component coupled to port 320 and Tx/Rx310 for receiving and transmitting electrical and optical signals.
The processor 330 may be implemented by hardware and software. The processor 330 may be implemented as one or more central processing units (central processing unit, CPU) and/or graphics processing units (graphics processing unit, GPU) chips, logic units, cores (e.g., multi-core processor), field-programmable gate arrays (FPGA), application specific integrated circuits (application specific integrated circuit, ASIC), and digital signal processors (digital signal processor, DSP). Processor 330 communicates with port 320, tx/Rx310 and memory 333. The controller module 334 is implemented by the processor 330 to execute instructions for implementing the various embodiments described herein. For example, the controller module 334 is configured to establish the enhanced BGP sessions 130-133, 203, and 206 and transmit BGP messages 140. Including the controller module 334 may improve the functionality of the NE 300. The controller module 334 may also cause the NE 300 to transition to a different state. Alternatively, the controller module 334 is implemented as instructions stored in the memory 333.
Memory 333 includes one or more of a disk, tape drive, or solid state drive, and may serve as an overflow data storage device for storing programs as such when selected for execution, and for storing instructions and data read during program execution. Memory 333 may be volatile and nonvolatile, and may be read-only memory (ROM), random access memory (random access memory, RAM), ternary content-addressable memory (TCAM), and Static Random Access Memory (SRAM).
In one embodiment, when NE 300 is a controller 109A-D, memory 333 stores status database 124, capabilities 303, location 306, old location 309, number 312 of controllers 109A-D, controller ID 315, and priority 318. Status database 124 maintains up-to-date information about NEs 110-116 in network 103 and up-to-date information about controllers 109A-D in cluster 106. Capability 303 may indicate whether NE 300 is capable of implementing BGP for the network HA (e.g., establish enhanced BGP sessions 130-133, 203, and 206, and send BGP message 140 with the extension).
Location 306 refers to the current or expected location of NE 300 within the priority order of controllers 109A-D within cluster 106. Old location 309 refers to the previous location of NE 300 within the priority order of controllers 109A-D within cluster 106. Quantity 312 refers to the number of controllers 109A-D within cluster 106.
The controller ID 315 is an ID or value that identifies each controller of the controllers 109A-D in the cluster 106. In one embodiment, the controller ID 315 includes the IDs of the controllers 109A-D reachable in the cluster 106. In another embodiment, the controller ID 315 includes an ID of a controller 109A-D that is not reachable or has failed in the cluster 106. Priority 318 is a value indicating the priority of NE 300 relative to other controllers 109A-D in cluster 106.
It should be appreciated that by programming and/or loading executable instructions onto NE 300, at least one of processor 330 and/or memory 333 is altered to, in part, transform NE 300 into a particular machine or device (e.g., a multi-core forwarding architecture) having the new functionality taught by the present application. The functions that can be implemented by loading executable software into a computer can be converted into hardware implementation by well-known design rules, which is fundamental in the fields of power engineering and software engineering. The decision whether to use software or hardware to implement a concept generally depends on design stability and the number of units to be produced, rather than any problems involved in transitioning from the software domain to the hardware domain. Often, the constantly changing designs are more suitable for implementation in software because re-writing hardware implementations are more expensive than re-writing software designs. Generally, a stable design that will be mass-produced is more suitable for implementation in hardware, such as in an ASIC, because for large production runs, a hardware implementation may be cheaper than a software implementation. Typically, a design may be developed and tested in software and then converted by well known design rules into an equivalent hardware implementation in an ASIC that implements instructions of hardwired software. The particular machine or device is in the same manner as a machine controlled by a new ASIC, and as such, a computer that has been programmed and/or loaded with executable instructions can be considered a particular machine or device.
Fig. 4A-4D are diagrams of type-length-value (TLV) encoding capabilities 303 of controllers 109A-D and NEs 110-116, according to various embodiments of the application. In particular, FIG. 4A illustrates a first embodiment of a controller capability triplet for indicating the capabilities of the controllers 109A-D or NEs 110-116. FIG. 4B illustrates a second embodiment of a controller capability triplet for indicating the capabilities of the controllers 109A-D or NEs 110-116. Fig. 4C illustrates capability selectable parameters including the controller capability triplet of fig. 4A or the controller capability triplet of fig. 4B. Fig. 4D shows an OPEN message including the capability option parameters of fig. 4C.
Referring now to FIG. 4A, a first embodiment of a controller capability triplet 400 for indicating the capabilities of the controllers 109A-D or NEs 110-116 is shown. Controllers 109A-D and/or NEs 110-111 exchange OPEN messages to establish enhanced BGP sessions 130-133, 203, and 206 with each other as described above with reference to fig. 1 and 2. In one embodiment, when a controller 109A-D or NE 110-111 sending an OPEN message is capable of implementing BGP for a network HA (e.g., establishing enhanced BGP sessions 130-133, 203, and 206, and sending a tape extended BGP message 140), the controller 109A-D or NE 110-111 includes a controller capability triplet 400 in the OPEN message.
As shown in fig. 4A, the controller capability triplet 400 includes a 1-octet capability code 401, a 1-octet capability length 402, and a flag 403. The capability code 401 is a value to be assigned by the internet number assignment office (Internet Assigned Numbers Authority, IANA). The value in capability code 401 indicates that controller capability triplet 400 carries capability 303 of NEs 110-116 or controllers 109A-D. Capability length 402 represents the length of flag 403. The flag 403 includes 32 bits, with 1 bit being the C bit 404. The C bit 404 is set to indicate whether the controllers 109A-D or NEs 110-111 are sending the controller capability triplet 400. For example, C bit 404 may be set to 1 when controllers 109A-D send controller capability triplet 400 and C bit 404 may be set to 0 when NEs 110-111 send controller capability triplet 400. Alternatively, C bit 404 may be set to 0 when controllers 109A-D send controller capability triplet 400 and C bit 404 may be set to 1 when NEs 110-111 send controller capability triplet 400.
Referring now to FIG. 4B, a second embodiment of a controller capability triplet 425 for indicating the capabilities of the controllers 109A-D or NEs 110-116 is shown. In one embodiment, when a controller 109A-D or NE 110-111 sending an OPEN message is capable of implementing BGP for a network HA (e.g., establishing enhanced BGP sessions 130-133, 203, and 206, and sending a tape extended BGP message 140), the controller 109A-D or NE 110-111 includes a controller capability triplet 425 in the OPEN message.
As shown in fig. 4B, the controller capability triplet 425 includes a 1-octet capability code 426, a 1-octet capability length 427, a controller address family identifier (address family identifier, AFI) 428, a controller sub-address family identifier (sub-address family identifier, SAFI) 429, and a flag 403. Capability code 426 is similar to capability code 401 of fig. 4A. Capability length 427 is similar to capability length 402 of fig. 4A.
Controller AFI 428 is a 16-bit value to be assigned by the internet number assignment office (Internet Assigned Numbers Authority, IANA). Controller SAFI 429 is an 8-bit value to be assigned by IANA. Controllers AFI 428 and SAFI 429 are values defined to carry information about controllers 109A-D in cluster 106. The flag 403 includes 8 bits, with 1 bit being the C bit 404. As described above, the C bit 404 is set to indicate whether the controllers 109A-D or NEs 110-111 are sending the controller capability triples 425. In some embodiments, the controller capability triplet 400 of FIG. 4A and the controller capability triplet 425 of FIG. 4B are based on capability optional parameters defined by RFC5492 entitled "capability advertisement with BGP-4" (hereinafter "RFC 5492") issued by J.Scuder et al, 2 nd month 2009.
Referring now to FIG. 4C, a capability selectable parameter 450 is shown that includes the controller capability triplet 400 of FIG. 4A or the controller capability triplet 425 of FIG. 4B. Capability selectable parameters 450 include a parameter type 451, a parameter length 452, and a plurality of triples of capability, including controller capability triplet 400 of fig. 4A or controller capability triplet 425 of fig. 4B. Parameter type 451 is an 8-bit field set to 2 to indicate that the TLV is a capability optional parameter 450. Parameter length 452 is an 8-bit field that indicates the length of a triplet of multiple capabilities.
Referring now to FIG. 4D, an OPEN message 475 according to RFC 4271 is shown, including the capability optional parameters 450 of FIG. 4C. In one embodiment, when a controller 109A-D or NE 110-111 sending OPEN message 475 is capable of implementing BGP for a network HA (e.g., establishing enhanced BGP sessions 130-133, 203, and 206, and sending extended BGP message 140), controller 109A-D or NE 110-111 includes capability optional parameter 450 of FIG. 4C in OPEN message 475. The capability selectable parameters 450 of fig. 4C include the controller capability triplet 400 of fig. 4A or the controller capability triplet 425 of fig. 4B.
As shown in fig. 4D, OPEN message 475 includes version 476, my autonomous system field 477, hold time field 478, BGP identification 479, optional parameter length 480, and optional parameters including capability optional parameter 450. Version 476 represents a BGP protocol version number that may indicate that the current BGP version number is 4. My autonomous System field 477 indicates the autonomous System number of the controller 109A-D or NE 110-11 that sent OPEN message 475. The hold time field 478 indicates the number of seconds that the controller 109A-D or NE 110-11 sending the OPEN message 475 suggests holding the value of the timer. BGP identifier 479 indicates the BGP identifier of the controller 109A-D or NE 110-11 that sent OPEN message 475 and may be the IP address of the controller 109A-D or NE 110-11 that sent OPEN message 475. Optional parameter length 480 represents the total length of the optional parameters included in OPEN message 475.
Fig. 5A-5B are illustrations of the contents of BGP messages 140 communicated through controller cluster networks 100 and 200, according to various embodiments of the application. Specifically, fig. 5A shows the contents of BGP message 140, and fig. 5B shows a TLV for encoding a controller network layer reachability information (network layer reachability information, NLRI) field carrying the contents of BGP message 140.
Referring now to fig. 5A, the contents of BGP messages 140 communicated through controller cluster networks 100 and 200 are shown, in accordance with various embodiments of the present application. As described above with reference to fig. 1 and 2, controllers 109A-D generate BGP messages 140 and forward BGP messages 140 through NEs 110-111 to reach other controllers 109A-D in cluster 106 by enhancing BGP sessions 130-133, 203, and 206.
In one embodiment, BGP message 140 includes controller network layer reachability information (network layer reachability information, NLRI) 503. The controller NLRI 503 describes the state of the controllers 109A-D (also referred to herein as "originating controllers 109A-D") that generated the BGP message 140. The controller NLRI 503 also describes other controllers 109A-D in the cluster 106. As shown in FIG. 5A, the controller NLRI 503 includes a master controller flag (C) 506, a location 306, an old location 309, a number of controllers 312, a priority 318, and controller IDs 315A-N identifying all controllers 109A-D in the cluster 106. It should be appreciated that BGP message 140 may include additional information not otherwise shown in fig. 5A.
In one embodiment, the master controller flag (C) 506 is a flag or bit set to indicate whether the originating controllers 109A-D are master controller 109A. For example, master controller 109A generates BGP message 140 with master controller flag (C) 506 set to 1.
Location 306 refers to the current or expected location of controller 109A within the priority order of controllers 109A-D within cluster 106. The priority order indicates the order in which the master controller 109A is selected from the cluster 106. For example, the operator of the controller cluster network 100 may set the priority of each controller 109A-D in the cluster 106, indicating the order of priority of the controllers 109A-D from highest priority to lowest priority. The controller 109A having the highest priority (e.g., priority 200) is the first master controller 109A, the second controller 109B having the next highest priority (e.g., priority 188) is a backup controller of the first master controller 109A, the third controller 109C having the next highest priority (e.g., priority 180) is a backup controller of the first master controller 109A and the second controller 109B, and the fourth controller 109D having the next highest priority (e.g., priority 178) is a backup controller of the first master controller 109A, the second controller 109B, and the third controller 109C. In this example, the priority order may be { controller 109A, controller 109B, controller 109C, and controller 109D }. Therefore, the position 306 of the main controller 109A is 1, the position 306 of the auxiliary controller 109B is 2, the position 306 of the third controller 109C is 3, and the position 306 of the fourth controller 109D is 4.
In one embodiment, the locations 306 of the controllers 109A-D change as a result of the failure of the cluster 106 of the controllers 109A-D. Old location 309 refers to an old location of an originating controller 109A-D relative to other controllers 109A-D in cluster 106 within the priority order of controllers 109A-D. For example, when the controller 109A fails, the controller 109B becomes the master controller and the old position 309 of the controller 109B is 2.
The number 312 of controllers 109A-D refers to the number or count of controllers 109A-D in the cluster 106. In the example shown in FIG. 2, cluster 106 includes four controllers 109A-D, and therefore, number 312 is set to 4. Priority 318 refers to the priority of the originating controllers 109A-D, which may be assigned by the operator of the controller cluster network 100 or 200.
The controller IDs 315A-N are IDs that identify each controller of the controllers 109A-D in the cluster 106. The controller IDs 315A-N may be an identification, tag, or address of each of the controllers 109A-D in the cluster 106. For example, BGP message 140 may include the IP addresses of controllers 109A-D as controller IDs 315A-D.
Referring now to fig. 5B, a TLV for encoding a controller NLRI field 510 carrying the NLRI 503 of fig. 5A is shown. The controller NLRI field 510 is similar to the encoding of the NLRI field defined by RFC 4271 carried in the BGP UPDATE message, except that the controller NLRI field 510 carries the controller NLRI 503.
As shown in fig. 5B, the controller NLRI field 510 includes a type field 511, a length field 512, a flag 513, a location field 515, a number of controllers field 516 (shown as "no controllers" in fig. 5B), an old location field 517, a reserved bit 518, a priority field 519, and connected controller ID fields 520A-N. It should be understood that the controller NLRI field 510 can include other fields not otherwise shown in FIG. 5B.
The type field 511 is a 16-bit field allocated by IANA for indicating that the controller NLRI 503 is carried in the controller NLRI field 510. The length field 512 is a 16-bit field, and the length of the controller NLRI field 510 excluding the type field 511 and the length field 512 is expressed in octets. The flag 513 includes 8 bits, wherein 1 flag is defined as a C bit 514. The C bit 514 indicates the master controller flag (C) 506. Location field 515 is an 8-bit field that indicates the location 306 of the originating controller 109A-D. The number of controllers field 516 is an 8-bit field that indicates the number 312 of controllers 109A-D in the cluster 106. Old location field 517 is an 8-bit field that indicates old location 309 of originating controller 109A-D. Reserved bit 518 comprises 24 bits that are set to zero and ignored upon receipt. The priority field 519 is an 8-bit field that indicates the priority 318 of the originating controllers 109A-D. The connected controller ID fields 520A-N are 32-bit fields that indicate the controller IDs 315A-N of the controllers 109A-D in the cluster 106.
In one embodiment, the controller NLRI 503 is carried in an existing BGP NLRI, such as a multiprotocol reachable NLRI (MP_REACH_NLRI) or a multiprotocol unreachable NLRI (MP_UNREACH_NLRI). Both MP_REACH_NLRI and MP_UNREACH_NLRI are defined in RFC 4760 (hereinafter "RFC 4760") entitled "Multi-protocol extensions to BGP-4 (Multiprotocol Extensions for BGP-4)" filed by T.bates et al at month 1 of 2007.
Fig. 6A-6C are illustrations of TLVs for encoding BGP messages 140 into existing BGP NLRIs, according to various embodiments of the present application. FIG. 6A is a diagram of MP_REACH_NLRI; fig. 6B is a diagram of mp_unreach_nlri; fig. 6C is a diagram of a TLV for encoding the unreachable controller NLRI field carried in the mp_unreach_nlri of fig. 6B.
Referring now to fig. 6A, BGP mp_reach_nlri 600 is shown, similar to mp_reach_nlri defined by RFC 4760, except that mp_reach_nlri 600 carries controller NLRI field 510 of fig. 5B. As shown in fig. 6A, the mp_reach_nlri 600 further includes a controller AFI field 601, a controller SAFI field 602, a length field 603 of a next hop network address, a network address field 604 of a next hop, a reserved field 605, and a controller NLRI field 510.
The controller AFI field 601 is a 2-octet field carrying the controller AFI, which is the value to be assigned by the IANA. When the controller AFI is carried in the mp_reach_nlri 600, it indicates that the mp_reach_nlri 600 includes a controller NLRI field 510 that carries the controller NLRI 503. The controller SAFI field 602 is a 1-octet field that carries the controller SAFI, which is also the value to be assigned by the IANA. When the controller SAFI is carried in the mp_reach_nlri 600, it also indicates that the mp_reach_nlri 600 includes a controller NLRI field 510 that carries the controller NLRI 503. The length field 603, the network address field 604 and the reserved field 605 of the next hop network address are left blank because they are independent of the controller NLRI 503 carried in the controller NLRI field 510.
In one embodiment, the controller NLRI 503 carried in the controller NLRI field 510 of MP_REACH_NLRI 600 indicates information about the controllers 109A-D in the cluster 106 that are reachable or available at the time the MP_REACH_NLRI 600 is sent. Instead, controllers 109A-D or NEs 110-111 send MP_UNREACH_NLRI to indicate information about the controllers 109A-D in cluster 106 that are not reachable, unavailable, or have failed.
Referring now to FIG. 6B, a BGP MP_UNREACH_NLRI 610 is illustrated that indicates information about the controllers 109A-D in the cluster 106 that are unreachable, unavailable, or have failed. BGP mp_unreach_nlri 610 is similar to mp_unreach_nlri defined by RFC 4760, except that mp_unreach_nlri 610 carries an unread controller NLRI field 615.
As shown in fig. 6B, mp_unreach_nlri 610 includes a controller AFI field 601, a controller SAFI field 602, and an unread controller NLRI field 615. The unreachable controller NLRI field 615 is similar to the controller NLRI field 510, except that the unreachable controller NLRI field 615 only carries information about the unreachable, unavailable, or failed controllers 109A-D in the cluster 106. Fig. 6C shows an example of the unreachable controller NLRI field 615 included in the mp_unreach_nlri 610.
Referring now to fig. 6C, a TLV for encoding the unreachable controller NLRI field 615 included in the mp_unreach_nlri 610 is shown. The unreachable controller NLRI field 615 is similar to the controller NLRI field 510 of FIG. 5B, except that only the controller ID 315X of the controllers 109A-D that has become unreachable, unavailable, or failed is included in the connected controller ID field 520X.
For example, NE 111 may detect that a session with controller 109A is failed and NE 111 may generate BGP message 150 including mp_unreach_nlri 610 based on BGP message 140 originating from controller 109A, the mp_unreach_nlri 610 including unreachable controller NLRI field 615. NE 111 may then send BGP message 150 to other controllers B-D. After receiving the message, the other controllers 109B-D delete the information about the controller from the controller 109A. The unreachable controller NLRI field 615 includes a C bit 514 that indicates that the controller 109A is the master controller, a location field 515 that indicates that the location 306 is 1, a number of controllers field 516 that indicates that the number 312 of controllers 109A-D is 4, an old location field 517 that indicates the old location 309 of the controller 109A, a priority field 519 that indicates that the controller 109A has the highest priority 318, and a connected controller ID field 520X that indicates the controller ID 315 of the controllers 109A-D.
Fig. 7A-7C are illustrations of BGP generic headers in a new BGP message 140 or an existing BGP message 140, according to various embodiments of the application. In particular, fig. 7A shows BGP message generic headers included in BGP message 140. Fig. 7B shows BGP message generic headers included in BGP message 140 encoded as new type of BGP message 140. Fig. 7C shows BGP message generic header included in BGP message 140 encoded as a BGP UPDATE message according to RFC 4271.
Referring now to fig. 7A, a BGP message generic header 700 is shown. BGP message generic header 700 may be used as a header of a new type of BGP message 140, defined as carried to controller NLRI 503, and may be used as a header of an existing BGP message 140 (e.g., BGP UPDATE message). As shown in fig. 7A, BGP message generic header 700 includes a label field 701, a length field 702, and a type field 703. The flag field 701 is a 16-octet field set to 1. The length field 702 is a 2-octet field that indicates the total length of the BGP message 140 (including BGP message generic header 700). The type field 703 is a 1-octet field carrying a value defined by the IANA. This value represents the type of BGP message 140. For example, the first value indicates that BGP message 140 is a new type of BGP message 140 and the second value indicates that BGP message 140 is an existing type of BGP message 140, such as a BGP UPDATE message.
Referring now to fig. 7B, a new BGP message 140A is shown, encoded as a new type of BGP message 140A. The new BGP message 140A includes a BGP message generic header 700A that includes a value in a type field 703 that indicates that the message is encoded as a new type of BGP message 140A. The new BGP message 140A also includes mp_reach_nlri 600 or mp_unreach_nlri 610, depending on whether the new BGP message 140A indicates information about reachable or unreachable controllers 109A-D. As described above, mp_reach_nlri 600 includes controller NLRI field 510,MP_UNREACH_NLRI 610 including unreachable controller NLRI field 615.
Referring now to FIG. 7C, an existing BGP UPDATE message 140B encoded in accordance with RFC 4271 is shown. The existing BGP UPDATE message 140B includes a BGP message generic header 700B that includes a value in a type field 703 that indicates that the message is encoded as an existing type of BGP message 140B. The existing BGP UPDATE message 140B includes a withdrawn path length field 753, a withdrawn path field 756, a path attribute length field 759, and a path attribute field 762. The withdrawn path length field 753 indicates the total length of the withdrawn path, which may be set to 0 in the BGP UPDATE message 140B because the path is not withdrawn. The withdrawn path field 756 is left blank because there is no path to be withdrawn by the BGP UPDATE message 140B. The path attribute length field 759 is a 2-octet field indicating the total length of the path attribute field 762. The path attributes field 762 includes either MP_REACH_NLRI 600 or MP_UNREACH_NLRI 610, depending on whether the existing BGP UPDATE message 140B indicates information about reachable controllers 109A-D or unreachable controllers 109A-D. As described above, mp_reach_nlri 600 includes controller NLRI field 510,MP_UNREACH_NLRI 610 including unreachable controller NLRI field 615.
Fig. 8A-8C are illustrations of BGP messages 140 communicated through the controller cluster network 100 of fig. 1 prior to any failure of the clusters 106 in the controller cluster network 100, in accordance with various embodiments of the application. In particular, fig. 8A illustrates the transmission of BGP messages 140 communicated over the controller cluster network 100 of fig. 1. Fig. 8B-8C illustrate TLVs for encoding BGP messages 140 communicated over controller cluster network 100 of fig. 8A.
Referring now to fig. 8A, a diagram illustrating the transmission of BGP messages 800 and 803 through the controller cluster network 100 of fig. 1 is shown, in accordance with various embodiments of the application. In fig. 8A, BGP messages 800 and 803 are sent before any failure of cluster 106 in controller cluster network 100 occurs.
As shown in fig. 8A, the controller 109A generates BGP messages 800 that may be encoded as new BGP messages 140A or existing BGP UPDATE messages 140B. BGP message 800 includes controller NLRI 503 of fig. 5A. The controller NLRI 503 of BGP message 800 includes a master controller flag (C) 506 (shown as "C" in FIG. 8A), a location 306, an old location 309, a quantity 312, a priority 318, and controller IDs 315A-B. The master controller flag (C) 506 is set to 1 to indicate that the controller 109A sending BGP message 800 (also referred to herein as "originating controller 109A") is the master controller 109A of the controller cluster network 100. Position 306 is a 1 indicating the first position in the priority order of the reachable controllers 109A-D in cluster 106. Old location 309 also indicates value 1, for example, because controller 109A is master controller 109A since cluster 106 was initialized with controllers 109A-B. The number 312 of controllers 109A-B indicates that there are two controllers 109A-B in the cluster 106. Priority 318 indicates that controller 109A has the highest priority 318 in cluster 106. The controller IDs 315A-B include a controller ID 315A that identifies the controller 109A and a controller ID 315B that identifies the controller 109B.
The controller 109A sends BGP message 800 to NE 110 over enhanced BGP session 130. NE 110 forwards BGP message 800 to controller 109B via enhanced BGP session 132. The controller 109B determines that the controller 109A is still reachable and available upon receipt of the BGP message 800 and updates the status database 124 to include data from the BGP message 800.
Controller 109A sends BGP message 800 to NE 111 through enhanced BGP session 131. NE 111 forwards BGP message 800 to controller 109B via enhanced BGP session 133. The controller 109B determines that the controller 109A is still reachable and available upon receipt of the BGP message 800 and updates the status database 124 to include data from the BGP message 800.
Similarly, controller 109B generates BGP message 803, which may be encoded as new BGP message 140A or as existing BGP UPDATE message 140B. BGP message 803 includes a controller NLRI 503 that includes a master controller flag (C) 506 (shown as "C" in fig. 8A), a location 306, an old location 309, a quantity 312, a priority 318, and controller IDs 315A-B. The master controller flag (C) 506 is set to 0 to indicate that the controller 109B sending BGP message 803 (also referred to herein as "originating controller 109B") is not the master controller 109A of the controller cluster network 100. Location 306 is a value of 2 indicating the second location in the priority order of the reachable controllers 109A-D in cluster 106. A value of 2 for location 306 also indicates that controller 109B is a backup for primary controller 109A. Old location 309 also indicates value 2, for example, because controller 109B was secondary controller 109B since cluster 106 was initialized with controllers 109A-B. The number 312 of controllers 109A-B indicates that there are 2 controllers 109A-B in the cluster 106. Priority 318 indicates that controller 109B has the second highest priority 318 in cluster 106. The controller IDs 315A-B include a controller ID 315A that identifies the controller 109A and a controller ID 315B that identifies the controller 109B.
The controller 109B sends BGP messages 803 to the NE 110 over the enhanced BGP session 132. NE 110 forwards BGP message 803 to controller 109A via enhanced BGP session 130. The controller 109A determines that the controller 109B is still reachable and available upon receipt of the BGP message 803 and updates the status database 124 to include data from the BGP message 803.
Controller 109B sends BGP message 803 to NE 111 over enhanced BGP session 133. NE 111 forwards BGP message 803 to controller 109A via enhanced BGP session 131. The controller 109A again determines that the controller 109B is still reachable and available upon receipt of the BGP message 803 and updates the status database 124 to include data from the BGP message 803.
Referring now to fig. 8B, a TLV of BGP message 800 generated by controller 109A is shown encoded in a format similar to controller NLRI field 510 of fig. 5B. The TLV of BGP message 800 includes a type field 511, a length field 512, a flag 513, a location field 515, a number of controllers field 516, an old location field 517, a reserved bit 518, a priority field 519, and connected controller ID fields 520A-N. In the TLV of BGP message 800, flag 513 includes a C bit 514 set to 1 to indicate that controller 109A is master controller 109A. The location field 515 includes the location 306 of the controller 109A, indicating value 1. The number of controllers field 516 includes the number 312 of controllers 109A-B, indicating value 2. Old location field 517 includes old location 309 of controller 109A, indicating value 1. The priority field 519 includes a value indicating that the controller 109A has the highest priority 318. The connected controller ID fields 520A-N include the controller IDs 315A-B of the controllers 109A-B, respectively.
Referring now to fig. 8C, a TLV of BGP message 803 generated by controller 109B is shown encoded in a format similar to controller NLRI field 510. The TLV of BGP message 803 includes a type field 511, a length field 512, a flag 513, a location field 515, a number of controllers field 516, an old location field 517, a reserved bit 518, a priority field 519, and connected controller ID fields 520A-N. In the TLV of BGP message 803, flag 513 includes a C bit 514 set to 0 to indicate that controller 109B is not the master. The location field 515 includes the location 306 of the controller 109B, indicating value 2. The number of controllers field 516 includes the number 312 of controllers 109A-B, indicating value 2. Old location field 517 includes old location 309 of controller 109B, indicating value 2. The priority field 519 includes a value indicating that the controller 109B has the second highest priority 318. The connected controller ID fields 520A-N include the controller IDs 315A-B of the controllers 109A-B, respectively.
Fig. 9A-9C are illustrations of BGP messages 140 communicated through the controller cluster network 100 after a failure of a cluster 106 in the controller cluster network 100, in accordance with various embodiments of the application. In particular, fig. 9A illustrates the transmission of BGP messages 140 communicated through controller cluster network 100 after a failure of cluster 106. Fig. 9B-9C illustrate TLVs for encoding BGP messages 140 communicated over controller cluster network 100.
Referring now to fig. 9A, a diagram illustrating the transmission of BGP messages 900 and 903 through the controller cluster network 100 of fig. 1 is shown, in accordance with various embodiments of the present application. In fig. 9A, BGP messages 900 and 903 are sent after controllers 109A and 109B detect a failure 910 occurring at link 121 interconnecting controllers 109A and 109B.
In fig. 9A, controller 109A generates BGP message 900 after detecting that a failure 910 occurred at link 121 interconnecting controllers 109A and 109B. When fault 910 occurs at link 121 interconnecting controllers 109A and 109B, controller 109A is no longer connected to controller 109B, and therefore controller 109A assumes that controller 109B has failed and becomes unreachable. The detection triggers the controller 109A to generate BGP messages 900 that include updated information about the clusters 106 and BGP messages 900 may be encoded as new BGP messages 140A or existing BGP UPDATE messages 140B.
The contents of BGP message 900 are similar to the contents of BGP message 800 sent prior to failure 910 in controller cluster network 100, and the fields of BGP message 900 are similar to the fields of BGP message 800 sent prior to failure 910 in controller cluster network 100. However, in BGP message 900, the number 312 of controllers 109A in cluster 106 indicates that there is only one controller in cluster 106 because controller 109A is no longer able to detect the presence of controller 109B. Similarly, controller IDs 315A-B only indicate controller ID 315A identifying controller 109A, because controller 109A is no longer able to detect the presence of controller 109B.
Similarly, controller 109B generates BGP message 903 after detecting that failure 910 occurred at link 121 interconnecting controllers 109A and 109B. When fault 910 occurs at link 121 interconnecting controllers 109A and 109B, controller 109B is no longer connected to controller 109A, and therefore controller 109B assumes that master controller 109A has failed and becomes unreachable. The detection triggers the controller 109B to generate BGP messages 903 that include updated information about the cluster 106, and the BGP messages 903 may be encoded as new BGP messages 140A or existing BGP UPDATE messages 140B. The detection also triggers the controller 109B to wait a predetermined period of time to determine if a message from the master controller 109A has been received, indicating that the master controller 109A is still reachable and active.
The contents of BGP message 903 are similar to the contents of BGP message 803 sent prior to failure 910 in controller cluster network 100, and the fields of BGP message 903 are similar to the fields of BGP message 800 sent prior to failure 910 in controller cluster network 100. However, in BGP message 903, the location 306 of controller 109B is updated to 1, indicating that the expected location of controller 109B after detection of fault 910 is 1. A position 306 of 1 indicates that the controller 109B will become the master controller of the controller cluster network 100. In addition, the number 312 of controllers 109B in cluster 106 indicates that there is only one controller 109B in cluster 106 because controller 109B is no longer able to detect the presence of controller 109A. Similarly, controller IDs 315A-B only indicate controller ID 315B that identifies controller 109B, because controller 109B is no longer able to detect the presence of controller 109A.
After generating BGP message 900, controller 109A sends BGP message 900 to NE 110 over enhanced BGP session 130. NE 110 forwards BGP message 900 to controller 109B via enhanced BGP session 132. Similarly, controller 109A sends BGP message 900 to NE 111 through enhanced BGP session 131. NE 111 forwards BGP message 900 to controller 109B through enhanced BGP session 133. The controller 109B, upon receiving the BGP message 900, determines that the controller 109A is still reachable and available and determines that the controller 109A is still the master controller 109A of the controller cluster network 100. In this way, the controller 109B does not erroneously elevate itself to be the master controller controlling the clustered network 100.
Referring now to fig. 9B, a TLV of BGP message 900 generated by controller 109A is shown encoded in a format similar to controller NLRI field 510 of fig. 5B. Controller 109A generates and transmits BGP message 900 after detecting failure 910 of link 121.
The TLV of BGP message 900 includes a type field 511, a length field 512, a flag 513, a location field 515, a number of controllers field 516, an old location field 517, a reserved bit 518, a priority field 519, and a connected controller ID field 520A. In the TLV of BGP message 900, flag 513 includes a C bit 514 set to 1 to indicate that controller 109A is master controller 109A. The location field 515 includes the location 306 of the controller 109A, indicating value 1. The number of controllers field 516 includes the number 312 of controllers 109A in cluster 106, indicating value 1, because controller 109A is no longer connected to controller 109B and no longer able to detect the presence of controller 109B. Old location field 517 includes old location 309 of controller 109A, indicating value 1. The priority field 519 includes a value indicating that the controller 109A has the highest priority 318. The connected controller ID field 520A includes only the controller ID 315A of the controller 109A because the controller 109A is no longer connected to the controller 109B and is no longer able to detect the presence of the controller 109B.
Referring now to fig. 9C, a TLV of BGP message 903 generated by controller 109B is shown encoded in a format similar to controller NLRI field 510. The controller 109B generates and transmits BGP message 903 after detecting failure 910 of link 121.
The TLV of BGP message 903 includes a type field 511, a length field 512, a flag 513, a location field 515, a number of controllers field 516, an old location field 517, a reserved bit 518, a priority field 519, and connected controller ID fields 520A-N. In the TLV of BGP message 903, flag 513 includes a C bit 514 set to 0 to indicate that controller 109B is not master controller 109A. The location field 515 includes the location 306 of the controller 109B, indicating value 1. The number of controllers field 516 includes the number 312 of controllers 109A-B, indicating a value of 1, because controller 109B is no longer connected to controller 109A and no longer able to detect the presence of controller 109A. Old location field 517 includes old location 309 of controller 109B, indicating value 2. The priority field 519 includes a value indicating that the controller 109B has the second highest priority 318. The connected controller ID field 520B includes only the controller ID 315B of the controller 109B because the controller 109A is no longer connected to the controller 109B and is no longer able to detect the presence of the controller 109A.
Fig. 10A-10C are illustrations of BGP messages 140 communicated through the controller cluster network 100 after a failure of a master controller 109A in a cluster 106 of the controller cluster network 100, in accordance with various embodiments of the application. In particular, fig. 10A illustrates the transmission of BGP messages 140 communicated through controller cluster network 100 after a failure of master controller 109A. Fig. 10B-10C illustrate TLVs for encoding BGP messages 140 communicated over controller cluster network 100.
Referring now to fig. 10A, a diagram illustrating the transmission of BGP messages 1003 through the controller cluster network 100 of fig. 1 is shown, in accordance with various embodiments of the application. In fig. 10A, the controller 109A has failed, and therefore the controller 109A does not generate any message. The controller 109B detects that a fault 1010 has occurred at the controller 109A when no message or heartbeat message has been received from the controller 109A for a predetermined period of time. In this state, the controller 109B determines that the controller 109A has failed and becomes unreachable. The detection triggers the controller 109B to generate BGP messages 1003 that include updated information about the clusters 106, and the BGP messages 1003 may be encoded as new BGP messages 140A or existing BGP UPDATE messages 140B.
The contents of BGP message 1003 are similar to the contents of BGP message 903 of fig. 9A. In BGP message 1003, the location 306 of controller 109B is updated to 1, indicating that the expected location of controller 109B after detection of failure 1010 is 1. Location 1 indicates that controller 109B should become the master controller of controller cluster network 100. Further, the number 312 of controllers 109B in cluster 106 indicates that there are only 1 controller in cluster 106. Further, the controller IDs 315A-B only indicate the controller ID 315B that identifies the controller 109B.
After generating BGP message 1003, controller 109B sends BGP message 1003 to NE 110 over enhanced BGP session 132. Similarly, controller 109B sends BGP message 1003 to NE 111 over enhanced BGP session 133.
At this point, the controller 109B waits for a predetermined period of time to determine whether a heartbeat message or any other message is received from the original master controller 109A. When no heartbeat or message is received from the master controller 109A within a predetermined period of time, the controller 109B determines that the controller 109B is now the master controller 109B of the controller cluster network 100. To this end, controller 109B generates and transmits another BGP message 1006, which is substantially identical to BGP message 1003. However, in BGP message 1006, master controller flag (C) 506 is set to indicate that controller 109B is master controller 109B of controller cluster network 100.
Referring now to fig. 10B, a TLV of BGP message 1003 generated by controller 109B is shown encoded in a format similar to controller NLRI field 510 of fig. 5A. After detecting the failure 1010 of the master controller of the controller cluster network 100, but before the controller 109B becomes the master controller of the controller cluster network 100, the controller 109B generates and transmits a BGP message 1003.
The TLV of BGP message 1003 includes a type field 511, a length field 512, a flag 513, a location field 515, a number of controllers field 516, an old location field 517, a reserved bit 518, a priority field 519, and a connected controller ID field 520B. In the TLV of BGP message 1003, flag 513 includes a C bit 514 set to 0 to indicate that controller 109B is not yet the master controller of controller cluster network 100. The location field 515 includes the location 306 of the controller 109B, indicating value 1, because the controller 109B should be the master controller of the controller cluster network 100. The number of controllers field 516 includes the number 312 of controllers 109B in cluster 106, indicating value 1, because controller 109A has failed. Old location field 517 includes old location 309 of controller 109B, indicating value 2. The priority field 519 includes a value indicating that the controller 109B has the second highest priority 318. Since controller 109A has failed, connected controller ID field 520B includes controller ID 315B of controller 109B.
Referring now to fig. 10C, a TLV of BGP message 1006 generated by controller 109B is shown encoded in a format similar to controller NLRI field 510. After detecting the failure 1010 of the master controller of the controller cluster network 100, and after becoming the master controller of the controller cluster network 100, the controller 109B generates and transmits a BGP message 1006.
The TLV of BGP message 1006 includes a type field 511, a length field 512, a flag 513, a location field 515, a number of controllers field 516, an old location field 517, a reserved bit 518, a priority field 519, and a connected controller ID field 520B. In the TLV of BGP message 1006, flag 513 includes a C bit 514 set to 1 to indicate that controller 109B is now the master controller of controller cluster network 100. The location field 515 includes the location 306 of the controller 109B, indicating value 1. The number of controllers field 516 includes the number 312 of controllers 109B in the controller cluster network 100, indicating a value of 1. Old location field 517 includes old location 309 of controller 109B, indicating value 2. The priority field 519 includes a value indicating that the controller 109B has the second highest priority 318. The connected controller ID field 520B includes the controller ID 315B of the controller 109B.
Fig. 11A-11C are illustrations of BGP messages 140 communicated through controller cluster network 200 prior to any failure of clusters 106 in controller cluster network 200, in accordance with various embodiments of the application. In particular, fig. 11A illustrates the transmission of BGP messages 140 communicated over the controller cluster network 200 of fig. 2. Fig. 11B-11C illustrate TLVs for encoding BGP messages 140 communicated over controller cluster network 200 of fig. 2.
Referring now to fig. 11A, a diagram illustrating the transmission of BGP messages 1100 and 1103 through the controller cluster network 200 of fig. 2 is shown, in accordance with various embodiments of the present application. In fig. 11A, BGP messages 1100 and 1103 are sent before any failure of cluster 106 in controller cluster network 200 occurs.
As shown in fig. 11A, the controller 109A generates BGP messages 1100 that may be encoded as new BGP messages 140A or existing BGP UPDATE messages 140B. BGP message 1100 includes controller NLRI 503 of fig. 5A. The controller NLRI 503 of BGP message 1100 includes a Master controller flag (C) 506 (shown as "C" in FIG. 11A), a location 306, an old location 309, a quantity 312, a priority 318, and controller IDs 315A-D. The master controller flag (C) 506 is set to 1 to indicate that the controller 109A that sent BGP message 1100 (also referred to herein as "originating controller 109A") is the master controller 109A of the controller cluster network 200. Location 306 is a value of 1 indicating the first location in the priority order of the reachable controllers 109A-D in cluster 106. Old location 309 also indicates value 1, for example, because controller 109A is master controller 109A since cluster 106 was initialized with controllers 109A-D. The number 312 of controllers 109A-D indicates that there are four controllers 109A-D in the cluster 106. Priority 318 indicates that controller 109A has the highest priority 318 in cluster 106. The controller IDs 315A-D include a controller ID 315A identifying the controller 109A, a controller ID 315B identifying the controller 109B, a controller ID 315C identifying the controller 109C, and a controller ID 315D identifying the controller 109D.
Controller 109A sends BGP message 1100 to NE 111 through enhanced BGP session 131. NE 111 forwards BGP message 1100 to all other controllers 109B-D in cluster 106. Controllers 109B-D determine that controller 109A is still reachable and available, respectively, upon receipt of BGP message 1100 and update state database 124 to include data from BGP message 1100.
Similarly, controller 109B generates BGP message 1103, which may be encoded as new BGP message 140A or as existing BGP UPDATE message 140B. BGP message 1103 includes a controller NLRI 503 that includes a master controller flag (C) 506 (shown as "C" in fig. 11A), a location 306, an old location 309, a quantity 312, a priority 318, and controller IDs 315A-D. The master controller flag (C) 506 is set to 0 to indicate that the controller 109B that sent BGP message 1103 is not the master controller of the controller cluster network 200. Location 306 is a value of 2 indicating the second location in the priority order of the reachable controllers 109A-D in cluster 106. A value of 2 for location 306 also indicates that controller 109B is a backup for primary controller 109A. Old location 309 also indicates value 2, for example, because controller 109B was secondary controller 109B since cluster 106 was initialized with controllers 109A-D. The number 312 of controllers 109A-D indicates that there are four controllers 109A-D in the cluster 106. Priority 318 indicates that controller 109B has the second highest priority 318 in cluster 106. The controller IDs 315A-D include a controller ID 315A identifying the controller 109A, a controller ID 315B identifying the controller 109B, a controller ID 315C identifying the controller 109C, and a controller ID 315D identifying the controller 109D.
Controller 109B sends BGP message 1103 to NE 111 through enhanced BGP session 133. NE 111 forwards BGP message 1103 to all other controllers 109A and 109C-D in cluster 106. Controllers 109A and 109C-D respectively determine that controller 109B is still reachable and available upon receipt of BGP message 1103 and update state database 124 to include data from BGP message 1103.
Controllers 109C and 109D similarly generate BGP messages 1100 and 1103 and send to NE 111.NE 111 forwards BGP messages 1100 and 1103 to other controllers 109A-D in cluster 106. In this manner, each of controllers 109A-D sends BGP messages 1100 and 1103 through NE 111 to maintain information about the latest state of each of the other controllers 109A-D in controller cluster network 200.
Referring now to fig. 11B, a TLV of BGP message 1100 generated by controller 109A is shown encoded in a format similar to controller NLRI field 510 of fig. 5B. The TLV of BGP message 1100 includes a type field 511, a length field 512, a flag 513, a location field 515, a number of controllers field 516, an old location field 517, a reserved bit 518, a priority field 519, and connected controller ID fields 520A-D. In the TLV of BGP message 1100, flag 513 includes a C bit 514 set to 1 to indicate that controller 109A is master controller 109A. The location field 515 includes the location 306 of the controller 109A, indicating value 1. The number of controllers field 516 includes the number 312 of controllers 109A-D, indicating value 4. Old location field 517 includes old location 309 of controller 109A, indicating value 1. The priority field 519 includes a value indicating that the controller 109A has the highest priority 318. The connected controller ID fields 520A-D include the controller IDs 315A-D of the controllers 109A-D, respectively.
Referring now to fig. 11C, a TLV of BGP message 1103 generated by controller 109B is shown encoded in a format similar to controller NLRI field 510. The TLV of BGP message 1103 includes a type field 511, a length field 512, a flag 513, a location field 515, a number of controllers field 516, an old location field 517, a reserved bit 518, a priority field 519, and connected controller ID fields 520A-D. In the TLV of BGP message 1103, flag 513 includes a C bit 514 set to 0 to indicate that controller 109B is not the master. The location field 515 includes the location 306 of the controller 109B, indicating value 2. The number of controllers field 516 includes the number 312 of controllers 109A-D, indicating value 4. Old location field 517 includes old location 309 of controller 109B, indicating value 2. The priority field 519 includes a value indicating that the controller 109B has the second highest priority 318. The connected controller ID fields 520A-D include the controller IDs 315A-D of the controllers 109A-D, respectively.
Fig. 12A-12E are illustrations of BGP messages 140 communicated through controller cluster network 200 after each of links 121A, 121C, and 121D in cluster 106 of controller cluster network 200 fails, in accordance with various embodiments of the application. In particular, fig. 12A illustrates the transmission of BGP messages 140 communicated over controller cluster network 200 after a failure of cluster 106. Fig. 12B-12C illustrate TLVs for encoding BGP messages 140 communicated over controller cluster network 200 after a failure of cluster 106. Fig. 12D illustrates the transmission of BGP messages 140 communicated through controller cluster network 200 after selection of the master controller of controller cluster network 200. Fig. 12E shows a TLV for encoding another BGP message 140 that is communicated over the controller cluster network 200 after selecting the master controller of the controller cluster network 200.
Referring now to fig. 12A, a diagram illustrating the transmission of BGP messages 1200 and 1203 through the controller cluster network 200 of fig. 2 is shown, in accordance with various embodiments of the present application. In fig. 12A, BGP messages 1200 and 1203 are sent after controllers 109A and 109B detect failure 215 occurring at links 121A, 121C, and 121D. After failure 215, controller 109A and controller 109C are interconnected by link 121B, and controller 109B and controller 109D are interconnected by link 121E. That is, the controllers 109A and 109C are no longer connected to the controller 109B or the controller 109D, and therefore, the controllers 109A and 109C cannot detect the presence of the controller 109B and the controller 109D. Similarly, the controllers 109B and 109D are no longer connected to the controller 109A or the controller 109C, and therefore, the controllers 109B and 109D cannot detect the presence of the controllers 109A and 109C. In this way, the remaining interconnected controllers 109A and 109C and controllers 109B and 109D form two separate controller groups 210A and 210B, respectively. The controller group 210A includes a controller 109A and a controller 109C interconnected by a link 121B. The controller group 210B includes a controller 109B and a controller 109D interconnected by a link 121E.
In this case, two controller groups 210A-B determine the master controller within each controller group 210A-B because each controller group 210A-B is unaware of the presence of the other controller group 210A-B. The master controllers in each controller group 210A-B are determined based on the priority 318 of each controller 109A-D in the controller group 210A-B. The controller 109A-D with the highest priority 318 becomes the master controller for the controller group 210A-B.
For example, in the controller group 210A, the controller 109A has a higher priority than the controller 109C. Thus, the controllers 109A and 109C determine that the controller 109A is the master controller of the group 210A. Similarly, in the controller group 210B, the controller 109B has a higher priority than the controller 109D. Thus, the controllers 109B and 109D determine that the controller 109B is the master controller of the group 210B.
In one embodiment, master controller 109A of group 210A is only the controller in group 210A that generates and transmits BGP messages 1200 that describe controllers 109A and 109C in group 210A. Similarly, the master controller 109B of group 210B is only the controller in group 210B that generates and transmits BGP messages 1203 describing the controllers 109B and 109D in group 210B. In another embodiment, all of the controllers 109A-D in each group 210A-B send BGP messages describing the respective controllers 109A-D, groups 210A-B, and/or clusters 106. In the example shown in fig. 12A, only the master controllers 109A and 109B of each group 210A and 210B generate and transmit BGP messages 1200 and 1203, respectively.
The controller 109A generates a BGP message 1200 after detecting the failure 215 and determining that the controller 109A is the master of the group 210A. BGP message 1200 may be encoded as a new BGP message 140A or an existing BGP UPDATE message 140B.
The contents of BGP message 1200 are similar to the contents of BGP message 1100 sent prior to failure 215 in controller cluster network 200, and the fields of BGP message 1200 are similar to the fields of BGP message 1100 sent prior to failure 215 in controller cluster network 200. However, in BGP message 1200, master controller flag (C) 506 is reset to 0 because a new master controller for the entire cluster 106 needs to be determined from controllers 109A-D in different groups 210A-B of cluster 106. Further, in BGP message 1200, the number 312 of controllers 109A and 109C in cluster 106 indicates that there are now two controllers 109A and 109C in cluster 106 because controller 109A is no longer able to detect the presence of controllers 109B and 109D. Similarly, controller IDs 315A-D only indicate controller ID 315A identifying controller 109A and controller ID 315C of controller 109C.
The controller 109B generates a BGP message 1203 after detecting the failure 215 and determining that the controller 109B is the master of the group 210B. BGP message 1203 may be encoded as a new BGP message 140A or an existing BGP UPDATE message 140B.
The contents of BGP message 1203 are similar to the contents of BGP message 1200 sent prior to failure 215 in controller cluster network 200, and the fields of BGP message 1203 are similar to the fields of BGP message 1100 sent prior to failure 215 in controller cluster network 200. However, in BGP message 1203, the number 312 of controllers 109B and 109D in cluster 106 indicates that there are now two controllers 109B and 109D in cluster 106 because controller 109B is no longer able to detect the presence of controllers 109A and 109C. Similarly, controller IDs 315A-D indicate only controller ID 315B identifying controller 109B and controller ID 315D of controller 109D.
After generating BGP message 1200, controller 109A sends BGP message 1200 to NE 111 through enhanced BGP session 131. NE 111 forwards BGP message 1200 describing group 210A to group 210B. NE 111 may forward BGP message 1200 only to master controller 109B of group 210B by enhancing BGP session 133. Alternatively, NE 111 may forward BGP message 1200 to all controllers 109B and 109D in group 210B by enhancing BGP sessions 206 and 133.
Similarly, controller 109B sends BGP message 1203 to NE 111 through enhanced BGP session 133. NE 111 forwards BGP message 1203 describing group 210B to group 210A. NE 111 may forward BGP message 1203 only to master controller 109A of group 210A by enhancing BGP session 131. Alternatively, NE 111 may forward BGP message 1203 to all controllers 109A and 109C in group 210A by enhancing BGP sessions 131 and 203.
In one embodiment, the controller 109B waits a predetermined period of time after sending BGP message 1203 to determine whether a message was received from the original master controller 109A. In fig. 12A, the controller 109B, upon receiving BGP message 1200, determines that the controller 109A is still reachable and available and determines that the controller 109A is still the master controller 109A of the entire controller cluster network 200. In this way, the controller 109B does not erroneously promote itself to be the master of all groups 210A-B and the entire controller cluster network 200.
In embodiments in which no messages have been received from the original master controller 109A for a predetermined period of time, the controllers 109B-D determine all groups 210A-B and new master controllers of the overall controller cluster network 200 based on information in BGP messages 1200 and 1203. In one embodiment, the controllers 109A-D select or elevate the controllers 109A-D as master controllers for all of the groups 210A-B and the entire controller cluster network 200 based on the number 312 of controllers 109A-D in each of the groups 210A-B. For example, when group 210A has three controllers and group 210B has only two controllers, controllers 109A-D determine that group 210A is the master group of cluster 106. The controllers 109A-D also determine that the master controller 109A of the master group 210A is the new master controller of all groups 210A-B and the entire controller cluster network 200.
In embodiments where the groups 210A-B have the same number 312 of controllers 109A-D, all of the groups 210A-B and the master controllers of the overall controller cluster network 200 may be selected based on the highest old location 309 in the master controllers 109A-D of the groups 210A-B. In the example shown in fig. 12A, the old position 309 of the controller 109A is 1, and the old position 309 of the controller 109B is 2. In this case, old location 309 of 1 is higher than old location 309 of 2, and thus, controllers 109A-D determine that group 210A is the master group of cluster 106. The controllers 109A-D also determine that the master controller 109A of the master group 210A is the new master controller of all groups 210A-B and the entire controller cluster network 200.
In another embodiment where the groups 210A-B have the same number 312 of controllers 109A-D, all of the groups 210A-B and the master controllers of the overall controller cluster network 200 may be selected based on the highest priority 318 among the master controllers 109A-D of the groups 210A-B. In the example shown in FIG. 12A, controller 109A has the highest priority 318 of controllers 109A-D in cluster 106, and controller 109B has the second highest priority 318 of controllers 109A-D in cluster 106. In this case, the controllers 109A-D determine that group 210A is the primary group of clusters 106. The controllers 109A-D also determine that the master controller 109A of the master group 210A is the new master controller of all groups 210A-B and the entire controller cluster network 200.
Referring now to fig. 12B, a TLV of BGP message 1200 generated by controller 109A is shown encoded in a format similar to controller NLRI field 510 of fig. 5B. The controller 109A generates and transmits BGP messages 1200 after detecting a failure 215 occurring in the controller cluster network 200.
The TLV of BGP message 1200 includes a type field 511, a length field 512, a flag 513, a location field 515, a number of controllers field 516, an old location field 517, a reserved bit 518, a priority field 519, and connected controller ID fields 520A and 520C. In the TLV of BGP message 1200, flag 513 includes a C bit 514 set to 0 to indicate that controller 109A has not been determined to be the master controller for the entire controller cluster network 200. The location field 515 includes the location 306 of the controller 109A, indicating value 1. The number of controllers field 516 includes the number 312 of controllers 109A and 109B in group 210A, indicating value 2, because controller 109A is no longer connected to controllers 109B and 109D. Old location field 517 includes old location 309 of controller 109A, indicating value 1. The priority field 519 includes a value indicating that the controller 109A has the highest priority 318 of all controllers 109A-D in the cluster 106. The connected controller ID fields 520A and 520C include the controller ID 315A of the controller 109A and the controller ID 315C of the controller 109C.
Referring now to fig. 12C, a TLV of BGP message 1203 generated by controller 109B is shown encoded in a format similar to controller NLRI field 510. The controller 109B generates and transmits BGP message 1203 upon detecting the failure 215 occurring in the controller cluster network 200.
The TLV of BGP message 1203 includes a type field 511, a length field 512, a flag 513, a location field 515, a number of controllers field 516, an old location field 517, a reserved bit 518, a priority field 519, and connected controller ID fields 520B and 520D. In the TLV of BGP message 1203, flag 513 includes a C bit 514 set to 0 to indicate that controller 109B is not the master controller of overall controller cluster network 200. The location field 515 includes the location 306 of the controller 109B, indicating value 1. The number of controllers field 516 includes the number 312 of controllers 109B and 109D in group 210B, indicating value 2, because controller 109B is no longer connected to controllers 109A and 109C. Old location field 517 includes old location 309 of controller 109B, indicating value 2. The priority field 519 includes a value that indicates that the controller 109B has the second highest priority 318 of all controllers 109A-D in the cluster 106. The connected controller ID fields 520B and 520D include the controller ID 315B of the controller 109B and the controller ID 315D of the controller 109D.
Referring now to fig. 12D, a diagram is shown of a BGP message 1206 being sent through the controller cluster network 200 of fig. 2 after the controller 109A is selected as the master controller of the controller cluster network 200. In fig. 12D, controllers 109A-D have exchanged BGP messages 1203 and 1206 and have determined using BGP messages 1203 and 1206 that master controller 109A is the new master for all groups 210A-B and the entire controller cluster network 200.
Master controller 109A generates BGP message 1206 that indicates that controller 109A is the master controller for all groups 210A-B and entire controller cluster network 200. BGP message 1206 is substantially similar to BGP message 1203 of fig. 12A, except that master controller flag (C) 506 is set to 1, indicating that controller 109A is now the master of all groups 210A-B and the entire controller cluster network 200.
Master controller 109A sends BGP message 1206 to NE 111 through enhanced BGP session 131. In one embodiment, NE 111 forwards BGP message 1206 to all other controllers 109B-D in cluster 106. In another embodiment, NE 111 forwards BGP message 1206 only to master controller 109B of another group 210B of cluster 106. In either case, when BGP message 1206 is received from NE 111, all controllers 109A-D maintain data indicating that controller 109A is the master controller for all groups 210A-B and the entire controller cluster network 200, and that controller 109A is active and reachable.
Referring now to fig. 12E, a TLV of BGP message 1206 generated by controller 109A is shown encoded in a format similar to controller NLRI field 510 of fig. 5B. The controller 109A generates and transmits BGP messages 1206 after detecting the failure 215 occurring in the controller cluster network 200 and determining the master controller.
The TLV of BGP message 1206 includes a type field 511, a length field 512, a flag 513, a location field 515, a number of controllers field 516, an old location field 517, a reserved bit 518, a priority field 519, and connected controller ID fields 520A and 520C. In BGP message 1206, flag 513 includes a C bit 514 set to 1 to indicate that controller 109A has been selected or promoted to be the master controller for the entire controller cluster network 200. The location field 515 includes the location 306 of the controller 109A, indicating value 1. The number of controllers field 516 includes the number 312 of controllers 109A and 109B in cluster group 210B, indicating value 2. Old location field 517 includes old location 309 of controller 109A, indicating value 1. The priority field 519 includes a value indicating that the controller 109A has the highest priority 318 of all controllers 109A-D in the cluster 106. The connected controller ID fields 520A and 520C still include only the controller ID 315A of the controller 109A and the controller ID 315C of the controller 109C.
Fig. 13A-13E are illustrations of BGP messages 140 communicated over the controller cluster network 200 after a failure of a master controller 109A in the cluster 106 of the controller cluster network 200, in accordance with various embodiments of the application. In particular, fig. 13A illustrates the transmission of BGP messages 140 communicated over controller cluster network 200 after a failure of master controller 109A. Fig. 13B-13C illustrate TLVs for encoding BGP messages 140 communicated over controller cluster network 200 after a failure of master controller 109A. Fig. 13D illustrates the transmission of BGP messages 140 communicated through controller cluster network 200 after selecting the master controller of controller cluster network 200. Fig. 13E shows a TLV for encoding another BGP message 140 that is communicated over the controller cluster network 200 after selecting the master controller of the controller cluster network 200.
Referring now to fig. 13A, a diagram illustrating the transmission of BGP messages 1300 and 1303 through the controller cluster network 200 of fig. 2 is shown, in accordance with various embodiments of the present application. In fig. 13A, BGP messages 1300 and 1303 are sent after controllers 109B, 109C, and 109D detect failure 215 occurring at links 121A, 121C, and 121D and failure 1310 occurring at controller 109A. As described above with reference to fig. 12A, a failure 215 occurring at links 121A, 121C, and 121D results in the generation of two groups 210A-B of controllers 109A-D. The first group 210A includes the controller 109C. The second group 210B includes the controllers 109B and 109D. However, failure 1310 results in controller 109A being no longer reachable or available to the rest of controller cluster network 200.
Upon detection of faults 215 and 1310, controllers 109B and 109D determine master controller 109B of group 210B in a manner similar to that described above with reference to fig. 12A. Since the controller 109A is no longer available, the controller 109C becomes the master controller for the group 210A. In one embodiment, controllers 109C and 109B generate and send BGP messages 1300 and 1303 describing groups 210A-B and clusters 106. In another embodiment, all controllers 109B-D generate and send BGP messages describing groups 210A-B and clusters 106. In the example shown in fig. 13A, only controllers 109C and 109B generate and send BGP messages 1300 and 1303 describing groups 210A-B and clusters 106.
The controller 109C generates BGP message 1300, which may be encoded as either a new BGP message 140A or an existing BGP UPDATE message 140B. BGP message 1300 includes controller NLRI 503 of fig. 5A. The controller NLRI 503 of BGP message 1300 includes a Master controller flag (C) 506 (shown as "C" in FIG. 13A), a location 306, an old location 309, a quantity 312, a priority 318, and a controller ID 315C. The master controller flag (C) 506 is set to 0 indicating that the controller 109C sending BGP message 1300 (also referred to herein as "originating controller 109C") is not the master controller of the controller cluster network 200. Position 306 is a value of 1, indicating the first position in the priority order of the reachable controllers 109A-D in group 210A. That is, since only controller 109A is present in group 210A, controller 109C rises to the first position in the order of priority of the reachable controllers 109A-D in group 210A. Old location 309 indicates value 3, for example, because controller 109C has the third location in the priority order of controllers 109A-D before failures 215 and 1310 occur in controller cluster network 200. The number 312 of controllers 109C indicates that only one controller in the group 210A is now present. Priority 318 indicates that controller 109C has the third highest priority 318 in cluster 106. The controller ID 315C identifies the controller 109C.
Controller 109C sends BGP message 1300 to NE 111 through enhanced BGP session 203. NE 111 forwards BGP message 1300 to master controller 109B in another group 210B or to all other controllers 109B and 109D in cluster 106. Controllers 109B and 109D determine that controller 109C is still reachable and available, respectively, upon receipt of BGP message 1300 and update state database 124 to include data from BGP message 1300.
Similarly, controller 109B generates BGP message 1303, which may be encoded as new BGP message 140A or as existing BGP UPDATE message 140B. BGP message 1303 includes controller NLRI 503 including a master controller flag (C) 506 (shown as "C" in fig. 11A), location 306, old location 309, quantity 312, priority 318, and controller IDs 315A-D. The master controller flag (C) 506 is set to 0 to indicate that the controller 109B that sent BGP message 1103 is not the master controller of the controller cluster network 200. Position 306 is a value of 1, indicating the first position in the priority order of the reachable controller group 210B in cluster 106. Old location 309 also indicates value 2, for example, because controller 109B was secondary controller 109B since cluster 106 was initialized with controllers 109A-D. The number 312 of controllers 109A-D indicates that there are two controllers 109B and 109D in group 210B. Priority 318 indicates that controller 109B has the second highest priority 318 in cluster 106. The controller IDs 315A-D include a controller ID 315B that identifies the controller 109B and a controller ID 315D that identifies the controller 109D.
Controller 109B sends BGP message 1303 to NE 111 over enhanced BGP session 133. NE 111 forwards BGP message 1303 to controller 109C in another group 210A, or to all other controllers 109C-D in cluster 106. Controllers 109C-D determine that controller 109B is still reachable and available, respectively, upon receipt of BGP message 1303 and update state database 124 to include data from BGP message 1303.
Upon receipt of BGP messages 1300 and 1303, controllers 109B-D determine that original master controller 109A is no longer reachable or available. Accordingly, the controllers 109B-D determine that a new master controller needs to be determined from the remaining controllers 109B-D based on the information carried in the BGP messages 1300 and 1303. In one embodiment, the controllers 109B-D determine a new master controller based on the number 312 of controllers B-D in each group 210A-B. In the example shown in fig. 13A, group 210B has more active controllers 109B and 109D than group 210A, and group 210A has only one active controller 109C. In this case, the controllers 109B-D determine that group 210B is the primary group of clusters 106. The controllers 109B-D also determine, based on the information in BGP messages 1300 and 1303, that the master controller 109B of the master group 210B is the new master controller of all groups 210A-B and the entire controller cluster network 200. As described above, all groups 210A-B and master controllers of the overall controller cluster network 200 may be otherwise determined based on the old locations 309 or priorities 318 carried in BGP messages 1300 and 1303.
Referring now to fig. 13B, a TLV of BGP message 1300 generated by controller 109C is shown encoded in a format similar to controller NLRI field 510 of fig. 5B. The controller 109C generates and transmits BGP messages 1300 after detecting faults 215 and 1310 occurring in the controller cluster network 200.
The TLV of BGP message 1300 includes a type field 511, a length field 512, a flag 513, a location field 515, a number of controllers field 516, an old location field 517, a reserved bit 518, a priority field 519, and a connected controller ID field 520C. In the TLV of BGP message 1200, flag 513 includes a C bit 514 set to 0 to indicate that controller 109C is not the master controller of overall controller cluster network 200. The location field 515 includes the location 306 of the controller 109C, indicating value 1. The number of controllers field 516 includes the number of controllers 312 in the group 210A, indicating a value of 1. Old location field 517 includes old location 309 of controller 109C, indicating value 3. The priority field 519 includes a value indicating that the controller 109C has the third highest priority 318 of all controllers 109A-D in the cluster 106. The connected controller ID field 520C includes only the controller ID 315C of the controller 109C.
Referring now to fig. 13C, a TLV of BGP message 1303 generated by controller 109B is shown encoded in a format similar to controller NLRI field 510. The controller 109B generates and transmits BGP messages 1303 after detecting faults 215 and 1310 occurring in the controller cluster network 200.
The TLV of BGP message 1303 includes a type field 511, a length field 512, a flag 513, a location field 515, a number of controllers field 516, an old location field 517, a reserved bit 518, a priority field 519, and connected controller ID fields 520B and 520D. In the TLV of BGP message 1303, flag 513 includes a C bit 514 set to 0 to indicate that controller 109B is not the master controller of overall controller cluster network 200. The location field 515 includes the location 306 of the controller 109B, indicating value 1. The number of controllers field 516 includes the number 312 of controllers 109B and 109D in group 210B, indicating value 2. Old location field 517 includes old location 309 of controller 109B, indicating value 2. The priority field 519 includes a value that indicates that the controller 109B has the second highest priority 318 of all controllers 109A-D in the cluster 106. The connected controller ID fields 520B and 520D include the controller ID 315B of the controller 109B and the controller ID 315D of the controller 109D.
Referring now to fig. 13D, a diagram of a BGP message 1306 transmitted through the controller cluster network 200 of fig. 2 is shown. In FIG. 13D, the controllers 109B-D have determined that the master controller 109B is the new master controller for all groups 210A-B and the entire controller cluster network 200.
Master controller 109B generates BGP message 1306 that indicates that controller 109B is the master controller for all groups 210A-B and entire controller cluster network 200. BGP message 1306 is substantially similar to BGP message 1303 of fig. 13A, except that master controller flag (C) 506 is set to 1, indicating that controller 109B is now the master controller of all groups 210A-B and entire controller cluster network 200.
Master controller 109B sends BGP message 1306 to NE 111 over enhanced BGP session 133. In one embodiment, NE 111 forwards BGP message 1306 to all other controllers 109C-D in cluster 106. In another embodiment, NE 111 forwards BGP message 1306 only to master controller 109C of another group 210A of cluster 106. In either case, when BGP message 1306 is received from NE 111, all controllers 109B-D maintain data indicating that controller 109B is the master controller for all groups 210A-B and the entire controller cluster network 200, and that controller 109B is active and reachable.
Referring now to fig. 13E, a TLV of BGP message 1306 generated by controller 109B is shown encoded in a format similar to controller NLRI field 510 of fig. 5B. The controller 109B generates and transmits BGP messages 1306 after detecting faults 215 and 1310 occurring in the controller cluster network 200.
The TLV of BGP message 1306 includes a type field 511, a length field 512, a flag 513, a location field 515, a number of controllers field 516, an old location field 517, a reserved bit 518, a priority field 519, and connected controller ID fields 520B and 520D. In the TLV of BGP message 1306, flag 513 includes a C bit 514 set to 1 to indicate that controller 109B has been selected or promoted to be the master controller for the entire controller cluster network 200. The location field 515 includes the location 306 of the controller 109A, indicating value 1. The number of controllers field 516 includes the number 312 of controllers 109B and 109D in cluster group 210B, indicating value 2. Old location field 517 includes old location 309 of controller 109B, indicating value 2. The priority field 519 includes a value that indicates that the controller 109B has the second highest priority 318 of all controllers 109A-D in the cluster 106. The connected controller ID fields 520B and 520D still include only the controller ID 315B of the controller 109B and the controller ID 315D of the controller 109D.
Fig. 14 is a flowchart of a method 1400 performed by the first controller 109A-D for implementing BGP for a network HA in accordance with various embodiments of the application. The method 1400 is implemented by a first controller 109A-D (hereinafter "first controller") in the controller cluster network 100 or 200 (referred to herein as a "network"). The first controller implements the method 1400 after connecting to one or more NEs 110-116.
In step 1403, the first controller establishes a BGP session with NEs 110-111 (hereinafter "NEs") in the network. The first controller is included in a cluster 106 that includes at least two controllers. The BGP session may be an enhanced BGP session in which extended BGP messages may be communicated. In one embodiment, messages encoded according to fig. 4A-4D through fig. 7A-7C may be communicated over a BGP session.
In step 1406, the controller sends a first BGP message 140, 800, 803, 900, 903, 1000, 1003, 1006, 1100, 1103, 1200, 1203, 1206, 1300, 1303, or 1306 (hereinafter referred to as "BGP message") to the NE. The first BGP message includes a first controller NLRI 503 indicating the state of the first controller. The first controller NLRI 503 carries a controller ID 513 for each controller in the cluster 106. The first controller NLRI 504 also carries the controller's location 306 relative to other controllers in the cluster 106 based on the priority order.
In step 1409, the controller receives a second BGP message from the NE. The second BGP message includes a second controller NLRI 503 indicating the status of the second controller in cluster 106. The first BGP message includes a first controller NLRI 503 indicating the state of the first controller. The second controller NLRI 503 carries a controller ID 513 for each controller in the cluster 106. The second controller NLRI 503 also carries the location 306 of the second controller relative to other controllers in the cluster 106 based on the order of priority.
In step 1412, the controller determines a master controller from the controller cluster 106 using the first controller NLRI 503 and the second controller NLRI 503. For example, the controller determines the master controller based on the location 306 of the controller carried in the first controller NLRI 503 and the location 306 of the second controller carried in the second controller NLRI 503. The master controller is responsible for controlling the network.
Fig. 15 is a flowchart of a method 1500 performed by NEs 110-111 for implementing BGP for a network HA in accordance with various embodiments of the present application. Method 1500 is performed by one of NEs 110-111 (hereinafter "NEs") after connection to one or more controllers in cluster 106 of the network.
In step 1503, the NE establishes a first BGP session with a primary controller of the network. The BGP session may be an enhanced BGP session in which extended BGP messages may be communicated. In one embodiment, messages encoded according to fig. 4A-4D through fig. 7A-7C may be communicated over a BGP session.
In step 1506, the NE establishes a second BGP session with the secondary controller of the network. For example, the primary controller is the controller 109A of the cluster 106 and the secondary controller is the controller 109B of the cluster 106. Cluster 106 includes at least two controllers. The master controller is responsible for controlling the network.
In step 1509, the NE receives BGP messages from the host controller. The BGP message includes a controller NLRI 503 indicating that the BGP message is sent by the master controller. The controller NLRI 503 also carries the location of the master controller relative to other controllers in the cluster 106, as well as the controller IDs 315A-N of each controller in the cluster 106. In step 1512, the NE forwards the BGP message to the secondary controllers in cluster 106.
Fig. 16 is an illustration of an apparatus 1600 implemented as a controller for implementing BGP for a network HA in accordance with various embodiments of the application. The apparatus 1600 includes a setup module 1603, a transmit module 1606, a receive module 1609, and a determination module 1612. The setup module 1603 includes a module for establishing a BGP session with a NE in a network, where the cluster 106 includes a first controller and a second controller. The sending module 1606 includes a module for sending a first BGP message to the NE, the first BGP message including a first controller NLRI that indicates a first controller state. The receiving module 1609 includes a module for receiving a second BGP message from the NE, the second BGP message including a second controller NLRI indicating a second controller state. The determining module 1612 includes a module for determining a master controller based on the first controller NLRI and the second controller NLRI, wherein the master controller is responsible for controlling the network.
Fig. 17 is an illustration of an apparatus 1700 implemented as a NE for implementing BGP for a network HA in accordance with various embodiments of the application. The apparatus 1700 includes a setup module 1702, a receive module 1706, and a forward module 1709. The setup module 1703 includes a module for establishing a first BGP session with a primary controller of the network and a second BGP session with a secondary controller of the network, where the cluster 106 includes the primary controller and the secondary controller, and the primary controller is responsible for controlling the network. The receiving module 1706 includes a module for receiving BGP messages from the host controller, the BGP messages indicating that the BGP messages were sent by the host controller and including a location 306 of the host controller relative to other controllers in the cluster 106 and a controller ID 315 of each controller in the cluster 106. The forwarding module 1709 includes a module for forwarding BGP messages to the secondary controllers.

Claims (23)

1. A method implemented by a first controller in a network, the network comprising a cluster of controllers, the cluster of controllers comprising the first controller and a second controller, the method comprising:
establishing an extended border gateway protocol, BGP, session with a network element, NE, in the network to create a control channel and an information channel for use between the network element, NE, and the first controller, each controller and one or more NEs exchanging BGP messages with each other after establishing the extended BGP session;
Sending a first BGP message to the NE, where the first BGP message includes first-controller network layer reachability information NLRI, where the first-controller NLRI carries an identification ID of each controller in the controller cluster, and the first-controller NLRI further carries a location of the first-controller relative to other controllers in the controller cluster based on a priority order;
receiving a second BGP message from the NE comprising a second controller NLRI carrying the ID of each controller in the cluster of controllers, the second controller NLRI carrying a location of the second controller relative to the other controllers in the cluster of controllers based on the priority order;
and determining a main controller from the controller cluster based on the position of the first controller carried in the first controller NLRI and the position of the second controller carried in the second controller NLRI, wherein the main controller is responsible for controlling the network.
2. The method of claim 1, wherein the first BGP message comprises at least one of: a flag indicating whether the first controller is the master controller of the network, the location of the first controller, an old location of the first controller, a number of controllers in the controller cluster, and a priority of the first controller relative to other controllers in the controller cluster.
3. The method according to claim 1 or 2, wherein the second BGP message comprises at least one of: a second flag indicating whether the second controller is the master controller of the network, the location of the second controller, an old location of the second controller, a number of controllers in the controller cluster, and a priority of the second controller relative to other controllers in the controller cluster.
4. The method according to claim 1 or 2, wherein establishing the BGP session with the NE comprises:
establishing a plurality of BGP sessions with a plurality of NEs in the network to create a plurality of control channels, the plurality of NEs comprising the NE;
and respectively establishing a BGP session with extension with the NE to create an information channel.
5. The method according to claim 1 or 2, wherein establishing the BGP session with the NE comprises:
establishing a plurality of BGP sessions with a plurality of NEs in the network to create a plurality of control channels, the plurality of NEs excluding the NE;
establishing a BGP session with the NE with the extension to create an information channel.
6. The method according to claim 1 or 2, wherein establishing the BGP session with the NE comprises:
Transmitting a first OPEN message to the NE having a high availability support capability triplet, the high availability support capability triplet including a flag indicating that the first controller is a controller;
a second OPEN message is received from the NE having a high availability support capability triplet including a flag indicating that the NE is a node in the network.
7. The method of claim 6, wherein the high availability support capability triplet in the first OPEN message is carried in a selectable parameter of the first OPEN message and the high availability support capability triplet in the second OPEN message is carried in a selectable parameter of the second OPEN message.
8. The method of claim 7, wherein the first BGP message comprises a first controller address family identification AFI, a first controller sub-address family identification SAFI, and the first controller NLRI, wherein the second BGP message comprises a second controller AFI, a second controller SAFI, and the second controller NLRI.
9. The method of claim 7, wherein the first BGP message is encoded as a BGP UPDATE, the first controller NLRI is carried in a first path attribute field of the first BGP message, and the second BGP message is encoded as a BGP UPDATE, and the second controller NLRI is carried in a second path attribute field of the second BGP message.
10. The method according to claim 9, wherein the method further comprises:
determining whether the second controller is malfunctioning after receiving an indication from the NE that the second controller is malfunctioning or after determining that no BGP message has been received from the secondary controller within a predetermined period of time;
selecting the first controller as the master controller of the network when the second controller fails;
and sending a third BGP message to the NE, the third BGP message including a third controller NLRI, the third controller NLRI indicating that the first controller is the master controller of the network.
11. The method of claim 9, wherein the cluster of controllers comprises a plurality of controllers including the first controller and the second controller, wherein the method further comprises:
determining that at least one fault has occurred within the controller cluster to create a first set of controllers and a second set of controllers within the controller cluster;
the first controller is determined to be coupled to the first set of controllers that does not include the second controller, which is coupled to the second set of controllers that does not include the first controller.
12. The method of claim 11, wherein the first set of controllers has a first number of controllers and the second set of controllers has a second number of controllers, wherein the method further comprises:
determining that the first controller of the first set of controllers is an intended master controller of the first set of controllers based on the old location of the first controller or a priority of the first controller relative to other controllers of the first set of controllers;
sending a third BGP message to the NE indicating a status of the first set of controllers, the third BGP message including a number of controllers in the first set of controllers, the old location of the first controller, and the priority of the first controller;
a fourth BGP message is received from the NE indicating a status of the second set of controllers, the fourth BGP message indicating that the second controller is an intended master of the second set of controllers, the fourth BGP message including a number of controllers in the second set of controllers, an old location of the second controller, and a priority of the second controller relative to other controllers in the second set of controllers.
13. The method as recited in claim 12, further comprising: the first controller is selected as the master controller of the network based on the number of controllers in each of the first and second groups of controllers, the highest old location of the first or second controller, or the highest priority of the first or second controller.
14. The method of claim 13, wherein the cluster of controllers further comprises a third controller, wherein the method further comprises receiving a third BGP message comprising a third controller NLRI carrying the ID of each controller in the cluster of controllers, the third controller NLRI carrying a location of the second controller relative to the other controllers in the cluster of controllers based on the priority order, wherein the master controller is determined based on the location of the first controller carried in the first controller NLRI, the location of the second controller carried in the second controller NLRI, and the location of the third controller carried in the third controller NLRI.
15. A method implemented by a network element, NE, in a network comprising a cluster of controllers, the method comprising:
establishing a first extended border gateway protocol, BGP, session with a master controller of the network to create a control channel and an information channel for use between the network element, NE, and the master controller, each controller and one or more NEs exchanging BGP messages with each other after establishing the extended BGP session;
establishing a second BGP session with an auxiliary controller of the network, wherein the controller cluster comprises the main controller and the auxiliary controller, and the main controller is responsible for controlling the network;
receiving BGP messages from the host controller, wherein the BGP messages comprise controller Network Layer Reachability Information (NLRI) indicating that the BGP messages are sent by the host controller, and the controller NLRI carries positions of the host controller relative to other controllers in the controller cluster and Identification (ID) of each controller in the controller cluster;
forwarding the BGP message to the secondary controller.
16. The method of claim 15, wherein the BGP message comprises at least one of: a flag indicating that the master controller controls the network, a location of the master controller relative to other controllers in the controller cluster, an old location of the master controller, a number of controllers in the controller cluster, and a priority of the master controller relative to other controllers in the controller cluster.
17. The method of claim 15 or 16, wherein establishing the first BGP session with the master controller comprises:
transmitting a first OPEN message to the master controller with a high availability support capability triplet, the high availability support capability triplet including a flag indicating that the NE is a node in the network;
a second OPEN message is received from the master controller having a high availability support capability triplet including a flag indicating that the master controller is a controller in the network.
18. The method of claim 17, wherein the high availability support capability triplet in the first OPEN message is carried in a selectable parameter of the first OPEN message and the high availability support capability triplet in the second OPEN message is carried in a selectable parameter of the second OPEN message.
19. The method according to claim 15 or 16, further comprising:
detecting a failure of the master controller;
and sending a second BGP message to the auxiliary controller, wherein the second BGP message comprises a third controller NLRI indicating that the main controller fails, and the second BGP message indicates that the auxiliary controller withdraws information about the main controller from a state database.
20. A communication device for use as a first controller, the communication device comprising:
a memory for storing instructions;
a processor which, when executing the instructions, causes the communication device to implement the method of any one of claims 1-14.
21. A communication device for use as a network element NE, the communication device comprising:
a memory for storing instructions;
a processor which, when executing the instructions, causes the communication device to implement the method of any one of claims 15-19.
22. A first controller, wherein a network comprises a cluster of controllers, the cluster of controllers comprising the first controller and a second controller, the first controller comprising:
means for establishing an extended border gateway protocol, BGP, session with a network element, NE, in the network to create a BGP message for each controller and one or more NEs to exchange BGP messages with each other after the establishment of the extended BGP session as a control channel and an information channel between the network element NE and the first controller;
means for sending a first BGP message to the NE including first controller network layer reachability information, NLRI, the first controller NLRI carrying an identification, ID, of each controller in the controller cluster, the first controller NLRI further carrying a location of the first controller relative to other controllers in the controller cluster based on a priority order;
Means for receiving a second BGP message from the NE comprising a second controller NLRI carrying the ID of each controller in the controller cluster, the second controller NLRI carrying a location of the second controller relative to the other controllers in the controller cluster based on the priority order;
and determining a master controller from the controller cluster based on the position of the first controller carried in the first controller NLRI and the position of the second controller carried in the second controller NLRI, wherein the master controller is responsible for controlling the network.
23. A network element, NE, characterized in that the network comprises a cluster of controllers, said NE comprising:
means for establishing a first extended border gateway protocol, BGP, session with a master controller of the network to create a BGP message for exchanging BGP messages with each other for each controller and one or more NEs after the establishment of the extended BGP session as a control channel and an information channel between the network element, NE, and the master controller;
a module for establishing a second BGP session with an auxiliary controller of the network, the cluster of controllers including the primary controller and the auxiliary controller, the primary controller being responsible for controlling the network;
A module for receiving BGP messages from the host controller, the BGP messages including controller network layer reachability information, NLRI, indicating that the BGP messages were sent by the host controller, the controller NLRI carrying a location of the host controller relative to other controllers in the controller cluster and an identification, ID, of each controller in the controller cluster;
and means for forwarding the BGP message to the secondary controller.
CN202080094697.7A 2020-02-18 2020-12-14 System and method for Border Gateway Protocol (BGP) controlled network reliability Active CN115004655B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311237020.6A CN117376234A (en) 2020-02-18 2020-12-14 System and method for Border Gateway Protocol (BGP) controlled network reliability

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US202062978099P 2020-02-18 2020-02-18
US62/978,099 2020-02-18
PCT/US2020/064888 WO2021167685A1 (en) 2020-02-18 2020-12-14 System and method for border gateway protocol (bgp) controlled network reliability

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202311237020.6A Division CN117376234A (en) 2020-02-18 2020-12-14 System and method for Border Gateway Protocol (BGP) controlled network reliability

Publications (2)

Publication Number Publication Date
CN115004655A CN115004655A (en) 2022-09-02
CN115004655B true CN115004655B (en) 2023-10-10

Family

ID=74141973

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202311237020.6A Pending CN117376234A (en) 2020-02-18 2020-12-14 System and method for Border Gateway Protocol (BGP) controlled network reliability
CN202080094697.7A Active CN115004655B (en) 2020-02-18 2020-12-14 System and method for Border Gateway Protocol (BGP) controlled network reliability

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202311237020.6A Pending CN117376234A (en) 2020-02-18 2020-12-14 System and method for Border Gateway Protocol (BGP) controlled network reliability

Country Status (4)

Country Link
US (1) US20220393936A1 (en)
EP (1) EP4085580A1 (en)
CN (2) CN117376234A (en)
WO (1) WO2021167685A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104468236A (en) * 2014-12-19 2015-03-25 上海斐讯数据通信技术有限公司 SDN controller cluster, SDN switch and SDN switch connecting control method
US9660897B1 (en) * 2013-12-04 2017-05-23 Juniper Networks, Inc. BGP link-state extensions for segment routing
CN108881059A (en) * 2018-05-29 2018-11-23 新华三技术有限公司 Controller role determines method, the network switching equipment, controller and network system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10126719B2 (en) * 2013-06-17 2018-11-13 Kt Corporation Methods for changing an authority of control for a controller in environment having multiple controllers
US10263828B2 (en) * 2015-09-30 2019-04-16 Nicira, Inc. Preventing concurrent distribution of network data to a hardware switch by multiple controllers
US10432427B2 (en) * 2016-03-03 2019-10-01 Futurewei Technologies, Inc. Border gateway protocol for communication among software defined network controllers
US10523499B2 (en) * 2016-09-20 2019-12-31 Hewlett Packard Enterprise Development Lp Master controller selection in a software defined network
US10511524B2 (en) * 2017-10-03 2019-12-17 Futurewei Technologies, Inc. Controller communications in access networks
CN109936505B (en) * 2017-12-15 2021-06-22 上海诺基亚贝尔股份有限公司 Method and apparatus in data-centric software-defined networks

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9660897B1 (en) * 2013-12-04 2017-05-23 Juniper Networks, Inc. BGP link-state extensions for segment routing
CN104468236A (en) * 2014-12-19 2015-03-25 上海斐讯数据通信技术有限公司 SDN controller cluster, SDN switch and SDN switch connecting control method
CN108881059A (en) * 2018-05-29 2018-11-23 新华三技术有限公司 Controller role determines method, the network switching equipment, controller and network system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
amp ; lt ; /a amp ; amp ; gt ; .IETF .2019,全文. *
amp ; lt ; a href= amp ; quot ; ./draft-ietf-idr-rfc7752bis-02 amp *
K. Talaulikar, Ed. ; Cisco Systems ; .Distribution of Link-State and Traffic Engineering Information Using BGP amp *
quot ; amp ; amp ; gt ; draft-ietf-idr-rfc7752bis-02 amp *

Also Published As

Publication number Publication date
CN117376234A (en) 2024-01-09
CN115004655A (en) 2022-09-02
WO2021167685A1 (en) 2021-08-26
EP4085580A1 (en) 2022-11-09
US20220393936A1 (en) 2022-12-08

Similar Documents

Publication Publication Date Title
Cascone et al. Fast failure detection and recovery in SDN with stateful data plane
CN110535760B (en) Forwarding detection of aggregated interfaces
US10218600B2 (en) Path computation element hierarchical software defined network control
US8082340B2 (en) Technique for distinguishing between link and node failure using bidirectional forwarding detection (BFD)
EP2680510B1 (en) Service plane triggered fast reroute protection
EP1779568B1 (en) Graceful shutdown of ldp on specific interfaces between label switched routers
US8543718B2 (en) Technique for efficiently and dynamically maintaining bidirectional forwarding detection on a bundle of links
CN110784400B (en) N: 1 method, system and standby service gateway for redundancy of stateful application gateway
JP7176095B2 (en) COMMUNICATION METHOD, COMMUNICATION DEVICE AND COMMUNICATION SYSTEM
JP4598123B2 (en) A method for providing an alternative route as a quick response to a link failure between two routing domains
JP7306642B2 (en) Loop avoidance communication method, loop avoidance communication device and loop avoidance communication system
JP2006135970A (en) SoftRouter DYNAMIC BINDING PROTOCOL
US8824451B2 (en) Method and system for establishing an associated bidirectional label-switched path
WO2022105927A1 (en) Method, device, and system for notifying processing capability of network device
US12047274B2 (en) Path computation method, storage medium and electronic apparatus
EP3188408B1 (en) Method and apparatus for determining network topology, and centralized network state information storage device
US11641307B2 (en) Method for configuring a network node
CN115004655B (en) System and method for Border Gateway Protocol (BGP) controlled network reliability
US12095656B2 (en) Failure detection and mitigation in an MC-LAG environment
CN115152192A (en) PCE controlled network reliability
US20200145326A1 (en) Path data deletion method, message forwarding method, and apparatus
CN113872843A (en) Route generation method, route processing method and device
WO2023103504A1 (en) Link detection method, public network node, and storage medium
US11888596B2 (en) System and method for network reliability
WO2019106681A1 (en) Method for migration of session accounting to a different stateful accounting peer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant