WO2017221050A1 - Efficient handling of multi-destination traffic in multi-homed ethernet virtual private networks (evpn) - Google Patents

Efficient handling of multi-destination traffic in multi-homed ethernet virtual private networks (evpn) Download PDF

Info

Publication number
WO2017221050A1
WO2017221050A1 PCT/IB2016/053726 IB2016053726W WO2017221050A1 WO 2017221050 A1 WO2017221050 A1 WO 2017221050A1 IB 2016053726 W IB2016053726 W IB 2016053726W WO 2017221050 A1 WO2017221050 A1 WO 2017221050A1
Authority
WO
WIPO (PCT)
Prior art keywords
network device
destination
network
label
links
Prior art date
Application number
PCT/IB2016/053726
Other languages
French (fr)
Inventor
Prasanna Chalapathy
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Priority to PCT/IB2016/053726 priority Critical patent/WO2017221050A1/en
Publication of WO2017221050A1 publication Critical patent/WO2017221050A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/46Interconnection of networks
    • H04L12/4633Interconnection of networks using encapsulation techniques, e.g. tunneling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/02Topology update or discovery
    • H04L45/04Interdomain routing, e.g. hierarchical routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/16Multipoint routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/50Routing or path finding of packets in data switching networks using label swapping, e.g. multi-protocol label switch [MPLS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/50Routing or path finding of packets in data switching networks using label swapping, e.g. multi-protocol label switch [MPLS]
    • H04L45/507Label distribution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/46Interconnection of networks
    • H04L12/4641Virtual LANs, VLANs, e.g. virtual private networks [VPN]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/46Interconnection of networks
    • H04L12/4604LAN interconnection over a backbone network, e.g. Internet, Frame Relay
    • H04L2012/4629LAN interconnection over a backbone network, e.g. Internet, Frame Relay using multilayer switching, e.g. layer 3 switching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/66Layer 2 routing, e.g. in Ethernet based MAN's

Abstract

Methods and apparatus for efficient handling multi-destination network packets are described. Responsive to determining that a first aliasing label (AL12) and a first multi- destination label (ML12) include an identical value, a multi-destination forwarding table is updated to exclude a first network device (112). The first aliasing label (AL12) indicates that the first network device (112) is coupled with a group of links (125), where the group of links couples a second network device (101) with network devices from a broadcast domain. The first multi-destination label (ML12) is to be used for forwarding multi-destination network packets towards the first network device (112). Responsive to updating the multi-destination forwarding table, multi-destination network packets received, at a third network device (111) from the second network device (101) are not transmitted to the first network device (112).

Description

EFFICIENT HANDLING OF MULTI-DESTINATION TRAFFIC IN MULTI-HOMED ETHERNET VIRTUAL PRIVATE NETWORKS (EVPN)
TECHNICAL FIELD
[0001] Embodiments of the invention relate to the field of packet networks; and more specifically, to multi-destination traffic in multi-homed Ethernet Virtual private networks (EVPN).
BACKGROUND
[0002] An Ethernet Virtual Private Network (EVPN) is a type of VPN technology which introduces routing Media Access Control (MAC) addresses using Multiprotocol Border Gateway Protocol (MP-BGP) over Multiprotocol Label Switching (MPLS). As with other types of VPNs, an EVPN is comprised of customer edge (CE) devices (host, router, or switch) connected to provider edge (PE) devices that form the edge of an MPLS infrastructure. A CE may be a host, a router, or a switch. The PEs provide virtual Layer 2 bridged connectivity between the CEs. There may be multiple EVPN instances in the provider' s network. The PEs may be connected by an MPLS Label Switched Path (LSP) infrastructure, which provides the benefits of MPLS technology, such as fast reroute, resiliency, etc. The PEs may also be connected by an IP infrastructure, in which case IP/GRE (Generic Routing Encapsulation) tunneling or other IP tunneling can be used between the PEs. The CEs can connect to multiple active points of attachment (i.e., to multiple PEs).
[0003] In EVPN, PEs advertise the MAC addresses learned from the CEs that are connected to them, along with an MPLS label to other PEs in the control plane using BGP. Control-plane route learning through MP-BGP, offers greater control over a MAC route learning process, and enables the introduction of restriction on which device learns which information as well as the ability to apply policies. It further enables load balancing of traffic to and from CEs that are multi-homed to multiple PEs and improves convergence times in the event of certain network failures.
[0004] According to Internet Engineering Task Force (IETF), Request for Comment (RFC) 7432, each PE advertises a multi-destination label (e.g., "Inclusive Multicast Ethernet Tag route") to enable other PEs to send broadcast or multicast traffic received from a CE to that PE. The packets are encapsulated in the multi-destination label associated with a broadcast domain of an EVPN instance (e.g., an Ethernet tag associated with a given VLAN) to all the other PEs that are part of that broadcast domain. In certain scenarios, a given PE may also need to flood unknown unicast traffic to other PEs. [0005] In standard EVPN procedures, when a PE receives a multi-destination (i.e., BUM (broadcast, unknown unicast, or multicast)) packet, it floods the packet to other network devices according to forwarding entries of a multi-destination forwarding table (e.g., a flood list) associated with the link on which the packet is received. If the packet arrived from a CE coupled with the PE, the PE sends a copy of that packet on every Ethernet segment (belonging to that EVPN Instance) for which it is the Designated Forwarded (DF), other than the link on which it received the packet. In addition, the PE floods the packet to all other PEs participating in that EVPN instance. If, on the other hand, the packet arrived from another PE, the PE sends a copy of the packet on each Ethernet segment (belonging to that EVPN instance) for which it is DF.
[0006] Further according to RFC 7432, procedures are defined to avoid looping of multi- destination traffic. For example, when a CE is multi-homed to two or more PEs on an Ethernet segment (that is operating in All-active redundancy mode), if the CE sends a BUM packet to one of the non-DF PEs, this PE will forward the packet to all or a subset of the other PEs in that EVPN instance, including the DF PE for that Ethernet Segment. In this case, the DF PE to which the CE is multi-homed drops the packet and does not forward the packet back to the CE.
[0007] However, this approach consumes computing resources on PEs as well as network bandwidth resources for replicating the multi-destination traffic towards other PEs of the EVPN instance, when these PEs will drop the packets to avoid looping of the traffic. Thus according to these standard approaches PEs of a broadcast domain within an EVPN instance receive and process multi-destination traffic from other PEs only to drop that traffic. This results in the use of bandwidth and computing resources within the network which can affect the quality of the network, in particular when PEs need to process other traffic. For example, an application such as video conferencing between CEs would result in unnecessary continuous traffic forwarded between PEs connected to a same Ethernet segment.
SUMMARY
[0008] One general aspect includes a method, of efficient handling multi-destination network packets, the method including: responsive to determining that a first aliasing label and a first multi-destination label include an identical value, where the first aliasing label indicates that a first network device is coupled with a group of links, where the group of links (125) couples a second network device with two or more network devices from a broadcast domain, and where the first multi-destination label is to be used for forwarding multi-destination network packets towards the first network device, updating a multi-destination forwarding table to exclude the first network device, where the multi-destination forwarding table is associated with a link of a third network device, where the link is part of the group of links and couples the second network device with the third network device; and responsive to updating the multi-destination forwarding table, causing multi-destination network packets received, at the third network device from the second network device to not be transmitted to the first network device.
[0009] One general aspect includes a first network device from a plurality of network devices forming a broadcast domain, for handling multi-destination network packets, where the first network device is to be coupled with a second network device through a first link from a group of links, where the group of links couples the second network device with two or more network devices from the broadcast domain, the first network device including: a non-transitory computer readable storage medium to store instructions; and one or more processors coupled with the non-transitory computer readable storage medium to process the stored instructions to responsive to determining that a first aliasing label and a first multi-destination label include an identical value, where the first aliasing label indicates that a first network device is coupled with a group of links, where the group of links couples a second network device with two or more network devices from a broadcast domain, and where the first multi-destination label is to be used for forwarding multi-destination network packets towards the first network device, update a multi-destination forwarding table to exclude the first network device, where the multi- destination forwarding table is associated with a link of a third network device, where the link is part of the group of links and couples the second network device with the third network device;; and responsive to updating the multi-destination forwarding table, cause multi-destination network packets received, at the third network device from the second network device to not be transmitted to the first network device.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] The invention may best be understood by referring to the following description and accompanying drawings that are used to illustrate embodiments of the invention. In the drawings:
[0011] Figure 1 illustrates a block diagram of an exemplary system for efficient handling of multi-destination traffic in a multi-homed Ethernet Virtual Private Network (EVPN), according to some embodiments of the invention.
[0012] Figure 2 illustrates a block diagram illustrating an exemplary configuration of an EVPN instance, according to some embodiments of the invention.
[0013] Figure 3A illustrates exemplary forwarding table at a network device of an EVPN instance, according to some embodiments of the invention.
[0014] Figure 3B illustrates exemplary forwarding table at a network device of an EVPN instance, according to some embodiments of the invention. [0015] Figure 3C illustrates exemplary forwarding table at a network device of an EVPN instance, according to some embodiments of the invention.
[0016] Figure 3D illustrates exemplary forwarding table at a network device of an EVPN instance, according to some embodiments of the invention.
[0017] Figure 3E illustrates exemplary multi-destination forwarding table(s) for each network device of the broadcast domain, according to some embodiments.
[0018] Figure 4 is a block diagram illustrating an exemplary scenario for forwarding multi- destination traffic in an EVPN instance, according to some embodiments of the invention.
[0019] Figure 5 is a block diagram illustrating an exemplary forwarding of multi-destination traffic in an EVPN instance, according to some embodiments of the invention.
[0020] Figure 6 illustrates a flow diagram of exemplary operations for efficient handling of multi-destination traffic in a multi-homed EVPN instance, according to some embodiments of the invention.
[0021] Figure 7 illustrates a flow diagram of exemplary detailed operations for determining values for aliasing labels and multi-destination labels, according to some embodiments of the invention.
[0022] Figure 8A illustrates connectivity between network devices (NDs) within an exemplary network, as well as three exemplary implementations of the NDs, according to some embodiments of the invention.
[0023] Figure 8B illustrates an exemplary way to implement a special-purpose network device according to some embodiments of the invention.
[0024] Figure 8C illustrates various exemplary ways in which virtual network elements (VNEs) may be coupled according to some embodiments of the invention.
[0025] Figure 8D illustrates a network with a single network element (NE) on each of the NDs, and within this straight forward approach contrasts a traditional distributed approach (commonly used by traditional routers) with a centralized approach for maintaining reachability and forwarding information (also called network control), according to some embodiments of the invention.
[0026] Figure 8E illustrates the simple case of where each of the NDs implements a single NE, but a centralized control plane has abstracted multiple of the NEs in different NDs into (to represent) a single NE in one of the virtual network(s), according to some embodiments of the invention.
[0027] Figure 8F illustrates a case where multiple VNEs are implemented on different NDs and are coupled to each other, and where a centralized control plane has abstracted these multiple VNEs such that they appear as a single VNE within one of the virtual networks, according to some embodiments of the invention.
[0028] Figure 9 illustrates a general purpose control plane device with centralized control plane (CCP) software 950), according to some embodiments of the invention.
DETAILED DESCRIPTION
[0029] The following description describes methods and apparatus for efficient handling of multi-destination traffic in multi-homes Ethernet virtual private networks (EVPN). In the following description, numerous specific details such as logic implementations, opcodes, means to specify operands, resource partitioning/sharing/duplication implementations, types and interrelationships of system components, and logic partitioning/integration choices are set forth in order to provide a more thorough understanding of the present invention. It will be appreciated, however, by one skilled in the art that the invention may be practiced without such specific details. In other instances, control structures, gate level circuits and full software instruction sequences have not been shown in detail in order not to obscure the invention. Those of ordinary skill in the art, with the included descriptions, will be able to implement appropriate functionality without undue experimentation.
[0030] References in the specification to "one embodiment," "an embodiment," "an example embodiment," etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
[0031] Bracketed text and blocks with dashed borders (e.g., large dashes, small dashes, dot- dash, and dots) may be used herein to illustrate optional operations that add additional features to embodiments of the invention. However, such notation should not be taken to mean that these are the only options or optional operations, and/or that blocks with solid borders are not optional in certain embodiments of the invention.
[0032] In the following description and claims, the terms "coupled" and "connected," along with their derivatives, may be used. It should be understood that these terms are not intended as synonyms for each other. "Coupled" is used to indicate that two or more elements, which may or may not be in direct physical or electrical contact with each other, co-operate or interact with each other. "Connected" is used to indicate the establishment of communication between two or more elements that are coupled with each other.
[0033] An electronic device stores and transmits (internally and/or with other electronic devices over a network) code (which is composed of software instructions and which is sometimes referred to as computer program code or a computer program) and/or data using machine -readable media (also called computer-readable media), such as machine -readable storage media (e.g., magnetic disks, optical disks, read only memory (ROM), flash memory devices, phase change memory) and machine -readable transmission media (also called a carrier) (e.g., electrical, optical, radio, acoustical or other form of propagated signals - such as carrier waves, infrared signals). Thus, an electronic device (e.g., a computer) includes hardware and software, such as a set of one or more processors coupled to one or more machine -readable storage media to store code for execution on the set of processors and/or to store data. For instance, an electronic device may include non-volatile memory containing the code since the non-volatile memory can persist code/data even when the electronic device is turned off (when power is removed), and while the electronic device is turned on that part of the code that is to be executed by the processor(s) of that electronic device is typically copied from the slower nonvolatile memory into volatile memory (e.g., dynamic random access memory (DRAM), static random access memory (SRAM)) of that electronic device. Typical electronic devices also include a set or one or more physical network interface(s) to establish network connections (to transmit and/or receive code and/or data using propagating signals) with other electronic devices. One or more parts of an embodiment of the invention may be implemented using different combinations of software, firmware, and/or hardware.
[0034] A network device (ND) is an electronic device that communicatively interconnects other electronic devices on the network (e.g., other network devices, end-user devices). Some network devices are "multiple services network devices" that provide support for multiple networking functions (e.g., routing, bridging, switching, Layer 2 aggregation, session border control, Quality of Service, and/or subscriber management), and/or provide support for multiple application services (e.g., data, voice, and video).
[0035] Some NDs provide support for implementing VPNs (Virtual Private Networks) (e.g., Layer 2 VPNs and/or Layer 3 VPNs). For example, the ND where a provider's network and a customer's network are coupled are respectively referred to as PEs (Provider Edge) and CEs (Customer Edge). In a Layer 2 VPN, forwarding typically is performed on the CE(s) on either end of the VPN and traffic is sent across the network (e.g., through one or more PEs coupled by other NDs). Layer 2 circuits are configured between the CEs and PEs (e.g., an Ethernet port, an ATM permanent virtual circuit (PVC), a Frame Relay PVC). In a Layer 3 VPN, routing typically is performed by the PEs. By way of example, an edge ND that supports multiple VNEs may be deployed as a PE; and a VNE may be configured with a VPN protocol, and thus that VNE is referred as a VPN VNE. An Ethernet Virtual Private Network (EVPN) is a type of VPN technology developed to address the limitations of Virtual Private LAN Service (VPLS) by providing multi-homing and redundancy, multicast optimization, provisioning simplicity, flow- based load balancing, and multipathing. RFC 7432: "BGP MPLS-Based Ethernet VPN" describes procedures for BGP MPLS based EVPN. which introduces routing MAC addresses using control plane routing protocol (e.g., Multiprotocol Border Gateway Protocol (MP-BGP)) over Multiprotocol Label Switching (MPLS).
[0036] A method and system for enabling efficient handling of multi-destination traffic in multi-homed Ethernet Virtual Private Networks (EVPN) are described. The techniques described herein reduce compute resources and network bandwidth usage in an EVPN instance when handling multi-destination traffic. An EVPN instance includes a set of network devices acting as provider edges (PEs) and a set of network devices acting as customer edges (CEs) coupled with the PEs. In one embodiment, in response to determining that a first aliasing label and a first multi-destination label associated with a first network device (e.g., a PE of a broadcast domain of an EVPN instance) include an identical value, a multi-destination forwarding table is updated to exclude the first network device. The first aliasing label associated with the first network device is learnt by a third network device (e.g., another PE broadcast domain of an EVPN instance) and indicates that the first network device is coupled with a group of links. The group of links (e.g., an Ethernet segment) couples the second network device with two or more network devices from a broadcast domain of an EVPN instance. The first multi-destination label associated with the first network device is further learnt by the third network device and is to be used by the third network device (e.g., a remote ND that is part of the broadcast domain) for forwarding multi-destination network packets towards the first network device. The multi-destination forwarding table is associated with a link of the third network device, where the link is part of the group of links and couples the second network device with the third network device. Responsive to updating the multi-destination forwarding table, multi-destination network packets received, at the third network device from the second network device are not transmitted to the first network device. In one embodiment, the multi- destination network packets are not transmitted to the first network device even if the first network device is part of the broadcast domain to which the multi-destination packets are addressed resulting in a reduction of usage of computing resources and bandwidth resources in the network. Thus in this embodiment, and in contrast to standard approaches, even if the first network device is part of a broadcast domain and is operative to receive and forward multi- destination traffic, the third network device does not transmit multi-destination traffic to the first network device in response to determining that the two labels include identical values.
[0037] The embodiments of the present invention, enable a network device to avoid transmission of multi-destination traffic to another network device from a broadcast domain if it determines that this network device will drop the packet upon it receipt. The embodiments described herein reduce the use of compute resources and network bandwidth usage in an EVPN instance when handling multi-destination traffic.
[0038] Figure 1 illustrates a block diagram of an exemplary network for efficient handling of multi-destination traffic in a multi-homed Ethernet Virtual Private Network (EVPN), according to some embodiments of the invention. Figure 1 illustrates an exemplary EVPN instance including a set of network devices (NDs). For example, the EVPN instance includes a set of NDs 111, 112, 113 and 114 of a provider's network coupled with a set of NDs (ND 101-103) of a customer's network. In some embodiments, the NDs 101, 102, 103 are customer edge (CEs) network devices coupled with Provider Edge network devices NDs 111, 112, 113 and 114. These NDs represent connection points in the network in which a customer's site (e.g., a data center, customer's network, computing device, etc.) connects with a provider's network. One of ordinary skill in the art would understand that the number of NDs in network 100 are exemplary only and not intended to be limiting. A network (e.g., EVPN network) may include any number of network devices. Each one of the NDs 111-114 and 101-103 can be implemented as described in further details with reference to Figures 8A-10.
[0039] Each one of the NDs 101-103 may be a host, a router, or a switch coupled with one or more customer sites (not shown in Figure 1). The NDs 111-114 provide virtual Layer 2 bridged connectivity between NDs 101-103. The NDs 111-114 are coupled through a network 105. For example, the NDs can be coupled through an MPLS Label Switched Path (LSP) infrastructure, which provides the benefits of MPLS technology, such as fast reroute, resiliency, etc. In other embodiments, the NDs 111-114 may be connected by an IP infrastructure, in which case IP/GRE (Generic Routing Encapsulation) tunneling or other IP tunneling can be used between the NDs.
[0040] A broadcast domain is a set of network devices associated with a broadcast ID, which is operative to receive packets identified in part based on that broadcast ID. For example, a broadcast domain may correspond to a Virtual LAN (VLAN), where a VLAN is typically represented by a single VLAN ID (VID). In some embodiments, a broadcast domain can be represented by several VIDs where Shared VLAN Learning (SVL) is used. In some
embodiments, an EVPN instance may include one or more broadcast domains. In the illustrated exemplary system of Figure 1, the EVPN instance includes a single broadcast domain (e.g., BDl of Figure 2, 3A-D) as will be described in further details below. While embodiments will be described with respect to the EVPN instance including a single broadcast domain, alternative embodiments can be implemented in which an EVPN instance may include multiple broadcast domain, each domain being identified with a corresponding broadcast ID without departing from the scope of the present invention.
[0041] Each one of the NDs 101-103 can connect to multiple active points of attachment (i.e., to multiple PEs). For example, ND 101 is coupled with ND 111, ND 112, and ND 113 through a group of links 125. The group of link includes link 121 coupling ND 101 with ND 111, link 122 coupling ND 101 with ND 112, and link 123 coupling ND 101 with ND 113. The group of links is associated with a unique non-zero identifier. In some embodiments, the group of link is an Ethernet segment and is associated with an Ethernet Segment Identifier (ESI). The group of links can operate in a "Single-Active Redundancy Mode," where only a single ND from the NDs 111-113 is allowed to forward traffic to/from that Ethernet segment for a given broadcast domain. Alternatively the group of links 125 may operate in a "All-Active Redundancy Mode," where all NDs 111-113 attached to the group of links are allowed to forward known unicast traffic to/from that group of links for a given broadcast domain. In another example, ND 103 is coupled to a single ND 113 through the link 124 that is not part of the group of links 125. ND 102 is coupled to a single ND 114 through the link 126 that is not part of the group of links 125.
[0042] Each one of the NDs 111-114 is operative to forward multi-destination traffic received from one of the ND 101-103. Multi-destination traffic refers to broadcast, unknown unicast, or multicast (i.e., BUM) traffic, which is received at a network device and needs to be forwarded to one or more network device of a broadcast domain.
[0043] During a route learning mechanism, each one of NDs 111-114 learns the MAC addresses of the CEs coupled with it. In one embodiment, each one of the CEs (e.g., ND 101, 102, and 103) advertises a MAC address to an associated PE (e.g., NDs 111-114) to which it is coupled. For example, ND 101 advertises its MAC address to ND 111, 112 and 113; ND 102 advertises its MAC address to ND 114; and ND 103 advertises its MAC address to ND 113. Further, each one of the PEs (e.g., ND 111, ND112, ND 113 and ND 114) learns the MAC addresses of the CEs to which other PEs are coupled as well as routing labels enabling a PE to forward traffic to the other PEs. For example, a routing protocol may be used to distribute in the control plane routing topology (e.g., through the distribution of routing labels) within the EVPN instance (e.g., MP-BGP may be used). In one embodiment, each one of the network devices NDs 111-114 of the broadcast domain, exchanges routing information with the other network devices of the broadcast domain. As will be described in further detail below, based on the remote network device (e.g., ND 112-114) learning the labels of another network device (e.g., ND 111), the values assigned to each label varies.
[0044] In one embodiment, each one of NDs 111-114 is assigned an aliasing label and a multi- destination label that are communicated to a respective one of the other network devices to enable the other network device to forward packets towards the ND. The aliasing label indicates whether the network device is coupled with a group of links (e.g., 125) and may be used for forwarding packets of the group of links. The multi-destination label is to be used by other network devices of the broadcast domain for forwarding multi-destination network packets towards the network device. For example, in the illustrated exemplary broadcast domain of Figure 1 and as will be described in further details with reference to Figure 3A-D, each one of the NDs 111-114 has associated multi-destination labels (i.e., ML11, ML12, ML13, and ML14) indicating that the ND is part of the broadcast domain BD1 of the EVPN instance (EVH). In addition, each one of the NDs 111-113 has associated aliasing labels ALI I, AL 12, AL 13 indicating that the respective device is coupled with a group of links (125). In some
embodiments, the value of the aliasing label associated with a network device learnt by a remote ND depends at least in part on whether the remote ND is part of the same group of links as the network device. In one non-limiting example, the multi-destination label is carried as part of Provider Multicast Service Interface (PMSI) Tunnel attribute in Inclusive Multicast Ethernet Tag Route; and the aliasing label is carried as part of Ethernet Auto-discover Route (EAD) (the aliasing label is the MPLS label part of the EAD).
[0045] A value is assigned to each one of the multi-destination label and aliasing label of a network device according to the connections of the network device with respect to other devices of the broadcast domain of the EVPN instance. For a given broadcast domain (BD1) of an EVPN instance (EV1), when it is determined (operation 109) that a remote network device (e.g., ND 111) to which routing information is advertised is coupled to a same group of links as the advertising ND (e.g., ND 112), and when it is determined (operation 110) that the broadcast domain is associated with a single link of the network device and that link is part of the group of links an identical value is assigned (operation 120) to the aliasing label and the multicast label of that network device. For example, ND 112, and ND 111 are each coupled with ND 101 through a respective link of the group of links 125 (i.e., link 122 and link 121 respectively) and the broadcast domain to which the NDs belong is not associated with another link other than the links 122 or 121 (that are each part of the group of links 125). Thus, when ND 111 learns the aliasing label and the multi-destination label of ND 112 the labels include an identical value (e.g., and ML12=AL12=X). [0046] Alternatively, when it is determined (operation 130) that a remote network device (e.g., ND 111) to which routing information is advertised is coupled to a same group of links as the advertising ND (e.g., ND 113), and when it is determined (operation 140) for a network device that the broadcast domain of a given EVPN instance is associated with more than one link of the network device and that at least one of the links is not part of the group of links different values are assigned (operation 150) to the aliasing label and the multicast label of that network device (ND 113). For example, ND 113 is coupled with ND 101 through link 123 of the group of links
125 and includes at least one link 124 that is not part of the group of links 125. Thus, a different value is assigned to the aliasing label and the multi-destination label ND 113 (e.g., AL 13=Y' is different than ML 13=Y).
[0047] ND 111 learns the aliasing labels and the multi-destination labels of each of the network devices that are part of the broadcast domain BD1 of the EVPN instance EVl and a multi-destination forwarding table is populated for each link at the network device, based on the routing information of the other network devices. The multi-destination forwarding table is populated for a link of a network device and is used to forward multi-destination packets (e.g., broadcast, unknown unicast, or multicast packets) received at that link. Thus, at operation 170, responsive to determining that ML12 and AL12 include an identical value X, the multi- destination forwarding table associated with link 121 is updated to exclude ND 112.
Alternatively, at operation 180, responsive to determining that ML13 and AL13 include different values, the multi-destination forwarding table associated with the link 121 is updated to include ND 113.
[0048] Figure 2 illustrates a block diagram illustrating an exemplary configuration of an EVPN instance, according to some embodiments of the invention. The table of Figure 2 illustrates a configuration of an EVPN instance EVl including ND 111-114 of Figure 1. In some embodiments, the NDs 111-114 are provider edge network devices forming a broadcast domain (e.g., VLAN). In this example, the EVPN instance EVl includes a single broadcast domain identified with a broadcast ID BD1 (e.g., the broadcast domain ID can be assigned a numerical value e.g., 100). Each ND has a corresponding number of links associated with the broadcast domain BD1. ND 111 has a single link 121 that is part of the group of links 125. The link 121 connects ND 111 to ND 101. ND 112 has a single link 122 that is part of the group of links 125. The link 122 connects ND 112 to ND 101. ND 113 has two links 123 and 124, where link 123 is part of the group of links 125, and link 124 is not part of the group of link 125. The link 123 connects ND 113 to ND 101 and link 124 connects ND 113 to ND 103. ND 114 has a single link
126 that is not part of the group of links 125. The link 126 connects ND 111 to ND 102. [0049] According to the configuration of EV1, aliasing labels and multi-destination labels are advertised for each ND from the broadcast domain BD1 as described with reference to Figure 1 resulting in forwarding tables at each one of the network devices. Figure 3A illustrates an exemplary forwarding table at ND 111 of an EVPN instance, according to some embodiments of the invention. Figure 3B illustrates exemplary forwarding table at ND 112 of an EVPN instance, according to some embodiments of the invention. Figure 3C illustrates exemplary forwarding table at ND 113 of an EVPN instance, according to some embodiments of the invention. Figure 3D illustrates exemplary forwarding table at ND 114 of an EVPN instance, according to some embodiments of the invention. Each forwarding table shows the labels (aliasing label AL, and multi-destination label ML) of each remote network device of the EVN instance as advertised by a remote ND through a control plane routing protocol (e.g., MP-BGP). Thus as described with reference to Figure 1, a value of the aliasing label and the multi-destination label is determined based whether the labels are destined to a remote network device that includes links that are part of the same group of links as the advertising network device and on whether the broadcast domain is associated with a single link of the network device and that link is part of a group of links or not. Thus when an DN is advertising to a remote ND that is part of the same group of link and the broadcast domain is associated with a single link of the network device and that link is part of a group of links an identical value is assigned to the aliasing label and the multicast label of that network device to be advertised to the remote ND. Alternatively, when it is determined for a network device that the broadcast domain of a given EVPN instance is associated with more than one link of the network device and that at least one of the links is not part of the group of links, different values are assigned to the aliasing label and the multicast label of that network device to be advertised to a remote ND. The forwarding tables of Figures 3A-3D illustrate the aliasing labels and multi-destination labels associated with each remote network device for a given ND of the broadcast domain. For example, forwarding table at ND 111, shows the labels associated with each one of the other NDs from the EVPN instance (EVl)/broadcast domain (BD1) as learnt by the ND 111. For example, an identical value is assigned to the aliasing label (AL 12 = X (which can be any numerical value within a given range (e.g., 20100)) and the multi-destination label (ML12=X) of ND 112. Further, two different values are assigned to the aliasing label (AL 13 = Y' (e.g., 33100) and the multi-destination label (ML13= Y (e.g., Y=30100)) of ND 113. As another example, a value Z is assigned to the multi-destination label of remote ND 114 whereas no aliasing label is assigned to this ND, as this ND is not connected to a group of links and therefore does not need to advertise an aliasing label. [0050] Similarly, forwarding table at ND 112, of Figure 3B shows the labels associated with each one of the other NDs from the EVPN instance (EVl)/broadcast domain (BD1) as learnt by the ND 112. For example, an identical value is assigned to the aliasing label (AL 11 = A (which can be any numerical value within a given range (e.g., 10100)) and the multi-destination label (ML11=A) of ND 111. Further, two different are assigned to the aliasing label (AL 13 = Y' (e.g., 33100) and the multi-destination label (ML13= Y (e.g., Z=30100)) of ND 113. As another example, a value Z is assigned to the multi-destination label of remote ND 114 whereas no aliasing label is assigned to this ND, as this ND is not connected to a group of links and therefore does not need to advertise an aliasing label.
[0051] Similarly, forwarding table at ND 113, of Figure 3C shows the labels associated with each one of the other NDs from the EVPN instance (EVl)/broadcast domain (BD1) as learnt by the ND 113. For example, an identical value is assigned to the aliasing label (AL 11 = A (which can be any numerical value within a given range (e.g., 10100)) and the multi-destination label (ML11=A) of ND 111. Further, an identical value is assigned to the aliasing label (AL 12 = X (which can be any numerical value within a given range (e.g., 20100)) and the multi-destination label (ML12=X) of ND 112. As another example, a value Z is assigned to the multi-destination label of remote ND114 whereas no aliasing label is assigned to this ND, as this ND is not connected to a group of links and therefore does not need to advertise an aliasing label.
[0052] Similarly, forwarding table at ND 114, of Figure 3D shows the labels associated with each one of the other NDs from the EVPN instance (EVl)/broadcast domain (BD1) as learnt by the ND 113. For example, given that none of the NDs 111-113 include links that are part of a same group of links as links of ND 114 (here ND 114 is not part of any group of links, in alternative embodiments, ND 114 may be part of a group of link that is different from the group of links 125), different values are assigned to each of the multi-destination label and the aliasing label of each device. For example, different values are assigned to the aliasing label (AL 11 = A' (which can be any numerical value within a given range (e.g., 11100)) and the multi-destination label (ML11=A (e.g., 10100)) of ND 111. Further, different values are assigned to the aliasing label (AL 12 = X' (which can be any numerical value within a given range (e.g., 22100)) and the multi-destination label (ML12=X (e.g., 20100)) of ND 112. As another example, different values are assigned to the aliasing label (AL 13 = Y' (e.g., 33100) and the multi-destination label (ML13= Y (e.g., Y=30100)) of ND 113.
[0053] Figure 3E illustrates exemplary multi-destination forwarding table(s) for each network device of the broadcast domain, according to some embodiments. Figure 3E will be described with reference to Figures 1-3D and Figures 4 and 5. As described with reference to Figure 1, the configuration of the broadcast domain BD1 of the EVPN EV1, and the assignment of multi- destination label and aliasing labels for each ND according to this configuration (operations 170 and 180 of Figure 1) result in the following multi-destination forwarding information for each one of the network devices illustrated in the forwarding table of Figure 3E. While Figure 3E illustrates a single forwarding table including forwarding information for all network devices of the EVPN instance/broadcast domain, the forwarding table is a logical representation of the forwarding information, which can have various implementation according to the
implementation of the network devices (e.g., centralized control plane implementation (SDN) or other types of control plane implementation). Further the forwarding table may be implemented as separate forwarding tables per each network device.
[0054] ND 111 is part of the broadcast domain BD1 of EVPN instance EV1 and includes a single link 121 that is part of that EV1/BD1. When multi-destination packets (i.e., broadcast, unicast unknown, or multicast) are received through the link 121 (from ND 101), they are forwarded according to the multi-destination list, i.e., towards ND 113 and ND 114 of the EV1/BD1. Thus contrary to standard approaches where ND 111 would forward multi- destination traffic to ND 113, ND 112, and ND 114, and the packets forwarded to ND 112 would be dropped, in the embodiments of the present invention, ND 111 is operative to send multi-destination packets only to network devices that will forward the packets and not drop them resulting in the reduction of computing resources at the ND 112 and the reduction of bandwidth usage in the network. Figure 4 is a block diagram illustrating an exemplary scenario for forwarding multi-destination traffic in an EVPN instance, according to some embodiments of the invention. A multi-destination packet PI is received at ND 111 through the link 121 from the network device 101. At operation 190, responsive to the update (at operation 170) of the multi-destination forwarding table (update of the multi-destination list to not include ND 112 upon determination that AL12=ML12), the multi-destination network packet PI received from ND 101 is not transmitted to ND 112.
[0055] Alternatively, in response to the update (at operation 180) of the multi-destination forwarding table (update of the multi-destination list to include ND 113 upon determination that AL13 !=ML13), the multi-destination network packet PI received from ND 101 is transmitted (operation 195) towards ND 113. In one embodiment, the packet PI is encapsulated in an MPLS label to form the encapsulated packet P13 (e.g., using the multi-destination label ML13 assigned to ND 113) and transmitted through the network 105 towards ND 113. The packet received at ND 113 is decapsulated and forwarded to ND 103. Similarly, in response to the update of the multi-destination forwarding table (update of the multi-destination list to include ND 114 upon determination that ND 114 does not include a link that is part of the group of links 125 (i.e., ND 114 does not have an aliasing label assigned)), the multi-destination network packet PI received from ND 101 is transmitted towards ND 114. In one embodiment, the packet PI is encapsulated in an MPLS label to form the encapsulated packet P14 (e.g., using the multi-destination label ML14 assigned to ND 114) and transmitted through the network 105 towards ND 114.
[0056] Figure 5 is a block diagram illustrating an exemplary forwarding scenario of multi- destination traffic in an EVPN instance, according to some embodiments of the invention. A multi-destination packet P2 is received at ND 113 through the link 123 from the network device 101. At operation 197, responsive to the update of the multi-destination forwarding table (update of the multi-destination list for link 123 to not include ND 111 and ND 112, upon determination that link 123 is part of the same group of links 125 as the single EV1/BD1 link 121 of ND 111, and the single link 121 of ND 112, and the determination that AL11=ML11, AL 12= ML 12), the multi-destination network packet P2 received from ND 101 is not transmitted to ND 111 or ND 112. Alternatively, the packet P2 is transmitted to ND 103 and towards ND 114 as the two NDs satisfy the conditions and are included in the multi-destination list associated with the link 123 for EVPN instance EV1 and the broadcast domain BD1. Thus in this embodiment, and in contrast to standard approaches, even if the ND 112 is part of the broadcast domain (BD1) to which the multi-destination packet P2 is addressed, ND 113 does not transmit multi-destination traffic to ND 112 at least in part in response to determining that the two labels learnt for ND 112 (AL12 and ML 12) include identical values.
[0057] The operations in the flow diagrams will be described with reference to the exemplary embodiments of the other figures. However, it should be understood that the operations of the flow diagrams can be performed by embodiments of the invention other than those discussed with reference to the other figures, and the embodiments of the invention discussed with reference to these other figures can perform operations different than those discussed with reference to the flow diagrams.
[0058] Figure 6 illustrates a flow diagram of exemplary operations for efficient handling of multi-destination traffic in a multi-homed EVPN instance, according to some embodiments of the invention. At operation 602, in response to determining that a first aliasing label (e.g., AL12) and a first multi-destination label (e.g., ML12) associated with a first network device (ND 112) include an identical value (X), a multi-destination forwarding table is updated to exclude the first network device (ND 112). The first aliasing label (e.g., AL12) indicates that the network device (e.g., ND 112) is coupled with a group of links (125). The group of links (125) couples a second network device (ND 101) with two or more network devices from a broadcast domain. The first multi-destination label (ML12) is to be used by a remote network device (e.g., ND 111) for forwarding multi-destination network packets towards the first network device (ND 112). The multi-destination forwarding table is associated with a link (121) of the remote network device (ND 111), where the link (121) is part of the group of links (125) and couples the second network device with the remote network device (ND 111).
[0059] At operation 604, responsive to updating the multi-destination forwarding table, multi- destination network packets received, at the remote network device (ND 111) from the second network device (101) are not transmitted to the first network device (112). In one embodiment, the multi-destination network packets are not transmitted to the network device (ND 112) even if the network device is part of the broadcast domain that is to which the multi-destination packets are addressed resulting in a reduction of usage of computing resources and bandwidth resources in the network.
[0060] Figure 7 illustrates a flow diagram of exemplary of detailed operations for determining values for aliasing labels and multi-destination labels, according to some embodiments of the invention. At operation 702, a determination is performed on whether a remote network device (e.g., ND 111-114) includes a link that is part of the group of links 125. If the remote network device (e.g., ND 114) to which the aliasing label and the multi-destination label are to be sent, is not part of the same group of links (i.e., does not include a link that is part of the same group of links as the ND associated with the labels), the flow of operations moves to operation 708, at which the aliasing label and the multi-destination label of the network device (e.g., ND 112) are assigned different values (e.g., ML12 = X and AL12 = X' as illustrated in Figure D).
Alternatively, if the remote network device (e.g., ND 111) to which the aliasing label and the multi-destination label are to be sent, is part of the same group of links (i.e., it includes a link that is part of the same group of links as the ND associated with the labels (e.g., ND 112)), the flow of operations moves to operation 706, at which a determination of whether the broadcast domain is associated with a single link at the network device, and that the single link is part of the group of links (125) is performed. Upon determining that the broadcast domain is associated with a single link (122) at the network device (e.g., ND 112), and that the single link (122) is part of the group of links (125), the flow of operations moves to operation 706, at which the aliasing label and the multi-destination label of the network device (e.g., ND 112) are assigned identical values (e.g., ML12 = X and AL12 = X as illustrated in Figure A).
[0061] Architecture
[0062] The embodiments of the present invention are performed in a control plane and forwarding plane of network devices. For example, some operations described with reference to Figures 1-7 (e.g., operations 109, 110, 120, 130, 140, 150, 170, 180, 190, 195, and 197, as well as 602-604, and 702, 704, 706, and 708) are performed in a control plane of a network (where the control plane can be implemented in a distributed approach, a centralized approach or with a more standard implementation in special purpose NDs). These operations enable network devices to update forwarding tables to handle multi-destination traffic. Thus in response to these updates the forwarding plane is then operative to efficiently forward multi-destination traffic in multi-homed environment.
[0063] Figure 8A illustrates connectivity between network devices (NDs) within an exemplary network, as well as three exemplary implementations of the NDs, according to some embodiments of the invention. Figure 8A shows NDs 800A-H, and their connectivity by way of lines between 800A-800B, 800B-800C, 800C-800D, 800D-800E, 800E-800F, 800F-800G, and 800A-800G, as well as between 800H and each of 800A, 800C, 800D, and 800G. These NDs are physical devices, and the connectivity between these NDs can be wireless or wired (often referred to as a link). An additional line extending from NDs 800A, 800E, and 800F illustrates that these NDs act as ingress and egress points for the network (and thus, these NDs are sometimes referred to as edge NDs; while the other NDs may be called core NDs).
[0064] Two of the exemplary ND implementations in Figure 8 A are: 1) a special-purpose network device 802 that uses custom application-specific integrated-circuits (ASICs) and a special-purpose operating system (OS); and 2) a general purpose network device 804 that uses common off-the-shelf (COTS) processors and a standard OS.
[0065] The special-purpose network device 802 includes networking hardware 810 comprising compute resource(s) 812 (which typically include a set of one or more processors), forwarding resource(s) 814 (which typically include one or more ASICs and/or network processors), and physical network interfaces (NIs) 816 (sometimes called physical ports), as well as non- transitory machine readable storage media 818 having stored therein networking software 820. A physical NI is hardware in a ND through which a network connection (e.g., wirelessly through a wireless network interface controller (WNIC) or through plugging in a cable to a physical port connected to a network interface controller (NIC)) is made, such as those shown by the connectivity between NDs 800A-H. During operation, the networking software 820 may be executed by the networking hardware 810 to instantiate a set of one or more networking software instance(s) 822. The networking software 820 includes a multi-destination unit (MDU) 821 (which includes code/software). Each of the networking software instance(s) 822, and that part of the networking hardware 810 that executes that network software instance (be it hardware dedicated to that networking software instance and/or time slices of hardware temporally shared by that networking software instance with others of the networking software instance(s) 822), form a separate virtual network element 830A-R. Each of the virtual network element(s) (VNEs) 830A-R includes a control communication and configuration module 832A- R (sometimes referred to as a local control module or control communication module) that includes the MDU instances 833A-833R, and forwarding table(s) 834A-R, such that a given virtual network element (e.g., 830A) includes the control communication and configuration module (e.g., 832A), a set of one or more forwarding table(s) (e.g., 834A), and that portion of the networking hardware 810 that executes the virtual network element (e.g., 830A). During operation, the multi-destination unit 821 may be executed by the networking hardware 810 to instantiate a set of one or more networking software instance(s) 833 which cause the network device 802 to perform the operations described with reference to Figures 1-7.
[0066] The special-purpose network device 802 is often physically and/or logically considered to include: 1) a ND control plane 824 (sometimes referred to as a control plane) comprising the compute resource(s) 812 that execute the control communication and configuration module(s) 832A-R; and 2) a ND forwarding plane 826 (sometimes referred to as a forwarding plane, a data plane, or a media plane) comprising the forwarding resource(s) 814 that utilize the forwarding table(s) 834A-R and the physical NIs 816. By way of example, where the ND is a router (or is implementing routing functionality), the ND control plane 824 (the compute resource(s) 812 executing the control communication and configuration module(s) 832A-R) is typically responsible for participating in controlling how data (e.g., packets) is to be routed (e.g., the next hop for the data and the outgoing physical NI for that data) and storing that routing information in the forwarding table(s) 834A-R, and the ND forwarding plane 826 is responsible for receiving that data on the physical NIs 816 and forwarding that data out the appropriate ones of the physical NIs 816 based on the forwarding table(s) 834A-R.
[0067] Figure 8B illustrates an exemplary way to implement the special-purpose network device 802 according to some embodiments of the invention. Figure 8B shows a special- purpose network device including cards 838 (typically hot pluggable). While in some embodiments the cards 838 are of two types (one or more that operate as the ND forwarding plane 826 (sometimes called line cards), and one or more that operate to implement the ND control plane 824 (sometimes called control cards)), alternative embodiments may combine functionality onto a single card and/or include additional card types (e.g., one additional type of card is called a service card, resource card, or multi-application card). A service card can provide specialized processing (e.g., Layer 4 to Layer 7 services (e.g., firewall, Internet Protocol Security (IPsec), Secure Sockets Layer (SSL) / Transport Layer Security (TLS), Intrusion Detection System (IDS), peer-to-peer (P2P), Voice over IP (VoIP) Session Border Controller, Mobile Wireless Gateways (Gateway General Packet Radio Service (GPRS) Support Node (GGSN), Evolved Packet Core (EPC) Gateway)). By way of example, a service card may be used to terminate IPsec tunnels and execute the attendant authentication and encryption algorithms. These cards are coupled together through one or more interconnect mechanisms illustrated as backplane 836 (e.g., a first full mesh coupling the line cards and a second full mesh coupling all of the cards).
[0068] Returning to Figure 8A, the general purpose network device 804 includes hardware 840 comprising a set of one or more processor(s) 842 (which are often COTS processors) and network interface controller(s) 844 (NICs; also known as network interface cards) (which include physical NIs 846), as well as non-transitory machine readable storage media 848 having stored therein software 850. During operation, the processor(s) 842 execute the software 850 to instantiate one or more sets of one or more applications 864A-R. While one embodiment does not implement virtualization, alternative embodiments may use different forms of virtualization. For example, in one such alternative embodiment the virtualization layer 854 represents the kernel of an operating system (or a shim executing on a base operating system) that allows for the creation of multiple instances 862A-R called software containers that may each be used to execute one (or more) of the sets of applications 864A-R; where the multiple software containers (also called virtualization engines, virtual private servers, or jails) are user spaces (typically a virtual memory space ) that are separate from each other and separate from the kernel space in which the operating system is run; and where the set of applications running in a given user space, unless explicitly allowed, cannot access the memory of the other processes. In another such alternative embodiment the virtualization layer 854 represents a hypervisor (sometimes referred to as a virtual machine monitor (VMM)) or a hypervisor executing on top of a host operating system, and each of the sets of applications 864A-R is run on top of a guest operating system within an instance 862A-R called a virtual machine (which may in some cases be considered a tightly isolated form of software container) that is run on top of the hypervisor - the guest operating system and application may not know they are running on a virtual machine as opposed to running on a "bare metal" host electronic device, or through para-virtualization the operating system and/or application may be aware of the presence of virtualization for optimization purposes. In yet other alternative embodiments, one, some or all of the applications are implemented as unikernel(s), which can be generated by compiling directly with an application only a limited set of libraries (e.g., from a library operating system (LibOS) including drivers/libraries of OS services) that provide the particular OS services needed by the application. As a unikernel can be implemented to run directly on hardware 840, directly on a hypervisor (in which case the unikernel is sometimes described as running within a LibOS virtual machine), or in a software container, embodiments can be implemented fully with unikernels running directly on a hypervisor represented by virtualization layer 854, unikernels running within software containers represented by instances 862A-R, or as a combination of unikernels and the above-described techniques (e.g. , unikernels and virtual machines both run directly on a hypervisor, unikernels and sets of applications that are run in different software containers). During operation, the multi-destination unit 851 (included in the software 850) may be executed by the networking hardware 850 to instantiate a set of one or more networking software instance(s) 864 which cause the network device 804 to perform the operations described with reference to Figures 1-7.
[0069] The instantiation of the one or more sets of one or more applications 864A-R, as well as virtualization if implemented, are collectively referred to as software instance(s) 852. Each set of applications 864A-R, corresponding virtualization construct (e.g., instance 862A-R) if implemented, and that part of the hardware 840 that executes them (be it hardware dedicated to that execution and/or time slices of hardware temporally shared), forms a separate virtual network element(s) 860A-R.
[0070] The virtual network element(s) 860A-R perform similar functionality to the virtual network element(s) 830A-R - e.g., similar to the control communication and configuration module(s) 832A and forwarding table(s) 834A (this virtualization of the hardware 840 is sometimes referred to as network function virtualization (NFV)). Thus, NFV may be used to consolidate many network equipment types onto industry standard high volume server hardware, physical switches, and physical storage, which could be located in Data centers, NDs, and customer premise equipment (CPE). While embodiments of the invention are illustrated with each instance 862A-R corresponding to one VNE 860A-R, alternative embodiments may implement this correspondence at a finer level granularity (e.g., line card virtual machines virtualize line cards, control card virtual machine virtualize control cards, etc.); it should be understood that the techniques described herein with reference to a correspondence of instances 862A-R to VNEs also apply to embodiments where such a finer level of granularity and/or unikernels are used.
[0071] In certain embodiments, the virtualization layer 854 includes a virtual switch that provides similar forwarding services as a physical Ethernet switch. Specifically, this virtual switch forwards traffic between instances 862A-R and the NIC(s) 844, as well as optionally between the instances 862A-R; in addition, this virtual switch may enforce network isolation between the VNEs 860A-R that by policy are not permitted to communicate with each other (e.g., by honoring virtual local area networks (VLANs)).
[0072] The third exemplary ND implementation in Figure 8A is a hybrid network device 806, which includes both custom ASICs/special-purpose OS and COTS processors/standard OS in a single ND or a single card within an ND. In certain embodiments of such a hybrid network device, a platform VM (i.e., a VM that that implements the functionality of the special-purpose network device 802) could provide for para-virtualization to the networking hardware present in the hybrid network device 806.
[0073] Regardless of the above exemplary implementations of an ND, when a single one of multiple VNEs implemented by an ND is being considered (e.g., only one of the VNEs is part of a given virtual network) or where only a single VNE is currently being implemented by an ND, the shortened term network element (NE) is sometimes used to refer to that VNE. Also in all of the above exemplary implementations, each of the VNEs (e.g., VNE(s) 830A-R, VNEs 860A-R, and those in the hybrid network device 806) receives data on the physical NIs (e.g., 816, 846) and forwards that data out the appropriate ones of the physical NIs (e.g., 816, 846). For example, a VNE implementing IP router functionality forwards IP packets on the basis of some of the IP header information in the IP packet; where IP header information includes source IP address, destination IP address, source port, destination port (where "source port" and
"destination port" refer herein to protocol ports, as opposed to physical ports of a ND), transport protocol (e.g., user datagram protocol (UDP), Transmission Control Protocol (TCP), and differentiated services code point (DSCP) values.
[0074] Figure 8C illustrates various exemplary ways in which VNEs may be coupled according to some embodiments of the invention. Figure 8C shows VNEs 870A.1-870A.P (and optionally VNEs 870A.Q-870A.R) implemented in ND 800A and VNE 870H.1 in ND 800H. In Figure 8C, VNEs 870A.1-P are separate from each other in the sense that they can receive packets from outside ND 800A and forward packets outside of ND 800A; VNE 870A.1 is coupled with VNE 870H.1, and thus they communicate packets between their respective NDs; VNE 870A.2-870A.3 may optionally forward packets between themselves without forwarding them outside of the ND 800 A; and VNE 870 A. P may optionally be the first in a chain of VNEs that includes VNE 870A.Q followed by VNE 870A.R (this is sometimes referred to as dynamic service chaining, where each of the VNEs in the series of VNEs provides a different service - e.g., one or more layer 4-7 network services). While Figure 8C illustrates various exemplary relationships between the VNEs, alternative embodiments may support other relationships (e.g., more/fewer VNEs, more/fewer dynamic service chains, multiple different dynamic service chains with some common VNEs and some different VNEs).
[0075] The NDs of Figure 8 A, for example, may form part of the Internet or a private network; and other electronic devices (not shown; such as end user devices including workstations, laptops, netbooks, tablets, palm tops, mobile phones, smartphones, phablets, multimedia phones, Voice Over Internet Protocol (VOIP) phones, terminals, portable media players, GPS units, wearable devices, gaming systems, set-top boxes, Internet enabled household appliances) may be coupled to the network (directly or through other networks such as access networks) to communicate over the network (e.g., the Internet or virtual private networks (VPNs) overlaid on (e.g., tunneled through) the Internet) with each other (directly or through servers) and/or access content and/or services. Such content and/or services are typically provided by one or more servers (not shown) belonging to a service/content provider or one or more end user devices (not shown) participating in a peer-to-peer (P2P) service, and may include, for example, public webpages (e.g., free content, store fronts, search services), private webpages (e.g.,
username/password accessed webpages providing email services), and/or corporate networks over VPNs. For instance, end user devices may be coupled (e.g., through customer premise equipment coupled to an access network (wired or wirelessly)) to edge NDs, which are coupled (e.g., through one or more core NDs) to other edge NDs, which are coupled to electronic devices acting as servers. However, through compute and storage virtualization, one or more of the electronic devices operating as the NDs in Figure 8A may also host one or more such servers (e.g., in the case of the general purpose network device 804, one or more of the software instances 862A-R may operate as servers; the same would be true for the hybrid network device 806; in the case of the special-purpose network device 802, one or more such servers could also be run on a virtualization layer executed by the compute resource(s) 812); in which case the servers are said to be co-located with the VNEs of that ND.
[0076] A virtual network is a logical abstraction of a physical network (such as that in Figure 8 A) that provides network services (e.g., L2 and/or L3 services). A virtual network can be implemented as an overlay network (sometimes referred to as a network virtualization overlay) that provides network services (e.g., layer 2 (L2, data link layer) and/or layer 3 (L3, network layer) services) over an underlay network (e.g., an L3 network, such as an Internet Protocol (IP) network that uses tunnels (e.g., generic routing encapsulation (GRE), layer 2 tunneling protocol (L2TP), IPSec) to create the overlay network).
[0077] A network virtualization edge (NVE) sits at the edge of the underlay network and participates in implementing the network virtualization; the network-facing side of the NVE uses the underlay network to tunnel frames to and from other NVEs; the outward-facing side of the NVE sends and receives data to and from systems outside the network. A virtual network instance (VNI) is a specific instance of a virtual network on a NVE (e.g., a NE/VNE on an ND, a part of a NE/VNE on a ND where that NE/VNE is divided into multiple VNEs through emulation); one or more VNIs can be instantiated on an NVE (e.g., as different VNEs on an ND). A virtual access point (VAP) is a logical connection point on the NVE for connecting external systems to a virtual network; a VAP can be physical or virtual ports identified through logical interface identifiers (e.g., a VLAN ID). [0078] Examples of network services include: 1) an Ethernet LAN emulation service (an Ethernet-based multipoint service similar to an Internet Engineering Task Force (IETF) Multiprotocol Label Switching (MPLS) or Ethernet VPN (EVPN) service) in which external systems are interconnected across the network by a LAN environment over the underlay network (e.g., an NVE provides separate L2 VNIs (virtual switching instances) for different such virtual networks, and L3 (e.g., IP/MPLS) tunneling encapsulation across the underlay network); and 2) a virtualized IP forwarding service (similar to IETF IP VPN (e.g., Border Gateway Protocol (BGP)/MPLS IP VPN) from a service definition perspective) in which external systems are interconnected across the network by an L3 environment over the underlay network (e.g., an NVE provides separate L3 VNIs (forwarding and routing instances) for different such virtual networks, and L3 (e.g., IP/MPLS) tunneling encapsulation across the underlay network)). Network services may also include quality of service capabilities (e.g., traffic classification marking, traffic conditioning and scheduling), security capabilities (e.g., filters to protect customer premises from network - originated attacks, to avoid malformed route announcements), and management capabilities (e.g., full detection and processing).
[0079] Fig. 8D illustrates a network with a single network element on each of the NDs of Figure 8A, and within this straight forward approach contrasts a traditional distributed approach (commonly used by traditional routers) with a centralized approach for maintaining reachability and forwarding information (also called network control), according to some embodiments of the invention. Specifically, Figure 8D illustrates network elements (NEs) 870A-H with the same connectivity as the NDs 800A-H of Figure 8A.
[0080] Figure 8D illustrates that the distributed approach 872 distributes responsibility for generating the reachability and forwarding information across the NEs 870A-H; in other words, the process of neighbor discovery and topology discovery is distributed.
[0081] For example, where the special-purpose network device 802 is used, the control communication and configuration module(s) 832A-R of the ND control plane 824 typically include a reachability and forwarding information module to implement one or more routing protocols (e.g., an exterior gateway protocol such as Border Gateway Protocol (BGP), Interior Gateway Protocol(s) (IGP) (e.g., Open Shortest Path First (OSPF), Intermediate System to Intermediate System (IS-IS), Routing Information Protocol (RIP), Label Distribution Protocol (LDP), Resource Reservation Protocol (RSVP) (including RSVP-Traffic Engineering (TE): Extensions to RSVP for LSP Tunnels and Generalized Multi-Protocol Label Switching
(GMPLS) Signaling RSVP-TE)) that communicate with other NEs to exchange routes, and then selects those routes based on one or more routing metrics. Thus, the NEs 870A-H (e.g., the compute resource(s) 812 executing the control communication and configuration module(s) 832A-R) perform their responsibility for participating in controlling how data (e.g., packets) is to be routed (e.g., the next hop for the data and the outgoing physical NI for that data) by distributively determining the reachability within the network and calculating their respective forwarding information. Routes and adjacencies are stored in one or more routing structures (e.g., Routing Information Base (RIB), Label Information Base (LIB), one or more adjacency structures) on the ND control plane 824. The ND control plane 824 programs the ND forwarding plane 826 with information (e.g., adjacency and route information) based on the routing structure(s). For example, the ND control plane 824 programs the adjacency and route information into one or more forwarding table(s) 834A-R (e.g., Forwarding Information Base (FIB), Label Forwarding Information Base (LFIB), and one or more adjacency structures) on the ND forwarding plane 826. For layer 2 forwarding, the ND can store one or more bridging tables that are used to forward data based on the layer 2 information in that data. While the above example uses the special-purpose network device 802, the same distributed approach 872 can be implemented on the general purpose network device 804 and the hybrid network device 806.
[0082] Figure 8D illustrates that a centralized approach 874 (also known as software defined networking (SDN)) that decouples the system that makes decisions about where traffic is sent from the underlying systems that forwards traffic to the selected destination. The illustrated centralized approach 874 has the responsibility for the generation of reachability and forwarding information in a centralized control plane 876 (sometimes referred to as a SDN control module, controller, network controller, OpenFlow controller, SDN controller, control plane node, network virtualization authority, or management control entity), and thus the process of neighbor discovery and topology discovery is centralized. The centralized control plane 876 has a south bound interface 882 with a data plane 880 (sometime referred to the infrastructure layer, network forwarding plane, or forwarding plane (which should not be confused with a ND forwarding plane)) that includes the NEs 870A-H (sometimes referred to as switches, forwarding elements, data plane elements, or nodes). The centralized control plane 876 includes a network controller 878, which includes a centralized reachability and forwarding information module 879 that determines the reachability within the network and distributes the forwarding information to the NEs 870A-H of the data plane 880 over the south bound interface 882 (which may use the OpenFlow protocol). Thus, the network intelligence is centralized in the centralized control plane 876 executing on electronic devices that are typically separate from the NDs.
[0083] For example, where the special-purpose network device 802 is used in the data plane 880, each of the control communication and configuration module(s) 832A-R of the ND control plane 824 typically include a control agent that provides the VNE side of the south bound interface 882. In this case, the ND control plane 824 (the compute resource(s) 812 executing the control communication and configuration module(s) 832A-R) performs its responsibility for participating in controlling how data (e.g., packets) is to be routed (e.g., the next hop for the data and the outgoing physical NI for that data) through the control agent communicating with the centralized control plane 876 to receive the forwarding information (and in some cases, the reachability information) from the centralized reachability and forwarding information module 879 (it should be understood that in some embodiments of the invention, the control communication and configuration module(s) 832A-R, in addition to communicating with the centralized control plane 876, may also play some role in determining reachability and/or calculating forwarding information - albeit less so than in the case of a distributed approach; such embodiments are generally considered to fall under the centralized approach 874, but may also be considered a hybrid approach). During operation, the multi-destination unit 881 when executed causes the centralized control plane and the respective network devices of the data plane to perform the operations described with reference to Figures 1-7.
[0084] While the above example uses the special-purpose network device 802, the same centralized approach 874 can be implemented with the general purpose network device 804 (e.g., each of the VNE 860A-R performs its responsibility for controlling how data (e.g., packets) is to be routed (e.g., the next hop for the data and the outgoing physical NI for that data) by communicating with the centralized control plane 876 to receive the forwarding information (and in some cases, the reachability information) from the centralized reachability and forwarding information module 879; it should be understood that in some embodiments of the invention, the VNEs 860A-R, in addition to communicating with the centralized control plane 876, may also play some role in determining reachability and/or calculating forwarding information - albeit less so than in the case of a distributed approach) and the hybrid network device 806. In fact, the use of SDN techniques can enhance the NFV techniques typically used in the general purpose network device 804 or hybrid network device 806 implementations as NFV is able to support SDN by providing an infrastructure upon which the SDN software can be run, and NFV and SDN both aim to make use of commodity server hardware and physical switches.
[0085] Figure 8D also shows that the centralized control plane 876 has a north bound interface 884 to an application layer 886, in which resides application(s) 888. The centralized control plane 876 has the ability to form virtual networks 892 (sometimes referred to as a logical forwarding plane, network services, or overlay networks (with the NEs 870A-H of the data plane 880 being the underlay network)) for the application(s) 888. Thus, the centralized control plane 876 maintains a global view of all NDs and configured NEs/VNEs, and it maps the virtual networks to the underlying NDs efficiently (including maintaining these mappings as the physical network changes either through hardware (ND, link, or ND component) failure, addition, or removal).
[0086] While Figure 8D shows the distributed approach 872 separate from the centralized approach 874, the effort of network control may be distributed differently or the two combined in certain embodiments of the invention. For example: 1) embodiments may generally use the centralized approach (SDN) 874, but have certain functions delegated to the NEs (e.g., the distributed approach may be used to implement one or more of fault monitoring, performance monitoring, protection switching, and primitives for neighbor and/or topology discovery); or 2) embodiments of the invention may perform neighbor discovery and topology discovery via both the centralized control plane and the distributed protocols, and the results compared to raise exceptions where they do not agree. Such embodiments are generally considered to fall under the centralized approach 874, but may also be considered a hybrid approach.
[0087] While Figure 8D illustrates the simple case where each of the NDs 800A-H implements a single NE 870A-H, it should be understood that the network control approaches described with reference to Figure 8D also work for networks where one or more of the NDs 800A-H implement multiple VNEs (e.g., VNEs 830A-R, VNEs 860A-R, those in the hybrid network device 806). Alternatively or in addition, the network controller 878 may also emulate the implementation of multiple VNEs in a single ND. Specifically, instead of (or in addition to) implementing multiple VNEs in a single ND, the network controller 878 may present the implementation of a VNE/NE in a single ND as multiple VNEs in the virtual networks 892 (all in the same one of the virtual network(s) 892, each in different ones of the virtual network(s) 892, or some combination). For example, the network controller 878 may cause an ND to implement a single VNE (a NE) in the underlay network, and then logically divide up the resources of that NE within the centralized control plane 876 to present different VNEs in the virtual network(s) 892 (where these different VNEs in the overlay networks are sharing the resources of the single VNE/NE implementation on the ND in the underlay network).
[0088] On the other hand, Figures 8E and 8F respectively illustrate exemplary abstractions of NEs and VNEs that the network controller 878 may present as part of different ones of the virtual networks 892. Figure 8E illustrates the simple case of where each of the NDs 800A-H implements a single NE 870A-H (see Figure 8D), but the centralized control plane 876 has abstracted multiple of the NEs in different NDs (the NEs 870A-C and G-H) into (to represent) a single NE 8701 in one of the virtual network(s) 892 of Figure 8D, according to some embodiments of the invention. Figure 8E shows that in this virtual network, the NE 8701 is coupled to NE 870D and 870F, which are both still coupled to NE 870E. [0089] Figure 8F illustrates a case where multiple VNEs (VNE 870A.1 and VNE 870H.1) are implemented on different NDs (ND 800A and ND 800H) and are coupled to each other, and where the centralized control plane 876 has abstracted these multiple VNEs such that they appear as a single VNE 870T within one of the virtual networks 892 of Figure 8D, according to some embodiments of the invention. Thus, the abstraction of a NE or VNE can span multiple NDs.
[0090] While some embodiments of the invention implement the centralized control plane 876 as a single entity (e.g., a single instance of software running on a single electronic device), alternative embodiments may spread the functionality across multiple entities for redundancy and/or scalability purposes (e.g., multiple instances of software running on different electronic devices).
[0091] Similar to the network device implementations, the electronic device(s) running the centralized control plane 876, and thus the network controller 878 including the centralized reachability and forwarding information module 879, may be implemented a variety of ways (e.g., a special purpose device, a general-purpose (e.g., COTS) device, or hybrid device). These electronic device(s) would similarly include compute resource(s), a set or one or more physical NICs, and a non-transitory machine-readable storage medium having stored thereon the centralized control plane software. For instance, Figure 9 illustrates, a general purpose control plane device 904 including hardware 940 comprising a set of one or more processor(s) 942 (which are often COTS processors) and network interface controller(s) 944 (NICs; also known as network interface cards) (which include physical NIs 946), as well as non-transitory machine readable storage media 948 having stored therein centralized control plane (CCP) software 950.
[0092] In embodiments that use compute virtualization, the processor(s) 942 typically execute software to instantiate a virtualization layer 954 (e.g., in one embodiment the virtualization layer 954 represents the kernel of an operating system (or a shim executing on a base operating system) that allows for the creation of multiple instances 962A-R called software containers (representing separate user spaces and also called virtualization engines, virtual private servers, or jails) that may each be used to execute a set of one or more applications; in another embodiment the virtualization layer 954 represents a hypervisor (sometimes referred to as a virtual machine monitor (VMM)) or a hypervisor executing on top of a host operating system, and an application is run on top of a guest operating system within an instance 962A-R called a virtual machine (which in some cases may be considered a tightly isolated form of software container) that is run by the hypervisor ; in another embodiment, an application is implemented as a unikernel, which can be generated by compiling directly with an application only a limited set of libraries (e.g., from a library operating system (LibOS) including drivers/libraries of OS services) that provide the particular OS services needed by the application, and the unikernel can run directly on hardware 940, directly on a hypervisor represented by virtualization layer 954 (in which case the unikernel is sometimes described as running within a LibOS virtual machine), or in a software container represented by one of instances 962A-R). Again, in embodiments where compute virtualization is used, during operation an instance of the CCP software 950 (illustrated as CCP instance 976A) is executed (e.g., within the instance 962A) on the virtualization layer 954. In embodiments where compute virtualization is not used, the CCP instance 976A is executed, as a unikernel or on top of a host operating system, on the "bare metal" general purpose control plane device 904. The instantiation of the CCP instance 976A, as well as the virtualization layer 954 and instances 962A-R if implemented, are collectively referred to as software instance(s) 952.
[0093] In some embodiments, the CCP instance 976A includes a network controller instance 978. The network controller instance 978 includes a centralized reachability and forwarding information module instance 979 (which is a middleware layer providing the context of the network controller 878 to the operating system and communicating with the various NEs), and an CCP application layer 980 (sometimes referred to as an application layer) over the middleware layer (providing the intelligence required for various network operations such as protocols, network situational awareness, and user - interfaces). At a more abstract level, this CCP application layer 980 within the centralized control plane 876 works with virtual network view(s) (logical view(s) of the network) and the middleware layer provides the conversion from the virtual networks to the physical view. During operation, the multi-destination unit 979 when executed causes the network controller instance 978 and the respective network devices of the data plane controlled by this instance to perform the operations described with reference to Figures 1-7.
[0094] The centralized control plane 876 transmits relevant messages to the data plane 880 based on CCP application layer 980 calculations and middleware layer mapping for each flow. A flow may be defined as a set of packets whose headers match a given pattern of bits; in this sense, traditional IP forwarding is also flow-based forwarding where the flows are defined by the destination IP address for example; however, in other implementations, the given pattern of bits used for a flow definition may include more fields (e.g., 10 or more) in the packet headers. Different NDs/NEs/VNEs of the data plane 880 may receive different messages, and thus different forwarding information. The data plane 880 processes these messages and programs the appropriate flow information and corresponding actions in the forwarding tables (sometime referred to as flow tables) of the appropriate NE/VNEs, and then the NEs/VNEs map incoming packets to flows represented in the forwarding tables and forward packets based on the matches in the forwarding tables.
[0095] Standards such as OpenFlow define the protocols used for the messages, as well as a model for processing the packets. The model for processing packets includes header parsing, packet classification, and making forwarding decisions. Header parsing describes how to interpret a packet based upon a well-known set of protocols. Some protocol fields are used to build a match structure (or key) that will be used in packet classification (e.g., a first key field could be a source media access control (MAC) address, and a second key field could be a destination MAC address).
[0096] Packet classification involves executing a lookup in memory to classify the packet by determining which entry (also referred to as a forwarding table entry or flow entry) in the forwarding tables best matches the packet based upon the match structure, or key, of the forwarding table entries. It is possible that many flows represented in the forwarding table entries can correspond/match to a packet; in this case the system is typically configured to determine one forwarding table entry from the many according to a defined scheme (e.g., selecting a first forwarding table entry that is matched). Forwarding table entries include both a specific set of match criteria (a set of values or wildcards, or an indication of what portions of a packet should be compared to a particular value/values/wildcards, as defined by the matching capabilities - for specific fields in the packet header, or for some other packet content), and a set of one or more actions for the data plane to take on receiving a matching packet. For example, an action may be to push a header onto the packet, for the packet using a particular port, flood the packet, or simply drop the packet. Thus, a forwarding table entry for IPv4/IPv6 packets with a particular transmission control protocol (TCP) destination port could contain an action specifying that these packets should be dropped.
[0097] Making forwarding decisions and performing actions occurs, based upon the forwarding table entry identified during packet classification, by executing the set of actions identified in the matched forwarding table entry on the packet.
[0098] However, when an unknown packet (for example, a "missed packet" or a "match-miss" as used in OpenFlow parlance) arrives at the data plane 880, the packet (or a subset of the packet header and content) is typically forwarded to the centralized control plane 876. The centralized control plane 876 will then program forwarding table entries into the data plane 880 to accommodate packets belonging to the flow of the unknown packet. Once a specific forwarding table entry has been programmed into the data plane 880 by the centralized control plane 876, the next packet with matching credentials will match that forwarding table entry and take the set of actions associated with that matched entry. [0099] A network interface (NI) may be physical or virtual; and in the context of IP, an interface address is an IP address assigned to a NI, be it a physical NI or virtual NI. A virtual NI may be associated with a physical NI, with another virtual interface, or stand on its own (e.g., a loopback interface, a point-to-point protocol interface). A NI (physical or virtual) may be numbered (a NI with an IP address) or unnumbered (a NI without an IP address). A loopback interface (and its loopback address) is a specific type of virtual NI (and IP address) of a NE/VNE (physical or virtual) often used for management purposes; where such an IP address is referred to as the nodal loopback address. The IP address(es) assigned to the NI(s) of a ND are referred to as IP addresses of that ND; at a more granular level, the IP address(es) assigned to NI(s) assigned to a NE/VNE implemented on a ND can be referred to as IP addresses of that NE/VNE.
[00100] Next hop selection by the routing system for a given destination may resolve to one path (that is, a routing protocol may generate one next hop on a shortest path); but if the routing system determines there are multiple viable next hops (that is, the routing protocol generated forwarding solution offers more than one next hop on a shortest path - multiple equal cost next hops), some additional criteria is used - for instance, in a connectionless network, Equal Cost Multi Path (ECMP) (also known as Equal Cost Multi Pathing, multipath forwarding and IP multipath) may be used (e.g., typical implementations use as the criteria particular header fields to ensure that the packets of a particular packet flow are always forwarded on the same next hop to preserve packet flow ordering). For purposes of multipath forwarding, a packet flow is defined as a set of packets that share an ordering constraint. As an example, the set of packets in a particular TCP transfer sequence need to arrive in order, else the TCP logic will interpret the out of order delivery as congestion and slow the TCP transfer rate down.
[00101] A Layer 3 (L3) Link Aggregation (LAG) link is a link directly connecting two NDs with multiple IP-addressed link paths (each link path is assigned a different IP address), and a load distribution decision across these different link paths is performed at the ND forwarding plane; in which case, a load distribution decision is made between the link paths.
[00102] Some NDs include functionality for authentication, authorization, and accounting (AAA) protocols (e.g., RADIUS (Remote Authentication Dial-In User Service), Diameter, and/or TACACS+ (Terminal Access Controller Access Control System Plus). AAA can be provided through a client/server model, where the AAA client is implemented on a ND and the AAA server can be implemented either locally on the ND or on a remote electronic device coupled with the ND. Authentication is the process of identifying and verifying a subscriber. For instance, a subscriber might be identified by a combination of a username and a password or through a unique key. Authorization determines what a subscriber can do after being authenticated, such as gaining access to certain electronic device information resources (e.g., through the use of access control policies). Accounting is recording user activity. By way of a summary example, end user devices may be coupled (e.g., through an access network) through an edge ND (supporting AAA processing) coupled to core NDs coupled to electronic devices implementing servers of service/content providers. AAA processing is performed to identify for a subscriber the subscriber record stored in the AAA server for that subscriber. A subscriber record includes a set of attributes (e.g., subscriber name, password, authentication information, access control information, rate-limiting information, policing information) used during processing of that subscriber's traffic.
[00103] Certain NDs (e.g., certain edge NDs) internally represent end user devices (or sometimes customer premise equipment (CPE) such as a residential gateway (e.g., a router, modem)) using subscriber circuits. A subscriber circuit uniquely identifies within the ND a subscriber session and typically exists for the lifetime of the session. Thus, a ND typically allocates a subscriber circuit when the subscriber connects to that ND, and correspondingly deallocates that subscriber circuit when that subscriber disconnects. Each subscriber session represents a distinguishable flow of packets communicated between the ND and an end user device (or sometimes CPE such as a residential gateway or modem) using a protocol, such as the point-to-point protocol over another protocol (PPPoX) (e.g., where X is Ethernet or
Asynchronous Transfer Mode (ATM)), Ethernet, 802.1Q Virtual LAN (VLAN), Internet Protocol, or ATM). A subscriber session can be initiated using a variety of mechanisms (e.g., manual provisioning a dynamic host configuration protocol (DHCP), DHCP/client-less internet protocol service (CLIPS) or Media Access Control (MAC) address tracking). For example, the point-to-point protocol (PPP) is commonly used for digital subscriber line (DSL) services and requires installation of a PPP client that enables the subscriber to enter a username and a password, which in turn may be used to select a subscriber record. When DHCP is used (e.g., for cable modem services), a username typically is not provided; but in such situations other information (e.g., information that includes the MAC address of the hardware in the end user device (or CPE)) is provided. The use of DHCP and CLIPS on the ND captures the MAC addresses and uses these addresses to distinguish subscribers and access their subscriber records.
[00104] A virtual circuit (VC), synonymous with virtual connection and virtual channel, is a connection oriented communication service that is delivered by means of packet mode communication. Virtual circuit communication resembles circuit switching, since both are connection oriented, meaning that in both cases data is delivered in correct order, and signaling overhead is required during a connection establishment phase. Virtual circuits may exist at different layers. For example, at layer 4, a connection oriented transport layer datalink protocol such as Transmission Control Protocol (TCP) may rely on a connectionless packet switching network layer protocol such as IP, where different packets may be routed over different paths, and thus be delivered out of order. Where a reliable virtual circuit is established with TCP on top of the underlying unreliable and connectionless IP protocol, the virtual circuit is identified by the source and destination network socket address pair, i.e. the sender and receiver IP address and port number. However, a virtual circuit is possible since TCP includes segment numbering and reordering on the receiver side to prevent out-of-order delivery. Virtual circuits are also possible at Layer 3 (network layer) and Layer 2 (datalink layer); such virtual circuit protocols are based on connection oriented packet switching, meaning that data is always delivered along the same network path, i.e. through the same NEs/VNEs. In such protocols, the packets are not routed individually and complete addressing information is not provided in the header of each data packet; only a small virtual channel identifier (VCI) is required in each packet; and routing information is transferred to the NEs/VNEs during the connection establishment phase;
switching only involves looking up the virtual channel identifier in a table rather than analyzing a complete address. Examples of network layer and datalink layer virtual circuit protocols, where data always is delivered over the same path: X.25, where the VC is identified by a virtual channel identifier (VCI); Frame relay, where the VC is identified by a VCI; Asynchronous Transfer Mode (ATM), where the circuit is identified by a virtual path identifier (VPI) and virtual channel identifier (VCI) pair; General Packet Radio Service (GPRS); and Multiprotocol label switching (MPLS), which can be used for IP over virtual circuits (Each circuit is identified by a label).
[00105] Certain NDs (e.g., certain edge NDs) use a hierarchy of circuits. The leaf nodes of the hierarchy of circuits are subscriber circuits. The subscriber circuits have parent circuits in the hierarchy that typically represent aggregations of multiple subscriber circuits, and thus the network segments and elements used to provide access network connectivity of those end user devices to the ND. These parent circuits may represent physical or logical aggregations of subscriber circuits (e.g., a virtual local area network (VLAN), a permanent virtual circuit (PVC) (e.g., for Asynchronous Transfer Mode (ATM)), a circuit-group, a channel, a pseudo-wire, a physical NI of the ND, and a link aggregation group). A circuit-group is a virtual construct that allows various sets of circuits to be grouped together for configuration purposes, for example aggregate rate control. A pseudo-wire is an emulation of a layer 2 point-to-point connection- oriented service. A link aggregation group is a virtual construct that merges multiple physical NIs for purposes of bandwidth aggregation and redundancy. Thus, the parent circuits physically or logically encapsulate the subscriber circuits. [00106] Each VNE (e.g., a virtual router, a virtual bridge (which may act as a virtual switch instance in a Virtual Private LAN Service (VPLS) is typically independently administrable. For example, in the case of multiple virtual routers, each of the virtual routers may share system resources but is separate from the other virtual routers regarding its management domain, AAA (authentication, authorization, and accounting) name space, IP address, and routing database(s). Multiple VNEs may be employed in an edge ND to provide direct network access and/or different classes of services for subscribers of service and/or content providers.
[00107] Within certain NDs, "interfaces" that are independent of physical NIs may be configured as part of the VNEs to provide higher-layer protocol and service information (e.g., Layer 3 addressing). The subscriber records in the AAA server identify, in addition to the other subscriber configuration requirements, to which context (e.g., which of the VNEs/NEs) the corresponding subscribers should be bound within the ND. As used herein, a binding forms an association between a physical entity (e.g., physical NI, channel) or a logical entity (e.g., circuit such as a subscriber circuit or logical circuit (a set of one or more subscriber circuits)) and a context's interface over which network protocols (e.g., routing protocols, bridging protocols) are configured for that context. Subscriber data flows on the physical entity when some higher- layer protocol interface is configured and associated with that physical entity.
[00108] Some NDs provide support for VPLS (Virtual Private LAN Service). For example, in a VPLS network, end user devices access content/services provided through the VPLS network by coupling to CEs, which are coupled through PEs coupled by other NDs. VPLS networks can be used for implementing triple play network applications (e.g., data applications (e.g., highspeed Internet access), video applications (e.g., television service such as IPTV (Internet Protocol Television), VoD (Video-on-Demand) service), and voice applications (e.g., VoIP (Voice over Internet Protocol) service)), VPN services, etc. VPLS is a type of layer 2 VPN that can be used for multi-point connectivity. VPLS networks also allow end use devices that are coupled with CEs at separate geographical locations to communicate with each other across a Wide Area Network (WAN) as if they were directly attached to each other in a Local Area Network (LAN) (referred to as an emulated LAN).
[00109] In VPLS networks, each CE typically attaches, possibly through an access network (wired and/or wireless), to a bridge module of a PE via an attachment circuit (e.g., a virtual link or connection between the CE and the PE). The bridge module of the PE attaches to an emulated LAN through an emulated LAN interface. Each bridge module acts as a "Virtual Switch Instance" (VSI) by maintaining a forwarding table that maps MAC addresses to pseudowires and attachment circuits. PEs forward frames (received from CEs) to destinations (e.g., other CEs, other PEs) based on the MAC destination address field included in those frames.
[00110] In the foregoing specification, embodiments of the invention have been described with reference to specific exemplary embodiments thereof. It will be evident that various modifications may be made thereto without departing from the broader spirit and scope of the invention as set forth in the following claims. The specification and drawings are, accordingly, to be regarded in an illustrative sense rather than a restrictive sense.
[00111] Throughout the description, embodiments of the present invention have been presented through flow diagrams. It will be appreciated that the order of transactions and transactions described in these flow diagrams are only intended for illustrative purposes and not intended as a limitation of the present invention. One having ordinary skill in the art would recognize that variations can be made to the flow diagrams without departing from the broader spirit and scope of the invention as set forth in the following claims.

Claims

CLAIMS What is claimed is:
1. A method, of efficient handling multi-destination network packets, the method comprising:
responsive to determining that a first aliasing label (AL12) and a first multi-destination label (ML12) include an identical value, wherein the first aliasing label indicates that a first network device (112) is coupled with a group of links (125), wherein the group of links (125) couples a second network device (101) with two or more network devices from a broadcast domain, and wherein the first multi-destination label (ML12) is to be used for forwarding multi-destination network packets towards the first network device (112), updating a multi-destination forwarding table to exclude the first network device (112), wherein the multi-destination forwarding table is associated with a link (121) of a third network device (111), wherein the link (121) is part of the group of links (125) and couples the second network device with the third network device (111); and
responsive to updating the multi-destination forwarding table, causing multi-destination network packets received, at the third network device (111) from the second network device (101) to not be transmitted to the first network device (112).
2. The method of claim 1, wherein the first aliasing label (AL12) and the first multi- destination label (ML12) that include the identical value are a result of a determination that the third network device (111) includes a link that is part of the group of links 125.
3. The method of claim 2, wherein the first aliasing label (AL12) and the first multi- destination label (ML12) that include the identical value are further a result of a determination that the broadcast domain is associated with a single link (122) at the first network device (112), and that the single link (122) is part of the group of links (125) coupling the second network device (101) with the third network device (111).
4. The method of claim 1, further comprising:
responsive to determining that a second aliasing label (AL13) and a second multi- destination label (ML13) include different values, wherein the second aliasing label (AL13) indicates that a fourth network device (113) is coupled with the group of links (125), and wherein the second multi-destination label (ML13) is to be used for forwarding multi-destination network packets towards the fourth network device (113), updating the multi-destination forwarding table to include the fourth network device (113); and
responsive to updating the multi-destination forwarding table, causing multi-destination network packets received at the third network device (111) from the second network device (101) to be transmitted to the fourth network device (113).
5. The method of claim 4, wherein the second aliasing label (AL13) and the second multi- destination label (ML13) that include different values are a result of a determination that the fourth network device (113) includes a link that is part of the group of links 125.
6. The method of claim 5, wherein the second aliasing label (AL13) and the second multi- destination label (ML13) that include different values are a result of a determination that the broadcast domain is associated with two or more links (123, 124) at the fourth network device (113) wherein at least one of the two or more links (124) is not part of the group of links (125).
7. The method of claim 1, wherein the multi-destination forwarding table includes a list of network devices from the broadcast domain to which multi-destination network packets are to be forwarded.
8. The method of claim 1, wherein each of the first aliasing label (AL12) and the first multi-destination label (ML12) are transmitted in a respective one of network reachability information (NRI) messages part of a routing protocol from the first network device (112) to the third network device (111).
9. The method of claim 8, wherein the first aliasing label (AL12) and the first multi- destination label (ML 12) are Multiprotocol Label Switching (MPLS) labels and the NRI messages are Border Gateway Protocol (BGP) Network Layer Reachability Information (NLRI) messages.
10. The method of claim 1, wherein the broadcast domain is a virtual local area network (VLAN) of an Ethernet Virtual Private Network (EVPN) instance, and the group of links (125) is an Ethernet segment.
11. A first network device from a plurality of network devices forming a broadcast domain, for handling multi-destination network packets, wherein the first network device is to be coupled with a second network device through a first link from a group of links, wherein the group of links couples the second network device with two or more network devices from the broadcast domain, the first network device comprising: a non-transitory computer readable storage medium to store instructions; and one or more processors coupled with the non-transitory computer readable storage
medium to process the stored instructions to:
responsive to determining that a first aliasing label (AL12) and a first multi- destination label (ML12) include an identical value, wherein the first aliasing label indicates that a first network device (112) is coupled with a group of links (125), wherein the group of links (125) couples a second network device (101) with two or more network devices from a broadcast domain, and wherein the first multi-destination label (ML12) is to be used for forwarding multi-destination network packets towards the first network device (112), update a multi-destination forwarding table to exclude the first network device (112), wherein the multi-destination forwarding table is associated with a link (121) of a third network device (111), wherein the link (121) is part of the group of links (125) and couples the second network device with the third network device (111); and
responsive to updating the multi-destination forwarding table, cause multi- destination network packets received, at the third network device (111) from the second network device (101) to not be transmitted to the first network device (112).
12. The first network device of claim 11, wherein the first aliasing label (AL12) and the first multi-destination label (ML12) that include the identical value are a result of a determination that the third network device (111) includes a link that is part of the group of links 125.
13. The first network device of claim 12, wherein the first aliasing label (AL12) and the first multi-destination label (ML12) that include the identical value are further a result of a determination that the broadcast domain is associated with a single link (122) at the first network device (112), and that the single link (122) is part of the group of links (125) coupling the second network device (101) with the third network device (111).
14. The first network device of claim 11, wherein the one or more processors are further to: responsive to determining that a second aliasing label (AL13) and a second multi- destination label (ML13) include different values, wherein the second aliasing label (AL13) indicates that a fourth network device (113) is coupled with the group of links (125), and wherein the second multi-destination label (ML13) is to be used for forwarding multi-destination network packets towards the fourth network device (113), update the multi-destination forwarding table to include the fourth network device (113); and
responsive to updating the multi-destination forwarding table, cause multi-destination network packets received at the third network device (111) from the second network device (101) to be transmitted to the fourth network device (113).
15. The first network device of claim 14, wherein the second aliasing label (AL13) and the second multi-destination label (ML 13) that include different values are a result of a
determination that the fourth network device (113) includes a link that is part of the group of links 125.
16. The first network device of claim 15, wherein the second aliasing label (AL13) and the second multi-destination label (ML 13) that include different values are a result of a
determination that the broadcast domain is associated with two or more links (123, 124) at the fourth network device (113) wherein at least one of the two or more links (124) is not part of the group of links (125).
17. The first network device of claim 11, wherein the multi-destination forwarding table includes a list of network devices from the broadcast domain to which multi-destination network packets are to be forwarded.
18. The first network device of claim 11, wherein each of the first aliasing label (AL12) and the first multi-destination label (ML 12) are transmitted in a respective one of network reachability information (NRI) messages part of a routing protocol from the first network device (112) to the third network device (111).
19. The first network device of claim 18, wherein the first aliasing label (AL12) and the first multi-destination label (ML12) are Multiprotocol Label Switching (MPLS) labels and the NRI messages are Border Gateway Protocol (BGP) Network Layer Reachability Information (NLRI) messages.
20. The first network device of claim 11, wherein the broadcast domain is a virtual local area network (VLAN) of an Ethernet Virtual Private Network (EVPN) instance, and the group of links (125) is an Ethernet segment
PCT/IB2016/053726 2016-06-23 2016-06-23 Efficient handling of multi-destination traffic in multi-homed ethernet virtual private networks (evpn) WO2017221050A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/IB2016/053726 WO2017221050A1 (en) 2016-06-23 2016-06-23 Efficient handling of multi-destination traffic in multi-homed ethernet virtual private networks (evpn)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/IB2016/053726 WO2017221050A1 (en) 2016-06-23 2016-06-23 Efficient handling of multi-destination traffic in multi-homed ethernet virtual private networks (evpn)

Publications (1)

Publication Number Publication Date
WO2017221050A1 true WO2017221050A1 (en) 2017-12-28

Family

ID=56511823

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2016/053726 WO2017221050A1 (en) 2016-06-23 2016-06-23 Efficient handling of multi-destination traffic in multi-homed ethernet virtual private networks (evpn)

Country Status (1)

Country Link
WO (1) WO2017221050A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110505152A (en) * 2019-09-11 2019-11-26 迈普通信技术股份有限公司 Route filtering method, device and electronic equipment
CN111935013A (en) * 2020-09-17 2020-11-13 南京中兴软件有限责任公司 Flow forwarding control method and device, flow forwarding method and chip, and switch
EP3893447A4 (en) * 2019-01-16 2022-03-16 Huawei Technologies Co., Ltd. Method for creating connectivity detection session, and network device and system
WO2022109528A1 (en) * 2020-11-23 2022-05-27 Cisco Technology, Inc. Sd-wan multicast replicator selection centralized policy

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100008361A1 (en) * 2008-07-08 2010-01-14 Cisco Technology, Inc. Carrier's carrier without customer-edge-to-customer-edge border gateway protocol
US20140294004A1 (en) * 2010-05-19 2014-10-02 Alcatel Lucent Method and apparatus for mpls label allocation for a bgp mac-vpn

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100008361A1 (en) * 2008-07-08 2010-01-14 Cisco Technology, Inc. Carrier's carrier without customer-edge-to-customer-edge border gateway protocol
US20140294004A1 (en) * 2010-05-19 2014-10-02 Alcatel Lucent Method and apparatus for mpls label allocation for a bgp mac-vpn

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SAJASSI A ET AL: "BGP MPLS-Based Ethernet VPN; rfc7432.txt", BGP MPLS-BASED ETHERNET VPN; RFC7432.TXT, INTERNET ENGINEERING TASK FORCE, IETF; STANDARD, INTERNET SOCIETY (ISOC) 4, RUE DES FALAISES CH- 1205 GENEVA, SWITZERLAND, 18 February 2015 (2015-02-18), pages 1 - 56, XP015104549 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3893447A4 (en) * 2019-01-16 2022-03-16 Huawei Technologies Co., Ltd. Method for creating connectivity detection session, and network device and system
CN110505152A (en) * 2019-09-11 2019-11-26 迈普通信技术股份有限公司 Route filtering method, device and electronic equipment
CN111935013A (en) * 2020-09-17 2020-11-13 南京中兴软件有限责任公司 Flow forwarding control method and device, flow forwarding method and chip, and switch
CN111935013B (en) * 2020-09-17 2021-01-08 南京中兴软件有限责任公司 Flow forwarding control method and device, flow forwarding method and chip, and switch
WO2022109528A1 (en) * 2020-11-23 2022-05-27 Cisco Technology, Inc. Sd-wan multicast replicator selection centralized policy
US11362849B1 (en) 2020-11-23 2022-06-14 Cisco Technology, Inc. SD-WAN multicast replicator selection centralized policy

Similar Documents

Publication Publication Date Title
US10785148B2 (en) OSPF extensions for flexible path stitchng and selection for traffic transiting segment routing and MPLS networks
US10819833B2 (en) Dynamic re-route in a redundant system of a packet network
EP3378193B1 (en) Designated forwarder (df) election and re-election on provider edge (pe) failure in all-active redundancy topology
US10581726B2 (en) Method and apparatus for supporting bidirectional forwarding (BFD) over multi-chassis link aggregation group (MC-LAG) in internet protocol (IP) multiprotocol label switching (MPLS) networks
EP3417578B1 (en) Is-is extensions for flexible path stitching and selection for traffic transiting segment routing and mpls networks
US20170070416A1 (en) Method and apparatus for modifying forwarding states in a network device of a software defined network
EP3430774B1 (en) Method and apparatus for supporting bidirectional forwarding (bfd) over multi-chassis link aggregation group (mc-lag) in internet protocol (ip) networks
EP3488564B1 (en) Method for fast convergence in layer 2 overlay network and non-transitory computer readable storage medium
US11159421B2 (en) Routing table selection in a policy based routing system
WO2018109536A1 (en) Method and apparatus for monitoring virtual extensible local area network (vxlan) tunnel with border gateway protocol (bgp)-ethernet virtual private network (evpn) infrastructure
WO2017089917A1 (en) Method and system for completing loosely specified mdts
WO2017221050A1 (en) Efficient handling of multi-destination traffic in multi-homed ethernet virtual private networks (evpn)
WO2018065813A1 (en) Method and system for distribution of virtual layer 2 traffic towards multiple access network devices
US20200267051A1 (en) Remotely controlling network slices in a network
US20220141761A1 (en) Dynamic access network selection based on application orchestration information in an edge cloud system
US9787577B2 (en) Method and apparatus for optimal, scale independent failover redundancy infrastructure
US20220311643A1 (en) Method and system to transmit broadcast, unknown unicast, or multicast (bum) traffic for multiple ethernet virtual private network (evpn) instances (evis)
WO2021260423A1 (en) Transient loop prevention in ethernet virtual private network egress fast reroute
WO2017149364A1 (en) Coordinated traffic reroute in an inter-chassis redundancy system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16741990

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16741990

Country of ref document: EP

Kind code of ref document: A1