EP4302469A1 - Container-router mit virtueller vernetzung - Google Patents

Container-router mit virtueller vernetzung

Info

Publication number
EP4302469A1
EP4302469A1 EP22712227.2A EP22712227A EP4302469A1 EP 4302469 A1 EP4302469 A1 EP 4302469A1 EP 22712227 A EP22712227 A EP 22712227A EP 4302469 A1 EP4302469 A1 EP 4302469A1
Authority
EP
European Patent Office
Prior art keywords
containerized
network
computing device
virtual
router
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP22712227.2A
Other languages
English (en)
French (fr)
Inventor
Mahesh SIVAKUMAR
Shailender Sharma
Sachchidanand Vaidya
Pranavadatta D N
Yuvaraja MARIAPPAN
Narendranath Karjala Subramanyam
Sivakumar Ganapathy
Philip M. Goddard
Pavan Kumar Kurapati
Sangarshan Pillareddy
Arijit PAUL
Ashutosh K. GREWAL
Srinivas Akkipeddi
Vinay K Nallamothu
Kiran K N
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Juniper Networks Inc
Original Assignee
Juniper Networks Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/649,643 external-priority patent/US11818647B2/en
Application filed by Juniper Networks Inc filed Critical Juniper Networks Inc
Priority to EP23194733.4A priority Critical patent/EP4307632A3/de
Priority to EP23194723.5A priority patent/EP4307639A1/de
Publication of EP4302469A1 publication Critical patent/EP4302469A1/de
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/40Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using virtualisation of network functions or resources, e.g. SDN or NFV entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/58Association of routers
    • H04L45/586Association of routers of virtual routers

Definitions

  • a data center may comprise a facility that hosts applications and services for subscribers, i.e., customers of data center.
  • the data center may, for example, host all of the infrastructure equipment, such as networking and storage systems, redundant power supplies, and environmental controls.
  • clusters of storage systems and application servers are interconnected via high-speed switch fabric provided by one or more tiers of physical network switches and routers. More sophisticated data centers provide infrastructure spread throughout the world with subscriber support equipment located in various physical hosting facilities.
  • Virtualized data centers are becoming a core foundation of the modern information technology (IT) infrastructure, in particular, modem data centers have extensively utilized virtualized environments in which virtual hosts, also referred to herein as virtual execution elements, such virtual machines or containers, are deployed and executed on an underlying compute platform of physical computing devices.
  • virtualization within a data center or any environment that includes one or more servers can provide several advantages.
  • One advantage is that virtualization can provide significant improvements to efficiency.
  • the underlying physical computing devices i.e., servers
  • a second advantage is that virtualization provides significant control over the computing infrastructure.
  • Containerization is a virtualization scheme based on operating system-level virtualization.
  • Containers are light-weight and portable execution elements for applications that are isolated from one another and from the host. Because containers are not tightly- coupled to the host hardware computing environment, an application can be tied to a container image and executed as a single light-weight package on any host or virtual host that supports the underlying container architecture. As such, containers address the problem of how to make software work m different computing environments. Containers offer the promise of running consistently from one computing environment to another, virtual or physical.
  • a virtualized cell-site router having containerized applications for implementing distributed units (DUs) on compute nodes.
  • the compute nodes include a DPDK-based virtual router for the data plane.
  • a containerized routing protocol daemon is a routing protocol process that is packaged as a container to run in Linux-based environments.
  • cRPD may be executed in the user space of the host as a containerized process.
  • eRPD provides control plane functionality.
  • Existing implementations of cRPD (running on the host) use the forwarding provided by the Linux kernel. This control plane is thus containerized.
  • a containerized routing protocol daemon interfaces with two disjoint data planes: the kernel network stack for the compute node and the DPDK-based virtual router.
  • the cRPD may leverage the kernel’s networking stack to set up routing exclusively for the DPDK fast path.
  • the routing information cRPD receives can include underlay routing information and overlay routing information.
  • the cRPD may run routing protocols on the vHost interfaces that are visible in the kernel, and the cRPD may install forwarding information base (FIB) updates corresponding to interior gateway protocol (IGP)-learned routes (underlay) in the kernel FIB (e.g., to enable establishment of multi-hop interior Border Gateway Protocol (iBGP) sessions to those destinations).
  • FIB forwarding information base
  • IGP interior gateway protocol
  • iBGP interior Border Gateway Protocol
  • the DPDK-based virtual router may notify the cRPD about the Application Pod interfaces created by the CNI for the compute node. Such Pod interfaces may not be advertised to or otherwise made known to the kernel.
  • the cRPD may advertise reachability to these Pod interfaces to the rest of the network as, e.g., L3VPN network layer reachability information (NLRI).
  • NLRI L3VPN network layer reachability information
  • the CNI may also add a second, backup interface into the application Pod.
  • the backup interface may be configured on a different, backup dataplane within the compute node than from the active data plane on which the active interface is configured.
  • the active data plane may be a DPDK-based virtual router
  • the backup data plane may be a kernel- based virtual router.
  • a set of software components provides CNI functionality that address networking requirements unique to cloud native 5G network environments.
  • the software components include a containerized routing protocol daemon (cRPD) to support a Network Service Mesh (NSM) architecture.
  • the set of software components support NSM architecture and may provide additional capabilities such as hybrid networking (between physical and virtual infrastructure), direct reachability to a Pod from outside a cluster of compute nodes to, e.g., advertise over protocols such as BGP, set up tunnels dynamically using various technologies such as MPLS, SRv6, IP-IP/V xL AN/GRE, IPsec, etc.
  • a 5G Q-RAN network may be deployed using cloud native technologies and follow the 5G split in which the DU (Distributed Unit) and CSR (Cell Site Router) are virtualized and run on a compute node.
  • the set of software components may operate as a cell-site router to provide L3 reachability for the mid-haul for the 5G network.
  • the software components use cRPD to distribute Layer 3 (L3) network reachability' information of the Pods not just within the cluster, but also outside the cluster.
  • the cRPD also programs the data plane on each compute node.
  • the DU application may run in the application Pod to bypass the kernel networking stack and abstractions, and thereby use, e.g., zero-copy mechanisms to directly send/receive packets from the physical NIC.
  • Data Plane Development Kit (DPDK) is one such framework, and a DPDK-based virtual router may be used as a user space data plane that leverages DPDK for high forwarding performance for this purpose.
  • the software components may include a DPDK-based virtual router to support DPDK applications.
  • a CNI plugin manages the DPDK configuration for application and programs the virtual router. This may include setting up a vhost control channel and assigning IP (e.g., both IPv4 and IPv6) and MAC addresses, advertising the Pod IP addresses, and detecting and withdrawing the routes when the Pod is considered down or removed.
  • IP e.g., both IPv4 and IPv6
  • a computing device is configured with a containerized router, the computing device comprising: processing circuitry; a containerized virtual router configured to execute on the processing circuitry and configured to implement a data plane for the containerized router; and a containerized routing protocol process configured to execute on the processing circuitry and configured to implement a control plane for the containerized router.
  • a virtualized ceil site router comprises a computing device configured with a containerized router, the computing device comprising: a containerized virtual router configured to execute on the processing circuitry and configured to implement a data plane for the containerized router; a containerized routing protocol process configured to execute on the processing circuitry and configured to implement a control plane for the containerized router; and a pod comprising a containerized distributed unit, wherein the containerized routing protocol process is configured to advertise routing information comprising reachability information for the containerized distributed unit.
  • FIG. 2 is a block diagram illustrating an example implementation of a part of the mobile network system of FIG. 1 in further detail, in accordance with techniques of this disclosure.
  • FIG. 5 is a block diagram illustrating an example vRouter agent, according to techniques of this disclosure.
  • FIG. 6 is a block diagram illustrating an example server with example control and data traffic flows within the server, according to techniques of this disclosure.
  • FIG. 7 is a conceptual diagram depicting a sequence of operations on a port-add leading to route programming in a vRouter, according to example aspects of this disclosure.
  • FIG. 8 is a block diagram of an example computing device (e.g., host), according to techniques described in this disclosure.
  • FIG 12 illustrates an example system and packet forwarding, according to techniques described in this disclosure.
  • 10038 j FIG. 13 illustrates an example system and packet forwarding, according to techniques described in this disclosure.
  • FIG. 14 is a conceptual diagram illustrating example operations for programming vRouter forwarding information, according to techniques of this disclosure.
  • FIG. 15 is a conceptual diagram illustrating example operations for configuring and advertising a virtual network interface in a server having a cloud native router, according to techniques of this disclosure.
  • 5G uses a cloud-native approach in which functional blocks are decomposed into microservices.
  • the microservices are deployed as containers on x86 platforms, orchestrated by Kubernetes (abbreviated as “K8s”).
  • K8s This includes 5G core control plane functions such Access and Mobility Management Function (AMF)and Session Management Function (SMF), RAN control plane functions such as CU-CP, service management and orchestration (SMO), Near-Real Tune & Non-Real Time Radio Intelligent Controller (RIC) and even some data-plane functions such as CU-DP and DU.
  • AMF Access and Mobility Management Function
  • SMF Session Management Function
  • RAN control plane functions such as CU-CP, service management and orchestration (SMO), Near-Real Tune & Non-Real Time Radio Intelligent Controller (RIC) and even some data-plane functions such as CU-DP and DU.
  • CU-CP Access and Mobility Management Function
  • SMO service management and orchestration
  • RIC Near-Re
  • CNIs Container Networking Interfaces
  • a Cloud Native Router provides a better fit for these situations.
  • a Cloud Native Router is a containerized router that allows an x86 or ARM based host to be a first- class member of the network routing system, participating in protocols such as Intermediate System to Intermediate System (I8- ⁇ 8) and Border Gateway Protocol (BGP) and providing Multiprotocol Label Switching/Segment Routing (MPLS/SR) based transport and multitenancy.
  • I8- ⁇ 8 Intermediate System to Intermediate System
  • BGP Border Gateway Protocol
  • MPLS/SR Multiprotocol Label Switching/Segment Routing
  • a Cloud Native Router may have one or more advantages over a conventional router.
  • a router has a control plane and a forwarding plane.
  • the control plane participates in dynamic routing protocols and exchanges routing information with other routers in the network, it downloads the results into a forwarding plane, in the form of prefixes, next-hops and associated SR/MPLS labels.
  • Implementations described herein are modular, in the sense that the control plane is agnostic to the exact details of how the forwarding plane is implemented.
  • the forwarding plane may be based on custom ASICs.
  • the Cloud Native Router is a virtualized router.
  • the routing protocol software is functionally similar in both cases. This means the Cloud Native Router benefits from the same highly comprehensive and robust protocol implementation as the hardware- based routers that underpin some of the world’s largest networks.
  • the Cloud Native Router uses a containerized routing protocol daemon (cRPD) Control Plane and a virtual router (vRouter) forwarding plane to deliver high performance networking in a small footprint, software package that is functionally similar to a non- virtual router, a physical network function (PNF).
  • the forwarding plane may be implemented via a choice of DPDK, Linux Kernel or Smart- NIC.
  • the complete integration delivers a KBs CNI- compliant package, deployable within a KBs environment (e.g., Multus-enabied).
  • the Cloud Native Router may be incorporated into the host on which it resides and integrated with KBs.
  • this disclosure describes how a DU and a Cloud Native Router can co-exist on the same 1U size xS6 or ARM based host or other computing device. This is especially attractive for those cell-sites that have limited power and space, as it avoids the need for a two-box solution, in the form of a separate DU and router. Multiple O-DUs , or other workloads, can be attached to the same Cloud Native Router.
  • the cell-site server may be a KBs worker node (or “minion”).
  • the O-DU pod is plumbed into the Cloud Native Router.
  • the O-DU may require multiple network interfaces, facilitated in some cases by the Multus meta-CNI
  • Each of these interfaces can be mapped into a different Layer 3 VPN on the Cloud Native Router to support multiple network slices.
  • a CNI described herein when triggered by K8s pod events, dynamically adds or deletes interfaces between the pod and the vRouter container. It also dynamically updates the cRPD control plane container with host routes for each pod interface and corresponding Layer 3 VPN mappings, in the form of Route Distinguishes and Route Targets.
  • the Layer 3 VPNs may he implemented using virtual routing and forwarding instances (VRFs).
  • VRFs virtual routing and forwarding instances
  • the cRPD control plane programs the vRouter forwarding plane accordingly via a gRPC interface.
  • the Cloud Native Router is introduced into the data path, supporting the FI interfaces to the CIJs running m edge or regional DC sites.
  • Cloud Native Router techniques are applicable for configuring host-based virtual router for other containerized applications.
  • the CNR As the CNR is itself a cloud-native application, it supports installation using KBs manifests or Helm Charts. These include the initial configuration of the router, including routing protocols and Layer 3 VPNs to support slices.
  • a CNR may be orchestrated and configured, in aI matter of seconds, with ail of the routing protocol adjacencies with the rest of the network up and running. Ongoing configuration changes during the lifetime of the CNR, for example to add or remove network slices, may be via a choice of CLI, KBs manifests, NetConf or Terraform.
  • the Cloud Native Router may mitigate the traditional operational overhead incurred when using a containerized appliance rather than its physical counterpart.
  • the Cloud Native Router may normalize the operational model of the virtual appliance to the physical appliance, eradicating the barrier to adoption within the operator’s network operations environment.
  • the Cloud Native Router may present a familiar routing appliance look-and-feel to any trained operations team.
  • the Cloud Native Router has similar features and capabilities, and a similar operational model as a hardware- based platform.
  • a domain-controller can use the protocols that it is uses with any other Junes router to communicate with and control the Cloud Native Router, for example Netconf/OpenConfig, gRPC, Path Computation Element Protocol (PCEP) and Programmable Routing Daemon (pRPD) APIs.
  • Netconf/OpenConfig gRPC
  • PCEP Path Computation Element Protocol
  • pRPD Programmable Routing Daemon
  • the node executing Cloud Native Router may participate in 18-18, Open Shortest Path First (OSPF), BGP, and/or other interior or exterior routing protocols, in addition,
  • MPLS may be used, often based on Segment Routing (8R). The reason for this is two-fold: to allow Traffic Engineering if needed, and to underpin multi-tenancy, by using MPLS-based Layer 3 VPNs. As an alternative, SRv6 could be used instead to fulfill these requirements. Having a comprehensive routing capability is also necessary to implement network slicing.
  • Each slice tenant is placed into its own Layer 3 VPN.
  • the Cloud Native Router acts as a provider edge (PE) router from the Layer 3 VPN point of view. The Cloud Native Router therefore exchanges Layer 3 VPN prefixes via BGP with other PE routers in the network, regardless of whether those other PEs are physical routers or Cloud Native Routers residing on other hosts.
  • PE provider edge
  • Each tenant may be placed in a separate VRF table on each PE, giving the correct degree of isolation and security between tenants, just as with a conventional Layer 3 VPN service. This neatly solves the problem that K8s does not natively provide such isolation.
  • Layer 3 VPN is a tried and tested method for achieving multi-tenancy in networking and is trusted by the many major corporations worldwide who buy this service from their network service providers.
  • the transport network offers a variety of paths, each tuned to a particular cost-function such as minimum latency or high-bandwidth. These are implemented using Segment Routing flex-algo or RSVP or Segment Routing-based traffic engineering.
  • the paths can be computed by a controller and communicated to the Cloud-Native Router via the PCEP protocol.
  • the controller detects congestion in the network via streaming telemetry', it automatically recomputes the affected paths to ease the congestion
  • PE routers including the Cloud-Native Routers, apply tags (BGP color communities) to the prefixes in a given VRF according to the type of path that the corresponding slice needs.
  • a first slice may need the lowest latency transport that is possible, and so is mapped to a low-latency path in order to reach the O-CU in an Edge Bata Center (EDC).
  • EDC Edge Bata Center
  • a second slice needs high-bandwidth with reasonably low latency. Therefore its O-CU is also located in the EDC, and the traffic is mapped to a high -bandwidth path to the EDC.
  • a third slice needs high-bandwidth transport but is not latency-sensitive, so its O-CU may be placed in the Regional Data Center (RDC). Traffic for the third slice is mapped into the high-bandwidth path to the RDC.
  • RDC Regional Data Center
  • Traffic for the third slice is mapped into the high-bandwidth path to the RDC.
  • the mapping of slices to a transport path will normally be many-to-one. For example, all of the slices that need low-latency transport between a given pair of endpoints share the same low-latency traffic-engineered or flex-algo path
  • the Cloud Nati ve Router may bring the full spectrum of routing capabilities to compute platforms that host containerized network functions. This may allow the platform to fully participate in the operator’s network routing system and facilitate multi- tenancy and network slicing. It may provide the same familiar !ook-and-free, operational experience and control-plane interfaces as a hardware-based router.
  • FIG. 1 is a block diagram illustrating an example mobile network system, in accordance with techniques described in this disclosure.
  • Mobile network system 100 may be a 5G network that implements 5G standards promulgated by, e.g., the 3 rd Generation Partnership Project (3 GPP), the Open Radio Access Network (“Q-RAN” or “ORAN”) Alliance, the European Telecommunications Standards Institute (ETSI), the Internet Engineering Task Force (IETF), and the International Telecommunication Union (ITU).
  • sendee providers may avoid becoming locked into particular appliance vendors and may combine effective solutions from different vendors at different layers and locations to build and provision the mobile network system.
  • This can improve the radio access networks (RANs), in particular, by making them more open, resilient, and scalable.
  • RANs radio access networks
  • O-RAN-based networks decompose the baseband unit (BBU) found in traditional telco networks into three functional units: a Radio Unit (RIJ), a Distributed Unit (DU), and a Centralized Unit (CU), Different functions of RUs, DUs, and CUs may be implemented by software executed by x86 ⁇ based or ARM-based host servers.
  • the CU can be further segregated into distinct control plane (CU-CP) and user plane (CU-UP) functions to further control and user plane separation (CUPS). This decoupling helps bring flexibility to deployment - different combinations of RIJ, DU, and CU may be deployed at the same location, or at different locations.
  • CU-CP control plane
  • CU-UP user plane
  • This decoupling helps bring flexibility to deployment - different combinations of RIJ, DU, and CU may be deployed at the same location, or at different locations.
  • RIJ user plane functions
  • DU DU
  • O-CU O-DIJs
  • O-CUs O-CUs
  • Additional data plane elements known as user plane functions (IJPFs) operate in mobile core network 7 to forward traffic between the CU and data network 15.
  • Additional control plane elements operate in mobile core network 7. These control plane elements include Network Slice Selection Function (NSSF), Policy Control Function (PCF), Authentication Server Function (ASUF), Access and Mobility Management Function (AMT), Network Exposure Function (NET), Network Function Repository Function (NRF), Application Function (AF), Unified Data Management (IJDM), and Session Management Function (SMF).
  • NSSF Network Slice Selection Function
  • PCF Policy Control Function
  • ASUF Authentication Server Function
  • AMT Access and Mobility Management Function
  • NET Network Exposure Function
  • NRF Network Exposure Function
  • NRF Network Function Repository Function
  • AF Application Function
  • IJDM Unified Data Management
  • Session Management Function Session Management Function
  • Mobile network system 100 includes radio access networks 9 and mobile core network 7.
  • Radio access networks 9 include RUs 14 located at various cellular network sites (“cell sites”).
  • Each RIJ 14 consists of an LO PHY and a RF transmitter.
  • the LO PHY component may be implemented using specialized hardware for high-performance packet processing.
  • RUs 14 connect to DUs 22A-22X (collectively, “DUs 22”) via the fronthaul network.
  • the fronthaul network connects LO PHY and HI PHY and is used by RUs 14 and DUs 22 to implement the F2 interface of 5G.
  • DU s 22 manage the packet transmission of radio by the RUs 14. In some cases, such packet transmission conforms to the Common Packet Radio Interface (CPRI) and/or to the enhanced CPRI (eCPRI) standard, or to IEEE 1914.3.
  • DUs 22 may implement the Radio Link Control (RLC), Media Access Control (MAC), and the HI PHY layer.
  • DUs 22 are at least partially controlled by CUs 13A-13B (collectively, “CUs
  • DUs 22 connect to CUs 13 via the midhaul network, which may be used by DUs 22 and CUs 13 to implement the FI of 5G.
  • CUs 13 may implement the Radio Resource Control (RRC) and Packet Data Convergence Protocol (PDCP) layers.
  • RRC Radio Resource Control
  • PDCP Packet Data Convergence Protocol
  • CUs 13 connect to mobile core network 7 via a backhaul network.
  • the midhaul and backhaul networks may each be wide area networks (WANs).
  • the gNodeB includes one of CUs 13 and one of DUs 22.
  • a CU may support multiple DUs to implement multiple gNodeBs.
  • one or more RUs may be supported by a single DU.
  • CU 13A and DU 22A and one of RUs 14 may form one eNodeB
  • CU 13A and DU 22B (of server 12B) and another one of RUs 14 may form another eNodeB.
  • any DU of DUs 22 may or may not be located at the cell site that includes the RU(s) 14 supported by the DU.
  • 15 may represent, for example, one or more sendee provider networks and services, the Internet, 3 rd party services, an IP-multimedia subsystem, or other network.
  • the combination of DUs 22, the midhaul network, CUs 13, and the backhaul network effectively implement an IP-based transport network between the radio units 14 and mobile core network 7.
  • virtualized cell site routers 24A-24X (“vCSRs 20A-2GX” and collectively, “vCSRs 20”) provide layer 3 routing functionality' between DUs 22 and CUs 13, These vCSR 24 may be executed on the same server 12 as one or more DUs 22 to provide provider edge router functionality to such DUs 22.
  • each of vCSRs 20 is termed a “ceil site” router, any of vCSRs 20 may be deployed to a local data center together with one or more DUs 22 for which the vCSR provides IP sendees, as shown with respect to vCSRs 20A-20N, i.e., where the local data center includes servers 12 that execute DUs 22 for one or more ceil sites.
  • Each of vCSRs 20 is implemented using one of containerized routing protocol daemons 20A-20X (“cRPDs 24A--24X” and collectively, “cRPDs 24”). More specifically, each of vCSRs 20 uses a corresponding cRPD of cRPDs 24 as a control plane for implementing a layer 3 router. The cRPD provides control plane routing functions.
  • the cRPD can execute IP (IPv4/IPv6) underlay routing protocols such as Intermediate System-intermediate System (IS-IS) and Border Gateway Protocol (BGP); advertise reachability of DUs 22 both inside and outside a cluster, e.g., to CUs 13; implement network namespaces (supported using L3 VPN and EVPN Type- 5 advertisements); implement Access Control Lists (ACEs) and network policies for security, network isolation, and quality of service (QoS); support tunnels and tunneling protocols (e.g., MPLS, 8R- MPLS, SRv6, SR-MPLS oIPv6, , SR-MPLSoIPv4, VxLAN, IP-imlP, GRE); support dynamic tunnels signaled using BGP; support encryption for IPSec tunnels; and program a forwarding plane of the vCSR of the server with learned and/or configured routing information to provide layer 3 packet forwarding, encapsulation, packet filtering, and/or QoS between one or more of DU
  • vCSR 20 A executed by server 12A includes eRPD 24A and a forwarding plane of server 12A (e.g., a SmartNIC, kernel-based forwarding plane, or Data Plane Development Kit (DPDK)-based forwarding plane).
  • eRPD 24A provides one or more of the above routing functions to program a forwarding plane of vCSR 20A in order to, among other tasks, advertise a layer 3 route for DU 22A outside of the cluster — including across the midhaul network to CU 13 A — and forward layer 3 packets between DU 22A and CU 13 A.
  • the techniques realize cloud-native, containerized cell site routers 20 executing on the same servers 12 as containerized DUs 22, thus significantly reducing latency on the midhaul between DUs 22 and CUs 13.
  • vCSRs 20 as containerized routers allow r an x86-based or ARM- based host to be a first-class member of the network routing system, participating in protocols such as IS-IS and BGP and providing MPLS/SR-based transport and multi-tenancy.
  • vCSRs 20 may operate as provider edge (PE) routers for networks transporting layer 3 packets among DUs 22, CUs 13, and mobile core network 7.
  • PE provider edge
  • the integration of cRPDs 24 and host-based forwarding planes may also deliver a Kubemetes CNI-compliant package that is deployable within a Kubernet.es environment.
  • the execution by a single server of a DU 22 and a vCSR 20 together can avoid a two-box solution with a separate DU and router, potentially reducing costs, power, and space requirements, which is particularly attractive for cell sites.
  • Application workloads can be containerized network functions (CNFs), such as DUs.
  • CNFs containerized network functions
  • Orchestrator 50 represents a container orchestration platform.
  • “Orchestration,” in the context of a virtualized computing infrastructure generally refers to provisioning, scheduling, and managing virtual execution elements and/or applications and services executing on such virtual execution elements to the host servers available to the orchestration platform.
  • Container orchestration specifically, permits container coordination and refers to the deployment, management, scaling, and configuration, e.g., of containers to host servers by a container orchestration platform.
  • Example instances of orchestration platforms include Kubernetes, Docker swarm, Mesos/Marathon, OpenShift, OpenStack, VMware, and Amazon ECS.
  • Orchestrator 50 orchestrates DUs 22 and at least containerized RPDs 24 of vCSRs 20.
  • the data plane of vCSRs 20 is also containerized and orchestrated by orchestrator 50.
  • the data plane may be a DPDK-based virtual router, for instance.
  • Containers including those implementing containerized routing protocol daemons 24, may be deployed to a virtualization environment using a cluster-based framework in which a cluster master node of a cluster manages the deployment and operation of containers to one or more cluster minion nodes of the cluster.
  • master node and “minion node” used herein encompass different orchestration platform terms for analogous devices that distinguish between primarily management elements of a cluster and primarily virtual execution element hosting devices of a cluster.
  • the Kubernetes platform uses the terms “cluster master node” and “minion nodes,” while the Docker Swarm platform refers to cluster managers and cluster nodes. Servers 12 or virtual machines thereon may represent cluster nodes.
  • Orchestrator 50 and software defined network (SDN) controller 70 may execute on separate computing devices or execute on the same computing device. Each of orchestrator 50 and SDN controller 70 may each be a distributed application that executes on one or more computing devices. Orchestrator 50 and SDN controller 70 may implement respective master nodes for one or more clusters each having one or more minion nodes implemented by respective servers 12. In general, SDN controller 70 controls the network configuration of radio access network 9 to facilitate paeketized communications among DUs 22, CUs 13, and mobile core network 7. SDN controller 70 may distribute routing and configuration information to the control plane elements of radio access networks 9, in particular, to cRPDs 24.
  • SDN controller 70 may, for instance, program segment routing headers, configure LSVPNs, configure VJRFS in routers of radio access network 9 (including virtualized cell site routers 20).
  • SDN controller 70 may implement one or more southbound protocols for configuring router, switches, and other networks devices of the midhaul and backhaul networks, as well as for configuring vCSRs 20.
  • Example southbound protocols may include Path Computation Element Protocol (PCEP), BGP, Neteonf, OpenConfig, another protocol for configuring cRPDs 24, and so forth. Additional information regarding L3VPNs is found in “BGP/MPLS IP Virtual Private Networks (VPNs),” Request for Comments 4364, Network Working Group of internet Engineering Task Force, February 2006, which is incorporated by reference in its entirety.
  • SDN controller 70 may provide a logically and m some cases physically centralized controller. In some examples, SDN controller 70 may operate in response to configuration input received from orchestrator 50 and/or an administrator/operator. SDN controller 70 may program NFV infrastructure (NFVI) such as servers 12, network switches/routers, and/or other network infrastructure. In the case of NF VI programming, SDN controller 70 may configure aspects of the operating system kernel to configure L3 IP routing, Linux bridges, iptables, network namespaces, and/or virtual switches.
  • NFVI NFV infrastructure
  • orchestrator 50 controls the deployment, scaling, and operations of containers across clusters of servers 12 and the providing of computing infrastructure, which may include container-centric computing infrastructure.
  • Orchestrator 50 and, in some cases, network controller 70 may implement respective cluster masters for one or more Kuhernetes clusters.
  • Kubernetes is a container management platform that provides portability across public and private clouds, each of which may provide virtualization infrastructure to the container management platform.
  • Virtualized cell site routers 20 may provide one or more technical advantages that realize at least one practical application.
  • Existing mobile networks use a physical cell site router that is located on or close to each BBU.
  • Physical routers often have specialized form factors, are relatively difficult to update and configure, and are relatively difficult to replace due to vendor lock-in effects. While these effects are tolerable where there are relatively few cell sites, as with 3G and 4G/LTE mobile networks, the comparatively large number of cell sites required by RANs for 5G mobile networks exacerbates the capital and operational costs related to these effects.
  • 5G network providers are moving to a disaggregated RAN architecture (e.g., O-RAN), such networks still rely on a physical cell site router or a virtual machine-based router to manage routes and data traffic between the DU and the CU over the midhaul network.
  • a disaggregated RAN architecture e.g., O-RAN
  • Virtualized cell site routers 20 having containerized routing protocol daemons 24 alleviate many of the negative effects of deploying physical or VM-based routers at the cell site.
  • containerized RPDs 24 are more light-weight in terms of compute resources (CPU, memory) compared to VM-based routers and may be more efficient in terms of space and power utilization than VM-based and physical routers.
  • Virtualized CSRs 20 may achieve these advantages while achieving comparable performance where DPDK-based virtual routers are used as the data plane to provide efficient and high packet I/O rate for vCSRs 20 to communicate with DUs 22.
  • the techniques may facilitate a cloud native experience for vCSR 20 deployment and configuration. Integrating in Kubemetes permits leveraging its existing mechanisms for monitoring the health of containerized RPD 24s and restarting if necessary, along with managing the life-cycle of the vCSRs 20 and in particular, containerized RPDs 24.
  • FIG. 2 is a block diagram illustrating an example implementation of a part of the mobile network system of FIG. 1 in further detail, in accordance with techniques of this disclosure.
  • System 200 includes CUs 213A-213K, each of which may represent any of CUs 13.
  • multiple network slices e.g., 5G network slices
  • L3VPNs L3VPNs
  • tunnels 231A-231K to connect DU 22A to different CUs 213A-213K for respective network slices.
  • a network slice provides a way to completely segment the mobile network to support a particular type of service or business or even to host service providers (multi-tenancy) who do not own a physical network. Furthermore, each slice can be optimized according to capacity', coverage, connectivity, security and performance characteristics. Since the slices can be isolated from each other, as if they are physically separated both in the control and user planes, the user experience of the network slice will be the same as if it was a separate network.
  • a network slice can span all domains of the network including software applications (both memory' and processing) running on network nodes, specific configurations of the core transport network, access network configurations as well as the end devices. The network slicing enables multiple operators to share a mobile network securely but by separating their own users from others, and different applications of a user to use different network slices that provide widely different performance characteristics.
  • Virtualized cell site router 20A includes a virtual router forwarding plane (vRouter) 206A configured with VRFs 212A--212K (collectively, “VRFs 212”) for respective network slices implemented with respective L3VPNs, which vCSR 20A and routers 204A-204B implement using tunnels 231 A-231K connecting VRFs 212 to VRFs 210A-210K on routers 204A-204B.
  • Each of tunnels 231 A--231K may represent a SR-MPLSoIPv6 or other type of tunnel mentioned above.
  • Each of routers 204A-204K may be a gateway router for a data center having one or more servers to execute any one or more of CUs 213A--213K.
  • the data center may include a data center fabric to switch mobile data traffic between the router and the CU.
  • the one or more servers of the data center may also execute a UPF for the mobile network, in which case the data center fabric may also switch mobile data traffic between the CU and the UPF.
  • Each of the VRFs 212A-212K has a corresponding virtual network interface to DU 22A.
  • Each of the virtual network interfaces of DU 22A may thus be mapped into a different L3VPN in vCSR 20A in order to, e.g., support a different one of multiple network slices.
  • a CNI of server 12A when triggered by pod events from orchestrator 50, dynamically adds or deletes virtual network interfaces between the pod (here deployed with DU 22A) and the vRouter 206A, which may also be deployed as container in some examples.
  • the CNI also dynamically updates cRPD 24A (the control plane of vCSR 20A) with host routes for each DU 22 A / pod virtual network interface and corresponding Layer 3 VPN mappings, in the form of Route Distinguishers and Route Targets.
  • cR PD 24A programs vRouter 206A (the data plane of vCSR 20 A) accordingly, optionally using a gRPC interface.
  • vCSR 20A is introduced as a cloud-native router into the data path to, e.g., support the FI interfaces to CUs 213A-213K that may be executing in edge or regional data center sites, for instance.
  • Virtual router 206A may represent a SmartNIC -based virtual router, kernel-based virtual router, or DPDK-based virtual router in various examples.
  • FIGS. 3A-3B are block diagrams illustrating example instances of a server that implements a virtualized cell site router, in accordance with techniques of this disclosure.
  • Servers 300, 350 may each represent any of servers 12 of FIG. 1. in some cases, servers 300, 350 are configured to implement both a virtualized cell site router and distributed unit for same-box forwarding of mobile data, traffic between DIJ 22 A and the data plane of virtualized cell site router 20A.
  • Servers 300, 350 may each be a bare-metal server or a virtual machine.
  • An example hardware architecture for servers 300, 350 is described in FIG. 8
  • Servers 300, 350 include one or more network interface cards (NiCs) 321 A--321B (collectively, “NICs 321”) each having one or more hardware interfaces.
  • NICs 321 network interface cards
  • interfaces 320 of NIC 321 A may he coupled via physical cabling to RUs.
  • Interfaces 320 may implement the F2 interface.
  • Interfaces 322 of NIC 321B may be coupled via physical cabling to the midhaul network, for sending/receiving mobile data traffic to/from CUs.
  • Interfaces 322 may implement the FI interface.
  • a DPDK- based virtual router data or forwarding plane (“vRouter”) 206 A is programmed by vRouter agent 314 with forwarding information for implementing a packet fast path.
  • vRouter agent 314 may be a user space process.
  • vRouter agent 314 may have a northbound interface 340 for receiving configuration and routing information from control plane processes, such as cRPD 324.
  • eRPD 324 may be an example of cRPD 24A of FIG. 1.
  • vRouter agent 314 has a southbound interface 341 for programming vRouter 206A.
  • An example implementation for interface 340 is described in further detail with respect to FIG. 5.
  • references herein to a “virtual router” may refer to the virtual router forwarding plane specifically, or to a combination of the virtual router forwarding plane (e.g., vRouter 206A) and the corresponding virtual router agent (e.g., vRouter agent 314).
  • cRPD 324 may have a northbound interface for exchanging configuration and routing information with SDN controller 70.
  • Containerized networking interface 312 may be a CNI plugin that configures the interfaces of the container workloads (DUs 22A-1 to 22A-N in this example) with the DPDK-based vRouter 206A.
  • Orchestrator 50 may orchestrate DPDK- based vRouter 206A, cRPD 324, and/or DU 22 workloads.
  • workloads may have multiple interfaces and multiple types of interfaces (e.g., some with vRouter 206A and some with NIC 321 A).
  • CNI 312 may represent a combination of CNIs or unified CNI that is capable of configuring a workload with multiple types of interfaces.
  • the multiple CNIs may be controlled by a master CNI such as Multus.
  • orchestrator 50 is a Kubernet.es master
  • CustomResourceDefmitions (CRDs) may be implemented for orchestrator 50 for supporting multi-tenancy and network isolation.
  • Orchestrator 50 orchestrates pods comprising container workloads.
  • CNI 312 configures virtual interfaces between pods and the data plane, which may be DPDK-based vRouter 206A, a kernel-based vRouter, or a SmartNIC-based vRouter.
  • CNI 312 configures a vxrtio interface for each pod as a vhost-user interface of the DPDK- based vRouter 206A.
  • CNI 312 configures veth pairs for each pod to vRouter 206A.
  • vRouter agent 314 may collect and output telemetry data to a telemetry collector, e.g., in the form of syslogs.
  • vRouter 206A has a bonded interface to NIC 321B, which may be an Intel-based NIC that supports DPDK. Bonded interfaces facilitate packet load balancing among fabric interfaces. Additional description of configuring virtual interfaces may be found in U.S. Patent 10,728,145, issued July 28, 2020, which is incorporated by reference herein in its entirety.
  • CNI 312 provides networking for application workloads. This includes, for example, setting up interfaces, IP address management, and access control lists; advertising reachability of workloads within the Kubernetes cluster comprising any of servers 300, 350 (examples of Kubernetes minion nodes); and setting up network namespaces.
  • CNI 312 may leverage cRPD 324 to expand its control plane capabilities and facilitate virtualized cell site router 20A that is on-box with the application workloads DUs 22A-1 to 22A-N.
  • cRPD 324 may incorporate elements of network service mesh architecture (N8M), service discovery, external endpoints, and tunneling.
  • cRPD 324 may use exterior routing protocols such as Border Gateway Protocol (BGP) to advertise pod reachability' both within and outside the Kubernetes cluster.
  • BGP Border Gateway Protocol
  • cRPD 324 may use interior gateway and other routing protocols such as IS-IS, OSPF, Label Distribution Protocol (LDP), etc., to participate in underlay networking.
  • cRPD 324 may also provide support for advanced L3VPN overlays using protocols/technologies such as MPLS, MPLSoUDP, or MPLSoGRE tunneling; VxLANs; SR-MPLS, SRv6, SRv4, and/or IPSec.
  • protocols/technologies such as MPLS, MPLSoUDP, or MPLSoGRE tunneling; VxLANs; SR-MPLS, SRv6, SRv4, and/or IPSec.
  • cRPD 324 operates as the control plane for vCSR 20A, while vRouter 206A operates as the data or forwarding plane for vCSR 20A.
  • CNI 312 leveraging cRPD 324 is thus able to facilitate multi-tenancy using L3VPNs, e.g., to implement network slices for different tenants; ACLs and network policies for applications; and IPSec for high security.
  • FIG. 10 is a block diagram illustrating an example implementation of cRPD 324 or any other cRPD of this disclosure, which an orchestrator may deploy using a pod.
  • cRPD 1440 may be deployed as a microservice in Docker, coreOS (rkt), or other container platform.
  • cRPD 1440 includes management interface 1400, which may represent a command line interface (CLI), Netconf, secure shell (SSH), PCEP, or Simple Network Management Protocol (SNMP) interface.
  • Management interface 1400 may support YANG, OpenConfig, or other configuration data formats.
  • Management interface 1400 may receive configuration data from automation systems 1420 and may output telemetry data to telemetry systems 1422.
  • cRPD 1440 implements routing protocols 1402, which may include BGP, OSPF, ISIS, LDP, segment routing, and may receive static routes for programming from a controller or automation system (represented by programmability 1424).
  • cRPD 1440 includes routing infrastructure 1404 to support routing protocols 1402.
  • Routing infrastructure 1404 may include a Routing Information Base (RIB), RIB manager, Label Information Base (LIB), LIB manager. Routing infrastructure 1404 may implement Bidirectional Forwarding Detection (BFD).
  • cRPD 1440 includes a forwarding information base (FIB) adaptation layer (1406) to integrate cRPD 1440 into the data plane by enabling configuring forwarding information in the data plane.
  • FIB adaptation layer 1406 may implement a gRPC, Netlink, or rtsock interface to program a vRouter (e.g., a DPDK- based vRouter).
  • FIB adaptation layer 1406 may implement another type of interface to program a vRouter, kernel-based vS witch, SmartNIC, network processor, ASIC-based forwarding chips, or other data plane.
  • FIG. 3B illustrates an example implementation of a server having a dis j oint data plane.
  • Kernel 380 may represent a Linux kernel, other Unix- variant kernel, or other operating system kernel that includes a network stack and is capable of packet forwarding.
  • Server 350 has two data planes for packet forwarding, a first data plane implemented by kernel 380 and a second data plane implemented by vRouter 206A.
  • DPDK-based vRouter 206A is configured with “ownership” of physical interfaces 322.
  • Physical interfaces 322 may be VPN attachment circuits for VRFs 212.
  • Physical interfaces 322 may be associated with respective interfaces of vRouter 206A by which vRouter 206A sends and receives traffic via physical interfaces 322.
  • vRouter 206A exposes respective interfaces 382 to kernel 380 for physical interfaces 322. That is, for each of physical interfaces, vRouter 206A exposes an interface to kernel 380.
  • Each of interfaces 382 may be a vliost interface. Kernel 380 may therefore send and receive network packets with vRouter 206A via interfaces 382.
  • cRPD 324 runs routing protocols and needs to exchange routing protocol messages with routers external to server 350. Moreover, cRPD 324 relies on the kernel 380 network stack to obtain network topology information for the underlay network, which is needed for cRPD 324 to establish routing protocol adjacencies with the external routers. Interfaces 382 provide access for cRPD 324, via kernel 380 and vRouter 206A, to physical interfaces 322 and thus to the underlay networks accessible via physical interfaces 322.
  • Such underlay networks may include the midhaul network, a switch fabric for a local data center in which server 350 is located, and so forth.
  • vRouter 206A is configured with a route that causes vRouter 206A to forward network packets, received at one of physical interfaces 322 and destined for an IP address of the corresponding one of interfaces 382, via that corresponding one of interfaces 382 to kernel 380.
  • Kernel 380 outputs the network packets to eRPD 324 via interface 384.
  • Interface 384 may represent system call interfaces/ APIs exposed by kernel 380, a file system, pthread, socket, or other mechanism by which processes such as cRPD 324 can receive packets from and inject packets into kernel 380.
  • cRPD 324 may operate as the control plane for executing routing protocols for virtualized cell site router 20A in a way that incorporates the network stack, routing protocol infrastructure, and other networking features of kernel 380; while vRouter 206A may operate as the data plane for forwarding data traffic between DUs 22A-1-22A-N and physical interfaces 322 in a way that excludes the kernel 380.
  • DPDK-based vRouter 206 A runs in user space and in general provides better performance capabilities as compared to kernel-based forwarding
  • these disjoint data planes may provide fast path data packet handling by vRouter 206A as well as full control plane routing functionality for virtualized cell site router 20 A.
  • FIG. 6 is a block diagram illustrating an example server with example control and data traffic flows within the server, according to techniques of this disclosure.
  • Server 600 may be similar to server 350 of FIG. 3B or other server described herein.
  • Server 600 differs from server 350 in that PODs 422A-422L are not necessarily DUs (e.g., DU microservices), though PODs 422A-422L may be DUs in some cases.
  • cRPD 324 operates as the control plane for a router implemented by server 600 and DPDK-based vRouter 206A operates as the fast path forwarding plane for the router.
  • PODs 422A-422L are endpoints from the perspecti ve of vRouter 206 A, and in particular may represents overlay endpoints for one or more virtual networks that have been programmed into vRouter 206A.
  • a single vhost interface, vhostO interface 382A may be an example of any of interfaces 328 of FIG. 3B, and is exposed by vRouter 206A to kernel 380 and in some cases by kernel 380 to vRouter 206 A.
  • vhost interface 382A has an associated underlay host IP address for receiving traffic “at the host”.
  • kernel 380 may be a network endpoint of the underlay network that includes server 600 as a network device, the network endpoint having the IP address of vhost interface 382A.
  • the application layer endpoint may be cRPD 324 or other process managed by kernel 380.
  • Underlay networking refers to the physical infrastructure that provides connectivity between nodes (typically servers) in the network.
  • the underlay network is responsible for delivering packets across the infrastructure.
  • Network devices of the underlay use routing protocols to determine IP connectivity. Typical routing protocols used on the underlay network devices for routing purposes are OSPF, IS-IS, and BGP.
  • Overlay networking refers to the virtual infrastructure that provides connectivity between virtual workloads (typically VMs / pods). This connectivity is built on top of the underlay network and permits the construction of virtual networks.
  • the overlay traffic i.e., virtual networking
  • Overlay networks can run across ail or a subset of the underlay network devices and achieve multi-tenancy via virtualization.
  • Control traffic 700 may represent routing protocol traffic for one or more routing protocols executed by cRPD 324.
  • control traffic 700 may be received over a physical interface 322 owned by vRouter 206A.
  • Router 206A is programmed with a route for the vhostO interface 382 A host IP address along with a receive next hop, which causes vRouter 206A to send traffic, received at the physical interface 322 and destined to the vhostO interface 382A host IP address, to kernel 380 via vhostO interface 382A. From the perspective of cRPD 324 and kernel 380, all such control traffic 700 would appear to come from vhostO interface 382A.
  • cRPD 324 routes will specify vhostO interface 382A as the forwarding next hop for the routes.
  • cRPD 324 selectively installs some routes to vRouter agent 314 and the same (or other) routes to kernel 380, as described in further detail below.
  • vRouter agent 314 wall receive a forwarding information base (FIB) update corresponding to some routes received by cRPD 324.
  • FIB forwarding information base
  • routing information programmed by cRPD 324 can be classified into underlay and overlay.
  • cRPD 324 will install the underlay routes to kernel 380, because cRPD 324 might need that reachability to establish additional protocols adjacencies/sessions with external routers, e.g., BGP multi-hop sessions over reachability provided by IGPs.
  • cRPD 324 supports selective filtering of FIB updates to specific data planes, e.g., to kernel 380 or vRouter 206A using routing policy constructs that allow for matching against RIB, routings instance, prefix, or other property.
  • Control traffic 700 sent by cRPD 324 to vRouter 206A over vhostO interface 382A may be sent by vRouter 206A out the corresponding physical interface 322 for vhostO interface 382 A.
  • cRPD-based CM 312 will create the virtual network (here, “pod”) interfaces for each of the application pods 422A, 422L on being notified by the orchestrator 50 via orchestration agent 310.
  • One end of a pod interface terminates in a container included in the pod.
  • CM 312 may request vRouter 206 A to start monitoring the other end of the pod interface, and cRPD 324 facilitates traffic from the physical interfaces 322 destined for application containers in DPDK-based pods 422A, 422L to be forwarded using DPDK, exclusively, and without involving kernel 380.
  • the reverse process applies for traffic sourced by pods 422 A, 4221,.
  • Server 600 may use tunnels exclusive to the DPDK forwarding path to send and receive overlay data traffic 800 internally among DPDK-based pods 422A, 422L; vRouter 206A; and
  • cRPD 324 interfaces with two disjoint data planes: kernel 380 and the DPDK-based vRouter 206 A.
  • cRPD 324 leverages the kernel 380 networking stack to setup routing exclusively for the DPDK fast path.
  • the routing information cRPD 324 receives includes underlay routing information and overlay routing information.
  • cRPD 324 runs routing protocols on vHost interface 382A that is visible in kernel 380, and cRPD 324 may install FIB updates corresponding to IGP-leamt routes (underlay routing information) in the kernel 380 FIB. This may enable establishment of multi-hop iBGP sessions to those destinations indicated in such IGP-leamt routes.
  • cRPD 324 routing protocol adjacencies involve kernel 380 (and vHost interface 382 A) because kernel 380 executes the networking stack.
  • vRouter agent 314 for vRouter 206A notifies cRPD 324 A about the application pod interfaces for pods 422A, 422L, These pod interfaces are created by CNI 312 and managed exclusively (i.e., without involvement of kernel 380) by the vRouter agent 314. These pod interfaces are not known to the kernel 380.
  • cRPD 324 may advertise reachability to these pod interfaces to the rest of the network as L3VPN routes including Network Layer Reachability Information (NLRI).
  • NLRI Network Layer Reachability Information
  • L3VPN routes may be stored in VRFs of vRouter 206A for different network slices.
  • the corresponding MPLS routes may be programmed by cRPD 324 only to vRouter 206A, via interface 340 with vRouter agent 314, and not to kernel 380. That is so because the next-hop of these MPLS labels is a pop-and-forward to a pod interface for one of pods 422A, 422L; these interfaces are only visible in vRouter 206A and not kernel 380.
  • reachability information received over BGP L3VPN may be selectively programmed by cRPD 324 to vRouter 206A, for such routes are only needed for forw3 ⁇ 4rding traffic generated by pods 422 A, 422. Kernel 380 has no application that needs such reachability.
  • the above routes programmed to vRouter 206A constitute overlay routes for the overlay network.
  • FIG. 4 is a block diagram illustrating an example server according to techniques of this disclosure.
  • Sewer 400 may be similar to sewer 600 of FIG. 6.
  • a first data plane 394 includes kernel 380.
  • a second data plane 392 includes vRouter 206A and vRouter agent 314 for vRouter 206A.
  • First data plane 394 and second data plane 392 are disjoint.
  • First data plane 394 and second data plane 392 may store different routes for the underlay network and overlay network, respectively.
  • First data plane 394 and second data plane 392 may independently perform forwarding lookups for and forward traffic using the respective, different stored routes.
  • cRPD 324 is the routing protocol process for processing both underlay routes and overlay routes. Having learned the routes, whether by routing protocols or from SDN controller 70, cRPD 324 can selectively program underlay routes to kernel 380 and overlay routes to vRouter 206A (via vRouter agent 314).
  • FIG 5 is a block diagram illustrating an example vRouter agent, according to techniques of this disclosure.
  • vRouter agent 314 includes gRPC server 520 for exchanging data with cRPD 324 (a gRPC client) via a generic interface 340.
  • APIs of gRPC server 520 include virtual machine interface (VMI) APIs 530 for exchanging virtual network interface data and requests, configuration APIs 532 for exchanging configuration data and requests, and route APIs 534 for exchanging routes and requests — including for enabling cRPD 324 to program routes to vRouter 206A via vRouter agent 314.
  • VMI virtual machine interface
  • Synchronization module 544 programs vRouter 206A with virtual network interfaces fe.g., part of a veth pair or a virtio- vhost interface between a DPDK pod and DPDK-hased vRouter 206 A) and programs vRouters 206A with routing information.
  • vRouter agent 314 provides a generic interface 340 to the data plane for overlay traffic sourced by or destined to application pods on the server.
  • This generic interface 340 may be implemented by any controller, routing protocol process, or other agent because it relies on gRPC rather than a proprietary' interface.
  • a generic data plane model is decoupled from a network controller for virtualized computing infrastructure.
  • a data plane can expose application programming interfaces (APIs) 530, 532, 534 that can be implemented by any control -plane service, in some examples, the data plane will also have the capability to work with multiple types of CNI 312,
  • the data plane may be implemented using a DPDK-based virtual router and expose a gRPC interface 340 for exchanging control data.
  • a virtual router agent 314 for the virtual router data plane may operate as a gRPC server 520 that exposes gRPC APIs for programming the virtual router data plane 206A.
  • the techniques include workflows for configuring virtual network interfaces for pods, where the virtual router agent 314 obtains the information from a containerized routing protocol daemon (cRPD) 324 in response to a request for a port from CNI 312.
  • cRPD containerized routing protocol daemon
  • This disclosure describes a generic data plane model that is decoupled from the SDN controller and can expose APIs that can he implemented by any control-plane service, such as vRouter agent 314.
  • the proposed data plane e.g., vRouter 206A and vRouter agent 314, will also have the capability to work with any CNI.
  • the data plane will work with Platter CNI.
  • cRPD containerized routing protocol daemon
  • NSM Network Service Mesh
  • the set of software components support NSM architecture.
  • solutions that possibly optimize the data plane to be modular and have a smaller footprint will also be considered.
  • the generic data plane proposed here may be an extension to the current contrail based data plane.
  • the design presented wall include scope for both a vrouter as well as a dpdk forwarding plane and at the same time will also accommodate the need for supporting upcoming technologies such as eBPF and XDP along with supporting a vRouter and DPDK based forwarding planes.
  • This design will also pave the way for having the same generic data plane work with more than one control plane at the same time.
  • the compute node data plane may therefore become more loosely coupled with the SDN control plane, versus existing control plane and data plane integration schemes.
  • the vRouter agent may be a user space process running on Linux. It acts as the local, lightweight control plane and is responsible for the following functions:
  • the vRouter forwarding plane is existing sy stems may operate as a loadable kernel module in Linux and is responsible for the following functions:
  • Packets received from the overlay network are assigned to a routing instance based on the MPLS label or Virtual Network Identifier (VNI). Virtual interfaces to local virtual machines are bound to routing instances.
  • VNI Virtual Network Identifier
  • the routes can be Layer 3 IP prefixes or Layer 2 MAC addresses.
  • a forwarding policy can be applied using a flow r table: It matches packets against the flow table and applies the flow actions. o It punts the packets for which no flow rule is found (that is, the first packet of every ' flow) to the vRouter agent, which then installs a rule in the flow table, o It punts certain packets such as DHCP, ARP, MDNS to the vRouter agent for proxying to an SDN controller.
  • the vRouter forwarding plane may be implemented using a DPDK-based router, which may present the following properties:
  • the virtual router application runs as multiple logical cores o Logical (Lcores) are pthreads with core affinity o Lcores run in poll mode and handle bursts of packets for maximum performance
  • a generic data plane interface for either type of virtual router to a control plane can be done by enhancing the current model in one of the following ways:
  • a vRouter Agent + vRouter/DPDK forwarding plane with XMPP northbound interface Keep the current model as is by using a vRouter Agent as the data plane and have the control plane implement XMPP (as is done by the contrail control-node). However, not all control planes support XMPP and it may not be the preferred approach.
  • the vRouter agent carries a lot of support for legacy openstack features which may not be necessary, leading to a larger footprint.
  • vRouter Agent + vRouter/DPDK forwarding plane and GRPC northbound interface Keep the current data plane and forwarding plane but implement a commonly used open-source protocol such as GRPC as the interface to the control plane. Using a more widely adapted protocol such as GRPC as the north-bound interface opens up more opportunities. Control planes are likely to be increase adoption. Still, the vRouter agent carries a lot of support for legacy openstack features which may not be necessary, leading to a larger footprint.
  • vRouter/DPDK forwarding plane + lean vRouter Agent and GRPC northbound interface Keep the vRouter/DPDK forwarding plane as is but reduce the footprint of the vRouter agent by only adopting functionality that is strictly required by the forwarding plane.
  • the northbound interface can be either XMPP or preferably GRPC.
  • vRouter agent footprint can be reduced in two ways:
  • vRouter/DPDK forwarding plane + northbound interface Expose the vRouter/DPDK forwarding plane directly to the control plane. This scheme elmiates the need for a separate vRouter agent, but it loses the current hardware abstraction where the vrouter is shielded from the control plane. Also, the intelligence provided by the vRouter agent has to either be absorbed by the control plane or by the vRouter, making it more complex.
  • a combination of schemes (2) and (3) may facilitate a generic data plane.
  • a vRouter-based data plane may be used in conjunction with a cRPD- based control plane
  • the proposed architecture will look as shown in FIGS. 3 A--3B, and 4-6.
  • vRouter Agent 314 vRouter/DPDK forwarding plane 206A
  • gRPC northbound interface 340 as the generic data plane interface.
  • the data plane of vRouter 206A and vRouter agent 314 then becomes “generic” by decoupling the northbound interface from vRouter agent 314 to cRPD 324.
  • vRouter 206A and vr outer agent 314 may run in a single container and/or as an independent piece of software. Use of gRPC reduces dependency on any particular control plane. There may be provided support for a gRPC interface as an abstract layer above the vRouter agent 314, an interface to handle eonfig + routing, a gRPC interface to provide abstraction for config objects, a standard data model for config north bound interface to control plane and translate to Agent understandable format on the south bound interface.
  • a Port Add and Port Down sequence may be primarily done via cRPD or via a vRouter Agent. Example such sequences for Port Add and Port Down are below:
  • the system supports VRF functionality where overlay and underlay routes remain separated in different VRFs.
  • vRouter Agent 314 is the client in an XMPP model, it is the server according to techniques of this disclosure and invokes functions in the client when it has to push an object.
  • the server may he implemented with two Completion Queues to work in asynchronous mode - one for the VMt subscribe service 530 and the other for route/config service 532, 534.
  • An abstract class ServiceData may be defined from which individual service data types inherit.
  • the service data objects’ addresses may be used as tags added to completion queue for various calls.
  • FIG. 7 is a conceptual diagram depicting a sequence of operations on a port-add leading to route programming in a vRouter, according to example aspects of this disclosure.
  • the sequence of operations is described with respect to components of seryer 300, but may be performed by components of any server described in this disclosure, e.g., servers 12, 350, 400, 600.
  • the sequence of operations in FIG. 7 may be similar to operations of CM - Agent (Option 2) described above.
  • CM 312 has the IP address block reserved for Pods.
  • vRouter agent 314 listens for Port-Add and Port-Delete messages, e.g., on a thrift sendee, where a “port” corresponds to a virtual network interface.
  • CM 312 sends a Port-Add message to vRouter agent 314 (702).
  • the Port-Add message includes an identifier for the virtual network for the port and an IP address allocated by CM 312 for the Pod. (CM 312 may separately configure the Pod with the other end of the virtual network interface.)
  • vRouter agent 314 creates a virtual network interface (referred to here as a virtual machine interface or VMI, which is an example of a virtual network interface) in interfaces 540 (704).
  • VMI virtual machine interface
  • vRouter agent. 314 configures the virtual network interface m vRouter 206A with a default VRF identifier, with a VMI Add message (706).
  • vRouter agent 314 subscribes to cRPD 324 instead of an SDN controller with a VMI Subscribe message that includes the virtual network name and IP address received in the Port Add message (708).
  • cRPD 327 sends a VMI Config message to vRouter agent 314 with the correct VRF identifier for virtual network for the virtual network interface (712), optionally adding a VRF to vRouter agent 314 if needed with a VRF Add message (710).
  • vRouter agent 314 send a VMI Update message with the correct VRF identifier to vRouter 206A to cause vRouter 206A, which attaches the virtual network interface to the correct VRF (714).
  • cRPD 324 allocates a service label and adds a route and next-hop (e.g., an MPLS route for BGP IP- VPNs) using a Route Add message to vRouter agent 314 (716).
  • cRPD 324 also advertises a route for reaching the Pod to its peer routers (718), which may include other cRPDs, routers in the underlay network, or other routers.
  • vRouter agent 314 configures vRouter 206A with forwarding information for the route received in the Route Add message from cRPD 324 (720). Examples message structures and data structures for messages described with respect to FIG. 7 are defined above.
  • FIG. 11 is a block diagram illustrating example components of a virtual router agent and an example sequence of operations and messages to create and advertise a new port for a Pod, in accordance with techniques of this disclosure.
  • the example sequence may be similar in some respects to that described with respect to FIG. 7.
  • FIG 8 is a block diagram of an example computing device (e.g., host), according to techniques described in this disclosure.
  • Computing device 800 of FIG. 2. may represent a real or virtual server and may represent an example instance of any of servers 12 of FIG. 1, or servers 350 or 400.
  • Computing device 800 includes in this example, a bus 242. coupling hardware components of a computing device 800 hardware environment.
  • Bus 242. couples network interface card (NIC) 230, storage disk 246, and one or more microprocessors 210 (hereinafter, "microprocessor 810").
  • NIC 230 may be SR-IOV-capable.
  • a front-side bus may in some eases couple microprocessor 810 and memory device 244.
  • bus 242 may eouple memory device 244, microprocessor 810, and NIC 230.
  • Bus 242 may represent a Peripheral Component Interface (PCI) express (PCIe) bus.
  • PCIe Peripheral Component Interface express
  • DMA direct memory access controller may control DMA transfers among components coupled to bus 242. in some examples, components coupled to bus 242 control DMA transfers among components coupled to bus 242.
  • Microprocessor 810 may include one or more processors each including an independent execution unit to perform instructions that conform to an instruction set architecture, the instructions stored to storage media.
  • Execution units may be implemented as separate integrated circuits (ICs) or may be combined within one or more multi-core processors (or “many-core” processors) that are each implemented using a single IC (i.e., a chip multiprocessor).
  • Disk 246 represents computer readable storage media that includes volatile and/or non-volatile, removable and/or non-removable media implemented in any method or technology for storage of information such as processor-readable instructions, data structures, program modules, or other data.
  • Computer readable storage media includes, but is not limited to, random access memory (RAM), read-only memory (ROM), EEPROM, Flash memory, CD-ROM, digital versatile discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information and that can he accessed by microprocessor 810.
  • Mam memory 244 includes one or more computer-readable storage media, which may include random-access memory (RAM) such as various forms of dynamic RAM (DRAM), e.g., DDR2/DDR3 SDRAM, or static RAM (SRAM), flash memory, or any other form of fixed or removable storage medium that can be used to cany or store desired program code and program data in the form of instructions or data structures and that can be accessed by a computer.
  • RAM random-access memory
  • DRAM dynamic RAM
  • SRAM static RAM
  • Main memory 244 provides a physical address space composed of addressable memory locations.
  • Network interface card (NIC) 2.30 includes one or more interfaces 232 configured to exchange packets using links of an underlying physical network.
  • Interfaces 232 may include a port interface card having one or more network ports, NIC 230 may also include an on- card memory to, e.g., store packet data. Direct memory access transfers between the NIC 230 and other devices coupled to bus 242 may read/write from/to the NIC memory.
  • Memory 244, NIC 230, storage disk 246, and microprocessor 810 may provide an operating environment for a software stack that includes an operating system kernel 380 executing in kernel space.
  • Kernel 380 may represent, for example, a Linux, Berkeley Software Distribution (BSD), another Unix-variant kernel, or a Window's server operating system kernel, available from Microsoft Corp.
  • the operating system may execute a hypervisor and one or more virtual machines managed by hypervisor.
  • Example hypervisors include Kernel-based Virtual Machine (KVM) for the Linux kernel, Xen, ESXi available from VMware, Windows Hyper-V available from Microsoft, and other open-source and proprietary hypervisors.
  • the term hypervisor can encompass a virtual machine manager (VMM).
  • An operating system that includes kernel 380 provides an execution environment for one or more processes in user space 245.
  • Kernel 380 includes a physical driver 225 to use the network interface card 230.
  • Network interface card 230 may also implement SR-IOV to enable sharing the physical network function (I/O) among one or more virtual execution elements, such as containers 229A-229B or one or more virtual machines (not shown in FIG 2).
  • Shared virtual devices such as virtual functions may provide dedicated resources such that each of the virtual execution elements may access dedicated resources of NIC 230, which therefore appears to each of the virtual execution elements as a dedicated NIC.
  • Virtual functions may represent lightweight PCIe functions that share physical resources with a physical function used by physical driver 225 and with other virtual functions.
  • NIC 230 may have thousands of available virtual functions according to the SR-IOV standard, but for I/O-intensive applications the number of configured virtual functions is typically much smaller.
  • Computing device 800 may be coupled to a physical network switch fabric that includes an overlay network that extends switch fabric from physical swatches to software or “virtual” routers of physical servers coupled to the switch fabric, including virtual router 206 A.
  • Virtual routers may be processes or threads, or a component thereof, executed by the physical servers, e.g,, servers 12 of FIG. 1, that dynamically create and manage one or more virtual networks usable for communication between virtual network endpoints.
  • virtual routers implement each virtual network using an overlay network, which provides the capability to decouple an endpoint’s virtual address from a physical address (e.g,, IP address) of the server on which the endpoint is executing.
  • Each virtual network may- use its own addressing and security scheme and may be viewed as orthogonal from the physical network and its addressing scheme.
  • Various techniques may be used to transport packets withm and across virtual networks over the physical network.
  • the term “virtual router” as used herein may encompass an Open vS witch (OVS), an OVS bridge, a Linux bridge, Docker bridge, or other device and/or software that is located on a host device and performs switching, bridging, or routing packets among virtual network endpoints of one or more virtual networks, where the virtual network endpoints are hosted by one or more of servers 12.
  • OVS Open vS witch
  • Linux bridge Linux bridge
  • Docker bridge or other device and/or software that is located on a host device and performs switching, bridging, or routing packets among virtual network endpoints of one or more virtual networks, where the virtual network endpoints are hosted by one or more of servers 12.
  • Virtual router 206A executes within user space as a DPDK-based virtual router, but virtual router 206A may execute within a hypervisor, a host operating system, a host application, or a virtual machine in various implementations.
  • Virtual router 206 A may replace and subsume the virtual routing/bridging functionality of the Linux bridge/OVS module that is commonly used for Kubernetes deployments of pods 202.
  • Virtual router 206A may perform bridging fe.g., E-VPN) and routing (e.g., L3VPN, IP- VPNs) for virtual networks.
  • Virtual router 206A may perform networking services such as applying security policies, NAT, multicast, mirroring, and load balancing.
  • Virtual router 206A can be executing as a kernel module or as a user space DPDK process (virtual router 206A is shown here in user space 245). Virtual router agent 314 may also be executing in user space. In the example computing device 800 of FIG. 2, virtual router 206A executes within user space as a DPDK-based virtual router, but virtual router 206A may execute within a hypervisor, a host operating system, a host application, or a virtual machine in various implementations. Virtual router agent 314 has a connection to network controller 24 using a channel, which is used to download configurations and forwarding information. Virtual router agent 314 programs this forwarding state to the virtual router data (or “forwarding”) plane represented by virtual router 206A. Virtual router 206A and virtual router agent 314 may be processes.
  • Virtual router 206A may replace and subsume the virtual routing/bridging functionality of the Linux bridge/OVS module that is commonly used for Kubernetes deployments of pods 202, Virtual router 206A may perform bridging (e.g., E-VPN) and routing (e.g., L3VPN, IP- VPNs) for virtual networks. Virtual router 206A may perform networking services such as applying security policies, NAT, multicast, mirroring, and load balancing.
  • bridging e.g., E-VPN
  • routing e.g., L3VPN, IP- VPNs
  • Virtual router 206A may perform networking services such as applying security policies, NAT, multicast, mirroring, and load balancing.
  • virtual router 206A uses one or more physical interfaces 232.
  • virtual router 206A exchanges overlay packets with workloads, such as VMs or pods 202 (in FIG. 2).
  • Virtual router 206A has multiple virtual network interfaces (e.g., vifs). These interfaces may include the kernel interface, vhostO, for exchanging packets with the host operating system; an interface with virtual router agent 314, pktO, to obtain forwarding state from the network controller and to send up exception packets.
  • Virtual network interfaces of virtual router 206A are for exchanging packets with the workloads.
  • Virtual network interfaces 212, 213 of virtual router 206A are illustrated in FIG. 2.
  • Virtual network interfaces 212, 213 may be any of the aforementioned types of virtual interfaces. In some cases, virtual network interfaces 212, 213 are tap interfaces.
  • virtual router 206 A In a kernel-based deployment of virtual router 206A (not shown), virtual router 206 A is installed as a kernel module inside the operating system. Virtual router 206A registers itself with the TCP/IP stack to receive packets from any of the desired operating system interfaces that it w3 ⁇ 4nts to. The interfaces can be bond, physical, tap (for VMs), veth (for containers) etc. Virtual router 206A in this mode relies on the operating system to send and receive packets from different interfaces. For example, the operating system may expose a tap interface backed by a vhost-net driver to communicate with VMs. Once virtual router 206A registers for packets from this tap interface, the TCP/IP stack sends ail the packets to it. Virtual router 206A sends packets via an operating system interface. In addition, NIC queues (physical or virtual) are handled by the operating system. Packet processing may operate in interrupt mode, which generates interrupts and may lead to frequent context switching.
  • virtual router 206A is installed as a user space 245 application that is linked to the DPDK library'. This may lead to faster performance than a kernel-based deployment, particularly in the presence of high packet rates.
  • the physical interfaces 232 are used by the poll mode drivers (PMDs) of DPDK rather the kernel’s interrupt-based drivers.
  • the registers of physical interfaces 232 may be exposed into user space 245 in order to be accessible to the PMDs; a physical interface 232 bound in this way is no longer managed by or visible to the host operating system, and the DPDK-based virtual router 206A manages the physical interface 232.
  • the nature of this “polling mode” makes the virtual router 206A DPDK data plane packet processing/forwarding much more efficient as compared to the interrupt mode when the packet rate is high. There are comparatively few interrupts and context switching during packet I/O, compared to kernel-mode virtual router 206A, and interrupt and context switching during packet I/O may in some cases be avoided altogether.
  • each of pods 202A-202B may be assigned one or more virtual network addresses for use within respective virtual networks, where each of the virtual networks may be associated with a different virtual subnet provided by virtual router 206A.
  • Pod 202B may be assigned its own virtual layer three (L3) IP address, for example, for sending and receiving communications but may be unaware of an IP address of the computing device 800 on which the pod 202B executes.
  • the virtual network address may thus differ from the logical address for the underlying, physical computer system, e.g., computing device 800.
  • Computing device 800 includes a virtual router agent 314 that controls the overlay of virtual networks for computing device 800 and that coordinates the routing of data packets within computing device 800.
  • virtual router agent 314 communicates with network controller 24 for the virtualization infrastructure, which generates commands to create virtual networks and configure network virtualization endpoints, such as computing device 800 and, more specifically, virtual router 206A, as a well as virtual network interface 212.
  • network controller 24 By configuring virtual router 206A based on information received from network controller 24, virtual router agent 314 may support configuring network isolation, policy- based security, a gateway, source network address translation (SNAT), a load-balancer, and service chaining capability for orchestration.
  • SNAT source network address translation
  • network packets e.g., layer three (L3) IP packets or layer two (L2) Ethernet packets generated or consumed by the containers 229A--229B within the virtual network domain may be encapsulated in another packet (e.g., another IP or Ethernet packet) that is transported by the physical network.
  • the packet transported in a virtual network may be referred to herein as an “inner packet” while the physical network packet may be referred to herein as an “outer packet” or a “tunnel packet.”
  • Encapsulation and/or de-capsulation of virtual network packets within physical network packets may be performed by virtual router 206A. This functionality is referred to herein as tunneling and may he used to create one or more overlay networks.
  • Virtual router 206 A performs tunnel encapsulation/decapsulation for packets sourced by/destined to any containers of pods 202, and virtual router 206A exchanges packets with pods 202 via bus 242 and/or a bridge of NIC 230.
  • a network controller 24 may provide a logically centralized controller for facilitating operation of one or more virtual networks.
  • the network controller 24 may, for example, maintain a routing information base, e.g., one or more routing tables that store routing information for the physical network as well as one or more overlay networks.
  • Virtual router 206A implements one or more virtual routing and forwarding instances (VRFs) 222A-222B for respective virtual networks for which virtual router 206 A operates as respective tunnel endpoints.
  • VRFs virtual routing and forwarding instances
  • each VRF 222 stores forwarding information for the corresponding virtual network and identifies where data packets are to be forwarded and whether the packets are to be encapsulated in a tunneling protocol, such as with a tunnel header that may include one or more headers for different layers of the virtual network protocol stack.
  • Each of VRFs 222 may include a network forwarding table storing routing and forwarding information for the virtual network.
  • NIC 230 may receive tunnel packets.
  • Virtual router 206A processes the tunnel packet to determine, from the tunnel encapsulation header, the virtual network of the source and destination endpoints for the inner packet.
  • Virtual router 206A may strip the layer 2 header and the tunnel encapsulation header to internally forward only the inner packet.
  • the tunnel encapsulation header may include a virtual network identifier, such as a VxLAN tag or MPLS label, that indicates a virtual network, e.g., a virtual network corresponding to VRF 222A.
  • VRF 222A may include forwarding information for the inner packet. For instance, VRF 222A may map a destination layer 3 address for the inner packet to virtual network interface 212. VRF 222A forwards the inner packet via virtual network interface 212 to POD 202A in response.
  • Containers 229 A may also source inner packets as source virtual network endpoints.
  • Container 229A may generate a layer 3 inner packet destined for a destination virtual network endpoint that is executed by another computing device (i.e., not computing device 800) or for another one of containers.
  • Container 229A may sends the layer 3 inner packet to virtual router 206 A via virtual network interface 212 attached to VRF 222A.
  • Virtual router 206A receives the inner packet and layer 2 header and determines a virtual network for the inner packet.
  • Virtual router 206A may determine the virtual network using any of the above-described virtual network interface implementation techniques (e.g., macvlan, veth, etc.).
  • Virtual router 206A uses the VRF 222A corresponding to the virtual network for the inner packet to generate an outer header for the inner packet, the outer header including an outer IP header for the overlay tunnel and a tunnel encapsulation header identifying the virtual network.
  • Virtual router 206A encapsulates the inner packet with the outer header.
  • Virtual router 206A may encapsulate the tunnel packet with a new layer 2 header having a destination layer 2 address associated with a device external to the computing device 800, e.g., a TOR switch 16 or one of servers 12. If external to computing device 800, virtual router 206A outputs the tunnel packet with the new layer 2 header to NIC 230 using physical function 221. NIC 230 outputs the packet on an outbound interface. If the destination is another virtual network endpoint executing on computing device 800, virtual router 206A routes the packet to the appropriate one of virtual network interfaces 212, 213.
  • a controller for computing device 800 (e.g., network controller 24 of FIG. 1) configures a default route in each of pods 202 to cause the virtual machines 224 to use virtual router 206A as an initial next hop for outbound packets.
  • NIC 230 is configured with one or more forwarding rules to cause all packets received from virtual machines 224 to be switched to virtual router 206A.
  • Pod 202A includes one or more application containers 229A.
  • Pod 202B includes an instance of cRPD 324.
  • Container platform 804 includes container runtime 208, orchestration agent 310, service proxy 211 , and CNI 312.
  • Container engine 208 includes code executable by microprocessor 810.
  • Container runtime 208 may be one or more computer processes.
  • Container engine 208 runs containerized applications in the form of containers 229A-229B.
  • Container engine 208 may represent a Dockert, rkt, or other container engine for managing containers.
  • container engine 208 receives requests and manages objects such as images, containers, networks, and volumes.
  • An image is a template with instructions for creating a container.
  • a container is an executable instance of an image. Based on directives from controller agent 310, container engine 208 may obtain images and instantiate them as executable containers in pods 202A-202B.
  • Service proxy 211 includes code executable by microprocessor 810.
  • Service proxy 211 may be one or more computer processes.
  • Service proxy 211 monitors for the addition and removal of service and endpoints objects, and it maintains the network configuration of the computing device 800 to ensure communication among pods and containers, e.g., using services.
  • Service proxy 211 may also manage iptables to capture traffic to a service’s virtual IP address and port and redirect the traffic to the proxy port that proxies a backed pod.
  • Service proxy 211 may represent a kube-proxy for a minion node of a Kubernetes cluster.
  • container platform 804 does not include a service proxy 211 or the service proxy 211 is disabled in favor of configuration of virtual router 206A and pods 202 by CNI 312.
  • Orchestration agent 310 includes code executable by microprocessor 810.
  • Orchestration agent 310 may be one or more computer processes.
  • Orchestration agent 310 may represent a kubelet for a minion node of a Kubernetes cluster.
  • Orchestration agent 310 is an agent of an orchestrator, e.g., orehestrator 23 of FIG. 1, that receives container specification data for containers and ensures the containers execute by computing device 800.
  • Container specification data may be m the form of a manifest file sent to orchestration agent 310 from orchestrator 23 or indirectly received via a command line interface, HTTP endpoint, or HTTP server.
  • Container specification data may be a pod specification (e.g., a
  • Orchestration agent 310 instantiates or otherwise invokes CNI 312 to configure one or more virtual network interfaces for each of pods 202.
  • orchestration agent 310 receives a container specification data for pod 202A and directs container engine 208 to create the pod 202A with containers 229A based on the container specification data for pod 202A.
  • Orchestration agent 310 also invokes the CNI 312 to configure, for pod 202A, virtual network interface for a virtual network corresponding to VRFs 222A.
  • pod 202A is a virtual network endpoints for a virtual network corresponding to VRF 222A.
  • CM 312 may obtain interface configuration data for configuring virtual network interfaces for pods 202.
  • Virtual router agent 314 operates as a virtual network control plane module for enabling network controller 24 to configure virtual router 206A.
  • a virtual network control plane (including network controller 24 and virtual router agent 314 for minion nodes) manages the configuration of virtual networks implemented in the data plane in part by virtual routers 206A of the minion nodes.
  • Virtual router agent 314 communicates, to CM 312, interface configuration data for virtual network interfaces to enable an orchestration control plane element (i.e., CM 312) to configure the virtual network interfaces according to the configuration state determined by the network controller 24, thus bridging the gap between the orchestration control plane and virtual network control plane.
  • CM 312 an orchestration control plane element
  • this may enable a CM 312 to obtain interface configuration data for multiple virtual network interfaces for a pod and configure the multiple virtual network interfaces, which may reduce communication and resource overhead inherent with invoking a separate CM 312 for configuring each virtual network interface.
  • FIG. 9 is a block diagram of an example computing device operating as an instance of an orchestrator master node for a cluster for a virtualized computing infrastructure.
  • Computing device 300 of FIG. 9 may represent one or more real or virtual servers. As such, computing device 300 may m some instances implement one or more master nodes for respective clusters.
  • Scheduler 1322, API server 1320, network controller manager 1326, network controller 1324, network controller manager 1325, and configuration store 1328 may be distributed among multiple computing devices 300 that make up a computing system or hardware/server cluster. Each of the multiple computing devices 300, in other words, may provide a hardware operating environment for one or more instances of any one or more of scheduler 1322, API server 1320, network controller manager 1326, network controller 1324, network controller manager 1325, or configuration store 1328.
  • Network controller 1324 may represent an example instance of network controller 24 of FIG. 1.
  • Scheduler 1322, API server 1320, controller manager 1326, and network controller manager 1325 may implement an example instance of orchestrator 23.
  • Network controller manager 1325 may represent an example implementation of a Kubemetes cloud controller manager or Ku be- manager.
  • Network controller 1324 may represent an example instance of network controller 24.
  • Computing device 300 includes in this example, a bus 1342 coupling hardware components of a computing device 300 hardware environment.
  • Bus 1342 couples network interface card (NIC) 1330, storage disk 1346, and one or more microprocessors 1310 (hereinafter, “microprocessor 1310”).
  • a front-side bus may m some cases couple microprocessor 1310 and memory device 1344.
  • bus 1342 may couple memory device 1344, microprocessor 1310, and NIC 1330.
  • Bus 1342 may represent a Peripheral Component Interface (PCI) express (PCIe) bus.
  • PCIe Peripheral Component Interface
  • PCIe Peripheral Component Interface
  • a direct memory access (DMA) controller may control DMA transfers among components coupled to bus 242.
  • components coupled to bus 1342 control DMA transfers among components coupled to bus 1342.
  • Microprocessor 1310 may include one or more processors each including an independent execution unit to perform instructions that conform to an instruction set architecture, the instructions stored to storage media.
  • Execution units may be implemented as separate integrated circuits (ICs) or may be combined within one or more multi-core processors (or “many-core” processors) that are each implemented using a single IC (i.e., a chip multiprocessor) .
  • Disk 1346 represents computer readable storage media that includes volatile and/or non-volatile, removable and/or non-removable media implemented in any method or technology for storage of information such as processor-readable instructions, data structures, program modules, or other data.
  • Computer readable storage media includes, but is not limited to, random access memory (RAM), read-only memory (ROM), EEPR.OM, Flash memory, CD-ROM, digital versatile discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information and that can be accessed by microprocessor 1310.
  • Mam memory 1344 includes one or more computer-readable storage media, which may include random-access memory (RAM) such as various forms of dynamic RAM (DRAM), e.g., DDR2/DDR3 SDRAM, or static RAM (SRAM), flash memory, or any other form of fixed or removable storage medium that can be used to carry or store desired program code and program data m the form of instructions or data structures and that can be accessed by a computer.
  • RAM random-access memory
  • DRAM dynamic RAM
  • SRAM static RAM
  • Main memory 1344 provides a physical address space composed of addressable memory locations.
  • Network interface card (NIC) 1330 includes one or more interfaces 3132 configured to exchange packets using links of an underlying physical network. Interfaces 3132 may include a port interface card having one or more network ports. NIC 1330 may also include an on-card memory to, e.g., store packet data. Direct memory' access transfers between the NIC 1330 and other devices coupled to bus 1342 may read/wnte from/to the NIC memory. [0168] Memory 1344, NIC 1330, storage disk 1346, and microprocessor 1310 may provide an operating environment for a software stack that includes an operating system kernel 1314 executing in kernel space.
  • Kernel 1314 may represent, for example, a Linux, Berkeley Software Distribution (BSD), another Unix- variant kernel, or a Windows server operating system kernel, available from Microsoft Corp.
  • the operating system may execute a hypervisor and one or more virtual machines managed by hypervisor.
  • Example hypervisors include Kernel-based Virtual Machine (KVM) for the Linux kernel, Xen, ESXi available from VMware, Windows Hyper-V available from Microsoft, and other open-source and proprietary hypervisors.
  • the term hypervisor can encompass a virtual machine manager (VMM).
  • An operating system that includes kernel 1314 provides an execution environment for one or more processes in user space 1345. Kernel 1314 includes a physical driver 1325 to use the network interface card 230.
  • Computing device 300 may be coupled to a physical network switch fabric that includes an overlay network that extends switch fabric from physical switches to software or “virtual” routers of phy sical servers coupled to the switch fabric, such virtual router 220 of FIG. 2.
  • Computing device 300 may use one or more dedicated virtual networks to configure mini on nodes of a cluster.
  • API server 1320, scheduler 1322, controller manager 1326, and configuration store may implement a master node for a cluster and be alternatively referred to as “master components.”
  • the cluster may be a Kubemetes cluster and the master node a Kubemetes master node, in which case the master components are Kubemetes master components.
  • API server 1320 includes code executable by microprocessor 1310.
  • API server 1320 may be one or more computer processes.
  • API server 1320 validates and configures data for objects, such as virtual execution elements (e.g., pods of containers), services, and replication controllers, for instance.
  • a service may be an abstraction that defines a logical set of pods and the policy used to access the pods. The set of pods implementing a service are selected based on the sendee definition.
  • a sendee may be implemented m part as, or otherwise include, a load balancer.
  • API sender 1320 may implement a Representational State Transfer (REST) interface to process REST operations and provide the frontend to a corresponding cluster’s shared state stored to configuration store 1328.
  • API sender 1320 may authenticate and authorize requests.
  • API server 1320 communicates with other components to instantiate virtual execution elements in the computing infrastructure 8.
  • API server 1320 may represent a Kubemetes AIT server.
  • Configuration store 1328 is a backing store for all cluster data.
  • Cluster data may include cluster state and configuration data.
  • Configuration data may also provide a backend for sendee discovery' and/or provide a locking sendee.
  • Configuration store 1328 may be implemented as a key value store.
  • Configuration store 1328 may be a central database or distributed database.
  • Configuration store 1328 may represent an eted store.
  • Configuration store 1328 may represent a Kubemetes configuration store.
  • Scheduler 1322 includes code executable by microprocessor 1310.
  • Scheduler 1322 may be one or more computer processes.
  • Scheduler 1322 monitors for newly created or requested virtual execution elements (e.g., pods of containers) and selects a minion node on which the virtual execution elements are to run.
  • Scheduler 1322 may select a minion node based on resource requirements, hardware constraints, software constraints, policy constraints, locality, etc.
  • Scheduler 1322 may represent a Kubemetes scheduler.
  • API server 1320 may invoke the scheduler 1322 to schedule a virtual execution element, which may select a minion node and returns an identifier for the selected minion node to API server 1320, which may write the identifier to the configuration store 1328 in association with the virtual execution element.
  • API server 1320 may invoke the orchestration agent 310 for the selected minion node, which may cause the container engine 208 for the selected minion node to obtain the virtual execution element from a storage server and create the virtual execution element on the minion node.
  • the orchestration agent 310 for the selected minion node may update the status for the virtual execution element to the API server 1320, which persists this new state to the configuration store 1328. In this way, computing device 300 instantiates new virtual execution elements in the computing infrastructure 8.
  • Controller manager 1326 includes code executable by microprocessor 1310. Controller manager 1326 may be one or more computer processes. Controller manager 1326 may embed the core control loops, monitoring a shared state of a cluster by obtaining notifications from API Server 1320. Controller manager 1326 may attempt to move the state of the cluster toward the desired state.
  • Example controllers (not shown) managed by the controller manager 1326 may include a replication controller, endpoints controller, namespace controller, and service accounts controller. Controller manager 1326 may perform lifecycle functions such as namespace creation and lifecycle, event garbage collection, terminated pod garbage collection, cascading-deletion garbage collection, node garbage collection, etc. Controller manager 1326 may represent a Kuhernetes Controller Manager for a Kubernetes cluster,
  • Network controller 1324 includes code executable by microprocessor 1310.
  • Network controller 1324 may include one or more computer processes.
  • Network controller 1324 may represent an example instance of network controller 24 of FIG. 1.
  • the network controller 1324 may be a logically centralized but physically distributed Software Defined Networking (SDN) controller that is responsible for providing the management, control, and analytics functions of a virtualized network.
  • SDN Software Defined Networking
  • network controller 1324 may be a logically centralized control plane and management plane of the computing infrastructure 8 and orchestrates v Routers for one or more minion nodes.
  • Network controller 1324 may provide network function virtualization (NFV) to networks, such as business edge networks, broadband subscriber management edge networks, and mobile edge networks.
  • NFV network function virtualization
  • networks such as business edge networks, broadband subscriber management edge networks, and mobile edge networks.
  • NFV involves orchestration and management of networking functions such as a Firewalls, Intrusion Detection or Preventions Systems (IDS / IPS), Deep Packet Inspection (DPI), caching, Wide Area Network (WAN) optimization, etc. in virtual machines, containers, or other virtual execution elements instead of on physical hardware appliances.
  • IDS / IPS Intrusion Detection or Preventions Systems
  • DPI Deep Packet Inspection
  • WAN Wide Area Network
  • the main drivers for virtualization of the networking sendees in this market are time to market and cost optimization.
  • Network controller 1324 programs network infrastructure elements to create virtual networks and may create interface configurations for virtual network interfaces for the virtual networks.
  • Network controller manager 1325 includes code executable by microprocessor 1310.
  • Network controller manager 1325 may be one or more computer processes.
  • Network controller manager 1325 operates as an interface between the orchestration-oriented elements (e.g., scheduler 1322, API server 1320, controller manager 1326, and configuration store 1328) and network controller 1324.
  • network controller manager 1325 monitors the cluster for new objects (e.g., pods and services).
  • Network controller manager 1325 may isolate pods in virtual networks and connect pods with services.
  • Network controller manager 1325 may be executed as a container of the master node for a cluster. In some cases, using network controller manager 1325 enables disabling the service proxies of minion nodes (e.g., the Kubernetes kube-proxy) such that all pod connectivity is implemented using virtual routers, as described herein. [0185] Network controller manager 1325 may use the controller framework for the orchestration platform to listen for (or otherwise monitor for) changes in objects that are defined in the API and to add annotations to some of these objects. The annotations may be labels or other identifiers specifying properties of the objects (e.g., “Virtual Network Green”).
  • Network controller manager 1325 may create a network solution for the application using an interface to network controller 1324 to define network objects such as virtual networks, virtual network interfaces, and access control policies.
  • Network controller 1324 may implement the network solution m the computing infrastructure by, e.g., configuring the one or more virtual network and virtual network interfaces in the virtual routers.
  • vCSRs 20 may support IPv6 in underlay along with SR-MPLS over IPv6 Tunnels on the vRouters 206.
  • the cRPD 324 control plane traffic e.g.: OSPF, ISIS, etc may be routed using the IPv6 underlay support provided by vRouters 206.
  • the overlay traffic coming from the user Pods may be routed by vRouters 206 over SR- MPLSoIPv6 or other tunnels.
  • the overlay traffic may be identified using the L3VPN Service Label (programmed by cRPDs 24).
  • the SR-MPLS tunnel may be represented using a ‘label stack” programmed by cRPDs 24.
  • virtual network interfaces for Pods 422 to vRouter 206 A may be virtio-host interfaces for separate VRFs configured in vRouter 206 A.
  • vRouter agent 314 and vRouter 206A may be configured with multiple interfaces for communicating with each other: a pktO interface and a Unix domain socket (e.g., Sandesh).
  • vHostO 382A is described elsewhere m this disclosure.
  • the cRPD 324 control plane traffic path via IPv6 underlay may in some cases be as follows:
  • VhostO interface 382A of vRouter 206A will host the IPv6 address used by cRPD 324 to send and receive control plane traffic for, e.g., BGP, IS-IS, OSPF, or other routing and control protocols.
  • control plane traffic e.g., BGP, IS-IS, OSPF, or other routing and control protocols.
  • cRPD 324 may attempt to resolve the next-hop mac address via a IPv6 Neighbor solicitation request.
  • ⁇ vRouter 206A will transparently send these IPv6 ND requests through the physical/fabric interface attached to it (e.g., to one of IPs 322). Similarly, on receiving a response to the solicitation request, vRouter 206A may send the response packet to the cRPD 324 as well as vRouter Agent 314.
  • the actual unicast control plane packets may be routed by vRouter 206A to the physical/fabric interface and vice versa.
  • the routing w'ould happen based on the routes programmed by cRPD 324 and vRouter Agent 314.
  • Control plane multicast packets sent by cRPD 324 over vhostO interface 382A may be forwarded by vRouter 206A over the physical interface.
  • any multicast packets coming over the physical interface may be sent to cRPD 324 using the vhostO interface 206A.
  • the overlay data path traffic is the traffic sent and received by the user pods 422 created in server 600 (the compute node). This can be either IPv4 or IPv6 traffic.
  • FIG. 12 illustrates example configuration of server 1200, 1220 and forwarding of a packet from Pod 422A on server 1200 to Pod 422M on server 1220.
  • Cloud native router 1202 includes instances of cRPD 324 for control plane and vRouter 206A for data plane.
  • interface 1204A has IP address 10.1.1.1 and Label - Li
  • interface 1204B has IP address 20.1.1.1 and Label L2.
  • Pods 422M, 422N have similar such interfaces, and the CNR is not shown on server 1220.
  • ⁇ cRPD 324 and router 1210 along with in-between SR-capable routers 1206, 1208 are configured with ISIS/OSPF with SR capabilities to exchange the SR segment identifiers (SIDs) in terms of labels. This results in cRPD 324 and router 1210 knowing the SIDs in the network and what such SIDs correspond to.
  • ⁇ cRPD 324 and router 1210 are also configured with BGP with inet and inet6 VPN which is used to exchange the overlay L3 VPN routes for the pod (“virtual”) networks. As a result of this the service labels for the overlay routes are exchanged between cRPD 324 and router 1210 .
  • ⁇ cRPD 324 now' programs the overlay routes, sendee label and the underlay SR-MPLS nexthop information to the vRouter 206A via the vRouter Agent (not shown m FIG.
  • the mechanics of choosing a SR path is taken care of by the cRPD 324 and optionally an SDN controller / path computation engine.
  • the overlay route is programmed in the pod VRF and is associated with a service label and a SR-MPLS nexthop.
  • the vRouter 206A SR-MPLS nexthop consists of a list of SR-labels to push along with L3 (IPv6), L2 (Ethernet) header information and the outgoing interface all of which may be used to encapsulate the packet and send the packet out as packet 1230.
  • vRouter uses the SR-MPLS nexthop to encapsulate the packet and send it out.
  • PGP penultimate hop popping
  • FIG. 13 illustrates example configuration of servers 1200, 1220 and forwarding of a packet from Pod 422M to Pod 422A.
  • ⁇ vRouter Agent 314 (not shown in FIG. 13) would install a L3 Receive Nil for the vhostO IPv6 address. This would be done at the time of vRouter Agent 314 initialization upon reading the agent .conf file winch will contain the vhostO IP address.
  • Routing process would happen in both cRPD 324 and router 1210 as given in the first two steps of Ingress processing.
  • vRouter 206A results in vRouter 206A being able to receive the incoming traffic destined to vhostO IP address and do further processing on it.
  • vRouter 206A For the incoming traffic destined to vhostO IP address, vRouter 206A will check if the packet is an SR-MPLS packet and if so it will pop the outer NULL/vCSR SID ( in case of w/o PHP) label.
  • the vRouter 206A will pop the service label in the packet.
  • the sendee label will point to the Pod VMI next hop
  • ⁇ vRouter 206A will then forward the packet to the Pod using the Pod VMFs nexthop after doing necessary 1.2 rewrite.
  • FIG. 14 is a conceptual diagram illustrating example operations 1400 for programming vRouter forwarding information, according to techniques of this disclosure.
  • FIG. 15 is a conceptual diagram illustrating example operations 1500 for configuring and advertising a virtual network interface in a server having a cloud native router, according to techniques of this disclosure. Operations 15 may be similar in some respects to those described with respect to FIG. 7.
  • DUs 22 containers may receive 5G radio traffic from PortO, which is using single root I/O virtualization (SR-IOV) to create multiple virtual functions (VFs) or instances for the phy sical function (port), with each VF terminating in its own Pod (one of DUs 22). These VFs are visible to the Linux kernel 380, however, they are no routing protocols run over them. Their sole purpose is to haul the radio traffic into DUs 22. DUs 22 process the radio traffic and would like to send this processed traffic over a tunnel (SR-MPLS) to the CU 5G functionality running in a data center, as described with respect to FIG. 1.
  • SR-MPLS tunnel
  • cRPD 324 may be configured with the requisite protocols (IGPs, BGP etc.).
  • DPDK vRouter 206A would manage the physical Portl - over which routing traffic would be sent and received.
  • eRPD 324 may be configured with the requisite protocols through Netconf, via a domain controller. eRPD 324 will establish adjacencies for various protocols; learn and advertise the routing information (including reachability to application containers) using its routing protocols. eRPD 324 needs to program this learnt routing information to the vRouter agent 314. vRouter 206A will provide a bidirectional gRPC channel 340 for to-and-fro communication with eRPD 324. The data objects (routes, VRFs, interfaces etc.) may be modelled in protocol buffers.
  • control traffic would come over a different physical port than port 0, e.g,, portl .
  • vRouter 206A will detect that this is control traffic (non-tunneled traffic) and forward it over vhostO interface 382 A.
  • control traffic non-tunneled traffic
  • ail traffic would appear to come from vhostO interface 382A.
  • all eRPD routes will refer to vhostO interface 382 A.
  • eRPD 324 will install these routes both to the vRouter agent 314 and to the kernel 380 (in some cases, this may involve selectively installing, using RIB/instance policy, the underlay routes in inet.O to the kernel ).
  • vRouter agent 314 may translate routes pointing to vhostO to portl automatically, as illustrated in FIG. 14.
  • the reason eRPD 324 will install the routes to the kernel 380 is because eRPD 324 might need the reachability to establish additional protocols adjacencies/sessions, e.g., BGP multihop sessions over reachability provided by IGPs.
  • Control traffic sent by eRPD 324 to vRouter 206A over vhostO interface 382A must be sent out of Portl without any other further operations.
  • eRPD 324 may communicate with vRouter agent 314 in one of the following ways:
  • eRPD 324 will continue to emit netlink messages.
  • An external (to eRPD 324) translator will convert these into respective gRPC messages. There may be some additional latency introduced by the introduction of this translator. This translator may be an in-place stateless entity.
  • cRPD 324 directly starts using these gRPC APIs through Kernel Routing Table multichannel or some version of FDM. In some eases, cRPD 324 may directly start using these gRPC APIs through another kernel routing table on top of the existing channels to program the Linux kernel and the one used to program SONIC.
  • a cRPD-based CNI 312 will create the veth pairs for each of the application containers on being notified by Kubernetes/orchestration agent 310. It is the responsibility of CNI 312 to assign IP addresses to these interfaces. One end of the veth pair would terminate in the Application Container’s interface. As for the other end, CNI 312 would request the vRouter 206A to start monitoring this end of the veth interfaces. This facilitates ail tunneled traffic from the physical ports headed for application containers to be forwarded by DPDK without having to involve kernel 380. Finally, CNI 312 would notify the Pod to start using the DPDK/memory-mapped interface.
  • vRouter 206 A now manages one end of these veth interfaces, these are not visible from kernel 280. Hence, these interfaces are not visible to cRPD 324 and thus cRPD 324 can’t announce reachability information to the outside world.
  • a veth equivalent interface is made visible to cRPD 324. This will not be an interface over which cRPD 324 could run routing protocols (as that requires using kernel facilities as sockets, TCP/IP stack etc.). This interface is there to notify cRPD 324 of reachability it needs to advertise.
  • vRouter 206A may directly inform cRPD 324 about this interface in some cases. This may be preferable because it is in some ways similar to how current VRFs are handled in cRPD 324. In addition, if this interface goes down, vRouter 206A can inform cRPD 324. If cRPD starts, vRouter 206A can let cRPD know of all the interfaces it is monitoring again. [0207] With these interfaces, cRPD 324 can advertise MPLS reachability to reach the application containers. cRPD 324 can either advertise vrf-table-label or a per-nexthop label (where next-hop represents the veth equivalent) or per-prefix label. When this MPLS route may be installed to vRouter 206A, vRouter agent 314 will have the ability to translate veth- equivalent to the actual veth interface.
  • Domain controller configures (IGP and BGP) protocol configuration on cRPD via Netconf.
  • an operator can use CLI on cRPD to do this manually,
  • cRPD establishes IGP adjacencies and learns network reachability and Segment Routing information
  • cRPD establishes BGP session over IGP learnt connectivity.
  • cRPD learns about workload interfaces from the vrouter. cRPD creates the subnet (say /30) and interface routes (/32) corresponding to this interface.
  • CM ⁇ configures the workload interface under specific vrfs on cRPD.
  • cRPD sends vrf-interface mapping to the vRouter.
  • cRPD advertises 13vpn routes for the vrf routes from step 7.
  • cRPD sends deletes for these routes (m vrf.inet(6).0 table), with tunnel next-hopss to vRouter.
  • CNi 312 may also add a second, backup interface into the application Pod.
  • the backup interface may be configured on a different, backup data plane within the compute node than the from the active data plane on winch the active interface is configured.
  • the active data plane may be a DPDK-based virtual router
  • the backup data plane may be a kernel-based virtual router, similar to server 350 but with a kernel-based virtual router in addition to DPDK vRouter 206A.
  • DPDK enables building applications that can bypass the kernel for packet I/O.
  • Application can directly send/receive the packets from the NIC and can achieve high performance by using polling. Bypassing kernel for packet i/o results in better performance
  • DPDK vRouter 206A wall own/takeover one or more of the (physical) network ports on the system. Kernel 280 will not be able to make use of these ports for normal network I/O as long as the vRouter 206A.
  • KBs In Kubernetes (KBs) cluster, DPDK applications are run inside Pods. KBs takes care of orchestrating (lifecycle management) of these Pods. Since these applications in the Pod need network connectivity, KBs uses a component called CNI to setup network interfaces, IP address assignment and routing.
  • CNI 312 will also add an additional (backup) interface into each application Pod, but goes via a data plane that is different from the one that is currently not functional.
  • the application (or an enhanced DPDK library' running as a part of the application process) will detect the primary' (DPDK) interface is down and switches to using the kernel (backup) interface.
  • DPDK vRouter 206A When the DPDK vRouter 206A is not functional either due to software issues or undergoing maintenance, DPDK vRouter 206A physical ports may be released back to the kernel 380. This would allow the kernel 380 to start using these ports for forwarding the traffic till the DPDK vRouter 206A comes back and claims the ports.
  • CNI 312 and/or routing stack programs the same routes into DPDK vRouter 206A and kernel 380 forwarding table, although with different next-hop interfaces.
  • Routing stack could detect DPDK vRouter 206A being out of service (a TCP connection is used between routing stack and DPDK vRouter 206A ) and update the next-hop information and bring up the core facing (physical) interface state accordingly.
  • routing stack could detect the availability and restore the routes and interface state such that application POD traffic starts going via the DPDK vRouter 206A.
  • a set of software components provides CNI functionality that address networking requirements unique to cloud native 5G network environments.
  • the software components include a containerized routing protocol daemon (cRPD) to support a Network Service Mesh (NSM) architecture.
  • the set of software components support NSM architecture and may provide additional capabilities such as hybrid networking (between physical and virtual infrastructure), direct reachability to a Pod from outside a cluster of compute nodes to, e.g., advertise over protocols such as BGP, set up tunnels dynamically using various technologies such as MPLS, SRv6, IP-IP/VxLAN/GRE, IPsec, etc.
  • a 5G G-RAN network may be deployed using cloud native technologies and follow the 5G split in which the DU (Distributed Unit) and CSR (Ceil Site Router) are virtualized and run on a compute node.
  • the set of software components may operate as a cell-site router to provide L3 reachability for the mid-haul for the 5G network.
  • the software components use cRPD to distribute Layer 3 (L3) network reachability? information of the Pods not just within the cluster, but also outside the cluster.
  • the cRPD also programs the data plane on each compute node.
  • the DU application may run in the application Pod to bypasses the kernel networking stack and abstractions, and thereby use, e.g., zero-copy mechanisms to directly send/receive packets from the physical NIC.
  • Data Plane Development Kit (DPDK) is one such framework, and a DPDK-based virtual router may be used as a userspace data plane that leverages DPDK for high forwarding performance for this purpose.
  • the software components may include a DPDK-based virtual router to support DPDK applications.
  • a CNI plugin manages the DPDK configuration for application and programs the virtual router. This may include setting up a vhost control channel and assigning IP (e.g., both IPv4 and IPv6) and MAC addresses, advertising the Pod IP addresses, and detecting and withdrawing the routes when the Pod is considered down or removed.
  • IP e.g., both IPv4 and IPv6
  • Kubernetes is an orchestration platform for running containerized applications in a clustered computing environment, it provides automatic deployment, scaling, networking and management of containerized applications.
  • a KBs pod consists of one or more containers representing an instance of application and is the smallest unit that KBs can handle. All containers in the pod share the same network namespace.
  • Container Network Interface provides networking for application pods in Kubernetes. It takes care of setting up pod interfaces, address assignment and networking between pods in a k8s cluster and network isolation between different workloads.
  • CNI 312 may CNI functionality along with capabilities useful for supporting Network Service Mesh (NSM) architecture.
  • NSM Network Service Mesh
  • a CNI that supports NSM architecture provides additional capabilities such as hybrid networking (between physical and virtual infrastructure), direct reachability to Pod from outside the cluster such for e.g: advertise over protocols such as BGP, setup tunnels dynamically using various technologies such as MPLS, SRv6, DMP/V xL AN/GRE, IPsec, etc.
  • a 5G Q-RAN network may be deployed using cloud native technologies and follows 5G 7.2 split where the DU (Distributed Unit) and CSR (Cell Site Router) are virtualized and run on a server.
  • CNI 312 acts as a cell-site router to provide L3 reachability for the mid-haul,
  • cRPD 324 distribute Layer-3 network reachability information of the Pods not just within a Kubernetes cluster (in Kubernetes deployments), but also outside the cluster.
  • cRPD 324 also takes care of programming the forwarding-plane on each compute node/server.
  • a DU application which runs in the application Pod bypasses the kernel networking stack and abstractions, and uses (zero-copy) mechanisms to directly send/receive packets from the physical NIC.
  • Data Plane Development Kit (DPDK) is one such framework.
  • DPDK vRouter 206A is a user space data-plane that leverages DPDK for high forwarding performance.
  • vRouter 206 A supports DPDK applications.
  • CNI 312 will take care of setting up DPDK configuration for applications and programming vrouter 206A. This includes setting up of vhost control channel and assigning IP (both IPv4 and IPv6) and mac addresses, advertise the Pod IP addresses and detect and withdraw the routes when the Pod is considered down or removed.
  • the set of components that make up a CNI 312 and the cloud-native router may be considered a Kubernetes CNI, referred to herein as the Platter CNI.
  • the CNI 312 and the cloud-native router provide the following features:
  • Network namespaces Application pods should be reachable via non-default network namespace or routing instance implemented using L3VPNs.
  • IPv6 Underlay Support IPv6 underlay as required by the use-ease. IGP protocols should be capable of exchanging IPv6 routes. BGP protocol sessions should be setup using IPv6 addresses.
  • IPv6 Overlay Support IPv6 overlays by assigning IPv6 addresses to the pod and advertising them over BGP.
  • ⁇ BGP Platter runs on each node in kSs cluster and uses BGP to advertise pod reachability to the network. Routes advertised over BGP may carry SRv6 label stack or other tunnel encapsulation attributes.
  • ⁇ 1GP Each node will participate in IGP underlay to learn reachability to other BGP peers and route reflectors. IS-IS may be used to advertise host/node addresses to the network.
  • SRv6 Pod traffic may be carried over SRv6 tunnels. IS-IS is used to learn segment routing SID information.
  • vrouter-dpdk For beter packet I/O performance, support vrouter-dpdk as the data- plane. This includes allocation of IP and mac addresses, generating suitable DPDK configuration for the application, programming of vrouter and advertising the routes.
  • YAM file which contains various details about all the containers that are part of the CNI: repositories the images are hosted on, order of initialization, environment variables, configuration, and license key information.
  • YAML file has to be customized to suite K8s deployment.
  • a sample YAML configuration (platter. yml) for platter CNI is provided below:
  • CNI 312, cRPD 324, and vRouter 206A sets up the network interface, assign IP address and setup routing, there is no direct interaction with the applications that are part of the application Pod.
  • a Unix Domain Socket (UDS) (called vhost-user adaptor) may be used between the application running in the Pod and the vrouter 206A as the control channel which is used to negotiate the data channel (virtio interface in the Pod and vhost interface on vRouter 206A) to transfer packets.
  • UDS Unix Domain Socket
  • a config file is generated, which should be volume mounted into the application pod at a suitable location accessible/known to the applications. For example:
  • the application pod volume will be mounted and create the configuration file as specified by the following parameters in the configmap section of the YAML file: dpdkConfigBaseDir: “/var/run/cni/platter” # Path on the host mapped into the pod dpdkConfigFileName: “dpdk-config.json”
  • the DPDK application may knows the location and name of the config file. Application in the pod should be able to access the pod-id as an environment variable. The system will set the permissions on the path such that the contents of the directory are accessible only when the pod-id is known.
  • application pod YAML configuration should include additional details such as environment variables and annotations.
  • DPDK application configuration may be stored in a mounted volume and the path.
  • path will have pod UID inserted and the DPDK application should be aware of the UID.
  • Pod YAML should export the Pod UID as KUBERNETES _POD_ UID which may be needed by DPDK application.
  • Annotations may be used to set the following optional configuration details needed by Platter:
  • initial versions of Platter will use a statically defined pod network configuration loaded using a config map files. This config map is read during Platter CNI installation and stored on each node as a file. This config file holds details on per application per interface basis and includes such IP addresses, routing-instance details.
  • Platter CNI is invoked to setup a pod interface, using pod name and interface name as the key, it finds the interface configuration details required to bring up the interface,
  • a computing device may execute one or more of such modules with multiple processors or multiple devices.
  • a computing device may execute one or more of such modules as a virtual machine executing on underlying hardware.
  • One or more of such modules may execute as one or more services of an operating system or computing platform.
  • One or more of such modules may execute as one or more executable programs at an application layer of a computing platform.
  • functionality provided by a module could be implemented by a dedicated hardware device.
  • certain modules, data stores, components, programs, executables, data items, functional units, and/or other items included within one or more storage devices may be illustrated separately, one or more of such items could be combined and operate as a single module, component, program, executable, data item, or functional unit.
  • one or more modules or data stores may be combined or partially combined so that they operate or provide functionality as a single module.
  • one or more modules may operate in conjunction with one another so that, for example, one module acts as a service or an extension of another module.
  • each module, data store, component, program, executable, data item, functional unit, or other item illustrated within a storage device may include multiple components, sub-components, modules, sub-modules, data stores, and/or other components or modules or data stores not illustrated.
  • each module, data store, component, program, executable, data item, functional unit, or other item illustrated within a storage device may be implemented in various ways.
  • each module, data store, component, program, executable, data item, functional unit, or other item illustrated within a storage device may be implemented as part of an operating system executed on a computing device.
  • this disclosure may be directed to an apparatus such as a processor or an integrated circuit device, such as an integrated circuit chip or chipset.
  • the techniques may be realized at least in part by a computer-readable data storage medium comprising instructions that, when executed, cause a processor to perform one or more of the methods described abo ve.
  • the computer-readable data storage medium may store such instructions for execution by a processor.
  • a computer-readable medium may form part of a computer program product, which may include packaging materials.
  • a computer-readable medium may comprise a computer data storage medium such as random access memory (RAM), read-only memory (ROM), non-volatile random access memory (NVRAM), electrically erasable programmable readonly memory (EEPROM), Flash memory, magnetic or optical data storage media, and the like, in some examples, an article of manufacture may comprise one or more computer- readable storage media.
  • RAM random access memory
  • ROM read-only memory
  • NVRAM non-volatile random access memory
  • EEPROM electrically erasable programmable readonly memory
  • Flash memory magnetic or optical data storage media, and the like
  • an article of manufacture may comprise one or more computer- readable storage media.
  • the computer-readable storage media may comprise non-transitory media.
  • the term “non-transitory” may indicate that the storage medium is not embodied in a carrier wave or a propagated signal.
  • a non-transitory storage medium may store data that can, over time, change (e.g., in RAM or cache).
  • the code or instructions may be software and/or firmware executed by processing circuitry including one or more processors, such as one or more digital signal processors (DSPs), general purpose microprocessors, application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other equivalent integrated or discrete logic circuitry'.
  • DSPs digital signal processors
  • ASICs application-specific integrated circuits
  • FPGAs field-programmable gate arrays
  • processors may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein.
  • functionality' described in this disclosure may be provided within software modules or hardware modules,
  • Example 1 A system comprising a container workload, a containerized routing protocol daemon comprising processing circuitry and configured to receive routing information from an external network controller, a kernel network stack comprising processing circuitry and configured to route packets for the container workload based on first routing information; a DPDK-based virtual router comprising processing circuitry and configured to route packets for the container workload based on second routing information; and a container networking interface plugin comprising processing circuitry and configured to configure a first virtual network interface for the workload to interface with the DPDK- based virtual router and a second virtual network interface for the workload to interface with the kernel network stack.
  • Example 2 The system of Example 1, further composing: a virtual router agent for the virtual router, the virtual router agent comprising processing circuitry and configured to receive the second routing information from the containerized routing protocol daemon.
  • Example 3 The system of Example 1, wherein the second routing information comprises routing information for an o verlay network of the computing infrastructure.
  • Example 4 The system of Example 1 , wherein the system operates as a virtualized cell site router for a mobile network.
  • Example 5 The system of Example 1, where the workloads are distributed units (DUs) for a 5G mobile network.
  • DUs distributed units
  • Example 6 The system of Example 1, wherein the system is a single compute node.
  • Example 7 The system of Example 1, wherein the container networking interface plugm is configured to receive virtual network interface information for the virtual router from a Kubernetes infrastructure.
  • Example 8 The system of Example 1, w'herein the system interfaces with a Kubernetes infrastructure as a container networking interface.
  • Example 9 The system of Example I, w'herein the routing information comprises segment routing information.
  • Example 10 The system of Example 1, wherein virtual router agent is configured to interface with multiple different types of control planes.
  • Example 11 A computing device comprising: a container networking interface plugin comprising processing circuitry; an orchestration agent comprising processing circuitry, wherein the orchestration agent is an agent of an orchestrator for a computing infrastructure that includes the computing device; a kernel network stack comprising processing circuitry; a virtual router comprising a virtual router data plane and a virtual router agent, the virtual router comprising processing circuitry, and a logically-related group of one or more containers, the computing device configured to operate to implement a backup netw'ork interface for the one or more containers.
EP22712227.2A 2021-03-01 2022-02-28 Container-router mit virtueller vernetzung Pending EP4302469A1 (de)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP23194733.4A EP4307632A3 (de) 2021-03-01 2022-02-28 Container-router mit virtueller vernetzung
EP23194723.5A EP4307639A1 (de) 2021-03-01 2022-02-28 Container-router mit virtueller vernetzung

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
IN202141008548 2021-03-01
US202163242434P 2021-09-09 2021-09-09
US17/649,643 US11818647B2 (en) 2021-03-01 2022-02-01 Containerized router with a generic data plane interface
US17/649,632 US20220279420A1 (en) 2021-03-01 2022-02-01 Containerized router with virtual networking
US17/649,640 US11812362B2 (en) 2021-03-01 2022-02-01 Containerized router with a disjoint data plane
PCT/US2022/070865 WO2022187796A1 (en) 2021-03-01 2022-02-28 Containerized router with virtual networking

Related Child Applications (4)

Application Number Title Priority Date Filing Date
EP23194733.4A Division EP4307632A3 (de) 2021-03-01 2022-02-28 Container-router mit virtueller vernetzung
EP23194733.4A Division-Into EP4307632A3 (de) 2021-03-01 2022-02-28 Container-router mit virtueller vernetzung
EP23194723.5A Division-Into EP4307639A1 (de) 2021-03-01 2022-02-28 Container-router mit virtueller vernetzung
EP23194723.5A Division EP4307639A1 (de) 2021-03-01 2022-02-28 Container-router mit virtueller vernetzung

Publications (1)

Publication Number Publication Date
EP4302469A1 true EP4302469A1 (de) 2024-01-10

Family

ID=80930204

Family Applications (3)

Application Number Title Priority Date Filing Date
EP23194733.4A Pending EP4307632A3 (de) 2021-03-01 2022-02-28 Container-router mit virtueller vernetzung
EP22712227.2A Pending EP4302469A1 (de) 2021-03-01 2022-02-28 Container-router mit virtueller vernetzung
EP23194723.5A Pending EP4307639A1 (de) 2021-03-01 2022-02-28 Container-router mit virtueller vernetzung

Family Applications Before (1)

Application Number Title Priority Date Filing Date
EP23194733.4A Pending EP4307632A3 (de) 2021-03-01 2022-02-28 Container-router mit virtueller vernetzung

Family Applications After (1)

Application Number Title Priority Date Filing Date
EP23194723.5A Pending EP4307639A1 (de) 2021-03-01 2022-02-28 Container-router mit virtueller vernetzung

Country Status (2)

Country Link
EP (3) EP4307632A3 (de)
WO (1) WO2022187796A1 (de)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI826194B (zh) * 2022-12-20 2023-12-11 明泰科技股份有限公司 相容於雲原生虛擬網路層的使用者層功能(upf)封包處理方法及計算裝置
US11729290B1 (en) * 2022-12-27 2023-08-15 Lenovo Global Technology (United States) Inc. Intelligent multicast proxy between container and outside network
CN116132435B (zh) * 2023-02-17 2023-09-01 成都道客数字科技有限公司 一种容器云平台的双栈跨节点通信方法和系统

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10587434B2 (en) * 2017-07-14 2020-03-10 Nicira, Inc. In-band management interface with user space datapath
US10855531B2 (en) * 2018-08-30 2020-12-01 Juniper Networks, Inc. Multiple networks for virtual execution elements
US10728145B2 (en) * 2018-08-30 2020-07-28 Juniper Networks, Inc. Multiple virtual network interface support for virtual execution elements
US10708082B1 (en) * 2018-08-31 2020-07-07 Juniper Networks, Inc. Unified control plane for nested clusters in a virtualized computing infrastructure
US10785122B2 (en) * 2018-10-05 2020-09-22 Cisco Technology, Inc. Canary release validation mechanisms for a containerized application or service mesh

Also Published As

Publication number Publication date
EP4307632A2 (de) 2024-01-17
EP4307632A3 (de) 2024-01-24
WO2022187796A1 (en) 2022-09-09
EP4307639A1 (de) 2024-01-17

Similar Documents

Publication Publication Date Title
US11818647B2 (en) Containerized router with a generic data plane interface
US10708082B1 (en) Unified control plane for nested clusters in a virtualized computing infrastructure
US11171830B2 (en) Multiple networks for virtual execution elements
US10728145B2 (en) Multiple virtual network interface support for virtual execution elements
US20230123775A1 (en) Cloud native software-defined network architecture
US11743182B2 (en) Container networking interface for multiple types of interfaces
US20220334864A1 (en) Plurality of smart network interface cards on a single compute node
US20230079209A1 (en) Containerized routing protocol process for virtual private networks
EP4307632A2 (de) Container-router mit virtueller vernetzung
EP4160409A1 (de) Cloud-native softwaredefinierte netzwerkarchitektur für mehrere cluster
EP4161003A1 (de) Evpn-host-geroutete überbrückung (hrb) und natives evpn-cloud-datenzentrum
CN116888940A (zh) 利用虚拟联网的容器化路由器
US20240031908A1 (en) Containerized router with a disjoint data plane
EP4293978A1 (de) Hybride datenebene für einen container-router
US11895020B1 (en) Virtualized cell site routers with layer 2 forwarding
EP4329254A1 (de) Absichtsgesteuerte konfiguration eines cloudnativen routers
EP4160410A1 (de) Cloud-native software-definierte netzwerkarchitektur
EP4336790A1 (de) Netzwerksegmentierung für containerorchestrierungsplattformen
EP4075757A1 (de) Mehrzahl von schnittstellenkarten für smarte netzwerke auf einem einzigen rechenknoten
CN117255019A (zh) 用于虚拟化计算基础设施的系统、方法及存储介质
US20230106531A1 (en) Virtual network routers for cloud native software-defined network architectures
US20240095158A1 (en) Deployment checks for a containerized sdn architecture system
CN117640389A (zh) 云原生路由器的意图驱动配置
CN117687773A (zh) 用于容器编排平台的网络分段

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20230825

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR