US20230342168A1 - Declarative vm management for a container orchestrator in a virtualized computing system - Google Patents

Declarative vm management for a container orchestrator in a virtualized computing system Download PDF

Info

Publication number
US20230342168A1
US20230342168A1 US18/334,592 US202318334592A US2023342168A1 US 20230342168 A1 US20230342168 A1 US 20230342168A1 US 202318334592 A US202318334592 A US 202318334592A US 2023342168 A1 US2023342168 A1 US 2023342168A1
Authority
US
United States
Prior art keywords
vms
host
control plane
virtualization layer
resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US18/334,592
Inventor
Derek William Beard
Jared Sean ROSOFF
Mark Russell JOHNSON
Brian Charles FORNEY
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
VMware LLC
Original Assignee
VMware LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by VMware LLC filed Critical VMware LLC
Priority to US18/334,592 priority Critical patent/US20230342168A1/en
Publication of US20230342168A1 publication Critical patent/US20230342168A1/en
Assigned to VMware LLC reassignment VMware LLC CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: VMWARE, INC.
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45541Bare-metal, i.e. hypervisor runs directly on hardware
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45562Creating, deleting, cloning virtual machine instances

Definitions

  • VMs virtual machines
  • application services application services
  • a container orchestrator known as Kubernetes®
  • Kubernetes provides a platform for automating deployment, scaling, and operations of application containers across clusters of hosts. It offers flexibility in application development and offers several useful tools for scaling.
  • containers are grouped into logical unit called “pods” that execute on nodes in a cluster (also referred to as “node cluster”).
  • Containers in the same pod share the same resources and network and maintain a degree of isolation from containers in other pods.
  • the pods are distributed across nodes of the cluster.
  • a node includes an operating system (OS), such as Linux®, and a container engine executing on top of the OS that supports the containers of the pod.
  • OS operating system
  • a node can he a physical server or a VM.
  • Computer virtualization is a technique that involves encapsulating a physical computing machine platform into virtual machine(s) executing under control of virtualization software on a hardware computing platform or “host.”
  • a virtual machine (VM) provides virtual hardware abstractions for processor, memory, storage, and the like to a guest operating system.
  • the virtualization software also referred to as a “hypervisor,” incudes one or more virtual machine monitors (VMMs) to provide execution environment(s) for the virtual machine(s).
  • VMMs virtual machine monitors
  • VMs allow for greater operating system diversity, isolation, and customization than do containers. Users have made considerable investments in making their applications run well on VMs and leveraging differentiating technologies of the underlying virtualized computing system. It is thus desirable to bring VMs into container orchestration systems like Kubernetes to allow a single management and deployment paradigm.
  • a virtualized computing system includes: a host cluster having a virtualization layer executing on hardware platforms of hosts, the virtualization layer supporting execution of virtual machines (VMs), the VMs including pod VMs and native VMs, the pod VMs including container engines supporting execution of containers in the pod VMs, the native VMs including applications executing on guest operating systems; and an orchestration control plane integrated with the virtualization layer, the orchestration control plane including a master server having a pod VM controller to manage lifecycles of the pod VMs and a native VM controller to manage lifecycles of the native VMs.
  • VMs virtual machines
  • the orchestration control plane including a master server having a pod VM controller to manage lifecycles of the pod VMs and a native VM controller to manage lifecycles of the native VMs.
  • FIG. 1 is a block diagram of a virtualized computing system in which embodiments described herein may be implemented.
  • FIG. 2 is a block diagram depicting a software platform according an embodiment.
  • FIG. 3 is a block diagram of a supervisor Kubernetes master according to an embodiment.
  • FIG. 4 is a block diagram depicting a logical view of a guest cluster executing in a virtualized computing system according to an embodiment.
  • FIG. 5 is a flow diagram depicting a method of application orchestration in a virtualized computing system according to an embodiment.
  • FIG. 6 is a flow diagram depicting a method of application orchestration in a virtualized computing system according to another embodiment.
  • FIG. 7 is a flow diagram depicting a method of application orchestration in a virtualized computing system according to an embodiment.
  • a virtualized computing system includes a software-defined datacenter (SDDC) comprising a server virtualization platform integrated with a logical network platform.
  • SDDC software-defined datacenter
  • the server virtualization platform includes clusters of physical servers (“hosts”) referred to as “host clusters.”
  • hosts hosts
  • Each host cluster includes a virtualization layer, executing on host hardware platforms of the hosts, which supports execution of virtual machines (VMs).
  • a virtualization management server manages host clusters, the virtualization layers, and the VMs executing thereon.
  • the virtualization layer of a host cluster is integrated with an orchestration control plane, such as a Kubernetes control plane.
  • an orchestration control plane such as a Kubernetes control plane.
  • This integration enables the host cluster as a “supervisor cluster” that uses VMs to implement both control plane nodes having a Kubernetes control plane, and compute nodes managed by the control plane nodes.
  • Kubernetes pods are implemented as “pod VMs,” each of which includes a kernel and container engine that supports execution of containers.
  • the Kubernetes control plane of the supervisor cluster is extended to support custom objects in addition to pods, such as VM objects that are implemented using native VMs (as opposed to pod VMs).
  • VIP virtualization infrastructure administrator
  • VIP can enable a host cluster as a supervisor cluster and provide its functionality to development teams.
  • the orchestration control plane includes master server(s) with both pod VM controllers and native VM controllers.
  • the pod VM controllers manage the lifecycles of pod VMs.
  • the native VM controllers manage the lifecycles of native VMs executing in parallel to the pod VMs.
  • FIG. 1 is a block diagram of a virtualized computing system 100 in which embodiments described herein may be implemented.
  • System 100 includes a cluster of hosts 120 (“host cluster 118 ”) that may be constructed on server-grade hardware platforms such as an x86 architecture platforms. For purposes of clarity, only one host cluster 118 is shown. However, virtualized computing system 100 can include many of such host clusters 118 .
  • a hardware platform 122 of each host 120 includes conventional components of a computing device, such as one or more central processing units (CPUs) 160 , system memory (e.g., random access memory (RAM) 162 ), one or more network interface controllers (NICs) 164 , and optionally local storage 163 .
  • CPUs central processing units
  • RAM random access memory
  • NICs network interface controllers
  • CPUs 160 are configured to execute instructions, for example, executable instructions that perform one or more operations described herein, which may be stored in RAM 162 .
  • NICs 164 enable host 120 to communicate with other devices through a physical network 180 .
  • Physical network 180 enables communication between hosts 120 and between other components and hosts 120 (other components discussed further herein).
  • Physical network 180 can include a plurality of VLANs to provide external network virtualization as described further herein.
  • hosts 120 access shared storage 170 by using NICs 164 to connect to network 180 .
  • each host 120 contains a host bus adapter (ITBA) through which input/output operations (IOs) are sent to shared storage 170 over a separate network (e.g., a fibre channel (PC) network).
  • shared storage 170 include one or more storage arrays, such as a storage area network (SAN), network attached storage (NAS), or the like. Shared storage 170 may comprise magnetic disks, solid-state disks, flash memory, and the like as well as combinations thereof.
  • hosts 120 include local storage 163 (e.g., hard disk drives, solid-state drives, etc.). Local storage 163 in each host 120 can be aggregated and provisioned as part of a virtual SAN, which is another form of shared storage 170 .
  • a software platform 124 of each host 120 provides a virtualization layer, referred to herein as a hypervisor 150 , which directly executes on hardware platform 122 .
  • hypervisor 150 is a Type-1 hypervisor (also known as a “bare-metal” hypervisor).
  • the virtualization layer in host cluster 118 (collectively hypervisors 150 ) is a bare-metal virtualization layer executing directly on host hardware platforms.
  • Hypervisor 150 abstracts processor, memory, storage, and network resources of hardware platform 122 to provide a virtual machine execution space within which multiple virtual machines (VM) may be concurrently instantiated and executed.
  • hypervisor 150 One example of hypervisor 150 that may be configured and used in embodiments described herein is a VMware ESXiTM hypervisor provided as part of the VMware vSphere® solution made commercially available by VMware, Inc. of Palo Alto, CA.
  • host cluster 118 is enabled as a “supervisor cluster,” described further herein, and thus VMs executing on each host 120 include pod VMs 130 and native VMs 140 .
  • a pod VM 130 is a virtual machine that includes a kernel and container engine that supports execution of containers, as well as an agent (referred to as a pod VM agent) that cooperates with a controller of an orchestration control plane 115 executing in hypervisor 150 (referred to as a pod VM controller).
  • An example of pod VM 130 is described further below with respect to FIG. 2 .
  • VMs 130 / 140 support applications 141 deployed onto host cluster 118 , which can include containerized applications (e.g., executing in either pod VMs 130 or native VMs 140 ) and applications executing directly on guest operating systems (non-containerized) (e.g., executing in native VMs 140 ).
  • containerized applications e.g., executing in either pod VMs 130 or native VMs 140
  • non-containerized e.g., executing in native VMs 140
  • One specific application discussed further herein is a guest cluster executing as a virtual extension of a supervisor cluster.
  • Some VMs 130 / 140 shown as support VMs 145 , have specific functions within host cluster 118 .
  • support VMs 145 can provide control plane functions, edge transport functions, and the like.
  • An embodiment of software platform 124 is discussed further below with respect to FIG. 2 .
  • SD network layer 175 includes logical network services executing on virtualized infrastructure in host cluster 118 .
  • the virtualized infrastructure that supports the logical network services includes hypervisor-based components, such as resource pools, distributed switches, distributed switch port groups and uplinks, etc., as well as VM-based components, such as router control VMs, load balancer VMs, edge service VMs, etc.
  • Logical network services include logical switches, logical routers. logical firewalls, logical virtual private networks (VPNs), logical load balancers, and the like, implemented on top of the virtualized infrastructure.
  • virtualized computing system 100 includes edge transport nodes 178 that provide an interface of host cluster 118 to an external network (e.g., a corporate network, the public Internet, etc.).
  • Edge transport nodes 178 can include a gateway between the internal logical networking of host cluster 118 and the external network.
  • Edge transport nodes 178 can be physical servers or VMs.
  • edge transport nodes 178 can be implemented in support VMs 145 and include a gateway of SD network layer 175 .
  • Various clients 119 can access service(s) in virtualized computing system through edge transport nodes 178 (including VM management client 106 and Kubernetes client 102 , which as logically shown as being separate by way of example).
  • Virtualization management server 11 . 6 is a physical or virtual server that manages host cluster 118 and the virtualization layer therein.
  • Virtualization management server 116 installs agent(s) 152 in hyper-visor 150 to add a host 120 as a managed entity.
  • Virtualization management server 116 logically groups hosts 120 into host cluster 118 to provide cluster-level functions to hosts 120 , such as VM migration between hosts 120 (e.g., for load balancing), distributed power management, dynamic VM placement according to affinity and anti-affinity rules, and high-availability.
  • the number of hosts 120 in host cluster 118 may he one or many.
  • Virtualization management server 116 can manage more than one host cluster 118 .
  • virtualization management server 116 further enables host cluster 118 as a supervisor cluster 101 .
  • Virtualization management server 116 installs additional agents 152 in hypervisor 150 to add host 120 to supervisor cluster 101 .
  • Supervisor cluster 101 integrates an orchestration control plane 115 with host duster 118 .
  • orchestration control plane 115 includes software components that support a container orchestrator, such as Kubernetes, to deploy and manage applications on host cluster 118 .
  • a Kubernetes container orchestrator is described herein.
  • hosts 120 become nodes of a Kubernetes cluster and pod VMs 130 executing on hosts 120 implement Kubernetes pods.
  • Orchestration control plane 115 includes supervisor Kubernetes master 104 and agents 152 executing in virtualization layer (e.g., hypervisors 150 ).
  • Supervisor Kubernetes master 104 includes control plane components of Kubernetes, as well as custom controllers, custom plugins, scheduler extender, and the like that extend Kubernetes to interface with virtualization management server 116 and the virtualization layer.
  • supervisor Kubernetes master 104 is shown as a separate logical entity.
  • supervisor Kubernetes master 104 is implemented as one or more VM(s) 130 / 140 in host cluster 118 .
  • supervisor cluster 101 can include more than one supervisor Kubernetes master 104 in a logical cluster for redundancy and load balancing.
  • virtualized computing system 100 further includes a storage service 110 that implements a storage provider in virtualized computing system 100 for container orchestrators.
  • storage service 110 manages lifecycles of storage volumes (e.g., virtual disks) that back persistent volumes used by containerized applications executing in host cluster 118 .
  • a container orchestrator such as Kubernetes cooperates with storage service 110 to provide persistent storage for the deployed applications.
  • supervisor Kubernetes master 104 cooperates with storage service 110 to deploy and manage persistent storage in the supervisor duster environment.
  • Other embodiments described below include a vanilla container orchestrator environment and a guest cluster environment.
  • Storage service 110 can execute in virtualization management server 116 as shown or operate independently from virtualization management server 116 (e.g., as an independent physical or virtual server).
  • virtualized computing system 100 further includes a network manager 112 .
  • Network manager 112 is a physical or virtual server that orchestrates SD network layer 175 .
  • network manager 112 comprises one or more virtual servers deployed as VMs.
  • Network manager 112 installs additional agents 152 in hypervisor 150 to add a host 120 as a managed entity, referred to as a transport node.
  • host cluster 118 can be a cluster 103 of transport nodes.
  • One example of an SD networking platform that can he configured and used in embodiments described herein as network manager 112 and SD network layer 175 is a VMware NSX® platform made commercially available by VMware, Inc. of Palo Alto, CA.
  • Network manager 112 can deploy one or more transport zones in virtualized computing system 100 , including VLAN transport zone(s) and an overlay transport zone.
  • a VLAN transport zone spans a set of hosts 120 (e.g., host cluster 118 ) and is backed by external network virtualization of physical network 180 (e.g., a VLAN).
  • One example VLAN transport zone uses a management VLAN 182 on physical network 180 that enables a management network connecting hosts 120 and the VI control plane (e.g., virtualization management server 116 and network manager 112 ).
  • An overlay transport zone using overlay VLAN 184 on physical network 180 enables an overlay network that spans a set of hosts 120 (e.g., host cluster 118 ) and provides internal network virtualization using software components (e.g., the virtualization layer and services executing in VMs). Host-to-host traffic for the overlay transport zone is carried by physical network 180 on the overlay VLAN 184 using layer-2-over-layer-3 tunnels.
  • Network manager 112 can configure SD network layer 175 to provide a cluster network 186 using the overlay network.
  • the overlay transport zone can he extended into at least one of edge transport nodes 178 to provide ingress/egress between cluster network 186 and an external network.
  • system 100 further includes an image registry 190 .
  • containers of supervisor cluster 101 execute in pod VMs 130 .
  • the containers in pod VMs 130 are spun up from container images managed by image registry 190 .
  • Image registry 190 manages images and image repositories for use in supplying images for containerized applications.
  • Virtualization management server 116 and network manager 112 comprise a virtual infrastructure (VI) control plane 113 of virtualized computing system 100 .
  • Virtualization management server 116 can include a supervisor cluster service 109 , storage service 110 , and VI services 108 .
  • Supervisor cluster service 109 enables host cluster 118 as supervisor cluster 101 and deploys the components of orchestration control plane 115 .
  • VI services 108 include various virtualization management services, such as a distributed resource scheduler (DRS), high-availability (HA) service, single sign-on (SSO) service, virtualization management daemon, and the like.
  • DRS is configured to aggregate the resources of host cluster 118 to provide resource pools and enforce resource allocation policies. DRS also provides resource management in the form of load balancing, power management, VM placement, and the like.
  • HA service is configured to pool VMs and hosts into a monitored cluster and, in the event of a failure, restart VMs on alternate hosts in the cluster.
  • a single host is elected as a master, which communicates with the HA service and monitors the state of protected VMs on subordinate hosts.
  • the HA service uses admission control to ensure enough resources are reserved in the cluster for VM recovery when a host fails.
  • SSO service comprises security token service, administration server, directory service, identity management service, and the like configured to implement an SSO platform for authenticating users.
  • the virtualization management daemon is configured to manage objects, such as data centers, clusters, hosts, VMs, resource pools, datastores, and the like.
  • a VI admin can interact with virtualization management server 116 through a VM management client 106 .
  • a VI admin commands virtualization management server 116 to form host cluster 118 , configure resource pools, resource allocation policies, and other cluster-level functions, configure storage and networking, enable supervisor cluster 101 , deploy and manage image registry 190 , and the like.
  • Kubernetes client 102 represents an input interface for a user to supervisor Kubernetes master 104 .
  • Kubernetes client 102 is commonly referred to as kubectl.
  • a user submits desired states of the Kubernetes system, e.g., as YAML documents, to supervisor Kubernetes master 104 .
  • the user submits the desired states within the scope of a supervisor namespace.
  • a “supervisor namespace” is a shared abstraction between VI control plane 113 and orchestration control plane 115 .
  • Each supervisor namespace provides resource-constrained and authorization-constrained units of multi-tenancy.
  • a supervisor namespace provides resource constraints, user-access constraints, and policies (e.g., storage policies, network policies, etc.).
  • Resource constraints can be expressed as quotas, limits, and the like with respect to compute (CPU and memory), storage, and networking of the virtualized infrastructure (host cluster 118 , shared storage 170 , SD network layer 175 ).
  • User-access constraints include definitions of users, roles, permissions, bindings of roles to users, and the like.
  • Each supervisor namespace is expressed within orchestration control plane 115 using a namespace native to orchestration control plane 115 (e.g., a Kubernetes namespace or generally a “native namespace”), which allows users to deploy applications in supervisor cluster 101 within the scope of supervisor namespaces. In this manner, the user interacts with supervisor Kubernetes master 104 to deploy applications in supervisor cluster 101 within defined supervisor namespaces.
  • a namespace native to orchestration control plane 115 e.g., a Kubernetes namespace or generally a “
  • FIG. 1 shows an example of a supervisor cluster 101
  • the techniques described herein do not require a supervisor cluster 101 .
  • host cluster 118 is not enabled as a supervisor cluster 101 .
  • supervisor Kubernetes master 104 Kubernetes client 102
  • pod VMs 130 pod VMs 130
  • supervisor cluster service 109 supervisor cluster service 109
  • image registry 190 can be omitted.
  • host cluster 118 is show as being enabled as a transport node cluster 103 , in other embodiments network manager 112 can he omitted.
  • virtualization management server 116 functions to configure SD network layer 175 .
  • FIG. 2 is a block diagram depicting software platform 124 according an embodiment.
  • software platform 124 of host 120 includes hypervisor 150 that supports execution of VMs, such as pod VMs 130 , native VMs 140 , and support VMs 145 .
  • hypervisor 150 includes a VM management daemon 211 , a host daemon 214 , a pod VM controller 216 , a native VM controller 217 , an image service 218 , and network agents 222 .
  • VM management daemon 211 is an agent 152 installed by virtualization management server 116 .
  • VM management daemon 211 provides an interface to host daemon 214 for virtualization management server 116 .
  • Host daemon 214 is configured to create, configure, and remove VMs (e.g., pod VMs 130 and native VMs 140 ).
  • Pod VM controller 216 is an agent 152 of orchestration control plane 115 for supervisor cluster 101 and allows supervisor Kubernetes master 104 to interact with hypervisor 150 .
  • Pod VM controller 216 configures the respective host as a node in supervisor cluster 101 .
  • Pod VM controller 216 manages the lifecycle of pod VMs 130 , such as determining when to spin-up or delete a pod VM.
  • Pod VM controller 216 also ensures that any pod dependencies, such as container images, networks, and volumes are available and correctly configured.
  • Pod VM controller 216 is omitted if host cluster 118 is not enabled as a supervisor cluster 101 .
  • Native VM controller is an agent 152 of orchestration control plane 115 for supervisor cluster 101 and allows supervisor Kubernetes master 104 to interact with hypervisor 150 to manage lifecycles of native VMs 140 and applications executing therein. While shown separately from pod VM controller 216 , in some embodiments both pod VM controller 216 and native VM controller 217 can be functions of a single controller.
  • Image service 218 is configured to pull container images from image registry 190 and store them in shared storage 170 such that the container images can be mounted by pod VMs 130 .
  • Image service 218 is also responsible for managing the storage available for container images within shared storage 170 . This includes managing authentication with image registry 190 , assuring providence of container images by verifying signatures, updating container images when necessary, and garbage collecting unused container images.
  • Image service 218 communicates with pod VM controller 216 during spin-up and configuration of pod VMs 130 .
  • image service 218 is part of pod VM controller 216 .
  • image service 218 utilizes system VMs 130 / 140 in support VMs 145 to fetch images, convert images to container image virtual disks, and cache container image virtual disks in shared storage 170 .
  • Network agents 222 comprises agents 152 installed by network manager 112 .
  • Network agents 222 are configured to cooperate with network manager 112 to implement logical network services.
  • Network agents 222 configure the respective host as a transport node in a duster 103 of transport nodes.
  • Each pod VM 130 has one or more containers 206 running therein in an execution space managed by container engine 208 .
  • the lifecycle of containers 206 is managed by pod VM agent 212 .
  • Both container engine 208 and pod VM agent 212 execute on top of a kernel 210 (e.g., a Linux® kernel).
  • Each native VM 140 has applications 202 running therein on top of an OS 204 .
  • Native VMs 140 do not include pod VM agents and are isolated from pod VM controller 216 . Rather, native VMs 140 include management agents 213 that communicate with native VM controller 217 .
  • Container engine 208 can be an industry-standard container engine, such as libcontainer, rune, or containerd.
  • Pod VMs 130 , pod VM controller 216 , native VM controller 217 , and image service 218 are omitted if host cluster 118 is not enabled as a supervisor cluster 101 .
  • FIG. 3 is a block diagram of supervisor Kubernetes master 104 according to an embodiment.
  • Supervisor Kubernetes master 104 includes application programming interface (API) server 302 , a state database 303 , a scheduler 304 , a scheduler extender 306 , controllers 308 , and plugins 319 .
  • API server 302 includes the Kubernetes API server, kube-api-server (“Kubernetes API 326 ”) and custom APIs 305 .
  • Custom APIs 305 are API extensions of Kubernetes API 326 using either the custom resource/operator extension pattern or the API extension server pattern. Custom APIs 305 are used to create and manage custom resources, such as VM objects.
  • API server 302 provides a declarative schema for creating, updating, deleting, and viewing objects.
  • State database 303 stores the state of supervisor cluster 101 (e.g., etcd) as objects created by API server 302 .
  • a user can provide application specification data to API server 302 that defines various objects supported by the API (e.g., as a YAML document). The objects have specifications that represent the desired state.
  • State database 303 stores the objects defined by application specification data as part of the supervisor cluster state.
  • Standard Kubernetes objects (“Kubernetes objects 310 ”) include namespaces, nodes, pods, config maps, secrets, among others.
  • Custom objects are resources defined through custom APIs 305 (e.g., VM objects 307 ).
  • Namespaces provide scope for objects. Namespaces are objects themselves maintained in state database 303 . A namespace can include resource quotas, limit ranges, role bindings, and the like that are applied to objects declared within its scope.
  • VI control plane 113 creates and manages supervisor namespaces for supervisor cluster 101 .
  • a supervisor namespace is a resource-constrained and authorization-constrained unit of multi-tenancy managed by virtualization management server 116 . Namespaces inherit constraints from corresponding supervisor cluster namespaces.
  • Config maps include configuration information for applications managed by supervisor Kubernetes master 104 . Secrets include sensitive information for use by applications managed by supervisor Kubernetes master 104 (e.g., passwords, keys, tokens, etc.).
  • the configuration information and the secret information stored by config maps and secrets is generally referred to herein as decoupled information. Decoupled information is information needed by the managed applications, but which is decoupled from the application code.
  • Controllers 308 can include, for example, standard Kubernetes controllers (“Kubernetes controllers 316 ”) (e.g., kube-controller-manager controllers, cloud-controller-manager controllers, etc.) and custom controllers 318 .
  • Custom controllers 318 include controllers for managing lifecycle of Kubernetes objects 310 and custom objects.
  • custom controllers 318 can include a VM controllers 328 configured to manage VM objects 307 and a pod VM lifecycle controller (PLC) 330 configured to manage pods 324 .
  • a controller 308 tracks objects in state database 303 of at least one resource type. Controller(s) 318 are responsible for making the current state of supervisor cluster 101 come closer to the desired state as stored in state database 303 .
  • a controller 318 can carry out action(s) by itself, send messages to API server 302 to have side effects, and/or interact with external systems.
  • Plugins 319 can include, for example, network plugin 312 and storage plugin 314 .
  • Plugins 319 provide a well-defined interface to replace a set of functionality of the Kubernetes control plane.
  • Network plugin 312 is responsible for configuration of SD network layer 175 to deploy and configure the cluster network.
  • Network plugin 312 cooperates with virtualization management server 116 and/or network manager 112 to deploy logical network services of the cluster network.
  • Network plugin 312 also monitors state database for custom objects 307 , such as NIP objects.
  • Storage plugin 314 is responsible for providing a standardized interface for persistent storage lifecycle and management to satisfy the needs of resources requiring persistent storage.
  • Storage plugin 314 cooperates with virtualization management server 116 and/or persistent storage manager 110 to implement the appropriate persistent storage volumes in shared storage 170 .
  • Scheduler 304 watches state database 303 for newly created pods with no assigned node.
  • a pod is an object supported by API server 302 that is a group of one or more containers, with network and storage, and a specification on how to execute.
  • Scheduler 304 selects candidate nodes in supervisor cluster 101 for pods.
  • Scheduler 304 cooperates with scheduler extender 306 , which interfaces with virtualization management server 116 .
  • Scheduler extender 306 cooperates with virtualization management server 116 (e.g., such as with DRS) to select nodes from candidate sets of nodes and provide identities of hosts 120 corresponding to the selected nodes.
  • scheduler 304 For each pod, scheduler 304 also converts the pod specification to a pod VM specification, and scheduler extender 306 asks virtualization management server 116 to reserve a pod VM on the selected host 120 .
  • Scheduler 304 updates pods in state database 303 with host identifiers.
  • Kubernetes API 326 , state database 303 , scheduler 304 , and Kubernetes controllers 316 comprise standard components of a Kubernetes system executing on supervisor cluster 101 .
  • Custom controllers 318 , plugins 319 , and scheduler extender 306 comprise custom components of orchestration control plane 115 that integrate the Kubernetes system with host cluster 118 and VI control plane 113 .
  • custom APIs 305 enable developers to discover available content and to import existing VMs as new images within their Kubernetes Namespace.
  • VM objects 307 that can be specified through custom APIs 305 include VM resources 320 , VM image resources 322 , VM profile resources 324 , network policy resources 325 , network resources 327 , and service resources 329 .
  • VM image resource 322 enables discovery of available images for consumption via custom APIs 305 .
  • VM image resource 322 resource exposes verbs such as image listing, filtering and import so that the developer can manage the lifecycle and consumption of images.
  • a single VM image resource 322 describes a reference to an existing VM template image in a repository.
  • VM profile resource 324 is a resource that describes a curated set of VM attributes that can be used to instantiate native VMs.
  • VM profile resource 324 gives the VI Admin control over the configuration and policy of the native VMs that are available to the developer.
  • the VI Admin can define a set of available VM profile resources 324 available in each namespace.
  • the VI Admin can create new profiles to balance the requirements of the VI Admin, the developer and those imposed by the underlying hardware, VM profile resource 324 enables definition of classes of information such as virtual CPU and memory capacity exposed to the native VM, resource, availability and compute policy for the native VM, and special hardware resources (e.g. FPGA, pmem, vGPU, etc.) available to the VM profile.
  • special hardware resources e.g. FPGA, pmem, vGPU, etc.
  • VMs Unlike pods VMs, native VMs have their own network requirements. Multipath NICs, public/private NICs and legacy application requirements can drive the need for support of multiple vNICs, each of which may have custom network configuration. As an example, a clustered SQL Server expects to have at least two vNiCs on separate networks: one public and one private. The private vNIC is used for IPC and heartbeat traffic with its peers. As a consequence of this need for flexibility, network policy resource 325 allows the VI admin to define the set of available networks for native VMs.
  • Network resource 327 represents a single network to be consumed by a native VM.
  • network resource 327 is a simple resource, abstracting the details of an underlying virtual port group that the network represents.
  • network resource 327 may be one of the following types: standard port group, distributed port group, or tier I logical router in SD network layer 175 , and the like.
  • the available networks are configured by the VI Admin for each namespace via a network policy resource 325 .
  • Network resources 327 are used to attach additional network interfaces to a specific virtual network.
  • Service resource 329 binds native VM instances to Kubernetes services in order to expose a network service from the native VM to pods and other native VMs.
  • service resource 329 includes a lab& selector that is used to match any labels applied to any VM resources 320 .
  • a delegate service and endpoints resource is installed in order to enable network access to the native VM via the service DNS name or IP address.
  • a VM resource 320 resource combines all of the above resources to generate a desired native VM.
  • a VM resource 320 specifies a VM image resource 322 to use as the master image.
  • a developer can override additional attributes of the cloned image.
  • a developer can override image attributes by specifying a VM profile resource 324 .
  • a developer can override image attributes by explicit specification of a desired attribute to override.
  • VM resources 320 specify a configuration that is mapped to underlying infrastructure features VM controllers 328 , including but not limited to: VM Name, Virtual Resource Capacity, Network to Virtual NIC binding, DNS Configuration, Volume Customization, VM Customization scripts and VM Placement and Affinity policy.
  • FIG. 4 is a block diagram depicting a logical view of a virtualized computing system according to an embodiment.
  • Supervisor cluster 101 is implemented by a software-defined data center (SDDC) 402 .
  • SDDC 402 includes virtualized computing system 100 shown in FIG. 1 , including host cluster 118 , virtualization management server 116 , network manager 112 , shared storage 170 , and SD network layer 175 .
  • SDDC 402 includes VI control plane 113 for managing a virtualization layer of host cluster 118 , along with shared storage 170 and SD network layer 175 .
  • a VI admin interacts with VM management server 116 (and optionally network manager 112 ) of VI control plane 113 to configure SDDC 402 to implement supervisor cluster 101 .
  • Supervisor cluster 101 includes orchestration control plane 115 , which includes supervisor Kubernetes master(s) 104 .
  • the VI admin interacts with VM management server 116 to create supervisor namespaces including supervisor namespace 412 .
  • Each supervisor namespace includes a resource pool and authorization constraints.
  • the resource pool includes various resource constraints on the supervisor namespace (e.g., reservation, limits, and share (RLS) constraints).
  • Authorization constraints provide for which roles are permitted to perform which operations in the supervisor namespace (e.g., allowing VI admin to create, manage access, allocate resources, view, and create objects; allowing DevOps to view and create objects; etc.).
  • a user interacts with supervisor Kubernetes master 104 to deploy applications on supervisor duster 101 within scopes of supervisor namespaces.
  • the user deploys containerized applications 428 on pod VMs 130 and non-containerized applications 429 on native VMs 140 .
  • Non-containerized applications 429 execute on a guest operating system in a native VM 140 exclusive of any container engine.
  • Kubernetes allows passing of configuration and secret information to containerized applications 428 .
  • standard Kubernetes does not extend this functionality beyond pod-based workloads (i.e., containerized applications executing in pods).
  • Embodiments described herein extend this functionality for applications executing in native VMs (e.g., non-containerized applications 429 ).
  • supervisor Kubernetes master 104 manages lifecycle of decoupled information 403 (e.g., config maps and secrets) for non-containerized applications 429 . That is, supervisor Kubernetes master 104 performs create, read, update, and delete operations on objects that include decoupled information 403 .
  • Supervisor Kubernetes master 104 provides decoupled information 403 to native VM controller 217 upon deployment of non-containerized applications 429 to native VMs 140 .
  • Native VM controller 217 cooperates with management agent 213 executing in each native VM 140 to provide decoupled information 403 for use by non-containerized applications 429 .
  • Management agent 213 in each native VM 140 exposes decoupled information 403 for access by non-containerized applications 429 .
  • management agent 213 creates environment variables accessible by non-containerized applications 429 .
  • management agent 213 creates files in a files in a filesystem accessible by native VMs 140 , which in turn can be read by non-containerized applications 429 .
  • the files can be resident in system memory (e.g., RAM).
  • Supervisor Kubernetes master 104 can provide updates to decoupled information 403 to native VM controller 217 , which in turn provides the updates to management agent 213 for use by non-containerized applications 429 .
  • supervisor Kubernetes master 104 When specifying a non-containerized application at supervisor Kubernetes master 104 , the user can specify which decoupled information 403 upon which the application relies and how to consume the decoupled information (e.g., as environment variables, as files, etc.).
  • Supervisor Kubernetes master 104 schedules the non-containerized application to run in a VM object implemented by a native VM 140 .
  • management agent 213 Upon deployment of native VM 140 , management agent 213 establishes a connect with native VM controller 217 using a hypervisor-guest channel (e.g., a virtual socket connection). In embodiments, management agent 213 communicates with native VM controller 217 over the hypervisor-guest channel using a remote procedure call (RPC) protocol.
  • RPC remote procedure call
  • Management agent 213 sets up decoupled information 403 as specified for each non-containerized application 429 (e.g., environment variables, files, etc.). Management agent 213 updates decoupled information 403 exposed to non-containerized applications 429 as updates are received from supervisor Kubernetes master 104 through native VM controller 217 .
  • FIG. 5 is a flow diagram depicting a method 500 of application orchestration in a virtualized computing system according to an embodiment.
  • Method 500 can be performed by software in supervisor cluster 101 executing on CPU, memory, storage, and network resources managed by virtualization layer(s) (e.g., hypervisor(s)) or a host operating system(s).
  • Virtualization layer(s) e.g., hypervisor(s)
  • host operating system(s) e.g., hypervisor(s)
  • Method 500 can be understood with reference to FIGS. 3 - 4 .
  • Method 500 begins at step 502 , where supervisor Kubernetes master 104 receives a specification for an application to be deployed using a native VM.
  • the specification is defined using custom APIs 305 .
  • a user can specify a VM image resource 322 .
  • a user can specify a VM profile 324 .
  • a user can specify a network policy 325 and/or network resources 327 .
  • a user can optionally specify a VM service 329 . A user can tie all of these objects together by specifying a VM resource 320 .
  • VM controller 328 which is part of orchestration control plane 115 , cooperates with virtualization management server 116 , which is part of VI control plane 113 , to select a host 120 for deploying a native VM.
  • the user specifies the native VM to orchestration control plane 115 , which in turn cooperates with VI control plane 113 to deploy the native VM.
  • VM controller 328 cooperates with virtualization management server 116 to deploy a native VM 140 as specified to the selected host.
  • Native VM 140 is deployed alongside any pod VMs 130 executing in the selected host and managed by orchestration control plane 115 .
  • orchestration control plane controls deployment of both native VM 140 and pod VM(s) 130 .
  • VM controller 328 and virtualization management server 116 clone a VM from a selected VM image after resource creation on the selected host.
  • VM controller 328 and virtualization management server 116 apply policies (e.g., VM profile(s), network policy, etc.) to the native VM.
  • policies e.g., VM profile(s), network policy, etc.
  • VM controller 328 and virtualization management server 116 start native VM 140 on the selected host as configured.
  • management agent 213 receives config map/secrets from supervisor Kubernetes master 104 through native VM controller 217 .
  • Management agent 213 exposes the configuration/secret information in the config maps/secrets to the application as specified by the user.
  • VM controller 328 and virtualization management server 116 cooperate to power down and delete the native VM upon deletion of VM resource 320 .
  • FIG. 6 is a flow diagram depicting a method 600 of application orchestration in a virtual zed computing system according to another embodiment.
  • method 600 can be performed by VI control plane 113 and orchestration control plane 115 , which comprise software executing on CPU, memory, storage, and network resources managed by a virtualization layer (e.g., a hypervisor) and/or host operating system.
  • Method 600 begins at step 602 , where a user provides a pod specification to API server 302 to create a new pod.
  • scheduler 304 selects candidate nodes for deployment of the pod. Scheduler 304 selects the candidate nodes by filtering on affinity, node selector constraints, etc.
  • scheduler extender 306 cooperates with VI services 108 in VM management server 116 to select a node from the set of candidate nodes.
  • VI services 108 selects zero or one node from the list of a plurality of candidate nodes provided by scheduler extender 306 .
  • scheduler 304 converts the pod specification to a VM specification for a pod VM 130 , For example, scheduler 304 converts CPU and memory requests and limits from pod specification to VM specification with fallback to reasonable defaults.
  • the VM specification includes a vNIC device attached to the logical network used by pod VMs 130 .
  • the guest OS in VM specification is specified to be kernel 210 with container engine 208 .
  • Storage is an ephemeral virtual disk.
  • PLC 324 invokes VM management server 116 to deploy pod VM 130 to a host 120 corresponding to the selected node.
  • VM management server 116 cooperates with host daemon 214 in host 120 corresponding to the selected node to create and power-on pod VM 130 .
  • FIG. 7 is a flow diagram depicting a method 700 of application orchestration in a virtualized computing system according to another embodiment.
  • Method 700 can be performed by VI control plane 113 and orchestration control plane 115 , which comprise software executing on CPU, memory, storage, and network resources managed by a virtualization layer (e.g., a hypervisor) and/or host operating system.
  • Method 700 begins at step 702 , where a user provides specification(s) to API server 302 in orchestration control plane 115 to create new pod VM(s) 130 and new native VM(s) 140 .
  • orchestration control plane 115 executes deployment of each native VM 140 as described above with respect to FIG. 5 .
  • orchestration control plane 115 executes deployment of each pod VM 130 as described above with respect to FIG. 6 .
  • one or more hosts 120 in host cluster 118 execute native VMs 140 alongside pod VMs 130 , all of which are deployed and managed by orchestration control plane 115 in cooperation with VI control plane 113 .
  • One or more embodiments of the invention also relate to a device or an apparatus for performing these operations.
  • the apparatus may be specially constructed for required purposes, or the apparatus may be a general-purpose computer selectively activated or configured by a computer program stored in the computer.
  • Various general-purpose machines may be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations.
  • One or more embodiments of the present invention may be implemented as one or more computer programs or as one or more computer program modules embodied in computer readable media.
  • the term computer readable medium refers to any data storage device that can store data which can thereafter be input to a computer system.
  • Computer readable media may be based on any existing or subsequently developed technology that embodies computer programs in a manner that enables a computer to read the programs. Examples of computer readable media are hard drives, NAS systems, read-only memory (ROM), RAM, compact disks (CDs), digital versatile disks (DVDs), magnetic tapes, and other optical and non-optical data storage devices.
  • a computer readable medium can also be distributed over a network-coupled computer system so that the computer readable code is stored and executed in a distributed fashion.
  • Virtualization systems in accordance with the various embodiments may be implemented as hosted embodiments, non-hosted embodiments, or as embodiments that blur distinctions between the two.
  • various virtualization operations may be wholly or partially implemented in hardware.
  • a hardware implementation may employ a look-up table for modification of storage access requests to secure non-disk data.
  • the virtualization software can therefore include components of a host, console, or guest OS that perform virtualization functions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Stored Programmes (AREA)

Abstract

An example virtualized computing system includes a host cluster having a virtualization layer executing on hardware platforms of hosts, the virtualization layer supporting execution of virtual machines (VMs), the VMs including pod VMs and native VMs, the pod VMs including container engines supporting execution of containers in the pod VMs, the native VMs including applications executing on guest operating systems; and an orchestration control plane integrated with the virtualization layer, the orchestration control plane including a master server having a pod VM controller to manage lifecycles of the pod VMs and a native VM controller to manage lifecycles of the native VMs.

Description

    CROSS-REFERENCE
  • This application is a continuation of U.S. patent application Ser. No. 17/153,296, filed Jan. 20, 2021, which is incorporated by reference herein.
  • BACKGROUND
  • Applications today are deployed onto a combination of virtual machines (VMs), containers, application services, and more. For deploying such applications, a container orchestrator (CO) known as Kubernetes® has gained in popularity among application developers. Kubernetes provides a platform for automating deployment, scaling, and operations of application containers across clusters of hosts. It offers flexibility in application development and offers several useful tools for scaling.
  • In a Kubernetes system, containers are grouped into logical unit called “pods” that execute on nodes in a cluster (also referred to as “node cluster”). Containers in the same pod share the same resources and network and maintain a degree of isolation from containers in other pods. The pods are distributed across nodes of the cluster. In a typical deployment, a node includes an operating system (OS), such as Linux®, and a container engine executing on top of the OS that supports the containers of the pod. A node can he a physical server or a VM.
  • Computer virtualization is a technique that involves encapsulating a physical computing machine platform into virtual machine(s) executing under control of virtualization software on a hardware computing platform or “host.” A virtual machine (VM) provides virtual hardware abstractions for processor, memory, storage, and the like to a guest operating system. The virtualization software, also referred to as a “hypervisor,” incudes one or more virtual machine monitors (VMMs) to provide execution environment(s) for the virtual machine(s). VMs allow for greater operating system diversity, isolation, and customization than do containers. Users have made considerable investments in making their applications run well on VMs and leveraging differentiating technologies of the underlying virtualized computing system. It is thus desirable to bring VMs into container orchestration systems like Kubernetes to allow a single management and deployment paradigm.
  • SUMMARY
  • In an embodiment, a virtualized computing system includes: a host cluster having a virtualization layer executing on hardware platforms of hosts, the virtualization layer supporting execution of virtual machines (VMs), the VMs including pod VMs and native VMs, the pod VMs including container engines supporting execution of containers in the pod VMs, the native VMs including applications executing on guest operating systems; and an orchestration control plane integrated with the virtualization layer, the orchestration control plane including a master server having a pod VM controller to manage lifecycles of the pod VMs and a native VM controller to manage lifecycles of the native VMs.
  • Further embodiments include a non-transitory computer-readable storage medium comprising instructions that cause a computer system to carry out the above methods, as well as a computer system configured to carry out the above methods.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a block diagram of a virtualized computing system in which embodiments described herein may be implemented.
  • FIG. 2 is a block diagram depicting a software platform according an embodiment.
  • FIG. 3 is a block diagram of a supervisor Kubernetes master according to an embodiment.
  • FIG. 4 is a block diagram depicting a logical view of a guest cluster executing in a virtualized computing system according to an embodiment.
  • FIG. 5 is a flow diagram depicting a method of application orchestration in a virtualized computing system according to an embodiment.
  • FIG. 6 is a flow diagram depicting a method of application orchestration in a virtualized computing system according to another embodiment.
  • FIG. 7 is a flow diagram depicting a method of application orchestration in a virtualized computing system according to an embodiment.
  • DETAILED DESCRIPTION
  • Declarative VM management for a container orchestrator in a virtualized computing system is described. In embodiments described herein, a virtualized computing system includes a software-defined datacenter (SDDC) comprising a server virtualization platform integrated with a logical network platform. The server virtualization platform includes clusters of physical servers (“hosts”) referred to as “host clusters.” Each host cluster includes a virtualization layer, executing on host hardware platforms of the hosts, which supports execution of virtual machines (VMs). A virtualization management server manages host clusters, the virtualization layers, and the VMs executing thereon.
  • In embodiments, the virtualization layer of a host cluster is integrated with an orchestration control plane, such as a Kubernetes control plane. This integration enables the host cluster as a “supervisor cluster” that uses VMs to implement both control plane nodes having a Kubernetes control plane, and compute nodes managed by the control plane nodes. For example, Kubernetes pods are implemented as “pod VMs,” each of which includes a kernel and container engine that supports execution of containers. In embodiments, the Kubernetes control plane of the supervisor cluster is extended to support custom objects in addition to pods, such as VM objects that are implemented using native VMs (as opposed to pod VMs). A virtualization infrastructure administrator (VI admin) can enable a host cluster as a supervisor cluster and provide its functionality to development teams.
  • In embodiments, the orchestration control plane includes master server(s) with both pod VM controllers and native VM controllers. The pod VM controllers manage the lifecycles of pod VMs. The native VM controllers manage the lifecycles of native VMs executing in parallel to the pod VMs. These and further advantages and aspects of the disclosed techniques are described below with respect to the drawings.
  • FIG. 1 is a block diagram of a virtualized computing system 100 in which embodiments described herein may be implemented. System 100 includes a cluster of hosts 120 (“host cluster 118”) that may be constructed on server-grade hardware platforms such as an x86 architecture platforms. For purposes of clarity, only one host cluster 118 is shown. However, virtualized computing system 100 can include many of such host clusters 118. As shown, a hardware platform 122 of each host 120 includes conventional components of a computing device, such as one or more central processing units (CPUs) 160, system memory (e.g., random access memory (RAM) 162), one or more network interface controllers (NICs) 164, and optionally local storage 163. CPUs 160 are configured to execute instructions, for example, executable instructions that perform one or more operations described herein, which may be stored in RAM 162. NICs 164 enable host 120 to communicate with other devices through a physical network 180. Physical network 180 enables communication between hosts 120 and between other components and hosts 120 (other components discussed further herein). Physical network 180 can include a plurality of VLANs to provide external network virtualization as described further herein.
  • In the embodiment illustrated in FIG. 1 , hosts 120 access shared storage 170 by using NICs 164 to connect to network 180. In another embodiment, each host 120 contains a host bus adapter (ITBA) through which input/output operations (IOs) are sent to shared storage 170 over a separate network (e.g., a fibre channel (PC) network). Shared storage 170 include one or more storage arrays, such as a storage area network (SAN), network attached storage (NAS), or the like. Shared storage 170 may comprise magnetic disks, solid-state disks, flash memory, and the like as well as combinations thereof. In some embodiments, hosts 120 include local storage 163 (e.g., hard disk drives, solid-state drives, etc.). Local storage 163 in each host 120 can be aggregated and provisioned as part of a virtual SAN, which is another form of shared storage 170.
  • A software platform 124 of each host 120 provides a virtualization layer, referred to herein as a hypervisor 150, which directly executes on hardware platform 122. In an embodiment, there is no intervening software, such as a host operating system (OS), between hypervisor 150 and hardware platform 122. Thus, hypervisor 150 is a Type-1 hypervisor (also known as a “bare-metal” hypervisor). As a result, the virtualization layer in host cluster 118 (collectively hypervisors 150) is a bare-metal virtualization layer executing directly on host hardware platforms. Hypervisor 150 abstracts processor, memory, storage, and network resources of hardware platform 122 to provide a virtual machine execution space within which multiple virtual machines (VM) may be concurrently instantiated and executed. One example of hypervisor 150 that may be configured and used in embodiments described herein is a VMware ESXi™ hypervisor provided as part of the VMware vSphere® solution made commercially available by VMware, Inc. of Palo Alto, CA.
  • in the example of FIG. 1 , host cluster 118 is enabled as a “supervisor cluster,” described further herein, and thus VMs executing on each host 120 include pod VMs 130 and native VMs 140. A pod VM 130 is a virtual machine that includes a kernel and container engine that supports execution of containers, as well as an agent (referred to as a pod VM agent) that cooperates with a controller of an orchestration control plane 115 executing in hypervisor 150 (referred to as a pod VM controller). An example of pod VM 130 is described further below with respect to FIG. 2 . VMs 130/140 support applications 141 deployed onto host cluster 118, which can include containerized applications (e.g., executing in either pod VMs 130 or native VMs 140) and applications executing directly on guest operating systems (non-containerized) (e.g., executing in native VMs 140). One specific application discussed further herein is a guest cluster executing as a virtual extension of a supervisor cluster. Some VMs 130/140, shown as support VMs 145, have specific functions within host cluster 118. For example, support VMs 145 can provide control plane functions, edge transport functions, and the like. An embodiment of software platform 124 is discussed further below with respect to FIG. 2 .
  • Host cluster 118 is configured with a software-defined (SD) network layer 175. SD network layer 175 includes logical network services executing on virtualized infrastructure in host cluster 118. The virtualized infrastructure that supports the logical network services includes hypervisor-based components, such as resource pools, distributed switches, distributed switch port groups and uplinks, etc., as well as VM-based components, such as router control VMs, load balancer VMs, edge service VMs, etc. Logical network services include logical switches, logical routers. logical firewalls, logical virtual private networks (VPNs), logical load balancers, and the like, implemented on top of the virtualized infrastructure. In embodiments, virtualized computing system 100 includes edge transport nodes 178 that provide an interface of host cluster 118 to an external network (e.g., a corporate network, the public Internet, etc.). Edge transport nodes 178 can include a gateway between the internal logical networking of host cluster 118 and the external network. Edge transport nodes 178 can be physical servers or VMs. For example, edge transport nodes 178 can be implemented in support VMs 145 and include a gateway of SD network layer 175. Various clients 119 can access service(s) in virtualized computing system through edge transport nodes 178 (including VM management client 106 and Kubernetes client 102, which as logically shown as being separate by way of example).
  • Virtualization management server 11.6 is a physical or virtual server that manages host cluster 118 and the virtualization layer therein. Virtualization management server 116 installs agent(s) 152 in hyper-visor 150 to add a host 120 as a managed entity. Virtualization management server 116 logically groups hosts 120 into host cluster 118 to provide cluster-level functions to hosts 120, such as VM migration between hosts 120 (e.g., for load balancing), distributed power management, dynamic VM placement according to affinity and anti-affinity rules, and high-availability. The number of hosts 120 in host cluster 118 may he one or many. Virtualization management server 116 can manage more than one host cluster 118.
  • In an embodiment, virtualization management server 116 further enables host cluster 118 as a supervisor cluster 101. Virtualization management server 116 installs additional agents 152 in hypervisor 150 to add host 120 to supervisor cluster 101. Supervisor cluster 101 integrates an orchestration control plane 115 with host duster 118. In embodiments, orchestration control plane 115 includes software components that support a container orchestrator, such as Kubernetes, to deploy and manage applications on host cluster 118. By way of example, a Kubernetes container orchestrator is described herein. In supervisor cluster 101, hosts 120 become nodes of a Kubernetes cluster and pod VMs 130 executing on hosts 120 implement Kubernetes pods. Orchestration control plane 115 includes supervisor Kubernetes master 104 and agents 152 executing in virtualization layer (e.g., hypervisors 150). Supervisor Kubernetes master 104 includes control plane components of Kubernetes, as well as custom controllers, custom plugins, scheduler extender, and the like that extend Kubernetes to interface with virtualization management server 116 and the virtualization layer. For purposes of clarity, supervisor Kubernetes master 104 is shown as a separate logical entity. For practical implementations, supervisor Kubernetes master 104 is implemented as one or more VM(s) 130/140 in host cluster 118. Further, although only one supervisor Kubernetes master 104 is shown, supervisor cluster 101 can include more than one supervisor Kubernetes master 104 in a logical cluster for redundancy and load balancing.
  • In an embodiment, virtualized computing system 100 further includes a storage service 110 that implements a storage provider in virtualized computing system 100 for container orchestrators. In embodiments, storage service 110 manages lifecycles of storage volumes (e.g., virtual disks) that back persistent volumes used by containerized applications executing in host cluster 118. A container orchestrator such as Kubernetes cooperates with storage service 110 to provide persistent storage for the deployed applications. In the embodiment of FIG. 1 , supervisor Kubernetes master 104 cooperates with storage service 110 to deploy and manage persistent storage in the supervisor duster environment. Other embodiments described below include a vanilla container orchestrator environment and a guest cluster environment. Storage service 110 can execute in virtualization management server 116 as shown or operate independently from virtualization management server 116 (e.g., as an independent physical or virtual server).
  • In an embodiment, virtualized computing system 100 further includes a network manager 112. Network manager 112 is a physical or virtual server that orchestrates SD network layer 175. In an embodiment, network manager 112 comprises one or more virtual servers deployed as VMs. Network manager 112 installs additional agents 152 in hypervisor 150 to add a host 120 as a managed entity, referred to as a transport node. In this manner, host cluster 118 can be a cluster 103 of transport nodes. One example of an SD networking platform that can he configured and used in embodiments described herein as network manager 112 and SD network layer 175 is a VMware NSX® platform made commercially available by VMware, Inc. of Palo Alto, CA.
  • Network manager 112 can deploy one or more transport zones in virtualized computing system 100, including VLAN transport zone(s) and an overlay transport zone. A VLAN transport zone spans a set of hosts 120 (e.g., host cluster 118) and is backed by external network virtualization of physical network 180 (e.g., a VLAN). One example VLAN transport zone uses a management VLAN 182 on physical network 180 that enables a management network connecting hosts 120 and the VI control plane (e.g., virtualization management server 116 and network manager 112). An overlay transport zone using overlay VLAN 184 on physical network 180 enables an overlay network that spans a set of hosts 120 (e.g., host cluster 118) and provides internal network virtualization using software components (e.g., the virtualization layer and services executing in VMs). Host-to-host traffic for the overlay transport zone is carried by physical network 180 on the overlay VLAN 184 using layer-2-over-layer-3 tunnels. Network manager 112 can configure SD network layer 175 to provide a cluster network 186 using the overlay network. The overlay transport zone can he extended into at least one of edge transport nodes 178 to provide ingress/egress between cluster network 186 and an external network.
  • In an embodiment, system 100 further includes an image registry 190. As described herein, containers of supervisor cluster 101 execute in pod VMs 130. The containers in pod VMs 130 are spun up from container images managed by image registry 190. Image registry 190 manages images and image repositories for use in supplying images for containerized applications.
  • Virtualization management server 116 and network manager 112 comprise a virtual infrastructure (VI) control plane 113 of virtualized computing system 100. Virtualization management server 116 can include a supervisor cluster service 109, storage service 110, and VI services 108. Supervisor cluster service 109 enables host cluster 118 as supervisor cluster 101 and deploys the components of orchestration control plane 115. VI services 108 include various virtualization management services, such as a distributed resource scheduler (DRS), high-availability (HA) service, single sign-on (SSO) service, virtualization management daemon, and the like. DRS is configured to aggregate the resources of host cluster 118 to provide resource pools and enforce resource allocation policies. DRS also provides resource management in the form of load balancing, power management, VM placement, and the like. HA service is configured to pool VMs and hosts into a monitored cluster and, in the event of a failure, restart VMs on alternate hosts in the cluster. A single host is elected as a master, which communicates with the HA service and monitors the state of protected VMs on subordinate hosts. The HA service uses admission control to ensure enough resources are reserved in the cluster for VM recovery when a host fails. SSO service comprises security token service, administration server, directory service, identity management service, and the like configured to implement an SSO platform for authenticating users. The virtualization management daemon is configured to manage objects, such as data centers, clusters, hosts, VMs, resource pools, datastores, and the like.
  • A VI admin can interact with virtualization management server 116 through a VM management client 106. Through VM management client 106, a VI admin commands virtualization management server 116 to form host cluster 118, configure resource pools, resource allocation policies, and other cluster-level functions, configure storage and networking, enable supervisor cluster 101, deploy and manage image registry 190, and the like.
  • Kubernetes client 102 represents an input interface for a user to supervisor Kubernetes master 104. Kubernetes client 102 is commonly referred to as kubectl. Through Kubernetes client 102, a user submits desired states of the Kubernetes system, e.g., as YAML documents, to supervisor Kubernetes master 104. In embodiments, the user submits the desired states within the scope of a supervisor namespace. A “supervisor namespace” is a shared abstraction between VI control plane 113 and orchestration control plane 115. Each supervisor namespace provides resource-constrained and authorization-constrained units of multi-tenancy. A supervisor namespace provides resource constraints, user-access constraints, and policies (e.g., storage policies, network policies, etc.). Resource constraints can be expressed as quotas, limits, and the like with respect to compute (CPU and memory), storage, and networking of the virtualized infrastructure (host cluster 118, shared storage 170, SD network layer 175). User-access constraints include definitions of users, roles, permissions, bindings of roles to users, and the like. Each supervisor namespace is expressed within orchestration control plane 115 using a namespace native to orchestration control plane 115 (e.g., a Kubernetes namespace or generally a “native namespace”), which allows users to deploy applications in supervisor cluster 101 within the scope of supervisor namespaces. In this manner, the user interacts with supervisor Kubernetes master 104 to deploy applications in supervisor cluster 101 within defined supervisor namespaces.
  • While FIG. 1 shows an example of a supervisor cluster 101, the techniques described herein do not require a supervisor cluster 101. In some embodiments, host cluster 118 is not enabled as a supervisor cluster 101. In such case, supervisor Kubernetes master 104, Kubernetes client 102, pod VMs 130, supervisor cluster service 109, and image registry 190 can be omitted. While host cluster 118 is show as being enabled as a transport node cluster 103, in other embodiments network manager 112 can he omitted. In such case, virtualization management server 116 functions to configure SD network layer 175.
  • FIG. 2 is a block diagram depicting software platform 124 according an embodiment. As described above, software platform 124 of host 120 includes hypervisor 150 that supports execution of VMs, such as pod VMs 130, native VMs 140, and support VMs 145. In an embodiment, hypervisor 150 includes a VM management daemon 211, a host daemon 214, a pod VM controller 216, a native VM controller 217, an image service 218, and network agents 222. VM management daemon 211 is an agent 152 installed by virtualization management server 116. VM management daemon 211 provides an interface to host daemon 214 for virtualization management server 116. Host daemon 214 is configured to create, configure, and remove VMs (e.g., pod VMs 130 and native VMs 140).
  • Pod VM controller 216 is an agent 152 of orchestration control plane 115 for supervisor cluster 101 and allows supervisor Kubernetes master 104 to interact with hypervisor 150. Pod VM controller 216 configures the respective host as a node in supervisor cluster 101. Pod VM controller 216 manages the lifecycle of pod VMs 130, such as determining when to spin-up or delete a pod VM. Pod VM controller 216 also ensures that any pod dependencies, such as container images, networks, and volumes are available and correctly configured. Pod VM controller 216 is omitted if host cluster 118 is not enabled as a supervisor cluster 101. Native VM controller is an agent 152 of orchestration control plane 115 for supervisor cluster 101 and allows supervisor Kubernetes master 104 to interact with hypervisor 150 to manage lifecycles of native VMs 140 and applications executing therein. While shown separately from pod VM controller 216, in some embodiments both pod VM controller 216 and native VM controller 217 can be functions of a single controller.
  • Image service 218 is configured to pull container images from image registry 190 and store them in shared storage 170 such that the container images can be mounted by pod VMs 130. Image service 218 is also responsible for managing the storage available for container images within shared storage 170. This includes managing authentication with image registry 190, assuring providence of container images by verifying signatures, updating container images when necessary, and garbage collecting unused container images. Image service 218 communicates with pod VM controller 216 during spin-up and configuration of pod VMs 130. In some embodiments, image service 218 is part of pod VM controller 216. In embodiments, image service 218 utilizes system VMs 130/140 in support VMs 145 to fetch images, convert images to container image virtual disks, and cache container image virtual disks in shared storage 170.
  • Network agents 222 comprises agents 152 installed by network manager 112. Network agents 222 are configured to cooperate with network manager 112 to implement logical network services. Network agents 222 configure the respective host as a transport node in a duster 103 of transport nodes.
  • Each pod VM 130 has one or more containers 206 running therein in an execution space managed by container engine 208. The lifecycle of containers 206 is managed by pod VM agent 212. Both container engine 208 and pod VM agent 212 execute on top of a kernel 210 (e.g., a Linux® kernel). Each native VM 140 has applications 202 running therein on top of an OS 204. Native VMs 140 do not include pod VM agents and are isolated from pod VM controller 216. Rather, native VMs 140 include management agents 213 that communicate with native VM controller 217. Container engine 208 can be an industry-standard container engine, such as libcontainer, rune, or containerd. Pod VMs 130, pod VM controller 216, native VM controller 217, and image service 218 are omitted if host cluster 118 is not enabled as a supervisor cluster 101.
  • FIG. 3 is a block diagram of supervisor Kubernetes master 104 according to an embodiment. Supervisor Kubernetes master 104 includes application programming interface (API) server 302, a state database 303, a scheduler 304, a scheduler extender 306, controllers 308, and plugins 319. API server 302 includes the Kubernetes API server, kube-api-server (“Kubernetes API 326”) and custom APIs 305. Custom APIs 305 are API extensions of Kubernetes API 326 using either the custom resource/operator extension pattern or the API extension server pattern. Custom APIs 305 are used to create and manage custom resources, such as VM objects. API server 302 provides a declarative schema for creating, updating, deleting, and viewing objects.
  • State database 303 stores the state of supervisor cluster 101 (e.g., etcd) as objects created by API server 302. A user can provide application specification data to API server 302 that defines various objects supported by the API (e.g., as a YAML document). The objects have specifications that represent the desired state. State database 303 stores the objects defined by application specification data as part of the supervisor cluster state. Standard Kubernetes objects (“Kubernetes objects 310”) include namespaces, nodes, pods, config maps, secrets, among others. Custom objects are resources defined through custom APIs 305 (e.g., VM objects 307).
  • Namespaces provide scope for objects. Namespaces are objects themselves maintained in state database 303. A namespace can include resource quotas, limit ranges, role bindings, and the like that are applied to objects declared within its scope. VI control plane 113 creates and manages supervisor namespaces for supervisor cluster 101. A supervisor namespace is a resource-constrained and authorization-constrained unit of multi-tenancy managed by virtualization management server 116. Namespaces inherit constraints from corresponding supervisor cluster namespaces. Config maps include configuration information for applications managed by supervisor Kubernetes master 104. Secrets include sensitive information for use by applications managed by supervisor Kubernetes master 104 (e.g., passwords, keys, tokens, etc.). The configuration information and the secret information stored by config maps and secrets is generally referred to herein as decoupled information. Decoupled information is information needed by the managed applications, but which is decoupled from the application code.
  • Controllers 308 can include, for example, standard Kubernetes controllers (“Kubernetes controllers 316”) (e.g., kube-controller-manager controllers, cloud-controller-manager controllers, etc.) and custom controllers 318. Custom controllers 318 include controllers for managing lifecycle of Kubernetes objects 310 and custom objects. For example, custom controllers 318 can include a VM controllers 328 configured to manage VM objects 307 and a pod VM lifecycle controller (PLC) 330 configured to manage pods 324. A controller 308 tracks objects in state database 303 of at least one resource type. Controller(s) 318 are responsible for making the current state of supervisor cluster 101 come closer to the desired state as stored in state database 303. A controller 318 can carry out action(s) by itself, send messages to API server 302 to have side effects, and/or interact with external systems.
  • Plugins 319 can include, for example, network plugin 312 and storage plugin 314. Plugins 319 provide a well-defined interface to replace a set of functionality of the Kubernetes control plane. Network plugin 312 is responsible for configuration of SD network layer 175 to deploy and configure the cluster network. Network plugin 312 cooperates with virtualization management server 116 and/or network manager 112 to deploy logical network services of the cluster network. Network plugin 312 also monitors state database for custom objects 307, such as NIP objects. Storage plugin 314 is responsible for providing a standardized interface for persistent storage lifecycle and management to satisfy the needs of resources requiring persistent storage. Storage plugin 314 cooperates with virtualization management server 116 and/or persistent storage manager 110 to implement the appropriate persistent storage volumes in shared storage 170.
  • Scheduler 304 watches state database 303 for newly created pods with no assigned node. A pod is an object supported by API server 302 that is a group of one or more containers, with network and storage, and a specification on how to execute. Scheduler 304 selects candidate nodes in supervisor cluster 101 for pods. Scheduler 304 cooperates with scheduler extender 306, which interfaces with virtualization management server 116. Scheduler extender 306 cooperates with virtualization management server 116 (e.g., such as with DRS) to select nodes from candidate sets of nodes and provide identities of hosts 120 corresponding to the selected nodes. For each pod, scheduler 304 also converts the pod specification to a pod VM specification, and scheduler extender 306 asks virtualization management server 116 to reserve a pod VM on the selected host 120. Scheduler 304 updates pods in state database 303 with host identifiers.
  • Kubernetes API 326, state database 303, scheduler 304, and Kubernetes controllers 316 comprise standard components of a Kubernetes system executing on supervisor cluster 101. Custom controllers 318, plugins 319, and scheduler extender 306 comprise custom components of orchestration control plane 115 that integrate the Kubernetes system with host cluster 118 and VI control plane 113.
  • In embodiments, custom APIs 305 enable developers to discover available content and to import existing VMs as new images within their Kubernetes Namespace. In embodiments, VM objects 307 that can be specified through custom APIs 305 include VM resources 320, VM image resources 322, VM profile resources 324, network policy resources 325, network resources 327, and service resources 329.
  • VM image resource 322 enables discovery of available images for consumption via custom APIs 305. VM image resource 322 resource exposes verbs such as image listing, filtering and import so that the developer can manage the lifecycle and consumption of images. A single VM image resource 322 describes a reference to an existing VM template image in a repository.
  • VM profile resource 324 is a resource that describes a curated set of VM attributes that can be used to instantiate native VMs. VM profile resource 324 gives the VI Admin control over the configuration and policy of the native VMs that are available to the developer. The VI Admin can define a set of available VM profile resources 324 available in each namespace. The VI Admin can create new profiles to balance the requirements of the VI Admin, the developer and those imposed by the underlying hardware, VM profile resource 324 enables definition of classes of information such as virtual CPU and memory capacity exposed to the native VM, resource, availability and compute policy for the native VM, and special hardware resources (e.g. FPGA, pmem, vGPU, etc.) available to the VM profile.
  • Unlike pods VMs, native VMs have their own network requirements. Multipath NICs, public/private NICs and legacy application requirements can drive the need for support of multiple vNICs, each of which may have custom network configuration. As an example, a clustered SQL Server expects to have at least two vNiCs on separate networks: one public and one private. The private vNIC is used for IPC and heartbeat traffic with its peers. As a consequence of this need for flexibility, network policy resource 325 allows the VI admin to define the set of available networks for native VMs.
  • Network resource 327 represents a single network to be consumed by a native VM. In embodiments, network resource 327 is a simple resource, abstracting the details of an underlying virtual port group that the network represents. For example, network resource 327 may be one of the following types: standard port group, distributed port group, or tier I logical router in SD network layer 175, and the like. The available networks are configured by the VI Admin for each namespace via a network policy resource 325. Network resources 327 are used to attach additional network interfaces to a specific virtual network.
  • Service resource 329 binds native VM instances to Kubernetes services in order to expose a network service from the native VM to pods and other native VMs. In embodiments, service resource 329 includes a lab& selector that is used to match any labels applied to any VM resources 320. Once a service resource 329 and a VM resource 320 have been coupled, a delegate service and endpoints resource is installed in order to enable network access to the native VM via the service DNS name or IP address.
  • A VM resource 320 resource combines all of the above resources to generate a desired native VM. In embodiments, a VM resource 320 specifies a VM image resource 322 to use as the master image. Optionally, a developer can override additional attributes of the cloned image. In embodiments, a developer can override image attributes by specifying a VM profile resource 324. In other embodiments, a developer can override image attributes by explicit specification of a desired attribute to override. VM resources 320 specify a configuration that is mapped to underlying infrastructure features VM controllers 328, including but not limited to: VM Name, Virtual Resource Capacity, Network to Virtual NIC binding, DNS Configuration, Volume Customization, VM Customization scripts and VM Placement and Affinity policy.
  • FIG. 4 is a block diagram depicting a logical view of a virtualized computing system according to an embodiment. Supervisor cluster 101 is implemented by a software-defined data center (SDDC) 402. SDDC 402 includes virtualized computing system 100 shown in FIG. 1 , including host cluster 118, virtualization management server 116, network manager 112, shared storage 170, and SD network layer 175. SDDC 402 includes VI control plane 113 for managing a virtualization layer of host cluster 118, along with shared storage 170 and SD network layer 175. A VI admin interacts with VM management server 116 (and optionally network manager 112) of VI control plane 113 to configure SDDC 402 to implement supervisor cluster 101.
  • Supervisor cluster 101 includes orchestration control plane 115, which includes supervisor Kubernetes master(s) 104. The VI admin interacts with VM management server 116 to create supervisor namespaces including supervisor namespace 412. Each supervisor namespace includes a resource pool and authorization constraints. The resource pool includes various resource constraints on the supervisor namespace (e.g., reservation, limits, and share (RLS) constraints). Authorization constraints provide for which roles are permitted to perform which operations in the supervisor namespace (e.g., allowing VI admin to create, manage access, allocate resources, view, and create objects; allowing DevOps to view and create objects; etc.). A user interacts with supervisor Kubernetes master 104 to deploy applications on supervisor duster 101 within scopes of supervisor namespaces. In the example, the user deploys containerized applications 428 on pod VMs 130 and non-containerized applications 429 on native VMs 140. Non-containerized applications 429 execute on a guest operating system in a native VM 140 exclusive of any container engine.
  • Kubernetes allows passing of configuration and secret information to containerized applications 428. However, standard Kubernetes does not extend this functionality beyond pod-based workloads (i.e., containerized applications executing in pods). Embodiments described herein extend this functionality for applications executing in native VMs (e.g., non-containerized applications 429). In embodiments, supervisor Kubernetes master 104 manages lifecycle of decoupled information 403 (e.g., config maps and secrets) for non-containerized applications 429. That is, supervisor Kubernetes master 104 performs create, read, update, and delete operations on objects that include decoupled information 403. Supervisor Kubernetes master 104 provides decoupled information 403 to native VM controller 217 upon deployment of non-containerized applications 429 to native VMs 140. Native VM controller 217 cooperates with management agent 213 executing in each native VM 140 to provide decoupled information 403 for use by non-containerized applications 429. Management agent 213 in each native VM 140 exposes decoupled information 403 for access by non-containerized applications 429. In embodiments, management agent 213 creates environment variables accessible by non-containerized applications 429. In embodiments, management agent 213 creates files in a files in a filesystem accessible by native VMs 140, which in turn can be read by non-containerized applications 429. In some embodiments, the files can be resident in system memory (e.g., RAM). Supervisor Kubernetes master 104 can provide updates to decoupled information 403 to native VM controller 217, which in turn provides the updates to management agent 213 for use by non-containerized applications 429.
  • When specifying a non-containerized application at supervisor Kubernetes master 104, the user can specify which decoupled information 403 upon which the application relies and how to consume the decoupled information (e.g., as environment variables, as files, etc.). Supervisor Kubernetes master 104 schedules the non-containerized application to run in a VM object implemented by a native VM 140. Upon deployment of native VM 140, management agent 213 establishes a connect with native VM controller 217 using a hypervisor-guest channel (e.g., a virtual socket connection). In embodiments, management agent 213 communicates with native VM controller 217 over the hypervisor-guest channel using a remote procedure call (RPC) protocol. Management agent 213 sets up decoupled information 403 as specified for each non-containerized application 429 (e.g., environment variables, files, etc.). Management agent 213 updates decoupled information 403 exposed to non-containerized applications 429 as updates are received from supervisor Kubernetes master 104 through native VM controller 217.
  • FIG. 5 is a flow diagram depicting a method 500 of application orchestration in a virtualized computing system according to an embodiment. Method 500 can be performed by software in supervisor cluster 101 executing on CPU, memory, storage, and network resources managed by virtualization layer(s) (e.g., hypervisor(s)) or a host operating system(s). Method 500 can be understood with reference to FIGS. 3-4 .
  • Method 500 begins at step 502, where supervisor Kubernetes master 104 receives a specification for an application to be deployed using a native VM. The specification is defined using custom APIs 305. For example, at step 504, a user can specify a VM image resource 322. At step 506, a user can specify a VM profile 324. At step 508, a user can specify a network policy 325 and/or network resources 327. At step 510, a user can optionally specify a VM service 329. A user can tie all of these objects together by specifying a VM resource 320.
  • At step 512, VM controller 328, which is part of orchestration control plane 115, cooperates with virtualization management server 116, which is part of VI control plane 113, to select a host 120 for deploying a native VM. Thus, the user specifies the native VM to orchestration control plane 115, which in turn cooperates with VI control plane 113 to deploy the native VM. At step 514, VM controller 328 cooperates with virtualization management server 116 to deploy a native VM 140 as specified to the selected host. Native VM 140 is deployed alongside any pod VMs 130 executing in the selected host and managed by orchestration control plane 115. Thus, orchestration control plane controls deployment of both native VM 140 and pod VM(s) 130. For example, at step 516, VM controller 328 and virtualization management server 116 clone a VM from a selected VM image after resource creation on the selected host. At step 518, VM controller 328 and virtualization management server 116 apply policies (e.g., VM profile(s), network policy, etc.) to the native VM. At step 520, VM controller 328 and virtualization management server 116 start native VM 140 on the selected host as configured.
  • At step 522, management agent 213 receives config map/secrets from supervisor Kubernetes master 104 through native VM controller 217. Management agent 213 exposes the configuration/secret information in the config maps/secrets to the application as specified by the user. At step 524, VM controller 328 and virtualization management server 116 cooperate to power down and delete the native VM upon deletion of VM resource 320.
  • FIG. 6 is a flow diagram depicting a method 600 of application orchestration in a virtual zed computing system according to another embodiment. As shown in FIG. 6 , method 600 can be performed by VI control plane 113 and orchestration control plane 115, which comprise software executing on CPU, memory, storage, and network resources managed by a virtualization layer (e.g., a hypervisor) and/or host operating system. Method 600 begins at step 602, where a user provides a pod specification to API server 302 to create a new pod. At step 604, scheduler 304 selects candidate nodes for deployment of the pod. Scheduler 304 selects the candidate nodes by filtering on affinity, node selector constraints, etc. At step 606, scheduler extender 306 cooperates with VI services 108 in VM management server 116 to select a node from the set of candidate nodes. VI services 108 selects zero or one node from the list of a plurality of candidate nodes provided by scheduler extender 306.
  • At step 608, scheduler 304 converts the pod specification to a VM specification for a pod VM 130, For example, scheduler 304 converts CPU and memory requests and limits from pod specification to VM specification with fallback to reasonable defaults. The VM specification includes a vNIC device attached to the logical network used by pod VMs 130. The guest OS in VM specification is specified to be kernel 210 with container engine 208. Storage is an ephemeral virtual disk.
  • At step 610, PLC 324 invokes VM management server 116 to deploy pod VM 130 to a host 120 corresponding to the selected node. At step 612, VM management server 116 cooperates with host daemon 214 in host 120 corresponding to the selected node to create and power-on pod VM 130.
  • FIG. 7 is a flow diagram depicting a method 700 of application orchestration in a virtualized computing system according to another embodiment. Method 700 can be performed by VI control plane 113 and orchestration control plane 115, which comprise software executing on CPU, memory, storage, and network resources managed by a virtualization layer (e.g., a hypervisor) and/or host operating system. Method 700 begins at step 702, where a user provides specification(s) to API server 302 in orchestration control plane 115 to create new pod VM(s) 130 and new native VM(s) 140. At step 704, orchestration control plane 115 executes deployment of each native VM 140 as described above with respect to FIG. 5 . At step 706, orchestration control plane 115 executes deployment of each pod VM 130 as described above with respect to FIG. 6 . In this manner, one or more hosts 120 in host cluster 118 execute native VMs 140 alongside pod VMs 130, all of which are deployed and managed by orchestration control plane 115 in cooperation with VI control plane 113.
  • One or more embodiments of the invention also relate to a device or an apparatus for performing these operations. The apparatus may be specially constructed for required purposes, or the apparatus may be a general-purpose computer selectively activated or configured by a computer program stored in the computer. Various general-purpose machines may be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations.
  • The embodiments described herein may be practiced with other computer system configurations including hand-held devices, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, etc.
  • One or more embodiments of the present invention may be implemented as one or more computer programs or as one or more computer program modules embodied in computer readable media. The term computer readable medium refers to any data storage device that can store data which can thereafter be input to a computer system. Computer readable media may be based on any existing or subsequently developed technology that embodies computer programs in a manner that enables a computer to read the programs. Examples of computer readable media are hard drives, NAS systems, read-only memory (ROM), RAM, compact disks (CDs), digital versatile disks (DVDs), magnetic tapes, and other optical and non-optical data storage devices. A computer readable medium can also be distributed over a network-coupled computer system so that the computer readable code is stored and executed in a distributed fashion.
  • Although one or more embodiments of the present invention have been described in some detail for clarity of understanding, certain changes may be made within the scope of the claims. Accordingly, the described embodiments are to be considered as illustrative and not restrictive, and the scope of the claims is not to be limited to details given herein but may be modified within the scope and equivalents of the claims. In the claims, elements and/or steps do not imply any particular order of operation unless explicitly stated in the claims.
  • Virtualization systems in accordance with the various embodiments may be implemented as hosted embodiments, non-hosted embodiments, or as embodiments that blur distinctions between the two. Furthermore, various virtualization operations may be wholly or partially implemented in hardware. For example, a hardware implementation may employ a look-up table for modification of storage access requests to secure non-disk data.
  • Many variations, additions, and improvements are possible, regardless of the degree of virtualization. The virtualization software can therefore include components of a host, console, or guest OS that perform virtualization functions.
  • Plural instances may be provided for components, operations, or structures described herein as a single instance. Boundaries between components, operations, and data stores are somewhat arbitrary, and particular operations are illustrated in the context of specific illustrative configurations. Other allocations of functionality are envisioned and may fall within the scope of the invention. In general, structures and functionalities presented as separate components in exemplary configurations may be implemented as a combined structure or component. Similarly, structures and functionalities presented as a single component may be implemented as separate components. These and other variations, additions, and improvements may fall within the scope of the appended claims.

Claims (20)

What is claimed is:
1. A virtualized computing system, comprising:
a host cluster having a virtualization layer executing on hardware platforms of hosts, the virtualization layer supporting execution of virtual machines (VMs), the VMs including first VMs and second VMs, the first VMs including container engines supporting execution of containers in the first VMs, the second VMs including non-virtualized guest operating systems;
a virtualization management server configured to manage the virtualization layer and the host cluster; and
an orchestration control plane integrated with the virtualization layer, the orchestration control plane including: a lifecycle controller configured to cooperate with the virtualization layer to manage lifecycles of the first VMs, and a VM controller configured to cooperate with the virtualization management server to manage lifecycles of the second VMs.
2. The virtualized computing system of claim 1, wherein the second VMs execute non-containerized applications.
3. The virtualized computing system of claim 1, wherein the orchestration control plane includes custom APIs to manage objects monitored by the VM controller.
4. The virtualized computing system of claim 3, wherein the objects include VM objects for the second VMs and VM image objects for VM images of guest software executing in the second VMs, the guest software including the non-virtualized guest operating system.
5. The virtualized computing system of claim 3, wherein the objects include VM service objects for exposing network services of the second VMs.
6. The virtualized computing system of claim 3, wherein the objects include virtual network resource objects for representing networks consumed by the second VMs.
7. The virtualized computing system of claim 1, wherein the VM controller is configured to communicate with a controller in the virtualization layer to provide decoupled information to the second VMs.
8. A method of application orchestration in a virtualized computing system including a host cluster having a virtualization layer directly executing on hardware platforms of hosts and a virtualization management server configured to manage the virtualization layer and the hosts, the virtualization layer supporting execution of virtual machines (VMs), the virtualization layer integrated with an orchestration control plane, the method comprising:
receiving, at the orchestration control plane, specification data for a first application and a second application;
deploying, by a lifecycle controller executing in the orchestration control plane, the first application to a first VM in a host of the host cluster based on the specification data, the first VM including a container engine supporting execution of containers in the pod VM; and
deploying, by a VM controller executing in the orchestration control plane and in cooperation with the virtualization management server, the second application to a second VM in the host, the second VM executing on the virtualization layer in parallel with the first VM.
9. The method of claim 8, wherein the specification data specifies a VM resource referencing a VM image resource for a VM image of guest software executing in the second VM.
10. The method of claim 8, wherein the specification data specifies a VM resource referencing a VM profile resource having attributes of the second VM.
11. The method of claim 8, wherein the specification data specifies a VM resource referencing a network resource for a virtual network connected to the second VM.
12. The method of claim 8, wherein the step of deploying comprises:
cloning the second VM from a VM image referenced in the specification data;
applying policies to the second VM based on the specification data; and
starting the second VM on a selected host of the host cluster.
13. The method of claim 8, further comprising:
receiving decoupled information at a management agent in the virtualization layer from the orchestration control plane through the VM controller; and
providing the decoupled information for consumption by the second application executing in the second VM, the decoupled information including at least one of configuration information and secret information.
14. The method of claim 8, wherein the second_application in the second VM is non-containerized.
15. A non-transitory computer readable medium comprising instructions to be executed in a computing device to cause the computing device to carry out a method of application orchestration in a virtualized computing system including a host cluster having a virtualization layer directly executing on hardware platforms of hosts and a virtualization management server configured to manage the virtualization layer and the hosts, the virtualization layer supporting execution of virtual machines (VMs), the virtualization layer integrated with an orchestration control plane, the method comprising:
receiving, at the orchestration control plane, specification data for a first application and a second application;
deploying, by a lifecycle controller (PLC), the first application to a first VM in a host of the host cluster based on the specification data, the first VM including a container engine supporting execution of containers in the first VM; and
deploying, by a VM controller executing in the orchestration control plane and in cooperation with the virtualization management server, the second application to a second VM in the host, the second VM executing on the virtualization layer in parallel with the first VM.
16. The non-transitory computer readable medium of claim 15, wherein the specification data specifies a VM resource referencing a VM image resource for a VM image of guest software executing in the second VM.
17. The non-transitory computer readable medium of claim 15, wherein the specification data specifies a VM resource referencing a VM profile resource having attributes of the second VM.
18. The non-transitory computer readable medium of claim 15, wherein the specification data specifies a VM resource referencing a network resource for a virtual network connected to the second VM.
19. The non-transitory computer readable medium of claim 15, wherein the step of deploying comprises:
cloning the second VM from a VM image referenced in the specification data;
applying policies to the second VM based on the specification data; and
starting the second VM on a selected host of the host cluster.
20. The non-transitory computer readable medium of claim 15, wherein the second application in the second VM is non-containerized.
US18/334,592 2021-01-20 2023-06-14 Declarative vm management for a container orchestrator in a virtualized computing system Pending US20230342168A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/334,592 US20230342168A1 (en) 2021-01-20 2023-06-14 Declarative vm management for a container orchestrator in a virtualized computing system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US17/153,296 US11720382B2 (en) 2021-01-20 2021-01-20 Declarative VM management for a container orchestrator in a virtualized computing system
US18/334,592 US20230342168A1 (en) 2021-01-20 2023-06-14 Declarative vm management for a container orchestrator in a virtualized computing system

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US17/153,296 Continuation US11720382B2 (en) 2021-01-20 2021-01-20 Declarative VM management for a container orchestrator in a virtualized computing system

Publications (1)

Publication Number Publication Date
US20230342168A1 true US20230342168A1 (en) 2023-10-26

Family

ID=82405129

Family Applications (2)

Application Number Title Priority Date Filing Date
US17/153,296 Active 2041-05-30 US11720382B2 (en) 2021-01-20 2021-01-20 Declarative VM management for a container orchestrator in a virtualized computing system
US18/334,592 Pending US20230342168A1 (en) 2021-01-20 2023-06-14 Declarative vm management for a container orchestrator in a virtualized computing system

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US17/153,296 Active 2041-05-30 US11720382B2 (en) 2021-01-20 2021-01-20 Declarative VM management for a container orchestrator in a virtualized computing system

Country Status (1)

Country Link
US (2) US11720382B2 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11762681B2 (en) 2021-03-02 2023-09-19 Vmware, Inc. Dynamic configuration of virtual objects
US11870642B2 (en) * 2021-10-04 2024-01-09 Juniper Networks, Inc. Network policy generation for continuous deployment

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160138448A (en) * 2014-03-08 2016-12-05 다이아만티 인코포레이티드 Methods and systems for converged networking and storage
US11637918B2 (en) * 2017-11-16 2023-04-25 Intel Corporation Self-descriptive orchestratable modules in software-defined industrial systems
US20190373052A1 (en) * 2018-05-30 2019-12-05 Tigera, Inc. Aggregation of scalable network flow events
CN114880078A (en) * 2018-06-05 2022-08-09 华为技术有限公司 Method and device for managing container service
US10754704B2 (en) * 2018-07-11 2020-08-25 International Business Machines Corporation Cluster load balancing based on assessment of future loading
US20200327615A1 (en) * 2019-04-10 2020-10-15 Ss&C Technologies, Inc. Portfolio risk measures aggregation
US11449354B2 (en) * 2020-01-17 2022-09-20 Spectro Cloud, Inc. Apparatus, systems, and methods for composable distributed computing

Also Published As

Publication number Publication date
US20220229678A1 (en) 2022-07-21
US11720382B2 (en) 2023-08-08

Similar Documents

Publication Publication Date Title
US11876671B2 (en) Dynamic configuration of a cluster network in a virtualized computing system
US11194483B1 (en) Enriching a storage provider with container orchestrator metadata in a virtualized computing system
US11907742B2 (en) Software-defined network orchestration in a virtualized computer system
US11422846B2 (en) Image registry resource sharing among container orchestrators in a virtualized computing system
US11822949B2 (en) Guest cluster deployed as virtual extension of management cluster in a virtualized computing system
US11593172B2 (en) Namespaces as units of management in a clustered and virtualized computer system
US11556372B2 (en) Paravirtual storage layer for a container orchestrator in a virtualized computing system
US11593139B2 (en) Software compatibility checking for managed clusters in a virtualized computing system
US11579916B2 (en) Ephemeral storage management for container-based virtual machines
US20230342168A1 (en) Declarative vm management for a container orchestrator in a virtualized computing system
US20230153145A1 (en) Pod deployment in a guest cluster executing as a virtual extension of management cluster in a virtualized computing system
US20220237049A1 (en) Affinity and anti-affinity with constraints for sets of resources and sets of domains in a virtualized and clustered computer system
US20220004417A1 (en) Logical network platform install and upgrade in a virtualized computer system
US11321223B2 (en) Conservation of network addresses for testing in a virtualized computing system
US11900141B2 (en) Direct access storage for persistent services in a distributed storage system
US20230333765A1 (en) Direct access storage for persistent services in a virtualized computing system
US11604672B2 (en) Operational health of an integrated application orchestration and virtualized computing system
US11972283B2 (en) Managing configuration and sensitive data for workloads in a virtualized computing system
US20220197687A1 (en) Data protection for control planes in a virtualized computer system
US20220197684A1 (en) Monitoring for workloads managed by a container orchestrator in a virtualized computing system
US20220197688A1 (en) Data protection for control planes in a virtualized computer system
US20220237048A1 (en) Affinity and anti-affinity for sets of resources and sets of domains in a virtualized and clustered computer system
US20220229686A1 (en) Scheduling workloads in a container orchestrator of a virtualized computer system
US20230014973A1 (en) Large message passing between containers in a virtualized computing system

Legal Events

Date Code Title Description
STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

AS Assignment

Owner name: VMWARE LLC, CALIFORNIA

Free format text: CHANGE OF NAME;ASSIGNOR:VMWARE, INC.;REEL/FRAME:067102/0242

Effective date: 20231121