US12314767B2 - Containerized workload management in container computing environment - Google Patents
Containerized workload management in container computing environment Download PDFInfo
- Publication number
- US12314767B2 US12314767B2 US17/503,469 US202117503469A US12314767B2 US 12314767 B2 US12314767 B2 US 12314767B2 US 202117503469 A US202117503469 A US 202117503469A US 12314767 B2 US12314767 B2 US 12314767B2
- Authority
- US
- United States
- Prior art keywords
- microservice
- parameter
- instance
- containerized workload
- computing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5044—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5022—Workload threshold
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/503—Resource availability
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/508—Monitor
Definitions
- the field relates generally to information processing systems, and more particularly to containerized workload management in such information processing systems.
- Information processing systems increasingly utilize reconfigurable virtual resources to meet changing user needs in an efficient, flexible and cost-effective manner.
- cloud-based computing and storage systems implemented using virtual resources in the form of containers have been widely adopted.
- Such containers may be used to provide at least a portion of the virtualization infrastructure of a given information processing system.
- significant challenges arise in managing container environments.
- Illustrative embodiments provide techniques for managing containerized workloads in a container computing environment.
- a method comprises the following steps.
- the method computes a parameter based on a first set of execution conditions for the microservice, wherein the parameter represents a resource utilization value at which at least one additional instance of the containerized workload is created for executing the microservice.
- the method then re-computes the parameter based on a second set of execution conditions for the microservice.
- the method may at least one of compute and re-compute the parameter for another microservice wherein the resource utilization value for the microservice is different than the resource utilization value for the other microservice.
- illustrative embodiments enable, inter alia, dynamic setting of an auto-scaling parameter for individual microservices running in a container computing environment.
- illustrative embodiments provide for calibration of microservices and dynamic determination (computation) of a target resource setting based on a statistical analysis of actual rate of increase (variations) of load and resource consumption of pods in a production or production-like environment.
- illustrative embodiments provide for re-calibration (re-computation) and resetting the target resource setting during production with live requests and loads.
- FIG. 1 illustrates a pod-based container environment within which one or more illustrative embodiments can be implemented.
- FIG. 2 illustrates host devices and a storage system within which one or more illustrative embodiments can be implemented.
- FIG. 3 illustrates an exemplary microservice application deployment with which one or more illustrative embodiments can be implemented.
- FIG. 4 illustrates a set of metrics associated with execution of a microservice application with which one or more illustrative embodiments can be implemented.
- FIGS. 5 and 6 respectively illustrate a change in load associated with execution of a microservice application with which one or more illustrative embodiments can be implemented.
- FIGS. 7 - 11 collectively illustrate dynamic management of a target resource setting in a containerized workload-based environment according to one or more illustrative embodiments.
- FIG. 12 illustrates a pod-based microservice environment with functionality for dynamic calibration and re-calibration of an auto-scaling parameter according to an illustrative embodiment.
- FIG. 13 illustrates a methodology for dynamic management of a target resource setting in a containerized workload-based environment according to an illustrative embodiment.
- FIGS. 14 and 15 respectively illustrate examples of processing platforms that may be utilized to implement at least a portion of an information processing system with a pod-based container environment according to one or more illustrative embodiments.
- ilustrarative embodiments will be described herein with reference to exemplary information processing systems and associated computers, servers, storage devices and other processing devices. It is to be appreciated, however, that embodiments are not restricted to use with the particular illustrative system and device configurations shown. Accordingly, the term “information processing system” as used herein is intended to be broadly construed, so as to encompass, for example, processing platforms comprising cloud and/or non-cloud computing and storage systems, as well as other types of processing systems comprising various combinations of physical and/or virtual processing resources.
- An information processing system may therefore comprise, by way of example only, at least one data center or other type of cloud-based system that includes one or more clouds hosting tenants that access cloud resources.
- a container may be considered lightweight, stand-alone, executable software code that includes elements needed to run the software code.
- the container structure has many advantages including, but not limited to, isolating the software code from its surroundings, and helping reduce conflicts between different tenants or users running different software code on the same underlying infrastructure.
- the term “user” herein is intended to be broadly construed so as to encompass numerous arrangements of human, hardware, software or firmware entities, as well as combinations of such entities.
- containers may be implemented using a Kubernetes container orchestration system.
- Kubernetes is an open-source system for automating application deployment, scaling, and management within a container-based information processing system comprised of components referred to as pods, nodes and clusters, as will be further explained below in the context of FIG. 1 .
- Types of containers that may be implemented or otherwise adapted within the Kubernetes system include, but are not limited to, Docker containers or other types of Linux containers (LXCs) or Windows containers.
- LXCs Linux containers
- Kubernetes has become the prevalent container orchestration system for managing containerized workloads. It is rapidly being adopted by many enterprise-based information technology (IT) organizations to deploy its application programs (applications).
- IT information technology
- such applications may include stateless (or inherently redundant applications) and/or stateful applications.
- stateful applications may include legacy databases such as Oracle, MySQL, and PostgreSQL, as well as other stateful applications that are not inherently redundant. While the Kubernetes container orchestration system is used to illustrate various embodiments, it is to be understood that alternative container orchestration systems can be utilized.
- the environment may be referred to, more generally, as a pod-based system, a pod-based container system, a pod-based container orchestration system, a pod-based container management system, or the like.
- the containers can be any type of container, e.g., Docker container, etc.
- a pod is typically considered the smallest execution unit in the Kubernetes container orchestration environment.
- a pod encapsulates one or more containers.
- One or more pods are executed on a worker node. Multiple worker nodes form a cluster.
- a Kubernetes cluster is managed by a least one master node.
- a Kubernetes environment may include multiple clusters respectively managed by multiple master nodes.
- pods typically represent the respective processes running on a cluster.
- a pod may be configured as a single process wherein one or more containers execute one or more functions that operate together to implement the process.
- Pods may each have a unique Internet Protocol (IP) address enabling pods to communicate with one another, and for other system components to communicate with each pod. Still further, pods may each have persistent storage volumes associated therewith.
- Configuration information configuration objects indicating how a container executes can be specified for each pod.
- FIG. 1 depicts an example of a pod-based container orchestration environment 100 .
- a plurality of master nodes 110 - 1 , . . . 110 -L (herein each individually referred to as master node 110 or collectively as master nodes 110 ) are respectively operatively coupled to a plurality of clusters 115 - 1 , . . . 115 -L (herein each individually referred to as cluster 115 or collectively as clusters 115 ).
- each cluster is managed by at least one master node.
- Illustrative embodiments provide for application copy management across multiple clusters (e.g., from one cluster of clusters 115 to another cluster of clusters 115 ), as will be further explained in detail herein.
- Each cluster 115 comprises a plurality of worker nodes 120 - 1 , . . . 120 -M (herein each individually referred to as worker node 120 or collectively as worker nodes 120 ).
- Each worker node 120 comprises a respective pod, i.e., one of a plurality of pods 122 - 1 , . . . 122 -M (herein each individually referred to as pod 122 or collectively as pods 122 ).
- pod 122 comprises a set of containers 1 , N (each pod may also have a different number of containers).
- each master node 110 comprises a controller manager 112 , a scheduler 114 , an application programming interface (API) service 116 , and a key-value database 118 , as will be further explained.
- API application programming interface
- multiple master nodes 110 may share one or more of the same controller manager 112 , scheduler 114 , API service 116 , and key-value database 118 .
- Worker nodes 120 of each cluster 115 execute one or more applications associated with pods 122 (containerized workloads).
- Each master node 110 manages the worker nodes 120 , and therefore pods 122 and containers, in its corresponding cluster 115 . More particularly, each master node 110 controls operations in its corresponding cluster 115 utilizing the above-mentioned components, i.e., controller manager 112 , scheduler 114 , API service 116 , and a key-value database 118 .
- controller manager 112 executes control processes (controllers) that are used to manage operations in cluster 115 .
- Scheduler 114 typically schedules pods to run on particular nodes taking into account node resources and application execution requirements such as, but not limited to, deadlines.
- API service 116 exposes the Kubernetes API, which is the front end of the Kubernetes container orchestration system.
- Key-value database 118 typically provides key-value storage for all cluster data including, but not limited to, configuration data objects generated, modified, deleted, and otherwise managed, during the course of system operations.
- FIG. 2 an information processing system 200 is depicted within which pod-based container orchestration environment 100 of FIG. 1 can be implemented. More particularly, as shown in FIG. 2 , a plurality of host devices 202 - 1 , . . . 202 -P (herein each individually referred to as host device 202 or collectively as host devices 202 ) are operatively coupled to a storage system 204 . Each host device 202 hosts a set of nodes 1 , . . . Q. Note that while multiple nodes are illustrated on each host device 202 , a host device 202 can host a single node, and one or more host devices 202 can host a different number of nodes as compared with one or more other host devices 202 .
- storage system 204 comprises a plurality of storage arrays 205 - 1 , . . . 205 -R (herein each individually referred to as storage array 205 or collectively as storage arrays 205 ), each of which is comprised of a set of storage devices 1 , . . . T upon which one or more storage volumes are persisted.
- the storage volumes depicted in the storage devices of each storage array 205 can include any data generated in the information processing system 200 but, more typically, include data generated, manipulated, or otherwise accessed, during the execution of one or more applications in the nodes of host devices 202 .
- any one of nodes 1 , . . . Q on a given host device 202 can be a master node 110 or a worker node 120 ( FIG. 1 ).
- a node can be configured as a master node for one execution environment and as a worker node for another execution environment.
- the components of pod-based container orchestration environment 100 in FIG. 1 can be implemented on one or more of host devices 202 , such that data associated with pods 122 ( FIG. 1 ) running on the nodes 1 , . . . Q is stored as persistent storage volumes in one or more of the storage devices 1 , . . . T of one or more of storage arrays 205 .
- Host devices 202 and storage system 204 of information processing system 200 are assumed to be implemented using at least one processing platform comprising one or more processing devices each having a processor coupled to a memory. Such processing devices can illustratively include particular arrangements of compute, storage and network resources. In some alternative embodiments, one or more host devices 202 and storage system 204 can be implemented on respective distinct processing platforms.
- processing platform as used herein is intended to be broadly construed so as to encompass, by way of illustration and without limitation, multiple sets of processing devices and associated storage systems that are configured to communicate over one or more networks.
- distributed implementations of information processing system 200 are possible, in which certain components of the system reside in one data center in a first geographic location while other components of the system reside in one or more other data centers in one or more other geographic locations that are potentially remote from the first geographic location.
- information processing system 200 it is possible in some implementations of information processing system 200 for portions or components thereof to reside in different data centers. Numerous other distributed implementations of information processing system 200 are possible. Accordingly, the constituent parts of information processing system 200 can also be implemented in a distributed manner across multiple computing platforms.
- FIGS. 1 and 2 Additional examples of processing platforms utilized to implement containers, container environments and container management systems in illustrative embodiments, such as those depicted in FIGS. 1 and 2 , will be described in more detail below in conjunction with additional figures.
- FIG. 2 shows an arrangement wherein host devices 202 are coupled to just one plurality of storage arrays 205 , in other embodiments, host devices 202 may be coupled to and configured for operation with storage arrays across multiple storage systems similar to storage system 204 .
- information processing system 200 may be part of a public cloud infrastructure such as, but not limited to, Amazon Web Services (AWS), Google Cloud Platform (GCP), Microsoft Azure, etc.
- the cloud infrastructure may also include one or more private clouds and/or one or more hybrid clouds (e.g., a hybrid cloud is a combination of one or more private clouds and one or more public clouds).
- a Kubernetes pod may be referred to more generally herein as a containerized workload.
- a containerized workload is an application program configured to provide a microservice.
- a microservice architecture is a software approach wherein a single application is composed of a plurality of loosely-coupled and independently-deployable smaller components or services.
- Container-based microservice architectures have profoundly changed the way development and operations teams test and deploy modern software. Containers help companies modernize by making it easier to scale and deploy applications.
- Kubernetes helps developers and microservice operations teams because it manages the container orchestration well.
- Kubernetes is more than a container orchestrator, as it can be considered an operating system for cloud-native applications in the sense that it is the platform that applications run on, (e.g., just as desktop applications run on MacOS, Windows, or Linux).
- Tanzu from VMWare is a suite of products that helps users run and manage multiple Kubernetes (K8S) clusters across public and private cloud platforms.
- microservices provide an ideal architecture for continuous delivery.
- each application may reside in a separate container along with the environment it needs to run. Because of this, each application can be edited in its container without the risk of interfering with any other application.
- the microservice architecture introduces new challenges to developers. One of the main challenges microservices introduces is managing a significant number of microservices for an application.
- SaaS Software-as-a-Service
- PCF Pivotal Cloud Foundry
- Azure Kubernetes Service AVS
- PVS Pivotal Container Service
- these frameworks and platforms attempt to address the scalability of microservices. For a given microservice-based application, as the request load increases or decreases, the container needs to increase or decrease the instances of microservices.
- automatic scaling or “auto-scaling” is used to attempt to ensure that an application has a sufficient amount of targeted resource capacity allocated to handle the traffic demand.
- auto-scaling solutions do not address important scaling issues.
- Auto-scaling is an important concept in cloud automation. Without auto-scaling, resources (e.g., compute, storage, network, etc.) have to be manually provisioned (and later scaled down) every time conditions change. As such, it will be less likely that the container computing environment will operate with optimal resource utilization and cloud spending.
- resources e.g., compute, storage, network, etc.
- HPA horizontal pod auto-scaler
- VPA vertical pod auto-scaler
- CA cluster auto-scaler
- HPA is based on a scale-out concept manually allowing administrators to increase or decrease the number of running pods in a cluster as application usage (e.g., requests) changes.
- VPA is based on a scale-up concept by adding more central processing unit (CPU) or memory capacity to a cluster.
- CA is based on a concept of adding or removing clusters in case a cluster itself is overloaded.
- HPA is typically considered a best practice, i.e., to ensure enough resources are allocated for sufficient operation of a microservice within a cluster. Further, in Kubernetes, an administrator can manually specify a fixed targeted utilization parameter with respect to resources to start replication of a microservice instance.
- HPA based on CPU utilization
- container workload management techniques described herein are equally applicable to any metric that can be auto-scaled (e.g., memory capacity, network capacity, etc.).
- a Kubernetes deployment is created for a microservice called “shibi-app” with targetCPUUtilization as 80% depicted as 310 in FIG. 3 .
- the Kubernetes platform e.g., master node 110 in FIG. 1 or some other component
- the Kubernetes platform will start spinning (instantiating, creating, etc.) one or more new pods when the first pod reaches 80% CPU (targetCPUUtilization), i.e., 80% of the CPU capacity is being utilized.
- the 80% applies to all pods, meaning that when all pods exceed 80%, the Kubernetes platform will start spinning new pods.
- load versus time 400 in FIG. 4
- average CPU usage percentage 410 in FIG. 4
- number of pods versus time 420 in FIG. 4
- the Kubernetes framework starts with a single pod.
- CPU load reaches 80%
- the framework spins a new pod.
- the CPU utilization became 85%. Soon after that the CPU usage goes down.
- the framework starts spinning another pod. Again, assume CPU usage per pod falls. The same cycle repeats until the framework reaches the maximum available CPU capacity. Once the total load decreases, the reverse happens.
- the framework starts releasing (terminating, etc.) pods at a total CPU usage of 160% and 80%.
- FIGS. 5 and 6 further illustrate the above issues in the context of load and with respect to a time for initialization (TI) and a maximum (max) time allowed for initialization (MTAI) of a pod.
- Graph 500 in FIG. 5 shows a microservice scenario when the load increases in a microservice, and the framework auto-scales more instances of the microservice according to a pre-defined static setting of a targeted resource (e.g., 80% of CPU). If the load exceeds 80% CPU, then a new instance spins off.
- a targeted resource e.g., 80% of CPU
- TI is less than or equal to MTAI.
- MTAI rate of increasing load
- the time for initialization is more.
- the microservice will hit 100% resource utilization before the new pod (instance) initializes and thus fail to serve.
- Time taken for the new instances depends on the size of the microservice image and the current resources. Due to each microservice's behavior, the number of requests are different for different microservices, and the time taken for reaching 80% to 100% is different for different microservices.
- microservices will reach the maximum time allowed faster and some slower (i.e., variation in dl/dt). Also, for the same microservice with more than one instance (more pods), the dl/dt will be different. Accordingly, the static rule for the pre-defined setting for auto-scaling may result in reaching 100% usage before a new microservice instance spins off, resulting in an out of memory error.
- One remedy is to set the rule to a low value (e.g., 40%). However, in this case, if at 40% of CPU time and the new instance spins off, then this may result in underutilization of instances and inefficient resource consumption.
- the problem can be defined as follows: in the current microservices auto-scaling approach, the scale out rule is pre-defined and statically set based on a guestimate.
- the time for initializing a new instance may be more than the time to reach 100% of resource utilization by a particular microservice. This will lead to errors for microservice clients which is not acceptable.
- Illustrative embodiments address this and other issues by enabling dynamic setting of an auto-scaling parameter (e.g., targetCPUUtilization parameter in case of Kubernetes CPU utilization cut off percentage) differently for different microservices based on, for example, the increasing load to peak and production-like resource distribution.
- an illustrative embodiment implements a side car module of a microservice (pod or containerized workload) to monitor and register the rate of increase in the load (dl/dt) and time taken for initialization of the new pod (new instance) for the microservice (calibration of rate of load and initialization) in a production-like environment.
- illustrative embodiments derive the optimal cut of percentage (optimal targeted resource setting or autoscaling parameter) for different resources.
- microservices can also be re-calibrated in production and values can be set in a timely manner for a pre-defined interval.
- the sidecar module for each microservice is configured to register two types of execution conditions: (i) increase/decrease in load with time; and (ii) time for initializing new pod(s) after the cut off percentage parameter is reached.
- execution conditions (i) increase/decrease in load with time; and (ii) time for initializing new pod(s) after the cut off percentage parameter is reached.
- application “shibi-app” Assume that the first run starts the request 1 to increase the load by increasing parallel requests and captures the cluster CPU and memory consumption without setting any cut off percentage value. For example, see table 700 in FIG. 7 .
- Calibration is re-run until 120% of the maximum parallel requests expected in production. The result is shown in table 800 of FIG. 8 and graph 900 of FIG. 9 considering 120% of the maximum parallel requests is 29.
- the previous record in calibration i.e., 95%
- targetCPUUtilization is reset to 95%. Calibration is run again and if all is satisfactory until the maximum expected load+20%, then the setting is maintained. This is depicted in graph 1000 of FIG. 10 .
- the framework reduces the cut off percentage setting, picking the previous value in the first calibration and then reruns the calibration.
- the optimal setting is thus obtained for that microservice, which can be kept for production.
- the optimal setting for the given microservice is determined to be 82%. It is to be appreciated that in the production environment, calibration can be also be performed at run-time, while serving the production load and re-calibrating the setting.
- FIG. 12 illustrates a pod-based microservice environment 1200 with functionality for dynamic calibration and re-calibration of an auto-scaling parameter (e.g., cut off percentage, targeted resource setting, etc.) according to an illustrative embodiment.
- pod-based microservice environment 1200 can be implemented as part of pod-based container orchestration environment 100 and/or information processing system 200 respectively of FIGS. 1 and 2 .
- a set of one or more pods 1202 are part of an execution environment for a microservice.
- At least one side car module 1204 is associated with the set of one or more pods 1202 .
- the side car module 1204 is operatively coupled to a targeted cut off calibrator 1210 which includes a register module 1212 , a calibrate module 1214 , and a set module 1216 .
- a storage unit 1218 is operatively coupled to register module 1212 and calibrate module 1214 .
- a yaml file 1220 is configured by targeted cut off calibrator 1210 and provided to the set of one or more pods 1202 , as further explained below.
- Side car module 1204 is configured to call register module 1212 to obtain to the number of incoming requests (load) and resource consumption of a given one of the set of one or more pods 1202 .
- Calibrate module 1214 calculates the best targeted resource setting for the specific microservice under the production resource status, as explained above.
- Set module 1216 sets the final and optimal target resource setting (auto-scaling parameter) in yaml file 1220 .
- Storage unit 1218 stores data for use by register module 1212 and calibrate module 1214 .
- FIG. 13 illustrates a methodology 1300 according to an illustrative embodiment.
- step 1302 computes a parameter based on a first set of execution conditions for the microservice, wherein the parameter represents a resource utilization value at which at least one additional instance of the containerized workload is created for executing the microservice.
- step 1304 re-computes the parameter based on a second set of execution conditions for the microservice.
- Step 1306 at least one of computes and re-computes the parameter for another microservice wherein the resource utilization value for the microservice is different than the resource utilization value for the other microservice.
- FIGS. 14 and 15 Illustrative embodiments of processing platforms utilized to implement functionality for containerized workload auto-scaling management in container environments will now be described in greater detail with reference to FIGS. 14 and 15 . It is to be appreciated that systems and processes described in the context of FIGS. 1 - 13 can be performed via the platforms in FIGS. 14 and/or 15 but may also be implemented, in whole or in part, in other information processing systems in other embodiments.
- FIG. 14 shows an example processing platform comprising cloud infrastructure 1400 .
- the cloud infrastructure 1400 comprises a combination of physical and virtual processing resources that may be utilized to implement at least a portion of the pod-based container orchestration environment 100 and/or information processing system 200 .
- the cloud infrastructure 1400 comprises multiple container sets 1402 - 1 , 1402 - 2 , . . . 1402 -L implemented using virtualization infrastructure 1404 .
- the virtualization infrastructure 1404 runs on physical infrastructure 1405 , and illustratively comprises one or more hypervisors and/or operating system level virtualization infrastructure.
- the cloud infrastructure 1400 further comprises sets of applications 1410 - 1 , 1410 - 2 , . . . 1410 -L running on respective ones of the container sets 1402 - 1 , 1402 - 2 , . . . 1402 -L under the control of the virtualization infrastructure 1404 .
- the container sets 1402 may comprise respective sets of one or more containers.
- the container sets 1402 comprise respective containers implemented using virtualization infrastructure 1404 that provides operating system level virtualization functionality, such as support for Kubernetes-managed containers.
- one or more of the processing modules or other components of pod-based container orchestration environment 100 and/or information processing system 200 may each run on a computer, server, storage device or other processing platform element.
- a given such element may be viewed as an example of what is more generally referred to herein as a “processing device.”
- the cloud infrastructure 1400 shown in FIG. 14 may represent at least a portion of one processing platform.
- processing platform 1500 shown in FIG. 15 is another example of such a processing platform.
- the processing platform 1500 in this embodiment comprises a portion of pod-based container orchestration environment 100 and/or information processing system 200 and includes a plurality of processing devices, denoted 1502 - 1 , 1502 - 2 , 1502 - 3 , . . . 1502 -K, which communicate with one another over a network 1504 .
- the network 1504 may comprise any type of network, including by way of example a global computer network such as the Internet, a WAN, a LAN, a satellite network, a telephone or cable network, a cellular network, a wireless network such as a WiFi or WiMAX network, or various portions or combinations of these and other types of networks.
- the processing device 1502 - 1 in the processing platform 1500 comprises a processor 1510 coupled to a memory 1512 .
- the processor 1510 may comprise a microprocessor, a microcontroller, an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA) or other type of processing circuitry, as well as portions or combinations of such circuitry elements.
- ASIC application-specific integrated circuit
- FPGA field-programmable gate array
- the memory 1512 may comprise random access memory (RAM), read-only memory (ROM), flash memory or other types of memory, in any combination.
- RAM random access memory
- ROM read-only memory
- flash memory or other types of memory, in any combination.
- the memory 1512 and other memories disclosed herein should be viewed as illustrative examples of what are more generally referred to as “processor-readable storage media” storing executable program code of one or more software programs.
- Articles of manufacture comprising such processor-readable storage media are considered illustrative embodiments.
- a given such article of manufacture may comprise, for example, a storage array, a storage disk or an integrated circuit containing RAM, ROM, flash memory or other electronic memory, or any of a wide variety of other types of computer program products.
- the term “article of manufacture” as used herein should be understood to exclude transitory, propagating signals. Numerous other types of computer program products comprising processor-readable storage media can be used.
- network interface circuitry 1514 is included in the processing device 1502 - 1 , which is used to interface the processing device with the network 1504 and other system components, and may comprise conventional transceivers.
- the other processing devices 1502 of the processing platform 1500 are assumed to be configured in a manner similar to that shown for processing device 1502 - 1 in the figure.
- pod-based container orchestration environment 100 and/or information processing system 200 may include additional or alternative processing platforms, as well as numerous distinct processing platforms in any combination, with each such platform comprising one or more computers, servers, storage devices or other processing devices.
- components of an information processing system as disclosed herein can be implemented at least in part in the form of one or more software programs stored in memory and executed by a processor of a processing device.
- a processor of a processing device For example, at least portions of the functionality as disclosed herein are illustratively implemented in the form of software running on one or more processing devices.
- storage systems may comprise at least one storage array implemented as a UnityTM, PowerMaxTM, PowerFlexTM (previously ScaleIOTM) or PowerStoreTM storage array, commercially available from Dell Technologies.
- storage arrays may comprise respective clustered storage systems, each including a plurality of storage nodes interconnected by one or more networks.
- An example of a clustered storage system of this type is an XtremIOTM storage array from Dell Technologies, illustratively implemented in the form of a scale-out all-flash content addressable storage array.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Debugging And Monitoring (AREA)
Abstract
Description
desiredReplicas=ceil[currentReplicas*(currentMetricValue/desiredMetricValue)]
-
- (i) Service behavior: The nature of each microservice can be different; some microservices will take more time to reach to 100%, but some others will take less time.
- (ii) Number of requests (load) in each microservice; more parallel requests cause the microservice to reach 100% faster.
- (iii) Size of the image and the resource available; the bigger the size of the image, the pod initialization time increases, and lesser available resources increases the initialization time.
Claims (20)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/503,469 US12314767B2 (en) | 2021-10-18 | 2021-10-18 | Containerized workload management in container computing environment |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/503,469 US12314767B2 (en) | 2021-10-18 | 2021-10-18 | Containerized workload management in container computing environment |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20230123350A1 US20230123350A1 (en) | 2023-04-20 |
| US12314767B2 true US12314767B2 (en) | 2025-05-27 |
Family
ID=85981521
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/503,469 Active 2043-04-15 US12314767B2 (en) | 2021-10-18 | 2021-10-18 | Containerized workload management in container computing environment |
Country Status (1)
| Country | Link |
|---|---|
| US (1) | US12314767B2 (en) |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12095885B2 (en) * | 2022-10-05 | 2024-09-17 | Hong Kong Applied Science and Technology Research Institute Company Limited | Method and apparatus for removing stale context in service instances in providing microservices |
| US12524705B2 (en) * | 2022-12-13 | 2026-01-13 | International Business Machines Corporation | Intelligent upgrade workflow for a container orchestration system |
| US20240362071A1 (en) * | 2023-04-30 | 2024-10-31 | Intergraph Corporation | Cloud-based systems and methods for execution of python scripts |
| CN118093204B (en) * | 2024-04-25 | 2024-07-09 | 数据空间研究院 | Cluster expansion and contraction method for arranging containers |
Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10467036B2 (en) * | 2014-09-30 | 2019-11-05 | International Business Machines Corporation | Dynamic metering adjustment for service management of computing platform |
| US10761889B1 (en) * | 2019-09-18 | 2020-09-01 | Palantir Technologies Inc. | Systems and methods for autoscaling instance groups of computing platforms |
| US20220029899A1 (en) * | 2020-07-22 | 2022-01-27 | Citrix Systems, Inc. | Determining changes in a performance of a server |
| US11436054B1 (en) * | 2021-04-05 | 2022-09-06 | Hewlett Packard Enterprise Development Lp | Directing queries to nodes of a cluster of a container orchestration platform distributed across a host system and a hardware accelerator of the host system |
| US20220329651A1 (en) * | 2021-04-12 | 2022-10-13 | Electronics And Telecommunications Research Institute | Apparatus for container orchestration in geographically distributed multi-cloud environment and method using the same |
| US20220385542A1 (en) * | 2019-10-02 | 2022-12-01 | Telefonaktiebolaget Lm Ericsson (Pubi) | Performance Modeling for Cloud Applications |
| US20230109368A1 (en) * | 2021-09-29 | 2023-04-06 | Sap Se | Autoscaling gpu applications in kubernetes based on gpu utilization |
| US20230114504A1 (en) * | 2021-10-11 | 2023-04-13 | International Business Machines Corporation | Dynamic scaling for workload execution |
-
2021
- 2021-10-18 US US17/503,469 patent/US12314767B2/en active Active
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10467036B2 (en) * | 2014-09-30 | 2019-11-05 | International Business Machines Corporation | Dynamic metering adjustment for service management of computing platform |
| US10761889B1 (en) * | 2019-09-18 | 2020-09-01 | Palantir Technologies Inc. | Systems and methods for autoscaling instance groups of computing platforms |
| US20220385542A1 (en) * | 2019-10-02 | 2022-12-01 | Telefonaktiebolaget Lm Ericsson (Pubi) | Performance Modeling for Cloud Applications |
| US20220029899A1 (en) * | 2020-07-22 | 2022-01-27 | Citrix Systems, Inc. | Determining changes in a performance of a server |
| US11436054B1 (en) * | 2021-04-05 | 2022-09-06 | Hewlett Packard Enterprise Development Lp | Directing queries to nodes of a cluster of a container orchestration platform distributed across a host system and a hardware accelerator of the host system |
| US20220329651A1 (en) * | 2021-04-12 | 2022-10-13 | Electronics And Telecommunications Research Institute | Apparatus for container orchestration in geographically distributed multi-cloud environment and method using the same |
| US20230109368A1 (en) * | 2021-09-29 | 2023-04-06 | Sap Se | Autoscaling gpu applications in kubernetes based on gpu utilization |
| US20230114504A1 (en) * | 2021-10-11 | 2023-04-13 | International Business Machines Corporation | Dynamic scaling for workload execution |
Non-Patent Citations (4)
| Title |
|---|
| Github, "kubernetes/autoscaler," https://github.com/kubernetes/autoscaler, Accessed Aug. 19, 2021, 3 pages. |
| Github, "Vertical Pod Autoscaler," https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler, Accessed Aug. 19, 2021, 9 pages. |
| K. Casey, "5 Approaches to Cloud Automation," https://enterprisersproject.com/article/2021/2/cloud, Feb. 5, 2021, 6 pages. |
| Kubernetes, "Horizontal Pod Autoscaler," https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/, Jul. 26, 2021, 9 pages. |
Also Published As
| Publication number | Publication date |
|---|---|
| US20230123350A1 (en) | 2023-04-20 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12314767B2 (en) | Containerized workload management in container computing environment | |
| US10866840B2 (en) | Dependent system optimization for serverless frameworks | |
| US10089150B2 (en) | Apparatus, device and method for allocating CPU resources | |
| US10037237B2 (en) | Method and arrangement for fault management in infrastructure as a service clouds | |
| US11843548B1 (en) | Resource scaling of microservice containers | |
| CN106068626B (en) | Load Balancing in Distributed Network Management Architecture | |
| US20230168940A1 (en) | Time-bound task management in parallel processing environment | |
| US9772792B1 (en) | Coordinated resource allocation between container groups and storage groups | |
| US9262494B2 (en) | Importing data into dynamic distributed databases | |
| EP3021521A1 (en) | A method and system for scaling, telecommunications network and computer program product | |
| US12061932B2 (en) | Multi-leader election in a distributed computing system | |
| US12386670B2 (en) | On-demand clusters in container computing environment | |
| US20250110775A1 (en) | Configuring microservices in containerized systems | |
| US9934268B2 (en) | Providing consistent tenant experiences for multi-tenant databases | |
| US12086643B2 (en) | Critical workload management in container-based computing environment | |
| US20240272947A1 (en) | Request processing techniques for container-based architectures | |
| US12541401B2 (en) | Intelligent auto-scaling of containerized workloads in container computing environment | |
| US12124722B2 (en) | Dynamic over-provisioning of storage devices | |
| CN117435302A (en) | Container capacity adjustment method, device, electronic equipment and storage medium | |
| CN114217917A (en) | Host scheduling method, device, equipment and storage medium | |
| US20200104173A1 (en) | Communication process load balancing in an automation engine | |
| US12430168B2 (en) | Schedule management for machine learning model-based processing in computing environment | |
| US12217086B2 (en) | Chain schedule management for machine learning model-based processing in computing environment | |
| US20230333870A1 (en) | Orchestrated shutdown of virtual machines using a shutdown interface and a network card | |
| US20240296075A1 (en) | Job control system and control method thereof |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: DELL PRODUCTS L.P., TEXAS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:PANIKKAR, SHIBI;REEL/FRAME:057816/0155 Effective date: 20211017 |
|
| FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |