US20140304404A1 - Scaling a virtual machine instance - Google Patents

Scaling a virtual machine instance Download PDF

Info

Publication number
US20140304404A1
US20140304404A1 US14/312,374 US201414312374A US2014304404A1 US 20140304404 A1 US20140304404 A1 US 20140304404A1 US 201414312374 A US201414312374 A US 201414312374A US 2014304404 A1 US2014304404 A1 US 2014304404A1
Authority
US
United States
Prior art keywords
virtual machine
service
computing device
resources
request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/312,374
Inventor
Michael David Marr
Marcin P. Kowalski
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Amazon Technologies Inc
Original Assignee
Amazon Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Amazon Technologies Inc filed Critical Amazon Technologies Inc
Priority to US14/312,374 priority Critical patent/US20140304404A1/en
Publication of US20140304404A1 publication Critical patent/US20140304404A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/04Billing or invoicing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5022Mechanisms to release resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5022Workload threshold
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request

Definitions

  • Cloud computing in general, is an approach to providing access to electronic resources through services, such as Web services, where the hardware and/or software used to support those services is dynamically scalable to meet the needs of the services at any given time.
  • a user or customer typically will rent, lease, or otherwise pay for access to resources through the cloud, and thus does not have to purchase and maintain the hardware and/or software needed.
  • Virtualization can allow computing servers, storage device or other resources to be partitioned into multiple isolated instances that are associated with (e.g., owned by) a particular user.
  • a cloud computing provider usually assigns one or more virtual machines to each of its customers, and the virtual machines are used to execute the applications and/or other workload for those customers. A number of issues and inconveniences may occur, however, when the processing load of the customer begins to exceed the capacity of the virtual machines due to an increase in demand or other reasons.
  • FIG. 1 illustrates an example of scaling up a virtual machine instance by allocating additional CPU, in accordance with various embodiments
  • FIG. 2 illustrates an example of an automatic scaling service deployed by a service provider, in accordance with various embodiments
  • FIG. 3A illustrates an example process for automatically scaling a virtual machine instance on a host machine, in accordance with various embodiments
  • FIG. 3B illustrates an example process of scaling a virtual machine instance in response to receiving a request from the user, in accordance with various embodiments
  • FIG. 4 illustrates an example process for automatically scaling a virtual machine instance and allocating additional virtual machine instances, in accordance with various embodiments
  • FIG. 5 illustrates a logical arrangement of a set of general components of an example computing device that can be utilized in accordance with various embodiments.
  • FIG. 6 illustrates an example of an environment for implementing aspects in accordance with various embodiments.
  • Systems and methods in accordance with various embodiments of the present disclosure may overcome one or more of the foregoing or other deficiencies experienced in conventional approaches for scaling computing resources.
  • various embodiments provide approaches for automatically allocating additional computing resources (e.g., processors, memory, networking devices etc.) to a virtual machine instance and/or de-allocating computing resources from a virtual machine instance according to various user-specified thresholds or user requests. Effectively, this enables a virtual machine instance to “grow” or “shrink” in size and capacity on-demand or according to the actual demand for the resources that the virtual machine provides.
  • additional computing resources e.g., processors, memory, networking devices etc.
  • one such approach can be implemented by a service provider of a shared computing resource environment (e.g., a “cloud” computing provider) that hosts applications and virtual machine instances on behalf of its customers.
  • the applications and virtual machine instances are hosted on the physical resources (e.g., host servers and other network resources) owned and operated by the service provider.
  • the service provider receives a virtual machine image from the customer and provisions one or more virtual machine instances for the customer based at least in part on the virtual machine image. These virtual machine instances can then execute the various applications and/or other services of the customer using the physical computing resources of the service provider.
  • each virtual machine instance is provisioned on a host machine (e.g., computing device).
  • a host machine e.g., computing device.
  • Each host machine can host one or more virtual machine instances.
  • the host machine further includes a hypervisor and service hosting layer that provides access to the hardware device drivers of the host machine and enables the one or more virtual machine instances to access the devices directly or through a virtualization abstraction.
  • the service provider may receive from the customer (e.g., via an application programming interface (API)) a request to allocate additional resources to the virtual machine instance or to de-allocate resources from the instance.
  • the API can allow the customer to specify one or more customer-defined thresholds for the virtual machine instance pertaining to various operating metrics of the underlying resources, such as CPU utilization.
  • the customer is enabled to specify the various runtime operating metrics associated with their service or application that may be relevant to the decision to scale the virtual machine instance. These operating metrics and thresholds can allow the customer to indicate the conditions under which resources allocated to the virtual machine instance should be scaled up or down.
  • a service on the service provider's system monitors operating metrics during execution of the virtual machine instance.
  • the service may receive operating metrics from a guest agent executing within the virtual machine instance. Consequently, the operating metrics may be generated from the server and/or from within the virtual machine instance.
  • the service detects that one or more of the metrics have exceeded a threshold, such as a customer-defined threshold, for a predetermined period of time, it may initiate the scaling up or down of the virtual machine instance by adding or removing resources (e.g., central processing units (CPUs), memory, other hardware devices).
  • resources e.g., central processing units (CPUs), memory, other hardware devices.
  • the service may allocate additional CPU capacity to the virtual machine instance (e.g., assign additional CPUs or CPU cores, switch to a more powerful CPU, etc.).
  • additional CPU capacity e.g., assign additional CPUs or CPU cores, switch to a more powerful CPU, etc.
  • the service may de-allocate (e.g., reduce) some amount of CPU capacity from the virtual machine instance.
  • the scaling of the virtual machine instance can be performed automatically, without requiring any manual involvement on the part of the customer. In other embodiments, the scaling of the virtual machine instance can be performed in response to receiving a request to scale the instance from the user (e.g., owner of the virtual machine).
  • the virtual machine instance can be automatically scaled up until a single virtual machine instance is no longer capable of adequately supporting the workload of the customer. Once this limit has been reached, the service may begin to automatically assign additional virtual machine instances to handle the workload. In addition, the service may continue to automatically scale each of the additional VM instances up or down to meet the fluctuating demand in the manner previously described.
  • operating metrics and user defined thresholds may, in part, include requirements for redundancy, availability, durability or the like, so “scale out” to multiple VM's hosted on different physical servers may occur even if there is sufficient resource capability on a single server to satisfy other resource requirements, such as a certain amount of CPU or RAM.
  • the managing of scaling up (or down) VM instances within a host can be performed by using web-based graphical user interface (GUI), customer defined thresholds, or fully automatic inference in a closed measurement/action loop.
  • GUI graphical user interface
  • This automatic scaling can enable several billing and/or payment models. For example, a customer may be charged for a generic scalable VM instance which charges per GHz hour (or other predetermined time period) and/or GB per hour of RAM, separating out individual machine resources (CPU, RAM, network) and charging differentially and the like.
  • Web Services can be used to allow a user (e.g., customer) to request the scaling of virtual machine instances or to specify various thresholds that control when these VM instances will grow or shrink in resource capacity.
  • Web Services can include both Query and simple object access protocol (SOAP) APIs. It should be noted, however, that Web Services are not limited to SOAP based API calls and can include any remote procedure/function/method execution carried out using a network, such as the Internet.
  • SOAP simple object access protocol
  • a web service can be deployed by the service provider, which provides resizable computing capacity (e.g., additional server instances in a resource center). This computing capacity can be used to build and host the customer's software systems.
  • the service provider can provide access to these resources using APIs or web tools and utilities. Users can thus access the API functionality exposed by the service provider, in order to add or remove resources, scale based on metrics, redundancy, availability, and the like.
  • FIG. 1 illustrates an example 100 of scaling up a virtual machine instance by allocating additional CPU, Virtual CPUs (VCPU), Physical CPUs (PCPU), cores of a physical CPU or fractions thereof and herein generally referred to as “CPU”, in accordance with various embodiments.
  • a host computing device 101 includes a hypervisor 102 that manages virtual machine instances 103 and 104 .
  • a hypervisor 102 manages the execution of the one or more guest operating systems and allows multiple instances of different operating systems to share the underlying hardware resources.
  • hypervisors are installed on server hardware, with the function of running guest operating systems, where the guest operating systems themselves act as servers.
  • hypervisor 102 there can be at least two types of hypervisor 102 : a type 1 (bare metal) hypervisor; and a type 2 (hosted) hypervisor.
  • a type 1 hypervisor runs directly on the hardware resources and manages and controls one or more guest operating systems, which run on top of the hypervisor.
  • a type 2 hypervisor is executed within the operating system and hosts the one or more guest operating conceptually at a third level above the hardware resources. Either type of hypervisor can be implemented in accordance with the embodiments described herein.
  • the hypervisor 102 can host a number of domains (e.g. virtual machines), such as the host domain (or service layer or virtualization layer or the like) and one or more guest domains.
  • the host domain (e.g., Dom- 0 ) is the first domain created and helps manages all of the hardware devices and other domains running on the hypervisor 102 .
  • the host domain can manage the creating, destroying, migrating, saving, or restoring the one or more guest domains (e.g., Dom-U).
  • the hypervisor 102 controls access to the hardware resources such as the CPU, input/output (I/O) memory and hypervisor memory.
  • the hypervisor 102 includes an automatic scaling service 114 that performs the scaling of the virtual machine instance by allocating or de-allocating resources to the virtual machine instance.
  • the scaling service can reside in Dom- 0 or externally with respect to the host computing device, and the host computing device may include a thin agent to execute commands received from the external scaling service.
  • the hardware resources of the host computing device 101 include physical memory 116 , one or more central processing units (CPUs) ( 107 , 108 , 109 , 110 ) and any other hardware resources or devices 111 .
  • the physical memory 116 can include any data storage device, including but not limited to solid state drive (SSD), magnetic disk storage (HDD), random access memory (RAM) and the like.
  • other hardware resources 111 can include but are not limited to a network interface controller (NIC), a graphics processing unit (GPU), peripheral input/output (I/O) devices and the like.
  • each virtual machine instance ( 103 , 104 ) can be associated with at least one user 112 , 113 (e.g., a customer of the service provider). Each virtual machine instance can execute at least one application ( 105 , 106 ) or other service on behalf of the user.
  • the virtual machine instance 103 is assigned a set of one or more of CPUs (e.g., 107 and 108 ) and virtual machine instance 104 is assigned another set of CPUs (e.g., 110 ).
  • the CPUs can be actual physical CPUs or alternatively, can be virtual CPU capacity that is assigned to the virtual machine.
  • the users are allowed to specify one or more threshold values for the various operating metrics associated with their virtual machines. As illustrated in the figure, when the processing load on the application 105 executing on the virtual machine instance 103 exceeds such a predetermined threshold, the system can allocate additional CPU 109 to the virtual machine instance 103 to meet the increased demand. Similarly, when the processing load decreases, the system may reduce the CPU capacity assigned to the virtual machine instance 103 .
  • the scaling of the virtual machine instance can be performed upon receiving a request from the customer to increase or decrease the amount of resources allocated to the virtual machine instance.
  • the customer may invoke an API provided by the service provider and request the service provider to allocate additional CPU capacity to the virtual machine instance.
  • the scaling service 114 can allocate additional CPU capacity to the virtual machine.
  • the system can also scale the virtual machine instances ( 103 , 104 ) by allocating or de-allocating memory 116 , and/or other hardware resources (e.g., NICs, GPU capacity, etc.). For example, if the virtual machine instance 103 is approaching 90% of memory capacity, the system may allocate additional memory (e.g., physical memory, virtual memory) to the virtual machine instance 103 .
  • additional memory e.g., physical memory, virtual memory
  • the scaling of virtual machine instance can include changing it from one virtual machine instance type to another instance type, where each instance type is associated with a predefined set of resources. For example, upon exceeding a predefined threshold, the service may change the virtual machine assigned to the customer from a “small” instance type (e.g., a 1.7 GB RAM and 160 GB of storage) to a “medium” instance type (e.g., 3.75 GB RAM and 410 GB storage).
  • the scaling of the virtual machine instance can be performed on a smooth continuum, e.g. by adding any arbitrary amount of CPU, memory or other resource capacity in any arbitrary increments, for example as required by the user application or service executing in the virtual machine and in accordance with defined metrics and thresholds.
  • FIG. 2 illustrates an example 200 of an automatic scaling service deployed by a service provider, in accordance with various embodiments.
  • a service provider 201 owns and operates a set or computing resources, such as host servers ( 219 , 220 ) which the service provider offers for lease to its customers.
  • the service provider 201 creates a shared resource execution environment in which each user (e.g., customer) is associated with one or more virtual machine instances ( 209 , 210 , 211 , 212 ).
  • the virtual machine instances operate on the computing resources 214 and are accessible by the users on various devices over a network (e.g., Internet).
  • a network e.g., Internet
  • a network can be any wired or wireless network of devices that are capable of communicating with each other, including but not limited to the Internet or other Wide Area Networks (WANs), cellular networks, Local Area Networks (LANs), Storage Area Networks (SANs), Intranets, Extranets, and the like.
  • the computing resources such as host servers ( 219 , 220 ) of the service provider can be located in any physical or logical grouping of resources, such as a data center, a server farm, content delivery network (CDN) point-of-presence (POP) and the like.
  • CDN content delivery network
  • POP point-of-presence
  • the service provider exposes one or more application programming interfaces (APIs) 208 for enabling users (e.g., customers) to access and manage the virtual machine instances ( 209 , 210 , 211 , 212 ).
  • APIs application programming interfaces
  • the APIs 208 can be employed by the users to submit a virtual machine image that will be used to provision the one or more virtual machine instances for the user.
  • the APIs 208 can be employed to specify one or more user-defined thresholds ( 215 , 216 , 217 , 218 ) and metrics that the thresholds relate to.
  • one threshold may be associated with the operating capacity of the CPUs assigned to the virtual machine instance, as previously described.
  • Another threshold may be associated with the amount of available memory assigned to the virtual machine. Another threshold may be an average number of requests being processed by the application executing on the virtual machine instance over a particular period of time.
  • the API can be used by the customer to submit a request to allocate additional resource capacity to the virtual machine instance(s) or to de-allocate resource capacity from the virtual machine instance(s).
  • the automatic monitoring and scaling service 213 can monitor the runtime execution metrics to detect when the metrics have exceeded the defined threshold.
  • the automatic scaling service 213 is a centralized service that collects runtime information from each of the virtual machine instances ( 209 , 210 , 211 , 212 ) and makes decisions to allocate or de-allocate resources from each VM instance.
  • the automatic scaling service 213 can be implemented as a service running on each host machine and be responsible for scaling the virtual machine instances on the host machine.
  • the host machines include a scaling agent ( 221 , 222 ).
  • the scaling agent may report various metrics to a centralized external scaling service 213 , as well as receive commands from the central scaling service 213 and execute them.
  • some of the virtual machine instances may include a guest agent 224 that reports various metrics to the scaling agent (e.g., metrics indicate memory pressure, CPU pressure, etc. as perceived from within the virtual machine as well as user specified metrics), which may in turn report the metrics to the automatic scaling service 213 .
  • the scaling service 221 may then make determinations of scaling the virtual machine instance up or down in resource capacity.
  • the automatic scaling service 213 can begin provisioning new virtual machine instances for the user.
  • the automatic scaling service 213 can continue to manage the scaling up and down of each individual virtual machine instance by adding and/or removing computing resources from each instance, as previously described.
  • thresholds may be defined which require multiple VM instances to support the workload even if all of the work could be handled by a single instance, for example to satisfy redundancy requirements.
  • the service 213 may simultaneously adjust the resources allocated to more than one VM instance to satisfy user specified sizing policy.
  • the automatic scaling of the virtual machine instances can enable a number of different billing models that can be used to charge the customer for utilizing the virtual machines.
  • the customer may be charged a premium for utilizing an automatically scalable virtual machine instance. For example, some customers may only need increased capacity during certain times of the day or on certain occasions. For those customers, it may be advantageous in terms of cost to utilize the automatic scaling service that can automatically add the needed capacity on-demand and reducing the capacity after the demand has subsided. Other customers may not readily know the demand for their service ahead of time and leveraging the automatic scaling service can provide an approach that ensures that their application will meet the demand without dedicating excess resource capacity to the application before requirements are well understood.
  • the customer may be billed per resource utilized for a given time period (e.g., per CPU hour utilized, per GB of memory per hour, etc.).
  • the service provider 201 can further employ a placement service 223 that is responsible for provisioning the various virtual machine instances ( 209 , 210 , 211 , 212 ) onto the host servers ( 219 , 220 ).
  • the placement service can determine whether the virtual machine instance will be a scalable virtual machine. If the placement service 223 determines that the virtual machine instance will be scalable, the service can provision the virtual machine onto a host server having excess capacity in order to be able to handle an increase in resource capacity that may be required at runtime or on-demand. For example, if the customer purchases an automatically sealable virtual machine for a premium, the placement service may place the VM onto the host machine that has enough capacity to accommodate increased workload of the VM. If the virtual machine will not be scalable, the placement service may provision the virtual machine onto host machines with little or no excess or reserved capacity.
  • the service provider 201 can further provide an electronic marketplace that enables the customer to purchase (e.g., allocate) additional resources of the host computing device to their virtual machine.
  • the price of the additional resources can be based at least in part on demand and supply of the one or more resources on the host computing device. For example, if there is a large amount of CPU capacity available on the host machine and demand is expected to remain low, the price for assigning additional CPUs to a virtual machine on that host machine may be low. Similarly, if there is a small amount of CPU capacity available, the price for additional CPUs may be higher.
  • the service provider is able to optimize resource utilization and provide a more efficient distribution of workload across its network.
  • FIG. 3A illustrates an example process 300 for automatically scaling a virtual machine instance on a host machine, in accordance with various embodiments.
  • this figure may depict functional operations in a particular sequence, the processes are not necessarily limited to the particular order or operations illustrated.
  • One skilled in the art will appreciate that the various operations portrayed in this or other figures can be changed, rearranged, performed in parallel or adapted in various ways.
  • certain operations or sequences of operations can be added to or omitted from the process, without departing from the scope of the various embodiments.
  • the process illustrations contained herein are intended to demonstrate an idea of the process flow to one of ordinary skill in the art, rather than specifying the actual sequences of code execution, which may be implemented as different flows or sequences, optimized for performance, or otherwise modified in various ways.
  • a virtual machine instance is provisioned for a customer.
  • the virtual machine instance can be provisioned by a service provider of a shared resource computing environment on behalf of the customer.
  • the virtual machine instance provisioned for the customer executes an application that provides a particular service.
  • a customer may deploy a service using several virtual machines, using one virtual machine instance as a database server, a separate virtual machine instance that functions as the front end (e.g. presentation logic) server and a third virtual machine instance that functions as a middleware computation server.
  • the user may specify a customer-defined threshold for scaling the one or more virtual machines.
  • the customer can use an API provided by the service provider to specify the various values and thresholds for specific operating metrics. For example, the user may specify that the size of the virtual machine instance should be increased if the instance is running at 60% CPU capacity for longer than 1 minute. In another embodiment, the user may be able to provide sets of thresholds independent of a virtual machine instance and later associate these with the instance when it is started or otherwise, such as at a later time when it is already operating.
  • the automatic scaling service monitors one or more operating metrics of the virtual machine instance during the execution of the workload.
  • an agent process residing on the host machine may continuously gather various runtime information, such as CPU utilization, number of open connections, IP packet counts, number of requests and the like. The collected information can be reported to a central service that can make the decision to scale each virtual machine instance up or down according to the customer-specified instructions.
  • the service can be hosted within the host machine and the gathered metrics do not need to be reported out.
  • the virtual machine instance may include an agent that reports user-specified metrics relevant to scaling the virtual machine.
  • the service detects that the one or more metrics have exceeded a customer-defined threshold. For example, the service may detect that the CPU usage of the virtual machine instance have exceeded a usage threshold for a minimum time frame specified by the customer.
  • the service can scale the virtual machine instance to increase or decrease capacity of various resources.
  • the scaling service allocates additional computing resources to the virtual machine instance. For example, the scaling service may add more CPUs (or virtual units of CPU capacity) to the virtual machine instance.
  • the scaling service may de-allocate a portion of the resources from the virtual machine instance and/or move the portion of the resources to other virtual machine instances. In some embodiments, the portion or subset selected for de-allocation is determined in order to bring the metrics back to within the customer-defined thresholds.
  • FIG. 3B illustrates an example process 310 of scaling a virtual machine instance in response to receiving a request from the user, in accordance with various embodiments.
  • the virtual machine instance is provisioned on a host machine for a user, as previously described. Once provisioned, the virtual machine instance can execute a workload on behalf of the user.
  • the service provider receives a request to increase or decrease the computing resource capacity allocated to the virtual machine instance. For example, the user may determine that the virtual machine instance needs more CPU capacity due to an increase in workload. The user may then invoke an API to allocate additional CPUs to the virtual machine instance.
  • the scaling service allocates the additional computing resources to the virtual machine instance or de-allocates computing resources from the virtual machine in response to the request.
  • FIG. 4 illustrates an example process 400 for automatically scaling a virtual machine instance and allocating additional virtual machine instances, in accordance with various embodiments.
  • a virtual machine instance is provisioned for a user, as previously described.
  • the virtual machine instance is then monitored for one or more pre-specified operating metrics.
  • the service may detect that the one or more operating metrics for the virtual machine instance have crossed (exceeded or fallen below) a customer-defined threshold.
  • the scaling service automatically scales the virtual machine instance by allocating additional computing resources to the virtual machine instance. For example, the scaling service may add additional memory capacity or CPU capacity to the virtual machine instance.
  • the scaling service may determine that the virtual machine instance cannot be scaled to adequately satisfy the workload required of the service. For example, it may determine that the virtual machine instance has grown to a maximum size allowed by the service provider.
  • the scaling service may begin to automatically provision one or more additional virtual machine instances to handle the workload. Each of the additional virtual machine instances can also be scaled in the manner previously described (operation 405 ).
  • FIG. 5 illustrates a logical arrangement of a set of general components of an example computing device 500 .
  • the device includes a processor 502 for executing instructions that can be stored in a memory device or element 504 .
  • the device can include many types of memory, data storage, or non-transitory computer-readable storage media, such as a first data storage for program instructions for execution by the processor 502 , a separate storage for images or data, a removable memory for sharing information with other devices, etc.
  • the device typically will include some type of display element 506 , such as a touch screen or liquid crystal display (LCD), although devices such as portable media players might convey information via other means, such as through audio speakers.
  • LCD liquid crystal display
  • the device in many embodiments will include at least one input element 508 able to receive conventional input from a user.
  • This conventional input can include, for example, a push button, touch pad, touch screen, wheel, joystick, keyboard, mouse, keypad, or any other such device or element whereby a user can input a command to the device.
  • a device might not include any buttons at all, and might be controlled only through a combination of visual and audio commands, such that a user can control the device without having to be in contact with the device.
  • the computing device 500 of FIG. 5 can include one or more network interface elements 508 for communicating over various networks, such as a Wi-Fi, Bluetooth, RF, wired, or wireless communication systems.
  • the device in many embodiments can communicate with a network, such as the Internet, and may be able to communicate with other such devices.
  • FIG. 6 illustrates an example of an environment 600 for implementing aspects in accordance with various embodiments.
  • the system includes an electronic client device 602 , which can include any appropriate device operable to send and receive requests, messages or information over an appropriate network 604 and convey information back to a user of the device.
  • client devices include personal computers, cell phones, handheld messaging devices, laptop computers, set-top boxes, personal data assistants, electronic book readers and the like.
  • the network can include any appropriate network, including an intranet, the Internet, a cellular network, a local area network or any other such network or combination thereof. Components used for such a system can depend at least in part upon the type of network and/or environment selected. Protocols and components for communicating via such a network are well known and will not be discussed herein in detail. Communication over the network can be enabled via wired or wireless connections and combinations thereof.
  • the network includes the Internet, as the environment includes a Web server 606 for receiving requests and serving content in response thereto, although for other networks an alternative device serving a similar purpose could be used, as would be apparent to one of ordinary skill in the art.
  • the illustrative environment includes at least one application server 608 and a data store 610 .
  • application server can include any appropriate hardware and software for integrating with the data store as needed to execute aspects of one or more applications for the client device and handling a majority of the data access and business logic for an application.
  • the application server provides access control services in cooperation with the data store and is able to generate content such as text, graphics, audio and/or video to be transferred to the user, which may be served to the user by the Web server in the form of HTML, XML or another appropriate structured language in this example.
  • content such as text, graphics, audio and/or video to be transferred to the user, which may be served to the user by the Web server in the form of HTML, XML or another appropriate structured language in this example.
  • the handling of all requests and responses, as well as the delivery of content between the client device 602 and the application server 708 can be handled by the Web server 606 . It should be understood that the Web and application servers are not required and are merely example components, as structured code discussed herein can be executed on any appropriate device or host machine as discussed elsewhere herein.
  • the data store 610 can include several separate data tables, databases or other data storage mechanisms and media for storing data relating to a particular aspect.
  • the data store illustrated includes mechanisms for storing production data 612 and user information 616 , which can be used to serve content for the production side.
  • the data store also is shown to include a mechanism for storing log or session data 614 . It should be understood that there can be many other aspects that may need to be stored in the data store, such as page image information and access rights information, which can be stored in any of the above listed mechanisms as appropriate or in additional mechanisms in the data store 610 .
  • the data store 610 is operable, through logic associated therewith, to receive instructions from the application server 608 and obtain, update or otherwise process data in response thereto.
  • a user might submit a search request for a certain type of item.
  • the data store might access the user information to verify the identity of the user and can access the catalog detail information to obtain information about items of that type.
  • the information can then be returned to the user, such as in a results listing on a Web page that the user is able to view via a browser on the user device 602 .
  • Information for a particular item of interest can be viewed in a dedicated page or window of the browser.
  • Each server typically will include an operating system that provides executable program instructions for the general administration and operation of that server and typically will include computer-readable medium storing instructions that, when executed by a processor of the server, allow the server to perform its intended functions.
  • Suitable implementations for the operating system and general functionality of the servers are known or commercially available and are readily implemented by persons having ordinary skill in the art, particularly in light of the disclosure herein.
  • the environment in one embodiment is a distributed computing environment utilizing several computer systems and components that are interconnected via communication links, using one or more computer networks or direct connections.
  • the environment in one embodiment is a distributed computing environment utilizing several computer systems and components that are interconnected via communication links, using one or more computer networks or direct connections.
  • FIG. 6 it will he appreciated by those of ordinary skill in the art that such a system could operate equally well in a system having fewer or a greater number of components than are illustrated in FIG. 6 .
  • the depiction of the system 600 in FIG. 6 should be taken as being illustrative in nature and not limiting to the scope of the disclosure.
  • Various embodiments discussed or suggested herein can be implemented in a wide variety of operating environments, which in some cases can include one or more user computers, computing devices, or processing devices which can be used to operate any of a number of applications.
  • User or client devices can include any of a number of general purpose personal computers, such as desktop or laptop computers running a standard operating system, as well as cellular, wireless, and handheld devices running mobile software and capable of supporting a number of networking and messaging protocols.
  • Such a system also can include a number of workstations running any of a variety of commercially-available operating systems and other known applications for purposes such as development and database management.
  • These devices also can include other electronic devices, such as dummy terminals, thin-clients, gaming systems, and other devices capable of communicating via a network.
  • Most embodiments utilize at least one network that would be familiar to those skilled in the art for supporting communications using any of a variety of commercially-available protocols, such as TCP/IP, OSI, FTP, UPnP, NFS, CIFS, and AppleTalk.
  • the network can be, for example, a local area network, a wide-area network, a virtual private network, the Internet, an intranet, an extranet, a public switched telephone network, an infrared network, a wireless network, and any combination thereof.
  • the Web server can run any of a variety of server or mid-tier applications, including HTTP servers, FTP servers, CGI servers, data servers, Java servers, and business application servers.
  • the server(s) also may be capable of executing programs or scripts in response requests from user devices, such as by executing one or more Web applications that may be implemented as one or more scripts or programs written in any programming language, such as Java®, C, C# or C++, or any scripting language, such as Perl, Python, or TCL, as well as combinations thereof.
  • the server(s) may also include database servers, including without limitation those commercially available from Oracle®, Microsoft®, Sybase®, and lBM®.
  • the environment can include a variety of data stores and other memory and storage media as discussed above. These can reside in a variety of locations, such as on a storage medium local to (and/or resident in) one or more of the computers or remote from any or all of the computers across the network. In a particular set of embodiments, the information may reside in a storage-area network (“SAN”) familiar to those skilled in the art. Similarly, any necessary files for performing the functions attributed to the computers, servers, or other network devices may be stored locally and/or remotely, as appropriate.
  • SAN storage-area network
  • each such device can include hardware elements that may be electrically coupled via a bus, the elements including, for example, at least one central processing unit (CPU), at least one input device (e.g., a mouse, keyboard, controller, touch screen, or keypad), and at least one output device (e.g., a display device, printer, or speaker).
  • CPU central processing unit
  • input device e.g., a mouse, keyboard, controller, touch screen, or keypad
  • at least one output device e.g., a display device, printer, or speaker
  • Such a system may also include one or more storage devices, such as disk drives, optical storage devices, and solid-state storage devices such as random access memory (“RAM”) or read-only memory (“ROM”), as well as removable media devices, memory cards, flash cards, etc.
  • ROM read-only memory
  • Such devices can include a computer-readable storage media reader, a communications device (e.g., a modem, a network card (wireless or wired), an infrared communication device, etc.), and working memory as described above.
  • the computer-readable storage media reader can be connected with, or configured to receive, a computer-readable storage medium, representing remote, local, fixed, and/or removable storage devices as well as storage media for temporarily and/or more permanently containing, storing, transmitting, and retrieving computer-readable information.
  • the system and various devices also typically will include a number of software applications, modules, services, or other elements located within at least one working memory device, including an operating system and application programs, such as a client application or Web browser. It should be appreciated that alternate embodiments may have numerous variations from that described above. For example, customized hardware might also be used and/or particular elements might be implemented in hardware, software (including portable software, such as applets), or both. Further, connection to other computing devices such as network input/output devices may be employed.
  • Storage media and computer readable media for containing code, or portions of code can include any appropriate media known or used in the art, including storage media and communication media, such as but not limited to volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage and/or transmission of information such as computer readable instructions, data structures, program modules, or other data, including RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a system device.
  • RAM random access memory
  • ROM read only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory electrically erasable programmable read-only memory
  • CD-ROM compact disc read-only memory
  • DVD digital versatile disk
  • magnetic cassettes magnetic tape
  • magnetic disk storage magnetic disk storage devices

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Stored Programmes (AREA)
  • Multi Processors (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Debugging And Monitoring (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Techniques are described for scaling of computing resources. A scaling service is utilized that allocates additional computing resources (e.g., processors, memory, etc.) to a virtual machine instance (or other compute instance) and/or de-allocates computing resources from a virtual machine instance according requests and/or thresholds. In addition to the foregoing, other aspects are described in the description, figures, and claims.

Description

    CROSS REFERENCE TO RELATED APPLICATION
  • This application is a divisional of allowed U.S. patent application Ser. No. 13/593,226, filed Aug. 23, 2012, which is hereby incorporated herein by reference for all purposes.
  • BACKGROUND
  • As an increasing number of applications and services are being made available over networks such as the Internet, an increasing number of content, application, and/or service providers are turning to technologies such as cloud computing. Cloud computing, in general, is an approach to providing access to electronic resources through services, such as Web services, where the hardware and/or software used to support those services is dynamically scalable to meet the needs of the services at any given time. A user or customer typically will rent, lease, or otherwise pay for access to resources through the cloud, and thus does not have to purchase and maintain the hardware and/or software needed.
  • In this context, many cloud computing providers utilize virtualization to allow multiple users to share the underlying hardware and/or software resources. Virtualization can allow computing servers, storage device or other resources to be partitioned into multiple isolated instances that are associated with (e.g., owned by) a particular user. A cloud computing provider usually assigns one or more virtual machines to each of its customers, and the virtual machines are used to execute the applications and/or other workload for those customers. A number of issues and inconveniences may occur, however, when the processing load of the customer begins to exceed the capacity of the virtual machines due to an increase in demand or other reasons.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • Various embodiments in accordance with the present disclosure will be described with reference to the drawings, in which:
  • FIG. 1 illustrates an example of scaling up a virtual machine instance by allocating additional CPU, in accordance with various embodiments;
  • FIG. 2 illustrates an example of an automatic scaling service deployed by a service provider, in accordance with various embodiments;
  • FIG. 3A illustrates an example process for automatically scaling a virtual machine instance on a host machine, in accordance with various embodiments;
  • FIG. 3B illustrates an example process of scaling a virtual machine instance in response to receiving a request from the user, in accordance with various embodiments;
  • FIG. 4 illustrates an example process for automatically scaling a virtual machine instance and allocating additional virtual machine instances, in accordance with various embodiments;
  • FIG. 5 illustrates a logical arrangement of a set of general components of an example computing device that can be utilized in accordance with various embodiments; and
  • FIG. 6 illustrates an example of an environment for implementing aspects in accordance with various embodiments.
  • DETAILED DESCRIPTION
  • In the following description, various embodiments will be illustrated by way of example and not by way of limitation in the figures of the accompanying drawings. References to various embodiments in this disclosure are not necessarily to the same embodiment, and such references mean at least one. While specific implementations and other details are discussed, it is to be understood that this is done for illustrative purposes only. A person skilled in the relevant art will recognize that other components and configurations may be used without departing from the scope and spirit of the claimed subject matter.
  • Systems and methods in accordance with various embodiments of the present disclosure may overcome one or more of the foregoing or other deficiencies experienced in conventional approaches for scaling computing resources. In particular, various embodiments provide approaches for automatically allocating additional computing resources (e.g., processors, memory, networking devices etc.) to a virtual machine instance and/or de-allocating computing resources from a virtual machine instance according to various user-specified thresholds or user requests. Effectively, this enables a virtual machine instance to “grow” or “shrink” in size and capacity on-demand or according to the actual demand for the resources that the virtual machine provides.
  • In accordance with various embodiments, one such approach can be implemented by a service provider of a shared computing resource environment (e.g., a “cloud” computing provider) that hosts applications and virtual machine instances on behalf of its customers. The applications and virtual machine instances are hosted on the physical resources (e.g., host servers and other network resources) owned and operated by the service provider. In accordance with an embodiment, the service provider receives a virtual machine image from the customer and provisions one or more virtual machine instances for the customer based at least in part on the virtual machine image. These virtual machine instances can then execute the various applications and/or other services of the customer using the physical computing resources of the service provider.
  • In accordance with an embodiment, each virtual machine instance is provisioned on a host machine (e.g., computing device). Each host machine can host one or more virtual machine instances. In at least one embodiment, the host machine further includes a hypervisor and service hosting layer that provides access to the hardware device drivers of the host machine and enables the one or more virtual machine instances to access the devices directly or through a virtualization abstraction.
  • In accordance with an embodiment, once the virtual machine instance has been provisioned on the host machine, the service provider may receive from the customer (e.g., via an application programming interface (API)) a request to allocate additional resources to the virtual machine instance or to de-allocate resources from the instance. Furthermore, the API can allow the customer to specify one or more customer-defined thresholds for the virtual machine instance pertaining to various operating metrics of the underlying resources, such as CPU utilization. In addition, the customer is enabled to specify the various runtime operating metrics associated with their service or application that may be relevant to the decision to scale the virtual machine instance. These operating metrics and thresholds can allow the customer to indicate the conditions under which resources allocated to the virtual machine instance should be scaled up or down.
  • In accordance with an embodiment, a service on the service provider's system monitors operating metrics during execution of the virtual machine instance. In the same, or an alternative embodiment, the service may receive operating metrics from a guest agent executing within the virtual machine instance. Consequently, the operating metrics may be generated from the server and/or from within the virtual machine instance. In the instance that the service detects that one or more of the metrics have exceeded a threshold, such as a customer-defined threshold, for a predetermined period of time, it may initiate the scaling up or down of the virtual machine instance by adding or removing resources (e.g., central processing units (CPUs), memory, other hardware devices). For example, if the service detects that the virtual machine instance has been operating at more than 90% CPU capacity for at least 10 seconds over the last hour, it may allocate additional CPU capacity to the virtual machine instance (e.g., assign additional CPUs or CPU cores, switch to a more powerful CPU, etc.). As another example, if the service detects that the virtual machine instance has been operating at less than 10% CPU capacity for a specified period of time, it may de-allocate (e.g., reduce) some amount of CPU capacity from the virtual machine instance. In some embodiments, the scaling of the virtual machine instance can be performed automatically, without requiring any manual involvement on the part of the customer. In other embodiments, the scaling of the virtual machine instance can be performed in response to receiving a request to scale the instance from the user (e.g., owner of the virtual machine).
  • In accordance with an embodiment, the virtual machine instance can be automatically scaled up until a single virtual machine instance is no longer capable of adequately supporting the workload of the customer. Once this limit has been reached, the service may begin to automatically assign additional virtual machine instances to handle the workload. In addition, the service may continue to automatically scale each of the additional VM instances up or down to meet the fluctuating demand in the manner previously described. In some embodiments, operating metrics and user defined thresholds may, in part, include requirements for redundancy, availability, durability or the like, so “scale out” to multiple VM's hosted on different physical servers may occur even if there is sufficient resource capability on a single server to satisfy other resource requirements, such as a certain amount of CPU or RAM.
  • In accordance with various embodiments, the managing of scaling up (or down) VM instances within a host can be performed by using web-based graphical user interface (GUI), customer defined thresholds, or fully automatic inference in a closed measurement/action loop. This automatic scaling can enable several billing and/or payment models. For example, a customer may be charged for a generic scalable VM instance which charges per GHz hour (or other predetermined time period) and/or GB per hour of RAM, separating out individual machine resources (CPU, RAM, network) and charging differentially and the like.
  • In various embodiments, Web Services can be used to allow a user (e.g., customer) to request the scaling of virtual machine instances or to specify various thresholds that control when these VM instances will grow or shrink in resource capacity. Web Services can include both Query and simple object access protocol (SOAP) APIs. It should be noted, however, that Web Services are not limited to SOAP based API calls and can include any remote procedure/function/method execution carried out using a network, such as the Internet.
  • In various embodiments, a web service can be deployed by the service provider, which provides resizable computing capacity (e.g., additional server instances in a resource center). This computing capacity can be used to build and host the customer's software systems. The service provider can provide access to these resources using APIs or web tools and utilities. Users can thus access the API functionality exposed by the service provider, in order to add or remove resources, scale based on metrics, redundancy, availability, and the like.
  • FIG. 1 illustrates an example 100 of scaling up a virtual machine instance by allocating additional CPU, Virtual CPUs (VCPU), Physical CPUs (PCPU), cores of a physical CPU or fractions thereof and herein generally referred to as “CPU”, in accordance with various embodiments. In the illustrated embodiment, a host computing device 101 includes a hypervisor 102 that manages virtual machine instances 103 and 104. A hypervisor 102 manages the execution of the one or more guest operating systems and allows multiple instances of different operating systems to share the underlying hardware resources. Conventionally, hypervisors are installed on server hardware, with the function of running guest operating systems, where the guest operating systems themselves act as servers. In various embodiments, there can be at least two types of hypervisor 102: a type 1 (bare metal) hypervisor; and a type 2 (hosted) hypervisor. A type 1 hypervisor runs directly on the hardware resources and manages and controls one or more guest operating systems, which run on top of the hypervisor. A type 2 hypervisor is executed within the operating system and hosts the one or more guest operating conceptually at a third level above the hardware resources. Either type of hypervisor can be implemented in accordance with the embodiments described herein. The hypervisor 102 can host a number of domains (e.g. virtual machines), such as the host domain (or service layer or virtualization layer or the like) and one or more guest domains. In one embodiment, the host domain (e.g., Dom-0) is the first domain created and helps manages all of the hardware devices and other domains running on the hypervisor 102. For example, the host domain can manage the creating, destroying, migrating, saving, or restoring the one or more guest domains (e.g., Dom-U). In accordance with various embodiments, the hypervisor 102 controls access to the hardware resources such as the CPU, input/output (I/O) memory and hypervisor memory. In the illustrated embodiment, the hypervisor 102 includes an automatic scaling service 114 that performs the scaling of the virtual machine instance by allocating or de-allocating resources to the virtual machine instance. Alternatively, the scaling service can reside in Dom-0 or externally with respect to the host computing device, and the host computing device may include a thin agent to execute commands received from the external scaling service.
  • in accordance with an embodiment, the hardware resources of the host computing device 101 include physical memory 116, one or more central processing units (CPUs) (107, 108, 109, 110) and any other hardware resources or devices 111. The physical memory 116 can include any data storage device, including but not limited to solid state drive (SSD), magnetic disk storage (HDD), random access memory (RAM) and the like. In various embodiments, other hardware resources 111 can include but are not limited to a network interface controller (NIC), a graphics processing unit (GPU), peripheral input/output (I/O) devices and the like.
  • In accordance with an embodiment, each virtual machine instance (103, 104) can be associated with at least one user 112, 113 (e.g., a customer of the service provider). Each virtual machine instance can execute at least one application (105, 106) or other service on behalf of the user. In accordance with the illustrated embodiment, the virtual machine instance 103 is assigned a set of one or more of CPUs (e.g., 107 and 108) and virtual machine instance 104 is assigned another set of CPUs (e.g., 110). In various embodiments the CPUs can be actual physical CPUs or alternatively, can be virtual CPU capacity that is assigned to the virtual machine.
  • In various embodiments, the users (112, 113) are allowed to specify one or more threshold values for the various operating metrics associated with their virtual machines. As illustrated in the figure, when the processing load on the application 105 executing on the virtual machine instance 103 exceeds such a predetermined threshold, the system can allocate additional CPU 109 to the virtual machine instance 103 to meet the increased demand. Similarly, when the processing load decreases, the system may reduce the CPU capacity assigned to the virtual machine instance 103.
  • In an alternative embodiment, the scaling of the virtual machine instance can be performed upon receiving a request from the customer to increase or decrease the amount of resources allocated to the virtual machine instance. For example, the customer may invoke an API provided by the service provider and request the service provider to allocate additional CPU capacity to the virtual machine instance. In response to receiving the request, the scaling service 114 can allocate additional CPU capacity to the virtual machine.
  • In accordance with various embodiments, the system can also scale the virtual machine instances (103, 104) by allocating or de-allocating memory 116, and/or other hardware resources (e.g., NICs, GPU capacity, etc.). For example, if the virtual machine instance 103 is approaching 90% of memory capacity, the system may allocate additional memory (e.g., physical memory, virtual memory) to the virtual machine instance 103.
  • In accordance with one embodiment, the scaling of virtual machine instance can include changing it from one virtual machine instance type to another instance type, where each instance type is associated with a predefined set of resources. For example, upon exceeding a predefined threshold, the service may change the virtual machine assigned to the customer from a “small” instance type (e.g., a 1.7 GB RAM and 160 GB of storage) to a “medium” instance type (e.g., 3.75 GB RAM and 410 GB storage). In an alternative embodiment, the scaling of the virtual machine instance can be performed on a smooth continuum, e.g. by adding any arbitrary amount of CPU, memory or other resource capacity in any arbitrary increments, for example as required by the user application or service executing in the virtual machine and in accordance with defined metrics and thresholds.
  • FIG. 2 illustrates an example 200 of an automatic scaling service deployed by a service provider, in accordance with various embodiments. In the illustrated embodiment, a service provider 201 owns and operates a set or computing resources, such as host servers (219, 220) which the service provider offers for lease to its customers. In accordance with at least one embodiment, the service provider 201 creates a shared resource execution environment in which each user (e.g., customer) is associated with one or more virtual machine instances (209, 210, 211, 212). The virtual machine instances operate on the computing resources 214 and are accessible by the users on various devices over a network (e.g., Internet). As used throughout this disclosure, a network can be any wired or wireless network of devices that are capable of communicating with each other, including but not limited to the Internet or other Wide Area Networks (WANs), cellular networks, Local Area Networks (LANs), Storage Area Networks (SANs), Intranets, Extranets, and the like. The computing resources such as host servers (219, 220) of the service provider can be located in any physical or logical grouping of resources, such as a data center, a server farm, content delivery network (CDN) point-of-presence (POP) and the like.
  • In accordance with an embodiment, the service provider exposes one or more application programming interfaces (APIs) 208 for enabling users (e.g., customers) to access and manage the virtual machine instances (209, 210, 211, 212). For example, the APIs 208 can be employed by the users to submit a virtual machine image that will be used to provision the one or more virtual machine instances for the user. Similarly, in accordance with various embodiments described herein, the APIs 208 can be employed to specify one or more user-defined thresholds (215, 216, 217, 218) and metrics that the thresholds relate to. For example, one threshold may be associated with the operating capacity of the CPUs assigned to the virtual machine instance, as previously described. Another threshold may be associated with the amount of available memory assigned to the virtual machine. Another threshold may be an average number of requests being processed by the application executing on the virtual machine instance over a particular period of time. In addition, the API can be used by the customer to submit a request to allocate additional resource capacity to the virtual machine instance(s) or to de-allocate resource capacity from the virtual machine instance(s).
  • In accordance with an embodiment, once the user specifies the thresholds (215, 216, 217, 218), the automatic monitoring and scaling service 213 can monitor the runtime execution metrics to detect when the metrics have exceeded the defined threshold. In one embodiment, the automatic scaling service 213 is a centralized service that collects runtime information from each of the virtual machine instances (209, 210, 211, 212) and makes decisions to allocate or de-allocate resources from each VM instance. In an alternative embodiment, the automatic scaling service 213 can be implemented as a service running on each host machine and be responsible for scaling the virtual machine instances on the host machine.
  • In some embodiments, the host machines include a scaling agent (221, 222). The scaling agent may report various metrics to a centralized external scaling service 213, as well as receive commands from the central scaling service 213 and execute them. In accordance with an embodiment, some of the virtual machine instances may include a guest agent 224 that reports various metrics to the scaling agent (e.g., metrics indicate memory pressure, CPU pressure, etc. as perceived from within the virtual machine as well as user specified metrics), which may in turn report the metrics to the automatic scaling service 213. The scaling service 221 may then make determinations of scaling the virtual machine instance up or down in resource capacity.
  • In accordance with an embodiment, if the workload or demand for the user's service reaches a certain limit where a single virtual machine instance is no longer sufficient to adequately handle the work, the automatic scaling service 213 can begin provisioning new virtual machine instances for the user. In addition, the automatic scaling service 213 can continue to manage the scaling up and down of each individual virtual machine instance by adding and/or removing computing resources from each instance, as previously described. In some embodiments, thresholds may be defined which require multiple VM instances to support the workload even if all of the work could be handled by a single instance, for example to satisfy redundancy requirements. In this case, the service 213 may simultaneously adjust the resources allocated to more than one VM instance to satisfy user specified sizing policy.
  • In accordance with an embodiment, the automatic scaling of the virtual machine instances can enable a number of different billing models that can be used to charge the customer for utilizing the virtual machines. In one embodiment, the customer may be charged a premium for utilizing an automatically scalable virtual machine instance. For example, some customers may only need increased capacity during certain times of the day or on certain occasions. For those customers, it may be advantageous in terms of cost to utilize the automatic scaling service that can automatically add the needed capacity on-demand and reducing the capacity after the demand has subsided. Other customers may not readily know the demand for their service ahead of time and leveraging the automatic scaling service can provide an approach that ensures that their application will meet the demand without dedicating excess resource capacity to the application before requirements are well understood. In another embodiment, the customer may be billed per resource utilized for a given time period (e.g., per CPU hour utilized, per GB of memory per hour, etc.).
  • In accordance with an embodiment, the service provider 201 can further employ a placement service 223 that is responsible for provisioning the various virtual machine instances (209, 210, 211, 212) onto the host servers (219, 220). The placement service can determine whether the virtual machine instance will be a scalable virtual machine. If the placement service 223 determines that the virtual machine instance will be scalable, the service can provision the virtual machine onto a host server having excess capacity in order to be able to handle an increase in resource capacity that may be required at runtime or on-demand. For example, if the customer purchases an automatically sealable virtual machine for a premium, the placement service may place the VM onto the host machine that has enough capacity to accommodate increased workload of the VM. If the virtual machine will not be scalable, the placement service may provision the virtual machine onto host machines with little or no excess or reserved capacity.
  • In accordance with an embodiment, the service provider 201 can further provide an electronic marketplace that enables the customer to purchase (e.g., allocate) additional resources of the host computing device to their virtual machine. The price of the additional resources can be based at least in part on demand and supply of the one or more resources on the host computing device. For example, if there is a large amount of CPU capacity available on the host machine and demand is expected to remain low, the price for assigning additional CPUs to a virtual machine on that host machine may be low. Similarly, if there is a small amount of CPU capacity available, the price for additional CPUs may be higher. By allowing price fluctuation based on demand and supply in this manner, the service provider is able to optimize resource utilization and provide a more efficient distribution of workload across its network.
  • FIG. 3A illustrates an example process 300 for automatically scaling a virtual machine instance on a host machine, in accordance with various embodiments. Although this figure may depict functional operations in a particular sequence, the processes are not necessarily limited to the particular order or operations illustrated. One skilled in the art will appreciate that the various operations portrayed in this or other figures can be changed, rearranged, performed in parallel or adapted in various ways. Furthermore, it is to be understood that certain operations or sequences of operations can be added to or omitted from the process, without departing from the scope of the various embodiments. In addition, the process illustrations contained herein are intended to demonstrate an idea of the process flow to one of ordinary skill in the art, rather than specifying the actual sequences of code execution, which may be implemented as different flows or sequences, optimized for performance, or otherwise modified in various ways.
  • In operation 302, a virtual machine instance is provisioned for a customer. The virtual machine instance can be provisioned by a service provider of a shared resource computing environment on behalf of the customer. In accordance with an embodiment, the virtual machine instance provisioned for the customer executes an application that provides a particular service. For example, a customer may deploy a service using several virtual machines, using one virtual machine instance as a database server, a separate virtual machine instance that functions as the front end (e.g. presentation logic) server and a third virtual machine instance that functions as a middleware computation server. When provisioning the virtual machine instance, the user may specify a customer-defined threshold for scaling the one or more virtual machines. In one embodiment, the customer can use an API provided by the service provider to specify the various values and thresholds for specific operating metrics. For example, the user may specify that the size of the virtual machine instance should be increased if the instance is running at 60% CPU capacity for longer than 1 minute. In another embodiment, the user may be able to provide sets of thresholds independent of a virtual machine instance and later associate these with the instance when it is started or otherwise, such as at a later time when it is already operating.
  • In operation 303, the automatic scaling service monitors one or more operating metrics of the virtual machine instance during the execution of the workload. For example, an agent process residing on the host machine may continuously gather various runtime information, such as CPU utilization, number of open connections, IP packet counts, number of requests and the like. The collected information can be reported to a central service that can make the decision to scale each virtual machine instance up or down according to the customer-specified instructions. In an alternative embodiment, the service can be hosted within the host machine and the gathered metrics do not need to be reported out. In other embodiments, the virtual machine instance may include an agent that reports user-specified metrics relevant to scaling the virtual machine.
  • In operation 304, the service detects that the one or more metrics have exceeded a customer-defined threshold. For example, the service may detect that the CPU usage of the virtual machine instance have exceeded a usage threshold for a minimum time frame specified by the customer.
  • In operation 305, the service can scale the virtual machine instance to increase or decrease capacity of various resources. In one embodiment, if the processing load has increased, the scaling service allocates additional computing resources to the virtual machine instance. For example, the scaling service may add more CPUs (or virtual units of CPU capacity) to the virtual machine instance. In another embodiment, the scaling service may de-allocate a portion of the resources from the virtual machine instance and/or move the portion of the resources to other virtual machine instances. In some embodiments, the portion or subset selected for de-allocation is determined in order to bring the metrics back to within the customer-defined thresholds.
  • FIG. 3B illustrates an example process 310 of scaling a virtual machine instance in response to receiving a request from the user, in accordance with various embodiments. In operation 311, the virtual machine instance is provisioned on a host machine for a user, as previously described. Once provisioned, the virtual machine instance can execute a workload on behalf of the user. In operation 312, the service provider receives a request to increase or decrease the computing resource capacity allocated to the virtual machine instance. For example, the user may determine that the virtual machine instance needs more CPU capacity due to an increase in workload. The user may then invoke an API to allocate additional CPUs to the virtual machine instance. In operation 313, the scaling service allocates the additional computing resources to the virtual machine instance or de-allocates computing resources from the virtual machine in response to the request.
  • FIG. 4 illustrates an example process 400 for automatically scaling a virtual machine instance and allocating additional virtual machine instances, in accordance with various embodiments.
  • In operation 401, a virtual machine instance is provisioned for a user, as previously described. The virtual machine instance is then monitored for one or more pre-specified operating metrics. In operation 402, the service may detect that the one or more operating metrics for the virtual machine instance have crossed (exceeded or fallen below) a customer-defined threshold. In operation 403, the scaling service automatically scales the virtual machine instance by allocating additional computing resources to the virtual machine instance. For example, the scaling service may add additional memory capacity or CPU capacity to the virtual machine instance.
  • In operation 404, the scaling service may determine that the virtual machine instance cannot be scaled to adequately satisfy the workload required of the service. For example, it may determine that the virtual machine instance has grown to a maximum size allowed by the service provider. In operation 405, the scaling service may begin to automatically provision one or more additional virtual machine instances to handle the workload. Each of the additional virtual machine instances can also be scaled in the manner previously described (operation 405).
  • FIG. 5 illustrates a logical arrangement of a set of general components of an example computing device 500. In this example, the device includes a processor 502 for executing instructions that can be stored in a memory device or element 504. As would be apparent to one of ordinary skill in the art, the device can include many types of memory, data storage, or non-transitory computer-readable storage media, such as a first data storage for program instructions for execution by the processor 502, a separate storage for images or data, a removable memory for sharing information with other devices, etc. The device typically will include some type of display element 506, such as a touch screen or liquid crystal display (LCD), although devices such as portable media players might convey information via other means, such as through audio speakers. As discussed, the device in many embodiments will include at least one input element 508 able to receive conventional input from a user. This conventional input can include, for example, a push button, touch pad, touch screen, wheel, joystick, keyboard, mouse, keypad, or any other such device or element whereby a user can input a command to the device. In some embodiments, however, such a device might not include any buttons at all, and might be controlled only through a combination of visual and audio commands, such that a user can control the device without having to be in contact with the device. In some embodiments, the computing device 500 of FIG. 5 can include one or more network interface elements 508 for communicating over various networks, such as a Wi-Fi, Bluetooth, RF, wired, or wireless communication systems. The device in many embodiments can communicate with a network, such as the Internet, and may be able to communicate with other such devices.
  • As discussed, different approaches can be implemented in various environments in accordance with the described embodiments. For example, FIG. 6 illustrates an example of an environment 600 for implementing aspects in accordance with various embodiments. As will be appreciated, although a Web-based environment is used for purposes of explanation, different environments may be used, as appropriate, to implement various embodiments. The system includes an electronic client device 602, which can include any appropriate device operable to send and receive requests, messages or information over an appropriate network 604 and convey information back to a user of the device. Examples of such client devices include personal computers, cell phones, handheld messaging devices, laptop computers, set-top boxes, personal data assistants, electronic book readers and the like. The network can include any appropriate network, including an intranet, the Internet, a cellular network, a local area network or any other such network or combination thereof. Components used for such a system can depend at least in part upon the type of network and/or environment selected. Protocols and components for communicating via such a network are well known and will not be discussed herein in detail. Communication over the network can be enabled via wired or wireless connections and combinations thereof. In this example, the network includes the Internet, as the environment includes a Web server 606 for receiving requests and serving content in response thereto, although for other networks an alternative device serving a similar purpose could be used, as would be apparent to one of ordinary skill in the art.
  • The illustrative environment includes at least one application server 608 and a data store 610. It should be understood that there can be several application servers, layers or other elements, processes or components, which may be chained or otherwise configured, which can interact to perform tasks such as obtaining data from an appropriate data store. As used herein the term data store” refers to any device or combination of devices capable of storing, accessing and retrieving data, which may include any combination and number of data servers, databases, data storage devices and data storage media, in any standard, distributed or clustered environment. The application server can include any appropriate hardware and software for integrating with the data store as needed to execute aspects of one or more applications for the client device and handling a majority of the data access and business logic for an application. The application server provides access control services in cooperation with the data store and is able to generate content such as text, graphics, audio and/or video to be transferred to the user, which may be served to the user by the Web server in the form of HTML, XML or another appropriate structured language in this example. The handling of all requests and responses, as well as the delivery of content between the client device 602 and the application server 708, can be handled by the Web server 606. It should be understood that the Web and application servers are not required and are merely example components, as structured code discussed herein can be executed on any appropriate device or host machine as discussed elsewhere herein.
  • The data store 610 can include several separate data tables, databases or other data storage mechanisms and media for storing data relating to a particular aspect. For example, the data store illustrated includes mechanisms for storing production data 612 and user information 616, which can be used to serve content for the production side. The data store also is shown to include a mechanism for storing log or session data 614. It should be understood that there can be many other aspects that may need to be stored in the data store, such as page image information and access rights information, which can be stored in any of the above listed mechanisms as appropriate or in additional mechanisms in the data store 610. The data store 610 is operable, through logic associated therewith, to receive instructions from the application server 608 and obtain, update or otherwise process data in response thereto. In one example, a user might submit a search request for a certain type of item. In this case, the data store might access the user information to verify the identity of the user and can access the catalog detail information to obtain information about items of that type. The information can then be returned to the user, such as in a results listing on a Web page that the user is able to view via a browser on the user device 602. Information for a particular item of interest can be viewed in a dedicated page or window of the browser.
  • Each server typically will include an operating system that provides executable program instructions for the general administration and operation of that server and typically will include computer-readable medium storing instructions that, when executed by a processor of the server, allow the server to perform its intended functions. Suitable implementations for the operating system and general functionality of the servers are known or commercially available and are readily implemented by persons having ordinary skill in the art, particularly in light of the disclosure herein.
  • The environment in one embodiment is a distributed computing environment utilizing several computer systems and components that are interconnected via communication links, using one or more computer networks or direct connections. However, it will he appreciated by those of ordinary skill in the art that such a system could operate equally well in a system having fewer or a greater number of components than are illustrated in FIG. 6. Thus, the depiction of the system 600 in FIG. 6 should be taken as being illustrative in nature and not limiting to the scope of the disclosure.
  • Various embodiments discussed or suggested herein can be implemented in a wide variety of operating environments, which in some cases can include one or more user computers, computing devices, or processing devices which can be used to operate any of a number of applications. User or client devices can include any of a number of general purpose personal computers, such as desktop or laptop computers running a standard operating system, as well as cellular, wireless, and handheld devices running mobile software and capable of supporting a number of networking and messaging protocols. Such a system also can include a number of workstations running any of a variety of commercially-available operating systems and other known applications for purposes such as development and database management. These devices also can include other electronic devices, such as dummy terminals, thin-clients, gaming systems, and other devices capable of communicating via a network.
  • Most embodiments utilize at least one network that would be familiar to those skilled in the art for supporting communications using any of a variety of commercially-available protocols, such as TCP/IP, OSI, FTP, UPnP, NFS, CIFS, and AppleTalk. The network can be, for example, a local area network, a wide-area network, a virtual private network, the Internet, an intranet, an extranet, a public switched telephone network, an infrared network, a wireless network, and any combination thereof.
  • In embodiments utilizing a Web server, the Web server can run any of a variety of server or mid-tier applications, including HTTP servers, FTP servers, CGI servers, data servers, Java servers, and business application servers. The server(s) also may be capable of executing programs or scripts in response requests from user devices, such as by executing one or more Web applications that may be implemented as one or more scripts or programs written in any programming language, such as Java®, C, C# or C++, or any scripting language, such as Perl, Python, or TCL, as well as combinations thereof. The server(s) may also include database servers, including without limitation those commercially available from Oracle®, Microsoft®, Sybase®, and lBM®.
  • The environment can include a variety of data stores and other memory and storage media as discussed above. These can reside in a variety of locations, such as on a storage medium local to (and/or resident in) one or more of the computers or remote from any or all of the computers across the network. In a particular set of embodiments, the information may reside in a storage-area network (“SAN”) familiar to those skilled in the art. Similarly, any necessary files for performing the functions attributed to the computers, servers, or other network devices may be stored locally and/or remotely, as appropriate. Where a system includes computerized devices, each such device can include hardware elements that may be electrically coupled via a bus, the elements including, for example, at least one central processing unit (CPU), at least one input device (e.g., a mouse, keyboard, controller, touch screen, or keypad), and at least one output device (e.g., a display device, printer, or speaker). Such a system may also include one or more storage devices, such as disk drives, optical storage devices, and solid-state storage devices such as random access memory (“RAM”) or read-only memory (“ROM”), as well as removable media devices, memory cards, flash cards, etc.
  • Such devices also can include a computer-readable storage media reader, a communications device (e.g., a modem, a network card (wireless or wired), an infrared communication device, etc.), and working memory as described above. The computer-readable storage media reader can be connected with, or configured to receive, a computer-readable storage medium, representing remote, local, fixed, and/or removable storage devices as well as storage media for temporarily and/or more permanently containing, storing, transmitting, and retrieving computer-readable information. The system and various devices also typically will include a number of software applications, modules, services, or other elements located within at least one working memory device, including an operating system and application programs, such as a client application or Web browser. It should be appreciated that alternate embodiments may have numerous variations from that described above. For example, customized hardware might also be used and/or particular elements might be implemented in hardware, software (including portable software, such as applets), or both. Further, connection to other computing devices such as network input/output devices may be employed.
  • Storage media and computer readable media for containing code, or portions of code, can include any appropriate media known or used in the art, including storage media and communication media, such as but not limited to volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage and/or transmission of information such as computer readable instructions, data structures, program modules, or other data, including RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a system device. Based on the disclosure and teachings provided herein, a person of ordinary skill in the art will appreciate other ways and/or methods to implement the various embodiments.
  • The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. It will, however, be evident that various modifications and changes may be made thereunto without departing from the broader spirit and scope of the invention as set forth in the claims.

Claims (23)

What is claimed is:
1. A computer implemented method comprising:
under the control of one or more computer systems configured with executable instructions,
causing, by a host computing device running in a service provider environment, a virtual machine to be provisioned on the host computing device;
receiving, from a scaling service running in the service provider environment, a request to adjust resources allocated to the virtual machine; and
in response to receiving the request, adjusting allocation of one or more computing resources to the virtual machine.
2. The computer implemented method of claim 1, wherein the scaling service further performs:
monitoring one or more metrics associated with running the virtual machine;
detecting that the one or more metrics have passed at least one specified threshold; and
transmitting the request to adjust resources allocated to the virtual machine to the host computer device in response to detecting that the one or more metrics have passed the specified threshold.
3. The computer implemented method of claim 1, wherein a monitoring service executing on the host computing device is configured to
monitor one or more metrics associated with the virtual machine;
detect that the one or more metrics have passed a threshold; and
adjust the allocation of the one or more computing resources in response to detecting that the one or more metrics have passed the threshold.
4. The computer implemented method of claim 1, wherein the scaling service further performs:
receiving a request to adjust resources allocated to the virtual machine from a customer via an application programming interface (API); and
transmitting the request to adjust resources allocated to the virtual machine in response to receiving the request.
5. The computer implemented method of claim 1, further comprising:
continuing to allocate the one or more additional resources to the virtual machine until a predetermined limit is reached; and
provisioning one or more additional virtual machines to distribute a workload of the virtual machine across one or more additional virtual machines.
6. The computer implemented method of claim 1, wherein adjusting the allocation further includes:
determining that one or more additional virtual machines are needed to satisfy at least one of: redundancy, availability or durability associated with at least one service provided by the virtual machine; and
provisioning the one or more additional virtual machines.
7. The computer implemented method of claim 1, wherein the virtual machine further includes a guest agent that reports one or more metrics associated with executing the workload.
8. The computer implemented method of claim 1, further comprising:
billing the user based at least in part on the one or more resources of the host computing device allocated to the virtual machine in response to the request from the scaling service.
9. The computer implemented method of claim 1, further comprising:
providing an electronic marketplace that enables a customer to obtain resources of the host computing device to be allocated to the virtual machine for a fee, the fee being based at least in part on demand and supply of the one or more resources.
10. The computer implemented method of claim 1, further comprising:
receiving, by a placement service, an API request requesting that the virtual machine be scalable;
determining, by the placement service, that the host computing device includes capacity to add additional resources to the virtual machine; and
provisioning, by the placement service, the virtual machine onto the host computing device.
11. A computing system, comprising:
at least one processor; and
memory including instructions that, when executed by the processor, cause the computing system to:
receive a web service request related to adjusting resources allocated to a virtual machine running in a service provider environment provision a virtual machine for a user; and
in response to receiving the request, cause a server hosting the virtual machine to allocate one or more computing resources to the virtual machine.
12. The computing system of claim 11, wherein the web service request contains one or more input parameters that specify the computing resources to allocate to the virtual machine.
13. The computing system of claim 11, wherein the web service request contains one or more input parameters that specify a set of conditions in response to which the server is instructed to adjust the computing resources allocated to the virtual machine.
14. The computing system of claim 11, wherein the memory further comprises instruction that upon execution cause the computing system to:
monitor one or more metrics associated with running the virtual machine;
detect that the one or more metrics have passed at least one specified threshold; and
transmit the request to adjust resources allocated to the virtual machine to the host computer device in response to detecting that the one or more metrics have passed the specified threshold.
15. The computing device of claim 11, wherein the scaling service further performs:
receive a request to scale the virtual machine from a customer via an application programming interface (API); and
transmit the instruction to scale the virtual machine in response to receiving the request.
16. The computing device of claim 11, wherein the memory further comprises instructions that, when executed by the processor, cause the computing device to:
continue to allocate the one or more additional resources to the virtual machine until a predetermined limit is reached; and
provision one or more additional virtual machines to distribute a workload of the virtual machine across one or more additional virtual machines.
17. The computing device of claim 11, wherein the memory further comprises instructions that, when executed by the processor, cause the computing device to:
determine that the user has selected to the virtual machine instance of a type that is capable of being scaled by allocating the computing resource in response to the web service request; and
bill the user based at least in part on the type of the virtual machine selected by the user.
18. The computing device of claim 11, wherein the memory further comprises instructions that, when executed by the processor, cause the computing device to:
provide an electronic marketplace that enables a customer to purchase one or more additional resources of the computing device based at least in part on demand and supply of the one or more resources on the computing device.
19. The computing device of claim 11, wherein provisioning the virtual machine for the user further includes:
receive, by a placement service, an Application Programming Interface (API) request requesting that the virtual machine be scalable; and
provision, by the placement service, the virtual machine onto the host computing device.
20. A non-transitory computer readable storage medium storing one or more sequences of instructions executable by one or more processors to perform a set of operations comprising:
causing a virtual machine to be provisioned for a user on a host computing device, the virtual machine capable of executing a workload;
receiving an instruction to scale the virtual machine, the instruction received from a scaling service to the host computing device, the scaling service residing externally with respect to the host computing device; and
in response to receiving the instruction, adjusting allocation of one or more computing resources to the virtual machine, the one or more computing resources being allocatable by a hypervisor of the host computing device.
21. The non-transitory computer readable storage medium of claim 20, wherein the virtual machine is provisioned by a shared resource computing environment service provider on behalf of at least one customer, and wherein the scaling service is deployed by the service provider.
22. The non-transitory computer readable storage medium of claim 20, wherein the scaling service further performs:
monitoring one or more metrics associated with the workload executed by the virtual machine;
detecting that the one or more metrics have passed at least one specified threshold; and
transmitting the instruction to scale the virtual machine instance in response to detecting that the one or more metrics have passed the specified threshold.
23. The non-transitory computer readable storage medium of claim 20, wherein the scaling service further performs:
receiving a request to scale the virtual machine from a customer via an application programming interface (API); and
transmitting the instruction to scale the virtual machine in response to receiving the request.
US14/312,374 2012-08-23 2014-06-23 Scaling a virtual machine instance Abandoned US20140304404A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/312,374 US20140304404A1 (en) 2012-08-23 2014-06-23 Scaling a virtual machine instance

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13/593,226 US8825550B2 (en) 2012-08-23 2012-08-23 Scaling a virtual machine instance
US14/312,374 US20140304404A1 (en) 2012-08-23 2014-06-23 Scaling a virtual machine instance

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US13/593,226 Division US8825550B2 (en) 2012-08-23 2012-08-23 Scaling a virtual machine instance

Publications (1)

Publication Number Publication Date
US20140304404A1 true US20140304404A1 (en) 2014-10-09

Family

ID=50148872

Family Applications (2)

Application Number Title Priority Date Filing Date
US13/593,226 Expired - Fee Related US8825550B2 (en) 2012-08-23 2012-08-23 Scaling a virtual machine instance
US14/312,374 Abandoned US20140304404A1 (en) 2012-08-23 2014-06-23 Scaling a virtual machine instance

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US13/593,226 Expired - Fee Related US8825550B2 (en) 2012-08-23 2012-08-23 Scaling a virtual machine instance

Country Status (11)

Country Link
US (2) US8825550B2 (en)
EP (1) EP2888663A4 (en)
JP (1) JP6144346B2 (en)
CN (2) CN104620222A (en)
AU (2) AU2013305544B2 (en)
BR (1) BR112015003786A2 (en)
CA (1) CA2882531C (en)
IN (1) IN2015DN01497A (en)
RU (1) RU2616167C2 (en)
SG (2) SG10201606964VA (en)
WO (1) WO2014032031A2 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150381711A1 (en) * 2014-06-26 2015-12-31 Vmware, Inc. Methods and apparatus to scale application deployments in cloud computing environments
US9547534B2 (en) * 2014-10-10 2017-01-17 International Business Machines Corporation Autoscaling applications in shared cloud resources
WO2017200878A1 (en) * 2016-05-17 2017-11-23 Amazon Technologies, Inc. Versatile autoscaling
US10069869B2 (en) 2016-05-17 2018-09-04 Amazon Technologies, Inc. Versatile autoscaling
WO2019046642A1 (en) * 2017-08-31 2019-03-07 Genesys Telecommunications Laboratories, Inc. Systems and methods for load balancing across media server instances
US10409642B1 (en) 2016-11-22 2019-09-10 Amazon Technologies, Inc. Customer resource monitoring for versatile scaling service scaling policy recommendations
US10412022B1 (en) 2016-10-19 2019-09-10 Amazon Technologies, Inc. On-premises scaling using a versatile scaling service and an application programming interface management service
US20190386895A1 (en) * 2018-06-13 2019-12-19 At&T Intellectual Property I, L.P. East-west traffic monitoring solutions for the microservice virtualized data center lan
US11228643B2 (en) * 2019-06-04 2022-01-18 Capital One Services, Llc System and method for fast application auto-scaling

Families Citing this family (180)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8826271B2 (en) 2010-04-28 2014-09-02 Cavium, Inc. Method and apparatus for a virtual system on chip
US9323579B2 (en) * 2012-08-25 2016-04-26 Vmware, Inc. Resource allocation diagnosis on distributed computer systems
GB2506195A (en) 2012-09-25 2014-03-26 Ibm Managing a virtual computer resource
US8959513B1 (en) 2012-09-27 2015-02-17 Juniper Networks, Inc. Controlling virtualization resource utilization based on network state
GB2507779A (en) 2012-11-09 2014-05-14 Ibm Terminating a virtual machine in response to user inactivity in a cloud computing environment
US9038068B2 (en) * 2012-11-15 2015-05-19 Bank Of America Corporation Capacity reclamation and resource adjustment
US9081623B1 (en) 2012-12-05 2015-07-14 Amazon Technologies, Inc. Service resource allocation
US9851989B2 (en) * 2012-12-12 2017-12-26 Vmware, Inc. Methods and apparatus to manage virtual machines
KR101540631B1 (en) * 2012-12-28 2015-07-30 삼성에스디에스 주식회사 System, method and recording medium recording the program thereof for dynamic expansion of the virtual cluster
US9275408B1 (en) * 2013-01-25 2016-03-01 Amazon Technologies, Inc. Transferring ownership of computing resources
US9535727B1 (en) * 2013-02-07 2017-01-03 Ca, Inc. Identifying virtual machines that perform inconsistent with a profile
JP6089783B2 (en) * 2013-02-27 2017-03-08 富士通株式会社 Control device, resource control program, and resource control method
US20140258235A1 (en) * 2013-03-05 2014-09-11 VCE Company LLC Method to provide user domain management of snapshots for virtual desktops using centralized portal
US9313188B2 (en) * 2013-06-14 2016-04-12 Microsoft Technology Licensing, Llc Providing domain-joined remote applications in a cloud environment
US9503387B2 (en) * 2013-08-21 2016-11-22 Cisco Technology, Inc. Instantiating incompatible virtual compute requests in a heterogeneous cloud environment
US9727355B2 (en) * 2013-08-23 2017-08-08 Vmware, Inc. Virtual Hadoop manager
US9851988B1 (en) * 2013-09-04 2017-12-26 Amazon Technologies, Inc. Recommending computer sizes for automatically scalable computer groups
US9870568B2 (en) * 2013-11-19 2018-01-16 Xerox Corporation Methods and systems to price customized virtual machines
US10769054B1 (en) * 2014-02-13 2020-09-08 Amazon Technologies, Inc. Integrated program code marketplace and service provider network
US9594584B2 (en) * 2014-03-31 2017-03-14 Electronics And Telecommunications Research Institute Apparatus and method for mapping of tenant based dynamic processor
US9965334B1 (en) * 2014-06-09 2018-05-08 VCE IP Holding Company LLC Systems and methods for virtual machine storage provisioning
EP2955631B1 (en) * 2014-06-09 2019-05-01 Nokia Solutions and Networks Oy Controlling of virtualized network functions for usage in communication network
US9424065B2 (en) 2014-06-26 2016-08-23 Vmware, Inc. Methods and apparatus to scale application deployments in cloud computing environments using virtual machine pools
US10423456B2 (en) 2014-07-31 2019-09-24 Hewlett Packard Enterprise Development Lp Dynamic adjustment of resource utilization thresholds
US10129112B2 (en) 2014-08-14 2018-11-13 At&T Intellectual Property I, L.P. Workflow-based resource management
EP2988214A1 (en) * 2014-08-20 2016-02-24 Alcatel Lucent Method for balancing a load, a system, an elasticity manager and a computer program product
US9606826B2 (en) * 2014-08-21 2017-03-28 International Business Machines Corporation Selecting virtual machines to be migrated to public cloud during cloud bursting based on resource usage and scaling policies
US9471362B2 (en) * 2014-09-23 2016-10-18 Splunk Inc. Correlating hypervisor data for a virtual machine with associated operating system data
US9146764B1 (en) 2014-09-30 2015-09-29 Amazon Technologies, Inc. Processing event messages for user requests to execute program code
US9830193B1 (en) 2014-09-30 2017-11-28 Amazon Technologies, Inc. Automatic management of low latency computational capacity
US9600312B2 (en) 2014-09-30 2017-03-21 Amazon Technologies, Inc. Threading as a service
US9678773B1 (en) 2014-09-30 2017-06-13 Amazon Technologies, Inc. Low latency computational capacity provisioning
US9647889B1 (en) 2014-11-12 2017-05-09 Amazon Technologies, Inc. Standby instances for auto-scaling groups
US10411960B1 (en) * 2014-11-12 2019-09-10 Amazon Technologies, Inc. Detaching instances from auto-scaling group
US9762457B2 (en) * 2014-11-25 2017-09-12 At&T Intellectual Property I, L.P. Deep packet inspection virtual function
US10355934B2 (en) * 2014-12-03 2019-07-16 Amazon Technologies, Inc. Vertical scaling of computing instances
US9413626B2 (en) * 2014-12-05 2016-08-09 Amazon Technologies, Inc. Automatic management of resource sizing
CN105873114B (en) * 2015-01-21 2020-12-11 中兴通讯股份有限公司 Method for monitoring virtual network function performance and corresponding system
WO2016122462A1 (en) * 2015-01-27 2016-08-04 Hewlett Packard Enterprise Development Lp Virtual machine placement
US9733967B2 (en) 2015-02-04 2017-08-15 Amazon Technologies, Inc. Security protocols for low latency execution of program code
US9588790B1 (en) 2015-02-04 2017-03-07 Amazon Technologies, Inc. Stateful virtual compute system
US9851933B2 (en) * 2015-03-25 2017-12-26 International Business Machines Corporation Capability-based abstraction of software-defined infrastructure
US9946573B2 (en) * 2015-05-20 2018-04-17 Oracle International Corporation Optimizing virtual machine memory sizing for cloud-scale application deployments
US20160359695A1 (en) * 2015-06-04 2016-12-08 Cisco Technology, Inc. Network behavior data collection and analytics for anomaly detection
US10848574B2 (en) 2015-06-11 2020-11-24 Microsoft Technology Licensing, Llc Computing resource management system
GB2539429B (en) 2015-06-16 2017-09-06 Advanced Risc Mach Ltd Address translation
GB2539436B (en) * 2015-06-16 2019-02-06 Advanced Risc Mach Ltd Secure initialisation
GB2539433B8 (en) 2015-06-16 2018-02-21 Advanced Risc Mach Ltd Protected exception handling
GB2539435B8 (en) 2015-06-16 2018-02-21 Advanced Risc Mach Ltd Data processing memory access control, in which an owning process for a region of memory is specified independently of privilege level
GB2539428B (en) 2015-06-16 2020-09-09 Advanced Risc Mach Ltd Data processing apparatus and method with ownership table
US10476766B1 (en) * 2015-06-19 2019-11-12 Amazon Technologies, Inc. Selecting and configuring metrics for monitoring
US10367705B1 (en) * 2015-06-19 2019-07-30 Amazon Technologies, Inc. Selecting and configuring metrics for monitoring
US10475111B1 (en) * 2015-06-19 2019-11-12 Amazon Technologies, Inc. Selecting and configuring metrics for monitoring
US9785474B2 (en) 2015-07-23 2017-10-10 International Business Machines Corporation Managing a shared pool of configurable computing resources using a set of scaling factors and a set of workload resource data
US9864640B2 (en) 2015-08-14 2018-01-09 International Business Machines Corporation Controlling virtual machine density and placement distribution in a converged infrastructure resource pool
US20170052866A1 (en) * 2015-08-21 2017-02-23 International Business Machines Corporation Managing a shared pool of configurable computing resources which uses a set of dynamically-assigned resources
CN106528287B (en) 2015-09-09 2019-10-29 阿里巴巴集团控股有限公司 Resource for computer system distribution method and device
US10169086B2 (en) 2015-09-13 2019-01-01 International Business Machines Corporation Configuration management for a shared pool of configurable computing resources
US10146592B2 (en) * 2015-09-18 2018-12-04 Salesforce.Com, Inc. Managing resource allocation in a stream processing framework
CN106548262B (en) * 2015-09-21 2020-11-06 阿里巴巴集团控股有限公司 Scheduling method, device and system for resources for processing tasks
JP6424797B2 (en) * 2015-11-02 2018-11-21 株式会社デンソー In-vehicle device
US10361919B2 (en) * 2015-11-09 2019-07-23 At&T Intellectual Property I, L.P. Self-healing and dynamic optimization of VM server cluster management in multi-cloud platform
JP6743368B2 (en) * 2015-11-09 2020-08-19 日本電気株式会社 Virtual infrastructure host, virtual infrastructure host control method, virtual infrastructure host program, and communication system
WO2017083781A1 (en) * 2015-11-11 2017-05-18 Amazon Technologies, Inc. Scaling for virtualized graphics processing
US9678785B1 (en) * 2015-11-30 2017-06-13 International Business Machines Corporation Virtual machine resource allocation based on user feedback
US10395219B1 (en) * 2015-12-18 2019-08-27 Amazon Technologies, Inc. Location policies for reserved virtual machine instances
US9910713B2 (en) 2015-12-21 2018-03-06 Amazon Technologies, Inc. Code execution request routing
US11181875B2 (en) 2016-01-22 2021-11-23 Johnson Controls Tyco IP Holdings LLP Systems and methods for monitoring and controlling a central plant
CN106998560A (en) * 2016-01-25 2017-08-01 中兴通讯股份有限公司 A kind of management method, the network equipment and system for virtualizing network function
US9940156B2 (en) * 2016-01-29 2018-04-10 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. Decreasing hardware resource amount assigned to virtual machine as utilization of hardware resource decreases below a threshold
US11132213B1 (en) 2016-03-30 2021-09-28 Amazon Technologies, Inc. Dependency-based process of pre-existing data sets at an on demand code execution environment
US10666516B2 (en) * 2016-04-04 2020-05-26 Avago Technologies International Sales Pte. Limited Constraint-based virtual network function placement
US10135712B2 (en) 2016-04-07 2018-11-20 At&T Intellectual Property I, L.P. Auto-scaling software-defined monitoring platform for software-defined networking service assurance
US10235211B2 (en) * 2016-04-22 2019-03-19 Cavium, Llc Method and apparatus for dynamic virtual system on chip
WO2017213634A1 (en) * 2016-06-07 2017-12-14 Hitachi, Ltd. Method and apparatus to deploy applications on proper it resources based on frequency and amount of change
US10102040B2 (en) 2016-06-29 2018-10-16 Amazon Technologies, Inc Adjusting variable limit on concurrent code executions
US10127068B2 (en) 2016-06-30 2018-11-13 Amazon Technologies, Inc. Performance variability reduction using an opportunistic hypervisor
US10318311B2 (en) * 2016-06-30 2019-06-11 Amazon Technologies, Inc. Memory allocation techniques at partially-offloaded virtualization managers
US10678603B2 (en) 2016-09-01 2020-06-09 Microsoft Technology Licensing, Llc Resource oversubscription based on utilization patterns in computing systems
US10871995B2 (en) * 2016-09-29 2020-12-22 Amazon Technologies, Inc. Managed container instances
US10346191B2 (en) * 2016-12-02 2019-07-09 Wmware, Inc. System and method for managing size of clusters in a computing environment
US20180173526A1 (en) 2016-12-20 2018-06-21 Invensys Systems, Inc. Application lifecycle management system
US20180183858A1 (en) * 2016-12-28 2018-06-28 BeBop Technology LLC Method and System for Managing Cloud Based Operations
US10593009B1 (en) 2017-02-22 2020-03-17 Amazon Technologies, Inc. Session coordination for auto-scaled virtualized graphics processing
US20180260248A1 (en) 2017-03-09 2018-09-13 Johnson Controls Technology Company Building automation system with hybrid cluster optimization
US10706375B2 (en) 2017-03-29 2020-07-07 Johnson Controls Technology Company Central plant with asset allocator
EP3616007A1 (en) 2017-04-25 2020-03-04 Johnson Controls Technology Company Predictive building control system with neural network based constraint generation
US11271769B2 (en) 2019-11-14 2022-03-08 Johnson Controls Tyco IP Holdings LLP Central plant control system with asset allocation override
US11005733B2 (en) 2017-06-08 2021-05-11 Vmware, Inc Methods, systems, and apparatus to scale in and/or scale out resources managed by a cloud automation system
US10318333B2 (en) 2017-06-28 2019-06-11 Sap Se Optimizing allocation of virtual machines in cloud computing environment
US10721294B2 (en) * 2017-07-12 2020-07-21 Telefonaktiebolaget Lm Ericsson (Publ) Apparatus and method for subscription-based resource throttling in a cloud environment
US11947489B2 (en) 2017-09-05 2024-04-02 Robin Systems, Inc. Creating snapshots of a storage volume in a distributed storage system
US10452267B2 (en) 2017-09-13 2019-10-22 Robin Systems, Inc. Storage scheme for a distributed storage system
US10579276B2 (en) 2017-09-13 2020-03-03 Robin Systems, Inc. Storage scheme for a distributed storage system
US10430105B2 (en) 2017-09-13 2019-10-01 Robin Systems, Inc. Storage scheme for a distributed storage system
RU181857U1 (en) * 2017-09-13 2018-07-26 Общество с ограниченной ответственностью "Интегратор" A hardware-software device based on the cloud hyperconvergence operating system
US10534549B2 (en) 2017-09-19 2020-01-14 Robin Systems, Inc. Maintaining consistency among copies of a logical storage volume in a distributed storage system
US10838788B2 (en) * 2017-09-30 2020-11-17 Oracle International Corporation Real-time debugging instances in a deployed container platform
US10846001B2 (en) 2017-11-08 2020-11-24 Robin Systems, Inc. Allocating storage requirements in a distributed storage system
US10782887B2 (en) 2017-11-08 2020-09-22 Robin Systems, Inc. Window-based prority tagging of IOPs in a distributed storage system
US11126927B2 (en) * 2017-11-24 2021-09-21 Amazon Technologies, Inc. Auto-scaling hosted machine learning models for production inference
US10430292B2 (en) 2017-12-19 2019-10-01 Robin Systems, Inc. Snapshot deletion in a distributed storage system
US10430110B2 (en) 2017-12-19 2019-10-01 Robin Systems, Inc. Implementing a hybrid storage node in a distributed storage system
US10452308B2 (en) 2017-12-19 2019-10-22 Robin Systems, Inc. Encoding tags for metadata entries in a storage system
US10719344B2 (en) 2018-01-03 2020-07-21 Acceture Global Solutions Limited Prescriptive analytics based compute sizing correction stack for cloud computing resource scheduling
US10642697B2 (en) 2018-01-11 2020-05-05 Robin Systems, Inc. Implementing containers for a stateful application in a distributed computing system
US11748203B2 (en) 2018-01-11 2023-09-05 Robin Systems, Inc. Multi-role application orchestration in a distributed storage system
US11099937B2 (en) 2018-01-11 2021-08-24 Robin Systems, Inc. Implementing clone snapshots in a distributed storage system
US10628235B2 (en) 2018-01-11 2020-04-21 Robin Systems, Inc. Accessing log files of a distributed computing system using a simulated file system
US11392363B2 (en) 2018-01-11 2022-07-19 Robin Systems, Inc. Implementing application entrypoints with containers of a bundled application
US10896102B2 (en) 2018-01-11 2021-01-19 Robin Systems, Inc. Implementing secure communication in a distributed computing system
US11582168B2 (en) 2018-01-11 2023-02-14 Robin Systems, Inc. Fenced clone applications
US10846137B2 (en) * 2018-01-12 2020-11-24 Robin Systems, Inc. Dynamic adjustment of application resources in a distributed computing system
US10579364B2 (en) 2018-01-12 2020-03-03 Robin Systems, Inc. Upgrading bundled applications in a distributed computing system
US10845997B2 (en) 2018-01-12 2020-11-24 Robin Systems, Inc. Job manager for deploying a bundled application
US10642694B2 (en) 2018-01-12 2020-05-05 Robin Systems, Inc. Monitoring containers in a distributed computing system
US11112449B2 (en) 2018-04-06 2021-09-07 Bendy Nevada, LLC Flexible and scalable monitoring systems for industrial machines
US11016141B2 (en) 2018-04-06 2021-05-25 Bently Nevada, Llc Monitoring systems for industrial machines having dynamically adjustable computational units
US10853115B2 (en) 2018-06-25 2020-12-01 Amazon Technologies, Inc. Execution of auxiliary functions in an on-demand network code execution system
US11146569B1 (en) 2018-06-28 2021-10-12 Amazon Technologies, Inc. Escalation-resistant secure network services using request-scoped authentication information
US10949237B2 (en) 2018-06-29 2021-03-16 Amazon Technologies, Inc. Operating system customization in an on-demand network code execution system
US11099870B1 (en) 2018-07-25 2021-08-24 Amazon Technologies, Inc. Reducing execution times in an on-demand network code execution system using saved machine states
US10976938B2 (en) 2018-07-30 2021-04-13 Robin Systems, Inc. Block map cache
US11023328B2 (en) 2018-07-30 2021-06-01 Robin Systems, Inc. Redo log for append only storage scheme
US10817380B2 (en) 2018-07-31 2020-10-27 Robin Systems, Inc. Implementing affinity and anti-affinity constraints in a bundled application
US10599622B2 (en) 2018-07-31 2020-03-24 Robin Systems, Inc. Implementing storage volumes over multiple tiers
US11243953B2 (en) 2018-09-27 2022-02-08 Amazon Technologies, Inc. Mapreduce implementation in an on-demand network code execution system and stream data processing system
US11099917B2 (en) 2018-09-27 2021-08-24 Amazon Technologies, Inc. Efficient state maintenance for execution environments in an on-demand code execution system
US10884778B1 (en) 2018-09-28 2021-01-05 Amazon Technologies, Inc. Adjusting dynamically scalable instance hosting based on compute resource usage
US10877786B1 (en) * 2018-09-28 2020-12-29 Amazon Technologies, Inc. Managing compute resource usage based on prior usage
JP2020061032A (en) * 2018-10-11 2020-04-16 富士通株式会社 Database server management program, database server management method, and database system
US10908848B2 (en) 2018-10-22 2021-02-02 Robin Systems, Inc. Automated management of bundled applications
US11036439B2 (en) 2018-10-22 2021-06-15 Robin Systems, Inc. Automated management of bundled applications
US10620871B1 (en) 2018-11-15 2020-04-14 Robin Systems, Inc. Storage scheme for a distributed storage system
US11853800B2 (en) * 2018-11-19 2023-12-26 Alibaba Group Holding Limited Power management method
US11943093B1 (en) 2018-11-20 2024-03-26 Amazon Technologies, Inc. Network connection recovery after virtual machine transition in an on-demand network code execution system
US11010188B1 (en) 2019-02-05 2021-05-18 Amazon Technologies, Inc. Simulated data object storage using on-demand computation of data objects
US11861386B1 (en) 2019-03-22 2024-01-02 Amazon Technologies, Inc. Application gateways in an on-demand network code execution system
US11086725B2 (en) 2019-03-25 2021-08-10 Robin Systems, Inc. Orchestration of heterogeneous multi-role applications
US11256434B2 (en) 2019-04-17 2022-02-22 Robin Systems, Inc. Data de-duplication
US10831387B1 (en) 2019-05-02 2020-11-10 Robin Systems, Inc. Snapshot reservations in a distributed storage system
CN113767367B (en) * 2019-05-10 2024-05-31 日立安斯泰莫株式会社 Virtual machine monitor and control device
US11061732B2 (en) * 2019-05-14 2021-07-13 EMC IP Holding Company LLC System and method for scalable backup services
US10877684B2 (en) 2019-05-15 2020-12-29 Robin Systems, Inc. Changing a distributed storage volume from non-replicated to replicated
US11119809B1 (en) 2019-06-20 2021-09-14 Amazon Technologies, Inc. Virtualization-based transaction handling in an on-demand network code execution system
US11159528B2 (en) 2019-06-28 2021-10-26 Amazon Technologies, Inc. Authentication to network-services using hosted authentication information
US11115404B2 (en) 2019-06-28 2021-09-07 Amazon Technologies, Inc. Facilitating service connections in serverless code executions
US11003504B2 (en) 2019-06-28 2021-05-11 Cohesity, Inc. Scaling virtualization resource units of applications
US11190609B2 (en) 2019-06-28 2021-11-30 Amazon Technologies, Inc. Connection pooling for scalable network services
US11226847B2 (en) 2019-08-29 2022-01-18 Robin Systems, Inc. Implementing an application manifest in a node-specific manner using an intent-based orchestrator
US11249851B2 (en) 2019-09-05 2022-02-15 Robin Systems, Inc. Creating snapshots of a storage volume in a distributed storage system
US11520650B2 (en) 2019-09-05 2022-12-06 Robin Systems, Inc. Performing root cause analysis in a multi-role application
US11347684B2 (en) 2019-10-04 2022-05-31 Robin Systems, Inc. Rolling back KUBERNETES applications including custom resources
US11113158B2 (en) 2019-10-04 2021-09-07 Robin Systems, Inc. Rolling back kubernetes applications
US11119826B2 (en) 2019-11-27 2021-09-14 Amazon Technologies, Inc. Serverless call distribution to implement spillover while avoiding cold starts
US10924429B1 (en) * 2019-11-29 2021-02-16 Amazon Technologies, Inc. Using edge-optimized compute instances to execute user workloads at provider substrate extensions
US11403188B2 (en) 2019-12-04 2022-08-02 Robin Systems, Inc. Operation-level consistency points and rollback
CN111210286A (en) * 2019-12-26 2020-05-29 大象慧云信息技术有限公司 Tax control server-based efficient invoice issuing method and system
US11714682B1 (en) 2020-03-03 2023-08-01 Amazon Technologies, Inc. Reclaiming computing resources in an on-demand code execution system
US11249790B1 (en) * 2020-03-11 2022-02-15 Amazon Technologies, Inc. Scheduling usage of oversubscribed computing resources
US11188391B1 (en) 2020-03-11 2021-11-30 Amazon Technologies, Inc. Allocating resources to on-demand code executions under scarcity conditions
CN111651170B (en) * 2020-05-29 2022-11-08 深圳平安医疗健康科技服务有限公司 Instance dynamic adjustment method and device and related equipment
US11108638B1 (en) 2020-06-08 2021-08-31 Robin Systems, Inc. Health monitoring of automatically deployed and managed network pipelines
US11528186B2 (en) 2020-06-16 2022-12-13 Robin Systems, Inc. Automated initialization of bare metal servers
CN111752712B (en) * 2020-06-28 2023-08-18 中国银行股份有限公司 Method and device for improving resource utilization rate of virtual machine
US11740980B2 (en) 2020-09-22 2023-08-29 Robin Systems, Inc. Managing snapshot metadata following backup
US11743188B2 (en) 2020-10-01 2023-08-29 Robin Systems, Inc. Check-in monitoring for workflows
US11456914B2 (en) 2020-10-07 2022-09-27 Robin Systems, Inc. Implementing affinity and anti-affinity with KUBERNETES
US11271895B1 (en) 2020-10-07 2022-03-08 Robin Systems, Inc. Implementing advanced networking capabilities using helm charts
CN112162864B (en) * 2020-10-26 2023-06-09 新华三大数据技术有限公司 Cloud resource allocation method, device and storage medium
US11750451B2 (en) 2020-11-04 2023-09-05 Robin Systems, Inc. Batch manager for complex workflows
US11593270B1 (en) 2020-11-25 2023-02-28 Amazon Technologies, Inc. Fast distributed caching using erasure coded object parts
US11550713B1 (en) 2020-11-25 2023-01-10 Amazon Technologies, Inc. Garbage collection in distributed systems using life cycled storage roots
US11556361B2 (en) 2020-12-09 2023-01-17 Robin Systems, Inc. Monitoring and managing of complex multi-role applications
US11960913B2 (en) * 2021-03-16 2024-04-16 Nerdio, Inc. Systems and methods of auto-scaling a virtual desktop environment
US11971705B2 (en) 2021-04-13 2024-04-30 UiPath, Inc. Autoscaling strategies for robotic process automation
US11388210B1 (en) 2021-06-30 2022-07-12 Amazon Technologies, Inc. Streaming analytics using a serverless compute system
WO2023022855A1 (en) * 2021-08-20 2023-02-23 Microsoft Technology Licensing, Llc. Upgrading a virtual device deployment based on spike utilization
US11968280B1 (en) 2021-11-24 2024-04-23 Amazon Technologies, Inc. Controlling ingestion of streaming data to serverless function executions
US20230213998A1 (en) * 2022-01-04 2023-07-06 Quanta Cloud Technology Inc. Prediction-based system and method for optimizing energy consumption in computing systems

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100211944A1 (en) * 2007-09-12 2010-08-19 Mitsubishi Electric Corporation Information processing apparatus
US20100306767A1 (en) * 2009-05-29 2010-12-02 Dehaan Michael Paul Methods and systems for automated scaling of cloud computing systems
US20110213686A1 (en) * 2010-02-26 2011-09-01 James Michael Ferris Systems and methods for managing a software subscription in a cloud network
US20120011378A1 (en) * 2010-07-09 2012-01-12 Stratergia Ltd Power profiling and auditing consumption systems and methods
US20120124211A1 (en) * 2010-10-05 2012-05-17 Kampas Sean Robert System and method for cloud enterprise services
US20120174097A1 (en) * 2011-01-04 2012-07-05 Host Dynamics Ltd. Methods and systems of managing resources allocated to guest virtual machines
US20120324239A1 (en) * 2009-12-29 2012-12-20 Siemens Aktiengesellschaft Method and device for operating a virtual machine in accordance with an associated information on assignment of rights
US20120324443A1 (en) * 2011-06-14 2012-12-20 International Business Machines Corporation Reducing data transfer overhead during live migration of a virtual machine
US20120331461A1 (en) * 2011-06-27 2012-12-27 Robert Fries Host enabled management channel
US9378044B1 (en) * 2015-03-28 2016-06-28 Vmware, Inc. Method and system that anticipates deleterious virtual-machine state changes within a virtualization layer

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7412492B1 (en) * 2001-09-12 2008-08-12 Vmware, Inc. Proportional share resource allocation with reduction of unproductive resource consumption
US8166473B2 (en) * 2005-04-21 2012-04-24 Microsoft Corporation Method and system for a resource negotiation between virtual machines
GB2426151B (en) * 2005-05-12 2007-09-05 Motorola Inc Optimizing network performance for communication servcies
US8286174B1 (en) * 2006-04-17 2012-10-09 Vmware, Inc. Executing a multicomponent software application on a virtualized computer platform
US20070271560A1 (en) * 2006-05-18 2007-11-22 Microsoft Corporation Deploying virtual machine to host based on workload characterizations
JP4875525B2 (en) * 2007-03-26 2012-02-15 株式会社日立製作所 Virtual computer system and program
US20090037879A1 (en) * 2007-07-31 2009-02-05 Arun Kwangil Iyengar Method and system for integrating model-based and search-based automatic software configuration
US8732706B2 (en) * 2007-11-27 2014-05-20 Hewlett-Packard Development Company, L.P. Generating governing metrics for resource provisioning
JP5229232B2 (en) * 2007-12-04 2013-07-03 富士通株式会社 Resource lending control device, resource lending method, and resource lending program
US8509415B2 (en) * 2009-03-02 2013-08-13 Twilio, Inc. Method and system for a multitenancy telephony network
US20110126197A1 (en) * 2009-11-25 2011-05-26 Novell, Inc. System and method for controlling cloud and virtualized data centers in an intelligent workload management system
US8705513B2 (en) * 2009-12-15 2014-04-22 At&T Intellectual Property I, L.P. Methods and apparatus to communicatively couple virtual private networks to virtual machines within distributive computing networks
US8433802B2 (en) * 2010-01-26 2013-04-30 International Business Machines Corporation System and method for fair and economical resource partitioning using virtual hypervisor
JP2011170679A (en) * 2010-02-19 2011-09-01 Runcom Systems Inc Virtual computer system and resource distribution control method of the same
JP5559582B2 (en) * 2010-03-25 2014-07-23 株式会社日立システムズ Virtual computer resource configuration changing system, method and program
US8572612B2 (en) * 2010-04-14 2013-10-29 International Business Machines Corporation Autonomic scaling of virtual machines in a cloud computing environment
US20120089736A1 (en) * 2010-10-08 2012-04-12 Electronics And Telecommunications Research Institute Apparatus and method for controlling computing capacity for multiple computers sharing resources with each other
JP2012099062A (en) * 2010-11-05 2012-05-24 Hitachi Ltd Service cooperation system and information processing system
US20120221454A1 (en) * 2011-02-28 2012-08-30 Morgan Christopher Edwin Systems and methods for generating marketplace brokerage exchange of excess subscribed resources using dynamic subscription periods

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100211944A1 (en) * 2007-09-12 2010-08-19 Mitsubishi Electric Corporation Information processing apparatus
US20100306767A1 (en) * 2009-05-29 2010-12-02 Dehaan Michael Paul Methods and systems for automated scaling of cloud computing systems
US20120324239A1 (en) * 2009-12-29 2012-12-20 Siemens Aktiengesellschaft Method and device for operating a virtual machine in accordance with an associated information on assignment of rights
US20110213686A1 (en) * 2010-02-26 2011-09-01 James Michael Ferris Systems and methods for managing a software subscription in a cloud network
US20120011378A1 (en) * 2010-07-09 2012-01-12 Stratergia Ltd Power profiling and auditing consumption systems and methods
US20120124211A1 (en) * 2010-10-05 2012-05-17 Kampas Sean Robert System and method for cloud enterprise services
US20120174097A1 (en) * 2011-01-04 2012-07-05 Host Dynamics Ltd. Methods and systems of managing resources allocated to guest virtual machines
US20120324443A1 (en) * 2011-06-14 2012-12-20 International Business Machines Corporation Reducing data transfer overhead during live migration of a virtual machine
US20120331461A1 (en) * 2011-06-27 2012-12-27 Robert Fries Host enabled management channel
US9378044B1 (en) * 2015-03-28 2016-06-28 Vmware, Inc. Method and system that anticipates deleterious virtual-machine state changes within a virtualization layer

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11343140B2 (en) 2014-06-26 2022-05-24 Vmware, Inc. Methods and apparatus to scale application deployments in cloud computing environments
US10855534B2 (en) 2014-06-26 2020-12-01 Vmware, Inc. Methods and apparatus to scale application deployments in cloud computing environments
US20150381711A1 (en) * 2014-06-26 2015-12-31 Vmware, Inc. Methods and apparatus to scale application deployments in cloud computing environments
US10097410B2 (en) * 2014-06-26 2018-10-09 Vmware, Inc. Methods and apparatus to scale application deployments in cloud computing environments
US11743116B2 (en) 2014-06-26 2023-08-29 Vmware, Inc. Methods and apparatus to scale application deployments in cloud computing environments
US9547534B2 (en) * 2014-10-10 2017-01-17 International Business Machines Corporation Autoscaling applications in shared cloud resources
CN109313572A (en) * 2016-05-17 2019-02-05 亚马逊科技有限公司 General auto zoom
US10135837B2 (en) 2016-05-17 2018-11-20 Amazon Technologies, Inc. Versatile autoscaling for containers
US10397240B2 (en) 2016-05-17 2019-08-27 Amazon Technologies, Inc. Versatile autoscaling for containers
US10069869B2 (en) 2016-05-17 2018-09-04 Amazon Technologies, Inc. Versatile autoscaling
WO2017200878A1 (en) * 2016-05-17 2017-11-23 Amazon Technologies, Inc. Versatile autoscaling
US10979436B2 (en) 2016-05-17 2021-04-13 Amazon Technologies, Inc. Versatile autoscaling for containers
US10412022B1 (en) 2016-10-19 2019-09-10 Amazon Technologies, Inc. On-premises scaling using a versatile scaling service and an application programming interface management service
US10409642B1 (en) 2016-11-22 2019-09-10 Amazon Technologies, Inc. Customer resource monitoring for versatile scaling service scaling policy recommendations
US11347549B2 (en) 2016-11-22 2022-05-31 Amazon Technologies, Inc. Customer resource monitoring for versatile scaling service scaling policy recommendations
US10701142B2 (en) 2017-08-31 2020-06-30 Genesys Telecommunications Laboratories, Inc. Systems and methods for load balancing across media server instances
AU2018326701B9 (en) * 2017-08-31 2021-11-25 Genesys Cloud Services Holdings II, LLC Systems and methods for load balancing across media server instances
US11303703B2 (en) 2017-08-31 2022-04-12 Genesys Telecommunications Laboratories, Inc. Systems and methods for load balancing across media server instances
AU2018326701B2 (en) * 2017-08-31 2021-07-08 Genesys Cloud Services Holdings II, LLC Systems and methods for load balancing across media server instances
WO2019046642A1 (en) * 2017-08-31 2019-03-07 Genesys Telecommunications Laboratories, Inc. Systems and methods for load balancing across media server instances
US20190386895A1 (en) * 2018-06-13 2019-12-19 At&T Intellectual Property I, L.P. East-west traffic monitoring solutions for the microservice virtualized data center lan
US11228643B2 (en) * 2019-06-04 2022-01-18 Capital One Services, Llc System and method for fast application auto-scaling
US20220124145A1 (en) * 2019-06-04 2022-04-21 Capital One Services, Llc System and method for fast application auto-scaling
US11888927B2 (en) * 2019-06-04 2024-01-30 Capital One Services, Llc System and method for fast application auto-scaling

Also Published As

Publication number Publication date
US8825550B2 (en) 2014-09-02
BR112015003786A2 (en) 2017-07-04
IN2015DN01497A (en) 2015-07-03
AU2016277719B2 (en) 2017-12-07
CA2882531C (en) 2017-05-09
AU2013305544B2 (en) 2016-09-29
AU2013305544A1 (en) 2015-04-09
WO2014032031A2 (en) 2014-02-27
CN110308990A (en) 2019-10-08
EP2888663A4 (en) 2016-04-20
RU2616167C2 (en) 2017-04-12
RU2015110044A (en) 2016-10-20
WO2014032031A3 (en) 2014-05-08
US20140058871A1 (en) 2014-02-27
EP2888663A2 (en) 2015-07-01
JP2015529918A (en) 2015-10-08
AU2016277719A1 (en) 2017-02-02
SG10201606964VA (en) 2016-10-28
SG11201501288SA (en) 2015-04-29
CA2882531A1 (en) 2014-02-27
JP6144346B2 (en) 2017-06-07
CN104620222A (en) 2015-05-13

Similar Documents

Publication Publication Date Title
AU2016277719B2 (en) Scaling a virtual machine instance
US10248461B2 (en) Termination policies for scaling compute resources
US11777867B2 (en) Managing committed request rates for shared resources
US11099877B2 (en) Predictively provisioning cloud computing resources for virtual machines
US11146498B2 (en) Distributed resource scheduling based on network utilization
US9176764B1 (en) Managing memory in virtualized environments
US10365955B2 (en) Resource allocation in cloud environment
US9830192B1 (en) Managing application performance in virtualization systems
US10120714B1 (en) Customizing computing resources for application workload
US9692663B2 (en) Methods, systems, and computer program products for user side optimization of acquisition of virtualized resources
US20180063026A1 (en) Capacity optimization in an automated resource-exchange system
US10719235B1 (en) Managing volume placement on disparate hardware
EP3889775A1 (en) Cloud resource utilization management
US9817756B1 (en) Managing memory in virtualized environments
US20230007092A1 (en) Prediction-based resource provisioning in a cloud environment
Al-E'mari et al. Cloud Datacenter Selection Using Service Broker Policies: A Survey.

Legal Events

Date Code Title Description
STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION