US9842039B2 - Predictive load scaling for services - Google Patents
Predictive load scaling for services Download PDFInfo
- Publication number
- US9842039B2 US9842039B2 US14/307,759 US201414307759A US9842039B2 US 9842039 B2 US9842039 B2 US 9842039B2 US 201414307759 A US201414307759 A US 201414307759A US 9842039 B2 US9842039 B2 US 9842039B2
- Authority
- US
- United States
- Prior art keywords
- time
- period
- operational metric
- metric measurements
- auto
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3034—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a storage system, e.g. DASD based or network based
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
- G06F11/3419—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment by assessing time
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5072—Grid computing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/147—Network analysis or design for predicting network behaviour
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/50—Network service management, e.g. ensuring proper service fulfilment according to agreements
- H04L41/5041—Network service management, e.g. ensuring proper service fulfilment according to agreements characterised by the time relationship between creation and deployment of a service
- H04L41/5054—Automatic deployment of services triggered by the service manager, e.g. service implementation by automatic configuration of network components
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/70—Admission control; Resource allocation
- H04L47/76—Admission control; Resource allocation using dynamic resource allocation, e.g. in-call renegotiation requested by the user or requested by the network in response to changing network conditions
- H04L47/762—Admission control; Resource allocation using dynamic resource allocation, e.g. in-call renegotiation requested by the user or requested by the network in response to changing network conditions triggered by the network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/70—Admission control; Resource allocation
- H04L47/83—Admission control; Resource allocation based on usage prediction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/4557—Distribution of virtual machine instances; Migration and load balancing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5019—Workload prediction
Definitions
- Cloud services are widely used to provide many types of functionality including hosting applications, providing access to data storage, providing web sites, email or other functionality.
- Cloud services typically run on a network of computer systems that may be located remotely to each other.
- the computer network may be configured to provide the various services using virtual machines.
- the services may be scaled by adding or removing virtual machines as needed. For instance, at times of peak load, additional virtual machines may be instantiated, while at times of reduced load, virtual machines may be shut down. These virtual machines are typically either brought up or taken down in a reactionary manner (i.e. reacting to current load), or are managed based on historical load data.
- Embodiments described herein are directed to determining an optimal number of concurrently running cloud resource instances and to providing an interactive interface that shows projected operational metric measurements.
- a computer system accesses metric information which identifies operational metric measurements for cloud resource instances over a first period of time prior to a present time.
- the computer system then accesses a second portion of metric information that identifies operational metric measurements for the cloud resource instances over a second period of time, where the second period of time is a period of time that occurred in the past but which corresponds to a specified future period of time.
- the computer system calculates projected operational metric measurements based on the identified operational metric measurements over the first period of time (e.g.
- the computer system determines, based on the projected operational metric measurements, a number of cloud resource instances that are to be concurrently running at a specified future point in time.
- a computer system provides an interactive interface that shows projected operational metric measurements.
- the computer system accesses operational metric measurement data over a specified time period.
- the computer system calculates projected operational metric measurements based on the accessed operational metric measurements and determines, based on the projected operational metric measurements, a number of cloud resource instances that are to be concurrently running at specified future points in time.
- the computer system then provides an interactive interface that displays the determined number of cloud resource instances that are to be concurrently running at the specified points in time.
- the interactive interface further allows input that changes operational metric settings and dynamically updates the determined number of concurrently running cloud resource instances.
- FIG. 1 illustrates a computer architecture in which embodiments described herein may operate including determining an optimal number of concurrently running cloud resource instances.
- FIG. 2 illustrates a flowchart of an example method for determining an optimal number of concurrently running cloud resource instances.
- FIG. 3 illustrates a flowchart of an example method for providing an interactive interface that shows projected operational metric measurements.
- FIG. 4 illustrates an embodiment of a time window for scale-up evaluation.
- FIG. 5 illustrates an embodiment of an interactive interface that displays a projected instance count.
- FIG. 6 illustrates an embodiment of an interactive interface that displays an impact preview.
- Embodiments described herein are directed to determining an optimal number of concurrently running cloud resource instances and to providing an interactive interface that shows projected operational metric measurements.
- a computer system accesses metric information which identifies operational metric measurements for cloud resource instances over a first period of time prior to a present time.
- the computer system then accesses a second portion of metric information that identifies operational metric measurements for the cloud resource instances over a second period of time, where the second period of time is a period of time that occurred in the past but which corresponds to a specified future period of time.
- the computer system calculates projected operational metric measurements based on the identified operational metric measurements over the first period of time (e.g.
- the computer system determines, based on the projected operational metric measurements, a number of cloud resource instances that are to be concurrently running at a specified future point in time.
- a computer system provides an interactive interface that shows projected operational metric measurements.
- the computer system accesses operational metric measurement data over a specified time period.
- the computer system calculates projected operational metric measurements based on the accessed operational metric measurements and determines, based on the projected operational metric measurements, a number of cloud resource instances that are to be concurrently running at specified future points in time.
- the computer system then provides an interactive interface that displays the determined number of cloud resource instances that are to be concurrently running at the specified points in time.
- the interactive interface further allows input that changes operational metric settings and dynamically updates the determined number of concurrently running cloud resource instances.
- Embodiments described herein may implement various types of computing systems. These computing systems are now increasingly taking a wide variety of forms. Computing systems may, for example, be handheld devices, appliances, laptop computers, desktop computers, mainframes, distributed computing systems, or even devices that have not conventionally been considered a computing system.
- the term “computing system” is defined broadly as including any device or system (or combination thereof) that includes at least one physical and tangible processor, and a physical and tangible memory capable of having thereon computer-executable instructions that may be executed by the processor.
- a computing system may be distributed over a network environment and may include multiple constituent computing systems.
- a computing system 101 typically includes at least one processing unit 102 and memory 103 .
- the memory 103 may be physical system memory, which may be volatile, non-volatile, or some combination of the two.
- the term “memory” may also be used herein to refer to non-volatile mass storage such as physical storage media. If the computing system is distributed, the processing, memory and/or storage capability may be distributed as well.
- executable module can refer to software objects, routings, or methods that may be executed on the computing system.
- the different components, modules, engines, and services described herein may be implemented as objects or processes that execute on the computing system (e.g., as separate threads).
- embodiments are described with reference to acts that are performed by one or more computing systems. If such acts are implemented in software, one or more processors of the associated computing system that performs the act direct the operation of the computing system in response to having executed computer-executable instructions.
- such computer-executable instructions may be embodied on one or more computer-readable media that form a computer program product.
- An example of such an operation involves the manipulation of data.
- the computer-executable instructions (and the manipulated data) may be stored in the memory 103 of the computing system 101 .
- Computing system 101 may also contain communication channels that allow the computing system 101 to communicate with other message processors over a wired or wireless network.
- Embodiments described herein may comprise or utilize a special-purpose or general-purpose computer system that includes computer hardware, such as, for example, one or more processors and system memory, as discussed in greater detail below.
- the system memory may be included within the overall memory 103 .
- the system memory may also be referred to as “main memory”, and includes memory locations that are addressable by the at least one processing unit 102 over a memory bus in which case the address location is asserted on the memory bus itself.
- System memory has been traditionally volatile, but the principles described herein also apply in circumstances in which the system memory is partially, or even fully, non-volatile.
- Embodiments within the scope of the present invention also include physical and other computer-readable media for carrying or storing computer-executable instructions and/or data structures.
- Such computer-readable media can be any available media that can be accessed by a general-purpose or special-purpose computer system.
- Computer-readable media that store computer-executable instructions and/or data structures are computer storage media.
- Computer-readable media that carry computer-executable instructions and/or data structures are transmission media.
- embodiments of the invention can comprise at least two distinctly different kinds of computer-readable media: computer storage media and transmission media.
- Computer storage media are physical hardware storage media that store computer-executable instructions and/or data structures.
- Physical hardware storage media include computer hardware, such as RAM, ROM, EEPROM, solid state drives (“SSDs”), flash memory, phase-change memory (“PCM”), optical disk storage, magnetic disk storage or other magnetic storage devices, or any other hardware storage device(s) which can be used to store program code in the form of computer-executable instructions or data structures, which can be accessed and executed by a general-purpose or special-purpose computer system to implement the disclosed functionality of the invention.
- Transmission media can include a network and/or data links which can be used to carry program code in the form of computer-executable instructions or data structures, and which can be accessed by a general-purpose or special-purpose computer system.
- a “network” is defined as one or more data links that enable the transport of electronic data between computer systems and/or modules and/or other electronic devices.
- program code in the form of computer-executable instructions or data structures can be transferred automatically from transmission media to computer storage media (or vice versa).
- program code in the form of computer-executable instructions or data structures received over a network or data link can be buffered in RAM within a network interface module (e.g., a “NIC”), and then eventually transferred to computer system RAM and/or to less volatile computer storage media at a computer system.
- a network interface module e.g., a “NIC”
- computer storage media can be included in computer system components that also (or even primarily) utilize transmission media.
- Computer-executable instructions comprise, for example, instructions and data which, when executed at one or more processors, cause a general-purpose computer system, special-purpose computer system, or special-purpose processing device to perform a certain function or group of functions.
- Computer-executable instructions may be, for example, binaries, intermediate format instructions such as assembly language, or even source code.
- a computer system may include a plurality of constituent computer systems.
- program modules may be located in both local and remote memory storage devices.
- Cloud computing environments may be distributed, although this is not required. When distributed, cloud computing environments may be distributed internationally within an organization and/or have components possessed across multiple organizations.
- cloud computing is defined as a model for enabling on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services). The definition of “cloud computing” is not limited to any of the other numerous advantages that can be obtained from such a model when properly deployed.
- system architectures described herein can include a plurality of independent components that each contribute to the functionality of the system as a whole.
- This modularity allows for increased flexibility when approaching issues of platform scalability and, to this end, provides a variety of advantages.
- System complexity and growth can be managed more easily through the use of smaller-scale parts with limited functional scope.
- Platform fault tolerance is enhanced through the use of these loosely coupled modules.
- Individual components can be grown incrementally as business needs dictate. Modular development also translates to decreased time to market for new functionality. New functionality can be added or subtracted without impacting the core system.
- FIG. 1 illustrates a computer architecture 100 in which at least one embodiment may be employed.
- Computer architecture 100 includes computer system 101 .
- Computer system 101 may be any type of local or distributed computer system, including a cloud computing system.
- the computer system 101 includes modules for performing a variety of different functions.
- the communications module 104 may be configured to communicate with other computing systems.
- the computing module 104 may include any wired or wireless communication means that can receive and/or transmit data to or from other computing systems.
- the communications module 104 may be configured to interact with databases, mobile computing devices (such as mobile phones or tablets), embedded or other types of computing systems.
- the communications module 104 of computer system 101 may be configured to receive metric information from database 111 .
- the database 111 may be any type of local or distributed database, and the data stored within the database may be stored according to substantially any open or proprietary data storage standard.
- the metric information 112 may include various operational metric measurements 113 for different cloud resources.
- the metric information may include central processing unit (CPU) load over a period of time. This may be general CPU load or CPU load that is specific to the hosting of a given service or virtual machine. Other metric information may be related to memory, networking bandwidth, number of concurrently running virtual machines, number of CPU cores or any other cloud resource.
- These cloud resources 114 may be monitored by the computer system 101 over time, and the information identified from the monitoring may be stored as metric information 112 .
- the accessing module 105 of computer system 101 may be configured to access the metric information using a wired or wireless connection to the database 111 (in some cases, it should be noted, the database 111 may be local to computer system 101 ).
- the calculating module of computer system 101 may calculate a projected operational measurement 107 .
- This projected value may be an approximation or projection of what the CPU load or other resource will be consuming at some point in the future. Such a projection may be used to determine how much hardware to have available for scaling on a given day or week or month, etc. For example, many websites encounter a large number of guests on or near the holidays in November and December.
- this projected operational measurement 107 may be based on past load over certain periods of time. This will be explained in greater detail below.
- the determining module 108 of computer system 101 may use the projected operational measurement 107 to determine an optimal number 109 of cloud resource instances 114 (which includes virtual resources (e.g. VM instances) and/or physical resources (e.g. CPUs or network ports)). This determined optimal number of instances 109 may be provided to the interface instantiating module 110 which instantiates interactive interface 115 .
- the interactive interface 115 displays the optimal number of instances 109 for a given period of time.
- a user 116 may be able to view the number of instances 109 in context with other settings, such as settings that govern how a service is to be hosted.
- the user may interact with the interface 115 to change certain hosting settings and view an updated projection 107 of cloud resources that should be available if those hosting settings are used. In this manner, a user 116 may be able to make virtual changes to the hosting settings of an application and view the impact to existing cloud resources 114 if those changes were actually to be applied.
- computing system 101 is designed to determine the optimal or ideal number of cloud instances (e.g. concurrently running virtual machines) that the user 116 should have at any given point in time. This determination may be made by looking at previous time windows of the load on the cloud resources (perhaps in relation to a given service), based on known recurring patterns such as daily patterns (e.g. previous days in the week may have similar load at similar times of the day), weekly patterns (e.g. the same day and time in a previous week likely has similar load characteristics) or annual patterns (e.g. there may be some broader patterns, such as the school year or the holiday season).
- daily patterns e.g. previous days in the week may have similar load at similar times of the day
- weekly patterns e.g. the same day and time in a previous week likely has similar load characteristics
- annual patterns e.g. there may be some broader patterns, such as the school year or the holiday season.
- the interactive interface 115 may be configured to show to the user 116 time-series data of the performance of a service, an application, a specified cloud resource or any combination thereof.
- This data may include the number of instances hosting the service and may further include the aggregate load on the service or cloud resource.
- This not the average across all of the cloud resource instances, but instead sums or aggregates the load metrics across all of the instances in the system.
- the data further includes a projected instance count (i.e. 109 ), based on the aggregate load, the user-defined hosting settings and any prediction logic.
- the projected instance count 109 may be used in multiple ways.
- the user 116 views the interactive interface 115 , it shows them what scaling should have done at a given point in time.
- the user 116 changes the hosting settings for their service, they can see a live preview that the changes would have on the service, without having to commit and wait for those changes to take effect.
- This live preview is based on the historical data (i.e. the predictive side of the logic), as future data is obviously not yet available. Additionally, this information, along with the total effects of the new user-defined settings, may be presented to the user 116 as aggregate statistics.
- an optimal number of cloud resource instances 114 may be determined or predicted by looking at usage patterns, and specifically at weekly patterns, as opposed to monthly or daily patterns. These weekly patterns may be the most common patterns, and may be universal across all (or most) services. That said, the same logic could be applied to different periods of time (e.g. monthly or yearly). The time period may even be user-customizable such that the user can select certain hours, days, weeks, etc. over which to view cloud resource load.
- auto-scaling refers to automatically scaling a cloud resource (e.g. the number of VMs currently running to host a service, or the size of a particular VM hosting a service) up or down based on current need.
- a cloud resource e.g. the number of VMs currently running to host a service, or the size of a particular VM hosting a service
- scale up decisions made by looking at the past hour may be prioritized over any other decisions. This may be done to err on the side of better performance as opposed to cost savings, as users are typically more impacted by bad performance than by a small difference in cost savings.
- the first time window considered when determining or predicting future use may be the previous hour 401 .
- the system may look at what the projected usage is for the next hour. This is calculated by looking at what happened over the next 60 minutes in previous weeks, with an increasing discount the further back we go. For example, if the current time is 1 pm ( 402 ), then the previous hour 401 would be noon-1 pm and the next hour 403 would be the usage between 1 pm and 2 pm a week ago, two weeks ago, three weeks ago, and so on.
- Each previous week may be rated at a different level. For example, the past week may be weighted at 0.5, two weeks ago at 0.25, three weeks ago at 0.125, and so on. By combining these values together, the system can determine a single projected CPU value for the upcoming hour. If the cloud resource load is above the threshold that the user has defined, then a scale up action may take place.
- scale-down conditions may be evaluated.
- computer system 101 may be configured to only look at the previous hour (or other time increment). This ensures that we don't erroneously scale down just because last week there wasn't a load at that time.
- the system may scale down only if current usage and historical usage are both sufficiently low. This is a more aggressive approach to keeping performance high and optimizing performance over cost savings.
- a timeline may be shown that indicates what would have happened had an auto-scaling feature been enabled.
- two different time series of data may be stored: instance count and auto-scale status. Both may be accomplished by having a regular job that emits the state of the system at certain time increments (e.g. every five minutes). This information may then be used to calculate two separate lines as shown in the projected instance count 501 of FIG. 5 : auto-scaled instance count and non-auto-scaled instance count.
- the dotted line is zero for all data points where auto-scale is off, and is equal to the instance count where auto-scale is on.
- the other (solid) line is the opposite: it is zero when auto-scale is on and the instance count when auto-scale is off.
- the aggregate load is also based on the time-series data of metrics reported from the users' system.
- the default metric is CPU usage, but it can be any metric that the user selects or defines.
- the projected instance count will be the aggregate CPU load divided by 0.6. As such, this line changes as the user adjusts the auto-scale settings. This allows the user to preview the effect that a new set of options would have on the performance, as well as show the cost of applying those settings.
- the interactive interface 115 may be configured to show to the user rolled up statistics on the overall success of proactive auto-scale.
- the aggregate statistics may include two values: cost without auto-scale and cost with auto-scale. Cost without is calculated by multiplying the non-auto-scaled instance count by the per-unit cost of the virtual machines. If there is no data point for the non-auto-scale instance cost, the maximum of the instance count line is used to fill in the data. The cost with auto-scale is calculated by multiplying the projected instance count by the per-unit cost of the virtual machines.
- FIG. 2 illustrates a flowchart of a method 200 for determining an optimal number of concurrently running cloud resource instances. The method 200 will now be described with frequent reference to the components and data of environment 100 .
- Method 200 includes an act of accessing a first portion of metric information which identifies operational metric measurements for one or more cloud resource instances over a first period of time prior to a present time (act 210 ).
- accessing module 105 of computer system 101 may access metric information 112 which includes operational metric measurements 113 for various cloud resource instances 114 .
- the metric information 112 may include metric information for the prior hour, for example.
- the metric information may correspond to the CPU load or other measurement over the previous hour 401 .
- Method 200 includes an act of accessing a second portion of metric information that identifies operational metric measurements for the one or more cloud resource instances over at least a second period of time, the second period of time comprising a period of time that occurred in the past but which corresponds to a specified future period of time (act 220 ).
- the accessing module 105 of computer system 101 may access another portion of metric information 112 that shows various operational characteristics of one or more cloud resource instances over a period of time.
- the period of time may correspond to a period that occurred in the past but which corresponds to a future period of time. Thus, if the current time is 10 am (e.g. 402 in FIG.
- the period of time is said to have occurred in the past (e.g. on Tuesday, April 1st, from 10 am-11 am), but corresponds to the future period of time Tuesday, April 8th, from 10 am-11 m.
- This second period of time may be selected or customized by a user.
- the second period of time ( 403 ) is used for predicting future load (or other measurement), while the first period of time ( 401 ) is used for reacting to past load (in most cases, in the very recent past).
- the second period of time may be specified by user 116 and may be a day, a week, a month a year, or some other specified timeframe (e.g. a weekend, six months, an hour and twelve minutes, etc.). Older operational metric measurements may be weighted progressively less than newer operational metric measurements.
- second time periods 403 that correspond to a time that occurred in the past but also correspond to a future time may each be weighted progressively less for each older operational measurement.
- those operational metric measurements 113 that are identified over the first period of time 401 are prioritized over the identified operational metric measurements of the second period of time 403 when calculating the projected operational metric measurements 107 .
- This prioritization may lead to performance of a hosted service being prioritized over potential cost savings that could be realized if a certain number of cloud resources (e.g. VMs) were scaled down in an auto-scale operation.
- cloud resources e.g. VMs
- Method 200 next includes an act of calculating one or more projected operational metric measurements based on the identified operational metric measurements over the first period of time and further based on the identified operational metric measurements over the second period of time (predictive tuning) (act 230 ).
- calculating module 106 of computer system 101 may calculate projected operational metric measurements 107 based on the identified operational metric measurements 113 over the first period of time (e.g. 401 ) (reactive tuning) and further based on the identified operational metric measurements over the second period of time (predictive tuning) (e.g. 403 ).
- current operational metric measurements may also be taken into account (e.g. at 402 ).
- the measurements over the first period of time i.e.
- transitions may occur more quickly, for example, if a website is offering a midnight deal, demand may change greatly from the load seen over the previous 11 pm-12 am hour. Similarly, traffic may increase for a service that provides applications used by workers. Demand may go up substantially in the morning and may drop substantially in the evening when workers go home. Accordingly, the calculating module 106 may look not only at the recent past, but may also look at what happened in the next hour (or other timeframe) in the past (e.g. what happened in the next hour, one day ago, or one week ago). In this manner, the calculating module 106 can provide a projected operational measurement 107 that includes a reactive measurement and a predictive measurement.
- the determining module 108 may then determine, based on the projected operational metric measurements 107 , a number of cloud resource instances 109 that are to be concurrently running at one or more specified future points in time (act 240 ).
- the determined number of instances 109 may specify how many virtual machines, or how many CPU cores, or how many network ports or how many other cloud resource instances 114 are to be running to handle the load predicted in the projected operational measurement 107 .
- the projected operational measurement 107 may be more accurate than other methods, as it includes the identified operational metric measurements over the first period of time (i.e. the reactive measurements) and the identified operational metric measurements over the second period of time (i.e. the predictive measurements).
- cloud resource scaling actions may give deference to the load measured over the prior hour ( 401 ) over the load measured over periods of time that are farther back. At least in some cases, however, this may be a configurable setting, and a user or administrator may establish a policy to determine which measurements are given deference.
- the determining module 108 may trigger an auto-scaling action.
- an auto-scaling action may occur which reduces the number of concurrently running VM instances to five.
- the auto-scaling action may include adding or removing VM instances, and may be performed repeatedly as determined by the determining module 109 and as the projected operational measurement 107 changes.
- virtual machine instances may only be removed upon determining that the removal would not trigger other auto-scaling actions, so as to prevent flapping (where one auto-scaling rule indicates that instances are to be added and, once added, a second auto-scaling rule indicates that the newly added instances are to be removed).
- Policies may be implemented which specify that auto-scaling actions are prevented from removing VM instances. This prioritizes health over resource savings, as VM instances are not scaled down even in times of reduced load.
- Auto-scaling actions may further be configured to increase or decrease the size one or more currently running virtual machine instances, instead of powering down or powering up new ones. Increasing the size may include increasing the number of available CPUs, CPU cores, memory, storage, networking capacity or increasing the quantity of other resources.
- the projections made by the calculating module 106 may be displayed in an interactive interface 115 , as will be explained further below with regard to method 300 of FIG. 3 .
- FIG. 3 illustrates a flowchart of a method 300 for providing an interactive interface that shows projected operational metric measurements. The method 300 will now be described with frequent reference to the components and data of environment 100 .
- Method 300 includes an act of accessing one or more portions of operational metric measurement data over at least one time period (act 310 ).
- accessing module 105 of computer system 101 may access operational metric measurement data 113 that includes metric data for one or more cloud resources over a period of time.
- the calculating module 106 of computer system 101 may then calculate one or more projected operational metric measurements 107 based on the accessed operational metric measurements (act 320 ). In this calculation, reactive and/or predictive calculations may be used.
- the determining module 108 may then determine, based on the projected operational metric measurements 107 , a number of cloud resource instances that are to be concurrently running at one or more specified future points in time (act 330 ).
- the interface instantiating module 110 may then instantiate an interactive interface 115 that displays the determined number of cloud resource instances 109 that are to be concurrently running at the one or more specified points in time.
- the interactive interface further allows input (e.g. from user 116 ) that changes operational metric hosting or auto-scale settings and dynamically updates the determined number of concurrently running cloud resource instances (act 340 ).
- a user may specify different actions 604 that are to occur or different metric settings that are to be maintained when hosting a service or otherwise using cloud resources.
- the user may specify a target CPU load range 601 that is to be maintained for each CPU, or may specify a target queue 602 , or time to wait after scaling up or down.
- the user may turn auto-scaling on or off using switch 605 , and may view the projected statistics or measurements at 603 .
- the statistics may include an indication of the cost to run the cloud resources with auto-scaling on and with auto-scaling turned off.
- the interactive interface may also show historical operational metric measurement data for a specified time period (i.e. what actually happened during that timeframe) and further show an indication of the number of virtual machine instances that would have been concurrently running had auto-scaling been applied during the time period (as shown in FIG. 5 where the dotted line shows the number had auto-scaling been on, while the solid line shows the actual measurements. Accordingly, it can be seen in FIG. 5 that had auto-scaling been on between the period of 6 pm and 9 pm, the number of concurrently running instances would have dropped, leading to a cost savings. Accordingly, users may use the interactive interface to see what actually happened, what would have happened had auto-scaling been turned on, and what would happen if certain settings were applied to the cloud resource(s).
- the interactive interface 115 provides an indication that an auto-scaling action has been triggered based on the determined number of virtual machine instances that are to be concurrently running. As mentioned above, these auto-scaling actions may take place when the determining module 108 determines that a certain number of virtual machine instances 109 are to be concurrently running. If more or fewer than that determined number are currently running, the computer system 101 triggers an auto-scaling action. Each time one of these auto-scaling actions occurs, the user 116 may be apprised in the interactive interface 115 . The user may use this information to change settings if, for example, auto-scaling actions are taking place too often.
- the interactive interface may further provide an option to choose which virtual machine instances are removed during an auto-scaling action. There may be situations where a user would like certain VM instances removed in a scale-down or certain VM instances added in a scale-up. Accordingly, the user may make such specifications using the interactive interface. Options may also be provided which allow the user to select a new size for those virtual machine instances that are to be changed during an auto-scaling action
- methods, systems and computer program products are provided which determine an optimal number of cloud resource instances that should be concurrently running at any given point in time. Moreover, methods, systems and computer program products are provided which provide an interactive interface that shows current and projected operational metric measurements.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Quality & Reliability (AREA)
- Computer Hardware Design (AREA)
- Environmental & Geological Engineering (AREA)
- Debugging And Monitoring (AREA)
- Stored Programmes (AREA)
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/307,759 US9842039B2 (en) | 2014-03-31 | 2014-06-18 | Predictive load scaling for services |
PCT/US2015/022605 WO2015153243A1 (fr) | 2014-03-31 | 2015-03-26 | Mise a l'echelle predictive de charges pour des services |
CN201580017765.9A CN106164864B (zh) | 2014-03-31 | 2015-03-26 | 服务的预测负载伸缩 |
BR112016020592A BR112016020592A8 (pt) | 2014-03-31 | 2015-03-26 | método, produto de programa de computador e sistema de computador para determinar um número ótimo de instâncias de recursos de nuvem simultaneamente executadas |
EP15716274.4A EP3126976A1 (fr) | 2014-03-31 | 2015-03-26 | Mise a l'echelle predictive de charges pour des services |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201461972703P | 2014-03-31 | 2014-03-31 | |
US14/307,759 US9842039B2 (en) | 2014-03-31 | 2014-06-18 | Predictive load scaling for services |
Publications (2)
Publication Number | Publication Date |
---|---|
US20150278061A1 US20150278061A1 (en) | 2015-10-01 |
US9842039B2 true US9842039B2 (en) | 2017-12-12 |
Family
ID=54190548
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/307,759 Active 2036-08-05 US9842039B2 (en) | 2014-03-31 | 2014-06-18 | Predictive load scaling for services |
Country Status (5)
Country | Link |
---|---|
US (1) | US9842039B2 (fr) |
EP (1) | EP3126976A1 (fr) |
CN (1) | CN106164864B (fr) |
BR (1) | BR112016020592A8 (fr) |
WO (1) | WO2015153243A1 (fr) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170337275A1 (en) * | 2016-05-17 | 2017-11-23 | International Business Machines Corporation | Allocating computing resources |
US20190342181A1 (en) * | 2018-05-03 | 2019-11-07 | Servicenow, Inc. | Prediction based on time-series data |
US10896069B2 (en) | 2018-03-16 | 2021-01-19 | Citrix Systems, Inc. | Dynamically provisioning virtual machines from remote, multi-tier pool |
US20220300305A1 (en) * | 2021-03-16 | 2022-09-22 | Nerdio, Inc. | Systems and methods of auto-scaling a virtual desktop environment |
Families Citing this family (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9722945B2 (en) | 2014-03-31 | 2017-08-01 | Microsoft Technology Licensing, Llc | Dynamically identifying target capacity when scaling cloud resources |
US10432699B2 (en) | 2014-06-26 | 2019-10-01 | Vmware, Inc. | Crowd-sourced operational metric analysis of virtual appliances |
JP6394455B2 (ja) * | 2015-03-24 | 2018-09-26 | 富士通株式会社 | 情報処理システム、管理装置およびプログラム |
US11283697B1 (en) | 2015-03-24 | 2022-03-22 | Vmware, Inc. | Scalable real time metrics management |
US10594562B1 (en) * | 2015-08-25 | 2020-03-17 | Vmware, Inc. | Intelligent autoscale of services |
US10097437B2 (en) * | 2015-10-30 | 2018-10-09 | Citrix Systems, Inc. | Efficient management of virtualized session resources |
US10411969B2 (en) * | 2016-10-03 | 2019-09-10 | Microsoft Technology Licensing, Llc | Backend resource costs for online service offerings |
US11277483B2 (en) * | 2017-03-31 | 2022-03-15 | Microsoft Technology Licensing, Llc | Assessing user activity using dynamic windowed forecasting on historical usage |
US10445208B2 (en) * | 2017-06-23 | 2019-10-15 | Microsoft Technology Licensing, Llc | Tunable, efficient monitoring of capacity usage in distributed storage systems |
US10445209B2 (en) * | 2017-09-08 | 2019-10-15 | Accenture Global Solutions Limited | Prescriptive analytics based activation timetable stack for cloud computing resource scheduling |
JP6962138B2 (ja) * | 2017-11-06 | 2021-11-05 | 富士通株式会社 | 情報処理装置、情報処理システム及びプログラム |
US10671621B2 (en) | 2017-12-08 | 2020-06-02 | Microsoft Technology Licensing, Llc | Predictive scaling for cloud applications |
US10922141B2 (en) | 2017-12-11 | 2021-02-16 | Accenture Global Solutions Limited | Prescriptive analytics based committed compute reservation stack for cloud computing resource scheduling |
US10719344B2 (en) | 2018-01-03 | 2020-07-21 | Acceture Global Solutions Limited | Prescriptive analytics based compute sizing correction stack for cloud computing resource scheduling |
US10621004B2 (en) * | 2018-03-19 | 2020-04-14 | Accenture Global Solutions Limited | Resource control stack based system for multiple domain presentation of cloud computing resource control |
US10862774B2 (en) | 2018-05-29 | 2020-12-08 | Capital One Services, Llc | Utilizing machine learning to proactively scale cloud instances in a cloud computing environment |
EP3591526A1 (fr) * | 2018-07-05 | 2020-01-08 | Siemens Aktiengesellschaft | Procédé de mise à l'échelle d'une application dans une plateforme service (plateform-as-a-service, paas) |
CN109446041B (zh) * | 2018-09-25 | 2022-10-28 | 平安普惠企业管理有限公司 | 一种服务器压力预警方法、系统及终端设备 |
US11044180B2 (en) | 2018-10-26 | 2021-06-22 | Vmware, Inc. | Collecting samples hierarchically in a datacenter |
FR3093842B1 (fr) * | 2019-03-14 | 2021-09-10 | Amadeus | Procédé et système pour optimiser des groupements de machines virtuelles d’une plateforme informatique en nuage |
US10459757B1 (en) | 2019-05-13 | 2019-10-29 | Accenture Global Solutions Limited | Prescriptive cloud computing resource sizing based on multi-stream data sources |
US11582120B2 (en) | 2019-05-30 | 2023-02-14 | Vmware, Inc. | Partitioning health monitoring in a global server load balancing system |
US11036608B2 (en) | 2019-09-27 | 2021-06-15 | Appnomic Systems Private Limited | Identifying differences in resource usage across different versions of a software application |
US11811861B2 (en) | 2021-05-17 | 2023-11-07 | Vmware, Inc. | Dynamically updating load balancing criteria |
CN113190329A (zh) * | 2021-05-24 | 2021-07-30 | 青岛聚看云科技有限公司 | 服务器及容器云集群资源自动伸缩的方法 |
US11792155B2 (en) | 2021-06-14 | 2023-10-17 | Vmware, Inc. | Method and apparatus for enhanced client persistence in multi-site GSLB deployments |
US20230232195A1 (en) | 2022-01-19 | 2023-07-20 | Vmware, Inc. | Collective scaling of applications |
US12107821B2 (en) | 2022-07-14 | 2024-10-01 | VMware LLC | Two tier DNS |
Citations (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050044228A1 (en) | 2003-08-21 | 2005-02-24 | International Business Machines Corporation | Methods, systems, and media to expand resources available to a logical partition |
US20050086029A1 (en) | 2003-10-17 | 2005-04-21 | International Business Machines Corporation | Mechanism for on-line prediction of future performance measurements in a computer system |
US20080172671A1 (en) * | 2007-01-11 | 2008-07-17 | International Business Machines Corporation | Method and system for efficient management of resource utilization data in on-demand computing |
US7743001B1 (en) * | 2005-06-21 | 2010-06-22 | Amazon Technologies, Inc. | Method and system for dynamic pricing of web services utilization |
US20100332643A1 (en) | 2009-06-24 | 2010-12-30 | Red Hat Israel, Ltd. | Pre-Scheduling the Timelines of Virtual Machines |
US20110078303A1 (en) | 2009-09-30 | 2011-03-31 | Alcatel-Lucent Usa Inc. | Dynamic load balancing and scaling of allocated cloud resources in an enterprise network |
US8028051B2 (en) * | 2003-04-16 | 2011-09-27 | Fujitsu Limited | Apparatus for adjusting use resources of system and method thereof |
EP2381363A2 (fr) | 2010-04-26 | 2011-10-26 | VMware, Inc. | Architecture de plateforme en nuages |
US20120173709A1 (en) | 2011-01-05 | 2012-07-05 | Li Li | Seamless scaling of enterprise applications |
US20120204186A1 (en) * | 2011-02-09 | 2012-08-09 | International Business Machines Corporation | Processor resource capacity management in an information handling system |
US20120254443A1 (en) | 2011-03-30 | 2012-10-04 | International Business Machines Corporation | Information processing system, information processing apparatus, method of scaling, program, and recording medium |
US8286165B2 (en) | 2009-10-26 | 2012-10-09 | Hitachi, Ltd. | Server management apparatus and server management method |
US20120260019A1 (en) | 2011-04-07 | 2012-10-11 | Infosys Technologies Ltd. | Elastic provisioning of resources via distributed virtualization |
US8296434B1 (en) | 2009-05-28 | 2012-10-23 | Amazon Technologies, Inc. | Providing dynamically scaling computing load balancing |
US20120324092A1 (en) | 2011-06-14 | 2012-12-20 | International Business Machines Corporation | Forecasting capacity available for processing workloads in a networked computing environment |
US20130007753A1 (en) | 2011-06-28 | 2013-01-03 | Microsoft Corporation | Elastic scaling for cloud-hosted batch applications |
US8380880B2 (en) * | 2007-02-02 | 2013-02-19 | The Mathworks, Inc. | Scalable architecture |
US20130086273A1 (en) | 2011-10-04 | 2013-04-04 | Tier3, Inc. | Predictive two-dimensional autoscaling |
US8499066B1 (en) | 2010-11-19 | 2013-07-30 | Amazon Technologies, Inc. | Predicting long-term computing resource usage |
EP2624138A2 (fr) | 2012-01-31 | 2013-08-07 | VMware, Inc. | Attribution souple de ressources de calcul à des applications logicielles |
US8572612B2 (en) | 2010-04-14 | 2013-10-29 | International Business Machines Corporation | Autonomic scaling of virtual machines in a cloud computing environment |
US8589549B1 (en) * | 2005-06-21 | 2013-11-19 | Amazon Technologies, Inc. | Method and system for customer incentive-based management of computing resource utilization |
US20130318527A1 (en) * | 2011-08-15 | 2013-11-28 | Hitachi Systems, Ltd. | Virtual server control system and program |
US8606897B2 (en) * | 2010-05-28 | 2013-12-10 | Red Hat, Inc. | Systems and methods for exporting usage history data as input to a management platform of a target cloud-based network |
US20130346614A1 (en) | 2012-06-26 | 2013-12-26 | International Business Machines Corporation | Workload adaptive cloud computing resource allocation |
US20140040885A1 (en) | 2012-05-08 | 2014-02-06 | Adobe Systems Incorporated | Autonomous application-level auto-scaling in a cloud |
US8661182B2 (en) | 2011-05-26 | 2014-02-25 | Vmware, Inc. | Capacity and load analysis using storage attributes |
US20140082165A1 (en) | 2012-09-20 | 2014-03-20 | Michael David Marr | Automated profiling of resource usage |
WO2014047073A1 (fr) | 2012-09-20 | 2014-03-27 | Amazon Technologies, Inc. | Profilage automatisé d'utilisation de ressources |
US8756610B2 (en) * | 2011-12-30 | 2014-06-17 | International Business Machines Corporation | Dynamically scaling multi-tier applications vertically and horizontally in a cloud environment |
US8806501B2 (en) * | 2010-03-31 | 2014-08-12 | International Business Machines Corporation | Predictive dynamic system scheduling |
US20140310401A1 (en) | 2011-07-01 | 2014-10-16 | Jeffery Darrel Thomas | Method of and system for managing computing resources |
US20150113120A1 (en) * | 2013-10-18 | 2015-04-23 | Netflix, Inc. | Predictive auto scaling engine |
US20150281113A1 (en) | 2014-03-31 | 2015-10-01 | Microsoft Corporation | Dynamically identifying target capacity when scaling cloud resources |
US9207984B2 (en) * | 2009-03-31 | 2015-12-08 | Amazon Technologies, Inc. | Monitoring and automatic scaling of data volumes |
US9306870B1 (en) * | 2012-06-28 | 2016-04-05 | Amazon Technologies, Inc. | Emulating circuit switching in cloud networking environments |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101695050A (zh) * | 2009-10-19 | 2010-04-14 | 浪潮电子信息产业股份有限公司 | 一种基于网络流量自适应预测的动态负载均衡方法 |
CN102111284B (zh) * | 2009-12-28 | 2013-09-04 | 北京亿阳信通科技有限公司 | 电信业务量预测方法和装置 |
US8572623B2 (en) * | 2011-01-11 | 2013-10-29 | International Business Machines Corporation | Determining an optimal computing environment for running an image based on performance of similar images |
CN102711177A (zh) * | 2012-04-26 | 2012-10-03 | 北京邮电大学 | 基于业务预测的负载均衡方法 |
CN103425535B (zh) * | 2013-06-05 | 2016-08-10 | 浙江大学 | 云环境下的敏捷弹性伸缩方法 |
CN103559089B (zh) * | 2013-10-30 | 2016-06-08 | 南京邮电大学 | 一种基于服务等级协议约束的虚拟机需求预测实现方法 |
-
2014
- 2014-06-18 US US14/307,759 patent/US9842039B2/en active Active
-
2015
- 2015-03-26 WO PCT/US2015/022605 patent/WO2015153243A1/fr active Application Filing
- 2015-03-26 EP EP15716274.4A patent/EP3126976A1/fr not_active Ceased
- 2015-03-26 CN CN201580017765.9A patent/CN106164864B/zh active Active
- 2015-03-26 BR BR112016020592A patent/BR112016020592A8/pt not_active Application Discontinuation
Patent Citations (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8028051B2 (en) * | 2003-04-16 | 2011-09-27 | Fujitsu Limited | Apparatus for adjusting use resources of system and method thereof |
US20050044228A1 (en) | 2003-08-21 | 2005-02-24 | International Business Machines Corporation | Methods, systems, and media to expand resources available to a logical partition |
US20050086029A1 (en) | 2003-10-17 | 2005-04-21 | International Business Machines Corporation | Mechanism for on-line prediction of future performance measurements in a computer system |
US7743001B1 (en) * | 2005-06-21 | 2010-06-22 | Amazon Technologies, Inc. | Method and system for dynamic pricing of web services utilization |
US8589549B1 (en) * | 2005-06-21 | 2013-11-19 | Amazon Technologies, Inc. | Method and system for customer incentive-based management of computing resource utilization |
US20080172671A1 (en) * | 2007-01-11 | 2008-07-17 | International Business Machines Corporation | Method and system for efficient management of resource utilization data in on-demand computing |
US8380880B2 (en) * | 2007-02-02 | 2013-02-19 | The Mathworks, Inc. | Scalable architecture |
US9207984B2 (en) * | 2009-03-31 | 2015-12-08 | Amazon Technologies, Inc. | Monitoring and automatic scaling of data volumes |
US8296434B1 (en) | 2009-05-28 | 2012-10-23 | Amazon Technologies, Inc. | Providing dynamically scaling computing load balancing |
US20100332643A1 (en) | 2009-06-24 | 2010-12-30 | Red Hat Israel, Ltd. | Pre-Scheduling the Timelines of Virtual Machines |
US20110078303A1 (en) | 2009-09-30 | 2011-03-31 | Alcatel-Lucent Usa Inc. | Dynamic load balancing and scaling of allocated cloud resources in an enterprise network |
US8286165B2 (en) | 2009-10-26 | 2012-10-09 | Hitachi, Ltd. | Server management apparatus and server management method |
US8806501B2 (en) * | 2010-03-31 | 2014-08-12 | International Business Machines Corporation | Predictive dynamic system scheduling |
US8572612B2 (en) | 2010-04-14 | 2013-10-29 | International Business Machines Corporation | Autonomic scaling of virtual machines in a cloud computing environment |
EP2381363A2 (fr) | 2010-04-26 | 2011-10-26 | VMware, Inc. | Architecture de plateforme en nuages |
US8606897B2 (en) * | 2010-05-28 | 2013-12-10 | Red Hat, Inc. | Systems and methods for exporting usage history data as input to a management platform of a target cloud-based network |
US8499066B1 (en) | 2010-11-19 | 2013-07-30 | Amazon Technologies, Inc. | Predicting long-term computing resource usage |
US20120173709A1 (en) | 2011-01-05 | 2012-07-05 | Li Li | Seamless scaling of enterprise applications |
US20120204186A1 (en) * | 2011-02-09 | 2012-08-09 | International Business Machines Corporation | Processor resource capacity management in an information handling system |
US20120254443A1 (en) | 2011-03-30 | 2012-10-04 | International Business Machines Corporation | Information processing system, information processing apparatus, method of scaling, program, and recording medium |
US20120260019A1 (en) | 2011-04-07 | 2012-10-11 | Infosys Technologies Ltd. | Elastic provisioning of resources via distributed virtualization |
US8661182B2 (en) | 2011-05-26 | 2014-02-25 | Vmware, Inc. | Capacity and load analysis using storage attributes |
US20120324092A1 (en) | 2011-06-14 | 2012-12-20 | International Business Machines Corporation | Forecasting capacity available for processing workloads in a networked computing environment |
US20130007753A1 (en) | 2011-06-28 | 2013-01-03 | Microsoft Corporation | Elastic scaling for cloud-hosted batch applications |
US20140310401A1 (en) | 2011-07-01 | 2014-10-16 | Jeffery Darrel Thomas | Method of and system for managing computing resources |
US20130318527A1 (en) * | 2011-08-15 | 2013-11-28 | Hitachi Systems, Ltd. | Virtual server control system and program |
US20130086273A1 (en) | 2011-10-04 | 2013-04-04 | Tier3, Inc. | Predictive two-dimensional autoscaling |
US8756610B2 (en) * | 2011-12-30 | 2014-06-17 | International Business Machines Corporation | Dynamically scaling multi-tier applications vertically and horizontally in a cloud environment |
EP2624138A2 (fr) | 2012-01-31 | 2013-08-07 | VMware, Inc. | Attribution souple de ressources de calcul à des applications logicielles |
US9110728B2 (en) | 2012-01-31 | 2015-08-18 | Vmware, Inc. | Elastic allocation of computing resources to software applications |
US20140040885A1 (en) | 2012-05-08 | 2014-02-06 | Adobe Systems Incorporated | Autonomous application-level auto-scaling in a cloud |
US20130346614A1 (en) | 2012-06-26 | 2013-12-26 | International Business Machines Corporation | Workload adaptive cloud computing resource allocation |
US9306870B1 (en) * | 2012-06-28 | 2016-04-05 | Amazon Technologies, Inc. | Emulating circuit switching in cloud networking environments |
US20140082165A1 (en) | 2012-09-20 | 2014-03-20 | Michael David Marr | Automated profiling of resource usage |
WO2014047073A1 (fr) | 2012-09-20 | 2014-03-27 | Amazon Technologies, Inc. | Profilage automatisé d'utilisation de ressources |
US20150113120A1 (en) * | 2013-10-18 | 2015-04-23 | Netflix, Inc. | Predictive auto scaling engine |
US20150281113A1 (en) | 2014-03-31 | 2015-10-01 | Microsoft Corporation | Dynamically identifying target capacity when scaling cloud resources |
Non-Patent Citations (15)
Title |
---|
"How to Scale an Application", Published on: Jan. 22, 2014, Available at: http://www.windowsazure.com/en-us/documentation/articles/cloud-services-how-to-scale/. |
"International Preliminary Report on Patentability Issued in PCT Application No. PCT/US2015/022604", dated Feb. 11, 2016, 7 pages. |
"International Preliminary Report on Patentability Issued in PCT Application No. PCT/US2015/022605", dated Jul. 19, 2016, 7 Pages. |
"International Search Report and Written Opinion Issued in PCT Application No. PCT/US2015/022604", dated Jun. 18, 2015, 12 pages. |
"International Search Report and Written Opinion Issued in PCT Application No. PCT/US2015/022605", dated Jun. 22, 2015, 11 Pages. |
"Second Written Opinion Issued in PCT Application No. PCT/US2015/022605", dated Apr. 28, 2016, 6 Pages. |
Gandhi, et al., "AutoScale: Dynamic, Robust Capacity Management for Multi-Tier Data Centers", In Journal of ACM Transactions on Computer Systems, vol. 30, Issue 4, Apr. 2012, 31 pages. |
Keagle, Charles, "AWS Auto-Scaling", Available at least as early as Mar. 21, 2014. Available at <<http://2ndwatch.com/aws-auto-scaling/>>. |
Mao, et al., "Cloud Auto-scaling with Deadline and Budget Constraints", In 11th IEEE/ACM International Conference on Grid Computing (GRID), Oct. 25, 2010, 8 pages. |
Notice of Allowance dated Mar. 28, 2017 cited in U.S. Appl. No. 14/307,745. |
Office Action dated Sep. 13, 2016 cited in U.S. Appl. No. 14/307,745. |
Ramesh, et al., "Project Hoover: Auto-Scaling Streaming Map-Reduce Applications", In Proceedings of the Workshop on Management of Big Data Systems, Sep. 21, 2012, 7 pages. |
U.S. Appl. No. 61/972,706, filed Mar. 31, 2014, Siciliano et al. |
Yuan, et al., "Scryer: Netflix's Predictive Auto Scaling Engine-Part 2", Published on: Dec. 4, 2013, Available at: http://techblog.netflix.com/2013/12/scryer-netflixs-predictive-auto-scaling.html. |
Yuan, et al., "Scryer: Netflix's Predictive Auto Scaling Engine—Part 2", Published on: Dec. 4, 2013, Available at: http://techblog.netflix.com/2013/12/scryer-netflixs-predictive-auto-scaling.html. |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170337275A1 (en) * | 2016-05-17 | 2017-11-23 | International Business Machines Corporation | Allocating computing resources |
US10896069B2 (en) | 2018-03-16 | 2021-01-19 | Citrix Systems, Inc. | Dynamically provisioning virtual machines from remote, multi-tier pool |
US11726833B2 (en) | 2018-03-16 | 2023-08-15 | Citrix Systems, Inc. | Dynamically provisioning virtual machines from remote, multi-tier pool |
US20190342181A1 (en) * | 2018-05-03 | 2019-11-07 | Servicenow, Inc. | Prediction based on time-series data |
US10819584B2 (en) * | 2018-05-03 | 2020-10-27 | Servicenow, Inc. | System and method for performing actions based on future predicted metric values generated from time-series data |
US11388064B2 (en) | 2018-05-03 | 2022-07-12 | Servicenow, Inc. | Prediction based on time-series data |
US20220300305A1 (en) * | 2021-03-16 | 2022-09-22 | Nerdio, Inc. | Systems and methods of auto-scaling a virtual desktop environment |
US11960913B2 (en) * | 2021-03-16 | 2024-04-16 | Nerdio, Inc. | Systems and methods of auto-scaling a virtual desktop environment |
Also Published As
Publication number | Publication date |
---|---|
EP3126976A1 (fr) | 2017-02-08 |
BR112016020592A2 (pt) | 2017-08-15 |
CN106164864B (zh) | 2020-03-10 |
WO2015153243A1 (fr) | 2015-10-08 |
US20150278061A1 (en) | 2015-10-01 |
CN106164864A (zh) | 2016-11-23 |
BR112016020592A8 (pt) | 2021-06-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9842039B2 (en) | Predictive load scaling for services | |
EP3126975B1 (fr) | Identification dynamique de capacité cible lors de la mise à l'échelle des ressources dans le nuage | |
EP3547126A1 (fr) | Information pour migration et optimisation de nuages | |
US9292354B2 (en) | Self-adjusting framework for managing device capacity | |
US10908953B2 (en) | Automated generation of scheduling algorithms based on task relevance assessment | |
US10958515B2 (en) | Assessment and dynamic provisioning of computing resources for multi-tiered application | |
US9916135B2 (en) | Scaling a cloud infrastructure | |
US11182216B2 (en) | Auto-scaling cloud-based computing clusters dynamically using multiple scaling decision makers | |
US11016808B2 (en) | Multi-tenant license enforcement across job requests | |
US9424171B1 (en) | Resource-constrained test automation | |
US11836644B2 (en) | Abnormal air pollution emission prediction | |
US11138552B2 (en) | Estimation of node processing capacity for order fulfillment | |
CN110717075B (zh) | 信息技术基础设施事故的优先化 | |
US10782949B2 (en) | Risk aware application placement modeling and optimization in high turnover DevOps environments | |
US10628766B2 (en) | Method and system for enabling dynamic capacity planning | |
US10985987B2 (en) | Control of activities executed by endpoints based on conditions involving aggregated parameters | |
CN114064262A (zh) | 管理存储系统中的计算资源的方法、设备和程序产品 | |
US10057327B2 (en) | Controlled transfer of data over an elastic network | |
CN112988455B (zh) | 用于数据备份的方法、设备和计算机程序产品 | |
McCreadie et al. | Leveraging data-driven infrastructure management to facilitate AIOps for big data applications and operations | |
US20240086203A1 (en) | Sizing service for cloud migration to physical machine | |
US11175825B1 (en) | Configuration-based alert correlation in storage networks | |
US20240305523A1 (en) | Method to auto correct the default resource allocation of services in a migration environment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: MICROSOFT CORPORATION, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SICILIANO, STEPHEN;LAMANNA, CHARLES;GREBNOV, ILYA;REEL/FRAME:033127/0227 Effective date: 20140331 |
|
AS | Assignment |
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034747/0417 Effective date: 20141014 Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:039025/0454 Effective date: 20141014 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |