US20140258546A1 - Method and apparatus for dynamically assigning resources of a distributed server infrastructure - Google Patents
Method and apparatus for dynamically assigning resources of a distributed server infrastructure Download PDFInfo
- Publication number
- US20140258546A1 US20140258546A1 US14/350,609 US201214350609A US2014258546A1 US 20140258546 A1 US20140258546 A1 US 20140258546A1 US 201214350609 A US201214350609 A US 201214350609A US 2014258546 A1 US2014258546 A1 US 2014258546A1
- Authority
- US
- United States
- Prior art keywords
- resources
- relative load
- server infrastructure
- load
- assigned portion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/70—Admission control; Resource allocation
- H04L47/76—Admission control; Resource allocation using dynamic resource allocation, e.g. in-call renegotiation requested by the user or requested by the network in response to changing network conditions
- H04L47/762—Admission control; Resource allocation using dynamic resource allocation, e.g. in-call renegotiation requested by the user or requested by the network in response to changing network conditions triggered by the network
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
- G06F9/5088—Techniques for rebalancing the load in a distributed system involving task migration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/12—Avoiding congestion; Recovering from congestion
- H04L47/125—Avoiding congestion; Recovering from congestion by balancing the load, e.g. traffic engineering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
- G06F11/3495—Performance evaluation by tracing or monitoring for systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
- H04L67/1008—Server selection for load balancing based on parameters of servers, e.g. available memory or workload
Definitions
- the present invention pertains to distributed computing systems, including distributed telecommunication systems, and more in particular to dynamic scaling of distributed computing infrastructures or “clouds”.
- Cloud computing has gained substantial momentum over the past few years, fueling technological innovation and creating considerable business impact.
- Public, private or hybrid cloud infrastructure shortens users' time to market (new hosting infrastructure is only a few mouse-clicks away), and claims to reduce their total cost of ownership by shifting the cost structure from higher capital expenditure to lower operating expenditure.
- Virtualization technologies include XEN, KVM, VMware, Solaris and Linux Containers
- This enables dynamically right-sizing the amount of resources that are actually needed, instead of statically over-dimensioning the capacity of such clustered services.
- Some key advantages emerging from dynamic right-sizing include (1) the ability to reduce the services' operational cost and (2) the ability to gracefully handle unanticipated load surges without introducing opportunity loss by compromising the service's SLA.
- a method for dynamically assigning resources of a distributed server infrastructure comprising the steps of: comparing an observed relative load of an assigned portion of the distributed server infrastructure with a desired relative load; if the observed relative load exceeds the desired relative load: assigning additional resources, and redistributing tasks from the assigned portion to the additional resources; and if the desired relative load exceeds the desired relative load: selecting removable resources, redistributing tasks from the removable resources to other resources in the assigned portion, and removing the removable resources from the assigned portion; wherein the redistributing of tasks is performed in such a way that state information related to the tasks is preserved.
- Load or “relative load”, as used herein, can represent various metrics related to the degree to which the assigned resources are occupied, and more specifically the total work volume of the tasks being carried out by the resources. These metrics may include CPU usage, memory usage, response time, etc.
- the target i.e., the desired relative load
- the target is not necessarily a single value, but may instead be specified as a range, having a lower threshold (low water mark) and a higher threshold (high water mark). In this way, hysteresis can be implemented, which reduces the risk of instability in dynamic systems.
- the steps are applied iteratively.
- the frequency of the iterative application of the steps is varied in function of a difference between the observed relative load and the desired relative load.
- the distributed server infrastructure is used to deploy an elastic telecommunication system.
- the selecting removable resources comprises determining an individual load of resources among the assigned portion and selecting resources for which the individual load is lowest.
- the method according to the present invention further comprises assigning further additional resources in accordance with a time schedule, the time schedule representing recurring usage patterns for the distributed server infrastructure.
- the observed relative load is used to update the schedule.
- a computer program configured to cause a processor to carry out the method as described above.
- a system for dynamically assigning resources of a distributed server infrastructure comprising: a monitoring agent configured to observe a relative load of an assigned portion of the distributed server infrastructure; a processor, operatively connected to the monitoring agent, the processor being configured to compare the observed relative load with a desired relative load; and a management agent, configured to transmit instructions to the distributed server infrastructure, and to act according to the following rules in response to the comparing: if the observed relative load exceeds the desired relative load: instruct the server infrastructure to assign additional resources, and redistribute tasks from the assigned portion to the additional resources; and if the desired relative load exceeds the observed relative load: select removable resources, redistribute tasks from the removable resources to other resources in the assigned portion, and instruct the server infrastructure to remove the removable resources from the assigned portion.
- system according to the present invention further comprises a scheduler operatively connected to said management agent, and said management agent is further configured to act according to the following rules in response to a signal from said scheduler: if said signal is indicative of an expected increase in demand for resources: instruct said server infrastructure to assign additional resources, and redistribute tasks from said assigned portion to said additional resources; and if said signal is indicative of an expected decrease in demand for resources: select removable resources, redistribute tasks from said removable resources to other resources in said assigned portion, and instruct said server infrastructure to remove said removable resources from said assigned portion.
- the distributed server infrastructure comprises a plurality of SIP servers.
- FIG. 1 schematically illustrates a network in which embodiments of the present invention may be deployed
- FIG. 2 illustrates a control loop process
- FIG. 3 illustrates a control loop process according to an embodiment of the present invention
- FIG. 4 illustrates a control loop process according to another embodiment of the present invention
- FIG. 5 illustrates the distribution of the number of processed calls per 15 minutes as measured during a month, with separate graphs for weekdays and week-ends;
- FIG. 6 schematically illustrates an embodiment of the system according to the present invention.
- FIG. 1 schematically illustrates an exemplary network in which embodiments of the present invention may be deployed.
- FIG. 5 depicts the average number of processed calls per 15 minutes, collected by the inventors from a local trunk group in June 2011. From this data, it can be deduced that static peak load dimensioning results into an average capacity usage of only 50% (averaged out over 24 hours).
- telecommunication services are also exposed to long-term load variations.
- Small and medium size carriers typically want to gradually increase the number of end users they support—starting for example with a pilot project that involves around ten thousand users, and gradually providing more infrastructure if the service becomes more successful.
- embodiments of the present invention pertain to a Cloud Scaling Feedback Control System (CSFCS) 150 that implements dynamic scaling behavior for “cloudified” stateful telecommunication clusters—to maximize their utility (thus reducing the operational cost) while at the same time maintaining one or more key operating parameters (such as maximum response time).
- CSFCS 150 An embodiment of the CSFCS 150 is illustrated in more detail in FIG. 6 .
- FIG. 1 illustrates an exemplary SIP-based telecommunication network comprising two exemplary user agents 101 , 201 interconnected by a single SIP domain 100 .
- the SIP domain 100 comprises a first Client Elasticity Gateway (CEG) 111 and a second CEG 211 , shielding a server cluster. Without loss of generality, the cluster is illustrated as containing three elastic SIP servers 121 - 123 .
- the number of allocated SIP servers may increase or decrease as a result of the application of the method according to the present invention.
- SIP CEG 111 plays the role of User Agent Server (UAS) in all its communication with the UA 101 , and the role of User Agent Client (UAC) in its relation with the SIP servers 121 - 123 of the elastic SIP cluster.
- UAS User Agent Server
- UAC User Agent Client
- the SIP CEG 111 thus conceals the elastic SIP servers 121 - 123 from the client 101 by acting as a single SIP server. It may include load balancing support and/or failover support by interacting with an Elasticity Control System (ECS) as described in European patent application no.
- ECS Elasticity Control System
- the SIP CEG 111 terminates elasticity control messages originating from the elastic SIP cluster 121 - 123 , so it conceals the dynamics of the elastic SIP cluster from the UA 101 —including instructions to redirect messages to another SIP server.
- the CSFCS 150 according to the present invention may act in addition to or in replacement of the ECS of the cited application; the CSFCS 150 according to the present invention may in fact coincide with the ECS.
- Cloud scaling support (such as offered by Amazon Web ServiceTM, Google App EngineTM and HerokuTM) provides the required ingredients to build application specific feedback control systems that automatically increase and decrease the amount of allocated cloud instances—being virtual machines, containers or service instances.
- Cloud load balancers distribute incoming traffic across these cloud instances, concealing their existence from client applications.
- Cloud monitoring components observe these cloud instances and report on metrics such as CPU utilization, response-time, drop-rate, queue lengths and request counts.
- APIs are offered to create and release service instances, and to automatically initiate these operations when collected metrics exceed specified thresholds.
- a feedback process is depicted in FIG. 2 .
- the “elasticity control” 230 calculates how many (cloud) instances are currently needed (denoted as Ax(i) in FIG. 2 ). If a global measurement exceeds 240 (upper branch) a specified high threshold (high-water mark), the feedback systems instruct the (cloud) infrastructure to acquire new resources and to launch new service instances 250 . Similarly, if the global measurement drops below 240 (lower branch) the low threshold (low-water mark), the feedback control system instructs the (cloud) infrastructure to release spare resources 269 .
- the elasticity control determines how much resources will be needed in the near future, and pro-actively provisions the required resources to handle these load forecasts.
- embodiments of the process according to the invention provide extra steps compared to the process depicted in FIG. 2 . These extra steps handle session state.
- a cloud instance can only be released safely once it is not accommodating any session (or other execution) state anymore. Since waiting for all ongoing sessions to terminate may significantly delay the removal of the affected cloud instance (hence compromising the ability to maximize resource utility and reducing operational cost), our cloud scaling system coordinates the migration of these sessions towards other instances.
- FIG. 3 An exemplary feedback flowchart, as might result from the application of the above improvements, is depicted in FIG. 3 .
- the “elasticity control” 330 calculates how many (cloud) instances are currently needed (denoted as ⁇ x(i) in FIG. 2 ). If a global measurement exceeds 340 (upper branch) a specified high threshold (high-water mark), the feedback system instructs the (cloud) infrastructure to acquire new resources and to launch new service instances 350 . Subsequently, tasks or sessions are started on these newly launched instances.
- the load on the cloud infrastructure may be balanced by migrating 355 existing sessions from already running instances to the newly launched instances. Upon migrating these sessions, care must be taken to maintain session integrity and to correctly transfer state information. Similarly, if the global measurement drops below 340 (lower branch) the low threshold (low-water mark), the feedback control system instructs the (cloud) infrastructure to release spare resources 369 . Prior to this release, any sessions that are still running on the resources marked for release are preferably migrated 355 , along with the associated state association, to remaining instances.
- Embodiments of the present invention comprise session migration steps according to the methods described in those documents, which shall expressly be considered to be incorporated by this reference.
- FIG. 4 A more elaborate embodiment of the process according to the invention is illustrated in FIG. 4 .
- the illustrated method formally starts at the starting point labeled 300 , and returns to this point periodically with a frequency determined by the variable delay 399 .
- the delay element 399 is only a logical delay, representing any technical means suitable to implement the desired periodicity).
- the instantaneous load of the network 100 is determined 310 and compared 330 to a desired load or set point 320 .
- the desired load may be a value or a range stored in a memory, retrieved via a management interface, etc.
- the result of the comparison 330 is used to assess 340 whether it is necessary to increase or decrease the amount of assigned resources; the details of the two branches of the selection 340 have already been described above in connection with FIG. 3 .
- the above mentioned steps are periodically repeated with a symbolic delay 399 ; as illustrated by the dashed line, this delay 399 may be reconfigured in function of the measured load, and more in particular in function of the rate at which the measured load changes.
- the most current load observations 310 and/or any other available load data may be stored in an appropriate storage means 370 , such as an internal memory, a disk drive, etc.
- Another periodical process 380 - 390 provides an ongoing assessment of whether the allocated resources are in line with the usage that may be expected given the known time-recurring patterns (in particular, the expected usage in function of the time of day and the day of the week). Again, the delay element 389 is only a logical delay, representing any technical means suitable to implement the desired periodicity.
- the CSFCS may be configured to apply the most suitable look-ahead technique, depending on the actual cost of over and under-estimation.
- CSFCS can also be configured to dynamically adapt the sampling rate if needed.
- the CSFCS halves the sampling interval when the prediction exceeds a specified threshold. When the error drops below a lower error, in contrast, the sampling interval is gradually increased. Simulations have indicated that this techniques results in more accurate load predictions, but at a higher monitoring cost (more frequent sampling).
- the CSFCS may also exploit recurring load variation patterns.
- every monitoring result is added to a time-series representing a specific timestamp k for a specific class of days (weekdays, holidays, weekends, etc.).
- Kalman filters, linear extrapolation, and spline extrapolation are then used to predict the future load on timestamp k (e.g. tomorrow), taking into account the history of previous measurements that occurred at the same timestamp k.
- the CSFCS may fall back on limited look-ahead predictions taking into account only on a few prior observations.
- embodiments of the cloud scaling feedback control system optimizes the resource utilization ratio of a telco cloud, by
- program storage devices e.g., digital data storage media, which are machine or computer readable and encode machine-executable or computer-executable programs of instructions, wherein said instructions perform some or all of the steps of said above-described methods.
- the program storage devices may be, e.g., digital memories, magnetic storage media such as a magnetic disks and magnetic tapes, hard drives, or optically readable digital data storage media.
- the embodiments are also intended to cover computers programmed to perform said steps of the above-described methods.
- FIG. 6 schematically illustrates a system, i.e. a CSFCS 150 , according to an embodiment of the present invention.
- the CSFCS 150 interacts with a network 100 comprising distributed server resources, such as the SIP network 100 illustrated in FIG. 1 .
- the CSFCS 150 is understood to have the necessary interfaces (hardware and software), as are known to the person skilled in the art of communication networking.
- a monitoring agent 151 retrieves information about the current load state of the infrastructure from the network 100
- a management agent 154 is configured to send instructions to the server infrastructure. Input from the monitoring agent 151 is compared to a set point 153 by a processor 152 , to determine whether the presently allocated infrastructure is under- or overloaded.
- the processor 152 will cause the management agent 154 to instruct the infrastructure to allocate more or less resources, as required, while ensuring state preservation by carrying out the necessary session migrations in a state-respecting manner.
- a scheduler 155 uses stored knowledge about recurring usage patterns to cause the management agent 154 to proactively instruct the infrastructure to allocate more or less resources, as required according to the usage expected in the (near) future.
- the skilled person will appreciate that one or more of the monitoring agent 151 , processor 152 , set point 153 , management agent 154 , and scheduler 155 may be implemented in a common hardware component.
- the CSFCS 150 and most particularly the processor 152 and the management agent 154 , may further be configured to carry out other functions related to the various embodiments of the method according to the invention as described above.
- processors may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software.
- the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared.
- processor or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, network processor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), read only memory (ROM) for storing software, random access memory (RAM), and non volatile storage.
- DSP digital signal processor
- ASIC application specific integrated circuit
- FPGA field programmable gate array
- ROM read only memory
- RAM random access memory
- non volatile storage Other hardware, conventional and/or custom, may also be included.
- any switches shown in the FIGS. are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer And Data Communications (AREA)
- Telephonic Communication Services (AREA)
Abstract
Description
- The present invention pertains to distributed computing systems, including distributed telecommunication systems, and more in particular to dynamic scaling of distributed computing infrastructures or “clouds”.
- Cloud computing has gained substantial momentum over the past few years, fueling technological innovation and creating considerable business impact. Public, private or hybrid cloud infrastructure shortens users' time to market (new hosting infrastructure is only a few mouse-clicks away), and claims to reduce their total cost of ownership by shifting the cost structure from higher capital expenditure to lower operating expenditure.
- One of the key advantages of cloud computing is the ability to build dynamically scaling systems. Virtualization technologies (including XEN, KVM, VMware, Solaris and Linux Containers) facilitate clustered computing services to acquire and release resources automatically. This enables dynamically right-sizing the amount of resources that are actually needed, instead of statically over-dimensioning the capacity of such clustered services. Some key advantages emerging from dynamic right-sizing include (1) the ability to reduce the services' operational cost and (2) the ability to gracefully handle unanticipated load surges without introducing opportunity loss by compromising the service's SLA.
- Although most existing dynamic scaling solutions have been targeting web and enterprise applications, also clustered and “cloudified” telecommunication services (such as SIP farms hosted by public or private clouds) can significantly benefit from the advantages of dynamic scaling. To guarantee carrier grade service execution, for instance, telecommunication operators typically over-dimension the employed resources—at the expense of reducing their resource utilization ratio, and thus raising their operational cost. This cost increases even more when the operator needs to provision sufficient resources to handle sporadic unanticipated load surges (caused by events with a significant social impact) or anticipated load spikes (e.g. caused by New Year's Eve texting).
- Accordingly, it is an object of embodiments of the present invention to provide methods and apparatus for proactive scaling of elastic (telecommunication) systems that more efficiently balance the tradeoff between reducing infrastructure cost, and providing enough overcapacity to deal with sudden increases in load.
- According to an aspect of the present invention, there is provided a method for dynamically assigning resources of a distributed server infrastructure, the method comprising the steps of: comparing an observed relative load of an assigned portion of the distributed server infrastructure with a desired relative load; if the observed relative load exceeds the desired relative load: assigning additional resources, and redistributing tasks from the assigned portion to the additional resources; and if the desired relative load exceeds the desired relative load: selecting removable resources, redistributing tasks from the removable resources to other resources in the assigned portion, and removing the removable resources from the assigned portion; wherein the redistributing of tasks is performed in such a way that state information related to the tasks is preserved.
- It is an advantage of the method according to the present invention that dynamic resource allocation can take place in computing environments in which preservation of session state information is of importance.
- “Load” or “relative load”, as used herein, can represent various metrics related to the degree to which the assigned resources are occupied, and more specifically the total work volume of the tasks being carried out by the resources. These metrics may include CPU usage, memory usage, response time, etc.
- The target (i.e., the desired relative load) is not necessarily a single value, but may instead be specified as a range, having a lower threshold (low water mark) and a higher threshold (high water mark). In this way, hysteresis can be implemented, which reduces the risk of instability in dynamic systems.
- In an embodiment of the method according to the present invention, the steps are applied iteratively.
- It is an advantage of this embodiment, that the amount of allocated resources can be optimized on an ongoing basis.
- In an embodiment of the method according to the present invention, the frequency of the iterative application of the steps is varied in function of a difference between the observed relative load and the desired relative load.
- It is an advantage of this embodiment, that the allocation or removal of resources can happen faster in periods of rapidly growing or declining demand for resources.
- In an embodiment of the method according to the present invention, the distributed server infrastructure is used to deploy an elastic telecommunication system.
- In an embodiment of the method according to the present invention, the selecting removable resources comprises determining an individual load of resources among the assigned portion and selecting resources for which the individual load is lowest.
- In an embodiment, the method according to the present invention further comprises assigning further additional resources in accordance with a time schedule, the time schedule representing recurring usage patterns for the distributed server infrastructure.
- It is an advantage of this embodiment, that the resource allocation is performed in a proactive manner, to avoid opportunity loss due to SLA violations in times of rapid increase in demand for resources.
- In a particular embodiment, the observed relative load is used to update the schedule.
- According to an aspect of the present invention, there is provided a computer program configured to cause a processor to carry out the method as described above.
- According to an aspect of the present invention, there is provided a system for dynamically assigning resources of a distributed server infrastructure, the system comprising: a monitoring agent configured to observe a relative load of an assigned portion of the distributed server infrastructure; a processor, operatively connected to the monitoring agent, the processor being configured to compare the observed relative load with a desired relative load; and a management agent, configured to transmit instructions to the distributed server infrastructure, and to act according to the following rules in response to the comparing: if the observed relative load exceeds the desired relative load: instruct the server infrastructure to assign additional resources, and redistribute tasks from the assigned portion to the additional resources; and if the desired relative load exceeds the observed relative load: select removable resources, redistribute tasks from the removable resources to other resources in the assigned portion, and instruct the server infrastructure to remove the removable resources from the assigned portion.
- The advantages of the system according to the present invention are analogous to those described above with respect to the method according to the present invention. Features of specific embodiments of the method according to the present invention may be applied to the system according to the present invention with similar benefits and advantages.
- In an embodiment, the system according to the present invention further comprises a scheduler operatively connected to said management agent, and said management agent is further configured to act according to the following rules in response to a signal from said scheduler: if said signal is indicative of an expected increase in demand for resources: instruct said server infrastructure to assign additional resources, and redistribute tasks from said assigned portion to said additional resources; and if said signal is indicative of an expected decrease in demand for resources: select removable resources, redistribute tasks from said removable resources to other resources in said assigned portion, and instruct said server infrastructure to remove said removable resources from said assigned portion.
- In an embodiment of the system according to the present invention, the distributed server infrastructure comprises a plurality of SIP servers.
- Some embodiments of apparatus and/or methods in accordance with embodiments of the present invention are now described, by way of example only, and with reference to the accompanying drawings, in which:
-
FIG. 1 schematically illustrates a network in which embodiments of the present invention may be deployed; -
FIG. 2 illustrates a control loop process; -
FIG. 3 illustrates a control loop process according to an embodiment of the present invention; -
FIG. 4 illustrates a control loop process according to another embodiment of the present invention; -
FIG. 5 illustrates the distribution of the number of processed calls per 15 minutes as measured during a month, with separate graphs for weekdays and week-ends; and -
FIG. 6 schematically illustrates an embodiment of the system according to the present invention. - Throughout the figures, like reference signs have been used to designate like elements.
-
FIG. 1 schematically illustrates an exemplary network in which embodiments of the present invention may be deployed. - Although the invention is hereinafter primarily described in terms of embodiments relating to telecommunication systems, in particular virtual SIP servers implemented in a “cloud” infrastructure, the skilled person will appreciate that the invention is not limited thereto. The invention may be applied to various kind of distributed computing infrastructures, in particular where the concerned computing tasks are stateful.
- Based on daily observations, it can be deduced that short-term load variations of a communication system largely adhere to recurring patterns (based on end users' daily routines). To illustrate this,
FIG. 5 depicts the average number of processed calls per 15 minutes, collected by the inventors from a local trunk group in June 2011. From this data, it can be deduced that static peak load dimensioning results into an average capacity usage of only 50% (averaged out over 24 hours). - In addition to short-term load variations, telecommunication services are also exposed to long-term load variations. Small and medium size carriers, for instance, typically want to gradually increase the number of end users they support—starting for example with a pilot project that involves around ten thousand users, and gradually providing more infrastructure if the service becomes more successful. These examples illustrate that dynamically scaling clustered telecommunication services (depending on their current load) is a promising technique to (1) reduce their operational cost and (2) gracefully handle anticipated as well as unanticipated load surges.
- According to the insight of the inventors, the value (and successful adoption) of dynamic scaling support for telecommunication services depends on
-
- (1) its ability to maximize resource utility (thus reducing the operational cost),
- (2) its ability to preserve the services' stringent carrier grade requirements (thus minimizing SLA violation penalties and the cost of losing customers), and
- (3) the ability to minimize the operating cost (overhead) of the scaling support.
- Furthermore, according to the insight of the inventors, successful adoption of dynamic scaling in telecommunication requires an ability to cope with the predominantly stateful nature of these telecommunication services. While stateless web applications or RESTful webservices can scale in or out by nature without breaking ongoing interactions, this is typically not the case for (call) stateful telecommunication services such as B2BUAs or SIP proxies controlling middle boxes that implement firewall and NAT functions. Before removing a (cloud) instance that belongs to a stateful telecommunication service, one needs to ensure this instance is driven to a safe execution state. Such a state is reached once all sessions currently being processed by the affected instance have been terminated (which may significantly delay the removal of the affected cloud instance), or by transparently migrating these sessions towards other service instances.
- To meet these requirements, embodiments of the present invention pertain to a Cloud Scaling Feedback Control System (CSFCS) 150 that implements dynamic scaling behavior for “cloudified” stateful telecommunication clusters—to maximize their utility (thus reducing the operational cost) while at the same time maintaining one or more key operating parameters (such as maximum response time). An embodiment of the CSFCS 150 is illustrated in more detail in
FIG. 6 . -
FIG. 1 illustrates an exemplary SIP-based telecommunication network comprising twoexemplary user agents single SIP domain 100. TheSIP domain 100 comprises a first Client Elasticity Gateway (CEG) 111 and asecond CEG 211, shielding a server cluster. Without loss of generality, the cluster is illustrated as containing three elastic SIP servers 121-123. The number of allocated SIP servers may increase or decrease as a result of the application of the method according to the present invention. - Without loss of generality, we consider the interaction between the
first SIP CEG 111 and the topologicallyadjacent UA 101.SIP CEG 111 plays the role of User Agent Server (UAS) in all its communication with theUA 101, and the role of User Agent Client (UAC) in its relation with the SIP servers 121-123 of the elastic SIP cluster. TheSIP CEG 111 thus conceals the elastic SIP servers 121-123 from theclient 101 by acting as a single SIP server. It may include load balancing support and/or failover support by interacting with an Elasticity Control System (ECS) as described in European patent application no. 11 290 326.5 entitled “Method for transferring state information pertaining to a plurality of SIP conversations” in the name of the present Applicant. Further in accordance with the cited application, theSIP CEG 111 terminates elasticity control messages originating from the elastic SIP cluster 121-123, so it conceals the dynamics of the elastic SIP cluster from theUA 101—including instructions to redirect messages to another SIP server. - The
CSFCS 150 according to the present invention may act in addition to or in replacement of the ECS of the cited application; theCSFCS 150 according to the present invention may in fact coincide with the ECS. - Today's cloud scaling support (such as offered by Amazon Web Service™, Google App Engine™ and Heroku™) provides the required ingredients to build application specific feedback control systems that automatically increase and decrease the amount of allocated cloud instances—being virtual machines, containers or service instances. Cloud load balancers distribute incoming traffic across these cloud instances, concealing their existence from client applications. Cloud monitoring components observe these cloud instances and report on metrics such as CPU utilization, response-time, drop-rate, queue lengths and request counts. Additionally, APIs are offered to create and release service instances, and to automatically initiate these operations when collected metrics exceed specified thresholds. Although these building blocks enable the development of web and enterprise applications that automatically scales out and back, they do not offer a unified solution tailored towards stateful telecommunication services (such as SIP clusters) that need to meet stringent carrier grade requirements listed above.
- For the sake clarifying the invention, a feedback process is depicted in
FIG. 2 . Based on a specified set point 220 (defining the key operating parameters, such as average instance load or maximum response time) on the one hand, andmonitoring data 210 reporting on operational metrics of the load balancer and/or the affected (cloud) instances on the other hand, the “elasticity control” 230 calculates how many (cloud) instances are currently needed (denoted as Ax(i) inFIG. 2 ). If a global measurement exceeds 240 (upper branch) a specified high threshold (high-water mark), the feedback systems instruct the (cloud) infrastructure to acquire new resources and to launchnew service instances 250. Similarly, if the global measurement drops below 240 (lower branch) the low threshold (low-water mark), the feedback control system instructs the (cloud) infrastructure to releasespare resources 269. - The reactive nature of these feedback systems (they react when an operating parameter is currently exceeding a specified threshold) typically assumes that
extra resources 250 can be provisioned immediately. However, booting new cloud instances (such as virtual machines) and initiating the services these cloud instances are hosting takes introduces an extra delay (up to a few minutes). Without anticipating this delay, SLA requirements might be violated during the actual provisioning of new resources, which in turn breaks the stringent carrier grade requirements of telecommunication services. - It is thus advantageous to be able to predict short-term load surges. Based on these predictions, the elasticity control determines how much resources will be needed in the near future, and pro-actively provisions the required resources to handle these load forecasts.
- According to the insight of the inventors, it is advantageous to also take into account the potentially stateful nature of the distributed infrastructure, in particular in the case of telecommunication systems. Two contributions of the feedback system according to the present invention are now described, and may be deployed jointly or independently.
- Firstly, successful adoption of dynamic scaling support for telecommunication services highly depends on its ability to preserve the services' stringent carrier grade requirements and to minimize opportunity loss due to SLA violations. Instead of responding to load changes in a reactive manner (for instance when a high-water mark is exceeded), the present solution exploits the potential value of pro-active resource provisioning based on short-term load forecasting. Hence, embodiments of the present invention are based on the observation that daily call load variations usually adhere to recurring patterns (illustrated in
FIG. 5 ). This allows deducing load predictions (and the associated decisions to increase or decrease the amount of virtual servers) from a history of load observations. In case of unanticipated load surges that significantly diverge from these recurring patterns, we fallback to limited look-ahead predictions taking into account only on a few prior observations. Simulations and experiments indicate that this pro-active resource provisioning significantly reduces the amount of SLA violations. - Secondly, embodiments of the process according to the invention provide extra steps compared to the process depicted in
FIG. 2 . These extra steps handle session state. As explained above, a cloud instance can only be released safely once it is not accommodating any session (or other execution) state anymore. Since waiting for all ongoing sessions to terminate may significantly delay the removal of the affected cloud instance (hence compromising the ability to maximize resource utility and reducing operational cost), our cloud scaling system coordinates the migration of these sessions towards other instances. - An exemplary feedback flowchart, as might result from the application of the above improvements, is depicted in
FIG. 3 . Based on a specified set point 320 (defining the key operating parameters, such as average instance load or maximum response time) on the one hand, andmonitoring data 310 reporting on operational metrics of the load balancer and/or the affected (cloud) instances on the other hand, the “elasticity control” 330 calculates how many (cloud) instances are currently needed (denoted as Δx(i) inFIG. 2 ). If a global measurement exceeds 340 (upper branch) a specified high threshold (high-water mark), the feedback system instructs the (cloud) infrastructure to acquire new resources and to launchnew service instances 350. Subsequently, tasks or sessions are started on these newly launched instances. In addition to the assignment of fresh sessions to the newly launched instances, the load on the cloud infrastructure may be balanced by migrating 355 existing sessions from already running instances to the newly launched instances. Upon migrating these sessions, care must be taken to maintain session integrity and to correctly transfer state information. Similarly, if the global measurement drops below 340 (lower branch) the low threshold (low-water mark), the feedback control system instructs the (cloud) infrastructure to releasespare resources 369. Prior to this release, any sessions that are still running on the resources marked for release are preferably migrated 355, along with the associated state association, to remaining instances. - Where the distributed services concern SIP services, the addition and removal of instances may occur as follows:
-
- If an increase in the number of cloud instances is required, the CSFCS first invokes the cloud infrastructure to activate these new cloud instances (containing the telco service instances—such as SIP servers). Next, the CSFCS activates these new telco service instances and registers them to the load balancer(s)—from this point on they can accept new requests. Finally, the CSFCS rebalances ongoing sessions (if needed) to let the new telco service instances take on part of the load of their peers.
- If a decrease in the number of cloud instances is required, the CSFCS first prepares the safe removal of the affected telco service instances. This involves first waiting until all ongoing transactions are finished, and subsequently migrating ongoing sessions to the remaining servers. The CSFCS deactivates and deregisters the affected telco service instances—hence preventing them from accepting and processing new sessions. Finally, the CSFCS can safely instruct the cloud infrastructure to deactivate the cloud instances accommodating these quiescent service instances.
- Further details about the methods by which sessions may be migrated without loss of session information can be found in European patent applications EP 11 290 327.3 and EP 11 290 326.5, in the name of the Applicant. Embodiments of the present invention comprise session migration steps according to the methods described in those documents, which shall expressly be considered to be incorporated by this reference.
- Although the comparison between the observed load and the desired load is schematically represented in the Figures as a single comparison, this is done for clarity purposes only. It is possible to use a single threshold value to trigger both addition and removal of resources. However, it is advantageous to choose a low threshold and a high threshold which are not the same. The use of non-identical low and high thresholds implies that the “desired load” is in fact a range, and the method as described will act to keep or return the system in/to the desired operated range.
- A more elaborate embodiment of the process according to the invention is illustrated in
FIG. 4 . - The illustrated method formally starts at the starting point labeled 300, and returns to this point periodically with a frequency determined by the
variable delay 399. Thedelay element 399 is only a logical delay, representing any technical means suitable to implement the desired periodicity). - The instantaneous load of the
network 100 is determined 310 and compared 330 to a desired load or setpoint 320. The desired load may be a value or a range stored in a memory, retrieved via a management interface, etc. The result of thecomparison 330 is used to assess 340 whether it is necessary to increase or decrease the amount of assigned resources; the details of the two branches of theselection 340 have already been described above in connection withFIG. 3 . The above mentioned steps are periodically repeated with asymbolic delay 399; as illustrated by the dashed line, thisdelay 399 may be reconfigured in function of the measured load, and more in particular in function of the rate at which the measured load changes. - The most
current load observations 310 and/or any other available load data may be stored in an appropriate storage means 370, such as an internal memory, a disk drive, etc. Another periodical process 380-390 provides an ongoing assessment of whether the allocated resources are in line with the usage that may be expected given the known time-recurring patterns (in particular, the expected usage in function of the time of day and the day of the week). Again, thedelay element 389 is only a logical delay, representing any technical means suitable to implement the desired periodicity. - Various limited look-ahead load predictions have been evaluated, including linear extrapolation, spline extrapolation and adaptive Kalman predictions. Simulations have indicated that linear extrapolation presents very good results in terms of minimizing the occurrence of over-estimation (i.e., situations in which a higher load had been predicted than actually measured), while Kalman predictions present very good results in terms of minimizing the occurrence of under-estimation (i.e., avoiding situations in which a lower load had been predicted than actually measured).
- The CSFCS may be configured to apply the most suitable look-ahead technique, depending on the actual cost of over and under-estimation.
- To further reduce the occurrence of over and under-estimation, CSFCS can also be configured to dynamically adapt the sampling rate if needed. In an embodiment, the CSFCS halves the sampling interval when the prediction exceeds a specified threshold. When the error drops below a lower error, in contrast, the sampling interval is gradually increased. Simulations have indicated that this techniques results in more accurate load predictions, but at a higher monitoring cost (more frequent sampling).
- Beside supporting the above mentioned limited look-ahead predictions, the CSFCS may also exploit recurring load variation patterns. In an embodiment, every monitoring result is added to a time-series representing a specific timestamp k for a specific class of days (weekdays, holidays, weekends, etc.). Kalman filters, linear extrapolation, and spline extrapolation are then used to predict the future load on timestamp k (e.g. tomorrow), taking into account the history of previous measurements that occurred at the same timestamp k. In case of unanticipated load surges that significantly diverge from these recurring patterns, the CSFCS may fall back on limited look-ahead predictions taking into account only on a few prior observations.
- Hence, embodiments of the cloud scaling feedback control system according to the present invention optimizes the resource utilization ratio of a telco cloud, by
-
- (1) exploiting recurring load variation patterns (inherent to telecommunication services) to pro-actively scale out and back (cloudified) telecommunication clusters,
- (2) falling back to limited-lookahead predictions (taking into account only a few prior measurements) in case of unanticipated load surges that significantly diverge from these recurring load variation patterns, and
- (3) minimizing the impact of session state on resource utility by coordinating the migration of session state (instead of waiting until all ongoing sessions have been terminated).
- All this enables the maximization of the resource utility in a telecommunication cloud (thus reducing the operational cost) while at the same time maintaining one or more key operating parameters (such as maximum response time).
- A person of skill in the art would readily recognize that steps of various above-described methods can be performed by programmed computers. Herein, some embodiments are also intended to cover program storage devices, e.g., digital data storage media, which are machine or computer readable and encode machine-executable or computer-executable programs of instructions, wherein said instructions perform some or all of the steps of said above-described methods. The program storage devices may be, e.g., digital memories, magnetic storage media such as a magnetic disks and magnetic tapes, hard drives, or optically readable digital data storage media. The embodiments are also intended to cover computers programmed to perform said steps of the above-described methods.
-
FIG. 6 schematically illustrates a system, i.e. aCSFCS 150, according to an embodiment of the present invention. TheCSFCS 150 interacts with anetwork 100 comprising distributed server resources, such as theSIP network 100 illustrated inFIG. 1 . For this purpose, theCSFCS 150 is understood to have the necessary interfaces (hardware and software), as are known to the person skilled in the art of communication networking. On the one hand, amonitoring agent 151 retrieves information about the current load state of the infrastructure from thenetwork 100, and on the other hand amanagement agent 154 is configured to send instructions to the server infrastructure. Input from themonitoring agent 151 is compared to aset point 153 by aprocessor 152, to determine whether the presently allocated infrastructure is under- or overloaded. Depending on this comparison, and acting in a fully analogous way as described for the methods according to the present invention, theprocessor 152 will cause themanagement agent 154 to instruct the infrastructure to allocate more or less resources, as required, while ensuring state preservation by carrying out the necessary session migrations in a state-respecting manner. Optionally, ascheduler 155 uses stored knowledge about recurring usage patterns to cause themanagement agent 154 to proactively instruct the infrastructure to allocate more or less resources, as required according to the usage expected in the (near) future. The skilled person will appreciate that one or more of themonitoring agent 151,processor 152, setpoint 153,management agent 154, andscheduler 155 may be implemented in a common hardware component. TheCSFCS 150, and most particularly theprocessor 152 and themanagement agent 154, may further be configured to carry out other functions related to the various embodiments of the method according to the invention as described above. - The functions of the various elements shown in the figures, including any functional blocks labeled as “processors”, may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared.
- Moreover, explicit use of the term “processor” or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, network processor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), read only memory (ROM) for storing software, random access memory (RAM), and non volatile storage. Other hardware, conventional and/or custom, may also be included. Similarly, any switches shown in the FIGS. are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
Claims (11)
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP11306337.4A EP2581831A1 (en) | 2011-10-14 | 2011-10-14 | Method and apparatus for dynamically assigning resources of a distributed server infrastructure |
EP11306337 | 2011-10-14 | ||
EP11306337.4 | 2011-10-14 | ||
PCT/EP2012/069453 WO2013053619A1 (en) | 2011-10-14 | 2012-10-02 | Method and apparatus for dynamically assigning resources of a distributed server infrastructure |
Publications (2)
Publication Number | Publication Date |
---|---|
US20140258546A1 true US20140258546A1 (en) | 2014-09-11 |
US9871744B2 US9871744B2 (en) | 2018-01-16 |
Family
ID=47080452
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/350,609 Active 2033-10-30 US9871744B2 (en) | 2011-10-14 | 2012-10-02 | Method and apparatus for dynamically assigning resources of a distributed server infrastructure |
Country Status (3)
Country | Link |
---|---|
US (1) | US9871744B2 (en) |
EP (1) | EP2581831A1 (en) |
WO (1) | WO2013053619A1 (en) |
Cited By (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150007177A1 (en) * | 2013-06-26 | 2015-01-01 | Fujitsu Limited | Virtual machine management method and information processing apparatus |
US20150169291A1 (en) * | 2013-12-16 | 2015-06-18 | International Business Machines Corporation | Systems and methods for scaling a cloud infrastructure |
US20160099850A1 (en) * | 2014-10-01 | 2016-04-07 | Avaya Inc. | Abstract activity counter |
US20160197848A1 (en) * | 2015-01-07 | 2016-07-07 | Yahoo!, Inc. | Content distribution resource allocation |
CN105897861A (en) * | 2016-03-28 | 2016-08-24 | 乐视控股(北京)有限公司 | Server deployment method and system for server cluster |
US20160344651A1 (en) * | 2012-09-26 | 2016-11-24 | Amazon Technologies, Inc. | Multi-tenant throttling approaches |
US9524200B2 (en) | 2015-03-31 | 2016-12-20 | At&T Intellectual Property I, L.P. | Consultation among feedback instances |
US9621643B1 (en) * | 2015-07-31 | 2017-04-11 | Parallels IP Holdings GmbH | System and method for joining containers running on multiple nodes of a cluster |
WO2017113868A1 (en) * | 2015-12-29 | 2017-07-06 | 网宿科技股份有限公司 | Method and system for self-adaptive bandwidth control for cdn platform |
US9760400B1 (en) * | 2015-07-31 | 2017-09-12 | Parallels International Gmbh | System and method for joining containers running on multiple nodes of a cluster |
US9769206B2 (en) | 2015-03-31 | 2017-09-19 | At&T Intellectual Property I, L.P. | Modes of policy participation for feedback instances |
US9992277B2 (en) | 2015-03-31 | 2018-06-05 | At&T Intellectual Property I, L.P. | Ephemeral feedback instances |
US9990506B1 (en) | 2015-03-30 | 2018-06-05 | Quest Software Inc. | Systems and methods of securing network-accessible peripheral devices |
US10129156B2 (en) | 2015-03-31 | 2018-11-13 | At&T Intellectual Property I, L.P. | Dynamic creation and management of ephemeral coordinated feedback instances |
US10129157B2 (en) | 2015-03-31 | 2018-11-13 | At&T Intellectual Property I, L.P. | Multiple feedback instance inter-coordination to determine optimal actions |
US10142391B1 (en) | 2016-03-25 | 2018-11-27 | Quest Software Inc. | Systems and methods of diagnosing down-layer performance problems via multi-stream performance patternization |
US10140466B1 (en) | 2015-04-10 | 2018-11-27 | Quest Software Inc. | Systems and methods of secure self-service access to content |
US10146954B1 (en) | 2012-06-11 | 2018-12-04 | Quest Software Inc. | System and method for data aggregation and analysis |
US10157358B1 (en) * | 2015-10-05 | 2018-12-18 | Quest Software Inc. | Systems and methods for multi-stream performance patternization and interval-based prediction |
US10218588B1 (en) | 2015-10-05 | 2019-02-26 | Quest Software Inc. | Systems and methods for multi-stream performance patternization and optimization of virtual meetings |
US10277666B2 (en) | 2015-03-31 | 2019-04-30 | At&T Intellectual Property I, L.P. | Escalation of feedback instances |
US10300386B1 (en) * | 2016-06-23 | 2019-05-28 | Amazon Technologies, Inc. | Content item instance scaling based on wait time |
US10326748B1 (en) | 2015-02-25 | 2019-06-18 | Quest Software Inc. | Systems and methods for event-based authentication |
US10348630B2 (en) * | 2017-04-24 | 2019-07-09 | Facebook, Inc. | Load balancing based on load projections for events |
CN110196768A (en) * | 2018-03-22 | 2019-09-03 | 腾讯科技(深圳)有限公司 | The method and apparatus for automatically determining the loading level of cloud platform resource |
US10411960B1 (en) * | 2014-11-12 | 2019-09-10 | Amazon Technologies, Inc. | Detaching instances from auto-scaling group |
US10417613B1 (en) | 2015-03-17 | 2019-09-17 | Quest Software Inc. | Systems and methods of patternizing logged user-initiated events for scheduling functions |
US10536352B1 (en) | 2015-08-05 | 2020-01-14 | Quest Software Inc. | Systems and methods for tuning cross-platform data collection |
CN110990159A (en) * | 2019-12-25 | 2020-04-10 | 浙江大学 | Historical data analysis-based container cloud platform resource quota prediction method |
US11296941B2 (en) | 2014-11-12 | 2022-04-05 | Amazon Technologies, Inc. | Standby instances for auto-scaling groups |
US20220206844A1 (en) * | 2020-12-29 | 2022-06-30 | Motorola Solutions, Inc. | Scheduling resource reservations in a cloud-based communication system |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105491150A (en) * | 2015-12-28 | 2016-04-13 | 中国民航信息网络股份有限公司 | Load balance processing method based on time sequence and system |
US11151524B2 (en) * | 2020-02-03 | 2021-10-19 | Shopify Inc. | Methods and systems for gateway load balancing |
US11893614B2 (en) | 2021-08-23 | 2024-02-06 | Shopify Inc. | Systems and methods for balancing online stores across servers |
US11880874B2 (en) * | 2021-08-23 | 2024-01-23 | Shopify Inc. | Systems and methods for server load balancing based on correlated events |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050102676A1 (en) * | 2003-11-06 | 2005-05-12 | International Business Machines Corporation | Load balancing of servers in a cluster |
US20050120354A1 (en) * | 2003-11-19 | 2005-06-02 | Hitachi, Ltd. | Information processing system and information processing device |
US20070237162A1 (en) * | 2004-10-12 | 2007-10-11 | Fujitsu Limited | Method, apparatus, and computer product for processing resource change |
US20080320121A1 (en) * | 2007-06-19 | 2008-12-25 | Faheem Altaf | System, computer program product and method of dynamically adding best suited servers into clusters of application servers |
US20100169477A1 (en) * | 2008-12-31 | 2010-07-01 | Sap Ag | Systems and methods for dynamically provisioning cloud computing resources |
US20120066371A1 (en) * | 2010-09-10 | 2012-03-15 | Cisco Technology, Inc. | Server Load Balancer Scaling for Virtual Servers |
US20120096461A1 (en) * | 2010-10-05 | 2012-04-19 | Citrix Systems, Inc. | Load balancing in multi-server virtual workplace environments |
US20120226797A1 (en) * | 2011-03-01 | 2012-09-06 | Cisco Technology, Inc. | Active Load Distribution for Control Plane Traffic Using a Messaging and Presence Protocol |
US20130080517A1 (en) * | 2010-06-08 | 2013-03-28 | Alcatel Lucent | Device and method for data load balancing |
US20130111467A1 (en) * | 2011-10-27 | 2013-05-02 | Cisco Technology, Inc. | Dynamic Server Farms |
US20130174177A1 (en) * | 2011-12-31 | 2013-07-04 | Level 3 Communications, Llc | Load-aware load-balancing cluster |
US20130198743A1 (en) * | 2012-01-26 | 2013-08-01 | Empire Technology Development Llc | Data center with continuous world switch security |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6103092A (en) | 1998-10-23 | 2000-08-15 | General Electric Company | Method for reducing metal ion concentration in brine solution |
JP2008257578A (en) * | 2007-04-06 | 2008-10-23 | Toshiba Corp | Information processor, scheduler, and schedule control method of information processor |
US20110078303A1 (en) * | 2009-09-30 | 2011-03-31 | Alcatel-Lucent Usa Inc. | Dynamic load balancing and scaling of allocated cloud resources in an enterprise network |
US8631403B2 (en) * | 2010-01-04 | 2014-01-14 | Vmware, Inc. | Method and system for managing tasks by dynamically scaling centralized virtual center in virtual infrastructure |
-
2011
- 2011-10-14 EP EP11306337.4A patent/EP2581831A1/en not_active Withdrawn
-
2012
- 2012-10-02 US US14/350,609 patent/US9871744B2/en active Active
- 2012-10-02 WO PCT/EP2012/069453 patent/WO2013053619A1/en active Application Filing
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050102676A1 (en) * | 2003-11-06 | 2005-05-12 | International Business Machines Corporation | Load balancing of servers in a cluster |
US20050120354A1 (en) * | 2003-11-19 | 2005-06-02 | Hitachi, Ltd. | Information processing system and information processing device |
US20070237162A1 (en) * | 2004-10-12 | 2007-10-11 | Fujitsu Limited | Method, apparatus, and computer product for processing resource change |
US20080320121A1 (en) * | 2007-06-19 | 2008-12-25 | Faheem Altaf | System, computer program product and method of dynamically adding best suited servers into clusters of application servers |
US20100169477A1 (en) * | 2008-12-31 | 2010-07-01 | Sap Ag | Systems and methods for dynamically provisioning cloud computing resources |
US20130080517A1 (en) * | 2010-06-08 | 2013-03-28 | Alcatel Lucent | Device and method for data load balancing |
US20120066371A1 (en) * | 2010-09-10 | 2012-03-15 | Cisco Technology, Inc. | Server Load Balancer Scaling for Virtual Servers |
US20120096461A1 (en) * | 2010-10-05 | 2012-04-19 | Citrix Systems, Inc. | Load balancing in multi-server virtual workplace environments |
US20120226797A1 (en) * | 2011-03-01 | 2012-09-06 | Cisco Technology, Inc. | Active Load Distribution for Control Plane Traffic Using a Messaging and Presence Protocol |
US20130111467A1 (en) * | 2011-10-27 | 2013-05-02 | Cisco Technology, Inc. | Dynamic Server Farms |
US20130174177A1 (en) * | 2011-12-31 | 2013-07-04 | Level 3 Communications, Llc | Load-aware load-balancing cluster |
US20130198743A1 (en) * | 2012-01-26 | 2013-08-01 | Empire Technology Development Llc | Data center with continuous world switch security |
Cited By (46)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10146954B1 (en) | 2012-06-11 | 2018-12-04 | Quest Software Inc. | System and method for data aggregation and analysis |
US10700994B2 (en) * | 2012-09-26 | 2020-06-30 | Amazon Technologies, Inc. | Multi-tenant throttling approaches |
US20160344651A1 (en) * | 2012-09-26 | 2016-11-24 | Amazon Technologies, Inc. | Multi-tenant throttling approaches |
US9921861B2 (en) * | 2013-06-26 | 2018-03-20 | Fujitsu Limited | Virtual machine management method and information processing apparatus |
US20150007177A1 (en) * | 2013-06-26 | 2015-01-01 | Fujitsu Limited | Virtual machine management method and information processing apparatus |
US9300552B2 (en) * | 2013-12-16 | 2016-03-29 | International Business Machines Corporation | Scaling a cloud infrastructure |
US9921809B2 (en) | 2013-12-16 | 2018-03-20 | International Business Machines Corporation | Scaling a cloud infrastructure |
US9916135B2 (en) | 2013-12-16 | 2018-03-13 | International Business Machines Corporation | Scaling a cloud infrastructure |
US20150169291A1 (en) * | 2013-12-16 | 2015-06-18 | International Business Machines Corporation | Systems and methods for scaling a cloud infrastructure |
US9300553B2 (en) * | 2013-12-16 | 2016-03-29 | International Business Machines Corporation | Scaling a cloud infrastructure |
US20160099850A1 (en) * | 2014-10-01 | 2016-04-07 | Avaya Inc. | Abstract activity counter |
US10051067B2 (en) * | 2014-10-01 | 2018-08-14 | Avaya Inc. | Abstract activity counter |
US11689422B1 (en) | 2014-11-12 | 2023-06-27 | Amazon Technologies, Inc. | Standby instances for auto-scaling groups |
US10411960B1 (en) * | 2014-11-12 | 2019-09-10 | Amazon Technologies, Inc. | Detaching instances from auto-scaling group |
US11296941B2 (en) | 2014-11-12 | 2022-04-05 | Amazon Technologies, Inc. | Standby instances for auto-scaling groups |
US11140095B2 (en) * | 2015-01-07 | 2021-10-05 | Verizon Media Inc. | Content distribution resource allocation |
US20160197848A1 (en) * | 2015-01-07 | 2016-07-07 | Yahoo!, Inc. | Content distribution resource allocation |
US10326748B1 (en) | 2015-02-25 | 2019-06-18 | Quest Software Inc. | Systems and methods for event-based authentication |
US10417613B1 (en) | 2015-03-17 | 2019-09-17 | Quest Software Inc. | Systems and methods of patternizing logged user-initiated events for scheduling functions |
US9990506B1 (en) | 2015-03-30 | 2018-06-05 | Quest Software Inc. | Systems and methods of securing network-accessible peripheral devices |
US10277666B2 (en) | 2015-03-31 | 2019-04-30 | At&T Intellectual Property I, L.P. | Escalation of feedback instances |
US10341388B2 (en) | 2015-03-31 | 2019-07-02 | At&T Intellectual Property I, L.P. | Modes of policy participation for feedback instances |
US10129157B2 (en) | 2015-03-31 | 2018-11-13 | At&T Intellectual Property I, L.P. | Multiple feedback instance inter-coordination to determine optimal actions |
US9992277B2 (en) | 2015-03-31 | 2018-06-05 | At&T Intellectual Property I, L.P. | Ephemeral feedback instances |
US10523569B2 (en) | 2015-03-31 | 2019-12-31 | At&T Intellectual Property I, L.P. | Dynamic creation and management of ephemeral coordinated feedback instances |
US10129156B2 (en) | 2015-03-31 | 2018-11-13 | At&T Intellectual Property I, L.P. | Dynamic creation and management of ephemeral coordinated feedback instances |
US9769206B2 (en) | 2015-03-31 | 2017-09-19 | At&T Intellectual Property I, L.P. | Modes of policy participation for feedback instances |
US10848550B2 (en) | 2015-03-31 | 2020-11-24 | At&T Intellectual Property I, L.P. | Escalation of feedback instances |
US9524200B2 (en) | 2015-03-31 | 2016-12-20 | At&T Intellectual Property I, L.P. | Consultation among feedback instances |
US10140466B1 (en) | 2015-04-10 | 2018-11-27 | Quest Software Inc. | Systems and methods of secure self-service access to content |
US9621643B1 (en) * | 2015-07-31 | 2017-04-11 | Parallels IP Holdings GmbH | System and method for joining containers running on multiple nodes of a cluster |
US9760400B1 (en) * | 2015-07-31 | 2017-09-12 | Parallels International Gmbh | System and method for joining containers running on multiple nodes of a cluster |
US10536352B1 (en) | 2015-08-05 | 2020-01-14 | Quest Software Inc. | Systems and methods for tuning cross-platform data collection |
US10157358B1 (en) * | 2015-10-05 | 2018-12-18 | Quest Software Inc. | Systems and methods for multi-stream performance patternization and interval-based prediction |
US10218588B1 (en) | 2015-10-05 | 2019-02-26 | Quest Software Inc. | Systems and methods for multi-stream performance patternization and optimization of virtual meetings |
US10574586B2 (en) | 2015-12-29 | 2020-02-25 | Wangsu Science & Technology Co., Ltd | Method and system for self-adaptive bandwidth control of CDN platform |
WO2017113868A1 (en) * | 2015-12-29 | 2017-07-06 | 网宿科技股份有限公司 | Method and system for self-adaptive bandwidth control for cdn platform |
US10142391B1 (en) | 2016-03-25 | 2018-11-27 | Quest Software Inc. | Systems and methods of diagnosing down-layer performance problems via multi-stream performance patternization |
CN105897861A (en) * | 2016-03-28 | 2016-08-24 | 乐视控股(北京)有限公司 | Server deployment method and system for server cluster |
US10300386B1 (en) * | 2016-06-23 | 2019-05-28 | Amazon Technologies, Inc. | Content item instance scaling based on wait time |
US10348630B2 (en) * | 2017-04-24 | 2019-07-09 | Facebook, Inc. | Load balancing based on load projections for events |
US11088953B2 (en) | 2017-04-24 | 2021-08-10 | Facebook, Inc. | Systems and methods for load balancing |
CN110196768A (en) * | 2018-03-22 | 2019-09-03 | 腾讯科技(深圳)有限公司 | The method and apparatus for automatically determining the loading level of cloud platform resource |
CN110990159A (en) * | 2019-12-25 | 2020-04-10 | 浙江大学 | Historical data analysis-based container cloud platform resource quota prediction method |
US20220206844A1 (en) * | 2020-12-29 | 2022-06-30 | Motorola Solutions, Inc. | Scheduling resource reservations in a cloud-based communication system |
US11977914B2 (en) * | 2020-12-29 | 2024-05-07 | Motorola Solutions, Inc. | Scheduling resource reservations in a cloud-based communication system |
Also Published As
Publication number | Publication date |
---|---|
WO2013053619A1 (en) | 2013-04-18 |
EP2581831A1 (en) | 2013-04-17 |
US9871744B2 (en) | 2018-01-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9871744B2 (en) | Method and apparatus for dynamically assigning resources of a distributed server infrastructure | |
CN111684419B (en) | Method and system for migrating containers in a container orchestration platform between computing nodes | |
EP2915312B1 (en) | Cdn traffic management in the cloud | |
Krishnamurthy et al. | Pratyaastha: an efficient elastic distributed sdn control plane | |
EP2915283B1 (en) | Cdn load balancing in the cloud | |
US9923785B1 (en) | Resource scaling in computing infrastructure | |
US9632815B2 (en) | Managing virtual machines according to network bandwidth | |
KR20130016237A (en) | Managing power provisioning in distributed computing | |
US20210243770A1 (en) | Method, computer program and circuitry for managing resources within a radio access network | |
Huang et al. | Prediction-based dynamic resource scheduling for virtualized cloud systems | |
WO2012094138A2 (en) | Seamless scaling of enterprise applications | |
CN103635882A (en) | Controlling network utilization | |
US11301301B2 (en) | Workload offloading between computing environments | |
US11553047B2 (en) | Dynamic connection capacity management | |
EP3021521A1 (en) | A method and system for scaling, telecommunications network and computer program product | |
Sharkh et al. | An evergreen cloud: Optimizing energy efficiency in heterogeneous cloud computing architectures | |
Huang et al. | Migration-based elastic consolidation scheduling in cloud data center | |
CN109960579B (en) | Method and device for adjusting service container | |
Walraven et al. | Adaptive performance isolation middleware for multi-tenant saas | |
Srivastava et al. | Queueing model based dynamic scalability for containerized cloud | |
Ashalatha et al. | Evaluation of auto scaling and load balancing features in cloud | |
Chen et al. | Corner: Cost-efficient and reliability-aware virtual network redesign and embedding | |
Mosoti Derdus et al. | Virtual machine sizing in virtualized public cloud data centres | |
Zharikov et al. | An integrated approach to cloud data center resource management | |
Khac et al. | An open Jackson network model for heterogeneous infrastructure as a service on cloud computing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: ALCATEL LUCENT, FRANCE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:JANSSENS, NICO;AN, XUELI;REEL/FRAME:032633/0793 Effective date: 20140320 |
|
AS | Assignment |
Owner name: CREDIT SUISSE AG, NEW YORK Free format text: SECURITY INTEREST;ASSIGNOR:ALCATEL LUCENT;REEL/FRAME:033500/0302 Effective date: 20140806 |
|
AS | Assignment |
Owner name: ALCATEL LUCENT, FRANCE Free format text: RELEASE OF SECURITY INTEREST;ASSIGNOR:CREDIT SUISSE AG;REEL/FRAME:033655/0304 Effective date: 20140819 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |