US20110173626A1 - Efficient maintenance of job prioritization for profit maximization in cloud service delivery infrastructures - Google Patents
Efficient maintenance of job prioritization for profit maximization in cloud service delivery infrastructures Download PDFInfo
- Publication number
- US20110173626A1 US20110173626A1 US12/818,528 US81852810A US2011173626A1 US 20110173626 A1 US20110173626 A1 US 20110173626A1 US 81852810 A US81852810 A US 81852810A US 2011173626 A1 US2011173626 A1 US 2011173626A1
- Authority
- US
- United States
- Prior art keywords
- job
- cost
- priority
- cbs
- jobs
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5072—Grid computing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/103—Workflow collaboration or project management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/506—Constraint
Definitions
- This application relates to Constraint-Conscious Optimal Scheduling for Cloud Infrastructures.
- Cloud computing has emerged as a promising computing platform with its on-demand scaling capabilities.
- a cloud service delivery infrastructure is used to deliver services to a diverse set of clients sharing the computing resources.
- SLAs Service Level Agreements
- the database community has also shown great interest in exploiting this new platform for scalable and cost-efficient data management.
- the success of cloud-based services depends on two main factors: quality of service that are identified through Service Level Agreements (SLAs) and operating cost management.
- SLAs Service Level Agreements
- cloud computing Users of cloud computing services are not only able to significantly reduce their IT costs and turn their capital expenditures to operational expenditures, but also able to speed up their innovation capabilities thanks to the on-demand access to vast IT resources in the cloud. While the cloud computing offers the clients all these advantages, it creates a number of challenges for the cloud service providers who try to create successful businesses: they have to handle diverse and dynamic workloads in a highly price-competitive way, to convince the potential clients to use the service delivery model instead of in-house hosting of IT functions. In addition, the quality of service should be comparable in all aspects to the capabilities that can be delivered off of an IT infrastructure under full control of clients. Thus, the success of cloud-based services arguably depends on the two major factors: quality of service, which is captured as Service Level Agreements (SLAs) and operational cost management.
- SLAs Service Level Agreements
- the consistent delivery of services within SLAs is crucial for sustained revenue for the service provider. Delivering those services incurs operational costs and the difference between the revenue and the operational costs is the service provider's profit, which is required for any commercially viable businesses.
- the revenue, R is defined for each job class in the system.
- Each client may have multiple job classes based on the contract.
- a stepwise function is used to characterize the revenue as shown in FIG. 1 .
- the clients agree to pay varying fee levels for corresponding service levels delivered for a particular class of requests, i.e., job classes in their contracts. For example, the client may be willing to pay a higher rate for lower response times. As shown in FIG.
- the client pays R 0 as long as the response time is between 0 and X 1 , and pays R 1 for the interval of X 1 and X 2 , and so on.
- SLA cost function a cost function
- the level of services changes, the amount that the provider can charge the client also changes according to the contract. Due to the limitations on the availability of infrastructure resources, the cloud service provider may not be able or choose to attend to all client requests at the highest possible service levels. Dropping/Increasing service levels cause loss/increase in the revenue. The loss of potential revenue corresponds to SLA cost.
- SLAs in general may be defined in terms of various criteria, such as service latency, throughput, consistency, security, etc.
- criteria such as service latency, throughput, consistency, security, etc.
- One embodiment focuses on service latency, or response time. Even with latency alone, there can be multiple specification methods:
- the SLA can be classified either as a hard SLA or a soft SLA as follows.
- the unit of operational cost is a server cost per hour. Consequently, the total operational cost, C, is the sum of individual server costs for a given period of time.
- the individual server cost is the aggregation of all specific costs items that are involved in operating a server, such as energy, administration, software, among others.
- Conventional scheduling systems typically rely on techniques that do not primarily consider profit maximization. These techniques mainly focus on optimizing metrics such as average response time.
- systems and methods are disclosed to schedule jobs in a cloud computing infrastructure by receiving in a first queue jobs with deadlines or constraints specified in a hard service level agreement (SLA); receiving in a second queue jobs with a penalty cost metric specified in a soft SLA; and minimizing both constraint violation count and total penalty cost in the cloud computing infrastructure by identifying jobs with deadlines in the first queue and delaying jobs in the first queue within a predetermined slack range in favor of jobs in the second queue to improve the penalty cost metric.
- SLA hard service level agreement
- systems and methods are disclosed for efficient maintenance of job prioritization for profit maximization in cloud-based service delivery infrastructures with multi-step cost structure support by breaking multiple steps in the SLA of a job into corresponding cost steps; generating a segmented cost function for each cost step; creating a cost-based-scheduling (CBS)-priority value associated with a validity period for each segment based on the segmented cost function; and choosing the job with the highest CBS priority value.
- CBS cost-based-scheduling
- the system provides a very efficient job prioritization for diverse pricing agreements across diverse clients and heterogeneous infrastructure resources.
- the system enables profit optimization, which is a vital economic indicator for sustainability.
- the system considers discrete levels of costs corresponding to varying levels of service, which is more realistic in many real-life systems.
- the system is also efficient and low in computational complexity to be feasible for high volume and large infrastructures.
- FIG. 1 shows an exemplary system diagram of an Intelligent Cloud Database Coordinator (ICDC).
- ICDC Intelligent Cloud Database Coordinator
- FIG. 2 shows an exemplary cost function segmentation in iCBS.
- FIG. 3 shows a Constraint-Conscious Optimization Scheduling (CCOS) system.
- FIG. 4 shows an exemplary slack tree used with the CCOS system.
- FIG. 5 shows an example where a new job is inserted into the slack tree.
- FIG. 6 shows an exemplary process for prioritization and scheduling of incoming jobs based on cost functions.
- FIG. 7 shows an exemplary process to schedule jobs prioritized in FIG. 6 .
- FIG. 1 shows an exemplary system diagram of the ICDC.
- the ICDC manages very large cloud service delivery infrastructures.
- the system architecture focuses on components that are relevant and subject to optimization to achieve the goal of SLA-based profit optimization of resource and workload management in the cloud databases.
- the use of distinctively optimizing individual system components with a global objective in mind provides a greater degree of freedom to customize operations. This approach yielded higher degrees of performance, customizability based on variable business requirements, and end-to-end profit optimization.
- clients 10 communicate with ICDC using a standard JDBC API and make plain JDBC method calls to talk to various databases without changing codes.
- the clients 10 communicate with a query router 20 .
- An autoscaler 30 monitors the queue length log and query response time log and determines if additional nodes should be added by an add/drop controller 40 .
- the controller issues commands to add/drop nodes to a database replication cluster 50 such as a MySQL replication cluster.
- a database replication cluster 50 such as a MySQL replication cluster.
- the ICDC has a Client Data Module that is responsible for maintaining client specific data such as cost functions and SLAs, which are derived from client contracts. Once captured, this information is made available to other system modules for resource and workload management purposes.
- An ICDC Manager monitors the status of system, e.g. system load, queue lengths, query response time, CPU and I/O utilization. All this information is maintained by the System Data module. Based on system monitoring data the ICDC Manager directs the Cluster Manager to add or remove servers from/to Resource Pool to optimize the operational cost while keeping the SLA costs in check.
- the ICDC Manager also provides the dispatcher and scheduler modules with the dynamic system data.
- An Online Simulator is responsible for dynamic capacity planning. It processes the client data and dynamic system data to assess optimum capacity levels through simulation.
- a Dispatcher takes incoming client calls and immediately forwards the queries (or jobs) to servers based on the optimized dispatching policy.
- the dispatching policy is constantly tuned according to dynamic changes in the system, such as user traffic, addition/removal of processing nodes.
- a Scheduler decides the order of execution of jobs at each server. After the client requests are dispatched to individual servers based on the dispatching policy, individual scheduler modules are responsible for prioritization of dispatched jobs locally by forming a queue of queries, from which a query is chosen and executed in the database. The choice of which query to execute first makes a difference in the SLA penalty costs observed.
- the system uses an SLA-based profit optimization approach for building and managing a data management platform in the cloud.
- the problem of resource and workload management is done for a data management platform that is hosted on an Infrastructure-as-a-Service (IaaS) offering, e.g., Amazon EC2.
- IaaS Infrastructure-as-a-Service
- the data management platform can be thought of a Platform-as-a-Service (PaaS) offering that is used by Software-as-a-Service (SaaS) applications in the cloud.
- PaaS Platform-as-a-Service
- each server node represents a replica of a database.
- a dispatcher When a query (job) arrives, a dispatcher immediately assigns the query to a server among multiple servers, according to certain dispatching policy; for each server, a resource scheduling policy decides which query to execute first, among those waiting in the associated queue; and a capacity planning component is in charge of determining how many resources (i.e., database servers) to be allocated in the system.
- resources i.e., database servers
- the Scheduler has two distinct features: cost sensitive and constraint conscious.
- the system evaluates the priorities of the n jobs in the queue individually, in a constant time, and to pick the job with the highest priority.
- CBS considers two possible cases: i) the job is served immediately at current time t, which will incur a cost of c i (t), where c i (t) is the cost function of job i with the queue wait time t, and ii) the job gets delayed by a wait time, ⁇ , and then served, which will cause the cost of c i (t+ ⁇ ). Since the value of ⁇ is not known, CBS uses probability density function, and compute the expected cost based on that. Thus, the CBS priority for a job i is,
- a( ⁇ ) is a probability distribution to model the waiting time of a job in a queue if it is not served immediately. After computing p i (t) value, they divide it by the job's service time, since longer job occupies the server for a longer time, delaying other jobs for a longer time period.
- CBS examines all the jobs in the queue in order to pick the next job to serve, in systems where queues can grow very long and job service time can be very short, CBS can be slow.
- Another embodiment uses an efficient version of CBS called iCBS (incremental CBS).
- iCBS uses a priority queue to maintain the list of jobs according to their priority and to dynamically update the priority queue when a new job arrives or an existing one removed from the queue. Because the priority queue is maintained incrementally, iCBS has a logarithmic time complexity.
- the iCBS system breaks multiple steps of a cost function into multiple cost functions, as shown in FIG. 2 .
- Each of the segmented cost functions has its own validity period, (x 1 ,x 2 ], and it is used to compute the priority of the corresponding job between time x 1 ⁇ x ⁇ x 2 .
- Segmentation is done by removing and pushing down steps as follows.
- the first segment is same as the original cost function, and its validity period is the duration of the first step, i.e., (0, x 1 ].
- the second segment is obtained by removing the first step and pushing down the rest of steps by the second step's cost (or y-value).
- Its validity period is the duration of the second step, i.e., (x 1 , x 2 ]. This is repeated until the last step is reached, where the cost is defined as zero for its validity period, which is the duration of the last step, i.e., in the example where (x 2 , ⁇ ].
- Equation 1 follows an exponential distribution, the relative priority order between the valid segments of two jobs remains unchanged over time, as long as the segments are still valid.
- the iCBS process decreases newly arrived jobs' priorities, instead of increasing the existing jobs' priorities, to avoid modification of existing jobs' priorities in the queue, while keeping the relative order the same.
- segmented cost function for new job arrivals, one segmented cost function is generated for each cost step, and segmented-CBS-priority associated with the validity period of each segment is generated. Then, each priority value is divided by e (t ⁇ t 0 )/a , where is t is current time, and t 0 is a fixed time instance, such as system time zero.
- the segmented-CBS-priority objects are inserted into a priority queue, where the objects are ordered by CBS priority. Among all segments corresponding to the same job, segment i will always have higher CBS priority than segment j, where i ⁇ j.
- the system also adds a nextSegment pointer from segmented-CBS-priority object to i+1, to chain the segments of the same job.
- segmented-CBS-priority queue For job scheduling, at the time of picking the next job to serve, the head of segmented-CBS-priority queue is pulled, which has the highest priority value to see if its validity period has expired or the corresponding job has been scheduled by an earlier segment. In either case, the segment is thrown away, and the next head off the priority queue is pulled, until the system finds the segmented-CBS-priority with an unexpired validity period and also has not been scheduled yet. When found, the system marks the other segments for the same job in the priority queue as scheduled, using nextSegment pointer.
- iCBS achieves near-optimal cost, but it is prone to starvation as its sole objective is cost minimization. In the real world, however, this may not be desirable. For instance, service providers may want to provide certain bottom line performance guarantee for all jobs, such as a guarantee that all jobs can be finished within ten times of job service time. Also, it may be desired to provide the worst-case performance guarantee for selected VIP customers. These types of hard SLAs need to be enforced, on top of soft SLAs that affect SLA costs.
- a scheduling embodiment of FIG. 3 manages hard SLAs, i.e. deadlines or constraints, and soft SLAs, i.e. optimization metric.
- This embodiment optimizes the metric while achieving (near-) minimal possible constraint violation.
- constraints or deadlines may be unavoidable in general (jobs may arrive in a bursty fashion) the system tries to make the minimum possible number of violations. Only a subset of jobs may have deadlines, which may happen as in the VIP example above.
- the optimization metric can be the average response time or the above discussed average cost.
- FIG. 3 shows a Constraint-Conscious Optimization Scheduling (CCOS) system.
- CCOS employs dual queue approach: 1) an opti-queue 110 is an optimization queue where all jobs are queued. SJF is used without modification, if response time minimization is the optimization goal, and iCBS is used, if cost minimization is the goal; and 2) a constraint-queue 120 employs EDF (Earliest Deadline First) process. Only the jobs with deadlines are queued here.
- EDF Erarliest Deadline First
- 3 's CCOS balances between the following two extremes: 1) ignore deadlines (always schedule jobs from opti-queue, achieving the best cost-based results, with uncontrolled deadline violation); and 2) blindly pursue violation control (schedule jobs from constraint-queue whenever it has a job, and attend opti-queue only when constraint-queue is empty). A job is deleted from both queues when it is scheduled from either one. The balance is achieved by observing that deadlines are not always urgent. There may be some job with deadlines in constraint-queue, but it may wait some time, called slack, without violating the deadline. Once known, the system can delay it, and attend opti-queue, to improve optimization metric.
- the scheduling system of FIG. 3 manages both hard SLAs, i.e. deadlines or constraints, and a cost optimization metric, that is also call soft SLAs.
- the operating costs metrics are optimized while possible constraint violations are minimized. This is done by the dual-queue based component where one queue handles the hard SLAs and the other queue handles the soft SLAs, and a system-monitoring component that efficiently monitors those queues.
- the main challenge of CCOS is to efficiently monitor the slack of jobs in the constraint-queue, which is defined as follows based on EDF scheduling policy. Given n jobs, J i , 1 ⁇ i ⁇ n in constraint-queue, where the job length of J i is l i , the deadline of J i is d i , and d i ⁇ d j if i ⁇ j, the slack of J i at time t is,
- s i can be determined for 1 ⁇ i ⁇ n by iterating jobs in the non-decreasing order of deadlines, and testing the minimum slack, i.e. min i s i , against a slack threshold s th . If the minimum slack is less than or equal to s th , jobs need to be removed from the constraint-queue 120 .
- the only parameter of CCOS is s th , and within a large range, i.e. [3*mean-job-length, 10*mean-job-length], the performance is not very sensitive to the parameter value.
- a data structure named slack tree can be used that supports fast minimum slack monitoring.
- FIG. 4 An example of slack tree is shown in FIG. 4 .
- Each leaf node has a job in the order of non-decreasing deadline from left to right.
- FIG. 4 shows a binary tree for illustration, but slack tree can have arbitrary fan-outs at each node.
- Each node maintains two values: left sibling execution time total (LSETT) and minimum slack in subtree (MSS).
- LSETT of a node is the total execution time, or total job length, of left siblings, which are the nodes to the left, sharing the same parent node.
- MSS of a node is the minimum slack in the subtree rooted at the node.
- MSS of a non-leaf node node i is recursively computed as:
- MSS i min node j ⁇ childrenofnode i MSS j ⁇ LSETT j
- Root node's MSS represents the minimum slack of the whole tree.
- FIG. 5 shows an example where a new job J 9 is inserted into the slack tree. Underlined numbers indicate the updated information from FIG. 5 . Based on its deadline, J 9 is inserted between J 3 and J 4 . This triggers updates of LSETT and MSS of other nodes as follows. At the parent node of J 9 , it updates MSS value from 30 to 29, as the slack of J 4 is reduced by 1. Its updated MSS affects its parent node's MSS as well, updating it from 15 to 14. Now its right sibling node is affected as well, such that LSETT has been increased by 1, from 30 to 31. These two nodes report their updated contribution to the root node's MSS, 14 and 9, respectively, and the root node updates MSS from 10 to 9. Given the node fan-out of k, insertion takes k time at each level, and therefore it takes O(k ⁇ log k n), or simply O(log n). Deletion is done in a similar fashion, giving the same time complexity.
- Some simple traditional dispatching policies include random and round robin. While being simple, these policies do not perform well, especially given highly variable job length, such as that in long tail distributions.
- Other more sophisticated policies include Join-shortest-queue (JSQ) or least-work-left (LWL), where the former sends jobs to the server with the fewest jobs in the queue and the latter sends jobs to the server whose the sum of job lengths in the queue is the least among all servers.
- JSQ Join-shortest-queue
- LWL least-work-left
- LWL is locally optimal policy in that each job will choose the server that will minimize its waiting time, though it does not necessarily minimize the total response time of all jobs.
- SITA Size Interval Task Assignment
- SITA-E SITA with Equal load
- SITA-E or SITA-U may not be the best candidate for SLA-based dispatching, since they are not aware of SLA penalty cost function and do not necessarily minimize the total SLA penalty cost.
- SITA-E would send equal load to all servers, but it may be the case that short jobs are more expensive in terms of SLA penalty cost, and the system may want to send less load to short job servers than long-job servers.
- SITA-U may find its own optimal boundaries of splitting jobs according to length for response time minimization, but it may not be the best set of boundaries for cost minimization.
- Tuning of a single boundary is done in two phases.
- the first phase the lowerbound and upperbound of the global minima are located. Starting from the boundary previously found, the process makes exponential jumps to the left, i.e. divide by 2 each time, to find the lowerbound. When f(0.5 ⁇ )>f(x), then 0.5 ⁇ is the lowerbound.
- the system performs an upperbound search to the right using exponential jumps, i.e. multiply by 2 each time, and when f(x) ⁇ f(2 ⁇ ), then 2 ⁇ is the upperbound. With these two bounds, the system performs a narrowing down search in the second phase.
- the system divides the interval bounded by lowerbound x LB and upperbound x UB into three equal-length sections using two division points named x 1 and x 2 .
- the system evaluates f(x i ) and f(x 2 ) using two simulation runs. If f(x i ) ⁇ f(x 2 ), the global minima is within [x LB ,x 2 ] and the next round search is done where this interval is divided into three sections. If f(x 1 )>f(x 2 ), then the global minimal is in [x 1 ,x UB ] and the search is limited to this smaller interval in the next round. This process is repeated until x LB /x UB is greater than a parameter StopPrecision, such as 0.9.
- a single cutoff can be used to divide short jobs and long jobs, which is decided by the above SITA-UC cutoff search process.
- the servers are divided into two groups, one for short jobs and another for long jobs, and within the group LWL is used.
- the system can also generalize a dispatching policy in the context where each job may be served by only a subset of servers.
- capability groups are set up where jobs and servers belong to one of them, and SITA is run for each capability group.
- Capacity planning is discussed next.
- the capacity planning process allocates resources in an intelligent way, considering factors such as job traffic and profit model, so that the total profit is maximized.
- the system uses observed job traffic patterns and unit server costs as the basis for the immediate future planning Simply adding more servers will increase the operational cost. Therefore, the task of capacity planning is to identify the best allocation that maximizes the total profit. Therefore, the task of capacity planning is to identify the best allocation that maximizes the total profit.
- Simulation-based capacity planning is used in one embodiment.
- the system has a discrete event simulator module that is responsible for finding optimum resource allocations through planned simulations.
- Capacity planning simulation can be online or offline.
- Offline Simulation for Capacity Planning the simulation receives the given job characteristics, the numbers of servers to handle the jobs, and different operational costs.
- Simulation-based capacity planning relies on simulations to estimate, in an offline manner, the profits under different server numbers in order to decide the best setting.
- the inputs to the simulation are the profit model and job characteristics. Those inputs are derived the real query logs for the already running systems.
- certain data statistics such as the distribution of job inter-arrival time and that of job service time unless they are not provided by the client are assumed.
- One embodiment of the capacity planner uses most frequently used distributions to initially characterize the data statistics. After that, it effectively refines those initial assumptions by constantly monitoring the system. This feature allows the system not to heavily rely on the client input on the data statistics to start with.
- FIG. 6 shows an exemplary process for prioritization and scheduling of incoming jobs based on cost functions.
- a new job is received ( 210 ).
- the process removes one step from the cost function ( 220 ).
- the process determines priority values an create a segment ( 230 ) as illustrated in FIG. 2 .
- the process checks if there are additional steps in the cost functions ( 240 ) and if so, loops back to 220 to handle the step and otherwise exits ( 250 ).
- FIG. 7 shows an exemplary process to schedule jobs prioritized in FIG. 6 .
- the process pulls a segment from the head of the priority queue ( 300 ).
- the process checks if the validity of the segment has expired ( 310 ), and if not, the process checks if the job has been scheduled ( 320 ). From 310 or 320 , if the segment is invalid or the job has been scheduled, the process ignores the segment ( 330 ) and loops back to 300 to handle the next segment. From 320 , if the job has not been scheduled, the process marks the other segments of the job as scheduled ( 340 ), and schedules the job ( 350 ).
- the result of the foregoing is a data management platform which is hosted on an Infrastructure-as-a-Service in the cloud, such as Amazon EC2.
- the system optimizes a database service provider's profit while delivering the services according to customer SLAs.
- the system identifies the major relevant components of cloud service delivery architecture that need to be optimized to reach this goal.
- the system explicitly considers SLA penalty cost function of each job at the core of scheduling, dispatching, and capacity planning problems to achieve an overall cost optimal solution.
- the system provides a cost-based and constraint-conscious resource scheduling method, called incremental Cost-Based Scheduling (iCBS), for profit optimization.
- iCBS incremental Cost-Based Scheduling
- the iCBS makes the cost-based scheduling a feasible option by a substantial efficiency improvement.
- the system can be applied to other SLA-based resource and workload management in cloud databases, such as job dropping, preempt-and-restart scheduling, and MPL tuning for the purpose of SLA profit optimization.
- SLA design will be a more complicated, but interesting, in the presence of such SLA profit optimizing techniques from the cloud service providers, e.g. how should a client design his/her SLAs so that the provider will accept it, and still get a certain level of services reliably delivered given the competition with other users.
- the cloud provider or the client may want to define additional constraints on certain jobs in addition to SLA penalty costs.
- cloud-service providers may want to provide differentiated quality of services to certain customers. The reasons could be various: e.g. i) service provider' desire for providing some guarantee against starvation on all customers, such that no jobs will experience a delay greater than a pre-set threshold, ii) explicit customer request in addition to the SLA-based price agreement, and iii) service provider's internal planning among multiple service components of a provider.
- Such constraint enforcement along with SLA cost optimization is a valuable feature.
- the method runs on a framework called constraint-conscious optimization scheduling (CCOS) that can schedule jobs such that it enforces desired constraints with a marginal sacrifice on the optimization metric (e.g. SLA penalty cost).
- CCOS constraint-conscious optimization scheduling
- SITA-UC Size Interval Task Assignment
- the system provides an effective and robust capacity planning framework for cloud resource management.
- the key elements of the framework are two folds: i) the capacity planner does not need to assume any distribution for user traffic and job lengths and ii) it works with cost-based scheduler and cost-based dispatcher modules in a tightly integrated manner to enable end-to-end profit optimization in the system.
- the system has been tested through extensive testing. Real data of user access data at Yahoo video site, and TPC-H benchmarks are used in the tests.
- the invention may be implemented in hardware, firmware or software, or a combination of the three.
- the invention is implemented in a computer program executed on a programmable computer having a processor, a data storage system, volatile and non-volatile memory and/or storage elements, at least one input device and at least one output device.
- the computer preferably includes a processor, random access memory (RAM), a program memory (preferably a writable read-only memory (ROM) such as a flash ROM) and an input/output (I/O) controller coupled by a CPU bus.
- RAM random access memory
- program memory preferably a writable read-only memory (ROM) such as a flash ROM
- I/O controller coupled by a CPU bus.
- the computer may optionally include a hard drive controller which is coupled to a hard disk and CPU bus. Hard disk may be used for storing application programs, such as the present invention, and data. Alternatively, application programs may be stored in RAM or ROM.
- I/O controller is coupled by means of an I/O bus to an I/O interface.
- I/O interface receives and transmits data in analog or digital form over communication links such as a serial link, local area network, wireless link, and parallel link.
- a display, a keyboard and a pointing device may also be connected to I/O bus.
- separate connections may be used for I/O interface, display, keyboard and pointing device.
- Programmable processing system may be preprogrammed or it may be programmed (and reprogrammed) by downloading a program from another source (e.g., a floppy disk, CD-ROM, or another computer).
- Each computer program is tangibly stored in a machine-readable storage media or device (e.g., program memory or magnetic disk) readable by a general or special purpose programmable computer, for configuring and controlling operation of a computer when the storage media or device is read by the computer to perform the procedures described herein.
- the inventive system may also be considered to be embodied in a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner to perform the functions described herein.
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- General Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Software Systems (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Marketing (AREA)
- Tourism & Hospitality (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Development Economics (AREA)
- Educational Administration (AREA)
- Game Theory and Decision Science (AREA)
- Debugging And Monitoring (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Systems and methods are disclosed for efficient maintenance of job prioritization for profit maximization in cloud-based service delivery infrastructures with multi-step cost structure support by breaking multiple steps in the SLA of a job into corresponding cost steps; generating a segmented cost function for each cost step; creating a cost-based-scheduling (CBS)-priority value associated with a validity period for each segment based on the segmented cost function; and choosing the job with the highest CBS priority value.
Description
- This application claims priority to U.S. Provisional Application Ser. Nos. 61/294,246 and 61/294,254, both filed on Jan. 12, 2010, the contents of which are incorporated by reference.
- This application relates to Constraint-Conscious Optimal Scheduling for Cloud Infrastructures.
- Cloud computing has emerged as a promising computing platform with its on-demand scaling capabilities. Typically, a cloud service delivery infrastructure is used to deliver services to a diverse set of clients sharing the computing resources. By providing on-demand scaling capabilities without any large upfront investment or long-term commitment, it is attracting a wide range of users, from web applications to Business Intelligence applications. The database community has also shown great interest in exploiting this new platform for scalable and cost-efficient data management. Arguably, the success of cloud-based services depends on two main factors: quality of service that are identified through Service Level Agreements (SLAs) and operating cost management.
- Users of cloud computing services are not only able to significantly reduce their IT costs and turn their capital expenditures to operational expenditures, but also able to speed up their innovation capabilities thanks to the on-demand access to vast IT resources in the cloud. While the cloud computing offers the clients all these advantages, it creates a number of challenges for the cloud service providers who try to create successful businesses: they have to handle diverse and dynamic workloads in a highly price-competitive way, to convince the potential clients to use the service delivery model instead of in-house hosting of IT functions. In addition, the quality of service should be comparable in all aspects to the capabilities that can be delivered off of an IT infrastructure under full control of clients. Thus, the success of cloud-based services arguably depends on the two major factors: quality of service, which is captured as Service Level Agreements (SLAs) and operational cost management.
- The consistent delivery of services within SLAs is crucial for sustained revenue for the service provider. Delivering those services incurs operational costs and the difference between the revenue and the operational costs is the service provider's profit, which is required for any commercially viable businesses.
- The total profit, P, of the cloud service provider is defined as P=Σiri−C, where ri is the revenue that can be generated by delivering the service for a particular job i and C is the operational cost of running the service delivery infrastructure. The revenue, R, is defined for each job class in the system. Each client may have multiple job classes based on the contract. A stepwise function is used to characterize the revenue as shown in
FIG. 1 . Intuitively, the clients agree to pay varying fee levels for corresponding service levels delivered for a particular class of requests, i.e., job classes in their contracts. For example, the client may be willing to pay a higher rate for lower response times. As shown inFIG. 1 , the client pays R0 as long as the response time is between 0 and X1, and pays R1 for the interval of X1 and X2, and so on. This characterization allows more intuitive interpretation of SLAs with respect to revenue generation. Once the revenue function is defined, the revenue function defines a cost function, called SLA cost function. If the level of services changes, the amount that the provider can charge the client also changes according to the contract. Due to the limitations on the availability of infrastructure resources, the cloud service provider may not be able or choose to attend to all client requests at the highest possible service levels. Dropping/Increasing service levels cause loss/increase in the revenue. The loss of potential revenue corresponds to SLA cost. For example, there is no revenue loss, hence no SLA penalty cost, as long as response time is between 0 and X1 inFIG. 1 . Likewise, increasing the amount of infrastructure resources to increase service levels results in increased operational cost. As a result, the key problem for the provider is to come up with optimal service levels that will maximize its profits based on the agreed upon SLAs. - SLAs in general may be defined in terms of various criteria, such as service latency, throughput, consistency, security, etc. One embodiment focuses on service latency, or response time. Even with latency alone, there can be multiple specification methods:
-
- Mean-value-based SLA (MV-SLA): For each job class, quality of service is measured based on mean response time. This is the least robust type of SLAs from the customers' perspective.
- Tail-distribution-based SLA (TD-SLA): For each job class, quality of service is measured in terms of the portion of jobs finished by a given deadline. For instance, a user may want 99% of job to be finished within 100 ms.
- Individual-job-based SLA (IJ-SLA): Quality of service is measured using the response time of individual jobs. Unlike MV-SLA or TD-SLA above, in IJ-SLA any single job with a poor service quality immediately affects the measured quality of service and incurs some SLA penalty cost.
- For each specification method, the SLA can be classified either as a hard SLA or a soft SLA as follows.
-
- Hard SLA: A hard SLA has a single hard deadline to meet, and if the deadline missed, it is counted as a violation. The definition of this type of SLA, or constraint, may come from the client or the cloud service provider. There are cases where a cloud provider needs to use Hard SLAs as a tool to control various business objectives, e.g., controlling the worst case user experience. Therefore the violation of a hard SLA may not correspond to financial terms in the client contracts.
- Soft SLA: A soft SLA corresponds to agreed levels of service in the contract. This is different from the hard SLA in that even after the violation, SLA penalty cost may continue to increase as response time further increases. Although the SLA penalty cost may have various shapes, stepwise function is a natural choice used in the real-world contracts. SLAs in general may be defined in terms of various criteria, such as service latency, throughput, consistency, security, etc.
- The unit of operational cost is a server cost per hour. Consequently, the total operational cost, C, is the sum of individual server costs for a given period of time. The individual server cost is the aggregation of all specific costs items that are involved in operating a server, such as energy, administration, software, among others. Conventional scheduling systems typically rely on techniques that do not primarily consider profit maximization. These techniques mainly focus on optimizing metrics such as average response time.
- In a first aspect, systems and methods are disclosed to schedule jobs in a cloud computing infrastructure by receiving in a first queue jobs with deadlines or constraints specified in a hard service level agreement (SLA); receiving in a second queue jobs with a penalty cost metric specified in a soft SLA; and minimizing both constraint violation count and total penalty cost in the cloud computing infrastructure by identifying jobs with deadlines in the first queue and delaying jobs in the first queue within a predetermined slack range in favor of jobs in the second queue to improve the penalty cost metric.
- In a second aspect, systems and methods are disclosed for efficient maintenance of job prioritization for profit maximization in cloud-based service delivery infrastructures with multi-step cost structure support by breaking multiple steps in the SLA of a job into corresponding cost steps; generating a segmented cost function for each cost step; creating a cost-based-scheduling (CBS)-priority value associated with a validity period for each segment based on the segmented cost function; and choosing the job with the highest CBS priority value.
- Advantage of the preferred embodiments may include one or more of the following. The system provides a very efficient job prioritization for diverse pricing agreements across diverse clients and heterogeneous infrastructure resources. In cloud computing infrastructures, the system enables profit optimization, which is a vital economic indicator for sustainability. The system considers discrete levels of costs corresponding to varying levels of service, which is more realistic in many real-life systems. The system is also efficient and low in computational complexity to be feasible for high volume and large infrastructures.
-
FIG. 1 shows an exemplary system diagram of an Intelligent Cloud Database Coordinator (ICDC). -
FIG. 2 shows an exemplary cost function segmentation in iCBS. -
FIG. 3 shows a Constraint-Conscious Optimization Scheduling (CCOS) system. -
FIG. 4 shows an exemplary slack tree used with the CCOS system. -
FIG. 5 shows an example where a new job is inserted into the slack tree. -
FIG. 6 shows an exemplary process for prioritization and scheduling of incoming jobs based on cost functions. -
FIG. 7 shows an exemplary process to schedule jobs prioritized inFIG. 6 . -
FIG. 1 shows an exemplary system diagram of the ICDC. The ICDC manages very large cloud service delivery infrastructures. The system architecture focuses on components that are relevant and subject to optimization to achieve the goal of SLA-based profit optimization of resource and workload management in the cloud databases. The use of distinctively optimizing individual system components with a global objective in mind provides a greater degree of freedom to customize operations. This approach yielded higher degrees of performance, customizability based on variable business requirements, and end-to-end profit optimization. - In one embodiment,
clients 10 communicate with ICDC using a standard JDBC API and make plain JDBC method calls to talk to various databases without changing codes. Theclients 10 communicate with aquery router 20. Anautoscaler 30 monitors the queue length log and query response time log and determines if additional nodes should be added by an add/drop controller 40. The controller issues commands to add/drop nodes to adatabase replication cluster 50 such as a MySQL replication cluster. Although the system ofFIG. 1 shows specific product names, such as MySQL and Active MQ, for example, the system is not limited to those products. For example MySQL can be replaced with other database products such as Oracle, among others. - The ICDC has a Client Data Module that is responsible for maintaining client specific data such as cost functions and SLAs, which are derived from client contracts. Once captured, this information is made available to other system modules for resource and workload management purposes. An ICDC Manager monitors the status of system, e.g. system load, queue lengths, query response time, CPU and I/O utilization. All this information is maintained by the System Data module. Based on system monitoring data the ICDC Manager directs the Cluster Manager to add or remove servers from/to Resource Pool to optimize the operational cost while keeping the SLA costs in check. The ICDC Manager also provides the dispatcher and scheduler modules with the dynamic system data. An Online Simulator is responsible for dynamic capacity planning. It processes the client data and dynamic system data to assess optimum capacity levels through simulation. It has capabilities to run simulations both in offline and online modes. A Dispatcher takes incoming client calls and immediately forwards the queries (or jobs) to servers based on the optimized dispatching policy. The dispatching policy is constantly tuned according to dynamic changes in the system, such as user traffic, addition/removal of processing nodes. A Scheduler decides the order of execution of jobs at each server. After the client requests are dispatched to individual servers based on the dispatching policy, individual scheduler modules are responsible for prioritization of dispatched jobs locally by forming a queue of queries, from which a query is chosen and executed in the database. The choice of which query to execute first makes a difference in the SLA penalty costs observed.
- The system uses an SLA-based profit optimization approach for building and managing a data management platform in the cloud. The problem of resource and workload management is done for a data management platform that is hosted on an Infrastructure-as-a-Service (IaaS) offering, e.g., Amazon EC2. The data management platform can be thought of a Platform-as-a-Service (PaaS) offering that is used by Software-as-a-Service (SaaS) applications in the cloud.
- In the system model, each server node represents a replica of a database. When a query (job) arrives, a dispatcher immediately assigns the query to a server among multiple servers, according to certain dispatching policy; for each server, a resource scheduling policy decides which query to execute first, among those waiting in the associated queue; and a capacity planning component is in charge of determining how many resources (i.e., database servers) to be allocated in the system. With this abstraction, the system optimizes three tasks: query dispatching, resource scheduling, and capacity planning.
- Next, the scheduling component of ICDC system is discussed. The Scheduler has two distinct features: cost sensitive and constraint conscious. In one embodiment using a conventional heuristic cost-based scheduling called CBS, the system evaluates the priorities of the n jobs in the queue individually, in a constant time, and to pick the job with the highest priority. To efficiently evaluate the priority of job i, CBS considers two possible cases: i) the job is served immediately at current time t, which will incur a cost of ci(t), where ci(t) is the cost function of job i with the queue wait time t, and ii) the job gets delayed by a wait time, τ, and then served, which will cause the cost of ci(t+τ). Since the value of τ is not known, CBS uses probability density function, and compute the expected cost based on that. Thus, the CBS priority for a job i is,
-
p i(t)=∫0 ∞ a(τ)·c i(t+τ)dτ−c i(t) (1) - where a(τ) is a probability distribution to model the waiting time of a job in a queue if it is not served immediately. After computing pi (t) value, they divide it by the job's service time, since longer job occupies the server for a longer time, delaying other jobs for a longer time period. The exponential function, a(τ)=1/β·e−dβ, works well, and β=1.4.
- Because CBS examines all the jobs in the queue in order to pick the next job to serve, in systems where queues can grow very long and job service time can be very short, CBS can be slow. Another embodiment uses an efficient version of CBS called iCBS (incremental CBS). iCBS uses a priority queue to maintain the list of jobs according to their priority and to dynamically update the priority queue when a new job arrives or an existing one removed from the queue. Because the priority queue is maintained incrementally, iCBS has a logarithmic time complexity.
- The iCBS system breaks multiple steps of a cost function into multiple cost functions, as shown in
FIG. 2 . Each of the segmented cost functions has its own validity period, (x1,x2], and it is used to compute the priority of the corresponding job between time x1<x≦x2. Segmentation is done by removing and pushing down steps as follows. The first segment is same as the original cost function, and its validity period is the duration of the first step, i.e., (0, x1]. The second segment is obtained by removing the first step and pushing down the rest of steps by the second step's cost (or y-value). Its validity period is the duration of the second step, i.e., (x1, x2]. This is repeated until the last step is reached, where the cost is defined as zero for its validity period, which is the duration of the last step, i.e., in the example where (x2, ∞]. - As a(τ) in
Equation 1 follows an exponential distribution, the relative priority order between the valid segments of two jobs remains unchanged over time, as long as the segments are still valid. The iCBS process decreases newly arrived jobs' priorities, instead of increasing the existing jobs' priorities, to avoid modification of existing jobs' priorities in the queue, while keeping the relative order the same. - In the iCBS process, for new job arrivals, one segmented cost function is generated for each cost step, and segmented-CBS-priority associated with the validity period of each segment is generated. Then, each priority value is divided by e(t−t
0 )/a, where is t is current time, and t0 is a fixed time instance, such as system time zero. The segmented-CBS-priority objects are inserted into a priority queue, where the objects are ordered by CBS priority. Among all segments corresponding to the same job, segment i will always have higher CBS priority than segment j, where i<j. The system also adds a nextSegment pointer from segmented-CBS-priority object to i+1, to chain the segments of the same job. - For job scheduling, at the time of picking the next job to serve, the head of segmented-CBS-priority queue is pulled, which has the highest priority value to see if its validity period has expired or the corresponding job has been scheduled by an earlier segment. In either case, the segment is thrown away, and the next head off the priority queue is pulled, until the system finds the segmented-CBS-priority with an unexpired validity period and also has not been scheduled yet. When found, the system marks the other segments for the same job in the priority queue as scheduled, using nextSegment pointer.
- In Constraint-Conscious Optimization Scheduling, iCBS achieves near-optimal cost, but it is prone to starvation as its sole objective is cost minimization. In the real world, however, this may not be desirable. For instance, service providers may want to provide certain bottom line performance guarantee for all jobs, such as a guarantee that all jobs can be finished within ten times of job service time. Also, it may be desired to provide the worst-case performance guarantee for selected VIP customers. These types of hard SLAs need to be enforced, on top of soft SLAs that affect SLA costs.
- To meet such needs, a scheduling embodiment of
FIG. 3 manages hard SLAs, i.e. deadlines or constraints, and soft SLAs, i.e. optimization metric. This embodiment optimizes the metric while achieving (near-) minimal possible constraint violation. As violation of constraints or deadlines may be unavoidable in general (jobs may arrive in a bursty fashion) the system tries to make the minimum possible number of violations. Only a subset of jobs may have deadlines, which may happen as in the VIP example above. The optimization metric can be the average response time or the above discussed average cost. -
FIG. 3 shows a Constraint-Conscious Optimization Scheduling (CCOS) system. CCOS employs dual queue approach: 1) an opti-queue 110 is an optimization queue where all jobs are queued. SJF is used without modification, if response time minimization is the optimization goal, and iCBS is used, if cost minimization is the goal; and 2) a constraint-queue 120 employs EDF (Earliest Deadline First) process. Only the jobs with deadlines are queued here. FIG. 3's CCOS balances between the following two extremes: 1) ignore deadlines (always schedule jobs from opti-queue, achieving the best cost-based results, with uncontrolled deadline violation); and 2) blindly pursue violation control (schedule jobs from constraint-queue whenever it has a job, and attend opti-queue only when constraint-queue is empty). A job is deleted from both queues when it is scheduled from either one. The balance is achieved by observing that deadlines are not always urgent. There may be some job with deadlines in constraint-queue, but it may wait some time, called slack, without violating the deadline. Once known, the system can delay it, and attend opti-queue, to improve optimization metric. - The scheduling system of
FIG. 3 manages both hard SLAs, i.e. deadlines or constraints, and a cost optimization metric, that is also call soft SLAs. The operating costs metrics are optimized while possible constraint violations are minimized. This is done by the dual-queue based component where one queue handles the hard SLAs and the other queue handles the soft SLAs, and a system-monitoring component that efficiently monitors those queues. - The main challenge of CCOS is to efficiently monitor the slack of jobs in the constraint-queue, which is defined as follows based on EDF scheduling policy. Given n jobs, Ji, 1≦i≦n in constraint-queue, where the job length of Ji is li, the deadline of Ji is di, and di≦dj if i<j, the slack of Ji at time t is,
-
- si can be determined for 1≦i≦n by iterating jobs in the non-decreasing order of deadlines, and testing the minimum slack, i.e. mini si, against a slack threshold sth. If the minimum slack is less than or equal to sth, jobs need to be removed from the constraint-
queue 120. The only parameter of CCOS is sth, and within a large range, i.e. [3*mean-job-length, 10*mean-job-length], the performance is not very sensitive to the parameter value. A data structure named slack tree can be used that supports fast minimum slack monitoring. - An example of slack tree is shown in
FIG. 4 . Each leaf node has a job in the order of non-decreasing deadline from left to right.FIG. 4 shows a binary tree for illustration, but slack tree can have arbitrary fan-outs at each node. Each node maintains two values: left sibling execution time total (LSETT) and minimum slack in subtree (MSS). LSETT of a node is the total execution time, or total job length, of left siblings, which are the nodes to the left, sharing the same parent node. MSS of a node is the minimum slack in the subtree rooted at the node. - MSS of a leaf node nodei can be determined as MSSi=di−li, where di is the deadline of the node i′s job and lt is the node i′s job length. MSS of a non-leaf node nodei is recursively computed as:
-
MSSi=minnodej εchildrenofnodei MSSj−LSETTj - Root node's MSS represents the minimum slack of the whole tree.
- Since the slack tree has all jobs in constraint-queue as its leaf nodes, each insertion and deletion from the queue translates to an insertion and a deletion to the tree. Slack tree efficiently supports these frequent changes.
-
FIG. 5 shows an example where a new job J9 is inserted into the slack tree. Underlined numbers indicate the updated information fromFIG. 5 . Based on its deadline, J9 is inserted between J3 and J4. This triggers updates of LSETT and MSS of other nodes as follows. At the parent node of J9, it updates MSS value from 30 to 29, as the slack of J4 is reduced by 1. Its updated MSS affects its parent node's MSS as well, updating it from 15 to 14. Now its right sibling node is affected as well, such that LSETT has been increased by 1, from 30 to 31. These two nodes report their updated contribution to the root node's MSS, 14 and 9, respectively, and the root node updates MSS from 10 to 9. Given the node fan-out of k, insertion takes k time at each level, and therefore it takes O(k·logkn), or simply O(log n). Deletion is done in a similar fashion, giving the same time complexity. - The prior discussion address the problem of “which job to serve first” at a single server. With multiple such servers, a central dispatcher needs to make a decision on to which server to send each job, given the objective of SLA penalty cost minimization. Assuming servers are homogeneous, the dispatch decision depends on the scheduling policy employed at the servers.
- Next, cost-based dispatching is discussed. Some simple traditional dispatching policies include random and round robin. While being simple, these policies do not perform well, especially given highly variable job length, such as that in long tail distributions. Other more sophisticated policies include Join-shortest-queue (JSQ) or least-work-left (LWL), where the former sends jobs to the server with the fewest jobs in the queue and the latter sends jobs to the server whose the sum of job lengths in the queue is the least among all servers. In particular, LWL is locally optimal policy in that each job will choose the server that will minimize its waiting time, though it does not necessarily minimize the total response time of all jobs. When job lengths are highly variable, as in heavy tail distributions, it has been shown that SITA (Size Interval Task Assignment) often outperforms LWL. In SITA, jobs are dispatched according to its job length, such that server-0 will get the smallest jobs, server-1 will get the next longer jobs, and the last server will get the longest jobs. In choosing the boundaries, SITA-E (SITA with Equal load), the most popular type of SITA, ensures that total work are equal across all servers. However, it has been observed that SITA-E does not necessarily minimize the average response time, and therefore SITA-U has been proposed, which unbalances the load to achieve optimal average response time.
- SITA-E or SITA-U, however, may not be the best candidate for SLA-based dispatching, since they are not aware of SLA penalty cost function and do not necessarily minimize the total SLA penalty cost. For instance, SITA-E would send equal load to all servers, but it may be the case that short jobs are more expensive in terms of SLA penalty cost, and the system may want to send less load to short job servers than long-job servers. Likewise, SITA-U may find its own optimal boundaries of splitting jobs according to length for response time minimization, but it may not be the best set of boundaries for cost minimization.
- Finding the optimal boundaries for SITA-UC, unfortunately, is not an easy problem. To solve the problem, a simulation-based technique can be used for SITA-UC boundary tuning In an exemplary case of two server dispatching, the system needs to decide a single boundary that divides job size intervals into two. To do this, multiple boundaries can be tested between the shortest and the longest job lengths, and the boundary that gives the lowest cost can be used. An approximate assumption that the boundary-value-to-SLA-cost function is near-unimodal can be used. A function is unimodal if it has only one local minima, which is its global minima; and it is near-unimodal, if it has multiple local minimas, but they are all very close to the global minima.
- Tuning of a single boundary is done in two phases. In the first phase, the lowerbound and upperbound of the global minima are located. Starting from the boundary previously found, the process makes exponential jumps to the left, i.e. divide by 2 each time, to find the lowerbound. When f(0.5×)>f(x), then 0.5× is the lowerbound. Likewise, the system performs an upperbound search to the right using exponential jumps, i.e. multiply by 2 each time, and when f(x)<f(2×), then 2× is the upperbound. With these two bounds, the system performs a narrowing down search in the second phase. The system divides the interval bounded by lowerbound xLB and upperbound xUB into three equal-length sections using two division points named x1 and x2. The system then evaluates f(xi) and f(x2) using two simulation runs. If f(xi)<f(x2), the global minima is within [xLB,x2] and the next round search is done where this interval is divided into three sections. If f(x1)>f(x2), then the global minimal is in [x1,xUB] and the search is limited to this smaller interval in the next round. This process is repeated until xLB/xUB is greater than a parameter StopPrecision, such as 0.9.
- For more than two servers, a single cutoff can be used to divide short jobs and long jobs, which is decided by the above SITA-UC cutoff search process. The servers are divided into two groups, one for short jobs and another for long jobs, and within the group LWL is used. The system can also generalize a dispatching policy in the context where each job may be served by only a subset of servers. In this case, capability groups are set up where jobs and servers belong to one of them, and SITA is run for each capability group.
- Capacity planning is discussed next. The capacity planning process allocates resources in an intelligent way, considering factors such as job traffic and profit model, so that the total profit is maximized. The system uses observed job traffic patterns and unit server costs as the basis for the immediate future planning Simply adding more servers will increase the operational cost. Therefore, the task of capacity planning is to identify the best allocation that maximizes the total profit. Therefore, the task of capacity planning is to identify the best allocation that maximizes the total profit.
- Simulation-based capacity planning is used in one embodiment. The system has a discrete event simulator module that is responsible for finding optimum resource allocations through planned simulations.
- Capacity planning simulation can be online or offline. In Offline Simulation for Capacity Planning, the simulation receives the given job characteristics, the numbers of servers to handle the jobs, and different operational costs. Simulation-based capacity planning relies on simulations to estimate, in an offline manner, the profits under different server numbers in order to decide the best setting. The inputs to the simulation are the profit model and job characteristics. Those inputs are derived the real query logs for the already running systems. At the initialization stage of a system, certain data statistics, such as the distribution of job inter-arrival time and that of job service time unless they are not provided by the client are assumed. One embodiment of the capacity planner uses most frequently used distributions to initially characterize the data statistics. After that, it effectively refines those initial assumptions by constantly monitoring the system. This feature allows the system not to heavily rely on the client input on the data statistics to start with.
- In offline simulations, data characteristics are assumed to be time invariant. Because such an assumption does not always hold true in cloud computing, the simulation results should be updated in real time. If time allows, the offline simulation can be repeated in real time. However, in many cases offline simulations are not acceptable either because it takes too much resource to run them or because it takes too much delay for them to give final answers. In other words, simulations conducted in real time should be quick and take less resource. With a time budget, an online simulation can be done to estimate an approximate optimal solution by using ICDC's online simulation capabilities. The main ideas are (1) instead of multiple simulation runs at a server setting, one run is done, and (2) instead of checking all the possible server numbers, the system checks a subset of server numbers. The cost estimation is then computed from a polynomial regression, which handles both the variance due to the single run and the interpolation for the unchecked server settings.
-
FIG. 6 shows an exemplary process for prioritization and scheduling of incoming jobs based on cost functions. First, a new job is received (210). Next, the process removes one step from the cost function (220). The process then determines priority values an create a segment (230) as illustrated inFIG. 2 . The process checks if there are additional steps in the cost functions (240) and if so, loops back to 220 to handle the step and otherwise exits (250). -
FIG. 7 shows an exemplary process to schedule jobs prioritized inFIG. 6 . The process pulls a segment from the head of the priority queue (300). The process checks if the validity of the segment has expired (310), and if not, the process checks if the job has been scheduled (320). From 310 or 320, if the segment is invalid or the job has been scheduled, the process ignores the segment (330) and loops back to 300 to handle the next segment. From 320, if the job has not been scheduled, the process marks the other segments of the job as scheduled (340), and schedules the job (350). - The result of the foregoing is a data management platform which is hosted on an Infrastructure-as-a-Service in the cloud, such as Amazon EC2. The system optimizes a database service provider's profit while delivering the services according to customer SLAs. The system identifies the major relevant components of cloud service delivery architecture that need to be optimized to reach this goal.
- The system explicitly considers SLA penalty cost function of each job at the core of scheduling, dispatching, and capacity planning problems to achieve an overall cost optimal solution. The system provides a cost-based and constraint-conscious resource scheduling method, called incremental Cost-Based Scheduling (iCBS), for profit optimization. The iCBS makes the cost-based scheduling a feasible option by a substantial efficiency improvement.
- The system can be applied to other SLA-based resource and workload management in cloud databases, such as job dropping, preempt-and-restart scheduling, and MPL tuning for the purpose of SLA profit optimization. From the cloud user perspective, SLA design will be a more complicated, but interesting, in the presence of such SLA profit optimizing techniques from the cloud service providers, e.g. how should a client design his/her SLAs so that the provider will accept it, and still get a certain level of services reliably delivered given the competition with other users.
- Also, the cloud provider or the client may want to define additional constraints on certain jobs in addition to SLA penalty costs. For instance, cloud-service providers may want to provide differentiated quality of services to certain customers. The reasons could be various: e.g. i) service provider' desire for providing some guarantee against starvation on all customers, such that no jobs will experience a delay greater than a pre-set threshold, ii) explicit customer request in addition to the SLA-based price agreement, and iii) service provider's internal planning among multiple service components of a provider. Such constraint enforcement along with SLA cost optimization is a valuable feature. The method runs on a framework called constraint-conscious optimization scheduling (CCOS) that can schedule jobs such that it enforces desired constraints with a marginal sacrifice on the optimization metric (e.g. SLA penalty cost).
- Cost-based dispatching is optimally handled. The system dispatches jobs among multiple servers with a Size Interval Task Assignment (SITA)-based dispatching policy, called SITA-UC, for the purpose of cost minimization, and a SITA boundary tuning process is used.
- The system provides an effective and robust capacity planning framework for cloud resource management. The key elements of the framework are two folds: i) the capacity planner does not need to assume any distribution for user traffic and job lengths and ii) it works with cost-based scheduler and cost-based dispatcher modules in a tightly integrated manner to enable end-to-end profit optimization in the system. The system has been tested through extensive testing. Real data of user access data at Yahoo video site, and TPC-H benchmarks are used in the tests.
- The invention may be implemented in hardware, firmware or software, or a combination of the three. Preferably the invention is implemented in a computer program executed on a programmable computer having a processor, a data storage system, volatile and non-volatile memory and/or storage elements, at least one input device and at least one output device.
- By way of example, a computer with digital signal processing capability to support the system is discussed next. The computer preferably includes a processor, random access memory (RAM), a program memory (preferably a writable read-only memory (ROM) such as a flash ROM) and an input/output (I/O) controller coupled by a CPU bus. The computer may optionally include a hard drive controller which is coupled to a hard disk and CPU bus. Hard disk may be used for storing application programs, such as the present invention, and data. Alternatively, application programs may be stored in RAM or ROM. I/O controller is coupled by means of an I/O bus to an I/O interface. I/O interface receives and transmits data in analog or digital form over communication links such as a serial link, local area network, wireless link, and parallel link. Optionally, a display, a keyboard and a pointing device (mouse) may also be connected to I/O bus. Alternatively, separate connections (separate buses) may be used for I/O interface, display, keyboard and pointing device. Programmable processing system may be preprogrammed or it may be programmed (and reprogrammed) by downloading a program from another source (e.g., a floppy disk, CD-ROM, or another computer).
- Each computer program is tangibly stored in a machine-readable storage media or device (e.g., program memory or magnetic disk) readable by a general or special purpose programmable computer, for configuring and controlling operation of a computer when the storage media or device is read by the computer to perform the procedures described herein. The inventive system may also be considered to be embodied in a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner to perform the functions described herein.
- The invention has been described herein in considerable detail in order to comply with the patent Statutes and to provide those skilled in the art with the information needed to apply the novel principles and to construct and use such specialized components as are required. However, it is to be understood that the invention can be carried out by specifically different equipment and devices, and that various modifications, both as to the equipment details and operating procedures, can be accomplished without departing from the scope of the invention itself.
Claims (20)
1. A method for efficient maintenance of job prioritization for profit maximization in cloud-based service delivery infrastructures with multi-step cost structure support, comprising:
breaking multiple steps in the SLA of a job into corresponding cost steps;
generating a segmented cost function for each cost step;
creating a cost-based-scheduling (CBS)-priority value associated with a validity period for each segment based on the segmented cost function; and
choosing the job with the highest CBS priority value.
2. The method of claim 1 , comprising dividing each priority value by e(t−t 0 )/a where is t is the current time, and t0 is a fixed time instance.
3. The method of claim 1 , comprising inserting segmented-CBS-priority objects into a priority queue, where the objects are ordered by CBS priority, wherein the priority queue maintains a sorted list of jobs and performs job scheduling as a logarithmic-time operation by pulling the next job from a head of the priority queue.
4. The method of claim 1 , wherein among all segments corresponding to the same job, segment i has a higher CBS priority than segment j, where i<j.
5. The method of claim 1 , comprising chaining all segments of the same job together.
6. The method of claim 1 , wherein the chaining comprises adding a next segment pointer from a segmented-CBS-priority object i to i+1.
7. The method of claim 1 , comprising pulling a head of a segmented-CBS-priority queue, wherein the head has the highest priority value.
8. The method of claim 1 , comprising checking if a validity period for the job has expired or the job has been scheduled by an earlier segment.
9. The method of claim 1 , comprising discarding the segment and pulling the next head off the priority queue, until we find the segmented-CBS-priority with an unexpired validity period and also has not been scheduled yet.
10. The method of claim 1 , comprising marking other segments for the same job in the priority queue as scheduled.
11. The method of claim 1 , comprising monitoring a system status including system load, queue lengths, query response time, processor and input/output utilization.
12. The method of claim 1 , comprising dynamically adding or removing servers based on the system status to optimize operational cost while keeping the SLA costs in check.
13. The method of claim 1 , comprising performing capacity planning based on the system status.
14. The method of claim 13 , wherein the capacity planning is dynamic.
15. The method of claim 1 , comprising applying an Online Simulator for dynamic capacity planning through simulation.
16. The method of claim 1 , wherein the Online Simulator comprises offline and online modes.
17. The method of claim 1 , comprising deciding an order of execution of jobs at each server.
18. The method of claim 1 , comprising dispatching jobs to one or more servers based on a dispatching policy.
19. The method of claim 1 , wherein the dispatching policy is tuned according to dynamic changes including user traffic, addition or removal of processing nodes.
20. The method of claim 19 , wherein after client requests are dispatched to individual servers based on the dispatching policy, prioritizing dispatched jobs locally by forming a queue of queries from which a query is chosen and executed in a database.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/818,528 US20110173626A1 (en) | 2010-01-12 | 2010-06-18 | Efficient maintenance of job prioritization for profit maximization in cloud service delivery infrastructures |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US29425410P | 2010-01-12 | 2010-01-12 | |
US29424610P | 2010-01-12 | 2010-01-12 | |
US12/818,528 US20110173626A1 (en) | 2010-01-12 | 2010-06-18 | Efficient maintenance of job prioritization for profit maximization in cloud service delivery infrastructures |
Publications (1)
Publication Number | Publication Date |
---|---|
US20110173626A1 true US20110173626A1 (en) | 2011-07-14 |
Family
ID=44259232
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/818,514 Expired - Fee Related US8762997B2 (en) | 2010-01-12 | 2010-06-18 | Constraint-conscious optimal scheduling for cloud infrastructures |
US12/818,528 Abandoned US20110173626A1 (en) | 2010-01-12 | 2010-06-18 | Efficient maintenance of job prioritization for profit maximization in cloud service delivery infrastructures |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/818,514 Expired - Fee Related US8762997B2 (en) | 2010-01-12 | 2010-06-18 | Constraint-conscious optimal scheduling for cloud infrastructures |
Country Status (1)
Country | Link |
---|---|
US (2) | US8762997B2 (en) |
Cited By (52)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110295635A1 (en) * | 2010-06-01 | 2011-12-01 | International Business Machine Corporation | Systems and methods for scheduling power sources and jobs in an integrated power system |
US20120042319A1 (en) * | 2010-08-10 | 2012-02-16 | International Business Machines Corporation | Scheduling Parallel Data Tasks |
WO2012023050A2 (en) | 2010-08-20 | 2012-02-23 | Overtis Group Limited | Secure cloud computing system and method |
US20120096470A1 (en) * | 2010-10-19 | 2012-04-19 | International Business Machines Corporation | Prioritizing jobs within a cloud computing environment |
US20120144040A1 (en) * | 2010-12-07 | 2012-06-07 | Nec Laboratories America, Inc. | Negotiation tool and method for cloud infrastructure data sharing |
US20120167108A1 (en) * | 2010-12-22 | 2012-06-28 | Microsoft Corporation | Model for Hosting and Invoking Applications on Virtual Machines in a Distributed Computing Environment |
US20120167102A1 (en) * | 2010-12-22 | 2012-06-28 | Institute For Information Industry | Tag-based data processing apparatus and data processing method thereof |
US20120180055A1 (en) * | 2011-01-10 | 2012-07-12 | International Business Machines Corporation | Optimizing energy use in a data center by workload scheduling and management |
US20130174168A1 (en) * | 2012-01-04 | 2013-07-04 | International Business Machines Corporation | Policy-based scaling of computing resources in a networked computing environment |
WO2014031115A1 (en) * | 2012-08-22 | 2014-02-27 | Empire Technology Development Llc | Cloud process management |
US8676622B1 (en) | 2012-05-01 | 2014-03-18 | Amazon Technologies, Inc. | Job resource planner for cloud computing environments |
US20140108630A1 (en) * | 2012-10-11 | 2014-04-17 | American Express Travel Related Services Company, Inc. | Method and system for managing processing resources |
US8707254B2 (en) | 2012-04-06 | 2014-04-22 | Microsoft Corporation | Service level objective for cloud hosted applications |
US20140173591A1 (en) * | 2012-12-13 | 2014-06-19 | Cisco Technology, Inc. | Differentiated service levels in virtualized computing |
US20140188532A1 (en) * | 2012-11-13 | 2014-07-03 | Nec Laboratories America, Inc. | Multitenant Database Placement with a Cost Based Query Scheduler |
US8775282B1 (en) | 2012-05-18 | 2014-07-08 | Amazon Technologies, Inc. | Capacity management of draining-state platforms providing network-accessible resources |
US8793381B2 (en) | 2012-06-26 | 2014-07-29 | International Business Machines Corporation | Workload adaptive cloud computing resource allocation |
US20140214496A1 (en) * | 2013-01-31 | 2014-07-31 | Hewlett-Packard Development Company, L.P. | Dynamic profitability management for cloud service providers |
US9032077B1 (en) | 2012-06-28 | 2015-05-12 | Amazon Technologies, Inc. | Client-allocatable bandwidth pools |
US20150150016A1 (en) * | 2013-11-25 | 2015-05-28 | Xerox Corporation | Method and apparatus for a user-driven priority based job scheduling in a data processing platform |
US9154589B1 (en) | 2012-06-28 | 2015-10-06 | Amazon Technologies, Inc. | Bandwidth-optimized cloud resource placement service |
CN104965755A (en) * | 2015-05-04 | 2015-10-07 | 东南大学 | Cloud service workflow scheduling method based on budget constraint |
CN105068863A (en) * | 2015-07-16 | 2015-11-18 | 福州大学 | Cost-driven scheduling method for workflow with deadline constraints in cloudy environment |
US9240025B1 (en) | 2012-03-27 | 2016-01-19 | Amazon Technologies, Inc. | Dynamic pricing of network-accessible resources for stateful applications |
US9246986B1 (en) | 2012-05-21 | 2016-01-26 | Amazon Technologies, Inc. | Instance selection ordering policies for network-accessible resources |
US9294236B1 (en) | 2012-03-27 | 2016-03-22 | Amazon Technologies, Inc. | Automated cloud resource trading system |
US9300547B2 (en) | 2013-11-19 | 2016-03-29 | International Business Machines Corporation | Modification of cloud application service levels based upon document consumption |
US9306870B1 (en) | 2012-06-28 | 2016-04-05 | Amazon Technologies, Inc. | Emulating circuit switching in cloud networking environments |
US9311159B2 (en) | 2011-10-31 | 2016-04-12 | At&T Intellectual Property I, L.P. | Systems, methods, and articles of manufacture to provide cloud resource orchestration |
US9417923B2 (en) | 2013-12-17 | 2016-08-16 | International Business Machines Corporation | Optimization of workload placement |
US20160277255A1 (en) * | 2015-03-20 | 2016-09-22 | International Business Machines Corporation | Optimizing allocation of multi-tasking servers |
US9479382B1 (en) | 2012-03-27 | 2016-10-25 | Amazon Technologies, Inc. | Execution plan generation and scheduling for network-accessible resources |
US9779374B2 (en) * | 2013-09-25 | 2017-10-03 | Sap Se | System and method for task assignment in workflows |
US9898337B2 (en) | 2015-03-27 | 2018-02-20 | International Business Machines Corporation | Dynamic workload deployment for data integration services |
US9985848B1 (en) | 2012-03-27 | 2018-05-29 | Amazon Technologies, Inc. | Notification based pricing of excess cloud capacity |
US10067798B2 (en) | 2015-10-27 | 2018-09-04 | International Business Machines Corporation | User interface and system supporting user decision making and readjustments in computer-executable job allocations in the cloud |
US10152449B1 (en) | 2012-05-18 | 2018-12-11 | Amazon Technologies, Inc. | User-defined capacity reservation pools for network-accessible resources |
US10223647B1 (en) | 2012-03-27 | 2019-03-05 | Amazon Technologies, Inc. | Dynamic modification of interruptibility settings for network-accessible resources |
US20190220309A1 (en) * | 2012-06-20 | 2019-07-18 | International Business Machines Corporation | Job distribution within a grid environment |
US10474502B2 (en) | 2013-01-14 | 2019-11-12 | Microsoft Technology Licensing, Llc | Multi-tenant license enforcement across job requests |
CN110795224A (en) * | 2019-10-30 | 2020-02-14 | 北京思特奇信息技术股份有限公司 | Automatic operation and maintenance system and method based on infrastructure |
US10620989B2 (en) | 2018-06-08 | 2020-04-14 | Capital One Services, Llc | Managing execution of data processing jobs in a virtual computing environment |
US10686677B1 (en) | 2012-05-18 | 2020-06-16 | Amazon Technologies, Inc. | Flexible capacity reservations for network-accessible resources |
US10832185B1 (en) * | 2018-01-10 | 2020-11-10 | Wells Fargo Bank, N.A. | Goal optimized process scheduler |
US10846788B1 (en) | 2012-06-28 | 2020-11-24 | Amazon Technologies, Inc. | Resource group traffic rate service |
US10929792B2 (en) | 2016-03-17 | 2021-02-23 | International Business Machines Corporation | Hybrid cloud operation planning and optimization |
US11093292B2 (en) | 2019-09-27 | 2021-08-17 | International Business Machines Corporation | Identifying recurring actions in a hybrid integration platform to control resource usage |
US11206579B1 (en) | 2012-03-26 | 2021-12-21 | Amazon Technologies, Inc. | Dynamic scheduling for network data transfers |
US11294726B2 (en) * | 2017-05-04 | 2022-04-05 | Salesforce.Com, Inc. | Systems, methods, and apparatuses for implementing a scalable scheduler with heterogeneous resource allocation of large competing workloads types using QoS |
US11442669B1 (en) | 2018-03-15 | 2022-09-13 | Pure Storage, Inc. | Orchestrating a virtual storage system |
US20240232751A9 (en) * | 2022-10-19 | 2024-07-11 | Dell Products L.P. | Information technology automation based on job return on investment |
US12066900B2 (en) | 2018-03-15 | 2024-08-20 | Pure Storage, Inc. | Managing disaster recovery to cloud computing environment |
Families Citing this family (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110313902A1 (en) * | 2010-06-18 | 2011-12-22 | International Business Machines Corporation | Budget Management in a Compute Cloud |
US8977677B2 (en) | 2010-12-01 | 2015-03-10 | Microsoft Technology Licensing, Llc | Throttling usage of resources |
US8447894B2 (en) | 2011-01-05 | 2013-05-21 | Alibaba Group Holding Limited | Upgrading an elastic computing cloud system |
US9329901B2 (en) | 2011-12-09 | 2016-05-03 | Microsoft Technology Licensing, Llc | Resource health based scheduling of workload tasks |
US9372735B2 (en) | 2012-01-09 | 2016-06-21 | Microsoft Technology Licensing, Llc | Auto-scaling of pool of virtual machines based on auto-scaling rules of user associated with the pool |
US8904008B2 (en) | 2012-01-09 | 2014-12-02 | Microsoft Corporation | Assignment of resources in virtual machine pools |
US9170849B2 (en) | 2012-01-09 | 2015-10-27 | Microsoft Technology Licensing, Llc | Migration of task to different pool of resources based on task retry count during task lease |
US9305274B2 (en) | 2012-01-16 | 2016-04-05 | Microsoft Technology Licensing, Llc | Traffic shaping based on request resource usage |
US20130198112A1 (en) * | 2012-01-30 | 2013-08-01 | Verizon Patent And Licensing Inc. | Capacity Management Methods and Systems |
CN102685230A (en) * | 2012-05-10 | 2012-09-19 | 苏州阔地网络科技有限公司 | Message parsing processing method and system for cloud conference |
CN102685229A (en) * | 2012-05-10 | 2012-09-19 | 苏州阔地网络科技有限公司 | Message parsing method and system of cloud conference |
CN102685227A (en) * | 2012-05-10 | 2012-09-19 | 苏州阔地网络科技有限公司 | Message scheduling method and message scheduling system for cloud conference |
US9535749B2 (en) | 2012-05-11 | 2017-01-03 | Infosys Limited | Methods for managing work load bursts and devices thereof |
US9122524B2 (en) | 2013-01-08 | 2015-09-01 | Microsoft Technology Licensing, Llc | Identifying and throttling tasks based on task interactivity |
CN103455375B (en) * | 2013-01-31 | 2017-02-08 | 南京理工大学连云港研究院 | Load-monitoring-based hybrid scheduling method under Hadoop cloud platform |
RU2014125148A (en) | 2014-06-20 | 2015-12-27 | Евгений Анатольевич Обжиров | ELECTROSTATIC ELECTRODES AND METHODS OF THEIR PRODUCTION |
US9424077B2 (en) | 2014-11-14 | 2016-08-23 | Successfactors, Inc. | Throttle control on cloud-based computing tasks utilizing enqueue and dequeue counters |
CN104793993B (en) * | 2015-04-24 | 2017-11-17 | 江南大学 | The cloud computing method for scheduling task of artificial bee colony particle cluster algorithm based on Levy flights |
CN106453457B (en) | 2015-08-10 | 2019-12-10 | 微软技术许可有限责任公司 | Multi-priority service instance allocation within a cloud computing platform |
US9733978B2 (en) * | 2015-08-27 | 2017-08-15 | Qualcomm Incorporated | Data management for multiple processing units using data transfer costs |
US9514037B1 (en) | 2015-12-16 | 2016-12-06 | International Business Machines Corporation | Test program scheduling based on analysis of test data sets |
US10742565B2 (en) * | 2016-01-18 | 2020-08-11 | Dell Products, L.P. | Enterprise messaging system using an optimized queueing model |
US10601725B2 (en) * | 2016-05-16 | 2020-03-24 | International Business Machines Corporation | SLA-based agile resource provisioning in disaggregated computing systems |
CN106789118B (en) * | 2016-11-28 | 2020-11-17 | 上海交通大学 | Cloud computing charging method based on service level agreement |
US10447806B1 (en) * | 2017-06-09 | 2019-10-15 | Nutanix, Inc. | Workload scheduling across heterogeneous resource environments |
US20190196969A1 (en) * | 2017-12-22 | 2019-06-27 | Samsung Electronics Co., Ltd. | Method and apparatus for adaptive cache load balancing for ssd-based cloud computing storage system |
US11310335B2 (en) * | 2018-05-11 | 2022-04-19 | Jpmorgan Chase Bank, N.A. | Function as a service gateway |
WO2020078539A1 (en) | 2018-10-16 | 2020-04-23 | Huawei Technologies Co., Ltd. | Time based priority queue |
US11513842B2 (en) | 2019-10-03 | 2022-11-29 | International Business Machines Corporation | Performance biased resource scheduling based on runtime performance |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070061180A1 (en) * | 2005-09-13 | 2007-03-15 | Joseph Offenberg | Centralized job scheduling maturity model |
US20080312990A1 (en) * | 2005-03-08 | 2008-12-18 | Roger Alan Byrne | Knowledge Management System For Asset Managers |
US20090021775A1 (en) * | 2007-07-18 | 2009-01-22 | Xerox Corporation | Workflow scheduling method and system |
US20100318609A1 (en) * | 2009-06-15 | 2010-12-16 | Microsoft Corporation | Bridging enterprise networks into cloud |
US20100333116A1 (en) * | 2009-06-30 | 2010-12-30 | Anand Prahlad | Cloud gateway system for managing data storage to cloud storage sites |
US20110016214A1 (en) * | 2009-07-15 | 2011-01-20 | Cluster Resources, Inc. | System and method of brokering cloud computing resources |
US20110131589A1 (en) * | 2009-12-02 | 2011-06-02 | International Business Machines Corporation | System and method for transforming legacy desktop environments to a virtualized desktop model |
US20110145392A1 (en) * | 2009-12-11 | 2011-06-16 | International Business Machines Corporation | Dynamic provisioning of resources within a cloud computing environment |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7480913B2 (en) * | 2003-09-09 | 2009-01-20 | International Business Machines Corporation | Method, apparatus, and program for scheduling resources in a penalty-based environment |
-
2010
- 2010-06-18 US US12/818,514 patent/US8762997B2/en not_active Expired - Fee Related
- 2010-06-18 US US12/818,528 patent/US20110173626A1/en not_active Abandoned
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080312990A1 (en) * | 2005-03-08 | 2008-12-18 | Roger Alan Byrne | Knowledge Management System For Asset Managers |
US20070061180A1 (en) * | 2005-09-13 | 2007-03-15 | Joseph Offenberg | Centralized job scheduling maturity model |
US20090021775A1 (en) * | 2007-07-18 | 2009-01-22 | Xerox Corporation | Workflow scheduling method and system |
US20100318609A1 (en) * | 2009-06-15 | 2010-12-16 | Microsoft Corporation | Bridging enterprise networks into cloud |
US20100333116A1 (en) * | 2009-06-30 | 2010-12-30 | Anand Prahlad | Cloud gateway system for managing data storage to cloud storage sites |
US20110016214A1 (en) * | 2009-07-15 | 2011-01-20 | Cluster Resources, Inc. | System and method of brokering cloud computing resources |
US20110131589A1 (en) * | 2009-12-02 | 2011-06-02 | International Business Machines Corporation | System and method for transforming legacy desktop environments to a virtualized desktop model |
US20110145392A1 (en) * | 2009-12-11 | 2011-06-16 | International Business Machines Corporation | Dynamic provisioning of resources within a cloud computing environment |
Non-Patent Citations (2)
Title |
---|
Jon M. Peha and Fouad A. Tobagi, A Cost-Based Scheduling Algorithm to Support Integrated Services, INFOCOM '91, Proceedings. Tenth Annual Joint Conference of the IEEE Computer and Communications Societies. Networking in the 90s, April 1991, http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=147579 * |
Peha, Jon M. and Fouad A. Tobagi, Cost-Based Scheduling and Dropping Algorithms to Support Integrated Services, Carnegie Mellon University, IEEE Transactions on Communications, Feb. 1996. * |
Cited By (83)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110295635A1 (en) * | 2010-06-01 | 2011-12-01 | International Business Machine Corporation | Systems and methods for scheduling power sources and jobs in an integrated power system |
US10984345B2 (en) * | 2010-06-01 | 2021-04-20 | International Business Machines Corporation | Management of power sources and jobs in an integrated power system |
US9274836B2 (en) | 2010-08-10 | 2016-03-01 | International Business Machines Corporation | Scheduling parallel data tasks |
US20120042319A1 (en) * | 2010-08-10 | 2012-02-16 | International Business Machines Corporation | Scheduling Parallel Data Tasks |
US8930954B2 (en) * | 2010-08-10 | 2015-01-06 | International Business Machines Corporation | Scheduling parallel data tasks |
WO2012023050A2 (en) | 2010-08-20 | 2012-02-23 | Overtis Group Limited | Secure cloud computing system and method |
US20120096470A1 (en) * | 2010-10-19 | 2012-04-19 | International Business Machines Corporation | Prioritizing jobs within a cloud computing environment |
US9218202B2 (en) | 2010-10-19 | 2015-12-22 | International Business Machines Corporation | Prioritizing jobs within a cloud computing environment |
US8429659B2 (en) * | 2010-10-19 | 2013-04-23 | International Business Machines Corporation | Scheduling jobs within a cloud computing environment |
US20120144040A1 (en) * | 2010-12-07 | 2012-06-07 | Nec Laboratories America, Inc. | Negotiation tool and method for cloud infrastructure data sharing |
US8612600B2 (en) * | 2010-12-07 | 2013-12-17 | Nec Laboratories America, Inc. | Negotiation tool and method for cloud infrastructure data sharing |
US20120167102A1 (en) * | 2010-12-22 | 2012-06-28 | Institute For Information Industry | Tag-based data processing apparatus and data processing method thereof |
US8695005B2 (en) * | 2010-12-22 | 2014-04-08 | Microsoft Corporation | Model for hosting and invoking applications on virtual machines in a distributed computing environment |
US20120167108A1 (en) * | 2010-12-22 | 2012-06-28 | Microsoft Corporation | Model for Hosting and Invoking Applications on Virtual Machines in a Distributed Computing Environment |
US9250962B2 (en) * | 2011-01-10 | 2016-02-02 | International Business Machines Corporation | Optimizing energy use in a data center by workload scheduling and management |
US9235441B2 (en) | 2011-01-10 | 2016-01-12 | International Business Machines Corporation | Optimizing energy use in a data center by workload scheduling and management |
US20120180055A1 (en) * | 2011-01-10 | 2012-07-12 | International Business Machines Corporation | Optimizing energy use in a data center by workload scheduling and management |
US9311159B2 (en) | 2011-10-31 | 2016-04-12 | At&T Intellectual Property I, L.P. | Systems, methods, and articles of manufacture to provide cloud resource orchestration |
US9940595B2 (en) | 2012-01-04 | 2018-04-10 | International Business Machines Corporation | Policy-based scaling of computing resources in a networked computing environment |
US20130174168A1 (en) * | 2012-01-04 | 2013-07-04 | International Business Machines Corporation | Policy-based scaling of computing resources in a networked computing environment |
US10304019B2 (en) | 2012-01-04 | 2019-05-28 | International Business Machines Corporation | Policy-based scaling of computing resources in a networked computing environment |
US10776730B2 (en) | 2012-01-04 | 2020-09-15 | International Business Machines Corporation | Policy-based scaling of computing resources in a networked computing environment |
US8966085B2 (en) * | 2012-01-04 | 2015-02-24 | International Business Machines Corporation | Policy-based scaling of computing resources in a networked computing environment |
US11206579B1 (en) | 2012-03-26 | 2021-12-21 | Amazon Technologies, Inc. | Dynamic scheduling for network data transfers |
US10223647B1 (en) | 2012-03-27 | 2019-03-05 | Amazon Technologies, Inc. | Dynamic modification of interruptibility settings for network-accessible resources |
US9240025B1 (en) | 2012-03-27 | 2016-01-19 | Amazon Technologies, Inc. | Dynamic pricing of network-accessible resources for stateful applications |
US9479382B1 (en) | 2012-03-27 | 2016-10-25 | Amazon Technologies, Inc. | Execution plan generation and scheduling for network-accessible resources |
US9294236B1 (en) | 2012-03-27 | 2016-03-22 | Amazon Technologies, Inc. | Automated cloud resource trading system |
US9985848B1 (en) | 2012-03-27 | 2018-05-29 | Amazon Technologies, Inc. | Notification based pricing of excess cloud capacity |
US9015662B2 (en) | 2012-04-06 | 2015-04-21 | Microsoft Technology Licensing, Llc | Service level objective for cloud hosted applications |
US8707254B2 (en) | 2012-04-06 | 2014-04-22 | Microsoft Corporation | Service level objective for cloud hosted applications |
US8676622B1 (en) | 2012-05-01 | 2014-03-18 | Amazon Technologies, Inc. | Job resource planner for cloud computing environments |
US10152449B1 (en) | 2012-05-18 | 2018-12-11 | Amazon Technologies, Inc. | User-defined capacity reservation pools for network-accessible resources |
US8775282B1 (en) | 2012-05-18 | 2014-07-08 | Amazon Technologies, Inc. | Capacity management of draining-state platforms providing network-accessible resources |
US10686677B1 (en) | 2012-05-18 | 2020-06-16 | Amazon Technologies, Inc. | Flexible capacity reservations for network-accessible resources |
US9246986B1 (en) | 2012-05-21 | 2016-01-26 | Amazon Technologies, Inc. | Instance selection ordering policies for network-accessible resources |
US11243805B2 (en) * | 2012-06-20 | 2022-02-08 | International Business Machines Corporation | Job distribution within a grid environment using clusters of execution hosts |
US11275609B2 (en) | 2012-06-20 | 2022-03-15 | International Business Machines Corporation | Job distribution within a grid environment |
US20190220309A1 (en) * | 2012-06-20 | 2019-07-18 | International Business Machines Corporation | Job distribution within a grid environment |
US8793381B2 (en) | 2012-06-26 | 2014-07-29 | International Business Machines Corporation | Workload adaptive cloud computing resource allocation |
US9497139B2 (en) | 2012-06-28 | 2016-11-15 | Amazon Technologies, Inc. | Client-allocatable bandwidth pools |
US9306870B1 (en) | 2012-06-28 | 2016-04-05 | Amazon Technologies, Inc. | Emulating circuit switching in cloud networking environments |
US10846788B1 (en) | 2012-06-28 | 2020-11-24 | Amazon Technologies, Inc. | Resource group traffic rate service |
US9154589B1 (en) | 2012-06-28 | 2015-10-06 | Amazon Technologies, Inc. | Bandwidth-optimized cloud resource placement service |
US9032077B1 (en) | 2012-06-28 | 2015-05-12 | Amazon Technologies, Inc. | Client-allocatable bandwidth pools |
US9979609B2 (en) | 2012-08-22 | 2018-05-22 | Empire Technology Development Llc | Cloud process management |
WO2014031115A1 (en) * | 2012-08-22 | 2014-02-27 | Empire Technology Development Llc | Cloud process management |
US20140108630A1 (en) * | 2012-10-11 | 2014-04-17 | American Express Travel Related Services Company, Inc. | Method and system for managing processing resources |
US20160055351A1 (en) * | 2012-10-11 | 2016-02-25 | American Express Travel Related Services Company, Inc. | Method and system for managing processing resources |
US9477847B2 (en) * | 2012-10-11 | 2016-10-25 | American Express Travel Related Services Company, Inc. | Method and system for managing processing resources |
US9207982B2 (en) * | 2012-10-11 | 2015-12-08 | American Express Travel Related Services Company, Inc. | Method and system for managing processing resources |
US20170011321A1 (en) * | 2012-10-11 | 2017-01-12 | American Express Travel Related Services Company, Inc. | Uplifting of computer resources |
US9898708B2 (en) * | 2012-10-11 | 2018-02-20 | American Express Travel Related Services Company, Inc. | Uplifting of computer resources |
US20140188532A1 (en) * | 2012-11-13 | 2014-07-03 | Nec Laboratories America, Inc. | Multitenant Database Placement with a Cost Based Query Scheduler |
US20140173591A1 (en) * | 2012-12-13 | 2014-06-19 | Cisco Technology, Inc. | Differentiated service levels in virtualized computing |
US10474502B2 (en) | 2013-01-14 | 2019-11-12 | Microsoft Technology Licensing, Llc | Multi-tenant license enforcement across job requests |
US20140214496A1 (en) * | 2013-01-31 | 2014-07-31 | Hewlett-Packard Development Company, L.P. | Dynamic profitability management for cloud service providers |
US9779374B2 (en) * | 2013-09-25 | 2017-10-03 | Sap Se | System and method for task assignment in workflows |
US9300547B2 (en) | 2013-11-19 | 2016-03-29 | International Business Machines Corporation | Modification of cloud application service levels based upon document consumption |
US9304817B2 (en) * | 2013-11-25 | 2016-04-05 | Xerox Corporation | Method and apparatus for a user-driven priority based job scheduling in a data processing platform |
US20150150016A1 (en) * | 2013-11-25 | 2015-05-28 | Xerox Corporation | Method and apparatus for a user-driven priority based job scheduling in a data processing platform |
US10102490B2 (en) | 2013-12-17 | 2018-10-16 | International Business Machines Corporation | Optimization of workload placement |
US9417923B2 (en) | 2013-12-17 | 2016-08-16 | International Business Machines Corporation | Optimization of workload placement |
US20160277255A1 (en) * | 2015-03-20 | 2016-09-22 | International Business Machines Corporation | Optimizing allocation of multi-tasking servers |
US10452450B2 (en) * | 2015-03-20 | 2019-10-22 | International Business Machines Corporation | Optimizing allocation of multi-tasking servers |
US10970122B2 (en) | 2015-03-20 | 2021-04-06 | International Business Machines Corporation | Optimizing allocation of multi-tasking servers |
US9898337B2 (en) | 2015-03-27 | 2018-02-20 | International Business Machines Corporation | Dynamic workload deployment for data integration services |
US10296384B2 (en) | 2015-03-27 | 2019-05-21 | International Business Machines Corporation | Dynamic workload deployment for data integration services |
CN104965755A (en) * | 2015-05-04 | 2015-10-07 | 东南大学 | Cloud service workflow scheduling method based on budget constraint |
CN105068863A (en) * | 2015-07-16 | 2015-11-18 | 福州大学 | Cost-driven scheduling method for workflow with deadline constraints in cloudy environment |
US10552223B2 (en) | 2015-10-27 | 2020-02-04 | International Business Machines Corporation | User interface and system supporting user decision making and readjustments in computer-executable job allocations in the cloud |
US11030011B2 (en) | 2015-10-27 | 2021-06-08 | International Business Machines Corporation | User interface and system supporting user decision making and readjustments in computer-executable job allocations in the cloud |
US10067798B2 (en) | 2015-10-27 | 2018-09-04 | International Business Machines Corporation | User interface and system supporting user decision making and readjustments in computer-executable job allocations in the cloud |
US10929792B2 (en) | 2016-03-17 | 2021-02-23 | International Business Machines Corporation | Hybrid cloud operation planning and optimization |
US11294726B2 (en) * | 2017-05-04 | 2022-04-05 | Salesforce.Com, Inc. | Systems, methods, and apparatuses for implementing a scalable scheduler with heterogeneous resource allocation of large competing workloads types using QoS |
US10832185B1 (en) * | 2018-01-10 | 2020-11-10 | Wells Fargo Bank, N.A. | Goal optimized process scheduler |
US11442669B1 (en) | 2018-03-15 | 2022-09-13 | Pure Storage, Inc. | Orchestrating a virtual storage system |
US12066900B2 (en) | 2018-03-15 | 2024-08-20 | Pure Storage, Inc. | Managing disaster recovery to cloud computing environment |
US10620989B2 (en) | 2018-06-08 | 2020-04-14 | Capital One Services, Llc | Managing execution of data processing jobs in a virtual computing environment |
US11620155B2 (en) | 2018-06-08 | 2023-04-04 | Capital One Services, Llc | Managing execution of data processing jobs in a virtual computing environment |
US11093292B2 (en) | 2019-09-27 | 2021-08-17 | International Business Machines Corporation | Identifying recurring actions in a hybrid integration platform to control resource usage |
CN110795224A (en) * | 2019-10-30 | 2020-02-14 | 北京思特奇信息技术股份有限公司 | Automatic operation and maintenance system and method based on infrastructure |
US20240232751A9 (en) * | 2022-10-19 | 2024-07-11 | Dell Products L.P. | Information technology automation based on job return on investment |
Also Published As
Publication number | Publication date |
---|---|
US20110173038A1 (en) | 2011-07-14 |
US8762997B2 (en) | 2014-06-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8762997B2 (en) | Constraint-conscious optimal scheduling for cloud infrastructures | |
US10678602B2 (en) | Apparatus, systems and methods for dynamic adaptive metrics based application deployment on distributed infrastructures | |
EP3770774B1 (en) | Control method for household appliance, and household appliance | |
US7870256B2 (en) | Remote desktop performance model for assigning resources | |
US8380557B2 (en) | Multi-tenant database management for service level agreement (SLA) profit maximization | |
Moon et al. | SLA-aware profit optimization in cloud services via resource scheduling | |
US9504061B2 (en) | Networked resource provisioning system | |
US8640132B2 (en) | Jobstream planner considering network contention and resource availability | |
US9218213B2 (en) | Dynamic placement of heterogeneous workloads | |
US8776076B2 (en) | Highly scalable cost based SLA-aware scheduling for cloud services | |
US8843929B1 (en) | Scheduling in computer clusters | |
US8949429B1 (en) | Client-managed hierarchical resource allocation | |
US9075832B2 (en) | Tenant placement in multitenant databases for profit maximization | |
US8898307B2 (en) | Scheduling methods using soft and hard service level considerations | |
Zhang et al. | Network service scheduling with resource sharing and preemption | |
Kurowski et al. | Multicriteria, multi-user scheduling in grids with advance reservation | |
Wang et al. | Performance analysis and optimization on scheduling stochastic cloud service requests: a survey | |
Fahad et al. | A multi‐queue priority‐based task scheduling algorithm in fog computing environment | |
Du et al. | Scheduling for cloud-based computing systems to support soft real-time applications | |
Zhao et al. | SLA-aware and deadline constrained profit optimization for cloud resource management in big data analytics-as-a-service platforms | |
Lin et al. | Two-tier project and job scheduling for SaaS cloud service providers | |
Gohad et al. | Model driven provisioning in multi-tenant clouds | |
Ogawa et al. | Cloud bursting approach based on predicting requests for business-critical web systems | |
Le Hai et al. | A working time deadline-based backfilling scheduling solution | |
Carvalho et al. | Multi-dimensional admission control and capacity planning for IaaS clouds with multiple service classes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |