WO2011159430A1 - System and method for quality - of - service aware rejuvenation - Google Patents

System and method for quality - of - service aware rejuvenation Download PDF

Info

Publication number
WO2011159430A1
WO2011159430A1 PCT/US2011/037510 US2011037510W WO2011159430A1 WO 2011159430 A1 WO2011159430 A1 WO 2011159430A1 US 2011037510 W US2011037510 W US 2011037510W WO 2011159430 A1 WO2011159430 A1 WO 2011159430A1
Authority
WO
WIPO (PCT)
Prior art keywords
qos metric
bucket
specific qos
metric
specific
Prior art date
Application number
PCT/US2011/037510
Other languages
French (fr)
Inventor
Alberto Avritzer
Original Assignee
Siemens Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US12/949,913 external-priority patent/US8423833B2/en
Application filed by Siemens Corporation filed Critical Siemens Corporation
Publication of WO2011159430A1 publication Critical patent/WO2011159430A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1438Restarting or rejuvenating
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5003Managing SLA; Interaction between SLA and QoS
    • H04L41/5019Ensuring fulfilment of SLA
    • H04L41/5025Ensuring fulfilment of SLA by proactively reacting to service quality change, e.g. by reconfiguration after service quality degradation or upgrade
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0823Errors, e.g. transmission errors
    • H04L43/0829Packet loss
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0852Delays
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/81Threshold
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/88Monitoring involving counting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0813Configuration setting characterised by the conditions triggering a change of settings
    • H04L41/082Configuration setting characterised by the conditions triggering a change of settings the condition being updates or upgrades of network functionality

Definitions

  • This disclosure is directed to methods for maintaining QoS requirements in software systems by monitoring the quality of high priority transactions.
  • Soft failures can be caused by the evolution of the state of one or more software data structures during (possibly) prolonged execution. This evolution is called software aging. Software aging has been observed in widely used software.
  • One approach for system capacity restoration for telecommunications systems takes advantage of the cyclical nature of telecommunications traffic. Telecommunications operating companies understand the traffic patterns in their networks well, and therefore can plan to restore their smoothly degrading systems to full capacity in the same way they plan their other maintenance activities.
  • Soft bugs typically occur as a result of problems with synchronization mechanisms, such as semaphores, kernel structures, such as file table allocations, database management systems, such as database lock deadlocks, and other resource allocation mechanisms that are essential to the proper operation of large multilayer distributed systems. Since some of these resources are designed with self-healing mechanisms, such as timeouts, some systems may recover from soft bugs after a period of time. For example, when the soft bug for a specific Java based e-commerce system was revealed, users were complaining of very slow response time for periods exceeding one hour, after which the problem would clear by itself. However, in other cases, host based worm disruption systems can throttle the rate of connections out of a host.
  • synchronization mechanisms such as semaphores, kernel structures, such as file table allocations, database management systems, such as database lock deadlocks, and other resource allocation mechanisms that are essential to the proper operation of large multilayer distributed systems. Since some of these resources are designed with self-healing mechanisms, such as timeouts, some systems may recover from soft bugs after a
  • Another methodology for the quantitative analysis of software rejuvenation policies is based on the assumption that system degradation can be quantified by monitoring a metric that is co-related with system degradation.
  • a maximum degradation threshold level is defined and two rejuvenation policies based on the defined threshold are presented.
  • the first policy is risk based. It defines a confidence level on the metric, and performs rejuvenation, with a probability that is proportional to the confidence level.
  • the second policy is deterministic and performs rejuvenation as soon as the threshold level is reached.
  • the theory of renewal processes with rewards is used to estimate the expected system down time and to help estimate the proper rejuvenation intervals.
  • Another methodology for proactive software rejuvenation is based on the statistical estimation of resource exhaustion.
  • Wi- Fi systems based on the IEEE 802.11 standard
  • VoIP Voice-over Internet Protocol
  • CCTV closed-circuit television
  • Exemplary embodiments of the invention as described herein generally include methods and systems for software rejuvenation that track individual transactions quality-of- service (QoS) and improves the software ability to meet a set of QoS requirements of high priority transactions by reducing the number of low priority transactions allowed in the system.
  • An algorithm according to an embodiment of the invention is applicable to environments that do not support transactions priorities by required applications to meet specific QoS requirements.
  • An algorithm according to an embodiment of the invention can accurately measure the QoS of high priority transactions, where the QoS can be represented by a set of multivariate functions, and uses the measurement results and approximate results for analytical modeling to derive the underlying environmental conditions. If it is found that the high priority transactions are deviating from the required QoS, a fast analytical modeling approximation can quickly establish new threshold on the maximum number of low-priority transactions that are allowed to be executed in the infrastructure..
  • An algorithm according to an embodiment of the invention can ensure QoS of high- priority transactions by dynamically estimating the infrastructure environmental conditions and by restricting the workload allowed to be carried by low-priority transactions.
  • An analytical performance model and software rejuvenation can quickly detect QoS degradation of high-priority transactions and enforce QoS requirements of these high-priority transactions.
  • the use of multiple buckets to count the variability in the measured customer affecting metric can distinguish between degradation that is a function of a transient in the arrival process and degradation that is a function of a significant degradation in the infrastructure environment.
  • An algorithm according to an embodiment of the invention can provide superior performance by tracking a system's ability to meet its QoS requirements and using measured QoS data to determine when to trigger a software rejuvenation routine.
  • An algorithm according to an embodiment of the invention can be generalized by performing a detailed analysis of performance usage to derive more precise performance signatures for different modes of operation, e.g. busy hour vs. weekend, different load conditions, e.g. high and low loads, and different user profiles.
  • An algorithm according to an embodiment of the invention could be applied to a network of hosts that support mission critical systems and depend on the successful completion of transactions with hard real-time requirements.
  • a computer-implemented method for monitoring the quality-of-service (QoS) of high priority transactions in a software system including receiving a QoS metric of a high priority transaction that is sampled by a software system monitoring infrastructure, where the QoS metric is a specific metric of a set of QoS metrics and is associated with a plurality of buckets, comparing the sampled specific QoS metric to an expected value for the specific QoS metric, where a bucket for the specific QoS metric is incremented if the sampled specific QoS metric exceeds the corresponding expected value, and the bucket for the specific QoS metric is decremented if the sampled specific QoS metric is less than the corresponding expected value, reinitializing the current bucket to zero, computing a depth of a next bucket for the specific QoS metric, and, increasing a number of standard deviations from a mean value for the specific QoS metric, if the bucket for the
  • the method includes initializing the current bucket and the bucket index to zero, and initializing a maximum value of the current bucket to a predetermined maximum.
  • the depth of the next bucket for the sampled specific QoS metric is computed as ) [N[z] + l,z] MAX
  • i is an index for the specific QoS metric
  • N[i] is an index for the current bucket
  • D M A X is an overall maximum value for all buckets
  • T[i] is the sampled QoS metric
  • x [i] a mean of the specific QoS metric
  • ⁇ [i] is a standard deviation of the mean QoS.
  • the method includes reinitializing the current bucket to zero, if the bucket for the specific QoS metric is emptied.
  • the method includes reinitializing the current bucket to a predetermined maximum and decreasing a number of standard deviations from a mean value for the sampled specific QoS metric, when a value of the current bucket index is greater than zero.
  • the expected value for the specific QoS metric is x[i] + N[i] x ⁇ [ ⁇ ] , where N[i] is an index for the current bucket, 3c [i] a mean for the specific QoS metric, and ⁇ [i] is a standard deviation of the mean QoS.
  • the software rejuvenation routine measures the set of QoS metrics T, computes a channel utilization p [i] for each metric as a function of each respective QoS metric i, determines a value p'[i] ⁇ p[i] for which T[i] is an inverse function of p [i] that is less than a required value of the specific QoS metric, where each function for each specific QoS metric is determined through a performance analysis of the software infrastructure.
  • a computer- implemented method for monitoring the quality-of-service (QoS) of high priority transactions in a software system including receiving a QoS metric of a high priority transaction that is sampled by a software system monitoring infrastructure, where the QoS metric is a specific metric of a set of QoS metrics and is associated with a plurality of buckets, comparing the sampled specific QoS metric to an expected value for the specific QoS metric, where a bucket for the specific QoS metric is incremented if the sampled specific QoS metric exceeds the corresponding expected value, and the bucket for the specific QoS metric is decremented if the sampled specific QoS metric is less than the corresponding expected value, reinitializing the current bucket to zero, if the bucket for the specific QoS metric is emptied, and executing a software rejuvenation routine when the bucket for the specific QoS metric exceeds a threshold.
  • QoS quality-of-service
  • the method includes reinitializing the current bucket to zero, computing a depth of a next bucket for the specific QoS metric based on the amount by which the sampled specific QoS metric exceeds the corresponding expected value, and increasing a number of standard deviations from a mean value for the specific QoS metric.
  • the method includes reinitializing the current bucket for the specific QoS metric to a predetermined maximum, and decreasing a number of standard deviations from a mean value for the specific QoS metric.
  • the method includes initializing the current bucket and the bucket index to zero, and initializing a maximum value of the current bucket to a predetermined maximum.
  • the depth of the next bucket for the specific QoS metric is computed as Z)[N[z] + l,z] MAX , where i is an
  • N[i] is an index for the current bucket
  • D M A X is an overall maximum value for all buckets
  • T[i] is the sampled QoS metric
  • x [i] a mean of the specific QoS metric
  • ⁇ [i] is a standard deviation of the mean QoS.
  • the expected value for the specific QoS metric is x[i] + N[i] x ⁇ [ ⁇ ] , where N[i] is an index for the current bucket for the specific QoS metric, x [i] a mean for the specific QoS metric, and ⁇ [i] is a standard deviation of the mean QoS.
  • the QoS metrics are multivariate functions, and the metrics include response time, packet loss, and jitter.
  • a program storage device readable by a computer, tangibly embodying a program of instructions executable by the computer to perform the method steps for monitoring the quality-of-service (QoS) of high priority transactions in a software system.
  • FIG. 1 is a flowchart of a method for maintaining quality-of-service (QoS) requirements in software systems by monitoring the quality of high priority transactions, according to an embodiment of the invention.
  • QoS quality-of-service
  • FIG. 2 depicts an exemplary set of buckets, according to an embodiment of the invention.
  • FIG. 3 is a block diagram of an exemplary computer system for implementing a method for maintaining QoS requirements in software systems by monitoring the quality of high priority transactions, according to an embodiment of the invention.
  • Exemplary embodiments of the invention as described herein generally include systems and methods for maintaining quality-of-service (QoS) requirements in software systems by monitoring the quality of high priority transactions. Accordingly, while the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that there is no intent to limit the invention to the particular forms disclosed, but on the contrary, the invention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.
  • QoS quality-of-service
  • An algorithm according to an embodiment of the invention can maximize the probability of a system satisfying QoS requirements of high-priority transactions by monitoring the QoS of high-priority transactions.
  • An algorithm according to an embodiment of the invention uses multiple buckets with varying bucket depths to ensure software rejuvenation is performed at correct times.
  • software rejuvenation is activated when the QoS metrics of the high-priority transactions have been so degraded that the best course of action is activation of a software rejuvenation routine.
  • An algorithm according to an embodiment of the invention can distinguish between QoS degradation due to a burst of arrivals and performance degradation due to increased service time as a result of system capacity degradation.
  • System capacity degradation may occur due to hardware failures, software bugs or degradation of environmental conditions, such as storm interference in a Wi-Fi environment. If a system is operating at full capacity and a short burst of arrivals is presented, there would be no benefit in executing a preventive maintenance routine. However, if system capacity has been degraded to such an extent that users are effectively locked out of the system, preventive maintenance may be warranted.
  • An algorithm according to an embodiment of the invention is based on the premise that the customer affecting metric of performance can be sampled frequently and that the first and second moments of the metric can be estimated when the system is operating at full capacity before the monitoring tool is deployed in production.
  • a multivariate QoS aware software rejuvenation algorithm tracks the end-to-end customer affecting performance metric of the high-priority transactions. If it is discovered that the system infrastructure cannot satisfy the set of QoS requirements of the high-priority transactions, an algorithm according to an embodiment of the invention can solve an optimization task to calculate the maximum allowed number of low-priority transactions that should be allowed to run in the system to maximize the likelihood that the high-priority transactions will meet their QoS requirements.
  • An algorithm tracks an estimate of the quality-of-service set for high-priority transactions, T, in terms of, e.g., response time, packet loss, and jitter (where these particular metric may be indexed by z), by maintaining a history of up to KX D M A X [I] recent quality of service measurements. K is defined so that when it is reached the system response time has degraded to a level that high-priority transactions can no longer satisfy their QoS objectives. Therefore, the system must be immediately rejuvenated.
  • the notation TfiJ is used herein to denote the point estimate of the specific quality of service metric i.
  • An algorithm divides the history of recent quality of service i measurements into K buckets of depth D M
  • a X [I] ⁇ N[i] is a pointer or index to the current bucket for quality of service metric i.
  • the system QoS requirements are used to derive dfij, the number of recent response times ("balls") stored in the current bucket for the quality of service metric i, x[i], the average response time objective for the quality of service metric i, and ⁇ [i] , the objective standard deviation for the quality of service metric i.
  • K represents the number of standard deviations from the mean that would be tolerated before software rejuvenation is activated.
  • the level dfNfiJ J of the current (N*) bucket is considered.
  • an algorithm according to an embodiment of the invention dynamically computes the depth of the next bucket, and changes the estimation of the expected quality of service measure by adding one standard deviation to the expected value of the metric. This is equivalent to moving to the next bucket. If a bucket underflows, an algorithm according to an embodiment of the invention subtracts one standard deviation from its estimation of the expected delay. This is equivalent to moving down to the previous bucket.
  • FIG. 1 is a flowchart of a method for estimating a current value of a monitored performance signature, according to an embodiment of the invention. The method illustrated by the flowchart is performed for each sampled transaction and for each quality of service metric i.
  • NfiJ is the current bucket for tke quality of service metric i Vietnamese K is the number of buckets
  • dfNfiJ J is the current depth of the NfiJ th bucket
  • DfNfiJ J is the maximum depth of the N[i] th bucket
  • DMAX is the maximum bucket depth.
  • FIG. 2 depicts an exemplary set of buckets, according to an embodiment of the invention.
  • N represents a bucket index 201 and d represents the number of balls stored in a current bucket 202.
  • N represents 4, and there are 8 balls in bucket 4.
  • the ⁇ buckets 203 are modeled, tracking the number of balls in each bucket.
  • a method begins at step 101 by comparing the current bucket index for the current quality of service metric i NfiJ to K, the total numbber of buckets. If NfiJ is equal to K, then a rejuvenation is triggered, and the method exits.
  • the measured QoS metric for the specific metric i, TfiJ is compared at step 105 to expected value x[/] + jV[i ' ]x a[i] . If TfiJ is greater than the expected value for the current quality of service metric i, the current bucket dfNfiJJJ, is incremented at step 106, otherwise it is decremented at step 109. After step 106, the bucket dfNfiJJJ is compared at step 107 with DfNfiJJJ, the maximum depth of the NfiJ th bucket.
  • bucket d[N[i],i] is less than 0, i.e., if bucket d[N[i],i] underflows, then, at step 111, the bucket d[N[i],i] is reset to 0. If, at step 112, N[i] is greater than 0, then at step 113, the current bucket d[N[i],i] is set to D M A X , the maximum depth of the bucket, and the bucket index N[i] is decremented.
  • An algorithm according to an embodiment of the invention can track a set of QoS metrics of interest and determine the ability of the system to meet its QoS requirements.
  • an algorithm according to an embodiment of the invention can react quickly to significant performance degradation. Resilience to degradation in the customer affecting metric is adjusted by tuning the value of K.
  • the software rej computing the actual ch
  • utilization is the fraction of time the channel being sampled is busy.
  • the functions ffij and f ] fij can be obtained through a detailed performance analysis of the infrastructure, and represent a mathematical model of the system.
  • bucket and “ball”. These terms are analogous to any method for counting the occurrence of an event.
  • an element of an array as a bucket, wherein the array has K elements (e.g., buckets) and each element stores a number representing a number of times an event, such as a transaction, has occurred (e.g., balls).
  • K elements e.g., buckets
  • each element stores a number representing a number of times an event, such as a transaction, has occurred (e.g., balls).
  • QoS metrics are possible.
  • embodiments of the present invention can be implemented in various forms of hardware, software, firmware, special purpose processes, or a combination thereof.
  • the present invention can be implemented in software as an application program tangible embodied on a computer readable program storage device.
  • the application program can be uploaded to, and executed by, a machine comprising any suitable architecture.
  • FIG. 3 is a block diagram of an exemplary computer system for implementing a method for detecting security intrusions and soft faults in software systems using performance signatures, according to an embodiment of the invention.
  • a computer system 301 for implementing an embodiment of the present invention can comprise, inter alia, a central processing unit (CPU) 302, a memory 303 and an input/output (I/O) interface 304.
  • the computer system 301 is generally coupled through the I/O interface 304 to a display 305 and various input devices 306 such as a mouse and a keyboard.
  • the support circuits can include circuits such as cache, power supplies, clock circuits, and a communication bus.
  • the memory 303 can include random access memory (RAM), read only memory (ROM), disk drive, tape drive, etc., or a combinations thereof.
  • RAM random access memory
  • ROM read only memory
  • the present invention can be implemented as a routine 307 that is stored in memory 303 and executed by the CPU 302 to process the signal from the signal source 308.
  • the computer system 301 is a general purpose computer system that becomes a specific purpose computer system when executing the routine 307 of the present invention.
  • the computer system 301 also includes an operating system and micro instruction code.
  • the various processes and functions described herein can either be part of the micro instruction code or part of the application program (or combination thereof) which is executed via the operating system.
  • various other peripheral devices can be connected to the computer platform such as an additional data storage device and a printing device.

Abstract

A method for monitoring the quality-of-service (QoS) of high priority transactions in a software system includes receiving a specific QoS metric of a high priority transaction, where the QoS metric associated with a plurality of buckets and comparing (105) the sampled specific QoS metric to an expected value for the specific QoS metric. If the sampled specific QoS metric exceeds the corresponding expected value, a bucket for the specific QoS metric is incremented (106), otherwise the bucket is decremented (109). If the bucket for the specific QoS metric overflows (107), the current bucket is reinitialized (108) to zero, a depth of a next bucket for the specific QoS metric is computed (108), and a number of standard deviations from a mean value for the specific QoS metric is incremented (108). When the bucket for the specific QoS metric exceeds (101) a threshold, a software rejuvenation routine is executed (102).

Description

SYSTEM AND METHOD FOR QUALITY - OF - SERVICE AWARE REJUVENATION
Cross Reference to Related United States Applications
This application claims priority from "QoS Aware Dynamic Software Rejuvenation Algorithms", U.S. Provisional Application No. 61/356,162 of Alberto Avritzer, filed June 18, 2010, the contents of which are herein incorporated by reference in their entirety, and is a continuation-in-part (CIP) of U.S. Application Serial No. 11/225,989 of Avritzer, et al., filed on September 14, 2005, which in turn claims priority from U.S. Provisional Application Serial No. 60/628,285 of Avritzer, et al., filed on November 16, 2004, the contents of both of which are herein incorporated by reference in their entireties.
Technical Field
This disclosure is directed to methods for maintaining QoS requirements in software systems by monitoring the quality of high priority transactions.
Discussion of the Related Art
Large industrial software systems require extensive monitoring and management to deliver expected performance and reliability. Some specific types of software failures, called soft failures, have been shown to leave the system in a degraded mode, where the system is still operational, but the available system capacity has been greatly reduced. Examples of soft bugs have been documented in several software studies. Soft failures can be caused by the evolution of the state of one or more software data structures during (possibly) prolonged execution. This evolution is called software aging. Software aging has been observed in widely used software. One approach for system capacity restoration for telecommunications systems takes advantage of the cyclical nature of telecommunications traffic. Telecommunications operating companies understand the traffic patterns in their networks well, and therefore can plan to restore their smoothly degrading systems to full capacity in the same way they plan their other maintenance activities. Soft bugs typically occur as a result of problems with synchronization mechanisms, such as semaphores, kernel structures, such as file table allocations, database management systems, such as database lock deadlocks, and other resource allocation mechanisms that are essential to the proper operation of large multilayer distributed systems. Since some of these resources are designed with self-healing mechanisms, such as timeouts, some systems may recover from soft bugs after a period of time. For example, when the soft bug for a specific Java based e-commerce system was revealed, users were complaining of very slow response time for periods exceeding one hour, after which the problem would clear by itself. However, in other cases, host based worm disruption systems can throttle the rate of connections out of a host.
One theoretical study to determine the optimal time to perform software rejuvenation for aging software with soft failures based on Markov decision models found optimal software rejuvenation times that would minimize the required cost function. The authors developed a Markov decision process model that allows for two queuing policies. In the first policy software rejuvenation is invoked whenever a buffer overflow is detected. In the second policy, packet loss is allowed without triggering software rejuvenation. A related study uses Markov regenerative stochastic Petri Nets to derive a quantitative analysis of software rejuvenation. The model solution supports the selection of the optimal rejuvenation interval to minimize the expected system downtime. Another study evaluated the use of both check pointing and rejuvenation to minimize software completion times was evaluated. Another methodology for the quantitative analysis of software rejuvenation policies is based on the assumption that system degradation can be quantified by monitoring a metric that is co-related with system degradation. A maximum degradation threshold level is defined and two rejuvenation policies based on the defined threshold are presented. The first policy is risk based. It defines a confidence level on the metric, and performs rejuvenation, with a probability that is proportional to the confidence level. The second policy is deterministic and performs rejuvenation as soon as the threshold level is reached. The theory of renewal processes with rewards is used to estimate the expected system down time and to help estimate the proper rejuvenation intervals. Another methodology for proactive software rejuvenation is based on the statistical estimation of resource exhaustion.
When a large infrastructure that supports a high-value business is overwhelmed due to excessive use or to degradation of the number of available resources, software rejuvenation must be quickly triggered to restore the capacity of the large infrastructure. If the degradation is a consequence of degraded environmental conditions, the allowed workload from low- priority transactions should be adjusted to allow for the higher-priority transaction to satisfy their quality-of-service (QoS) requirements.
Large infrastructure based systems that do not support QoS requirements, such as Wi- Fi systems based on the IEEE 802.11 standard, are currently being deployed to support large mission critical systems, such as large transportation systems that support VoIP (Voice-over Internet Protocol) and CCTV (closed-circuit television). These systems cannot satisfy the QoS requirements of high-priority transactions when faced with degraded environmental conditions resulting from, for example, media interference, hidden terminals, shadows, etc.
Summary of the Invention
Exemplary embodiments of the invention as described herein generally include methods and systems for software rejuvenation that track individual transactions quality-of- service (QoS) and improves the software ability to meet a set of QoS requirements of high priority transactions by reducing the number of low priority transactions allowed in the system. An algorithm according to an embodiment of the invention is applicable to environments that do not support transactions priorities by required applications to meet specific QoS requirements. An algorithm according to an embodiment of the invention can accurately measure the QoS of high priority transactions, where the QoS can be represented by a set of multivariate functions, and uses the measurement results and approximate results for analytical modeling to derive the underlying environmental conditions. If it is found that the high priority transactions are deviating from the required QoS, a fast analytical modeling approximation can quickly establish new threshold on the maximum number of low-priority transactions that are allowed to be executed in the infrastructure..
An algorithm according to an embodiment of the invention can ensure QoS of high- priority transactions by dynamically estimating the infrastructure environmental conditions and by restricting the workload allowed to be carried by low-priority transactions. An analytical performance model and software rejuvenation can quickly detect QoS degradation of high-priority transactions and enforce QoS requirements of these high-priority transactions. In addition, the use of multiple buckets to count the variability in the measured customer affecting metric can distinguish between degradation that is a function of a transient in the arrival process and degradation that is a function of a significant degradation in the infrastructure environment.
An algorithm according to an embodiment of the invention can provide superior performance by tracking a system's ability to meet its QoS requirements and using measured QoS data to determine when to trigger a software rejuvenation routine. An algorithm according to an embodiment of the invention can be generalized by performing a detailed analysis of performance usage to derive more precise performance signatures for different modes of operation, e.g. busy hour vs. weekend, different load conditions, e.g. high and low loads, and different user profiles.
An algorithm according to an embodiment of the invention could be applied to a network of hosts that support mission critical systems and depend on the successful completion of transactions with hard real-time requirements.
According to an aspect of the invention, there is provided a computer-implemented method for monitoring the quality-of-service (QoS) of high priority transactions in a software system, including receiving a QoS metric of a high priority transaction that is sampled by a software system monitoring infrastructure, where the QoS metric is a specific metric of a set of QoS metrics and is associated with a plurality of buckets, comparing the sampled specific QoS metric to an expected value for the specific QoS metric, where a bucket for the specific QoS metric is incremented if the sampled specific QoS metric exceeds the corresponding expected value, and the bucket for the specific QoS metric is decremented if the sampled specific QoS metric is less than the corresponding expected value, reinitializing the current bucket to zero, computing a depth of a next bucket for the specific QoS metric, and, increasing a number of standard deviations from a mean value for the specific QoS metric, if the bucket for the specific QoS metric overflows, and executing a software rejuvenation routine when the bucket for the specific QoS metric exceeds a threshold.
According to a further aspect of the invention, the method includes initializing the current bucket and the bucket index to zero, and initializing a maximum value of the current bucket to a predetermined maximum. According to a further aspect of the invention, the depth of the next bucket for the sampled specific QoS metric is computed as )[N[z] + l,z] MAX
T[i] - (x[i] + N[i] x a[i]) ' where i is an index for the specific QoS metric, N[i] is an index for the current bucket, D[N[z'],z'] maximum value of the current bucket, DMAX is an overall maximum value for all buckets, T[i] is the sampled QoS metric, x [i] a mean of the specific QoS metric, and σ [i] is a standard deviation of the mean QoS.
According to a further aspect of the invention, the method includes reinitializing the current bucket to zero, if the bucket for the specific QoS metric is emptied.
According to a further aspect of the invention, the method includes reinitializing the current bucket to a predetermined maximum and decreasing a number of standard deviations from a mean value for the sampled specific QoS metric, when a value of the current bucket index is greater than zero.
According to a further aspect of the invention, the expected value for the specific QoS metric is x[i] + N[i] x σ[ι] , where N[i] is an index for the current bucket, 3c [i] a mean for the specific QoS metric, and σ [i] is a standard deviation of the mean QoS.
According to a further aspect of the invention, the software rejuvenation routine measures the set of QoS metrics T, computes a channel utilization p [i] for each metric as a function of each respective QoS metric i, determines a value p'[i] < p[i] for which T[i] is an inverse function of p [i] that is less than a required value of the specific QoS metric, where each function for each specific QoS metric is determined through a performance analysis of the software infrastructure. According to a another aspect of the invention, there is provided a computer- implemented method for monitoring the quality-of-service (QoS) of high priority transactions in a software system, including receiving a QoS metric of a high priority transaction that is sampled by a software system monitoring infrastructure, where the QoS metric is a specific metric of a set of QoS metrics and is associated with a plurality of buckets, comparing the sampled specific QoS metric to an expected value for the specific QoS metric, where a bucket for the specific QoS metric is incremented if the sampled specific QoS metric exceeds the corresponding expected value, and the bucket for the specific QoS metric is decremented if the sampled specific QoS metric is less than the corresponding expected value, reinitializing the current bucket to zero, if the bucket for the specific QoS metric is emptied, and executing a software rejuvenation routine when the bucket for the specific QoS metric exceeds a threshold.
According to a further aspect of the invention, if the current bucket for the specific QoS metric overflows, the method includes reinitializing the current bucket to zero, computing a depth of a next bucket for the specific QoS metric based on the amount by which the sampled specific QoS metric exceeds the corresponding expected value, and increasing a number of standard deviations from a mean value for the specific QoS metric.
According to a further aspect of the invention, if the current bucket for the specific QoS metric empties, and the bucket index is greater than zero, the method includes reinitializing the current bucket for the specific QoS metric to a predetermined maximum, and decreasing a number of standard deviations from a mean value for the specific QoS metric. According to a further aspect of the invention, the method includes initializing the current bucket and the bucket index to zero, and initializing a maximum value of the current bucket to a predetermined maximum.
According to a further aspect of the invention, the depth of the next bucket for the specific QoS metric is computed as Z)[N[z] + l,z] MAX , where i is an
Τ[ί] - (χ[ί] + Ν ί] χ σ[ί]) index for the specific QoS metric, N[i] is an index for the current bucket, D[N[z'],z'] maximum value of the current bucket, DMAX is an overall maximum value for all buckets, T[i] is the sampled QoS metric, x [i] a mean of the specific QoS metric, and σ [i] is a standard deviation of the mean QoS.
According to a further aspect of the invention, the expected value for the specific QoS metric is x[i] + N[i] x σ[ϊ] , where N[i] is an index for the current bucket for the specific QoS metric, x [i] a mean for the specific QoS metric, and σ [i] is a standard deviation of the mean QoS.
According to a further aspect of the invention, the QoS metrics are multivariate functions, and the metrics include response time, packet loss, and jitter.
According to a another aspect of the invention, there is provided a program storage device readable by a computer, tangibly embodying a program of instructions executable by the computer to perform the method steps for monitoring the quality-of-service (QoS) of high priority transactions in a software system. Brief Description of the Drawings
FIG. 1 is a flowchart of a method for maintaining quality-of-service (QoS) requirements in software systems by monitoring the quality of high priority transactions, according to an embodiment of the invention.
FIG. 2 depicts an exemplary set of buckets, according to an embodiment of the invention.
FIG. 3 is a block diagram of an exemplary computer system for implementing a method for maintaining QoS requirements in software systems by monitoring the quality of high priority transactions, according to an embodiment of the invention.
Detailed Description of Exemplary Embodiments
Exemplary embodiments of the invention as described herein generally include systems and methods for maintaining quality-of-service (QoS) requirements in software systems by monitoring the quality of high priority transactions. Accordingly, while the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that there is no intent to limit the invention to the particular forms disclosed, but on the contrary, the invention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.
An algorithm according to an embodiment of the invention can maximize the probability of a system satisfying QoS requirements of high-priority transactions by monitoring the QoS of high-priority transactions. An algorithm according to an embodiment of the invention uses multiple buckets with varying bucket depths to ensure software rejuvenation is performed at correct times. In a QoS aware software rejuvenation algorithm, software rejuvenation is activated when the QoS metrics of the high-priority transactions have been so degraded that the best course of action is activation of a software rejuvenation routine.
An algorithm according to an embodiment of the invention can distinguish between QoS degradation due to a burst of arrivals and performance degradation due to increased service time as a result of system capacity degradation. System capacity degradation may occur due to hardware failures, software bugs or degradation of environmental conditions, such as storm interference in a Wi-Fi environment. If a system is operating at full capacity and a short burst of arrivals is presented, there would be no benefit in executing a preventive maintenance routine. However, if system capacity has been degraded to such an extent that users are effectively locked out of the system, preventive maintenance may be warranted.
An algorithm according to an embodiment of the invention is based on the premise that the customer affecting metric of performance can be sampled frequently and that the first and second moments of the metric can be estimated when the system is operating at full capacity before the monitoring tool is deployed in production.
A multivariate QoS aware software rejuvenation algorithm according to an embodiment of the invention tracks the end-to-end customer affecting performance metric of the high-priority transactions. If it is discovered that the system infrastructure cannot satisfy the set of QoS requirements of the high-priority transactions, an algorithm according to an embodiment of the invention can solve an optimization task to calculate the maximum allowed number of low-priority transactions that should be allowed to run in the system to maximize the likelihood that the high-priority transactions will meet their QoS requirements. An algorithm according to an embodiment of the invention tracks an estimate of the quality-of-service set for high-priority transactions, T, in terms of, e.g., response time, packet loss, and jitter (where these particular metric may be indexed by z), by maintaining a history of up to KX DMAX[I] recent quality of service measurements. K is defined so that when it is reached the system response time has degraded to a level that high-priority transactions can no longer satisfy their QoS objectives. Therefore, the system must be immediately rejuvenated. The notation TfiJ is used herein to denote the point estimate of the specific quality of service metric i. An algorithm according to an embodiment of the invention divides the history of recent quality of service i measurements into K buckets of depth DMAX[I]■ N[i] is a pointer or index to the current bucket for quality of service metric i. The system QoS requirements are used to derive dfij, the number of recent response times ("balls") stored in the current bucket for the quality of service metric i, x[i], the average response time objective for the quality of service metric i, and σ [i] , the objective standard deviation for the quality of service metric i. K represents the number of standard deviations from the mean that would be tolerated before software rejuvenation is activated. At any given time, the level dfNfiJ J of the current (N*) bucket is considered. For each quality of service metric i, when the current bucket overflows, an algorithm according to an embodiment of the invention dynamically computes the depth of the next bucket, and changes the estimation of the expected quality of service measure by adding one standard deviation to the expected value of the metric. This is equivalent to moving to the next bucket. If a bucket underflows, an algorithm according to an embodiment of the invention subtracts one standard deviation from its estimation of the expected delay. This is equivalent to moving down to the previous bucket.
FIG. 1 is a flowchart of a method for estimating a current value of a monitored performance signature, according to an embodiment of the invention. The method illustrated by the flowchart is performed for each sampled transaction and for each quality of service metric i. In the flowchart, NfiJ is the current bucket for tke quality of service metric i„ K is the number of buckets, dfNfiJ J is the current depth of the NfiJth bucket, DfNfiJ J is the maximum depth of the N[i]th bucket, and DMAX is the maximum bucket depth. Initialization is performed at system startup and at rejuvenation with d[0, 0]=0, N[0] =0, and D[0,0]=DMAX- FIG. 2 depicts an exemplary set of buckets, according to an embodiment of the invention. Referring to FIG. 2, N represents a bucket index 201 and d represents the number of balls stored in a current bucket 202. In the example shown in FIG. 2, N=4, and there are 8 balls in bucket 4. The ^ buckets 203 are modeled, tracking the number of balls in each bucket.
Referring back to FIG. 1 , a method according to an embodiment of the invention begins at step 101 by comparing the current bucket index for the current quality of service metric i NfiJ to K, the total numbber of buckets. If NfiJ is equal to K, then a rejuvenation is triggered, and the method exits.
If the value of a bucket NfiJ is less than K, the measured QoS metric for the specific metric i, TfiJ, is compared at step 105 to expected value x[/] + jV[i']x a[i] . If TfiJ is greater than the expected value for the current quality of service metric i, the current bucket dfNfiJJJ, is incremented at step 106, otherwise it is decremented at step 109. After step 106, the bucket dfNfiJJJ is compared at step 107 with DfNfiJJJ, the maximum depth of the NfiJth bucket. If bucket dfNfiJJJ exceeds DfNfiJJJ, i.e., if bucket dfNfiJJJ overflows, then at step 108, dfNfiJJJ is reset to 0, the depth of the next bucket (N+l) is dynamically computed as )[N[t'] + l, t] = > which is equivalent to moving to the next bucket,
Figure imgf000013_0001
and the estimation of the expected value is incremented by adding one standard deviation to the expected value of the metric by incrementing the component bucket index NfiJ. On the other hand, after step 109, the bucket d[N[i],i] is compared to 0 at step 110. If bucket d[N[i],i] is less than 0, i.e., if bucket d[N[i],i] underflows, then, at step 111, the bucket d[N[i],i] is reset to 0. If, at step 112, N[i] is greater than 0, then at step 113, the current bucket d[N[i],i] is set to DMAX, the maximum depth of the bucket, and the bucket index N[i] is decremented.
Equivalent pseudo-code for a QoS aware software rejuvenation algorithm shown in FIG.1 is as follows. if (N[i] == K ) then execute the software rejuvenation routine, if (T[i] > x[i] + N[i] x σ [i]) then
d[N[i]
else
d[N[i] ,i]
end
if (d[N[i],i] > D[N[i],i] ) then
d[N[i],i] = 0;
D[N[i] + l, ±] = DMAX/ (T[i] - (x[i] + Ν[ί]χσ [i]) ) ;
N[i]++;
end
if ( (d[N[i] ,i] < 0) AND (N[i] >0) ) then
d[N[i],i] = DMAX;
N[i]— ;
end
if ( (d[N[i] ,i] < 0) AND (N[i] ==0) ) then
d[N[i],i] = 0;
end
An algorithm according to an embodiment of the invention can track a set of QoS metrics of interest and determine the ability of the system to meet its QoS requirements. By dynamically computing the value of DfNfiJ J, an algorithm according to an embodiment of the invention can react quickly to significant performance degradation. Resilience to degradation in the customer affecting metric is adjusted by tuning the value of K. The software rej , computing the actual ch
and finding the value of
Figure imgf000015_0001
l objective
utilization is the fraction of time the channel being sampled is busy. The functions ffij and f]fij can be obtained through a detailed performance analysis of the infrastructure, and represent a mathematical model of the system.
It should be noted that throughout the specification, embodiments have been described using the terms "bucket" and "ball". These terms are analogous to any method for counting the occurrence of an event. For example, in computer science, one can consider an element of an array as a bucket, wherein the array has K elements (e.g., buckets) and each element stores a number representing a number of times an event, such as a transaction, has occurred (e.g., balls). One of ordinary skill in the art would appreciate that other methods of tracking a set of QoS metrics are possible.
System Implementations
It is to be understood that embodiments of the present invention can be implemented in various forms of hardware, software, firmware, special purpose processes, or a combination thereof. In one embodiment, the present invention can be implemented in software as an application program tangible embodied on a computer readable program storage device. The application program can be uploaded to, and executed by, a machine comprising any suitable architecture.
FIG. 3 is a block diagram of an exemplary computer system for implementing a method for detecting security intrusions and soft faults in software systems using performance signatures, according to an embodiment of the invention. Referring now to FIG. 3, a computer system 301 for implementing an embodiment of the present invention can comprise, inter alia, a central processing unit (CPU) 302, a memory 303 and an input/output (I/O) interface 304. The computer system 301 is generally coupled through the I/O interface 304 to a display 305 and various input devices 306 such as a mouse and a keyboard. The support circuits can include circuits such as cache, power supplies, clock circuits, and a communication bus. The memory 303 can include random access memory (RAM), read only memory (ROM), disk drive, tape drive, etc., or a combinations thereof. The present invention can be implemented as a routine 307 that is stored in memory 303 and executed by the CPU 302 to process the signal from the signal source 308. As such, the computer system 301 is a general purpose computer system that becomes a specific purpose computer system when executing the routine 307 of the present invention.
The computer system 301 also includes an operating system and micro instruction code. The various processes and functions described herein can either be part of the micro instruction code or part of the application program (or combination thereof) which is executed via the operating system. In addition, various other peripheral devices can be connected to the computer platform such as an additional data storage device and a printing device.
It is to be further understood that, because some of the constituent system components and method steps depicted in the accompanying figures can be implemented in software, the actual connections between the systems components (or the process steps) may differ depending upon the manner in which the present invention is programmed. Given the teachings of the present invention provided herein, one of ordinary skill in the related art will be able to contemplate these and similar implementations or configurations of the present invention. While the present invention has been described in detail with reference to exemplary embodiments, those skilled in the art will appreciate that various modifications and substitutions can be made thereto without departing from the spirit and scope of the invention as set forth in the appended claims.

Claims

CLAIMS What is claimed is:
1. A computer-implemented method for monitoring the quality-of-service (QoS) of high priority transactions in a software system, the method implemented by the computer comprising the steps of:
receiving a QoS metric of a high priority transaction that is sampled by a software system monitoring infrastructure, wherein said QoS metric is a specific metric of a set of QoS metrics and is associated with a plurality of buckets;
comparing said sampled specific QoS metric to an expected value for the specific QoS metric, wherein a bucket for the specific QoS metric is incremented if the sampled specific QoS metric exceeds the corresponding expected value, and the bucket for the specific QoS metric is decremented if the sampled specific QoS metric is less than the corresponding expected value;
reinitializing the current bucket to zero, computing a depth of a next bucket for the specific QoS metric, and, increasing a number of standard deviations from a mean value for the specific QoS metric, if the bucket for the specific QoS metric overflows; and
executing a software rejuvenation routine when said bucket for said specific QoS metric exceeds a threshold.
2. The method of claim 1, further comprising initializing the current bucket and the bucket index to zero, and initializing a maximum value of the current bucket to a predetermined maximum.
3. The method of claim 1, wherein the depth of the next bucket for the sampled specific QoS metric is computed as -D[N[z] + = MAX ^ wherein i is an
T[i] - [x[i] + N[i] x σ[ί]) index for the specific QoS metric, N[i] is an index for the current bucket, D[N[z],z] maximum value of the current bucket, DMAX is an overall maximum value for all buckets, T\i] is the sampled QoS metric, x [i] a mean of the specific QoS metric, and σ [i] is a standard deviation of the mean QoS.
4. The method of claim 1, further comprising reinitializing the current bucket to zero, if the bucket for the specific QoS metric is emptied.
5. The method of claim 4, further comprising reinitializing the current bucket to a predetermined maximum and decreasing a number of standard deviations from a mean value for the sampled specific QoS metric, when a value of the current bucket index is greater than zero.
6. The method of claim 1 , wherein said expected value for the specific QoS metric is x[i] + N[i] x σ[ϊ] , wherein N[i] is an index for the current bucket, x [i] a mean for the specific QoS metric, and σ [i] is a standard deviation of the mean QoS.
7. The method of claim 1, wherein said software rejuvenation routine measures said set of QoS metrics T, computes a channel utilization p [i] for each metric as a function of each respective QoS metric i, determines a value p'[i] < p[i] for which T i] is an inverse function of p [i] that is less than a required value of said specific QoS metric, wherein each function for each specific QoS metric is determined through a performance analysis of the software infrastructure.
8. A computer-implemented method for monitoring the quality-of-service (QoS) of high priority transactions in a software system, the method implemented by the computer comprising the steps of:
receiving a QoS metric of a high priority transaction that is sampled by a software system monitoring infrastructure, wherein said QoS metric is a specific metric of a set of QoS metrics and is associated with a plurality of buckets;
comparing said sampled specific QoS metric to an expected value for the specific QoS metric, wherein a bucket for the specific QoS metric is incremented if the sampled specific QoS metric exceeds the corresponding expected value, and the bucket for the specific QoS metric is decremented if the sampled specific QoS metric is less than the corresponding expected value;
reinitializing the current bucket to zero, if the bucket for the specific QoS metric is emptied; and
executing a software rejuvenation routine when said bucket for said specific QoS metric exceeds a threshold.
9. The method of claim 8, wherein if the current bucket for the specific QoS metric overflows, the method further comprises:
reinitializing the current bucket to zero;
computing a depth of a next bucket for the specific QoS metric based on the amount by which the sampled specific QoS metric exceeds the corresponding expected value; and increasing a number of standard deviations from a mean value for the specific QoS
10. The method of claim 8, wherein if the current bucket for the specific QoS metric empties, and the bucket index is greater than zero, the method further comprises: reinitializing the current bucket for the specific QoS metric to a predetermined maximum; and decreasing a number of standard deviations from a mean value for the specific QoS metric.
11. The method of claim 7, further comprising initializing the current bucket and the bucket index to zero, and initializing a maximum value of the current bucket to a predetermined maximum.
12. The method of claim 8, wherein the depth of the next bucket for the specific
QoS metric is computed as )|NTz'l + Li] = 7— MAX wherein i is an index
T[i] - (x[i] + N[i] x a[i]) for the specific QoS metric, N[i] is an index for the current bucket, D[N[z'],z] maximum value of the current bucket, DMAX is an overall maximum value for all buckets, T\i] is the sampled QoS metric, x [i] a mean of the specific QoS metric, and σ [i] is a standard deviation of the mean QoS.
13. The method of claim 8, wherein said expected value for the specific QoS metric is x[i] + N[i] x σ[ϊ] , wherein N[i] is an index for the current bucket for the specific QoS metric, x [i] a mean for the specific QoS metric, and σ [i] is a standard deviation of the mean QoS.
14. The method of claim 8, wherein the QoS metrics are multivariate functions, and the metrics include response time, packet loss, and jitter.
15. A program storage device readable by a computer, tangibly embodying a program of instructions executable by the computer to perform the method steps for monitoring the quality-of-service (QoS) of high priority transactions in a software system, the method implemented by the computer comprising the steps of:
receiving a QoS metric of a high priority transaction that is sampled by a software system monitoring infrastructure, wherein said QoS metric is a specific metric of a set of QoS metrics and is associated with a plurality of buckets;
comparing said sampled specific QoS metric to an expected value for the specific QoS metric, wherein a bucket for the specific QoS metric is incremented if the sampled specific QoS metric exceeds the corresponding expected value, and the bucket for the specific QoS metric is decremented if the sampled specific QoS metric is less than the corresponding expected value;
reinitializing the current bucket to zero, computing a depth of a next bucket for the specific QoS metric, and, increasing a number of standard deviations from a mean value for the specific QoS metric, if the bucket for the specific QoS metric overflows; and
executing a software rejuvenation routine when said bucket for said specific QoS metric exceeds a threshold.
16. The computer readable program storage device of claim 15, the method further comprising initializing the current bucket and the bucket index to zero, and initializing a maximum value of the current bucket to a predetermined maximum.
17. The computer readable program storage device of claim 15, wherein the depth of the next bucket for the sampled specific QoS metric is computed as
D[N i] + l,i] MAX , wherein i is an index for the specific QoS metric,
T[i] - (x[i] + N[i] x a[i])
N[i] is an index for the current bucket, D[N[z'],z] maximum value of the current bucket, DMAX is an overall maximum value for all buckets, T[i] is the sampled QoS metric, x [i] a mean of the specific QoS metric, and σ [i] is a standard deviation of the mean QoS.
18. The computer readable program storage device of claim 15, the method further comprising reinitializing the current bucket to zero, if the bucket for the specific QoS metric is emptied.
19. The computer readable program storage device of claim 18, the method further comprising reinitializing the current bucket to a predetermined maximum and decreasing a number of standard deviations from a mean value for the sampled specific QoS metric, when a value of the current bucket index is greater than zero.
20. The computer readable program storage device of claim 15, wherein said expected value for the specific QoS metric is x[i] + N[i] x σ[ϊ] , wherein N[i] is an index for the current bucket, x [i] a mean for the specific QoS metric, and σ [i] is a standard deviation of the mean QoS.
21. The computer readable program storage device of claim 15, wherein said software rejuvenation routine measures said set of QoS metrics T, computes a channel utilization p [i] for each metric as a function of each respective QoS metric i, determines a value p'[i] < p[i] for which T[i] is an inverse function of p [i] that is less than a required value of said specific QoS metric, wherein each function for each specific QoS metric is determined through a performance analysis of the software infrastructure.
22. The computer readable program storage device of claim 15, wherein the QoS metrics are multivariate functions, and the metrics include response time, packet loss, and jitter.
PCT/US2011/037510 2010-06-18 2011-05-23 System and method for quality - of - service aware rejuvenation WO2011159430A1 (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US35616210P 2010-06-18 2010-06-18
US61/356,162 2010-06-18
US40575010P 2010-10-22 2010-10-22
US61/405,750 2010-10-22
US12/949,913 US8423833B2 (en) 2004-11-16 2010-11-19 System and method for multivariate quality-of-service aware dynamic software rejuvenation
US12/949,913 2010-11-19

Publications (1)

Publication Number Publication Date
WO2011159430A1 true WO2011159430A1 (en) 2011-12-22

Family

ID=44343868

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2011/037510 WO2011159430A1 (en) 2010-06-18 2011-05-23 System and method for quality - of - service aware rejuvenation

Country Status (1)

Country Link
WO (1) WO2011159430A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040078657A1 (en) * 2002-10-22 2004-04-22 Gross Kenny C. Method and apparatus for using pattern-recognition to trigger software rejuvenation
US20100241905A1 (en) * 2004-11-16 2010-09-23 Siemens Corporation System and Method for Detecting Security Intrusions and Soft Faults Using Performance Signatures

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040078657A1 (en) * 2002-10-22 2004-04-22 Gross Kenny C. Method and apparatus for using pattern-recognition to trigger software rejuvenation
US20100241905A1 (en) * 2004-11-16 2010-09-23 Siemens Corporation System and Method for Detecting Security Intrusions and Soft Faults Using Performance Signatures

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
"FLOW CONTROL OF PRIORITIZED DATA IN A MULTIMEDIA COMMUNICATIONS SYSTEM", IBM TECHNICAL DISCLOSURE BULLETIN, INTERNATIONAL BUSINESS MACHINES CORP. (THORNWOOD), US, vol. 37, no. 1, 1 January 1994 (1994-01-01), pages 531/532, XP000428874, ISSN: 0018-8689 *
AVRITZER A ET AL: "Ensuring stable performance for systems that degrade", PROCEEDINGS OF THE FIFTH INTERNATIONAL WORKSHOP ON SOFTWARE AND PERFORMANCE, WOSP'05 - PROCEEDINGS OF THE FIFTH INTERNATIONAL WORKSHOP ON SOFTWARE AND PERFORMANCE, WOSP'05 2005 ASSOCIATION FOR COMPUTING MACHINERY US, 2005, pages 43 - 51, XP002656599 *
AVRITZER ET AL: "Ensuring system performance for cluster and single server systems", JOURNAL OF SYSTEMS & SOFTWARE, ELSEVIER NORTH HOLLAND, NEW YORK, NY, US, vol. 80, no. 4, 16 February 2007 (2007-02-16), pages 441 - 454, XP005892112, ISSN: 0164-1212, DOI: 10.1016/J.JSS.2006.07.020 *
DATABASE COMPENDEX [online] ENGINEERING INFORMATION, INC., NEW YORK, NY, US; 2005, AVRITZER A ET AL: "Ensuring stable performance for systems that degrade", XP002656598, Database accession no. E20064010141651 *

Similar Documents

Publication Publication Date Title
Ðukić et al. Is advance knowledge of flow sizes a plausible assumption?
US8271838B2 (en) System and method for detecting security intrusions and soft faults using performance signatures
Mace et al. Retro: Targeted resource management in multi-tenant distributed systems
US7949756B2 (en) Method and apparatus for monitoring web services resource utilization
CN108733509B (en) Method and system for backing up and restoring data in cluster system
US9658910B2 (en) Systems and methods for spatially displaced correlation for detecting value ranges of transient correlation in machine data of enterprise systems
US10528450B2 (en) Predicting defects in software systems hosted in cloud infrastructures
US7890297B2 (en) Predictive monitoring method and system
US10645153B2 (en) Modeling session states in microservices on cloud infrastructures
US10362100B2 (en) Determining load state of remote systems using delay and packet loss rate
US8055952B2 (en) Dynamic tuning of a software rejuvenation method using a customer affecting performance metric
US20140089493A1 (en) Minimally intrusive cloud platform performance monitoring
US20060130044A1 (en) System and method for triggering software rejuvenation using a customer affecting performance metric
US9172646B2 (en) Dynamic reconfiguration of network devices for outage prediction
US20070168201A1 (en) Formula for automatic prioritization of the business impact based on a failure on a service in a loosely coupled application
US10108520B2 (en) Systems and methods for service demand based performance prediction with varying workloads
US7475292B2 (en) System and method for triggering software rejuvenation using a customer affecting performance metric
US20170054804A1 (en) Server Access Processing System
US8423833B2 (en) System and method for multivariate quality-of-service aware dynamic software rejuvenation
US7484128B2 (en) Inducing diversity in replicated systems with software rejuvenation
Gebert et al. Performance modeling of softwarized network functions using discrete-time analysis
US20110208854A1 (en) Dynamic traffic control using feedback loop
US7657793B2 (en) Accelerating software rejuvenation by communicating rejuvenation events
US8483234B2 (en) Monitoring resource congestion in a network processor
US9183042B2 (en) Input/output traffic backpressure prediction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11725239

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11725239

Country of ref document: EP

Kind code of ref document: A1