US20210073680A1 - Data driven methods and systems for what if analysis - Google Patents

Data driven methods and systems for what if analysis Download PDF

Info

Publication number
US20210073680A1
US20210073680A1 US17/028,166 US202017028166A US2021073680A1 US 20210073680 A1 US20210073680 A1 US 20210073680A1 US 202017028166 A US202017028166 A US 202017028166A US 2021073680 A1 US2021073680 A1 US 2021073680A1
Authority
US
United States
Prior art keywords
demand
change
resource
simulation
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US17/028,166
Inventor
Dustin Garvey
Sampanna Shahaji Salunke
Uri Shaft
Amit Ganesh
Sumathi Gopalakrishnan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oracle International Corp
Original Assignee
Oracle International Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oracle International Corp filed Critical Oracle International Corp
Priority to US17/028,166 priority Critical patent/US20210073680A1/en
Publication of US20210073680A1 publication Critical patent/US20210073680A1/en
Assigned to ORACLE INTERNATIONAL CORPORATION reassignment ORACLE INTERNATIONAL CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: GANESH, AMIT, GOPALAKRISHNAN, SUMATHI, GARVEY, DUSTIN, SALUNKE, SAMPANNA SHAHAJI, SHAFT, URI
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources

Definitions

  • the present disclosure relates to analytical models that process time-series data.
  • the present disclosure relates to training and evaluating what-if models to analyze performance of computing resources.
  • many different inputs may affect the performance of hardware and software resources.
  • the performance of a database server may be impacted by the frequency of database transactions, user calls, and/or query executions, among other factors. As the number and frequency of inputs into a resource increase, so does the likelihood that the resource's performance will degrade.
  • System administrators are generally responsible for ensuring that resources within the computing environment are meeting performance expectations. As the demand on resources increase, system administrators may deploy additional resources and/or balance the load across existing resources to maintain the quality of service. As the demands decrease, resources may be brought offline to mitigate waste and improve efficiency. If administrators fail to adequately understand or anticipate demands, system resources may become overloaded, leading to performance degradation.
  • FIG. 1A illustrates an example system comprising a set of what-if analytic services that are driven by time-series data captured from one or more host devices, in accordance with one or more embodiments;
  • FIG. 1B illustrates an example dataflow for simulating scenarios and generating scenario outputs, in accordance with one or more embodiments
  • FIG. 2 illustrates an example set of operations for training a set of correlation prediction models, in accordance with one or more embodiments
  • FIG. 3 illustrates an example set of demand propagation models, in accordance with one or more embodiments
  • FIG. 4 illustrates an example set of correlation prediction models, in accordance with one or more embodiments
  • FIG. 5 illustrates an example set of operations for performing seasonal aware training of simulation models, in accordance with one or more embodiments
  • FIG. 6 illustrates an example set of seasonal-aware correlation prediction models, in accordance with one or more embodiments
  • FIG. 7 illustrates an example set of operations for simulating a historical scenario, in accordance with one or more embodiments
  • FIG. 8A illustrates an example presentation of propagating adjustments in one demand to another demand, in accordance with one or more embodiments
  • FIG. 8B illustrates an example presentation of adjustments generated for resource time-series during a historical simulation, in accordance with one or more embodiments
  • FIG. 9 illustrates an example set of operations for simulating a forecast scenario, in accordance with one or more embodiments
  • FIG. 10A illustrates an example set of demand time-series that were adjusted during a forecast scenario simulation, in accordance with one or more embodiments
  • FIG. 10B illustrates an example set of resource time-series that were adjusted during a forecast scenario simulation, in accordance with one or more embodiments.
  • FIG. 11 illustrates an example computing system upon which one or more embodiments may be implemented.
  • System administrators may rely on various tools to monitor system performance in computing environments. For example, system administrators may deploy performance profilers to track various performance metrics associated with hardware and/or software resources. Performance profilers may generate periodic reports that track overall system performance as well as the performance of individual resources. Performance profilers may further generate alerts to notify the system administrator if the performance metrics cross a threshold. The alerts allow system administrators to address problems that may arise in the event of performance degradation or too many idle resources.
  • Performance profilers are generally directed to analyzing historical metrics. For example, a performance profiler may track various metrics, such as central processing unit (CPU) utilization, physical memory accesses, function calls, database transactions, etc. While historical profiling presents an overall picture of how the system has performed over time, the results are generally constrained to historical observations. System administrators are often interested in analyzing other possible scenarios, even if these scenarios have not occurred in the past.
  • CPU central processing unit
  • What-if analytical models allow system administrators to simulate and observe the impact of changes within a system environment.
  • One approach to the what-if analysis is for the system administrator to define performance capacity of a system and the performance requirements for software resources deployed within a system.
  • a cloud administrator may define an actual or hypothetical host capacity and the requirements of running an instance of the cloud service on the host. The cloud administrator may then simulate adding and removing instances from the host to determine how system performance is affected.
  • This approach allows administrators to more accurately predict how adding resources affects system performance.
  • the approach relies on the administrator's domain knowledge. In complex system with variable inputs and demands, the administrator may not be aware of or be able to keep up with how changes will affect a particular resource.
  • machine-learning processes and systems may be used to train what-if analytical models (also referred to herein as simulation models) based on historical metrics captured from a computing environment.
  • the machine-learning processes may train/build the simulation models based on a variety of inputs and machine-learning algorithms.
  • the what-if analytic may be implemented to learn complex relationships between system demands and resources from historical time-series data. Additionally or alternatively, the what-if analytic 138 may account for trends and seasonal patterns when training the simulation models.
  • These techniques may provide greater accuracy and more robust analytic capabilities when simulating scenarios in computing environments. As a result, capacity planning operations may be more optimally tuned for a wider variety of scenarios to increase the efficiency with which computing resources are allocated and deallocated within the computing environment.
  • a simulation model comprises a set of one or more underlying correlation predication models.
  • Each correlation prediction model may be trained using a set time-series datasets associated with at least one demand on a resource and at least one performance metric for the resource.
  • a demand time-series may comprise a sequence of data points captured of time for any input or metric representing the use of a resource.
  • Example demands may include, but are not limited to, user logons to access a resource, transactions per second (e.g., on a database or other transactional system), executions per second, and resource calls per second.
  • Example performance metrics that may be tracked may include, but are not limited to CPU performance metrics (e.g., CPU utilization rates, thread counts, etc.), memory bandwidth metrics (e.g., memory usage rates, cache hit rates, etc.), I/O metrics (e.g., physical reads and writes to disk), and network metrics (e.g., packet counts, packet flow rates, etc.).
  • CPU performance metrics e.g., CPU utilization rates, thread counts, etc.
  • memory bandwidth metrics e.g., memory usage rates, cache hit rates, etc.
  • I/O metrics e.g., physical reads and writes to disk
  • network metrics e.g., packet counts, packet flow rates, etc.
  • a correlation prediction model may be used to project values for one or more performance metrics of a resource as a function of learned correlation patterns. For example, a correlation prediction model may output a projected CPU utilization rate based on an input demand or combination of demands. Other correlation prediction models may output other performance metrics depending on the particular implementation.
  • a simulation model accounts for seasonal correlation patterns. For example, within a computing environment, high and/or low demands may recur on a seasonal basis (e.g., monthly, weekly, daily, etc.) In some cases, the seasonal pattern may correlate strongly with a set of resource performance values, while in other cases, the seasonal pattern may be weakly correlated (or may not correlate at all) with resource performance. For example, weekly high transaction loads may correlate with a decrease in performance for a database resource. On the other hand, weekly low transaction loads may not be correlated with database performance. This may occur due to maintenance and batch process scheduling during low transaction periods of time. The maintenance and batch processing may also significantly impact the database performance, reducing the correlation of low transaction seasonal periods.
  • a seasonal basis e.g., monthly, weekly, daily, etc.
  • a simulation model comprises different correlation prediction models to account for different seasonal patterns.
  • one correlation prediction model may be trained as a function of the correlation between seasonal high data points in transactions and CPU utilization.
  • Another correlation model may be trained as a function of seasonal lows values or some other seasonal pattern. The projected performance values may thus vary based on learned seasonal behavior.
  • the simulation model may be trained to account for patterns and interdependencies between different demands on a resource. For example, an increase in one demand may correlate to an increase or decrease in another demand. In order to capture this relationship, a demand propagation model may be trained based on the correlation patterns extracted from historical time-series data. The demand propagation models may be used to predict how changes one demand would affect values in other demands.
  • a simulation model may be evaluated against one or more what-if scenarios.
  • Each what-if scenario defines a set of one more scenario parameters including an adjustment to one or more resource and/or demand time-series values. For example, a user may wish to determine how an increase in the number of database transactions and/or one or more other demand affects CPU utilization on a target host.
  • the user may input, into the simulation model, the magnitude of the adjustment of the set of resource values.
  • the simulation model generates and present a projected adjustment to a set of one or more demand values.
  • the simulation model may run historical simulations and/or forecast simulations. With a historical simulation, adjustments are made to one or more historical values to determine how the change would have affected past performance. For example, the user may query the simulation model to determine how one hundred more instances of a cloud server would have affected performance of a particular target host in the past week.
  • a forecast simulation may involve changes to one or more historical values and/or one or more forecasted values to determine how a change would affect future performance.
  • the simulation model may comprise a forecasting model to project future values.
  • the simulation model may adjust a set of forecasted values.
  • the what-if analytic may be used to optimize system resources and configurations. For example, one or more simulations may be run to determine how various adjustments would affect resource performance. Based on the output of the simulations, one or more responsive actions may be taken. Example responsive actions may include deploying additional resources, bringing resources offline, and adjusting system configurations. If a system administrator is expecting additional demands on a resource, for instance, various scenarios may be evaluated to determine which scenario leads to the optimal resource configuration, which may be the configuration that maximizes performance (or satisfies a threshold level) using the fewest of resources.
  • a time series signal in this context, comprises a sequence of values that are captured over time.
  • the source of the time series signal and the type of information that is captured may vary from implementation to implementation.
  • a time series may be collected from one or more software and/or hardware resources and capture various performance attributes of the resources from which the data was collected.
  • a time series may be collected using one or more sensors that measure physical properties, such as temperature, pressure, motion, traffic flow, or other attributes of an object or environment.
  • FIG. 1A illustrates an example system comprising a set of what-if analytic services that are driven by time-series data captured from one or more host devices.
  • System 100 generally comprises hosts 110 a - n , data collector 120 , analytic services 130 , data repository 140 , and clients 150 a - k .
  • Components of system 100 may be implemented in one or more host machines operating within one or more clouds or other networked environments, depending on the particular implementation. Each component may be distributed over multiple applications and/or machines. Multiple components may be combined into one application and/or machine. Operations described with respect to one component may instead be performed by another component.
  • Hosts 110 a - n represent a set of one or more network hosts and generally comprise targets 112 a - i and agents 114 a - j .
  • a “target” in this context refers to a resource that serves as a source of time series data.
  • a target may be a software deployment such as a database server instance, middleware instance, or some other software resource executing on a network host.
  • a target may be a hardware resource, an environmental characteristic, or some other physical resource for which metrics may be measured and tracked.
  • Agents 114 a - j comprise hardware and/or software logic for capturing time-series measurements from a corresponding target (or set of targets) and sending these metrics to data collector 120 .
  • an agent includes a process, such as a service or daemon, that executes on a corresponding host machine and monitors one or more software and/or hardware resources that have been deployed.
  • an agent may include one or more hardware sensors, such as microelectromechanical (MEMs) accelerometers, thermometers, pressure sensors, etc., that capture time-series measurements of a physical environment and/or resource.
  • MEMs microelectromechanical
  • FIG. 1A the number of agents and/or targets per host may vary from implementation to implementation. Multiple agents may be installed on a given host to monitor different target sources of time series data.
  • an agent that resides remotely on a different host than a target may be responsible for collecting sample time-series data from the target.
  • Data collector 120 includes logic for aggregating data captured by agents 114 a - j into a set of one or more time-series. Data collector 120 may store the time series data in data repository 140 . Additionally or alternatively, data collector 120 may provide the time-series data to time-series analytic 130 . In one or more embodiments, data collector 120 receives data from agents 114 a - j over one or more data communication networks, such as the Internet.
  • Example communication protocols that may be used to transport data between the components illustrated within system 100 may include, without limitation, the hypertext transfer protocol (HTTP), simple network management protocol (SNMP), and other communication protocols of the internet protocol (IP) suite.
  • HTTP hypertext transfer protocol
  • SNMP simple network management protocol
  • IP internet protocol
  • data collector 120 collects demand and resource metrics from agents 114 a - j .
  • example demand metrics may include, but are not limited to, user logons to access a resource, transactions per second (e.g., on a database or other transactional system), executions per second, and calls per second associated with the resource.
  • Example performance metrics that may be tracked may include, but are not limited to CPU performance metrics (e.g., CPU utilization rates, thread counts, etc.), memory bandwidth metrics (e.g., memory usage rates, cache hit rates, etc.), I/O metrics (e.g., physical reads and writes to disk), and network metrics (e.g., packet counts, packet flow rates, etc.).
  • CPU performance metrics e.g., CPU utilization rates, thread counts, etc.
  • memory bandwidth metrics e.g., memory usage rates, cache hit rates, etc.
  • I/O metrics e.g., physical reads and writes to disk
  • network metrics e.g., packet counts, packet flow rates, etc.
  • Analytic services 130 includes correlation modelling logic 132 , seasonality modelling logic 134 , forecast modelling logic 136 , and what-if analytic 138 .
  • Each service may be invoked independently or in combination to train and/or evaluate time-series models, as described in further detail below.
  • correlation modelling logic 132 may train/evaluate correlation prediction models (e.g., demand propagation models and resource prediction models)
  • seasonality modelling logic 134 may train/evaluate seasonal behavioral models
  • forecast modelling logic 136 may train/evaluate forecast models
  • what-if analytic 138 may evaluate/simulate what-if scenarios using the underlying models.
  • Models may be trained/updated periodically or on-demand based on time-series data collected from targets 112 a - i.
  • Data repository 140 includes volatile and/or non-volatile storage for storing data for analytic services 130 , such as trained simulation models and the results of running scenario simulations.
  • Data repository 140 may be implemented by any type of storage unit and/or device (e.g., a file system, database, collection of tables, disk, tape cartridge, random access memory, disk, or any other storage mechanism) for storing data.
  • data repository 140 may include multiple different storage units and/or devices. The multiple different storage units and/or devices may or may not be of the same type or located at the same physical site.
  • data repository 140 may be implemented or may execute on the same computing system as one or more other components of FIG. 1A and/or may reside remotely from one or more other components.
  • Clients 150 a - k represent one or more clients that may access analytic services 130 to evaluate what-if scenarios.
  • a “client” in this context may be a human user, such as an administrator, a client program, or some other application instance.
  • a client may execute locally on the same host as time-series analytic or may execute on a different machine. If executing on a different machine, the client may communicate with analytic services 130 via one or more data communication protocols according to a client-server model, such as by submitting HTTP requests invoking one or more of the services and receiving HTTP responses comprising results generated by one or more of the services.
  • Analytic services 130 may provide clients 150 a - k with an interface through which one or more of the provided services may be invoked.
  • Example interfaces may comprise, without limitation, a graphical user interface (GUI), an application programming interface (API), a command-line interface (CLI) or some other interface that allows a user to interact with and invoke one or more of the provided services.
  • GUI
  • FIG. 1B illustrates an example dataflow for simulating scenarios and generating scenario outputs, in accordance with one or more embodiments.
  • analytic services 130 receives, as input, demand and resource time-series data 142 .
  • Correlation modelling logic 132 , seasonality modelling logic 134 , and forecast modelling logic 136 each process demand and resource time-series data 142 to build respective time-series models.
  • correlation modelling logic 132 may train correlation prediction models
  • seasonality modelling logic 134 may train seasonal pattern models
  • forecasting modelling logic 136 may train forecast models. Example operations for building and training these time-series models are described in further detail below.
  • seasonality modelling logic 134 provides seasonal pattern representations to correlation modelling logic 132 and/or forecast modelling logic 136 . Based on the seasonal patterns representations, correlation modelling logic 132 may generate seasonal-aware correlation prediction models that account for seasonal behavior in the correlation patterns. Additionally or alternatively, forecast modelling logic 136 may generate seasonal-aware forecast models that account for seasonal behavior in the forecasts.
  • scenario parameters 144 comprise a set of values that define a particular scenario to simulate.
  • scenario parameters 144 define at least one adjustment to a demand or resource time-series value.
  • a scenario may be defined as follows: “What if my system saw X % more user-demand”. Based on this scenario definition, historical and/or forecast simulations may be evaluated by what-if analytic 138 .
  • what-if analytic 138 may use the demand propagation models to project how the change in user demand would affect other demands. For example, for a scenario “What if my system saw X % more transactions”, what-if analytic may determine what change, if any, would occur in the number of redo records generated per second and/or any other demand on a database resource. What-if analytic 138 may then generate adjustments to the other demands according to the trained demand propagation models.
  • what-if analytic 138 may use the resource prediction models to project how the change in user demand would affect resource performance.
  • the resource prediction model may account for adjustments to other demands.
  • physical writes to disk may be a function of both transactions per second and redo records per second.
  • an adjustment to the number of transactions per second may be propagated to the redo records per second before adjusting the physical writes to disk.
  • the projected change to the physical writes is not computed solely as a function of the adjustment to a single demand (e.g., based on a one-to-one mapping between the demand and resource). Rather, complex relationships (e.g., many to one and many-to-many mappings) between different demands and resources are accounted for in the simulation models, which may increase the accuracy of the projected changes.
  • scenario outputs 146 capture adjustments made to demand and/or resource time-series data for a scenario definition.
  • scenario outputs may reflect historical and/or forecasted changes to one or more resource performance metrics and/or other demand metrics.
  • Scenario outputs may be stored in data repository 140 , provided to clients 150 a - k (e.g., by sending over a network to another host device or notifying a separate application executing on a host) and/or presented via an interactive display.
  • An application or other user may process the scenario outputs to make adjustments to targets 112 a - i .
  • a scenario simulation indicates that predicted performance metrics for a scenario will not satisfy a threshold
  • additional resources may be deployed and/or existing resources may be reconfigured (e.g., through load balancing, pooling etc.) to boost performance.
  • the number of resources that are deployed may also be selected based on scenario simulations that satisfy the target performance threshold. In other cases, resources may be brought offline or other responsive actions may be taken to optimize system 100 for a particular scenario.
  • Performance of a software or hardware resource may be function of many different inputs and interactions. For example, CPU utilization on a database host may be impacted by the frequency of transactions—a greater the number of transactions per second may typically correlate with a greater CPU utilization. However, CPU utilization may also be affected by other demands, such as the types of queries executed by the database host.
  • On-line transaction processing (OLTP) type queries/workloads such as table inserts and updates, generally do not impact CPU utilization as significantly as online analytical processing (OLAP) queries/workloads. The reason for the disparity is that OLAP queries generally involve scans and analysis of a much greater number of records within the database than an OLTP query.
  • Other demands may also have varying impact on the database host's performance. Other examples include, but are not limited to, the number of redo records (i.e., a log that stores a history of changes made to the database) generated per second, and the number of user calls to the database.
  • correlation modelling logic 132 builds a set of correlation prediction models to represent relationships between demands and resources.
  • a correlation prediction model in this context, is a data object or data structure that is generated, in data repository 140 , as a representation of learned pattern of correlation between different time-series.
  • one type of correlation prediction model also referred to herein as a demand propagation model, may map values for a demand metric to correlated values for another demand metric.
  • a demand propagation model that maps values associated with one demand (e.g., transactions per second) to another demand (e.g., redo records generated per second).
  • Another type of correlation prediction model may map values for a demand metric to correlated values for a resource performance metric.
  • a model may map one or more demand value ranges (e.g., transactions per second, user calls per second, etc.) to correlated CPU utilization rates.
  • correlation modelling logic 132 comprises a set of training processes implementing machine-learning algorithms to train a set of models.
  • the machine-learning algorithms may perform regression, clustering, and/or classification to train models, as described further below.
  • the set of training processes may initially train the set of models using a training set of time-series data. As additional time-series data is received, the correlation prediction models may be updated to reflect changes and/or additional learned patterns between resources and demands.
  • FIG. 2 illustrates an example set of operations for training a set of demand propagation and correlation prediction models, in accordance with one or more embodiments.
  • the set of operations include receiving time-series data for a set of demands (Operation 210 ).
  • correlation modelling logic 132 may receive, from data collector 120 , a sequence of data points capturing the number of instruction executions, transactions per second, redo records generated, function calls, and/or user calls per second for one or more of targets 112 a - i . Additionally or alternatively, other demands may also be captured for analysis, depending on the particular implementation.
  • time-series data may be received on-demand, periodically, or on a streaming/continuous basis from data collector 120 .
  • the training set of time-series data is conditioned before analysis.
  • correlation modelling logic 132 may perform an hourly rollup and/or a time-alignment of different time-series dataset to compare values from the same hourly time-range.
  • values may be rolled up over a different time period (e.g., on a per minute basis, per half-hour, daily, etc.)
  • correlation modelling logic 132 analyzes different demands and filters out/removes demands that contain redundant information.
  • Demand filtering reduces processing and storage overhead by eliminating the training and storage of redundant models. Demand filtering is described further below with reference to operations 220 to 260 .
  • correlation modelling logic 132 selects a demand pair to analyze (Operation 220 ). For example, if the time series includes three inputs/demand on a system, D 1 , D 2 , and D 3 , there are three possible demand pairs: ⁇ D 1 , D 2 ⁇ , ⁇ D 1 , D 3 ⁇ and ⁇ D 2 , D 3 ⁇ .
  • the demand pairs may be selected and processed n any order.
  • correlation modelling logic 132 next determines a correlation coefficient using the training set of time-series data for the selected demand pair (Operation 230 ). For example, a Pearson's correlation coefficient may be computed as follows:
  • Correlation modelling logic 132 next compares the correlation coefficient to a filter threshold to determine whether to filter one of the demands from the demand pair (Operation 240 ).
  • the filter threshold may vary from implementation to implementation.
  • the filter threshold for the Pearson's correlation coefficient may be set at 0.99. A correlation coefficient this high indicates that the demands track each other very closely and may be treated as representing the same behavioral patterns. In other cases. a slightly lower correlation coefficient may be used to allow for a greater amount of deviation between the demands, thereby increasing the likelihood of filtering.
  • correlation modeling logic 132 removes one of the demands from the demand pair (Operation 250 ). Once removed, a demand is not used to train demand propagation models or resource prediction models. Instead, the demand propagation model and resource prediction model for the remaining demand in the demand pair may be applied to the removed demand since the demands are highly correlated.
  • Correlation modelling logic 132 next determines whether there are any demand pairs remaining (Operation 260 ). The determination may be made based on whether any demands were filtered. For example, in the above example with three demand pairs, if ⁇ D 1 , D 2 ⁇ was analyzed and no demands were filtered, then there are two remaining demand pairs to analyzed: ⁇ D 1 , D 3 ⁇ and ⁇ D 2 , D 3 ⁇ . On the other hand, if D 2 was filtered out, then only one demand pair is left to analyze: ⁇ D 1 , D 3 ⁇ . The process returns to operation 220 for each reaming demand pair.
  • correlation modelling logic 132 trains a set of one or more demand propagation models for the remaining demand pairs (Operation 270 ).
  • correlation modelling logic 132 trains a demand propagation model by fitting a correlation prediction model to the observed data sets of each demand pair. Given demand pair ⁇ D 1 , D 2 ⁇ , for instance, a predictive model may be fit using linear regression as follows:
  • demand propagation models are only trained for demand pairs if the correlation coefficient is above a threshold.
  • the training threshold in this case may be set much lower than the filter threshold as no demand pairs remain with a correlation coefficient above the filter threshold.
  • a training threshold may be set at 0.25 or any other value, depending on the particular implementation.
  • the training threshold may eliminate having to train/store demand propagation models for demands that have little to no correlation.
  • FIG. 3 illustrates an example set of demand propagation models, in accordance with one or more embodiments.
  • demand propagation models are trained for demand pairs that have a correlation above a threshold (e.g., 0.25). Referring to the demand pairs:
  • the process includes receiving time-series datasets for resource performance metrics (Operation 280 ).
  • correlation modelling logic 132 may receive a sequence of samples that track CPU utilization, memory bandwidth, I/O operations, and/or other performance metric over time.
  • the resource performance metrics may be received at any point in the process on demand, periodically, or on a streaming basis. Upon receipt, the resource performance metrics may be conditioned as previously described.
  • correlation modelling logic 132 trains a set of resource prediction models (Operation 290 ).
  • correlation modelling logic 132 trains a resource prediction model by fitting a correlation prediction model to the observed data sets of a demand, resource metric pairing.
  • a predictive model may be fit using linear regression as follows:
  • R 1 D 1 ⁇ + ⁇ i (3)
  • resource prediction models are only trained for demand-to-resource pairs if the correlation coefficient is above a threshold.
  • a training threshold may be set at 0.25, as above for the demand propagation models, or any other value, depending on the particular implementation.
  • the training threshold may eliminate having to train/store resource prediction models for demands that have little to no correlation.
  • FIG. 4 illustrates an example set of resource prediction models, in accordance with one or more embodiments.
  • resource prediction models are trained for resource-to-demand pairings that have a correlation above a threshold (e.g., 0.4). Referring to the pairings:
  • host CPU utilization is a function of redo size, transaction, and user calls per second. In other cases, CPU utilization may be a function of other demands in addition or as an alternative to those listed.
  • Correlation modelling logic 132 may select demands during runtime to predict each resource. The selected demands may be adjusted over time as additional correlation patterns are detected. If no demand is correlated to a given resource metric, then correlation modelling logic 132 may proceed without training a resource prediction model for the given resource performance metric.
  • Correlation patterns may vary between different seasons.
  • a season in this context refers to pattern that recurs periodically.
  • daily high seasons for transactions may be observed between the hours of 9 to 5 from Monday to Friday.
  • Daily low seasons may occur outside of these hours.
  • seasonal high, lows, and/or other patterns may recur on an hourly, weekly, monthly, yearly, or holiday basis.
  • the daily highs in transactions may correlate with a high CPU utilization rate.
  • daily lows may not be closely correlated with the CPU utilization rate as batch processes may be run in the evenings on the database host.
  • seasonal lows may follow a different correlation pattern than the daily highs. Therefore, training different correlation models (e.g., demand propagation and resource prediction models) for different resources may lead to more accurate representations.
  • seasonality modelling logic 134 builds a seasonal classification model that represents different seasonal patterns.
  • the seasonal classification model may cluster time-series data points that are associated with different seasonal pattern classifications.
  • the seasonal pattern classifications and clusters may then be provided to correlation modelling logic 132 .
  • correlation modelling logic 132 may build demand propagation models and/or resource prediction models, as previously described, for the data points belonging to each cluster. In other words, separate models are trained and build for each cluster.
  • Correlation prediction models for a set of demands and resources may vary between different clusters/seasonal pattern classifications. For example, CPU utilization may be a function of transactions, user calls, and redo logs generated per second in the high season. In the low season, CPU utilization may only be a function of executions per second. In other words, the correlation coefficients between a particular demand and resource pair may vary from one season to the next. As a result, correlation modelling logic 132 may select different demands to train resource prediction models for different seasonal patterns.
  • FIG. 5 illustrates an example set of operations for performing seasonal aware training of simulation models, in accordance with one or more embodiments.
  • the process includes identifying seasonal patterns in a time-series (Operation 510 ).
  • seasonal modelling logic 134 may classify different sub-periods/instances of a resource and/or demand time-series as sparse high for those instances that exhibit sharp, relatively short bursts on a recurring basis.
  • Other classifications may include, but are not limited to, dense highs, sparse lows, and dense lows.
  • Example techniques for classifying seasonal patterns are described in U.S. application Ser. No. 15/266,971, entitled “SEASONAL AWARE METHOD FOR FORECASTING AND CAPACITY PLANNING”; U.S.
  • seasonality modelling logic 126 extracts data points in the time series that belong to the seasonal pattern (Operation 520 ). For example, if a high season for resource utilization is identified to recur weekly on Mondays from 9 to noon, then data points captured during this time frame may be clustered or otherwise grouped together within data repository 140 .
  • correlation modelling logic 132 uses the data points to train a set of one or more correlation prediction models for the seasonal pattern (Operation 530 ).
  • correlation modelling logic 132 may generated demand propagation models and/or resource prediction models as previously described.
  • the process next determines whether there are any remaining seasonal patterns to analyze (Operation 540 ). If so, then the process repeats for the next selected seasonal patterns.
  • correlation prediction models may be generated for each respective seasonal pattern. For example, if sparse high, dense high, and low seasonal patterns were identified, then the process may repeat for each of these seasonal pattern classifications. After correlation prediction models have been trained for each seasonal pattern, the process ends.
  • different sets of data points may be used to train correlation prediction models associated with different respective seasonal patterns. For example, for a high season, data points from a first set of one or more sub-periods (e.g., Monday to Thursday 9a.m. to 4p.m.) may be extracted to train the correlation prediction model. Data points from a different set of one or more sub-periods may be used to train correlation prediction models for other seasonal patterns, such as a low season.
  • FIG. 6 illustrates an example set of seasonal-aware correlation prediction models, in accordance with one or more embodiments.
  • Chart 602 depicts a resource prediction model the maps values of executions per second to CPU utilization rates in the sparse high season. A best fit line is generated using time-series samples from the sparse high season.
  • Chart 604 depicts a resource prediction model that maps values of executions per second to CPU utilization rates in the dense high season. As can be seen, the resource prediction model that is fit to the data has a different slope than in the sparse high season.
  • data repository 140 stores a mapping between seasonal patterns and the corresponding set of trained correlation prediction models.
  • the mapping data may be accessed to determine which correlation prediction models to use for a given scenario.
  • a scenario may be defined as follows “What if my system saw X times more transactions in the high season”.
  • what-in analytic 138 may determine which correlation prediction models are associated with the high season by reading the mapping data. What-if analytic 138 may use the associated correlation prediction models to generate the scenario output.
  • a scenario may apply to multiple seasons. For example, a general scenario of “What if my system saw X times more transactions” may apply across all seasons. In this case, what-if analytic 138 may analyze each season independently and stitch together the results.
  • what-if analytic 138 is configured to perform historical what-if scenario simulations. With historical simulations, modifications are made to historical time-series data to see how the system would have behaved. One benefit of adjusting historical data is that what-if analytic 138 is able to provide the user with scenario outputs that span the distribution of known operating modes. Historical simulations may be performed with no inferences about future conditions. In other words, a historical simulation may be performed without having to generate a forecast.
  • FIG. 7 illustrates an example set of operations for simulating a historical scenario, in accordance with one or more embodiments.
  • the process includes receiving a set of historical simulation parameters (Operation 710 ).
  • the scenario parameters define a historical adjustment.
  • an example historical simulation scenario may be defined as “What if there were three times the amount of users in the last three months?”
  • a historical simulation scenario may be defined as “What if there were ten percent more transactions last week?”
  • the scenario defines a historical adjustment and a period in the past to analyze.
  • a default timeframe to analyze may be chosen rather than explicitly defined.
  • a user may submit a scenario definition as follows “What if there were ten percent more transactions on database host DBHost?”
  • the user identifies an adjustment/modification and a particular target resource, but does not define the timeframe.
  • Any default timeframe e.g., the past week, month, six months, etc. may be selected for analysis in this case.
  • what-if analytic 138 adjusts one or more demands by a prescribed amount (Operation 720 ). For example, if the scenario requests a simulation of a historical scenario with three times more transactions, then what-if analytic 138 may adjust the transaction time-series data collected from one or more of targets 112 a - i to treble the values. Adjustments may be made to values over a prescribed or default timeframe, depending on the particular implementation.
  • the adjustments that are made by what-if analytic 138 vary depending on whether the scenario is attempting to model a user change or not. If the scenario defines a user change, then what-if analytic 138 models the user change by adjusting all of the demands by a prescribed amount. For example, if a scenario requests a simulation of four times change in users, then what-if analytic may adjust all demands by four times. If the scenario is more particular and adjusts a single demand, then the process may start by adjusting the prescribed demand by the prescribed amount. What-if analytic 138 may access the relevant demand propagation models to predict how the adjusted demand would propagate to other related demands, as described in the operations below.
  • what-if analytic 138 determines whether to propagate changes to a prescribed demand to other demands (Operation 730 ). As previously indicated, adjustments may be propagated if the scenario is making adjustments to individual demands rather than user changes. If the scenario is defining a user change, then the adjustment may be applied across all demands, and the process may skip to Operation 750 . If the adjustment is to a prescribed demand or set of demands, then what-if analytic 138 may determine whether there are any associated demand propagation models involving the adjusted demand. For instance, a trained demand propagation model may be available if the adjusted demand was correlated more than the training threshold with another demand. Conversely, a trained demand propagation may not be available for the adjusted demand if the demand is not correlated with other demands.
  • an adjusted execution per second value may be mapped to a corresponding value for transactions per second using the demand propagation model depicted in chart 304 .
  • the demand propagation model is also reversible, such that a change in transactions per second may be mapped to a corresponding execution value.
  • an adjustment in the number of executions per second may be mapped to a corresponding adjustment in user calls per second, and vice versa, using chart 306 .
  • what-if analytic determines 138 whether there are any resource prediction models associated with the adjusted demands (Operation 750 ). For example, what-if analytic 138 may search data store 140 for a trained resource prediction model that maps an adjusted demand to a resource performance metric. In some cases, a resource prediction model may not be available. This scenario may occur if none of the adjusted demands are correlated with a relevant resource performance metric. If none of the adjusted demands are correlated to resource performance, then the process may proceed without making any adjustments to a resource performance metric.
  • a resource prediction model may use the resource prediction model to generate an adjustment to at least one resource performance time-series (Operation 760 ). For example, a resource prediction model may map an adjusted value for executions per second to a corresponding CPU utilization value. What-if analytic 138 may adjust the historical CPU utilization rate to the value mapped to the adjusted executions per second value.
  • an adjustment to a resource performance metric may be determined based on multiple resource prediction models. For example, one resource prediction model may map an adjusted value for transactions per second to an adjusted CPU utilization rate. A second model may map an adjusted redo size per second to an adjusted CPU utilization rate. A final adjustment may be computed by combining (e.g., through averaging or otherwise aggregating) the adjustments from the different resource prediction models.
  • adjustments may be performed based on seasonal pattern classifications. For example, if a historical simulation is being run for a prescribed seasonal pattern (e.g., a high season, sparse high, low season, etc.), then what-if analytic 138 may search data store 140 for a mapping between the prescribed season seasonal pattern and a corresponding set of resource prediction models. What-if analytic 138 may then select/use one or more resource prediction models to map adjusted demand values to adjusted resource performance values for the prescribed seasonal pattern. In the event that the simulation spans multiple seasons, the models may be isolated to regions of time based on collective seasons.
  • a prescribed seasonal pattern e.g., a high season, sparse high, low season, etc.
  • the resource prediction model depicted in chart 602 may be used to map adjustments to executions per second in the sparse high season to corresponding adjustments to CPU utilization rate.
  • the resource prediction model depicted n chart 604 may be used.
  • what-if analytic 138 presents the time-series datasets for one or more demands and one or more resources, including any generated adjustments (Operation 770 ).
  • the presentation of the adjusted time-series datasets may vary from implementation to implementation.
  • the time-series datasets may be displayed via an interactive display that is coupled to a client computing system.
  • the interactive display may allow a user to drill down to view adjustments to individual demands, resources, or custom groups of demands and/or resources.
  • an interactive display presenting the results of a historical simulation may allow a user to select responsive actions, including on or more of the responsive actions previously described, for the system to perform.
  • system 100 may adjust the configurations of 112 a - i , deploy additional resources, and/or perform load balancing to replicate a simulated scenario.
  • what-if analytic 138 may present an uncertainty interval for a historical simulation.
  • An uncertainty interval may be a visualization or other indication of a degree of uncertainty in the predicted adjustments.
  • correlation modelling logic 132 may calculate and cache the uncertainties for possible combinations of evaluation. For a model that maps demand X to resource (or demand) Y, these possibilities include the uncertainty of predicting Y from X and predicting X from Y.
  • the uncertainty interval is a range of values that include an upper limit and a lower limit within which at least a certain threshold of values falls within a given level of confidence. For example, a historical CPU utilization rate may be observed to be 50 percent. In response to an adjustment of three times the amount of users, the projected/adjusted CPU utilization rate may jump to 65% (these values are given by way of example only and may vary from system to system). Based on historical patterns, the lower limit of the uncertainty level may be 55%, and the upper limit may be 75% with 95% confidence. The uncertainty interval thus gives a range of expected values within a prescribed level of confidence.
  • FIG. 8A illustrates an example presentation of propagating adjustments in one demand to another demand, in accordance with one or more embodiments.
  • the scenario parameters for this simulation may be defined as follows: “What if may system saw three times more executions per seconds in the last five months?”
  • what-if analytic 138 may present chart 800 , which illustrates an adjustment to a historical time-series for executions per second.
  • Time-series plot 804 depicts historical executions per second that were performed between May and September.
  • Time-series plot 802 depicts an example adjustment that increases the number of executions by three times.
  • What-if analytic 138 may further present chart 810 as part of the historical simulation.
  • Time-series plot 818 depicts historical transactions per second that were performed between May and September. Using a demand propagation model, the transactions per second are adjusted based on the three times increase in executions per second.
  • Time-series plot 812 depicts the adjustment.
  • Plot 814 depicts an upper bound for the uncertainty interval
  • plot 816 depicts a lower bound for the uncertainty interval.
  • FIG. 8B illustrates an example presentation of adjustments generated for resource time-series during the historical simulation defined with reference to FIG. 8A .
  • Chart 820 depicts the historical CPU utilization rate (time-series plot 830 ).
  • Chart 820 further depicts a projected/adjusted CPU utilization rate (time-series plot 824 ) computed based on the three time increase to executions per second using a resource prediction model.
  • Plot 826 depicts the upper bound in the uncertainty interval
  • plot 828 depicts the lower bound.
  • Time-series plot 822 depicts three times the CPU utilization rate, which may be omitted from the scenario output but is shown for purposes of illustration. As can be seen, the projected CPU utilization is much lower than multiplying the CPU utilization rate by the same amount as the increase in executions per second.
  • the training of time-series models as previously described allows for a much more complex and meaningful analysis between the various interactions between demands and resource performance.
  • Chart 840 depicts the historical number of physical writes per second (time-series plot 850 ). Chart 840 further depicts a projected/adjusted number of physical writes per second (time-series plot 844 ) based on the adjusted demands. Plot 846 depicts the upper bound of the uncertainty interval, and plot 848 depicts the lower bound of the interval. Time-series plot 842 depicts three times the historical physical writes per second rate, which, as can be seen, is inaccurate in comparison to the adjusted value, which uses the trained models to account for complex interactions within the system.
  • what-if analytic 138 is configured to perform forecast what-if scenario simulations.
  • forecast scenario simulations modifications may be made to historical and/or forecast time-series data to see how the system is predicted to behave.
  • forecast simulations include modelling changes in future demands and/or resources. The adjustments may account for changes, trends, and seasonal patterns as determined from historical data collected from targets 112 a - i.
  • FIG. 9 illustrates an example set of operations for simulating a forecast scenario, in accordance with one or more embodiments.
  • the set of operations includes generating a set of forecasts for each demand and resource time-series (Operation 910 ).
  • forecast modelling logic 136 is configured to generate a forecast that account for seasonal patterns and trends in the time-series data.
  • forecast modelling logic 136 may generate an Additive or Multiplicative Holt-Winters forecasting model.
  • the Holt-Winters models and other example forecasting models that may be trained are described in U.S. application Ser. No. 15/266,971, entitled “SEASONAL AWARE METHOD FOR FORECASTING AND CAPACITY PLANNING”, previously incorporated by reference.
  • the process further includes receiving a set of forecast simulation parameters (Operation 920 ).
  • the scenario parameters define a future adjustment.
  • an example forecast simulation scenario may be defined as “What if there are three times the amount of forecasted users in the next three months?”
  • a historical simulation scenario may be defined as “What if there are ten percent more transactions than forecasted for next week?”
  • the scenario defines an adjustment to a forecasted value and a period in the future to analyze.
  • a default timeframe to analyze may be chosen rather than explicitly defined.
  • a user may submit a scenario definition as follows “What if there are ten percent more transactions on database host DBHost than forecasted?”
  • the user identifies an adjustment/modification and a particular target resource, but does not define the timeframe.
  • Any default timeframe e.g., the next week, month, six months, etc. may be selected for analysis in this case.
  • what-if analytic 138 adjusts a historical and/or forecasted value by a prescribed amount (Operation 930 ). For example, if the scenario requests a simulation of a scenario with three times more transactions than forecasted, then what-if analytic 138 may adjust the forecast for future transactions per second projected for targets 112 a - i to treble the values. Adjustments may be made to values over a prescribed or default timeframe, depending on the particular implementation.
  • what-if analytic 138 determines whether to propagate changes to the prescribed one or more demands to other demands (Operation 940 ). Similar to the historical simulation, adjustments may be propagated if the scenario is making adjustments to individual demands rather than user changes. If the scenario is defining a user change in the future, then the adjustment may be applied across all demands, and the process may skip to Operation 960 . If the adjustment is to a prescribed demand or set of demands, then what-if analytic 138 may determine whether there are any associated demand propagation models involving the adjusted demand(s) as described above for in the historical scenario simulation.
  • a trained demand propagation model is available, then what-if analytic 138 use the demand propagation model to propagate the adjustment to other, related demands.
  • an adjusted forecast for executions per second may be mapped to an adjusted forecast for transactions per second using the demand propagation model depicted in chart 304 .
  • the mapped values in this case incorporate the trend and seasonal factors from the forecast.
  • the demand propagation models, as well as the resource prediction models, are also reversible as previously described.
  • what-if analytic determines 138 whether there are any resource prediction models associated with the adjusted demands (Operation 960 ). For example, what-if analytic 138 may search data store 140 for a trained resource prediction model that maps an adjusted demand to a resource performance metric. In some cases, a resource prediction model may not be available. This scenario may occur if none of the adjusted demands are correlated with a relevant resource performance metric. If none of the adjusted demands are correlated to resource performance, then the process may proceed without making any adjustments to a resource performance metric.
  • a resource prediction model may use the resource prediction model to generate an adjustment to at least one resource performance time-series (Operation 970 ). For example, a resource prediction model may map an adjusted value for executions per second to a corresponding CPU utilization value. What-if analytic 138 may adjust a forecast CPU utilization rate to the value mapped to the adjusted executions per second value. Similar to the demand adjustments, the resource performance adjustments may incorporate the trend and/or seasonal factors of the forecast.
  • an adjustment to a forecast resource performance metric may be determined based on multiple resource prediction models. Additionally or alternatively, adjustments may be performed based on seasonal pattern classifications. For example, if a historical simulation is being run for a prescribed seasonal pattern (e.g., a high season, sparse high, low season, etc.), then what-if analytic 138 may search data store 140 for a mapping between the prescribed season seasonal pattern and a corresponding set of resource prediction models. What-if analytic 138 may then select/use one or more resource prediction models to map adjusted demand forecasts to adjusted resource performance forecasts for the prescribed seasonal pattern. In the event that the simulation spans multiple seasons, the models may be isolated to regions of time based on collective seasons.
  • a prescribed seasonal pattern e.g., a high season, sparse high, low season, etc.
  • what-if analytic 138 presents the time-series datasets for one or more demand forecasts and one or more resource performance forecasts, including any generated adjustments (Operation 980 ).
  • the presentation of the adjusted time-series datasets may vary from implementation to implementation.
  • the time-series datasets may be displayed via an interactive display that is coupled to a client computing system.
  • the interactive display may allow a user to drill down to view adjustments to individual demands, resources, or custom groups of demands and/or resources.
  • an interactive display presenting the results of a forecast scenario simulation may allow a user to select responsive actions, including on or more of the responsive actions previously described, for the system to perform.
  • system 100 may adjust the configurations of 112 a - i , deploy additional resources, and/or perform load balancing to replicate a simulated scenario.
  • forecast modelling logic 136 and/or correlation modelling logic 132 may calculate and cache the uncertainties for possible combinations of evaluation. For a model that maps demand X to resource (or demand) Y, these possibilities include the uncertainty of predicting a forecasted Y from a forecasted X, predicting a forecasted X from a forecasted Y, predicting a forecasted Y from X, and predicting a forecasted X from Y.
  • what-if analytic 138 may modify the uncertainty interval when adjusting the forecasted values.
  • an original forecast may have an interval that is computed based on the forecast model.
  • the resource prediction model may introduce additional uncertainty.
  • the uncertainty interval may be shifted around the changed forecast values and/or expanded to reflect the uncertainty of predicting a change in the forecasted performance metric based on a change to the historical and/or forecasted demand metrics.
  • FIG. 10A illustrates an example set of demand time-series that were adjusted during a forecast scenario simulation, in accordance with one or more embodiments.
  • the scenario parameters for this simulation may be defined as follows: “What if may system saw five times times more executions per seconds in the next two months?” Based on the simulation, what-if analytic 138 may present chart 1000 , which illustrates an adjusted forecast (time-series plot 1004 ) for executions per second. Time-series plot 1002 depicts historical executions per second leading up to the forecast.
  • Time-series plot 1012 depicts historical size of redo data generated per second between March and July leading up to the forecast.
  • the forecasted redo generated per second is adjusted, as depicted in time-series plot 1014 , based on the five times increase in forecasted executions per second.
  • FIG. 10B illustrates an example set of resource time-series that were adjusted during a forecast scenario simulation, in accordance with one or more embodiments.
  • Chart 820 depicts the historical number of physical reads per second (time-series plot 1022 ) and an adjusted forecast (time-series plot 1024 ).
  • Chart 1030 depicts the historical physical writes per second (time-series plot 1032 ) and an adjusted forecast (time-series plot 1034 ).
  • a computer network provides connectivity among a set of nodes.
  • the nodes may be local to and/or remote from each other.
  • the nodes are connected by a set of links. Examples of links include a coaxial cable, an unshielded twisted cable, a copper cable, an optical fiber, and a virtual link.
  • a subset of nodes implements the computer network. Examples of such nodes include a switch, a router, a firewall, and a network address translator (NAT). Another subset of nodes uses the computer network.
  • Such nodes may execute a client process and/or a server process.
  • a client process makes a request for a computing service (such as, execution of a particular application, and/or storage of a particular amount of data).
  • a server process responds by executing the requested service and/or returning corresponding data.
  • a computer network may be a physical network, including physical nodes connected by physical links.
  • a physical node is any digital device.
  • a physical node may be a function-specific hardware device, such as a hardware switch, a hardware router, a hardware firewall, and a hardware NAT. Additionally or alternatively, a physical node may be a generic machine that is configured to execute various virtual machines and/or applications performing respective functions.
  • a physical link is a physical medium connecting two or more physical nodes. Examples of links include a coaxial cable, an unshielded twisted cable, a copper cable, and an optical fiber.
  • a computer network may be an overlay network.
  • An overlay network is a logical network implemented on top of another network (such as, a physical network).
  • Each node in an overlay network corresponds to a respective node in the underlying network.
  • each node in an overlay network is associated with both an overlay address (to address to the overlay node) and an underlay address (to address the underlay node that implements the overlay node).
  • An overlay node may be a digital device and/or a software process (such as, a virtual machine, an application instance, or a thread)
  • a link that connects overlay nodes is implemented as a tunnel through the underlying network.
  • the overlay nodes at either end of the tunnel treat the underlying multi-hop path between them as a single logical link. Tunneling is performed through encapsulation and decapsulation.
  • a client may be local to and/or remote from a computer network.
  • the client may access the computer network over other computer networks, such as a private network or the Internet.
  • the client may communicate requests to the computer network using a communications protocol, such as Hypertext Transfer Protocol (HTTP).
  • HTTP Hypertext Transfer Protocol
  • the requests are communicated through an interface, such as a client interface (such as a web browser), a program interface, or an application programming interface (API).
  • HTTP Hypertext Transfer Protocol
  • the requests are communicated through an interface, such as a client interface (such as a web browser), a program interface, or an application programming interface (API).
  • HTTP Hypertext Transfer Protocol
  • API application programming interface
  • a computer network provides connectivity between clients and network resources.
  • Network resources include hardware and/or software configured to execute server processes. Examples of network resources include a processor, a data storage, a virtual machine, a container, and/or a software application.
  • Network resources are shared amongst multiple clients. Clients request computing services from a computer network independently of each other.
  • Network resources are dynamically assigned to the requests and/or clients on an on-demand basis.
  • Network resources assigned to each request and/or client may be scaled up or down based on, for example, (a) the computing services requested by a particular client, (b) the aggregated computing services requested by a particular tenant, and/or (c) the aggregated computing services requested of the computer network.
  • Such a computer network may be referred to as a “cloud network.”
  • a service provider provides a cloud network to one or more end users.
  • Various service models may be implemented by the cloud network, including but not limited to Software-as-a-Service (SaaS), Platform-as-a-Service (PaaS), and Infrastructure-as-a-Service (IaaS).
  • SaaS Software-as-a-Service
  • PaaS Platform-as-a-Service
  • IaaS Infrastructure-as-a-Service
  • SaaS a service provider provides end users the capability to use the service provider's applications, which are executing on the network resources.
  • PaaS the service provider provides end users the capability to deploy custom applications onto the network resources.
  • the custom applications may be created using programming languages, libraries, services, and tools supported by the service provider.
  • IaaS the service provider provides end users the capability to provision processing, storage, networks, and other fundamental computing resources provided by the network resources. Any arbitrary applications, including an operating system, may be deployed on the network resources.
  • various deployment models may be implemented by a computer network, including but not limited to a private cloud, a public cloud, and a hybrid cloud.
  • a private cloud network resources are provisioned for exclusive use by a particular group of one or more entities (the term “entity” as used herein refers to a corporation, organization, person, or other entity).
  • entity refers to a corporation, organization, person, or other entity.
  • the network resources may be local to and/or remote from the premises of the particular group of entities.
  • cloud resources are provisioned for multiple entities that are independent from each other (also referred to as “tenants” or “customers”).
  • the computer network and the network resources thereof are accessed by clients corresponding to different tenants.
  • Such a computer network may be referred to as a “multi-tenant computer network.”
  • Several tenants may use a same particular network resource at different times and/or at the same time.
  • the network resources may be local to and/or remote from the premises of the tenants.
  • a computer network comprises a private cloud and a public cloud.
  • An interface between the private cloud and the public cloud allows for data and application portability. Data stored at the private cloud and data stored at the public cloud may be exchanged through the interface.
  • Applications implemented at the private cloud and applications implemented at the public cloud may have dependencies on each other. A call from an application at the private cloud to an application at the public cloud (and vice versa) may be executed through the interface.
  • tenants of a multi-tenant computer network are independent of each other.
  • a business or operation of one tenant may be separate from a business or operation of another tenant.
  • Different tenants may demand different network requirements for the computer network. Examples of network requirements include processing speed, amount of data storage, security requirements, performance requirements, throughput requirements, latency requirements, resiliency requirements, Quality of Service (QoS) requirements, tenant isolation, and/or consistency.
  • QoS Quality of Service
  • tenant isolation and/or consistency.
  • the same computer network may need to implement different network requirements demanded by different tenants.
  • tenant isolation is implemented to ensure that the applications and/or data of different tenants are not shared with each other.
  • Various tenant isolation approaches may be used.
  • each tenant is associated with a tenant ID.
  • Each network resource of the multi-tenant computer network is tagged with a tenant ID.
  • a tenant is permitted access to a particular network resource only if the tenant and the particular network resources are associated with a same tenant ID.
  • each tenant is associated with a tenant ID.
  • Each application, implemented by the computer network is tagged with a tenant ID.
  • each data structure and/or dataset, stored by the computer network is tagged with a tenant ID.
  • a tenant is permitted access to a particular application, data structure, and/or dataset only if the tenant and the particular application, data structure, and/or dataset are associated with a same tenant ID.
  • each database implemented by a multi-tenant computer network may be tagged with a tenant ID. Only a tenant associated with the corresponding tenant ID may access data of a particular database.
  • each entry in a database implemented by a multi-tenant computer network may be tagged with a tenant ID. Only a tenant associated with the corresponding tenant ID may access data of a particular entry.
  • the database may be shared by multiple tenants.
  • a subscription list indicates which tenants have authorization to access which applications. For each application, a list of tenant IDs of tenants authorized to access the application is stored. A tenant is permitted access to a particular application only if the tenant ID of the tenant is included in the subscription list corresponding to the particular application.
  • network resources such as digital devices, virtual machines, application instances, and threads
  • packets from any source device in a tenant overlay network may only be transmitted to other devices within the same tenant overlay network.
  • Encapsulation tunnels are used to prohibit any transmissions from a source device on a tenant overlay network to devices in other tenant overlay networks.
  • the packets, received from the source device are encapsulated within an outer packet.
  • the outer packet is transmitted from a first encapsulation tunnel endpoint (in communication with the source device in the tenant overlay network) to a second encapsulation tunnel endpoint (in communication with the destination device in the tenant overlay network).
  • the second encapsulation tunnel endpoint decapsulates the outer packet to obtain the original packet transmitted by the source device.
  • the original packet is transmitted from the second encapsulation tunnel endpoint to the destination device in the same particular overlay network.
  • microservice in this context refers to software logic designed to be independently deployable, having endpoints that may be logically coupled to other microservices to build a variety of applications.
  • Applications built using microservices are distinct from monolithic applications, which are designed as a single fixed unit and generally comprise a single logical executable. With microservice applications, different microservices are independently deployable as separate executables.
  • Microservices may communicate using HyperText Transfer Protocol (HTTP) messages and/or according to other communication protocols via API endpoints. Microservices may be managed and updated separately, written in different languages, and be executed independently from other microservices.
  • HTTP HyperText Transfer Protocol
  • Microservices provide flexibility in managing and building applications. Different applications may be built by connecting different sets of microservices without changing the source code of the microservices. Thus, the microservices act as logical building blocks that may be arranged in a variety of ways to build different applications. Microservices may provide monitoring services that notify a microservices manager (such as If-This-Then-That (IFTTT), Zapier, or Oracle Self-Service Automation (OSSA)) when trigger events from a set of trigger events exposed to the microservices manager occur.
  • a microservices manager such as If-This-Then-That (IFTTT), Zapier, or Oracle Self-Service Automation (OSSA)
  • Microservices exposed for an application may alternatively or additionally provide action services that perform an action in the application (controllable and configurable via the microservices manager by passing in values, connecting the actions to other triggers and/or data passed along from other actions in the microservices manager) based on data received from the microservices manager.
  • the microservice triggers and/or actions may be chained together to form recipes of actions that occur in optionally different applications that are otherwise unaware of or have no control or dependency on each other.
  • These managed applications may be authenticated or plugged in to the microservices manager, for example, with user-supplied application credentials to the manager, without requiring reauthentication each time the managed application is used alone or in combination with other applications.
  • microservices may be connected via a GUI.
  • microservices may be displayed as logical blocks within a window, frame, other element of a GUI.
  • a user may drag and drop microservices into an area of the GUI used to build an application.
  • the user may connect the output of one microservice into the input of another microservice using directed arrows or any other GUI element.
  • the application builder may run verification tests to confirm that the output and inputs are compatible (e.g., by checking the datatypes, size restrictions, etc.)
  • a microservice may trigger a notification (into the microservices manager for optional use by other plugged in applications, herein referred to as the “target” microservice) based on the above techniques and/or may be represented as a GUI block and connected to one or more other microservices.
  • the trigger condition may include absolute or relative thresholds for values, and/or absolute or relative thresholds for the amount or duration of data to analyze, such that the trigger to the microservices manager occurs whenever a plugged-in microservice application detects that a threshold is crossed. For example, a user may request a trigger into the microservices manager when the microservice application detects a value has crossed a triggering threshold.
  • the trigger when satisfied, might output data for consumption by the target microservice.
  • the trigger when satisfied, outputs a binary value indicating the trigger has been satisfied, or outputs the name of the field or other context information for which the trigger condition was satisfied.
  • the target microservice may be connected to one or more other microservices such that an alert is input to the other microservices.
  • Other microservices may perform responsive actions based on the above techniques, including, but not limited to, deploying additional resources, adjusting system configurations, and/or generating GUIs.
  • a plugged-in microservice application may expose actions to the microservices manager.
  • the exposed actions may receive, as input, data or an identification of a data object or location of data, that causes data to be moved into a data cloud.
  • the exposed actions may receive, as input, a request to increase or decrease existing alert thresholds.
  • the input might identify existing in-application alert thresholds and whether to increase or decrease, or delete the threshold. Additionally or alternatively, the input might request the microservice application to create new in-application alert thresholds.
  • the in-application alerts may trigger alerts to the user while logged into the application, or may trigger alerts to the user using default or user-selected alert mechanisms available within the microservice application itself, rather than through other applications plugged into the microservices manager.
  • the microservice application may generate and provide an output based on input that identifies, locates, or provides historical data, and defines the extent or scope of the requested output.
  • the action when triggered, causes the microservice application to provide, store, or display the output, for example, as a data model or as aggregate data that describes a data model.
  • the techniques described herein are implemented by one or more special-purpose computing devices.
  • the special-purpose computing devices may be hard-wired to perform the techniques, or may include digital electronic devices such as one or more application-specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) that are persistently programmed to perform the techniques, or may include one or more general purpose hardware processors programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination.
  • ASICs application-specific integrated circuits
  • FPGAs field programmable gate arrays
  • Such special-purpose computing devices may also combine custom hard-wired logic, ASICs, or FPGAs with custom programming to accomplish the techniques.
  • the special-purpose computing devices may be desktop computer systems, portable computer systems, handheld devices, networking devices or any other device that incorporates hard-wired and/or program logic to implement the techniques.
  • FIG. 11 is a block diagram that illustrates computer system 1100 upon which one or more embodiments may be implemented.
  • Computer system 1100 includes bus 1102 or other communication mechanism for communicating information, and hardware processor 1104 coupled with bus 1102 for processing information.
  • Hardware processor 1104 may be, for example, a general purpose microprocessor.
  • Computer system 1100 also includes main memory 1106 , such as a random access memory (RAM) or other dynamic storage device, coupled to bus 1102 for storing information and instructions to be executed by processor 1104 .
  • Main memory 1106 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 1104 .
  • Such instructions when stored in non-transitory storage media accessible to processor 1104 , render computer system 1100 into a special-purpose machine that is customized to perform the operations specified in the instructions.
  • Computer system 1100 further includes read only memory (ROM) 1108 or other static storage device coupled to bus 1102 for storing static information and instructions for processor 1104 .
  • Storage device 1110 such as a magnetic disk or optical disk, is provided and coupled to bus 1102 for storing information and instructions.
  • Computer system 1100 may be coupled via bus 1102 to display 1112 , such as a cathode ray tube (CRT), liquid crystal display (LCD), or light-emitting diode (LED), for displaying information to a computer user.
  • Display 1112 such as a cathode ray tube (CRT), liquid crystal display (LCD), or light-emitting diode (LED), for displaying information to a computer user.
  • Input device 1114 which may include physical and/or touchscreen based alphanumeric keys, is coupled to bus 1102 for communicating information and command selections to processor 1104 .
  • cursor control 1116 is Another type of user input device.
  • cursor control 1116 such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 1104 and for controlling cursor movement on display 1112 .
  • This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (
  • Computer system 1100 may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system 1100 to be a special-purpose machine. According to one embodiment, the techniques herein are performed by computer system 1100 in response to processor 1104 executing one or more sequences of one or more instructions contained in main memory 1106 . Such instructions may be read into main memory 1106 from another storage medium, such as storage device 1110 . Execution of the sequences of instructions contained in main memory 1106 causes processor 1104 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions.
  • Non-volatile media includes, for example, optical or magnetic disks, such as storage device 1110 .
  • Volatile media includes dynamic memory, such as main memory 1106 .
  • Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.
  • Storage media is distinct from but may be used in conjunction with transmission media.
  • Transmission media participates in transferring information between storage media.
  • transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 1102 .
  • transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
  • Various forms of media may be involved in carrying one or more sequences of one or more instructions to processor 1104 for execution.
  • the instructions may initially be carried on a magnetic disk or solid state drive of a remote computer.
  • the remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem.
  • a modem local to computer system 1100 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal.
  • An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus 1102 .
  • Bus 1102 carries the data to main memory 1106 , from which processor 1104 retrieves and executes the instructions.
  • the instructions received by main memory 1106 may optionally be stored on storage device 1110 either before or after execution by processor 1104 .
  • Computer system 1100 also includes a communication interface 1118 coupled to bus 1102 .
  • Communication interface 1118 provides a two-way data communication coupling to a network link 1120 that is connected to local network 1122 .
  • communication interface 1118 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line.
  • ISDN integrated services digital network
  • communication interface 1118 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN.
  • LAN local area network
  • Wireless links may also be implemented.
  • communication interface 1118 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
  • Network link 1120 typically provides data communication through one or more networks to other data devices.
  • network link 1120 may provide a connection through local network 1122 to host computer 1124 or to data equipment operated by Internet Service Provider (ISP) 1126 .
  • ISP 1126 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the “Internet” 1128 .
  • Internet 1128 uses electrical, electromagnetic or optical signals that carry digital data streams.
  • the signals through the various networks and the signals on network link 1120 and through communication interface 1118 , which carry the digital data to and from computer system 1100 are example forms of transmission media.
  • Computer system 1100 can send messages and receive data, including program code, through the network(s), network link 1120 and communication interface 1118 .
  • server 1130 might transmit a requested code for an application program through Internet 1128 , ISP 1126 , local network 1122 and communication interface 1118 .
  • the received code may be executed by processor 1104 as it is received, and/or stored in storage device 1110 , or other non-volatile storage for later execution.
  • Embodiments are directed to a system with one or more devices that include a hardware processor and that are configured to perform any of the operations described herein and/or recited in any of the claims below.
  • a non-transitory computer readable storage medium comprises instructions which, when executed by one or more hardware processors, causes performance of any of the operations described herein and/or recited in any of the claims.

Abstract

Techniques are described for applying what-f analytics to simulate performance of computing resources in cloud and other computing environments. In one or more embodiments, a plurality of time-series datasets are received including time-series datasets representing a plurality of demands on a resource and datasets representing performance metrics for a resource. Based on the datasets at least one demand propagation model and at least one resource prediction model are trained. Responsive to receiving an adjustment to a first set of one or more values associated with a first demand: (a) a second adjustment is generated for a second set of one or more values associated with a second demand; and (b) a third adjustment is generated for a third set of one or more values that is associated with the resource performance metric.

Description

    INCORPORATION BY REFERENCE; DISCLAIMER
  • The following application is hereby incorporated by reference: U.S. application Ser. No. 15/612,999 filed on Jun. 2, 2017. The Applicant hereby rescinds any disclaimer of claim scope in the parent application(s) or the prosecution history thereof and advises the USPTO that the claims in this application may be broader than any claim in the parent application(s).
  • RELATED APPLICATIONS
  • This application is related to U.S. application Ser. No. 15/266,971, entitled “SEASONAL AWARE METHOD FOR FORECASTING AND CAPACITY PLANNING”; U.S. application Ser. No. 15/445,763, entitled “METHOD FOR CREATING PERIOD PROFILE FOR TIME-SERIES DATA WITH RECURRENT PATTERNS”; U.S. application Ser. No. 15/266,979, entitled “SYSTEMS AND METHODS FOR DETECTING AND ACCOMMODATING STATE CHANGES IN MODELLING”; U.S. application Ser. No. 15/140,358, entitled “SCALABLE TRI-POINT ARBITRATION AND CLUSTERING”; U.S. application Ser. No. 15/057,065, entitled “SYSTEM FOR DETECTING AND CHARACTERIZING SEASONS”; U.S. application Ser. No. 15/057,060, entitled “SUPERVISED METHOD FOR CLASSIFYING SEASONAL PATTERNS”; U.S. application Ser. No. 15/057,062, entitled “UNSUPERVISED METHOD FOR CLASSIFYING SEASONAL PATTERNS”; and U.S. Provisional Patent Appl. No. 62/370,880, entitled “UNSUPERVISED METHOD FOR BASELINING AND ANOMALY DETECTION IN TIME-SERIES DATA FOR ENTERPRISE SYSTEMS”, the entire contents for each of which are incorporated by reference herein as if set forth in their entirety.
  • TECHNICAL FIELD
  • The present disclosure relates to analytical models that process time-series data. In particular, the present disclosure relates to training and evaluating what-if models to analyze performance of computing resources.
  • BACKGROUND
  • In large-scale computing environments, such as datacenters and cloud computing platforms, many different inputs may affect the performance of hardware and software resources. For example, the performance of a database server may be impacted by the frequency of database transactions, user calls, and/or query executions, among other factors. As the number and frequency of inputs into a resource increase, so does the likelihood that the resource's performance will degrade.
  • System administrators are generally responsible for ensuring that resources within the computing environment are meeting performance expectations. As the demand on resources increase, system administrators may deploy additional resources and/or balance the load across existing resources to maintain the quality of service. As the demands decrease, resources may be brought offline to mitigate waste and improve efficiency. If administrators fail to adequately understand or anticipate demands, system resources may become overloaded, leading to performance degradation.
  • The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The embodiments are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings. It should be noted that references to “an” or “one” embodiment in this disclosure are not necessarily to the same embodiment, and they mean at least one. In the drawings:
  • FIG. 1A illustrates an example system comprising a set of what-if analytic services that are driven by time-series data captured from one or more host devices, in accordance with one or more embodiments;
  • FIG. 1B illustrates an example dataflow for simulating scenarios and generating scenario outputs, in accordance with one or more embodiments;
  • FIG. 2 illustrates an example set of operations for training a set of correlation prediction models, in accordance with one or more embodiments;
  • FIG. 3 illustrates an example set of demand propagation models, in accordance with one or more embodiments;
  • FIG. 4 illustrates an example set of correlation prediction models, in accordance with one or more embodiments;
  • FIG. 5 illustrates an example set of operations for performing seasonal aware training of simulation models, in accordance with one or more embodiments;
  • FIG. 6 illustrates an example set of seasonal-aware correlation prediction models, in accordance with one or more embodiments;
  • FIG. 7 illustrates an example set of operations for simulating a historical scenario, in accordance with one or more embodiments;
  • FIG. 8A illustrates an example presentation of propagating adjustments in one demand to another demand, in accordance with one or more embodiments;
  • FIG. 8B illustrates an example presentation of adjustments generated for resource time-series during a historical simulation, in accordance with one or more embodiments;
  • FIG. 9 illustrates an example set of operations for simulating a forecast scenario, in accordance with one or more embodiments;
  • FIG. 10A illustrates an example set of demand time-series that were adjusted during a forecast scenario simulation, in accordance with one or more embodiments;
  • FIG. 10B illustrates an example set of resource time-series that were adjusted during a forecast scenario simulation, in accordance with one or more embodiments; and
  • FIG. 11 illustrates an example computing system upon which one or more embodiments may be implemented.
  • DETAILED DESCRIPTION
  • In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding. One or more embodiments may be practiced without these specific details. Features described in one embodiment may be combined with features described in a different embodiment. In some examples, well-known structures and devices are described with reference to a block diagram form in order to avoid unnecessarily obscuring the present invention.
      • 1. GENERAL OVERVIEW
      • 2. ARCHITECTURAL OVERVIEW
      • 3. SIMULATION MODELS BASED ON DEMAND AND RESOURCE CORRELATION
      • 4. SEASONAL-AWARE SIMULATION MODELS
      • 5. HISTORICAL SCENARIO SIMULATIONS
      • 6. FORECAST SCENARIO SIMULATIONS
      • 7. COMPUTER NETWORKS AND CLOUD NETWORKS
      • 8. MICROSERVICE APPLICATIONS
      • 9. HARDWARE OVERVIEW
      • 10. MISCELLANEOUS; EXTENSIONS
  • 1. General Overview
  • System administrators may rely on various tools to monitor system performance in computing environments. For example, system administrators may deploy performance profilers to track various performance metrics associated with hardware and/or software resources. Performance profilers may generate periodic reports that track overall system performance as well as the performance of individual resources. Performance profilers may further generate alerts to notify the system administrator if the performance metrics cross a threshold. The alerts allow system administrators to address problems that may arise in the event of performance degradation or too many idle resources.
  • Performance profilers are generally directed to analyzing historical metrics. For example, a performance profiler may track various metrics, such as central processing unit (CPU) utilization, physical memory accesses, function calls, database transactions, etc. While historical profiling presents an overall picture of how the system has performed over time, the results are generally constrained to historical observations. System administrators are often interested in analyzing other possible scenarios, even if these scenarios have not occurred in the past.
  • What-if analytical models allow system administrators to simulate and observe the impact of changes within a system environment. One approach to the what-if analysis is for the system administrator to define performance capacity of a system and the performance requirements for software resources deployed within a system. In the context of a cloud service, for instance, a cloud administrator may define an actual or hypothetical host capacity and the requirements of running an instance of the cloud service on the host. The cloud administrator may then simulate adding and removing instances from the host to determine how system performance is affected. This approach allows administrators to more accurately predict how adding resources affects system performance. However, the approach relies on the administrator's domain knowledge. In complex system with variable inputs and demands, the administrator may not be aware of or be able to keep up with how changes will affect a particular resource.
  • According to techniques described herein, machine-learning processes and systems may be used to train what-if analytical models (also referred to herein as simulation models) based on historical metrics captured from a computing environment. The machine-learning processes may train/build the simulation models based on a variety of inputs and machine-learning algorithms. The what-if analytic may be implemented to learn complex relationships between system demands and resources from historical time-series data. Additionally or alternatively, the what-if analytic 138 may account for trends and seasonal patterns when training the simulation models. These techniques may provide greater accuracy and more robust analytic capabilities when simulating scenarios in computing environments. As a result, capacity planning operations may be more optimally tuned for a wider variety of scenarios to increase the efficiency with which computing resources are allocated and deallocated within the computing environment.
  • In one or more embodiments, a simulation model comprises a set of one or more underlying correlation predication models. Each correlation prediction model may be trained using a set time-series datasets associated with at least one demand on a resource and at least one performance metric for the resource. A demand time-series may comprise a sequence of data points captured of time for any input or metric representing the use of a resource. Example demands may include, but are not limited to, user logons to access a resource, transactions per second (e.g., on a database or other transactional system), executions per second, and resource calls per second. Example performance metrics that may be tracked may include, but are not limited to CPU performance metrics (e.g., CPU utilization rates, thread counts, etc.), memory bandwidth metrics (e.g., memory usage rates, cache hit rates, etc.), I/O metrics (e.g., physical reads and writes to disk), and network metrics (e.g., packet counts, packet flow rates, etc.).
  • Once trained, a correlation prediction model may be used to project values for one or more performance metrics of a resource as a function of learned correlation patterns. For example, a correlation prediction model may output a projected CPU utilization rate based on an input demand or combination of demands. Other correlation prediction models may output other performance metrics depending on the particular implementation.
  • In one or more embodiments, a simulation model accounts for seasonal correlation patterns. For example, within a computing environment, high and/or low demands may recur on a seasonal basis (e.g., monthly, weekly, daily, etc.) In some cases, the seasonal pattern may correlate strongly with a set of resource performance values, while in other cases, the seasonal pattern may be weakly correlated (or may not correlate at all) with resource performance. For example, weekly high transaction loads may correlate with a decrease in performance for a database resource. On the other hand, weekly low transaction loads may not be correlated with database performance. This may occur due to maintenance and batch process scheduling during low transaction periods of time. The maintenance and batch processing may also significantly impact the database performance, reducing the correlation of low transaction seasonal periods.
  • In one or more embodiments, a simulation model comprises different correlation prediction models to account for different seasonal patterns. For example, one correlation prediction model may be trained as a function of the correlation between seasonal high data points in transactions and CPU utilization. Another correlation model may be trained as a function of seasonal lows values or some other seasonal pattern. The projected performance values may thus vary based on learned seasonal behavior.
  • In one or more embodiments, the simulation model may be trained to account for patterns and interdependencies between different demands on a resource. For example, an increase in one demand may correlate to an increase or decrease in another demand. In order to capture this relationship, a demand propagation model may be trained based on the correlation patterns extracted from historical time-series data. The demand propagation models may be used to predict how changes one demand would affect values in other demands.
  • Once trained, a simulation model may be evaluated against one or more what-if scenarios. Each what-if scenario defines a set of one more scenario parameters including an adjustment to one or more resource and/or demand time-series values. For example, a user may wish to determine how an increase in the number of database transactions and/or one or more other demand affects CPU utilization on a target host. To run the simulation, the user may input, into the simulation model, the magnitude of the adjustment of the set of resource values. In response, the simulation model generates and present a projected adjustment to a set of one or more demand values.
  • In one or more embodiments, the simulation model may run historical simulations and/or forecast simulations. With a historical simulation, adjustments are made to one or more historical values to determine how the change would have affected past performance. For example, the user may query the simulation model to determine how one hundred more instances of a cloud server would have affected performance of a particular target host in the past week. A forecast simulation, on the other hand, may involve changes to one or more historical values and/or one or more forecasted values to determine how a change would affect future performance. In this case, the simulation model may comprise a forecasting model to project future values. In response to receiving an adjustment to one or more values (e.g., to a resource time-series), the simulation model may adjust a set of forecasted values.
  • In one or more embodiments, the what-if analytic may be used to optimize system resources and configurations. For example, one or more simulations may be run to determine how various adjustments would affect resource performance. Based on the output of the simulations, one or more responsive actions may be taken. Example responsive actions may include deploying additional resources, bringing resources offline, and adjusting system configurations. If a system administrator is expecting additional demands on a resource, for instance, various scenarios may be evaluated to determine which scenario leads to the optimal resource configuration, which may be the configuration that maximizes performance (or satisfies a threshold level) using the fewest of resources.
  • 2. Architectural Overview
  • In one or more embodiments, training of a what-if analytic is driven by a plurality of time-series signals. A time series signal, in this context, comprises a sequence of values that are captured over time. The source of the time series signal and the type of information that is captured may vary from implementation to implementation. For example, a time series may be collected from one or more software and/or hardware resources and capture various performance attributes of the resources from which the data was collected. As another example, a time series may be collected using one or more sensors that measure physical properties, such as temperature, pressure, motion, traffic flow, or other attributes of an object or environment.
  • FIG. 1A illustrates an example system comprising a set of what-if analytic services that are driven by time-series data captured from one or more host devices. System 100 generally comprises hosts 110 a-n, data collector 120, analytic services 130, data repository 140, and clients 150 a-k. Components of system 100 may be implemented in one or more host machines operating within one or more clouds or other networked environments, depending on the particular implementation. Each component may be distributed over multiple applications and/or machines. Multiple components may be combined into one application and/or machine. Operations described with respect to one component may instead be performed by another component.
  • Hosts 110 a-n represent a set of one or more network hosts and generally comprise targets 112 a-i and agents 114 a-j. A “target” in this context refers to a resource that serves as a source of time series data. For example, a target may be a software deployment such as a database server instance, middleware instance, or some other software resource executing on a network host. Additionally or alternatively, a target may be a hardware resource, an environmental characteristic, or some other physical resource for which metrics may be measured and tracked.
  • Agents 114 a-j comprise hardware and/or software logic for capturing time-series measurements from a corresponding target (or set of targets) and sending these metrics to data collector 120. In one or more embodiments, an agent includes a process, such as a service or daemon, that executes on a corresponding host machine and monitors one or more software and/or hardware resources that have been deployed. Additionally or alternatively, an agent may include one or more hardware sensors, such as microelectromechanical (MEMs) accelerometers, thermometers, pressure sensors, etc., that capture time-series measurements of a physical environment and/or resource. Although only one agent and target is illustrated per host in FIG. 1A, the number of agents and/or targets per host may vary from implementation to implementation. Multiple agents may be installed on a given host to monitor different target sources of time series data. In other embodiments, an agent that resides remotely on a different host than a target may be responsible for collecting sample time-series data from the target.
  • Data collector 120 includes logic for aggregating data captured by agents 114 a-j into a set of one or more time-series. Data collector 120 may store the time series data in data repository 140. Additionally or alternatively, data collector 120 may provide the time-series data to time-series analytic 130. In one or more embodiments, data collector 120 receives data from agents 114 a-j over one or more data communication networks, such as the Internet. Example communication protocols that may be used to transport data between the components illustrated within system 100 may include, without limitation, the hypertext transfer protocol (HTTP), simple network management protocol (SNMP), and other communication protocols of the internet protocol (IP) suite.
  • In one or more embodiments, data collector 120 collects demand and resource metrics from agents 114 a-j. As previously indicated, example demand metrics may include, but are not limited to, user logons to access a resource, transactions per second (e.g., on a database or other transactional system), executions per second, and calls per second associated with the resource. Example performance metrics that may be tracked may include, but are not limited to CPU performance metrics (e.g., CPU utilization rates, thread counts, etc.), memory bandwidth metrics (e.g., memory usage rates, cache hit rates, etc.), I/O metrics (e.g., physical reads and writes to disk), and network metrics (e.g., packet counts, packet flow rates, etc.).
  • Analytic services 130 includes correlation modelling logic 132, seasonality modelling logic 134, forecast modelling logic 136, and what-if analytic 138. Each service may be invoked independently or in combination to train and/or evaluate time-series models, as described in further detail below. For example, correlation modelling logic 132 may train/evaluate correlation prediction models (e.g., demand propagation models and resource prediction models), seasonality modelling logic 134 may train/evaluate seasonal behavioral models, forecast modelling logic 136 may train/evaluate forecast models, and what-if analytic 138 may evaluate/simulate what-if scenarios using the underlying models. Models may be trained/updated periodically or on-demand based on time-series data collected from targets 112 a-i.
  • Data repository 140 includes volatile and/or non-volatile storage for storing data for analytic services 130, such as trained simulation models and the results of running scenario simulations. Data repository 140 may be implemented by any type of storage unit and/or device (e.g., a file system, database, collection of tables, disk, tape cartridge, random access memory, disk, or any other storage mechanism) for storing data. Further, data repository 140 may include multiple different storage units and/or devices. The multiple different storage units and/or devices may or may not be of the same type or located at the same physical site. Further, data repository 140 may be implemented or may execute on the same computing system as one or more other components of FIG. 1A and/or may reside remotely from one or more other components.
  • Clients 150 a-k represent one or more clients that may access analytic services 130 to evaluate what-if scenarios. A “client” in this context may be a human user, such as an administrator, a client program, or some other application instance. A client may execute locally on the same host as time-series analytic or may execute on a different machine. If executing on a different machine, the client may communicate with analytic services 130 via one or more data communication protocols according to a client-server model, such as by submitting HTTP requests invoking one or more of the services and receiving HTTP responses comprising results generated by one or more of the services. Analytic services 130 may provide clients 150 a-k with an interface through which one or more of the provided services may be invoked. Example interfaces may comprise, without limitation, a graphical user interface (GUI), an application programming interface (API), a command-line interface (CLI) or some other interface that allows a user to interact with and invoke one or more of the provided services.
  • FIG. 1B illustrates an example dataflow for simulating scenarios and generating scenario outputs, in accordance with one or more embodiments. During a training phase, analytic services 130 receives, as input, demand and resource time-series data 142. Correlation modelling logic 132, seasonality modelling logic 134, and forecast modelling logic 136 each process demand and resource time-series data 142 to build respective time-series models. For example, correlation modelling logic 132 may train correlation prediction models, seasonality modelling logic 134 may train seasonal pattern models, and forecasting modelling logic 136 may train forecast models. Example operations for building and training these time-series models are described in further detail below.
  • In one or more embodiments, seasonality modelling logic 134 provides seasonal pattern representations to correlation modelling logic 132 and/or forecast modelling logic 136. Based on the seasonal patterns representations, correlation modelling logic 132 may generate seasonal-aware correlation prediction models that account for seasonal behavior in the correlation patterns. Additionally or alternatively, forecast modelling logic 136 may generate seasonal-aware forecast models that account for seasonal behavior in the forecasts.
  • During an evaluation phase, what-if analytic 138 receives, as input, (a) scenario parameters 144, (b) trained correlation prediction models (including demand propagation and resource prediction models generated by correlation modelling logic 132), (c) trained forecast models (from forecast modelling logic 136). Scenario parameters 144 comprise a set of values that define a particular scenario to simulate. In one or more embodiments, scenario parameters 144 define at least one adjustment to a demand or resource time-series value. For example, a scenario may be defined as follows: “What if my system saw X % more user-demand”. Based on this scenario definition, historical and/or forecast simulations may be evaluated by what-if analytic 138.
  • During the evaluation phase, what-if analytic 138 may use the demand propagation models to project how the change in user demand would affect other demands. For example, for a scenario “What if my system saw X % more transactions”, what-if analytic may determine what change, if any, would occur in the number of redo records generated per second and/or any other demand on a database resource. What-if analytic 138 may then generate adjustments to the other demands according to the trained demand propagation models.
  • Additionally or alternatively, what-if analytic 138 may use the resource prediction models to project how the change in user demand would affect resource performance. The resource prediction model may account for adjustments to other demands. For example, physical writes to disk may be a function of both transactions per second and redo records per second. Thus, an adjustment to the number of transactions per second may be propagated to the redo records per second before adjusting the physical writes to disk. In other words, the projected change to the physical writes is not computed solely as a function of the adjustment to a single demand (e.g., based on a one-to-one mapping between the demand and resource). Rather, complex relationships (e.g., many to one and many-to-many mappings) between different demands and resources are accounted for in the simulation models, which may increase the accuracy of the projected changes.
  • Based on the evaluation, what-if analytic 138 generates scenario outputs 146. In one or more embodiments, scenario outputs 146 capture adjustments made to demand and/or resource time-series data for a scenario definition. In the example scenario “What if my system saw X % more transactions”, for instance, the scenario output may reflect historical and/or forecasted changes to one or more resource performance metrics and/or other demand metrics. Scenario outputs may be stored in data repository 140, provided to clients 150 a-k (e.g., by sending over a network to another host device or notifying a separate application executing on a host) and/or presented via an interactive display. An application or other user may process the scenario outputs to make adjustments to targets 112 a-i. For example, if a scenario simulation indicates that predicted performance metrics for a scenario will not satisfy a threshold, additional resources may be deployed and/or existing resources may be reconfigured (e.g., through load balancing, pooling etc.) to boost performance. The number of resources that are deployed may also be selected based on scenario simulations that satisfy the target performance threshold. In other cases, resources may be brought offline or other responsive actions may be taken to optimize system 100 for a particular scenario.
  • 3. Demand Propagation and Resource Prediction Models
  • Performance of a software or hardware resource may be function of many different inputs and interactions. For example, CPU utilization on a database host may be impacted by the frequency of transactions—a greater the number of transactions per second may typically correlate with a greater CPU utilization. However, CPU utilization may also be affected by other demands, such as the types of queries executed by the database host. On-line transaction processing (OLTP) type queries/workloads, such as table inserts and updates, generally do not impact CPU utilization as significantly as online analytical processing (OLAP) queries/workloads. The reason for the disparity is that OLAP queries generally involve scans and analysis of a much greater number of records within the database than an OLTP query. Other demands may also have varying impact on the database host's performance. Other examples include, but are not limited to, the number of redo records (i.e., a log that stores a history of changes made to the database) generated per second, and the number of user calls to the database.
  • In one or more embodiments, correlation modelling logic 132 builds a set of correlation prediction models to represent relationships between demands and resources. A correlation prediction model, in this context, is a data object or data structure that is generated, in data repository 140, as a representation of learned pattern of correlation between different time-series. For example, one type of correlation prediction model, also referred to herein as a demand propagation model, may map values for a demand metric to correlated values for another demand metric. For instance, a demand propagation model that maps values associated with one demand (e.g., transactions per second) to another demand (e.g., redo records generated per second). Another type of correlation prediction model, also referred to herein as a resource prediction, may map values for a demand metric to correlated values for a resource performance metric. In the context of CPU utilization, for instance, a model may map one or more demand value ranges (e.g., transactions per second, user calls per second, etc.) to correlated CPU utilization rates.
  • In one or more embodiments, correlation modelling logic 132 comprises a set of training processes implementing machine-learning algorithms to train a set of models. The machine-learning algorithms may perform regression, clustering, and/or classification to train models, as described further below. The set of training processes may initially train the set of models using a training set of time-series data. As additional time-series data is received, the correlation prediction models may be updated to reflect changes and/or additional learned patterns between resources and demands.
  • FIG. 2 illustrates an example set of operations for training a set of demand propagation and correlation prediction models, in accordance with one or more embodiments. The set of operations include receiving time-series data for a set of demands (Operation 210). For example, correlation modelling logic 132 may receive, from data collector 120, a sequence of data points capturing the number of instruction executions, transactions per second, redo records generated, function calls, and/or user calls per second for one or more of targets 112 a-i. Additionally or alternatively, other demands may also be captured for analysis, depending on the particular implementation. As previously indicated, time-series data may be received on-demand, periodically, or on a streaming/continuous basis from data collector 120.
  • In one or more embodiments, the training set of time-series data is conditioned before analysis. For example, correlation modelling logic 132 may perform an hourly rollup and/or a time-alignment of different time-series dataset to compare values from the same hourly time-range. In other embodiments, values may be rolled up over a different time period (e.g., on a per minute basis, per half-hour, daily, etc.)
  • Before the training set of time-series data is used to train correlation prediction models, correlation modelling logic 132 analyzes different demands and filters out/removes demands that contain redundant information. Demand filtering reduces processing and storage overhead by eliminating the training and storage of redundant models. Demand filtering is described further below with reference to operations 220 to 260.
  • During the filtering phase, correlation modelling logic 132 selects a demand pair to analyze (Operation 220). For example, if the time series includes three inputs/demand on a system, D1, D2, and D3, there are three possible demand pairs: {D1, D2}, {D1, D3} and {D2, D3}. The demand pairs may be selected and processed n any order.
  • As part of the filtering phase, correlation modelling logic 132 next determines a correlation coefficient using the training set of time-series data for the selected demand pair (Operation 230). For example, a Pearson's correlation coefficient may be computed as follows:
  • p = Σ i = 1 n ( d i - d _ ) ( x i - x _ ) Σ i = 1 n ( d i - d _ ) 2 Σ i = 1 n ( x i - x _ ) 2 ( 1 )
  • where:
      • d is a first demand time-series dataset including n values {d1, . . . , dn};
      • x is a second demand time-series dataset including n values {x1, . . . , xn};
      • d is the sample mean for the first demand time series, and
      • x is the sample mean for the second demand time-series dataset.
        Other methods of computing a correlation coefficient, such as spearman's rank correlation coefficient, may also be used, depending on the particular implementation.
  • Correlation modelling logic 132 next compares the correlation coefficient to a filter threshold to determine whether to filter one of the demands from the demand pair (Operation 240). The filter threshold may vary from implementation to implementation. In one or more embodiments, the filter threshold for the Pearson's correlation coefficient may be set at 0.99. A correlation coefficient this high indicates that the demands track each other very closely and may be treated as representing the same behavioral patterns. In other cases. a slightly lower correlation coefficient may be used to allow for a greater amount of deviation between the demands, thereby increasing the likelihood of filtering.
  • If the filter threshold is satisfied, then correlation modeling logic 132 removes one of the demands from the demand pair (Operation 250). Once removed, a demand is not used to train demand propagation models or resource prediction models. Instead, the demand propagation model and resource prediction model for the remaining demand in the demand pair may be applied to the removed demand since the demands are highly correlated.
  • Correlation modelling logic 132 next determines whether there are any demand pairs remaining (Operation 260). The determination may be made based on whether any demands were filtered. For example, in the above example with three demand pairs, if {D1, D2} was analyzed and no demands were filtered, then there are two remaining demand pairs to analyzed: {D1, D3} and {D2, D3}. On the other hand, if D2 was filtered out, then only one demand pair is left to analyze: {D1, D3}. The process returns to operation 220 for each reaming demand pair.
  • Once all demand pairs have been processed, correlation modelling logic 132 trains a set of one or more demand propagation models for the remaining demand pairs (Operation 270). In one or more embodiments, correlation modelling logic 132 trains a demand propagation model by fitting a correlation prediction model to the observed data sets of each demand pair. Given demand pair {D1, D2}, for instance, a predictive model may be fit using linear regression as follows:

  • D 1 =D 2β+ϵi  (2)
  • where
      • β is a parameter vector with elements that are partial derivatives; and
      • ϵi represents an error term.
        Linear regression may be used to compute a “best fit” line through the demand time-series data points. The best fit in this case is the line that most accurately represents the relationship between the demand pairs. The best fit may be determined by minimizing the sum of squared residuals, percentage of squared residuals, or according to any other linear regression estimation method. In other cases, nonlinear models may be trained to fit polynomials or other functions to the data points.
  • In one or more embodiments, demand propagation models are only trained for demand pairs if the correlation coefficient is above a threshold. The training threshold in this case may be set much lower than the filter threshold as no demand pairs remain with a correlation coefficient above the filter threshold. For example, a training threshold may be set at 0.25 or any other value, depending on the particular implementation. The training threshold may eliminate having to train/store demand propagation models for demands that have little to no correlation.
  • FIG. 3 illustrates an example set of demand propagation models, in accordance with one or more embodiments. In the example, demand propagation models are trained for demand pairs that have a correlation above a threshold (e.g., 0.25). Referring to the demand pairs:
      • Chart 302 depicts the correlation between the size of redo data generated per second and executions per second on a database host. The correlation coefficient is 0.13, which is below the threshold. Thus, no demand propagation model is trained.
      • Chart 304 depicts the correlation between transactions per second and executions per second on a database host. The correlation is 0.73, which is above the threshold. Thus, a demand propagation model is fit to the data points.
      • Chart 306 depicts the correlation between user calls per second and executions per second on the database host. The correlation is 0.44, which is above the threshold. Thus, a demand propagation model is fit to the data points.
      • Chart 308 depicts the correlation between transactions per second and redo data generated per second on a database host. The correlation coefficient is 0.1, which is below the threshold. Thus, no demand propagation model is trained.
      • Chart 310 depicts the correlation between the number of user calls per second and the size of redo records generated per second on a database host. The correlation coefficient is 0.08, which is below the threshold. Thus, no demand propagation model is trained.
      • Chart 312 depicts the correlation between user calls per second and transactions per second on a database host. The correlation coefficient is 0.05, which is below the threshold. Thus, no demand propagation model is trained.
  • Returning again to FIG. 2, the process includes receiving time-series datasets for resource performance metrics (Operation 280). For example, correlation modelling logic 132 may receive a sequence of samples that track CPU utilization, memory bandwidth, I/O operations, and/or other performance metric over time. The resource performance metrics may be received at any point in the process on demand, periodically, or on a streaming basis. Upon receipt, the resource performance metrics may be conditioned as previously described.
  • Using the time-series data for the remaining demands and the resource performance metrics, correlation modelling logic 132 trains a set of resource prediction models (Operation 290). In one or more embodiments, correlation modelling logic 132 trains a resource prediction model by fitting a correlation prediction model to the observed data sets of a demand, resource metric pairing. Given demand-resource pair {D1, R1}, for instance, a predictive model may be fit using linear regression as follows:

  • R 1 =D 1β+ϵi  (3)
  • where
      • β is a parameter vector with elements that are partial derivatives; and
      • ϵi represents an error term.
        Linear regression may be used to compute a “best fit” line through the demand and resource time-series data points. The best fit in this case is the line that most accurately represents the relationship between the demand and resource performance. The best fit may be determined by minimizing the sum of squared residuals, percentage of squared residuals, or according to any other linear regression estimation method. In other cases, nonlinear models may be trained to fit polynomials or other functions to the data points.
  • In one or more embodiments, resource prediction models are only trained for demand-to-resource pairs if the correlation coefficient is above a threshold. For example, a training threshold may be set at 0.25, as above for the demand propagation models, or any other value, depending on the particular implementation. The training threshold may eliminate having to train/store resource prediction models for demands that have little to no correlation.
  • FIG. 4 illustrates an example set of resource prediction models, in accordance with one or more embodiments. In the example, resource prediction models are trained for resource-to-demand pairings that have a correlation above a threshold (e.g., 0.4). Referring to the pairings:
      • Chart 402 depicts the correlation between CPU utilization and executions per second on a database host. The correlation coefficient is 0.28, which is below the threshold. Thus, no resource prediction propagation model is trained.
      • Chart 404 depicts the correlation between CPU utilization and redo size generated per second on a database host. The correlation is 0.51, which is above the threshold. Thus, a resource prediction model is fit to the data points.
      • Chart 406 depicts the correlation between CPU utilization and transactions per second on a database host. The correlation is 0.43, which is above the threshold. Thus, a resource prediction model is fit to the data points.
      • Chart 408 depicts the correlation between CPU utilization and user calls per second on a database host. The correlation coefficient is 0.43, which is above the threshold. Thus, a resource prediction model is fit to the data points.
  • As depicted in FIG. 4, host CPU utilization is a function of redo size, transaction, and user calls per second. In other cases, CPU utilization may be a function of other demands in addition or as an alternative to those listed. Correlation modelling logic 132 may select demands during runtime to predict each resource. The selected demands may be adjusted over time as additional correlation patterns are detected. If no demand is correlated to a given resource metric, then correlation modelling logic 132 may proceed without training a resource prediction model for the given resource performance metric.
  • 4. Seasonal-Aware Simulation Models
  • Correlation patterns may vary between different seasons. A season in this context refers to pattern that recurs periodically. In a database host, for instance, daily high seasons for transactions may be observed between the hours of 9 to 5 from Monday to Friday. Daily low seasons may occur outside of these hours. In other cases, seasonal high, lows, and/or other patterns may recur on an hourly, weekly, monthly, yearly, or holiday basis. As previously described, the daily highs in transactions may correlate with a high CPU utilization rate. However, daily lows may not be closely correlated with the CPU utilization rate as batch processes may be run in the evenings on the database host. In other cases, seasonal lows may follow a different correlation pattern than the daily highs. Therefore, training different correlation models (e.g., demand propagation and resource prediction models) for different resources may lead to more accurate representations.
  • In one or more embodiments, seasonality modelling logic 134 builds a seasonal classification model that represents different seasonal patterns. For example, the seasonal classification model may cluster time-series data points that are associated with different seasonal pattern classifications. The seasonal pattern classifications and clusters may then be provided to correlation modelling logic 132. In response, correlation modelling logic 132 may build demand propagation models and/or resource prediction models, as previously described, for the data points belonging to each cluster. In other words, separate models are trained and build for each cluster.
  • Correlation prediction models for a set of demands and resources may vary between different clusters/seasonal pattern classifications. For example, CPU utilization may be a function of transactions, user calls, and redo logs generated per second in the high season. In the low season, CPU utilization may only be a function of executions per second. In other words, the correlation coefficients between a particular demand and resource pair may vary from one season to the next. As a result, correlation modelling logic 132 may select different demands to train resource prediction models for different seasonal patterns.
  • FIG. 5 illustrates an example set of operations for performing seasonal aware training of simulation models, in accordance with one or more embodiments. The process includes identifying seasonal patterns in a time-series (Operation 510). For example, seasonal modelling logic 134 may classify different sub-periods/instances of a resource and/or demand time-series as sparse high for those instances that exhibit sharp, relatively short bursts on a recurring basis. Other classifications may include, but are not limited to, dense highs, sparse lows, and dense lows. Example techniques for classifying seasonal patterns are described in U.S. application Ser. No. 15/266,971, entitled “SEASONAL AWARE METHOD FOR FORECASTING AND CAPACITY PLANNING”; U.S. Provisional Patent Appl. No. 62/301,585, entitled “METHOD FOR CREATING PERIOD PROFILE FOR TIME-SERIES DATA WITH RECURRENT PATTERNS”; U.S. application Ser. No. 15/057,065, entitled “SYSTEM FOR DETECTING AND CHARACTERIZING SEASONS”; U.S. application Ser. No. 15/057,060, entitled “SUPERVISED METHOD FOR CLASSIFYING SEASONAL PATTERNS”; U.S. application Ser. No. 15/057,062, entitled “UNSUPERVISED METHOD FOR CLASSIFYING SEASONAL PATTERNS”; and U.S. Provisional Patent Appl. No. 62/370,880, entitled “UNSUPERVISED METHOD FOR BASELINING AND ANOMALY DETECTION IN TIME-SERIES DATA FOR ENTERPRISE SYSTEMS”, previously incorporated by reference.
  • Responsive to identifying a seasonal pattern, seasonality modelling logic 126 extracts data points in the time series that belong to the seasonal pattern (Operation 520). For example, if a high season for resource utilization is identified to recur weekly on Mondays from 9 to noon, then data points captured during this time frame may be clustered or otherwise grouped together within data repository 140.
  • Once extracted, correlation modelling logic 132 uses the data points to train a set of one or more correlation prediction models for the seasonal pattern (Operation 530). For example, correlation modelling logic 132 may generated demand propagation models and/or resource prediction models as previously described.
  • The process next determines whether there are any remaining seasonal patterns to analyze (Operation 540). If so, then the process repeats for the next selected seasonal patterns. Thus, correlation prediction models may be generated for each respective seasonal pattern. For example, if sparse high, dense high, and low seasonal patterns were identified, then the process may repeat for each of these seasonal pattern classifications. After correlation prediction models have been trained for each seasonal pattern, the process ends.
  • As can be seen from the process in FIG. 5, different sets of data points may be used to train correlation prediction models associated with different respective seasonal patterns. For example, for a high season, data points from a first set of one or more sub-periods (e.g., Monday to Thursday 9a.m. to 4p.m.) may be extracted to train the correlation prediction model. Data points from a different set of one or more sub-periods may be used to train correlation prediction models for other seasonal patterns, such as a low season.
  • FIG. 6 illustrates an example set of seasonal-aware correlation prediction models, in accordance with one or more embodiments. Chart 602 depicts a resource prediction model the maps values of executions per second to CPU utilization rates in the sparse high season. A best fit line is generated using time-series samples from the sparse high season. Chart 604 depicts a resource prediction model that maps values of executions per second to CPU utilization rates in the dense high season. As can be seen, the resource prediction model that is fit to the data has a different slope than in the sparse high season.
  • In one or more embodiments, data repository 140 stores a mapping between seasonal patterns and the corresponding set of trained correlation prediction models. During the evaluation phase, described n further detail below, the mapping data may be accessed to determine which correlation prediction models to use for a given scenario. For example, a scenario may be defined as follows “What if my system saw X times more transactions in the high season”. In response, what-in analytic 138 may determine which correlation prediction models are associated with the high season by reading the mapping data. What-if analytic 138 may use the associated correlation prediction models to generate the scenario output. In other cases, a scenario may apply to multiple seasons. For example, a general scenario of “What if my system saw X times more transactions” may apply across all seasons. In this case, what-if analytic 138 may analyze each season independently and stitch together the results.
  • 5. Historical Scenario Simulations
  • In one or more embodiments, what-if analytic 138 is configured to perform historical what-if scenario simulations. With historical simulations, modifications are made to historical time-series data to see how the system would have behaved. One benefit of adjusting historical data is that what-if analytic 138 is able to provide the user with scenario outputs that span the distribution of known operating modes. Historical simulations may be performed with no inferences about future conditions. In other words, a historical simulation may be performed without having to generate a forecast.
  • FIG. 7 illustrates an example set of operations for simulating a historical scenario, in accordance with one or more embodiments. The process includes receiving a set of historical simulation parameters (Operation 710). In one or more embodiments, the scenario parameters define a historical adjustment. For example, an example historical simulation scenario may be defined as “What if there were three times the amount of users in the last three months?” As another example, a historical simulation scenario may be defined as “What if there were ten percent more transactions last week?” In both examples, the scenario defines a historical adjustment and a period in the past to analyze. In other cases, a default timeframe to analyze may be chosen rather than explicitly defined. For example a user may submit a scenario definition as follows “What if there were ten percent more transactions on database host DBHost?” In this scenario definition, the user identifies an adjustment/modification and a particular target resource, but does not define the timeframe. Any default timeframe (e.g., the past week, month, six months, etc.) may be selected for analysis in this case.
  • Once the scenario parameters have been submitted, what-if analytic 138 adjusts one or more demands by a prescribed amount (Operation 720). For example, if the scenario requests a simulation of a historical scenario with three times more transactions, then what-if analytic 138 may adjust the transaction time-series data collected from one or more of targets 112 a-i to treble the values. Adjustments may be made to values over a prescribed or default timeframe, depending on the particular implementation.
  • In one or more embodiments, the adjustments that are made by what-if analytic 138 vary depending on whether the scenario is attempting to model a user change or not. If the scenario defines a user change, then what-if analytic 138 models the user change by adjusting all of the demands by a prescribed amount. For example, if a scenario requests a simulation of four times change in users, then what-if analytic may adjust all demands by four times. If the scenario is more particular and adjusts a single demand, then the process may start by adjusting the prescribed demand by the prescribed amount. What-if analytic 138 may access the relevant demand propagation models to predict how the adjusted demand would propagate to other related demands, as described in the operations below.
  • During the historical simulation, what-if analytic 138 determines whether to propagate changes to a prescribed demand to other demands (Operation 730). As previously indicated, adjustments may be propagated if the scenario is making adjustments to individual demands rather than user changes. If the scenario is defining a user change, then the adjustment may be applied across all demands, and the process may skip to Operation 750. If the adjustment is to a prescribed demand or set of demands, then what-if analytic 138 may determine whether there are any associated demand propagation models involving the adjusted demand. For instance, a trained demand propagation model may be available if the adjusted demand was correlated more than the training threshold with another demand. Conversely, a trained demand propagation may not be available for the adjusted demand if the demand is not correlated with other demands.
  • If a trained demand propagation model is available, then what-if analytic 138 use the demand propagation model to propagate the adjustment to other, related demands. (Operation 740). For example, an adjusted execution per second value may be mapped to a corresponding value for transactions per second using the demand propagation model depicted in chart 304. The demand propagation model is also reversible, such that a change in transactions per second may be mapped to a corresponding execution value. Similarly, an adjustment in the number of executions per second may be mapped to a corresponding adjustment in user calls per second, and vice versa, using chart 306.
  • Once the adjustments to the demands are complete, what-if analytic determines 138 whether there are any resource prediction models associated with the adjusted demands (Operation 750). For example, what-if analytic 138 may search data store 140 for a trained resource prediction model that maps an adjusted demand to a resource performance metric. In some cases, a resource prediction model may not be available. This scenario may occur if none of the adjusted demands are correlated with a relevant resource performance metric. If none of the adjusted demands are correlated to resource performance, then the process may proceed without making any adjustments to a resource performance metric.
  • If a resource prediction model is available, then what-if analytic 138 may use the resource prediction model to generate an adjustment to at least one resource performance time-series (Operation 760). For example, a resource prediction model may map an adjusted value for executions per second to a corresponding CPU utilization value. What-if analytic 138 may adjust the historical CPU utilization rate to the value mapped to the adjusted executions per second value.
  • In one or more embodiments, an adjustment to a resource performance metric may be determined based on multiple resource prediction models. For example, one resource prediction model may map an adjusted value for transactions per second to an adjusted CPU utilization rate. A second model may map an adjusted redo size per second to an adjusted CPU utilization rate. A final adjustment may be computed by combining (e.g., through averaging or otherwise aggregating) the adjustments from the different resource prediction models.
  • In one or more embodiments, adjustments may be performed based on seasonal pattern classifications. For example, if a historical simulation is being run for a prescribed seasonal pattern (e.g., a high season, sparse high, low season, etc.), then what-if analytic 138 may search data store 140 for a mapping between the prescribed season seasonal pattern and a corresponding set of resource prediction models. What-if analytic 138 may then select/use one or more resource prediction models to map adjusted demand values to adjusted resource performance values for the prescribed seasonal pattern. In the event that the simulation spans multiple seasons, the models may be isolated to regions of time based on collective seasons. For example, the resource prediction model depicted in chart 602 may be used to map adjustments to executions per second in the sparse high season to corresponding adjustments to CPU utilization rate. For adjustments in the dense high season, the resource prediction model depicted n chart 604 may be used.
  • Once the evaluation phase is complete, what-if analytic 138 presents the time-series datasets for one or more demands and one or more resources, including any generated adjustments (Operation 770). The presentation of the adjusted time-series datasets may vary from implementation to implementation. For example, the time-series datasets may be displayed via an interactive display that is coupled to a client computing system. The interactive display may allow a user to drill down to view adjustments to individual demands, resources, or custom groups of demands and/or resources.
  • In one or more embodiments, an interactive display presenting the results of a historical simulation may allow a user to select responsive actions, including on or more of the responsive actions previously described, for the system to perform. For example, system 100 may adjust the configurations of 112 a-i, deploy additional resources, and/or perform load balancing to replicate a simulated scenario.
  • In one or more embodiments, what-if analytic 138 may present an uncertainty interval for a historical simulation. An uncertainty interval may be a visualization or other indication of a degree of uncertainty in the predicted adjustments. For example, during training, correlation modelling logic 132 may calculate and cache the uncertainties for possible combinations of evaluation. For a model that maps demand X to resource (or demand) Y, these possibilities include the uncertainty of predicting Y from X and predicting X from Y.
  • In one or more embodiments, the uncertainty interval is a range of values that include an upper limit and a lower limit within which at least a certain threshold of values falls within a given level of confidence. For example, a historical CPU utilization rate may be observed to be 50 percent. In response to an adjustment of three times the amount of users, the projected/adjusted CPU utilization rate may jump to 65% (these values are given by way of example only and may vary from system to system). Based on historical patterns, the lower limit of the uncertainty level may be 55%, and the upper limit may be 75% with 95% confidence. The uncertainty interval thus gives a range of expected values within a prescribed level of confidence.
  • FIG. 8A illustrates an example presentation of propagating adjustments in one demand to another demand, in accordance with one or more embodiments. The scenario parameters for this simulation may be defined as follows: “What if may system saw three times more executions per seconds in the last five months?” Based on the simulation, what-if analytic 138 may present chart 800, which illustrates an adjustment to a historical time-series for executions per second. Time-series plot 804 depicts historical executions per second that were performed between May and September. Time-series plot 802 depicts an example adjustment that increases the number of executions by three times.
  • What-if analytic 138 may further present chart 810 as part of the historical simulation. Time-series plot 818 depicts historical transactions per second that were performed between May and September. Using a demand propagation model, the transactions per second are adjusted based on the three times increase in executions per second. Time-series plot 812 depicts the adjustment. Plot 814 depicts an upper bound for the uncertainty interval, and plot 816 depicts a lower bound for the uncertainty interval.
  • FIG. 8B illustrates an example presentation of adjustments generated for resource time-series during the historical simulation defined with reference to FIG. 8A. Chart 820 depicts the historical CPU utilization rate (time-series plot 830). Chart 820 further depicts a projected/adjusted CPU utilization rate (time-series plot 824) computed based on the three time increase to executions per second using a resource prediction model. Plot 826 depicts the upper bound in the uncertainty interval, and plot 828 depicts the lower bound. Time-series plot 822 depicts three times the CPU utilization rate, which may be omitted from the scenario output but is shown for purposes of illustration. As can be seen, the projected CPU utilization is much lower than multiplying the CPU utilization rate by the same amount as the increase in executions per second. The training of time-series models as previously described allows for a much more complex and meaningful analysis between the various interactions between demands and resource performance.
  • Chart 840 depicts the historical number of physical writes per second (time-series plot 850). Chart 840 further depicts a projected/adjusted number of physical writes per second (time-series plot 844) based on the adjusted demands. Plot 846 depicts the upper bound of the uncertainty interval, and plot 848 depicts the lower bound of the interval. Time-series plot 842 depicts three times the historical physical writes per second rate, which, as can be seen, is inaccurate in comparison to the adjusted value, which uses the trained models to account for complex interactions within the system.
  • 6. Forecast Scenario Simulations
  • In one or more embodiments, what-if analytic 138 is configured to perform forecast what-if scenario simulations. With forecast scenario simulations, modifications may be made to historical and/or forecast time-series data to see how the system is predicted to behave. In contrast to historical simulations, forecast simulations include modelling changes in future demands and/or resources. The adjustments may account for changes, trends, and seasonal patterns as determined from historical data collected from targets 112 a-i.
  • FIG. 9 illustrates an example set of operations for simulating a forecast scenario, in accordance with one or more embodiments. The set of operations includes generating a set of forecasts for each demand and resource time-series (Operation 910). In one or more embodiments, forecast modelling logic 136 is configured to generate a forecast that account for seasonal patterns and trends in the time-series data. For example, forecast modelling logic 136 may generate an Additive or Multiplicative Holt-Winters forecasting model. The Holt-Winters models and other example forecasting models that may be trained are described in U.S. application Ser. No. 15/266,971, entitled “SEASONAL AWARE METHOD FOR FORECASTING AND CAPACITY PLANNING”, previously incorporated by reference.
  • The process further includes receiving a set of forecast simulation parameters (Operation 920). In one or more embodiments, the scenario parameters define a future adjustment. For example, an example forecast simulation scenario may be defined as “What if there are three times the amount of forecasted users in the next three months?” As another example, a historical simulation scenario may be defined as “What if there are ten percent more transactions than forecasted for next week?” In both examples, the scenario defines an adjustment to a forecasted value and a period in the future to analyze. In other cases, a default timeframe to analyze may be chosen rather than explicitly defined. For example a user may submit a scenario definition as follows “What if there are ten percent more transactions on database host DBHost than forecasted?” In this scenario definition, the user identifies an adjustment/modification and a particular target resource, but does not define the timeframe. Any default timeframe (e.g., the next week, month, six months, etc.) may be selected for analysis in this case.
  • Responsive to receiving the forecast scenario parameters, what-if analytic 138 adjusts a historical and/or forecasted value by a prescribed amount (Operation 930). For example, if the scenario requests a simulation of a scenario with three times more transactions than forecasted, then what-if analytic 138 may adjust the forecast for future transactions per second projected for targets 112 a-i to treble the values. Adjustments may be made to values over a prescribed or default timeframe, depending on the particular implementation.
  • During the forecast scenario simulation, what-if analytic 138 determines whether to propagate changes to the prescribed one or more demands to other demands (Operation 940). Similar to the historical simulation, adjustments may be propagated if the scenario is making adjustments to individual demands rather than user changes. If the scenario is defining a user change in the future, then the adjustment may be applied across all demands, and the process may skip to Operation 960. If the adjustment is to a prescribed demand or set of demands, then what-if analytic 138 may determine whether there are any associated demand propagation models involving the adjusted demand(s) as described above for in the historical scenario simulation.
  • If a trained demand propagation model is available, then what-if analytic 138 use the demand propagation model to propagate the adjustment to other, related demands. (Operation 950). For example, an adjusted forecast for executions per second may be mapped to an adjusted forecast for transactions per second using the demand propagation model depicted in chart 304. The mapped values in this case incorporate the trend and seasonal factors from the forecast. The demand propagation models, as well as the resource prediction models, are also reversible as previously described.
  • Once the adjustments to the demand forecasts are complete, what-if analytic determines 138 whether there are any resource prediction models associated with the adjusted demands (Operation 960). For example, what-if analytic 138 may search data store 140 for a trained resource prediction model that maps an adjusted demand to a resource performance metric. In some cases, a resource prediction model may not be available. This scenario may occur if none of the adjusted demands are correlated with a relevant resource performance metric. If none of the adjusted demands are correlated to resource performance, then the process may proceed without making any adjustments to a resource performance metric.
  • If a resource prediction model is available, then what-if analytic 138 may use the resource prediction model to generate an adjustment to at least one resource performance time-series (Operation 970). For example, a resource prediction model may map an adjusted value for executions per second to a corresponding CPU utilization value. What-if analytic 138 may adjust a forecast CPU utilization rate to the value mapped to the adjusted executions per second value. Similar to the demand adjustments, the resource performance adjustments may incorporate the trend and/or seasonal factors of the forecast.
  • As with the historical simulation, an adjustment to a forecast resource performance metric may be determined based on multiple resource prediction models. Additionally or alternatively, adjustments may be performed based on seasonal pattern classifications. For example, if a historical simulation is being run for a prescribed seasonal pattern (e.g., a high season, sparse high, low season, etc.), then what-if analytic 138 may search data store 140 for a mapping between the prescribed season seasonal pattern and a corresponding set of resource prediction models. What-if analytic 138 may then select/use one or more resource prediction models to map adjusted demand forecasts to adjusted resource performance forecasts for the prescribed seasonal pattern. In the event that the simulation spans multiple seasons, the models may be isolated to regions of time based on collective seasons.
  • Once the evaluation phase is complete, what-if analytic 138 presents the time-series datasets for one or more demand forecasts and one or more resource performance forecasts, including any generated adjustments (Operation 980). The presentation of the adjusted time-series datasets may vary from implementation to implementation. For example, the time-series datasets may be displayed via an interactive display that is coupled to a client computing system. The interactive display may allow a user to drill down to view adjustments to individual demands, resources, or custom groups of demands and/or resources.
  • In one or more embodiments, an interactive display presenting the results of a forecast scenario simulation may allow a user to select responsive actions, including on or more of the responsive actions previously described, for the system to perform. For example, system 100 may adjust the configurations of 112 a-i, deploy additional resources, and/or perform load balancing to replicate a simulated scenario.
  • As with the historical simulation, what-if analytic 138 may present an uncertainty interval for a historical simulation. However, in the case of a forecast scenario simulation, the original forecast may already be associated with an uncertainty interval, as opposed to historical observations. During training, forecast modelling logic 136 and/or correlation modelling logic 132 may calculate and cache the uncertainties for possible combinations of evaluation. For a model that maps demand X to resource (or demand) Y, these possibilities include the uncertainty of predicting a forecasted Y from a forecasted X, predicting a forecasted X from a forecasted Y, predicting a forecasted Y from X, and predicting a forecasted X from Y.
  • In one or more embodiments, what-if analytic 138 may modify the uncertainty interval when adjusting the forecasted values. For example, an original forecast may have an interval that is computed based on the forecast model. During the forecast scenario simulation, the resource prediction model may introduce additional uncertainty. The uncertainty interval may be shifted around the changed forecast values and/or expanded to reflect the uncertainty of predicting a change in the forecasted performance metric based on a change to the historical and/or forecasted demand metrics.
  • FIG. 10A illustrates an example set of demand time-series that were adjusted during a forecast scenario simulation, in accordance with one or more embodiments. The scenario parameters for this simulation may be defined as follows: “What if may system saw five times times more executions per seconds in the next two months?” Based on the simulation, what-if analytic 138 may present chart 1000, which illustrates an adjusted forecast (time-series plot 1004) for executions per second. Time-series plot 1002 depicts historical executions per second leading up to the forecast.
  • What-if analytic 138 may further present chart 1010 as part of the forecast scenario simulation. Time-series plot 1012 depicts historical size of redo data generated per second between March and July leading up to the forecast. Using a demand propagation model, the forecasted redo generated per second is adjusted, as depicted in time-series plot 1014, based on the five times increase in forecasted executions per second.
  • FIG. 10B illustrates an example set of resource time-series that were adjusted during a forecast scenario simulation, in accordance with one or more embodiments. Chart 820 depicts the historical number of physical reads per second (time-series plot 1022) and an adjusted forecast (time-series plot 1024). Chart 1030 depicts the historical physical writes per second (time-series plot 1032) and an adjusted forecast (time-series plot 1034).
  • 7. Computer Networks and Cloud Networks
  • In one or more embodiments, a computer network provides connectivity among a set of nodes. The nodes may be local to and/or remote from each other. The nodes are connected by a set of links. Examples of links include a coaxial cable, an unshielded twisted cable, a copper cable, an optical fiber, and a virtual link.
  • A subset of nodes implements the computer network. Examples of such nodes include a switch, a router, a firewall, and a network address translator (NAT). Another subset of nodes uses the computer network. Such nodes (also referred to as “hosts”) may execute a client process and/or a server process. A client process makes a request for a computing service (such as, execution of a particular application, and/or storage of a particular amount of data). A server process responds by executing the requested service and/or returning corresponding data.
  • A computer network may be a physical network, including physical nodes connected by physical links. A physical node is any digital device. A physical node may be a function-specific hardware device, such as a hardware switch, a hardware router, a hardware firewall, and a hardware NAT. Additionally or alternatively, a physical node may be a generic machine that is configured to execute various virtual machines and/or applications performing respective functions. A physical link is a physical medium connecting two or more physical nodes. Examples of links include a coaxial cable, an unshielded twisted cable, a copper cable, and an optical fiber.
  • A computer network may be an overlay network. An overlay network is a logical network implemented on top of another network (such as, a physical network). Each node in an overlay network corresponds to a respective node in the underlying network. Hence, each node in an overlay network is associated with both an overlay address (to address to the overlay node) and an underlay address (to address the underlay node that implements the overlay node). An overlay node may be a digital device and/or a software process (such as, a virtual machine, an application instance, or a thread) A link that connects overlay nodes is implemented as a tunnel through the underlying network. The overlay nodes at either end of the tunnel treat the underlying multi-hop path between them as a single logical link. Tunneling is performed through encapsulation and decapsulation.
  • In an embodiment, a client may be local to and/or remote from a computer network. The client may access the computer network over other computer networks, such as a private network or the Internet. The client may communicate requests to the computer network using a communications protocol, such as Hypertext Transfer Protocol (HTTP). The requests are communicated through an interface, such as a client interface (such as a web browser), a program interface, or an application programming interface (API).
  • In an embodiment, a computer network provides connectivity between clients and network resources. Network resources include hardware and/or software configured to execute server processes. Examples of network resources include a processor, a data storage, a virtual machine, a container, and/or a software application. Network resources are shared amongst multiple clients. Clients request computing services from a computer network independently of each other. Network resources are dynamically assigned to the requests and/or clients on an on-demand basis. Network resources assigned to each request and/or client may be scaled up or down based on, for example, (a) the computing services requested by a particular client, (b) the aggregated computing services requested by a particular tenant, and/or (c) the aggregated computing services requested of the computer network. Such a computer network may be referred to as a “cloud network.”
  • In an embodiment, a service provider provides a cloud network to one or more end users. Various service models may be implemented by the cloud network, including but not limited to Software-as-a-Service (SaaS), Platform-as-a-Service (PaaS), and Infrastructure-as-a-Service (IaaS). In SaaS, a service provider provides end users the capability to use the service provider's applications, which are executing on the network resources. In PaaS, the service provider provides end users the capability to deploy custom applications onto the network resources. The custom applications may be created using programming languages, libraries, services, and tools supported by the service provider. In IaaS, the service provider provides end users the capability to provision processing, storage, networks, and other fundamental computing resources provided by the network resources. Any arbitrary applications, including an operating system, may be deployed on the network resources.
  • In an embodiment, various deployment models may be implemented by a computer network, including but not limited to a private cloud, a public cloud, and a hybrid cloud. In a private cloud, network resources are provisioned for exclusive use by a particular group of one or more entities (the term “entity” as used herein refers to a corporation, organization, person, or other entity). The network resources may be local to and/or remote from the premises of the particular group of entities. In a public cloud, cloud resources are provisioned for multiple entities that are independent from each other (also referred to as “tenants” or “customers”). The computer network and the network resources thereof are accessed by clients corresponding to different tenants. Such a computer network may be referred to as a “multi-tenant computer network.” Several tenants may use a same particular network resource at different times and/or at the same time. The network resources may be local to and/or remote from the premises of the tenants. In a hybrid cloud, a computer network comprises a private cloud and a public cloud. An interface between the private cloud and the public cloud allows for data and application portability. Data stored at the private cloud and data stored at the public cloud may be exchanged through the interface. Applications implemented at the private cloud and applications implemented at the public cloud may have dependencies on each other. A call from an application at the private cloud to an application at the public cloud (and vice versa) may be executed through the interface.
  • In an embodiment, tenants of a multi-tenant computer network are independent of each other. For example, a business or operation of one tenant may be separate from a business or operation of another tenant. Different tenants may demand different network requirements for the computer network. Examples of network requirements include processing speed, amount of data storage, security requirements, performance requirements, throughput requirements, latency requirements, resiliency requirements, Quality of Service (QoS) requirements, tenant isolation, and/or consistency. The same computer network may need to implement different network requirements demanded by different tenants.
  • In one or more embodiments, in a multi-tenant computer network, tenant isolation is implemented to ensure that the applications and/or data of different tenants are not shared with each other. Various tenant isolation approaches may be used.
  • In an embodiment, each tenant is associated with a tenant ID. Each network resource of the multi-tenant computer network is tagged with a tenant ID. A tenant is permitted access to a particular network resource only if the tenant and the particular network resources are associated with a same tenant ID.
  • In an embodiment, each tenant is associated with a tenant ID. Each application, implemented by the computer network, is tagged with a tenant ID. Additionally or alternatively, each data structure and/or dataset, stored by the computer network, is tagged with a tenant ID. A tenant is permitted access to a particular application, data structure, and/or dataset only if the tenant and the particular application, data structure, and/or dataset are associated with a same tenant ID.
  • As an example, each database implemented by a multi-tenant computer network may be tagged with a tenant ID. Only a tenant associated with the corresponding tenant ID may access data of a particular database. As another example, each entry in a database implemented by a multi-tenant computer network may be tagged with a tenant ID. Only a tenant associated with the corresponding tenant ID may access data of a particular entry. However, the database may be shared by multiple tenants.
  • In an embodiment, a subscription list indicates which tenants have authorization to access which applications. For each application, a list of tenant IDs of tenants authorized to access the application is stored. A tenant is permitted access to a particular application only if the tenant ID of the tenant is included in the subscription list corresponding to the particular application.
  • In an embodiment, network resources (such as digital devices, virtual machines, application instances, and threads) corresponding to different tenants are isolated to tenant-specific overlay networks maintained by the multi-tenant computer network. As an example, packets from any source device in a tenant overlay network may only be transmitted to other devices within the same tenant overlay network. Encapsulation tunnels are used to prohibit any transmissions from a source device on a tenant overlay network to devices in other tenant overlay networks. Specifically, the packets, received from the source device, are encapsulated within an outer packet. The outer packet is transmitted from a first encapsulation tunnel endpoint (in communication with the source device in the tenant overlay network) to a second encapsulation tunnel endpoint (in communication with the destination device in the tenant overlay network). The second encapsulation tunnel endpoint decapsulates the outer packet to obtain the original packet transmitted by the source device. The original packet is transmitted from the second encapsulation tunnel endpoint to the destination device in the same particular overlay network.
  • 8. Microservice Applications
  • According to one or more embodiments, the techniques described herein are implemented in a microservice architecture. A microservice in this context refers to software logic designed to be independently deployable, having endpoints that may be logically coupled to other microservices to build a variety of applications. Applications built using microservices are distinct from monolithic applications, which are designed as a single fixed unit and generally comprise a single logical executable. With microservice applications, different microservices are independently deployable as separate executables. Microservices may communicate using HyperText Transfer Protocol (HTTP) messages and/or according to other communication protocols via API endpoints. Microservices may be managed and updated separately, written in different languages, and be executed independently from other microservices.
  • Microservices provide flexibility in managing and building applications. Different applications may be built by connecting different sets of microservices without changing the source code of the microservices. Thus, the microservices act as logical building blocks that may be arranged in a variety of ways to build different applications. Microservices may provide monitoring services that notify a microservices manager (such as If-This-Then-That (IFTTT), Zapier, or Oracle Self-Service Automation (OSSA)) when trigger events from a set of trigger events exposed to the microservices manager occur. Microservices exposed for an application may alternatively or additionally provide action services that perform an action in the application (controllable and configurable via the microservices manager by passing in values, connecting the actions to other triggers and/or data passed along from other actions in the microservices manager) based on data received from the microservices manager. The microservice triggers and/or actions may be chained together to form recipes of actions that occur in optionally different applications that are otherwise unaware of or have no control or dependency on each other. These managed applications may be authenticated or plugged in to the microservices manager, for example, with user-supplied application credentials to the manager, without requiring reauthentication each time the managed application is used alone or in combination with other applications.
  • In one or more embodiments, microservices may be connected via a GUI. For example, microservices may be displayed as logical blocks within a window, frame, other element of a GUI. A user may drag and drop microservices into an area of the GUI used to build an application. The user may connect the output of one microservice into the input of another microservice using directed arrows or any other GUI element. The application builder may run verification tests to confirm that the output and inputs are compatible (e.g., by checking the datatypes, size restrictions, etc.)
  • Triggers
  • The techniques described above may be encapsulated into a microservice, according to one or more embodiments. In other words, a microservice may trigger a notification (into the microservices manager for optional use by other plugged in applications, herein referred to as the “target” microservice) based on the above techniques and/or may be represented as a GUI block and connected to one or more other microservices. The trigger condition may include absolute or relative thresholds for values, and/or absolute or relative thresholds for the amount or duration of data to analyze, such that the trigger to the microservices manager occurs whenever a plugged-in microservice application detects that a threshold is crossed. For example, a user may request a trigger into the microservices manager when the microservice application detects a value has crossed a triggering threshold.
  • In one embodiment, the trigger, when satisfied, might output data for consumption by the target microservice. In another embodiment, the trigger, when satisfied, outputs a binary value indicating the trigger has been satisfied, or outputs the name of the field or other context information for which the trigger condition was satisfied. Additionally or alternatively, the target microservice may be connected to one or more other microservices such that an alert is input to the other microservices. Other microservices may perform responsive actions based on the above techniques, including, but not limited to, deploying additional resources, adjusting system configurations, and/or generating GUIs.
  • Actions
  • In one or more embodiments, a plugged-in microservice application may expose actions to the microservices manager. The exposed actions may receive, as input, data or an identification of a data object or location of data, that causes data to be moved into a data cloud.
  • In one or more embodiments, the exposed actions may receive, as input, a request to increase or decrease existing alert thresholds. The input might identify existing in-application alert thresholds and whether to increase or decrease, or delete the threshold. Additionally or alternatively, the input might request the microservice application to create new in-application alert thresholds. The in-application alerts may trigger alerts to the user while logged into the application, or may trigger alerts to the user using default or user-selected alert mechanisms available within the microservice application itself, rather than through other applications plugged into the microservices manager.
  • In one or more embodiments, the microservice application may generate and provide an output based on input that identifies, locates, or provides historical data, and defines the extent or scope of the requested output. The action, when triggered, causes the microservice application to provide, store, or display the output, for example, as a data model or as aggregate data that describes a data model.
  • 9. Hardware Overview
  • According to one or more embodiments, the techniques described herein are implemented by one or more special-purpose computing devices. The special-purpose computing devices may be hard-wired to perform the techniques, or may include digital electronic devices such as one or more application-specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) that are persistently programmed to perform the techniques, or may include one or more general purpose hardware processors programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination. Such special-purpose computing devices may also combine custom hard-wired logic, ASICs, or FPGAs with custom programming to accomplish the techniques. The special-purpose computing devices may be desktop computer systems, portable computer systems, handheld devices, networking devices or any other device that incorporates hard-wired and/or program logic to implement the techniques.
  • For example, FIG. 11 is a block diagram that illustrates computer system 1100 upon which one or more embodiments may be implemented. Computer system 1100 includes bus 1102 or other communication mechanism for communicating information, and hardware processor 1104 coupled with bus 1102 for processing information. Hardware processor 1104 may be, for example, a general purpose microprocessor.
  • Computer system 1100 also includes main memory 1106, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 1102 for storing information and instructions to be executed by processor 1104. Main memory 1106 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 1104. Such instructions, when stored in non-transitory storage media accessible to processor 1104, render computer system 1100 into a special-purpose machine that is customized to perform the operations specified in the instructions.
  • Computer system 1100 further includes read only memory (ROM) 1108 or other static storage device coupled to bus 1102 for storing static information and instructions for processor 1104. Storage device 1110, such as a magnetic disk or optical disk, is provided and coupled to bus 1102 for storing information and instructions.
  • Computer system 1100 may be coupled via bus 1102 to display 1112, such as a cathode ray tube (CRT), liquid crystal display (LCD), or light-emitting diode (LED), for displaying information to a computer user. Input device 1114, which may include physical and/or touchscreen based alphanumeric keys, is coupled to bus 1102 for communicating information and command selections to processor 1104. Another type of user input device is cursor control 1116, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 1104 and for controlling cursor movement on display 1112. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
  • Computer system 1100 may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system 1100 to be a special-purpose machine. According to one embodiment, the techniques herein are performed by computer system 1100 in response to processor 1104 executing one or more sequences of one or more instructions contained in main memory 1106. Such instructions may be read into main memory 1106 from another storage medium, such as storage device 1110. Execution of the sequences of instructions contained in main memory 1106 causes processor 1104 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions.
  • The term “storage media” as used herein refers to any non-transitory media that store data and/or instructions that cause a machine to operation in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 1110. Volatile media includes dynamic memory, such as main memory 1106. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.
  • Storage media is distinct from but may be used in conjunction with transmission media. Transmission media participates in transferring information between storage media. For example, transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 1102. Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
  • Various forms of media may be involved in carrying one or more sequences of one or more instructions to processor 1104 for execution. For example, the instructions may initially be carried on a magnetic disk or solid state drive of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 1100 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus 1102. Bus 1102 carries the data to main memory 1106, from which processor 1104 retrieves and executes the instructions. The instructions received by main memory 1106 may optionally be stored on storage device 1110 either before or after execution by processor 1104.
  • Computer system 1100 also includes a communication interface 1118 coupled to bus 1102. Communication interface 1118 provides a two-way data communication coupling to a network link 1120 that is connected to local network 1122. For example, communication interface 1118 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface 1118 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, communication interface 1118 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
  • Network link 1120 typically provides data communication through one or more networks to other data devices. For example, network link 1120 may provide a connection through local network 1122 to host computer 1124 or to data equipment operated by Internet Service Provider (ISP) 1126. ISP 1126 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the “Internet” 1128. Local network 1122 and Internet 1128 both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 1120 and through communication interface 1118, which carry the digital data to and from computer system 1100, are example forms of transmission media.
  • Computer system 1100 can send messages and receive data, including program code, through the network(s), network link 1120 and communication interface 1118. In the Internet example, server 1130 might transmit a requested code for an application program through Internet 1128, ISP 1126, local network 1122 and communication interface 1118.
  • The received code may be executed by processor 1104 as it is received, and/or stored in storage device 1110, or other non-volatile storage for later execution.
  • 10. Miscellaneous; Extensions
  • Embodiments are directed to a system with one or more devices that include a hardware processor and that are configured to perform any of the operations described herein and/or recited in any of the claims below.
  • In an embodiment, a non-transitory computer readable storage medium comprises instructions which, when executed by one or more hardware processors, causes performance of any of the operations described herein and/or recited in any of the claims.
  • Any combination of the features and functionalities described herein may be used in accordance with one or more embodiments. In the foregoing specification, embodiments have been described with reference to numerous specific details that may vary from implementation to implementation. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. The sole and exclusive indicator of the scope of the invention, and what is intended by the applicants to be the scope of the invention, is the literal and equivalent scope of the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction.

Claims (20)

What is claimed is:
1. A method comprising:
receiving a request to perform a what-if simulation for one or more performance metrics of a system based on a first change to a first demand on one or more system resources;
responsive to receiving the request to perform the what-if simulation, generating a first prediction of at least a second change to a second demand on the one or more system resources; and
based at least on the first change to the first demand and the second change to the second demand, generating a second prediction of a third change to the one or more performance metrics of the system.
2. The method of claim 1, wherein the first change is a change to at least one historical value that corresponds to a measurement of the first demand on the one or more system resources.
3. The method of claim 1, wherein generating the second prediction of the third change to the one or more performance metrics of the system comprises predicting a change to at least one historical value that corresponds to a measurement of the one or more performance metrics of the system.
4. The method of claim 1, wherein the second prediction of the third change to the one or more performance metrics of the system comprises a change to at least one future value that corresponds to a forecast of the one or more performance metrics of the system.
5. The method of claim 1, wherein the second prediction of the third change to the one or more performance metrics of the system comprises a change to an uncertainty interval in a forecast of the one or more performance metrics of the system.
6. The method of claim 1, further comprising presenting a historical or forecast simulation based on at least the third change to the one or more performance metrics of the system.
7. The method of claim 1, wherein at least one of the first prediction or the second prediction is generated based on one or more seasonal patterns.
8. The method of claim 1, wherein the request is received from a tenant of a cloud service; and wherein responsive to the request the cloud service presents, to the tenant, or stores, at a location accessible to the tenant, a result of the what-if simulation.
9. The method of claim 1, further comprising configuring one or more system resources based on a result of the what-if simulation.
10. One or more non-transitory computer-readable media storing instructions that, when executed by one or more hardware processors, cause:
receiving a request to perform a what-if simulation for one or more performance metrics of a system based on a first change to a first demand on one or more system resources;
responsive to receiving the request to perform the what-if simulation, generating a first prediction of at least a second change to a second demand on the one or more system resources; and
based at least on the first change to the first demand and the second change to the second demand, generating a second prediction of a third change to the one or more performance metrics of the system.
11. The media of claim 10, wherein the first change is a change to at least one historical value that corresponds to a measurement of the first demand on the one or more system resources.
12. The media of claim 10, wherein generating the second prediction of the third change to the one or more performance metrics of the system comprises predicting a change to at least one historical value that corresponds to a measurement of the one or more performance metrics of the system.
13. The media of claim 10, wherein the second prediction of the third change to the one or more performance metrics of the system comprises a change to at least one future value that corresponds to a forecast of the one or more performance metrics of the system.
14. The media of claim 10, wherein the second prediction of the third change to the one or more performance metrics of the system comprises a change to an uncertainty interval in a forecast of the one or more performance metrics of the system.
15. The media of claim 10, wherein the instructions further cause presenting a historical or forecast simulation based on at least the third change to the one or more performance metrics of the system.
16. The media of claim 10, wherein at least one of the first prediction or the second prediction is generated based on one or more seasonal patterns.
17. The media of claim 10, wherein the request is received from a tenant of a cloud service; and wherein responsive to the request the cloud service presents, to the tenant, or stores, at a location accessible to the tenant, a result of the what-if simulation.
18. The media of claim 10, further comprising configuring one or more system resources based on a result of the what-if simulation.
19. A computing system comprising:
one or more hardware processors;
one or more non-transitory computer-readable media storing instructions that, when executed by the one or more hardware processors, cause:
receiving a request to perform a what-if simulation for one or more performance metrics of a system based on a first change to a first demand on one or more system resources;
responsive to receiving the request to perform the what-if simulation, generating a first prediction of at least a second change to a second demand on the one or more system resources; and
based at least on the first change to the first demand and the second change to the second demand, generating a second prediction of a third change to the one or more performance metrics of the system.
20. The computing system of claim 19, wherein the first change is a change to at least one historical value that corresponds to a measurement of the first demand on the one or more system resources.
US17/028,166 2017-06-02 2020-09-22 Data driven methods and systems for what if analysis Pending US20210073680A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/028,166 US20210073680A1 (en) 2017-06-02 2020-09-22 Data driven methods and systems for what if analysis

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US15/612,999 US10817803B2 (en) 2017-06-02 2017-06-02 Data driven methods and systems for what if analysis
US17/028,166 US20210073680A1 (en) 2017-06-02 2020-09-22 Data driven methods and systems for what if analysis

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US15/612,999 Continuation US10817803B2 (en) 2017-06-02 2017-06-02 Data driven methods and systems for what if analysis

Publications (1)

Publication Number Publication Date
US20210073680A1 true US20210073680A1 (en) 2021-03-11

Family

ID=64460261

Family Applications (2)

Application Number Title Priority Date Filing Date
US15/612,999 Active 2039-02-08 US10817803B2 (en) 2017-06-02 2017-06-02 Data driven methods and systems for what if analysis
US17/028,166 Pending US20210073680A1 (en) 2017-06-02 2020-09-22 Data driven methods and systems for what if analysis

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US15/612,999 Active 2039-02-08 US10817803B2 (en) 2017-06-02 2017-06-02 Data driven methods and systems for what if analysis

Country Status (1)

Country Link
US (2) US10817803B2 (en)

Families Citing this family (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10331802B2 (en) 2016-02-29 2019-06-25 Oracle International Corporation System for detecting and characterizing seasons
US10867421B2 (en) 2016-02-29 2020-12-15 Oracle International Corporation Seasonal aware method for forecasting and capacity planning
US10885461B2 (en) 2016-02-29 2021-01-05 Oracle International Corporation Unsupervised method for classifying seasonal patterns
US11082439B2 (en) 2016-08-04 2021-08-03 Oracle International Corporation Unsupervised method for baselining and anomaly detection in time-series data for enterprise systems
US11282021B2 (en) * 2017-09-22 2022-03-22 Jpmorgan Chase Bank, N.A. System and method for implementing a federated forecasting framework
US20190197413A1 (en) * 2017-12-27 2019-06-27 Elasticsearch B.V. Forecasting for Resource Allocation
US11138090B2 (en) * 2018-10-23 2021-10-05 Oracle International Corporation Systems and methods for forecasting time series with variable seasonality
CN111327655A (en) * 2018-12-14 2020-06-23 中移(杭州)信息技术有限公司 Multi-tenant container resource quota prediction method and device and electronic equipment
US10977112B2 (en) * 2019-01-22 2021-04-13 International Business Machines Corporation Performance anomaly detection
CN111724293B (en) * 2019-03-22 2023-07-28 华为技术有限公司 Image rendering method and device and electronic equipment
US11783006B1 (en) * 2019-03-29 2023-10-10 Cigna Intellectual Property, Inc. Computerized methods and systems for machine-learned multi-output multi-step forecasting of time-series data
US11516091B2 (en) * 2019-04-22 2022-11-29 At&T Intellectual Property I, L.P. Cloud infrastructure planning assistant via multi-agent AI
US11537940B2 (en) 2019-05-13 2022-12-27 Oracle International Corporation Systems and methods for unsupervised anomaly detection using non-parametric tolerance intervals over a sliding window of t-digests
US20200380351A1 (en) * 2019-05-28 2020-12-03 Sap Se Automated Scaling Of Resources Based On Long Short-Term Memory Recurrent Neural Networks And Attention Mechanisms
US11178065B2 (en) * 2019-08-07 2021-11-16 Oracle International Corporation System and methods for optimal allocation of multi-tenant platform infrastructure resources
US11341588B2 (en) 2019-09-04 2022-05-24 Oracle International Corporation Using an irrelevance filter to facilitate efficient RUL analyses for utility system assets
US11887015B2 (en) 2019-09-13 2024-01-30 Oracle International Corporation Automatically-generated labels for time series data and numerical lists to use in analytic and machine learning systems
CN111191113B (en) * 2019-09-29 2024-01-23 西北大学 Data resource demand prediction and adjustment method based on edge computing environment
US11455656B2 (en) 2019-11-18 2022-09-27 Walmart Apollo, Llc Methods and apparatus for electronically providing item advertisement recommendations
US11392984B2 (en) 2019-11-20 2022-07-19 Walmart Apollo, Llc Methods and apparatus for automatically providing item advertisement recommendations
US11050677B2 (en) 2019-11-22 2021-06-29 Accenture Global Solutions Limited Enhanced selection of cloud architecture profiles
US11367018B2 (en) 2019-12-04 2022-06-21 Oracle International Corporation Autonomous cloud-node scoping framework for big-data machine learning use cases
US11301305B2 (en) * 2020-01-07 2022-04-12 Bank Of America Corporation Dynamic resource clustering architecture
US11460500B2 (en) 2020-02-07 2022-10-04 Oracle International Corporation Counterfeit device detection using EMI fingerprints
US11255894B2 (en) 2020-02-28 2022-02-22 Oracle International Corporation High sensitivity detection and identification of counterfeit components in utility power systems via EMI frequency kiviat tubes
US11275144B2 (en) 2020-03-17 2022-03-15 Oracle International Corporation Automated calibration of EMI fingerprint scanning instrumentation for utility power system counterfeit detection
US11948051B2 (en) 2020-03-23 2024-04-02 Oracle International Corporation System and method for ensuring that the results of machine learning models can be audited
US11461709B2 (en) 2020-04-28 2022-10-04 Optrilo, Inc. Resource capacity planning system
US11481257B2 (en) 2020-07-30 2022-10-25 Accenture Global Solutions Limited Green cloud computing recommendation system
WO2022235651A1 (en) * 2021-05-03 2022-11-10 Avesha, Inc. Distributed computing system with multi tenancy based on application slices
US11822036B2 (en) 2021-10-07 2023-11-21 Oracle International Corporation Passive spychip detection through time series monitoring of induced magnetic field and electromagnetic interference
US11740122B2 (en) 2021-10-20 2023-08-29 Oracle International Corporation Autonomous discrimination of operation vibration signals
US11729940B2 (en) 2021-11-02 2023-08-15 Oracle International Corporation Unified control of cooling in computers

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100257133A1 (en) * 2005-05-09 2010-10-07 Crowe Keith E Computer-Implemented System And Method For Storing Data Analysis Models
US20140156557A1 (en) * 2011-08-19 2014-06-05 Jun Zeng Providing a Simulation Service by a Cloud-Based Infrastructure
US20150309840A1 (en) * 2006-08-31 2015-10-29 Bmc Software, Inc. Automated capacity provisioning method using historical performance data
US11146463B2 (en) * 2019-06-05 2021-10-12 Cisco Technology, Inc. Predicting network states for answering what-if scenario outcomes

Family Cites Families (155)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6104717A (en) 1995-11-03 2000-08-15 Cisco Technology, Inc. System and method for providing backup machines for implementing multiple IP addresses on multiple ports
US6438592B1 (en) 1998-02-25 2002-08-20 Michael G. Killian Systems for monitoring and improving performance on the world wide web
US6597777B1 (en) 1999-06-29 2003-07-22 Lucent Technologies Inc. Method and apparatus for detecting service anomalies in transaction-oriented networks
WO2001073428A1 (en) 2000-03-27 2001-10-04 Ramot University Authority For Applied Research & Industrial Development Ltd. Method and system for clustering data
US6996599B1 (en) 2000-06-21 2006-02-07 Microsoft Corporation System and method providing multi-tier applications architecture
US20020092004A1 (en) 2000-07-26 2002-07-11 Lee John Michael Methods and systems for automatically generating software applications
US20020019860A1 (en) 2000-08-02 2002-02-14 Lee Gene F. Method and apparatus for distributed administration of thin client architecture
US6609083B2 (en) 2001-06-01 2003-08-19 Hewlett-Packard Development Company, L.P. Adaptive performance data measurement and collections
US7454751B2 (en) 2001-06-07 2008-11-18 Intel Corporation Fault-tolerant system and methods with trusted message acknowledgement
AU2002316479A1 (en) 2001-07-03 2003-01-21 Altaworks Corporation System and methods for monitoring performance metrics
US20030149603A1 (en) 2002-01-18 2003-08-07 Bruce Ferguson System and method for operating a non-linear model with missing data for use in electronic commerce
US20050119982A1 (en) 2002-05-10 2005-06-02 Masato Ito Information processing apparatus and method
US8635328B2 (en) 2002-10-31 2014-01-21 International Business Machines Corporation Determining time varying thresholds for monitored metrics
US8572249B2 (en) 2003-12-10 2013-10-29 Aventail Llc Network appliance for balancing load and platform services
US6996502B2 (en) 2004-01-20 2006-02-07 International Business Machines Corporation Remote enterprise management of high availability systems
JP3922375B2 (en) 2004-01-30 2007-05-30 インターナショナル・ビジネス・マシーンズ・コーポレーション Anomaly detection system and method
US7343375B1 (en) 2004-03-01 2008-03-11 The Directv Group, Inc. Collection and correlation over time of private viewing usage data
US7672814B1 (en) 2004-03-03 2010-03-02 Emc Corporation System and method for baseline threshold monitoring
US7774485B2 (en) 2004-05-21 2010-08-10 Bea Systems, Inc. Dynamic service composition and orchestration
US7450498B2 (en) 2004-10-27 2008-11-11 Morgan Stanley Fault tolerant network architecture
US7519564B2 (en) 2004-11-16 2009-04-14 Microsoft Corporation Building and using predictive models of current and future surprises
US7739143B1 (en) 2005-03-24 2010-06-15 Amazon Technologies, Inc. Robust forecasting techniques with reduced sensitivity to anomalous data
US7739284B2 (en) 2005-04-20 2010-06-15 International Business Machines Corporation Method and apparatus for processing data streams
US7676539B2 (en) 2005-06-09 2010-03-09 International Business Machines Corporation Methods, apparatus and computer programs for automated problem solving in a distributed, collaborative environment
US20060287848A1 (en) 2005-06-20 2006-12-21 Microsoft Corporation Language classification with random feature clustering
US8463899B2 (en) 2005-07-29 2013-06-11 Bmc Software, Inc. System, method and computer program product for optimized root cause analysis
US8335720B2 (en) 2005-08-10 2012-12-18 American Express Travel Related Services Company, Inc. System, method, and computer program product for increasing inventory turnover using targeted consumer offers
US8185423B2 (en) 2005-12-22 2012-05-22 Canon Kabushiki Kaisha Just-in time workflow
US9037698B1 (en) 2006-03-14 2015-05-19 Amazon Technologies, Inc. Method and system for collecting and analyzing time-series data
US7987106B1 (en) 2006-06-05 2011-07-26 Turgut Aykin System and methods for forecasting time series with multiple seasonal patterns
US7783510B1 (en) 2006-06-23 2010-08-24 Quest Software, Inc. Computer storage capacity forecasting system using cluster-based seasonality analysis
US7529991B2 (en) 2007-01-30 2009-05-05 International Business Machines Corporation Scoring method for correlation anomalies
US20080221974A1 (en) 2007-02-22 2008-09-11 Alexander Gilgur Lazy Evaluation of Bulk Forecasts
US8046086B2 (en) 2007-05-15 2011-10-25 Fisher-Rosemount Systems, Inc. Methods and systems for batch processing and execution in a process system
US8200454B2 (en) 2007-07-09 2012-06-12 International Business Machines Corporation Method, data processing program and computer program product for time series analysis
US20090030752A1 (en) 2007-07-27 2009-01-29 General Electric Company Fleet anomaly detection method
US9323837B2 (en) 2008-03-05 2016-04-26 Ying Zhao Multiple domain anomaly detection system and method using fusion rule and visualization
US20080215576A1 (en) 2008-03-05 2008-09-04 Quantum Intelligence, Inc. Fusion and visualization for multiple anomaly detection systems
US8514868B2 (en) 2008-06-19 2013-08-20 Servicemesh, Inc. Cloud computing gateway, cloud computing hypervisor, and methods for implementing same
US8271949B2 (en) 2008-07-31 2012-09-18 International Business Machines Corporation Self-healing factory processes in a software factory
US8676964B2 (en) 2008-07-31 2014-03-18 Riverbed Technology, Inc. Detecting outliers in network traffic time series
US9245000B2 (en) 2008-08-05 2016-01-26 Vmware, Inc. Methods for the cyclical pattern determination of time-series data using a clustering approach
US8606379B2 (en) 2008-09-29 2013-12-10 Fisher-Rosemount Systems, Inc. Method of generating a product recipe for execution in batch processing
US20100082697A1 (en) 2008-10-01 2010-04-01 Narain Gupta Data model enrichment and classification using multi-model approach
US8363961B1 (en) 2008-10-14 2013-01-29 Adobe Systems Incorporated Clustering techniques for large, high-dimensionality data sets
EP2364473A4 (en) 2008-11-10 2015-10-14 Google Inc Method and system for clustering data points
US8234236B2 (en) 2009-06-01 2012-07-31 International Business Machines Corporation System and method for efficient allocation of resources in virtualized desktop environments
US9280436B2 (en) 2009-06-17 2016-03-08 Hewlett Packard Enterprise Development Lp Modeling a computing entity
US7992031B2 (en) 2009-07-24 2011-08-02 International Business Machines Corporation Automated disaster recovery planning
US9070096B2 (en) 2009-08-11 2015-06-30 Mckesson Financial Holdings Appliance and pair device for providing a reliable and redundant enterprise management solution
US8229876B2 (en) 2009-09-01 2012-07-24 Oracle International Corporation Expediting K-means cluster analysis data mining using subsample elimination preprocessing
US20110126197A1 (en) 2009-11-25 2011-05-26 Novell, Inc. System and method for controlling cloud and virtualized data centers in an intelligent workload management system
US8776066B2 (en) 2009-11-30 2014-07-08 International Business Machines Corporation Managing task execution on accelerators
US8819701B2 (en) 2009-12-12 2014-08-26 Microsoft Corporation Cloud computing monitoring and management system
CN102713861B (en) 2010-01-08 2015-09-23 日本电气株式会社 Operation management device, operation management method and program recorded medium
US8650299B1 (en) 2010-02-03 2014-02-11 Citrix Systems, Inc. Scalable cloud computing
WO2011128922A1 (en) 2010-04-15 2011-10-20 Neptuny S.R.L. Automated upgrading method for capacity of it system resources
US8627426B2 (en) 2010-04-26 2014-01-07 Vmware, Inc. Cloud platform architecture
US8725891B2 (en) 2010-07-01 2014-05-13 Red Hat, Inc. Aggregation across cloud providers
US8442067B2 (en) 2010-08-30 2013-05-14 International Business Machines Corporation Using gathered system activity statistics to determine when to schedule a procedure
US8677004B2 (en) 2010-09-10 2014-03-18 International Business Machines Corporation Migration of logical partitions between two devices
US9135586B2 (en) 2010-10-28 2015-09-15 Sap Se System for dynamic parallel looping of repetitive tasks during execution of process-flows in process runtime
US8621058B2 (en) 2010-10-28 2013-12-31 Hewlett-Packard Development Company, L.P. Providing cloud-based computing services
CN103339613B (en) 2011-01-24 2016-01-06 日本电气株式会社 Operation management device, operation management method and program
US8862933B2 (en) 2011-02-09 2014-10-14 Cliqr Technologies, Inc. Apparatus, systems and methods for deployment and management of distributed computing systems and applications
US20120240072A1 (en) 2011-03-18 2012-09-20 Serious Materials, Inc. Intensity transform systems and methods
US9195563B2 (en) 2011-03-30 2015-11-24 Bmc Software, Inc. Use of metrics selected based on lag correlation to provide leading indicators of service performance degradation
US9218232B2 (en) 2011-04-13 2015-12-22 Bar-Ilan University Anomaly detection methods, devices and systems
US9075659B2 (en) 2011-06-16 2015-07-07 Kodak Alaris Inc. Task allocation in a computer network
US9047559B2 (en) 2011-07-22 2015-06-02 Sas Institute Inc. Computer-implemented systems and methods for testing large scale automatic forecast combinations
EP2737411A4 (en) 2011-07-26 2015-10-14 Nebula Inc Systems and methods for implementing cloud computing
JP5710417B2 (en) 2011-08-05 2015-04-30 株式会社東芝 Wireless receiver
EP2759938B1 (en) 2011-09-19 2019-09-11 Nec Corporation Operations management device, operations management method, and program
WO2013043170A1 (en) 2011-09-21 2013-03-28 Hewlett-Packard Development Company L.P. Automated detection of a system anomaly
US9002774B2 (en) 2011-09-23 2015-04-07 Aol Advertising Inc. Systems and methods for generating a forecasting model and forecasting future values
US9355357B2 (en) 2011-10-21 2016-05-31 Hewlett Packard Enterprise Development Lp Computing predicted data according to weighted peak preservation and time distance biasing
US9141914B2 (en) 2011-10-31 2015-09-22 Hewlett-Packard Development Company, L.P. System and method for ranking anomalies
US10554077B2 (en) 2011-12-13 2020-02-04 Schneider Electric USA, Inc. Automated monitoring for changes in energy consumption patterns
CN104137078B (en) 2012-01-23 2017-03-22 日本电气株式会社 Operation management device, operation management method, and program
CN104205063B (en) 2012-03-14 2017-05-24 日本电气株式会社 Operation administration device, operation administration method, and program
US8880525B2 (en) 2012-04-02 2014-11-04 Xerox Corporation Full and semi-batch clustering
US8949677B1 (en) 2012-05-23 2015-02-03 Amazon Technologies, Inc. Detecting anomalies in time series data
US20130326202A1 (en) 2012-05-30 2013-12-05 Roi Rosenthal Load test capacity planning
JP6049314B2 (en) 2012-06-11 2016-12-21 キヤノン株式会社 Radiation imaging apparatus and image processing method
EP2870580A4 (en) 2012-07-03 2016-05-18 Hewlett Packard Development Co Managing a hybrid cloud service
US9825823B2 (en) 2012-07-03 2017-11-21 Hewlett Packard Enterprise Development Lp Managing a cloud service
WO2014007813A1 (en) 2012-07-03 2014-01-09 Hewlett-Packard Development Company, L.P. Managing a multitenant cloud service
US9569804B2 (en) 2012-08-27 2017-02-14 Gridium, Inc. Systems and methods for energy consumption and energy demand management
US11185241B2 (en) 2014-03-05 2021-11-30 Whoop, Inc. Continuous heart rate monitoring and interpretation
US9495220B2 (en) 2012-09-28 2016-11-15 Sap Se Self-management of request-centric systems
US9245248B2 (en) 2012-09-28 2016-01-26 Dell Software Inc. Data metric resolution prediction system and method
US9712402B2 (en) 2012-10-10 2017-07-18 Alcatel Lucent Method and apparatus for automated deployment of geographically distributed applications within a cloud
WO2014075108A2 (en) 2012-11-09 2014-05-15 The Trustees Of Columbia University In The City Of New York Forecasting system using machine learning and ensemble methods
US9147167B2 (en) 2012-11-19 2015-09-29 Oracle International Corporation Similarity analysis with tri-point data arbitration
US9146777B2 (en) 2013-01-25 2015-09-29 Swarm Technology Llc Parallel processing with solidarity cells by proactively retrieving from a task pool a matching task for the solidarity cell to process
US9710493B2 (en) 2013-03-08 2017-07-18 Microsoft Technology Licensing, Llc Approximate K-means via cluster closures
US9514213B2 (en) 2013-03-15 2016-12-06 Oracle International Corporation Per-attribute data clustering using tri-point data arbitration
US9330119B2 (en) 2013-04-11 2016-05-03 Oracle International Corporation Knowledge intensive data management system for business process and case management
US9507718B2 (en) 2013-04-16 2016-11-29 Facebook, Inc. Intelligent caching
US9692775B2 (en) 2013-04-29 2017-06-27 Telefonaktiebolaget Lm Ericsson (Publ) Method and system to dynamically detect traffic anomalies in a network
US10163034B2 (en) 2013-06-19 2018-12-25 Oracle International Corporation Tripoint arbitration for entity classification
WO2014208002A1 (en) 2013-06-25 2014-12-31 日本電気株式会社 System analysis device, system analysis method and system analysis program
EP3022612A1 (en) 2013-07-19 2016-05-25 GE Intelligent Platforms, Inc. Model change boundary on time series data
US10324942B2 (en) 2013-07-26 2019-06-18 Snap Inc. Segment data visibility and management in a distributed database of time stamped records
US9632858B2 (en) 2013-07-28 2017-04-25 OpsClarity Inc. Organizing network performance metrics into historical anomaly dependency data
US10229104B2 (en) 2013-08-01 2019-03-12 Sonicwall Inc. Efficient DFA generation for non-matching characters and character classes in regular expressions
US9280372B2 (en) 2013-08-12 2016-03-08 Amazon Technologies, Inc. Request processing techniques
US9319911B2 (en) 2013-08-30 2016-04-19 International Business Machines Corporation Adaptive monitoring for cellular networks
WO2015065435A1 (en) 2013-10-31 2015-05-07 Hewlett-Packard Development Company, L.P. Storing time series data for a search query
US20160299961A1 (en) 2014-02-04 2016-10-13 David Allen Olsen System and method for grouping segments of data sequences into clusters
US9323574B2 (en) 2014-02-21 2016-04-26 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. Processor power optimization with response time assurance
US9740402B2 (en) 2014-03-28 2017-08-22 Vmware, Inc. Migrating workloads across host computing systems based on remote cache content usage characteristics
CN105095614A (en) 2014-04-18 2015-11-25 国际商业机器公司 Method and device for updating prediction model
US9374389B2 (en) 2014-04-25 2016-06-21 Intuit Inc. Method and system for ensuring an application conforms with security and regulatory controls prior to deployment
US9727533B2 (en) 2014-05-20 2017-08-08 Facebook, Inc. Detecting anomalies in a time series
US9779361B2 (en) 2014-06-05 2017-10-03 Mitsubishi Electric Research Laboratories, Inc. Method for learning exemplars for anomaly detection
US9658910B2 (en) 2014-07-29 2017-05-23 Oracle International Corporation Systems and methods for spatially displaced correlation for detecting value ranges of transient correlation in machine data of enterprise systems
US10069900B2 (en) 2014-08-05 2018-09-04 Oracle International Corporation Systems and methods for adaptive thresholding using maximum concentration intervals
US20160092516A1 (en) 2014-09-26 2016-03-31 Oracle International Corporation Metric time series correlation by outlier removal based on maximum concentration interval
US11087263B2 (en) 2014-10-09 2021-08-10 Splunk Inc. System monitoring with key performance indicators from shared base search of machine data
US9811394B1 (en) 2014-10-12 2017-11-07 Workato, Inc. Application programming interface recipe cloning
US9977699B2 (en) 2014-11-17 2018-05-22 Mediatek, Inc. Energy efficient multi-cluster system and its operations
EP3239839A4 (en) 2014-12-22 2018-08-22 Nec Corporation Operation management device, operation management method, and recording medium in which operation management program is recorded
US9529630B1 (en) 2015-01-21 2016-12-27 Pivotal Software, Inc. Cloud computing platform architecture
JP6777069B2 (en) 2015-03-16 2020-10-28 日本電気株式会社 Information processing equipment, information processing methods, and programs
JP6384590B2 (en) 2015-03-26 2018-09-05 日本電気株式会社 Learning model generation system, method and program
US20160283533A1 (en) 2015-03-26 2016-09-29 Oracle International Corporation Multi-distance clustering
US9794229B2 (en) 2015-04-03 2017-10-17 Infoblox Inc. Behavior analysis based DNS tunneling detection and classification framework for network security
JP6313730B2 (en) 2015-04-10 2018-04-18 タタ コンサルタンシー サービシズ リミテッドTATA Consultancy Services Limited Anomaly detection system and method
US10410155B2 (en) 2015-05-01 2019-09-10 Microsoft Technology Licensing, Llc Automatic demand-driven resource scaling for relational database-as-a-service
US10453007B2 (en) 2015-05-18 2019-10-22 International Business Machines Corporation Automatic time series exploration for business intelligence analytics
US20160357674A1 (en) 2015-06-07 2016-12-08 Cloud Physics, Inc. Unified Online Cache Monitoring and Optimization
US10997176B2 (en) 2015-06-25 2021-05-04 International Business Machines Corporation Massive time series correlation similarity computation
US9323599B1 (en) 2015-07-31 2016-04-26 AppDynamics, Inc. Time series metric data modeling and prediction
US11010196B2 (en) * 2015-08-31 2021-05-18 Vmware, Inc. Capacity analysis using closed-system modules
US9961571B2 (en) 2015-09-24 2018-05-01 Futurewei Technologies, Inc. System and method for a multi view learning approach to anomaly detection and root cause analysis
US20200034745A1 (en) 2015-10-19 2020-01-30 Nutanix, Inc. Time series analysis and forecasting using a distributed tournament selection process
CN105426411B (en) 2015-10-31 2019-05-28 南京南瑞继保电气有限公司 Time series databases buffer memory management method based on access trend prediction
US9471778B1 (en) 2015-11-30 2016-10-18 International Business Machines Corporation Automatic baselining of anomalous event activity in time series data
US10867421B2 (en) 2016-02-29 2020-12-15 Oracle International Corporation Seasonal aware method for forecasting and capacity planning
US10754573B2 (en) 2016-03-11 2020-08-25 EMC IP Holding Company LLC Optimized auto-tiering, wherein subset of data movements are selected, utilizing workload skew point, from a list that ranks data movements based on criteria other than I/O workload
CN107203891A (en) * 2016-03-17 2017-09-26 阿里巴巴集团控股有限公司 A kind of automatic many threshold values characteristic filter method and devices
US10073906B2 (en) 2016-04-27 2018-09-11 Oracle International Corporation Scalable tri-point arbitration and clustering
US10198339B2 (en) 2016-05-16 2019-02-05 Oracle International Corporation Correlation-based analytic for time-series data
US11263566B2 (en) 2016-06-20 2022-03-01 Oracle International Corporation Seasonality validation and determination of patterns
US10390114B2 (en) 2016-07-22 2019-08-20 Intel Corporation Memory sharing for physical accelerator resources in a data center
US10635563B2 (en) 2016-08-04 2020-04-28 Oracle International Corporation Unsupervised method for baselining and anomaly detection in time-series data for enterprise systems
US20180053207A1 (en) 2016-08-16 2018-02-22 Adobe Systems Incorporated Providing personalized alerts and anomaly summarization
US20180081629A1 (en) 2016-09-16 2018-03-22 Sevone, Inc. Method and apparatus for providing ordered sets of arbitrary percentile estimates for varying timespans
US10375098B2 (en) 2017-01-31 2019-08-06 Splunk Inc. Anomaly detection based on relationships between multiple time series
US10872000B2 (en) 2017-05-05 2020-12-22 Workato, Inc. Late connection binding for bots
US10917419B2 (en) 2017-05-05 2021-02-09 Servicenow, Inc. Systems and methods for anomaly detection
US20180330433A1 (en) 2017-05-12 2018-11-15 Connect Financial LLC Attributing meanings to data concepts used in producing outputs
US10621005B2 (en) 2017-08-31 2020-04-14 Oracle International Corporation Systems and methods for providing zero down time and scalability in orchestration cloud services
CN109359763A (en) * 2018-09-03 2019-02-19 华南理工大学 Dairy products cold chain transportation paths planning method based on calendar variation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100257133A1 (en) * 2005-05-09 2010-10-07 Crowe Keith E Computer-Implemented System And Method For Storing Data Analysis Models
US20150309840A1 (en) * 2006-08-31 2015-10-29 Bmc Software, Inc. Automated capacity provisioning method using historical performance data
US20140156557A1 (en) * 2011-08-19 2014-06-05 Jun Zeng Providing a Simulation Service by a Cloud-Based Infrastructure
US11146463B2 (en) * 2019-06-05 2021-10-12 Cisco Technology, Inc. Predicting network states for answering what-if scenario outcomes

Also Published As

Publication number Publication date
US10817803B2 (en) 2020-10-27
US20180349797A1 (en) 2018-12-06

Similar Documents

Publication Publication Date Title
US20210073680A1 (en) Data driven methods and systems for what if analysis
US20210320939A1 (en) Unsupervised method for baselining and anomaly detection in time-series data for enterprise systems
US10635563B2 (en) Unsupervised method for baselining and anomaly detection in time-series data for enterprise systems
US10855548B2 (en) Systems and methods for automatically detecting, summarizing, and responding to anomalies
US10915830B2 (en) Multiscale method for predictive alerting
US11537940B2 (en) Systems and methods for unsupervised anomaly detection using non-parametric tolerance intervals over a sliding window of t-digests
JP2022003566A (en) Correlation between thread strength and heep use amount for specifying stack trace in which heaps are stored up
US11138090B2 (en) Systems and methods for forecasting time series with variable seasonality
US11586706B2 (en) Time-series analysis for forecasting computational workloads
US20170104658A1 (en) Large-scale distributed correlation
US8229953B2 (en) Metric correlation and analysis
US20200125988A1 (en) Systems and Methods For Detecting Long Term Seasons
US9798644B2 (en) Monitoring system performance with pattern event detection
US11200139B2 (en) Automatic configuration of software systems for optimal management and performance using machine learning
US9093841B2 (en) Power distribution network event correlation and analysis
US10116534B2 (en) Systems and methods for WebSphere MQ performance metrics analysis
US10949436B2 (en) Optimization for scalable analytics using time series models
US11362893B2 (en) Method and apparatus for configuring a cloud storage software appliance
US20200099570A1 (en) Cross-domain topological alarm suppression
US11669374B2 (en) Using machine-learning methods to facilitate experimental evaluation of modifications to a computational environment within a distributed system
Gupta et al. Long range dependence in cloud servers: a statistical analysis based on google workload trace
CN109478296A (en) System for fully-integrated capture and analysis business information to generate forecast and decision and simulation
US10630561B1 (en) System monitoring with metrics correlation for data center
US20230205664A1 (en) Anomaly detection using forecasting computational workloads
US20230195591A1 (en) Time series analysis for forecasting computational workloads

Legal Events

Date Code Title Description
STPP Information on status: patent application and granting procedure in general

Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED

AS Assignment

Owner name: ORACLE INTERNATIONAL CORPORATION, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GARVEY, DUSTIN;SALUNKE, SAMPANNA SHAHAJI;SHAFT, URI;AND OTHERS;SIGNING DATES FROM 20170526 TO 20170602;REEL/FRAME:055773/0656

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED