US20170085448A1 - Selecting time-series data for information technology (it) operations analytics anomaly detection - Google Patents

Selecting time-series data for information technology (it) operations analytics anomaly detection Download PDF

Info

Publication number
US20170085448A1
US20170085448A1 US14/862,395 US201514862395A US2017085448A1 US 20170085448 A1 US20170085448 A1 US 20170085448A1 US 201514862395 A US201514862395 A US 201514862395A US 2017085448 A1 US2017085448 A1 US 2017085448A1
Authority
US
United States
Prior art keywords
series data
time
processing circuit
program instructions
sets
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US14/862,395
Other versions
US10587487B2 (en
Inventor
Ryan A. Garrett
Robert J. McKeown
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US14/862,395 priority Critical patent/US10587487B2/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: GARRETT, RYAN A., MCKEOWN, ROBERT J.
Publication of US20170085448A1 publication Critical patent/US20170085448A1/en
Application granted granted Critical
Publication of US10587487B2 publication Critical patent/US10587487B2/en
Expired - Fee Related legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • H04L43/067Generation of reports using time frame reporting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0852Delays

Definitions

  • the present invention relates to time-series data selection and, more specifically, to time-series data selection for use in detecting anomalies in information technology (IT) operations analytics.
  • IT information technology
  • an early step in the deployment and configuration of such systems is the selection of a subset of data that will work well or be suitable for a given type of analysis from the available data. For example, when an analytic tool to retrieve and analyze data from a IT performance management system is deployed, the administrator or user responsible for the deployment may be required to explicitly select which tables of data to export (if they are contained within a database) or to specify which data to export.
  • a computer program product for selecting time-series data includes a computer readable storage medium having program instructions embodied therewith.
  • the program instructions are readable and executable by a processing circuit to cause the processing circuit to assemble, by the processing circuit, a set of analytic assessment tools for time-series data, engage, by the processing circuit, the analytic assessment tools to measure characteristics-of-importance in a relevant analytic domain for sets of the time-series data, generate, by the processing circuit, as a measurement result, a score for each set of the time-series data based on the associated characteristics-of-importance and rank, by the processing circuit, the sets of the time-series data in accordance with the score for each set of the time-series data for subsequent time-series data selection.
  • a computer program product for selecting time-series data includes a computer readable storage medium having stored thereon first program instructions executable by a processing circuit to cause the processing circuit to assemble a set of analytic assessment tools for time-series data, second program instructions executable by the processing circuit to cause the processing circuit to engage the analytic assessment tools to measure characteristics-of-importance in a relevant analytic domain for sets of the time-series data, third program instructions executable by the processing circuit to cause the processing circuit to generate, as a measurement result, a score for each set of the time-series data based on the associated characteristics-of-importance and fourth program instructions executable by the processing circuit to cause the processing circuit to rank the sets of the time-series data in accordance with the score for each set of the time-series data for subsequent time-series data selection.
  • a computer-implemented method for selecting time-series data includes assembling a set of analytic assessment tools for time-series data, engaging the analytic assessment tools to measure characteristics-of-importance in a relevant analytic domain for sets of the time-series data, generating, as a measurement result, a score for each set of the time-series data based on the associated characteristics-of-importance and ranking the sets of the time-series data in accordance with the score for each set of the time-series data for subsequent time-series data selection.
  • FIG. 1 is a schematic diagram of a computing system in accordance with embodiments
  • FIG. 2 is a schematic diagram of a computer program product of the computing system of FIG. 1 in accordance with embodiments;
  • FIG. 3 is a flow diagram illustrating a deployment process for the computer program product of FIG. 2 in accordance with embodiments.
  • FIG. 4 is a flow diagram illustrating a computer-implemented method of selecting time-series data in accordance with embodiments.
  • Deploying and configuring analytics systems can be problematic due to the fact that best practices for the same are limited to being attuned to conditions that have already been seen before multiple times and the fact that this limitation may be most acute at the early stages of the analytic product lifecycle.
  • Other problems include the fact that applicability of best practices depends on the dynamics of a given environment since what worked well in one environment sometimes will not work well in another environment and the fact that the applicability of best practices changes with algorithm changes and, depending on the dynamics of the environment, they may not apply or become tedious to update in any effect.
  • the description provided below relates to an approach to quickly analyze actual data in source systems and to determine a set of acceptable metrics or metric types based upon what algorithms have been deployed.
  • This set of acceptable metrics should then be presented to an administrator or user so that the administrator or user is given an opportunity to select which ones of the set of acceptable metrics should be processed and which ones should be ignored.
  • the motivation for the user selection would be that there are additional selection criteria beyond the notion of “what works well with the algorithm,” which should be or must be considered. For example, the administrator or the user may want to give particular consideration towards concerns of computing resources, scalability and customer interest.
  • the approach effectively combines a method for assessing/scoring time-series against a variety of criteria (e.g. data completeness, presence of particular frequency components, etc.), computing a weighted score for each time-series in the data source and then presenting to the administrator or user the time-series ordered by this ranking for selection.
  • criteria e.g. data completeness, presence of particular frequency components, etc.
  • the assessment/scoring schemes themselves may be independently derivable or otherwise produced by algorithm developers.
  • time-series/metric types can be classified as “acceptable” or “unacceptable.” This would be a two stage process that would first do a cursory search on a small time window to determine if the data should be considered for analytics, with the second stage expanding the time window and focusing on the data that was deemed valuable.
  • a computing system 10 is provided and may be configured for example as an enterprise computing system or as a personal computing system.
  • the computing system 10 includes multiple computing devices 11 , 12 , 13 , etc., which are configured to be networked together for communication purposes.
  • Each of the multiple computing devices 11 , 12 , 13 , etc. includes among other features a processing circuit 20 , a display 30 , user input devices 40 and a networking unit 50 as well as a computer program product 100 for selecting time-series data.
  • the processing circuit 20 may be provided as a micro-processor, a central processing unit (CPU) or any other suitable processing device.
  • the display 30 may be provided as a monitor and is configured to display data and information as well as a graphical user interface to an administrator or user.
  • the user input devices 40 may be provided as a mouse and a keyboard combination and are configured to allow the administrator or user to input commands to the processing circuit 20 .
  • the networking unit 50 may be provided as an Ethernet or other suitable networking device by which the multiple computing devices 11 , 12 , 13 , etc. are communicative.
  • the computer program product 100 includes a computer readable storage medium 110 having first, second, third and fourth program instructions 111 , 112 , 113 and 114 stored thereon.
  • the first program instructions 111 are executable by the processing circuit 20 of each of the multiple computing devices 11 , 12 , 13 , etc., to cause the processing circuit 20 to assemble a set of analytic assessment tools, such as analytic assessment algorithms and functions, or “time-series scorers” for analyzing time-series data.
  • the second program instructions 112 are executable by the processing circuit 20 to cause the processing circuit 20 to engage the analytic assessment tools to measure characteristics-of-importance in a relevant analytic domain for sets of the time-series data.
  • the relevant analytic domain may refer, for example, to information technology (IT) operations analytics, data completeness analytics and frequency/power/spectrum analytics.
  • the third program instructions 113 are executable by the processing circuit 20 to cause the processing circuit 20 to generate, as a measurement result, a score for each set of the time-series data based on the associated characteristics-of-importance and, in some cases, external data.
  • This external data may be, for example, meta-data relating to the sets of the time-series data such as the source of the time-series data and a customer name.
  • the fourth program instructions 114 are executable by the processing circuit 20 to cause the processing circuit 20 to rank the sets of the time-series data in accordance with the score for each set of the time-series data for subsequent time-series data selection.
  • the characteristics-of-importance may include, but are not limited to, data completeness. That is, the second program instructions 112 may be executable by the processing circuit 20 to cause the processing circuit 20 to engage the analytic assessment tools to measure whether a predefined percentage of expected data is/was present for a given time-series over a given window of time.
  • the third program instructions 113 may be configured to cause the processing circuit 20 to combine multiple scores for each set of the time series data to thereby generate an overall score. Such combining may be executed by the processing circuit 20 by way of a linear combination of each of the multiple scores with numerical weighting or by way of a binary combination of each of the multiple scores (for a binary combination, one approach is that, for a time-series, if any of the characteristics-of-importance exceed a specified level for that characteristic, then that time-series would be given a score of ‘1’, meaning ‘could be included’ such that the binary scoring is typical of the first phase where we are looking for candidates to include—whether it is actually included, depends on subsequent scoring based upon analyzing a fuller set of data and for other key characteristics-of-interest). In the latter case, a score of “X” as a user configurable threshold would be required for a given set of the time-series data to be included in eventual analytics.
  • the fourth program instructions 114 cause the processing circuit 20 to include higher ranked sets of the time-series data in eventual analytics. In doing so, the fourth program instructions 114 may cause the processing circuit to 20 include sets of the time-series data having scores that are above a predefined threshold in the eventual analytics and reject sets of the time-series data having scores that are below the predefined threshold.
  • the ranking of the sets of the time-series data may be provided as a two-phase or two-step process.
  • the initial (optional) phase may be executed as a scoring phase in which a relatively small time sample of data is examined and characteristics-of-interest are measured so that the results of the examination and measurement can be analyzed using binary or weighted-combinations to arrive at an initial score.
  • a candidate subset is identified based upon essential criteria and is determined to be present in sufficient strength for inclusion in the subsequent analytics.
  • the second (characterization) phase follows where there may be fewer numbers of sets of time-series data to be examined but those sets of time-series data that remain would be across a wider range of times such that a single stage process would otherwise be computationally expensive.
  • the combination of scores for the characteristics-of-interest of the second phase leads to the scores used for selection when exceeding configured thresholds or ranking.
  • first, second, third and fourth program instructions 111 , 112 , 113 and 114 may be deployed by manual loading thereof directly into a client, server and/or proxy computer by way of a loadable storage medium, such as a CD, DVD, etc., being manually inserted into each of the multiple computing devices 11 , 12 , 13 , etc.
  • the first, second, third and fourth program instructions 111 , 112 , 113 and 114 may also be automatically or semi-automatically deployed into the computing system 10 by way of a central server 15 or a group of central servers 15 (see FIG. 1 ).
  • the first, second, third and fourth program instructions 111 , 112 , 113 and 114 may be downloadable into client computers that will then execute the first, second, third and fourth program instructions 111 , 112 , 113 and 114 .
  • the first, second, third and fourth program instructions 111 , 112 , 113 and 114 may be sent directly to a client system via e-mail with the first, second, third and fourth program instructions 111 , 112 , 113 and 114 then being detached to or loaded into a directory.
  • Another alternative would be that the first, second, third and fourth program instructions 111 , 112 , 113 and 114 be sent directly to a directory on a client computer hard drive.
  • loading processes will select proxy server codes, determine on which computers to place the proxy servers' codes, transmit the proxy server codes and then install the proxy server codes on proxy computers.
  • the first, second, third and fourth program instructions 111 , 112 , 113 and 114 will then be transmitted to the proxy server and subsequently stored thereon.
  • a deployment process of the computer program product described above begins at block 300 and at block 101 with a determination of whether the first, second, third and fourth program instructions 111 , 112 , 113 and 114 will reside on a server or servers when executed. If so, then the servers that will contain the executables are identified at block 209 .
  • the first, second, third and fourth program instructions 111 , 112 , 113 and 114 for the server or servers are then transferred directly to the servers' storage via FTP or some other protocol or by copying though the use of a shared file system at block 210 such that the first, second, third and fourth program instructions 111 , 112 , 113 and 114 are installed on the servers at block 211 .
  • a proxy server is a server that sits between a client application, such as a Web browser, and a real server and operates by intercepting all requests to the real server to see if it can fulfill the requests itself. If not, the proxy server forwards the request to the real server.
  • the two primary benefits of a proxy server are to improve performance and to filter requests.
  • the proxy server is installed at block 201 and the first, second, third and fourth program instructions 111 , 112 , 113 and 114 are sent to the (one or more) servers via a protocol, such as FTP, or by being copied directly from the source files to the server files via file sharing at block 202 .
  • Another embodiment involves sending a transaction to the (one or more) servers that contained the process software, and have the server process the transaction and then receive and copy the process software to the server's file system. Once the process software is stored at the servers, the users may then access the first, second, third and fourth program instructions 111 , 112 , 113 and 114 on the servers and copy to the same to their respective client computer file systems at block 203 .
  • the servers may automatically copy the first, second, third and fourth program instructions 111 , 112 , 113 and 114 to each client and then run an installation program for the first, second, third and fourth program instructions 111 , 112 , 113 and 114 at each client computer whereby the user executes the program that installs the first, second, third and fourth program instructions 111 , 112 , 113 and 114 on his client computer at block 212 and then exits the process at block 108 .
  • the users then receive the e-mail at block 205 and then detach the first, second, third and fourth program instructions 111 , 112 , 113 and 114 from the e-mail to a directory on their client computers at block 206 .
  • the user executes the program that installs the first, second, third and fourth program instructions 111 , 112 , 113 and 114 on his client computer at block 212 and then exits the process at block 108 .
  • first, second, third and fourth program instructions 111 , 112 , 113 and 114 will be sent directly to user directories on their client computers at block 106 . If so, the user directories are identified at block 107 and the process software is transferred directly to the user's client computer directories at block 207 . This can be done in several ways such as, but not limited to, sharing the file system directories and then copying from the sender's file system to the recipient user's file system or, alternatively, using a transfer protocol such as File Transfer Protocol (FTP).
  • FTP File Transfer Protocol
  • the users access the directories on their client file systems in preparation for installing the first, second, third and fourth program instructions 111 , 112 , 113 and 114 at block 208 , execute the program that installs the first, second, third and fourth program instructions 111 , 112 , 113 and 114 at block 212 and then exit the process at block 108 .
  • the method includes assembling a set of analytic assessment tools for time-series data, as shown at block 401 , engaging the analytic assessment tools to measure characteristics-of-importance in a relevant analytic domain for sets of the time-series data, as shown at block 402 , generating, as a measurement result, a score for each set of the time-series data based on the associated characteristics-of-importance, as shown at block 403 and as described above, and ranking the sets of the time-series data in accordance with the score for each set of the time-series data for subsequent time-series data selection, as shown at block 404 and as described above.
  • a potential use of the computer program product and the method described above would, in some cases, be to generate point tooling or “data mediation tooling” at potential data sources and data bases and, for time-series data stored therein, to analyze the time-series data and to present ranked lists to users for inclusion or exclusion with respect to downstream analysis.
  • aspects of the disclosure can be automated whereby, for example, only time-series data scoring above a certain level might be flagged for subsequent inclusion in downstream analysis.
  • users will have an opportunity to adjust weighting values assigned to various factors and to build upon those supplied by a pre-existing development team. For example, the user may decide that particular time-series data very directly measures key customer experience aspects and are more important to include in downstream analytics then the corresponding score might otherwise indicate.
  • an example of a binary score would be if the time-series is directly associated with customer experience measurements (e.g., it is a well-known metric type), it may automatically be scored/flagged for inclusion based upon the best practice.
  • Advantages associated with the computer program product and method described above include, but are not limited to, the fact that the features described herein can work without explicit notions of ‘best-practices’ or pre-existing lists of metrics since important factors that are generally included in the creation of such best-practices or lists are codified within the time series scoring mechanisms.
  • data selection can be explicitly facilitated and can deal with vagaries of a particular environment (i.e., where a metric type X, based upon observation and analysis, is determined to be acceptable in one environment but would not be in another).
  • administrators or users of the computer program product and method will be able to select the top-N sets of the time-series data and have confidence that the set is a “best” fit for a given case.
  • the present invention may be a system, a method, and/or a computer program product at any possible technical detail level of integration
  • the computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention
  • the computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device.
  • the computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • a non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read-only memory
  • EPROM or Flash memory erasable programmable read-only memory
  • SRAM static random access memory
  • CD-ROM compact disc read-only memory
  • DVD digital versatile disk
  • memory stick a floppy disk
  • a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon
  • a computer readable storage medium is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
  • Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network.
  • the network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
  • Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, configuration data for integrated circuitry, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++, or the like, and procedural programming languages, such as the “C” programming language or similar programming languages.
  • the computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
  • electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
  • These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
  • the computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer-implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the blocks may occur out of the order noted in the Figures.
  • two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Environmental & Geological Engineering (AREA)

Abstract

A computer program product for selecting time-series data is provided. The computer program product includes a computer readable storage medium having program instructions embodied therewith. The program instructions are readable and executable by a processing circuit to cause the processing circuit to assemble, by the processing circuit, a set of analytic assessment tools for time-series data, engage, by the processing circuit, the analytic assessment tools to measure characteristics-of-importance in a relevant analytic domain for sets of the time-series data, generate, by the processing circuit, as a measurement result, a score for each set of the time-series data based on the associated characteristics-of-importance and rank, by the processing circuit, the sets of the time-series data in accordance with the score for each set of the time-series data for subsequent time-series data selection.

Description

    BACKGROUND
  • The present invention relates to time-series data selection and, more specifically, to time-series data selection for use in detecting anomalies in information technology (IT) operations analytics.
  • When analytics systems focusing on analyses of time-series data are deployed in an IT environment, an early step in the deployment and configuration of such systems is the selection of a subset of data that will work well or be suitable for a given type of analysis from the available data. For example, when an analytic tool to retrieve and analyze data from a IT performance management system is deployed, the administrator or user responsible for the deployment may be required to explicitly select which tables of data to export (if they are contained within a database) or to specify which data to export.
  • In general though, what works well and provides the best possible results in one situation versus another situation is highly dependent on the actual analytic algorithms that are deployed but the administrators or users that are charged with the deployment and the configurations may have no real sense of what data should be selected. Thus a common approach to system deployment and configuration is based simply upon the notion of “best practices” in which previous experience and heuristics are codified in documentation which provide recommendations on which metrics to process. A variation on this theme is one where code and configurations to extract data are organized in deployable packages/packs that can be deployed together.
  • SUMMARY
  • According to an embodiment of the present invention, a computer program product for selecting time-series data is provided. The computer program product includes a computer readable storage medium having program instructions embodied therewith. The program instructions are readable and executable by a processing circuit to cause the processing circuit to assemble, by the processing circuit, a set of analytic assessment tools for time-series data, engage, by the processing circuit, the analytic assessment tools to measure characteristics-of-importance in a relevant analytic domain for sets of the time-series data, generate, by the processing circuit, as a measurement result, a score for each set of the time-series data based on the associated characteristics-of-importance and rank, by the processing circuit, the sets of the time-series data in accordance with the score for each set of the time-series data for subsequent time-series data selection.
  • According to another embodiment of the present invention, a computer program product for selecting time-series data is provided and includes a computer readable storage medium having stored thereon first program instructions executable by a processing circuit to cause the processing circuit to assemble a set of analytic assessment tools for time-series data, second program instructions executable by the processing circuit to cause the processing circuit to engage the analytic assessment tools to measure characteristics-of-importance in a relevant analytic domain for sets of the time-series data, third program instructions executable by the processing circuit to cause the processing circuit to generate, as a measurement result, a score for each set of the time-series data based on the associated characteristics-of-importance and fourth program instructions executable by the processing circuit to cause the processing circuit to rank the sets of the time-series data in accordance with the score for each set of the time-series data for subsequent time-series data selection.
  • According to yet another embodiment of the present invention, a computer-implemented method for selecting time-series data is provided and includes assembling a set of analytic assessment tools for time-series data, engaging the analytic assessment tools to measure characteristics-of-importance in a relevant analytic domain for sets of the time-series data, generating, as a measurement result, a score for each set of the time-series data based on the associated characteristics-of-importance and ranking the sets of the time-series data in accordance with the score for each set of the time-series data for subsequent time-series data selection.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The subject matter which is regarded as the invention is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The forgoing and other features, and advantages of the invention are apparent from the following detailed description taken in conjunction with the accompanying drawings in which:
  • FIG. 1 is a schematic diagram of a computing system in accordance with embodiments;
  • FIG. 2 is a schematic diagram of a computer program product of the computing system of FIG. 1 in accordance with embodiments;
  • FIG. 3 is a flow diagram illustrating a deployment process for the computer program product of FIG. 2 in accordance with embodiments; and
  • FIG. 4 is a flow diagram illustrating a computer-implemented method of selecting time-series data in accordance with embodiments.
  • DETAILED DESCRIPTION
  • Deploying and configuring analytics systems can be problematic due to the fact that best practices for the same are limited to being attuned to conditions that have already been seen before multiple times and the fact that this limitation may be most acute at the early stages of the analytic product lifecycle. Other problems include the fact that applicability of best practices depends on the dynamics of a given environment since what worked well in one environment sometimes will not work well in another environment and the fact that the applicability of best practices changes with algorithm changes and, depending on the dynamics of the environment, they may not apply or become tedious to update in any effect.
  • Therefore, the description provided below relates to an approach to quickly analyze actual data in source systems and to determine a set of acceptable metrics or metric types based upon what algorithms have been deployed. This set of acceptable metrics should then be presented to an administrator or user so that the administrator or user is given an opportunity to select which ones of the set of acceptable metrics should be processed and which ones should be ignored. The motivation for the user selection would be that there are additional selection criteria beyond the notion of “what works well with the algorithm,” which should be or must be considered. For example, the administrator or the user may want to give particular consideration towards concerns of computing resources, scalability and customer interest.
  • The approach effectively combines a method for assessing/scoring time-series against a variety of criteria (e.g. data completeness, presence of particular frequency components, etc.), computing a weighted score for each time-series in the data source and then presenting to the administrator or user the time-series ordered by this ranking for selection. The assessment/scoring schemes themselves may be independently derivable or otherwise produced by algorithm developers. In any case, through the application of the appropriate time-series assessment/scoring schemes, individual time-series/metric types can be classified as “acceptable” or “unacceptable.” This would be a two stage process that would first do a cursory search on a small time window to determine if the data should be considered for analytics, with the second stage expanding the time window and focusing on the data that was deemed valuable.
  • With reference to FIG. 1, a computing system 10 is provided and may be configured for example as an enterprise computing system or as a personal computing system. In either case, the computing system 10 includes multiple computing devices 11, 12, 13, etc., which are configured to be networked together for communication purposes. Each of the multiple computing devices 11, 12, 13, etc., includes among other features a processing circuit 20, a display 30, user input devices 40 and a networking unit 50 as well as a computer program product 100 for selecting time-series data. The processing circuit 20 may be provided as a micro-processor, a central processing unit (CPU) or any other suitable processing device. The display 30 may be provided as a monitor and is configured to display data and information as well as a graphical user interface to an administrator or user. The user input devices 40 may be provided as a mouse and a keyboard combination and are configured to allow the administrator or user to input commands to the processing circuit 20. The networking unit 50 may be provided as an Ethernet or other suitable networking device by which the multiple computing devices 11, 12, 13, etc. are communicative.
  • With reference to FIG. 2, the computer program product 100 includes a computer readable storage medium 110 having first, second, third and fourth program instructions 111, 112, 113 and 114 stored thereon. The first program instructions 111 are executable by the processing circuit 20 of each of the multiple computing devices 11, 12, 13, etc., to cause the processing circuit 20 to assemble a set of analytic assessment tools, such as analytic assessment algorithms and functions, or “time-series scorers” for analyzing time-series data. The second program instructions 112 are executable by the processing circuit 20 to cause the processing circuit 20 to engage the analytic assessment tools to measure characteristics-of-importance in a relevant analytic domain for sets of the time-series data. The relevant analytic domain may refer, for example, to information technology (IT) operations analytics, data completeness analytics and frequency/power/spectrum analytics. The third program instructions 113 are executable by the processing circuit 20 to cause the processing circuit 20 to generate, as a measurement result, a score for each set of the time-series data based on the associated characteristics-of-importance and, in some cases, external data. This external data may be, for example, meta-data relating to the sets of the time-series data such as the source of the time-series data and a customer name. The fourth program instructions 114 are executable by the processing circuit 20 to cause the processing circuit 20 to rank the sets of the time-series data in accordance with the score for each set of the time-series data for subsequent time-series data selection.
  • As defined herein, the characteristics-of-importance may include, but are not limited to, data completeness. That is, the second program instructions 112 may be executable by the processing circuit 20 to cause the processing circuit 20 to engage the analytic assessment tools to measure whether a predefined percentage of expected data is/was present for a given time-series over a given window of time.
  • In accordance with embodiments, the third program instructions 113 may be configured to cause the processing circuit 20 to combine multiple scores for each set of the time series data to thereby generate an overall score. Such combining may be executed by the processing circuit 20 by way of a linear combination of each of the multiple scores with numerical weighting or by way of a binary combination of each of the multiple scores (for a binary combination, one approach is that, for a time-series, if any of the characteristics-of-importance exceed a specified level for that characteristic, then that time-series would be given a score of ‘1’, meaning ‘could be included’ such that the binary scoring is typical of the first phase where we are looking for candidates to include—whether it is actually included, depends on subsequent scoring based upon analyzing a fuller set of data and for other key characteristics-of-interest). In the latter case, a score of “X” as a user configurable threshold would be required for a given set of the time-series data to be included in eventual analytics.
  • In accordance with further embodiments, the fourth program instructions 114 cause the processing circuit 20 to include higher ranked sets of the time-series data in eventual analytics. In doing so, the fourth program instructions 114 may cause the processing circuit to 20 include sets of the time-series data having scores that are above a predefined threshold in the eventual analytics and reject sets of the time-series data having scores that are below the predefined threshold.
  • The ranking of the sets of the time-series data may be provided as a two-phase or two-step process. In such cases, the initial (optional) phase may be executed as a scoring phase in which a relatively small time sample of data is examined and characteristics-of-interest are measured so that the results of the examination and measurement can be analyzed using binary or weighted-combinations to arrive at an initial score. In this first phase, a candidate subset is identified based upon essential criteria and is determined to be present in sufficient strength for inclusion in the subsequent analytics. The second (characterization) phase follows where there may be fewer numbers of sets of time-series data to be examined but those sets of time-series data that remain would be across a wider range of times such that a single stage process would otherwise be computationally expensive. The combination of scores for the characteristics-of-interest of the second phase leads to the scores used for selection when exceeding configured thresholds or ranking.
  • While it is understood that the first, second, third and fourth program instructions 111, 112, 113 and 114 may be deployed by manual loading thereof directly into a client, server and/or proxy computer by way of a loadable storage medium, such as a CD, DVD, etc., being manually inserted into each of the multiple computing devices 11, 12, 13, etc., the first, second, third and fourth program instructions 111, 112, 113 and 114 may also be automatically or semi-automatically deployed into the computing system 10 by way of a central server 15 or a group of central servers 15 (see FIG. 1). In such cases, the first, second, third and fourth program instructions 111, 112, 113 and 114 may be downloadable into client computers that will then execute the first, second, third and fourth program instructions 111, 112, 113 and 114.
  • In accordance with alternative embodiments, the first, second, third and fourth program instructions 111, 112, 113 and 114 may be sent directly to a client system via e-mail with the first, second, third and fourth program instructions 111, 112, 113 and 114 then being detached to or loaded into a directory. Another alternative would be that the first, second, third and fourth program instructions 111, 112, 113 and 114 be sent directly to a directory on a client computer hard drive. When there are proxy servers, however, loading processes will select proxy server codes, determine on which computers to place the proxy servers' codes, transmit the proxy server codes and then install the proxy server codes on proxy computers. The first, second, third and fourth program instructions 111, 112, 113 and 114 will then be transmitted to the proxy server and subsequently stored thereon.
  • In accordance with embodiments and, with reference to FIG. 3, a deployment process of the computer program product described above is provided. The process begins at block 300 and at block 101 with a determination of whether the first, second, third and fourth program instructions 111, 112, 113 and 114 will reside on a server or servers when executed. If so, then the servers that will contain the executables are identified at block 209. The first, second, third and fourth program instructions 111, 112, 113 and 114 for the server or servers are then transferred directly to the servers' storage via FTP or some other protocol or by copying though the use of a shared file system at block 210 such that the first, second, third and fourth program instructions 111, 112, 113 and 114 are installed on the servers at block 211.
  • Next, a determination is made on whether the first, second, third and fourth program instructions 111, 112, 113 and 114 are to be deployed by having users access the first, second, third and fourth program instructions 111, 112, 113 and 114 on a server or servers at block 102. If so, the server addresses that will store the first, second, third and fourth program instructions 111, 112, 113 and 114 are identified at block 103 and a determination is made if a proxy server is to be built at block 200 to store the first, second, third and fourth program instructions 111, 112, 113 and 114. A proxy server is a server that sits between a client application, such as a Web browser, and a real server and operates by intercepting all requests to the real server to see if it can fulfill the requests itself. If not, the proxy server forwards the request to the real server. The two primary benefits of a proxy server are to improve performance and to filter requests.
  • If a proxy server is required, then the proxy server is installed at block 201 and the first, second, third and fourth program instructions 111, 112, 113 and 114 are sent to the (one or more) servers via a protocol, such as FTP, or by being copied directly from the source files to the server files via file sharing at block 202. Another embodiment involves sending a transaction to the (one or more) servers that contained the process software, and have the server process the transaction and then receive and copy the process software to the server's file system. Once the process software is stored at the servers, the users may then access the first, second, third and fourth program instructions 111, 112, 113 and 114 on the servers and copy to the same to their respective client computer file systems at block 203. Alternatively, the servers may automatically copy the first, second, third and fourth program instructions 111, 112, 113 and 114 to each client and then run an installation program for the first, second, third and fourth program instructions 111, 112, 113 and 114 at each client computer whereby the user executes the program that installs the first, second, third and fourth program instructions 111, 112, 113 and 114 on his client computer at block 212 and then exits the process at block 108.
  • At block 104, a determination is made as to whether the first, second, third and fourth program instructions 111, 112, 113 and 114 are to be deployed by sending the first, second, third and fourth program instructions 111, 112, 113 and 114 to users via e-mail. If a result of the determination is affirmative, the set of users where the first, second, third and fourth program instructions 111, 112, 113 and 114 will be deployed are identified together with the addresses of the user client computers at block 105 and the first, second, third and fourth program instructions 111, 112, 113 and 114 are sent via e-mail to each of the users' client computers. The users then receive the e-mail at block 205 and then detach the first, second, third and fourth program instructions 111, 112, 113 and 114 from the e-mail to a directory on their client computers at block 206. The user executes the program that installs the first, second, third and fourth program instructions 111, 112, 113 and 114 on his client computer at block 212 and then exits the process at block 108.
  • Lastly, a determination is made on whether the first, second, third and fourth program instructions 111, 112, 113 and 114 will be sent directly to user directories on their client computers at block 106. If so, the user directories are identified at block 107 and the process software is transferred directly to the user's client computer directories at block 207. This can be done in several ways such as, but not limited to, sharing the file system directories and then copying from the sender's file system to the recipient user's file system or, alternatively, using a transfer protocol such as File Transfer Protocol (FTP). The users access the directories on their client file systems in preparation for installing the first, second, third and fourth program instructions 111, 112, 113 and 114 at block 208, execute the program that installs the first, second, third and fourth program instructions 111, 112, 113 and 114 at block 212 and then exit the process at block 108.
  • With reference to FIG. 4, a method for selecting time-series data is provided. The method includes assembling a set of analytic assessment tools for time-series data, as shown at block 401, engaging the analytic assessment tools to measure characteristics-of-importance in a relevant analytic domain for sets of the time-series data, as shown at block 402, generating, as a measurement result, a score for each set of the time-series data based on the associated characteristics-of-importance, as shown at block 403 and as described above, and ranking the sets of the time-series data in accordance with the score for each set of the time-series data for subsequent time-series data selection, as shown at block 404 and as described above.
  • A potential use of the computer program product and the method described above would, in some cases, be to generate point tooling or “data mediation tooling” at potential data sources and data bases and, for time-series data stored therein, to analyze the time-series data and to present ranked lists to users for inclusion or exclusion with respect to downstream analysis. As noted above, aspects of the disclosure can be automated whereby, for example, only time-series data scoring above a certain level might be flagged for subsequent inclusion in downstream analysis. Moreover, users will have an opportunity to adjust weighting values assigned to various factors and to build upon those supplied by a pre-existing development team. For example, the user may decide that particular time-series data very directly measures key customer experience aspects and are more important to include in downstream analytics then the corresponding score might otherwise indicate. Here, an example of a binary score would be if the time-series is directly associated with customer experience measurements (e.g., it is a well-known metric type), it may automatically be scored/flagged for inclusion based upon the best practice.
  • Advantages associated with the computer program product and method described above include, but are not limited to, the fact that the features described herein can work without explicit notions of ‘best-practices’ or pre-existing lists of metrics since important factors that are generally included in the creation of such best-practices or lists are codified within the time series scoring mechanisms. Thus, from day 1 of a product deployment, data selection can be explicitly facilitated and can deal with vagaries of a particular environment (i.e., where a metric type X, based upon observation and analysis, is determined to be acceptable in one environment but would not be in another). In addition, administrators or users of the computer program product and method will be able to select the top-N sets of the time-series data and have confidence that the set is a “best” fit for a given case.
  • The descriptions of the various embodiments of the present invention have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
  • The present invention may be a system, a method, and/or a computer program product at any possible technical detail level of integration. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.
  • The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
  • Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
  • Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, configuration data for integrated circuitry, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++, or the like, and procedural programming languages, such as the “C” programming language or similar programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
  • Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
  • These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
  • The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer-implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the blocks may occur out of the order noted in the Figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
  • The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one more other features, integers, steps, operations, element components, and/or groups thereof.
  • The descriptions of the various embodiments of the present invention have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (20)

What is claimed is:
1. A computer program product for selecting time-series data, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions being readable and executable by a processing circuit to cause the processing circuit to:
assemble, by the processing circuit, a set of analytic assessment tools for time-series data;
engage, by the processing circuit, the analytic assessment tools to measure characteristics-of-importance in a relevant analytic domain for sets of the time-series data;
generate, by the processing circuit, as a measurement result, a score for each set of the time-series data based on the associated characteristics-of-importance; and
rank, by the processing circuit, the sets of the time-series data in accordance with the score for each set of the time-series data for subsequent time-series data selection.
2. The computer program product according to claim 1, wherein the program instructions cause the processing circuit to combine multiple scores for each set of the time series data to generate an overall score.
3. The computer program product according to claim 2, wherein the combining by the processing circuit comprises linearly combining each of the multiple scores.
4. The computer program product according to claim 2, wherein the combining by the processing circuit comprises binary combining of each of the multiple scores.
5. The computer program product according to claim 1, wherein the program instructions cause the processing circuit to generate the score based on the associated characteristics-of-importance and external data.
6. The computer program product according to claim 1, wherein the program instructions cause the processing circuit to include higher ranked sets of the time-series data in an eventual analytics.
7. The computer program product according to claim 6, wherein the program instructions cause the processing circuit to:
include sets of the time-series data having scores that are above a predefined threshold in the eventual analytics; and
reject sets of the time-series data having scores that are below the predefined threshold.
8. A computer program product for selecting time-series data, the computer program product comprising:
a computer readable storage medium having stored thereon:
first program instructions executable by a processing circuit to cause the processing circuit to assemble a set of analytic assessment tools for time-series data;
second program instructions executable by the processing circuit to cause the processing circuit to engage the analytic assessment tools to measure characteristics-of-importance in a relevant analytic domain for sets of the time-series data;
third program instructions executable by the processing circuit to cause the processing circuit to generate, as a measurement result, a score for each set of the time-series data based on the associated characteristics-of-importance; and
fourth program instructions executable by the processing circuit to cause the processing circuit to rank the sets of the time-series data in accordance with the score for each set of the time-series data for subsequent time-series data selection.
9. The computer program product according to claim 8, wherein the third program instructions cause the processing circuit to combine multiple scores for each set of the time series data to generate an overall score.
10. The computer program product according to claim 9, wherein the combining by the processing circuit comprises linearly combining each of the multiple scores.
11. The computer program product according to claim 9, wherein the combining by the processing circuit comprises binary combining of each of the multiple scores.
12. The computer program product according to claim 8, wherein the third program instructions cause the processing circuit to generate the score based on the associated characteristics-of-importance and external data.
13. The computer program product according to claim 8, wherein the fourth program instructions cause the processing circuit to include higher ranked sets of the time-series data in an eventual analytics.
14. The computer program product according to claim 13, wherein the fourth program instructions cause the processing circuit to:
include sets of the time-series data having scores that are above a predefined threshold in the eventual analytics; and
reject sets of the time-series data having scores that are below the predefined threshold.
15. A computer-implemented method for selecting time-series data, comprising:
assembling, by a processor, a set of analytic assessment tools for time-series data;
engaging the analytic assessment tools to measure characteristics-of-importance in a relevant analytic domain for sets of the time-series data;
generating, as a measurement result, a score for each set of the time-series data based on the associated characteristics-of-importance; and
ranking the sets of the time-series data in accordance with the score for each set of the time-series data for subsequent time-series data selection.
16. The computer-implemented method according to claim 15, further comprising combining multiple scores for each set of the time series data to generate an overall score.
17. The computer-implemented method according to claim 16, wherein the combining comprises linearly combining each of the multiple scores.
18. The computer-implemented method according to claim 16, wherein the combining comprises binary combining of each of the multiple scores.
19. The computer-implemented method according to claim 15, wherein the generating comprises generating the score based on the associated characteristics-of-importance and external data.
20. The computer-implemented method according to claim 15, further comprising:
including higher ranked sets of the time-series data in an eventual analytics;
including sets of the time-series data having scores that are above a predefined threshold in the eventual analytics; and
rejecting sets of the time-series data having scores that are below the predefined threshold.
US14/862,395 2015-09-23 2015-09-23 Selecting time-series data for information technology (IT) operations analytics anomaly detection Expired - Fee Related US10587487B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/862,395 US10587487B2 (en) 2015-09-23 2015-09-23 Selecting time-series data for information technology (IT) operations analytics anomaly detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US14/862,395 US10587487B2 (en) 2015-09-23 2015-09-23 Selecting time-series data for information technology (IT) operations analytics anomaly detection

Publications (2)

Publication Number Publication Date
US20170085448A1 true US20170085448A1 (en) 2017-03-23
US10587487B2 US10587487B2 (en) 2020-03-10

Family

ID=58283336

Family Applications (1)

Application Number Title Priority Date Filing Date
US14/862,395 Expired - Fee Related US10587487B2 (en) 2015-09-23 2015-09-23 Selecting time-series data for information technology (IT) operations analytics anomaly detection

Country Status (1)

Country Link
US (1) US10587487B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11301352B2 (en) 2020-08-26 2022-04-12 International Business Machines Corporation Selecting metrics for system monitoring

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060276995A1 (en) * 2005-06-07 2006-12-07 International Business Machines Corporation Automated and adaptive threshold setting
US20130110761A1 (en) * 2011-10-31 2013-05-02 Krishnamurthy Viswanathan System and method for ranking anomalies
US20130166572A1 (en) * 2010-06-28 2013-06-27 Nec Corporation Device, method, and program for extracting abnormal event from medical information
US20150088606A1 (en) * 2013-09-20 2015-03-26 Tata Consultancy Services Ltd. Computer Implemented Tool and Method for Automating the Forecasting Process

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7831464B1 (en) 2006-04-06 2010-11-09 ClearPoint Metrics, Inc. Method and system for dynamically representing distributed information
US8090592B1 (en) * 2007-10-31 2012-01-03 At&T Intellectual Property I, L.P. Method and apparatus for multi-domain anomaly pattern definition and detection
US20090248722A1 (en) 2008-03-27 2009-10-01 International Business Machines Corporation Clustering analytic functions
US8855420B2 (en) 2009-04-09 2014-10-07 France Telecom Descriptor determination in a multimedia content
JP5532150B2 (en) 2011-01-24 2014-06-25 日本電気株式会社 Operation management apparatus, operation management method, and program
US8458090B1 (en) 2012-04-18 2013-06-04 International Business Machines Corporation Detecting fraudulent mobile money transactions
US9563670B2 (en) 2013-03-14 2017-02-07 Leidos, Inc. Data analytics system
US20160239264A1 (en) 2013-06-10 2016-08-18 Ge Intelligent Platforms, Inc. Re-streaming time series data for historical data analysis
US20160062950A1 (en) * 2014-09-03 2016-03-03 Google Inc. Systems and methods for anomaly detection and guided analysis using structural time-series models
US10592093B2 (en) 2014-10-09 2020-03-17 Splunk Inc. Anomaly detection
US9904584B2 (en) * 2014-11-26 2018-02-27 Microsoft Technology Licensing, Llc Performance anomaly diagnosis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060276995A1 (en) * 2005-06-07 2006-12-07 International Business Machines Corporation Automated and adaptive threshold setting
US20130166572A1 (en) * 2010-06-28 2013-06-27 Nec Corporation Device, method, and program for extracting abnormal event from medical information
US20130110761A1 (en) * 2011-10-31 2013-05-02 Krishnamurthy Viswanathan System and method for ranking anomalies
US20150088606A1 (en) * 2013-09-20 2015-03-26 Tata Consultancy Services Ltd. Computer Implemented Tool and Method for Automating the Forecasting Process

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11301352B2 (en) 2020-08-26 2022-04-12 International Business Machines Corporation Selecting metrics for system monitoring

Also Published As

Publication number Publication date
US10587487B2 (en) 2020-03-10

Similar Documents

Publication Publication Date Title
US10169731B2 (en) Selecting key performance indicators for anomaly detection analytics
US9720812B2 (en) Risk-based test coverage and prioritization
JP7413255B2 (en) Computer-implemented methods, systems, and computer program products and computer programs for performing interactive workflows
US9436540B2 (en) Automated diagnosis of software crashes
US20170139819A1 (en) Proactive and selective regression testing based on historic test results
US10984360B2 (en) Cognitive learning workflow execution
JP2019501436A (en) System and method for application security and risk assessment and testing
US10452520B2 (en) Association between a test case and source code
US10719365B2 (en) Cognitive learning workflow execution
US10719795B2 (en) Cognitive learning workflow execution
US9747188B2 (en) Determining importance of an artifact in a software development environment
US10713084B2 (en) Cognitive learning workflow execution
US9703683B2 (en) Software testing coverage
US9582270B2 (en) Effective feature location in large legacy systems
US20170116616A1 (en) Predictive tickets management
US9507592B2 (en) Analysis of data integration job
US10587487B2 (en) Selecting time-series data for information technology (IT) operations analytics anomaly detection
US11119763B2 (en) Cognitive selection of software developer for software engineering task
CN110008108B (en) Regression range determining method, device, equipment and computer readable storage medium
US20170228650A1 (en) Estimating analytic execution times
US20190361989A1 (en) Application Tracking System
US20200302002A1 (en) Managing content of an online information system
de Sousa et al. Evaluating the performance of Twitter-based exploit detectors
US9632918B1 (en) Creating expected test results using previous test results
Sheakh et al. Taxonomical study of software reliability growth models

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW YORK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GARRETT, RYAN A.;MCKEOWN, ROBERT J.;REEL/FRAME:036633/0501

Effective date: 20150922

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GARRETT, RYAN A.;MCKEOWN, ROBERT J.;REEL/FRAME:036633/0501

Effective date: 20150922

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

ZAAA Notice of allowance and fees due

Free format text: ORIGINAL CODE: NOA

ZAAB Notice of allowance mailed

Free format text: ORIGINAL CODE: MN/=.

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED

STCF Information on status: patent grant

Free format text: PATENTED CASE

FEPP Fee payment procedure

Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

LAPS Lapse for failure to pay maintenance fees

Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20240310