WO2016138616A1 - Data query job submission management - Google Patents

Data query job submission management Download PDF

Info

Publication number
WO2016138616A1
WO2016138616A1 PCT/CN2015/073491 CN2015073491W WO2016138616A1 WO 2016138616 A1 WO2016138616 A1 WO 2016138616A1 CN 2015073491 W CN2015073491 W CN 2015073491W WO 2016138616 A1 WO2016138616 A1 WO 2016138616A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
query
queries
running
test
Prior art date
Application number
PCT/CN2015/073491
Other languages
French (fr)
Inventor
Haitao Liu
Qianqian NIE
Panxin ZHU
Original Assignee
Microsoft Technology Licensing, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Technology Licensing, Llc filed Critical Microsoft Technology Licensing, Llc
Priority to PCT/CN2015/073491 priority Critical patent/WO2016138616A1/en
Priority to CN201580056607.4A priority patent/CN107077490B/en
Publication of WO2016138616A1 publication Critical patent/WO2016138616A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • G06F11/0754Error or fault detection not based on redundancy by exceeding limits
    • G06F11/0757Error or fault detection not based on redundancy by exceeding limits by exceeding a time limit, i.e. time-out, e.g. watchdogs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0727Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a storage system, e.g. in a DASD or network based storage system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3692Test management for test results analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • G06F11/3419Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment by assessing time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/81Threshold

Definitions

  • Enterprises for example, companies, educational entities, government entities, and the like often operate hundreds or thousands of computers and computing systems for their employees, students and affiliates. Often such computers and computer systems are operated at various enterprise locations or often such computers or computer systems are operated at large data centers. Many enterprises store and process data via data storage and processing services provider operating remotely from the enterprise where data storage, data processing and online services are provided at the remote services provider over a distributed computing network such as the Internet.
  • an enterprise sends data queries to the services provider for running various processing jobs against the enterprise’s data and systems stored and operated at the services provider or at an associated services provider data center.
  • the query submitted by the enterprise includes query logic created by the enterprise so that the enterprise may perform self-service queries on the enterprise’s data and subscribed-to systems at the services provider or data center.
  • the queries passed to the services provider or data center by the enterprise are often problematic for some reason, for example, code errors, version errors, and the like in or associated with the query.
  • the query may run for an extended period of time, for example, 20 hours, without actually completing as expected by the querying enterprise or subscriber. In such a case, the limited resources of the services provider or data center may be consumed or diminished by the erroneous query which may prevent or hamper other subscribers from running needed queries.
  • test queries often an enterprise runs test queries against a limited amount of data for testing the operation of the queries so that they can be modified as needed for eventual use as production queries against large production datasets. If the test queries have problems, as described above for production queries, the running of the test queries may similarly run too long and thus consume limited test query resources and prevent or hamper others from running their test queries.
  • a production query is received and run against a given dataset or system. If the runtime for the query exceeds a threshold period, for example, ten hours, the query is stopped and the query is marked as a poison query which puts the query into a semi-quarantined state. The query subscriber is notified and is allowed to send a subsequent query of the same type that comprises a query job, for example, a job comprised of daily queries over the course of two weeks. If a threshold number of queries are designated as poison queries during a prescribed time period, for example, three poison queries in a seven day period, the entire query job may be quarantined which means it will be shut down and prevented from running against the subscriber’s data and systems.
  • a threshold period for example, ten hours
  • the query is stopped and the query is marked as a poison query which puts the query into a semi-quarantined state.
  • the query subscriber is notified and is allowed to send a subsequent query of the same type that comprises a query job, for example, a job comprised of
  • a test query is received and is run against a given dataset or system designated for use with test queries. If the runtime of the test query exceeds a threshold period, for example, one hour, operation of the test query is paused, and the test query is moved from a run queue to a wait queue to allow other test queries in the run queue to run against their test data or systems without delay.
  • the paused test query may be placed in a high priority position in the wait queue so that it may next be run after other test queries in the run queue are allowed to process.
  • a subscriber that submits a test query that takes an extended period of time to process may have a proper expectation as to the runtime without preventing other test query subscribers from running their test queries in a reasonable amount of time.
  • Fig. 1 is a simplified block diagram of one example of a system architecture for uploading and/or downloading data to and/or from an external data center or enterprise and a services provider or data center at which a production or test query may be run.
  • Fig. 2A is a simplified block diagram of one example of a data uploader module for uploading and/or downloading data to and/or from an external data center or enterprise and a services provider or data center at which a production or test query may be run.
  • Fig. 2B is a simplified block diagram of one example of a proxy service for ensuring that data uploaded from a source computing system to a secure computing system is processed from a trustworthy source/requester.
  • Fig. 2C is a simplified block diagram of one example of a system architecture for uploading queries to a production query domain or a test query domain for running production or test queries against data or systems owned or subscribed to by a querying enterprise or subscriber.
  • Fig. 3A is a flowchart of an example method for managing a production query directed to data or systems owned and/or subscribed to by a querying subscriber.
  • Fig. 3B is a flowchart of an example method for managing a test query directed to data or systems owned and/or subscribed to by a querying subscriber.
  • Fig. 4 is a block diagram illustrating example physical components of a computing device with which aspects of the present invention may be practiced.
  • Figs. 5A and 5B are simplified block diagrams of a mobile computing device with which aspects of the present invention may be practiced.
  • Fig. 6 is a simplified block diagram of a distributed computing system in which aspects of the present invention may be practiced.
  • enterprises of various types often send both production and test data queries to one or more data centers and to services providers through which they store and process data and online software systems for running various processing jobs against the enterprise’s data and systems.
  • an enterprise may run a query at the services provider that computes usage of the enterprise’s subscribed-to online software services over each 24 hour period.
  • reports may be generated at the services provider and may be passed back to the enterprise to allow it to make decisions about its online software subscriptions.
  • an enterprise may send a query to the services provider to parse a massive amount of data covering the enterprise’s sales figures on a weekly basis for each operating quarter.
  • test queries may be run against a limited data set or systems data for purposes of testing the operation of the test queries so that they may be modified as needed for eventual use as production queries.
  • queries submitted by an enterprise include query logic created by the enterprise so that the enterprise may perform self-service queries on the enterprise’s data and subscribed-to systems at the services provider or data center.
  • queries passed to the services provider or data center by the enterprise are often problematic for some reason.
  • the query logic may have a code error, or a version number for a data center system called by the query may have changed or may have become corrupted, or a data read request authentication may have failed, or the like. That is, any number of problems may be present in a given query passed to the services provider or data center for running against data or systems.
  • the query may run for an extended period of time, for example, 20 hours, without actually completing as expected by the querying enterprise or subscriber.
  • the limited resources of the services provider or data center may be consumed, and the queries of other parties may be unreasonably delayed.
  • Such processing problems are particularly problematic when queries are run in self-service operation where the querying party may run the queries without assistance from the services provider. That is, if a self-service query operation is utilized and a given query runs for an excessive time period, the resources of the services provider may be tied up by the poorly operating query without the knowledge of the services provider.
  • FIG. 1 is a simplified block diagram of one example of a system architecture for uploading data, including production and test queries, from a source location to a destination location.
  • the system architecture 100 is comprised of various example computing components for uploading production and test queries data from a variety of source computing systems (or individual computers) to a variety of destination locations such as data centers and services providers.
  • a data center 105 is illustrative of a data center operated by an enterprise or subscriber of services (hereafter “subscribers) that may need to upload data of various types, including production and test queries, to a data center or services provider (hereafter “services provider” ) at which uploaded data and queries may be stored and/or processed.
  • the data center 105 may house hundreds, thousands or more individual computers/computing systems 110 on which may be stored data of a variety of data types that may be processed using a variety of different computing processes, for example, a variety of software applications.
  • each of the computing devices 110 may include computers of various types, for example, server computers, for storing user data in databases, electronic mail systems, document management systems, and the like, and the computer/computing systems 110 may be used for running a variety of computing system software applications, for example, database applications, electronic mail systems applications, web services applications, online software provision applications, productivity applications, data management system applications, telecommunications applications, and the like.
  • the data center 105 is also illustrative of one of many data centers that may be co-located, or that may be located at different locations and that may be associated with each other via various transmission systems for passing data between disparate data centers.
  • the data center 105 is illustrated as a data center in which numerous computer systems 110 may be located for provision of data and services, as described above, the data center 105 is equally illustrative of an entity such as a company, educational facility, government facility or a single computing device, for example, a desktop, laptop, tablet, handheld, or other computing device operated by an individual user from which user data and/or computer system production and test queries may be uploaded to a services provider.
  • each computing device 110 is associated with an uploader module 115 that is operative for uploading user and/or system data and production or test queries from each associated computer/computing system 110.
  • the uploader module 115 is described in further detail below with respect to Fig. 2A.
  • an uploader module 115 may be installed on each associated computer/computing system 110 or may be accessed by each computer/computing system 110.
  • an edge router 120 is illustrative of a typical router device for passing queries from a given uploader module to systems external to the data center 105.
  • the edge router 120 may be responsible for ensuring that data passed from a given data center 105 is properly passed to a desired destination system component, for example, that packetized data passing from the uploader module is properly routed to a correct destination component of the system 100.
  • the distributed computing network 125 (illustrated in Fig. 1 as a dotted line) is illustrative of any network such as the Internet or an intranet through which data may be passed from the data center to components external to the data center such as destination storage repositories 145a-c of the secure data management center/repository, described below.
  • the edge router 135 is illustrative or a receiving edge router through which queries may be passed to a proxy service 140 responsible for ensuring received queries are properly authenticated prior to allowing received data to being passed to one or more destination storage repositories 145a-c at the services provider 107. Operation of the proxy service 140 is described in further detail below with reference to Fig. 2B.
  • the storage repositories 145a-c are illustrative of any data storage repository that may be authorized to receive data or queries uploaded via the uploader modules 115.
  • the destination storage repositories 145a-c may be associated with a secure data management center/repository of a services provider for receiving, storing and analyzing data (in response to one or more production or test queries) associated with computing systems and software services provided for subscribers of the services provider.
  • the data repository 145a may serve as a primary secure data receiving repository for a services provider.
  • Access points 152, 154 and 156 represent access points at the data repository 145a through which data and queries may be passed from the proxy service 140 for uploading data to one or more specific data locations 160, or for passing data or queries through one or more specific data access points 158, 162 for passing the data to other data repositories 145b, 145c.
  • the data repository 145b may be designated for receiving and analyzing user data and systems data, as well as, various queries associated with one or more services or data types.
  • the data repository 145b is illustrative of a cloud services system operated at the secure data management center/repository 144 of a given services provider.
  • a scheduler module 166 is illustrative of a software module or device operative for scheduling data uploads and downloads to and from the data repository 145b.
  • a pumper module 168 is illustrative of a software module or device operative for distributing data to and from components of the data repository 145b.
  • An analytics module 170 is illustrative of a software module or device operative for outputting and/or displaying or otherwise presenting data from the storage repository 145b.
  • the destination storage repository 145c is illustrative of another component of the services provider 107.
  • the destination storage repository 145c may be in the form of a database system operated at the services provider 107.
  • a scheduler module 166 is illustrative of a software module or device operative for scheduling data uploads and downloads to and from the data repository 145c.
  • a pumper module 168 is illustrative of a software module or device operative for distributing data to and from components of the data repository 145c.
  • An analytics module 170 is illustrative of a software module or device operative for outputting and/or displaying or otherwise presenting data from the storage repository 145c.
  • components of the services provider and the individual components 145a, 145b, 145c are for purposes of example and illustration only and are not limiting of various other components or systems that may be operated as part of the secure data management center/repository to which data may be uploaded or from which data may be downloaded from/to an external (and potentially unsecure) data generator/user.
  • components of the secure data management center/repository 107 may provide for online software and data management provision, for example, provision of word processing services, slide presentation application services, database application services, spreadsheet application services, telecommunications application services, and the like provided to various users via one or more online software application services and data management systems.
  • a description of a query receiving and processing system that may operate at the services provider in one of its components is provided below with reference to Fig. 2C.
  • the components of the system 100 are equally operative for passing data, including response and/or notifications to or associated with queries, from the services provider 107 back to the data center 105.
  • data or queries (whether production or test) uploaded to a data center or services provider may be uploaded via an uploader module for ensuring that uploaded data and/or queries are properly passed from an originating computing system to an appropriate storage or processing repository at a data center or services provider for processing, as described herein.
  • Fig. 2A operation of the data uploader 115 and data downloader 115 is illustrated and described.
  • the data uploader and data downloader are software applications or software modules containing sufficient computer executable instructions for reading, transforming (if required) and exporting data of a variety of data types from the external data generator/user on the unsecure side to the secure data management center/repository on the secure side.
  • the data uploaders and downloaders are also operative to pass data from the secure side back to the unsecure side.
  • the data uploader and downloader may be identical modules and are only designated as uploader versus downloader based on the direction of the data movement.
  • the data uploader or downloader (hereafter referred to as data loader) 115 includes an operation module 205 for receiving data upload instructions and for directing the processing of components of the data loader module 115.
  • a configuration file reader 210 is a module with which the data loader 115 reads a configuration file 215 for data uploading instructions, as described below.
  • a data reader module 225 is operative to read data of a variety of data types via a data reader plug-in module 227.
  • a data transformation module 230 is a module operative for transforming data or queries in response to data transformation information read from the configuration file 215 via a data transformation plug-in 232.
  • a data export module 235 is operative to export data or queries from memory to a designated destination storage repository 145a-c as designated by instructions received from the configuration file 215 via the data export plug-in 237.
  • a specific data export plug-in 237 may be used for directing a production query to a production queries domain or for directing a test query to a test queries domain, as described below with reference to Fig. 2C.
  • Various data reader, data transformation and data export plug-in modules 227, 232, 237 may be provided to the data loaders 115 or may be accessed by the data loader modules 115 as required for different types of data reading, transformation and export.
  • a services provider which needs to receive transformed data from various computing devices operated at a data center 105 may provide data reader plug-ins, data transformation plug-ins, and data export plug-ins for use by data loader modules 115 for reading, transforming and exporting data according to their individual needs.
  • the configuration file 215 is illustrative of a file that may be accessed by the data loader module 115 for receiving data and query uploading instructions.
  • Data uploading instructions contained in the configuration file 215 may provide information including the data types associated with a query to be uploaded, data reading instructions, as well as, security information for allowing the loader module to access desired data.
  • the configuration file may provide instructions on how desired data is to be transformed, if required, and instructions on where uploaded data is to be stored and in what file type exported data is to be stored.
  • the configuration file may also provide the data loader with a specified export plug-in for causing the data loader to pass production and test queries to appropriate components of the services provider 107.
  • the proxy service 140 is a system or software module operative to authenticate requests for uploading data and/or queries to a services provider and/or for authenticating data download/read requests (including responses to or notifications associated with queries) from a services provider.
  • the proxy service 140 includes a data transmission module 250 which is a software module and/or system component operative to receive data transmissions from a loader module 115 for passing uploaded data and queries from a computing device 110.
  • the authentication module 255 is a device or software module operative to authenticate the source of a data upload/download/read request to ensure that the source is trustworthy for either uploading data to a secure repository or for downloading or reading data from a secure repository.
  • the memory 260 is illustrative of a memory location housed either in the proxy service 140 or accessible by the proxy service 140 in which may be stored information required for authenticating upload/download/read requests.
  • the Internet protocol (IP) address list 265 is illustrative of a list of IP addresses that may be used for comparing against an IP address associated with a data upload/download/read requester.
  • the certificate list 270 is illustrative of a list of authentication certificates that may be used to compare with an authentication certificate associated with a data upload/download/read requester.
  • a transmission approved list 275 is illustrative of a list of approved sources from which upload/download/read requests previously have been authenticated and approved.
  • a given enterprise or subscriber of production or test query services often desires to run production queries and test queries against data and systems owned or subscribed to by the enterprise or subscriber.
  • an enterprise or subscriber will be referred to as “subscriber” to mean any party sending a production or test query for running against data or systems, as described herein.
  • a number of data centers 105a-n are provided as described above with reference to Fig. 1.
  • each of the data centers 105a-n may upload data and data queries through the proxy service 140 to storage repositories or processing components/systems of a services provider or data center as described above with reference to Fig. 2B.
  • the proxy service 140 may be operative to pass data or data queries directly to specified storage repositories or components of a receiving data center or services provider based on data export plug-ins utilized by an uploader module responsible for passing the data or data queries through the proxy service 140.
  • a production queries domain 280 and a test queries domain 290 may be operated at a services provider to which production and test queries may be passed for running production or test queries against data or systems owned and/or subscribed-to by a querying subscriber.
  • an enterprise operating at the data center 105a may pass a production data query through the proxy server 140 to the production queries domain 280 for running a production query against data or data systems owned and/or subscribed-to by the enterprise operating at the data center 105a.
  • test queries domain 290 may be passed a test query from a data center 105a-n through the proxy service 140 to the test queries domain 290 for running a test query against a limited set of data or systems for testing the operation of the test query so that the test query may be modified, revised or edited as needed for eventual use as a production query against large datasets and complex systems.
  • a production query passed by a given enterprise to the production queries domain 280 may cause the computation of employer login frequency to enterprise computing systems for 50,000 employees operating an equal number of computing systems for the enterprise.
  • the query may require the running of such a computation daily for all employees over the period of a month so that a report may be generated in response to the query that may be passed back to the enterprise for allowing enterprise personnel to make decisions regarding the proper utilization of their employees and associated computing systems.
  • an enterprise before putting such a production query into use, an enterprise may wish to generate a test query for testing the operability of the query against a limited amount of data and/or systems so that the test query may be modified and/or de-bugged for eventual use as a production query.
  • the production queries domain 280 is illustrative of a collection of software modules and computing systems, as well as, databases and/or data access points for allowing production queries to be received and processed against data and/or systems owned and/or subscribed-to by a querying subscriber.
  • the production queries domain is housed in the storage repository 145a of the services provider 107, illustrated and described above with reference to Fig. 1.
  • the production queries domain 280 may be located at and operated at any other component of the services provider, for example, the components 145b and 145c, as illustrated and described above with reference to Fig. 1.
  • the production queries domain 280 includes a scheduler module 281 operative to receive queries from a querying subscriber and for scheduling performance of the received query against desired data or systems.
  • a run queue may be established for running data queries against datasets and/or systems of various subscribers.
  • the scheduler module 281 is operative for scheduling the running of a received data query in a queue of other queries to be run against various datasets and/or systems in accordance with the limited query resources of the production queries domain 280.
  • the query processor 282 is illustrative of a software module or device operative to receive and execute a production query as requested by a querying subscriber.
  • the jobs repository 284 is illustrative of a database or other storage repository for storing received data queries for eventual execution against prescribed user or systems data accessed at the jobs data repository 285.
  • Information and data responsive to the running of a received data query may be stored at the query data repository 286 for processing and reporting by the query processor 282.
  • a quarantine information module 283 is illustrative of a software module or device operative to generate and store information about a quarantined production query run or production query job, as described herein.
  • the test queries domain 290 is illustrative of a collection of software modules, devices and data operative for processing and reporting on test queries run against limited sets of data and systems for allowing a querying subscriber to test a query for eventual use as a production query.
  • the test queries domain 290 includes a scheduler module 291 operative for scheduling the performance of a received test query against test data or systems.
  • the query processor 292 is operative to process a scheduled test query by placing the test query in a run queue 293 comprised of test queries scheduled for running against prescribed data or systems or for placing a test query in a wait queue 294 comprised of a list of tests queries that are paused waiting for an opening on the run queue 293.
  • the jobs repository 295 is illustrative of a database or other storage repository for storing received test queries for eventual execution against prescribed user or systems data accessed at the jobs data repository 296.
  • Information and data responsive to the running of a received test query may be stored at the test query data repository 297 for processing and reporting by the test query processor 292.
  • Fig. 3A is a flowchart of an example method for managing a production query directed to data or systems accessed by a querying subscriber.
  • the routine 300 begins at start operation 302 and proceeds to operation 304 where a production query is received at the production queries domain 280 of a data and systems services provider from a subscriber from a data center 105a-n through the proxy service 140, as illustrated and described above with reference to Figs. 1-2C.
  • the received production query is scheduled for processing by the scheduler module 281 where the received production query is placed in a processing queue for being run by the query processor 282 against desired user or systems data, as required by the received query.
  • information identifying the received query may be placed in the jobs repository 284 and data against which the query is to be run may be accessed via the jobs data repository 285.
  • Information about the query for example, the query’s position in a run queue and any other information about the query for example, identification information about the query, identification of the enterprise or subscriber from which the query is received, and the like may be stored in the query data repository 286.
  • the query processor 282 runs the received query against the requested data.
  • decision operation 310 a determination is made as to the processing time associated with running the query.
  • the query operation may be stopped to allow other queries in the run queue to be processed.
  • the routine may proceed to operation 314 where the query processing may be stopped. If the query processing is stopped, and the query may be marked as a poison query such that the query is placed in a semi-quarantined state.
  • some data queries may require processing times greater than the threshold processing time allowed before the processing of a query is terminated, as described herein.
  • a given query requires a greater amount of time, for example, 20 hours, to fully process
  • such a query may be placed on a list of queries that may be fully processed regardless of processing times that exceed the threshold time.
  • the threshold processing time may be increased so that, for such queries requiring longer processing times, a threshold time beyond which they will not be allowed to run may be established.
  • the subscriber launching the query may be contacted.
  • the subscriber may decide to terminate the query job to effect changes or repairs to the query. If the subscriber decides to allow the query job to continue, then the subscriber may send a subsequent query of the query job or allow the query job to continue as scheduled including the processing of subsequent queries comprising the query job.
  • the results of the query may be reported to the subscriber responsible for launching the query.
  • the results may be aggregated with other results of associated queries comprising the query job, the results may be tabulated in a spreadsheet or database of query results, or the results may be placed in various formats for ultimately reporting to the subscriber responsible for launching the query.
  • a prescribed threshold number of poison queries are experienced during a threshold period of time, for example, more than three poison queries in seven days, then the entire query job may be terminated because it may be determined that a coding error or other error in the received queries are rendering the entire query job suspect. If the threshold number of queries in a given period of time is exceeded, then the entire query job may be marked as poison, and the entire query job may be quarantined from the production queries domain. That is, a quarantined query job may not be processed at the production queries domain by any other data queries included in the query job until the query job is modified or de-bugged in a satisfactory manner, as described below.
  • the limited processing resources of the production queries domain 280 may be utilized for other queries, and the subscriber responsible for launching the query job may have the opportunity to modify, revise or de-bug the queries comprising the query job before resubmitting the queries.
  • the subscriber responsible for launching the quarantined job is contacted.
  • a modification of the poison and quarantined query job may be received from the subscriber.
  • the query processor 282 at the production queries domain 280 analyzes the received modified query job comprised of one or more data queries and analyzes the received modified query job and associated data queries against the data queries comprising the quarantined query job.
  • the routine may proceed back to operation 304, and a first data query comprising the modified query job may be received for processing at the production queries domain, as described above.
  • analysis of the modified query job against the quarantined query job may include a parsing of code contained in the modified data queries and comparing the code against code contained in the quarantined queries.
  • a given series of data queries comprising a query job may be failing because a simple version identification contained in the data queries for applying the data queries against a given set of data may be erroneous causing an excessive amount of runtime in the processing of the data queries.
  • a modification of the data queries to correct the erroneous version number may be a simple correction that may then render the modified data queries and query job acceptable for running against specified data and/or systems, as desired by the requesting enterprise or subscriber.
  • Fig. 3B is a flowchart of an example method for managing a test query directed to data or systems accessed by a querying subscriber.
  • the routine 330 begins at start operation 332 and proceeds to operation 334 where a test query is received at the test queries domain 290 from a requesting subscriber from a data center 105a-n via the proxy service 140, as described above with reference to Figs. 1-2C.
  • a test query may be uploaded by a given enterprise for testing the operation of a given query before the test query is released as a production query for running against data and/or systems owned or subscribed to by the requesting subscriber.
  • the test query processing is scheduled by the scheduler module 291, and an identification of the test query job may be stored in the jobs repository 295, and any data required for running the received test query against may be stored in or accessed via the jobs data repository 296.
  • Information about the test query including identification information about the querying enterprise or subscriber as well as identification information about the test data against which the test query will be run, and the like, may be stored at or via the query data repository 297.
  • the scheduler module 291 in association with the query processor 292 places the received test query in the run queue so that the test query may be run against the requested data and/or systems in an order prescribed for the test query relative to other test queries waiting in the run queue for processing.
  • the received test query is run against the data or systems prescribed for running the test query.
  • a threshold runtime for example, one hour.
  • the routine proceeds to operation 346, and the test query processing is paused such that the test query yields the processing resources of the query processor 292 to other test queries waiting in the run queue for processing.
  • the test query is moved to the wait queue 294 where it will wait in a paused mode until space becomes available in the run queue.
  • the enterprise or subscriber launching the test query may be contacted to provide notification of the paused test query.
  • the routine proceeds to operation 352, and the test query is moved from the wait queue to the run queue.
  • the test query is once again run in a designated position in the run queue. For example, when the test query is moved from the wait queue to the run queue, it may enter at the bottom of the run queue and must now wait until higher ordered test queries are processed before it can be processed.
  • moving a test query requiring excessive amount of processing time from a run queue to a wait queue allows for other test queries to be processed more rapidly.
  • a typical test query may run in a matter of seconds or minutes.
  • moving such a test query to the wait queue may allow for a number of other test queries to be processed on the run queue while the paused test query is on the wait queue.
  • test query run may be paused on the wait queue indefinitely. If not, the routine proceeds back to operation 344, and results of a successful run of the test query may be reported to the enterprise or subscriber launching the test query. Alternatively, if the test query run is a failure, then at operation 358, the test query may be removed from the run queue and may be placed back into the wait queue at a priority level below other paused test queries.
  • test query run results in a failure
  • test query may be removed from both the run queue and the wait queue, and at operation 360, the enterprise or subscriber launching the test query may be contacted for allowing a modification and/or de-bugging of the test query, as desired.
  • the routine 330 ends at operation 365.
  • program modules include routines, programs, components, data structures, and other types of structures that perform particular tasks or implement particular abstract data types.
  • the embodiments and functionalities described herein may operate via a multitude of computing systems including, without limitation, desktop computer systems, wired and wireless computing systems, mobile computing systems (e.g., mobile telephones, netbooks, tablet or slate type computers, notebook computers, and laptop computers) , hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, and mainframe computers.
  • desktop computer systems e.g., desktop computer systems, wired and wireless computing systems, mobile computing systems (e.g., mobile telephones, netbooks, tablet or slate type computers, notebook computers, and laptop computers) , hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, and mainframe computers.
  • mobile computing systems e.g., mobile telephones, netbooks, tablet or slate type computers, notebook computers, and laptop computers
  • hand-held devices e.g., hand-held devices
  • multiprocessor systems e.g., microprocessor-based or programmable consumer electronics, minicomputers,
  • embodiments and functionalities described herein may operate over distributed systems (e.g., cloud-based computing systems) , where application functionality, memory, data storage and retrieval and various processing functions may be operated remotely from each other over a distributed computing network, such as the Internet or an intranet.
  • a distributed computing network such as the Internet or an intranet.
  • User interfaces and information of various types may be displayed via on-board computing device displays or via remote display units associated with one or more computing devices. For example user interfaces and information of various types may be displayed and interacted with on a wall surface onto which user interfaces and information of various types are projected.
  • Interaction with the multitude of computing systems with which embodiments of the invention may be practiced include, keystroke entry, touch screen entry, voice or other audio entry, gesture entry where an associated computing device is equipped with detection (e.g., camera) functionality for capturing and interpreting user gestures for controlling the functionality of the computing device, and the like.
  • detection e.g., camera
  • Figures 4-6 and the associated descriptions provide a discussion of a variety of operating environments in which embodiments of the invention may be practiced.
  • the devices and systems illustrated and discussed with respect to Figures 4-6 are for purposes of example and illustration and are not limiting of a vast number of computing device configurations that may be utilized for practicing embodiments of the invention, described herein.
  • FIG. 4 is a block diagram illustrating physical components (i.e., hardware) of a computing device 400 with which embodiments of the invention may be practiced.
  • the computing device components described below may be suitable for the computing devices 110, 115, 145, described above.
  • the computing device 400 may include at least one processing unit 402 and a system memory 404.
  • the system memory 404 may comprise, but is not limited to, volatile storage (e.g., random access memory) , non-volatile storage (e.g., read-only memory) , flash memory, or any combination of such memories.
  • the system memory 404 may include an operating system 405 and one or more program modules 406 suitable for running software applications 450.
  • the operating system 405 may be suitable for controlling the operation of the computing device 400.
  • embodiments of the invention may be practiced in conjunction with a graphics library, other operating systems, or any other application program and is not limited to any particular application or system.
  • This basic configuration is illustrated in Figure 4 by those components within a dashed line 408.
  • the computing device 400 may have additional features or functionality.
  • the computing device 400 may also include additional data storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape.
  • additional storage is illustrated in Figure 4 by a removable storage device 409 and a non-removable storage device 410.
  • program modules 406 may perform processes including, but not limited to, one or more of the stages of the routine 300 illustrated in Figure 3.
  • Other program modules that may be used in accordance with embodiments of the present invention and may include applications such as electronic mail and contacts applications, word processing applications, spreadsheet applications, database applications, slide presentation applications, drawing or computer-aided application programs, etc.
  • embodiments of the invention may be practiced in an electrical circuit comprising discrete electronic elements, packaged or integrated electronic chips containing logic gates, a circuit utilizing a microprocessor, or on a single chip containing electronic elements or microprocessors.
  • embodiments of the invention may be practiced via a system-on-a-chip (SOC) where each or many of the components illustrated in Figure 4 may be integrated onto a single integrated circuit.
  • SOC system-on-a-chip
  • Such an SOC device may include one or more processing units, graphics units, communications units, system virtualization units and various application functionality all of which are integrated (or “burned” ) onto the chip substrate as a single integrated circuit.
  • the functionality, described herein, with respect to providing an activity stream across multiple workloads may be operated via application-specific logic integrated with other components of the computing device 400 on the single integrated circuit (chip) .
  • Embodiments of the invention may also be practiced using other technologies capable of performing logical operations such as, for example, AND, OR, and NOT, including but not limited to mechanical, optical, fluidic, and quantum technologies.
  • embodiments of the invention may be practiced within a general purpose computer or in any other circuits or systems.
  • the computing device 400 may also have one or more input device (s) 412 such as a keyboard, a mouse, a pen, a sound input device, a touch input device, etc.
  • the output device (s) 414 such as a display, speakers, a printer, etc. may also be included.
  • the aforementioned devices are examples and others may be used.
  • the computing device 400 may include one or more communication connections 416 allowing communications with other computing devices 418. Examples of suitable communication connections 416 include, but are not limited to, RF transmitter, receiver, and/or transceiver circuitry; universal serial bus (USB) , parallel, and/or serial ports.
  • USB universal serial bus
  • Computer readable media may include computer storage media.
  • Computer storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, or program modules.
  • the system memory 404, the removable storage device 409, and the non-removable storage device 410 are all computer storage media examples (i.e., memory storage.
  • Computer storage media may include RAM, ROM, electrically erasable read-only memory (EEPROM) , flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other article of manufacture which can be used to store information and which can be accessed by the computing device 400. Any such computer storage media may be part of the computing device 400. Computer storage media does not include a carrier wave or other propagated or modulated data signal.
  • Communication media may be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and includes any information delivery media.
  • modulated data signal may describe a signal that has one or more characteristics set or changed in such a manner as to encode information in the signal.
  • communication media may include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF) , infrared, and other wireless media.
  • RF radio frequency
  • FIGS 5A and 5B illustrate a mobile computing device 500, for example, a mobile telephone, a smart phone, a tablet personal computer, a laptop computer, and the like, with which embodiments of the invention may be practiced.
  • a mobile computing device 500 for implementing the embodiments is illustrated.
  • the mobile computing device 500 is a handheld computer having both input elements and output elements.
  • the mobile computing device 500 typically includes a display 505 and one or more input buttons 510 that allow the user to enter information into the mobile computing device 500.
  • the display 505 of the mobile computing device 500 may also function as an input device (e.g., a touch screen display) . If included, an optional side input element 515 allows further user input.
  • the side input element 515 may be a rotary switch, a button, or any other type of manual input element.
  • mobile computing device 500 may incorporate more or less input elements.
  • the display 505 may not be a touch screen in some embodiments.
  • the mobile computing device 500 is a portable phone system, such as a cellular phone.
  • the mobile computing device 500 may also include an optional keypad 535.
  • Optional keypad 535 may be a physical keypad or a “soft” keypad generated on the touch screen display.
  • the output elements include the display 505 for showing a graphical user interface (GUI) , a visual indicator 520 (e.g., a light emitting diode) , and/or an audio transducer 525 (e.g., a speaker) .
  • GUI graphical user interface
  • the mobile computing device 500 incorporates a vibration transducer for providing the user with tactile feedback.
  • the mobile computing device 500 incorporates peripheral device port 540, such as an audio input (e.g., a microphone jack) , an audio output (e.g., a headphone jack) , and a video output (e.g., a HDMI port) for sending signals to or receiving signals from an external device.
  • FIG. 5B is a block diagram illustrating the architecture of one embodiment of a mobile computing device. That is, the mobile computing device 500 can incorporate a system (i.e., an architecture) 502 to implement some embodiments.
  • the system 502 is implemented as a “smart phone” capable of running one or more applications (e.g., browser, e-mail, calendaring, contact managers, messaging clients, games, and media clients/players) .
  • the system 502 is integrated as a computing device, such as an integrated personal digital assistant (PDA) and wireless phone.
  • PDA personal digital assistant
  • One or more application programs 550 may be loaded into the memory 562 and run on or in association with the operating system 564. Examples of the application programs include phone dialer programs, electronic communication applications, personal information management (PIM) programs, word processing programs, spreadsheet programs, Internet browser programs, messaging programs, and so forth.
  • the system 502 also includes a non-volatile storage area 568 within the memory 562. The non-volatile storage area 568 may be used to store persistent information that should not be lost if the system 502 is powered down.
  • the application programs 550 may use and store information in the non-volatile storage area 568, such as e-mail or other messages used by an e-mail application, and the like.
  • a synchronization application (not shown) also resides on the system 502 and is programmed to interact with a corresponding synchronization application resident on a host computer to keep the information stored in the non-volatile storage area 568 synchronized with corresponding information stored at the host computer.
  • other applications may be loaded into the memory 562 and run on the mobile computing device 500.
  • the system 502 has a power supply 570, which may be implemented as one or more batteries.
  • the power supply 570 might further include an external power source, such as an AC adapter or a powered docking cradle that supplements or recharges the batteries.
  • the system 502 may also include a radio 572 that performs the function of transmitting and receiving radio frequency communications.
  • the radio 572 facilitates wireless connectivity between the system 502 and the “outside world, ” via a communications carrier or service provider. Transmissions to and from the radio 572 are conducted under control of the operating system 564. In other words, communications received by the radio 572 may be disseminated to the application programs 550 via the operating system 564, and vice versa.
  • the visual indicator 520 may be used to provide visual notifications and/or an audio interface 574 may be used for producing audible notifications via the audio transducer 525.
  • the visual indicator 520 is a light emitting diode (LED) and the audio transducer 525 is a speaker.
  • LED light emitting diode
  • the LED may be programmed to remain on indefinitely until the user takes action to indicate the powered-on status of the device.
  • the audio interface 574 is used to provide audible signals to and receive audible signals from the user.
  • the audio interface 574 may also be coupled to a microphone to receive audible input, such as to facilitate a telephone conversation.
  • the microphone may also serve as an audio sensor to facilitate control of notifications, as will be described below.
  • the system 502 may further include a video interface 576 that enables an operation of an on-board camera 530 to record still images, video stream, and the like.
  • a mobile computing device 500 implementing the system 502 may have additional features or functionality.
  • the mobile computing device 500 may also include additional data storage devices (removable and/or non-removable) such as, magnetic disks, optical disks, or tape.
  • additional storage is illustrated in Figure 5B by the non-volatile storage area 568.
  • Data/information generated or captured by the mobile computing device 500 and stored via the system 502 may be stored locally on the mobile computing device 500, as described above, or the data may be stored on any number of storage media that may be accessed by the device via the radio 572 or via a wired connection between the mobile computing device 500 and a separate computing device associated with the mobile computing device 500, for example, a server computer in a distributed computing network, such as the Internet.
  • a server computer in a distributed computing network such as the Internet.
  • data/information may be accessed via the mobile computing device 500 via the radio 572 or via a distributed computing network.
  • data/information may be readily transferred between computing devices for storage and use according to well-known data/information transfer and storage means, including electronic mail and collaborative data/information sharing systems.
  • Figure 6 illustrates one embodiment of the architecture of a system for providing the functionality described herein across components of a distributed computing environment.
  • Content developed, interacted with, or edited in association with the applications described above may be stored in different communication channels or other storage types.
  • various documents may be stored using a directory service 622, a web portal 624, a mailbox service 626, an instant messaging store 628, or a social networking site 630.
  • the application 450 e.g., an electronic communication application
  • a server 615 may provide the functionality to clients 605A-C and 110.
  • the server 615 may be a web server providing the application functionality described herein over the web.
  • the server 615 may provide the application functionality over the web to clients 605A-C and 110 through a network 125, 610.
  • a computing devices 110 may be implemented and embodied in a personal computer 605A, a tablet computing device 605B and/or a mobile computing device 605C (e.g., a smart phone) , or other computing device. Any of these embodiments of the client computing device may obtain content from the store 616.
  • Embodiments of the present invention are described above with reference to block diagrams and/or operational illustrations of methods, systems, and computer program products according to embodiments of the invention.
  • the functions/acts noted in the blocks may occur out of the order as shown in any flowchart.
  • two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved.

Abstract

Monitoring and managing the running of production and test queries to prevent either type of query from excessive processing runtime is provided. If the runtime for a production query exceeds a threshold period, the query may be stopped. If a threshold number of queries comprising a query job are stopped for excessive runtime, the entire query job may be quarantined which means it will be shut down and prevented from running against the subscriber's data and systems. If the runtime of a test query exceeds a threshold period, the test query may be paused, and the test query may be moved from a run queue to a wait queue to allow other test queries in the run queue to run against their test data or systems without delay. The paused test query may be moved back to the run queue when space on the run queue becomes available.

Description

DATA QUERY JOB SUBMISSION MANAGEMENT BACKGROUND
Enterprises, for example, companies, educational entities, government entities, and the like often operate hundreds or thousands of computers and computing systems for their employees, students and affiliates. Often such computers and computer systems are operated at various enterprise locations or often such computers or computer systems are operated at large data centers. Many enterprises store and process data via data storage and processing services provider operating remotely from the enterprise where data storage, data processing and online services are provided at the remote services provider over a distributed computing network such as the Internet.
Often, an enterprise sends data queries to the services provider for running various processing jobs against the enterprise’s data and systems stored and operated at the services provider or at an associated services provider data center. The query submitted by the enterprise includes query logic created by the enterprise so that the enterprise may perform self-service queries on the enterprise’s data and subscribed-to systems at the services provider or data center. Unfortunately, the queries passed to the services provider or data center by the enterprise are often problematic for some reason, for example, code errors, version errors, and the like in or associated with the query. When such problems exist with a presented query, the query may run for an extended period of time, for example, 20 hours, without actually completing as expected by the querying enterprise or subscriber. In such a case, the limited resources of the services provider or data center may be consumed or diminished by the erroneous query which may prevent or hamper other subscribers from running needed queries.
In addition, often an enterprise runs test queries against a limited amount of data for testing the operation of the queries so that they can be modified as needed for eventual use as production queries against large production datasets. If the test queries have problems, as described above for production queries, the running of the test queries may similarly run too long and thus consume limited test query resources and prevent or hamper others from running their test queries.
There is a need for methods and systems for managing query (production and test) submission and operation. It is with respect to these and other considerations that the present invention has been made.
SUMMARY
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended as an aid in determining the scope of the claimed subject matter.
The above and other problems are solved by methods and systems for monitoring and managing the running of production and test queries to prevent either type of query from running for excessive time periods that consume limited production or test query resources. According to one aspect of the invention, a production query is received and run against a given dataset or system. If the runtime for the query exceeds a threshold period, for example, ten hours, the query is stopped and the query is marked as a poison query which puts the query into a semi-quarantined state. The query subscriber is notified and is allowed to send a subsequent query of the same type that comprises a query job, for example, a job comprised of daily queries over the course of two weeks. If a threshold number of queries are designated as poison queries during a prescribed time period, for example, three poison  queries in a seven day period, the entire query job may be quarantined which means it will be shut down and prevented from running against the subscriber’s data and systems.
According to another aspect of the invention, a test query is received and is run against a given dataset or system designated for use with test queries. If the runtime of the test query exceeds a threshold period, for example, one hour, operation of the test query is paused, and the test query is moved from a run queue to a wait queue to allow other test queries in the run queue to run against their test data or systems without delay. The paused test query may be placed in a high priority position in the wait queue so that it may next be run after other test queries in the run queue are allowed to process. Thus, a subscriber that submits a test query that takes an extended period of time to process may have a proper expectation as to the runtime without preventing other test query subscribers from running their test queries in a reasonable amount of time.
The details of one or more embodiments are set forth in the accompanying drawings and description below. Other features and advantages will be apparent from a reading of the following detailed description and a review of the associated drawings. It is to be understood that the following detailed description is explanatory only and is not restrictive of the invention as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate various aspects of the present invention.
Fig. 1 is a simplified block diagram of one example of a system architecture for uploading and/or downloading data to and/or from an external data center or enterprise and a services provider or data center at which a production or test query may be run.
Fig. 2A is a simplified block diagram of one example of a data uploader module for uploading and/or downloading data to and/or from an external data center or enterprise and a services provider or data center at which a production or test query may be run.
Fig. 2B is a simplified block diagram of one example of a proxy service for ensuring that data uploaded from a source computing system to a secure computing system is processed from a trustworthy source/requester.
Fig. 2C is a simplified block diagram of one example of a system architecture for uploading queries to a production query domain or a test query domain for running production or test queries against data or systems owned or subscribed to by a querying enterprise or subscriber.
Fig. 3A is a flowchart of an example method for managing a production query directed to data or systems owned and/or subscribed to by a querying subscriber.
Fig. 3B is a flowchart of an example method for managing a test query directed to data or systems owned and/or subscribed to by a querying subscriber.
Fig. 4 is a block diagram illustrating example physical components of a computing device with which aspects of the present invention may be practiced.
Figs. 5A and 5B are simplified block diagrams of a mobile computing device with which aspects of the present invention may be practiced.
Fig. 6 is a simplified block diagram of a distributed computing system in which aspects of the present invention may be practiced.
DETAILED DESCRIPTION
The following detailed description refers to the accompanying drawings. Wherever possible, the same reference numbers are used in the drawings and the following description to refer to the same or similar elements. While embodiments of the invention  may be described, modifications, adaptations, and other implementations are possible. For example, substitutions, additions, or modifications may be made to the elements illustrated in the drawings, and the methods described herein may be modified by substituting, reordering, or adding stages to the disclosed methods. Accordingly, the following detailed description does not limit the invention, but instead, the proper scope of the invention is defined by the appended claims.
As briefly described above, enterprises of various types often send both production and test data queries to one or more data centers and to services providers through which they store and process data and online software systems for running various processing jobs against the enterprise’s data and systems. For example an enterprise may run a query at the services provider that computes usage of the enterprise’s subscribed-to online software services over each 24 hour period. In response, reports may be generated at the services provider and may be passed back to the enterprise to allow it to make decisions about its online software subscriptions. For another example, an enterprise may send a query to the services provider to parse a massive amount of data covering the enterprise’s sales figures on a weekly basis for each operating quarter. In the case of test queries, test queries may be run against a limited data set or systems data for purposes of testing the operation of the test queries so that they may be modified as needed for eventual use as production queries.
For either production or test queries, queries submitted by an enterprise include query logic created by the enterprise so that the enterprise may perform self-service queries on the enterprise’s data and subscribed-to systems at the services provider or data center. Unfortunately, the queries passed to the services provider or data center by the enterprise are often problematic for some reason. For example, the query logic may have a code error, or a version number for a data center system called by the query may have changed or may have become corrupted, or a data read request authentication may have failed, or the like. That is,  any number of problems may be present in a given query passed to the services provider or data center for running against data or systems. When such problems exist with a presented query, the query may run for an extended period of time, for example, 20 hours, without actually completing as expected by the querying enterprise or subscriber. In such a case, the limited resources of the services provider or data center may be consumed, and the queries of other parties may be unreasonably delayed.
Such processing problems are particularly problematic when queries are run in self-service operation where the querying party may run the queries without assistance from the services provider. That is, if a self-service query operation is utilized and a given query runs for an excessive time period, the resources of the services provider may be tied up by the poorly operating query without the knowledge of the services provider.
As briefly described above, aspects of the present invention are directed to managing production and test queries uploaded to a data center and/or services provider for running production or test queries against data and/or systems owned or subscribed to by an enterprise, individual computer system user, or other subscriber to data and/or systems services of the services provider. Fig. 1 is a simplified block diagram of one example of a system architecture for uploading data, including production and test queries, from a source location to a destination location. According to aspects of the present invention, the system architecture 100 is comprised of various example computing components for uploading production and test queries data from a variety of source computing systems (or individual computers) to a variety of destination locations such as data centers and services providers.
At the bottom of Fig. 1, a data center 105 is illustrative of a data center operated by an enterprise or subscriber of services (hereafter “subscribers) that may need to upload data of various types, including production and test queries, to a data center or services provider (hereafter “services provider” ) at which uploaded data and queries may be stored  and/or processed. The data center 105 may house hundreds, thousands or more individual computers/computing systems 110 on which may be stored data of a variety of data types that may be processed using a variety of different computing processes, for example, a variety of software applications. For example, each of the computing devices 110 may include computers of various types, for example, server computers, for storing user data in databases, electronic mail systems, document management systems, and the like, and the computer/computing systems 110 may be used for running a variety of computing system software applications, for example, database applications, electronic mail systems applications, web services applications, online software provision applications, productivity applications, data management system applications, telecommunications applications, and the like.
As should be appreciated, the data center 105 is also illustrative of one of many data centers that may be co-located, or that may be located at different locations and that may be associated with each other via various transmission systems for passing data between disparate data centers. In addition, while the data center 105 is illustrated as a data center in which numerous computer systems 110 may be located for provision of data and services, as described above, the data center 105 is equally illustrative of an entity such as a company, educational facility, government facility or a single computing device, for example, a desktop, laptop, tablet, handheld, or other computing device operated by an individual user from which user data and/or computer system production and test queries may be uploaded to a services provider.
Referring still to the data center 105, each computing device 110 is associated with an uploader module 115 that is operative for uploading user and/or system data and production or test queries from each associated computer/computing system 110. The uploader module 115 is described in further detail below with respect to Fig. 2A. According  to one aspect of the invention, an uploader module 115 may be installed on each associated computer/computing system 110 or may be accessed by each computer/computing system 110.
Referring still to Fig. 1, an edge router 120 is illustrative of a typical router device for passing queries from a given uploader module to systems external to the data center 105. As should be appreciated, the edge router 120 may be responsible for ensuring that data passed from a given data center 105 is properly passed to a desired destination system component, for example, that packetized data passing from the uploader module is properly routed to a correct destination component of the system 100.
The distributed computing network 125 (illustrated in Fig. 1 as a dotted line) is illustrative of any network such as the Internet or an intranet through which data may be passed from the data center to components external to the data center such as destination storage repositories 145a-c of the secure data management center/repository, described below.
The edge router 135 is illustrative or a receiving edge router through which queries may be passed to a proxy service 140 responsible for ensuring received queries are properly authenticated prior to allowing received data to being passed to one or more destination storage repositories 145a-c at the services provider 107. Operation of the proxy service 140 is described in further detail below with reference to Fig. 2B.
The storage repositories 145a-c are illustrative of any data storage repository that may be authorized to receive data or queries uploaded via the uploader modules 115. For example, the destination storage repositories 145a-c may be associated with a secure data management center/repository of a services provider for receiving, storing and analyzing data (in response to one or more production or test queries) associated with computing systems and software services provided for subscribers of the services provider.
For example, the data repository 145a may serve as a primary secure data receiving repository for a services provider. Access points 152, 154 and 156 represent access points at the data repository 145a through which data and queries may be passed from the proxy service 140 for uploading data to one or more specific data locations 160, or for passing data or queries through one or more specific data access points 158, 162 for passing the data to other data repositories 145b, 145c.
The data repository 145b may be designated for receiving and analyzing user data and systems data, as well as, various queries associated with one or more services or data types. For example, the data repository 145b is illustrative of a cloud services system operated at the secure data management center/repository 144 of a given services provider. A scheduler module 166 is illustrative of a software module or device operative for scheduling data uploads and downloads to and from the data repository 145b. A pumper module 168 is illustrative of a software module or device operative for distributing data to and from components of the data repository 145b. An analytics module 170 is illustrative of a software module or device operative for outputting and/or displaying or otherwise presenting data from the storage repository 145b.
The destination storage repository 145c is illustrative of another component of the services provider 107. For example, the destination storage repository 145c may be in the form of a database system operated at the services provider 107. A scheduler module 166 is illustrative of a software module or device operative for scheduling data uploads and downloads to and from the data repository 145c. A pumper module 168 is illustrative of a software module or device operative for distributing data to and from components of the data repository 145c. An analytics module 170 is illustrative of a software module or device operative for outputting and/or displaying or otherwise presenting data from the storage repository 145c.
As should be appreciated the descriptions of the components of the services provider and the  individual components  145a, 145b, 145c are for purposes of example and illustration only and are not limiting of various other components or systems that may be operated as part of the secure data management center/repository to which data may be uploaded or from which data may be downloaded from/to an external (and potentially unsecure) data generator/user. For example, components of the secure data management center/repository 107 may provide for online software and data management provision, for example, provision of word processing services, slide presentation application services, database application services, spreadsheet application services, telecommunications application services, and the like provided to various users via one or more online software application services and data management systems. A description of a query receiving and processing system that may operate at the services provider in one of its components is provided below with reference to Fig. 2C. In addition, as should be understood, the components of the system 100 are equally operative for passing data, including response and/or notifications to or associated with queries, from the services provider 107 back to the data center 105.
As described above with reference to Fig. 1, data or queries (whether production or test) uploaded to a data center or services provider may be uploaded via an uploader module for ensuring that uploaded data and/or queries are properly passed from an originating computing system to an appropriate storage or processing repository at a data center or services provider for processing, as described herein. Referring now to Fig. 2A, operation of the data uploader 115 and data downloader 115 is illustrated and described. As briefly described above, the data uploader and data downloader are software applications or software modules containing sufficient computer executable instructions for reading, transforming (if required) and exporting data of a variety of data types from the external data  generator/user on the unsecure side to the secure data management center/repository on the secure side. The data uploaders and downloaders are also operative to pass data from the secure side back to the unsecure side. As should be appreciated, the data uploader and downloader may be identical modules and are only designated as uploader versus downloader based on the direction of the data movement.
The data uploader or downloader (hereafter referred to as data loader) 115 includes an operation module 205 for receiving data upload instructions and for directing the processing of components of the data loader module 115. A configuration file reader 210 is a module with which the data loader 115 reads a configuration file 215 for data uploading instructions, as described below. A data reader module 225 is operative to read data of a variety of data types via a data reader plug-in module 227. A data transformation module 230 is a module operative for transforming data or queries in response to data transformation information read from the configuration file 215 via a data transformation plug-in 232.
data export module 235 is operative to export data or queries from memory to a designated destination storage repository 145a-c as designated by instructions received from the configuration file 215 via the data export plug-in 237. According to aspects of the present invention, a specific data export plug-in 237 may be used for directing a production query to a production queries domain or for directing a test query to a test queries domain, as described below with reference to Fig. 2C.
Various data reader, data transformation and data export plug-in modules 227, 232, 237 may be provided to the data loaders 115 or may be accessed by the data loader modules 115 as required for different types of data reading, transformation and export. For example, a services provider which needs to receive transformed data from various computing devices operated at a data center 105 may provide data reader plug-ins, data  transformation plug-ins, and data export plug-ins for use by data loader modules 115 for reading, transforming and exporting data according to their individual needs.
The configuration file 215 is illustrative of a file that may be accessed by the data loader module 115 for receiving data and query uploading instructions. Data uploading instructions contained in the configuration file 215 may provide information including the data types associated with a query to be uploaded, data reading instructions, as well as, security information for allowing the loader module to access desired data. In addition, the configuration file may provide instructions on how desired data is to be transformed, if required, and instructions on where uploaded data is to be stored and in what file type exported data is to be stored. As described below, the configuration file may also provide the data loader with a specified export plug-in for causing the data loader to pass production and test queries to appropriate components of the services provider 107.
As briefly described above with reference to Fig. 1, data or data queries (whether production or test) uploaded from a data center, enterprise or individual computing system may be required to pass through a proxy service for ensuring that the uploaded data and/or queries are originating from a trustworthy source. Referring now to Fig. 2B, the proxy service 140 is a system or software module operative to authenticate requests for uploading data and/or queries to a services provider and/or for authenticating data download/read requests (including responses to or notifications associated with queries) from a services provider.
The proxy service 140 includes a data transmission module 250 which is a software module and/or system component operative to receive data transmissions from a loader module 115 for passing uploaded data and queries from a computing device 110. The authentication module 255 is a device or software module operative to authenticate the source of a data upload/download/read request to ensure that the source is trustworthy for either  uploading data to a secure repository or for downloading or reading data from a secure repository. The memory 260 is illustrative of a memory location housed either in the proxy service 140 or accessible by the proxy service 140 in which may be stored information required for authenticating upload/download/read requests. According to aspects of the invention, the Internet protocol (IP) address list 265 is illustrative of a list of IP addresses that may be used for comparing against an IP address associated with a data upload/download/read requester. The certificate list 270 is illustrative of a list of authentication certificates that may be used to compare with an authentication certificate associated with a data upload/download/read requester. A transmission approved list 275 is illustrative of a list of approved sources from which upload/download/read requests previously have been authenticated and approved.
As described above, according to aspects of the present invention, a given enterprise or subscriber of production or test query services often desires to run production queries and test queries against data and systems owned or subscribed to by the enterprise or subscriber. Hereafter, an enterprise or subscriber will be referred to as “subscriber” to mean any party sending a production or test query for running against data or systems, as described herein. As illustrated in Fig. 2C, a number of data centers 105a-n are provided as described above with reference to Fig. 1. As illustrated in Fig. 2C, each of the data centers 105a-n may upload data and data queries through the proxy service 140 to storage repositories or processing components/systems of a services provider or data center as described above with reference to Fig. 2B. As described above, according to aspects of the present invention, the proxy service 140 may be operative to pass data or data queries directly to specified storage repositories or components of a receiving data center or services provider based on data export plug-ins utilized by an uploader module responsible for passing the data or data queries through the proxy service 140.
Referring still to Fig. 2C, according to aspects of the present invention a production queries domain 280 and a test queries domain 290 may be operated at a services provider to which production and test queries may be passed for running production or test queries against data or systems owned and/or subscribed-to by a querying subscriber. For example, an enterprise operating at the data center 105a may pass a production data query through the proxy server 140 to the production queries domain 280 for running a production query against data or data systems owned and/or subscribed-to by the enterprise operating at the data center 105a. Similarly, another enterprise or data/system subscriber may pass a test query from a data center 105a-n through the proxy service 140 to the test queries domain 290 for running a test query against a limited set of data or systems for testing the operation of the test query so that the test query may be modified, revised or edited as needed for eventual use as a production query against large datasets and complex systems.
For example, a production query passed by a given enterprise to the production queries domain 280 may cause the computation of employer login frequency to enterprise computing systems for 50,000 employees operating an equal number of computing systems for the enterprise. The query may require the running of such a computation daily for all employees over the period of a month so that a report may be generated in response to the query that may be passed back to the enterprise for allowing enterprise personnel to make decisions regarding the proper utilization of their employees and associated computing systems. In the case of a test query, before putting such a production query into use, an enterprise may wish to generate a test query for testing the operability of the query against a limited amount of data and/or systems so that the test query may be modified and/or de-bugged for eventual use as a production query.
According to aspects of the present invention, the production queries domain 280 is illustrative of a collection of software modules and computing systems, as well as,  databases and/or data access points for allowing production queries to be received and processed against data and/or systems owned and/or subscribed-to by a querying subscriber. As illustrated in Fig. 2C, the production queries domain is housed in the storage repository 145a of the services provider 107, illustrated and described above with reference to Fig. 1. As should be appreciated, however, the production queries domain 280 may be located at and operated at any other component of the services provider, for example, the components 145b and 145c, as illustrated and described above with reference to Fig. 1.
According to aspects of the invention, the production queries domain 280 includes a scheduler module 281 operative to receive queries from a querying subscriber and for scheduling performance of the received query against desired data or systems. For example, a run queue may be established for running data queries against datasets and/or systems of various subscribers. The scheduler module 281 is operative for scheduling the running of a received data query in a queue of other queries to be run against various datasets and/or systems in accordance with the limited query resources of the production queries domain 280.
The query processor 282 is illustrative of a software module or device operative to receive and execute a production query as requested by a querying subscriber. The jobs repository 284 is illustrative of a database or other storage repository for storing received data queries for eventual execution against prescribed user or systems data accessed at the jobs data repository 285. Information and data responsive to the running of a received data query may be stored at the query data repository 286 for processing and reporting by the query processor 282. A quarantine information module 283 is illustrative of a software module or device operative to generate and store information about a quarantined production query run or production query job, as described herein.
The test queries domain 290 is illustrative of a collection of software modules, devices and data operative for processing and reporting on test queries run against limited sets of data and systems for allowing a querying subscriber to test a query for eventual use as a production query. The test queries domain 290 includes a scheduler module 291 operative for scheduling the performance of a received test query against test data or systems. The query processor 292 is operative to process a scheduled test query by placing the test query in a run queue 293 comprised of test queries scheduled for running against prescribed data or systems or for placing a test query in a wait queue 294 comprised of a list of tests queries that are paused waiting for an opening on the run queue 293. The jobs repository 295 is illustrative of a database or other storage repository for storing received test queries for eventual execution against prescribed user or systems data accessed at the jobs data repository 296. Information and data responsive to the running of a received test query may be stored at the test query data repository 297 for processing and reporting by the test query processor 292.
Fig. 3A is a flowchart of an example method for managing a production query directed to data or systems accessed by a querying subscriber. The routine 300 begins at start operation 302 and proceeds to operation 304 where a production query is received at the production queries domain 280 of a data and systems services provider from a subscriber from a data center 105a-n through the proxy service 140, as illustrated and described above with reference to Figs. 1-2C. At operation 306, the received production query is scheduled for processing by the scheduler module 281 where the received production query is placed in a processing queue for being run by the query processor 282 against desired user or systems data, as required by the received query.
As described above with reference to Fig. 2C, information identifying the received query may be placed in the jobs repository 284 and data against which the query is  to be run may be accessed via the jobs data repository 285. Information about the query for example, the query’s position in a run queue and any other information about the query for example, identification information about the query, identification of the enterprise or subscriber from which the query is received, and the like may be stored in the query data repository 286.
At operation 308, the query processor 282 runs the received query against the requested data. At decision operation 310, a determination is made as to the processing time associated with running the query. According to an aspect of the present invention, if the query runtime exceeds a threshold time period, the query operation may be stopped to allow other queries in the run queue to be processed. According to one aspect of the invention, if the runtime of a given production query exceeds the threshold, for example ten hours, the routine may proceed to operation 314 where the query processing may be stopped. If the query processing is stopped, and the query may be marked as a poison query such that the query is placed in a semi-quarantined state.
As should be appreciated, some data queries may require processing times greater than the threshold processing time allowed before the processing of a query is terminated, as described herein. In such a case, where a given query requires a greater amount of time, for example, 20 hours, to fully process, such a query may be placed on a list of queries that may be fully processed regardless of processing times that exceed the threshold time. Alternatively, for such queries, the threshold processing time may be increased so that, for such queries requiring longer processing times, a threshold time beyond which they will not be allowed to run may be established.
At operation 316, for a query that has been stopped for exceeding the threshold runtime and that has been marked as a poison and semi-quarantined query, the subscriber launching the query may be contacted. As should be appreciated, when the subscriber is  contacted, the subscriber may decide to terminate the query job to effect changes or repairs to the query. If the subscriber decides to allow the query job to continue, then the subscriber may send a subsequent query of the query job or allow the query job to continue as scheduled including the processing of subsequent queries comprising the query job.
Referring back to decision operation 310, if the query processing is completed in less than the threshold time, for example, ten hours, then at operation 312, the results of the query may be reported to the subscriber responsible for launching the query. As should be appreciated, prior to reporting the results of the run query, the results may be aggregated with other results of associated queries comprising the query job, the results may be tabulated in a spreadsheet or database of query results, or the results may be placed in various formats for ultimately reporting to the subscriber responsible for launching the query.
Referring still to Fig. 3A, at decision operation 318, a determination is made as to whether the number of semi-quarantined or poison queries during a threshold period of time has been exceeded. For example, according to one example operation, if more than three queries out of seven sequential queries contained in a query job are marked as poison queries, the entire query job may be terminated, as described below. For example, a given query job may require that a query be passed to the production queries domain 280 for running against a set of data, for example, employee login data, each day for a period of one month. Thus, at decision operation 318, a determination may be made as to whether three poison and quarantined queries are experienced for the example query job in seven days of daily query operation. If the threshold number of queries required for terminating and quarantining a query job has not been reached, then the routine proceeds back to operation 304 and the next production query in a series of production queries may be received.
At operation 320, if a prescribed threshold number of poison queries are experienced during a threshold period of time, for example, more than three poison queries in  seven days, then the entire query job may be terminated because it may be determined that a coding error or other error in the received queries are rendering the entire query job suspect. If the threshold number of queries in a given period of time is exceeded, then the entire query job may be marked as poison, and the entire query job may be quarantined from the production queries domain. That is, a quarantined query job may not be processed at the production queries domain by any other data queries included in the query job until the query job is modified or de-bugged in a satisfactory manner, as described below. By terminating the entire query job, the limited processing resources of the production queries domain 280 may be utilized for other queries, and the subscriber responsible for launching the query job may have the opportunity to modify, revise or de-bug the queries comprising the query job before resubmitting the queries.
At operation 322, in response to marking an entire query job as poison and quarantined, the subscriber responsible for launching the quarantined job is contacted. At operation 324, a modification of the poison and quarantined query job may be received from the subscriber. At operation 326, the query processor 282 at the production queries domain 280 analyzes the received modified query job comprised of one or more data queries and analyzes the received modified query job and associated data queries against the data queries comprising the quarantined query job. At operation 328, if sufficient modification to the quarantined query job and associated data queries is received, then the routine may proceed back to operation 304, and a first data query comprising the modified query job may be received for processing at the production queries domain, as described above.
As should be appreciated, analysis of the modified query job against the quarantined query job may include a parsing of code contained in the modified data queries and comparing the code against code contained in the quarantined queries. For another example, a given series of data queries comprising a query job may be failing because a  simple version identification contained in the data queries for applying the data queries against a given set of data may be erroneous causing an excessive amount of runtime in the processing of the data queries. In such a case, a modification of the data queries to correct the erroneous version number may be a simple correction that may then render the modified data queries and query job acceptable for running against specified data and/or systems, as desired by the requesting enterprise or subscriber.
Fig. 3B is a flowchart of an example method for managing a test query directed to data or systems accessed by a querying subscriber. The routine 330 begins at start operation 332 and proceeds to operation 334 where a test query is received at the test queries domain 290 from a requesting subscriber from a data center 105a-n via the proxy service 140, as described above with reference to Figs. 1-2C. As should be appreciated, a test query may be uploaded by a given enterprise for testing the operation of a given query before the test query is released as a production query for running against data and/or systems owned or subscribed to by the requesting subscriber.
At operation 336, the test query processing is scheduled by the scheduler module 291, and an identification of the test query job may be stored in the jobs repository 295, and any data required for running the received test query against may be stored in or accessed via the jobs data repository 296. Information about the test query, including identification information about the querying enterprise or subscriber as well as identification information about the test data against which the test query will be run, and the like, may be stored at or via the query data repository 297.
At operation 338, the scheduler module 291 in association with the query processor 292 places the received test query in the run queue so that the test query may be run against the requested data and/or systems in an order prescribed for the test query relative to  other test queries waiting in the run queue for processing. At operation 340, the received test query is run against the data or systems prescribed for running the test query.
At decision operation 342, a determination is made as to whether the processing of the test query exceeds a threshold runtime, for example, one hour. If the running of the test query does not exceed the threshold runtime, the routine proceeds to operation 344, and the results of the running of the test query may be reported to the enterprise or subscriber launching the test query in a similar manner as described above for the production query. At operation 348, the enterprise or subscriber is contacted for reporting the results of the processed test query.
Referring back to decision operation 342, if the running of the test query exceeds the threshold time period, for example, one hour, the routine proceeds to operation 346, and the test query processing is paused such that the test query yields the processing resources of the query processor 292 to other test queries waiting in the run queue for processing. When the test query is paused, the test query is moved to the wait queue 294 where it will wait in a paused mode until space becomes available in the run queue. At operation 348, the enterprise or subscriber launching the test query may be contacted to provide notification of the paused test query.
At decision operation 350, a determination is made as to whether processing space is now available in the run queue. That is, according to one aspect of the invention, when a paused test query is moved from the run queue to the wait queue, the paused test query may be given a priority position in the wait queue that places it at the top of the wait queue so that it immediately is moved back to the run queue when space in the run queue becomes available owing to the processing of one or more test queries from the run queue. If space is not available in the run queue, then the routine proceeds back to operation 346, and  the paused test query is maintained on the wait queue until space does become available on the run queue.
Alternatively, if space becomes available in the run queue, the routine proceeds to operation 352, and the test query is moved from the wait queue to the run queue. At operation 354, the test query is once again run in a designated position in the run queue. For example, when the test query is moved from the wait queue to the run queue, it may enter at the bottom of the run queue and must now wait until higher ordered test queries are processed before it can be processed.
As should be appreciated, moving a test query requiring excessive amount of processing time from a run queue to a wait queue allows for other test queries to be processed more rapidly. For example, a typical test query may run in a matter of seconds or minutes. Thus, if a test query is processed and is requiring processing time exceeding a threshold amount of time, for example, one hour, then moving such a test query to the wait queue may allow for a number of other test queries to be processed on the run queue while the paused test query is on the wait queue. Thus, enterprises or subscribers sending test queries requiring short processing times may have expectations met where results of such test queries may be returned quickly, whereas an enterprise or subscriber submitting a test query requiring an extended runtime may also have expectations met because the submitting subscriber should know that the processing time for the test query may be lengthy.
Referring still to Figure 3B, at operation 356, after one or more attempts to rerun the test query, if the test query still will not complete in a reasonable amount of time, a determination may be made that the test query run has failed, and the test query may be paused on the wait queue indefinitely. If not, the routine proceeds back to operation 344, and results of a successful run of the test query may be reported to the enterprise or subscriber launching the test query. Alternatively, if the test query run is a failure, then at operation  358, the test query may be removed from the run queue and may be placed back into the wait queue at a priority level below other paused test queries. Alternatively, if the test query run results in a failure, then the test query may be removed from both the run queue and the wait queue, and at operation 360, the enterprise or subscriber launching the test query may be contacted for allowing a modification and/or de-bugging of the test query, as desired. The routine 330 ends at operation 365.
While the invention has been described in the general context of program modules that execute in conjunction with an application program that runs on an operating system on a computer, those skilled in the art will recognize that the invention may also be implemented in combination with other program modules. Generally, program modules include routines, programs, components, data structures, and other types of structures that perform particular tasks or implement particular abstract data types.
The embodiments and functionalities described herein may operate via a multitude of computing systems including, without limitation, desktop computer systems, wired and wireless computing systems, mobile computing systems (e.g., mobile telephones, netbooks, tablet or slate type computers, notebook computers, and laptop computers) , hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, and mainframe computers.
In addition, the embodiments and functionalities described herein may operate over distributed systems (e.g., cloud-based computing systems) , where application functionality, memory, data storage and retrieval and various processing functions may be operated remotely from each other over a distributed computing network, such as the Internet or an intranet. User interfaces and information of various types may be displayed via on-board computing device displays or via remote display units associated with one or more computing devices. For example user interfaces and information of various types may be  displayed and interacted with on a wall surface onto which user interfaces and information of various types are projected. Interaction with the multitude of computing systems with which embodiments of the invention may be practiced include, keystroke entry, touch screen entry, voice or other audio entry, gesture entry where an associated computing device is equipped with detection (e.g., camera) functionality for capturing and interpreting user gestures for controlling the functionality of the computing device, and the like.
Figures 4-6 and the associated descriptions provide a discussion of a variety of operating environments in which embodiments of the invention may be practiced. However, the devices and systems illustrated and discussed with respect to Figures 4-6 are for purposes of example and illustration and are not limiting of a vast number of computing device configurations that may be utilized for practicing embodiments of the invention, described herein.
Figure 4 is a block diagram illustrating physical components (i.e., hardware) of a computing device 400 with which embodiments of the invention may be practiced. The computing device components described below may be suitable for the  computing devices  110, 115, 145, described above. In a basic configuration, the computing device 400 may include at least one processing unit 402 and a system memory 404. Depending on the configuration and type of computing device, the system memory 404 may comprise, but is not limited to, volatile storage (e.g., random access memory) , non-volatile storage (e.g., read-only memory) , flash memory, or any combination of such memories. The system memory 404 may include an operating system 405 and one or more program modules 406 suitable for running software applications 450. The operating system 405, for example, may be suitable for controlling the operation of the computing device 400. Furthermore, embodiments of the invention may be practiced in conjunction with a graphics library, other operating systems, or any other application program and is not limited to any particular application or system. This  basic configuration is illustrated in Figure 4 by those components within a dashed line 408. The computing device 400 may have additional features or functionality. For example, the computing device 400 may also include additional data storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape. Such additional storage is illustrated in Figure 4 by a removable storage device 409 and a non-removable storage device 410.
As stated above, a number of program modules and data files may be stored in the system memory 404. While executing on the processing unit 402, the program modules 406 may perform processes including, but not limited to, one or more of the stages of the routine 300 illustrated in Figure 3. Other program modules that may be used in accordance with embodiments of the present invention and may include applications such as electronic mail and contacts applications, word processing applications, spreadsheet applications, database applications, slide presentation applications, drawing or computer-aided application programs, etc.
Furthermore, embodiments of the invention may be practiced in an electrical circuit comprising discrete electronic elements, packaged or integrated electronic chips containing logic gates, a circuit utilizing a microprocessor, or on a single chip containing electronic elements or microprocessors. For example, embodiments of the invention may be practiced via a system-on-a-chip (SOC) where each or many of the components illustrated in Figure 4 may be integrated onto a single integrated circuit. Such an SOC device may include one or more processing units, graphics units, communications units, system virtualization units and various application functionality all of which are integrated (or “burned” ) onto the chip substrate as a single integrated circuit. When operating via an SOC, the functionality, described herein, with respect to providing an activity stream across multiple workloads may be operated via application-specific logic integrated with other components of the computing  device 400 on the single integrated circuit (chip) . Embodiments of the invention may also be practiced using other technologies capable of performing logical operations such as, for example, AND, OR, and NOT, including but not limited to mechanical, optical, fluidic, and quantum technologies. In addition, embodiments of the invention may be practiced within a general purpose computer or in any other circuits or systems.
The computing device 400 may also have one or more input device (s) 412 such as a keyboard, a mouse, a pen, a sound input device, a touch input device, etc. The output device (s) 414 such as a display, speakers, a printer, etc. may also be included. The aforementioned devices are examples and others may be used. The computing device 400 may include one or more communication connections 416 allowing communications with other computing devices 418. Examples of suitable communication connections 416 include, but are not limited to, RF transmitter, receiver, and/or transceiver circuitry; universal serial bus (USB) , parallel, and/or serial ports.
The term computer readable media as used herein may include computer storage media. Computer storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, or program modules. The system memory 404, the removable storage device 409, and the non-removable storage device 410 are all computer storage media examples (i.e., memory storage. ) Computer storage media may include RAM, ROM, electrically erasable read-only memory (EEPROM) , flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other article of manufacture which can be used to store information and which can be accessed by the computing device 400. Any such computer storage media may be part of the  computing device 400. Computer storage media does not include a carrier wave or other propagated or modulated data signal.
Communication media may be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and includes any information delivery media. The term “modulated data signal” may describe a signal that has one or more characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media may include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF) , infrared, and other wireless media.
Figures 5A and 5B illustrate a mobile computing device 500, for example, a mobile telephone, a smart phone, a tablet personal computer, a laptop computer, and the like, with which embodiments of the invention may be practiced. With reference to Figure 5A, one embodiment of a mobile computing device 500 for implementing the embodiments is illustrated. In a basic configuration, the mobile computing device 500 is a handheld computer having both input elements and output elements. The mobile computing device 500 typically includes a display 505 and one or more input buttons 510 that allow the user to enter information into the mobile computing device 500. The display 505 of the mobile computing device 500 may also function as an input device (e.g., a touch screen display) . If included, an optional side input element 515 allows further user input. The side input element 515 may be a rotary switch, a button, or any other type of manual input element. In alternative embodiments, mobile computing device 500 may incorporate more or less input elements. For example, the display 505 may not be a touch screen in some embodiments. In yet another alternative embodiment, the mobile computing device 500 is a portable phone system, such as a cellular phone. The mobile computing device 500 may also include an optional keypad  535. Optional keypad 535 may be a physical keypad or a “soft” keypad generated on the touch screen display. In various embodiments, the output elements include the display 505 for showing a graphical user interface (GUI) , a visual indicator 520 (e.g., a light emitting diode) , and/or an audio transducer 525 (e.g., a speaker) . In some embodiments, the mobile computing device 500 incorporates a vibration transducer for providing the user with tactile feedback. In yet another embodiment, the mobile computing device 500 incorporates peripheral device port 540, such as an audio input (e.g., a microphone jack) , an audio output (e.g., a headphone jack) , and a video output (e.g., a HDMI port) for sending signals to or receiving signals from an external device.
Figure 5B is a block diagram illustrating the architecture of one embodiment of a mobile computing device. That is, the mobile computing device 500 can incorporate a system (i.e., an architecture) 502 to implement some embodiments. In one embodiment, the system 502 is implemented as a “smart phone” capable of running one or more applications (e.g., browser, e-mail, calendaring, contact managers, messaging clients, games, and media clients/players) . In some embodiments, the system 502 is integrated as a computing device, such as an integrated personal digital assistant (PDA) and wireless phone.
One or more application programs 550 may be loaded into the memory 562 and run on or in association with the operating system 564. Examples of the application programs include phone dialer programs, electronic communication applications, personal information management (PIM) programs, word processing programs, spreadsheet programs, Internet browser programs, messaging programs, and so forth. The system 502 also includes a non-volatile storage area 568 within the memory 562. The non-volatile storage area 568 may be used to store persistent information that should not be lost if the system 502 is powered down. The application programs 550 may use and store information in the non-volatile storage area 568, such as e-mail or other messages used by an e-mail application, and the  like. A synchronization application (not shown) also resides on the system 502 and is programmed to interact with a corresponding synchronization application resident on a host computer to keep the information stored in the non-volatile storage area 568 synchronized with corresponding information stored at the host computer. As should be appreciated, other applications may be loaded into the memory 562 and run on the mobile computing device 500.
The system 502 has a power supply 570, which may be implemented as one or more batteries. The power supply 570 might further include an external power source, such as an AC adapter or a powered docking cradle that supplements or recharges the batteries.
The system 502 may also include a radio 572 that performs the function of transmitting and receiving radio frequency communications. The radio 572 facilitates wireless connectivity between the system 502 and the “outside world, ” via a communications carrier or service provider. Transmissions to and from the radio 572 are conducted under control of the operating system 564. In other words, communications received by the radio 572 may be disseminated to the application programs 550 via the operating system 564, and vice versa.
The visual indicator 520 may be used to provide visual notifications and/or an audio interface 574 may be used for producing audible notifications via the audio transducer 525. In the illustrated embodiment, the visual indicator 520 is a light emitting diode (LED) and the audio transducer 525 is a speaker. These devices may be directly coupled to the power supply 570 so that when activated, they remain on for a duration dictated by the notification mechanism even though the processor 560 and other components might shut down for conserving battery power. The LED may be programmed to remain on indefinitely until the user takes action to indicate the powered-on status of the device. The audio interface 574 is used to provide audible signals to and receive audible signals from the user. For  example, in addition to being coupled to the audio transducer 525, the audio interface 574 may also be coupled to a microphone to receive audible input, such as to facilitate a telephone conversation. In accordance with embodiments of the present invention, the microphone may also serve as an audio sensor to facilitate control of notifications, as will be described below. The system 502 may further include a video interface 576 that enables an operation of an on-board camera 530 to record still images, video stream, and the like.
mobile computing device 500 implementing the system 502 may have additional features or functionality. For example, the mobile computing device 500 may also include additional data storage devices (removable and/or non-removable) such as, magnetic disks, optical disks, or tape. Such additional storage is illustrated in Figure 5B by the non-volatile storage area 568.
Data/information generated or captured by the mobile computing device 500 and stored via the system 502 may be stored locally on the mobile computing device 500, as described above, or the data may be stored on any number of storage media that may be accessed by the device via the radio 572 or via a wired connection between the mobile computing device 500 and a separate computing device associated with the mobile computing device 500, for example, a server computer in a distributed computing network, such as the Internet. As should be appreciated such data/information may be accessed via the mobile computing device 500 via the radio 572 or via a distributed computing network. Similarly, such data/information may be readily transferred between computing devices for storage and use according to well-known data/information transfer and storage means, including electronic mail and collaborative data/information sharing systems.
Figure 6 illustrates one embodiment of the architecture of a system for providing the functionality described herein across components of a distributed computing environment. Content developed, interacted with, or edited in association with the  applications described above may be stored in different communication channels or other storage types. For example, various documents may be stored using a directory service 622, a web portal 624, a mailbox service 626, an instant messaging store 628, or a social networking site 630. The application 450 (e.g., an electronic communication application) may use any of these types of systems or the like for providing the functionalities described herein across multiple workloads, as described herein. A server 615 may provide the functionality to clients 605A-C and 110. As one example, the server 615 may be a web server providing the application functionality described herein over the web. The server 615 may provide the application functionality over the web to clients 605A-C and 110 through a  network  125, 610. By way of example, a computing devices 110 may be implemented and embodied in a personal computer 605A, a tablet computing device 605B and/or a mobile computing device 605C (e.g., a smart phone) , or other computing device. Any of these embodiments of the client computing device may obtain content from the store 616.
Embodiments of the present invention, for example, are described above with reference to block diagrams and/or operational illustrations of methods, systems, and computer program products according to embodiments of the invention. The functions/acts noted in the blocks may occur out of the order as shown in any flowchart. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
The description and illustration of one or more embodiments provided in this application are not intended to limit or restrict the scope of the invention as claimed in any way. The embodiments, examples, and details provided in this application are considered sufficient to convey possession and enable others to make and use the best mode of claimed invention. The claimed invention should not be construed as being limited to any  embodiment, example, or detail provided in this application. Regardless of whether shown and described in combination or separately, the various features (both structural and methodological) are intended to be selectively included or omitted to produce an embodiment with a particular set of features. Having been provided with the description and illustration of the present application, one skilled in the art may envision variations, modifications, and alternate embodiments falling within the spirit of the broader aspects of the general inventive concept embodied in this application that do not depart from the broader scope of the claimed invention.

Claims (20)

  1. A method of managing performance of a data query, comprising:
    receiving a data query at a data repository for running against one or more data items;
    running the data query at the data repository against the one or more data items;
    tracking a runtime for the running of the data query; and
    if the runtime exceeds a threshold runtime, terminating the running of the data query against the one or more data items.
  2. The method of Claim 1, wherein the running of the data query is terminated if the runtime exceeds a threshold runtime of ten hours.
  3. The method of Claim 1, prior to running the data query against the one or more data items, at the data repository, automatically scheduling a running of the data query relative to the running of one or more other received data queries.
  4. The method of Claim 1, wherein terminating the running of the data query includes quarantining the terminated data query from subsequently running against the one or more data items.
  5. The method of Claim 4, further comprising storing the quarantined data query at a quarantine repository from which it may be analyzed for errors.
  6. The method of Claim 1, further comprising reporting the termination of the running of the data query to a querying party from which the data query was received.
  7. The method of Claim 1, further comprising:
    receiving one or more additional data queries at the data repository wherein the one or more additional data queries and the terminated data query comprise a query job; and
    running each of the one or more additional data queries in sequence as prescribed by a querying party from which the query job is received.
  8. The method of Claim 7, wherein if a threshold number of data queries included in the query job are terminated due to excessive runtime during a threshold time period, terminating the query job from further processing.
  9. The method of Claim 8, further comprising:
    quarantining the terminated query job from running against the one or more data items; and
    preventing any additional data queries comprising the query job from running against the one or more data items.
  10. The method of Claim 8, wherein terminating the query job from further processing includes terminating the query job due to excessive runtime if more than three data queries are terminated for excessive runtime out of a seven data query sequence.
  11. The method of Claim 7, further comprising:
    reporting the terminated query job to a querying party from which the query job was received;
    receiving a modification to one or more data queries comprising the query job; and
    if the modification to the one or more data queries comprising the query job allows the one or more data queries comprising the query job to run to completion without exceeding the threshold runtime, allowing a running of the data queries comprising the query job against the one or more data items.
  12. A method of managing performance of a test data query; comprising
    receiving a test data query at a data repository for running against one or more test data items;
    running the test data query at the data repository against the one or more test data items;
    tracking a runtime for the running of the test data query; and
    if the runtime of the test data query exceeds a threshold runtime, pausing the running of the test data query to allow other test data queries to run during a time in which the test data query is paused.
  13. The method of Claim 12, further comprising, at the data repository, automatically scheduling a running of the test data query relative to the running of one or more other received test data queries by placing the received test data query in a run queue from which test data queries are pulled for running against one or more test data items.
  14. The method of Claim 12, wherein pausing the running of the test data query includes moving the test data query from a run queue to a wait queue for yielding a running of the test data query to a running of other test data queries in the run queue.
  15. The method of Claim 14, wherein if a space becomes available in the run queue, moving the paused test data query from the wait queue to the run queue.
  16. The method of Claim 15, wherein after the paused test data query is moved from the wait queue to the run queue, running the previously paused test data query against the one or more test data items.
  17. The method of Claim 16, wherein if the running of the previously paused test data query exceeds the threshold runtime, pausing the running of the previously paused test data query indefinitely, and reporting the pausing of the running of the previously paused test data query to a querying party from which the test data query was received.
  18. A system for managing performance of a data query, the system comprising:
    one or more processors;
    memory storing one or more modules that are executable by the one or more processors, the one or more modules comprising:
    a queries domain operative to
    receive a data query at a data repository for running against one or more data items;
    run the data query at the data repository against the one or more data items;
    track a runtime for the running of the data query; and
    terminate the running of the data query against the one or more data items if the runtime exceeds a threshold runtime.
  19. The system of Claim 18, the queries domain being further operative to pause the running of the data query against the one or more data items if the data query is a test data query and exceeds a threshold runtime for test data queries.
  20. The system of Claim 18, the queries domain being further operative to
    receive one or more additional data queries at the data repository wherein the one or more additional data queries and the terminated data query comprise a query job;
    run each of the one or more additional data queries in sequence as prescribed by a querying party from which the query job is received; and
    terminate further processing of the query job if a threshold number of data queries included in the query job are terminated due to excessive runtime during a prescribed sequence of run data queries from the query job.
PCT/CN2015/073491 2015-03-02 2015-03-02 Data query job submission management WO2016138616A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2015/073491 WO2016138616A1 (en) 2015-03-02 2015-03-02 Data query job submission management
CN201580056607.4A CN107077490B (en) 2015-03-02 2015-03-02 Data query job submission management

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/073491 WO2016138616A1 (en) 2015-03-02 2015-03-02 Data query job submission management

Publications (1)

Publication Number Publication Date
WO2016138616A1 true WO2016138616A1 (en) 2016-09-09

Family

ID=56849199

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/073491 WO2016138616A1 (en) 2015-03-02 2015-03-02 Data query job submission management

Country Status (2)

Country Link
CN (1) CN107077490B (en)
WO (1) WO2016138616A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664366B (en) * 2017-03-28 2021-08-24 百度在线网络技术(北京)有限公司 Data transmission method and device and server
WO2021077341A1 (en) * 2019-10-23 2021-04-29 北京欧珀通信有限公司 Data request method and device, system, server, and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8200657B2 (en) * 2005-01-28 2012-06-12 International Business Machines Corporation Processing cross-table non-boolean term conditions in database queries
CN104216894A (en) * 2013-05-31 2014-12-17 国际商业机器公司 Method and system for data query

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050192937A1 (en) * 2004-02-26 2005-09-01 International Business Machines Corporation Dynamic query optimization
CN103294533B (en) * 2012-10-30 2016-09-07 北京安天电子设备有限公司 task flow control method and system
CN103414771B (en) * 2013-08-05 2017-02-15 国云科技股份有限公司 Monitoring method for long task operation between nodes in cloud computing environment
CN103761185B (en) * 2014-01-14 2016-06-22 烽火通信科技股份有限公司 A kind of automatization test system and method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8200657B2 (en) * 2005-01-28 2012-06-12 International Business Machines Corporation Processing cross-table non-boolean term conditions in database queries
CN104216894A (en) * 2013-05-31 2014-12-17 国际商业机器公司 Method and system for data query

Also Published As

Publication number Publication date
CN107077490A (en) 2017-08-18
CN107077490B (en) 2021-03-30

Similar Documents

Publication Publication Date Title
US20210271633A1 (en) Compliance violation detection
US20180054438A1 (en) Proxy service for uploading data from a source to a destination
CN107430666B (en) Tenant lock box
US11418592B2 (en) Uploading user and system data from a source location to a destination location
US10528530B2 (en) File repair of file stored across multiple data stores
US20140372369A1 (en) Managing Changes to Shared Electronic Documents Using Change History
US20210232640A1 (en) Contact creation and utilization
EP3538990A1 (en) Methods and systems for application rendering
CN109313589B (en) Enabling interaction with external functions
KR102202108B1 (en) System and method to automatically diagnose vulnerabilities in cloud infrastructure assets
US10715513B2 (en) Single sign-on mechanism on a rich client
US10931617B2 (en) Sharing of bundled content
US20130198620A1 (en) Method and system for performing synchronous document conversion
US20180069774A1 (en) Monitoring and reporting transmission and completeness of data upload from a source location to a destination location
WO2016138616A1 (en) Data query job submission management
US20210295234A1 (en) Automated evidence collection
WO2016138614A1 (en) Management of database queries against large datasets
US20180213398A1 (en) Tenant based signature validation
US11496453B2 (en) Binary experimentation on running web servers
US20170293599A1 (en) Checklist Contexts and Completion
CN107210992B (en) Uploading and downloading data between a secure data system and an external data system
US20220237026A1 (en) Volatile memory acquisition
EP4191946A1 (en) Cloud environment security tool
WO2022164612A1 (en) Volatile memory acquisition
CN112181975A (en) Method and apparatus for creating a database in a data warehouse

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15883673

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15883673

Country of ref document: EP

Kind code of ref document: A1