US8620945B2 - Query rewind mechanism for processing a continuous stream of data - Google Patents
Query rewind mechanism for processing a continuous stream of data Download PDFInfo
- Publication number
- US8620945B2 US8620945B2 US12/888,427 US88842710A US8620945B2 US 8620945 B2 US8620945 B2 US 8620945B2 US 88842710 A US88842710 A US 88842710A US 8620945 B2 US8620945 B2 US 8620945B2
- Authority
- US
- United States
- Prior art keywords
- query
- data
- stream
- chunk
- continuous stream
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
Definitions
- DB database
- Database management systems manage large volumes of data that need to be efficiently accessed and manipulated. Queries to the database are becoming increasingly complex to execute in view of such massive data structures. If queries to the database are not completed in a sufficient amount of time, then acceptable performance is difficult to achieve.
- FIG. 1 shows a database system with a query engine in accordance with an example implementation.
- FIG. 2 shows a data stream management system in accordance with an example implementation.
- FIG. 3 shows a flow diagram for processing a continuous stream of data in accordance with an example implementation.
- FIG. 4 shows a method in accordance with an example implementation.
- FIG. 5 shows a computer system in accordance with an example implementation.
- Example implementations are systems, methods, and apparatuses for stream processing that apply queries to data chunks of a continuous data stream.
- the data stream is divided into chunks for Structured Query Language (SQL) queries and User Defined Functions (UDF) based analysis while the data chunks of multiple streams are synchronized.
- SQL Structured Query Language
- UDF User Defined Functions
- a rewind mechanism processes the data chunk-by-chunk and enables a single long-standing query instance to be sustained, which allows the state of the query and UDFs invoked in the query to be maintained across cycles.
- Stream processing deals with unbound data sequences but often on the per chunk basis divided by time windows, such as calculating the aggregates or statistics of every ten minutes of data, or the moving averages of five minutes of sliding windows. Integrating data-intensive stream processing with SQL-based query processing under the notion of a continuous query enables benefits, such as fast data access, reduced data transfer and SQL expressive power, and leveraging of existing database technology, such as transaction management.
- some techniques apply an SQL query to the entire input data set rather than chunk-by-chunks of data. Such techniques repeatedly launch a same query on each window of data that is scheduled by the workflow system.
- These techniques have several shortcomings: the stream process system is built on top of the query engine with overhead in workflow scheduling, memory management and inter-process communication; the stream processing is not made by a true continuous query but by multiple query executions with set up/tear-down overhead. Since these executions are isolated in memory context, the result and state of a window query execution, as well as of the UDFs invoked by the query, cannot be sustained and carried over windows.
- Stream processing supports continuous and incremental analytics over data before the data are loaded in the database (as opposed to a traditional store-first query-later approach).
- Stream processing is characterized by Continued Query or Continuous Query (CQ) and by dealing with a sequence of data chunks falling in time windows.
- CQ Continuous Query
- a CQ is actually implemented by multiple query instances scheduled to run based on time-delta or other events.
- DBMS Database Management System
- the SQL query is applied to the entire input relation, rather than the incoming relational data chunk-by-chunk corresponding to the (time) windows.
- Example embodiments use a cut-and-rewind approach to apply an SQL query on a chunk-by-chunk basis of data of the input stream.
- a stream source function that accepts events to generate stream elements.
- an end-of-data message is signaled by the stream source function to the query engine to terminate the query execution.
- the query engine executes a query rewind (as opposed to shutting down and restarting processing of the data falling in a subsequent window).
- rewinding a query applied to static data means re-scanning the same data
- rewinding a query applied to continuous stream data means reactivating the stream source function for processing the newly incoming data.
- a stream is naturally processed chunk-by-chunk by the designated query with the full SQL expressive power. Two or more streams may be joined under the cut semantics where the query rewinding point serves as a synchronization point of processing these streams.
- the conventional query evaluation is a tuple-by tuple iterative process
- stream processing with the cut- and rewind mechanism is considered as a Super Iterative Continuous Query (SICQ) process over chunk-by-chunk evaluation cycles.
- SICQ Super Iterative Continuous Query
- the SICQ approach allows tight integration of stream processing and query processing as SICQ is directly supported by the extended query engine, rather than by a workflow system built outsides of the query engine.
- the SICQ is a continuous query (i.e., it is not stopped or shutdown), which eliminates the query set up/tear-down overhead.
- the SICQ further allows the results and states of every window processing cycle to be sustained and carried onto the next cycle.
- Continuation of the query instance enables various incremental computations with flexible granularities, and allows static data (e.g. data used in a UDF) to be loaded only once, rather than the data being repeatedly fetched cycle-by-cycle.
- the CQ approach with example embodiments differs from regular querying in several aspects.
- Stream data are captured by stream source functions, which are a special kind of User Defined Function (UDF) that is extended with support from the query engine.
- UDF User Defined Function
- the CQ does not stop and continuously processes the stream with a single long-standing query, rather than a large number of periodically setup/tear-down short queries.
- FIG. 1 shows a database system 100 with a database or query engine 110 that includes a SICQ engine and/or a PostgreSQL engine in accordance with an example implementation.
- the query engine is in communication with a query rewind mechanism or rewind mechanism 115 .
- Multiple input streams 120 (shown as chunk-by-chunk input) are input to a cycle-based continuous query for stream processing 130 , which is in communication with the query engine 110 and a database 140 .
- the processed input streams are output 150 (shown as chunk-by-chunk output).
- Example embodiments utilize the query engine 110 and query rewind mechanism 115 for in-DB stream processing.
- the query engine processes endless data streams 120 in a record-by-record or tuple-by-tuple fashion and implements the query rewind mechanism to sustain the query as a single long-standing query, rather than a large number of periodically setup/tear-down short queries (i.e., rather than numerous, single queries issued on the incoming continuous data stream).
- Automatic information derivation is a continuous querying and computation process where an operation is driven by input data streams 120 and outputs to other data streams 150 . In this way, the process acts as both a stream consumer and a stream producer. Since input is continuous (i.e., does not end), the process does not cease; although it may have different paces at different operations.
- FIG. 2 shows a data stream management system (DSMS) 200 in accordance with an example embodiment.
- a continuous input stream 210 is provided to the DSMS 200 (which includes the query engine of FIG. 1 ), which is in communication with a continuous query generator 220 , computer system 230 , and archive 240 .
- Streamed results 250 are provided to the computer system 230 .
- the input stream is an unbounded bag of tuple and timestamp pairs. Windowing operators convert the input streams into relations that are then transformed back into an output or answer stream.
- Example embodiments build window functions or operators for query engine enabled stream processing.
- the window operator is history-sensitive in the sense that it has the capability of keeping the data state in a window of time (e.g. 1 minute) or granule (e.g. 100 tuples), and executing the required operation, such as delta-aggregate, to those data.
- Window operations are handled by a long-standing forever query, rather than by separate individual short queries. Window operations are directly scheduled by the query engine rather than by an external scheduler or database program (e.g. a PL/SQL script or a stored procedure).
- window operations with different granularities and for different applications can be specified in a single query and these windows are allowed to be overlapping.
- Example embodiments support at least two kinds of window operations. These window operations include delta window operations on the data apart of given time or cardinality ranges (e.g. every 1 minute or every 100 tuples) and sliding windows.
- window operations include delta window operations on the data apart of given time or cardinality ranges (e.g. every 1 minute or every 100 tuples) and sliding windows.
- static data retrieved from the database is cached in a window operation. In one example embodiment, this static data is loaded only once in the entire long-standing query, which removes much of the data access cost of the multi-query based stream processing.
- the SICQ is extended by a PostgreSQL engine.
- This engine directly leverages the SQL language and the query engine for stream processing.
- the SICQ expresses stream processing directly using extended SQL with UDFs.
- the SICQ engine is an extension of the query engine, rather than a system built “on top” of the query engine.
- FIG. 3 shows a flow diagram for processing a continuous stream of data in accordance with an example implementation.
- a continuous stream of data is received at a query engine.
- the continuous stream of data is divided into data chunks.
- an SQL query with UDFs is applied to the data chunks of the continuous stream of data.
- the data chunks are processed at the query engine with a query rewind mechanism to sustain the SQL query as a single long-standing query that allows the UDFs and a state of the query to be maintained across multiple cycles.
- Example embodiments apply the SQL query to data on a chunk-by-chunk basis rather than to an entire data set.
- the query instance is iteratively rewound with a query rewind mechanism for processing the data in time windows.
- the query runs in cycles for processing data chunk-by-chunk. In each cycle, the query applies to the data falling in a window boundary (as opposed to being applied to the entire data set).
- rewinding the query applies to the new data flowing in, which is very different from re-scanning a static data set.
- An example embodiment sustains a query state across running cycles.
- a SICQ is rewinding but not torn down
- the query result is maintained and the data is stored in UDFs over cycles.
- This stored data is used, for example, to aggregate the results obtained in multiple cycles and to maintain the UDF state continuously regardless of cycle boundaries, which may be used for handling continuous sliding windows without the discontinuation around cycle/chunk boundaries. This feature enables aggregating stream data in multiple levels with various degrees of granularity.
- the stream processing is executed by a long-standing SQL query instance, rather than by multiple event driven query instances.
- the query evaluation is an iterative processing.
- the operators, except the scan operators, apply to data one tuple at a time. These operators form a query tree where a parent operator requests its child operator to deliver the “next” result as its input. In turn, the child operator requests its own child operator to deliver the “next” result, and this process continues.
- the states maintained in aggregate operators and in UDFs are continued across tuple-by-tuple operator invocations.
- SICQ execution is a super iterative process in the sense that the same query is applied to the data on a chunk-by-chunk basis for supporting stream window semantics. Further, since the SICQ does not stop (i.e., is continuous), its states (i.e., the cycle execution results and the data kept in the UDFs) are continued across cycles. Cut-and-rewind is consistently handled at two levels of the stream query processing: the function scan level and the overall query execution level.
- the operators at the leaf level of a query tree are data access operators, known as scan operators that are further classified into operators for table-scan, index-scan, function scan, etc.
- scan operators that are further classified into operators for table-scan, index-scan, function scan, etc.
- a scan operator typically materializes a block or even the entire resulting tuple set first and then has the tuple fetched one-by-one by its parent operator.
- a query is executed from the root of the query tree in a demand driven, iterative fashion, where a parent operator requests its child operators to deliver the next tuple. In turn, the child operator requests its own child operators to deliver the next tuple, until reaching the scan operators.
- example embodiments use SQL query based stream processing to scan and generate stream data at two levels.
- SSF Stream Source Function
- At the lower level is a Stream Source Function (SSF) that gets input events by listening to an event source, reading data from files, etc., and outputs tuples with the predefined relation schema.
- SSF scan operator At the higher level is the SSF scan operator that invokes the SSF once for delivering one tuple to the query as a stream element.
- the SSF scan operator invokes the SSF by passing in a system handle, function call Handle (fcH); and a data structure contains the function call information referenced by both, which is used for their communication.
- fcH function call Handle
- the SSF scan operator belongs to the query engine, and the SSF is a User Defined Table Function for the designated stream data.
- the SSF scan operator is the fixed part and the SSF is the changeable part.
- SSFs are a new kind of data sources that alter the query engine's block based function scan method to retrieve stream elements (tuples) from SSF on the per-tuple basis, which constitutes the start point for SQL query based stream processing.
- a SSF can stop delivering tuples by signaling the “end-of-data” to the SSF scan operator.
- the SSF scan operator Upon receipt the “end-of-data” message, the SSF scan operator returns NULL to its own parent operator and in this way terminates the current query execution.
- the SSF informs the “end-of-data” status to the SSF scan operator through updating a designated field of the fcH mentioned above.
- Cutting data into chunks is based on application-oriented conditions.
- One example condition is based on the chunk size (i.e., the number of tuples contained in each chunk).
- the most frequently used condition is the time window (e.g., a five-minute time window) for the stream data with timestamps.
- the cut mechanism is one part of the SICQ approach.
- the next issue is to maintain the query instance across the query executions on multiple data chunks in order to continue the execution states and results.
- Cut-and-Rewind approach applies a SQL query to the data chunk-by-chunk (rather than entirely) while keeping the query instance alive across multiple chunk-oriented executions. This sustains and continues the execution states and provides incremental time-window based computation, aggregation, etc.
- the stream source function that accepts events to generate stream elements.
- the “end-of-data” is signaled by the stream source function to the query engine to terminate the query execution.
- the query rewinds (rather than undergoing a shutdown/restart operation) for processing the data falling in the subsequent window.
- Two or more streams may be joined under the cut semantics where the query rewinding point serves as the synchronization point of processing these streams.
- stream processing with rewind query is a Super Iterative Continuous Query (SICQ) process over chunk-by-chunk evaluation cycles.
- SICQ Super Iterative Continuous Query
- the executor processes a tree of “plan nodes.”
- the plan tree is essentially a demand-pull pipeline of tuple processing operations.
- Each node when called, produces the next tuple in its output sequence, or produces NULL if no more tuples are available. If the node is not a primitive relation-scanning node, the child node(s) are called to obtain input tuples.
- the plan tree delivered by the planner includes a tree of plan nodes.
- Each plan node has expression trees that represent its target list, qualification conditions, etc.
- one example embodiment builds a parallel tree of identical structures containing executor state nodes (e.g., every plan and expression node type has a corresponding executor state node type).
- Each node in the state tree has a pointer to its corresponding node in the plan tree, plus executor state data as needed to implement that node type. This arrangement allows the plan tree to be read-only as far as the executor is concerned: the data that is modified during execution is in the state tree.
- control flow for query processing becomes the following:
- Rewind is based on the rescan utilities of the DBMS.
- An example use of rewind is to rewind a fetch cursor to re-fetch the query result.
- the resulting data is sent to the client through a server-client connection tuple-by-tuple. If the results are not materialized, rescan includes a re-calculation; otherwise the materialized data is retrieved directly.
- the rescan utility has wider application in addition to support rewind. For example, in nested loop join, the left plan state tree is rescanned.
- the rescan is made top-down along the PlanState tree, at any level. If the results are materialized, the results are reused without further rescan.
- the hash groups in aggregate/group belong to this case, where the re-scan function returns the materialized data without further query processing action. This is seen in some other rescan functions such as those for dealing with sorting, limiting, etc.
- the cut-and-rewind based cycle query for stream processing the situation is different. Here, rewinding the query instance is reset to the original state for the plan nodes, and then started to process the new streaming data, rather than the previous data. In this case we need to disable the use of materialized results and enforce recalculating based rescan.
- one example embodiment identifies when the cycle-rescan is to be used and when the regular rescan is to be used in order not to alter the normal behavior and performance of the DBMS.
- streaming queries are identified.
- a stream query is identified as follows:
- the input stream tuples contain the position (segment) and speed of a car (identified by the vehicle id-vid) at a time (second, starting from 0). These tuples are delivered by the stream source function STREAM_CYCLE_LR(time, cycles, xway) where the parameter time is the number of seconds in a query cycle as the data chunk boundary, and cycles is the number of cycles the query is supposed to run. For example, STREAM_CYCLE_LR(60, 60, 1) delivers the 1 minute (60 seconds) data chunks for 1 hour (60 chunks) for express way 1 .
- the segment statistics computation is computed for every minute that cars are in each expressway with each segment being identified as the bottleneck of the benchmark.
- This query cuts data in one minute time-windows and rewinds 60 times. It is a single query instance that is applied to data on a chunk-by-chunk basis, which yields chunk-wise results.
- the query for calculating the average speed in each segment, each direction in each minute is expressed as follows:
- This query calculates the average speed of cars in each section in the past five minutes by a sliding window function Ir_moving_avg( ).
- Ir_moving_avg( ) the per-minute average speed is calculated by the query with respect to each minute, but the moving average is calculated by the function and buffers the minute average results continuously regardless of cycle boundaries. This is why we rewind rather than reinitiate the query instance to keep the query instance state across cycles.
- the two streams sources are joined. For example, assuming that the stream data delivered by the stream source function S 1 are more frequently than that delivered by S 2 .
- the stream data delivered by S 1 are cut on every 100 tuples, and the stream data delivered by S 2 are cut on every 10 tuples.
- the following query joins every 100 tuples of S 1 data with every 10 tuples S 2 data in each of the 100 cycles. This query actually provide a kind of “sync” to these two data streams as follows:
- SICQ leverages the full power of SQL and DBMS, and integrates stream processing and data management. Additionally, SICQ extends the query engine capability for applying a query, with full SQL expressive power, to data chunks rather than to the entire data set. Also, with SICQ, a data stream divided or cut to chunks for SQL/UDF based analysis, while the data chunks of multiple streams with the same or different window conditions are synchronized by the query applied to them (e.g., join two streams in every five minutes, join every 100 tuples of stream A and every 10 tuples of stream B).
- the single long-standing query instance sustains and allows the state of the query and the states of the UDFs invoked in the query to maintained or stored and continued across cycles.
- the static data for per-tuple processing is loaded once (as opposed to being reloaded in each cycle), and the operations, such as sliding window operations, continue without affected by the window boundaries.
- FIG. 4 is a flow diagram for traversing a multidimensional database while searching a query in accordance with an exemplary embodiment.
- the flow diagram is implemented in a data center that receives stores data in a database, receives queries from a user, and executes the queries, provides search or query results back to the user.
- a query is received to search a multi-dimensional database.
- the database is searched for the terms or keywords in the query.
- results of the query are provided to the user.
- the results of the query are displayed to the user on a display, stored in a computer, or provided to another software application.
- FIG. 5 is a block diagram of a computer system 500 in accordance with an exemplary embodiment of the present invention.
- the computer system is implemented in a data center.
- the computer system includes a database or warehouse 560 (such as a multidimensional database) and a computer or electronic device 505 that includes memory 510 , algorithms and/or computer instructions 520 , display 530 , processing unit 540 , and one or more buses 550 .
- a database or warehouse 560 such as a multidimensional database
- a computer or electronic device 505 that includes memory 510 , algorithms and/or computer instructions 520 , display 530 , processing unit 540 , and one or more buses 550 .
- the processor unit includes a processor (such as a central processing unit, CPU, microprocessor, application-specific integrated circuit (ASIC), etc.) for controlling the overall operation of memory 510 (such as random access memory (RAM) for temporary data storage, read only memory (ROM) for permanent data storage, and firmware).
- the processing unit 540 communicates with memory 510 and algorithms 520 via one or more buses 550 and performs operations and tasks necessary for constructing models and searching the database per a query.
- the memory 510 for example, stores applications, data, programs, algorithms (including software to implement or assist in implementing embodiments in accordance with the present invention) and other data.
- continuous query is a registered query that is continuously and/or repeatedly triggered.
- database means records or data stored in a computer system such that a computer program or person using a query language can send and/or retrieve records and data from the database.
- database management system or “DBMS” is computer software designed to manage databases.
- DSMS data stream management system
- the term “data stream management system” or “DSMS” is computer software that controls the maintenance and querying of data streams.
- the DSMS issues continuous queries against the data stream, as opposed to a conventional database query that executes once and returns a set of results for the query.
- the continuous query continues to execute over time, even as new data enters the data stream.
- processing the query on a “chunk-by-chunk” basis is means a set of data that falls within a time window (e.g., a one minute time window).
- multidimensional database is a database wherein data is accessed or stored with more than one attribute (a composite key).
- Data instances are represented with a vector of values and a collection of vectors (for example, data tuples) that are a set of points in a multidimensional vector space.
- query rewind mechanism or “rewind mechanism” is an apparatus or method that rewinds the query execution state to the beginning state for processing the next chunk of data without shutting-down and/or restarting the query instance.
- single long-standing query is a single query instance, rather than multiple instances (i.e., executions) of the same query.
- SQL Structured Query Language
- RDBMS relational database management systems
- SQL provides a language for an administrator or computer to query and modifying data stored in a database.
- a stream is a time varying data sequence.
- a stream can be a continuous sequence of (tuple, timestamp) pairs, where the timestamp defines an order over the tuples in the stream.
- query engine is a component of a database management system that is used to evaluate queries (e.g., SQL queries) to generate responses or answers to the queries.
- queries e.g., SQL queries
- UDF User Defined Functions
- window functions is a function that is applied to the data falling in the window of value range (e.g., the data between the values of 100-200) and/or the window of time (e.g., is a time window such as every one minute).
- one or more blocks or steps discussed herein are automated. In other words, apparatus, systems, and methods occur automatically.
- the methods illustrated herein and data and instructions associated therewith are stored in respective storage devices, which are implemented as one or more computer-readable or computer-usable storage media or mediums.
- the storage media include different forms of memory including semiconductor memory devices such as DRAM, or SRAM, Erasable and Programmable Read-Only Memories (EPROMs), Electrically Erasable and Programmable Read-Only Memories (EEPROMs) and flash memories; magnetic disks such as fixed, floppy and removable disks; other magnetic media including tape; and optical media such as Compact Disks (CDs) or Digital Versatile Disks (DVDs).
- instructions of the software discussed above can be provided on one computer-readable or computer-usable storage medium, or alternatively, can be provided on multiple computer-readable or computer-usable storage media distributed in a large system having possibly plural nodes.
- Such computer-readable or computer-usable storage medium or media is (are) considered to be part of an article (or article of manufacture).
- An article or article of manufacture can refer to any manufactured single component or multiple components.
- Example embodiments are implemented as a method, system, and/or apparatus. As one example, example embodiments and steps associated therewith are implemented as one or more computer software programs to implement the methods described herein.
- the software is implemented as one or more modules (also referred to as code subroutines, or “objects” in object-oriented programming).
- the software programming code for example, is accessed by a processor or processors of the computer or server from long-term storage media of some type, such as a CD-ROM drive or hard drive.
- the software programming code is embodied or stored on any of a variety of known physical and tangible media for use with a data processing system or in any memory device such as semiconductor, magnetic and optical devices, including a disk, hard drive, CD-ROM, ROM, etc.
- the code is distributed on such media, or is distributed to users from the memory or storage of one computer system over a network of some type to other computer systems for use by users of such other systems.
- the programming code is embodied in the memory and accessed by the processor using the bus.
Abstract
Description
-
- (1) Create Query Descriptor.
- (2) Start Execution—Create Plan Instance, Create execution context and allocate memory space.
- (3) Execute.
- (4) End Execution—recursively release resources and free memory.
- (5) Free Query Descriptor.
-
- (1) Create Query Descriptor.
- (2) Start Execution—Create Plan Instance, Create execution context and allocate memory space.
- (3) EXEC: Execute on a chunk of data ended by the end-of-data signal.
- (4) Rewind the query plan instance, go to EXEC (process data chunk by chunk in a loop).
- (5) End Execution—recursively release resources and free memory.
- (6) Free Query Descriptor.
-
- (1) Get a Boolean value as the flag of Stream Query (or cycle query) from query parser.
- (2) Place this information into Query Descriptor.
- (3) Pass the information to the Plan State tree level-by-level (e.g., for each node pass this value to its sub-node, left node, right node, and other component nodes).
- (4) In the rescan functions, disable the use of materialized results and enforce reprocessing (e.g., perform this function for stream cycle queries by checking the flag).
- (5) After rewinding, clear the stream cycle query flag from the query plan state tree level-by-level in order to have regular options of the rescan facility that is used in the query processing.
select | ||
floor(time/60)::integer as minute, | ||
xway, dir, seg , count(distinct Vid) as active_cars | ||
from STREAM_CYCLE_lr_producer(60, 60, 1) | ||
group by minute, xway, dir, seg. | ||
select floor(time/60)::integer as minute, xway, dir, seg, avg(speed) | ||
from STREAM_CYCLE_lr_producer(60, 60, 1) | ||
group by minute, xway, dir, seg. | ||
select p.minute, p.xway, p.dir, |
p.seg,lr_moving_avg(0, xway, dir, seg, minute, minute_avg_speed) as |
past_5m_avg_speed |
from ( |
select floor(time/60)::integer as minute, xway, dir, seg, |
avg(speed) as minute_avg_speed |
from STREAM_CYCLE_lr_producer(60, 10, 1) |
group by minute, xway, dir, seg) p. |
select p.minute, p.xway, p.dir, p.seg, p.active_cars,lr_moving_avg(0, |
xway, dir, seg, minute, minute_avg_speed) as past_5m_avg_speed |
from ( |
select floor(time/60)::integer as minute, xway, dir, seg, |
avg(speed) as minute_avg_speed, count(distinct Vid) as |
active_cars |
from STREAM_CYCLE_lr_producer(60, 10, 1) |
group by minute, xway, dir, seg) p. |
select r1.pid, r2.v from | ||
(select p.pid AS pid, p.x AS x, p.y AS y | ||
from STREM_S1(100, 100) p) r1, | ||
(select pp.pid AS pid, points_catch(pp.pid, pp.x, pp.y) AS v | ||
from STREAM_S2(10, 100) pp) r2 | ||
where r1.pid = r2.pid order by r1.pid DESC. | ||
Claims (15)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/888,427 US8620945B2 (en) | 2010-09-23 | 2010-09-23 | Query rewind mechanism for processing a continuous stream of data |
US12/907,948 US8260803B2 (en) | 2010-09-23 | 2010-10-19 | System and method for data stream processing |
US12/907,940 US8260826B2 (en) | 2010-09-23 | 2010-10-19 | Data processing system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/888,427 US8620945B2 (en) | 2010-09-23 | 2010-09-23 | Query rewind mechanism for processing a continuous stream of data |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/907,948 Continuation-In-Part US8260803B2 (en) | 2010-09-23 | 2010-10-19 | System and method for data stream processing |
US12/907,940 Continuation-In-Part US8260826B2 (en) | 2010-09-23 | 2010-10-19 | Data processing system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
US20120078939A1 US20120078939A1 (en) | 2012-03-29 |
US8620945B2 true US8620945B2 (en) | 2013-12-31 |
Family
ID=45871712
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/888,427 Active 2030-11-19 US8620945B2 (en) | 2010-09-23 | 2010-09-23 | Query rewind mechanism for processing a continuous stream of data |
Country Status (1)
Country | Link |
---|---|
US (1) | US8620945B2 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10067703B2 (en) | 2015-12-15 | 2018-09-04 | International Business Machines Corporation | Monitoring states of processing elements |
US10311052B2 (en) | 2017-05-19 | 2019-06-04 | International Business Machines Corporation | Query governor enhancements for databases integrated with distributed programming environments |
US10997124B2 (en) * | 2013-04-02 | 2021-05-04 | Micro Focus Llc | Query integration across databases and file systems |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130191370A1 (en) * | 2010-10-11 | 2013-07-25 | Qiming Chen | System and Method for Querying a Data Stream |
US9483110B2 (en) * | 2011-11-07 | 2016-11-01 | International Business Machines Corporation | Adaptive media file rewind |
US9436740B2 (en) | 2012-04-04 | 2016-09-06 | Microsoft Technology Licensing, Llc | Visualization of changing confidence intervals |
US9607045B2 (en) | 2012-07-12 | 2017-03-28 | Microsoft Technology Licensing, Llc | Progressive query computation using streaming architectures |
US9317553B2 (en) | 2012-10-31 | 2016-04-19 | Microsoft Technology Licensing, Llc | Declarative partitioning for data collection queries |
US9087082B2 (en) * | 2013-03-07 | 2015-07-21 | International Business Machines Corporation | Processing control in a streaming application |
US9436732B2 (en) | 2013-03-13 | 2016-09-06 | Futurewei Technologies, Inc. | System and method for adaptive vector size selection for vectorized query execution |
US9514214B2 (en) | 2013-06-12 | 2016-12-06 | Microsoft Technology Licensing, Llc | Deterministic progressive big data analytics |
WO2014204489A2 (en) * | 2013-06-21 | 2014-12-24 | Hitachi, Ltd. | Stream data processing method with time adjustment |
US10191943B2 (en) * | 2014-01-31 | 2019-01-29 | Indian Institute Of Technology Bombay | Decorrelation of user-defined function invocations in queries |
US9953074B2 (en) * | 2014-01-31 | 2018-04-24 | Sap Se | Safe synchronization of parallel data operator trees |
WO2015160362A1 (en) * | 2014-04-18 | 2015-10-22 | Hewlett-Packard Development Company, L.P. | Providing combined data from a cache and a storage device |
US10740328B2 (en) | 2016-06-24 | 2020-08-11 | Microsoft Technology Licensing, Llc | Aggregate-query database system and processing |
US10552435B2 (en) | 2017-03-08 | 2020-02-04 | Microsoft Technology Licensing, Llc | Fast approximate results and slow precise results |
CN113297246B (en) * | 2020-06-16 | 2022-10-21 | 阿里巴巴集团控股有限公司 | Data processing method, computing device and storage medium |
Citations (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6910032B2 (en) * | 2002-06-07 | 2005-06-21 | International Business Machines Corporation | Parallel database query processing for non-uniform data sources via buffered access |
US6947933B2 (en) * | 2003-01-23 | 2005-09-20 | Verdasys, Inc. | Identifying similarities within large collections of unstructured data |
US20070005579A1 (en) * | 2005-07-01 | 2007-01-04 | Microsoft Corporation | Query based synchronization |
US7310638B1 (en) * | 2004-10-06 | 2007-12-18 | Metra Tech | Method and apparatus for efficiently processing queries in a streaming transaction processing system |
US7315923B2 (en) * | 2003-11-13 | 2008-01-01 | Commvault Systems, Inc. | System and method for combining data streams in pipelined storage operations in a storage network |
US7318075B2 (en) * | 2004-02-06 | 2008-01-08 | Microsoft Corporation | Enhanced tabular data stream protocol |
US20080120283A1 (en) * | 2006-11-17 | 2008-05-22 | Oracle International Corporation | Processing XML data stream(s) using continuous queries in a data stream management system |
US7403959B2 (en) * | 2005-06-03 | 2008-07-22 | Hitachi, Ltd. | Query processing method for stream data processing systems |
US7412481B2 (en) * | 2002-09-16 | 2008-08-12 | Oracle International Corporation | Method and apparatus for distributed rule evaluation in a near real-time business intelligence system |
US20090106214A1 (en) * | 2007-10-17 | 2009-04-23 | Oracle International Corporation | Adding new continuous queries to a data stream management system operating on existing queries |
US20090228434A1 (en) * | 2008-03-06 | 2009-09-10 | Saileshwar Krishnamurthy | Addition and processing of continuous sql queries in a streaming relational database management system |
US20090271529A1 (en) | 2008-04-25 | 2009-10-29 | Hitachi, Ltd. | Stream data processing method and computer systems |
US20090287628A1 (en) | 2008-05-15 | 2009-11-19 | Exegy Incorporated | Method and System for Accelerated Stream Processing |
US20090327257A1 (en) * | 2008-06-27 | 2009-12-31 | Business Objects, S.A. | Apparatus and method for facilitating continuous querying of multi-dimensional data streams |
US20100088274A1 (en) * | 2008-10-03 | 2010-04-08 | Microsoft Corporation | System and method for synchronizing a repository with a declarative defintion |
US7739265B2 (en) * | 2007-10-18 | 2010-06-15 | Oracle International Corporation | Deleting a continuous query from a data stream management system continuing to operate on other queries |
US20100318495A1 (en) * | 2009-06-12 | 2010-12-16 | Sap Ag | Correlation aware synchronization for near real-time decision support |
US7860884B2 (en) * | 2006-08-21 | 2010-12-28 | Electronics And Telecommunications Research Institute | System and method for processing continuous integrated queries on both data stream and stored data using user-defined shared trigger |
US20110060890A1 (en) * | 2009-09-10 | 2011-03-10 | Hitachi, Ltd | Stream data generating method, stream data generating device and a recording medium storing stream data generating program |
US20110119262A1 (en) * | 2009-11-13 | 2011-05-19 | Dexter Jeffrey M | Method and System for Grouping Chunks Extracted from A Document, Highlighting the Location of A Document Chunk Within A Document, and Ranking Hyperlinks Within A Document |
US20110125778A1 (en) * | 2009-11-26 | 2011-05-26 | Hitachi, Ltd. | Stream data processing method, recording medium, and stream data processing apparatus |
US7991766B2 (en) * | 2007-10-20 | 2011-08-02 | Oracle International Corporation | Support for user defined aggregations in a data stream management system |
US20110196856A1 (en) * | 2010-02-10 | 2011-08-11 | Qiming Chen | Processing a data stream |
US8010512B2 (en) * | 2008-06-16 | 2011-08-30 | International Business Machines Corporation | System and method for model-driven object store |
-
2010
- 2010-09-23 US US12/888,427 patent/US8620945B2/en active Active
Patent Citations (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6910032B2 (en) * | 2002-06-07 | 2005-06-21 | International Business Machines Corporation | Parallel database query processing for non-uniform data sources via buffered access |
US7412481B2 (en) * | 2002-09-16 | 2008-08-12 | Oracle International Corporation | Method and apparatus for distributed rule evaluation in a near real-time business intelligence system |
US6947933B2 (en) * | 2003-01-23 | 2005-09-20 | Verdasys, Inc. | Identifying similarities within large collections of unstructured data |
US7861050B2 (en) * | 2003-11-13 | 2010-12-28 | Comm Vault Systems, Inc. | Systems and methods for combining data streams in a storage operation |
US7315923B2 (en) * | 2003-11-13 | 2008-01-01 | Commvault Systems, Inc. | System and method for combining data streams in pipelined storage operations in a storage network |
US7318075B2 (en) * | 2004-02-06 | 2008-01-08 | Microsoft Corporation | Enhanced tabular data stream protocol |
US7310638B1 (en) * | 2004-10-06 | 2007-12-18 | Metra Tech | Method and apparatus for efficiently processing queries in a streaming transaction processing system |
US7403959B2 (en) * | 2005-06-03 | 2008-07-22 | Hitachi, Ltd. | Query processing method for stream data processing systems |
US20070005579A1 (en) * | 2005-07-01 | 2007-01-04 | Microsoft Corporation | Query based synchronization |
US7860884B2 (en) * | 2006-08-21 | 2010-12-28 | Electronics And Telecommunications Research Institute | System and method for processing continuous integrated queries on both data stream and stored data using user-defined shared trigger |
US20080120283A1 (en) * | 2006-11-17 | 2008-05-22 | Oracle International Corporation | Processing XML data stream(s) using continuous queries in a data stream management system |
US20090106214A1 (en) * | 2007-10-17 | 2009-04-23 | Oracle International Corporation | Adding new continuous queries to a data stream management system operating on existing queries |
US7739265B2 (en) * | 2007-10-18 | 2010-06-15 | Oracle International Corporation | Deleting a continuous query from a data stream management system continuing to operate on other queries |
US7991766B2 (en) * | 2007-10-20 | 2011-08-02 | Oracle International Corporation | Support for user defined aggregations in a data stream management system |
US20090228434A1 (en) * | 2008-03-06 | 2009-09-10 | Saileshwar Krishnamurthy | Addition and processing of continuous sql queries in a streaming relational database management system |
US20090271529A1 (en) | 2008-04-25 | 2009-10-29 | Hitachi, Ltd. | Stream data processing method and computer systems |
US20090287628A1 (en) | 2008-05-15 | 2009-11-19 | Exegy Incorporated | Method and System for Accelerated Stream Processing |
US8010512B2 (en) * | 2008-06-16 | 2011-08-30 | International Business Machines Corporation | System and method for model-driven object store |
US20090327257A1 (en) * | 2008-06-27 | 2009-12-31 | Business Objects, S.A. | Apparatus and method for facilitating continuous querying of multi-dimensional data streams |
US20100088274A1 (en) * | 2008-10-03 | 2010-04-08 | Microsoft Corporation | System and method for synchronizing a repository with a declarative defintion |
US20100318495A1 (en) * | 2009-06-12 | 2010-12-16 | Sap Ag | Correlation aware synchronization for near real-time decision support |
US20110060890A1 (en) * | 2009-09-10 | 2011-03-10 | Hitachi, Ltd | Stream data generating method, stream data generating device and a recording medium storing stream data generating program |
US20110119262A1 (en) * | 2009-11-13 | 2011-05-19 | Dexter Jeffrey M | Method and System for Grouping Chunks Extracted from A Document, Highlighting the Location of A Document Chunk Within A Document, and Ranking Hyperlinks Within A Document |
US20110125778A1 (en) * | 2009-11-26 | 2011-05-26 | Hitachi, Ltd. | Stream data processing method, recording medium, and stream data processing apparatus |
US20110196856A1 (en) * | 2010-02-10 | 2011-08-11 | Qiming Chen | Processing a data stream |
Non-Patent Citations (2)
Title |
---|
Damian Black, Relational Asynchronous Messaging: A New Paradigm for Data Management & Integration, Jan. 2007. |
SQL Stream, Query the Future: SQLstream's RAMMS Solutions and Applications, 2009 SQL Stream Incorporated. |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10997124B2 (en) * | 2013-04-02 | 2021-05-04 | Micro Focus Llc | Query integration across databases and file systems |
US10067703B2 (en) | 2015-12-15 | 2018-09-04 | International Business Machines Corporation | Monitoring states of processing elements |
US10552421B2 (en) | 2015-12-15 | 2020-02-04 | International Business Machines Corporation | Monitoring states of processing elements |
US10311052B2 (en) | 2017-05-19 | 2019-06-04 | International Business Machines Corporation | Query governor enhancements for databases integrated with distributed programming environments |
Also Published As
Publication number | Publication date |
---|---|
US20120078939A1 (en) | 2012-03-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8620945B2 (en) | Query rewind mechanism for processing a continuous stream of data | |
US8661014B2 (en) | Stream processing by a query engine | |
US11240117B2 (en) | Intelligent analytic cloud provisioning | |
US9361342B2 (en) | Query to streaming data | |
AU2019232789B2 (en) | Aggregating data in a mediation system | |
US8260803B2 (en) | System and method for data stream processing | |
US9195708B2 (en) | Continuous querying of a data stream | |
US8880493B2 (en) | Multi-streams analytics | |
CN104903894B (en) | System and method for distributed networks database query engine | |
US20100250572A1 (en) | Data continuous sql process | |
CN107066546B (en) | MPP engine-based cross-data center quick query method and system | |
CN105354247A (en) | Geographical video data organization management method supporting storage and calculation linkage | |
US20140258251A1 (en) | Management of updates in a database system | |
US9229969B2 (en) | Management of searches in a database system | |
US8930352B2 (en) | Reliance oriented data stream management system | |
US20100332501A1 (en) | System and method for on-demand indexing | |
US9594573B2 (en) | Systems and methods of block computation | |
Chen et al. | Experience in extending query engine for continuous analytics | |
US10289721B2 (en) | Query management based on amount of data change | |
Chen et al. | Cut-and-rewind: Extending query engine for continuous stream analytics | |
US20140149419A1 (en) | Complex event processing apparatus for referring to table within external database as external reference object | |
Chen et al. | Continuous mapreduce for in-db stream analytics | |
US20220358095A1 (en) | Managing data requests to a data shard | |
Vallath et al. | Tools and Utilities |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P., TEXAS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:CHEN, QIMING;HSU, MEICHUN;REEL/FRAME:025031/0741 Effective date: 20100921 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP, TEXAS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.;REEL/FRAME:037079/0001 Effective date: 20151027 |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
AS | Assignment |
Owner name: ENTIT SOFTWARE LLC, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP;REEL/FRAME:042746/0130 Effective date: 20170405 |
|
AS | Assignment |
Owner name: JPMORGAN CHASE BANK, N.A., DELAWARE Free format text: SECURITY INTEREST;ASSIGNORS:ENTIT SOFTWARE LLC;ARCSIGHT, LLC;REEL/FRAME:044183/0577 Effective date: 20170901 Owner name: JPMORGAN CHASE BANK, N.A., DELAWARE Free format text: SECURITY INTEREST;ASSIGNORS:ATTACHMATE CORPORATION;BORLAND SOFTWARE CORPORATION;NETIQ CORPORATION;AND OTHERS;REEL/FRAME:044183/0718 Effective date: 20170901 |
|
AS | Assignment |
Owner name: MICRO FOCUS LLC, CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:ENTIT SOFTWARE LLC;REEL/FRAME:050004/0001 Effective date: 20190523 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |
|
AS | Assignment |
Owner name: MICRO FOCUS LLC (F/K/A ENTIT SOFTWARE LLC), CALIFORNIA Free format text: RELEASE OF SECURITY INTEREST REEL/FRAME 044183/0577;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:063560/0001 Effective date: 20230131 Owner name: NETIQ CORPORATION, WASHINGTON Free format text: RELEASE OF SECURITY INTEREST REEL/FRAME 044183/0718;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:062746/0399 Effective date: 20230131 Owner name: MICRO FOCUS SOFTWARE INC. (F/K/A NOVELL, INC.), WASHINGTON Free format text: RELEASE OF SECURITY INTEREST REEL/FRAME 044183/0718;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:062746/0399 Effective date: 20230131 Owner name: ATTACHMATE CORPORATION, WASHINGTON Free format text: RELEASE OF SECURITY INTEREST REEL/FRAME 044183/0718;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:062746/0399 Effective date: 20230131 Owner name: SERENA SOFTWARE, INC, CALIFORNIA Free format text: RELEASE OF SECURITY INTEREST REEL/FRAME 044183/0718;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:062746/0399 Effective date: 20230131 Owner name: MICRO FOCUS (US), INC., MARYLAND Free format text: RELEASE OF SECURITY INTEREST REEL/FRAME 044183/0718;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:062746/0399 Effective date: 20230131 Owner name: BORLAND SOFTWARE CORPORATION, MARYLAND Free format text: RELEASE OF SECURITY INTEREST REEL/FRAME 044183/0718;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:062746/0399 Effective date: 20230131 Owner name: MICRO FOCUS LLC (F/K/A ENTIT SOFTWARE LLC), CALIFORNIA Free format text: RELEASE OF SECURITY INTEREST REEL/FRAME 044183/0718;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:062746/0399 Effective date: 20230131 |