WO2018215062A1 - System and method for stream processing - Google Patents

System and method for stream processing Download PDF

Info

Publication number
WO2018215062A1
WO2018215062A1 PCT/EP2017/062548 EP2017062548W WO2018215062A1 WO 2018215062 A1 WO2018215062 A1 WO 2018215062A1 EP 2017062548 W EP2017062548 W EP 2017062548W WO 2018215062 A1 WO2018215062 A1 WO 2018215062A1
Authority
WO
WIPO (PCT)
Prior art keywords
events
stream
operator
operators
evictor
Prior art date
Application number
PCT/EP2017/062548
Other languages
French (fr)
Inventor
Radu TUDORAN
Goetz BRASCHE
Xing ZHU
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to CN201780017236.8A priority Critical patent/CN109643307B/en
Priority to PCT/EP2017/062548 priority patent/WO2018215062A1/en
Publication of WO2018215062A1 publication Critical patent/WO2018215062A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries

Abstract

A system for processing a stream of events, the system comprising a plurality of operators configured to receive a plurality of streams, and being configured to: receive by a first operator a first stream; generate by the first operator a first window of events by selecting from the received first stream a first set of events that satisfy a filter test; generate by the first operator a first evictor stream by selecting from the received first stream a second set of events that do not satisfy the filter test; receive at a second operator the first evictor stream; apply by the first operator a first computation function on the first set of events to obtain a first output stream; and apply by the second operator on the first evictor stream at least one of a second filter and a second computation function, to obtain a second output stream.

Description

SYSTEM AND METHOD FOR STREAM PROCESSING
BACKGROUND
The present invention, in some embodiments thereof, relates to a system for processing a stream of data and, more specifically, but not exclusively, to distributed processing of data in big data systems.
The term big data is used to refer to a collection of data so large and/or so complex that traditional data processing application software cannot deal with the collection adequately. Among the challenges in dealing with big data is analysis of the large amount of data in the collection. In some systems the data is an ordered sequence of data instances or events, referred to as a stream of data or a stream of events.
In typical batch processing systems, data may be accessed as many times as needed to perform the required processing. In stream processing systems, data arrives continuously and cannot be stored for future reference. There may be a need to continuously calculate, on the fly, mathematical or statistical analytics within the stream of events. In some systems there is a need to handle high volumes of data in real time. In addition, there may be a need for the system to be scalable and have a fault tolerant architecture.
Some stream processing systems use window stream operators. A window stream operator is a software object for processing a set of data instances (also referred to as events), selected by applying a filter to some of the events of the stream of events. The set of selected events is called a window of events. After applying the filter, a typical window stream operator discards the remaining events, i.e. events out of the scope of the filter. In some systems, an instance of data can be read only once. In such systems, when the system comprises more than one window stream operator, there may be a need to duplicate the entire stream of events in order to select more than one window of events from the same stream of events.
SUMMARY
It is an object of the present invention to provide a system and a method for processing a stream of data.
The foregoing and other objects are achieved by the features of the independent claims. Further implementation forms are apparent from the dependent claims, the description and the figures. According to a first aspect of the invention, a system for processing a stream of events comprises a plurality of operators configured to receive a plurality of streams of events, the system being configured to: receive by a first operator of said plurality of operators a first stream of events; generate by the first operator a first window of events by selecting from the received first stream of events a first set of events that satisfy a filter test; generate by the first operator a first evictor stream of events by selecting from the received first stream of events a second set of events that do not satisfy said filter test; receive at a second operator of the plurality of operators the first evictor stream of events from the first operator; apply by the first operator a first computation function on the first set of events to obtain a first output stream of events; and apply by the second operator on the first evictor stream at least one of a second filter and a second computation function, to obtain a second output stream of events. Directing the remaining events to an evictor stream of events allows other software objects to process the remaining events, facilitating processing based on dynamic partitioning of the stream of events.
According to a second aspect of the invention, a method for processing a stream of events by a plurality of operators comprises: receiving by a first operator of the plurality of operators a first stream of events; generating by the first operator a first window of events by selecting from the received first stream of events a first set of events that satisfy a filter test; generating by the first operator a first evictor stream of events by selecting from the received stream of events a second set of events that do not satisfy the filter test; receiving at a second operator of the plurality of operators the first evictor stream of events from the first operator; applying by the first operator a computation function on the first set of events to obtain a first output stream of events; and applying by the second operator on the first evictor stream at least one of a second filter and a second computation function, to obtain a second output stream of events.
With reference to the first and second aspects, in a first possible implementation of the first and second aspects of the present invention, the system is further configured to: receive by the first operator a second stream of events; generate by the first operator a second window of events by selecting from the first window of events and the second stream of events a third set of events that satisfy the filter test; generate by the first operator a second evictor stream of events by selecting from the first window of events and the second stream of events a fourth set of events that do not satisfy the filter test; receive at the second operator the second evictor stream of events from the first operator; apply by the first operator the first computation function on the third set of events to obtain a third output stream of events; and apply by the second operator on the second evictor stream at least one of the second filter and the second computation function, to obtain a fourth output stream of events. Producing a new window of events after receiving some more events allows continuous processing, which is needed to process a stream of events that is continuous by nature.
With reference to the first and second aspects, or the first possible implementation of the first and second aspects, in a second possible implementation of the first and second aspects of the present invention, at least one of the plurality of operators is a software object executed by at least one hardware processor. Using software objects allows complex and dynamic changes in processing.
With reference to the first aspect and second aspects, or the first or second possible implementations of the first and second aspects, in a third possible implementation of the first and second aspects of the present invention, the first operator is configured to produce a plurality of output streams; one of the plurality of output streams is received by at least one third operator of the plurality of operators; and a second of the plurality of output streams is received by at least one fourth operator of the plurality of operators. Optionally, directing more than one output stream of results to more than one operator facilitates system topologies more complex than a single path. Executing more than one software object on one hardware processor reduces the cost of creating the system, as well as reducing the system's power consumption.
With reference to the first and second aspects, or the first, second, or third possible implementations of the first and second aspects of the present invention, in a fourth possible implementation of the first and second aspects of the present invention, the plurality of other hardware processors are connected in a directed-acyclic-graph (DAG) topology or a directed graph topology with cycles. Graph topologies, cyclic and acyclic, are common topologies for processing algorithms.
With reference to the first and second aspects, or the first, second, third, or fourth possible implementations of the first and second aspects of the present invention, in a fifth possible implementation of the first and second aspects of the present invention, the plurality of other hardware processors are connected in a pipeline topology. Pipelines facilitate improved performance by enabling simultaneous processing. With reference to the first and second aspects, or the first, second, third, fourth, or fifth possible implementations of the first and second aspects of the present invention, in a sixth possible implementation of the first and second aspects of the present invention, the plurality of other hardware processors are connected in a topology member of a group comprising a grid topology and a mesh topology. Grid topologies and mesh topologies support complex processing algorithms.
With reference to the first and second aspects, or the first, second, third, fourth, fifth, or sixth possible implementations of the first and second aspects of the present invention, in a seventh possible implementation of the first and second aspects of the present invention, each event of each stream of the plurality of events has a sequence number in a sequence of events, and the filter test comprises comparing a difference between the sequence number and a second sequence number of a last received event to a certain number threshold. Using a sequence number in a sequence of events allows dynamic partitioning of the events in the stream of events by a number of events.
With reference to the first aspect and second aspects, or the first, second, third, fourth, fifth, sixth, or seventh possible implementations of the first and second aspects of the present invention, in an eighth possible implementation of the first and second aspects of the present invention, each event of the stream of events has a time, the time being a time of event occurrence or a time of event reception, and the filter test comprises comparing a difference between the time and a current time to a certain time difference threshold. Using a time difference allows dynamic partitioning of the events in the stream of events by time.
With reference to the first and second aspects, or the first, second, third, fourth, fifth, sixth, seventh, or eighth possible implementations of the first and second aspects of the present invention, in a ninth possible implementation of the first and second aspects of the present invention, the system further comprises at least one sensor. Events in the stream of events include information collected by the at least one sensor. Using a sensor allows using the present invention in systems for analyzing physical properties such as temperature and velocity.
With reference to the first and third aspects, or the first, second, third, fourth, fifth, sixth, seventh, eighth, or ninth possible implementations of the first and second aspects of the present invention, in a tenth possible implementation of the first and second aspects of the present invention, events in the stream of events include information member of a group including: a temperature, a water level, an amount of accesses to a web site, a price, an amount of people, an age, a length, a height, a weight, a circumference, an amount of light, an amount of sound, an amount of money, a geographical location, an amount of purchases, an amount of objects, a timestamp, an internet protocol address, a media access controller address, an identification number, an identification name, a telephone number, telephone call metadata, a merchant name, and a merchant identification number.
Other systems, methods, features, and advantages of the present disclosure will be or become apparent to one with skill in the art upon examination of the following drawings and detailed description. It is intended that all such additional systems, methods, features, and advantages be included within this description, be within the scope of the present disclosure, and be protected by the accompanying claims.
Unless otherwise defined, all technical and/or scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the invention pertains. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of embodiments of the invention, exemplary methods and/or materials are described below. In case of conflict, the patent specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and are not intended to be necessarily limiting.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
Some embodiments of the invention are herein described, by way of example only, with reference to the accompanying drawings. With specific reference now to the drawings in detail, it is stressed that the particulars shown are by way of example and for purposes of illustrative discussion of embodiments of the invention. In this regard, the description taken with the drawings makes apparent to those skilled in the art how embodiments of the invention may be practiced.
In the drawings:
FIGs. 1A, IB and 1C are schematic illustrations of an exemplary mapping of a stream of events to operators, according to some embodiments of the present invention;
FIG. 2 is a schematic illustration of an exemplary system according to some embodiments of the present invention;
FIG. 3 is a schematic illustration of a second exemplary system according to some embodiments of the present invention; FIG. 4 is a schematic illustration of a third exemplary system according to some embodiments of the present invention;
FIG. 5 is a flowchart schematically representing an optional flow of operations for processing a stream of events, according to some embodiments of the present invention;
FIG. 6 is a flowchart schematically representing a second optional flow of operations for processing a stream of events as it relates to a continuous stream of events, according to some embodiments of the present invention;
FIG. 7 is a flowchart schematically representing a third optional flow of operations for processing a stream of events, according to some embodiments of the present invention; and
FIG. 8 is a flowchart schematically representing a fourth optional flow of operations for processing a stream of events as it relates to a continuous stream of events, according to some embodiments of the present invention.
DETAILED DESCRIPTION
The present invention, in some embodiments thereof, relates to a system for processing a stream of data and, more specifically, but not exclusively, to distributed processing of data in big data systems.
As used herein, the term "event" means a data instance and the term "stream of events" means a continuous ordered sequence of data instances or events.
A window stream operator is a software object for processing a window of data instances (also referred to as events), selected by applying a filter to some events of the stream of events. As used herein, the term "operator" means a window stream operator.
In a typical system using window stream operators, one operator sends its output to a second operator to be processes by the second operator. In typical solutions for window operator based stream processing, each operator applies at most one function to events it receives, and produces at most one output stream of events. A typical operator has a working set of events. The at most one function may be a computation function, applied to the working set of events and resulting in a result event sent to another operator on an output stream of results of the operator. As the operator receives events in a continuous stream of events, the operator adds the received events to its working set of events. At a trigger, the operator selects a window of events by applying a filter to its working set of events and selecting only events that match the filter. Other events remaining after applying the filter, i.e. events out of the scope of the filter, are discarded. Typically the window of events is a group of events, each having a certain property with a value within certain finite boundaries. The trigger may be reception of an event or a time interval since last selecting a window of events. In some systems, the operator applies its computation function after selecting a window of events.
Henceforth, the term "window" means "window of events".
In such systems, events out of the scope of a filter of one operator are discarded, and cannot be processed by a second operator. As a result, such systems cannot support processing based on dynamic partitioning of the events of the stream of events, where at one time an event complies with the filter of one operator, and at a later time the event complies with a second filter for a second operator. For example, such systems cannot support processing based on relative time partitioning of the events of the stream of events along the timeline of the stream of events. An example of such a partitioning is: "events from the last 1 hour", "events from 2 hours ago until 1 hour ago", "events from 3 hours ago until 2 hours ago", etc. Another example of dynamic partitioning of the events of the stream of events is partitioning by an amount of events, for example "last 50 events", "previous 50 events immediately before the last 50 events", etc. A typical solution does not allow cascading an event from one operator to another. Once an event is discarded by an operator, the event is lost to the system and can no longer be processed. In such systems, applying a dynamic filter requires storing all events having the potential to comply with the dynamic filter at a time after being received. In systems with large quantities of continuous data this may require more storage than is typically available on one computer, and requires additional computation for managing the large quantity of data. As a result, typical stream processing systems do not store events for future reference, prohibiting implementation of dynamic partitioning of events in a stream of events.
To address this problem, the current invention in some embodiments thereof adds an evictor stream of events to an operator. The evictor stream may be received by a second operator, enabling the second operator to apply its filter to events discarded by the operator. This solution allows a plurality of operators to receive events in the stream of events without duplicating the entire stream of events and without requiring large storage. This solution allows partitioning the events in the stream of events between a plurality of operators, each receiving only the events it is likely to process upon reception, reducing cost of storage and complexity of management. Another limitation of typical operators is that typical operators generate at most one output stream, thus limiting the number of interconnection possibilities. When an output of one operator is connected to an input of a second operator these operators are considered neighbors. In a typical stream processing system comprising operators, an operator has at most one neighbor operator connected to its output. In such systems, in order to send an operator's output stream of events to multiple other operators, the output stream of events must be duplicated. As a result, sophisticated grid-like interconnections (where an operator has two neighbor operators connected to its output) and neural-network-like interconnections (where an operator has more than two neighbor operators connected to its output) are practically impossible to implement due to the prohibitive cost in terms of storage and processing complexity. The present invention in some embodiments thereof solves this problem by adding one or more functions to an operator and adding one or more additional output streams to the operator. Thus, an operator having a certain filter for generating a window of events may apply a plurality of functions to its window to produce a plurality of types of result events. The operator may direct all result events of one type of result events to one output stream and all result events of a second type of result events to a second output stream. This reduces the need to duplicate the generation of the window and the storage of the events in the window. In addition, having multiple output streams allows an operator to duplicate its output stream without the complexity of adding a software object to receive the output stream and duplicate the output stream. Reducing complexity of management and computation such as generation of windows reduces power consumption and facilitates shorter latencies in computation.
Before explaining at least one embodiment of the invention in detail, it is to be understood that the invention is not necessarily limited in its application to the details of construction and the arrangement of the components and/or methods set forth in the following description and/or illustrated in the drawings and/or the Examples. The invention is capable of other embodiments or of being practiced or carried out in various ways.
The present invention may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention. The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network.
The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user' s computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
In order to understand one of the problems in a typical window operator based stream processing system and how some embodiments of the current invention solve this problem, reference is now made to FIGs. 1A, IB and 1C, showing schematic illustrations of an exemplary mapping of a stream of events to operators, according to some embodiments of the present invention.
Referring now to FIG. 1A, line 330 shows an optional timeline, where time 310 is earlier than time 311. 340, 341 and 342 are optional operators, each having a different filter for producing a window of events. In this example, time 320 is 0: 15 hours after time 311 and 0:35 hours after time 310. In this example, operator 340 has a filter to select only events from the past hour and at time 320, events 301 and 302 comply with operator 340' s filter and may be included in operator 340' s window. In this example, operator 341 has a filter to select only events from the hour before last. If operator 341 receives events 301 and 302 at the time of their occurrence, operator 341 may discard events 301 and 302.
Referring now also to FIG. IB, difference 350 indicates that time 321 is 1: 15 hours after time 320. Thus, in this example time 310 is 1:50 hours before time 321 and time 311 is 1:30 hours before time 321. In this example, events 303 and 304 occurred at time 312 and 313 respectively, both before time 321. In this example, at time 321 only event 304 complies with operator 340' s filter. Operator 340 may discard events 301, 302 that were previously in its window, and 303 received later. If operator 341 could receive events 301, 302 and 303, at time 321 events 301, 302 and 303 would be in operator 341 's window. However, in a typical system, events discarded by one operator are lost to the system and cannot be processed by another operator. Storing events 301, 302 and 303 by operator 341 until such time as they may comply with operator 341 's filter requires additional storage and processing resources to identify when the stored events comply with the operator' s filter. The present invention, in some embodiments thereof, provides a solution to the problem of processing by one operator events discarded by another operator.
Referring now also to FIG. 1C, in some embodiments of the present invention operator 340 generates an evictor stream 351, sent to an input of operator 341. In such embodiments, events 301, 302 and 303 may be included in operator 341 's window, without operator 341 storing events 301, 302 and 303 until they comply with operator
341 's filter. Similarly, operator 341 may generate an evictor stream 352 and operator
342 may receive evictor stream 352, to process events discarded by operator 341.
The present invention, in some embodiments thereof, provides a system having an operator processing events discarded by another operator.
Reference is now made to FIG. 2, showing a schematic illustration of an exemplary system 100 according to some embodiments of the present invention. In such embodiments, an operator is at least one software object for processing a stream of events, executed by at least one hardware processor. In such embodiments, operator 101 receives a stream of events 110. Optionally, the operator receives warehouse data 115. Warehouse data may comprise data read from a database and non- stream data received by the operator. In these embodiments, operator 101 generates a window of events by selecting some events from events of the stream of events according to a compliance of the some events with a filter test. The operator discards the remaining events (that is, events that do not comply with the filter test), and applies at least one computation function to the window of events to produce one or more result events. In these embodiments operator 101 generates at least one output stream 111 and outputs the one or more result events on the at least one output stream. Optionally, the at least one output stream is received by at least one other operator 103. In these embodiments, operator 101 generates an evictor stream of events 113, received by at least one third operator 102. In such embodiments operator 101 outputs events it discards on the evictor stream, allowing at least one third operator 102 to process some of the events discarded by operator 101. Operator 102 receiving the evictor stream may be connected to a fourth operator 105. Operator 102 may generate at least one other output stream of results 114 and send it to operator 105. Optionally, one hardware processor executes two or more of operators 101, 102, 103 and 105.
In some embodiments, events of the stream of events have a plurality of values of a plurality of event properties. Examples of event properties are a time of event generation (occurrence), a time of event reception, a sequence number in a sequence of events, and a source identification name or number. Optionally, the window of events is produced by applying the filter test to each event's value of at least one certain event property. In some embodiments the filter test compares the difference between two values to a certain threshold value. For example, the certain event property may be a time of event occurence. A possible optional filter test is time values that are more than one hour before the current time but less than two hours before the current time (this produces a window consisting of events occurring in the hour before the last hour). Another example of a certain event property is a sequence number in a sequence of events. Another possible optional filter test is sequence values that are less than the highest sequence value by no more than 100 (this produces a window consisting of the last 100 events in the sequence).
The present invention, in some embodiments thereof, offers also a solution for the problem of generating more than one output stream of events by one operator.
Reference is now also made to FIG. 3, showing a schematic illustration of a second exemplary system 200 according to some embodiments of the present invention. In such embodiments, operator 101 produces at least one more output stream 112, received by at least one more operator 104. Optionally, operator 101 generates evictor stream 113. In embodiments where the system has a plurality of operators, the operators may be connected in a directed graph topology, where an output of one operator is connected to an input of another operator. In graph topologies an operator has at most one neighbor in one dimension (input or output). In some embodiments the graph may be cyclic. In other embodiments the graph may be acyclic. In some embodiments, the plurality of operators may be connected in a grid topology, where an operator has two output neighbors in one or more dimensions. In other embodiments, the plurality of operators may be connected in a mesh topology, where an operator has more than two neighbors in one or more dimensions. In some embodiments having a plurality of operators, the plurality of operators are connected in a pipeline, where one operator processes new events while at the same time another operator processes the one operator's output.
The present invention, in some embodiments thereof, provides a solution where one operator generates more than one output stream of results and no evictor stream.
Reference is now also made to FIG. 4, showing a schematic illustration of a third exemplary system 00 according to some embodiments of the present invention. In such embodiments, operator 101 applies at least one computation function to its window to produce one or more of one type of result events and applies at least one additional computation function to its window to produce one or more of at least one additional type of result events. In such embodiments, operator 101 outputs the one or more results of the one type of results to output stream 111 and outputs the one or more results of the at least one additional type of results to output stream 112. Output stream 111 is received by 103, and output stream 112 is received by 104. In such embodiments, the system may have a plurality of operators. In such embodiments the operators may be connected in a directed graph topology, where an output of one operator is connected to an input of another operator. In graph topologies an operator has at most one neighbor in one dimension (input or output). In some embodiments the graph may be cyclic. In other embodiments the graph may be acyclic. In some embodiments, the plurality of operators may be connected in a grid topology, where an operator has two output neighbors in one or more dimensions. In other embodiments, the plurality of operators may be connected in a mesh topology, where an operator has more than two neighbors in one or more dimensions.
Optionally, events in the stream of events include information member of a group including: a temperature, a water level, an amount of accesses to a web site, a price, an amount of people, an age, a length, a height, a weight, a circumference, an amount of light, an amount of sound, an amount of money, a geographical location, an amount of purchases, an amount of objects, a timestamp, an internet protocol address, a media access controller address, an identification number, an identification name, a telephone number, telephone call metadata, a merchant name, and a merchant identification number. Example of telephone call metadata include a time of call start, a call duration, a sender identification name or number, a receiver identification name or number and an antenna identification name or number.
In some embodiments of the present invention, the system further comprises at least one sensor, converting a physical parameter into an electrical signal which can be measured and the measured value can be collected. Examples of physical parameters are temperature, velocity, and blood pressure. In such embodiments, events in the stream of events include information collected by the at least one sensor.
To provide a solution for processing discarded events, in some embodiments of the present invention the systems implement the following method.
Reference is now made to FIG. 5, showing a flowchart schematically representing an optional flow of operations 500 for processing a stream of events, according to some embodiments of the present invention. In such embodiments, an operator receives 401 some events of a stream of events, each event having a plurality of values of a plurality of event properties. In 402 the operator produces a window of events by selecting events from the some events of the stream of events. Events are selected such that each selected event has a value of at least one certain event property compliant with a filter test. In 403, in such embodiments the operator applies a computation function to the window of events, to produce one or more result events. In some embodiments the operator generates 404 one or more output streams of results. In such embodiments, each of the one or more output streams of results comprises at least one of the one or more result events. In some embodiments the operator generates 405 an evictor stream of events, comprising all events of the some of the events remaining after producing the window of events, that is having a value of the at least one certain property not compliant with the filter test. In some embodiments, the evictor stream of events is not a duplication of the stream of events.
Processing a stream of events is a continuous process. After processing a first window of events, in some embodiments of the present invention the operator continues to receive events of the stream of events and continues processing events of the stream of events.
Reference is now also made to FIG. 6, showing a flowchart schematically representing a second optional flow of operations 600 for processing a stream of events as it relates to a continuous stream of events, according to some embodiments of the present invention. In some embodiments, the operator receives a stream of events comprising a continuous sequence of events. In such embodiments, after producing a window of events, the operator receives 701 some other events of the stream of events. In such embodiments, in 702 the operator produces a new window of events by selecting events from the previously produced window of events and the some other events of the stream of events. Events are selected such that each selected event has a value of the at least one certain event property compliant with the filter test. In 703, in such embodiments the operator applies the computation function to the window of events, to produce one or more other result events. In such embodiments, the operator outputs 704 on each of the one or more output streams of results at least one of the one or more other result events. In some embodiments the operator outputs 705 on the evictor stream of events all events of the previously produced window of events and the some other events remaining after producing the new window of events, that is having a value of the at least one certain property not compliant with the filter test. Optionally, flow of operations 600 is repeated as additional events are received by the operator.
To provide a solution for producing more than one output stream of results, in some embodiments of the present invention the systems implement the following method.
Reference is now also made to FIG. 7, showing a flowchart schematically representing a third optional flow of operations 700 for processing a stream of events, according to some embodiments of the present invention. In such embodiments, an operator receives 501 some events of a stream of events, each event having a plurality of values of a plurality of event properties. In 502 the operator produces a window of events by selecting events from the some events of the stream of events. Events are selected such that each selected event has a value of at least one certain event property compliant with a filter test. In 503, in such embodiments the operator applies a computation function to the window of events, to produce one or more result events. In some embodiments the operator generates 504 at least two output streams of results. In such embodiments, each of the at least two output streams of results comprises at least one of the one or more result events.
In some embodiments of the present invention, an operator applies more than one computation function to a window of events by the system implementing the following method.
Reference is now made to FIG. 8, showing a flowchart schematically representing a fourth optional flow of operations 800 for processing a stream of events, according to some embodiments of the present invention. In such embodiments, an operator receives 501 some events of a stream of events, each event having a plurality of values of a plurality of event properties. In 502 the operator produces a window of events by selecting events from the some events of the stream of events. Events are selected such that each selected event has a value of at least one certain event property compliant with a filter test. In 503, in such embodiments the operator applies a computation function to the window of events, to produce one or more result events. In addition, in some embodiments the operator applies 505 at least one additional computation function to the window of events to produce the one or more result events. In some embodiments the operator generates 504 at least two output streams of results. In such embodiments, each of the at least two output streams of results comprises at least one of the one or more result events. The descriptions of the various embodiments of the present invention have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
It is expected that during the life of a patent maturing from this application many relevant operators and events will be developed and the scope of the terms "operator" and "event" are intended to include all such new technologies a priori.
As used herein the term "about" refers to ± 10 %.
The terms "comprises", "comprising", "includes", "including", "having" and their conjugates mean "including but not limited to". This term encompasses the terms "consisting of and "consisting essentially of.
The phrase "consisting essentially of means that the composition or method may include additional ingredients and/or steps, but only if the additional ingredients and/or steps do not materially alter the basic and novel characteristics of the claimed composition or method.
As used herein, the singular form "a", "an" and "the" include plural references unless the context clearly dictates otherwise. For example, the term "a compound" or "at least one compound" may include a plurality of compounds, including mixtures thereof.
The word "exemplary" is used herein to mean "serving as an example, instance or illustration". Any embodiment described as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments and/or to exclude the incorporation of features from other embodiments.
The word "optionally" is used herein to mean "is provided in some embodiments and not provided in other embodiments". Any particular embodiment of the invention may include a plurality of "optional" features unless such features conflict.
Throughout this application, various embodiments of this invention may be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 3, 4, 5, and 6. This applies regardless of the breadth of the range.
Whenever a numerical range is indicated herein, it is meant to include any cited numeral (fractional or integral) within the indicated range. The phrases "ranging/ranges between" a first indicate number and a second indicate number and "ranging/ranges from" a first indicate number "to" a second indicate number are used herein interchangeably and are meant to include the first and second indicated numbers and all the fractional and integral numerals therebetween.
It is appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable subcombination or as suitable in any other described embodiment of the invention. Certain features described in the context of various embodiments are not to be considered essential features of those embodiments, unless the embodiment is inoperative without those elements.
All publications, patents and patent applications mentioned in this specification are herein incorporated in their entirety by reference into the specification, to the same extent as if each individual publication, patent or patent application was specifically and individually indicated to be incorporated herein by reference. In addition, citation or identification of any reference in this application shall not be construed as an admission that such reference is available as prior art to the present invention. To the extent that section headings are used, they should not be construed as necessarily limiting.

Claims

1. A system for processing a stream of events, said system comprising a plurality of operators, configured to receive a plurality of streams of events, said system being configured to:
receive by a first operator of said plurality of operators a first stream of events; generate by said first operator a first window of events by selecting from said received first stream of events a first set of events that satisfy a filter test;
generate by said first operator a first evictor stream of events by selecting from said received first stream of events a second set of events that do not satisfy said filter test;
receive at a second operator of said plurality of operators said first evictor stream of events from said first operator;
apply by said first operator a first computation function on said first set of events to obtain a first output stream of events; and
apply by said second operator on said first evictor stream at least one of a second filter and a second computation function, to obtain a second output stream of events.
2. The system of claim 1, wherein said system is further configured to:
receive by said first operator a second stream of events;
generate by said first operator a second window of events by selecting from said first window of events and said second stream of events a third set of events that satisfy said filter test;
generate by said first operator a second evictor stream of events by selecting from said first window of events and said second stream of events a fourth set of events that do not satisfy the filter test;
receive at said second operator said second evictor stream of events from said first operator;
apply by said first operator said first computation function on said third set of events to obtain a third output stream of events; and
apply by said second operator on said second evictor stream at least one of said second filter and said second computation function, to obtain a fourth output stream of events.
3. The system of claim 1 or claim 2, wherein at least one of said plurality of operators is a software object executed by at least one hardware processor.
4. The system of one of the preceding claims, wherein said first operator is configured to produce a plurality of output streams;
wherein one of said plurality of output streams is received by at least one third operator of said plurality of operators; and
wherein a second of said plurality of output streams is received by at least one fourth operator of said plurality of operators.
5. The system of claim 4, wherein said first operator and said third operator are software objects executed by the same hardware processor.
6. The system of one of claims 1 to 5, wherein said plurality of operators are connected in a directed- acyclic-graph (DAG) topology or a directed graph topology with cycles.
7. The system of one of claims 1 to 5, wherein said plurality of operators are connected in a pipeline topology.
8. The system of one of claims 1 to 5, wherein said plurality of operators are connected in a topology member of a group comprising a grid topology and a mesh topology.
9. The system of one of the preceding claims, wherein each event of each stream of said plurality of streams of events has a sequence number in a sequence of events; and wherein said filter test comprises comparing a difference between said sequence number and a second sequence number of a last received event to a certain number threshold.
10. The system of one of the preceding claims, wherein each event of said stream of events has a time, said time being a time of event occurrence or a time of event reception; and wherein said filter test comprises comparing a difference between said time and a current time to a certain time difference threshold.
11. The system of one of the preceding claims, further comprising at least one sensor;
wherein events in said stream of events include information collected by said at least one sensor.
12. The system of claim 1 wherein events in said stream of events include information member of a group including: a temperature, a water level, an amount of accesses to a web site, a price, an amount of people, an age, a length, a height, a weight, a circumference, an amount of light, an amount of sound, an amount of money, a geographical location, an amount of purchases, an amount of objects, a timestamp, an internet protocol address, a media access controller address, an identification number, an identification name, a telephone number, telephone call metadata, a merchant name, and a merchant identification number.
13 A method for processing a stream of events by a plurality of operators, comprising:
receiving by a first operator of said plurality of operators a first stream of events; generating by said first operator a first window of events by selecting from said received first stream of events a first set of events that satisfy a filter test;
generating by said first operator a first evictor stream of events by selecting from said received stream of events a second set of events that do not satisfy said filter test; receiving at a second operator of said plurality of operators said first evictor stream of events from said first operator;
applying by said first operator a computation function on said first set of events to obtain a first output stream of events; and
applying by said second operator on said first evictor stream at least one of a second filter and a second computation function, to obtain a second output stream of events.
14. The method of claim 13, further comprising:
receiving by said first operator a second stream of events; generating by said first operator a second window of events by selecting from said first window of events and said second stream of events a third set of events that satisfy said filter test;
generating by said first operator a second evictor stream of events by selecting from said first window of events and said second stream of events a fourth set of events that do not satisfy the filter test;
receiving at said second operator said second evictor stream of events from said first operator;
applying by said first operator said first computation function on said third set of events to obtain a third output stream of events; and
applying by said second operator on said second evictor stream at least one of said second filter and said second computation function, to obtain a fourth output stream of events.
PCT/EP2017/062548 2017-05-24 2017-05-24 System and method for stream processing WO2018215062A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201780017236.8A CN109643307B (en) 2017-05-24 2017-05-24 Stream processing system and method
PCT/EP2017/062548 WO2018215062A1 (en) 2017-05-24 2017-05-24 System and method for stream processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2017/062548 WO2018215062A1 (en) 2017-05-24 2017-05-24 System and method for stream processing

Publications (1)

Publication Number Publication Date
WO2018215062A1 true WO2018215062A1 (en) 2018-11-29

Family

ID=58772889

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2017/062548 WO2018215062A1 (en) 2017-05-24 2017-05-24 System and method for stream processing

Country Status (2)

Country Link
CN (1) CN109643307B (en)
WO (1) WO2018215062A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230055003A1 (en) * 2021-08-20 2023-02-23 Kenny To Research LLC Method for Organizing Data by Events, Software and System for Same
WO2024038245A1 (en) * 2022-08-15 2024-02-22 Arm Limited Behavioral sensor for creating consumable events

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8069190B2 (en) * 2007-12-27 2011-11-29 Cloudscale, Inc. System and methodology for parallel stream processing
US20150293974A1 (en) * 2014-04-10 2015-10-15 David Loo Dynamic Partitioning of Streaming Data
WO2016032986A1 (en) * 2014-08-29 2016-03-03 Microsoft Technology Licensing, Llc Event stream transformations

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9275353B2 (en) * 2007-11-09 2016-03-01 Oracle America, Inc. Event-processing operators
CN102012918B (en) * 2010-11-26 2012-11-21 中金金融认证中心有限公司 System and method for excavating and executing rule
US9390135B2 (en) * 2013-02-19 2016-07-12 Oracle International Corporation Executing continuous event processing (CEP) queries in parallel

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8069190B2 (en) * 2007-12-27 2011-11-29 Cloudscale, Inc. System and methodology for parallel stream processing
US20150293974A1 (en) * 2014-04-10 2015-10-15 David Loo Dynamic Partitioning of Streaming Data
WO2016032986A1 (en) * 2014-08-29 2016-03-03 Microsoft Technology Licensing, Llc Event stream transformations

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230055003A1 (en) * 2021-08-20 2023-02-23 Kenny To Research LLC Method for Organizing Data by Events, Software and System for Same
WO2024038245A1 (en) * 2022-08-15 2024-02-22 Arm Limited Behavioral sensor for creating consumable events

Also Published As

Publication number Publication date
CN109643307B (en) 2021-08-20
CN109643307A (en) 2019-04-16

Similar Documents

Publication Publication Date Title
US10885292B2 (en) System, method and computer program product for pollution source attribution
US11487764B2 (en) System and method for stream processing
US9942103B2 (en) Predicting service delivery metrics using system performance data
CN106664322A (en) Event stream transformations
US10057283B2 (en) Volumetric event forecasting tool
US9766993B2 (en) Quality of information assessment in dynamic sensor networks
US10685043B2 (en) Event analysis in network management event streams
US20170286861A1 (en) Structured machine learning framework
CN109033234A (en) It is a kind of to update the streaming figure calculation method and system propagated based on state
US20200177712A1 (en) Harmonized data for engineering simulation
US20150081376A1 (en) Customization of event management and incident management policies
US9009007B2 (en) Simulating stream computing systems
Zaikin et al. An improved SAT-based guess-and-determine attack on the alternating step generator
KR101634402B1 (en) Retrospective event processing pattern language and execution model extension
US10664743B2 (en) Modeling a subject process by machine learning with adaptive inputs
US11514381B2 (en) Providing customized integration flow templates
WO2018215062A1 (en) System and method for stream processing
Munteanu et al. Cloud incident management, challenges, research directions, and architectural approach
EP2992430B1 (en) Method and system for generating directed graphs
US10122805B2 (en) Identification of collaborating and gathering entities
CN110022343B (en) Adaptive event aggregation
McKee et al. Enabling decision support for the delivery of real-time services
US20160004982A1 (en) Method and system for estimating the progress and completion of a project based on a bayesian network
Amekraz et al. An adaptive workload prediction strategy for non-Gaussian cloud service using ARMA model with higher order statistics
US9438480B2 (en) Generating a representation of the status of a data processing system based on empirical operations metrics and derived sentiment metrics

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17725947

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17725947

Country of ref document: EP

Kind code of ref document: A1