WO2013097073A1 - Stream processing method and apparatus - Google Patents

Stream processing method and apparatus Download PDF

Info

Publication number
WO2013097073A1
WO2013097073A1 PCT/CN2011/084643 CN2011084643W WO2013097073A1 WO 2013097073 A1 WO2013097073 A1 WO 2013097073A1 CN 2011084643 W CN2011084643 W CN 2011084643W WO 2013097073 A1 WO2013097073 A1 WO 2013097073A1
Authority
WO
WIPO (PCT)
Prior art keywords
stream processing
event
current stream
processing window
expression
Prior art date
Application number
PCT/CN2011/084643
Other languages
French (fr)
Chinese (zh)
Inventor
刘晓
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2011/084643 priority Critical patent/WO2013097073A1/en
Priority to CN201180003717.6A priority patent/CN103282880B/en
Publication of WO2013097073A1 publication Critical patent/WO2013097073A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching

Definitions

  • the present invention relates to the field of complex event processing, and in particular, to a stream processing method and apparatus. Background technique
  • CEP Complex Event Processing
  • CEP technology is an emerging event stream-based data processing technology. It treats system data as different types of event streams. By analyzing the relationship between events, detecting and establishing associations between different events, and using techniques such as filtering, association, and aggregation, finally generating and triggering advanced events by simple events. Or business process. Unlike traditional database technology, CEP technology does not process large amounts of data stored statically, but processes dynamically generated, time-series data streams in real time to discover relationships between a large number of different events—patterns, exceptions, Missing, hierarchy, etc.
  • the query statement continuously acts/executes (ie, performs filtering, association, etc.) on the event. Since the event is continuously passed through the CEP system, in CEP processing, it is often necessary to define a window, and the query is generally calculated based on the specific window. For example: For stock event streams, count the average of all stocks in the last minute; or count the stock averages for the last 100 trades.
  • CQL Continuous Query Language
  • SQL Structured Query Language
  • the stream name is followed by the FROM clause (assuming the stream name is S), and the definition after the stream name ".” indicates the concept of the window. For example: Select * From S. win: time (60 sec), which means that 60 seconds of stream processing window is opened for stream S.
  • time sliding window represented as S. win: time (60 sec)
  • number sliding window represented as S. win: length (60 sec)
  • grouping window represented as S. std: groupwin(*)
  • the existing keyword Having in the stream processing can only output the record after the condition is satisfied, and the previous data is considered to be the condition data not processed together with the expired data, and cannot be output together with the record satisfying the condition. , such as: The condition is having > 5; then if there are 10 records that actually satisfy the condition, the actual output is only 5 ⁇ 10.
  • a technical problem to be solved by embodiments of the present invention is to provide a stream processing method and apparatus.
  • the output of all events that satisfy the condition can be implemented without multi-rule nesting.
  • an embodiment of the present invention provides a stream processing method, where the method includes:
  • the cached event belonging to the current stream processing window is output according to the first keyword, where the first keyword is used to indicate that the output buffered belongs to the current stream processing.
  • the window and the event of the first expression is satisfied.
  • the embodiment of the present invention further provides a stream processing apparatus, where the apparatus includes: a receiving unit, configured to receive an event belonging to a current stream processing window;
  • a cache unit configured to cache the event that belongs to the current stream processing window
  • a first expression determining unit configured to determine whether the cached event belonging to the current stream processing window is Satisfying the first expression related to the first keyword
  • an output unit configured to: when the first expression determining unit determines that the first expression is satisfied, output the cached event belonging to the current stream processing window according to the first keyword, where the first keyword is used Instructing to output an event that belongs to the current stream processing window and satisfying the first expression; a loop unit, configured to not delete the cached when the first expression determining unit determines that the first expression is not satisfied The event belonging to the current stream processing window, and triggering the receiving unit to receive the next event belonging to the current stream processing window.
  • the cached event belonging to the current stream processing window is not deleted, and the current stream is not
  • the events are not retained, so that at the last output, all events in the stream processing window that cause the condition are satisfied can be output, and not only the event after the condition is satisfied.
  • FIG. 1 is a schematic flow chart of a stream processing method in an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a composition of a stream processing device in an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of a composition of a cache unit in an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of another composition of a stream processing apparatus in an embodiment of the present invention.
  • FIG. 5 is another schematic flowchart of a stream processing method in an embodiment of the present invention.
  • FIG. 6 is a schematic flowchart of processing a plurality of events by using the stream processing method in FIG. 5;
  • FIG. 7 is a schematic diagram showing changes in events included in a sliding window when the sliding window is slid in multiple times in the embodiment of the present invention.
  • FIG. 8 is a schematic flowchart of processing a plurality of events by using the stream processing method in the embodiment of the present invention
  • FIG. 9 is another schematic diagram of event changes included when the sliding window is slid multiple times in the embodiment of the present invention.
  • the having keyword can only retain the data after the condition is met in the period (ie, the stream processing window).
  • the first keyword for example, the had keyword
  • all data in the period that causes the condition to be reached can be retained.
  • had can be followed by a complex conditional expression, which can be a single-flow conditional expression or a multi-stream conditional expression.
  • FIG. 1 it is a schematic flowchart of an event stream-based data processing method in an embodiment of the present invention. The process includes the following steps:
  • the 101 Receive an event belonging to a current stream processing window.
  • events are entered as event streams.
  • the event can include a call log, a short message send record, or a website click record.
  • the event belonging to the current stream processing window may be cached in the memory of the stream processing engine.
  • the second expression may be further cached, that is, whether the event belonging to the current stream processing window satisfies the second expression related to the second keyword; if the judgment result is that the second expression is not satisfied And deleting the cached event belonging to the current stream processing window, and buffering the received event belonging to the current stream processing window; otherwise, buffering the received event belonging to the current stream processing window.
  • the expression may also be a third expression, that is, determining whether to cache the event of the current stream processing window according to the third expression, and when the judgment result is yes, buffering the event belonging to the current stream processing window.
  • the first keyword may be a had keyword.
  • the stream processing window may be a time window that changes with time, such as a sliding time window, a skip time window, etc., when the current stream processing window moves, the cache is deleted. Events that are not part of the moved stream processing window.
  • the intermediate calculation result of processing the event of the current stream processing window is cached;
  • the word output buffer belongs to the event of the current stream processing window, the final calculation result obtained based on the intermediate calculation result is output.
  • the above method can be used in a complex event processing CEP system using a CQL language, an EPL language, or a CEP rule description language like SQL.
  • the embodiment of the present invention further provides a stream processing apparatus.
  • the apparatus includes: a receiving unit 10, configured to receive an event belonging to a current stream processing window; and a buffer unit 20, configured to cache the An event belonging to the current stream processing window; the first expression determining unit 30 is configured to determine whether the cached event belonging to the current stream processing window satisfies a first expression related to the first keyword; the output unit 40 is configured to When the judgment result of the first expression judging unit is that the first expression is satisfied, the cached event belonging to the current stream processing window is output according to the first keyword, wherein the first keyword is used to indicate that the output is cached.
  • the loop unit 50 is configured to: when the first expression determining unit determines that the first expression is not satisfied, does not delete the cached current stream processing An event of the window, and triggering the receiving unit to receive a next event belonging to the current stream processing window.
  • the cache unit 20 may be specifically configured to cache the events belonging to the current stream processing window into the memory of the stream processing engine.
  • the cache unit 20 may include: a second expression determining subunit 200, configured to determine whether the event belonging to the current stream processing window satisfies a second expression related to the second keyword; 202, configured to: when the second expression determines that the subunit determines that the second expression is not satisfied, deletes the cached event that belongs to the current stream processing window, and buffers the received event that belongs to the current stream processing window; Otherwise, the received event belonging to the current stream processing window is cached.
  • the cache unit includes: a third expression determining subunit (not shown), configured to determine, according to the third expression, whether to cache an event of the current stream processing window, and when the judgment result is yes, buffering the belonging The event of the current stream processing window. Further, as shown in FIG.
  • the foregoing apparatus may further include: a processing unit 60, configured to: when determining whether the cached event belonging to the current stream processing window satisfies the first expression related to the first keyword, An intermediate calculation result of processing an event of the current stream processing window; the output unit 40 is further configured to output, when the event belonging to the current stream processing window is cached according to the first keyword, output a final result obtained according to the intermediate calculation result Calculation results.
  • a processing unit 60 configured to: when determining whether the cached event belonging to the current stream processing window satisfies the first expression related to the first keyword, An intermediate calculation result of processing an event of the current stream processing window
  • the output unit 40 is further configured to output, when the event belonging to the current stream processing window is cached according to the first keyword, output a final result obtained according to the intermediate calculation result Calculation results.
  • the event described above may include a call record, a short message transmission record or a website click record, the first keyword is a had keyword, the device is used in a complex event processing CEP system, and the CEP system uses a CQL language. , EPL language or CEP rule description language like SQL.
  • CQL language e.g., EPL language or CEP rule description language like SQL.
  • the cached event belonging to the current stream processing window is not deleted, and the current stream is not
  • the events are not retained, so that at the last output, all events in the stream processing window that cause the condition are satisfied can be output, and not only the event after the condition is satisfied.
  • all call records satisfying the condition may be output according to the first expression related to the first keyword, for example, finding a malicious call generated by the user, and then according to these
  • the call record is recorded in the database to obtain a specific bill; it is not because the part of the call record before the condition is satisfied is not cached, and the last acquired bill is incomplete.
  • FIG. 5 it is another specific flowchart of the stream processing method in the embodiment of the present invention.
  • the event is processed according to the first expression and the second expression.
  • step 204 Cache the received event belonging to the current stream processing window into a memory of the stream processing engine. Then, go to step 205. 205. Determine whether the cached event belonging to the current stream processing window satisfies the first expression related to the first keyword. When the result of the determination is that the first expression is not satisfied, execution 206; when the result of the determination is that the first expression is satisfied, 207 is performed.
  • the cached event belonging to the current stream processing window is not deleted, and returning to step 201 to receive the next event belonging to the current stream processing window.
  • the cached event belonging to the current stream processing window is output according to the first keyword.
  • the process can be ended after the execution of this step ends, or return to step 201 to continue receiving the next event.
  • the flow processing window in a specific embodiment may be a one-minute sliding window, and the first expression is that the number of consecutive calls of the user in the one-minute sliding window is greater than or equal to five times; the second expression The interval between adjacent calls is less than 5 seconds (At ⁇ 5s ), and the second keyword is to record only the call records that satisfy the second expression.
  • the rules described by the EPL are as follows:
  • time (timestamp - previous ( timestamp ) ) ⁇ 5 s means that the interval between two adjacent events is less than 5 seconds.
  • the above sliding time window can also be a jump time window, and the processing flow is similar.
  • the flow is processed according to the first expression and the third expression.
  • the first keyword is the had keyword
  • the expression for the first keyword is count ( ) > 5
  • the keyword had can also be used in combination with various other window definitions, query conditions, schema definitions, etc., such as Count ( a ), count( [all
  • a single rule existing keyword eg, having
  • a single rule keyword expansion eg, increase
  • the record buffer satisfying the condition of the post-had expression is obtained when the rule definition output condition or the expression condition after the had is not satisfied (for a simple event count, the expression condition does not conform to the had condition.
  • the event is a record of events in different groups. For example, packet A may satisfy 5 times, then the record of packet A is output, and packet B does not satisfy 5 times. , then the record associated with group B has no output), will output all the records that satisfy the condition of the post-had expression, clear the buffer and continue counting.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

Abstract

Disclosed are a stream processing method and apparatus. The method comprises receiving an event belonging to a current stream processing window; caching the event belonging to the current stream processing window; judging whether the cached event belonging to the current stream processing window meets a first expression associated with a first key word; in the case of a judgment result of not meeting the first expression, not deleting the cached event belonging to the current stream processing window, and receiving a next event belonging to the current stream processing window; and in the case of a judgment result of meeting the first expression, outputting the cached event belonging to the current stream processing window according to the first key word, wherein the first key word is used for indicating outputting of the cached event belonging to the current stream processing window and meeting the first expression. By adopting the present invention, the output of all the events meeting the conditions can be realized without nesting of multiple rules.

Description

技术领域 Technical field
本发明涉及复杂事件处理领域, 尤其涉及一种流处理方法和装置。 背景技术  The present invention relates to the field of complex event processing, and in particular, to a stream processing method and apparatus. Background technique
目前信息处理系统需要处理的数据量越来越大, 实时性要求也越来越高。 复杂事件处理( Complex Event Processing, CEP )技术是一种新兴的基于事件流 的数据处理技术。 它将系统数据看作不同类型的事件流, 通过分析事件之间的 关系, 检测、 建立不同事件之间的关联关系, 并利用过滤、 关联、 聚合等技术, 最终由简单事件产生、 触发高级事件或商业流程。 与传统的数据库技术不同, CEP技术不是处理静态存储的大量数据, 而是对动态生成的具有时序性的数据 流进行实时地处理, 以期发现在大量不同事件之间的关系——模式、 异常、 缺 失、 层次等。  At present, the amount of data that the information processing system needs to process is getting larger and larger, and the real-time requirements are getting higher and higher. Complex Event Processing (CEP) technology is an emerging event stream-based data processing technology. It treats system data as different types of event streams. By analyzing the relationship between events, detecting and establishing associations between different events, and using techniques such as filtering, association, and aggregation, finally generating and triggering advanced events by simple events. Or business process. Unlike traditional database technology, CEP technology does not process large amounts of data stored statically, but processes dynamically generated, time-series data streams in real time to discover relationships between a large number of different events—patterns, exceptions, Missing, hierarchy, etc.
在 CEP系统中, 当有事件流到来时, 查询语句连续地作用 /执行 (即进行过 滤、 关联等操作) 于事件上。 由于事件是连续不断地经过 CEP系统, 因此, 在 CEP 处理中, 往往需要定义一个窗口, 查询一般是基于该特定的窗口来进行计 算的。 例如: 对于股票事件流, 统计最后一分钟内所有股票的均值; 或者统计 最后 100个交易的股票均值。  In the CEP system, when an event flow arrives, the query statement continuously acts/executes (ie, performs filtering, association, etc.) on the event. Since the event is continuously passed through the CEP system, in CEP processing, it is often necessary to define a window, and the query is generally calculated based on the specific window. For example: For stock event streams, count the average of all stocks in the last minute; or count the stock averages for the last 100 trades.
CEP 系统中的查询语句有多种方式可以进行描述, 连续查询语言 ( Continuous Query Language, CQL )是一种较好的描述语言。 CQL是一种基于 SQL的描述语言,可以针对流数据进行连续查询。 EPL ( Event Process Language ) 是一种类结构化查询语言 (Structured Query Language, SQL ) 的流处理描述语 言, 基于 SQL描述语言加入了, 针对流处理的窗口、 模式等描述能力。  There are many ways to describe a query in a CEP system. The Continuous Query Language (CQL) is a better description language. CQL is a SQL-based description language that allows continuous querying of streaming data. EPL (Event Process Language) is a stream processing description language of Structured Query Language (SQL). It is based on SQL description language and describes the description of window, mode and so on.
在 EPL语言中, 在 FROM子句后面跟随的是流名称(假设流名为 S ), 而 在流名称后面 "."以后的定义表示的则是窗口的概念。例如: Select * From S. win:time(60 sec), 则表示对于流 S开设 60秒的流处理窗口。 在 EPL中, 主要有 以下几种窗口: 时间滑动窗口 (表示为 S. win:time(60 sec) )、 数量滑动窗口 (表 示为 S. win: length (60 sec) )、 分组窗口 (表示为 S. std:groupwin(*) )等等。  In the EPL language, the stream name is followed by the FROM clause (assuming the stream name is S), and the definition after the stream name "." indicates the concept of the window. For example: Select * From S. win: time (60 sec), which means that 60 seconds of stream processing window is opened for stream S. In the EPL, there are mainly the following windows: time sliding window (represented as S. win: time (60 sec)), number sliding window (represented as S. win: length (60 sec)), grouping window (represented as S. std: groupwin(*) ) and so on.
在 EPL语言中, having **>=10表示在查询结果中只返回满足 having条件的 结果,与 SQL用法类 4以; on myevent(CallCounter>= 10) as a update mywindow as b set flag=true where a.strlmsi=b.strlmsi and b.flag=false表示当满足 where条件时 , myevent流字段 CallCounter>=10满足某条件时, 更新 mywindow中的 flag字段 内容。 In the EPL language, having **>=10 means that only the return condition is satisfied in the query result. As a result, with SQL usage class 4; on myevent(CallCounter>= 10) as a update mywindow as b set flag=true where a.strlmsi=b.strlmsi and b.flag=false means that when the where condition is met, myevent flow When the field CallCounter>=10 satisfies a certain condition, the content of the flag field in mywindow is updated.
在现有技术中, 流处理中现有的关键字 Having, 只能输出满足条件以后的 记录, 之前的数据被认为是不满足条件数据与过期数据一同处理, 不能与满足 条件后的记录一起输出, 如: 条件是 having > 5; 那么如果实际满足条件的记 录有 10条, 但实际输出结果只有 5〜10。  In the prior art, the existing keyword Having in the stream processing can only output the record after the condition is satisfied, and the previous data is considered to be the condition data not processed together with the expired data, and cannot be output together with the record satisfying the condition. , such as: The condition is having > 5; then if there are 10 records that actually satisfy the condition, the actual output is only 5~10.
如果要实现输出满足条件的所有记录, 则在现有技术下, 需要采用多规则 嵌套实现: 对同一周期内的数据生成两份 、 B, 从 A中找到满足条件的用户, 再从 B中, 找出这个用户的通话记录。 发明内容  If all the records satisfying the condition are to be output, in the prior art, a multi-rule nesting implementation is required: two copies of the data in the same period are generated, B, the user who satisfies the condition is found from A, and then from B. , find out the call history of this user. Summary of the invention
本发明实施例所要解决的技术问题在于, 提供一种流处理方法和装置。 可 以在不采用多规则嵌套的情况下, 实现满足条件的全部事件的输出。  A technical problem to be solved by embodiments of the present invention is to provide a stream processing method and apparatus. The output of all events that satisfy the condition can be implemented without multi-rule nesting.
为了解决上述技术问题, 一方面, 本发明实施例提供了一种流处理方法, 所述方法包括:  In order to solve the above technical problem, in one aspect, an embodiment of the present invention provides a stream processing method, where the method includes:
接收属于当前流处理窗口的事件;  Receiving an event belonging to the current stream processing window;
緩存所述属于当前流处理窗口的事件;  Caching the events belonging to the current stream processing window;
判断已緩存的属于当前流处理窗口的事件是否满足与第一关键字有关的第 一表达式;  Determining whether the cached event belonging to the current stream processing window satisfies the first expression related to the first keyword;
当判断结果为不满足所述第一表达式时, 不删除已緩存的属于当前流处理 窗口的事件, 并接收属于当前流处理窗口的下一事件;  When the judgment result is that the first expression is not satisfied, the cached event belonging to the current stream processing window is not deleted, and the next event belonging to the current stream processing window is received;
当判断结果为满足所述第一表达式时, 按照所述第一关键字输出已緩存的 属于当前流处理窗口的事件, 其中所述第一关键字用于指示输出已緩存的属于 当前流处理窗口并满足所述第一表达式的事件。  When the result of the determination is that the first expression is satisfied, the cached event belonging to the current stream processing window is output according to the first keyword, where the first keyword is used to indicate that the output buffered belongs to the current stream processing. The window and the event of the first expression is satisfied.
另一方面, 本发明实施例还提供了一种流处理装置, 所述装置包括: 接收单元, 用于接收属于当前流处理窗口的事件;  In another aspect, the embodiment of the present invention further provides a stream processing apparatus, where the apparatus includes: a receiving unit, configured to receive an event belonging to a current stream processing window;
緩存单元, 用于緩存所述属于当前流处理窗口的事件;  a cache unit, configured to cache the event that belongs to the current stream processing window;
第一表达式判断单元, 用于判断已緩存的属于当前流处理窗口的事件是否 满足与第一关键字有关的第一表达式; a first expression determining unit, configured to determine whether the cached event belonging to the current stream processing window is Satisfying the first expression related to the first keyword;
输出单元, 用于当第一表达式判断单元的判断结果为满足所述第一表达式 时, 按照第一关键字输出已緩存的属于当前流处理窗口的事件, 其中所述第一 关键字用于指示输出已緩存的属于当前流处理窗口并满足第一表达式的事件; 循环单元, 用于当第一表达式判断单元的判断结果为不满足所述第一表达 式时, 不删除已緩存的所述属于当前流处理窗口的事件, 并触发所述接收单元 接收属于当前流处理窗口的下一事件。  And an output unit, configured to: when the first expression determining unit determines that the first expression is satisfied, output the cached event belonging to the current stream processing window according to the first keyword, where the first keyword is used Instructing to output an event that belongs to the current stream processing window and satisfying the first expression; a loop unit, configured to not delete the cached when the first expression determining unit determines that the first expression is not satisfied The event belonging to the current stream processing window, and triggering the receiving unit to receive the next event belonging to the current stream processing window.
在本发明实施例中在处理与第一关键字有关的第一表达式时, 即使不满足 该第一表达式仍然不删除已緩存的所述属于当前流处理窗口的事件, 不会因为 当前流处理窗口的事件不满足条件就不保留这些事件, 使得在最后输出时, 可 以将流处理窗口中的所有引起满足条件的事件输出, 而不会只能输出满足条件 之后的事件。 附图说明  In the embodiment of the present invention, when processing the first expression related to the first keyword, even if the first expression is not satisfied, the cached event belonging to the current stream processing window is not deleted, and the current stream is not When the event of the processing window does not satisfy the condition, the events are not retained, so that at the last output, all events in the stream processing window that cause the condition are satisfied can be output, and not only the event after the condition is satisfied. DRAWINGS
为了更清楚地说明本发明实施例或现有技术中的技术方案, 下面将对实施 例或现有技术描述中所需要使用的附图作简单地介绍, 显而易见地, 下面描述 中的附图仅仅是本发明的一些实施例, 对于本领域普通技术人员来讲, 在不付 出创造性劳动性的前提下, 还可以根据这些附图获得其他的附图。  In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only It is a certain embodiment of the present invention, and other drawings can be obtained from those skilled in the art without any inventive labor.
图 1是本发明实施例中的流处理方法的一个流程示意图;  1 is a schematic flow chart of a stream processing method in an embodiment of the present invention;
图 2是本发明实施例中的流处理装置的一个组成示意图;  2 is a schematic diagram of a composition of a stream processing device in an embodiment of the present invention;
图 3是本发明实施例中的緩存单元的一个组成示意图;  3 is a schematic diagram of a composition of a cache unit in an embodiment of the present invention;
图 4是本发明实施例中的流处理装置的另一个组成示意图;  4 is a schematic diagram of another composition of a stream processing apparatus in an embodiment of the present invention;
图 5是本发明实施例中的流处理方法的另一个流程示意图;  FIG. 5 is another schematic flowchart of a stream processing method in an embodiment of the present invention; FIG.
图 6是采用图 5中的流处理方法处理多个事件的流程示意图;  6 is a schematic flowchart of processing a plurality of events by using the stream processing method in FIG. 5;
图 7是本发明实施例中的滑动窗口多次滑动时其包括的事件变化的一个具 体示意图;  7 is a schematic diagram showing changes in events included in a sliding window when the sliding window is slid in multiple times in the embodiment of the present invention;
图 8是采用本发明实施例中的流处理方法的处理多个事件的流程示意图; 图 9是本发明实施例中的滑动窗口多次滑动时其包括的事件变化的另一个 示意图。 具体实施方式 FIG. 8 is a schematic flowchart of processing a plurality of events by using the stream processing method in the embodiment of the present invention; FIG. 9 is another schematic diagram of event changes included when the sliding window is slid multiple times in the embodiment of the present invention. detailed description
下面将结合本发明实施例中的附图, 对本发明实施例中的技术方案进行清 楚、 完整地描述, 显然, 所描述的实施例仅仅是本发明一部分实施例, 而不是 全部的实施例。 基于本发明中的实施例, 本领域普通技术人员在没有作出创造 性劳动前提下所获得的所有其他实施例, 都属于本发明保护的范围。  BRIEF DESCRIPTION OF THE DRAWINGS The technical solutions in the embodiments of the present invention will be described in detail with reference to the accompanying drawings. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative work are within the scope of the present invention.
在目前流处理软件的类 SQL规则描述中, having关键字,只能保留周期 (即 流处理窗口)内满足条件以后的数据。在本发明实施例中,添加第一关键字(如, had关键字), 可以保留周期内引起达到条件前后的所有数据。 并且 had后可带 复杂条件表达式, 可以是单一流的条件表达式, 也可以是多流的条件表达式。  In the current SQL rule description of the stream processing software, the having keyword can only retain the data after the condition is met in the period (ie, the stream processing window). In the embodiment of the present invention, by adding the first keyword (for example, the had keyword), all data in the period that causes the condition to be reached can be retained. And had can be followed by a complex conditional expression, which can be a single-flow conditional expression or a multi-stream conditional expression.
如图 1 所示, 为本发明实施例中的基于事件流的数据处理方法的一个流程 示意图。 该流程包括如下步骤:  As shown in FIG. 1, it is a schematic flowchart of an event stream-based data processing method in an embodiment of the present invention. The process includes the following steps:
101、 接收属于当前流处理窗口的事件。 在流处理中, 事件是以事件流的形 式输入。 该事件可以包括呼叫记录、 短消息发送记录或网站点击记录。  101. Receive an event belonging to a current stream processing window. In stream processing, events are entered as event streams. The event can include a call log, a short message send record, or a website click record.
102、 緩存属于当前流处理窗口的事件。 具体可以是将所述属于当前流处理 窗口的事件緩存到流处理引擎的内存中。  102. Cache an event belonging to the current stream processing window. Specifically, the event belonging to the current stream processing window may be cached in the memory of the stream processing engine.
在緩存事件时, 还可以进一步根据第二表达式进行緩存, 即判断所述属于 当前流处理窗口的事件是否满足与第二关键字有关的第二表达式; 如果判断结 果为不满足第二表达式, 则删除已緩存的属于当前流处理窗口的事件, 并緩存 接收的所述属于当前流处理窗口的事件; 否则, 緩存接收的所述属于当前流处 理窗口的事件。  When the event is cached, the second expression may be further cached, that is, whether the event belonging to the current stream processing window satisfies the second expression related to the second keyword; if the judgment result is that the second expression is not satisfied And deleting the cached event belonging to the current stream processing window, and buffering the received event belonging to the current stream processing window; otherwise, buffering the received event belonging to the current stream processing window.
当然, 该表达式也可以是第三表达式, 即根据第三表达式判断是否緩存所 述当前流处理窗口的事件, 当判断结果为是时, 緩存所述属于当前流处理窗口 的事件。  Of course, the expression may also be a third expression, that is, determining whether to cache the event of the current stream processing window according to the third expression, and when the judgment result is yes, buffering the event belonging to the current stream processing window.
103、 判断已緩存的属于当前流处理窗口的事件是否满足与第一关键字有关 的第一表达式。  103. Determine whether the cached event belonging to the current stream processing window satisfies the first expression related to the first keyword.
其中, 所述第一关键字可为 had关键字。  The first keyword may be a had keyword.
104、 当判断结果为不满足所述第一表达式时, 不删除已緩存的属于当前流 处理窗口的事件, 并接收属于当前流处理窗口的下一事件。  104. When the result of the determination is that the first expression is not satisfied, the cached event belonging to the current stream processing window is not deleted, and the next event belonging to the current stream processing window is received.
105、 当判断结果为满足所述第一表达式时, 按照所述第一关键字输出已緩 存的属于当前流处理窗口的事件, 其中所述第一关键字用于指示输出已緩存的 属于当前流处理窗口并满足所述第一表达式的事件。 105. When the result of the determination is that the first expression is satisfied, outputting a cached event belonging to the current stream processing window according to the first keyword, where the first keyword is used to indicate that the output is cached An event belonging to the current stream processing window and satisfying the first expression.
进一步的, 在本发明实施例中的流处理过程该流处理窗口可为随时间变化 的时间窗, 如滑动时间窗、 跳跃时间窗等, 则当所述当前流处理窗口移动时, 删除緩存中不属于移动后的流处理窗口内的事件。  Further, in the stream processing process in the embodiment of the present invention, the stream processing window may be a time window that changes with time, such as a sliding time window, a skip time window, etc., when the current stream processing window moves, the cache is deleted. Events that are not part of the moved stream processing window.
同时, 当判断緩存的属于当前流处理窗口的事件是否满足与第一关键字有 关的第一表达式时, 緩存对所述当前流处理窗口的事件进行处理的中间计算结 果; 当按照第一关键字输出緩存的属于当前流处理窗口的事件时, 输出根据所 述中间计算结果获得的最终计算结果。  At the same time, when it is determined whether the cached event belonging to the current stream processing window satisfies the first expression related to the first keyword, the intermediate calculation result of processing the event of the current stream processing window is cached; When the word output buffer belongs to the event of the current stream processing window, the final calculation result obtained based on the intermediate calculation result is output.
上述方法可用于复杂事件处理 CEP系统中,所述 CEP系统使用 CQL语言、 EPL语言或类 SQL的 CEP规则描述语言。  The above method can be used in a complex event processing CEP system using a CQL language, an EPL language, or a CEP rule description language like SQL.
相应的, 本发明实施例还提供了一种流处理装置, 如图 2所示, 该装置包 括: 接收单元 10, 用于接收属于当前流处理窗口的事件; 緩存单元 20, 用于緩 存所述属于当前流处理窗口的事件; 第一表达式判断单元 30, 用于判断已緩存 的属于当前流处理窗口的事件是否满足与第一关键字有关的第一表达式; 输出 单元 40, 用于当第一表达式判断单元的判断结果为满足所述第一表达式时, 按 照第一关键字输出已緩存的属于当前流处理窗口的事件, 其中所述第一关键字 用于指示输出已緩存的属于当前流处理窗口并满足第一表达式的事件; 循环单 元 50, 用于当第一表达式判断单元的判断结果为不满足所述第一表达式时, 不 删除已緩存的属于当前流处理窗口的事件, 并触发所述接收单元接收属于当前 流处理窗口的下一事件。  Correspondingly, the embodiment of the present invention further provides a stream processing apparatus. As shown in FIG. 2, the apparatus includes: a receiving unit 10, configured to receive an event belonging to a current stream processing window; and a buffer unit 20, configured to cache the An event belonging to the current stream processing window; the first expression determining unit 30 is configured to determine whether the cached event belonging to the current stream processing window satisfies a first expression related to the first keyword; the output unit 40 is configured to When the judgment result of the first expression judging unit is that the first expression is satisfied, the cached event belonging to the current stream processing window is output according to the first keyword, wherein the first keyword is used to indicate that the output is cached. An event that belongs to the current stream processing window and satisfies the first expression; the loop unit 50 is configured to: when the first expression determining unit determines that the first expression is not satisfied, does not delete the cached current stream processing An event of the window, and triggering the receiving unit to receive a next event belonging to the current stream processing window.
其中, 緩存单元 20可具体用于将所述属于当前流处理窗口的事件緩存到流 处理引擎的内存中。  The cache unit 20 may be specifically configured to cache the events belonging to the current stream processing window into the memory of the stream processing engine.
如图 3所示, 緩存单元 20可包括: 第二表达式判断子单元 200, 用于判断 所述属于当前流处理窗口的事件是否满足与第二关键字有关的第二表达式; 处 理子单元 202, 用于当第二表达式判断子单元的判断结果为不满足第二表达式 时, 删除已緩存的属于当前流处理窗口的事件, 并緩存接收的所述属于当前流 处理窗口的事件; 否则, 緩存接收的所述属于当前流处理窗口的事件。 或: 緩 存单元包括: 第三表达式判断子单元(图中未示), 用于根据第三表达式判断是 否緩存所述当前流处理窗口的事件, 当判断结果为是时, 緩存所述属于当前流 处理窗口的事件。 进一步的, 如图 4所示, 上述装置还可包括: 处理单元 60, 用于当判断緩 存的属于当前流处理窗口的事件是否满足与第一关键字有关的第一表达式时, 緩存对所述当前流处理窗口的事件进行处理的中间计算结果; 所述输出单元 40 还用于当按照第一关键字输出緩存的属于当前流处理窗口的事件时, 输出根据 所述中间计算结果获得的最终计算结果。 As shown in FIG. 3, the cache unit 20 may include: a second expression determining subunit 200, configured to determine whether the event belonging to the current stream processing window satisfies a second expression related to the second keyword; 202, configured to: when the second expression determines that the subunit determines that the second expression is not satisfied, deletes the cached event that belongs to the current stream processing window, and buffers the received event that belongs to the current stream processing window; Otherwise, the received event belonging to the current stream processing window is cached. Or: the cache unit includes: a third expression determining subunit (not shown), configured to determine, according to the third expression, whether to cache an event of the current stream processing window, and when the judgment result is yes, buffering the belonging The event of the current stream processing window. Further, as shown in FIG. 4, the foregoing apparatus may further include: a processing unit 60, configured to: when determining whether the cached event belonging to the current stream processing window satisfies the first expression related to the first keyword, An intermediate calculation result of processing an event of the current stream processing window; the output unit 40 is further configured to output, when the event belonging to the current stream processing window is cached according to the first keyword, output a final result obtained according to the intermediate calculation result Calculation results.
其中, 上述描述的事件可包括呼叫记录、 短消息发送记录或网站点击记录, 所述第一关键字为 had关键字, 所述装置用于复杂事件处理 CEP系统中, 所述 CEP系统使用 CQL语言、 EPL语言或类 SQL的 CEP规则描述语言。 对于本装 置实施例中的其他术语和细节也可以参考前述或后述的方法实施例中的细节, 此处不做贅述。  The event described above may include a call record, a short message transmission record or a website click record, the first keyword is a had keyword, the device is used in a complex event processing CEP system, and the CEP system uses a CQL language. , EPL language or CEP rule description language like SQL. For other terms and details in the embodiment of the device, reference may be made to the details of the foregoing or later method embodiments, which are not described herein.
在本发明实施例中在处理与第一关键字有关的第一表达式时, 即使不满足 该第一表达式仍然不删除已緩存的所述属于当前流处理窗口的事件, 不会因为 当前流处理窗口的事件不满足条件就不保留这些事件, 使得在最后输出时, 可 以将流处理窗口中的所有引起满足条件的事件输出, 而不会只能输出满足条件 之后的事件。  In the embodiment of the present invention, when processing the first expression related to the first keyword, even if the first expression is not satisfied, the cached event belonging to the current stream processing window is not deleted, and the current stream is not When the event of the processing window does not satisfy the condition, the events are not retained, so that at the last output, all events in the stream processing window that cause the condition are satisfied can be output, and not only the event after the condition is satisfied.
在本发明的具体实施例中, 当事件为呼叫记录时, 可以根据与第一关键字 有关的第一表达式输出满足条件的所有呼叫记录, 如, 找出用户产生的恶意呼 叫, 然后根据这些呼叫记录到数据库中获取具体的话单; 不会因为在满足条件 之前的部分呼叫记录没有被緩存, 而导致最后获取的话单不完整。  In a specific embodiment of the present invention, when the event is a call record, all call records satisfying the condition may be output according to the first expression related to the first keyword, for example, finding a malicious call generated by the user, and then according to these The call record is recorded in the database to obtain a specific bill; it is not because the part of the call record before the condition is satisfied is not cached, and the last acquired bill is incomplete.
如图 5 所示, 为本发明实施例中的流处理方法的另一具体流程示意图。 本 例中根据第一表达式和第二表达式对事件进行处理。  As shown in FIG. 5, it is another specific flowchart of the stream processing method in the embodiment of the present invention. In this example, the event is processed according to the first expression and the second expression.
201、 接收属于当前流处理窗口的事件。  201. Receive an event belonging to a current stream processing window.
202、 判断所述属于当前流处理窗口的事件是否满足与第二关键字有关的第 二表达式。 如果判断结果为不满足第二表达式, 则执行 203; 如果判断结果为满 足第二表达式, 则执行 204。  202. Determine whether the event belonging to the current stream processing window satisfies a second expression related to the second keyword. If the result of the judgment is that the second expression is not satisfied, 203 is performed; if the result of the judgment is that the second expression is satisfied, 204 is performed.
203、 删除流处理引擎的内存中已緩存的属于当前流处理窗口的事件, 并将 接收的所述属于当前流处理窗口的事件緩存到流处理引擎的内存中。 接着执行 205„  203. Delete an event that is cached in the memory of the stream processing engine and belongs to the current stream processing window, and cache the received event belonging to the current stream processing window into a memory of the stream processing engine. Then execute 205
204、 将接收的所述属于当前流处理窗口的事件緩存到流处理引擎的内存 中。 然后转入步骤 205。 205、 判断已緩存的属于当前流处理窗口的事件是否满足与第一关键字有关 的第一表达式。 当判断结果为不满足所述第一表达式时, 执行 206; 当判断结果 为满足所述第一表达式时, 执行 207。 204. Cache the received event belonging to the current stream processing window into a memory of the stream processing engine. Then, go to step 205. 205. Determine whether the cached event belonging to the current stream processing window satisfies the first expression related to the first keyword. When the result of the determination is that the first expression is not satisfied, execution 206; when the result of the determination is that the first expression is satisfied, 207 is performed.
206、 当判断结果为不满足所述第一表达式时, 不删除已緩存的属于当前流 处理窗口的事件, 并返回步骤 201接收属于当前流处理窗口的下一事件。  206. When the result of the determination is that the first expression is not satisfied, the cached event belonging to the current stream processing window is not deleted, and returning to step 201 to receive the next event belonging to the current stream processing window.
207、 当判断结果为满足所述第一表达式时, 按照第一关键字输出已緩存的 属于当前流处理窗口的事件。  207. When the result of the determination is that the first expression is satisfied, the cached event belonging to the current stream processing window is output according to the first keyword.
在本步骤执行结束后可以结束流程, 也可以返回步骤 201 继续接收下一事 件。  The process can be ended after the execution of this step ends, or return to step 201 to continue receiving the next event.
在上述实施例的基础上, 可以限定一具体实施例中的流处理窗口为一分钟 滑动窗口, 第一表达式为在该一分钟滑动窗口内用户连续呼叫次数大于等于五 次; 第二表达式为相邻两次呼叫记录间隔小于 5秒(At < 5s ), 第二关键字为 仅记录满足第二表达式的呼叫记录。 则 EPL描述的规则如下:  On the basis of the foregoing embodiment, the flow processing window in a specific embodiment may be a one-minute sliding window, and the first expression is that the number of consecutive calls of the user in the one-minute sliding window is greater than or equal to five times; the second expression The interval between adjacent calls is less than 5 seconds (At < 5s ), and the second keyword is to record only the call records that satisfy the second expression. The rules described by the EPL are as follows:
Select * from stream.win:time(60 sec) group by a had count(*)>=5 where (timestamp - previous (times tamp ) ) ( 5 s  Select * from stream.win:time(60 sec) group by a had count(*)>=5 where (timestamp - previous (times tamp ) ) ( 5 s
其中, where ( timestamp - previous ( timestamp ) ) < 5 s表示, 相邻两个事 件的时间间隔要小于 5秒。 则如图 6所示, 针对图 Ί的事件流的处理流程为: Where, time (timestamp - previous ( timestamp ) ) < 5 s means that the interval between two adjacent events is less than 5 seconds. Then, as shown in FIG. 6, the processing flow of the event flow for the image is:
301、 接收事件 1 , 由于事件 1为第一个事件, 不需要判断其与前一个事件 的间隔, 緩存该事件 1 , 此时窗口 1中的事件计数为 1 (且有 1<5 ), 则不删除已 緩存的事件 1 , 但也不输出。 301. Receiving event 1, since event 1 is the first event, it is not necessary to determine the interval between the event and the previous event, and the event 1 is cached. At this time, the event count in window 1 is 1 (and 1<5), then Cached event 1 is not deleted, but it is not output.
302、 接收事件 2, 此时判断事件 1和事件 2之间的间隔小于 5秒, 则緩存 事件 2 (即: 此时已緩存的属于窗口 1的事件有两个, 分别是事件 1和事件 2 ), 此时窗口 1中的事件计数为 2 (且有 2<5 ), 则不删除已緩存的事件 1和 2, 但也 不输出。  302. Receive event 2, and determine that the interval between event 1 and event 2 is less than 5 seconds, then cache event 2 (ie: two events belonging to window 1 that have been cached at this time are event 1 and event 2, respectively. ), when the event count in window 1 is 2 (and there is 2<5), the cached events 1 and 2 are not deleted, but they are not output.
303、 接收事件 3 , 此时判断事件 2和事件 3之间的间隔大于 5秒, 则此时 删除已緩存的事件 1和事件 2,仅緩存事件 3 ,此时窗口 1中的事件计数为 1 (且 有 1<5 ), 则不删除緩存的事件 3 , 但也不输出。  303. Receive event 3, and determine that the interval between event 2 and event 3 is greater than 5 seconds, then delete the cached event 1 and event 2, and only cache event 3, and the event count in window 1 is 1 (And there is 1<5), the cached event 3 is not deleted, but it is not output.
304、 并当窗口的滑动时间到时, 滑动窗口。 当窗口 1滑动为窗口 2时(也 即: 此时当前流处理窗口由窗口 1变为窗口 2 ), 删除滑动出窗口的事件(这里 为事件 1 ), 接着接收并处理后续事件 4〜6, 同时计数也相应改变, 并按上述方 法继续进行事件处理。 304, and when the sliding time of the window is up, slide the window. When window 1 slides to window 2 (that is, the current stream processing window changes from window 1 to window 2), the event that slides out of the window (here, event 1) is deleted, and then subsequent events 4 to 6 are received and processed. At the same time, the count changes accordingly, and according to the above The law continues with event processing.
305、 并当窗口的滑动时间到时, 滑动窗口。 当窗口 2滑动为窗口 3时(也 即: 此时当前流处理窗口由窗口 2变为窗口 3 ), 删除滑动出窗口的事件(这里 为事件 2 ), 接收事件 7, 此时判断事件 6和事件 7之间的间隔小于 5秒, 则此 时已緩存的事件为 3〜7 , 此时窗口中的事件计数为 5 (且有 5=5 ), 则不删除已緩 存的事件 3〜7, 并输出緩存的事件 3〜7。  305, and when the sliding time of the window is up, slide the window. When window 2 slides to window 3 (ie: the current stream processing window changes from window 2 to window 3), the event that slides out of the window (here, event 2) is deleted, event 7 is received, and event 6 is judged at this time. The interval between events 7 is less than 5 seconds, then the cached event is 3~7, and the event count in the window is 5 (and 5=5), then the cached events 3~7 are not deleted. And output the cached events 3~7.
306、 采用类似上述的方法继续处理后续事件。  306. Continue to process subsequent events in a manner similar to that described above.
当然, 上述滑动时间窗也可以为跳跃时间窗, 其处理流程类似。  Of course, the above sliding time window can also be a jump time window, and the processing flow is similar.
如图 8所示, 则为根据第一表达式和第三表达式对事件进行处理的流程。 在本例中, 第一关键字为 had关键字, 第一关键字的表达式为 count ( ) >5 , 第 三表达式为 ** ! =null, 即字段 **不为空。 则对于如图 9所示的事件流, 其处理 流程为:  As shown in FIG. 8, the flow is processed according to the first expression and the third expression. In this example, the first keyword is the had keyword, the expression for the first keyword is count ( ) > 5 , and the third expression is ** ! =null, ie the field ** is not empty. For the event flow shown in Figure 9, the processing flow is:
401、 接收事件 1 , 由于事件 1为第一个事件, 判断其字段 **不为空, 緩存 该事件, 此时窗口 1中的事件计数为 1 (且有 1<5 ), 则不删除已緩存的事件 1 , 但也不输出。  401. Receive event 1, because event 1 is the first event, determine that its field ** is not empty, and cache the event. At this time, the event count in window 1 is 1 (and there is 1<5), then the deletion is not deleted. Cache event 1 but not output.
402、 接收事件 2, 判断其字段 **不为空, 緩存该事件, 此时窗口 1 中的事 件计数为 2 (且有 2<5 ), 则不删除已緩存的事件 1和 2, 但也不输出。 同理处理 事件 3〜5。 当窗口的滑动时间到时, 滑动窗口。 当窗口 1滑动为窗口 2时, 删除 滑动出窗口的事件, 同时计数也相应改变, 并按上述方法继续进行事件处理。  402. Receive event 2, determine that its field ** is not empty, and cache the event. At this time, the event count in window 1 is 2 (and there is 2<5), then the cached events 1 and 2 are not deleted, but also Do not output. The same thing is handled by events 3~5. When the sliding time of the window is up, slide the window. When window 1 is slid into window 2, the event that slides out of the window is deleted, and the count is changed accordingly, and the event processing is continued as described above.
403、 接收事件 6, 判断其字段 **不为空, 则此时已緩存的事件为 2〜6, 此 时窗口中的事件计数为 5 (且有 5=5 ), 则不删除已緩存的事件, 并输出已緩存 的事件 2〜6。  403. Receive event 6, and judge that the field ** is not empty, then the cached event is 2~6, and the event count in the window is 5 (and 5=5), then the cached is not deleted. Event, and output cached events 2~6.
404、 采用类似上述的方法继续处理后续事件。  404. Continue to process subsequent events by using a method similar to the above.
在上述实施例中, 关键字 had还可以与其他各种窗口定义、 查询条件、 模 式定义等等结合使用 ,如, Count ( a )、 count( [all | distinct] expression [, filter— expr])\ count(* [, filter— expr])等等。  In the above embodiment, the keyword had can also be used in combination with various other window definitions, query conditions, schema definitions, etc., such as Count ( a ), count( [all | distinct] expression [, filter_expr]) \ count(* [, filter_ expr]) and so on.
通过上述实施例的描述可知, 在现有技术中, 在统计周期内: 单一规则现 有关键字 (如, having ), 只能输出满足条件后的记录, 当有新的事件接收时会 删除已处理过的事件; 如果要列出引起满足条件的所有记录, 必须多规则组合 实现。 而在本发明实施例中, 在统计周期内: 单一规则的关键字扩展(如, 增 加关键字 had ), 可以列出引起满足条件的所有记录, 而不需要多规则组合实现。 即在本发明实施例中, 将满足 had后表达式条件的记录緩存, 当达到规则 定义输出条件或者不符 had后表达式条件时(对于简单的事件计数来说, 不符 had后表达式条件就没有输出了, 但是对于复杂情况, 如存在分组统计的情况, 事件是不同分组中的事件的记录, 比如分组 A可能满足 5次了, 那么分组 A的 记录就输出了,分组 B的没有满足 5次,那么和分组 B相关的记录就没有输出), 将满足 had后表达式条件的所有记录输出, 清空緩存再继续计数。 According to the description of the foregoing embodiment, in the prior art, in a statistical period: a single rule existing keyword (eg, having ) can only output a record that satisfies the condition, and deletes when a new event is received. Processed events; if you want to list all the records that cause the condition to be met, you must implement multiple rule combinations. In the embodiment of the present invention, in the statistical period: a single rule keyword expansion (eg, increase) Add the keyword had) to list all the records that cause the condition to be met, without the need for multiple rule combinations. That is, in the embodiment of the present invention, the record buffer satisfying the condition of the post-had expression is obtained when the rule definition output condition or the expression condition after the had is not satisfied (for a simple event count, the expression condition does not conform to the had condition. Output, but for complex situations, such as the case of packet statistics, the event is a record of events in different groups. For example, packet A may satisfy 5 times, then the record of packet A is output, and packet B does not satisfy 5 times. , then the record associated with group B has no output), will output all the records that satisfy the condition of the post-had expression, clear the buffer and continue counting.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程, 是可以通过计算机程序来指令相关的硬件来完成, 所述的程序可存储于一计算 机可读取存储介质中, 该程序在执行时, 可包括如上述各方法的实施例的流程。 其中, 所述的存储介质可为磁碟、 光盘、 只读存储记忆体(Read-Only Memory, ROM )或随机存储记忆体(Random Access Memory, RAM )等。  A person skilled in the art can understand that all or part of the process of implementing the above embodiment method can be completed by a computer program to instruct related hardware, and the program can be stored in a computer readable storage medium. In execution, the flow of an embodiment of the methods as described above may be included. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).
以上所揭露的仅为本发明一种较佳实施例而已, 当然不能以此来限定本发 明之权利范围, 因此依本发明权利要求所作的等同变化, 仍属本发明所涵盖的 范围。  The above description is only a preferred embodiment of the present invention, and the scope of the present invention is not limited thereto, and the equivalent changes made by the claims of the present invention are still within the scope of the present invention.

Claims

权 利 要 求 Rights request
1、 一种流处理方法, 其特征在于, 所述方法包括:  A stream processing method, the method comprising:
接收属于当前流处理窗口的事件;  Receiving an event belonging to the current stream processing window;
緩存所述属于当前流处理窗口的事件;  Caching the events belonging to the current stream processing window;
判断已緩存的属于当前流处理窗口的事件是否满足与第一关键字有关的第 一表达式;  Determining whether the cached event belonging to the current stream processing window satisfies the first expression related to the first keyword;
当判断结果为不满足所述第一表达式时, 不删除已緩存的属于当前流处理 窗口的事件, 并接收属于当前流处理窗口的下一事件;  When the judgment result is that the first expression is not satisfied, the cached event belonging to the current stream processing window is not deleted, and the next event belonging to the current stream processing window is received;
当判断结果为满足所述第一表达式时, 按照所述第一关键字输出已緩存的 属于当前流处理窗口的事件, 其中所述第一关键字用于指示输出已緩存的属于 当前流处理窗口并满足所述第一表达式的事件。  When the result of the determination is that the first expression is satisfied, the cached event belonging to the current stream processing window is output according to the first keyword, where the first keyword is used to indicate that the output buffered belongs to the current stream processing. The window and the event of the first expression is satisfied.
2、 如权利 1所述的流处理方法, 其特征在于, 所述緩存所述属于当前流处 理窗口的事件具体包括: 2. The stream processing method according to claim 1, wherein the buffering the event belonging to the current stream processing window specifically includes:
将所述属于当前流处理窗口的事件緩存到流处理引擎的内存中。  The event belonging to the current stream processing window is cached into the memory of the stream processing engine.
3、 如权 1或 2所述的流处理方法, 其特征在于, 所述緩存所述属于当前流 处理窗口的事件具体包括: 3. The stream processing method according to claim 1 or 2, wherein the buffering the event belonging to the current stream processing window specifically includes:
判断所述属于当前流处理窗口的事件是否满足与第二关键字有关的第二表 达式;  Determining whether the event belonging to the current stream processing window satisfies a second expression related to the second keyword;
如果判断结果为不满足第二表达式, 则删除已緩存的属于当前流处理窗口 的事件, 并緩存接收的所述属于当前流处理窗口的事件; 否则, 緩存接收的所 述属于当前流处理窗口的事件。  If the judgment result is that the second expression is not satisfied, the cached event belonging to the current stream processing window is deleted, and the received event belonging to the current stream processing window is buffered; otherwise, the buffer received the current stream processing window belongs to event.
4、 如权利要求 1至 3中任一项所述的流处理方法, 其特征在于, 所述方法 还包括: The stream processing method according to any one of claims 1 to 3, wherein the method further comprises:
当所述当前流处理窗口移动时, 删除緩存中不属于移动后的流处理窗口内 的事件。 When the current stream processing window moves, an event in the cache that does not belong to the moved stream processing window is deleted.
5、 如权利要求 1至 4中任一项所述的流处理方法, 其特征在于, 在所述判断緩存的属于当前流处理窗口的事件是否满足与第一关键字有关 的第一表达式步骤之后, 所述方法还包括: 緩存对所述当前流处理窗口的事件 进行处理的中间计算结果; The stream processing method according to any one of claims 1 to 4, wherein, in the determining whether the event belonging to the current stream processing window of the cache satisfies the first expression step related to the first keyword Afterwards, the method further includes: buffering an intermediate calculation result that processes an event of the current stream processing window;
当按照第一关键字输出已緩存的属于当前流处理窗口的事件时, 输出根据 所述中间计算结果获得的最终计算结果。  When the cached event belonging to the current stream processing window is output in accordance with the first keyword, the final calculation result obtained based on the intermediate calculation result is output.
6、 如权利要求 1或 2所述的流处理方法, 其特征在于, 所述緩存所述属于 当前流处理窗口的事件包括: The stream processing method according to claim 1 or 2, wherein the buffering the events belonging to the current stream processing window comprises:
根据第三表达式判断是否緩存所述当前流处理窗口的事件, 当判断结果为 是时, 緩存所述属于当前流处理窗口的事件。  Determining whether to cache the event of the current stream processing window according to the third expression, and when the judgment result is YES, buffering the event belonging to the current stream processing window.
7、 如权利要求 1至 6中任一项所述的流处理方法, 其特征在于, 所述事件 包括呼叫记录、 短消息发送记录或网站点击记录。 The stream processing method according to any one of claims 1 to 6, wherein the event includes a call record, a short message transmission record, or a website click record.
8、 如权利要求 1至 7中任一项所述的流处理方法, 其特征在于, 所述第一 关键字为 had关键字。 The stream processing method according to any one of claims 1 to 7, wherein the first keyword is a had keyword.
9、 如权利要求 1至 8中任一项所述的流处理方法, 其特征在于, 所述方法 用于复杂事件处理 CEP系统中, 所述 CEP系统使用 CQL语言、 EPL语言或类 SQL的 CEP规则描述语言。 The stream processing method according to any one of claims 1 to 8, wherein the method is used in a complex event processing CEP system, and the CEP system uses CQL language, EPL language or CEP-like CEP Rule description language.
10、 一种流处理装置, 其特征在于, 所述装置包括: 10. A stream processing device, the device comprising:
接收单元, 用于接收属于当前流处理窗口的事件;  a receiving unit, configured to receive an event that belongs to a current stream processing window;
緩存单元, 用于緩存所述属于当前流处理窗口的事件;  a cache unit, configured to cache the event that belongs to the current stream processing window;
第一表达式判断单元, 用于判断已緩存的属于当前流处理窗口的事件是否 满足与第一关键字有关的第一表达式;  a first expression determining unit, configured to determine whether the cached event belonging to the current stream processing window satisfies a first expression related to the first keyword;
输出单元, 用于当第一表达式判断单元的判断结果为满足所述第一表达式 时, 按照第一关键字输出已緩存的属于当前流处理窗口的事件, 其中所述第一 关键字用于指示输出已緩存的属于当前流处理窗口并满足第一表达式的事件; 循环单元, 用于当第一表达式判断单元的判断结果为不满足所述第一表达 式时, 不删除已緩存的属于当前流处理窗口的事件, 并触发所述接收单元接收 属于当前流处理窗口的下一事件。 And an output unit, configured to: when the first expression determining unit determines that the first expression is satisfied, output the cached event belonging to the current stream processing window according to the first keyword, where the first keyword is used Instructing to output an event that is cached and belongs to the current stream processing window and satisfies the first expression; a looping unit, configured to: when the first expression determining unit determines that the first expression is not satisfied, does not delete the cached event belonging to the current stream processing window, and triggers the receiving unit to receive the current stream processing The next event of the window.
11、 如权利要求 10所述的装置, 其特征在于, 所述緩存单元具体用于将所 述属于当前流处理窗口的事件緩存到流处理引擎的内存中。 The device according to claim 10, wherein the cache unit is specifically configured to cache the event belonging to the current stream processing window into a memory of the stream processing engine.
12、 如权利要求 10或 11所述的装置, 其特征在于, 所述緩存单元具体包 括: The device according to claim 10 or 11, wherein the cache unit specifically includes:
第二表达式判断子单元, 用于判断所述属于当前流处理窗口的事件是否满 足与第二关键字有关的第二表达式;  a second expression determining subunit, configured to determine whether the event belonging to the current stream processing window satisfies a second expression related to the second keyword;
处理子单元, 用于当第二表达式判断子单元的判断结果为不满足第二表达 式时, 删除已緩存的属于当前流处理窗口的事件, 并緩存接收的所述属于当前 流处理窗口的事件; 否则, 緩存接收的所述属于当前流处理窗口的事件。  a processing subunit, configured to: when the second expression determines that the subunit determines that the second expression is not satisfied, deletes the cached event belonging to the current stream processing window, and caches the received current window that belongs to the current stream processing window. Event; otherwise, the received event belonging to the current stream processing window is cached.
13、 如权利要求 10至 12中任一项所述的装置, 其特征在于, 所述装置还 包括: The apparatus according to any one of claims 10 to 12, wherein the apparatus further comprises:
处理单元, 用于当判断緩存的属于当前流处理窗口的事件是否满足与第一 关键字有关的第一表达式时, 緩存对所述当前流处理窗口的事件进行处理的中 间计算结果;  a processing unit, configured to: when determining whether the cached event belonging to the current stream processing window satisfies the first expression related to the first keyword, buffering an intermediate calculation result of processing the event of the current stream processing window;
所述输出单元, 还用于当按照第一关键字输出已緩存的属于当前流处理窗 口的事件时, 输出根据所述中间计算结果获得的最终计算结果。  The output unit is further configured to output, when the cached event belonging to the current stream processing window is output according to the first keyword, a final calculation result obtained according to the intermediate calculation result.
14、 如权利要求 10或 11所述的装置, 其特征在于, 所述緩存单元具体包 括: The device according to claim 10 or 11, wherein the buffer unit specifically includes:
第三表达式判断子单元, 用于根据第三表达式判断是否緩存所述当前流处 理窗口的事件, 当判断结果为是时, 緩存所述属于当前流处理窗口的事件。  The third expression determining subunit is configured to determine, according to the third expression, whether to cache the event of the current stream processing window, and when the judgment result is yes, cache the event belonging to the current stream processing window.
15、 如权利要求 10至 14中任一项所述的装置, 其特征在于, 所述事件包 括呼叫记录、 短消息发送记录或网站点击记录, 所述第一关键字为 had关键字, 所述装置用于复杂事件处理 CEP系统中, 所述 CEP系统使用 CQL语言、 EPL 语言或类 SQL的 CEP规则描述语言。 The device according to any one of claims 10 to 14, wherein the event comprises a call record, a short message transmission record or a website click record, and the first keyword is a had keyword. The apparatus is used in a complex event processing CEP system that uses a CQL language, an EPL language, or a CEP rule description language like SQL.
PCT/CN2011/084643 2011-12-26 2011-12-26 Stream processing method and apparatus WO2013097073A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2011/084643 WO2013097073A1 (en) 2011-12-26 2011-12-26 Stream processing method and apparatus
CN201180003717.6A CN103282880B (en) 2011-12-26 2011-12-26 A kind of method for stream processing and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2011/084643 WO2013097073A1 (en) 2011-12-26 2011-12-26 Stream processing method and apparatus

Publications (1)

Publication Number Publication Date
WO2013097073A1 true WO2013097073A1 (en) 2013-07-04

Family

ID=48696169

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/084643 WO2013097073A1 (en) 2011-12-26 2011-12-26 Stream processing method and apparatus

Country Status (2)

Country Link
CN (1) CN103282880B (en)
WO (1) WO2013097073A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106021326A (en) * 2016-05-03 2016-10-12 无锡雅座在线科技发展有限公司 A stream-computing event processing method and device
CN106484595A (en) * 2016-10-09 2017-03-08 华青融天(北京)技术股份有限公司 A kind of event-handling method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101067843A (en) * 2006-05-04 2007-11-07 Sap股份公司 Systems and methods for processing auto-id data
US20080301125A1 (en) * 2007-05-29 2008-12-04 Bea Systems, Inc. Event processing query language including an output clause
CN101436337A (en) * 2008-12-23 2009-05-20 北京中星微电子有限公司 Method and apparatus for monitoring event
CN101685466A (en) * 2009-07-22 2010-03-31 中兴通讯股份有限公司 Event handling method and event handling equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110239229A1 (en) * 2010-03-26 2011-09-29 Microsoft Corporation Predicative and persistent event streams

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101067843A (en) * 2006-05-04 2007-11-07 Sap股份公司 Systems and methods for processing auto-id data
US20080301125A1 (en) * 2007-05-29 2008-12-04 Bea Systems, Inc. Event processing query language including an output clause
CN101436337A (en) * 2008-12-23 2009-05-20 北京中星微电子有限公司 Method and apparatus for monitoring event
CN101685466A (en) * 2009-07-22 2010-03-31 中兴通讯股份有限公司 Event handling method and event handling equipment

Also Published As

Publication number Publication date
CN103282880A (en) 2013-09-04
CN103282880B (en) 2016-10-05

Similar Documents

Publication Publication Date Title
US11288231B2 (en) Reproducing datasets generated by alert-triggering search queries
US8589432B2 (en) Real time searching and reporting
US10255238B2 (en) CEP engine and method for processing CEP queries
Yang et al. In-network execution of monitoring queries in sensor networks
US8880493B2 (en) Multi-streams analytics
US8560511B1 (en) Fine-grain locking
WO2021017884A1 (en) Data processing method and apparatus, and gateway server
US9170984B2 (en) Computing time-decayed aggregates under smooth decay functions
US20120197928A1 (en) Real time searching and reporting
US20110131198A1 (en) Method and apparatus for providing a filter join on data streams
US9600526B2 (en) Generating and using temporal data partition revisions
CN107301215B (en) Search result caching method and device and search method and device
WO2014019349A1 (en) File merge method and device
WO2018205845A1 (en) Data processing method, server, and computer storage medium
WO2014101520A1 (en) Method and system for achieving analytic function based on mapreduce
WO2013097073A1 (en) Stream processing method and apparatus
Maier et al. Capturing episodes: may the frame be with you
WO2017124660A1 (en) System and method for associating multi-stage assembly transactions
Venkatesan et al. PoN: Open source solution for real-time data analysis
US20120296861A1 (en) Storing events from a datastream
Sun et al. DSSP: stream split processing model for high correctness of out-of-order data processing
WO2023077451A1 (en) Stream data processing method and system based on column-oriented database
Suzanne et al. Window-slicing techniques extended to spanning-event streams
US9787564B2 (en) Algorithm for latency saving calculation in a piped message protocol on proxy caching engine
CN116756177B (en) Multi-table index maintenance method and system for mysql database

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11878919

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11878919

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 11878919

Country of ref document: EP

Kind code of ref document: A1