WO2013167344A1 - An adaptive method for processing an event sequence in a complex event processing system and a system thereof - Google Patents
An adaptive method for processing an event sequence in a complex event processing system and a system thereof Download PDFInfo
- Publication number
- WO2013167344A1 WO2013167344A1 PCT/EP2013/057700 EP2013057700W WO2013167344A1 WO 2013167344 A1 WO2013167344 A1 WO 2013167344A1 EP 2013057700 W EP2013057700 W EP 2013057700W WO 2013167344 A1 WO2013167344 A1 WO 2013167344A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- event
- sequence
- event sequence
- sequences
- events
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/542—Event management; Broadcasting; Multicasting; Notifications
Definitions
- An adaptive method for processing an event sequence in a complex event processing system and a system thereof
- the present invention relates to the fi eld of event
- an adaptive method processing an event sequence in a complex event processing system and a system thereof .
- a complex event processing f CEP system is an intelligent system, which handles and processes a multitude of events to e1 ic i t meaningful information from the events .
- the events are received by the CEP system and the same are processed in order to identi. fy a consequence, an opportunity, a threat , et cetera , caused by the events .
- the CEP system is enormous used in a wide variety of systems and applications , for example, surveillance and monitoring systems , predictive systems , business process management systems , stock trading applications , futures and options trading applications , et cetera .
- An event sequence is a sequence of events , and the event sequence may contain one or more anomalies .
- Anomalous event sequences are those sequences that differ from predefined, event sequences or as those sequences that do not match a predefined model . Determination, of such anomalies is a challenging task . Ef f ici.en t cietermination. of the anomalies improves both the quality and accuracy of the CEP system. This is extremely beneficial , especially if the CEP system is used in real - time envi ronments , for example , surveillance and monitoring systems . Therefore , the events are to be processed in a facile manner to determine such anomalies .
- US 20090210364 relates to an apparatus for and method of generating complex event processing system rules .
- the patent appl ication teaches a mechanism of enabling a standard, learning function to generate rules for CEP systems .
- the method creates rules based, on the previously defined, output events by creating input event feature vectors for each targeted output event .
- the patent application covers a. method for automatically generating CEP system rules to infer output events which are anomal ies of the input event
- the underlying idea of the present invention is to simplify the processing of an event sequence in a CEP system .
- An adaptive method for processing the event sequence is herein proposed.
- the event sequence comprising a plurality of events is received .
- the plurality of events is related to an entity .
- Each event comprises an event label .
- the plurality of events is sequenced responsive to the event labels thereby obtaining an event sequence .
- the event sequence is compared with each of a plurality of reference event sequences for determining at least a. closely matching reference event sequence .
- Events of the closely matching reference event sequence comprise the plurality of events to a substantial extent .
- an anomaly is determined if the event sequence is non- identical to the closely matching reference event sequence .
- the event sequence is then added to the plurality of reference event sequences based on the determined, anomaly .
- the determined, anomaly is an. event absence anomaly, if a
- the determined anomaly is a multiple event occurrence anomaly, if a statistical frequency of an event in the event sequence is more than a statistical frequency of the event in the closely matching reference event sequence .
- the determined anomaly is an extraneous event anomaly, if an event is only present in the received event sequence and is absent in the closely matching determined reference event sequence .
- the plurality of reference event sequences comprises a first set of acceptable reference event sequences and a second set of unacceptable reference event sequences .
- An acceptable reference event sequence conforms to a. sequence of activities that are permitted for the entity, and an acceptable
- reference event sequence conforms to a sequence of activi. ties that are not permi. tted for the entity .
- the event sequence is determined to be an acceptable event sequence, if the first set of acceptable reference event sequences comprises the closely matching reference event sequence . Whereas, the event sequence is determined to be an unacceptable event sequence, if the second set of unacceptable reference event sequences
- the event sequence is added to the f irst set of acceptable reference event sequences if the event sequence is non- identical to the closely matching reference event sequence .
- the first set of acceptable reference event sequences comprises the closely matching reference event sequence .
- the event sequence is added to the second set of unacceptable reference event sequences if the event sequence is non- identical to the closely matching reference event sequence .
- the second set of acceptable reference event sequences comprises the closely matching reference event sequence .
- each of the event labels are processed for sequencing the first plurality of events .
- the event sequence is determined .
- each of the event labels is a times tamp .
- a times tamp denotes an instance of time at which the event was generated .
- the plurality of events is chronologically sequenced based on the timestamps .
- the event sequence and each of the plurality of reference event sequences are modelled as strings .
- the anomaly is determined, by applying a string matching function to compare the event sequence with each of the plurality of reference event sequences . This is a faster way of comparing and thereby the processing time is reduced .
- each of the plurality of events compri ses a common identi f ier tag.
- the identi bomb tag characterizes the entity. This is advantageous in identifying and segregating the events related to a particular enti ty in a CEP system that may have to handle events for di f ferent entities .
- a CEP system for processing an event sequence .
- the CEP system comprises a processor and. a memory unit.
- the processor is configured, to perform the method according to any of the aforemen tioned embodiments .
- the processor receives each of the plurality of events from a respective event source .
- the memory unit is operably coupled to the processor, and. it comprises the plurality of reference event sequences .
- FIG 1 depicts a multi -stream environment compri sing a plurality of event sources , wherein each event source generates an. event related to an enti. ty , wherein each event comprises a unique event label ,
- FIG 2 depicts a CEP system comprising a processor and a database, wherein the CEP system is interfaced, to the plurality of event sources referred to in FIG 1 for receiving a plurality of events ,
- FIG 3 depicts reference event sequences comprised in the database referred to in FIG 2
- FIG 4 depicts an exemplary activity profile comprising the plurality of events received from the plurality of event sources for the entity referred, to in FIG 1
- FIG 4 depicts an exemplary activity profile comprising the plurality of events received from the plurality of event sources for the entity referred, to in FIG 1
- FIG 4 depicts an exemplary activity profile comprising the plurality of events received from the plurality of event sources for the entity referred, to in FIG 1 .
- FIG 5 depicts a flow chart of an adaptive method for
- FIG 1 depicts an exemplary mul ti -stream environment 10 comprising a plurality of event sources (ESi to ESio ) 11-20.
- Each of the event sources (ESi to ESio ) 11-20 generates a respective event (Ei) for an exemplary enti ty 35. Thereby it results in a plurality of events (Ei to ⁇ _ 0 ) 21-30 for the exemplary ent i ty 35.
- the exemplary multi-stream environment 10 depicted herein is an airport and the exemplary enti ty 35 depicted herein is a passenger at the airport 10.
- the aforementioned plurality of event sources f ESi to ESio) 11-20 is a plurality of predefined locations in the airport 10.
- Each of the plurality of locations (ESi to ESio ) 11-20 generates an event (Ei) for the passenger 35 if a. predefined activity has been performed by the passenger 35 at the respective location .
- Each of the generated, events (Ei to Ei 0 ) 21-30 comprises a unique event label (Li) upon the generation of the event (Ei) .
- the resultant is a. plurality of events (Ei to Ei 0 ) 21-30 for the passenger 35.
- each of the events (Ei to Ei 0 ) 21-30 comprises an. identi f ier 52 , which is common to all the events ( Ei to Ei 0 ) 21-30 for the passenger 35.
- the identi fier 52 serves to charac ter i ze the passenger 35.
- the ident i f ier 52 is a Passenger Ref erence Number ( PNR) , because it is speci f ically identifies the passenger 35 and, for example, the type of travel ( international or domestic) that will be undergone by the passenger 35.
- PNR Passenger Ref erence Number
- the identifier 52 may also be a passport number of the passenger 35, ticket reference number of the passenger 35 or any other document number of the passenger 35 , which uniquely identifies the passenger 35.
- each of the events (Ei to Ei 0 ) 21-30 may comprise information related to the passenger 35 , event type , and its unique event label (Li) .
- the example mentioned below purports to illustrate the essence of the aforementioned in an exemplary context of the airport 10 for the passenger 35 if the passenger 35 performs the activity speci fic to each of the respective locations (ESi to ESio ) 11-20.
- the f irst location (ESi) 11 is an ' Entry gate ' , which generates the first event (Ei) 21 comprising the event label (Ln ) 41 that represents * Entry of the passenger 35 ' if the passenger 35 performs the activity of entering the airport 10 through the entry gate 11 ,
- the third location ( ES 3 ) 13 is a 1 Check-in counter ' , which generates the third event (E 3 ) 23 compri sing the event label (L 3 ) 43 that represents ' Issue of a boarding pass to the passenger 35', if the passenger 35 performs the activity of checking in at the check- in counter 13
- - the fourth location (ES 4 ) 14 is a ' Baggage counter ' , which generates the fourth even t (E 4 ) 24 comprising the event label (L 4 ) 44 that represents ' Check-in baggage submitted by the passenger 35', if the passenger 35 performs the activity of submitting his check- in baggage at the baggage counter 44 ,
- the fifth location f ES 5 ) 15 is an ⁇ Immigration counter ' , which generates the fifth event (E 5 ) 25 comprising the event label ( L 5 ) 45 that represents ⁇ Immigration clearance of the passenger 35', if the passenger 35 performs the activi ty of presenting himself for
- the sixth location (ES 6 ) 16 is a 1 Customs counter' , which generates the sixth event (E 6 ) 26 compri sing the event label (L 6 ) 46 that represents 1 Customs clearance of the passenger 35', if the passenger 35 performs the activity of presenting himself for customs clearance at the customs counter 16 ,
- the seventh location (ES?) 17 is a ' Security check counter ' , which generates the seventh event (E 7 ) 27 comprising the event label (L 7 ) 47 that represents ⁇ Security check of the passenger 35 ' , if the passenger 35 performs the activity of presenting himself for security clearance at the security check counter 17 , the eighth location (ES 8 ) 18 is a * Baggage
- the ninth location ( ES 9 ) 19 is a ' Boarding gate ' , which generates the ninth event ( E 9 ) 29 comprising the label f L 9 ) 49 that represents " Boarding of the passenger 35', if the passenger 35 performs the activity of passing through the boarding gate 29 for boarding an aircraft 53 , and
- the tenth location ( ESio ) 20 is an ⁇ Exit gate ' , which generates the tenth event ( E i0 ) 30 comprising the event label (Li 0 ) 50 that represents ⁇ Exit of the passenger 35 ' , if the passenger 35 performs the activity of exiting- the airport 10 through the exit gate 20.
- Each of the event labels (Li to Li 0 ) 41-50 contains data , which may be a number or a code assigned to the event ( Ei) at the location (ESi) where the even t (Ei) is generated .
- the even t labels facilitate the identi fication of the events (Ei to Ei 0 ) and the arrangement of the generated, events (Ei to Eio) 21-30 speci f ic to the passenger 35 in a sequential order .
- the labels may contain the points in time at which the correspondi ng event occurred .
- a sequential order of the events (Ei to Ei 0 ) 21-30 for the passenger 35 determines an event sequence of the passenger 35.
- the event sequence def ines a sequence of activities perf ormed by the passenger 35 in the airport 10. For example , in the aforementi oned.
- the event sequences 58 , 59 may contain one or more anomalies .
- the term "anomaly" refers to any deviation in a.
- such a deviation can be a one or more missing events (Ei to Ei 0 ) 21- 30 , a change in the order of occurrence of the events (Ei to Eio ) 21-30 , or recurrence of certain events (Ei to Ei 0 ) 21-30 , et cetera .
- FIG 2 depicts a. CEP system 60 interfaced to each of the plurality of locations (ESi to ESio ) 11-20.
- the CEP system 60 comprises a. central processor 65 , and the processor is operably coupled to a memory unit 70.
- the memory unit 70 is a database .
- the processor 65 receives each of the respective plurality of events (Ei to Ei 0 ) 21-30 for the passenger 35 .
- Each of the labels (Li to Li 0 ) 41-50 of each of the plurality of events (Ei to Ei 0 ) 21-30 is processed for determining the event sequence for the activities performed by the passenger 35.
- each of the event labels (Li to Li 0 ) 41-50 comprises a respective timestamp (TSi) , wherein the times tamp (TSi) denotes an instance of time at which the particular event (Ei) was generated , i.e. the instance of time at which the particular activity was performed at the particular location (ESi) .
- the event sequence determined by processing the event labels (Ln to ⁇ _ ⁇ 0 ) 41-50 of the plurality of events (Ei to Eio ) 21-30 comprises processing the times tamps (TSi to TSio) 71-80 of the event labels (Li to L i0 ) 41-50 for
- the event sequence is a chronological arrangement of the plurality of events (Ei to Ei 0 ) 21-30 , which in fact denotes the chronological sequence of activities performed, by the passenger 35 a t the airport 10.
- the event (Ei) 21 comprises the event label (Ln) 41 comprising the timestamp (TSi) 71 and.
- the event (E 5 ) 25 comprises the event label (L 5 ) 45 comprising the timestamp (TS 5 ) 75.
- the times tamp (T 5 ) 75 is subsequent to the times tamp (Ti) 71.
- FIG 3 depicts the database 70 of the CEP system 60.
- the database 70 comprises a reference set 90 comprising reference event sequences 91-100.
- One way of generating the reference event sequences 91-100 may be by processing an adaptive rule set (not shown) , i.e. a learning rule set .
- the adaptive rule set comprises one or more rules , and the rules are event- specific, entity-specific , and environment-specific. For example, some of the rules may be as follows :
- the reference event sequences 91-100 are to be construed, as the possible event sequences that may be determinable for the passenger 35 in the airport 10.
- the reference event sequences 91-100 comprise a first set of acceptable reference event sequences 91-95 and a second, set of unacceptable reference event sequences 96-100.
- An event sequence in the reference event sequences 91-100 is termed acceptable if the event sequence conforms to a sequence of activities that are permitted to be performed, by the passenger 35 in the airport 10 as defined by the adaptive rule set .
- a reference event sequence such as [Ei E 2 E 3 E 4 E 7 E 9 ] 91 is permi. tted for the passenger 35 in the airport 10 , because it conforms to the permitted activities for a. passenger in the airport 10 , thereby rendering the reference event sequence [Ei E 2 E 3 E 4 E 7 E 9 ] 91 acceptable .
- An event sequence in the reference event sequences 91-100 is termed unacceptable if the event sequence conforms to a sequence of activi ties that are not permi tted to be per formed by the passenger 35 in the airport 10.
- a reference event sequence such as [Ei E 4 E 5 E 9 ] 97 is not permi tted the airport 10 , because the passenger 35 does not present himself for security clearance at the security check counter 17.
- the reference event sequence [Ei E 4 E 5 E 9 ] 97 is an unacceptable event sequence .
- the reference set 90 depicted in FIG 4 comprises five exemplary acceptable event sequences 91-95 and five exemplary unacceptable event sequences 96-100.
- the event sequence determined for the passenger 35 is compared with each of the reference event sequences 91-100 in the database 70 for determining an anomaly in the event sequence for the passenger 35. This will be elucidated, with reference to FIG 5.
- the event sequence determined for the passenger 35 may be a normal event sequence or an anomalous event sequence .
- a normal event sequence is an. acceptable event sequence, because it conforms to the rules as defined in the adaptive ru1 e set .
- an anomalous event sequence differs from an acceptable event sequence as per the existing rules comprised in the adaptive rule set .
- the anomalous sequence is to be processed, to determine whether the anomalous event sequence is acceptable or unacceptable .
- FIG 4 depicts an exemplary activity profile 110 of the passenger 35 , wherein the passenger 35 has performed the sequence of ac tivi. ties corresponding to the first event sequence 58.
- the activity profile 110 of the passenger 35 comprises the following :
- the activity prof i le 110 is assigned with the PNR 52 of the passenger 35.
- the activity profile 110 of the passenger 35 is unique in comparison with activity prof i les for other passengers . Therefore, if the event sequence of the passenger 35 is determined to be anomalous , it is to be construed tha t the activity profile 110 of the passenger 35 is determined to be anomalous .
- FIG 5 depicts a flow chart of an adaptive method for
- the central processor 65 of the CEP system 60 is configured, to execute the adaptive method .
- each of the plurality of events (Ei to Eio) 21-30 which is generated by the respective plurality of locations (ESi to ESio) 11-20 for the passenger 35 in the airport 10 , is received .
- This may involve the central processor 65 communicating with each of the plurality of locations (ESi to ESio) 11-20 to receive the respective events (Ei to E i0 ) 21-30.
- Each of the plurality of events (Ei to Eio) 21-30 comprises the unique event label (Li) comprising a. timestamp TSi when the event (Ei) is generated .
- the event label (Li) may be assigned to the respective event (Ei) by the respective location (ESi) .
- the processor 65 may assign the event label (Li) if each of the plurality of event locations (Ei to Ei 0 ) 11-20 and the processor 65 are networked to operate in real-time .
- each of the event labels (Ln to Ln 0 ) 41-50 of each of the plurality of events (Ei to Ei 0 ) 21-30 is processed to chronologically sequence the plurality of events (Ei to Eio) 21-30 of the passenger 35. This is achieved by
- a step 220 the event sequence for the passenger 35 is received, by the central processor 65 for processing .
- the reference set 90 comprising the reference event sequences 91-100 for the passenger 35 is fetched from the database 70.
- the reference set 90 may be readily
- a step 240 the passenger ' s 35 event sequence generated in step 210 and received, in step 220 is compared with each of the reference event sequences 91-100 of the reference set 90.
- at least one reference event sequence from the reference set 90 is determined, which closely matches the event sequence for the passenger 35.
- the event sequence and the reference event sequences 91-100 are modelled as strings . Thereafter, any of the well-known string matching functions may be appl ied for comparing the received, event sequence for the passenger 35 with each of the
- reference event sequences 91-100 in the reference set 90 are reference event sequences 91-100 in the reference set 90.
- the determined, reference event sequence in step 240 substantially comprises the same events in comparison with the event sequence for the passenger 35.
- respective ordinal position of each of the events in the event sequence for the passenger 35 and the determined reference event sequence is substantially similar, i.e. the sequence of the events is substantially order-preserved .
- substantially means that the two sequences , viz , the event sequence of the passenger 35 and the
- the determined reference event sequence comprise nearly the same events and the events are also order preserved .
- the event sequence is anomalous if the event sequence is non- identical to the closest matching reference sequence
- the received event sequence will be marked "normal " or anomalous " and "acceptable” or “unacceptable” . If the received event sequence for the passenger 35 and the closely matching determined reference event sequence are identical , then the received event sequence for the passenger 35 does not contain an anomaly and is flagged as a normal sequence .
- the received event sequence may be an acceptable event sequence or an unacceptable event sequence depending on the type of determined reference event sequence .
- the received event sequence for the passenger 35 and. the closely matching determined reference event sequence are not identical but only match substantially, then it is determined that the received, event sequence comprises one or more anomalies and. is flagged as an anomalous sequence .
- the received event sequence can still be an acceptable event sequence or an unacceptable event sequence depending on the type of anomaly present in it .
- a type of anomaly present in the received event sequence for the passenger 35 is determined .
- the one or more anomalies in the received event sequence are determined based, on the one or more outputs of the string matching function .
- the following types of anomalies are determined based, on the one or more outputs :
- any event (Ei) firstly if the event (Ei) is present in both the received event sequence and the closely matching determined reference event sequence , and secondly if a statistical frequency of the event ( Ei) in the received even t sequence is more than a statistical frequency of the event (Ei) in the closely matching determined, reference event sequence , then the determined, anomaly in the received, event sequence is a. multiple event occurrence anomaly, and.
- the statistical frequency of any event (Ei) present in the passenger ' s 35 event sequence is determined by counting the number of times the event (Ei) is present in the event sequence . This is performed by the central processor 65.
- Example 1 suppose a received event sequence is the f irst event sequence [Ei E 3 E 7 E 9 ] 58.
- the first event sequence [Ei E 3 E 7 Eg] 58 is compared with each of the reference even t sequences 91-100.
- the reference event sequence that closely matches the first event sequence [Ei E 3 E 7 E 9 ] 58 is the reference event sequence [Ei E 3 E 7 Eg] 92.
- the received even t sequence of the passenger 35 is identical to an acceptable reference sequence . Therefore, the received event sequence of the passenger 35 is not anomalous , which implies that the activities performed by the passenger 35 in the airport 10 conform to the permi tted activities for a passenger in the airport 10.
- the activity profile 110 of the passenger 35 is determined to be non-anomalous .
- Example 2 suppose a received event sequence is the second event sequence [Ei E 3 E 7 E 10 Ei Eg] 59. According to step 240 , the second event sequence [Ei E 3 E 7 Ei 0 Ei Eg] 59 is compared, with each of the reference event sequences 91-100. The reference event sequence that closely matches the second event sequence [Ei E 3 E 7 Ei 0 Ei Eg] 59 is the reference event sequence [Ei E 3 E 7 ⁇ _ 0 Ei E 3 Eg] 96. In this case, the received event sequence of the passenger 35 is substantially identical to an unacceptable reference sequence . Therefore, the received event sequence of the passenger 35 is both
- the activity profile 110 of the passenger 35 is determined, to be anomalous as well as unacceptable .
- Example 3 suppose a received event sequence is [Ei E 3 Ei 0 E 4 E 7 Eg] .
- the received event sequence [Ei E 3 Ei 0 E 4 E 7 Eg] is compared, with each of the reference event sequences 91-100.
- the reference event sequence that closely matches the received event sequence is the reference event sequence [Ei E 3 Eio E 7 Eg] 95.
- the received event sequence of the passenger 35 is substantially identical to an acceptable reference sequence. Therefore, the received event sequence of the passenger 35 is acceptable but anomalous .
- the type of anomaly determined in the received event sequence is an extraneous event anomaly because the event (E 4 ) is present in the received sequence but absent in the reference event sequence 95.
- the event (E 4 ) is an acceptable event in the framework of the airport 10.
- the activi ty profile 110 of the passenger 35 is determined to be anomalous but acceptable .
- Example 4 suppose a received, event sequence is [Ei E i0 Ei E 9 ] .
- the received, event sequence [Ei Eio Ei E 9 ] is compared with each of the reference even t sequences 91-100.
- the reference event sequence that closely matches the received event sequence is the reference event sequence [Ei E 3 Ei 0 Ei Eg] 98.
- the received event sequence of the passenger 35 is substantially identical to an unacceptable reference sequence . Therefore, the received, event sequence of the passenger 35 is unacceptable as well as anomalous .
- the type of anomaly determined, in the received even t sequence is an event absence anomaly, because the statistical frequency of the event (E 3 ) in the received, event sequence is lesser than the statistical frequency of the event ( E 3 ) in the reference event sequence 95. Furthermore , the absence of an important event (E 3 ) is not acceptable in the framework of an ai. rport .
- the activity profile 110 of the passenger 35 is determined to be anomalous as well as unacceptable .
- the reference set 90 is modified to obtain a modified reference set .
- the reference set is modi f ied according to the following cases :
- Case 1 I f the received, event sequence comprises any of the aforementioned types of anomalies , and if the determined closest, matching reference sequence for the received event sequence is an acceptable reference sequence, then the received event sequence is added to the first set of
- the received sequence [Ei E 3 ⁇ _ 0 E 4 E 7 E 9 ] is added, to the first set of acceptable reference event sequences 91- 95, because though the received sequence [Ei E3 Eio E4 E7 Eg] contained an anomaly it is determined to be acceptable .
- the modi f ied reference set comprising the sequence [Ei E 3 Eio E 4 E 7 Eg] will be used, for anomaly detection in the future .
- Case 2 If the received event sequence comprises any of the aforementi.oned. types of anomalies , and if the determined. closest matching reference sequence for the received event sequence is an unacceptable reference sequence , then the received event sequence is added to the second set of unacceptable reference event sequences 91-95 to obtain a modi fied reference set .
- the received sequence [Ei E 3 E 7 Ei 0 Ei Eg] is added to the second set of unacceptable reference event sequences 96-100 , because the received sequence [Ei E 3 E 7 E i0 Ei E 9 ] contained an anomaly as well as it is determined, to be unacceptable .
- the modi f ied reference set comprising the sequence [Ei E 3 E 7 Ei 0 Ei Eg] will be used for anomaly detection in the future .
- the justification of the aforementioned case 2 is also appl icable for the received event sequence [Ei E 10 Ei Eg] according to example 4, wherein the received event sequence [Ei Eio Ei Eg] will be added, to the second set of unacceptable reference event sequences 96-100 to obtain a modified reference set .
- the method adaptively determines the anomaly of any received, event sequence for any passenger in the airport 10.
- the CEP system 60 is a constantly learning system.
- the exemplary enti ty can be a living enti.
- the multi -stream environment may comprise more or less number of event sources without loss of generality.
- the exemplary reference set may comprise more number of acceptable reference event sequences and. unacceptable event sequences without departing from the scope of the present invention .
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The present invention relates to an adaptive method for processing an event sequence (59) in a CEP system (60) and a system (60) thereof. An event sequence (59) related to an entity (35) is received. The event sequence (59) is a sequence of a plurality of events (21-30) and each of the plurality of events (21-30) comprises an event label (41-50). The plurality of events (21-30) is sequenced responsive to the event labels (41-50) for obtaining the event sequence (59). The event sequence (59) is compared with each of a plurality of reference event sequences (91-100). A closest matching reference event sequence (96) for the event sequence (59) is determined. Therewith an anomaly in the event sequence (59) is determined if the event sequence (59) is non-identical to the closely matching reference event sequence (96). The event sequence (59) is then added to the plurality of reference event sequences (91-100) based on the determined anomaly.
Description
An adaptive method, for processing an event sequence in a complex event processing system and a system thereof
The present invention relates to the fi eld of event
processing, and in particularly to , an adaptive method processing an event sequence in a complex: event processing system and a system thereof .
A complex event processing f CEP) system is an intelligent system, which handles and processes a multitude of events to e1 ic i t meaningful information from the events . The events are received by the CEP system and the same are processed in order to identi. fy a consequence, an opportunity, a threat , et cetera , caused by the events . The CEP system is immensely used in a wide variety of systems and applications , for example, surveillance and monitoring systems , predictive systems , business process management systems , stock trading applications , futures and options trading applications , et cetera .
Numerous events and/or event sequences are handled and processed by the CEP system. An event sequence is a sequence of events , and the event sequence may contain one or more anomalies . Anomalous event sequences are those sequences that differ from predefined, event sequences or as those sequences that do not match a predefined model . Determination, of such anomalies is a challenging task . Ef f ici.en t cietermination. of the anomalies improves both the quality and accuracy of the CEP system. This is extremely beneficial , especially if the CEP system is used in real - time envi ronments , for example , surveillance and monitoring systems . Therefore , the events are to be processed in a facile manner to determine such anomalies . US 20090210364 relates to an apparatus for and method of generating complex event processing system rules . The patent appl ication. teaches a mechanism of enabling a standard, learning function to generate rules for CEP systems . The
method creates rules based, on the previously defined, output events by creating input event feature vectors for each targeted output event . Also , the patent application covers a. method for automatically generating CEP system rules to infer output events which are anomal ies of the input event
sequences disclosed . However, the standard learning function approach disclosed in US 20090210364 is not only non- versatile and inflexible, but also inefficient for the determination of the anomal ies in event sequences that are handled by the CEP system. Furthermore, it is both resource- intensive and time consuming .
It is an object of the present invention to propose a. simple and an ef f i.ci ent solution for processing an event sequence in a CEP system.
The above obj ect is achieved by an adaptive method processing an event sequence in a complex event processing system according to claim 1 and a system thereof according to claim 11.
The underlying idea of the present invention is to simplify the processing of an event sequence in a CEP system . An adaptive method for processing the event sequence is herein proposed. The event sequence comprising a plurality of events is received . The plurality of events is related to an entity . Each event comprises an event label . The plurality of events is sequenced responsive to the event labels thereby obtaining an event sequence . The event sequence is compared with each of a plurality of reference event sequences for determining at least a. closely matching reference event sequence . Events of the closely matching reference event sequence comprise the plurality of events to a substantial extent . Herein , an anomaly is determined if the event sequence is non- identical to the closely matching reference event sequence . The event sequence is then added to the plurality of reference event sequences based on the determined, anomaly . By this , it is possible to determine anomalies present in the even t sequence
adapti vely , thereby rendering accuracy and improved, quality to the CEP system.
According to an embodiment of the present invention , the determined, anomaly is an. event absence anomaly, if a
statistical frequency of: an event in the event sequence is less than a statistical frequency of the event in the closely matching reference event sequence . According to another embodiment of the present invention, the determined anomaly is a multiple event occurrence anomaly, if a statistical frequency of an event in the event sequence is more than a statistical frequency of the event in the closely matching reference event sequence .
According to yet another embodiment of the present invention , the determined anomaly is an extraneous event anomaly, if an event is only present in the received event sequence and is absent in the closely matching determined reference event sequence .
By determining the di f ferent types of anoma1 ies , it is possible for the CEP system to characterize various
activities performed, by the entity. This is advantageous especially for CEP systems used for real-time appl i cat ions .
According to yet another embodiment of the present invention , the plurality of reference event sequences comprises a first set of acceptable reference event sequences and a second set of unacceptable reference event sequences . An acceptable reference event sequence conforms to a. sequence of activities that are permitted for the entity, and an acceptable
reference event sequence conforms to a sequence of activi. ties that are not permi. tted for the entity . Herein , upon
comparison, the event sequence is determined to be an acceptable event sequence, if the first set of acceptable reference event sequences comprises the closely matching reference event sequence . Whereas, the event sequence is
determined to be an unacceptable event sequence, if the second set of unacceptable reference event sequences
comprises the closely matching reference event sequence .
According to yet another embodiment of the present invention , the event sequence is added to the f irst set of acceptable reference event sequences if the event sequence is non- identical to the closely matching reference event sequence . In this case, the first set of acceptable reference event sequences comprises the closely matching reference event sequence . Otherwise, the event sequence is added to the second set of unacceptable reference event sequences if the event sequence is non- identical to the closely matching reference event sequence . In this case, the second set of acceptable reference event sequences comprises the closely matching reference event sequence .
According to yet another embodiment of the present invention , each of the event labels are processed for sequencing the first plurality of events . Herewith, the event sequence is determined . According to a preferred embodiment , each of the event labels is a times tamp . A times tamp denotes an instance of time at which the event was generated . Herewi th , the plurality of events is chronologically sequenced based on the timestamps . By this , the temporal aspects of the events are maintained , and are beneficial for accurate event sequencing in multi-stream environments handling numerous types of events and entities .
According to yet another embodiment of the present invention , the event sequence and each of the plurality of reference event sequences are modelled as strings . The anomaly is determined, by applying a string matching function to compare the event sequence with each of the plurality of reference event sequences . This is a faster way of comparing and thereby the processing time is reduced .
According to yet another embodiment of the present invention, each of the plurality of events compri ses a common identi f ier
tag. The identi fier tag characterizes the entity. This is advantageous in identifying and segregating the events related to a particular enti ty in a CEP system that may have to handle events for di f ferent entities .
A CEP system is disclosed herein for processing an event sequence . The CEP system comprises a processor and. a memory unit. The processor is configured, to perform the method according to any of the aforemen tioned embodiments . The processor receives each of the plurality of events from a respective event source . The memory unit is operably coupled to the processor, and. it comprises the plurality of reference event sequences . The aforement ioned. and other embodiments of the invention related to an adaptive method for processing an event sequence in a complex event processing system and. a system thereof will now be addressed with reference to the
accompanying drawings of the present invention . The
illustrated embodiments are intended to illustrate, but not to limit the invention . The accompanying drawings contain the following f igures , in which like numbers refer to .1 ike parts , throughout the description and drawings. The f igures i.1 lustrate in a schematic manner further examples of the embodiments of the invention, in which :
FIG 1 depicts a multi -stream environment compri sing a plurality of event sources , wherein each event source generates an. event related to an enti. ty , wherein each event comprises a unique event label ,
FIG 2 depicts a CEP system comprising a processor and a database, wherein the CEP system is interfaced, to the plurality of event sources referred to in FIG 1 for receiving a plurality of events ,
FIG 3 depicts reference event sequences comprised in the database referred to in FIG 2 ,
FIG 4 depicts an exemplary activity profile comprising the plurality of events received from the plurality of event sources for the entity referred, to in FIG 1 , and
FIG 5 depicts a flow chart of an adaptive method for
processing an event sequence in the CEP system referred to in FIG 2.
FIG 1 depicts an exemplary mul ti -stream environment 10 comprising a plurality of event sources (ESi to ESio ) 11-20.
Each of the event sources (ESi to ESio ) 11-20 generates a respective event (Ei) for an exemplary enti ty 35. Thereby it results in a plurality of events (Ei to Ει_0) 21-30 for the exemplary ent i ty 35.
For the purpose of simplifying the elucidation of the present invention , the exemplary multi-stream environment 10 depicted herein is an airport and the exemplary enti ty 35 depicted herein is a passenger at the airport 10. The aforementioned plurality of event sources f ESi to ESio) 11-20 is a plurality of predefined locations in the airport 10. Each of the plurality of locations (ESi to ESio ) 11-20 generates an event (Ei) for the passenger 35 if a. predefined activity has been performed by the passenger 35 at the respective location .
Each of the generated, events (Ei to Ei0) 21-30 comprises a unique event label (Li) upon the generation of the event (Ei) . Thus , the resultant is a. plurality of events (Ei to Ei0) 21-30 for the passenger 35.
Furthermore, each of the events (Ei to Ei0 ) 21-30 comprises an. identi f ier 52 , which is common to all the events ( Ei to Ei0) 21-30 for the passenger 35. The identi fier 52 serves to charac ter i ze the passenger 35. Herein , the ident i f ier 52 is a Passenger Ref erence Number ( PNR) , because it is speci f ically identifies the passenger 35 and, for example, the type of
travel ( international or domestic) that will be undergone by the passenger 35. However, the identifier 52 may also be a passport number of the passenger 35, ticket reference number of the passenger 35 or any other document number of the passenger 35 , which uniquely identifies the passenger 35.
Herein, each of the events (Ei to Ei0 ) 21-30 may comprise information related to the passenger 35 , event type , and its unique event label (Li) . The example mentioned below purports to illustrate the essence of the aforementioned in an exemplary context of the airport 10 for the passenger 35 if the passenger 35 performs the activity speci fic to each of the respective locations (ESi to ESio ) 11-20. Herein,
- the f irst location (ESi) 11 is an ' Entry gate ' , which generates the first event (Ei) 21 comprising the event label (Ln ) 41 that represents * Entry of the passenger 35 ' if the passenger 35 performs the activity of entering the airport 10 through the entry gate 11 ,
- the second location (ES2) 12 is a ' Baggage screening
counter ' , which generates the second event (E2) 22 comprising the event label (L2) 42 that represents ' Screening of the passenger ' s 35 baggage ' , if the passenger 35 performs the activity of screening his baggage at the baggage screening counter 12 ,
- the third location ( ES3 ) 13 is a 1 Check-in counter ' , which generates the third event (E3) 23 compri sing the event label (L3) 43 that represents ' Issue of a boarding pass to the passenger 35', if the passenger 35 performs the activity of checking in at the check- in counter 13 , - the fourth location (ES4) 14 is a ' Baggage counter ' , which generates the fourth even t (E4) 24 comprising the event label (L4) 44 that represents ' Check-in baggage submitted by the passenger 35', if the passenger 35 performs the activity of submitting his check- in baggage at the baggage counter 44 ,
- the fifth location f ES5) 15 is an λ Immigration counter ' , which generates the fifth event (E5) 25 comprising the event label ( L5 ) 45 that represents λ Immigration
clearance of the passenger 35', if the passenger 35 performs the activi ty of presenting himself for
immigration clearance at the immigration counter 15 , the sixth location (ES6) 16 is a 1 Customs counter' , which generates the sixth event (E6) 26 compri sing the event label (L6) 46 that represents 1 Customs clearance of the passenger 35', if the passenger 35 performs the activity of presenting himself for customs clearance at the customs counter 16 ,
the seventh location (ES?) 17 is a ' Security check counter ' , which generates the seventh event (E7) 27 comprising the event label (L7) 47 that represents λ Security check of the passenger 35 ' , if the passenger 35 performs the activity of presenting himself for security clearance at the security check counter 17 , the eighth location (ES8) 18 is a * Baggage
ic!enti f ication counter ' , which generates the eighth event (E8) 28 comprising the event label (L8) 48 that represents ' Baggage identi ficatiori by the passenger 35 ' , if the passenger 35 performs the activity of identifying his check-in baggage at the baggage identification counter 18 ,
the ninth location ( ES9 ) 19 is a ' Boarding gate ' , which generates the ninth event ( E9 ) 29 comprising the label f L9 ) 49 that represents " Boarding of the passenger 35', if the passenger 35 performs the activity of passing through the boarding gate 29 for boarding an aircraft 53 , and
the tenth location ( ESio ) 20 is an λ Exit gate ' , which generates the tenth event ( Ei0 ) 30 comprising the event label (Li0 ) 50 that represents λ Exit of the passenger 35 ' , if the passenger 35 performs the activity of exiting- the airport 10 through the exit gate 20. Each of the event labels (Li to Li0 ) 41-50 contains data , which may be a number or a code assigned to the event ( Ei) at the location (ESi) where the even t (Ei) is generated . The even t labels (Ln to Li0 ) facilitate the identi fication of the
events (Ei to Ei0) and the arrangement of the generated, events (Ei to Eio) 21-30 speci f ic to the passenger 35 in a sequential order . For example , the labels may contain the points in time at which the correspondi ng event occurred . Herewi th , a sequential order of the events (Ei to Ei0) 21-30 for the passenger 35 determines an event sequence of the passenger 35. The event sequence def ines a sequence of activities perf ormed by the passenger 35 in the airport 10. For example , in the aforementi oned. context of the passenger 35 in the airport 10 , if the activities performed by the passenger 35 are λ Entry of a passenger 35 ' foilowed by λ Issue of a boarding pass to the passenger 35 ' followed by 'Security check, of the passenger 35 ' and concluded, by ' Boarding of the passenger 35', then it is represented, by a first event sequence [Ei E3 E7 E9] 58. In another example, if the
activities performed, by the passenger are ' Entry of a passenger ' followed by ' Issue of a boarding pass to the passenger ' followed by ' Security check of the passenger ' fo11owed by λ Exit of the passenger ' followed by ' Entry of a passenger ' concluded by ' Boarding of the passenger ' , then it is represented, by a second, event sequence [Ei E3 E7 Ei0 Ei E9] 59. The event sequences 58 , 59 may contain one or more anomalies . Herein , the term "anomaly" refers to any deviation in a.
sequence of events in the event sequence for the passenger 35 from a predefined event sequence . For example, such a deviation can be a one or more missing events (Ei to Ei0) 21- 30 , a change in the order of occurrence of the events (Ei to Eio ) 21-30 , or recurrence of certain events (Ei to Ei0) 21-30 , et cetera .
FIG 2 depicts a. CEP system 60 interfaced to each of the plurality of locations (ESi to ESio ) 11-20.
The CEP system 60 comprises a. central processor 65 , and the processor is operably coupled to a memory unit 70. The memory
unit 70 is a database . The processor 65 receives each of the respective plurality of events (Ei to Ei0 ) 21-30 for the passenger 35 . Each of the labels (Li to Li0 ) 41-50 of each of the plurality of events (Ei to Ei0) 21-30 is processed for determining the event sequence for the activities performed by the passenger 35.
According to an embodiment of the present invention , each of the event labels (Li to Li0) 41-50 comprises a respective timestamp (TSi) , wherein the times tamp (TSi) denotes an instance of time at which the particular event (Ei) was generated , i.e. the instance of time at which the particular activity was performed at the particular location (ESi) . Herewith, the event sequence determined by processing the event labels (Ln to Ι_π0) 41-50 of the plurality of events (Ei to Eio ) 21-30 comprises processing the times tamps (TSi to TSio) 71-80 of the event labels (Li to Li0) 41-50 for
sequencing the plurality of events (Ei to Ei0 ) 21-30. Thus , the event sequence is a chronological arrangement of the plurality of events (Ei to Ei0) 21-30 , which in fact denotes the chronological sequence of activities performed, by the passenger 35 a t the airport 10. For example, the event (Ei) 21 comprises the event label (Ln) 41 comprising the timestamp (TSi) 71 and. the event (E5) 25 comprises the event label (L5) 45 comprising the timestamp (TS5) 75. Herein, it is to be construed that the times tamp (T5) 75 is subsequent to the times tamp (Ti) 71.
FIG 3 depicts the database 70 of the CEP system 60.
According to an. embodiment , the database 70 comprises a reference set 90 comprising reference event sequences 91-100. One way of generating the reference event sequences 91-100 may be by processing an adaptive rule set (not shown) , i.e. a learning rule set . The adaptive rule set , as used herein, comprises one or more rules , and the rules are event-
specific, entity-specific , and environment-specific. For example, some of the rules may be as follows :
- for every event (Ei) , a rule for def ining one or more events (Ei to Ei0) 21-30 that may precede and/ or succeed the event (Ei) ,
- for every even t (Ei) , a rule for defining one or more events (Ei to Ei0) 21-30 that should not precede and/or succeed, the event (Ei) ,
- for every event (Ei) , a rule defining valid time limits for the occurrence of one or more events ( Ei to Ei0 ) 21-
30 subsequent to the event (Ei) ,
- a. rule defining event sequences that are acceptable,
- a rule def ining event sequences that are unacceptable, et cetera .
Based on the aforementi oned adaptive rules set , the following exemplary ten reference event sequences 91-100 are generated in order to elucidate the present invention :
reference event sequence [Ei E2 E3 E4 E7 Eg] 91
reference event sequence [Ei E3 E7 Eg] 92
reference event sequence [Ei E3 E4 E5 Eg E7 Eg ] 93 reference event sequence [Ei E2 E3 E4 E7 Eio Ei E7 E reference event sequence [Ei E3 Eio E7 Eg] 95
reference even t sequence [Ei E3 E7 Eio Ei E3 Eg ] 96 reference event sequence [Ei E4 E5 Eg] 97
reference event sequence [Ei E3 Eio Ei Eg] 98
reference event sequence [Ei E3 E4 E6 Eg] 99
reference event sequence [Ei E3 E5 E4 E7 Eg Eg ] 100 The reference event sequences 91-100 are to be construed, as the possible event sequences that may be determinable for the passenger 35 in the airport 10. The reference event sequences 91-100 comprise a first set of acceptable reference event sequences 91-95 and a second, set of unacceptable reference event sequences 96-100.
An event sequence in the reference event sequences 91-100 is termed acceptable if the event sequence conforms to a
sequence of activities that are permitted to be performed, by the passenger 35 in the airport 10 as defined by the adaptive rule set . For example, a reference event sequence such as [Ei E2 E3 E4 E7 E9] 91 is permi. tted for the passenger 35 in the airport 10 , because it conforms to the permitted activities for a. passenger in the airport 10 , thereby rendering the reference event sequence [Ei E2 E3 E4 E7 E9] 91 acceptable .
An event sequence in the reference event sequences 91-100 is termed unacceptable if the event sequence conforms to a sequence of activi ties that are not permi tted to be per formed by the passenger 35 in the airport 10. For example, a reference event sequence such as [Ei E4 E5 E9] 97 is not permi tted the airport 10 , because the passenger 35 does not present himself for security clearance at the security check counter 17. Thereby the reference event sequence [Ei E4 E5 E9] 97 is an unacceptable event sequence .
The reference set 90 depicted in FIG 4 comprises five exemplary acceptable event sequences 91-95 and five exemplary unacceptable event sequences 96-100.
The event sequence determined for the passenger 35 is compared with each of the reference event sequences 91-100 in the database 70 for determining an anomaly in the event sequence for the passenger 35. This will be elucidated, with reference to FIG 5.
The event sequence determined for the passenger 35 may be a normal event sequence or an anomalous event sequence . A normal event sequence is an. acceptable event sequence, because it conforms to the rules as defined in the adaptive ru1 e set . However, an anomalous event sequence differs from an acceptable event sequence as per the existing rules comprised in the adaptive rule set . The anomalous sequence is to be processed, to determine whether the anomalous event sequence is acceptable or unacceptable .
FIG 4 depicts an exemplary activity profile 110 of the passenger 35 , wherein the passenger 35 has performed the sequence of ac tivi. ties corresponding to the first event sequence 58. The activity profile 110 of the passenger 35 comprises the following :
- the plurality of generated events (Ei to Ei0) 21-30
corresponding to each of the activities performed by the passenger 35 ,
- the plurality of timestamps (TSi to TSio) 71-80 assigned respectively to the plurality of generated events (Ei to
Eio) 21-30, and
- the event sequence corresponding to the plurality of generated, events (Ei to Ei0) 21-30 for the passenger 35. The activity prof i le 110 is assigned with the PNR 52 of the passenger 35. Herewith, the activity profile 110 of the passenger 35 is unique in comparison with activity prof i les for other passengers . Therefore, if the event sequence of the passenger 35 is determined to be anomalous , it is to be construed tha t the activity profile 110 of the passenger 35 is determined to be anomalous .
FIG 5 depicts a flow chart of an adaptive method for
processing an even t sequence in the CEP system 60 for determining the anomaly in the event sequence of the
passenger 35. The central processor 65 of the CEP system 60 is configured, to execute the adaptive method .
In a step 200 of the method , each of the plurality of events (Ei to Eio) 21-30 , which is generated by the respective plurality of locations (ESi to ESio) 11-20 for the passenger 35 in the airport 10 , is received . This may involve the central processor 65 communicating with each of the plurality of locations (ESi to ESio) 11-20 to receive the respective events (Ei to Ei0) 21-30. Each of the plurality of events (Ei to Eio) 21-30 comprises the unique event label (Li) comprising a. timestamp TSi when the event (Ei) is generated . The event label (Li) may be assigned to the respective event (Ei) by
the respective location (ESi) . Alternatively, the processor 65 may assign the event label (Li) if each of the plurality of event locations (Ei to Ei0) 11-20 and the processor 65 are networked to operate in real-time .
In a step 210 , each of the event labels (Ln to Ln 0 ) 41-50 of each of the plurality of events (Ei to Ei0) 21-30 is processed to chronologically sequence the plurality of events (Ei to Eio) 21-30 of the passenger 35. This is achieved by
processing each of the plurality of t imes tamps (TSi to TSio) 71-80 of the corresponding labels (Ln to Li0 ) 41-50 and sequencing the plurality of events (Ei to Ει_0) 21-30
chronologically based on the timestamps (TSi to TSio ) 71-80. Herewith, an event sequence for the passenger 35 is
determined, for example the above mentioned first and second event sequences 58 and 59.
In a step 220 , the event sequence for the passenger 35 is received, by the central processor 65 for processing .
In a step 230 , the reference set 90 comprising the reference event sequences 91-100 for the passenger 35 is fetched from the database 70. The reference set 90 may be readily
available or may be generated by processing the adaptive rule set .
In a step 240 , the passenger ' s 35 event sequence generated in step 210 and received, in step 220 is compared with each of the reference event sequences 91-100 of the reference set 90. Herein, at least one reference event sequence from the reference set 90 is determined, which closely matches the event sequence for the passenger 35. For example, the event sequence and the reference event sequences 91-100 are modelled as strings . Thereafter, any of the well-known string matching functions may be appl ied for comparing the received, event sequence for the passenger 35 with each of the
reference event sequences 91-100 in the reference set 90.
Herein , by the term "closely matches " in context of the determined reference event sequence, it is to be construed, that firstly the determined, reference event sequence in step 240 substantially comprises the same events in comparison with the event sequence for the passenger 35. Secondly, respective ordinal position of each of the events in the event sequence for the passenger 35 and the determined reference event sequence is substantially similar, i.e. the sequence of the events is substantially order-preserved .. Herein the term "substantially" means that the two sequences , viz , the event sequence of the passenger 35 and the
determined reference event sequence comprise nearly the same events and the events are also order preserved . In short , the event sequence is anomalous if the event sequence is non- identical to the closest matching reference sequence
determined, from the reference event sequences .
In the step 240 , by processing the received event sequence for the passenger 35 and. the closely matching determined reference event sequence using the string matching function, the received event sequence will be marked "normal " or anomalous " and "acceptable" or "unacceptable" . If the received event sequence for the passenger 35 and the closely matching determined reference event sequence are identical , then the received event sequence for the passenger 35 does not contain an anomaly and is flagged as a normal sequence . The received event sequence may be an acceptable event sequence or an unacceptable event sequence depending on the type of determined reference event sequence .
However, if the received event sequence for the passenger 35 and. the closely matching determined reference event sequence are not identical but only match substantially, then it is determined that the received, event sequence comprises one or more anomalies and. is flagged as an anomalous sequence . The received event sequence can still be an acceptable event sequence or an unacceptable event sequence depending on the type of anomaly present in it .
In a step 250 , a type of anomaly present in the received event sequence for the passenger 35 is determined . Herein, the one or more anomalies in the received event sequence are determined based, on the one or more outputs of the string matching function . The following types of anomalies are determined based, on the one or more outputs :
- for any event (Ei) , i f a s tat i st i.ca1 frequency of the event (Ei) in the received event sequence is less than a. statistical frequency of the event (Ei) in the closely matching determined reference event sequence , then the determined anomaly in the received event sequence is an event absence anomaly,
- for any event (Ei) , firstly if the event (Ei) is present in both the received event sequence and the closely matching determined reference event sequence , and secondly if a statistical frequency of the event ( Ei) in the received even t sequence is more than a statistical frequency of the event (Ei) in the closely matching determined, reference event sequence , then the determined, anomaly in the received, event sequence is a. multiple event occurrence anomaly, and.
- for any event (Ei) , if the event (Ei) is only present in the received event sequence and absent in the closely matching determined reference event sequence , then the determined anomaly in the received event sequence is an extraneous event anomaly .
The statistical frequency of any event (Ei) present in the passenger ' s 35 event sequence is determined by counting the number of times the event (Ei) is present in the event sequence . This is performed by the central processor 65.
The a forement i oned is elucidated with the help of following examples :
Example 1 : suppose a received event sequence is the f irst event sequence [Ei E3 E7 E9] 58. According to step 240 , the
first event sequence [Ei E3 E7 Eg] 58 is compared with each of the reference even t sequences 91-100. The reference event sequence that closely matches the first event sequence [Ei E3 E7 E9] 58 is the reference event sequence [Ei E3 E7 Eg] 92. In this case , the received even t sequence of the passenger 35 is identical to an acceptable reference sequence . Therefore, the received event sequence of the passenger 35 is not anomalous , which implies that the activities performed by the passenger 35 in the airport 10 conform to the permi tted activities for a passenger in the airport 10. Herewith , the activity profile 110 of the passenger 35 is determined to be non-anomalous .
Example 2 : suppose a received event sequence is the second event sequence [Ei E3 E7 E10 Ei Eg] 59. According to step 240 , the second event sequence [Ei E3 E7 Ei0 Ei Eg] 59 is compared, with each of the reference event sequences 91-100. The reference event sequence that closely matches the second event sequence [Ei E3 E7 Ei0 Ei Eg] 59 is the reference event sequence [Ei E3 E7 Ει_0 Ei E3 Eg] 96. In this case, the received event sequence of the passenger 35 is substantially identical to an unacceptable reference sequence . Therefore, the received event sequence of the passenger 35 is both
unacceptable as well as anomalous . The type of anomaly determined in the second event sequence [Ei E3 E7 Ei0 Ei Eg] 59 i s a multiple event occurrence anomaly, because the
statistical frequency of the event ( E3 ) in the second, event sequence is higher than the statistical frequency of the event (E3) in the determined reference event sequence .
Herewith, the activity profile 110 of the passenger 35 is determined, to be anomalous as well as unacceptable .
Example 3 : suppose a received event sequence is [Ei E3 Ei0 E4 E7 Eg] . The received event sequence [Ei E3 Ei0 E4 E7 Eg] is compared, with each of the reference event sequences 91-100. The reference event sequence that closely matches the received event sequence is the reference event sequence [Ei E3 Eio E7 Eg] 95. In this case , the received event sequence of the passenger 35 is substantially identical to an acceptable
reference sequence. Therefore, the received event sequence of the passenger 35 is acceptable but anomalous . The type of anomaly determined in the received event sequence is an extraneous event anomaly because the event (E4) is present in the received sequence but absent in the reference event sequence 95. However, the event (E4) is an acceptable event in the framework of the airport 10. Herewith, the activi ty profile 110 of the passenger 35 is determined to be anomalous but acceptable .
Example 4: suppose a received, event sequence is [Ei Ei0 Ei E9] . According- to step 240 , the received, event sequence [Ei Eio Ei E9] is compared with each of the reference even t sequences 91-100. The reference event sequence that closely matches the received event sequence is the reference event sequence [Ei E3 Ei0 Ei Eg] 98. In this case, the received event sequence of the passenger 35 is substantially identical to an unacceptable reference sequence . Therefore, the received, event sequence of the passenger 35 is unacceptable as well as anomalous . The type of anomaly determined, in the received even t sequence is an event absence anomaly, because the statistical frequency of the event (E3) in the received, event sequence is lesser than the statistical frequency of the event ( E3 ) in the reference event sequence 95. Furthermore , the absence of an important event (E3) is not acceptable in the framework of an ai. rport . Herewith , the activity profile 110 of the passenger 35 is determined to be anomalous as well as unacceptable . In a step 250 , based on the type of anomaly determined, in the received event sequence of the passenger 35 , the reference set 90 is modified to obtain a modified reference set .
Herein , the reference set is modi f ied according to the following cases :
Case 1 : I f the received, event sequence comprises any of the aforementioned types of anomalies , and if the determined closest, matching reference sequence for the received event
sequence is an acceptable reference sequence, then the received event sequence is added to the first set of
acceptable reference event sequences 91-95 to obtain a modi fled reference set . Referring to the aforement i oned.
example 3, the received sequence [Ei E3 Ει_0 E4 E7 E9] is added, to the first set of acceptable reference event sequences 91- 95, because though the received sequence [Ei E3 Eio E4 E7 Eg] contained an anomaly it is determined to be acceptable .
Herewith, the modi f ied reference set comprising the sequence [Ei E3 Eio E4 E7 Eg] will be used, for anomaly detection in the future .
Case 2: If the received event sequence comprises any of the aforementi.oned. types of anomalies , and if the determined. closest matching reference sequence for the received event sequence is an unacceptable reference sequence , then the received event sequence is added to the second set of unacceptable reference event sequences 91-95 to obtain a modi fied reference set . Referring to the aforementi.oned example 2 , the received sequence [Ei E3 E7 Ei0 Ei Eg] is added to the second set of unacceptable reference event sequences 96-100 , because the received sequence [Ei E3 E7 Ei0 Ei E9] contained an anomaly as well as it is determined, to be unacceptable . Herewi. th , the modi f ied reference set comprising the sequence [Ei E3 E7 Ei0 Ei Eg] will be used for anomaly detection in the future .
The justification of the aforementioned case 2 is also appl icable for the received event sequence [Ei E10 Ei Eg] according to example 4, wherein the received event sequence [Ei Eio Ei Eg] will be added, to the second set of unacceptable reference event sequences 96-100 to obtain a modified reference set . Thus , by adding the anomalous received, event sequence either to the first set of acceptable reference even t sequences 91- 95 or to the second set of unacceptable reference event sequences 96-100 based on the determined anomaly, the method
adaptively determines the anomaly of any received, event sequence for any passenger in the airport 10. Herewith, the CEP system 60 is a constantly learning system. The exemplary enti ty can be a living enti. ty or a non living enti ty capable of producing an event in a multi-stream environment . The multi -stream environment may comprise more or less number of event sources without loss of generality. The exemplary reference set may comprise more number of acceptable reference event sequences and. unacceptable event sequences without departing from the scope of the present invention . Though the invention has been described with reference to specific embodiments , this description is not meant to be construed, in a limiting sense . Various examples of the disclosed embodiments , as well as alternate embodiments of the invention , will become apparent to persons skilled in the art upon reference to the description of the invention . It is therefore contemplated, that such modifications can be made without departing from the embodiments of the present invention as def ined .
Claims
1. An adaptive method for processing an event sequence ( 59 ) in a complex event processing (CEP) system ( 60 ) , the method comprising :
- a step (220 ) of recei ving the event sequence ( 59 ) , wherein the event sequence ( 59 ) comprises a plurality of events (21- 30 ) , wherein each of the plurality of events (21-30 ) is related to a respective predefined activi ty of an entity (35) , wherein each of the plurality of events (21-30) comprises an event label ( 41-50 ) , wherein the plurality of events (21-30 ) is sequenced responsive to the event labels (41-50) for obtaining the event sequence ( 59 ) ,
- a step (240 ) of comparing the event sequence ( 59 ) with each of a plurality of reference event sequences ( 91-100 ) for determining a closest matching reference event sequence ( 96 ) out of the plurality of reference event sequences ( 91-100 ) , wherein events of the closest matching reference event sequence ( 59 ) substantially comprise the plurality of events (21-30 ) of the event sequence ( 59 ) ,
- a step (250 ) of determining an anomaly in the event sequence ( 59 ) if the event sequence ( 59 ) is non- identical to the closest matching reference event sequence ( 96 ) , and.
- a step (260 ) of adding the event sequence ( 59 ) to the plurality of reference event sequences ( 91-100 ) based on the determined anomaly .
2. The method according to claim 1 , wherein the determined anomaly is an event absence anomaly, if for an event ( Ei ) in the event sequence ( 59 ) , a statistical frequency of the event (Ei) in the event sequence ( 59 ) is less than a statistical frequency of the event (Ei) in the closest matching reference event sequence ( 96 ) .
3. The method according to claim 1 , wherein the determined anomaly is a multiple event occurrence anomaly, if for an event (Ei) in the event sequence ( 59 ) , the event (Ei) is also present in the closest matching determined, reference event
sequence ( 96 ) , and a statistical frequency of the event (Ei) in the received event sequence ( 59 ) is more than a
statistical frequency of the event (Ei) in the closest matching reference event sequence ( 96 ) .
4. The method according to claim 1, wherein the determined anomaly is an extraneous event anomaly, if for an event (Ei) , the event (Ei) is only present in the received event sequence ( 59 ) and is absent in the closest matching determined reference event sequence ( 96 ) .
5. The method according to any of the claims 1 to 4 , wherein the plurality of reference even t sequences ( 91-100 ) comprises a first set of acceptable reference event sequences ( 91-95 ) and a second set of unacceptable reference event sequences ( 96-100 ) , wherein an acceptable reference event sequence conforms to a sequence of activi ties that are permi tted. for the enti ty (35), and. wherein an unacceptable reference event sequence conforms to a sequence of activities that are not permi tted for the enti ty (35 ) ,
wherein in the step (240 ) of comparing the event sequence (50) ,
- the event sequence ( 59 ) is determined to be an acceptable event sequence if the first set of acceptable reference event sequences ( 91-95 ) comprises the closest matching reference event sequence ( 96 ) , and
the event sequence ( 59 ) is determined, to be an
unacceptable event sequence if the second, set of acceptable reference event sequences ( 95-100 ) comprises the closest matching reference event sequence ( 96 ) .
6. The method according to claim 5 , wherein in the step (260 ) of adding the event sequence ( 59 ) to the plurality of reference event sequences ( 91-100 ) ,
- the event sequence ( 59 ) is added to the first set of acceptable reference event sequences ( 91-95 ) , if the event sequence ( 59 ) is non- identical to the closest matching reference event sequence ( 96 ) , wherein the first set of
acceptable reference event sequences (91-95) comprises the closest matching reference event sequence ( 96 ) , and
- the event sequence ( 59 ) is added to the second set of unacceptable reference event sequences ( 96-100 ) , if the event sequence ( 59 ) is non- identical to the closest matching reference event sequence ( 96 ) , wherein the second set of unacceptable reference event sequences ( 95-100 ) comprises the closest matching reference event sequence ( 96 ) .
7. The method, according to any of the claims 1 to 6 , further comprising :
- a step (210 ) of processing each of the event labels ( 41-50 ) of each of the first plurality of events (21-30) for
sequencing' the f irst plurality of events (21-30) to determine the event sequence ( 59 ) , wherein
the step (210 ) of processing the event sequence ( 59 ) precedes the step (220 ) of receiving the event sequence ( 59 ) .
8. The method, according to any of the claim 7 , wherein each of the event labeIs ( 41-50 ) of each of the first plurality of events (21-30 ) is a times tamp (71-80 ) , wherein the timestamp ( 71- 80 ) denotes an instance of time at which the event ( Ei ) was generated , wherein in the step (210) of processing each of the timestamps (71-80) are processed for sequencing the plurality of events (21-30) chronologically based on the timestamps (71-80) .
9. The method according to any of the claims 1 to 8 , wherein in the step ( 240 ) of comparing comprises :
- modelling the event sequence ( 59 ) and each of the plurality of reference event sequences ( 91-100 ) as strings , and.
- applying a string matching function to the event sequence ( 59 ) and each of the plurality of reference event sequences ( 91-100 ) for comparing the event sequence ( 59 ) with each of the plurality of reference event sequences ( 91-100 ) to determine the anomaly in the event sequence (59) .
10. The method according to any of the claims 1 to 9 , wherein each of the plurality of events (21-30 ) comprises a common identi f ier tag ( 52 ) , wherein the identifier tag ( 52 )
characteri zes the enti ty (35 ) .
11. A CEP system ( 60 ) for processing an event sequence (59), the CEP system ( 60 ) comprising :
- a processor ( 65 ) conf igured to perform the method according to any of the claims 1 to 10 , wherein the processor ( 65 ) is configured to receive each of the plurality of events (21-30 ) from a respective event source (11-20) , and
- a. memory unit (70) operably coupled to the processor ( 65 ) , wherein the memory unit ( 70 ) comprises the plurality of reference event sequences ( 91-100 ) .
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE112013002401.2T DE112013002401T5 (en) | 2012-05-08 | 2013-04-12 | Adaptive method of processing an event sequence in a complex event processing system and a system thereof |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
IN518/KOL/2012 | 2012-05-08 | ||
IN518KO2012 | 2012-05-08 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013167344A1 true WO2013167344A1 (en) | 2013-11-14 |
Family
ID=48184155
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2013/057700 WO2013167344A1 (en) | 2012-05-08 | 2013-04-12 | An adaptive method for processing an event sequence in a complex event processing system and a system thereof |
Country Status (2)
Country | Link |
---|---|
DE (1) | DE112013002401T5 (en) |
WO (1) | WO2013167344A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018037411A1 (en) * | 2016-08-24 | 2018-03-01 | B. G. Negev Technologies And Applications Ltd., At Ben-Gurion University | Model for detection of anomalous discrete data sequences |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007131545A2 (en) * | 2005-12-09 | 2007-11-22 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | A method and apparatus for automatic comparison of data sequences |
US20090210364A1 (en) | 2008-02-20 | 2009-08-20 | Asaf Adi | Apparatus for and Method of Generating Complex Event Processing System Rules |
-
2013
- 2013-04-12 WO PCT/EP2013/057700 patent/WO2013167344A1/en active Application Filing
- 2013-04-12 DE DE112013002401.2T patent/DE112013002401T5/en not_active Withdrawn
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007131545A2 (en) * | 2005-12-09 | 2007-11-22 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | A method and apparatus for automatic comparison of data sequences |
US20090210364A1 (en) | 2008-02-20 | 2009-08-20 | Asaf Adi | Apparatus for and Method of Generating Complex Event Processing System Rules |
Non-Patent Citations (3)
Title |
---|
T. LANE, C. E. BRODLEY: "Temporal sequence learning and data reduction for anomaly detection", ACM TRANSACTIONS ON INFORMATION AND SYSTEM SECURITY, vol. 2, no. 3, August 1999 (1999-08-01), pages 296 - 331, XP002907154, DOI: 10.1145/322510.322526 * |
V. CHANDOLA, A. BANERJEE, V. KUMAR: "Anomaly detection for discrete sequences: a survey", IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, vol. 24, no. 5, 18 November 2010 (2010-11-18), pages 823 - 839, XP011440359, DOI: 10.1109/tkde.2010.235 * |
Z. XING, J. PEI, E. KEOGH: "A brief survey on sequence classification", ACM SIGKDD EXPLORATIONS NEWSLETTER, vol. 12, no. 1, June 2010 (2010-06-01), pages 40 - 48, XP055046500, DOI: 10.1145/1882471.1882478 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018037411A1 (en) * | 2016-08-24 | 2018-03-01 | B. G. Negev Technologies And Applications Ltd., At Ben-Gurion University | Model for detection of anomalous discrete data sequences |
Also Published As
Publication number | Publication date |
---|---|
DE112013002401T5 (en) | 2015-02-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10699168B1 (en) | Computer-executed method and apparatus for assessing vehicle damage | |
US8843492B2 (en) | Record linkage based on a trained blocking scheme | |
CN111915366B (en) | User portrait construction method, device, computer equipment and storage medium | |
CN111475804A (en) | Alarm prediction method and system | |
Gharaibeh et al. | Toward digital construction supply chain-based Industry 4.0 solutions: scientometric-thematic analysis | |
US10489637B2 (en) | Method and device for obtaining similar face images and face image information | |
CN108763199A (en) | The investigation method and device of text feedback information | |
CN107292418A (en) | A kind of waybill is detained Forecasting Methodology | |
CN108734475A (en) | A kind of method for anti-counterfeit of tracing to the source based on big data correlation analysis | |
CN111738441B (en) | Prediction model training method and device considering prediction precision and privacy protection | |
WO2014018244A2 (en) | Intelligence analysis | |
CN110705998A (en) | Block chain based information auditing method and device, electronic equipment and storage medium | |
CN118246584A (en) | Express whole-course aging prediction method, device, equipment and storage medium | |
WO2013167344A1 (en) | An adaptive method for processing an event sequence in a complex event processing system and a system thereof | |
CN110598772A (en) | Operation data detection method and device, computer equipment and storage medium | |
Ginanjar et al. | Lean government concept and design over service administration in Indonesian ID card | |
CN111680941A (en) | Premium recommendation method, device, equipment and storage medium | |
CN115664853A (en) | Network security data association analysis method, device and system and storage medium | |
CN115878888A (en) | Message pushing method and device based on model implicit multi-target fusion | |
CN115034704A (en) | Logistics tracking method, device, equipment and storage medium | |
CN107545347B (en) | Attribute determination method and device for risk prevention and control and server | |
US10346448B2 (en) | System and method for classifying an alphanumeric candidate identified in an email message | |
CN109118603B (en) | Intelligent card punching method and device based on industrial incubator management system | |
CN113269614A (en) | Quick merchant registration order receiving method and device, electronic equipment and readable medium | |
CN110795941A (en) | Named entity identification method and system based on external knowledge and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13718530 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 112013002401 Country of ref document: DE Ref document number: 1120130024012 Country of ref document: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13718530 Country of ref document: EP Kind code of ref document: A1 |