CN107025486A - A kind of event detection system and method - Google Patents

A kind of event detection system and method Download PDF

Info

Publication number
CN107025486A
CN107025486A CN201710109380.6A CN201710109380A CN107025486A CN 107025486 A CN107025486 A CN 107025486A CN 201710109380 A CN201710109380 A CN 201710109380A CN 107025486 A CN107025486 A CN 107025486A
Authority
CN
China
Prior art keywords
attribute
event data
behavior
time
test
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710109380.6A
Other languages
Chinese (zh)
Other versions
CN107025486B (en
Inventor
朱大立
陈均煌
娄乔
荆鹏飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN201710109380.6A priority Critical patent/CN107025486B/en
Publication of CN107025486A publication Critical patent/CN107025486A/en
Application granted granted Critical
Publication of CN107025486B publication Critical patent/CN107025486B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the present invention provides a kind of event detection system and method, wherein, system includes identity property test module, position attribution test module and behavior property test module, object event data are obtained by the detection by the progress of the identity property information in event data, position attribution information and behavior property information successively, and according to test result.The embodiment of the present invention is by distributed real-time event detection model and method based on classificating thought, rapid extraction object event in the multidimensional Monitoring data flow that can be produced from extensive real-time incremental, and with very low detection time delay.

Description

Event detection system and method
Technical Field
The embodiment of the invention relates to the field of data processing, in particular to an event detection system and method.
Background
In order to master the real-time status of a moving target object (including articles, intelligent terminals, automobiles, people, and the like), a corresponding monitoring system is often required to be constructed to cooperate with various information sources such as an environmental sensor and an operation log to realize target event detection for the target object.
In the prior art, the processing of the monitoring data is mainly accomplished by Complex Event Processing (CEP), which inputs a piece of continuously arriving data into a detection engine as a single event, and the detection engine matches a specific event combination by a predefined event pattern, where the matching content includes the semantics of the event itself, the event sequence, the order of the event, and so on. The declaration of the event mode is generally implemented by SQL-like language, and the supported operations include conversion operation, logic operation, sequence operation, aggregation operation, window operation, and so on. The user can map the target event into a corresponding operation combination in advance by means of a complex event processing system, submit the operation combination to a detection engine, and then make the continuously arriving monitoring data flow through the processing engine to perform event pattern matching, so that the real-time detection of the target event is realized.
In the prior art, the definition of an event pattern in a complex event processing technical scheme needs strong domain knowledge and cannot be constructed through autonomous learning, so that the complex event processing technical scheme is difficult to be widely applied, and on the other hand, the complex event processing engine is difficult to deal with large-scale monitoring data streams due to the limitation of single-machine processing capacity.
Disclosure of Invention
The event detection system and method are provided for solving the problems that in the prior art, the definition of an event mode in event detection needs strong domain knowledge and is limited by the processing capability of a single machine, and a processing engine is difficult to deal with large-scale detection data flow.
According to an aspect of an embodiment of the present invention, there is provided an event detection system including:
the identification attribute testing module is used for reading corresponding attribute information from the obtained event data, carrying out identification attribute testing on the input event data so as to distribute the event data to corresponding identification sets, and sending the event data to position groups corresponding to the identification sets; the attribute information at least comprises identification attribute information, position attribute information and behavior attribute information;
the position attribute testing module comprises a plurality of position groups, each position group comprises a plurality of position area groups, and the position attribute testing module is used for receiving the event data sent by the identification attribute testing module, carrying out position attribute testing on the event data so as to distribute the event data to the corresponding position area group, and sending the event data to the behavior group corresponding to the position area set; the number of the position groups is the same as that of the identification sets;
and the behavior attribute testing module comprises a plurality of behavior groups and is used for receiving the event data sent by the position attribute testing module, performing behavior attribute testing on the event data and acquiring target event data according to a testing result, wherein the number of the behavior groups is the same as that of the position area groups.
The event detection system also comprises a time attribute testing module; correspondingly, the identification attribute testing module is used for sending the event data to a time group corresponding to the identification set; the number of the time groups is the same as that of the identification sets;
the time attribute testing module comprises a plurality of time groups, each time group comprises a plurality of time interval groups, and the time attribute testing module is used for receiving the event data sent by the identification attribute testing module, carrying out time attribute testing on the event data so as to distribute the event data to the corresponding time interval group, and sending the event data to the position group corresponding to the time interval group; correspondingly, the location attribute testing module is used for receiving the event data sent by the time attribute testing module.
The position attribute testing module, the time attribute testing module and the behavior attribute testing module are used as selection bases of processing nodes in the packets according to the consistent hash result of the identification attribute information of the event data.
Wherein the behavior attribute information is a behavior aggregate of all behavior monitoring resultsCorrespondingly, the performing behavior attribute testing on the event data comprises:
performing behavior attribute test on the weighted sum of the behaviors in the event data in a threshold judgment mode,wherein,is a weight vector for the set of behaviors,the value range of each weight component in the weight vector is wi∈[-1,1]And the Th is a preset threshold value, and when the test result is greater than the preset threshold value Th in the behavior attribute test, the action is judged to occur.
Wherein the system is built on a distributed streaming computing engine.
In another aspect, an embodiment of the present invention provides an event detection method, including: the method comprises the steps of performing identification attribute test, namely reading corresponding attribute information from a plurality of acquired event data, performing identification attribute test on the input event data to distribute the event data to corresponding identification sets, and sending the event data to position groups corresponding to the identification sets; the attribute information at least comprises identification attribute information, position attribute information and behavior attribute information;
the position attribute test comprises a plurality of position groups, each position group comprises a plurality of position area groups, the event data after the identification attribute test is received, the position attribute test is carried out on the event data, the event data are distributed to the corresponding position area groups, and the event data are sent to the behavior groups corresponding to the position area sets; the number of the position groups is the same as that of the identification sets;
and the behavior attribute test comprises a plurality of behavior groups, performs the behavior attribute test on the event data, and acquires target event data according to a test result.
Wherein the method further comprises a time attribute test; correspondingly, after the identification attribute is tested, the event data is sent to the time group corresponding to the identification set; the number of the time groups is the same as that of the identification sets;
the time test comprises a plurality of time groups, each time group comprises a plurality of time interval groups, the time attribute test is carried out by receiving event data subjected to the identification attribute test, so that the event data are distributed to corresponding time interval groups, and the event data are sent to position groups corresponding to the time interval groups; correspondingly, the event data after the time detection is received in the position attribute test.
And the position attribute test, the time attribute test and the behavior attribute test are used as selection basis of the processing nodes in the packets according to the consistent hash result of the identification attribute information of the event data.
Wherein the behavior attribute information is a behavior aggregate of all behavior monitoring resultsCorrespondingly, the performing behavior attribute testing on the event data comprises:
performing behavior attribute test on the weighted sum of the behaviors in the event data in a threshold judgment mode,wherein,is a weight vector for the set of behaviors,the value range of each weight component in the weight vector is wi∈[-1,1]And the Th is a preset threshold value, and when the test result is greater than the preset threshold value Th in the behavior attribute test, the action is judged to occur.
Wherein the event detection method is built on a distributed stream computing engine.
According to the event detection system and method provided by the embodiment of the invention, data in a monitoring data stream is separated and abstracted, the data is summarized into various attribute information such as identification, position, behavior and the like, and a rule-based classification model which is easy to describe is introduced, so that detection rules in the model can be manually defined according to domain knowledge, and can also be extracted from training data in a centralized manner through an autonomous learning algorithm, and the problem that strong domain knowledge is needed in event detection is solved; and target events can be quickly extracted from multi-dimensional monitoring data streams generated in large-scale real-time increments.
Drawings
Fig. 1 is a block diagram of an event detection system according to an embodiment of the present invention;
FIG. 2 is a block diagram of an event detection system according to yet another embodiment of the present invention;
FIG. 3 is a topology diagram of a detection model implementation of an event detection system according to another embodiment of the present invention;
FIG. 4 is a flowchart of an event detection method according to another embodiment of the present invention;
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
Fig. 1 is a block diagram of an event detection system according to an embodiment of the present invention, as shown in fig. 1, including an attribute identification testing module 11, a location attribute testing module 12, and a behavior attribute testing module 13, where:
the attribute identification testing module 11 is configured to read corresponding attribute messages from the acquired plurality of event data, perform identification attribute testing on the input event data to allocate the event data to corresponding identification sets, and send the event data to position packets corresponding to the identification sets; the attribute information includes at least identification attribute information, location attribute information, and behavior attribute information.
In specific implementation, the monitoring data stream for a specific target object includes: monitoring three types of attribute information of a target identifier, a position and a target behavior, so that the embodiment of the invention summarizes an input data instance x of an event detection model as follows:
wherein, i represents a unique identification (id for short) of the monitored object; p represents the position (position);the behavior (actions) of the monitored target in a given time period is represented by a vector with an element value of 0 or 1, wherein 0 represents that no corresponding behavior is monitored, and 1 represents that the corresponding behavior is monitored.
The attribute detection of the detection model is performed from top to bottom in a hierarchical manner according to the detection rules. Firstly, testing the attribute identifications of all data in the monitoring data stream, specifically, the identification of the monitoring object is a nominal attribute, so that the dimension is divided into discrete set partitions, and the attribute test can be expressed as:
i∈{i1,i2,…,iN}
after the test is completed, the module distributes the event data to the corresponding position groups according to the test result for the next test, however, if the message does not belong to any id set in the rule set, a default rule is triggered, and the default rule generally discards the message directly.
Through the module, object sets suitable for different monitoring rules can be effectively distinguished, detection objects are classified according to preset rules, and on the other hand, construction of an event detection model is simplified through stipulation of detection data attributes.
The location attribute testing module 12 includes a plurality of location groups, each location group includes a plurality of location area groups, and is configured to receive the event data sent by the identification attribute testing module, and perform a location attribute test on the event data, so as to allocate the event data to a corresponding location area set, and send the event data to a behavior group corresponding to the location area group; wherein the number of the position groups is the same as the number of the identification sets.
In a specific implementation, the location is a region attribute, and the definition of the location region is related to its topological representation, for example, a two-dimensional rectangular region may be defined as P ═ (P)11,p12,p21,p22) Wherein p is11、p12、p21And p22Four vertices of a rectangular region, respectively, the three-dimensional sphere space region can be defined as P ═ P (P)cR) in which pcIs the position of the center of the sphere and r is the radius of the sphere. For convenience, the embodiment of the present invention uniformly expresses the location area definition as P, and then the attribute test can be expressed as:
p∈P
the location attribute test allows for focused monitoring of critical areas.
Through the module, the position region information which the event belongs to when the event occurs can be effectively detected, meanwhile, the detection rules of different monitoring strengths can be implemented in a mode of setting key monitoring position regions, and the accuracy of the event detection result is improved.
The behavior attribute testing module 13 is configured to receive the event data sent by the location attribute testing module, perform behavior attribute testing on the event data, and obtain target event data according to a test result, where the number of behavior groups is the same as the number of location area groups. .
In specific implementation, the behavior information is a vector with an element value of 0 or 1, 0 indicates that a corresponding behavior is not monitored, 1 indicates that a corresponding behavior is monitored, the behavior of the monitored object is subjected to attribute testing, and after the test is passed, the behavior attribute testing module outputs a corresponding event record.
Through the module, the target event can be rapidly acquired in event detection, and a distributed event detection model system based on a classification idea can deal with large-scale multidimensional monitoring data streams, and is suitable for a larger event detection range and has higher event detection performance.
On the basis of the above embodiment, the event detection system further includes a time attribute testing module, as shown in fig. 2, fig. 2 is a block diagram of a mechanism of an event detection system according to another embodiment of the present invention, including: an identity attribute test module 21, a time attribute test module 22, a location attribute test module 23 and a behavior attribute test module 24.
The identification attribute testing module 21 is configured to read corresponding attribute messages from the acquired plurality of event data, perform identification attribute testing on the input event data to allocate the event data to corresponding identification sets, and send the event data to time packets corresponding to the identification sets; the attribute information includes at least identification attribute information, time attribute information, location attribute information, and behavior attribute information.
The time attribute testing module 22 comprises a plurality of time groups, each time group comprises a plurality of time interval groups, and is used for receiving the event data sent by the identification attribute testing module, and performing time attribute testing on the event data, so as to allocate the event data to the corresponding time interval group, and send the event data to the position group corresponding to the time interval group; correspondingly, the location attribute testing module is used for receiving the event data sent by the time attribute testing module; wherein the number of the time packets is the same as the number of the identification sets.
The location attribute testing module 23 includes a plurality of location groups, each location group includes a plurality of location area groups, and is configured to receive the event data sent by the identification attribute testing module, and perform a location attribute test on the event data, so as to allocate the event data to a corresponding location area group, and send the event data to a behavior group corresponding to the location area group; wherein the number of the position groups is the same as the number of the identification sets.
The behavior attribute testing module 24 is configured to receive the event data sent by the location attribute testing module, perform behavior attribute testing on the event data, and obtain target event data according to a testing result.
In a specific implementation, the actions of the identifier attribute testing module 21 on the above embodiment are substantially the same, and are not described herein again, but the difference is that the input data instance x of the event detection model is summarized as:
wherein i represents a monitored object unique identification (id); t represents the current time (time); p represents the position (position);the behavior (actions) of the monitored target in a given time period is represented by a vector with an element value of 0 or 1, wherein 0 represents that no corresponding behavior is monitored, and 1 represents that the corresponding behavior is monitored.
And after the event information is distributed into the corresponding identification set, sending the event data to a time packet corresponding to the identification set.
In the time attribute testing module 22, time is an interval attribute, and for the division of the dimension into interval divisions, the attribute test can be expressed as:
t∈[ts,te]
the purpose of the test is to allow a monitoring policy to be set with a time interval as granularity, for example, monitoring different policies for day and night, and for event data passing a time attribute, the time attribute test module sends the event data to a position group corresponding to the time interval group.
The real-time actions of the location attribute testing module 23 and the behavior attribute testing module 24 are consistent with the above embodiments, and are not described herein.
By the system, the detection of the time attribute is added in the event detection, the pertinence of the event occurrence time in the event detection is improved, the event detection precision is improved, and meanwhile, the real-time performance of the event detection can be improved.
On the basis of the above embodiment, the location attribute testing module, the time attribute testing module, and the behavior attribute testing module use a consistent hash result of identification attribute information of event data as a selection basis of an in-packet processing node.
In specific implementation, the model realizes that a consistent hash result of an id of a monitoring object is used as a basis for selecting processing nodes in a packet, and in a time attribute testing module, a position attribute testing module and a behavior attribute testing module, a system respectively selects specific processing nodes in the time attribute packet, the position attribute packet and the behavior attribute packet according to the consistent hash result of the id.
By the system, load balance of each node in the group is realized, dynamic change of the parallel speed of the nodes is adapted, and monitoring messages of the same monitored object can be routed to the same group node as far as possible.
On the basis of the above embodiment, the behavior attribute information is a behavior aggregate of all behavior monitoring resultsCorrespondingly, the performing behavior attribute testing on the event data comprises:
performing behavior attribute test on the weighted sum of the behaviors in the event data in a threshold judgment mode,wherein,is a weight vector for the set of behaviors,the value range of each weight component in the weight vector is wi∈[-1,1]And the Th is a preset threshold value, and when the test result is greater than the preset threshold value Th in the behavior attribute test, the action is judged to occur.
In a specific implementation, for the behavior attribute detection as the overall behavior of the monitoring target in a given time period, the behavior of one monitoring target is a set of multiple behaviors, for example, for the monitoring scenario with the monitored behavior set of (action1, action2, action3, action4, action5, action6), if the target object has actions of action1, action3, and action6 in a certain time period, its behavior attribute value may be represented as a 6-dimensional vector:
for a target object, the event decision of the target object at a specific time and a specific position depends on the comprehensive information of various behaviors, and meanwhile, in consideration of the inconsistency of different behaviors on the event decision degree, the embodiment of the invention adopts the following formula:
the illustrated manner of thresholding the weighted sum of behaviors performs an attribute test in which,is a weight vector for a set of behaviors,the value range of each weight component in the weight vector is wi∈[-1,1]The Th is a preset threshold value, and when the test result is greater than the preset threshold value Th in the behavior attribute test, the action is judged to occur; negative value weight indicates that corresponding action has explanatory effect on the event, while positive value weight is evidence for event judgment, and the user can set 0 weight scoreAmounts to exclude irrelevant behavior.
Through the system, the representation of any behavior combination mode can be sufficiently completed by using the form of weight summation, so that the judgment of the event is more accurate, and meanwhile, the formulaAnother interpretation of the behavioral attribute test shown is a linear classifier model in statistics: and f (x) sign (w · x + b), which may be replaced by other classification models, such as a decision tree model, a logistic regression model, etc., in practical implementations.
On the basis of the above embodiment, the system is built on a distributed stream type computing engine.
In a specific implementation, the detection model designed in the embodiment of the present invention is built on a distributed stream type computation engine, each detection rule in the model is implemented as a parallel processing flow graph, and each conjunction clause in the rule is implemented as a series processing node in the processing flow graph. The processing node performs attribute test on each input attribute message, and selects the next node route or outputs the final detection result according to the test result.
By the method, the target event can be quickly extracted from the multidimensional monitoring data stream generated by large-scale real-time increment, and the problem that a complex event processing engine is difficult to deal with the large-scale detection data stream is solved.
In another embodiment of the present invention, as shown in fig. 3, fig. 3 is a topology diagram of a detection model implementation of an event detection system according to another embodiment of the present invention.
Each detection rule in the model is realized as a parallel processing flow graph, and each conjunction clause in the rule is realized as a processing node which is connected in series in the processing flow graph. The processing node performs attribute test on each input attribute message, and selects the next node route or outputs the final detection result according to the test result. The implementation topology of the model mainly comprises four modules of Kafka Spout, TimeBolt, PositionBolt and ActionsBolt:
wherein, Kafka spout is an input module of the model, and is responsible for reading the attribute message from Apache Kafka in parallel and performing the monitoring object id attribute test on the read message. After the test is finished, the test data is distributed to the TimeBolts group corresponding to the id set, and the next processing is carried out. However, if the message does not belong to any id set in the rule set, a default rule will be triggered, which is typically to discard the message directly. The parallelism setting of the Kafka spout module generally coincides with the number of partitions of the corresponding topic in Kafka.
Where TimeBolt is the time attribute test module of the model. And during topology definition, establishing a corresponding number of TimeBolts groups according to the number of id sets in the model rule set. In order to deal with large-scale attribute information flow, each TimeBolts packet comprises a plurality of TimeBolts with the same test logic, each TimeBolt node is responsible for processing a part of data, and a user can set the parallelism of the TimeBolts according to needs. In order to realize load balance of each node in a packet and adapt to dynamic change of node parallelism, and simultaneously ensure that monitoring messages of the same monitoring object can be routed to the same TimeBolt node as much as possible, KafkA Spout takes a Consistent Hashing (Consistent Hashing) result of a monitoring object id as a basis for selecting nodes in the packet. Similarly, the TimeBolt will route the attribute message to the corresponding PositionBolts packet according to the time attribute test result, and the selection of the node in the packet also uses the consistent hash result of the monitored object id.
Wherein, PositionBolt is a position attribute testing module of the model, which maintains the latest position information of the monitoring object through the position tuple. Therefore, the PositionBolt module can perform position attribute test on the behavior tuple and distribute the test result and the consistent hash result of the monitoring object id to a processing node in the corresponding ActionsBolts group according to the test result and the consistent hash result of the monitoring object id. However, in actual implementation, the PositionBolt does not explicitly maintain the location information of the monitoring object, but directly saves the routing information of the instance data in the routing table. Specifically, when the position bolt receives the position tuple, the position attribute test is performed on the position tuple, if the test result is inconsistent with the result in the routing table, the routing information corresponding to the monitored object id in the routing table is updated, otherwise, no operation is performed. Therefore, when the behavior tuple is received, the routing information of the next processing node can be directly obtained according to the monitoring object id.
Wherein ActionsBolt is a behavior attribute testing module of the model. Since multiple behavior components of the same instance arrive discretely, a set of behaviors is maintained for each monitored object. When a new behavior tuple is received, the actionbolt module will add it to the behavior collection of the corresponding monitoring object and trigger the behavior attribute test. If the property test is passed, the illustrative instance hits a rule in the rule set, and thus, ActionBolt outputs a corresponding event record. ActionsBolt determines instance boundaries through a Sliding Window mechanism. Specifically, the trigger conditions for window sliding include two types, namely time interval change and position area change, due to the concept of behavior monitoring for a specific area at a specific time. The time interval change information is determined by ActionsBolt comprehensively according to the newly received behavior tuple time and the time interval given by the rule set. The location area change information is provided by the PositionBolt and is usually triggered when the routing information is changed.
The data source Apache Kafka in the model implementation may be replaced with other distributed message components, such as rockmq, and the like. Accordingly, kafka Spout, which reads attribute messages from these distributed message components and performs monitoring object id testing, may also be replaced with corresponding Spout components.
As shown in fig. 4, fig. 4 is a flowchart of an event detection method according to another embodiment of the present invention, where the method includes:
step 401: the method comprises the steps of performing identification attribute test, namely reading corresponding attribute information from a plurality of acquired event data, performing identification attribute test on the input event data to distribute the event data to corresponding identification sets, and sending the event data to position groups corresponding to the identification sets;
step 402: the position attribute test comprises a plurality of position groups, each position group comprises a plurality of position area groups, the event data after the identification attribute test is received, the position attribute test is carried out on the event data, the event data are distributed to the corresponding position area groups, and the event data are sent to the behavior groups corresponding to the position area groups;
step 403: and the behavior attribute test comprises a plurality of behavior groups, performs the behavior attribute test on the event data, and acquires target event data according to a test result.
In a specific implementation, the monitoring data stream for a specific target object includes: monitoring three types of attribute information of a target identifier, a position and a target behavior, so that the embodiment of the invention summarizes an input data instance x of an event detection model as follows:
wherein i represents a monitored object unique identification (id); p represents the position (position);the behavior (actions) of the monitored target in a given time period is represented by a vector with an element value of 0 or 1, wherein 0 represents that no corresponding behavior is monitored, and 1 represents that the corresponding behavior is monitored.
The attribute detection of the detection model is performed from top to bottom in a hierarchical manner according to the detection rules. Firstly, testing the attribute identifications of all data in the monitoring data stream, specifically, the identification of the monitoring object is a nominal attribute, so that the dimension is divided into discrete set partitions, and the attribute test can be expressed as:
i∈{i1,i2,…,iN}
after the test is completed, the module distributes the event data to the corresponding position groups according to the test result for the next test, however, if the message does not belong to any id set in the rule set, a default rule is triggered, and the default rule generally discards the message directly.
After the event data is received by the location packet, the data is subjected to location attribute test, wherein the location is an area attribute, and the definition of the location area is related to the topological representation thereof, for example, a two-dimensional rectangular area can be defined as P ═ (P ═11,p12,p21,p22) Wherein p is11、p12、p21And p22The three-dimensional sphere region may be defined as P ═ P (P), which is the four vertices of a rectangular space, respectivelycR) in which pcIs the position of the center of the sphere and r is the radius of the sphere. For convenience, the embodiment of the present invention uniformly expresses the location area definition as P, and then the attribute test can be expressed as:
p∈P
the location attribute test allows for focused monitoring of critical areas. And after the position attribute test is finished, the event data is sent to the behavior group corresponding to the position area group.
In the behavior attribute test, the behavior information is a vector with an element value of 0 or 1, 0 indicates that a corresponding behavior is not monitored, 1 indicates that a corresponding behavior is monitored, the behavior of the monitored object is subjected to the attribute test, and after the test is passed, the behavior attribute test module outputs a corresponding event record.
According to the method, from the perspective of classification in data mining, data in a monitoring data stream is separated and abstracted, and is summarized into various attribute values such as object identification, position, behavior and the like, and a rule-based classification model which is easy to describe is introduced, so that detection rules in the model can be manually defined according to domain knowledge, and can also be extracted from training data set through an autonomous learning algorithm; on the other hand, the event detection method provided by the embodiment of the invention can quickly acquire the target event in event detection, and can cope with large-scale multidimensional monitoring data flow through a distributed event detection model system based on a classification idea, so that the method is suitable for a larger event detection range and has higher event detection performance.
On the basis of the above embodiment, the method provided by the embodiment of the present invention further includes a time attribute test, and accordingly, after the identifier attribute test, the event data is sent to the time packet corresponding to the identifier set; the number of the time groups is the same as that of the identification sets;
the time test comprises a plurality of time groups, each time group comprises a plurality of time interval groups, the time attribute test is carried out by receiving event data subjected to the identification attribute test, so that the event data are distributed to corresponding time interval groups, and the event data are sent to position groups corresponding to the time interval groups; correspondingly, the event data after the time detection is received in the position attribute test.
In a specific implementation, the input data instance x of the event detection model is summarized as:
wherein i represents a monitored object unique identification (id); t represents the current time (time); p represents the position (position);the behavior (actions) of the monitored target in a given time period is represented by a vector with an element value of 0 or 1, wherein 0 represents that no corresponding behavior is monitored, and 1 represents that the corresponding behavior is monitored.
And sending the event data after the identification attribute test to a time group corresponding to the identification set. In the time attribute test, time is an interval attribute, and the division of the dimension into interval divisions can be represented as:
t∈[ts,te]
the purpose of the test is to allow setting of monitoring strategies with time intervals as granularity, for example, monitoring of different strategies for day and night, and for event data passing time attributes, the time attribute test module sends the event data to a position group corresponding to the time interval group.
By the method, the detection of the time attribute is added in the event detection, the pertinence of the event occurrence time in the event detection is improved, the event detection precision is improved, and meanwhile, the real-time performance of the event detection can be improved.
On the basis of the above embodiment, the method provided in the embodiment of the present invention further includes the location attribute test, the time attribute test, and the behavior attribute test, and the consistent hash result of the identification attribute information of the event data is used as a selection basis for the processing node.
In specific implementation, a data source in model implementation takes a consistent hash result of an id of a monitoring object as a basis for selecting nodes in a group, and a system selects specific processing nodes in time attribute group, position attribute group and behavior attribute group according to the consistent hash result of the id in a time attribute testing module, a position attribute testing module and a behavior attribute testing module.
By the method, load balance of each node in the group is realized, dynamic change of the parallel speed of the nodes is adapted, and monitoring messages of the same monitored object can be routed to the same group node as far as possible.
On the basis of the foregoing embodiment, in the method provided in the embodiment of the present invention, the behavior attribute information is a behavior collection of all behavior monitoring resultsCorrespondingly, the performing behavior attribute testing on the event data comprises:
performing behavior attribute test on the weighted sum of the behaviors in the event data in a threshold judgment mode,wherein,is a weight vector for the set of behaviors,the value range of each weight component in the weight vector is wi∈[-1,1]And the Th is a preset threshold value, and when the test result is greater than the preset threshold value Th in the behavior attribute test, the action is judged to occur.
In a specific implementation, for the behavior attribute detection as the overall behavior of the monitoring target in a given time period, the behavior of one monitoring target is a set of multiple behaviors, for example, for the monitoring scenario with the monitored behavior set of (action1, action2, action3, action4, action5, action6), if the target object has actions of action1, action3, and action6 in a certain time period, its behavior attribute value may be represented as a 6-dimensional vector:
for a target object, the event decision of the target object at a specific time and a specific position depends on the comprehensive information of various behaviors, and meanwhile, in consideration of the inconsistency of different behaviors on the event decision degree, the embodiment of the invention adopts the following formula:
the illustrated manner of thresholding the weighted sum of behaviors performs an attribute test in which,is a weight vector for a set of behaviors,the value range of each weight component in the weight vector is wi∈[-1,1]The Th is a preset threshold value, and when the test result is greater than the preset threshold value Th in the behavior attribute test, the action is judged to occur; negative weight means that the corresponding behavior is explanatory to the event, while positive weight is one evidence of event decision, and the user can also set 0 weight component to exclude irrelevant behavior.
By the method, the representation of any behavior combination mode can be completed by using the form of weight addition, so that the judgment of the event is more accurate, and meanwhile, the formulaAnother interpretation of the behavioral attribute test shown is a linear classifier model in statistics: and f (x) sign (w · x + b), which may be replaced by other classification models, such as a decision tree model, a logistic regression model, etc., in practical implementations.
On the basis of the above embodiments, the event detection method provided by the embodiment of the present invention is implemented on a distributed streaming computing engine.
In a specific implementation, the detection model designed in the embodiment of the present invention is built on a distributed stream type computation engine, each detection rule in the model is implemented as a parallel processing flow graph, and each conjunction clause in the rule is implemented as a series processing node in the processing flow graph. The processing node performs attribute test on each input attribute message, and selects the next node route or outputs the final detection result according to the test result.
By the method, the target event can be quickly extracted from the multidimensional monitoring data stream generated by large-scale real-time increment, and the problem that a complex event processing engine is difficult to deal with the large-scale detection data stream is solved.
The technical scheme of the embodiment of the invention provides an event detection system and method, which can cope with large-scale multidimensional monitoring data streams and have very low detection time delay. The detection rule of the model can be manually defined according to domain knowledge, and can also be extracted from the training data set through an autonomous learning algorithm, so that the model has good performance in the aspects of application range, usability and event detection performance including accuracy and recall rate.
Secondly, the invention abstractly separates various monitoring data into four-dimensional attributes such as monitoring object id, time, position and behavior, and uses behavior vector to represent the behavior occurrence condition of the monitoring object by a way of pre-constructing behavior set. By the attribute specification, the construction of the event detection model is simplified.
Meanwhile, according to the consistency hash result of the attribute test and the monitored object id, the invention selects the processing node of the arriving data, and completes the distributed realization of data processing, thereby being capable of dealing with large-scale monitoring data flow and having strong expansibility.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. An event detection system, comprising:
the identification attribute testing module is used for reading corresponding attribute information from the obtained event data, carrying out identification attribute testing on the input event data so as to distribute the event data to corresponding identification sets, and sending the event data to position groups corresponding to the identification sets; the attribute information at least comprises identification attribute information, position attribute information and behavior attribute information;
the position attribute testing module comprises a plurality of position groups, each position group comprises a plurality of position area groups, and the position attribute testing module is used for receiving the event data sent by the identification attribute testing module, carrying out position attribute testing on the event data so as to distribute the event data to the corresponding position area group, and sending the event data to the behavior group corresponding to the position area group; the number of the position groups is the same as that of the identification sets;
and the behavior attribute testing module comprises a plurality of behavior groups and is used for receiving the event data sent by the position attribute testing module, performing behavior attribute testing on the event data and acquiring target event data according to a testing result, wherein the number of the behavior groups is the same as that of the position area groups.
2. The system of claim 1, further comprising a time attribute testing module; correspondingly, the identification attribute testing module is used for sending the event data to a time group corresponding to the identification set; the number of the time groups is the same as that of the identification sets;
the time attribute testing module comprises a plurality of time groups, each time group comprises a plurality of time interval groups, and the time attribute testing module is used for receiving the event data sent by the identification attribute testing module, carrying out time attribute testing on the event data so as to distribute the event data to the corresponding time interval group, and sending the event data to the position group corresponding to the time interval group; correspondingly, the location attribute testing module is used for receiving the event data sent by the time attribute testing module.
3. The system according to claim 2, wherein the location attribute testing module, the time attribute testing module, and the behavior attribute testing module use a consistent hash result of the identification attribute information of the event data as a basis for selecting the in-packet processing node.
4. The system of claim 1, 2 or 3, wherein the behavior attribute information is a behavior aggregate of all behavior monitoring resultsCorrespondingly, the performing behavior attribute testing on the event data comprises:
performing behavior attribute test on the weighted sum of the behaviors in the event data in a threshold judgment mode,wherein,is a weight vector for the set of behaviors,the value range of each weight component in the weight vector is wi∈[-1,1]And the Th is a preset threshold value, and when the test result is greater than the preset threshold value Th in the behavior attribute test, the action is judged to occur.
5. The system of claim 1, 2 or 3, wherein the system is built on a distributed streaming computing engine.
6. An event detection method, comprising:
the method comprises the steps of performing identification attribute test, namely reading corresponding attribute information from a plurality of acquired event data, performing identification attribute test on the input event data to distribute the event data to corresponding identification sets, and sending the event data to position groups corresponding to the identification sets; the attribute information at least comprises identification attribute information, position attribute information and behavior attribute information;
the position attribute test comprises a plurality of position groups, each position group comprises a plurality of position area groups, the event data after the identification attribute test is received, the position attribute test is carried out on the event data, the event data are distributed to the corresponding position area groups, and the event data are sent to the behavior groups corresponding to the position area groups; the number of the position groups is the same as that of the identification sets;
and the behavior attribute test comprises a plurality of behavior groups, performs the behavior attribute test on the event data, and acquires target event data according to a test result.
7. The method of claim 6, further comprising a time attribute test; correspondingly, after the identification attribute is tested, the event data is sent to the time group corresponding to the identification set; the number of the time groups is the same as that of the identification sets;
the time test comprises a plurality of time groups, each time group comprises a plurality of time interval groups, the time attribute test is carried out by receiving event data subjected to the identification attribute test, so that the event data are distributed to corresponding time interval groups, and the event data are sent to position groups corresponding to the time interval groups; correspondingly, the event data after the time detection is received in the position attribute test.
8. The method of claim 7, wherein the time attribute test and the behavior attribute test are based on a consistent hash of identifying attribute information of event data as a basis for selecting processing nodes within a packet.
9. The system of claim 6, 7 or 8, wherein the behavior attribute information is one behavior of all behavior monitoring resultsCollectionCorrespondingly, the performing behavior attribute testing on the event data comprises:
performing behavior attribute test on the weighted sum of the behaviors in the event data in a threshold judgment mode,wherein,is a weight vector for the set of behaviors,the value range of each weight component in the weight vector is wi∈[-1,1]And the Th is a preset threshold value, and when the test result is greater than the preset threshold value Th in the behavior attribute test, the action is judged to occur.
10. The method of claim 6, 7 or 8, wherein the event detection method is built on a distributed streaming computation engine.
CN201710109380.6A 2017-02-27 2017-02-27 Event detection system and method Active CN107025486B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710109380.6A CN107025486B (en) 2017-02-27 2017-02-27 Event detection system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710109380.6A CN107025486B (en) 2017-02-27 2017-02-27 Event detection system and method

Publications (2)

Publication Number Publication Date
CN107025486A true CN107025486A (en) 2017-08-08
CN107025486B CN107025486B (en) 2020-10-16

Family

ID=59525314

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710109380.6A Active CN107025486B (en) 2017-02-27 2017-02-27 Event detection system and method

Country Status (1)

Country Link
CN (1) CN107025486B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491817A (en) * 2018-03-30 2018-09-04 国信优易数据有限公司 A kind of event detection model training method, device and event detecting method
CN109460308A (en) * 2018-11-13 2019-03-12 郑州云海信息技术有限公司 A kind of event-handling method, system, device and computer readable storage medium
CN109558450A (en) * 2018-10-30 2019-04-02 中国汽车技术研究中心有限公司 A kind of automobile remote monitoring method and apparatus based on distributed structure/architecture
CN109857524A (en) * 2019-01-25 2019-06-07 深圳前海微众银行股份有限公司 Streaming computing method, apparatus, equipment and computer readable storage medium
CN115080963A (en) * 2022-07-07 2022-09-20 济南开耀网络技术有限公司 Intelligent financial data protection method based on cloud computing and server

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130305357A1 (en) * 2010-11-18 2013-11-14 The Boeing Company Context Aware Network Security Monitoring for Threat Detection
CN103440274A (en) * 2013-08-07 2013-12-11 北京航空航天大学 Video event sketch construction and matching method based on detail description
CN104008149A (en) * 2014-01-16 2014-08-27 西北工业大学 Event model space-time information representing and processing method orientated towards CPS

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130305357A1 (en) * 2010-11-18 2013-11-14 The Boeing Company Context Aware Network Security Monitoring for Threat Detection
CN103440274A (en) * 2013-08-07 2013-12-11 北京航空航天大学 Video event sketch construction and matching method based on detail description
CN104008149A (en) * 2014-01-16 2014-08-27 西北工业大学 Event model space-time information representing and processing method orientated towards CPS

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
SIMEON A. BAMFORD: "Large developing receptive fields using a distributed and locally reprogrammable address-event receiver", 《IEEE TRANSACTIONS ON NEURAL NETWORKS》 *
柯佳: "基于语义的视频事件检测分析方法研究", 《中国博士学位论文全文数据库信息科技辑》 *
武慧敏: "一种基于事件的双序列时空数据模型", 《地理与地理信息科学》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491817A (en) * 2018-03-30 2018-09-04 国信优易数据有限公司 A kind of event detection model training method, device and event detecting method
CN108491817B (en) * 2018-03-30 2021-02-26 国信优易数据股份有限公司 Event detection model training method and device and event detection method
CN109558450A (en) * 2018-10-30 2019-04-02 中国汽车技术研究中心有限公司 A kind of automobile remote monitoring method and apparatus based on distributed structure/architecture
CN109460308A (en) * 2018-11-13 2019-03-12 郑州云海信息技术有限公司 A kind of event-handling method, system, device and computer readable storage medium
CN109857524A (en) * 2019-01-25 2019-06-07 深圳前海微众银行股份有限公司 Streaming computing method, apparatus, equipment and computer readable storage medium
CN109857524B (en) * 2019-01-25 2024-02-27 深圳前海微众银行股份有限公司 Stream computing method, device, equipment and computer readable storage medium
CN115080963A (en) * 2022-07-07 2022-09-20 济南开耀网络技术有限公司 Intelligent financial data protection method based on cloud computing and server

Also Published As

Publication number Publication date
CN107025486B (en) 2020-10-16

Similar Documents

Publication Publication Date Title
CN107025486B (en) Event detection system and method
Nazarenko et al. Features of application of machine learning methods for classification of network traffic (features, advantages, disadvantages)
De Meo et al. Estimating graph robustness through the Randic index
KR102087959B1 (en) Artificial intelligence operations system of telecommunication network, and operating method thereof
CN104268260A (en) Method, device and system for classifying streaming data
Al-Yaseen et al. Real-time intrusion detection system using multi-agent system
WO2015154484A1 (en) Traffic data classification method and device
CN111061911B (en) Target detection and tracking method, device and equipment for multi-video monitoring data
Scheinert et al. Telesto: A graph neural network model for anomaly classification in cloud services
Loo et al. Online incremental learning for high bandwidth network traffic classification
US20230092777A1 (en) Decentralized machine learning across similar environments
Shafiq et al. WeChat traffic classification using machine learning algorithms and comparative analysis of datasets
Wetzig et al. Unsupervised anomaly alerting for iot-gateway monitoring using adaptive thresholds and half-space trees
CN106650800B (en) Markov equivalence class model distributed learning method based on Storm
Gias et al. Samplehst: Efficient on-the-fly selection of distributed traces
Denham et al. HDSM: A distributed data mining approach to classifying vertically distributed data streams
CN112925964A (en) Big data acquisition method based on cloud computing service and big data acquisition service system
Chatzidimitriou et al. Cenote: a big data management and analytics infrastructure for the web of things
Qian et al. A fast and anti-matchability matching algorithm for content-based publish/subscribe systems
Wang et al. A dynamic traffic awareness system for urban driving
CN113835973B (en) Model training method and related device
Yin et al. Data Stream Clustering Algorithm Based on Bucket Density for Intrusion Detection
Wu et al. P2P object tracking in the internet of things
US10360505B2 (en) Applying a plurality of rules to a multiplicity of streaming messages to perform intelligent data analytics
Itagi et al. DDoS Attack Detection in SDN Environment using Bi-directional Recurrent Neural Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant