WO2008009990A1 - System - Google Patents

System Download PDF

Info

Publication number
WO2008009990A1
WO2008009990A1 PCT/GB2007/050418 GB2007050418W WO2008009990A1 WO 2008009990 A1 WO2008009990 A1 WO 2008009990A1 GB 2007050418 W GB2007050418 W GB 2007050418W WO 2008009990 A1 WO2008009990 A1 WO 2008009990A1
Authority
WO
WIPO (PCT)
Prior art keywords
term
rule
value
comparison
stored
Prior art date
Application number
PCT/GB2007/050418
Other languages
French (fr)
Inventor
Stephen Robinson
Original Assignee
Chronicle Solutions (Uk) Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chronicle Solutions (Uk) Limited filed Critical Chronicle Solutions (Uk) Limited
Publication of WO2008009990A1 publication Critical patent/WO2008009990A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks

Definitions

  • the present invention relates to a system for monitoring electronic data traffic, and particularly to a system which watches for and alerts a network administrator to specific data traffic on the computer network. For example an administrator can define rules to instruct the system to watch for particular content and actions can be triggered when matching content is found.
  • An example of a known such alerting system is found in many e-mail clients.
  • An administrator, or a user can create logical rules from a set of predefined rules which can be matched against incoming e-mails. If a match to the rule is found in an e-mail then a pre-selected action is performed automatically by the e-mail client. For example, a rule may be created in which any e-mails with the subject heading containing the word "confidential" are automatically moved to a folder labelled "private" in the e-mail client.
  • each rule having a combination of predefined terms and resulting actions. If multiple rules are defined then each rule must be tested against each incoming object.
  • a problem with known such alerting systems occurs when the number of defined rules becomes large. The more defined rules there are, the longer it takes to test the rule set against each incoming object. This is particularly problematic when the testing is performed in systems with a high rate of incoming objects, for example when testing rules against network traffic.
  • the present invention seeks to provide a faster and more efficient system for monitoring data traffic by taking all of the alert rules defined by the administrator and merging them into a single combined rule represented by a data structure that allows those rules to be more efficiently and quickly matched against incoming content in each object.
  • a computer system for assembling a single electronic data structure from a plurality of logical rules for the treatment of objects in electronic data traffic, wherein each rule comprises at least one sub-expression which comprises component parts including a term, a comparison operator and a value; means for user input of a plurality of said logical rules; means for dividing each rule into its sub-expressions and dividing each sub-expression into its component parts; a term database for storing the terms of the rules with links to each of the other component parts of the sub-expression of the rule; means for comparing a term of a newly input rule with the terms stored in the term database, and if said term is not found in the term database then storing said term with a link to each of the other component parts of the related sub-expression of the newly input rule; and if said term is found in the term database, then following the stored links to the first of the previous comparisons associated with the stored term, comparing the associated stored value of the comparison with the value of the term in the newly input rule
  • the rule set state for an object being tested may be stored using an array of tri-state values (unknown, true or false) wherein one value corresponds with each node in the data structure and one value corresponds to each comparison. Two bits per node is preferably used for storage.
  • a computer system for automatically processing logical rules governing the treatment of objects in electronic data traffic, the system comprising: means for combining a plurality of user input logical rules into a single data structure comprising a term database, a comparison operator database and a rule graph; means for intercepting data traffic comprising a plurality of said objects; means for scanning the intercepted traffic and extracting one or more terms from said objects; means for comparing an intercepted term with terms stored in the term database; if the term does not find a match, then stopping; if the term does find a match, then following stored links associated with the term to each comparison that includes the term and checking the state of the comparison; if the state is known then stopping; if the state is not known then evaluating the comparison, and triggering the associated rule if the comparison is true.
  • the object state is updated to reflect whether the term evaluated to true or false, and the parents of each node are evaluated in turn.
  • Advantageously means are provided for triggering a particular rule if a node in the data structure (rule set) is evaluated to true and the node is the head node for that rule.
  • Action may be taken when a rule is triggered such as details being logged, e-mail alerts being sent, running command line programs and/or adjusting the logging level for the object.
  • a data structure for processing a plurality of logical rules for the treatment of objects in electronic data traffic, wherein each rule comprises at least one sub-expression which comprises component parts including a term, a comparison operator and a value; wherein the data structure takes the form of a rule graph in which any one term appears only once, and any sub-expression occurs only once.
  • Figure 1 is a schematic illustration of one embodiment of a rule graph used in the present invention
  • Figure 2 is a flow diagram illustrating a sequence of operations of a system according to one aspect of the invention, used to evaluate a received term
  • Figure 3 is a flow diagram illustrating a sequence of operations of one step of figure 2 (the step "evaluate graph”)
  • Figure 4 is a schematic illustration of a computer system according to the invention.
  • an administrator will define rules which are used to trigger actions when matching content is captured from data traffic on the network. These rules will typically monitor client and server details such as username, IP address and MAC address, the type of the objects content, the size of the object, the protocol used to transport it, the direction (upload or download), and data information regarding the transfer of the object.
  • rules would typically monitor keywords within the objects content, and its check sum (e.g. the MD4 check sum).
  • Other properties dependent on the content type or protocol can also be included and also the ID of a previous alert that was triggered. However this is not an exhaustive list.
  • various actions may be taken by the system, such as logging the access and object details that triggered the alert rule into a database for later review by the administrator.
  • the system might notify the administrator of the details e.g. by e-mail, SMS or MSN. It could automatically take a particular action such as blocking the access immediately to prevent the object being transferred, or redirect it to a predefined resource such as a quarantine area.
  • the system could inject predefined data into the communication channel, for example to allow the access but accompany it with a warning message to the user.
  • the system of the present invention can speed up the process of checking the rules because it provides for effectively merging the rules into a more unified data structure in which common sub-expressions are combined or eliminated.
  • the unified data structure compiled from the administrators rules contains three main parts:
  • Comparisons a collection of all the comparisons in the complete merged rule set.
  • Rule Graph a graph representing the logic that joins the comparisons together.
  • the terms of the rules comprise the nodes: DstIP (Destination IP address). SrcIP (Source IP address), Type and Protocol.
  • the rule graph can be constructed using techniques used to create compilers and interpreters and will be familiar to persons skilled in the art. Tools such as lex, yacc, flex and bison can be employed to transform the text of the rules into the graph. These tools are adapted so that with each additional rule which is added by the administrator, the system does not start from the beginning again but instead continues where it finished with the previous rule. In this way the single large graph can be generated containing all the rules instead of lots of different graphs. The assembly of such a rule graph is described in more detail below.
  • the rule graph avoids duplication of terms saving only one copy of each possible value in the rules and thus achieves a significant space saving over known system. It allows for logical operators within rules which again allows for significant space saving. Each rule can have as many as or as few comparisons as required and allows for multiple actions to be performed. Of course the graph structure created depends on the ruleset provided and the number of levels in the structure is dictated by the size of the rule and not the number of fields.
  • the rule graph format allows direct access to the relevant comparisons without passing events from parent nodes to child nodes and the graph can be traversed multidirectionally going from the comparison to its parent and continuing both up and down as necessary for evaluation of a term. This is in contrast to prior art systems in which rules are evaluated one by one. This improves speed and memory use.
  • each term in received content or accesses is looked up in the term collection. If the term is not found, then the rule set does not contain the term and therefore that term cannot cause a rule to be matched. If the term is not found then there is no need to waste processing time evaluating the term and therefore the processing stops, as indicated by the terminator box "done".
  • each term contains a list of links to each comparison that includes that term so it is possible to very efficiently locate the comparisons which need to be evaluated.
  • the state of the first comparison in the list is checked to see if it already has a value other than 'unknown'. If the comparison does have a value then the processing of that comparison stops, and the system proceeds to the next comparison until all comparisons have been processed.
  • the comparison does not have a value (value is unknown) the term is evaluated to either 'true' or 'false' and the object state is updated accordingly.
  • the rule graph is evaluated according to the sequence of steps illustrated in figure 3. Each of the parent nodes in the graph are evaluated against the new term value. As each node is evaluated, its parent node is then also evaluated, and this process repeats until the head node for the original rule has been evaluated.
  • the head nodes represent the original rules as created by an administrator, therefore if a head node is evaluated as 'true' then the corresponding rule is triggered.
  • the current evaluation state of the rule set for each access being monitored is maintained by the alerting system using an array of tri-state values (unknown, true or false), one for each node in the graph and for each comparison.
  • This array uses two bits per node to reduce the overhead in tracking this state and that combined with the common sub-expression elimination allows a very large rule set to be used before memory consumption becomes a problem.
  • a further complication while processing the rule set is that many of the terms can have more than one value.
  • the keyword term most content includes hundreds, if not thousands of words so when a rule wants to know when the keyword is 'abc' it is important to ignore cases where the comparison evaluates to false because it could evaluate to true later on.
  • the keyword term is a special case in another way. It is not desirable to have the alerter informed about the value of the keyword term for every word in every item of content monitored. Therefore, the alerting system has a mechanism to inform the content source which words it's interested in. This way the alert rules are only evaluated for words which are included in the rule set. The alerting system calls these terms 'External terms' because the comparison is done outside the alerting system.
  • An administrator may define alert rules such that more than one may trigger for a given access or object. For this reason each rule can be given a precedence value. This value is an integer, lower values have a higher precedence and higher values have a lower precedence. An administrator is then able to define which rule should take effect if more than one rule is triggered by a single access or object.
  • alert rule When an alert rule is submitted to the alerting system, that rule contains one or more actions that should be taken if and when the rule is triggered. These actions are submitted to the alerting system in the form of pointers to either an access object, which defines the access that triggered the alert, or to a layer object which defines the particular object within the access that caused the alert to be triggered. There are several categories of alert actions, including “Log action”, “e-mail action”, “Command action” and “log-level action”, as described below.
  • a log action simply logs the details of the triggered alert to a database.
  • the log action enters a single record in the database each time it is run.
  • An email action causes an e-mail to be sent to the recipient specified by the alert rule whenever it is run.
  • Full Descriptions are sent when the quantity of alerts are not hitting the rate limit.
  • One e-mail is sent per alert and each notification to an administrator includes a full description of the alert that has just fired.
  • Summaries are sent when the quantity of alerts are hitting the rate limit. The number of times each alert is triggered is remembered by the action and one e-mail is sent whenever the rate limit allows. This e-mail contains a summary of the alerts that have been triggered since the last notification.
  • the format of the bodies and subjects of these e-mails can be hard coded into the email action for speed of processing, or alternatively, an administrator may specify the content of both the subject and the body by using templates.
  • an administrator may specify the content of both the subject and the body by using templates.
  • An entry can be placed in the main log file so an administrator can see why the e-mails are not being sent.
  • the command action causes a command line program to be run whenever the action is run.
  • the parameter specified in the alert rule defines the program that should be run and the parameters that should be passed to that program.
  • the parameters are passed by string substitution within the parameter string supplied by the alert rule.
  • the log-level action causes the logging level to be adjusted for the access that triggered the alert.
  • the parameter specified by the alert rule defines the new logging level.
  • the log-level action makes use of the rule precedence to decide which rule applies when more than one rule matches a given access.
  • Each rule has a unique precedence, this precedence is an integer between 0 and OxFFFFFFFF, the lower the number the higher the precedence.
  • the log level action also has a stored default logging level, which is used if no alert rule is triggered for the object or access. The log-level action always starts by using the default logging level with a precedence of OxFFFFFFFF. Whenever a rule triggers the precedence of that rule is examined.
  • the precedence is higher (a lower value) then the current logging level is changed to that specified by the rule. If the precedence is lower (a higher value) then it is ignored. As it is not possible to know in advance which alert rules will trigger it is not possible to know when the final logging level is reached until the object or access has completed.
  • Some rules may only include comparisons of properties that are global to all objects within an access, for example if the rule only references addresses or the protocol used to transport the objects. Other rules may only reference properties specific to one object within the access, for example the object's type. Still others may reference both properties global to the access and properties specific to one or more objects within that access. It is necessary to have a separate logging level for each object within an access as the administrator may wish to log different amounts of data for different object types. For example the administrator may wish to log all data for word documents, but only log the first 100kb of an MP3 file and not log any CSS stylesheets.
  • FIG 4 is a schematic illustration of a computer system 40 according to the invention.
  • the computer system 1 is adapted to intercept and monitor data traffic on a network line 41.
  • Input access is provided at 42 for a user to define rules concerning the monitoring and the system analyses these rules and compiles them at 43 into a unified data structure at 44 comprising a term database 45, a comparison operator database 46 and a rule graph 47.
  • the data traffic on line 41 is intercepted and scanned at 48 and comparator 49 looks up each term of the data in the term database 45.
  • the links associated with the term are followed to evaluate the rule via the comparison operator database 46 and the rule graph 47. If the evaluation triggers the rule then pointers are generated and control 50 will take appropriate action such as logging details in memory 51 , executing a program and/or sending an e-mail via network line 41 or via another line 52.

Abstract

A computer system for assembling a single electronic data structure from a plurality of logical rules for the treatment of objects in electronic data traffic, wherein each rule comprises at least one sub-expression which comprises component parts including a term, a comparison operator and a value; means for user input of a plurality of said logical rules; means for dividing each rule into its sub-expressio ns and dividing each sub-expression into its component parts; a term database for storing the terms of the rules with links to each of the other component parts of the sub-expression of the rule; means for comparing a term of a newly input rule with the terms stored in the term database, and if said term is not found in the term database then storing said term with a link to each of the other component parts of the related sub-expression of the newly input rule; and if said term is found in the term database, then following the stored links to the first of the previous comparisons associated with the stored term, comparing the associated stored value of the comparison with the value of the term in the newly input rule; if the value is stored then continuing to the next sub-expression; if the value is not stored then storing the value as a link to said term; then continuing for the next sub-expression of the newly input rule.

Description

SYSTEM
The present invention relates to a system for monitoring electronic data traffic, and particularly to a system which watches for and alerts a network administrator to specific data traffic on the computer network. For example an administrator can define rules to instruct the system to watch for particular content and actions can be triggered when matching content is found.
An example of a known such alerting system is found in many e-mail clients. An administrator, or a user, can create logical rules from a set of predefined rules which can be matched against incoming e-mails. If a match to the rule is found in an e-mail then a pre-selected action is performed automatically by the e-mail client. For example, a rule may be created in which any e-mails with the subject heading containing the word "confidential" are automatically moved to a folder labelled "private" in the e-mail client.
Multiple rules may be defined for a system, each rule having a combination of predefined terms and resulting actions. If multiple rules are defined then each rule must be tested against each incoming object.
A problem with known such alerting systems occurs when the number of defined rules becomes large. The more defined rules there are, the longer it takes to test the rule set against each incoming object. This is particularly problematic when the testing is performed in systems with a high rate of incoming objects, for example when testing rules against network traffic.
In US 5,571,914 a system is disclosed for handling multiple rules using a template for an unskilled user to edit the rules and a tree structure for evaluation.
The present invention seeks to provide a faster and more efficient system for monitoring data traffic by taking all of the alert rules defined by the administrator and merging them into a single combined rule represented by a data structure that allows those rules to be more efficiently and quickly matched against incoming content in each object.
According to a first aspect of the present invention there is provided a computer system for assembling a single electronic data structure from a plurality of logical rules for the treatment of objects in electronic data traffic, wherein each rule comprises at least one sub-expression which comprises component parts including a term, a comparison operator and a value; means for user input of a plurality of said logical rules; means for dividing each rule into its sub-expressions and dividing each sub-expression into its component parts; a term database for storing the terms of the rules with links to each of the other component parts of the sub-expression of the rule; means for comparing a term of a newly input rule with the terms stored in the term database, and if said term is not found in the term database then storing said term with a link to each of the other component parts of the related sub-expression of the newly input rule; and if said term is found in the term database, then following the stored links to the first of the previous comparisons associated with the stored term, comparing the associated stored value of the comparison with the value of the term in the newly input rule; if the value is stored then continuing to the next sub-expression; if the value is not stored then storing the value as a link to said term; then continuing for the next sub-expression of the newly input rule.
The rule set state for an object being tested may be stored using an array of tri-state values (unknown, true or false) wherein one value corresponds with each node in the data structure and one value corresponds to each comparison. Two bits per node is preferably used for storage.
According to a second aspect of the present invention there is provided a computer system for automatically processing logical rules governing the treatment of objects in electronic data traffic, the system comprising: means for combining a plurality of user input logical rules into a single data structure comprising a term database, a comparison operator database and a rule graph; means for intercepting data traffic comprising a plurality of said objects; means for scanning the intercepted traffic and extracting one or more terms from said objects; means for comparing an intercepted term with terms stored in the term database; if the term does not find a match, then stopping; if the term does find a match, then following stored links associated with the term to each comparison that includes the term and checking the state of the comparison; if the state is known then stopping; if the state is not known then evaluating the comparison, and triggering the associated rule if the comparison is true.
Preferably the object state is updated to reflect whether the term evaluated to true or false, and the parents of each node are evaluated in turn. Advantageously means are provided for triggering a particular rule if a node in the data structure (rule set) is evaluated to true and the node is the head node for that rule.
Action may be taken when a rule is triggered such as details being logged, e-mail alerts being sent, running command line programs and/or adjusting the logging level for the object.
According to a third aspect of the invention there is provided a data structure for processing a plurality of logical rules for the treatment of objects in electronic data traffic, wherein each rule comprises at least one sub-expression which comprises component parts including a term, a comparison operator and a value; wherein the data structure takes the form of a rule graph in which any one term appears only once, and any sub-expression occurs only once.
Corresponding methods, apparatus, computer programs and other associated aspects are also provided as can be envisaged by a person skilled in the art.
The invention will now be described, by way of example only, with reference to the accompanying drawing in which:
Figure 1 is a schematic illustration of one embodiment of a rule graph used in the present invention; and Figure 2 is a flow diagram illustrating a sequence of operations of a system according to one aspect of the invention, used to evaluate a received term; Figure 3 is a flow diagram illustrating a sequence of operations of one step of figure 2 (the step "evaluate graph"); and Figure 4 is a schematic illustration of a computer system according to the invention. In a computer network an administrator will define rules which are used to trigger actions when matching content is captured from data traffic on the network. These rules will typically monitor client and server details such as username, IP address and MAC address, the type of the objects content, the size of the object, the protocol used to transport it, the direction (upload or download), and data information regarding the transfer of the object. In addition rules would typically monitor keywords within the objects content, and its check sum (e.g. the MD4 check sum). Other properties dependent on the content type or protocol, can also be included and also the ID of a previous alert that was triggered. However this is not an exhaustive list.
When a rule is matched then various actions may be taken by the system, such as logging the access and object details that triggered the alert rule into a database for later review by the administrator. Alternatively or in addition the system might notify the administrator of the details e.g. by e-mail, SMS or MSN. It could automatically take a particular action such as blocking the access immediately to prevent the object being transferred, or redirect it to a predefined resource such as a quarantine area. Alternatively the system could inject predefined data into the communication channel, for example to allow the access but accompany it with a warning message to the user.
Evidently a large number of rules will usually be generated on a typical network and checking each rule against each data transmission in the network utilises network computing capacity and memory and can slow the speed of the network.
The system of the present invention can speed up the process of checking the rules because it provides for effectively merging the rules into a more unified data structure in which common sub-expressions are combined or eliminated.
The unified data structure compiled from the administrators rules contains three main parts:
• Terms - a collection of all the terms present in the complete merged rule set.
• Comparisons - a collection of all the comparisons in the complete merged rule set. • Rule Graph - a graph representing the logic that joins the comparisons together.
Figure 1 shows a shows a schematic illustration of a rule graph for three rules: • Rule 1: DstIP = 192.168.1.1
• Rule 2: SrcIP = 192.168.1.1 & Type = GIF & Protocol = FTP
• Rule 3: SrcIP = 192.168.1.1 & Type = GIF & Protocol = HTTP
The terms of the rules comprise the nodes: DstIP (Destination IP address). SrcIP (Source IP address), Type and Protocol. The comparisons comprise the node denoted by "=" and the remainder of the nodes form the rule graph.
The rule graph can be constructed using techniques used to create compilers and interpreters and will be familiar to persons skilled in the art. Tools such as lex, yacc, flex and bison can be employed to transform the text of the rules into the graph. These tools are adapted so that with each additional rule which is added by the administrator, the system does not start from the beginning again but instead continues where it finished with the previous rule. In this way the single large graph can be generated containing all the rules instead of lots of different graphs. The assembly of such a rule graph is described in more detail below.
The rule graph avoids duplication of terms saving only one copy of each possible value in the rules and thus achieves a significant space saving over known system. It allows for logical operators within rules which again allows for significant space saving. Each rule can have as many as or as few comparisons as required and allows for multiple actions to be performed. Of course the graph structure created depends on the ruleset provided and the number of levels in the structure is dictated by the size of the rule and not the number of fields.
The rule graph format, with links from the terms to the relevant comparisons, allows direct access to the relevant comparisons without passing events from parent nodes to child nodes and the graph can be traversed multidirectionally going from the comparison to its parent and continuing both up and down as necessary for evaluation of a term. This is in contrast to prior art systems in which rules are evaluated one by one. This improves speed and memory use.
Once the data structure and rule graph have been established for an existing set of rules it will be used in monitoring accesses and content in network traffic. The steps in this procedure are illustrated in figures 2 and 3.
First, each term in received content or accesses is looked up in the term collection. If the term is not found, then the rule set does not contain the term and therefore that term cannot cause a rule to be matched. If the term is not found then there is no need to waste processing time evaluating the term and therefore the processing stops, as indicated by the terminator box "done".
If the term is found then it means that one or more comparisons contain the term. Each term contains a list of links to each comparison that includes that term so it is possible to very efficiently locate the comparisons which need to be evaluated. The state of the first comparison in the list is checked to see if it already has a value other than 'unknown'. If the comparison does have a value then the processing of that comparison stops, and the system proceeds to the next comparison until all comparisons have been processed.
If, on the other hand, the comparison does not have a value (value is unknown) the term is evaluated to either 'true' or 'false' and the object state is updated accordingly. At this stage the rule graph is evaluated according to the sequence of steps illustrated in figure 3. Each of the parent nodes in the graph are evaluated against the new term value. As each node is evaluated, its parent node is then also evaluated, and this process repeats until the head node for the original rule has been evaluated. The head nodes represent the original rules as created by an administrator, therefore if a head node is evaluated as 'true' then the corresponding rule is triggered.
As described below it is also necessary to evaluate whether other child nodes can affect the outcome and if so then the state of other child nodes and sub-child nodes are set to false until the graph branches. When a node in the graph is evaluated, an additional action can be taken depending on the type of the node. For example, assume a rule reads:
SrcIP = 192.168.1.0/24 | DstIP = 192.168.1.0/24 (meaning if the source OR destination address is 192.168.1.0/24).
If either side of this rule is evaluated to true then there is no need to evaluate the other side of the rule as the OR will be true regardless. Similarly, if the rule reads:
SrcIP = 192.168.1.0/24 & DstIP = 10.0.0.0/8 (meaning if the source address is 192.168.1.0/24 AND destination address is 10.0.0.0/8).
Then either side evaluating to false would mean that there is no need to evaluate the other side of the rule. The alerting system takes advantage of this fact and when it decides that there is no need to evaluate one side of a node it will mark all child nodes (as far as it can) on that side to either true or false so that if the system receives a value for one of these terms it will not evaluate it, knowing in advance that its value is irrelevant to the final outcome.
The current evaluation state of the rule set for each access being monitored is maintained by the alerting system using an array of tri-state values (unknown, true or false), one for each node in the graph and for each comparison. This array uses two bits per node to reduce the overhead in tracking this state and that combined with the common sub-expression elimination allows a very large rule set to be used before memory consumption becomes a problem.
A further complication while processing the rule set is that many of the terms can have more than one value. For example the keyword term, most content includes hundreds, if not thousands of words so when a rule wants to know when the keyword is 'abc' it is important to ignore cases where the comparison evaluates to false because it could evaluate to true later on.
The keyword term is a special case in another way. It is not desirable to have the alerter informed about the value of the keyword term for every word in every item of content monitored. Therefore, the alerting system has a mechanism to inform the content source which words it's interested in. This way the alert rules are only evaluated for words which are included in the rule set. The alerting system calls these terms 'External terms' because the comparison is done outside the alerting system.
An administrator may define alert rules such that more than one may trigger for a given access or object. For this reason each rule can be given a precedence value. This value is an integer, lower values have a higher precedence and higher values have a lower precedence. An administrator is then able to define which rule should take effect if more than one rule is triggered by a single access or object.
When an alert rule is submitted to the alerting system, that rule contains one or more actions that should be taken if and when the rule is triggered. These actions are submitted to the alerting system in the form of pointers to either an access object, which defines the access that triggered the alert, or to a layer object which defines the particular object within the access that caused the alert to be triggered. There are several categories of alert actions, including "Log action", "e-mail action", "Command action" and "log-level action", as described below.
A log action simply logs the details of the triggered alert to a database. The log action enters a single record in the database each time it is run.
An email action causes an e-mail to be sent to the recipient specified by the alert rule whenever it is run. There are two types of e-mail which can be sent by this action: full descriptions and summaries. Full Descriptions are sent when the quantity of alerts are not hitting the rate limit. One e-mail is sent per alert and each notification to an administrator includes a full description of the alert that has just fired. Summaries are sent when the quantity of alerts are hitting the rate limit. The number of times each alert is triggered is remembered by the action and one e-mail is sent whenever the rate limit allows. This e-mail contains a summary of the alerts that have been triggered since the last notification. The format of the bodies and subjects of these e-mails can be hard coded into the email action for speed of processing, or alternatively, an administrator may specify the content of both the subject and the body by using templates. To minimise the processing requirements of the action, if the e-mail fails to be sent it should not be retried. An entry can be placed in the main log file so an administrator can see why the e-mails are not being sent.
The command action causes a command line program to be run whenever the action is run. The parameter specified in the alert rule defines the program that should be run and the parameters that should be passed to that program. The parameters are passed by string substitution within the parameter string supplied by the alert rule.
The log-level action causes the logging level to be adjusted for the access that triggered the alert. The parameter specified by the alert rule defines the new logging level. The log-level action makes use of the rule precedence to decide which rule applies when more than one rule matches a given access. Each rule has a unique precedence, this precedence is an integer between 0 and OxFFFFFFFF, the lower the number the higher the precedence. The log level action also has a stored default logging level, which is used if no alert rule is triggered for the object or access. The log-level action always starts by using the default logging level with a precedence of OxFFFFFFFF. Whenever a rule triggers the precedence of that rule is examined. If the precedence is higher (a lower value) then the current logging level is changed to that specified by the rule. If the precedence is lower (a higher value) then it is ignored. As it is not possible to know in advance which alert rules will trigger it is not possible to know when the final logging level is reached until the object or access has completed.
Some rules may only include comparisons of properties that are global to all objects within an access, for example if the rule only references addresses or the protocol used to transport the objects. Other rules may only reference properties specific to one object within the access, for example the object's type. Still others may reference both properties global to the access and properties specific to one or more objects within that access. It is necessary to have a separate logging level for each object within an access as the administrator may wish to log different amounts of data for different object types. For example the administrator may wish to log all data for word documents, but only log the first 100kb of an MP3 file and not log any CSS stylesheets. Therefore if a rule triggers for an access and not for a specific object, the logging level for all objects within the access must be adjusted, provided that the rule has a higher precedence than the individual objects. Figure 4 is a schematic illustration of a computer system 40 according to the invention. The computer system 1 is adapted to intercept and monitor data traffic on a network line 41. Input access is provided at 42 for a user to define rules concerning the monitoring and the system analyses these rules and compiles them at 43 into a unified data structure at 44 comprising a term database 45, a comparison operator database 46 and a rule graph 47. The data traffic on line 41 is intercepted and scanned at 48 and comparator 49 looks up each term of the data in the term database 45. If the term is found then the links associated with the term are followed to evaluate the rule via the comparison operator database 46 and the rule graph 47. If the evaluation triggers the rule then pointers are generated and control 50 will take appropriate action such as logging details in memory 51 , executing a program and/or sending an e-mail via network line 41 or via another line 52.

Claims

1. A computer system for assembling a single electronic data structure from a plurality of logical rules for the treatment of objects in electronic data traffic, wherein each rule comprises at least one sub-expression which comprises component parts including a term, a comparison operator and a value; means for user input of a plurality of said logical rules; means for dividing each rule into its sub-expressions and dividing each subexpression into its component parts; a term database for storing the terms of the rules with links to each of the other component parts of the sub-expression of the rule; means for comparing a term of a newly input rule with the terms stored in the term database, and if said term is not found in the term database then storing said term with a link to each of the other component parts of the related sub-expression of the newly input rule; and if said term is found in the term database, then following the stored links to the first of the previous comparisons associated with the stored term, comparing the associated stored value of the comparison with the value of the term in the newly input rule; if the value is stored then continuing to the next sub-expression; if the value is not stored then storing the value as a link to said term; then continuing for the next sub-expression of the newly input rule.
2. A system according to claim 1 comprising storing the data structure using an array of tri-state values wherein one value corresponds with each node in the data structure and one value corresponds to each comparison.
3. A system according to claim 2 comprising using two bits per node for storage.
4. A data structure for processing a plurality of logical rules for the treatment of objects in electronic data traffic, wherein each rule comprises at least one subexpression which comprises component parts including a term, a comparison operator and a value; wherein the data structure takes the form of a rule graph in which any one term appears only once, and any sub-expression occurs only once.
5. The data structure of claim 4 comprising an array of tri-state values for each object wherein one value corresponds with each node in the data structure and one value corresponds to each comparison.
6. A computer system for automatically processing logical rules governing the treatment of objects in electronic data traffic, the system comprising: means for combining a plurality of user input logical rules into a single data structure comprising a term database, a comparison operator database and a rule graph; means for intercepting data traffic comprising a plurality of said objects; means for scanning the intercepted traffic and extracting one or more terms from said objects; means for comparing an intercepted term with terms stored in the term database; if the term does not find a match, then stopping; if the term does find a match, then following stored links associated with the term to each comparison that includes the term and checking the state of the comparison; if the state is known then stopping; if the state is not known then evaluating the comparison, and triggering the associated rule if the comparison is true.
7. A system according to claim 6 wherein the object state is updated to reflect whether the term evaluated to true or false.
8. A system according to claim 7 wherein the parents of each node are evaluated in turn.
9. A system according to claim 8 comprising means for triggering a particular rule if a node in the data structure is evaluated to true and the node is the head node for said rule.
10. A system according to any one of claims 6 to 9 comprising means for taking action in response to the triggering of a rule.
11. A system according to claim 10 wherein said action includes any one or more of: logging details of the object which triggered the rule; sending an e-mail to alert a recipient that the rule has been triggered; running a command line program; and/or adjusting the logging level for the object.
12. A system according to claim 10 or 11 wherein the triggering of a rule generates a pointer to an access object defining the access that triggered the rule.
13. A system according to claim 10, 11 or 12 wherein the triggering of a rule generates a pointer to a layer object defining the particular object within the access that caused the rule to be triggered.
14 A method for assembling a single electronic data structure from a plurality of logical rules for the treatment of objects in electronic data traffic, wherein each rule comprises at least one sub-expression which comprises component parts including a term, a comparison operator and a value the method comprising the steps: input of a plurality of said logical rules; dividing each rule into its sub-expressions and dividing each sub-expression into its component parts; storing the terms of the rules in a term database with links to each of the other component parts of the sub-expression of the rule; comparing a term of a newly input rule with the terms stored in the term database, and if said term is not found in the term database then storing said term with a link to each of the other component parts of the related sub-expression of the newly input rule; and if said term is found in the term database, then following the stored links to the first of the previous comparisons associated with the stored term, comparing the associated stored value of the comparison with the value of the term in the newly input rule; if the value is stored then continuing to the next sub-expression; if the value is not stored then storing the value as a link to said term; then continuing for the next sub-expression of the newly input rule.
15. A method according to claim 14 further comprising the step of storing the data structure wherein one array of tri-state values exists for each object wherein one value corresponds with each node in the data structure and one value corresponds to each comparison.
16. A method of processing logical rules governing the treatment of objects in electronic data traffic, the method comprising: combining a plurality of user input logical rules into a single data structure comprising a term database, a comparison operator database and a rule graph; intercepting data traffic comprising a plurality of said objects; scanning the intercepted traffic and extracting one or more terms from said objects; comparing an intercepted term with terms stored in the term database; if the term does not find a match, then stopping; if the term does find a match, then following stored links associated with the term to each comparison that includes the term and checking the state of the comparison; if the state is known then stopping; if the state is not known then evaluating the comparison, and triggering the associated rule if the comparison is true.
17. A method according to claim 16 further comprising the step of taking action when a rule is triggered.
18. A method according to claim 17 wherein said action includes any one or more of: logging details of the object which triggered the rule; sending an e-mail to alert a recipient that the rule has been triggered; running a command line program; and/or adjusting the logging level for the object.
PCT/GB2007/050418 2006-07-19 2007-07-19 System WO2008009990A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0614335A GB2440359A (en) 2006-07-19 2006-07-19 Rules for data traffic assembled in single data structure
GB0614335.8 2006-07-19

Publications (1)

Publication Number Publication Date
WO2008009990A1 true WO2008009990A1 (en) 2008-01-24

Family

ID=36998337

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2007/050418 WO2008009990A1 (en) 2006-07-19 2007-07-19 System

Country Status (2)

Country Link
GB (1) GB2440359A (en)
WO (1) WO2008009990A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102214097A (en) * 2011-06-09 2011-10-12 浙江工业大学 Method for visualizing result of program comprehension based on function I.Nassi and B.Schneiderman (NS) graph
CN110505186A (en) * 2018-05-18 2019-11-26 深信服科技股份有限公司 A kind of recognition methods of safety regulation conflict, identification equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0899980A1 (en) * 1997-08-28 1999-03-03 Siemens Aktiengesellschaft Telecommunication network and state propagation method
WO2004015627A2 (en) * 2002-08-09 2004-02-19 Corticon Technologies, Inc. Rule engine
US6876889B1 (en) * 1998-11-17 2005-04-05 Intel Corporation Rule processing system with external application integration

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5751914A (en) * 1995-10-10 1998-05-12 International Business Machines Corporation Method and system for correlating a plurality of events within a data processing system
US20050050060A1 (en) * 2003-08-27 2005-03-03 Gerard Damm Data structure for range-specified algorithms
US7644085B2 (en) * 2003-11-26 2010-01-05 Agere Systems Inc. Directed graph approach for constructing a tree representation of an access control list

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0899980A1 (en) * 1997-08-28 1999-03-03 Siemens Aktiengesellschaft Telecommunication network and state propagation method
US6876889B1 (en) * 1998-11-17 2005-04-05 Intel Corporation Rule processing system with external application integration
WO2004015627A2 (en) * 2002-08-09 2004-02-19 Corticon Technologies, Inc. Rule engine

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102214097A (en) * 2011-06-09 2011-10-12 浙江工业大学 Method for visualizing result of program comprehension based on function I.Nassi and B.Schneiderman (NS) graph
CN110505186A (en) * 2018-05-18 2019-11-26 深信服科技股份有限公司 A kind of recognition methods of safety regulation conflict, identification equipment and storage medium

Also Published As

Publication number Publication date
GB2440359A (en) 2008-01-30
GB0614335D0 (en) 2006-08-30

Similar Documents

Publication Publication Date Title
US11716248B1 (en) Selective event stream data storage based on network traffic volume
US11818018B1 (en) Configuring event streams based on identified security risks
US11115505B2 (en) Facilitating custom content extraction rule configuration for remote capture agents
US11853303B1 (en) Data stream generation based on sourcetypes associated with messages
US11196756B2 (en) Identifying notable events based on execution of correlation searches
US10700950B2 (en) Adjusting network data storage based on event stream statistics
US10366101B2 (en) Bidirectional linking of ephemeral event streams to creators of the ephemeral event streams
US9781133B2 (en) Automatic stability determination and deployment of discrete parts of a profile representing normal behavior to provide fast protection of web applications
US9356937B2 (en) Disambiguating conflicting content filter rules
US7516492B1 (en) Inferring document and content sensitivity from public account accessibility
US8584233B1 (en) Providing malware-free web content to end users using dynamic templates
US9858051B2 (en) Regex compiler
US20070230486A1 (en) Communication and compliance monitoring system
US9253154B2 (en) Configuration management for a capture/registration system
US20200244680A1 (en) Computer security system with network traffic analysis
US7543055B2 (en) Service provider based network threat prevention
US20140359771A1 (en) Clustering event data by multiple time dimensions
US20050278781A1 (en) System security approaches using sub-expression automata
CN108206802A (en) The method and apparatus for detecting webpage back door
CN104378283A (en) Sensitive email filtering system and method based on client/server mode
CN114968754A (en) Application program interface API test method and device
CN109218168A (en) The blocking-up method and device of sensitive e-mail messages
US11533323B2 (en) Computer security system for ingesting and analyzing network traffic
EP4005178B1 (en) Multi-perspective security context per actor
WO2008009990A1 (en) System

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07766459

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

NENP Non-entry into the national phase

Ref country code: RU

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A) DATED 15.06.09

122 Ep: pct application non-entry in european phase

Ref document number: 07766459

Country of ref document: EP

Kind code of ref document: A1