CN105574032A - Rule matching operation method and device - Google Patents

Rule matching operation method and device Download PDF

Info

Publication number
CN105574032A
CN105574032A CN201410547129.4A CN201410547129A CN105574032A CN 105574032 A CN105574032 A CN 105574032A CN 201410547129 A CN201410547129 A CN 201410547129A CN 105574032 A CN105574032 A CN 105574032A
Authority
CN
China
Prior art keywords
rule
matching
rule condition
unit
units
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410547129.4A
Other languages
Chinese (zh)
Inventor
陈显铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201410547129.4A priority Critical patent/CN105574032A/en
Publication of CN105574032A publication Critical patent/CN105574032A/en
Pending legal-status Critical Current

Links

Landscapes

  • Remote Monitoring And Control Of Power-Distribution Networks (AREA)

Abstract

The invention provides a rule matching operation method and device. The method comprises the following steps: decomposing a rule expression into a plurality of rule condition units; adjusting the operation order of the rule condition units according to the priority of operators in the rule expression and the matching success rate, monitored by each rule condition unit, within a certain period; if an AND relationship between the currently operated rule condition units is determined according to the priority of the operators, carrying out priority operation on the rule condition units with relatively low matching success rates; and if an OR relationship between the currently operated rule condition units is determined according to the priority of the operators, carrying out priority operation on the rule condition units with relatively high matching success rates. The rule matching operation method and device can monitor the matching success rates of various rule condition units in the rule expression in real time, and adjust the operation order of the overall rule expression according to the matching success rates, so that the overall operation speed of the rule expression is improved; and the operation efficiency is improved.

Description

Rule matching operation method and device
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a rule matching operation method and device.
Background
With the rapid development of computer technology, in the application of internet technology, the requirement of a user on the operation speed is more and more strict, however, the operation speed is slower and the operation efficiency is lower in the traditional rule matching.
For example: for regular expression C! 10| (a ═ 0& & B >5), rule context { a ═ 0, B ═ 6, C ═ 10 }. The traditional rule operation will match three times: first operation C! 10 is false, and a-0 in the second operation (a-0 & & B >5) is true; and the third operation B >5 is true, and then the matching result of the regular expression is obtained to be true.
However, if the calculation order of the regular expressions is adjusted, the number of operations can be effectively reduced, and the performance of the system can be improved. For example: the calculation order of the above regular expression is adjusted to (a ═ 0& & B >5) | | C |)! If the rule context is also { a ═ 0, B ═ 6, and C ═ 10}, then only 2 times of matching are needed in the rule operation process, and the matching result of the rule expression can be obtained as true. Correspondingly, in the first operation (a ═ 0& & B >5), a ═ 0 is true; the second operation B >5 is true, so the matching result of (a ═ 0& & B >5) is true, and (a & & B >5) is not equal to C! The logical relationship between 10 is an or relationship, so only after the matching result obtained (a ═ 0& & B >5) is true, C | does not need to be calculated again! That is, the matching result of the regular expression may be obtained as true.
Disclosure of Invention
The invention aims to provide a rule matching operation method and a rule matching operation device.
To achieve one of the above objects, an embodiment of the present invention provides a rule matching operation method, including: decomposing the regular expression into a plurality of regular condition units;
adjusting the operation sequence of the rule condition units according to the priority of an operator in the rule expression and the matching success rate monitored by each rule condition unit in a certain period; wherein,
if the rule condition units of the current operation are determined to be in a sum relation according to the priority of the operator, preferentially operating the rule condition units with lower matching success rate;
and if the rule condition units of the current operation are determined to be in an OR relationship according to the operator priority, preferentially operating the rule condition units with higher matching success rate.
As a further improvement of an embodiment of the present invention, the rule condition unit is composed of a unit type, a dimension value, an operator, and a comparison value.
As a further improvement of an embodiment of the present invention, "adjusting the rule operation order according to the priority of the operator in the rule expression and the matching success rate monitored by each rule condition unit in a certain period" specifically includes:
analyzing the rule expression into a syntax binary tree according to operator priority and rule condition units, wherein each leaf node of the syntax binary tree corresponds to one rule condition unit;
distributing the same identifier for the same rule condition unit in the leaf node;
replacing leaf nodes of the binary syntax tree with identifiers corresponding to regular condition units, and performing pairwise calculation layer by the leaf nodes; and adjusting the operation sequence of the identifiers according to the priority of an operator in the rule expression and the matching success rate monitored by the rule condition unit corresponding to each identifier in a certain period.
As a further refinement of an embodiment of the present invention, the method includes:
stacking identifiers according to the unit type of the rule condition unit;
and calculating the matching result of each identifier by combining the dimension value of the rule condition unit according to the execution template corresponding to the unit type.
As a further improvement of an embodiment of the present invention, the method includes monitoring a matching success rate of each rule condition unit in a certain period, and includes:
and respectively acquiring the matching success rate of the plurality of rule condition units in a certain period in real time, wherein the matching success rate is the matching success times of the rule condition units in the certain period/the matching times of the rule condition units in the certain period.
As a further improvement of an embodiment of the present invention, calculating the matching success rate specifically includes:
analyzing the rule expression into a rule syntax binary tree according to operator priority and rule condition units, wherein each leaf node of the rule syntax binary tree corresponds to one rule condition unit;
distributing the same identifier for the same rule condition unit in the leaf node;
replacing leaf nodes of the grammar tree with identifiers corresponding to rule condition units, and calculating matching results of the identifiers;
the matching success rate is the matching success frequency of the identifier corresponding to the rule condition unit/the matching frequency of the identifier corresponding to the rule condition unit.
As a further improvement of an embodiment of the present invention, "calculating a matching result of the identifier" specifically includes:
stacking identifiers according to the unit type of the rule condition unit;
and calculating the matching result of each identifier by combining the dimension value of the rule condition unit according to the execution template corresponding to the unit type.
To achieve one of the above objects, according to one embodiment of the present invention, there is provided a rule matching calculation apparatus including: the decomposition module is used for decomposing the rule expression into a plurality of rule condition units;
the matching module is used for adjusting the operation sequence of the rule condition units according to the priority of an operator in the rule expression and the matching success rate monitored by each rule condition unit in a certain period; wherein,
if the rule condition units of the current operation are determined to be in a sum relation according to the priority of the operator, preferentially operating the rule condition units with lower matching success rate;
and if the rule condition units of the current operation are determined to be in an OR relationship according to the operator priority, preferentially operating the rule condition units with higher matching success rate.
As a further improvement of an embodiment of the present invention, the rule condition unit is composed of a unit type, a dimension value, an operator, and a comparison value.
As a further improvement of an embodiment of the present invention, the matching module is further configured to: analyzing the rule expression into a syntax binary tree according to operator priority and rule condition units, wherein each leaf node of the syntax binary tree corresponds to one rule condition unit;
distributing the same identifier for the same rule condition unit in the leaf node;
replacing leaf nodes of the binary syntax tree with identifiers corresponding to regular condition units, and performing pairwise calculation layer by the leaf nodes; and adjusting the operation sequence of the identifiers according to the priority of an operator in the rule expression and the matching success rate monitored by the rule condition unit corresponding to each identifier in a certain period.
As a further improvement of an embodiment of the present invention, the matching module is further configured to: stacking identifiers according to the unit type of the rule condition unit;
and calculating the matching result of each identifier by combining the dimension value of the rule condition unit according to the execution template corresponding to the unit type.
As a further improvement of an embodiment of the present invention, the matching module is configured to: monitoring the matching success rate of each rule condition unit in a certain period;
the matching module is specifically configured to respectively acquire matching success rates of the plurality of rule condition units in a certain period in real time, where the matching success rate is the matching success times of the rule condition units in the certain period/the matching times of the rule condition units in the certain period.
As a further improvement of an embodiment of the present invention, the matching module is configured to: analyzing the rule expression into a rule syntax binary tree according to operator priority and rule condition units, wherein each leaf node of the rule syntax binary tree corresponds to one rule condition unit;
distributing the same identifier for the same rule condition unit in the leaf node;
replacing leaf nodes of the grammar tree with identifiers corresponding to rule condition units, and calculating matching results of the identifiers;
the matching success rate is the matching success frequency of the identifier corresponding to the rule condition unit/the matching frequency of the identifier corresponding to the rule condition unit.
As a further improvement of an embodiment of the present invention, the matching module is configured to: stacking identifiers according to the unit type of the rule condition unit;
and calculating the matching result of each identifier by combining the dimension value of the rule condition unit according to the execution template corresponding to the unit type.
Compared with the prior art, the invention has the beneficial effects that: the rule matching operation method and the rule matching operation device can monitor the matching success rate of each rule condition unit in the rule expression in real time, and adjust the operation sequence of the whole rule expression according to the matching success rate so as to improve the overall operation speed of the rule expression and further improve the operation efficiency.
Drawings
FIG. 1 is a flow chart illustrating a rule matching operation method according to an embodiment of the present invention;
FIG. 2A is a schematic structural diagram of a regular syntax binary tree according to an embodiment of the present invention;
FIG. 2B is a schematic diagram of the structure of the rule condition cell pool corresponding to FIG. 2A;
FIG. 2C is a schematic diagram of the structure in which the regular element expressions in the regular syntax binary tree of FIG. 2A are replaced with identifiers in the regular conditional element pool;
FIG. 2D is a diagram of the matching results after the identifiers of FIG. 2B are stacked;
FIG. 2E is a diagram of the matching results of FIG. 2C;
fig. 3 is a block diagram of a rule matching calculation apparatus according to an embodiment of the present invention.
Detailed Description
The present invention will be described in detail below with reference to embodiments shown in the drawings. These embodiments are not intended to limit the present invention, and structural, methodical, or functional changes that can be easily made by one of ordinary skill in the art based on these embodiments are included in the scope of the present invention.
As shown in fig. 1, in an embodiment of the present invention, the rule matching operation method includes:
and decomposing the regular expression into a plurality of regular condition units. Accordingly, the regular expression generally includes a number of regular condition elements and operators connecting the regular condition elements.
For example: the regular expression is:
"String: a >1& & int: B! 2| (date: C >201209& & date: C <201211) ", where" String: a >1 "," int: B! 2 "," date: C >201209 "," date: C <201211 "are rule condition units, and" & & "," | "," (",") are operators.
Correspondingly, the rule condition unit consists of a unit type, a dimension value, an operator and a comparison value. Taking one of the rule condition units of the regular expression as an example:
for example: "String: a >1 ", where" String "is the cell type," A "is the dimension value," > "is the operator," 1 "is the comparison value.
It will be appreciated that the cell types include a variety of categories, such as: string, int, long, double, money, date, borolean, etc., which operators also include multiple categories, such as: a ═! The terms, >, <, (,), &, | "and the like, and will not be described in detail herein.
In the above example, the operator involved: the "& &" represents and operation, the "|" represents or operation, and the "(", ")" represents the preferred operation of the rule condition unit therein, and the embodiment of the present invention describes the technical solution of the present invention in detail through the above examples, and other operators not involved can be replaced by those skilled in the art according to the embodiment of the present invention without creative work, and will not be described in detail herein.
In this embodiment, the rule matching operation method further includes:
adjusting the operation sequence of the rule condition units according to the priority of an operator in the rule expression and the matching success rate monitored by each rule condition unit in a certain period, and preferentially operating the rule condition units with lower matching success rate if the rule condition units of the current operation determined according to the priority of the operator are in a sum relationship; and if the rule condition units of the current operation are determined to be in an OR relationship according to the operator priority, preferentially operating the rule condition units with higher matching success rate.
Correspondingly, the matching success rate is the matching success number of the rule condition unit in a certain period/the matching number of the rule condition unit in a certain period.
Correspondingly, the rule matching operation method further includes: and monitoring the matching success rate of each rule condition unit in a certain period, and determining and adjusting the operation sequence of the rule expression according to the matching success rate.
Correspondingly, the length of the period is not specifically limited, and may be specifically set according to actual needs, for example, the period is set to 1 minute, 1 hour, 1 day, and the like, and further, the operation times of the regular expression in one period is not specifically limited, and is not described in detail herein.
For convenience of description, the following description specifically describes, by way of example, a calculation process of matching success rates of all rule condition units included in the rule expression monitored in one period.
In this embodiment, the rule expression may be firstly parsed into a binary syntax tree according to operator priorities and rule condition units, where each leaf node of the binary syntax tree corresponds to one rule condition unit.
For example, the regular expression is:
“String:A>1&&int:B!=2||(date:C>201209&&date:C<201211)”。
correspondingly, referring to fig. 2A, fig. 2A is a syntax binary tree obtained by parsing the regular expression. Wherein the regular expression "String: a >1 "," int: B! 2, date: C201209 and date: C <201211 respectively represent corresponding leaf nodes, and the leaf nodes are connected and ranked through operators in the regular expression.
Further, in order to reduce the number of times of repeating the calculation of the rule condition unit and improve the calculation efficiency in the following rule matching process, the same identifier may be allocated to the same rule condition unit in the leaf node.
Specifically, as shown in fig. 2B, all leaf nodes of each rule syntax tree are scanned to obtain a corresponding rule condition unit pool, and identifiers are allocated to the same rule condition units in the rule condition unit pool.
For example, the regular expression "String: a >1 "corresponds to an identifier of" decidedUnit 1 ", the regular expression" int: B! 2 "corresponds to the identifier" decidedUnit 2 ", the regular expression" date: C >201209 "corresponds to the identifier" decidedUnit 3 ", and the regular expression" date: C <201211 "corresponds to the identifier" decidedUnit 4 ".
Further, as shown in fig. 2C, leaf nodes of the binary syntax tree are replaced with identifiers corresponding to regular condition units.
Correspondingly, the 'String' in the binary syntax tree is: a >1 "is replaced with" decidedUnit 1 "," int: B! 2 "is replaced with" decidedUnit 2 "," date: C >201209 "is replaced with" decidedUnit 3 ", and" date: C <201211 "is replaced with" decidedUnit 4 ". Therefore, each rule condition unit does not need to be calculated one by one, and calculation is carried out through the identifier corresponding to the rule condition unit, so that a large amount of calculation resources are saved, and the calculation efficiency is improved.
Further, as shown in FIG. 2D, the identifiers are stacked according to the unit type of the rule condition unit; and calculating the matching result of each identifier by combining the dimension value of the rule condition unit according to the execution template corresponding to the unit type, so that the identifiers of the same unit type can be processed uniformly, and the operation efficiency is further improved. Preferably, the calculation result obtained in this step is of the coolean type, which is not described in detail herein.
Preferably, after the step, the matching success rate can also be calculated by the following equation: the matching success times of the identifiers corresponding to the rule condition units/the matching times of the identifiers corresponding to the rule condition units.
Correspondingly, a corresponding matching result is obtained according to the regular expression, and the matching result is expressed by a conditional relation function, such as: "true", "false", etc. functions.
In the above example, the regular expression corresponding to the identifier "decidedUnit 1" is "String: a is more than 1 ', the context is operated according to the rule, and the matching result is ' true '; correspondingly, the rule expression "int: B!corresponding to the identifier" deciduUnit 2 "is based on the rule calculation context! A matching result of 2 "is" false "; the matching result for the regular expression "date: C > 201209" corresponding to the identifier "decidedUnit 3" is "true"; the matching result of the regular expression "date: C < 201211" corresponding to the identifier "decidedUnit 4" is "tune".
Correspondingly, when the matching result corresponding to the corresponding rule condition unit is "tune", it indicates that the matching of the rule condition unit is successful this time, and simultaneously, the matching times and matching success times of the rule condition unit are recorded. For example, two counters are employed to record the number of matching successes of the corresponding rule condition unit or the corresponding identifier, and the number of matching of the rule condition unit or the identifier. If, in a certain period, every time a rule condition unit or a corresponding identifier is matched, adding 1 to a counter recording the matching times of the rule condition unit or the corresponding identifier; and when the matching of the rule condition unit or the corresponding identifier is successful, adding 1 to a counter for recording the matching success times of the rule condition unit or the corresponding identifier.
Correspondingly, when the matching result corresponding to the corresponding rule condition unit is "false", it indicates that the matching of the rule condition unit fails this time, and meanwhile, the matching times of the rule condition unit also needs to be recorded. Of course, in other embodiments of the present invention, the matching failure times of the rule condition unit may also be recorded at the same time, which is not described in detail herein.
Further, a ternary structure may be used to indicate the number of matching successes of the identifiers corresponding to a single rule condition unit, and the number of matching of the identifiers.
For example, the above-described ternary structure is represented using three attributes in java. In the initial state, analysis revealed that: the three-element structure corresponding to the rule condition unit is respectively as follows: [ String: a >1, 0, [ int: B! 2 ", 0, [ date: C >201209, 0, 0 ], and [ date: C <201211, 0, 0 ].
As can be seen from the above embodiments, after one time of analysis is completed, the ternary structure changes with the change of the counter value, and the change result is: [ String: a >1, 1, 1! 2 ", 0, 1, [ date: C >201209, 1, 1 ], and [ date: C <201211, 1, 1 ].
Further, in a certain period, assuming that the rule condition unit operates 100 times in the period, according to the above steps, after the rule condition unit is context-matched according to the rule operation, the following ternary structure is obtained: [ String: a >1, 80, 100 [ ], [ int: B! 2 ", 70, 100, [ date: C >201209, 60, 100 ], and [ date: C <201211, 50, 100 ], wherein the ternary structure represents: rule condition element [ String: a >1, 80, 100 ] calculates 100 times in a certain period, wherein 80 times of matching is successful; rule condition Unit [ int: B! 2 ", 70, 100 ] are operated 100 times in a certain period, wherein 70 times of matching are successful; the rule condition unit (date: C >201209, 60, 100) calculates 100 times in a certain period, wherein the matching for 60 times is successful; the rule condition unit (date: C <201211, 50, 100) operates 100 times in a certain period, wherein 50 times of matching succeeds.
Further, according to the above formula: the matching success rate, i.e. the matching success number of the rule condition unit in a certain period/the matching number of the rule condition unit in a certain period, or the matching success number of the identifier corresponding to the matching success rate, i.e. the matching number of the identifier corresponding to the rule condition unit/the matching number of the identifier corresponding to the rule condition unit, is known, and the matching success rate corresponds to the rule condition unit "String: a >1 "," int: B! The matching success rates of 2 "," date: C >201209 ", and" date: C <201211 "are: 80%, 70%, 60% and 50%.
Correspondingly, the operation sequence of the rule condition units is adjusted according to the priority of an operator in the rule expression and the matching success rate monitored by each rule condition unit in a certain period, and if the rule condition units of the current operation determined according to the priority of the operator are in a sum relationship, the rule condition units with lower matching success rate are preferentially operated; and if the rule condition units of the current operation are determined to be in an OR relationship according to the operator priority, preferentially operating the rule condition units with higher matching success rate.
Further, as shown in fig. 2E, in the binary syntax tree, pairwise operations are performed layer by leaf nodes; and adjusting the operation sequence of the identifiers according to the priority of an operator in the rule expression and the matching success rate monitored by the rule condition unit corresponding to each identifier in a certain period. Because the calculation results of the leaf nodes are already completed in the previous step, the operation speed of the whole binary syntax tree is high in this step.
Correspondingly, the matching result of the corresponding identifier is replaced into the binary syntax tree, and the operation is performed in a mode of traversing the binary syntax tree according to the operation sequence adjusted by the rule condition unit, so that the final matching result of the single rule expression is obtained.
Such as: in the lowest level structure in the binary syntax tree, because the matching success rate of the rule condition unit 'date: C > 201209' is 60% and is greater than the matching success rate of the rule condition unit 'date: C < 201211' by 50%, and meanwhile, the operator connecting the rule condition unit 'date: C < 201211' is a sum relationship, the rule condition unit 'date: C < 201211' is calculated firstly, if the result is true, the rule condition unit 'date: C > 201209' is calculated, and then the next step is continued; if the result of the rule condition unit "date: C < 201211" is false, the operation is not continued, and the calculation result of the rule expression including the rule condition unit is given as false directly, which is not described in detail herein.
Compared with the prior art, the rule matching operation method can monitor the matching success rate of each rule condition unit in the rule expression in real time, and adjust the operation sequence of the whole rule expression according to the matching success rate, so that the overall operation speed of the rule expression is increased, and the operation efficiency is further improved.
As shown in fig. 3, in an embodiment of the present invention, the rule matching calculation device includes: decomposition module 100, matching module 200.
The decomposition module 100 is used for decomposing the rule expression of the rule into a plurality of rule condition units. Accordingly, the regular expression generally includes a number of regular condition elements and operators connecting the regular condition elements.
For example: the regular expression is:
"String: a >1& & int: B! 2| (date: C >201209& & date: C <201211) ", where" String: a >1 "," int: B! 2 "," date: C >201209 "," date: C <201211 "are rule condition units, and" & & "," | "," (",") are operators.
Correspondingly, the rule condition unit consists of a unit type, a dimension value, an operator and a comparison value. Taking one of the rule condition units of the regular expression as an example:
for example: "String: a >1 ", where" String "is the cell type," A "is the dimension value," > "is the operator," 1 "is the comparison value.
It will be appreciated that the cell types include a variety of categories, such as: string, int, long, double, money, date, borolean, etc., which operators also include multiple categories, such as: a ═! The terms, >, <, (,), &, | "and the like, and will not be described in detail herein.
In the above example, the operator involved: the "& &" represents and operation, the "|" represents or operation, and the "(", ")" represents the preferred operation of the rule condition unit therein, and the embodiment of the present invention describes the technical solution of the present invention in detail through the above examples, and other operators not involved can be replaced by those skilled in the art according to the embodiment of the present invention without creative work, and will not be described in detail herein.
In this embodiment, the matching module 200 is configured to adjust an operation sequence of the rule condition units according to a priority of an operator in the rule expression and a matching success rate monitored by each rule condition unit in a certain period, and if a sum relationship exists between the rule condition units of the current operation determined according to the priority of the operator, the matching module 200 preferentially operates the rule condition unit with a lower matching success rate; if the rule condition units of the current operation are determined to be or relationships according to the operator priorities, the matching module 200 preferentially operates the rule condition units with higher matching success rate.
Correspondingly, the matching success rate is the matching success number of the rule condition unit in a certain period/the matching number of the rule condition unit in a certain period.
Correspondingly, the matching module 200 is configured to monitor a matching success rate of each rule condition unit in a certain period, and determine and adjust an operation sequence of the rule expression according to the matching success rate.
Specifically, in a certain period, the matching module 200 is configured to monitor matching results of all operation rule condition units in the rule expression respectively, and calculate a matching success rate of the rule condition unit according to the results.
Correspondingly, the length of the period is not specifically limited, and may be specifically set according to actual needs, for example, the period is set to 1 minute, 1 hour, 1 day, and the like, and further, the operation times of the regular expression in one period is not specifically limited, and is not described in detail herein.
For convenience of description, the following description specifically describes, by way of example, a calculation process of matching success rates of all rule condition units included in the rule expression monitored in one period.
In this embodiment, the matching module 200 may first parse the rule expression into a binary syntax tree according to the operator priority and the rule condition unit, where each leaf node of the binary syntax tree corresponds to one rule condition unit.
For example, the regular expression is:
“String:A>1&&int:B!=2||(date:C>201209&&date:C<201211)”。
correspondingly, referring to fig. 2A, fig. 2A is a syntax binary tree obtained by parsing the regular expression. Wherein the regular expression "String: a >1 "," int: B! 2, date: C201209 and date: C <201211 respectively represent corresponding leaf nodes, and the leaf nodes are connected and ranked through operators in the regular expression.
Further, in order to reduce the number of times of repeating the rule condition units in the following rule matching process and improve the calculation efficiency, the matching module 200 is configured to allocate the same identifier to the same rule condition unit in the leaf node.
Specifically, as shown in fig. 2B, the matching module 200 is configured to scan all leaf nodes of each rule syntax tree to obtain a corresponding rule condition unit pool, and meanwhile, allocate an identifier to the same rule condition unit in the rule condition unit pool.
For example, the regular expression "String: a >1 "corresponds to an identifier of" decidedUnit 1 ", the regular expression" int: B! 2 "corresponds to the identifier" decidedUnit 2 ", the regular expression" date: C >201209 "corresponds to the identifier" decidedUnit 3 ", and the regular expression" date: C <201211 "corresponds to the identifier" decidedUnit 4 ".
Further, as shown in fig. 2C, the matching module 200 is configured to replace leaf nodes of the binary syntax tree with identifiers corresponding to regular condition units.
Accordingly, the matching module 200 compares "String: a >1 "is replaced with" decidedUnit 1 "," int: B! 2 "is replaced with" decidedUnit 2 "," date: C >201209 "is replaced with" decidedUnit 3 ", and" date: C <201211 "is replaced with" decidedUnit 4 ".
Further, as shown in FIG. 2D, the matching module 200 is used for stacking the identifiers according to the unit type of the rule condition unit; and calculating the matching result of each identifier by combining the dimension value of the rule condition unit according to the execution template corresponding to the unit type, so that the identifiers of the same unit type can be processed uniformly, and the operation efficiency is further improved.
Preferably, the calculation result obtained by the matching module 200 is of the coolean type, which is not described in detail herein.
Preferably, after the step, the matching success rate can be calculated by the following equation: the matching success times of the identifiers corresponding to the rule condition units/the matching times of the identifiers corresponding to the rule condition units.
Correspondingly, the matching module 200 is configured to obtain a corresponding matching result according to the regular expression, where the matching result is expressed by a conditional relation function, such as: "true", "false", etc. functions.
In the above example, the regular expression corresponding to the identifier "decidedUnit 1" is "String: a is more than 1 ', the context is operated according to the rule, and the matching result is ' true '; correspondingly, the rule expression "int: B!corresponding to the identifier" deciduUnit 2 "is based on the rule calculation context! A matching result of 2 "is" false "; the matching result for the regular expression "date: C > 201209" corresponding to the identifier "decidedUnit 3" is "true"; the matching result of the regular expression "date: C < 201211" corresponding to the identifier "decidedUnit 4" is "tune".
Correspondingly, when the matching result corresponding to the corresponding rule condition unit is "tune", it indicates that the matching of the rule condition unit is successful this time, and meanwhile, the matching module 200 records the matching times of the rule condition unit and the matching success times. For example, two counters are employed in the matching module 200 to record the matching success times of the corresponding rule condition units or the corresponding identifiers and the matching times of the rule condition units or the identifiers. For example, in a certain period, each time a rule condition unit or a corresponding identifier is matched, the matching module 200 adds 1 to a counter recording the matching times of the rule condition unit or the corresponding identifier; and each time a rule condition unit or corresponding identifier is successfully matched, the matching module 200 increments by 1 the counter that records the number of times the rule condition unit or corresponding identifier is successfully matched.
Correspondingly, when the matching result corresponding to the corresponding rule condition unit is "false", it indicates that the matching of the rule condition unit fails this time, and meanwhile, the matching times of the rule condition unit also needs to be recorded. Of course, in other embodiments of the present invention, the matching failure times of the rule condition unit may also be recorded at the same time, which is not described in detail herein.
Further, a ternary structure may be used to indicate the number of matching successes of the identifiers corresponding to a single rule condition unit, and the number of matching of the identifiers.
Accordingly, two counters are employed in the matching module 200 to record the matching success times of the identifiers corresponding to the single rule condition unit and the matching times of the identifiers. For example, in a certain period, each time the rule condition unit is matched, the matching module 200 adds 1 to the counter recording the matching times of the identifier; and the matching module 200 adds 1 to the counter recording the number of matching successes of the identifier each time the matching of the rule condition unit succeeds, i.e., the returned result of a single rule condition unit is "true".
For example, the above-described ternary structure is represented using three attributes in java. In the initial state, analysis revealed that: the three-element structure corresponding to the rule condition unit is respectively as follows: [ String: a >1, 0, [ int: B! 2 ", 0, [ date: C >201209, 0, 0 ], and [ date: C <201211, 0, 0 ].
As can be seen from the above embodiments, after one time of analysis is completed, the ternary structure changes with the change of the counter value, and the change result is: [ String: a >1, 1, 1! 2 ", 0, 1, [ date: C >201209, 1, 1 ], and [ date: C <201211, 1, 1 ].
Further, in a certain period, assuming that the rule condition unit operates 100 times in the period, according to the above steps, the matching module 200 matches the rule condition unit according to the rule operation context to obtain the following ternary structure: [ String: a >1, 80, 100 [ ], [ int: B! 2 ", 70, 100, [ date: C >201209, 60, 100 ], and [ date: C <201211, 50, 100 ], wherein the ternary structure represents: rule condition element [ String: a >1, 80, 100 ] calculates 100 times in a certain period, wherein 80 times of matching is successful; rule condition Unit [ int: B! 2 ", 70, 100 ] are operated 100 times in a certain period, wherein 70 times of matching are successful; the rule condition unit (date: C >201209, 60, 100) calculates 100 times in a certain period, wherein the matching for 60 times is successful; the rule condition unit (date: C <201211, 50, 100) operates 100 times in a certain period, wherein 50 times of matching succeeds.
Further, the matching module 200 is configured to: the matching success rate, i.e. the matching success number of the rule condition unit in a certain period/the matching number of the rule condition unit in a certain period, or the matching success number of the identifier corresponding to the matching success rate, i.e. the matching number of the identifier corresponding to the rule condition unit/the matching number of the identifier corresponding to the rule condition unit, is known, and the matching success rate corresponds to the rule condition unit "String: a >1 "," int: B! The matching success rates of 2 "," date: C >201209 ", and" date: C <201211 "are: 80%, 70%, 60% and 50%.
Correspondingly, the matching module 200 is configured to adjust an operation sequence of the rule condition units according to the priority of the operator in the rule expression and the matching success rate monitored by each rule condition unit in a certain period, and if the sum relationship between the rule condition units of the current operation is determined according to the priority of the operator, preferentially operate the rule condition unit with a lower matching success rate; and if the rule condition units of the current operation are determined to be in an OR relationship according to the operator priority, preferentially operating the rule condition units with higher matching success rate.
Further, as shown in fig. 2E, the matching module 200 is configured to perform pairwise operation layer by upward leaf nodes in the binary syntax tree; and adjusting the operation sequence of the identifiers according to the priority of an operator in the rule expression and the matching success rate monitored by the rule condition unit corresponding to each identifier in a certain period. In the foregoing steps, the calculation result of the leaf node is already completed, so the operation speed of the whole binary syntax tree is fast.
Correspondingly, the matching module 200 is configured to replace the matching result of the corresponding identifier into the binary syntax tree, and perform operation in a manner of traversing the binary syntax tree according to the operation sequence adjusted by the rule condition unit, so as to obtain the final matching result of a single rule expression.
Such as: in the lowest level structure in the binary syntax tree, because the matching success rate of the rule condition unit 'date: C > 201209' is 60% and is greater than the matching success rate of the rule condition unit 'date: C < 201211' by 50%, and meanwhile, the operator connecting the rule condition unit 'date: C < 201211' is a sum relationship, the rule condition unit 'date: C < 201211' is calculated firstly, if the result is true, the rule condition unit 'date: C > 201209' is calculated, and then the next step is continued; if the result of the rule condition unit "date: C < 201211" is false, the operation is not continued, and the calculation result of the rule expression including the rule condition unit is given as false directly, which is not described in detail herein.
Compared with the prior art, the rule matching operation method and the rule matching operation device monitor the matching success rate of each rule condition unit in the rule expression in real time, and adjust the operation sequence of the whole rule expression according to the matching success rate, so that the overall operation speed of the rule expression is increased, and the operation efficiency is further improved.
In the several embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is merely a logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or 2 or more units may be integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may be modified or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (14)

1. A method of rule matching operations, the method comprising:
decomposing the regular expression into a plurality of regular condition units;
adjusting the operation sequence of the rule condition units according to the priority of an operator in the rule expression and the matching success rate monitored by each rule condition unit in a certain period; wherein,
if the rule condition units of the current operation are determined to be in a sum relation according to the priority of the operator, preferentially operating the rule condition units with lower matching success rate;
and if the rule condition units of the current operation are determined to be in an OR relationship according to the operator priority, preferentially operating the rule condition units with higher matching success rate.
2. The rule matching operation method according to claim 1, wherein the rule condition unit is composed of a unit type, a dimension value, an operator, and a comparison value.
3. The rule matching operation method according to claim 1, wherein adjusting the rule operation order according to the priority of an operator in the rule expression and the matching success rate monitored by each rule condition unit in a certain period specifically comprises:
analyzing the rule expression into a syntax binary tree according to operator priority and rule condition units, wherein each leaf node of the syntax binary tree corresponds to one rule condition unit;
distributing the same identifier for the same rule condition unit in the leaf node;
replacing leaf nodes of the binary syntax tree with identifiers corresponding to regular condition units, and performing pairwise calculation layer by the leaf nodes; and adjusting the operation sequence of the identifiers according to the priority of an operator in the rule expression and the matching success rate monitored by the rule condition unit corresponding to each identifier in a certain period.
4. The rule matching operation method according to claim 3, wherein the method comprises:
stacking identifiers according to the unit type of the rule condition unit;
and calculating the matching result of each identifier by combining the dimension value of the rule condition unit according to the execution template corresponding to the unit type.
5. The rule matching operation method according to claim 1, wherein the method includes monitoring a matching success rate of each rule condition unit in a certain period, and the method includes:
and respectively acquiring the matching success rate of the plurality of rule condition units in a certain period in real time, wherein the matching success rate is the matching success times of the rule condition units in the certain period/the matching times of the rule condition units in the certain period.
6. The rule matching operation method according to claim 5, wherein calculating the matching success rate specifically comprises:
analyzing the rule expression into a rule syntax binary tree according to operator priority and rule condition units, wherein each leaf node of the rule syntax binary tree corresponds to one rule condition unit;
distributing the same identifier for the same rule condition unit in the leaf node;
replacing leaf nodes of the grammar tree with identifiers corresponding to rule condition units, and calculating matching results of the identifiers;
the matching success rate is the matching success frequency of the identifier corresponding to the rule condition unit/the matching frequency of the identifier corresponding to the rule condition unit.
7. The rule matching operation method according to claim 6, wherein the "calculating the matching result of the identifier" specifically includes:
stacking identifiers according to the unit type of the rule condition unit;
and calculating the matching result of each identifier by combining the dimension value of the rule condition unit according to the execution template corresponding to the unit type.
8. A rule matching operation apparatus, comprising:
the decomposition module is used for decomposing the rule expression into a plurality of rule condition units;
the matching module is used for adjusting the operation sequence of the rule condition units according to the priority of an operator in the rule expression and the matching success rate monitored by each rule condition unit in a certain period; wherein,
if the rule condition units of the current operation are determined to be in a sum relation according to the priority of the operator, preferentially operating the rule condition units with lower matching success rate;
and if the rule condition units of the current operation are determined to be in an OR relationship according to the operator priority, preferentially operating the rule condition units with higher matching success rate.
9. The rule matching operation device of claim 8, wherein the rule condition unit is composed of a unit type, a dimension value, an operator, and a comparison value.
10. The rule matching operation device according to claim 8, wherein the matching module is further configured to:
analyzing the rule expression into a syntax binary tree according to operator priority and rule condition units, wherein each leaf node of the syntax binary tree corresponds to one rule condition unit;
distributing the same identifier for the same rule condition unit in the leaf node;
replacing leaf nodes of the binary syntax tree with identifiers corresponding to regular condition units, and performing pairwise calculation layer by the leaf nodes; and adjusting the operation sequence of the identifiers according to the priority of an operator in the rule expression and the matching success rate monitored by the rule condition unit corresponding to each identifier in a certain period.
11. The rule matching operation device according to claim 10, wherein the matching module is configured to:
stacking identifiers according to the unit type of the rule condition unit;
and calculating the matching result of each identifier by combining the dimension value of the rule condition unit according to the execution template corresponding to the unit type.
12. The rule matching operation device according to claim 8, wherein the matching module is configured to monitor a matching success rate of each rule condition unit in a certain period;
the matching module is specifically configured to respectively acquire matching success rates of the plurality of rule condition units in a certain period in real time, where the matching success rate is the matching success times of the rule condition units in the certain period/the matching times of the rule condition units in the certain period.
13. The rule matching operation device according to claim 12, wherein the matching module is configured to:
analyzing the rule expression into a rule syntax binary tree according to operator priority and rule condition units, wherein each leaf node of the rule syntax binary tree corresponds to one rule condition unit;
distributing the same identifier for the same rule condition unit in the leaf node;
replacing leaf nodes of the grammar tree with identifiers corresponding to rule condition units, and calculating matching results of the identifiers;
the matching success rate is the matching success frequency of the identifier corresponding to the rule condition unit/the matching frequency of the identifier corresponding to the rule condition unit.
14. The rule matching operation device according to claim 13, wherein the matching module is configured to:
stacking identifiers according to the unit type of the rule condition unit;
and calculating the matching result of each identifier by combining the dimension value of the rule condition unit according to the execution template corresponding to the unit type.
CN201410547129.4A 2014-10-15 2014-10-15 Rule matching operation method and device Pending CN105574032A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410547129.4A CN105574032A (en) 2014-10-15 2014-10-15 Rule matching operation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410547129.4A CN105574032A (en) 2014-10-15 2014-10-15 Rule matching operation method and device

Publications (1)

Publication Number Publication Date
CN105574032A true CN105574032A (en) 2016-05-11

Family

ID=55884175

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410547129.4A Pending CN105574032A (en) 2014-10-15 2014-10-15 Rule matching operation method and device

Country Status (1)

Country Link
CN (1) CN105574032A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106534095A (en) * 2016-10-27 2017-03-22 成都知道创宇信息技术有限公司 Fast matching method for WAF security rules
CN106780656A (en) * 2016-12-30 2017-05-31 中国民航信息网络股份有限公司 Chart output intent and device
CN108616410A (en) * 2016-12-09 2018-10-02 北京京东尚科信息技术有限公司 Information calibration method and device
CN109002355A (en) * 2018-06-06 2018-12-14 阿里巴巴集团控股有限公司 Handle distribution method, device and the equipment of request
CN109726312A (en) * 2018-12-25 2019-05-07 广州虎牙信息科技有限公司 A kind of regular expression detection method, device, equipment and storage medium
CN110134941A (en) * 2019-04-01 2019-08-16 贵州力创科技发展有限公司 A kind of compound expression analytic method and system
CN115391068A (en) * 2022-10-26 2022-11-25 南方电网数字电网研究院有限公司 Framework construction method based on IT (information technology) resource management and control system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090222793A1 (en) * 2008-02-29 2009-09-03 International Business Machines Corporation Virtual Machine and Programming Language for Event Processing
CN101739248A (en) * 2008-11-13 2010-06-16 国际商业机器公司 Method and system for executing rule set
CN102004613A (en) * 2010-12-07 2011-04-06 无锡永中软件有限公司 Dendriform display method of expression and evaluation method
CN103973684A (en) * 2014-05-07 2014-08-06 北京神州绿盟信息安全科技股份有限公司 Rule compiling and matching method and device
CN105573726A (en) * 2014-10-10 2016-05-11 阿里巴巴集团控股有限公司 Rule processing method and equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090222793A1 (en) * 2008-02-29 2009-09-03 International Business Machines Corporation Virtual Machine and Programming Language for Event Processing
CN101739248A (en) * 2008-11-13 2010-06-16 国际商业机器公司 Method and system for executing rule set
CN102004613A (en) * 2010-12-07 2011-04-06 无锡永中软件有限公司 Dendriform display method of expression and evaluation method
CN103973684A (en) * 2014-05-07 2014-08-06 北京神州绿盟信息安全科技股份有限公司 Rule compiling and matching method and device
CN105573726A (en) * 2014-10-10 2016-05-11 阿里巴巴集团控股有限公司 Rule processing method and equipment

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106534095A (en) * 2016-10-27 2017-03-22 成都知道创宇信息技术有限公司 Fast matching method for WAF security rules
CN108616410A (en) * 2016-12-09 2018-10-02 北京京东尚科信息技术有限公司 Information calibration method and device
CN106780656A (en) * 2016-12-30 2017-05-31 中国民航信息网络股份有限公司 Chart output intent and device
CN109002355A (en) * 2018-06-06 2018-12-14 阿里巴巴集团控股有限公司 Handle distribution method, device and the equipment of request
CN109002355B (en) * 2018-06-06 2022-04-05 创新先进技术有限公司 Distribution method, device and equipment for processing requests
CN109726312A (en) * 2018-12-25 2019-05-07 广州虎牙信息科技有限公司 A kind of regular expression detection method, device, equipment and storage medium
CN110134941A (en) * 2019-04-01 2019-08-16 贵州力创科技发展有限公司 A kind of compound expression analytic method and system
CN115391068A (en) * 2022-10-26 2022-11-25 南方电网数字电网研究院有限公司 Framework construction method based on IT (information technology) resource management and control system

Similar Documents

Publication Publication Date Title
CN105574032A (en) Rule matching operation method and device
Appice et al. A co-training strategy for multiple view clustering in process mining
US10354201B1 (en) Scalable clustering for mixed machine learning data
US10031836B2 (en) Systems and methods for automatically generating message prototypes for accurate and efficient opaque service emulation
US11182691B1 (en) Category-based sampling of machine learning data
US10102480B2 (en) Machine learning service
JP7373611B2 (en) Log auditing methods, equipment, electronic equipment, media and computer programs
US20210092160A1 (en) Data set creation with crowd-based reinforcement
US20210136121A1 (en) System and method for creation and implementation of data processing workflows using a distributed computational graph
US20210385251A1 (en) System and methods for integrating datasets and automating transformation workflows using a distributed computational graph
CN104504084A (en) Method and device for determining user retention rate
CN111339078A (en) Data real-time storage method, data query method, device, equipment and medium
US20200112475A1 (en) Real-time adaptive infrastructure scenario identification using syntactic grouping at varied similarity
US20200084084A1 (en) N-gram based knowledge graph for semantic discovery model
US20150363214A1 (en) Systems and methods for clustering trace messages for efficient opaque response generation
CN106844550B (en) Virtualization platform operation recommendation method and device
US20240104009A9 (en) Generating test data for application performance
US11237951B1 (en) Generating test data for application performance
CN109783459A (en) The method, apparatus and computer readable storage medium of data are extracted from log
CN114416685A (en) Log processing method, system and storage medium
CN112612832B (en) Node analysis method, device, equipment and storage medium
US11003513B2 (en) Adaptive event aggregation
CN113138906A (en) Call chain data acquisition method, device, equipment and storage medium
US20140214826A1 (en) Ranking method and system
US10003492B2 (en) Systems and methods for managing data related to network elements from multiple sources

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160511