CN106874255A - Method and device for rule matching - Google Patents

Method and device for rule matching Download PDF

Info

Publication number
CN106874255A
CN106874255A CN201510921020.7A CN201510921020A CN106874255A CN 106874255 A CN106874255 A CN 106874255A CN 201510921020 A CN201510921020 A CN 201510921020A CN 106874255 A CN106874255 A CN 106874255A
Authority
CN
China
Prior art keywords
rule
current
match
rule set
count value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510921020.7A
Other languages
Chinese (zh)
Inventor
徐文斌
何鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201510921020.7A priority Critical patent/CN106874255A/en
Publication of CN106874255A publication Critical patent/CN106874255A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars

Abstract

The invention discloses a kind of method and device for rule matching.Wherein, the method includes:Current rule set is obtained during this rule match, wherein, current rule set is, according to matching times of each rule in preset time period in a upper rule set, to be ranked up what is obtained to each rule, and the regular set used during this rule match;Acquisition needs the language material of executing rule matching operation;Rule match is carried out to the language material for needing executing rule matching operation using current rule set.The present invention is solved in correlation technique because each rule ordering fixes the low technical problem of the matching efficiency that causes.

Description

Method and device for rule matching
Technical field
The present invention relates to natural language processing field, in particular to a kind of method and device for rule matching.
Background technology
In structural data, had the advantages that using Rule Extraction or label information it is quick, succinct, therefore, mesh The preceding module using rule still occupies very big market proportion.Rule match is typically to be carried out according to putting in order for rule 's.But if language material is different, Different Rule needs the number of times of matching just different, therefore matching speed is limited to rule Put in order.
In the related art, rule match mechanism needs putting in order for prior unalterable rules, then for new sentence Using the pattern of dictionary pattern matching, whole matching process needs the strictly all rules that traversal rule is concentrated.Such as, it is assumed that N is The size of rule set, then each sentence is required for traveling through n times during matching, wherein, the size rule of rule set Bar number is represented.Meanwhile, based on the match pattern of traversal, some rules for seldom occurring are also required to travel through a rule set. In the case of rule set very little, this match pattern can also meet Production requirement, but in the very big situation of rule set Under, the time complexity of whole engineering will be greatly increased, so that it cannot meeting Production requirement very well.
Specifically, in the related art, rule match process is as follows:1, whole rule set is traveled through, to any in set Rule is all proceeded as follows:All words in traversal sentence, a rule match operation is carried out to each word;2, Judge whether there is position and current rule match in the sentence, if returning result is fal se, then it represents that do not exist With the information of current rule match, continue the matching operation of follow-up rule, if returning result is true, then it represents that deposit In the information with current rule match, record matching position, and direction information processing routine terminate this matching algorithm; 3, obtain the information of matching.The technical scheme shortcoming is:Rule set has been traveled through using the mode enumerated, and to every Individual rule is required for traveling through a word list, causes substantial amounts of invalid matching occur, increased extra number of comparisons, Reduce matching efficiency.
For above-mentioned problem, effective solution is not yet proposed at present.
The content of the invention
A kind of method and device for rule matching is the embodiment of the invention provides, at least to solve in correlation technique due to each rule Then order fixes the low technical problem of the matching efficiency for causing.
A kind of one side according to embodiments of the present invention, there is provided rule matching method, including:In this rule Current rule set is obtained during matching somebody with somebody, wherein, above-mentioned current rule set is pre- according to each rule in a upper rule set If the matching times in the time period, what is obtained is ranked up to above-mentioned each rule, and in above-mentioned this rule match mistake The regular set used in journey;Acquisition needs the language material of executing rule matching operation;Use above-mentioned current rule set pair The above-mentioned language material for needing executing rule matching operation carries out rule match.
Further, there is a count value per rule, wherein, the above method also includes:In above-mentioned this rule When carrying out rule match in matching process, when in above-mentioned current rule set per rule and above-mentioned language material in correspondence position After the match is successful, the count value of the rule is carried out into increment treatment;According to every rule in above-mentioned current rule set Count value increment result is resequenced to the rule in above-mentioned current rule set, obtains next rule set, its In, above-mentioned next rule set is the regular set used during next rule match.
Further, after being resequenced to the rule in above-mentioned current rule set and above-mentioned next rule set is obtained Before, the above method also includes:The count value of each rule after to rearrangement carries out initialization process, wherein, root The above-mentioned next rule set of each rule generation after being initialized according to resequence and count value.
Further, after the count value of the rule being carried out into increment treatment, the above method also includes:Judge from upper State whether this rule match process starts to the duration between current time to reach the duration of above-mentioned preset time period, its In, judged result be reached to the duration above-mentioned current time since above-mentioned this rule match process it is above-mentioned In the case of the duration of preset time period, according to the count value increment result in above-mentioned current rule set per rule Rule in above-mentioned current rule set is resequenced, above-mentioned next rule set is obtained.
Further, there is a pre-defined rule sequence number per rule, according to every rule in above-mentioned current rule set Count value increment result is resequenced to the rule in above-mentioned current rule set, and obtaining next rule set includes: Pre-defined rule sequence number according to above-mentioned every rule determines every original of the rule in data structure in above-mentioned current rule set There is position;Result is rised in value in above-mentioned current rule set according to the count value in above-mentioned current rule set per rule It is adjusted per original position of the rule in data structure, obtains above-mentioned next rule set.
Another aspect according to embodiments of the present invention, additionally provides a kind of rule match device, including:First obtains single Unit, for obtaining current rule set during this rule match, wherein, above-mentioned current rule set is according to upper one Matching times of each rule in preset time period in rule set, are ranked up what is obtained to above-mentioned each rule, and The regular set used during above-mentioned this rule match;Second acquisition unit, needs to perform rule for obtaining The then language material of matching operation;Matching unit, for needing executing rule to match behaviour to above-mentioned using above-mentioned current rule set The language material of work carries out rule match.
Further, there is a count value per rule, wherein, said apparatus also include:Processing unit, is used for When carrying out rule match during above-mentioned this rule match, when every rule in above-mentioned current rule set and upper predicate The count value of the rule is carried out increment treatment after the match is successful by correspondence position in material;Sequencing unit, for root The rule in above-mentioned current rule set is carried out according to the count value increment result in above-mentioned current rule set per rule Rearrangement, obtains next rule set, wherein, above-mentioned next rule set is used during next rule match The set of rule.
Further, said apparatus also include:Initialization unit, for being carried out to the rule in above-mentioned current rule set After rearrangement and before obtaining above-mentioned next rule set, to rearrangement after the count value of each rule carry out initially Change is processed;Generation unit, above-mentioned next rule are generated for each rule after according to the initialization of resequence and count value Then collect.
Further, said apparatus also include:Judging unit, for the count value of the rule to be carried out into increment treatment Afterwards, judge when whether reaching above-mentioned default to the duration current time above-mentioned this rule match process Between section duration, wherein, it is since above-mentioned this rule match process that above-mentioned sequencing unit is additionally operable in judged result In the case of reaching the duration of above-mentioned preset time period to the duration between above-mentioned current time, according to above-mentioned current rule Concentrate the increment of the count value per rule result to resequence the rule in above-mentioned current rule set, obtain Above-mentioned next rule set.
Further, there is a pre-defined rule sequence number per rule, above-mentioned sequencing unit includes:Determining module, uses In above-mentioned current rule set is determined according to the above-mentioned pre-defined rule sequence number per rule per rule in data structure Original position;Adjusting module, for according to the count value increment result pair in above-mentioned current rule set per rule It is adjusted per original position of the rule in data structure in above-mentioned current rule set, obtains above-mentioned next rule set.
In embodiments of the present invention, by the way of dynamic regulation rule order, by during this rule match Obtain current rule set, wherein, current rule set be according to each rule in a upper rule set in preset time period Matching times, are ranked up what is obtained to each rule, and the regular set used during this rule match; Acquisition needs the language material of executing rule matching operation;Using current rule set to needing the language material of executing rule matching operation Rule match is carried out, the purpose of dynamic regulation rule order has been reached, it is achieved thereby that accelerating rule match speed, carrying The technique effect of rule matching efficiency high, and then solve in correlation technique because each rule ordering fixes the matching for causing The low technical problem of efficiency.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this hair Bright schematic description and description does not constitute inappropriate limitation of the present invention for explaining the present invention.In accompanying drawing In:
Fig. 1 is the flow chart of a kind of optional rule matching method according to embodiments of the present invention;
Fig. 2 is the schematic diagram of a kind of optional rule match device according to embodiments of the present invention.
Specific embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention Accompanying drawing, is clearly and completely described to the technical scheme in the embodiment of the present invention, it is clear that described embodiment The only embodiment of a present invention part, rather than whole embodiments.Based on the embodiment in the present invention, ability The every other embodiment that domain those of ordinary skill is obtained under the premise of creative work is not made, should all belong to The scope of protection of the invention.
It should be noted that term " first ", " in description and claims of this specification and above-mentioned accompanying drawing Two " it is etc. for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that this The data that sample is used can be exchanged in the appropriate case, so as to embodiments of the invention described herein can with except Here the order beyond those for illustrating or describing is implemented.Additionally, term " comprising " and " having " and they Any deformation, it is intended that covering is non-exclusive to be included, for example, containing process, the side of series of steps or unit Method, system, product or equipment are not necessarily limited to those steps clearly listed or unit, but may include unclear List or for these processes, method, product or other intrinsic steps of equipment or unit.
Embodiment 1
According to embodiments of the present invention, there is provided a kind of embodiment of the method for rule matching method, it is necessary to explanation, The step of flow of accompanying drawing is illustrated can perform in the such as one group computer system of computer executable instructions, and And, although logical order is shown in flow charts, but in some cases, can be with different from order herein Perform shown or described step.
Fig. 1 is the flow chart of a kind of optional rule matching method according to embodiments of the present invention, as shown in figure 1, should Method comprises the following steps:
Step S102, current rule set is obtained during this rule match, wherein, current rule set is according to upper Matching times of each rule in preset time period in one rule set, are ranked up what is obtained to each rule, and The regular set used during this rule match;
Step S104, acquisition needs the language material of executing rule matching operation;
Step S106, rule match is carried out using current rule set to the language material for needing executing rule matching operation.
Rule set is exactly the set of rule, it is generally the case that each rule set can include some according to certain suitable The rule of sequence arrangement, namely all there is a pre-defined rule sequence number per rule.It should be noted that a rule set When using first, the pre-defined rule sequence number of its each rule is manually set, and when using afterwards, each regular makes a reservation for Number of regulation is then that the match is successful in matching process before that situation adjust automatically is obtained according to it.Therefore, at this Obtained during secondary rule match according to matching times of each rule in preset time period in a upper rule set, to each Rule is ranked up what is obtained, and the current rule set and acquisition used during this rule match need to perform The language material of rule match operation, and line discipline is entered to the language material for needing executing rule matching operation using current rule set Match somebody with somebody, it is possible to reduce the number of times of invalid matching, accelerate the speed of rule match, improve matching efficiency.
For example, it is assumed that rule set 1 includes ABC rules, and these rules sort according to ABC, wherein, A tables Show the phone number of 11, B represents the base number of 010 beginning, represents mailbox number, for language material 1, In the related art, even if needing only to be matched using rule set 1 the base number of 010 beginning therein, then All can first be matched using A rules in matching every time, reuse regular B and matched;And in the present invention, By improving, first pass through language material 2 (similar with language material 1) is matched using rule set 1, train and meet BAC The rule 2 of sequence, reuses regular 2 pairs of language materials 1 and matches.So, in matching, A can effectively be avoided The matching times of rule, realize reducing the number of times of invalid matching, accelerate the speed of rule match, improve matching efficiency Purpose.
It should be noted that in the present invention, a parameter item can be included per rule, the parameter item includes pre- Determine number of regulation and counter two parts, pre-defined rule sequence number is artificial formulation, for being limited to rule match process In, the sequencing matched with language material per rule;Counter is used for record rule successful match on current language material Number of times, obtains count value.For the ease of the order of dynamic regulation rule, when using, these can be had certain The rule storage of order is in data structure.In embodiments of the present invention, data structure can have many kinds, such as set Shape structure, forest structure, graph model, list structure etc. can be used to the structure of sequence.Hereinafter select Huffman trees Based on data structure elaborate the present invention.Wherein, Huffman trees have advantage easy to adjust, and Rule can be directly hung up in node, using more convenient.Dynamic regulation rule is sequentially, whole its role is to cause Node in data structure has optimal traversal order, and the initial position set by the meter of parameter of regularity Counter Numerical value is determined.In addition, when the current sentence in language material is matched, can be with the regular sequence number of record matching.
By above-described embodiment, the purpose of dynamic regulation rule order can be reached, so as to realize accelerating rule match speed Degree, the technique effect for improving rule matching efficiency.That is, the distribution situation by considering current language material, dynamic to adjust Whole rule match sequentially, to improve matching efficiency.The method not only goes for the word of natural language processing, sentence Son matching, can also equally be applied to other similar scenes.
Alternatively, there is a count value per rule, the count value is obtained by the counter records in the parameter item of rule Arrive, its initial value is 0, wherein, the above method also includes:
S2, when rule match is carried out during this rule match, when in every rule and language material in current rule set Correspondence position after the match is successful, the count value of the rule is carried out into increment treatment;
S4, enters according to the count value increment result in current rule set per rule to the rule in current rule set Row rearrangement, obtains next rule set, wherein, next rule set is the rule used during next rule match Set then.
That is, when being matched to sentence, the count value in the parameter item of respective rule can be changed.For example, pressing According to the Huffman tree structures of each rule in current rule set, rule match is carried out to all sentences in language material, wherein, The matching order of each rule is determined by depositing these regular data structures., it is necessary to from the root of Huffman trees during implementation Node starts matching, when the success of certain node matching, the counter in the node in the parameter item of corresponding rule Count value+1, and the particular location in sentence with rule match is returned, complete this time to match.Further, in order to The efficiency of the next rule match of optimization, when count value reaches preset value or the time reaches Preset Time, can basis The count value of each rule is resequenced to it, is obtained beneficial to the new rule set for accelerating matching speed so that next time is to similar Language material used during rule match.
By the embodiment of the present invention, dynamically each regular order can be concentrated by regulation rule, for follow-up rule match Flow is used, the unnecessary purpose for matching work during the follow-up rule match of realization reduction, is reached raising matching and is imitated The technique effect of rate.
In addition, increment treatment after, it is necessary to preserve rule parameter item, so as to subsequent execution it is same or like rule During with task, sequencing directly can be carried out using preservation result, with time-consuming.
Still optionally further, after being resequenced to the rule in current rule set and before obtaining next rule set, Method also includes:
S6, to rearrangement after it is each rule count value carry out initialization process, wherein, according to rearrangement and The next rule set of each rule generation after count value initialization.
Specifically, in readjusting rule set every time it is each rule matching order after, it is necessary to by it is each rule count value Reset, so, during follow-up rule match, counter can be counted again, and based on foregoing embodiments, Realize that dynamic regulation rule concentrates the purpose of the matching order of each rule.
Alternatively, after the count value of the rule being carried out into increment treatment, method also includes:
Whether S8, judges reach preset time period to the duration current time since this rule match process Duration, wherein, judged result be reached to the duration current time since this rule match process it is default In the case of the duration of time period, result is rised in value to working as front lay according to the count value in current rule set per rule The rule then concentrated is resequenced, and obtains next rule set.
That is, after fixed duration, just the count value according to rule resequence and alteration ruler data structure. After have passed through matching work, the parameter value of Huffman tree interior joints there occurs change, but in order to avoid excessively frequent Regulation rule order and cause produce the wasting of resources phenomenon, in the time slice of preset duration, not to original Rule ordering is modified, only after the time slice of preset duration terminates, just using such scheme to rule ordering Enter Mobile state adjustment.
Alternatively, there is a pre-defined rule sequence number per rule, according to the count value in current rule set per rule Increment result is resequenced to the rule in current rule set, and obtaining next rule set includes:
S10, the pre-defined rule sequence number according to every rule determines every original of the rule in data structure in current rule set There is position;
S12, according to the count value increment result in current rule set per rule to every rule in current rule set Original position in data structure is adjusted, and obtains next rule set.
By using as a example by Huffman tree storage rules, when clamped between after fragment terminates, using following program circuits pair Position of each rule in data structure is adjusted:Whether Step1, detection current node are root nodeIf so, then Terminate;If it is not, going to Step2;Step2, current node is adjusted on the position for its father node;Step3、 Judge current node adjustment after whether be affiliated block maximum nodeIf so, Step4 is then gone to, if it is not, then exchanging The position of current node and maximum node;Step4, judge current node adjustment after its inevitable order whether be broken, if It is, counter -1 and rebound Step3 that otherwise, rebound Step1 continues to adjust, until the structure of Huffman trees Untill not changing;Finally, in Step4, repetition Step2-Step3, the matching work of implementation rule, together When constantly dynamically adjust Huffman tree constructions, it is more suitable for language material matching task.
By the embodiment of the present invention, rule structuralisation is arranged, and during rule match, according to matching result certainly The rule that dynamic this structuring of adjustment is arranged, so as to accelerate the speed of follow-up rule match.Specifically, in rule It is, dynamic to adjust in matching process according to the rule frequency that the match is successful in current language material and the rule prioritization sequentially The order of whole matched rule, reduces unnecessary matching process so that the time complexity of whole algorithm is reduced to O (log (n)), so as to reach the requirement of engineering-level.The method is equally reached the effect of accurate match positional information, And invalid location comparison is reduced to greatest extent so that the implementation complexity of model is close to linear complexity. The execution efficiency of the method is improved, is used so as to be more suitable for engineering.
Embodiment 2
According to embodiments of the present invention, there is provided a kind of device embodiment of rule match device.
Fig. 2 is the schematic diagram of a kind of optional rule match device according to embodiments of the present invention, as shown in Fig. 2 should Device includes:First acquisition unit 202, for obtaining current rule set during this rule match, wherein, Current rule set is, according to matching times of each rule in preset time period in a upper rule set, each rule to be entered Row sequence is obtained, and the regular set used during this rule match;Second acquisition unit 204, uses The language material of executing rule matching operation is needed in acquisition;Matching unit 206, for being held to needs using current rule set The language material of line discipline matching operation carries out rule match.
Rule set is exactly the set of rule, it is generally the case that each rule set can include some according to certain suitable The rule of sequence arrangement, namely all there is a pre-defined rule sequence number per rule.It should be noted that a rule set When using first, the pre-defined rule sequence number of its each rule is manually set, and when using afterwards, each regular makes a reservation for Number of regulation is then that the match is successful in matching process before that situation adjust automatically is obtained according to it.Therefore, at this Obtained during secondary rule match according to matching times of each rule in preset time period in a upper rule set, to each Rule is ranked up what is obtained, and the current rule set and acquisition used during this rule match need to perform The language material of rule match operation, and line discipline is entered to the language material for needing executing rule matching operation using current rule set Match somebody with somebody, it is possible to reduce the number of times of invalid matching, accelerate the speed of rule match, improve matching efficiency.
For example, it is assumed that rule set 1 includes ABC rules, and these rules sort according to ABC, wherein, A tables Show the phone number of 11, B represents the base number of 010 beginning, represents mailbox number, for language material 1, In the related art, even if needing only to be matched using rule set 1 the base number of 010 beginning therein, then All can first be matched using A rules in matching every time, reuse regular B and matched;And in the present invention, By improving, first pass through language material 2 (similar with language material 1) is matched using rule set 1, train and meet BAC The rule 2 of sequence, reuses regular 2 pairs of language materials 1 and matches.So, in matching, A can effectively be avoided The matching times of rule, realize reducing the number of times of invalid matching, accelerate the speed of rule match, improve matching efficiency Purpose.
It should be noted that in the present invention, a parameter item can be included per rule, the parameter item includes pre- Determine number of regulation and counter two parts, pre-defined rule sequence number is artificial formulation, for being limited to rule match process In, the sequencing matched with language material per rule;Counter is used for record rule successful match on current language material Number of times, obtains count value.For the ease of the order of dynamic regulation rule, when using, these can be had certain The rule storage of order is in data structure.In embodiments of the present invention, data structure can have many kinds, such as set Shape structure, forest structure, graph model, list structure etc. can be used to the structure of sequence.Hereinafter select Huffman trees Based on data structure elaborate the present invention.Wherein, Huffman trees have advantage easy to adjust, and Rule can be directly hung up in node, using more convenient.Dynamic regulation rule is sequentially, whole its role is to cause Node in data structure has optimal traversal order, and the initial position set by the meter of parameter of regularity Counter Numerical value is determined.In addition, when the current sentence in language material is matched, can be with the regular sequence number of record matching.
By above-described embodiment, the purpose of dynamic regulation rule order can be reached, so as to realize accelerating rule match speed Degree, the technique effect for improving rule matching efficiency.That is, the distribution situation by considering current language material, dynamic to adjust Whole rule match sequentially, to improve matching efficiency.The method not only goes for the word of natural language processing, sentence Son matching, can also equally be applied to other similar scenes.
Alternatively, there is a count value per rule, wherein, said apparatus also include:Processing unit, for When carrying out rule match during this rule match, when the correspondence position in current rule set per rule and in language material After the match is successful, the count value of the rule is carried out into increment treatment;Sequencing unit, for according in current rule set Count value increment result per rule is resequenced to the rule in current rule set, obtains next rule Collection, wherein, next rule set is the regular set used during next rule match.
That is, when being matched to sentence, the count value in the parameter item of respective rule can be changed.For example, pressing According to the Huffman tree structures of each rule in current rule set, rule match is carried out to all sentences in language material, wherein, The matching order of each rule is determined by depositing these regular data structures., it is necessary to from the root of Huffman trees during implementation Node starts matching, when the success of certain node matching, the counter in the node in the parameter item of corresponding rule Count value+1, and the particular location in sentence with rule match is returned, complete this time to match.Further, in order to The efficiency of the next rule match of optimization, when count value reaches preset value or the time reaches Preset Time, can basis The count value of each rule is resequenced to it, is obtained beneficial to the new rule set for accelerating matching speed so that next time is to similar Language material used during rule match.
By the embodiment of the present invention, dynamically each regular order can be concentrated by regulation rule, for follow-up rule match Flow is used, the unnecessary purpose for matching work during the follow-up rule match of realization reduction, is reached raising matching and is imitated The technique effect of rate.
In addition, increment treatment after, it is necessary to preserve rule parameter item, so as to subsequent execution it is same or like rule During with task, sequencing directly can be carried out using preservation result, with time-consuming.
Still optionally further, said apparatus also include:Initialization unit, for being carried out to the rule in current rule set After rearrangement and before obtaining next rule set, to rearrangement after the count value of each rule carry out at initialization Reason;Generation unit, next rule set is generated for each rule after according to the initialization of resequence and count value.
Specifically, in readjusting rule set every time it is each rule matching order after, it is necessary to by it is each rule count value Reset, so, during follow-up rule match, counter can be counted again, and based on foregoing embodiments, Realize that dynamic regulation rule concentrates the purpose of the matching order of each rule.
Alternatively, said apparatus also include:Judging unit, for the count value of the rule to be carried out into treatment of rising in value Afterwards, judge whether to be reached to the duration current time since this rule match process the duration of preset time period, Wherein, it is to the duration current time since this rule match process that sequencing unit is additionally operable in judged result In the case of reaching the duration of preset time period, according to the count value increment result in current rule set per rule Rule in current rule set is resequenced, next rule set is obtained.
That is, after fixed duration, just the count value according to rule resequence and alteration ruler data structure. After have passed through matching work, the parameter value of Huffman tree interior joints there occurs change, but in order to avoid excessively frequent Regulation rule order and cause produce the wasting of resources phenomenon, in the time slice of preset duration, not to original Rule ordering is modified, only after the time slice of preset duration terminates, just using such scheme to rule ordering Enter Mobile state adjustment.
Alternatively, there is a pre-defined rule sequence number per rule, sequencing unit includes:Determining module, for basis Pre-defined rule sequence number per rule determines every original position of the rule in data structure in current rule set;Adjustment Module, for rising in value result to every rules and regulations in current rule set according to the count value in current rule set per rule Then the original position in data structure is adjusted, and obtains next rule set.
By using as a example by Huffman tree storage rules, when clamped between after fragment terminates, using following program circuits pair Position of each rule in data structure is adjusted:Whether Step1, detection current node are root nodeIf so, then Terminate;If it is not, going to Step2;Step2, current node is adjusted on the position for its father node;Step3、 Judge current node adjustment after whether be affiliated block maximum nodeIf so, Step4 is then gone to, if it is not, then exchanging The position of current node and maximum node;Step4, judge current node adjustment after its inevitable order whether be broken, if It is, counter -1 and rebound Step3 that otherwise, rebound Step1 continues to adjust, until the structure of Huffman trees Untill not changing;Finally, in Step4, repetition Step2-Step3, the matching work of implementation rule, together When constantly dynamically adjust Huffman tree constructions, it is more suitable for language material matching task.
By the embodiment of the present invention, rule structuralisation is arranged, and during rule match, according to matching result certainly The rule that dynamic this structuring of adjustment is arranged, so as to accelerate the speed of follow-up rule match.Specifically, in rule It is, dynamic to adjust in matching process according to the rule frequency that the match is successful in current language material and the rule prioritization sequentially The order of whole matched rule, reduces unnecessary matching process so that the time complexity of whole algorithm is reduced to O (log (n)), so as to reach the requirement of engineering-level.The method is equally reached the effect of accurate match positional information, And invalid location comparison is reduced to greatest extent so that the implementation complexity of model is close to linear complexity. The execution efficiency of the method is improved, is used so as to be more suitable for engineering.
Above-mentioned rule match device include processor and memory, above-mentioned first acquisition unit, second acquisition unit, Stored in memory as program unit with unit etc., by computing device storage said procedure in memory Unit.
Kernel is included in processor, is gone in memory to transfer corresponding program unit by kernel.Kernel can set one Or more, parse content of text by adjusting kernel parameter.
Memory potentially includes the volatile memory in computer-readable medium, random access memory (RAM) and/ Or the form, such as read-only storage (ROM) or flash memory (flash RAM) such as Nonvolatile memory, memory includes at least one Individual storage chip.
Present invention also provides a kind of embodiment of computer program product, when being performed on data processing equipment, fit In the program code for performing initialization there are as below methods step:Current rule set is obtained during this rule match, Wherein, current rule set is according to matching times of each rule in preset time period in a upper rule set, to each bar Rule is ranked up what is obtained, and the regular set used during this rule match;Obtaining needs to perform rule The then language material of matching operation;Rule match is carried out to the language material for needing executing rule matching operation using current rule set.
The embodiments of the present invention are for illustration only, and the quality of embodiment is not represented.
In the above embodiment of the present invention, the description to each embodiment all emphasizes particularly on different fields, and does not have in certain embodiment The part of detailed description, may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents, can be by other Mode realize.Wherein, device embodiment described above is only schematical, such as division of described unit, Can be a kind of division of logic function, there can be other dividing mode when actually realizing, for example multiple units or component Can combine or be desirably integrated into another system, or some features can be ignored, or do not perform.It is another, institute Display or the coupling each other for discussing or direct-coupling or communication connection can be by some interfaces, unit or mould The INDIRECT COUPLING of block or communication connection, can be electrical or other forms.
The unit that is illustrated as separating component can be or may not be it is physically separate, it is aobvious as unit The part for showing can be or may not be physical location, you can with positioned at a place, or can also be distributed to On multiple units.Some or all of unit therein can be according to the actual needs selected to realize this embodiment scheme Purpose.
In addition, during each functional unit in each embodiment of the invention can be integrated in a processing unit, it is also possible to It is that unit is individually physically present, it is also possible to which two or more units are integrated in a unit.It is above-mentioned integrated Unit can both be realized in the form of hardware, it would however also be possible to employ the form of SFU software functional unit is realized.
If the integrated unit is to realize in the form of SFU software functional unit and as independent production marketing or when using, Can store in a computer read/write memory medium.Based on such understanding, technical scheme essence On all or part of the part that is contributed to prior art in other words or the technical scheme can be with software product Form is embodied, and the computer software product is stored in a storage medium, including some instructions are used to so that one Platform computer equipment (can be personal computer, server or network equipment etc.) performs each embodiment institute of the invention State all or part of step of method.And foregoing storage medium includes:USB flash disk, read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD Etc. it is various can be with the medium of store program codes.
The above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improve and moisten Decorations also should be regarded as protection scope of the present invention.

Claims (10)

1. a kind of rule matching method, it is characterised in that including:
Current rule set is obtained during this rule match, wherein, the current rule set is according to upper one Matching times of each rule in preset time period in rule set, are ranked up what is obtained to each rule, And the regular set used during described this rule match;
Acquisition needs the language material of executing rule matching operation;
Rule match is carried out to the language material for needing executing rule matching operation using the current rule set.
2. method according to claim 1, it is characterised in that there is a count value per rule, wherein, institute Stating method also includes:
When carrying out rule match during described this rule match, when every rule in the current rule set After the match is successful, the count value of the rule is carried out into increment treatment with the correspondence position in the language material;
Result is rised in value in the current rule set according to the count value in the current rule set per rule Rule resequenced, obtain next rule set, wherein, next rule set is in next rule The regular set used during matching somebody with somebody.
3. method according to claim 2, it is characterised in that carried out again to the rule in the current rule set After sequence and before obtaining next rule set, methods described also includes:
The count value of each rule after to rearrangement carries out initialization process,
Wherein, each rule generation next rule set after being initialized according to resequence and count value.
4. method according to claim 2, it is characterised in that the count value of the rule is carried out into treatment of rising in value Afterwards, methods described also includes:
Judge whether reach described presetting to the duration current time since described this rule match process The duration of time period,
Wherein, it is to when the current time since described this rule match process in judged result In the case that length reaches the duration of the preset time period, according to the counting in the current rule set per rule Value increment result is resequenced to the rule in the current rule set, obtains next rule set.
5. method according to claim 2, it is characterised in that there is a pre-defined rule sequence number, root per rule Rise in value result to the rule in the current rule set according to the count value in the current rule set per rule Resequenced, obtaining next rule set includes:
Every rule is in data knot in determining the current rule set according to the pre-defined rule sequence number per rule Original position in structure;
Result is rised in value in the current rule set according to the count value in the current rule set per rule It is adjusted per original position of the rule in data structure, obtains next rule set.
6. a kind of rule match device, it is characterised in that including:
First acquisition unit, for obtaining current rule set during this rule match, wherein, it is described to work as Preceding rule set is according to matching times of each rule in preset time period in a upper rule set, to each bar Rule is ranked up what is obtained, and the regular set used during described this rule match;
Second acquisition unit, the language material of executing rule matching operation is needed for obtaining;
Matching unit, for being entered to the language material for needing executing rule matching operation using the current rule set Line discipline is matched.
7. device according to claim 6, it is characterised in that there is a count value per rule, wherein, institute Stating device also includes:
Processing unit, during for carrying out rule match during described this rule match, front lay is worked as when described Correspondence position in then concentrating per rule and the language material is carried out the count value of the rule after the match is successful Increment is processed;
Sequencing unit, for rising in value result to institute according to the count value in the current rule set per rule The rule stated in current rule set is resequenced, and obtains next rule set, wherein, next rule set It is the regular set used during next rule match.
8. device according to claim 7, it is characterised in that described device also includes:
Initialization unit, for being resequenced to the rule in the current rule set after and obtain described Before next rule set, to rearrangement after it is each rule count value carry out initialization process;
Generation unit, next rule are generated for each rule after according to the initialization of resequence and count value Then collect.
9. device according to claim 7, it is characterised in that described device also includes:
Judging unit, after the count value of the rule carried out into increment treatment, judges from described this rule Then whether matching process starts to the duration between current time to reach the duration of the preset time period,
Wherein, it is to institute since described this rule match process that the sequencing unit is additionally operable in judged result In the case of stating the duration that the duration between current time reaches the preset time period, according to the current rule The increment of the count value per rule result is concentrated to resequence the rule in the current rule set, Obtain next rule set.
10. device according to claim 7, it is characterised in that there is a pre-defined rule sequence number, institute per rule Stating sequencing unit includes:
Determining module, it is every in the current rule set for being determined according to the pre-defined rule sequence number per rule Original position of the rule in data structure;
Adjusting module, for rising in value result to institute according to the count value in the current rule set per rule Every original position of the rule in data structure is adjusted in stating current rule set, obtains next rule Collection.
CN201510921020.7A 2015-12-11 2015-12-11 Method and device for rule matching Pending CN106874255A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510921020.7A CN106874255A (en) 2015-12-11 2015-12-11 Method and device for rule matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510921020.7A CN106874255A (en) 2015-12-11 2015-12-11 Method and device for rule matching

Publications (1)

Publication Number Publication Date
CN106874255A true CN106874255A (en) 2017-06-20

Family

ID=59177448

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510921020.7A Pending CN106874255A (en) 2015-12-11 2015-12-11 Method and device for rule matching

Country Status (1)

Country Link
CN (1) CN106874255A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182234A (en) * 2017-12-27 2018-06-19 中科鼎富(北京)科技发展有限公司 Regular expression screening technique and device
CN108959573A (en) * 2018-07-05 2018-12-07 京东方科技集团股份有限公司 Data migration method, device, electronic equipment and storage medium based on desktop cloud
CN109445797A (en) * 2018-10-24 2019-03-08 北京奇虎科技有限公司 Handle task executing method and device
CN112910831A (en) * 2019-12-04 2021-06-04 中兴通讯股份有限公司 Message matching method and device, firewall equipment and storage medium
CN113778858A (en) * 2021-08-05 2021-12-10 深圳开源互联网安全技术有限公司 Component detection method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101739248A (en) * 2008-11-13 2010-06-16 国际商业机器公司 Method and system for executing rule set
CN101753542A (en) * 2008-12-03 2010-06-23 北京天融信网络安全技术有限公司 Method and device for speeding up matching of filter rules of firewalls
CN104092612A (en) * 2014-06-05 2014-10-08 汉柏科技有限公司 Method and device for updating matching order of fast forwarding table
CN104468161A (en) * 2013-09-17 2015-03-25 中国移动通信集团设计院有限公司 Configuration method and apparatus of firewall rule set, and firewall

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101739248A (en) * 2008-11-13 2010-06-16 国际商业机器公司 Method and system for executing rule set
CN101753542A (en) * 2008-12-03 2010-06-23 北京天融信网络安全技术有限公司 Method and device for speeding up matching of filter rules of firewalls
CN104468161A (en) * 2013-09-17 2015-03-25 中国移动通信集团设计院有限公司 Configuration method and apparatus of firewall rule set, and firewall
CN104092612A (en) * 2014-06-05 2014-10-08 汉柏科技有限公司 Method and device for updating matching order of fast forwarding table

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182234A (en) * 2017-12-27 2018-06-19 中科鼎富(北京)科技发展有限公司 Regular expression screening technique and device
CN108182234B (en) * 2017-12-27 2021-07-09 鼎富智能科技有限公司 Regular expression screening method and device
CN108959573A (en) * 2018-07-05 2018-12-07 京东方科技集团股份有限公司 Data migration method, device, electronic equipment and storage medium based on desktop cloud
CN109445797A (en) * 2018-10-24 2019-03-08 北京奇虎科技有限公司 Handle task executing method and device
CN112910831A (en) * 2019-12-04 2021-06-04 中兴通讯股份有限公司 Message matching method and device, firewall equipment and storage medium
CN113778858A (en) * 2021-08-05 2021-12-10 深圳开源互联网安全技术有限公司 Component detection method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN106874255A (en) Method and device for rule matching
CN109299480A (en) Terminology Translation method and device based on context of co-text
CN109710947B (en) Electric power professional word bank generation method and device
CN109815011A (en) A kind of method and apparatus of data processing
JP2020520609A5 (en)
CN109064532B (en) Automatic mouth shape generating method and device for cartoon character
CN110263818A (en) Method, apparatus, terminal and the computer readable storage medium of resume selection
CN109933778A (en) Segmenting method, device and computer readable storage medium
CN109947940A (en) File classification method, device, terminal and storage medium
CN107180053A (en) A kind of data warehouse optimization method and device
CN104182539B (en) The method and system of abnormal information batch processing
CN106815193A (en) Model training method and device and wrong word recognition methods and device
CN106919703A (en) Film information searching method and device
CN106909454A (en) A kind of rules process method and equipment
CN104484391B (en) The computational methods and device of similarity of character string
CN104077274B (en) Method and device for extracting hot word phrases from document set
CN109726299B (en) Automatic indexing method for incomplete patent
CN106649385B (en) Data reordering method and device based on HBase database
CN108228869A (en) The method for building up and device of a kind of textual classification model
CN110516722B (en) Automatic generation method for traceability between requirements and codes based on active learning
CN106919627A (en) The treating method and apparatus of hot word
CN104978375B (en) A kind of language material filter method and device
CN107943989B (en) Module recommendation device and method based on software as a service (SaaS) platform
CN106570058A (en) Searching method and search engine
CN103793504B (en) A kind of cluster initial point system of selection based on user preference and item attribute

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Applicant after: Beijing Guoshuang Technology Co.,Ltd.

Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing

Applicant before: Beijing Guoshuang Technology Co.,Ltd.

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20170620

RJ01 Rejection of invention patent application after publication