CN108628966B - A kind of quick matching and recognition method and device based on character string - Google Patents

A kind of quick matching and recognition method and device based on character string Download PDF

Info

Publication number
CN108628966B
CN108628966B CN201810362354.9A CN201810362354A CN108628966B CN 108628966 B CN108628966 B CN 108628966B CN 201810362354 A CN201810362354 A CN 201810362354A CN 108628966 B CN108628966 B CN 108628966B
Authority
CN
China
Prior art keywords
character
character string
array
string
bit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810362354.9A
Other languages
Chinese (zh)
Other versions
CN108628966A (en
Inventor
李小坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Greenet Information Service Co Ltd
Original Assignee
Wuhan Greenet Information Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Greenet Information Service Co Ltd filed Critical Wuhan Greenet Information Service Co Ltd
Priority to CN201910339586.7A priority Critical patent/CN110096628B/en
Priority to CN201810362354.9A priority patent/CN108628966B/en
Priority to CN201910339570.6A priority patent/CN110008385B/en
Priority to CN201910339599.4A priority patent/CN110083746B/en
Publication of CN108628966A publication Critical patent/CN108628966A/en
Application granted granted Critical
Publication of CN108628966B publication Critical patent/CN108628966B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Character Discrimination (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to field of computer technology, provide a kind of quick matching and recognition method and device based on character string.Wherein method includes that there are the static character bits in one or more character bit of dynamic change and respective symbols string in determining character string;According to the content information of the static character bit in the character string and one or more of dynamic character positions, character string maps library is updated.The present invention is demarcated the character bit of wherein dynamic change, and it can be by adding such as the 257th in the array of conventional dictionary tree, for storing the link information of the next stage array of the character bit of the corresponding dynamic change, to greatly simplify the degree of redundancy of dictionary tree.

Description

A kind of quick matching and recognition method and device based on character string
[technical field]
The present invention relates to field of computer technology, more particularly to a kind of quick matching and recognition method based on character string and Device.
[background technique]
Deep packet inspection technical (Deep Packet Inspect, write a Chinese character in simplified form are as follows: DPI) technology is a kind of based on application layer Flow detection and control technology, when IP data packet, TCP or UDP message stream are by bandwidth management system based on DPI technology, The system recombinates the application layer message in seven layer protocol of OSI by the deep content for reading IP payload package, thus To the content of entire application program, shaping operation then is carried out to flow according to the management strategy that system defines.
In DPI technology, when doing network data message using identification and malicious traffic stream analysis, it will usually collect message The feature of certain bytes in preceding n byte is loaded, such as: it will go out in message designated position in the network data message of Tencent QQ Include two bytes of QQ;Then specific rule base is generated, does rule and data packet finally by a matching engine Match.However, can but encounter in the actual operation process, it is uncertain since there are some bytes in n byte, it can not be by automatic Machine (Aho-Corasickautomation writes a Chinese character in simplified form are as follows: AC) algorithm establishes state machine to match, so generally can be one by one Rule goes whether traversal matching hits.Traversal rule is feasible in the case where regular quantity is few, but gets on it in the regular order of magnitude Afterwards, matched performance will be very low, and rate matched can be comparable slow.The wave of computing resource largely can be brought in this way Take, and in the prior art without succinct, the efficient solution for such situation.
Patent application number is that the patent file of CN201210132834.9 discloses a kind of multi-pattern match side Method and device.This method comprises:, by its respective character composition sequence, the root node along tree construction is downward by multiple pattern strings, Each character is written in a node, a decision tree structure is generated;By main string to be matched along the decision tree downwards into Row matching.Technical solution of the present invention can be realized the accurate matching of multi-mode character string, while according to the corresponding Kazakhstan of child node Uncommon value searches the child node, and the width of decision tree changes the CPU time expense that will not influence string matching, the algorithm when Between expense be solely dependent upon the mean depth of decision tree, it is unrelated with the quantity of pattern string.The string matching more for pattern string, The algorithm can largely reduce the time overhead of CPU, improve the response speed of application.But it is not supported in character string in the patent Include does not determine the case where matching of character.
Patent application number: the patent file of CN201310744154.7 discloses a kind of based on non-determined finite automata Character string search method, including the non-determined finite automata NFA of construction and set state variable for non-determined finite automata; Matching expression is loaded into the non-determined finite automata, and according to digraph operator transformation rule, it will be described non-true The matching expression determined in finite automata is converted to digraph;According to the state position in the state variable, to entering institute The character in the character string of non-determined finite automata is stated to start to match;According to the digraph if character match success In final position pointed by the position update the state variable, since the position in the state variable of the update under One character is matched, and then matches completion until obtaining to meet the character string of the matching expression or have character match to fail; When the matching is completed, and the state variable is set to starting position.The patent is by similar " ((A*B | AC) D) " Logical operator carries out string matching, and the NFA algorithm in the patent is to support abc*cd, and do not know among abc and cd The unlimited situation of * number of character, therefore, compared to application scenarios more proposed by the invention, although using NFA algorithm and general AC algorithm equally can solve technical problem according to the present invention, still, respectively because AC algorithm itself realize excessively Solidify and the applicable flexibility of veneer and NFA algorithm itself is excessive, cannot achieve the money under application proposed by the invention The raising of source effective use and calculated performance.
[summary of the invention]
The technical problem to be solved by the present invention is to be fixed length be n string matching identification when, corresponding word Comprising one or more uncertain byte in symbol string, however, when above-mentioned character string still corresponds to the same recognition result, it is existing Some AC algorithms can still set one or more of uncertain bytes the match bit of 0-255, and corresponding one Or the matching result of 0-255 all corresponds to said one recognition result in multiple uncertain bytes, can bring calculation amount at this time On waste;In this case, existing non-determined finite automata NFA algorithm is that current input is realized based on transfer function Switching between transfer object, and final state (receiving state) finally is arrived at, therefore, also need to protect said one Multiple uncertain bytes corresponding transfer function is set, therefore, this hair can not be directed to using NFA algorithm for the AC that compares Problem brings the saving of any computing resource and the raising of calculated performance in bright proposed application scenarios.
The present invention adopts the following technical scheme:
In a first aspect, the present invention provides a kind of quick matching and recognition methods based on character string, comprising:
Determine that there are the static words in one or more character bit of dynamic change and respective symbols string in character string Fu Wei;
According to the content information of the static character bit in the character string and one or more of dynamic character positions, more New character strings mapping library;
Wherein, one or more of dynamic character positions are corresponded in the character string maps library, accordingly with default Additional character position demarcated.
Preferably, the character string maps library includes one or more array array, and the array array is specifically by one A or multiple arrays correspond to character arrangements sequence, are arranged to make up in hierarchical fashion;Wherein, the series of array and phase in character string Answer character quantity corresponding;Each array includes array location consistent with complete character number in quantity, described default Additional character position add after the last character position of each array;Wherein, the array location closes therewith for storing The address of the next stage array of connection.
Preferably, the array location of complete character number specifically includes 256 array lists in total of corresponding 0x00-0xFF Member, then the additional character is correspondingly arranged as the 257th array location in array, wherein each array location is for storing under it The address information of level-one array jumps out current number group pattern and obtains the corresponding information of matching result for storing.
Preferably, it jumps out current number group pattern for storing and obtains the corresponding information of matching result, specifically include:
Jump address link is stored in the afterbody array of the array array of corresponding each character string, it is described to jump ground Location link is for obtaining the parsing result to match with the character string;Alternatively,
The solution to match with the character string is stored in the afterbody array of the array array of corresponding each character string Analyse result.
Preferably, character string maps library has been stored with the first character string, at this point, importing in character string maps library newly-increased The second character string, specifically include:
The first character string identical for character string initial character and the second character string, by the first order array of the first character string It is multiplexed to second character string;
For i-th of character bit being had differences between the first character string and the second character string, then in the first character string In array array, it is located at link where corresponding i-stage array, increases an array newly to correspond to i-th of character in the second character string Position content;To form junior's link of two i-stage arrays relative to (i-1)-th grade of array.
Preferably, a third character string is being got, needed through the character string maps library, for the third word When the representative information of symbol string is parsed, the method also includes:
According to the content of the initial character position of the third character string, is matched in the array array in character string maps library One or more consistent candidate array of content of the initial character position of information and the third character string is recorded in level-one array Array;
Successively according to the content of the successive character position of the third character string, one or more of candidate arrays are screened Array obtains the corresponding parsing result of third character string.
Preferably, described successively according to the content of the successive character position of the third character string, screen it is one or Multiple candidate's array arrays, obtain the corresponding parsing result of third character string, specifically include:
Successive character position is set as static character bit to match, if do not match to obtain it is unique as a result, if selectivity Successive character position is set as to dynamic character position, and successive character position adjusted is matched, it is unique until being matched to As a result, alternatively, feeding back the message of non-successful match to operator after arriving at and jumping out matching cycling condition.
Preferably, it is described it is selective successive character position is set as dynamic character position, specifically include:
From last round of matching process, the character bit of last mismatch is adjusted to dynamic character position, and when with last round of mismatch, It is described to be adjusted to array corresponding to the previous character bit of dynamic character position newly as starting, carry out the matching process when front-wheel;
If also there is character late position mismatch, above-mentioned adjustment process is repeated, and complete the matching of entire character string Journey;
Wherein, for same character bit, if still non-successful match, then confirmation is supported after it is adjusted to dynamic character position Up to jumping out matching cycling condition, and the message of non-successful match is fed back to operator.
Preferably, there are one or more character bits and respective symbols of dynamic change in the determining character string Static character bit in string, specifically includes:
Compare the 4th character string and the 5th character string that obtain in the data packet in preset time period, if the 4th character string and Number ratio result between 5th character string between the number of the character bit of the number and difference of similar character bit is greater than pre- If threshold value, then the 4th character string and the 5th character string are marked out;
Belong to the confirmation message of same parsing result according to the 4th character string of input terminal feedback and the 5th character string; Determine that the character bit that has differences between the 4th character string and the 5th character string is the character bit of the dynamic change, and the 4th The identical character bit of content is the static character bit between character string and the 5th character string.
Preferably, there are one or more character bits and respective symbols of dynamic change in the determining character string Static character bit in string, specifically includes:
According to preset period of time, same target can be matched to the presence or absence of multiple final stage arrays by analyzing in character mapping library Matching result situation;
If there are multiple final stage arrays can be matched to same object matching result situation for confirmation, according to the final stage array With the linking relationship of prime array, at least two character strings to be integrated corresponding to respective array array are obtained;
Described at least two character strings to be integrated are compared, are obtained wait integrate between character string the character bit of dynamic change and quiet State character bit;
According to the character bit of the dynamic change and static character bit, corresponding array array in character mapping library is adjusted.
Second aspect, the present invention also provides the Rapid matching identification devices based on character string, for realizing first aspect The quick matching and recognition method based on character string, described device include:
At least one processor;And the memory being connect at least one described processor communication;Wherein, described to deposit Reservoir is stored with the instruction that can be executed by least one described processor, and described instruction is arranged to carry out first aspect institute by program The quick matching and recognition method based on character string stated.
The third aspect, the present invention also provides a kind of nonvolatile computer storage media, the computer storage medium Computer executable instructions are stored with, which is executed by one or more processors, for completing first Based on the quick matching and recognition method of character string described in aspect.
The present invention is demarcated the character bit of wherein dynamic change, and can pass through the array in conventional dictionary tree In add such as the 257th, the link information of the next stage array of the character bit for storing the corresponding dynamic change, thus Greatly simplify the degree of redundancy of dictionary tree.
[Detailed description of the invention]
In order to illustrate the technical solution of the embodiments of the present invention more clearly, will make below to required in the embodiment of the present invention Attached drawing is briefly described.It should be evident that drawings described below is only some embodiments of the present invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.
Fig. 1 is a kind of flow diagram of quick matching and recognition method based on character string provided in an embodiment of the present invention;
Fig. 2 is the structure of arrays schematic diagram after a kind of newly-increased array location provided in an embodiment of the present invention;
Fig. 3 is a kind of structural schematic diagram of typical array array provided in an embodiment of the present invention;
Fig. 4 is the method flow diagram of a kind of determining dynamic byte position provided in an embodiment of the present invention and static byte position;
Fig. 5 is a kind of array array approach stream for increasing character string newly in existing array array provided in an embodiment of the present invention Cheng Tu;
Fig. 6 is a kind of process of array array for increasing character string newly in existing array array provided in an embodiment of the present invention One of schematic diagram;
Fig. 7 is a kind of process of array array for increasing character string newly in existing array array provided in an embodiment of the present invention The two of schematic diagram;
Fig. 8 is a kind of effect diagram of array array in the prior art provided in an embodiment of the present invention;
Fig. 9 is provided in an embodiment of the present invention a kind of to utilize character string maps library lookup third character proposed by the invention The flow diagram of the object matching result of string;
Figure 10 is the expansion implementation method flow chart of step 402 in corresponding diagram 9 provided in an embodiment of the present invention;
Figure 11 is another expansion implementation method flow chart of step 402 in corresponding diagram 9 provided in an embodiment of the present invention;
Figure 12 is that the embodiment of the present invention provides a kind of method flow diagram for updating character string maps library;
Figure 13 is a kind of structural schematic diagram of Rapid matching identification device based on character string provided in an embodiment of the present invention.
[specific embodiment]
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.
In the description of the present invention, term "inner", "outside", " longitudinal direction ", " transverse direction ", "upper", "lower", "top", "bottom" etc. refer to The orientation or positional relationship shown be based on the orientation or positional relationship shown in the drawings, be merely for convenience of description the present invention rather than It is required that the present invention must be constructed and operated in a specific orientation, therefore it is not construed as limitation of the present invention.
In various embodiments of the present invention, the determiner of similar " first ", " second ", it is only for facilitate description technique side There is the scene of two main bodys simultaneously in same name object in case, not gives special restriction, practical manifestation can be Meet wantonly one or two of objective subject of corresponding function description in concrete scene.
In addition, as long as technical characteristic involved in the various embodiments of the present invention described below is each other not Constituting conflict can be combined with each other.
Embodiment 1:
The embodiment of the present invention 1 provides a kind of quick matching and recognition method based on character string, is related to word suitable for various Symbol String matching obtain also include in the occasion of object matching result, especially character string dynamic change one or more word Fu Wei above-mentioned character will occur when for example, network data message being done and analyzed using identification and malicious traffic stream in DPI technology The scene of specific character position meeting dynamic change in string.As shown in Figure 1, method the following steps are included:
In step 201, one or more character bit and respective symbols in character string there are dynamic change are determined Static character bit in string.
It, can be with there are one or more character bit of dynamic change in the determining character string in specific implementation It is to be carried out in multiple stages of the character string maps library generation/update;According to determining opportunity and mode difference, can be divided into It is several below:
Mode one can be the statistics via operator or observe experience, be the character string in character string maps When library generates initial number group pattern, the dynamic character position in character string has just been determined;Such as: operator is in creation character string When " Hello4world " Associate array array in character string maps library, just have been acknowledged that the 6th character bit therein is State byte position (i.e. it includes parameter value be can dynamic change, such as: the parameter value currently shown is " 4 ", and next number Parameter value according to packet may be exactly " 5 ", but two character strings correspond to the same matching result, such as all corresponding The data packet that " HelloWorld application " is issued).
Mode two can be compared the more intelligentized mode of mode one using one kind, i.e., will identify accordingly and include The process of the character string of random order transfers to computer to complete;Concrete mode is as follows:
Computer (can be understood as server in embodiments of the present invention) creates corresponding multiple the multiple of character string and reflects After penetrating array array, each character string in character string maps library and corresponding matching result are traversed, if finding there are two not With matching result of the character string corresponding to it is identical, then update character string maps in a manner of the array array both being fitted Library.And wherein, multiple mapping array arrays of corresponding multiple character strings are created, be can be and confirmed by operator and established 's.It compares for aforesaid way one, mode two reduces operator's early period confirmation using the post analysis function of computer With the complexity for establishing work.
Wherein, the array array of both described fittings is implemented as the number for grade corresponding to dynamic character position therein Group, adjusting its array location for storing next stage Associate array is the corresponding array location (example being arranged exclusively for dynamic character position Such as: the 256th in array, also can be set as the 0th previous position in array during specific implementation, or in number Other any positions in group are added, it is, of course, preferable to mode be to be arranged in array head or tail portion), and by this grade of array A plurality of branch abatement later is that a branch (is only to consider that an only character bit is that dynamic becomes between two character strings at this time The case where change).
In step 202, according to the content information of the static character bit in the character string and one or more of Dynamic character position updates character string maps library.
Wherein, one or more of dynamic character positions are corresponded in the character string maps library, accordingly with default Additional character position demarcated.As above-mentioned described in step 201, the additional character position is preferably in array It is arranged before initial position or after end position.As shown in Fig. 2, number is added at the end for the 0-255 array in a standard Calibration of the group unit as the additional character position, wherein the calibration of the additional character position is for storing its next series The chained address of group uses.
The embodiment of the present invention provides firstly a kind of foundation in character string maps library comprising dynamic variation characteristic character bit, Compare need in the prior art to include random order character string in each character bit require the parameter occurred to it Value establishes dictionary tree, and there are 0-255 kind possible parameter values for one random character bit when the limit, to can greatly increase Add the volume of dictionary tree, and need when for computer disposal for corresponding dictionary tree to be loaded into memory, at this point, will will cause interior Hold the unnecessary wasting of resources.In embodiments of the present invention, by combining computer statistics and/or operator to identify, will show Have in technology and pre-processed for creating the character string of dictionary tree, i.e., is marked the character bit of wherein dynamic change It is fixed, and can be by adding such as the 257th in the array of conventional dictionary tree, for storing the corresponding dynamic change The link information of the next stage array of character bit, to greatly simplify the degree of redundancy of dictionary tree.
In order to various treatment processes and operation content involved in the apparent elaboration embodiment of the present invention, connect Get off and more similar elaboration is done for the relationship between array array and character string employed in the embodiment of the present invention.Specifically , as shown in figure 3, the character string maps library includes one or more array array, the array array is specifically by one Or multiple arrays correspond to character arrangements sequence, are arranged to make up in hierarchical fashion;Wherein, the series of array is corresponding to character string Character quantity is corresponding;Each array includes array location consistent with complete character number in quantity, described preset Additional character position is added after the last character position of each array;Wherein, the array location is associated for storing Next stage array address.
By taking Fig. 3 as an example, it is an array array for typically mapping two character strings, has continued to use array exhibition shown in Fig. 2 Showing mode, wherein we respectively become character string A and character string B to above-mentioned two character string, as shown in Figure 3, wherein character string The parameter value of the first character position of A and character string B is identical, is 0x04, therefore character string A and character string B shares the first order Array, and character string A and character string B just produce disagreement since the second character bit, respectively with thick arrow line and thin in Fig. 3 Arrow line respectively represent the linking relationship of array at different levels in array array corresponding to character string A and character string B.By above-mentioned Analysis and Fig. 3 presented content, it is known that the content of character string A is that " 0x04,0x05,0xFB, 0x02 " and character string B's is interior Hold for " 0x04,0x04, X, 0xFD ", wherein the X in character string B just indicates that the parameter value of its expression is the ginseng of a dynamic change Numerical value, and the third character bit of the character string B where corresponding parameter value is just a dynamic character position.
By taking Fig. 3 as an example, the array location of complete character number specifically includes 256 arrays in total of corresponding 0x00-0xFF Unit, then the additional character is correspondingly arranged as the 257th array location in array, wherein each array location is for storing it The address information of next stage array jumps out current number group pattern and obtains the corresponding information of matching result for storing.In Fig. 3 Shown in arrow relationship, just be each array location be used for store its next stage array address information performance;And it is described It jumps out current number group pattern for storing and obtains the corresponding information of matching result, typically refer to the last character in corresponding character string The array location in the array of the corresponding series of position is accorded with, by taking the character string A of Fig. 3 as an example, wherein the corresponding mark in fourth stage array Infused stored in the array location of latticed shade jump out current number group pattern obtain matching result corresponding information (such as: Object matching result is directly stored in the array location for being labelled with latticed shade;Alternatively, being labelled with grid described The identification information for addressing the object matching result is stored in the array location of shape shade;Wherein, the object matching knot The identification information of fruit can show as a string of numerical value, for searching object matching result into specified table).
In embodiments of the present invention, for there are the one of dynamic change in the determining character string involved in step 201 Static character bit in a or multiple character bits and respective symbols string, the embodiment of the invention also provides a kind of specific Implementation (a kind of specific implementation that may be considered mode one described in step 201), as shown in figure 4, specific packet Include step performed below:
In step 2011, the 4th character string and the 5th character string obtained in the data packet in preset time period is compared, If the number between the 4th character string and the 5th character string between the number of the character bit of the number and difference of similar character bit Ratio result is greater than preset threshold, then marks out the 4th character string and the 5th character string.
Wherein, described to mark out the 4th character string and the 5th character string, it can show as feeding back to operator true Recognize request message, the content of the 4th character string and the 5th character string is carried in the confirmation request message, so as to operator Member confirms whether the 4th character string and the 5th character string are mapped as same object matching as a result, to further confirm character Dynamic change character bit in string.
Wherein, in preset time period, can be the time parameter that operator is arranged in system side, i.e., feed system (or Referred to as server) judge whether to need to carry out the operation foundation of a wheel dynamic character bit identification process.The preset threshold can be with A definite value, be also possible to one by relational expression calculate as a result, in embodiments of the present invention it is preferred use after Person, such as: the preset threshold can be the 60%-80% of string length.
In step 2012, same parsing is belonged to according to the 4th character string of input terminal feedback and the 5th character string and is tied The confirmation message of fruit;Determine that the character bit having differences between the 4th character string and the 5th character string is the word of the dynamic change Fu Wei, and the identical character bit of content is the static character bit between the 4th character string and the 5th character string.
Wherein, input terminal refers specifically to the input terminal of operator side, can show as keyboard, touch tablet, hand-held intelligent end End etc..
Embodiment 2:
After the embodiment of the present invention 1 has illustrated a more complete character string maps library structure feature, the present invention Embodiment then from how in already existing character string maps library, increases the method mistake of new character string and its array array newly Journey.Before description is specifically unfolded in the embodiment of the present invention, it is assumed that character string maps library has been stored with the first character string, at this point, The second newly-increased character string is imported in character string maps library, as shown in figure 5, the method specifically includes:
In step 301, the first character string identical for character string initial character and the second character string, by the first character string First order array be multiplexed to second character string.
Using the character string A that is introduced in embodiment 1 and character string B as the first character string described in the embodiment of the present invention Description is expanded with the second character string.As shown in fig. 6, for after performing step 301, character string B has been multiplexed the of character string A Effect diagram after level-one array, at this point, compare original state when, be accordingly labeled in first order array shown in fig. 6 The array location of solid dark circle, content are updated to from the link address information for the second level array for only storing character string A originally The link address information of the link address information of the second level array of character string A and the second level array of character string B is stored simultaneously. In specific implementation, if first order array is multiplexed by N number of character string, in first order array in corresponding array location (such as Fig. 6 In be labeled with the array location of solid dark circle) also by and meanwhile store the link address information of N number of second level array.
In step 302, for i-th of character bit being had differences between the first character string and the second character string, then exist In the array array of first character string, it is located at link where corresponding i-stage array, increases an array newly to correspond to the second character string In i-th of character bit content;To form junior's link of two i-stage arrays relative to (i-1)-th grade of array.Wherein, i For the natural number more than or equal to 2.
Between character string A and character string B, i-th of character bit shows as the second character bit, as shown in fig. 7, be After being multiplexed the first order array of character string A, the second character of the first order array being re-used and newly-generated character string B is established The effect diagram of link relationship between second level array corresponding to position.Further by the subsequent character bit of character string B Corresponding arrays at different levels, which are established, completes corresponding link relationship, can obtain array array as shown in Figure 3.
The array array process of newly-built second character string of 301-302 through the above steps, is not difficult to release implementation of the present invention Example compared to for more existing dictionary tree technology, the embodiment of the present invention 1 its for dynamic change character bit add new array location into Line identifier, the prominent improvement occupied for memory headroom.Still by taking character string A shown in Fig. 3 and character string B as an example, now with The case where third character bit of a character string B in the prior art includes three dynamic parameter values is shown for Fig. 8, passes through Fig. 8 The array that filament is marked in middle third level array knows that three dynamic parameter values, can by Fig. 8 for 0x04,0xFC, 0xFD To see in the prior art for the array number of the dynamic parameter value in how many corresponding third level array, accordingly the The array that same static character bit is corresponded in level Four array can also be replicated be with dynamic parameter value respectively three times 0x04,0xFC, Third level array corresponding to 0xFD establishes respective link relationship.In the prior art processing is not distinguished for dynamic character position Mode, compare the solution array array obtained that this proposes according to inventive embodiments, can pass through same a pair of of word Symbol string A and character string B distinguishes proposed scheme array array (as shown in Figure 3) obtained through the embodiment of the present invention and passes through The array array (as shown in Figure 8) that the mode that the prior art generates character mapping library obtains intuitively relatively is obtained in storage volume The upper embodiment of the present invention possesses superior improvement effect.It is emphasized that shown in Fig. 8 is only the third word of character string B The case where there are three dynamic parameter values is only gathered around in symbol position, in extreme circumstances, moves if the third character bit of character string B possesses 256 State parameter value, then in corresponding array display since third level array, number corresponding to the character bit in subsequent character string B Group is required to duplication 256 times, this will bring the significant losses of memory source in analysis matching process.
Embodiment 3:
Character mapping library framework proposed by the invention is being described by embodiment 1, and is being illustrated by embodiment 2 How for a newly-increased text string generation, it is mapped in character in the character mapping library framework that the embodiment of the present invention is proposed The process of array array in library.The embodiment of the present invention 3 further from character mapping library use process for getting one Third character string, the character mapping library framework how to be proposed through the embodiment of the present invention complete the mistake that object matching result obtains Cheng Jinhang is illustrated in detail.As shown in figure 9, the process includes step performed below:
In step 401, the array battle array according to the content of the initial character position of the third character string, in character string maps library Matched in column recorded in the first order array initial character position of information and the third character string content it is consistent one or Multiple candidate's array arrays.
Said one or multiple candidate array arrays refer to that after first order array, only having a link arrives at finally Object matching result, then referred to as one candidate array array;And for there are multilink arrivals after first order array Multiple object matching results, then referred to as multiple candidate array arrays.
In step 402, successively according to the content of the successive character position of the third character string, screen it is one or Multiple candidate's array arrays, obtain the corresponding parsing result of third character string.
In embodiments of the present invention, described successively according to the subsequent words of the third character string involved in step 402 The content of position is accorded with, one or more of candidate array arrays is screened, obtains the corresponding parsing result of third character string, such as scheme Shown in 10, specifically include:
In step 4021, successive character position is set as static character bit and is matched.
In step 4022, if do not match to obtain it is unique as a result, if selectivity successive character position is set as to dynamic Character bit, and successive character position adjusted is matched, until being matched to unique as a result, jumping out matching alternatively, arriving at After cycling condition, the message of non-successful match is fed back to operator.
Wherein, it is described it is selective successive character position is set as dynamic character position, specifically include: being matched from last round of Cheng Zhong, the character bit of last mismatch are adjusted to dynamic character position, and when with last round of mismatch, described to be newly adjusted to dynamic character position Previous character bit corresponding to array be starting, carry out the matching process when front-wheel;If also there is character late position mismatch, Above-mentioned adjustment process is then repeated, and completes the matching process of entire character string;Wherein, for same character bit, if in its adjustment Behind dynamic character position, still non-successful match then confirms that arrival jumps out matching cycling condition, and feeds back disappearing for non-successful match It ceases to operator.
In embodiments of the present invention, in addition to the exploratory of single thread described in above-mentioned steps 4021-4022 can be used Outside matching process, it can also be specifically described as follows as shown in figure 11 using a kind of parallel heuristic method of multithreading:
In step 4021 ' in, it reads in each candidate array array, the character bit letter in the subsequent stages array of preset length Breath, and matched with the parameter value in respective symbols position in third character string.
In step 4022 ' in, by step 4021 ' a wheel the selection result is obtained, a part of time can be rejected in the selection result Array array is selected, at this point, the operating process of similar step 4021 ' is repeated for remaining candidate array array again, until matching To unique result (i.e. object matching result), alternatively, feeding back the message of non-successful match after arrival jumps out matching cycling condition To operator.
The single thread of the above-mentioned steps that compare 4021-4022, band are turned back matching process, the institute of above-mentioned steps 4021 ' -4022 ' The method of proposition is more efficient, certainly, the preset length in the candidate array array of the implementation procedure of multithreading or directly extraction Subsequent stages array in character bit information, also bring along more occupancy of cpu resource, therefore, two kinds of process flows are each From possessing respective advantage.
Embodiment 4:
In embodiment 1, have been presented in a kind of determining character string that there are one or more characters of dynamic change The concrete methods of realizing of position and the static character bit in respective symbols string still accordingly specifically describes in embodiment 1 The method of determination needs operator to intervene, and be suitable for initial stage also not for multiple character strings for being analyzed any one Generation has the case where array array.And the feelings that may relate in the character mapping library framework proposed as the embodiment of the present invention Condition, in addition to just give in the early stage dynamic change character bit and the static character bit in character string of expansion description in embodiment 1 carry out Identification is outer, the embodiment of the present invention also proposed it is a kind of cross multiple access without operator and do differentiate operation, and be to generate After the array array of each each character string of correspondence, the dynamic change character bit between character string is identified automatically by server (system) Method is that a kind of extensive style early period establishes array array, and the method that the later period gradually collapses (may be considered step in embodiment 1 A kind of specific implementation of mode two described in 201).As shown in figure 12, specifically includes the following steps:
In step 501, according to preset period of time, analyze in character mapping library with the presence or absence of multiple final stage arrays can It is fitted on same object matching result situation.
Wherein, preset period of time can be 1 day, 1 week or 1 month, be set with specific reference to actual conditions, herein Do not do particular determination.
Wherein, the final stage array refers to the afterbody array of a corresponding complete character string in character mapping library, example As shown in Figure 3, in the array array of corresponding character string A and character string B, fourth stage array is character string A and character string B Final stage array.
Wherein, according to two kinds of implementations described in embodiment before, in mode one, if straight in the final stage array It connects and stores object matching as a result, then can directly transfer the object matching result stored in each final stage array in character mapping library It is compared, there are multiple final stage arrays can be matched to same object matching result situation for confirmation if comparison result is identical.? It, equally can be direct if directly storing the identification information of corresponding object matching result in the final stage array in mode two The identification information for transferring the correspondence object matching result stored in each final stage array in character mapping library, if comparison result is identical There are multiple final stage arrays can be matched to same object matching result situation for confirmation.
In step 502, if there are multiple final stage arrays can be matched to same object matching result situation, basis for confirmation The linking relationship of the final stage array and prime array obtains at least two characters to be integrated corresponding to respective array array String.
Wherein, the character string to be integrated is back-calculated to obtain according to the link relationship between the array in each array array, By taking array array shown in fig. 6 as an example, according to the serial number of the array location of storage content in its fourth stage array, the fourth stage is determined The parameter value of character bit corresponding to array, wherein the parameter value of character bit corresponding to fourth stage array shown in fig. 6 is 0x02;And obtain storing substantial array location serial number 0xFB in third level array by link relationship between array, therefore, The parameter value of character bit corresponding to third level array is 0xFB, and so on obtain completely character string A to be integrated and be " 0x04,0x05,0xFB, 0x02 ".
In step 503, at least two character string to be integrated is compared, is obtained wait integrate dynamic change between character string Character bit and static character bit.
Now to be multiplexed array array shown in Fig. 8, as a kind of application scenarios described in the embodiment of the present invention, wherein Fig. 8 Shown in array array contain character string A, character string B1, character string B2 and character string B3 (wherein, character string B1, character string B2 and character string B3 is respectively corresponded that Fig. 8 thin-line arrow is marked and is located at the phase successively arranged from top to bottom in each grade of array Answer array), therefore, pass through the step 501 content according to included in fourth stage array, it is thus identified that character string B1, character The object matching result of string B2 and character string B3 is identical;And respective symbols string B1, character string B2 are instead released by step 502 With character string B3 be respectively " 0x04,0x04,0x04,0xFD ", " 0x04,0x04,0xFC, 0xFD " and " 0x04,0x04,0xFD, 0xFD " then further can be confirmed that wherein static character bit is respectively as follows: the first character bit " 0x04 ", the second word by step 503 It accords with position " 0x04 ", the 4th character bit " 0xFD ";And the character bit of corresponding dynamic change are as follows: third character bit.
In step 504, it according to the character bit of the dynamic change and static character bit, adjusts corresponding in character mapping library Array array.
Still the array array using state diagram shown in Fig. 8 as before executing the step 504, and accordingly execute the step After 504, then respective counts group pattern is updated to state diagram as shown in Figure 3 in character map.
Embodiment 5:
The present invention is providing the above-mentioned user for character mapping library framework and its links proposed by the invention After method, a kind of Rapid matching identification device based on character string also proposed in embodiments of the present invention, it is above-mentioned each for executing The method that embodiment is proposed, as shown in figure 13, the Rapid matching identification device based on character string of the present embodiment include one Or multiple processors 21 and memory 22.Wherein, in Figure 13 by taking a processor 21 as an example.
Processor 21 can be connected with memory 22 by bus or other modes, to be connected as by bus in Figure 13 Example.
Memory 22 is readable as a kind of quick matching and recognition method based on character string and device non-volatile computer Storage medium can be used for storing non-volatile software program, non-volatile computer executable program and module, such as embodiment The quick matching and recognition method (for example, Fig. 1,4,5, flow chart shown in 9-12) based on character string in 1.Processor 21 passes through Non-volatile software program, instruction and the module being stored in memory 22 are run, thereby executing based on the quick of character string The various function application and data processing of match cognization device realize the Rapid matching based on character string of embodiment 1-4 Recognition methods.
Memory 22 may include high-speed random access memory, can also include nonvolatile memory, for example, at least One disk memory, flush memory device or other non-volatile solid state memory parts.In some embodiments, memory 22 Optional includes the memory remotely located relative to processor 21, these remote memories can pass through network connection to processor 21.The example of above-mentioned network includes but is not limited to internet, intranet, local area network, mobile radio communication and combinations thereof.
It is worth noting that the contents such as information exchange, implementation procedure between module, unit in above-mentioned apparatus, due to It is based on same design with processing method embodiment of the invention, for details, please refer to the description in the embodiment of the method for the present invention, Details are not described herein again.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of embodiment is can to lead to Program is crossed to instruct relevant hardware and complete, which can be stored in a computer readable storage medium, storage medium It may include: read-only memory (ROM, Read Only Memory), random access memory (RAM, Random Access Memory), disk or CD etc..
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.

Claims (10)

1. a kind of quick matching and recognition method based on character string characterized by comprising
Determine that there are the static characters in one or more character bit of dynamic change and respective symbols string in character string Position;
According to the content information of the static character bit in the character string and one or more of dynamic character positions, word is updated Symbol string mapping library;
Wherein, one or more of dynamic character positions are corresponded in the character string maps library, accordingly with preset attached Padding position is demarcated;
A third character string is being got, is being needed through the character string maps library, for representated by the third character string Information when being parsed, the method also includes:
According to the content of the initial character position of the third character string, the first order is matched in the array array in character string maps library One or more consistent candidate array array of content of the initial character position of information and the third character string is recorded in array;
Successively according to the content of the successive character position of the third character string, one or more of candidate array battle arrays are screened Column, obtain the corresponding parsing result of third character string.
2. the quick matching and recognition method according to claim 1 based on character string, which is characterized in that the character string is reflected Penetrating library includes one or more array array, and it is suitable that the array array specifically corresponds to character arrangements by one or more array Sequence is arranged to make up in hierarchical fashion;Wherein, the series of array is corresponding with respective symbols quantity in character string;Each array Comprising array location consistent with complete character number in quantity, the preset additional character position is added in each array Last character position after;Wherein, the array location is used to store the address for the next stage array being associated.
3. the quick matching and recognition method according to claim 2 based on character string, which is characterized in that complete character number Array location specifically include 256 array locations in total of corresponding 0x00-0xFF, then the additional character position be correspondingly arranged for The 257th array location in array, wherein each array location is for storing the address information of its next stage array or for depositing Storage jumps out current number group pattern and obtains the corresponding information of matching result.
4. the quick matching and recognition method according to claim 3 based on character string, which is characterized in that jumped out for storing Current number group pattern obtains the corresponding information of matching result, specifically includes:
Jump address link, the jump address chain are stored in the afterbody array of the array array of corresponding each character string It connects for obtaining the parsing result to match with the character string in character string maps library;Alternatively,
It is stored with and the character string phase in character string maps library in the afterbody array of the array array of corresponding each character string Matched parsing result.
5. the quick matching and recognition method according to claim 2 based on character string, which is characterized in that character string maps library It has been stored with the first character string, at this point, importing the second newly-increased character string in character string maps library, has been specifically included:
The first order array of first character string is multiplexed by the first character string identical for character string initial character and the second character string To second character string;
For i-th of character bit being had differences between the first character string and the second character string, then in the array of the first character string In array, it is located at link where corresponding i-stage array, increases an array newly to correspond in the second character string in i-th of character bit Hold;To form junior's link of two i-stage arrays relative to (i-1)-th grade of array.
6. the quick matching and recognition method according to claim 1 based on character string, which is characterized in that the successively basis The content of the successive character position of the third character string screens one or more of candidate array arrays, obtains third word The corresponding parsing result of symbol string, specifically includes:
Successive character position is set as static character bit to match, if do not match to obtain it is unique as a result, if selectivity will Successive character position is set as dynamic character position, and successive character position adjusted is matched, until being matched to unique knot Fruit, alternatively, feeding back the message of non-successful match to operator after arrival jumps out matching cycling condition.
7. the quick matching and recognition method according to claim 6 based on character string, which is characterized in that described selective Successive character position is set as dynamic character position, is specifically included:
From last round of matching process, the character bit of last mismatch is adjusted to dynamic character position, and when with last round of mismatch, new to adjust Whole be array corresponding to the previous character bit of dynamic character position is starting, carries out the matching process when front-wheel;
If also there is character late position mismatch, above-mentioned adjustment process is repeated, and complete the matching process of entire character string;
Wherein, for same character bit, if after it is adjusted to dynamic character position, still non-successful match then confirms that arrival is jumped Cycling condition is matched out, and feeds back the message of non-successful match to operator.
8. -7 any quick matching and recognition method based on character string according to claim 1, which is characterized in that described true Determine in character string that there are the static character bits in one or more character bit of dynamic change and respective symbols string, specifically Include:
The 4th character string and the 5th character string obtained in the data packet in preset time period is compared, if the 4th character string and the 5th Number ratio result between character string between the number of the character bit of the number and difference of similar character bit is greater than default threshold Value, then mark out the 4th character string and the 5th character string;
Belong to the confirmation message of same parsing result according to the 4th character string of input terminal feedback and the 5th character string;It determines The character bit being had differences between 4th character string and the 5th character string be dynamic change character bit, and the 4th character string and The identical character bit of content is the static character bit between 5th character string.
9. -7 any quick matching and recognition method based on character string according to claim 1, which is characterized in that described true Determine in character string that there are the static character bits in one or more character bit of dynamic change and respective symbols string, specifically Include:
According to preset period of time, same object matching can be matched to the presence or absence of multiple final stage arrays by analyzing in character mapping library As a result situation;
If there are multiple final stage arrays can be matched to same object matching result situation for confirmation, according to the final stage array with before The linking relationship of grade array, obtains at least two character strings to be integrated corresponding to respective array array;
Described at least two character strings to be integrated are compared, are obtained wait integrate the character bit of dynamic change and static word between character string Fu Wei;
According to the character bit of the dynamic change and static character bit, corresponding array array in character mapping library is adjusted.
10. a kind of Rapid matching identification device based on character string, which is characterized in that including at least one processor;And with The memory of at least one processor communication connection;Wherein, the memory be stored with can by it is described at least one processing The instruction that device executes, described instruction are arranged to carry out any quick based on character string of claim 1-9 by program With recognition methods.
CN201810362354.9A 2018-04-20 2018-04-20 A kind of quick matching and recognition method and device based on character string Active CN108628966B (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201910339586.7A CN110096628B (en) 2018-04-20 2018-04-20 Quick matching identification method and device based on character strings
CN201810362354.9A CN108628966B (en) 2018-04-20 2018-04-20 A kind of quick matching and recognition method and device based on character string
CN201910339570.6A CN110008385B (en) 2018-04-20 2018-04-20 Quick matching identification method and device based on character strings
CN201910339599.4A CN110083746B (en) 2018-04-20 2018-04-20 Quick matching identification method and device based on character strings

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810362354.9A CN108628966B (en) 2018-04-20 2018-04-20 A kind of quick matching and recognition method and device based on character string

Related Child Applications (3)

Application Number Title Priority Date Filing Date
CN201910339570.6A Division CN110008385B (en) 2018-04-20 2018-04-20 Quick matching identification method and device based on character strings
CN201910339599.4A Division CN110083746B (en) 2018-04-20 2018-04-20 Quick matching identification method and device based on character strings
CN201910339586.7A Division CN110096628B (en) 2018-04-20 2018-04-20 Quick matching identification method and device based on character strings

Publications (2)

Publication Number Publication Date
CN108628966A CN108628966A (en) 2018-10-09
CN108628966B true CN108628966B (en) 2019-06-14

Family

ID=63694204

Family Applications (4)

Application Number Title Priority Date Filing Date
CN201910339586.7A Active CN110096628B (en) 2018-04-20 2018-04-20 Quick matching identification method and device based on character strings
CN201910339570.6A Active CN110008385B (en) 2018-04-20 2018-04-20 Quick matching identification method and device based on character strings
CN201910339599.4A Active CN110083746B (en) 2018-04-20 2018-04-20 Quick matching identification method and device based on character strings
CN201810362354.9A Active CN108628966B (en) 2018-04-20 2018-04-20 A kind of quick matching and recognition method and device based on character string

Family Applications Before (3)

Application Number Title Priority Date Filing Date
CN201910339586.7A Active CN110096628B (en) 2018-04-20 2018-04-20 Quick matching identification method and device based on character strings
CN201910339570.6A Active CN110008385B (en) 2018-04-20 2018-04-20 Quick matching identification method and device based on character strings
CN201910339599.4A Active CN110083746B (en) 2018-04-20 2018-04-20 Quick matching identification method and device based on character strings

Country Status (1)

Country Link
CN (4) CN110096628B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110659489B (en) * 2019-09-20 2023-03-24 安天科技集团股份有限公司 Threat detection method, device and storage medium for character string splicing behavior
CN111061972B (en) * 2019-12-25 2023-05-16 武汉绿色网络信息服务有限责任公司 AC searching optimization method and device for URL path matching
US11586615B2 (en) * 2020-07-29 2023-02-21 Bank Of America Corporation System for generation of resource identification numbers to avoid electronic misreads
CN113641672A (en) * 2021-07-30 2021-11-12 武汉思普崚技术有限公司 Multi-dimensional rapid matching method and device and storage medium
CN113609352B (en) * 2021-08-03 2023-08-04 北京恒安嘉新安全技术有限公司 Character string retrieval method, device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102142009A (en) * 2010-12-09 2011-08-03 华为技术有限公司 Method and device for matching regular expressions
CN102646115A (en) * 2012-02-17 2012-08-22 北京星网锐捷网络技术有限公司 Method and device for constructing AC (aho-corasick) state machine
CN104750725A (en) * 2013-12-30 2015-07-01 亿阳信通股份有限公司 Character string searching method and device based on non-determined finite automaton
CN107545071A (en) * 2017-09-21 2018-01-05 北京神州泰岳智能数据技术有限公司 A kind of method and apparatus of string matching

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6718325B1 (en) * 2000-06-14 2004-04-06 Sun Microsystems, Inc. Approximate string matcher for delimited strings
US6917936B2 (en) * 2002-12-18 2005-07-12 Xerox Corporation Method and apparatus for measuring similarity between documents
CN101441664A (en) * 2008-12-03 2009-05-27 北京启明星辰信息技术股份有限公司 Paralleling multiple-mode matching method and system of matching regulation including choosing character
CN101807184B (en) * 2009-02-16 2013-05-01 阿尔卡特朗讯 Method for searching character string with wildcard character and system thereof
CN103186640B (en) * 2011-12-31 2016-05-25 百度在线网络技术(北京)有限公司 Adopt traffic filtering method and the device of the canonical coupling based on AC algorithm
CN104160396B (en) * 2012-03-01 2017-06-16 国际商业机器公司 The method and system of best match character string is searched among character trail
US8990232B2 (en) * 2012-05-15 2015-03-24 Telefonaktiebolaget L M Ericsson (Publ) Apparatus and method for parallel regular expression matching
US8972450B2 (en) * 2013-04-17 2015-03-03 National Taiwan University Multi-stage parallel multi-character string matching device
CN103414600B (en) * 2013-07-19 2017-03-08 华为技术有限公司 Approximate adaptation method and relevant device and communication system
CN103685222A (en) * 2013-09-05 2014-03-26 北京科能腾达信息技术股份有限公司 A data matching detection method based on a determinacy finite state automation
CN105404635B (en) * 2014-09-16 2019-05-28 华为技术有限公司 Method, equipment and the heterogeneous computing system of string matching
CN107193843B (en) * 2016-03-15 2020-08-28 阿里巴巴集团控股有限公司 Character string screening method and device based on AC automaton and suffix expression

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102142009A (en) * 2010-12-09 2011-08-03 华为技术有限公司 Method and device for matching regular expressions
CN102646115A (en) * 2012-02-17 2012-08-22 北京星网锐捷网络技术有限公司 Method and device for constructing AC (aho-corasick) state machine
CN104750725A (en) * 2013-12-30 2015-07-01 亿阳信通股份有限公司 Character string searching method and device based on non-determined finite automaton
CN107545071A (en) * 2017-09-21 2018-01-05 北京神州泰岳智能数据技术有限公司 A kind of method and apparatus of string matching

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于众核硬件的模式匹配算法加速技术研究;刘旭东;《中国优秀硕士学位论文全文数据库 信息科技辑》;20150815(第8期);第I137-32页

Also Published As

Publication number Publication date
CN110008385A (en) 2019-07-12
CN110083746B (en) 2021-01-22
CN110096628B (en) 2021-01-22
CN110096628A (en) 2019-08-06
CN108628966A (en) 2018-10-09
CN110083746A (en) 2019-08-02
CN110008385B (en) 2020-12-22

Similar Documents

Publication Publication Date Title
CN108628966B (en) A kind of quick matching and recognition method and device based on character string
CN105224692B (en) Support the system and method for the SDN multilevel flow table parallel searchs of multi-core processor
US8849841B2 (en) Memory circuit for Aho-corasick type character recognition automaton and method of storing data in such a circuit
CN104980418B (en) The compiling of finite automata based on memory hierarchy
CN102857493B (en) Content filtering method and device
CN104503901B (en) A kind of guiding symbolic excution methodology analyzed based on static path
CN106569824A (en) Page data compiling method and apparatus, and page rendering method and apparatus
US8914320B2 (en) Graph generation method for graph-based search
CN104426911A (en) Method and apparatus for compilation of finite automata
CN106416152B (en) A kind of lookup device searches configuration method and lookup method
CN104899264B (en) A kind of multi-mode matching regular expressions method and device
CN108647145A (en) software memory safety detection method and system
US20150156102A1 (en) A Method of and Network Server for Detecting Data Patterns in an Input Data Stream
JP2003196295A (en) Method for improving lookup performance of tree-type knowledge base search
CN109614309A (en) Compare the method, apparatus, computer equipment and storage medium of test result
CN111935081B (en) Data packet desensitization method and device
CN107423391A (en) The information extracting method of Web page structural data
CN110245273A (en) A kind of method obtaining APP service feature library and corresponding device
CN109951495A (en) Network segment lookup method and device
CN107748778A (en) A kind of method and device for extracting address
CN103927325B (en) A kind of method and device classified to URL
CN106657075A (en) Multilayer protocol analysis method and device as well as data matching method and device
CN103685280B (en) Message matching method, state machine compiling method and equipment
CN106790109A (en) Data matching method and device, protocol data analysis method, device and system
CN103957131B (en) Deep massage detection method based on finite automata

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant