CN110413958A - Linear congruence character set transform method and system for automatic machine space compression - Google Patents

Linear congruence character set transform method and system for automatic machine space compression Download PDF

Info

Publication number
CN110413958A
CN110413958A CN201910505446.2A CN201910505446A CN110413958A CN 110413958 A CN110413958 A CN 110413958A CN 201910505446 A CN201910505446 A CN 201910505446A CN 110413958 A CN110413958 A CN 110413958A
Authority
CN
China
Prior art keywords
character
state
transformation
automatic machine
statusline
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910505446.2A
Other languages
Chinese (zh)
Other versions
CN110413958B (en
Inventor
孙恭鑫
卢毓海
刘燕兵
张春燕
谭建龙
郭莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN201910505446.2A priority Critical patent/CN110413958B/en
Publication of CN110413958A publication Critical patent/CN110413958A/en
Application granted granted Critical
Publication of CN110413958B publication Critical patent/CN110413958B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The present invention provides a kind of linear congruence character set transform method for automatic machine space compression, and step includes: mode trail to be built into automatic machine, and generate state-transition matrix;Each statusline of reading state transfer matrix calculates optimized transformation parameters and maximum effective successor states;According to state-transition matrix and optimized transformation parameters, data structure is recorded, statusline is replaced with into transformed effective status row;The character for reading target text carries out character transformation using linear congruence function, obtains transformed character, successor states are obtained under eligible, realize transformation according to character current state.The present invention also provides a kind of linear congruence character set transformation systems for automatic machine space compression, including rule compiler, transformation parameter generator, statusline converter, comparator, compression automatic machine memory, status register, character set converter and text scanner.

Description

Linear congruence character set transform method and system for automatic machine space compression
Technical field
The invention belongs to information technology fields, and in particular to a kind of linear congruence character set for automatic machine space compression Transform method and system.
Background technique
String matching algorithm is a kind of searching algorithm, be widely used in bioinformatics, information retrieval, data compression, The fields such as network invasion monitoring.One character string is the limited character string being defined on finite alphabet Σ, character string Matching algorithm searches for some string assemble S={ P on a big character string TiIn all character string PiOccurred position It sets.T is referred to as text, PiReferred to as pattern string, T and PiIt is all defined on the same alphabet Σ.
In string matching field, automatic machine is a kind of important data structure.For example, 1975 by Aho and The AC automatic machine algorithm that Corasick is proposed (refers to Efficient String Matching:An Aid to Bibliographic Search), the KMP algorithm proposed by Knuth, Morris and Pratt for 1977 (refers to Fast Pattern Matching In Strings), it is calculated by the BOM that Allauzen, Crochemore and Raffinot are proposed within 1999 Many algorithms such as method (referring to Factor Oracle:A New Structure for Pattern Matching) all pass through certainly Motivation realizes quick string matching.Since the scale of mode trail in most applications is usually larger, the automatic machine of generation Occupied space is larger, and the resource for also influencing its matching speed, therefore reducing automatic machine occupancy, which just becomes one, is worth asking for research Topic.
Automatic machine is also known as finite state machine, is a kind of for indicating string assemble and providing string matching function Data structure.For abstract, the automatic machine in string matching algorithm can be expressed as the matrix A that a size is N × 256N×256, Wherein N is the status number of automatic machine, and 256 be character set size (1 byte).For current state s and character c is inputted, A [s, C] indicate the next state arrived at, usually indicated with a nonnegative integer or pointer.A [s, c]=- 1 indicates current state s There is no successor states for input character c.AN×256In each occupied space of statusline be sizeof (int) × 256, altogether Occupied space sizeof (int) × 256 × N.Since mode trail S is on a grand scale in numerous applications, corresponding automatic machine shape State number is more, and the space occupied is very considerable, affects the practicability of the string matching algorithm based on automatic machine, therefore, it is necessary to right The high-efficiency compression method of automatic machine is studied.
Norton in 2004 is in mono- text of Optimizing Pattern Matching for Intrusion Detection In propose it is a kind of be known as Banded-Row automatic machine compression method.Due in string matching algorithm, the major part of automatic machine The usually only seldom several successor states of state, for each statusline A [s], directly with sizeof (int) × 256 come Expression is to waste very much memory space.In order to compress the memory space of AC automatic machine, two integer lb of Banded-RowsWith ubsRespectively in recording status row A [s] first and the last one successor states transfer character, it may be assumed that
Remove the idle running of A [s] end to end to move, sizeof (int) × (ub is used only in every a lines-lbs+ 3) memory space comes It indicates, had both remained the random-access characteristic of array, while also saving memory space compared to matrix representation.
2018 Nian Liuyan soldiers etc. propose the automatic machine space compression method based on character set transformation, due to calculating in String matching In method, state of automata row is usually very sparse, lbsAnd ubsBetween still have a large amount of invalid states, based on character set transformation Automatic machine space compression method passes through exclusive or functionInput character is converted, and is defined:
Under the action of suitable transformation parameter X [s], reduceWithBetween invalid state number, further save Memory space is saved.
Existing technical solution is mainly Banded-Row method and the automatic machine space compression method based on character set transformation, Both schemes all some shortcomings on room and time.Spatially, Banded-Row method and based on character set transformation from Motivation space compression method is respectively necessary for occupyingWithSpace, work as ubs-lbsOrWhen larger, even if the successor states number of every row Less, the space occupied remains on considerable.For example, if lbs=0, ubs=255, even if successor states there are two A [s], Banded-Row method also can not compression space, in another example, whenAnd when A [s, 125] ≠ -1, although A [s] only there are three successor states, based on the automatic machine space compression method of character set transformation there is still a need for consumption sizeof (int) × 130 space.On time, since both methods requires check whether character is more than [lb in scannings,ubs] orRange affects matching speed.
Summary of the invention
The object of the present invention is to provide a kind of linear congruence character set transform method for automatic machine space compression and it is System belongs to the automatic machine compression method for string matching, and this method guarantees that the time complexity of state transfer is O (1), together When the memory space of data structure can be greatly reduced.
To achieve the above object, the present invention adopts the following technical scheme:
A kind of linear congruence character set transform method for automatic machine space compression, comprising the following steps:
Mode trail is built into automatic machine, and generates state-transition matrix;
Optimized transformation parameters and maximum effective successor states are calculated in each statusline of reading state transfer matrix;
According to state-transition matrix and optimized transformation parameters, data structure is obtained, is replaced statusline according to the data structure It is changed to transformed effective status row;
The character for reading target text carries out character transformation using linear congruence function, obtains according to character current state Transformed character;
If the character after variation is no more than maximum effective successor states, the final subsequent shape after obtaining character transformation State realizes transformation.
Further, it according to each statusline of state-transition matrix and candidate transformation parameter, calculates minimum effectively subsequent Maximum effective successor states are calculated when effectively successor states are zero to the minimum in state, and first group of note makes maximum effective The smallest transformation parameter of successor states is optimized transformation parameters.
Further, the calculating formula of effective successor states isWherein A [] is state transfer The statusline of matrix, c are the character of target text, and s is character current state, and i, j are candidate transformation parameter.
Further, candidate transformation parameter i value is from 0 to 127, and j value is from 0 to 255.
Further, transformed effective status behavior < A [s, M [s]], A [s, (N [s]+M [s]) mod256], A [s, (N [s] × 2+M [s]) mod256] ..., A [s, (N [s] × lc [s]+M [s]) mod256] >, wherein N [s], M [s], lc [s] they are number According to structure, A [] is the statusline of state-transition matrix, and s is character current state.
Further, N [s]=2k+1, M [s]=m, lc [s] are equal to maximum effective successor states, wherein k, and m is best becomes Change parameter.
Further, linear congruence function is fs (c)=N [s] × c+M [s], and wherein N [s], M [s] are data structure, c For the character of target text, s is character current state.
Further, final successor states are A [s, c '], and wherein A [] is the statusline of state-transition matrix, and s is word Accord with current state, the transformed character of c '.
A kind of linear congruence character set transformation system for automatic machine space compression, comprising:
Rule compiler establishes state of automata transfer figure for reading, interpretive model trail, and generates state transfer Matrix;
Transformation parameter generator, for generating optimized transformation parameters;
Statusline converter receives optimized transformation parameters for reading state transfer matrix line by line, and carries out to statusline Transformation;
Comparator updates compression automatic machine memory for deciding whether according to transformation results, and knot is compared in generation Fruit;
Automatic machine memory is compressed, for reading above-mentioned transformation results according to comparison result, updates storage inside;
Status register, for storing current state;
Character set converter according to the current state of status register storage and compresses certainly for reading text character by character The corresponding transformation parameter stored in motivation, converts character;
Text scanner, the character sent according to the current state of status register storage, character set converter and pressure The statusline stored in contracting automatic machine calculates next state and updates status register.
A kind of computer readable storage medium storing computer program, the computer program include instruction, which works as The server is made to execute each step in the above method when being executed by the processor of server.
The method of the present invention guarantees that the time complexity of state transfer is O (1), the fast speed of matched data, while can be big The memory space of amplitude reduction data structure.
Detailed description of the invention
Fig. 1 is character set transformation schematic diagram.
Fig. 2 is a kind of linear congruence character set transformation system structure chart for automatic machine space compression.
Fig. 3 is state of automata transfer figure.
Fig. 4 A-4D is the result statistical chart for testing 1-4.
Specific embodiment
To enable features described above and advantage of the invention to be clearer and more comprehensible, special embodiment below, and institute's attached drawing is cooperated to make Detailed description are as follows.
Provided by the present invention for linear congruence character set transform method (the hereinafter referred to as congruence change of automatic machine space compression Change method) it is in the automatic machine space compression method (hereinafter referred to as exclusive or converter technique) based on character set transformation for prototype, guarantee shape The time complexity of state transfer is O (1), while the memory space of data structure can be greatly reduced.
As shown in Figure 1, the main thought of the invention is through a linear congruence function fs(c)=ns×c+msTo word Symbol collection is converted, so that the effective status of statusline is continuous as far as possible.In figure, A [s] is a state in state-transition matrix Row, p is deviant of each successor states in statusline, and c is corresponding input character.Figure left side indicates exclusive or converter technique, In the method, input character c passes through exclusive or functionIt is mapped to deviant p, although there was only 3 in A [s] has Successor states are imitated, in order to store first to a last effective successor states, after needing to store 7 in the shadow region of left side It include 4 invalid successor states after state.In congruence transformation method shown on the right side of the figure, input character c passes through linear congruence Function fs(c)=3 × c+3 is mapped to deviant p, and the deviant of effective successor states is transformed to a more continuous area Domain, therefore it may only be necessary to which storing 4 successor states in right shade region may include all effective successor states.
The same with exclusive or converter technique, congruence transformation method is divided into initialization and two stages of matching, is described as follows.
Initial phase:
1. pressing matrix representation, mode set of strings is built into automatic machine.
2. each statusline for state-transition matrix calculates optimized transformation parameters: reading each statusline of automatic machine A [s], candidate transformation parameter i value change to 127, j value from 0 and change to 255 from 0, calculate minimum effective successor statesIf infs,i,j=0, calculate maximum effective successor statesRemember that first group makes sups,i,jThe smallest parameter i, j k, m.
3. storing transformation parameter and compression automatic machine: record data structure N [s]=2k+1, M [s]=m, lc [s]= sups,k,m, statusline is replaced with transformed effective status row < A [s, M [s]], A [s, (N [s]+M [s]) mod256], A [s, (N [s] × 2+M [s]) mod256] ..., A [s, (N [s] × lc [s]+M [s]) mod256] >.
So far, the step of initial phase is fully completed.
Matching stage:
Upon a match, automatic machine is turned by four available states of data structure N, M, lc, A that above-mentioned compression method generates Move formula:
Detailed process is as follows:
1. reading in a character c in text to be scanned, according to current state s, calculate c'=(N [s] × c+M [s]) mod256;
2. if c'≤lc [s], successor states are A [s, c'];
3. otherwise, returning, it fails to match.
So far, the step of matching stage is fully completed.
As shown in Fig. 2, congruence transformation method of the invention is real by the automatic machine space compression system based on character set transformation It is now as follows:
1) rule compiler reading, interpretive model trail, establish state of automata transfer figure, and generate state transfer square Battle array;
2) statusline converter reads the state-transition matrix of rule compiler generation line by line, while it is raw to receive transformation parameter It grows up to be a useful person the transformation parameter transmitted, statusline is converted, and transformed statusline length is sent to comparator;
3) comparator decides whether to update compression automatic machine memory according to transformation results, and comparison result is sent out It send to compression automatic machine memory;
4) compression automatic machine memory receives that comparator is sent as a result, raw according to comparison result reading state row converter At transformation results, update storage inside;
5) character set converter reads text character by character, according to the current state of status register storage and compression automatic machine The corresponding transformation parameter of middle storage, converts character and is sent to text scanner;
6) character and pressure that text scanner is stored according to status register current state, character set converter are sent The statusline stored in contracting automatic machine calculates next state and updates status register.
It is specifically addressed by the following examples:
For ease of description, character set Σ={ 0,1,2,3,4,5,6,7,8,9, A, B, C, D, E, F }, character set size are enabled | Σ |=16, text T=5C5F, mode trail S are as follows:
1 Sample Rules of table
Initial phase:
1. interpretive model trail establishes state of automata transfer figure, as shown in Figure 3;According to state transition diagram, state is established Shift-matrix A [s, c], as shown in table 2, -1 indicates invalid transfer in table, subsequent after the corresponding character of other digital representations receiving State;
2 state-transition matrix of table
2. reading a line in A, to transformation parameter i=0...7 and j=0...15, calculateIf infs,i,j=0, it calculatesNote is minimum Sups,i,jFor sup's, k, m are designated as under corresponding, such as A [0], k=1, m=3, sup0,1,3=(2 × 1+1) × 0+ 3mod16=3, A [0]=2, -1,3,1 > of <;
3. every a line in couple A executes aforesaid operations, four data structures N, M, lc, A are obtained as shown in table 3, table 4:
3 transformation parameter of table
s N[s] M[s] lc[s]
0 3 3 3
1 3 12 2
2 3 9 1
3 3 12 1
4 1 9 0
5 1 13 0
6 1 11 0
7 1 2 0
8 1 1 0
Table 4 compresses automatic machine
s 0 1 2 3
0 2 -1 3 1
1 4 -1 7
2 7 5
3 6 8
4 7
5 8
6 8
7 9
8 9
So far, the step of initial phase is fully completed.
Matching stage:
1. reading in the first character 5 in text T, according to current state 0, c'=3 × 5+3mod16=2 is calculated;
2. due to lc [0]=3, thus c'≤lc [0], therefore successor states are A [0, c']=3;
3. repeating aforesaid operations until s=9.
So far, the step of matching stage is fully completed.
The good effect that the present invention obtains:
The present invention has made following reality under 64 Linux 4.19.3 systems of single machine (8GB memory, CPU are Intel i7) It tests:
Test program generates the mode trail and text to be matched for establishing automatic machine at random;Mode trail size exists It is determined respectively in each experiment, size text is fixed as 10MB.
Statistical indicator: initialization time initializes time and matching speed used in occupied space, matched data.
Experiment uses exclusive or converter technique and does comparative experiments, and experimental result is as shown in table 5.
In experiment 1, long 16 bytes of pattern string, pattern string is concentrated with 262144 pattern strings, exclusive or converter technique occupied space EMS memory occupation is down to 245.23MB by 344.30MB, congruence transformation method, and in scanning speed, exclusive or converter technique is 10.661MB/s, Congruence transformation method is 11.318MB/s, is slightly promoted, such as Fig. 4 A.
In experiment 2, long 16 bytes of pattern string, pattern string is concentrated with 524288 pattern strings, exclusive or converter technique occupied space EMS memory occupation is down to 415.10MB by 751.55MB, congruence transformation method, and in scanning speed, exclusive or converter technique is 9.406MB/s, together Remaining converter technique is 10.437MB/s, is slightly promoted, such as Fig. 4 B.
In experiment 3, long 32 bytes of pattern string, pattern string is concentrated with 262144 pattern strings, exclusive or converter technique occupied space EMS memory occupation is down to 439.35MB by 846.81MB, congruence transformation method, and close to halving, in scanning speed, exclusive or converter technique is 15.314MB/s, congruence transformation method are 18.231MB/s, speed-raising about 19%, such as Fig. 4 C.
In experiment 4, long 32 bytes of pattern string, pattern string is concentrated with 524288 pattern strings, exclusive or converter technique occupied space EMS memory occupation is down to 920.29MB by 2203.26MB, congruence transformation method, and saving is more than half memory, and in scanning speed, exclusive or becomes Changing method is 13.114MB/s, and congruence transformation method is 16.751MB/s, speed-raising 27.7%, such as Fig. 4 D.
5 experimental result of table statistics
The above experiment shows that the occupied significant spatial of compression automatic machine of the method for the present invention converts compression side lower than exclusive or The speed of method, matched data is faster than exclusive or transform-based image compression, achieves apparent technical effect.Therefore, this method and system There are extensive real value and application scenarios.
The above embodiments are merely illustrative of the technical solutions of the present invention rather than is limited, the ordinary skill of this field Personnel can be with modification or equivalent replacement of the technical solution of the present invention are made, without departing from the spirit and scope of the present invention, this The protection scope of invention should be subject to described in claims.

Claims (10)

1. a kind of linear congruence character set transform method for automatic machine space compression, which comprises the following steps:
Mode trail is built into automatic machine, and generates state-transition matrix;
Optimized transformation parameters and maximum effective successor states are calculated in each statusline of reading state transfer matrix;
According to state-transition matrix and optimized transformation parameters, data structure is obtained, is replaced with statusline according to the data structure Transformed effective status row;
The character for reading target text carries out character transformation using linear congruence function, is converted according to character current state Character afterwards;
If the character after variation is no more than maximum effective successor states, the final successor states after obtaining character transformation are real Now convert.
2. the method as described in claim 1, which is characterized in that according to each statusline and candidate transformation of state-transition matrix Parameter calculates minimum effective successor states, and when effectively successor states are zero to the minimum, maximum effectively subsequent shape is calculated State, first group of note make the maximum effectively the smallest transformation parameter of successor states be optimized transformation parameters.
3. method according to claim 2, which is characterized in that the meter in of effective successor states calculates fs, i formula, j isWherein A [] is the statusline of state-transition matrix, and c is the character of target text, and s is character Current state, i, j are candidate transformation parameter.
4. method as claimed in claim 3, which is characterized in that candidate transformation parameter i value is from 0 to 127, and j value is from 0 To 255.
5. the method as described in claim 1, which is characterized in that transformed effective status behavior < A [s, M [s]], A [s, (N [s]+M [s]) mod256], A [s, (N [s] × 2+M [s]) mod256] ..., A [s, (N [s] × lc [s]+M [s]) mod256] >, Wherein N [s], M [s], lc [s] are data structure, and A [] is the statusline of state-transition matrix, and s is character current state.
6. method as claimed in claim 5, which is characterized in that N [s]=2k+1, M [s]=m, after lc [s] is equal to maximum effectively After state, wherein k, m are optimized transformation parameters.
7. the method as described in claim 1, which is characterized in that linear congruence function is fs (c)=N [s] × c+M [s], wherein N [s], M [s] are data structure, and c is the character of target text, and s is character current state.
8. the method as described in claim 1, which is characterized in that final successor states are A [s, c '], and wherein A [] is state The statusline of transfer matrix, s are character current state, the transformed character of c '.
9. a kind of linear congruence character set transformation system for automatic machine space compression characterized by comprising
Rule compiler establishes state of automata transfer figure, and generate state-transition matrix for reading, interpretive model trail;
Transformation parameter generator, for generating optimized transformation parameters;
Statusline converter receives optimized transformation parameters, and become to statusline for reading state transfer matrix line by line It changes;
Comparator updates compression automatic machine memory for deciding whether according to transformation results, generates comparison result;
Automatic machine memory is compressed, for reading above-mentioned transformation results according to comparison result, updates storage inside;
Status register, for storing current state;
Character set converter, for reading text character by character, according to the current state of status register storage and compression automatic machine The corresponding transformation parameter of middle storage, converts character;
Text scanner, the character sent according to the current state of status register storage, character set converter and compression are certainly The statusline stored in motivation calculates next state and updates status register.
10. a kind of computer readable storage medium for storing computer program, which is characterized in that the computer program includes referring to It enables, which makes the server execute any side the claims 1-8 when the processor execution by server Each step in method.
CN201910505446.2A 2019-06-12 2019-06-12 Linear congruence character set transformation method and system for automaton space compression Active CN110413958B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910505446.2A CN110413958B (en) 2019-06-12 2019-06-12 Linear congruence character set transformation method and system for automaton space compression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910505446.2A CN110413958B (en) 2019-06-12 2019-06-12 Linear congruence character set transformation method and system for automaton space compression

Publications (2)

Publication Number Publication Date
CN110413958A true CN110413958A (en) 2019-11-05
CN110413958B CN110413958B (en) 2020-12-04

Family

ID=68358997

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910505446.2A Active CN110413958B (en) 2019-06-12 2019-06-12 Linear congruence character set transformation method and system for automaton space compression

Country Status (1)

Country Link
CN (1) CN110413958B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117675417A (en) * 2024-02-02 2024-03-08 中国电子信息产业集团有限公司第六研究所 Quick text scanning method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004013777A1 (en) * 2002-08-05 2004-02-12 Fish Robert System and method of parallel pattern matching
CN101630323A (en) * 2009-08-20 2010-01-20 中国科学院计算技术研究所 Method for compressing space of finite automaton
CN101916259A (en) * 2010-07-06 2010-12-15 中国科学院计算技术研究所 Space compression method of state transition table of deterministic automaton
CN103559018A (en) * 2013-10-23 2014-02-05 东软集团股份有限公司 String matching method and system based on graphics processing unit (GPU) calculation
US9083740B1 (en) * 2009-09-28 2015-07-14 Juniper Networks, Inc. Network traffic pattern matching using adaptive deterministic finite automata
CN104809161A (en) * 2015-04-01 2015-07-29 中国科学院信息工程研究所 Method and system for conducting compression and query on sparse matrix
CN104881439A (en) * 2015-05-11 2015-09-02 中国科学院信息工程研究所 Method and system for space-efficient multi-pattern matching
CN105426412A (en) * 2015-11-03 2016-03-23 北京锐安科技有限公司 Multi-mode string matching method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004013777A1 (en) * 2002-08-05 2004-02-12 Fish Robert System and method of parallel pattern matching
CN101630323A (en) * 2009-08-20 2010-01-20 中国科学院计算技术研究所 Method for compressing space of finite automaton
US9083740B1 (en) * 2009-09-28 2015-07-14 Juniper Networks, Inc. Network traffic pattern matching using adaptive deterministic finite automata
CN101916259A (en) * 2010-07-06 2010-12-15 中国科学院计算技术研究所 Space compression method of state transition table of deterministic automaton
CN103559018A (en) * 2013-10-23 2014-02-05 东软集团股份有限公司 String matching method and system based on graphics processing unit (GPU) calculation
CN104809161A (en) * 2015-04-01 2015-07-29 中国科学院信息工程研究所 Method and system for conducting compression and query on sparse matrix
CN104881439A (en) * 2015-05-11 2015-09-02 中国科学院信息工程研究所 Method and system for space-efficient multi-pattern matching
CN105426412A (en) * 2015-11-03 2016-03-23 北京锐安科技有限公司 Multi-mode string matching method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
熊刚、何慧敏、于静、刘燕兵、郭莉: "HybridFA:一种基于统计的AC自动机空间优化技术", 《通信学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117675417A (en) * 2024-02-02 2024-03-08 中国电子信息产业集团有限公司第六研究所 Quick text scanning method and device, electronic equipment and storage medium
CN117675417B (en) * 2024-02-02 2024-04-16 中国电子信息产业集团有限公司第六研究所 Quick text scanning method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN110413958B (en) 2020-12-04

Similar Documents

Publication Publication Date Title
CN108959246B (en) Answer selection method and device based on improved attention mechanism and electronic equipment
Ren et al. On querying historical evolving graph sequences
Chen et al. Density-based clustering for real-time stream data
US20130198272A1 (en) Operation log storage system, device, and program
CN103593440A (en) Method and device for reading and writing log file
CN112884204B (en) Network security risk event prediction method and device
CN112231514B (en) Data deduplication method and device, storage medium and server
CN101604408B (en) Generation of detectors and detecting method
CN110401451A (en) Automatic machine space compression method and system based on character set transformation
CN110389840B (en) Load consumption early warning method and device, computer equipment and storage medium
Yong et al. Efficient graph summarization using weighted lsh at billion-scale
CN110413958A (en) Linear congruence character set transform method and system for automatic machine space compression
CN113761192B (en) Text processing method, text processing device and text processing equipment
Zhang et al. SPOT: A system for detecting projected outliers from high-dimensional data streams
CN111767419B (en) Picture searching method, device, equipment and computer readable storage medium
CN112181302A (en) Data multilevel storage and access method and system
CN116127447A (en) Virtual power plant false data injection attack detection method, device, terminal and medium
CN108399152A (en) Compression expression method, system, storage medium and the rule match device of digital search tree
CN112054805B (en) Model data compression method, system and related equipment
KR20180137387A (en) Apparatus and method for detecting overlapping community
Liu et al. An analysis of missing data treatment methods and their application to health care dataset
CN113342518A (en) Task processing method and device
CN113468202B (en) Memory data screening method, device, equipment and storage medium
CN115329118B (en) Image similarity retrieval method and system for garbage image
JPWO2020074788A5 (en)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant