CN1645374A - Digit marking character string searching technology - Google Patents
Digit marking character string searching technology Download PDFInfo
- Publication number
- CN1645374A CN1645374A CNA2005100233835A CN200510023383A CN1645374A CN 1645374 A CN1645374 A CN 1645374A CN A2005100233835 A CNA2005100233835 A CN A2005100233835A CN 200510023383 A CN200510023383 A CN 200510023383A CN 1645374 A CN1645374 A CN 1645374A
- Authority
- CN
- China
- Prior art keywords
- character string
- character
- bit
- place value
- base
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
Abstract
An indexing method of bit labeled character string includes dividing basic character of character string to be m group, labeling these basic character information by bit 'or' operation, recording character string information named as 'bit value', using 'bit value' operation to select out preliminay result set R1 from databank record then using character normal bit to bit comparison mode for the secondary indexing to obtain final indexing result set R2.
Description
Technical field
The present invention is a kind of character string fuzzy search technology, and purpose is to improve the speed of database character string fuzzy search.Method is that the base character that will form character string is divided into the m group, and uses by the data W of m bit and come mark to form the base character information of character string.If the base character C1 of character string S belongs to the n group, then data W is labeled as 1 from n bit of right-to-left (also can be from left to right), similarly, according to other base character C2, C3, C4 ... under group data W is carried out mark, mark can with " or " (or) computing carry out.Finish the data W behind whole base character marks, record the information of character string S, be called " place value " of character string S.To " place value " Wt of " place value " Wn of character string Sn and character string T to be retrieved carry out the position " with " (and) computing, its result is called Wg.If Wg equals Wt, then " place value " Wn equals or comprises Wt.Because different character strings has identical place value, obtain PRELIMINARY RESULTS collection R1 utilizing " place value " computing that data-base recording is screened, again with common character by turn manner of comparison make quadratic search, draw final retrieval set R2.
Background technology
Database character string fuzzy search at present adopts by turn manner of comparison to carry out, as judges whether comprise character f among the character string bdopfqew, computing machine from first to last compares character string bdopfqew by turn with f, efficient is not high.
On October 19th, 2004, I have applied for " prime number replacing character string search technology " patent, application number 200410067258.X, this method has improved the speed of character string fuzzy search effectively, but implement " prime number replacing string search " for long character string, need write down the prime number product with the integer of a plurality of fields, more to the memory space demand.In order to improve the speed of character string fuzzy search, and minimizing is to the demand of memory space, the present invention proposes with several position (bit) composition information of coming the tab character string, after database finished mark, utilize the position " with " computing makes preliminary screening to record, utilizing by turn in PRELIMINARY RESULTS again, manner of comparison retrieves net result.
Summary of the invention
The present invention is a kind of character string fuzzy search technology, and method is that the base character that will form character string is divided into the m group, and uses by the data W of m bit and come mark to form the base character information of character string.Its process has two, one to utilize the inclusive-OR operation of position that character string is carried out mark; Two, utilize the position " with " computing retrieves, and the following describes the realization principle:
21000 Chinese characters and other symbol of income GBK scope all have ISN, according to ISN whole Chinese characters and other symbol are divided into 31 groups, for n group value of investing 2
N-1From scale-of-two, every group is being 1 on n bit of right-to-left, and all the other bit are 0, is referred to as " basic place value ".
Group | Numerical value | Basic place value |
????1 | ????1 | ????00000000000000000000000000000001 |
????2 | ????2 | ????00000000000000000000000000000010 |
????3 | ????4 | ????00000000000000000000000000000100 |
????4 | ????8 | ????00000000000000000000000000001000 |
????5 | ????16 | ????00000000000000000000000000010000 |
????6 | ????32 | ????00000000000000000000000000100000 |
????7 | ????64 | ????00000000000000000000000001000000 |
????8 | ????128 | ????00000000000000000000000010000000 |
????9 | ????256 | ????00000000000000000000000100000000 |
????10 | ????512 | ????00000000000000000000001000000000 |
????11 | ????1024 | ????00000000000000000000010000000000 |
????12 | ????2048 | ????00000000000000000000100000000000 |
????13 | ????4096 | ????00000000000000000001000000000000 |
????14 | ????8192 | ????00000000000000000010000000000000 |
????15 | ????16384 | ????00000000000000000100000000000000 |
????16 | ????32768 | ????00000000000000001000000000000000 |
????17 | ????65536 | ????00000000000000010000000000000000 |
????18 | ????131072 | ????00000000000000100000000000000000 |
????19 | ????262144 | ????00000000000001000000000000000000 |
????20 | ????524288 | ????00000000000010000000000000000000 |
????21 | ????1048576 | ????00000000000100000000000000000000 |
????22 | ????2097152 | ????00000000001000000000000000000000 |
????23 | ????4194304 | ????00000000010000000000000000000000 |
????24 | ????8388608 | ????00000000100000000000000000000000 |
????25 | ????16777216 | ????00000001000000000000000000000000 |
????26 | ????33554432 | ????00000010000000000000000000000000 |
????27 | ????67108864 | ????00000100000000000000000000000000 |
????28 | ????134217728 | ????00001000000000000000000000000000 |
????29 | ????268435456 | ????00010000000000000000000000000000 |
????30 | ????536870912 | ????00100000000000000000000000000000 |
????31 | ????1073741824 | ????01000000000000000000000000000000 |
Be provided with character string " the straight long river of lonely cigarette, desert setting sun circle ", then:
Chinese character | ISN | Group | Numerical value | Basic place value |
Greatly | ????22823 | ????8 | ????128 | ????00000000000000000000000010000000 |
Unconcerned | ????28448 | ????22 | ????2097152 | ????00000000001000000000000000000000 |
Lonely | ????23396 | ????23 | ????4194304 | ????00000000010000000000000000000000 |
Cigarette | ????28895 | ????4 | ????8 | ????00000000000000000000000000001000 |
Directly | ????30452 | ????11 | ????1024 | ????00000000000000000000010000000000 |
Long | ????27265 | ????17 | ????65536 | ????00000000000000010000000000000000 |
The river | ????27827 | ????21 | ????1048576 | ????00000000000100000000000000000000 |
Fall | ????31683 | ????2 | ????2 | ????00000000000000000000000000000010 |
Day | ????26085 | ????15 | ????16384 | ????00000000000000000100000000000000 |
Circle | ????22278 | ????21 | ????1048576 | ????00000000000100000000000000000000 |
????8471690 | ||||
The place value of whole character string | ????7423114 | ????00000000011100010100010010001010 |
Place value to " big, desert, orphan, cigarette, straight, length, river, fall, day, justify " ten characters do " or " (or) computing, can obtain " place value " of whole character string: 00000000011100010100010010001010.
Another aspect, the total value of character string " the straight long river of lonely cigarette, desert setting sun circle " is 8471690, removes a repetition values 1048576 in " river " and " circle ", net value is 7423114.It is corresponding with 00000000011100010100010010001010.
Can obtain " place value " of any character string with this kind method, " place value " of " white clouds thousand years empty long " is: 00100010000001001010000000010000.
And " place value " of " the long river setting sun " is 00000000000100010100000000000010.
Judge whether " place value " Wn of character string Sn comprises or equal " place value " Wt of T, as long as " place value " Wt to " place value " Wn of character string and T do " with " (and) computing, if Wg equals Wt as a result, then Wn comprises or equals Wt, furtherly, character string Sn may comprise or equal T.That is:
Wg=Wn?and?Wt
As Wg=Wt
Then Wn comprises or equals Wt,
And Sn may comprise or equal T.
The straight long river of lonely cigarette, desert setting sun circle S1 | Thousand years empty long S2 of white clouds | |
??00000000011100010100010010001010W1 | ??00100010000001001010000000010000W2 | |
??00000000000100010100000000000010Wt | ??00000000000100010100000000000010Wt | Long river setting sun T |
??00000000000100010100000000000010Wg1 | ??00000000000000000000000000000000Wg2 | " and " value |
As seen from above-mentioned, " the basic place value " of " river " and " circle " is identical, and kinds of characters string " place value " identical existence.The purpose of bit mark character string retrieval is to utilize bit arithmetic that the character string in the database is done preliminary search to obtain R1, carries out quadratic search with common relative method by turn in the result, obtains net result R2.The position " with " computing by turn than comparatively fast, in the enforcement, in order to improve retrieval rate, should reduce R1 more than character as far as possible, makes it near R2, reduces the used time of quadratic search.
Some explanation:
1. establishing the character string average length is L, and data-base recording bar number is R, and string length to be retrieved is l, and the used figure place of mark is m, and then the bar number of preliminary search result set R1 can be estimated roughly with following formula:
This formula is not considered the probability distribution problem of string token place value, thus inaccurate, but general description the influence of Several Parameters to R1.
Be provided with the title database of 3,000,000 records, the character string average length is 16, and with 31 bit marks, the used search key length of user is 4, then
As seen for general Chinese character words and phrases, title, place name, unit name, can carry out mark to character string effectively with 31 bit outside the sign bit among 32 bit of a lint-long integer.
Longer for the character string average length, record strip is counted the more data storehouse, in sql SEVER 2000, can adopt 63 bit of data type bigint to carry out mark, correspondingly, base character is divided into 63 groups, certainly for 32 bit processors, with bigint inevitable Wn and Wt carry out the position " with " whether computing and comparison Wg equate to use the more time with Wt, whether adopt should do to contrast and test.In fact, any data type of being convenient to carry out " or " and " with " computing of position in any database all can be used for the tab character string, and is better naturally if independent programming constructs does not have the exclusive data type of sign symbol position.
3. be the Chinese words and phrases database of double word symbol for the overwhelming majority, can consider with two bit to be that 1 basic place value is carried out mark, the branch Chinese character be 31! / (2! * (31-2)! ) group, promptly 465 groups.But thus, " place value " common 4 bit of double word symbol words and phrases are 1, can resolve to 4! / (2! * (4-2)! ), i.e. 6 Chinese characters.If the user is with a Chinese character index, the database of 100,000 words and phrases, then
As seen adopting two bit is that 1 basic place value is carried out labeling properties and slightly improved, but this kind method restricted application.That is to say that when string length to be retrieved was 1 character, bit mark character string search method performance was inferior to the prime number replacing character string retrieving method.
4. base character is for Chinese Chinese character normally, certainly during Chinese character retrieval, can be basic compile other.For the Chinese phonetic alphabet, can be letter, initial consonant, simple or compound vowel of a Chinese syllable, syllable.For other Languages, can be letter, syllable, word etc.
5. the used data type of mark still should be considered the figure place of cpu except that considering software factors such as programming language, database.For 64 cpu, should pay the utmost attention to and adopt 64 bit to come the tab character string, to make full use of the performance of cpu, improve the dispersion of " place value ".
4. the grouping of base character if can realize the word frequency equalization, and then performance is optimum naturally, carries out modular arithmetic with grouping with Hanzi internal code, and relatively easy the realization is not optimum grouping.
Embodiment
The present invention has obtained good realization in database character string fuzzy searches such as Chinese vocabulary, phrase, phrase, title, make up database with sql SERVER2000 below, with vb6.0 is programming language, specify, the character string fuzzy search of other programming language and other database can be with reference to enforcement.
1. set up database
If database shuku has table biao, field shuming is wherein arranged, data type is nvarchar, length is 40.Other sets up field wei, and data type is " long ", and just 4 bytes have 32 bit, and wherein one is positive and negative numerical symbol, and all the other 31 bit can utilize.
2 utilize the inclusive-OR operation of position that the database character string is made " mark "
The long array of 31 elements of dim shuzu (30) As Long ' definition.Shuzu (0)=1 For x=1 To 30 shuzu (x)=2*shuzu (x-1) Next ' is to 31 element assignment of long array, from 1,2,4,8,16 to 1073741824, from scale-of-two, it is 1 that a bit is arranged, and all the other bit are 0, just " basic place value ".Dim biaostr As String when the basic place value Dim x As Integer biaors.MoveFirst of a character of place value Dim weizhilin As Long storage of the character string Dim weizhi As Long of pre-treatment storage character string<!--SIPO<DP n=" 6 "〉--〉<dp n=" d6 "/' first record Do weizhilin=0 weizhi=0 With biaors biaostr=.Fields (" shuming ") the End With ' that moves to database record set biaors reads in the character string of a record, invest string variable biaostr For x=1 To Len (biaostr) index=Abs (AscW (Mid (biaostr, x, 1)) Mod 31) ' from string variable biaostr, get a character, and with this character ISN, with 31 is that mould is done computing, take absolute value again, and invest index, just base character is divided into groups.Weizhilin=shuzu (index) ' invests weizhilin with array shuzu (index) value, is one of 1,2,4,8,16 to 1073741824.Weizhi=weizhi Or weizhilin ' is with " basic place value " the weizhilin value of a character and the inclusive-OR operation of weizhi work position.Next ' loop ends, " place value " weizhi With biaors .Fields (" wei ")=weizhi End With biaors.Update ' that obtains current string handles next record Loop While Not biaors.EOF with the field wei biaors.MoveNext ' that " place value " weizhi stores into current record
3. utilize the position " with " computing carries out the fuzzy search of database character string
Dim shuzu (30) As Long shuzu (0)=1 For x=1 To 30 shuzu (x)=2*shuzu (x-1) Next<!--SIPO<DP n=" 7 "〉--〉<dp n=" d7 "/' the long array of 31 elements of definition, assignment is finished consistent from 1,2,4,8,16 to 1073741824 with the array of " mark ".It is character string For x=1 To Len to be retrieved (biaostr) index=Abs (AscW (Mid (textstr that the place value Dim textstr As String ' of a character of place value Dim weizhilin As Long ' storage of a character string of Dim weizhi As Long ' storage stores current searching character string Dim xAs Integer weizhilin=0 weizhi=0 textstr=Text1.Text ' Text1.Text, x, 1)) Mod 31) weizhilin=shuzu (index) weizhi=weizhi Or weiztilin Next ' obtains " place value " weizhi of character string to be retrieved, and method is consistent with database character string " mark " method.StrQuery=" select*from (SELECT*FROM biao WHERE (wei﹠amp; Amp; " ﹠amp; Amp; Weizhi﹠amp; Amp; ")=" ﹠amp; Amp; Weizhi﹠amp; Amp; ") DERIVEDTBL WHERE (shuming like ' % " ﹠amp; Amp; Textstr﹠amp; Amp; " % ') " ' with " place value " work of each record of " place value " of character string to be retrieved and database " with " (and) computing, make preliminary search, make quadratic search in common character string fuzzy search mode again, obtain net result.This is the query statement of sql SERVER2000, and other database may be slightly different.Adodc 1.RecordSource=strQuery Adodc 1.Refresh ' execution retrieval DataList1.ListField=" shuming "<!--SIPO<DP n=" 8 "〉--〉<dp n=" d8 "/DataList1.ReFill ' shows current result for retrieval in list box.
Claims (10)
1. character string fuzzy search technology is characterized in that: the base character that will form character string is divided into the m group, and uses by the data W of m bit and come mark to form the base character information of character string.If the base character C1 of character string S belongs to the n group, then data W is labeled as 1 from n bit of right-to-left (or from left to right), similarly, according to other base character C2, C3, C4 ... affiliated group is carried out mark to data W, finish the data W behind whole base character marks, record the information of character string S, be called " place value " of character string S." place value " Wn of character string Sn and " place value " Wt of character string T to be retrieved are compared, if Wn equals or comprises Wt, then character string Sn may equal or comprise character string T, thereby realizes the fuzzy search of character string.
2. in accordance with the method for claim 1, it is characterized in that: it is 1 and all the other bit are 0 " place value substantially " that mark can be earlier invests corresponding n bit to each group base character, to " the basic place value " of whole base characters of a character string carry out the position " or " (or) computing, obtain " place value " of a character string.
3. in accordance with the method for claim 1, it is characterized in that: relatively whether two " place values " have relation of inclusion, available position " with " (and) computing carry out.To " place value " Wn of character string Sn and character string T " place value,, Wt carry out the position " with " (and) computing, the result is called Wg, if Wg equals Wt, then Wn equals or comprises Wt.
4. in accordance with the method for claim 1, it is characterized in that: because different character strings has identical " place value ", obtain PRELIMINARY RESULTS R1 so utilize " place value " computing that data-base recording is screened, again with common character by turn manner of comparison make quadratic search, draw final result for retrieval R2.
5. in accordance with the method for claim 1, it is characterized in that:, can carry out mark to character string effectively with 31 bit outside the sign bit among 32 bit of a lint-long integer for general words and phrases, phrase, proper noun database.For the bigger database of character string average length, in sql SEVER 2000, can carry out mark with 63 bit of data type bigint, correspondingly, base character then should be divided into 63 groups.Any data type that can carry out " or " and " with " computing of position in any database all can be used for the tab character string, and the exclusive data type of not having the sign symbol position as independent programming constructs is then better.
6. according to claim 1 and 5 described methods, it is characterized in that: the figure place of the used bit of mark, except that considering the character string average length, should consider the bar number of current database record simultaneously, mark is carried out in record strip number database application how more multidigit bit, correspondingly, base character then should be divided into more groups.
7. according to claim 1,5 and 6 described methods, it is characterized in that: the used data type of mark still should be considered the figure place of cpu except that considering software factors such as programming language, database.For 64 cpu, should pay the utmost attention to and adopt 64 bit to come the tab character string, to make full use of the performance of cpu, improve the dispersion of " place value ".
8. in accordance with the method for claim 1, it is characterized in that: base character is for Chinese Chinese character normally, during Chinese character retrieval, can be basic compile other; For the Chinese phonetic alphabet, can be letter, initial consonant, simple or compound vowel of a Chinese syllable, syllable; For other Languages, can be letter, syllable, word etc.
9. in accordance with the method for claim 1, it is characterized in that: base character is divided into groups, the base character number of each group needn't equate, should make every effort in this language or current database, respectively organize base character word frequency sum and be tending towards balanced, especially the high frequency base character answers equilibrium to be allocated in each group, so that best performance.
10. in accordance with the method for claim 1, it is characterized in that: for the Chinese words and phrases database of the overwhelming majority for double word symbol, available two bit are that 1 basic place value is carried out mark, divide Chinese character be 31~/ (2! * (31-2)! ) group, promptly 465 groups, to improve performance.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNA2005100233835A CN1645374A (en) | 2005-01-17 | 2005-01-17 | Digit marking character string searching technology |
CN200510057491.4A CN101488127B (en) | 2005-01-17 | 2005-09-13 | Bit mark character string fuzzy retrieval method for grouping character and labellng with bit |
PCT/CN2005/001642 WO2006074586A1 (en) | 2005-01-17 | 2005-10-08 | Retrieval technology of character string marked with bit |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNA2005100233835A CN1645374A (en) | 2005-01-17 | 2005-01-17 | Digit marking character string searching technology |
Publications (1)
Publication Number | Publication Date |
---|---|
CN1645374A true CN1645374A (en) | 2005-07-27 |
Family
ID=34875846
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNA2005100233835A Pending CN1645374A (en) | 2005-01-17 | 2005-01-17 | Digit marking character string searching technology |
CN200510057491.4A Active CN101488127B (en) | 2005-01-17 | 2005-09-13 | Bit mark character string fuzzy retrieval method for grouping character and labellng with bit |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN200510057491.4A Active CN101488127B (en) | 2005-01-17 | 2005-09-13 | Bit mark character string fuzzy retrieval method for grouping character and labellng with bit |
Country Status (2)
Country | Link |
---|---|
CN (2) | CN1645374A (en) |
WO (1) | WO2006074586A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010088833A1 (en) * | 2009-02-03 | 2010-08-12 | 华为技术有限公司 | Character string processing method and system and matcher |
CN101535993B (en) * | 2006-10-30 | 2011-11-09 | 新叶股份有限公司 | Bit sequence searching method and device |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102682033A (en) * | 2011-03-17 | 2012-09-19 | 环达电脑(上海)有限公司 | Method for querying words by matching binary characteristic values |
CN103870537B (en) * | 2013-12-03 | 2017-02-01 | 山东金质信息技术有限公司 | Intelligent word segmentation method for standard retrieval |
CN106933938A (en) * | 2015-12-30 | 2017-07-07 | 唯溥思株式会社 | The document retrieval method and literature index method encoded using multibyte |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2669601B2 (en) * | 1994-11-22 | 1997-10-29 | インターナショナル・ビジネス・マシーンズ・コーポレイション | Information retrieval method and system |
JP3636941B2 (en) * | 1999-07-19 | 2005-04-06 | 松下電器産業株式会社 | Information retrieval method and information retrieval apparatus |
JP4298138B2 (en) * | 2000-06-21 | 2009-07-15 | 株式会社日立製作所 | Information retrieval method, apparatus for implementing the same, and recording medium recording the processing program |
US6785677B1 (en) * | 2001-05-02 | 2004-08-31 | Unisys Corporation | Method for execution of query to search strings of characters that match pattern with a target string utilizing bit vector |
JP2003152548A (en) * | 2001-11-14 | 2003-05-23 | Canon Inc | Retrieving method of character string in data compression |
-
2005
- 2005-01-17 CN CNA2005100233835A patent/CN1645374A/en active Pending
- 2005-09-13 CN CN200510057491.4A patent/CN101488127B/en active Active
- 2005-10-08 WO PCT/CN2005/001642 patent/WO2006074586A1/en not_active Application Discontinuation
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101535993B (en) * | 2006-10-30 | 2011-11-09 | 新叶股份有限公司 | Bit sequence searching method and device |
WO2010088833A1 (en) * | 2009-02-03 | 2010-08-12 | 华为技术有限公司 | Character string processing method and system and matcher |
Also Published As
Publication number | Publication date |
---|---|
CN101488127A (en) | 2009-07-22 |
CN101488127B (en) | 2015-01-07 |
WO2006074586A1 (en) | 2006-07-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11275740B2 (en) | Efficient use of trie data structure in databases | |
US7516125B2 (en) | Processor for fast contextual searching | |
US7512596B2 (en) | Processor for fast phrase searching | |
CN103365992B (en) | Method for realizing dictionary search of Trie tree based on one-dimensional linear space | |
CN108509505B (en) | Character string retrieval method and device based on partition double-array Trie | |
US10984029B2 (en) | Multi-level directory tree with fixed superblock and block sizes for select operations on bit vectors | |
US10417208B2 (en) | Constant range minimum query | |
Meurer | Corpuscle–a new corpus management platform for annotated corpora | |
Arroyuelo et al. | Space-efficient construction of Lempel–Ziv compressed text indexes | |
CN1645374A (en) | Digit marking character string searching technology | |
Bannai et al. | Computing all distinct squares in linear time for integer alphabets | |
Boucher et al. | Computing the original eBWT faster, simpler, and with less memory | |
Puntambekar | Data structures | |
CN109885641B (en) | Method and system for searching Chinese full text in database | |
Lewenstein et al. | Space-efficient string indexing for wildcard pattern matching | |
Kärkkäinen et al. | Full-text indexes in external memory | |
Barsky et al. | Full-text (substring) indexes in external memory | |
CN113420564B (en) | Hybrid matching-based electric power nameplate semantic structuring method and system | |
Dinklage | Translating between wavelet tree and wavelet matrix construction | |
Li et al. | Study on efficiency of full-text retrieval based on lucene | |
CN102184165A (en) | LCS (Longest Common Subsequence) algorithm for saving memory | |
Chan et al. | Faster query algorithms for the text fingerprinting problem | |
Katajainen et al. | A compact data structure for representing a dynamic multiset | |
Robenek et al. | Ternary Tree Optimalization for n-gram Indexing. | |
Andreica et al. | Practical Algorithmic Techniques for Several String Processing Problems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C12 | Rejection of a patent application after its publication | ||
RJ01 | Rejection of invention patent application after publication |