CN101751416A - Method for ordering and seeking character strings - Google Patents

Method for ordering and seeking character strings Download PDF

Info

Publication number
CN101751416A
CN101751416A CN200810227539A CN200810227539A CN101751416A CN 101751416 A CN101751416 A CN 101751416A CN 200810227539 A CN200810227539 A CN 200810227539A CN 200810227539 A CN200810227539 A CN 200810227539A CN 101751416 A CN101751416 A CN 101751416A
Authority
CN
China
Prior art keywords
character string
godel
character
peptide sequence
encoded radio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN200810227539A
Other languages
Chinese (zh)
Inventor
李由
贺思敏
付岩
袁作飞
迟浩
王海鹏
王乐珩
孙瑞祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN200810227539A priority Critical patent/CN101751416A/en
Publication of CN101751416A publication Critical patent/CN101751416A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention provides a method for ordering and seeking character strings, which comprises the following steps of: sorting characters in all the character strings needed to be ordered and assigning a numerical value for one type of characters, wherein the assigned numerical values of different types of characters are different; by combining the assigned numerical value of each character, adopting a Godel coding method to code each character string needed to be ordered respectively, wherein each character string obtains one Godel coding value represented by a number; and comparing the Godel coding values of all the character strings needed to be ordered and ordering the character strings according to the magnitude of the Godel coding values. The method adopts the Godel coding method to map the character strings into the Godel coding values represented by floating numbers and orders the character strings by ordering the Godel coding values so as to enhance the ordering efficiency.

Description

A kind of method that character string is sorted and searches
Technical field
The present invention relates to the character process field, particularly a kind of method that character string is sorted and searches.
Background technology
Current, character string is done comprised that ordering, the processing of searching operate in routine work, study or the research demand is widely all arranged on computers.Cite a plain example, in an Excel document, the user may need the character string in the form is done sorting operation.In the example of a relative complex, set up corresponding peptide sequence database according to Protein Data Bank, need carry out sorting operation to the character string (being generally the English alphabet string) that is used to represent peptide sequence equally.In addition, at the e-dictionary of setting up various language and in telephone directory, search in the operation such as name, also all need character string is handled.
In the prior art, on computers character string is handled usually with character string itself as process object.With the character string ordering in the character string processing is example, and a kind of method that character string is sorted is to adopt quicksort thought.According to this thought, in sequencer procedure relatively during two character strings big or small, two character string step-by-steps are compared from high to low one by one, as to character string " ABCDE " and " ABCEF ", because the former the 4th character " D " is less than the latter's the 4th character " E ", so the order of character string " ABCDE " is prior to " ABCEF ".In addition, usually also think in this method the substring of a character string on order prior to this character string, promptly " ABCE " is less than " ABCEF ".Owing to will compare one by one the character step-by-step in the character string, so character string is when adopting said method to sort, higher, the particularly very long situation of the consumed time cost of wanting to string length.If this character string sort method is represented with the cost computing formula, if the average length of character string is M, then relatively the time complexity of two character strings is exactly O (M), and just can think O (M*n*log (n)) approx to the time complexity that all character strings are done quicksort, wherein n is the character string number.From above-mentioned cost computing formula, when the value of character string number n is very big, or the average length of character string is when very long, and the time cost that will spend for the character string ordering is appreciable.
Character string ordering in above handling with character string is an example, the time cost that ordering will spend has been done preliminary discussion, and other operations in the character string processing, as string searching, be prerequisite with the character string ordering mostly, therefore, other operations of character string processing also have the high defective of time cost.The words that the realization principle of existing character string sort method is studied, can find, a main cause growing of ordering time that will spend is can't be by once more just obtaining comparative result for two comparison others (i.e. two character strings), need do usually for two identical character strings of high-order character and can know more just repeatedly whether they are identical or different.In a word, the existing method that character string is handled is unfavorable for reducing time overhead on computers, for the efficient that further improves the character string processing has caused obstacle.
Summary of the invention
The objective of the invention is to overcome in the existing character string processing method owing to need handling the defective that the treatment effeciency that brings is low, time overhead is big respectively, thereby a kind of character string processing method of realizing based on your coding method of Goethe is provided each character in the character string.
To achieve these goals, the invention provides a kind of method that character string is sorted, comprising:
Step 1), for the character classification in all character strings that will sort, be that a numerical value given in the character of a classification, the numerical value that different classes of character is given is different;
Step 2), in conjunction with the value of giving for each character in the described step 1), adopt the Godel coding method that each character string that will sort is encoded respectively, a character string obtains a umerical Godel encoded radio;
Step 3), all character strings that will sort are done sorting operation according to their Godel encoded radio.
In the technique scheme, described step 2) comprising:
Step 2-1), represent a position in the character string, different positions is represented with different prime numbers with a prime number;
Step 2-2), numerical value that locational character in the described character string is endowed in described step 1) does product with the logarithm value of the prime number of this position of expression, with all positions in this character string the addition result of getable product as described Godel encoded radio.
In the technique scheme, also comprise a calculation procedure in advance, in this step, calculate in advance and the logarithm value of the pairing prime number of storage and character string position, at described step 2-2) in, the logarithm value of described prime number directly called.
In the technique scheme, in described calculation procedure in advance, the product that also comprises the logarithm value of calculating and storing numerical value that each character is endowed and each prime number in advance in described step 1), at described step 2-2) in, the product of the logarithm value of numerical value that character is endowed and prime number directly called.
In the technique scheme, described Godel encoded radio is represented with 64 double-precision floating points.
The present invention also provides a kind of method that character string is searched, and comprising:
Step 1), the described method that character string is sorted of employing are to being carried out sorting operation by all character strings in the data source of searching;
Step 2), character string to be found is calculated its Godel encoded radio;
Step 3), utilize the Godel encoded radio of character string to be found from the result of the sorting operation of step 1), to search the corresponding characters string.
The present invention provides a kind of creation method of peptide sequence dictionary again, comprising:
Step 1), the protein sequence in the Protein Data Bank is done analogue enztme cut, obtain peptide sequence;
Step 2), the resulting peptide sequence of step 1) is sorted according to quality, for peptide sequence identical in quality, adopt the described method that character string is sorted to do further ordering to being used to represent the English alphabet of peptide sequence;
Step 3), remove redundant peptide sequence, will go the peptide sequence after the redundancy to write the peptide sequence dictionary then according to ranking results.
In the technique scheme, do the process of further ordering to being used for representing the English alphabet of peptide sequence adopting the described method that character string is sorted, after calculating the Godel encoded radio of described peptide sequence, add the territory that is used to represent the Godel encoded radio for the data structure of described peptide sequence, utilize Godel in this territory to be encoded to described peptide sequence and do ordering.
The present invention also provides a kind of device that character string is sorted, and comprises character string assignment module, Godel coding generation module and order module; Wherein,
Described character string assignment module is used to the character classification in all character strings that will sort, and is that a numerical value given in the character of a classification, and the numerical value that different classes of character is given is different;
Described Godel coding generation module is in conjunction with the value of giving for each character in the described character string assignment module, adopt the Godel coding method that each character string that will sort is encoded respectively, a character string obtains a umerical Godel encoded radio;
Described order module compares their Godel encoded radio to all character strings that will sort, and according to the size between described Godel encoded radio, described character string is done sorting operation.
The invention has the advantages that:
1, in character string sort method of the present invention, adopt the Godel coding method character string to be mapped to the Godel encoded radio of representing with floating number, by the ordering of described Godel encoded radio being realized ordering, significantly improved ordering efficient then to character string.
2, in string searching method of the present invention, the character string that will search and the character string of being searched are mapped to the Godel encoded radio of representing with floating number respectively, thereby utilize searching of Godel encoded radio realized the searching of character string improved a lot on search efficiency.
3, in peptide sequence dictionary creation method of the present invention, to be used to represent that the English alphabet of peptide sequence is mapped to the Godel encoded radio of representing with floating number, the peptide sequence that will have equal in quality sorts by described Godel encoded radio, thereby create the peptide sequence dictionary, improved the establishment efficient of peptide sequence dictionary.
4, in the present invention,, make in the process of calculating the Godel coding, can directly call result of calculation, accelerated to calculate on computers the efficient of Godel coding also by calculating in advance to prime pair numerical value and character numerical value.
Description of drawings
Fig. 1 is the existing method flow diagram of setting up the peptide sequence dictionary;
Fig. 2 is the process flow diagram of character string sort method of the present invention.
Embodiment
The present invention will be further described below in conjunction with the drawings and specific embodiments.
In one embodiment of the invention, think that it is example that a Protein Data Bank is set up the peptide sequence dictionary, the specific implementation and the application of the inventive method is illustrated.
Store a large amount of data about protein sequence in a Protein Data Bank, protein sequence is made up of a plurality of amino acid, and its subsequence is called peptide sequence.Owing to represent amino acid with English alphabet usually in the prior art, therefore the peptide sequence of being made up of amino acid shows with an English alphabet string list usually.For example, " AAIK ", " GK " etc.
The process of being set up the peptide sequence dictionary by Protein Data Bank can be with reference to figure 1, it mainly comprises: the enzyme process of cutting in the simulation biology is cut into peptide sequence with the protein sequence in the Protein Data Bank, removes redundant peptide sequence then in resulting peptide sequence result.Because peptide sequence is deposited according to quality and can be quickened the protein search engine and read peptide sequence, therefore, a kind of mode of removing redundant sequence is that all peptide sequences are sorted according to quality, for peptide sequence identical in quality, then do further ordering according to the order of the English alphabet string that is used to represent peptide sequence.After finishing ordering to all peptide sequences, scan peptide sequence from front to back, article one peptide sequence is write the peptide sequence dictionary, every peptide sequence and last peptide sequence compare afterwards, and the quality difference then writes the peptide sequence dictionary, then comparative sequences identical in quality, the sequence difference then writes dictionary, otherwise do not write, up to scanning the last item peptide sequence, thereby set up corresponding peptide sequence dictionary.
From top description, can see, in the process of setting up the peptide sequence dictionary, when peptide sequence is identical in quality, need sort the English alphabet string that is used to represent peptide sequence order according to letter.Because the protein number that is had in the Protein Data Bank is ten hundreds of usually, and by the resulting peptide sequence number of protein more can reach 1,000,000, ten million and even over ten billion, more than one hundred billion, therefore, peptide sequence identical in quality is sorted according to the order of English alphabet require a great deal of time.For example, in swiss-prot database commonly used, approximately include 3,000 ten thousand peptide sequences, employing was handled these database needs about 50 minutes based on the Protein Data Bank index tool set IndexToolKit2.0 that existing character string sortord makes up dictionary, and obviously cost in time is bigger.And in actual applications, also there be other database bigger, contain 300,000,000 peptide sequences approximately as the NCBInr database, for these databases than swiss-prot database data amount, if still adopt existing character string sortord, then the time that is spent will make us being difficult to bear more.
In order to reduce time cost effectively, adopted the Godel coding method to realize the conversion of character string in the present invention to numeral to the character string ordering.In the prior art, the essence of Godel coding method is that a natural number (containing zero) sequence transformation is become a natural number, and for example, the pairing Godel coding of sequence of natural numbers abc is a natural number P 1 a* P 2 b* P 3 c, P wherein 1, P 2And P 3Be different prime numbers, promptly in the Godel coding, the position of numeral in sequence distinguished with different prime numbers, for example, with first position in first prime number 2 flags sequence, with second position in second prime number 3 flag sequence; A certain locational numeral is represented by the power of the prime number of expression correspondence position in the sequence of natural numbers.Be expressed as Godel coding back as a sequence of natural numbers 123 and just can obtain 2 1* 3 2* 5 3
In the present embodiment, because peptide sequence represents with English alphabet that all therefore, the thought of utilizing Godel to encode can represent that the English alphabet string of peptide sequence converts numeral to being used to equally.With reference to figure 2, and be used to represent that with one the English alphabet string ACCDDD of peptide sequence is that example describes.According to aforesaid Godel coding thinking, this English alphabet string can be expressed as 2 A* 3 C* 5 C* 7 D* 11 D* 13 DThen wherein related English alphabet is carried out assignment, as long as guarantee that identical letter has identical value, different letters has different values to get final product in principle, in the present embodiment, then can give corresponding value for them according to the position of each English alphabet in alphabet.For example, A represents that with 0 C represents that with 2 D represents with 3.Because the value of the coding of the Godel after the conversion is too big, when realizing on computers not only a coding need to adopt more figure place, and cause the offside of encoded radio easily, therefore, also to take the logarithm, be about to aforesaid alphabetic string ACCDDD and be expressed as following formula resulting encoded radio:
A×log2+C×log3+C×log5+D×log7+D×log11+D×log13 (1)
By above-mentioned operation, a character string can be compressed into 64 double-precision floating points (double).
Above-mentioned Godel coding thinking is being applied in the constructive process of peptide sequence dictionary, when obtaining peptide sequence by the protein sequence in the Protein Data Bank, and when peptide sequence gone redundancy, can change being used to represent the English alphabet string of peptide sequence according to above-mentioned Godel coding method.Before utilizing the Godel coding method to change, stored on computers about the data structure of peptide sequence shown in following table 1, include peptide sequence quality (size_t, integer type) and two territories of peptide sequence (string, character string type).In territory about peptide sequence, with aforesaid English alphabet string peptide sequence is represented, if peptide sequence is sorted according to quicksort method of the prior art, just need to realize sorting operation with the content in the peptide sequence territory.
Table 1
The peptide sequence quality Peptide sequence
After utilizing the Godel coding method to change, the data structure about peptide sequence of being stored has on computers had corresponding change, as shown in table 2, it also includes the corresponding Godel encoded radio (double) of peptide sequence except having peptide sequence quality (size_t) and two of peptide sequences (string) overseas.The data structure of peptide sequence when peptide sequence is sorted, just can be carried out sorting operation with the content in this territory after this territory of the corresponding Godel encoded radio of peptide sequence has been arranged.
Table 2
The peptide sequence quality Peptide sequence The corresponding Godel encoded radio of peptide sequence
In the prior art, peptide sequence is sorted according to the quality in the structure, the dictionary preface according to peptide sequence when identical in quality sorts, and behind the Godel coding, then sorts according to the quality in the structure, sorts according to the Godel encoded radio when identical in quality.Because behind the Godel coding, the object of ordering has become a concrete numeral by a string English alphabet, by to each letter in the English alphabet string repeatedly respectively comparison be transformed into once the comparing an of numeral, on ordering efficient, obviously can be greatly improved.
The application prerequisite of the inventive method is that the Godel encoded radio is corresponding one by one with peptide sequence, in experiment of the present invention, the peptide sequence about 100,000,000 is verified that Godel encoded radio and peptide sequence are one to one.Analyze theoretically, because the Godel encoded radio accounts for 64 (bit), when unduplicated peptide sequence (character string) quantity surpasses 2 64The time, the Godel encoded radio can produce conflict, but the peptide sequence of this quantity (character string) extremely difficult appearance in practical problems.
In the above-described embodiment, introduced and utilized the Godel coding to realize the peptide sequence ordering, and then set up a kind of implementation of peptide sequence dictionary.In yet another embodiment, can also do further improvement to digital transfer process to utilizing Godel coding to realize to be used to represent the alphabetic string of peptide sequence.From the formula (1) of front as can be seen, in the process that alphabetic string is converted to the Godel coding, include logarithm calculating, multiplying and additive operation.When realizing on computers, the time that logarithm calculating will consume will be higher than multiplying, and multiplying can spend more time again than additive operation.Efficient when realizing on computers for further raising the present invention, can precompute the logarithm value of top n prime number, N wherein can directly adopt the logarithm value that precomputes like this greater than the length of long alphabetic string when calculating the Godel coding, save the time of calculating logarithm.In addition, if it is known to constitute the alphabet of alphabetic string, then the product of alphabetic coding and logarithm value also can also be calculated in advance, then result of calculation is stored in the two-dimensional array, when calculating the Godel coding, just can directly from two-dimensional array, transfer relevant data, calculate again.For example, if constituting the alphabet of alphabetic string represents with ∑, the length of supposing ∑ is M, then can calculate a two-dimensional array Value[N in advance] [M], wherein each element is represented the product of the logarithm value and the alphabetic coding of prime number, for example Value[i] j alphabetical product in [j] expression logarithm value of i prime number and the ∑.Aforesaid formula (1) also just can be converted to following computing formula:
Value[0][0]+Value[1][2]+Value[2][2]+Value[3][3]+Value[4][3]+Value[5][3] (2)
Compare with formula (1), in this formula, only need carry out the additional calculation of floating number, avoided the multiplying of logarithm operation and floating number, can further improve operation efficiency.
When using method of the present invention to set up the peptide sequence dictionary, can improve operation efficiency greatly.Be example still with the swiss-prot database of being mentioned in the preamble, IndexToolKit2.1 adopts method of the present invention to set up the peptide sequence dictionary for this database only needs about 6 minutes, compared in needed 50 minutes with IndexToolKit2.0, obviously be greatly improved.Even the NCBInr database also only needs about 3 hours.
In the above-described embodiments, be example with the process of setting up of peptide sequence dictionary, to the realization of the inventive method with should be used as specific description.In this declarative procedure, realized the Godel coding of English alphabet string, thereby finished the ordering of English alphabet string.Those of ordinary skill in the art should understand, method of the present invention is not limited to English alphabet related in the present embodiment, character for other types, can handle with similar method as Roman character, Greek alphabet even Chinese character, to improve treatment effeciency, the coding of character is defined in once front and back unification in the complete processing procedure as long as guarantee to these characters.
Method of the present invention is except the ordering that realizes character string, can also further be applied in other processing to character string, for example in a problem of searching telephone directory, name with string representation in the telephone directory is sorted according to the Godel encoded radio, and relevant Godel encoded radio information and the character string information of record, the character string of the name to be found that will import then also converts the Godel encoded radio to, existing value in this encoded radio and the telephone directory is carried out binary chop, find the name that to search when finding identical Godel encoded radio.In addition, the database that is widely used in daily life can be realized the acceleration of data is searched equally with method of the present invention.
With the basis that is contemplated that of the inventive method, the present invention also provides a kind of device that character string is sorted, and includes character string assignment module, Godel coding generation module and order module in this device.Wherein,
Described character string assignment module is used to the character classification in all character strings that will sort, and is that a numerical value given in the character of a classification, and the numerical value that different classes of character is given is different.
Described Godel coding generation module is in conjunction with the value of giving for each character in the described character string assignment module, adopt the Godel coding method that each character string that will sort is encoded respectively, a character string obtains a umerical Godel encoded radio.
Described order module compares their Godel encoded radio to all character strings that will sort, and according to the size between described Godel encoded radio, described character string is done sorting operation.
It should be noted last that above embodiment is only unrestricted in order to technical scheme of the present invention to be described.Although the present invention is had been described in detail with reference to embodiment, those of ordinary skill in the art is to be understood that, technical scheme of the present invention is made amendment or is equal to replacement, do not break away from the spirit and scope of technical solution of the present invention, it all should be encompassed in the middle of the claim scope of the present invention.

Claims (9)

1. method that character string is sorted comprises:
Step 1), for the character classification in all character strings that will sort, be that a numerical value given in the character of a classification, the numerical value that different classes of character is given is different;
Step 2), in conjunction with the value of giving for each character in the described step 1), adopt the Godel coding method that each character string that will sort is encoded respectively, a character string obtains a umerical Godel encoded radio;
Step 3), all character strings that will sort are done sorting operation according to their Godel encoded radio.
2. the method that character string is sorted according to claim 1 is characterized in that, described step 2) comprising:
Step 2-1), represent a position in the character string, different positions is represented with different prime numbers with a prime number;
Step 2-2), numerical value that locational character in the described character string is endowed in described step 1) does product with the logarithm value of the prime number of this position of expression, with all positions in this character string the addition result of getable product as described Godel encoded radio.
3. the method that character string is sorted according to claim 2, it is characterized in that, also comprise a calculation procedure in advance, in this step, calculate in advance and the logarithm value of the pairing prime number of storage and character string position, at described step 2-2) in, the logarithm value of described prime number directly called.
4. the method that character string is sorted according to claim 3, it is characterized in that, in described calculation procedure in advance, the product that also comprises the logarithm value of calculating and storing numerical value that each character is endowed and each prime number in advance in described step 1), at described step 2-2) in, the product of the logarithm value of numerical value that character is endowed and prime number directly called.
5. the method that character string is sorted according to claim 1 is characterized in that, described Godel encoded radio is represented with 64 double-precision floating points.
6. method that character string is searched comprises:
Step 1), adopt the method that character string is sorted of one of claim 1-5 to being carried out sorting operation by all character strings in the data source of searching;
Step 2), character string to be found is calculated its Godel encoded radio;
Step 3), utilize the Godel encoded radio of character string to be found from the result of the sorting operation of step 1), to search the corresponding characters string.
7. the creation method of a peptide sequence dictionary comprises:
Step 1), the protein sequence in the Protein Data Bank is done analogue enztme cut, obtain peptide sequence;
Step 2), the resulting peptide sequence of step 1) is sorted according to quality, for peptide sequence identical in quality, the method that character string is sorted that adopts one of claim 1-5 is done further ordering to being used to represent the English alphabet of peptide sequence;
Step 3), remove redundant peptide sequence, will go the peptide sequence after the redundancy to write the peptide sequence dictionary then according to ranking results.
8. the creation method of peptide sequence dictionary according to claim 7, it is characterized in that, do the process of further ordering in the method that character string is sorted that adopts one of claim 1-5 to being used for representing the English alphabet of peptide sequence, after calculating the Godel encoded radio of described peptide sequence, add the territory that is used to represent the Godel encoded radio for the data structure of described peptide sequence, utilize Godel in this territory to be encoded to described peptide sequence and do ordering.
9. the device that character string is sorted is characterized in that, comprises character string assignment module, Godel coding generation module and order module; Wherein,
Described character string assignment module is used to the character classification in all character strings that will sort, and is that a numerical value given in the character of a classification, and the numerical value that different classes of character is given is different;
Described Godel coding generation module is in conjunction with the value of giving for each character in the described character string assignment module, adopt the Godel coding method that each character string that will sort is encoded respectively, a character string obtains a umerical Godel encoded radio;
Described order module compares their Godel encoded radio to all character strings that will sort, and according to the size between described Godel encoded radio, described character string is done sorting operation.
CN200810227539A 2008-11-28 2008-11-28 Method for ordering and seeking character strings Pending CN101751416A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200810227539A CN101751416A (en) 2008-11-28 2008-11-28 Method for ordering and seeking character strings

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200810227539A CN101751416A (en) 2008-11-28 2008-11-28 Method for ordering and seeking character strings

Publications (1)

Publication Number Publication Date
CN101751416A true CN101751416A (en) 2010-06-23

Family

ID=42478407

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200810227539A Pending CN101751416A (en) 2008-11-28 2008-11-28 Method for ordering and seeking character strings

Country Status (1)

Country Link
CN (1) CN101751416A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831232A (en) * 2012-08-30 2012-12-19 山石网科通信技术(北京)有限公司 Character string matching method and device
CN103294694A (en) * 2012-02-27 2013-09-11 华为技术有限公司 Method and device for comparing sizes of keywords in database
CN103514160A (en) * 2012-06-15 2014-01-15 华为终端有限公司 Sorting method and mobile equipment
CN104202412A (en) * 2014-09-15 2014-12-10 湖北工业大学 Image storage method and system based on multiple cloud terminals
CN107251018A (en) * 2014-12-10 2017-10-13 凯恩迪股份有限公司 The apparatus and method for representing and operating for the data based on combination hypergraph shape
CN108256587A (en) * 2018-02-05 2018-07-06 武汉斗鱼网络科技有限公司 Determining method, apparatus, computer and the storage medium of a kind of similarity of character string
CN108763468A (en) * 2018-05-29 2018-11-06 周宇 Dictionary sequence processing method, device and e-learning equipment
CN109471859A (en) * 2018-10-17 2019-03-15 北京我知科技有限公司 A kind of method of record ordering result
CN110020954A (en) * 2019-03-26 2019-07-16 阿里巴巴集团控股有限公司 A kind of income distribution method, device and computer equipment
CN112364214A (en) * 2020-12-17 2021-02-12 深圳市芯天下技术有限公司 Numerical value grade-based character string sorting method and device, storage medium and terminal

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103294694A (en) * 2012-02-27 2013-09-11 华为技术有限公司 Method and device for comparing sizes of keywords in database
CN103294694B (en) * 2012-02-27 2016-09-14 华为技术有限公司 The method and device that in a kind of data base, keyword size compares
CN103514160B (en) * 2012-06-15 2017-04-12 华为终端有限公司 Sorting method and mobile equipment
CN103514160A (en) * 2012-06-15 2014-01-15 华为终端有限公司 Sorting method and mobile equipment
CN102831232A (en) * 2012-08-30 2012-12-19 山石网科通信技术(北京)有限公司 Character string matching method and device
CN102831232B (en) * 2012-08-30 2015-12-16 山石网科通信技术有限公司 The matching process of character string and device
CN104202412B (en) * 2014-09-15 2018-01-23 湖北工业大学 A kind of picture storage method and system based on more high in the clouds
CN104202412A (en) * 2014-09-15 2014-12-10 湖北工业大学 Image storage method and system based on multiple cloud terminals
CN107251018A (en) * 2014-12-10 2017-10-13 凯恩迪股份有限公司 The apparatus and method for representing and operating for the data based on combination hypergraph shape
CN108256587A (en) * 2018-02-05 2018-07-06 武汉斗鱼网络科技有限公司 Determining method, apparatus, computer and the storage medium of a kind of similarity of character string
CN108763468A (en) * 2018-05-29 2018-11-06 周宇 Dictionary sequence processing method, device and e-learning equipment
CN108763468B (en) * 2018-05-29 2021-06-22 周宇 Dictionary sorting processing method and device and electronic learning equipment
CN109471859A (en) * 2018-10-17 2019-03-15 北京我知科技有限公司 A kind of method of record ordering result
CN110020954A (en) * 2019-03-26 2019-07-16 阿里巴巴集团控股有限公司 A kind of income distribution method, device and computer equipment
CN110020954B (en) * 2019-03-26 2023-09-05 创新先进技术有限公司 Revenue distribution method and device and computer equipment
CN112364214A (en) * 2020-12-17 2021-02-12 深圳市芯天下技术有限公司 Numerical value grade-based character string sorting method and device, storage medium and terminal

Similar Documents

Publication Publication Date Title
CN101751416A (en) Method for ordering and seeking character strings
Giegerich et al. Efficient implementation of lazy suffix trees
CN108573045B (en) Comparison matrix similarity retrieval method based on multi-order fingerprints
CN109325032B (en) Index data storage and retrieval method, device and storage medium
CN102024047B (en) Data searching method and device thereof
CN1924854B (en) Desktop searching method for intelligent mobile terminal
US20070239663A1 (en) Parallel processing of count distinct values
Navarro et al. Space-efficient top-k document retrieval
US20100191717A1 (en) Optimization of query processing with top operations
CN101082918A (en) Method for enquiring electronic dictionary word with letter index table and system thereof
EP3955256A1 (en) Non-redundant gene clustering method and system, and electronic device
CN112434085A (en) Roaring Bitmap-based user data statistical method
US11736119B2 (en) Semi-sorting compression with encoding and decoding tables
CN106855866A (en) XML document storage method and device
Feigenblat et al. Linear time succinct indexable dictionary construction with applications
US8918374B1 (en) Compression of relational table data files
CN113609313A (en) Data processing method and device, electronic equipment and storage medium
CN108595508B (en) Adaptive index construction method and system based on suffix array
CN110909027A (en) Hash retrieval method
CN110609914B (en) Online Hash learning image retrieval method based on rapid category updating
Usmani et al. Modelling of Efficient Graph-aware Data Storage using DNA.
CN103136274A (en) Date retrieval method and device used for content resource data base
Gupta et al. An efficient compressor for biological sequences
Mehta et al. DNA compression using referential compression algorithm
Farina et al. Indexing sequences of ieee 754 double precision numbers

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20100623