CN110287147A - A kind of character string sorting method and device - Google Patents

A kind of character string sorting method and device Download PDF

Info

Publication number
CN110287147A
CN110287147A CN201910567581.XA CN201910567581A CN110287147A CN 110287147 A CN110287147 A CN 110287147A CN 201910567581 A CN201910567581 A CN 201910567581A CN 110287147 A CN110287147 A CN 110287147A
Authority
CN
China
Prior art keywords
character
section
unicode
yard
character string
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910567581.XA
Other languages
Chinese (zh)
Other versions
CN110287147B (en
Inventor
林荷滨
李鑫辉
黄凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201910567581.XA priority Critical patent/CN110287147B/en
Publication of CN110287147A publication Critical patent/CN110287147A/en
Application granted granted Critical
Publication of CN110287147B publication Critical patent/CN110287147B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/122File system administration, e.g. details of archiving or snapshots using management policies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Document Processing Apparatus (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a kind of character string sorting method, comprising: obtains the first Unicode and the second Unicode;According to the sequence from a high position to low level, the successively extraction code section in the first Unicode and in the second Unicode respectively, until the first yard of section extracted in the first Unicode, different from the second code section extracted in the second Unicode;When the corresponding character of first yard of section and the corresponding character of second code section are kinds of characters type, according to the size relation of preset multiple character types, the size relation of first yard of section corresponding character and the corresponding character of second code section is determined;Determine that the corresponding character string of the greater is greater than another character string in the corresponding character of first yard of section and the corresponding character of second code section;According to the size relation of the first character string and the second character string, two character strings are ranked up.Based on disclosed method, in the case where multiple character strings to be sorted include a plurality of types of characters, sequence that can rapidly to multiple character strings.

Description

A kind of character string sorting method and device
Technical field
The application belongs to field of computer technology more particularly to character string sorting method and device.
Background technique
With the continuous development of memory technology, either terminal or server can all store a large amount of file, video network The server stood is even more the video file for being stored with magnanimity.For the ease of management and locating file, usually file is arranged Sequence.Sequence to multiple files, its essence is the titles to multiple files to be ranked up, that is, arranges multiple character strings Sequence.
Since the name of file is often more casual, resulting in the title of file can include a plurality of types of characters.How Multiple character strings being made of multi-type character are ranked up, are the technical problems that those skilled in the art face.
Summary of the invention
In view of this, the application's is designed to provide a kind of character string sorting method and device, with realize to it is multiple by The purpose that the character string that multi-type character is constituted is ranked up.
To achieve the above object, the application provides the following technical solutions:
The application provides a kind of character string sorting method, comprising:
Obtain the first Unicode and the second Unicode;Wherein, first Unicode is the first character string pair to be sorted The Unicode answered, second Unicode are the corresponding Unicode of the second character string to be sorted;
According to the sequence from a high position to low level, mentioned in first Unicode and successively in second Unicode respectively Code fetch section, until the first yard of section extracted in first Unicode, different from being extracted in second Unicode Second code section;Wherein, the code section extracted in first Unicode corresponds to a character in first character string; The code section extracted in second Unicode corresponds to a character in second character string;
The case where the corresponding character of first yard of section and the corresponding character of the second code section are kinds of characters type Under, according to the size relation of preset multiple character types, determine the corresponding character of first yard of section and the second code section The size relation of corresponding character;
Determine the corresponding character of the greater in the corresponding character of first yard of section and the corresponding character of the second code section String is greater than another character string;
According to the size relation of first character string and second character string, to first character string and described Two character strings are ranked up.
Optionally, based on the above method, further includes:
In the case where the corresponding character of first yard of section and the corresponding character of the second code section are numeric type, Third yard section is extracted in first Unicode, and the 4th yard of section is extracted in second Unicode;Wherein, the third yard Section is the corresponding code section of the first numeric string, and the 4th yard of section is the corresponding code section of the second numeric string;First numeric string is The corresponding character of first yard of section in first character string belonging to numeric string, second numeric string is described second The corresponding character of code section numeric string affiliated in second character string;
The 4th yard of section is converted to the second of floating type by the first data that the third yard section is converted to floating type Data;
Determine that the corresponding character string of the greater is greater than another character string in first data and second data.
Optionally, based on the above method, further includes:
In the corresponding character of first yard of section and the corresponding character of the second code section are the case where literal type Under, obtain the first phonetic and the second phonetic, wherein first phonetic is the phonetic of the corresponding character of first yard of section, institute State the phonetic that the second phonetic is the corresponding character of the second code section;
According to sequence from left to right, letter is successively extracted in first phonetic and second phonetic respectively;
If two in the same order letter extracted in first phonetic and second phonetic is different, The then size of more described two letters determines that the corresponding character string of the greater is greater than another character string in described two letters;
If first phonetic and second phonetic include the letter of identical quantity, and first phonetic and described Letter in second phonetic in same order is all the same, then the first tone in first phonetic and described second is spelled The size of the second tone in sound determines that the corresponding character string of the greater is greater than another in first tone and second tone One character string.
Optionally, based on the above method, further includes:
If first phonetic is identical with second phonetic, the first of the corresponding character of first yard of section is obtained II yard of II yard of the 2nd ASC of II yard of ASC character corresponding with the second code section, the first ASC and the 2nd ASC II yard of size determines that II yard of the first ASC character string corresponding with the greater in described II yard of 2nd ASC is greater than another word Symbol string.
Optionally, based on the above method, further includes:
Any one in first Unicode and second Unicode completes the extraction of whole code sections, but does not mention In the case where getting different code sections, the corresponding word of length the greater in first Unicode and second Unicode is determined Symbol string is greater than another character string.
The application also provides a kind of character string sorting device, comprising:
Data capture unit, for obtaining the first Unicode and the second Unicode;Wherein, first Unicode is wait arrange The corresponding Unicode of the first character string of sequence, second Unicode are the corresponding Unicode of the second character string to be sorted;
Code section extraction unit, it is respectively in first Unicode and described for according to the sequence from a high position to low level Successively extraction code section in second Unicode is different from described the until the first yard of section extracted in first Unicode The second code section extracted in two Unicodes;Wherein, correspond to first character in the code section that first Unicode extracts A character in string;The code section extracted in second Unicode corresponds to a character in second character string;
Character types comparing unit, in the corresponding character of first yard of section and the corresponding character of the second code section In the case where for kinds of characters type, according to the size relation of preset multiple character types, determine that first yard of section is corresponding Character and the corresponding character of the second code section size relation;Determine the corresponding character of first yard of section and described second The corresponding character string of the greater is greater than another character string in the corresponding character of code section;
Sequencing unit, for the size relation according to first character string and second character string, to described first Character string and second character string are ranked up.
Optionally, on the basis of above-mentioned apparatus, further includes:
Numeric string comparing unit, for equal in the corresponding character of first yard of section and the corresponding character of the second code section In the case where for numeric type, third yard section is extracted in first Unicode, and the 4th is extracted in second Unicode Code section;Wherein, the third yard section is the corresponding code section of the first numeric string, and the 4th yard of section is the corresponding code of the second numeric string Section;First numeric string is the corresponding character of first yard of section numeric string affiliated in first character string, described Second numeric string is the corresponding character of second code section numeric string affiliated in second character string;By the third yard Section is converted to the first data of floating type, and the 4th yard of section is converted to the second data of floating type;Determine first number It is greater than another character string according to character string corresponding with the greater in second data.
Optionally, on the basis of above-mentioned apparatus, further includes:
Phonetic comparing unit, for being in the corresponding character of first yard of section and the corresponding character of the second code section In the case where middle literal type, the first phonetic and the second phonetic are obtained, wherein first phonetic is corresponding for first yard of section Character phonetic, second phonetic be the corresponding character of the second code section phonetic;According to sequence from left to right, divide Letter is successively extracted not in first phonetic and second phonetic;If in first phonetic and second phonetic In two in the same order letter that extracts it is different, then the size of more described two letters, determines described two words The corresponding character string of the greater is greater than another character string in mother;If first phonetic and second phonetic include identical number The letter of amount, and the letter in first phonetic and second phonetic in same order is all the same, then more described the The size of the first tone in one phonetic and the second tone in second tone, determines first tone and described second The corresponding character string of the greater is greater than another character string in tone.
Optionally, on the basis of above-mentioned apparatus, further includes:
Standard code comparing unit, for determining first phonetic and second phonetic in the phonetic comparing unit In identical situation, II yard of the first ASC character corresponding with the second code section of the corresponding character of first yard of section is obtained II yard of the 2nd ASC, II yard of the first ASC and II yard of the 2nd ASC of the size determine described II yard of first ASC Character string corresponding with the greater in described II yard of 2nd ASC is greater than another character string.
Optionally, on the basis of above-mentioned apparatus, further includes:
Length comparing unit is completed all for any one in first Unicode and second Unicode The extraction of code section, but in the case where not extracting different code sections, it determines in first Unicode and second Unicode The corresponding character string of length the greater is greater than another character string.
It can be seen that character string sorting method disclosed in the present application, for the first character string and the second character to be sorted String, first acquisition and corresponding first Unicode of the first character string and the second Unicode corresponding with the second character string, later According to the sequence from a high position to low level, the successively extraction code section in the first Unicode and in the second Unicode respectively, if extracted Two code sections out are identical, then continue the operation of extraction code section, until the first yard of section extracted from the first Unicode is not It is same as the second code section extracted in the second Unicode, if the character types of first yard of section and the corresponding character of second code section Difference determines that the corresponding character of first yard of section and second code section are corresponding then according to the size relation of preset multiple character types Character size, and then the size relation of the first character string and the second character string is determined, later according to the first character string and the The size relation of two character strings is ranked up two character strings.Character string sorting method disclosed in the present application, will be to be sorted Character string is converted to Unicode, by being compared the size of determining character string to Unicode, in multiple character strings to be sorted In the case where comprising a plurality of types of characters, the size relation of multiple character strings can be also quickly determined, to realize to more The sequence of a character string also improves the efficiency of character string sorting.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the application Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.
Fig. 1 is a kind of flow diagram of character string sorting method disclosed in the present application;
Fig. 2 is the flow diagram of another character string sorting method disclosed in the present application;
Fig. 3 is the flow diagram of another character string sorting method disclosed in the present application;
Fig. 4 is the flow diagram of another character string sorting method disclosed in the present application;
Fig. 5 is a kind of structural schematic diagram of character string sorting device disclosed in the present application;
Fig. 6 is the structural schematic diagram of another character string sorting device disclosed in the present application.
Specific embodiment
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall in the protection scope of this application.
As shown in Figure 1, being a kind of flow diagram of character string sorting method disclosed in the present application.The character string sorting side Method includes:
S101: the first Unicode and the second Unicode are obtained.
Wherein, the first Unicode is the corresponding Unicode of the first character string to be sorted, and the second Unicode is wait sort The corresponding Unicode of second character string.
In implementation, two character strings to be sorted are obtained, two character strings are referred to as the first character string and the second word Symbol string.The first character string and the corresponding Unicode of the second character string are determined respectively, and Unicode corresponding with the first character string is claimed For the first Unicode, Unicode corresponding with the second character string is known as the second Unicode.Unicode be also known as Unicode or Single code, English name Unicode.Unicode has odd encoder scheme, such as UTF-8, UTF-16 and UTF-32.
UTF-8 is a kind of variable length coding schemes, it uses 1 to 4 byte representation, one character, not according to character Same transform length, coding rule are as follows: for the character of single byte, first is set as 0, this character of subsequent 7 correspondences Unicode code-point;For the character (N > 1) for needing to indicate using N number of byte, the top N of first character section is all set as 1, N+1 are set as 0, and the front two of remaining N-1 byte is all set as 10, and remaining binary digit then uses this character Unicode code-point is filled.
UTF-16 is also variable length code, uses 2 or 4 byte representations, one character.UTF-32 is with 32 nothings Symbol integer is unit.The UTF-32 coding of Unicode is exactly its corresponding 32 signless integer.
Compared with UTF-16 and UTF-32, UTF-8 can save memory space.As a preferred embodiment, in the application First Unicode and the second Unicode are the Unicode of UTF-8 format.
S102: it according to the sequence from a high position to low level, is extracted in the first Unicode and successively in the second Unicode respectively Code section, until the first yard of section extracted in the first Unicode, different from the second code section extracted in the second Unicode.
In the application, by byte corresponding with a character in the first Unicode and the second Unicode (quantity be one or It is multiple) it is known as a code section.The code section extracted in the first Unicode corresponds to a character in the first character string;? The code section extracted in two Unicodes corresponds to a character in the second character string.
For example, the first Unicode and the second Unicode respectively include 5 characters, according to the sequence from a high position to low level, First code section in one Unicode corresponds to the first character of the first character string, and second code section in the first Unicode is right Should in second character of the first character string, and so on, the 5th code section in the first Unicode corresponds to the first character string The 5th character.First code section in second Unicode corresponds to the first character of the second character string, the second Unicode In second code section correspond to the second character string second character, and so on, the 5th code section in the second Unicode The 5th character corresponding to the second character string.
It should be noted that the leftmost side is highest order in the first Unicode and the second Unicode, the rightmost side is lowest order.It presses According to the sequence from a high position to low level, the successively extraction code section in the first Unicode and in the second Unicode respectively, it may be assumed that according to from a left side To right sequence, the successively extraction code section in the first Unicode and the second Unicode respectively.
In implementation, according to encoding scheme used by the first Unicode and the second Unicode, so that it may determine how difference Code section is successively extracted in the first Unicode and the second Unicode.
By taking the first Unicode and the second Unicode are the Unicode of UTF-8 format as an example:
If a character is " 0 " by a byte representation, the first place of code section corresponding with the character;If one Character is by two byte representations, then code section corresponding with the character includes two bytes, the front three of first character section is " 110 ", the front two of second byte are " 10 ";If a character is by three byte representations, corresponding with the character Code section includes three bytes, and first four of first character section are " 1110 ", and the front two of other bytes is " 10 ";If a word Symbol is by four byte representations, then code section corresponding with the character includes four bytes, first five position of first character section is " 1110 ", the front two of other bytes are " 10 ".
If the first Unicode is " 1110011010110001100010010100 ", according to UTF-8 Coding rule it is found that " 111001101011000110001001 " correspond to a character, extracted in the first Unicode First code section be " 111001101011000110001001 ", this yard of section corresponds to character " Chinese ".
In implementation, since the highest order of the first Unicode, first code section is extracted, this yard of section corresponds to the first character string In first character extract first code section since the highest order of the second Unicode, this yard of section corresponds to the second character string In first character.Judge whether the two code sections extracted are identical, if the two yard of section is identical, in the first Unicode Second code section of middle extraction, this yard of section correspond to second character in the first character string, extract second in the second Unicode A yard of section, this yard of section correspond to second character in the second character string.Judge whether the two code sections extracted are identical, if Two code sections are identical, then continuing the extraction code in the first Unicode and the second Unicode according to the sequence from a high position to low level Section, until the code section extracted in the first Unicode is different from the code section extracted in the second Unicode.
S103: in the case where the corresponding character of first yard of section and the corresponding character of second code section are kinds of characters type, According to the size relation of preset multiple character types, the corresponding character of first yard of section and the corresponding character of second code section are determined Size relation.
When the first yard of section extracted in the first Unicode is different from the second code section extracted in the second Unicode When, determine the character types of first yard of section and the corresponding character of second code section.If first yard of section and the corresponding word of second code section The character types of symbol are different, then determining the corresponding character of first yard of section according to the size relation of preset multiple character types The size relation of character corresponding with second code section.
For example, the character types used in Chinese environment generally include spcial character, letter, number and middle text.Make For a kind of embodiment, the size relation of multiple character types is pre-defined are as follows: spcial character < number < letter < Chinese Word.If the character types of the corresponding character of first yard of section are middle text, the character types of the corresponding character of second code section are number Word, then according to the size relation of multiple character types of aforementioned definitions it was determined that the corresponding character of first yard of section is greater than the The corresponding character of two yards of sections.
S104: determine that the corresponding character string of the greater is big in the corresponding character of first yard of section and the corresponding character of second code section In another character string.
That is, if the corresponding character of first yard of section is greater than the corresponding character of second code section, then it is determined that the first word Symbol string is greater than the second character string;If the corresponding character of first yard of section is less than the corresponding character of second code section, then it is determined that first Character string is less than the second character string.
To be exemplified below convenient for better understanding technical solution:
According to the sequence from a high position to low level, the successively extraction code section in the first Unicode and in the second Unicode respectively, The two code sections extracted for the first time are identical, and the two code sections extracted for the second time are identical, the two code sections extracted for the third time It is different, it is assumed that it is " the " in the corresponding character of first yard of section that the first Unicode extracts, the extracted in the second Unicode The corresponding character of two yards of sections is " 3 ", so that it is determined that the character types of the corresponding character of first yard of section are middle literal type, second code The character types of the corresponding character of section are numeric type.According to the size relation of preset multiple character types, numeric type is small In Chinese type, therefore, the corresponding character of first yard of section is greater than the corresponding character of second code section, thereby determines that the first character string is big In the second character string.
S105: according to the size relation of the first character string and the second character string, to the first character string and the second character string into Row sequence.
The application character string sorting method disclosed above, it is first for the first character string and the second character string to be sorted First obtain corresponding with the first character string the first Unicode, acquisition the second Unicode corresponding with the second character string, later according to Sequence from a high position to low level, the successively extraction code section in the first Unicode and in the second Unicode respectively, if extract Two code sections are identical, then continue the operation of extraction code section, until the first yard of section extracted from the first Unicode is different from The second code section extracted in the second Unicode, if the character types of first yard of section and the corresponding character of second code section are not Together, then according to the size relation of preset multiple character types, determine that the corresponding character of first yard of section and second code section are corresponding The size of character, and then determine the size relation of the first character string and the second character string, later according to the first character string and second The size relation of character string is ranked up two character strings.The application character string sorting method disclosed above, will be wait sort Character string be converted to Unicode, by being compared the size of determining character string to Unicode, in multiple characters to be sorted String is comprising can also quickly determine the size relation of multiple character strings in the case where a plurality of types of characters, thus realization pair The sequence of multiple character strings also improves the efficiency of character string sorting.
As shown in Fig. 2, for the flow diagram of another character string sorting method provided by the embodiments of the present application.The character String sorting method includes:
S201: the first Unicode and the second Unicode are obtained.
Wherein, the first Unicode is the corresponding Unicode of the first character string to be sorted, and the second Unicode is wait sort The corresponding Unicode of second character string.
S202: it according to the sequence from a high position to low level, is extracted in the first Unicode and successively in the second Unicode respectively Code section, until the first yard of section extracted in the first Unicode, different from the second code section extracted in the second Unicode.
Wherein, correspond to a character in the first character string in the code section that the first Unicode extracts;In the second unification The code section extracted in code corresponds to a character in the second character string.
The specific implementation process of step S201 and step S202 is consistent with step S101 shown in fig. 1 and step S102, this In repeat no more.
S203: in the case where the corresponding character of first yard of section and the corresponding character of second code section are numeric type, Third yard section is extracted in first Unicode, and the 4th yard of section is extracted in the second Unicode.
Wherein, third yard section is the corresponding code section of the first numeric string, and the 4th yard of section is the corresponding code section of the second numeric string;The One numeric string is the corresponding character of first yard of section numeric string affiliated in the first character string, and the second numeric string is that second code section is right The character answered numeric string affiliated in the second character string.It should be noted that the first numeric string and the second numeric string may be only Including one-bit digital, the first numeric string and the second numeric string may also include decimal point.
When the first yard of section extracted in the first Unicode is different from the second code section extracted in the second Unicode When, it determines the character types of the corresponding character of first yard of section, determines the character types of the corresponding character of second code section.If first The corresponding character of code section character corresponding with second code section is numeric type, then, third yard is extracted in the first Unicode Section extracts the 4th yard of section in the second Unicode.
Wherein, third yard section includes first yard of section, may further include continuous with first yard of section, and character types are number Type or code section for decimal point.4th yard of section includes second code section, may further include, and character continuous with second code section Type be numeric type or be decimal point code section.
Third yard section: being converted to the first data of floating type by S204, and the 4th yard of section is converted to the second number of floating type According to.
S205: determine that the corresponding character string of the greater is greater than another character string in the first data and the second data.
That is, the first character string is greater than the second character string if the first data are greater than the second data;If the One data are less than the second data, then it is determined that the first character string is less than the second character string.
It should be noted that the first numeric string and the second numeric string may include decimal point, by third yard section and the 4th yard Section is converted to real-coded GA, is compared again later, can guarantee the correctness of comparison result.
To be exemplified below convenient for better understanding technical solution:
Two character strings to be sorted are " collection of wife's password the 90th " and " collection of wife's password the 100th ".It obtains and the first Corresponding first Unicode of one character string " collection of wife's password the 90th ", it is right with the second character string " collection of wife's password the 100th " to obtain The second Unicode answered.
According to the sequence from a high position to low level, the successively extraction code section in the first Unicode and in the second Unicode respectively, Until extracting first yard of section corresponding with character " 9 " in the first Unicode, it is right with character " 1 " to extract in the second Unicode The second code section answered.Since the corresponding character " 9 " of first yard of section and the corresponding character of second code section " 1 " they are numeric type, because This, extracts third yard section in the first Unicode, and the 4th yard of section, third yard Duan Weiyu numeric string are extracted in the second Unicode " 90 " corresponding code section, the corresponding code section of the 4th yard of Duan Weiyu numeric string " 100 ".Third yard section is converted to the of floating type 4th yard of section is converted to the second data 100 (decimal system) of floating type by one data 90 (decimal system).Since the second data are greater than First data, it is thus determined that the corresponding character string of the second data is greater than the corresponding character string of the first data, that is, determine the second character String is greater than the first character string.
It can be seen that needing to determine the first character string and second according to the number in the first character string and the second character string When the size of character string, the application is numerical value comparison to be carried out to numeric string, rather than carry out text comparison, obtained comparison result It is more accurate.It is illustrated in conjunction with above-mentioned example: if carrying out text to the number in the first character string and the second character string Compare, first in " 100 " " 1 " is less than first " 9 " in " 90 ", then will show that the first character string is greater than the second word The comparison result of string is accorded with, which is wrong.
S206: according to the size relation of the first character string and the second character string, to the first character string and the second character string into Row sequence.
Character string sorting method disclosed in the application Fig. 2, for the first character string and the second character string to be sorted, first Obtain and corresponding first Unicode of the first character string and the second Unicode corresponding with the second character string, later according to from A high position arrives the sequence of low level, respectively the successively extraction code section in the first Unicode and in the second Unicode, until from the first unification First yard of section that code extracts is different from the second code section extracted in the second Unicode, if the corresponding character of first yard of section Character corresponding with second code section is numeric type, then extracting third corresponding with the first numeric string in the first Unicode Code section, third yard section corresponding with the second numeric string is extracted in the second Unicode, third yard section is converted to floating-point later 4th yard of section is converted to the second data of floating type by the first data of type, determines the by comparing the data of two floating types The size of one character string and the second character string.The application character string sorting method disclosed above, it is determining according to number when needing When the size relation of the first character string and the second character string, by being carried out to the numeric string in the first character string and the second character string Numerical value compares, and can accurately determine the size relation of the first character string and the second character string.
As shown in figure 3, the flow diagram of the sort method for another character string provided by the embodiments of the present application.The word According with string sorting method includes:
S301: the first Unicode and the second Unicode are obtained.
Wherein, the first Unicode is the corresponding Unicode of the first character string to be sorted, and the second Unicode is wait sort The corresponding Unicode of second character string.
S302: it according to the sequence from a high position to low level, is extracted in the first Unicode and successively in the second Unicode respectively Code section, until the first yard of section extracted in the first Unicode, different from the second code section extracted in the second Unicode.
Wherein, correspond to a character in the first character string in the code section that the first Unicode extracts;In the second Unicode The code section of middle extraction corresponds to a character in the second character string.
The specific implementation process of step S301 and step S302 is consistent with step S101 shown in fig. 1 and step S102, this In repeat no more.
S303: in the corresponding character of first yard of section and the corresponding character of second code section are in the case where literal type, Obtain the first phonetic and the second phonetic.
Wherein, the first phonetic is the phonetic of the corresponding character of first yard of section, and the second phonetic is the corresponding character of second code section Phonetic.
As an implementation, the text class in the corresponding character of first yard of section and the corresponding character of second code section are In the case where type, by quoting Chinese pinyin packet, corresponding first phonetic of first yard of section and second code section corresponding second are obtained Phonetic.
S304: according to sequence from left to right, letter is successively extracted in the first phonetic and the second phonetic respectively.
S305: if two in the same order letter extracted in the first phonetic and the second phonetic is different, Compare the size of the two letters.
S306: determine that the corresponding character string of the greater is greater than another character string in two letters, executes S309.
S307: if the first phonetic and the second phonetic include the letter of identical quantity, and in the first phonetic and the second phonetic Letter in same order is all the same, then compares the big of the second tone in the first tone and the second phonetic in the first phonetic It is small.
S308: determine that the corresponding character string of the greater is greater than another character string in the first tone and the second tone, executes S309。
It should be noted that if the first phonetic and the second phonetic include the letter of different number, in the first phonetic and the Any one in two phonetics completes the extraction of all letters, but does not extract alphabetical situations in same order but different Under, it determines in the first phonetic and the second phonetic and is greater than another character string comprising the corresponding character string of number of letters the greater.Such as: First phonetic is " xian ", and the second phonetic is " xiang ", and the 1st to the 4th letter and the second phonetic are located in the first phonetic In positioned at the 1st to the 4th letter it is all the same, for the first phonetic complete all letter extractions after, do not extract yet In same order but different letters, accordingly, it is determined that be greater than the first phonetic corresponding for corresponding second character string of the second phonetic First character string.
S309: according to the size relation of the first character string and the second character string, to the first character string and the second character string into Row sequence.
To be exemplified below convenient for better understanding technical solution:
Assuming that the corresponding character of first yard of section is " text ", the corresponding character of second code section is " for ", first yard of section and second The corresponding character of code section is middle literal type, by quoting Chinese pinyin packet, obtains the first phonetic " wen2 " and the second phonetic "wei2".It should be noted that phonetic includes tone, tone is numerically.Optionally, a sound, two sound, three sound, the four tones of standard Chinese pronunciation and The digital representation being softly successively gradually increased with numerical value, or the digital representation being successively gradually reduced with numerical value.In addition, tone can Can also decimally indicate with integer representation.For example, a sound, two sound, three sound, the four tones of standard Chinese pronunciation and softly successively with metric 1, 2, it 3,4 and 5 indicates." 2 " in first phonetic " wen2 " and the second phonetic " wei2 " are tone.
According to sequence from left to right, letter is successively extracted in the first phonetic and the second phonetic respectively, until first First alphabetical " n " extracted in phonetic, different from second alphabetical " i " extracted in the second phonetic.Compare the first letter The size of " n " and second alphabetical " i " obtain first alphabetical " n " greater than second alphabetical " i " as a result, it is thus determined that the first character String is greater than the second character string.
Assuming that the corresponding character of first yard of section is " collection ", the corresponding character of second code section is " season ", first yard of section and second The corresponding character of code section is middle literal type, by quoting Chinese pinyin packet, obtains the first phonetic " ji2 " and the second phonetic "ji4".According to sequence from left to right, letter is successively extracted in the first phonetic and the second phonetic respectively, due to the first phonetic It is all the same with the letter in the second phonetic in same order, therefore to compare the first tone in the first phonetic and the second phonetic In the second tone size, so that it is determined that the size of the first character string and the second character string.With a sound, two sound, three sound, the four tones of standard Chinese pronunciation Softly successively by taking the digital representation that numerical value is gradually increased as an example, obtains the rising tone in the second phonetic and tune up in the first phonetic The first tone as a result, accordingly, it is determined that the second character string be greater than the first character string.
What needs to be explained here is that if the tone in phonetic is to compare the big of two tones with fractional representation Hour, two decimals for indicating tone first can be converted into real-coded GA, then compare the size of two real-coded GAs.
Character string sorting method disclosed in the application Fig. 3, for the first character string and the second character string to be sorted, first Obtain and corresponding first Unicode of the first character string and the second Unicode corresponding with the second character string, later according to from A high position arrives the sequence of low level, respectively the successively extraction code section in the first Unicode and in the second Unicode, until from the first unification First yard of section that code extracts is different from the second code section extracted in the second Unicode, if the corresponding character of first yard of section Character corresponding with second code section is middle literal type, then, the first phonetic and the second phonetic are obtained, it is suitable according to from left to right Sequence compares the letter that same order is in the first phonetic and the second phonetic, if in the first phonetic and the second phonetic one by one It is different in two letters of same order, then determining the first character string and the second character string by comparing two alphabetical sizes Size relation, if the letter in same order in the first phonetic and the second phonetic is all the same, by comparing first The size of tone, determines the size relation of the first character string and the second character string in phonetic and the second phonetic, thus to character string It is ranked up.
Since Chinese is there are a large amount of phonetically similar word, in the first character string and the second character string to be sorted, according to from The kinds of characters that the high-order sequence to low level first appears may be phonetically similar word, in such a case, it is possible to same based on the two II yard of sound word ASC determines the size of the first character string and the second character string.
Specifically, following steps can further be arranged on the basis of character string sorting method shown in Fig. 3:
Under the first phonetic and the identical situation of the second phonetic, II yard of the first ASC of the corresponding character of first yard of section is obtained II yard of the 2nd ASC of character corresponding with second code section, compares the size of II yard of II yard of the first ASC and the 2nd ASC, determines first The corresponding character string of the greater is greater than another character string in II yard of ASC and II yard of the 2nd ASC.
The application character string sorting method disclosed above, in the corresponding character of first yard of section and the corresponding word of second code section In the case that symbol is phonetically similar word, II yard of ASC of two characters is obtained, determines two words by comparing two II yard of ASC of size The size for according with string, to be ranked up to two character strings.
In an implementation, when first yard of section extracting in the first Unicode is different from extracting in the second Unicode When second code section, if the corresponding character of first yard of section and the corresponding character of second code section are spcial character, by first Code section is converted to II yard of the 3rd ASC, and second code section is converted to II yard of the 4th ASC, determines II yard of the 3rd ASC and the 4th ASC The corresponding character string of the greater is greater than another character string in II yard.
In an implementation, when first yard of section extracting in the first Unicode is different from extracting in the second Unicode When second code section, if the corresponding character of first yard of section and the corresponding character of second code section are letter, compare two words Female size determines that the corresponding character string of biggish letter is greater than another character string.
In an implementation, if any one in the first Unicode and the second Unicode completes the extraction of whole code sections, But in the case where not extracting different code sections, the corresponding character of length the greater in the first Unicode and the second Unicode is determined String is greater than another character string.
Such as: the first character string is " collection of man and wife's password the 100th ", and the second character string is the " collection of man and wife's password the 100th-excellent Segment ".According to the sequence from a high position to low level, respectively in corresponding first unification of the first character string " collection of man and wife's password the 100th " Successively extraction code section in code and the second character string " man and wife's password 100 collection-wonderful " corresponding second Unicode, due to the One character string includes 9 characters, and the first character string is identical with preceding 9 characters of the second character string, therefore, works as completion After whole code sections of first character string are extracted, different code sections still will not be extracted, in this case, determines the second character string Greater than the first character string.
Based on the above embodiments, according to the sequence from a high position to low level, respectively in the first Unicode and the second unification Successively extraction code section in code, if any one in the first Unicode and the second Unicode completes the extraction of whole code sections, but In the case where not extracting different code sections, determine that length the greater is greater than another character in the first Unicode and the second Unicode String.
Character string sorting method disclosed in the present application is illustrated below with reference to Fig. 4, refers to Fig. 4, comprising:
S401: the first Unicode and the second Unicode are obtained;
Wherein, the first Unicode is the corresponding Unicode of the first character string, and the second Unicode is that the second character string is corresponding Unicode.
S402: according to the sequence from a high position to low level, i-th yard is extracted in the first Unicode and the second Unicode respectively Section, wherein the initial value of i is 1.
S403: judge whether extract two i-th yard of sections are identical;If they are the same, then S404 is executed, if it is different, then holding Row S406.
S404: the first Unicode and the second Unicode are judged whether there are also undrawn code section, if the first Unicode and the Two Unicodes have undrawn code section, then execute S405, otherwise, execute S411.
S405: by i plus 1, S402 is executed.
S406: the character types of the corresponding character of two i-th yard of sections are determined;If the corresponding word of two i-th yard of sections The character types of symbol are different, then execute S407, if the character types of the corresponding character of two i-th yard of sections are identical, basis Specific character types execute one in S408, S409, S410 and S412;Specifically, being executed if being all numeric type S408 executes S409 if being middle literal type, executes S410 if being all spcial character type, if being all word Mother then executes S412.
S407: according to the size relation of preset multiple character types, the big of the corresponding character of two i-th yard of sections is determined Small relationship, so that it is determined that the size relation of the first character string and the second character string, executes S413.
S408: extracting third yard section in the first Unicode, the 4th yard of section is extracted in second code section, according to third yard section The size relation of the first character string and the second character string is determined with the 4th yard of section, executes S413.
S409: obtaining the first phonetic and the second phonetic of the corresponding character of two code sections, is spelled according to the first phonetic and second Sound determines the size relation of the first character string and the second character string, executes S413.
S410: being respectively converted into II yard of ASC for two code sections, according to two ASC, II yard of determination first character string and second The size relation of character string executes S413.
S411: the size of the first character string and the second character string is determined according to the length of the first Unicode and the second Unicode Relationship executes S413.
S412: the size relation of the first character string and the second character string is determined according to two letters, executes S413.
S413: according to the size relation of the first character string and the second character string, to the first character string and the second character string into Row sequence.
The application character string sorting method disclosed above, correspondingly, character string sorting device is also disclosed in the application.Explanation Explanation in book about character string sorting method and character string sorting device can be referred to mutually.
Referring to Fig. 5, Fig. 5 is a kind of structural schematic diagram of character string sorting device disclosed in the present application.It should be obtained including data Take unit 10, code section extraction unit 20, character types comparing unit 30 and sequencing unit 40.
Wherein:
Data capture unit 10, for obtaining the first Unicode and the second Unicode.Wherein, the first Unicode is wait sort The corresponding Unicode of the first character string, the second Unicode be the corresponding Unicode of the second character string to be sorted.
Code section extraction unit 20, for being united in the first Unicode with second respectively according to the sequence from a high position to low level Successively extraction code section in one yard, until the first yard of section extracted in the first Unicode, different from being extracted in the second Unicode Second code section out.Wherein, correspond to a character in the first character string in the code section that the first Unicode extracts;In the second system The code section extracted in one yard corresponds to a character in the second character string.
Character types comparing unit 30, for being difference in the corresponding character of first yard of section and the corresponding character of second code section In the case where character types, according to the size relation of preset multiple character types, the corresponding character of first yard of section and are determined The size relation of the corresponding character of two yards of sections;Determine the greater in the corresponding character of first yard of section and the corresponding character of second code section Corresponding character string is greater than another character string.
Sequencing unit 40, for the size relation according to the first character string and the second character string, to the first character string and Two character strings are ranked up.
First character string and the second character string to be sorted are converted to unification by character string sorting device disclosed in the present application Code, according to the sequence from a high position to low level, the successively extraction code section in two Unicodes respectively, until being mentioned from two Unicodes The code section of taking-up is different, if the corresponding character of the two yard of section is different character types, according to character types determine this two The size of the corresponding character of a yard of section, and the first character string and the second word are determined according to the size of the corresponding character of the two yard of section The size for according with string, is later ranked up the first character string and the second character string.Character string sorting device disclosed in the present application, will Character string to be sorted is converted to Unicode, by being compared the size of determining character string to Unicode, to be sorted more In the case that a character string includes a plurality of types of characters, the size relation of multiple character strings can be also quickly determined, thus Realize the sequence to multiple character strings.
Optionally, on the basis of character string sorting device disclosed above, numeric string comparing unit 50 is further set, As shown in Figure 6.
Numeric string comparing unit 50 is used for: being number in the corresponding character of first yard of section and the corresponding character of second code section In the case where type, third yard section is extracted in the first Unicode, and the 4th yard of section is extracted in the second Unicode;Wherein, third Code section is the corresponding code section of the first numeric string, and the 4th yard of section is the corresponding code section of the second numeric string;First numeric string is first yard The corresponding character of section numeric string affiliated in the first character string, the second numeric string are the corresponding character of second code section in the second word Numeric string belonging in symbol string;4th yard of section is converted to floating type by the first data that third yard section is converted to floating type Second data;Determine that the corresponding character string of the greater is greater than another character string in the first data and the second data.
Optionally, on the basis of character string sorting device disclosed above, phonetic comparing unit 60 is further set, such as Shown in Fig. 6.
Phonetic comparing unit 60 is used for: the text in the corresponding character of first yard of section and the corresponding character of second code section are In the case where type, the first phonetic and the second phonetic are obtained, wherein the first phonetic is the phonetic of the corresponding character of first yard of section, Second phonetic is the phonetic of the corresponding character of second code section;According to sequence from left to right, spelled respectively in the first phonetic and second Letter is successively extracted in sound, if two in the same order letter extracted in the first phonetic and the second phonetic is not Together, then the size for comparing the two letters determines that the corresponding character string of the greater is greater than another character string in the two letters;Such as The first phonetic of fruit and the second phonetic include the letter of identical quantity, and the word of same order is in the first phonetic and the second phonetic It is female all the same, then compare the size of the second tone in the first tone and the second tone in the first phonetic, determines the first tone Character string corresponding with the greater in the second tone is greater than another character string.
Optionally, on the basis of character string sorting device disclosed above, standard code comparing unit is further set 70, as shown in Figure 6.
Standard code comparing unit 70 is used for: determining the first phonetic and the identical feelings of the second phonetic in phonetic comparing unit 60 Under condition, II yard of the 2nd ASC of II yard of the first ASC character corresponding with second code section of the corresponding character of first yard of section is obtained, than Compared with the size of II yard of the first ASC and II yard of the 2nd ASC, the corresponding word of the greater in II yard of the first ASC and II yard of the 2nd ASC is determined Symbol string is greater than another character string.
Optionally, on the basis of character string sorting device disclosed above, length comparing unit 80 is further set, such as Shown in Fig. 6.
Length comparing unit 80 is used for: any one in the first Unicode and the second Unicode completes whole code sections It extracts, but in the case where not extracting different code sections, determines that length the greater is corresponding in the first Unicode and the second Unicode Character string be greater than another character string.
Optionally, on the basis of character string sorting device disclosed above, spcial character comparing unit is further set. The spcial character comparing unit is used for: being spcial character in the corresponding character of first yard of section and the corresponding character of second code section In the case of, first yard of section is converted into II yard of the 3rd ASC, second code section is converted into II yard of the 4th ASC, determines the 3rd ASC The corresponding character string of the greater is greater than another character string in II yard and II yard of the 4th ASC.
Optionally, on the basis of character string sorting device disclosed above, alphabetical comparing unit is further set.The word Female comparing unit is used for: in the case where the corresponding character of first yard of section and the corresponding character of second code section are letter, being compared Two alphabetical sizes determine that the corresponding character string of biggish letter is greater than another character string.
Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that the process, method, article or equipment for including a series of elements not only includes that A little elements, but also including other elements that are not explicitly listed, or further include for this process, method, article or The intrinsic element of equipment.In the absence of more restrictions, the element limited by sentence "including a ...", is not arranged Except there is also other identical elements in the process, method, article or apparatus that includes the element.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other The difference of embodiment, the same or similar parts in each embodiment may refer to each other.For device disclosed in embodiment For, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is said referring to method part It is bright.
The foregoing description of the disclosed embodiments makes professional and technical personnel in the field can be realized or use the application. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the application.Therefore, the application It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims (10)

1. a kind of character string sorting method characterized by comprising
Obtain the first Unicode and the second Unicode;Wherein, first Unicode is that the first character string to be sorted is corresponding Unicode, second Unicode are the corresponding Unicode of the second character string to be sorted;
According to the sequence from a high position to low level, the successively extraction code in first Unicode and in second Unicode respectively Section, until first yard of section being extracted in first Unicode, different from extracted in second Unicode the Two yards of sections;Wherein, the code section extracted in first Unicode corresponds to a character in first character string;Institute State the character that the code section extracted in the second Unicode corresponds in second character string;
In the case where the corresponding character of first yard of section and the corresponding character of the second code section are kinds of characters type, root According to the size relation of preset multiple character types, determine that the corresponding character of first yard of section and the second code section are corresponding The size relation of character;
Determine that the corresponding character string of the greater is big in the corresponding character of first yard of section and the corresponding character of the second code section In another character string;
According to the size relation of first character string and second character string, to first character string and second word Symbol string is ranked up.
2. the method according to claim 1, wherein further include:
In the case where the corresponding character of first yard of section and the corresponding character of the second code section are numeric type, in institute Extraction third yard section in the first Unicode is stated, the 4th yard of section is extracted in second Unicode;Wherein, the third yard section is The corresponding code section of first numeric string, the 4th yard of section are the corresponding code section of the second numeric string;First numeric string is described The corresponding character of first yard of section numeric string affiliated in first character string, second numeric string are the second code section Corresponding character numeric string affiliated in second character string;
The 4th yard of section is converted to the second number of floating type by the first data that the third yard section is converted to floating type According to;
Determine that the corresponding character string of the greater is greater than another character string in first data and second data.
3. method according to claim 1 or 2, which is characterized in that further include:
In the corresponding character of first yard of section and the corresponding character of the second code section are in the case where literal type, obtain The first phonetic and the second phonetic, wherein first phonetic be the corresponding character of first yard of section phonetic, described second Phonetic is the phonetic of the corresponding character of the second code section;
According to sequence from left to right, letter is successively extracted in first phonetic and second phonetic respectively;
If two in the same order letter extracted in first phonetic and second phonetic is different, compare The size of more described two letters determines that the corresponding character string of the greater is greater than another character string in described two letters;
If the letter of first phonetic and second phonetic comprising identical quantity, and first phonetic and described second Letter in phonetic in same order is all the same, then in the first tone and second phonetic in first phonetic The second tone size, determine in first tone and second tone that the corresponding character string of the greater is greater than another word Symbol string.
4. according to the method described in claim 3, it is characterized by further comprising:
If first phonetic is identical with second phonetic, the first ASC of the corresponding character of first yard of section is obtained II yard of II yard of the 2nd ASC of II yard of character corresponding with the second code section, the first ASC and the 2nd ASC II The size of code determines that II yard of the first ASC character string corresponding with the greater in described II yard of 2nd ASC is greater than another character String.
5. according to claim 1 or 2 methods stated, which is characterized in that further include:
Any one in first Unicode and second Unicode completes the extraction of whole code sections, but does not extract In the case where different code sections, the corresponding character string of length the greater in first Unicode and second Unicode is determined Greater than another character string.
6. a kind of character string sorting device characterized by comprising
Data capture unit, for obtaining the first Unicode and the second Unicode;Wherein, first Unicode is wait sort The corresponding Unicode of first character string, second Unicode are the corresponding Unicode of the second character string to be sorted;
Code section extraction unit, for according to the sequence from a high position to low level, respectively in first Unicode and described second Successively extraction code section is different from up to the first yard of section extracted in first Unicode described second in Unicode The second code section extracted in Unicode;Wherein, the code section extracted in first Unicode corresponds to first character A character in string;The code section extracted in second Unicode corresponds to a character in second character string;
Character types comparing unit, for being not in the corresponding character of first yard of section and the corresponding character of the second code section In the case where with character types, according to the size relation of preset multiple character types, the corresponding word of first yard of section is determined Accord with the size relation of character corresponding with the second code section;Determine the corresponding character of first yard of section and the second code section The corresponding character string of the greater is greater than another character string in corresponding character;
Sequencing unit, for the size relation according to first character string and second character string, to first character String and second character string are ranked up.
7. device according to claim 6, which is characterized in that further include:
Numeric string comparing unit, for being number in the corresponding character of first yard of section and the corresponding character of the second code section In the case where word type, third yard section is extracted in first Unicode, and the 4th yard of section is extracted in second Unicode; Wherein, the third yard section is the corresponding code section of the first numeric string, and the 4th yard of section is the corresponding code section of the second numeric string;Institute Stating the first numeric string is the corresponding character of first yard of section numeric string affiliated in first character string, second number Word string is the corresponding character of second code section numeric string affiliated in second character string;The third yard section is converted For the first data of floating type, the 4th yard of section is converted to the second data of floating type;Determine first data and institute It states the corresponding character string of the greater in the second data and is greater than another character string.
8. device according to claim 6 or 7, which is characterized in that further include:
Phonetic comparing unit, for being Chinese in the corresponding character of first yard of section and the corresponding character of the second code section In the case where word type, the first phonetic and the second phonetic are obtained, wherein first phonetic is the corresponding word of first yard of section The phonetic of symbol, second phonetic are the phonetic of the corresponding character of the second code section;According to sequence from left to right, exist respectively Letter is successively extracted in first phonetic and second phonetic;If mentioned in first phonetic and second phonetic Two in the same order letter taken out is different, then the size of more described two letters, determines in described two letters The corresponding character string of the greater is greater than another character string;If first phonetic and second phonetic include identical quantity Letter, and the letter in first phonetic and second phonetic in same order is all the same, then more described first spells The size of the first tone in sound and the second tone in second tone, determines first tone and second tone The corresponding character string of middle the greater is greater than another character string.
9. device according to claim 8, which is characterized in that further include:
Standard code comparing unit, for determining that first phonetic is identical with second phonetic in the phonetic comparing unit In the case where, obtain the of II yard of the first ASC character corresponding with the second code section of the corresponding character of first yard of section II yard of two II yard of ASC, the first ASC and II yard of the 2nd ASC of the size, determine II yard of the first ASC and institute It states the corresponding character string of the greater in II yard of the 2nd ASC and is greater than another character string.
10. device according to claim 6 or 7, which is characterized in that further include:
Length comparing unit completes whole code sections for any one in first Unicode and second Unicode Extraction, but in the case where not extracting different code sections, determine length in first Unicode and second Unicode The corresponding character string of the greater is greater than another character string.
CN201910567581.XA 2019-06-27 2019-06-27 Character string sorting method and device Active CN110287147B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910567581.XA CN110287147B (en) 2019-06-27 2019-06-27 Character string sorting method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910567581.XA CN110287147B (en) 2019-06-27 2019-06-27 Character string sorting method and device

Publications (2)

Publication Number Publication Date
CN110287147A true CN110287147A (en) 2019-09-27
CN110287147B CN110287147B (en) 2022-08-19

Family

ID=68019229

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910567581.XA Active CN110287147B (en) 2019-06-27 2019-06-27 Character string sorting method and device

Country Status (1)

Country Link
CN (1) CN110287147B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1039132A (en) * 1988-06-28 1990-01-24 原益中 Sound shape stroke integrated encode high-speed Chinese character input method and applied keyboard
CN1063370A (en) * 1992-01-27 1992-08-05 彭鹏 A kind of Roman character spelling of Chinese characters and suitable input equipment
CN101459712A (en) * 2009-01-05 2009-06-17 深圳华为通信技术有限公司 Telephone book ordering method and mobile phone equipment
CN102104741A (en) * 2009-12-16 2011-06-22 新奥特(北京)视频技术有限公司 Method and device for arranging multi-language captions
CN103514160A (en) * 2012-06-15 2014-01-15 华为终端有限公司 Sorting method and mobile equipment
CN103810279A (en) * 2014-02-18 2014-05-21 天津松下汽车电子开发有限公司 Ordering method and device of mixed fields
CN104902091A (en) * 2015-05-27 2015-09-09 广东欧珀移动通信有限公司 Sorting method for address book and terminal
CN106227808A (en) * 2016-07-22 2016-12-14 无锡云商通科技有限公司 A kind of method removing mail interference information and method for judging rubbish mail
CN108134799A (en) * 2018-01-18 2018-06-08 国网湖南省电力有限公司 Novel encipher-decipher method and its device
CN108805132A (en) * 2018-06-01 2018-11-13 华中科技大学 A kind of rubbish text filter method based on deep learning

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1039132A (en) * 1988-06-28 1990-01-24 原益中 Sound shape stroke integrated encode high-speed Chinese character input method and applied keyboard
CN1063370A (en) * 1992-01-27 1992-08-05 彭鹏 A kind of Roman character spelling of Chinese characters and suitable input equipment
CN101459712A (en) * 2009-01-05 2009-06-17 深圳华为通信技术有限公司 Telephone book ordering method and mobile phone equipment
CN102104741A (en) * 2009-12-16 2011-06-22 新奥特(北京)视频技术有限公司 Method and device for arranging multi-language captions
CN103514160A (en) * 2012-06-15 2014-01-15 华为终端有限公司 Sorting method and mobile equipment
CN103810279A (en) * 2014-02-18 2014-05-21 天津松下汽车电子开发有限公司 Ordering method and device of mixed fields
CN104902091A (en) * 2015-05-27 2015-09-09 广东欧珀移动通信有限公司 Sorting method for address book and terminal
CN106227808A (en) * 2016-07-22 2016-12-14 无锡云商通科技有限公司 A kind of method removing mail interference information and method for judging rubbish mail
CN108134799A (en) * 2018-01-18 2018-06-08 国网湖南省电力有限公司 Novel encipher-decipher method and its device
CN108805132A (en) * 2018-06-01 2018-11-13 华中科技大学 A kind of rubbish text filter method based on deep learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CSDN: "js中中文、空格、数字、字符串混合排序", 《HTTPS://BLOG.CSDN.NET/WEIXIN_39090097/ARTICLE/DETAILS/87428721》 *
CSDN: "关于汉字按拼音音调排序…", 《HTTPS://BBS.CSDN.NET/TOPICS/80314098》 *
百度知道: "java数字字母混合字符串排序", 《HTTPS://ZHIDAO.BAIDU.COM/QUESTION/128188637.HTML》 *
腾讯云: "深入理解苹果系统(Unicode)字符串的排序方法", 《HTTPS://CLOUD.TENCENT.COM/DEVELOPER/ARTICLE/1363377》 *

Also Published As

Publication number Publication date
CN110287147B (en) 2022-08-19

Similar Documents

Publication Publication Date Title
CN104753540B (en) Data compression method, data decompression method and apparatus
CN107807982B (en) Consistency checking method and device for heterogeneous database
CN105260354A (en) Chinese AC (Aho-Corasick) automaton working method based on keyword dictionary tree structure
CN104579360B (en) A kind of method and apparatus of data processing
US11178212B2 (en) Compressing and transmitting structured information
CN104360865A (en) Serialization method, deserialization method and related equipment
CN104008093A (en) Method and system for chinese name transliteration
CA2413055C (en) Method and system of creating and using chinese language data and user-corrected data
CN103473056A (en) Automatic generation method for telemetering configuration files
TW201516715A (en) Method of data sorting
CN107328968A (en) For freezing and event log data storage method for electric energy meter
CN102867049A (en) Chinese PINYIN quick word segmentation method based on word search tree
CN103514160B (en) Sorting method and mobile equipment
CN101551820B (en) Generation method and apparatus for index database of points of interest attribute
CN107679187A (en) A kind of construction method and device of Chinese address tree
CN109446198B (en) Trie tree node compression method and device based on double arrays
CN110287147A (en) A kind of character string sorting method and device
WO2010043117A1 (en) Digital encoding method and application thereof
CN115525728A (en) Method and device for Chinese character sorting, chinese character retrieval and Chinese character insertion
CN112232025B (en) Character string storage method and device and electronic equipment
WO2021102263A1 (en) Computerized data compression and analysis using potentially non-adjacent pairs
CN1916888A (en) Method and system of identifying language of double-byte character set character data
CN114330262B (en) Statistical method and device for material data and electronic equipment
WO2004006123A2 (en) Method and system of creating and using chinese language data and user-corrected data
CN105022799B (en) The primary and secondary keyword of two dimensional extent uncertain data based on TreeMap is from sort algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant