CN109542243A - Phrase composing method and device, for the device of group word - Google Patents

Phrase composing method and device, for the device of group word Download PDF

Info

Publication number
CN109542243A
CN109542243A CN201710861480.4A CN201710861480A CN109542243A CN 109542243 A CN109542243 A CN 109542243A CN 201710861480 A CN201710861480 A CN 201710861480A CN 109542243 A CN109542243 A CN 109542243A
Authority
CN
China
Prior art keywords
word
group
unit
arithemetic
arithemetic unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710861480.4A
Other languages
Chinese (zh)
Other versions
CN109542243B (en
Inventor
左艳波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201710861480.4A priority Critical patent/CN109542243B/en
Publication of CN109542243A publication Critical patent/CN109542243A/en
Application granted granted Critical
Publication of CN109542243B publication Critical patent/CN109542243B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • G06F3/0236Character input methods using selection techniques to select from displayed items

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a kind of phrase composing method and device, for the device of group word, method therein specifically includes: receiving the input string of user;If the input string meets prerequisite in the hit situation of preset data collection, the group word path for meeting rule of combination is obtained according to the input string, as a group word candidate;Wherein, the preset data collection includes: word collection and the corresponding coding unit collection of the word collection.The success rate of group word can be improved in the embodiment of the present invention, and group reasonability and quality of word candidate can be improved, and then the input efficiency of user can be improved.

Description

Phrase composing method and device, for the device of group word
Technical field
The present invention relates to computerized information input technology fields, more particularly to a kind of phrase composing method and device, Yi Jiyi Device of the kind for group word.
Background technique
Currently, being related to interactive equipment, it usually needs user passes through input method procedure for oneself operation intention and equipment Interactive identification.For example, user can input input string, it then should by the input method procedure Standard Map rule preset according to its Input string is converted to the candidate item of corresponding language and displaying, and then will shield in the candidate item of user's selection.
When the entry that input string is directly hit is not present in dictionary, input method procedure can trigger a group word function.It is existing Group word process specifically: search the n-tuple relation in polynary library, which calculates each group The path probability of vocabulary string in word scheme, and the group word scheme with maximum path probability is returned into user as preference. Wherein, which refers to the Matching Relation between vocabulary and vocabulary, such as " the good heat of weather-", " I-know ", " like- You ", " 100,000-eight thousand " etc. can have binary crelation.Group word function is extremely important, and the quality of group word result can influence to input The quality of method program also will affect the experience of user.
In practical applications, since corresponding with unit group of number is combined into infinite set, therefore for including number and unit Group word, generally requires very more n-tuple relations.However, on the one hand, it is limited to memory space, the n-tuple relation of storage is limited; On the other hand, the n-tuple relation stored in polynary library is obtained often by the mode of statistical learning, the polynary pass being commonly stored System is it is difficult to ensure that all situations can be covered., will in this way, if the n-tuple relation in polynary library can not be hit during group word Lead to a group word failure.For example, if " 100,000,008 Wan Jiuqian " not stored in polynary library, input string " yiyilingbawanjiuqian " corresponding vocabulary " 100,000,000 ", " zero ", " 80,000 ", " 9,000 " etc. will be unable to hit in polynary library N-tuple relation, and then lead to the failure of group word.
Summary of the invention
In view of the above problems, it proposes the embodiment of the present invention and overcomes the above problem or at least partly in order to provide one kind The phrase composing method that solves the above problems, group word device, for the device of group word, the success of group word can be improved in the embodiment of the present invention Rate, and group reasonability and quality of word candidate can be improved, and then the input efficiency of user can be improved.
To solve the above-mentioned problems, the embodiment of the invention discloses a kind of phrase composing methods, comprising:
Receive the input string of user;
If the input string meets prerequisite in the hit situation of preset data collection, is obtained and accorded with according to the input string It is combined group word path normally, as a group word candidate;Wherein, the preset data collection includes: that word collection and the word collection are corresponding Coding unit collection.
On the other hand, the embodiment of the invention discloses a kind of group of word devices, comprising:
Input string receiving module, for receiving the input string of user;And
Group word candidate obtains module, if the hit situation for the input string in preset data collection meets prerequisite, The group word path for meeting rule of combination is then obtained according to the input string, as a group word candidate;Wherein, the preset data Ji Bao It includes: word collection and the corresponding coding unit collection of the word collection.
Optionally, described device further include:
Judgment module, for judging whether the input string meets prerequisite in the hit situation of preset data collection:
The judgment module, comprising:
Cutting submodule obtains corresponding cutting result for carrying out cutting to the input string;
Judging submodule, for judging whether the corresponding cutting result of the input string hits the preset data collection.
Optionally, described group of word candidate acquisition module includes:
Search submodule, for according to the cutting as a result, in mapping relations between coding unit collection and word collection into Row is searched, corresponding to a group individual character as the input string to obtain the individual character to match with the cutting result;
Path determines submodule, for corresponding to a group individual character, a determining group word path according to the input string;
Path acquisition submodule, for obtaining the group word path for meeting rule of combination.
Optionally, described device further include:
Individual character determining module, for determining the corresponding list of the cutting result according to the corresponding context of the input string Word.
Optionally, the word collection includes: digital individual character collection and unit word collection, and the rule of combination is for characterizing digital individual character And/or the corresponding rule of combination of unit word.
Optionally, the rule of combination includes:
Described group of word path includes the first arithemetic unit word, described group of word path the first first arithemetic unit word it Before, the second arithemetic unit group for including between the first adjacent arithemetic unit word or after tail the first arithemetic unit word Quantity is no more than 1;And/or
First arithemetic unit word is not located at the first place in described group of word path;And/or
If the first arithemetic unit word is adjacent with the second arithemetic unit word, alternatively, two the first arithemetic unit words are adjacent, then exist Preceding arithemetic unit is less than posterior arithemetic unit;And/or
The second arithemetic unit word that the second arithemetic unit group in described group of word path includes is presented from big arithemetic unit to small The sequence of arithemetic unit;And/or
Any two the second arithemetic unit word that the second arithemetic unit group in described group of word path includes is non-conterminous;And/or
When there is arithemetic unit interruption in the second arithemetic unit word that the second arithemetic unit group in described group of word path includes, Corresponding arithemetic unit discontinuity position occurs 1 zero;And/or
When there is no the second arithemetic unit groups with number between the first adjacent arithemetic unit word in described group of word path, Posterior first arithemetic unit word omits;And/or
The zero-bit that described group of word path includes is in non-end position;And/or
The zero corresponding previous individual character that described group of word path includes is not digital individual character, alternatively, described to a group individual character pair Zero corresponding the latter individual character that the group word path answered includes is digital individual character or monetary unit word;And/or
The first place in described group of word path be pick up perhaps ten seconds be not pick up, ten, hundred, one hundred, thousand, thousand, it is whole or zero;With/ Or
Frequency of occurrence of the monetary unit group in described group of word path in described group of word path is no more than 1;And/or
Arithemetic unit word is located in described group of word path before monetary unit group;And/or
Sequence from big to small is presented in the monetary unit word that the monetary unit group in described group of word path includes;And/or
The monetary unit word that the monetary unit group in described group of word path includes is non-conterminous;And/or
The monetary unit group in described group of word path include the first monetary unit word, the first monetary unit word it is previous Individual character is digital individual character;And/or
The digital individual character that described group of word path includes is non-conterminous;And/or
Described group of word path includes whole positioned at end position that whole previous individual character is member.
Optionally, the second arithemetic unit word that the second arithemetic unit group in described group of word path includes occurs between arithemetic unit When disconnected, corresponding arithemetic unit discontinuity position occurs 1 zero, comprising:
In the second arithemetic unit group in described group of word path before first first arithemetic unit word big arithemetic unit exist, And when decimal fractions unit presence interruption, corresponding arithemetic unit discontinuity position occurs 1 zero;And/or
There are arithemetic unit and multiple continuous number units are not present in the second arithemetic unit group in described group of word path When, corresponding arithemetic unit discontinuity position occurs 1 zero.
Optionally, the first arithemetic unit word includes: hundred million or ten thousand, and the rule of combination includes:
Described group of word path includes hundred million and ten thousand, and myriabit is before hundred million, the second arithemetic unit group for including between Wan He hundred million Quantity is no more than 1;
When between ten thousand and hundred million there are when the second arithemetic unit group, included by the second arithemetic unit group between Wan He hundred million The second arithemetic unit group that two arithemetic unit words are presented from big arithemetic unit to the sequence of decimal fractions unit, between Wan He hundred million is wrapped Any two the second arithemetic unit word included is non-conterminous, and the second number included by the second arithemetic unit group between Wan He hundred million is single When arithemetic unit interruption occurs in position word, corresponding arithemetic unit discontinuity position occurs 1 zero;Or
When the second arithemetic unit group and number are not present between ten thousand and hundred million, do not occur zero between Wan He hundred million.
Optionally, the first arithemetic unit word includes: hundred million or ten thousand, and the rule of combination includes:
Described group of word path includes hundred million and ten thousand, and hundred million are located at before ten thousand, between hundred million and ten thousand there is no the second arithemetic unit group and When digital, ten thousand are omitted.
Optionally, the first arithemetic unit word includes: hundred million or ten thousand, and the rule of combination includes:
Described group of word path does not include hundred million, and ten thousand frequency of occurrence is no more than 1 in described group of word path.
Optionally, described device further include:
Sorting module, the position for occurring according to unisonance in group word path, multiple groups of words corresponding to the input string Path is ranked up.
In another aspect, the embodiment of the invention discloses a kind of device for group word, include memory and one or The more than one program of person, one of them perhaps more than one program be stored in memory and be configured to by one or It includes the instruction for performing the following operation that more than one processor, which executes the one or more programs:
Receive the input string of user;
If the input string meets prerequisite in the hit situation of preset data collection, is obtained and accorded with according to the input string It is combined group word path normally, as a group word candidate;Wherein, the preset data collection includes: that word collection and the word collection are corresponding Coding unit collection.
Another aspect, the embodiment of the invention discloses a kind of machine readable medias, are stored thereon with instruction, when by one or When multiple processors execute, so that device executes phrase composing method above-mentioned.
The embodiment of the present invention includes following advantages:
The embodiment of the present invention in the input string in the case where the hit situation of preset data collection meets prerequisite, root The group word path for meeting rule of combination is obtained according to the input string, as a group word candidate.Since the rule of combination is for characterizing number Word individual character and/or the corresponding rule of combination of unit word, the rule of combination can be adapted for arbitrary digital individual character and/or unit The success rate of group word can be improved in word, therefore, the embodiment of the present invention.Also, the rule of combination can reflect digital individual character and/or The corresponding combination rule of unit word, from described to obtain legal group word path in corresponding group of word path of group individual character as group word Candidate can be improved group reasonability and quality of word candidate, and then the input efficiency of user can be improved.
Detailed description of the invention
Fig. 1 is a kind of step flow chart of phrase composing method embodiment of the invention;
Fig. 2 is a kind of signal of input interface of the embodiment of the present invention;
Fig. 3 is a kind of step flow chart of phrase composing method embodiment of the invention;
Fig. 4 is the structural block diagram of a kind of group of word Installation practice of the invention;
Fig. 5 be a kind of device for group word shown according to an exemplary embodiment as terminal when block diagram;And
Fig. 6 be a kind of device for group word shown according to an exemplary embodiment as server when block diagram.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real Applying mode, the present invention is described in further detail.
The embodiment of the present invention can be applied to the input method procedure of various input modes, such as above-mentioned input mode specifically may be used To include the input modes such as keyboard symbol, hand-written information, voice input, i.e. user can be special by coding unit, hand-written attribute Shield content in the inputs such as sign.By taking voice input mode as an example, input method procedure can acquire the voice signal of user's input, by this Voice signal is converted to text information, is to carry out a group word to group individual character to text information cutting.Below mainly with coding unit It is illustrated for corresponding input mode, other input mode cross-reference.
Existing input method procedure, can set corresponding coding unit for each words, and user's input is correct Coding unit can obtain required words.Wherein, coding unit can be corresponding with coding rule, the corresponding volume of spelling input method Code rule is syllable rule, and the corresponding coding rule of five-stroke input method is five rules, in this way, the coding list of the embodiment of the present invention Member may include: syllable, five pens units etc., for example, the corresponding syllable of individual character " thousand " is " qian ", individual character " thousand " is five corresponding Unit is " wtfh ".It is appreciated that those skilled in the art can be according to coding rule, using corresponding coding unit, this hair Bright embodiment is without restriction for specific coding unit.
Optionally, input method procedure may operate in terminal, and above-mentioned terminal specifically includes but unlimited: smart phone, flat Plate computer, electronic reader, MP3 (dynamic image expert's compression standard audio level 3, Moving Picture Experts Group Audio Layer III) player, MP4 (dynamic image expert's compression standard audio level 4, Moving Picture Experts Group Audio Layer IV) player, pocket computer on knee, vehicle-mounted computer, desktop computer, machine top Box, intelligent TV set, wearable device etc..
In input method technique field, the either input method procedure of Chinese, Japanese, Korean or other Languages, usually may be used It is converted into the candidate item of corresponding language with the input string for being used to characterize coding unit by user, is then selected to export by user To the content of application program, pass through the content namely upper screen content of upper screen operation output to application program here.Wherein, use During the input string at family is converted into the candidate item of corresponding language, the corresponding word of input string can be searched directly from dictionary Item can will search obtained entry as candidate item, be inputted for example, searching directly in dictionary if searching hit Go here and there " nihao " either entries such as " tianqihenhao " corresponding " hello " or " weather is fine ".Optionally, the present invention is implemented The dictionary of example can specifically include: system dictionary, user thesaurus, cell dictionary, cloud dictionary etc., the embodiment of the present invention is for tool The dictionary of body is without restriction.
However, in practical applications, a lot of reasons will will lead to there is no the entry that input string is directly hit in dictionary, it can Selection of land in the vocabulary quantity more (such as phrase or long sentence) that user to be inputted or is intended to not input before inputting interior Rong Shi, it is understood that there may be the case where the entry directly hit in dictionary there is no input string, input method procedure can be in such cases Triggering group word function.For example, user wants through input string " yiyilingbawanjiuqian " input " 100,080,009 Thousand ", alternatively, these may be not present in dictionary when wanting through input string " jiuwanliangqian " input " nine Wan Baqian " The entry that input string is directly hit.
Existing group of word scheme is using the n-tuple relation (Matching Relation between vocabulary and vocabulary) in polynary library, for defeated Enter string and carries out a group word.However, generally requiring very more n-tuple relations, this is not only for the group word comprising number and unit There are higher requirements for size and memory space for polynary library, and often because the coverage rate of n-tuple relation is insufficient and Lead to a group word failure.By taking the group word of number as an example, need to store the Matching Relation between all numbers in polynary library, if storage Coverage rate is inadequate, it will leads to a group word failure.
For the above problem existing for the group word of number and unit, the embodiment of the present invention is directed to digital individual character and/or unit Word proposes rule of combination, and the rule of combination is for characterizing digital individual character and/or the corresponding rule of combination of unit word, for example, the group It normally may include: the rule of combination etc. between rule of combination, digital individual character and the unit word between unit word;And described Input string meets combination rule in the case where the hit situation of preset data collection meets prerequisite, according to input string acquisition Group word path then, as a group word candidate.
Wherein, the preset data collection may include: word collection and the corresponding coding unit collection of the word collection, and above-mentioned word collection can For storing individual character, the input string meets prerequisite in the hit situation of preset data collection, on the one hand, illustrates the input Going here and there corresponding to group individual character includes digital individual character and/or unit word so that it is described to group individual character as digital individual character and/or a unit The corresponding combination of word;On the other hand, the input string is touched in the case where the hit situation of preset data collection meets prerequisite The group word for sending out the embodiment of the present invention, can be improved the corresponding acquisition efficiency to group individual character of the input string.
It further, can be from described to obtain reasonable group word in corresponding group of word path of group individual character by rule of combination Path is candidate as group word, it is possible thereby to improve group reasonability of word candidate.
Since the rule of combination is for characterizing digital individual character and/or the corresponding rule of combination of unit word, which can To be suitable for arbitrary digital individual character and/or unit word, therefore, the success rate of group word is can be improved in the embodiment of the present invention.Also, The rule of combination can reflect digital individual character and/or the corresponding combination rule of unit word, from described to corresponding group of word of group individual character It is candidate as group word that legal group word path is obtained in path, can be improved group reasonability and quality of word candidate, and then can be with Improve the input efficiency of user.
In the embodiment of the present invention, above-mentioned word collection can be used for storing individual character, and optionally, above-mentioned word collection may include: digital single Word collection, and/or unit word collection.
Wherein, digital individual character collection may include: digital individual character, the number individual character can for Arabic numerals it is corresponding certain The individual character of language, for example, the corresponding Chinese individual character of Arabic numerals " 1 " can be " one ", " one " etc..As an example, originally The digital individual character collection of inventive embodiments may include: " zero ", " one ", " two ", " three ", " four ", " five ", " six ", " eight ", " nine ", " one ", " two ", " three ", " wantonly ", " 5 ", " land ", " seven ", " eight ", " nine " etc..
The example of unit word collection may include: arithemetic unit word collection and monetary unit word collection.
Wherein, arithemetic unit word collection may include: arithemetic unit word, which can be used for counting number Number.For example, arithemetic unit word may include: " a ", " ten ", " hundred ", " thousand ", " ten thousand ", " hundred million ", " million ", " capital " etc..
Monetary unit word collection may include: monetary unit word, the monetary unit word can be used for the currency of national regulation into Row metering.For example, for the currency as defined in the contemporary Chinese, monetary unit word may include: " one hundred ", " thousand ", " ten thousand ", " hundred million ", " circle ", " member ", " angle ", " dividing ", " block ", " hair ", " li ", " dollar ", " sterling " etc.;For goods as defined in ancient Chinese For coin, monetary unit word may include: " money ", " passing through ", " text ", " two ", " jin " etc..
It is appreciated that above-mentioned monetary unit word collection is intended only as the alternative embodiment of unit word collection, in fact, this field skill Art personnel can be according to practical application request, using other unit word collection, such as volume unit word collection (including cubic meter), volume Unit word collection (including liter etc.), unit of land area (including mu etc.).
It should be noted that the embodiment of the present invention mainly by taking Chinese as an example, is illustrated word collection, Japanese, Korean etc. its The corresponding word collection of his language, it is cross-referenced.
Embodiment of the method
Referring to Fig.1, a kind of step flow chart of phrase composing method embodiment of the invention is shown, can specifically include as follows Step:
Step 101, the input string for receiving user;
If step 102, the input string meet prerequisite in the hit situation of preset data collection, according to the input String obtains the group word path for meeting rule of combination, as a group word candidate;Wherein, the preset data collection may include: word collection and The corresponding coding unit collection of the word collection.
The embodiment of the present invention can realize the input of numeral expression by the group word of digital correlation.The numeral expression can wrap It includes: the combination of unit or number and unit.Wherein, when numeral expression includes unit, the example of numeral expression be can wrap Include: " ten yuan ", " hundred yuan " etc., when numeral expression includes the combination of number and unit, the example of numeral expression may include: " 100,000,008 Wan Jiuqian " etc..
The prerequisite may include: the corresponding cutting knot of the input string in an alternative embodiment of the invention Fruit hits the preset data collection.Correspondingly, it is preset to judge whether the input string meets in the hit situation of preset data collection The process of condition may include: to carry out cutting to the input string, obtain corresponding cutting result;Judge the input string pair Whether the cutting result answered hits the preset data collection.
In practical applications, cutting can be carried out to input string according to the rule of input string.If the input string is phonetic String then can carry out cutting according to syllable rule.One input string may have one or more kinds of cutting schemes, therein every Kind cutting scheme may each comprise one or more substrings, and each substring can be corresponding with coding unit.For example, input string " yiyilingbawanjiuqian " can be split as " yi ' yi ' ling ' ba ' wan ' jiu ' qian ".
According to a kind of embodiment, the input intention of the input string may include: numeral expression;As user wants to pass through input String " yiyilingbawanjiuqian " input numeral expression " 100,000,008 Wan Jiuqian ".In such cases, the input string pair It may include: that each cutting result hits the preset data collection respectively that the cutting result answered, which hits the preset data collection,.
According to another embodiment, be intended in the input of the input string can also include other than including numeral expression Other expression.If user wants through input string " sanshiwuwanba " input " 350,000 ", alternatively, user want it is defeated Enter " being 7,890 unitary in total ", alternatively, user wants to pass through " wozhonglesanqianwubaiwanrenminbi " input " I has suffered 3,000 Wu Baiwan RMB " namely input string is defeated Enter to be intended to other than numeral expression include other expression.In such cases, the corresponding cutting result life of the input string Described in preset data collection may include: that continuous multiple cutting results hit the preset data collection respectively.
For spelling input string, the embodiment of the present invention can be directly by substring and preset data in the spelling input string The coding unit of concentration is matched, if successful match, illustrates that substring hits the preset data collection;For simplicity input string For, the corresponding spelling character string of substring in the available simplicity character string of the embodiment of the present invention, then by the spelling character The coding unit concentrated with preset data of going here and there is matched, if successful match, illustrates that spelling character string hits the preset number According to collection.In the embodiment of the present invention, the preset data collection may include: digital individual character and/or the corresponding coding unit of unit word, Coding unit can be obtained by the corresponding coding rule of input method procedure, and coding unit collection may include: syllable collection, five pens unit collection Deng the embodiment of the present invention is without restriction for specific coding unit collection.
Integrated for syllable collection by coding unit, referring to table 1, shows a kind of syllable collection and word collection of the embodiment of the present invention Between mapping relations example.It is appreciated that those skilled in the art can according to practical application request, establish syllable collection with Mapping relations between word collection, the embodiment of the present invention are without restriction for specific mapping relations.
Table 1
It should be noted that for simplicity input string, the available simplicity character string neutron of the embodiment of the present invention Corresponding spelling syllable of going here and there specifically can obtain the corresponding spelling syllable of substring according to above-mentioned spelling syllable set.Such as The corresponding spelling syllable of substring " y " may include: " yi ", " yuan " etc., and for another example, the corresponding spelling syllable of substring " s " can wrap Include: " si ", " san ", " shi " etc., for another example, the corresponding spelling syllable of substring " w " may include: " wan ", " wu " etc., alternatively, sub " b " corresponding spelling syllable of going here and there may include: " bai ", " ba " etc., alternatively, the corresponding spelling syllable of substring " j " may include: " jiu ", " jiao " etc., alternatively, the corresponding spelling syllable of substring " q " may include: " qi ", " qian " etc..
As described above, cutting is carried out to the input string, after obtaining corresponding cutting result, according in the cutting result The corresponding syllable of the substring for including is searched in the mapping relations between coding unit set (such as syllable collection) and word collection, It is corresponding to a group individual character as the input string to obtain the individual character to match with the cutting result;According to the input string It is corresponding to a group individual character, determine a group word path;Obtain the group word path for meeting rule of combination.Pass through coding unit set and word collection Between mapping relations, obtain that input string is corresponding can reduce the range to group individual character to a group individual character, therefore can be certain Operand in degree during reduction group word.
During searching the individual character to match with cutting result, it is possible that a substring corresponds to multiple individual characters Situation, multiple individual character may include: word figure individual character and small letter number individual character, small letter number individual character may include: " one ", " two ", " three ", " four ", " five ", " six ", " eight ", " nine " etc., word figure individual character may include: " one ", " two ", " three ", " wantonly ", " 5 ", " land ", " seven ", " eight ", " nine " etc..Such as " yi " has corresponded to " one " and " one ", in a kind of optional reality of the invention It applies in example, can also determine the corresponding individual character of cutting result according to the corresponding context of input string.Specifically, if input string is corresponding Context include the keywords such as " money ", " account " or word figure the individual character such as digital individual character of " one " into " nine ", then can be with Think that the corresponding individual character of cutting result is word figure individual character.Conversely, if the corresponding context of input string does not include " money ", " account " Equal keywords or the word figure individual character such as digital individual character of " one " into " nine ", it may be considered that the corresponding list of cutting result Word is small letter number.
It to a group individual character may include: individual character or vocabulary it should be noted that above-mentioned.Input string is corresponding can to a group individual character To include: several word elements, specifically, several word elements may include: several individual characters, several vocabulary or Several individual characters and several vocabulary, the embodiment of the present invention is for specifically without restriction to group individual character.Optionally, it is above-mentioned to Group individual character can be with are as follows: individual character sequence, the corresponding sequence of individual character sequence can be determined according to input time, wherein input time Before first individual character comes input time posterior individual character, for example, input string " yiyilingbawanjiuqian " is corresponding Individual character sequence may include: " one ", " one ", " zero ", " eight ", " ten thousand ", " nine ", " thousand ".
In the embodiment of the present invention, described to group individual character can be individual character sequence, and continuous multiple individual characters can correspond to individual character sequence The all or part of column.
In the case where continuous multiple individual characters correspond to the whole of individual character sequence, the input intention of the input string may include: Numeral expression;As user wants to input numeral expression " 100,080,009 by input string " yiyilingbawanjiuqian " Thousand ", then the corresponding individual character sequence of input string " yiyilingbawanjiuqian " of user may include: " one ", " one ", " zero ", " eight ", " ten thousand ", " nine ", " thousand ", continuous multiple individual characters which includes hit digital individual character collection or unit Word collection.
In the case where continuous multiple individual characters correspond to the part of individual character sequence, the input of the input string is intended in addition to including number It can also include other expression except word expression.As user wants through input string " sanshiwuwanba " input " 35 Ten thousand ", then the corresponding individual character sequence of input string " sanshiwuwanba " of user may include: " three ", " ten ", " five ", " ten thousand ", " ", the part which includes continuously multiple word hits number individual character collection or unit word collection.Alternatively, with Family wants to input " being 7,890 unitary in total ", then it can also include other expression other than numeral expression.
In practical applications, can according to where individual character in the corresponding individual character sequence of group individual character position or word The position at place is combined the corresponding individual character of different location or word, to obtain to corresponding group of word path of group individual character. Specifically, position 1, position 2, the corresponding multiple words of position 3 ... position n (n is positive integer) can be combined, to obtain To corresponding group of word path of group individual character, as the words 1 of position 1, the words 1 of position 2, position 3 words 1 ... position n words 1, alternatively, the words 2 of position 1, the words 1 of position 2, words 1 of words 1 ... position n of position 3 etc..The embodiment of the present invention pair It is without restriction in the specific acquisition process to corresponding group of word path of group individual character.
In the embodiment of the present invention, which may include: digital individual character collection and unit word collection, and the rule of combination is for characterizing Digital individual character and/or the corresponding rule of combination of unit word can reflect digital individual character and/or the corresponding combination rule of unit word Rule, for example, the rule of combination may include: the combination rule between rule of combination between unit word, digital individual character and unit word Then etc..It is appreciated that those skilled in the art can be advised according to the numeral expression demand of certain language using corresponding combination Then.
The embodiment of the present invention can provide following rule of combination, those skilled in the art can according to practical application request, It is combined using any or multiple in following rule of combination:
Rule 1, described group of word path include the first arithemetic unit word, and described group of word path is single in first first number The second number for including before the word of position, between the first adjacent arithemetic unit word or after tail the first arithemetic unit word is single The quantity of hyte is no more than 1.
Regular 2, first arithemetic unit word is not located at the first place in described group of word path.
If regular 3, first arithemetic unit word is adjacent with the second arithemetic unit word, alternatively, two the first arithemetic unit word phases Neighbour, then preceding arithemetic unit is less than posterior arithemetic unit.
The second arithemetic unit word that rule 4, the second arithemetic unit group in described group of word path include presents single from big number The sequence of decimal fractions unit is arrived in position.
Rule 5, any two the second arithemetic unit word that the second arithemetic unit group in described group of word path includes are non-conterminous.
Regular 6, when arithemetic unit occurs in the second arithemetic unit word that the second arithemetic unit group in described group of word path includes When interruption, corresponding arithemetic unit discontinuity position occurs 1 zero.
Rule 7, when described group of word path between the first adjacent arithemetic unit word there is no the second arithemetic unit group and When digital, posterior first arithemetic unit word is omitted.
The zero-bit that rule 8, described group of word path include is in non-end position.
The zero corresponding previous individual character that rule 9, described group of word path include is not digital individual character, described to a group individual character pair Zero corresponding the latter individual character that the group word path answered includes is digital individual character or monetary unit word.
Rule 10, described group of word path first place be to pick up or ten, second be not pick up, ten, hundred, one hundred, thousand, thousand, it is whole or Person zero.
Regular 11, frequency of occurrence of the monetary unit group in described group of word path in described group of word path is no more than 1.
Rule 12, arithemetic unit word are located in described group of word path before monetary unit group.
Sequence from big to small is presented in the monetary unit word that rule 13, the monetary unit group in described group of word path include.
Rule 14, the monetary unit word that the monetary unit group in described group of word path includes are non-conterminous.
Rule 15, the monetary unit group in described group of word path include the first monetary unit word, the first monetary unit word Previous individual character be digital individual character.
The digital individual character that rule 16, described group of word path include is non-conterminous.
Rule 17, described group of word path include whole positioned at end position that whole previous individual character is member.
In practical applications, a group word path can be traversed, with obtain each individual character that group word path includes and Position of the individual character in group word path.Further, it is possible to judge whether it meets for each individual character that group word path includes State any or combination of the rule 1 into rule 16.
The individual character of the embodiment of the present invention may include: digital individual character, arithemetic unit word or monetary unit word.For side Just it describes, the arithemetic unit word of the embodiment of the present invention may include: the first arithemetic unit word and the second arithemetic unit word.
Wherein, the second arithemetic unit word may include: to pick up the lesser number of the arithemetic units such as (ten), one hundred (hundred), thousand (thousand) Unit word, the second arithemetic unit word can be corresponding with the second arithemetic unit group, which may include: at least one A second arithemetic unit word, the second arithemetic unit word which includes are presented from big arithemetic unit to decimal fractions The sequence of unit, such as thousand (thousand), one hundred (hundred) pick up (ten) corresponding sequence, alternatively, thousand (thousand), (ten) corresponding sequence is picked up, or Person, one hundred (hundred) pick up (ten) corresponding sequence etc..
First arithemetic unit word may include: hundred million, ten thousand, capital, the biggish arithemetic unit words of arithemetic units such as million.
Rule 1 is suitable for the situation that described group of word path includes the first arithemetic unit word, in this case, the road Zu Ci Diameter is between the first arithemetic unit word before the first first arithemetic unit word, adjacent or tail the first arithemetic unit The quantity for the second arithemetic unit group for including after word is no more than 1, and in the embodiment of the present invention, it may include: number that quantity, which is no more than 1, Amount is 0 or 1.
For hundred million, ten thousand, capital, for the biggish arithemetic units of arithemetic units such as million, the front generally includes number, therefore advises Then 2 it can correspond to the first place that the first arithemetic unit word is not located at described group of word path.
For rule 3, if the first arithemetic unit word is adjacent with the second arithemetic unit word, alternatively, two the first arithemetic units Word is adjacent, then illustrates that preceding arithemetic unit is used to limit posterior arithemetic unit, therefore preceding arithemetic unit is less than posterior Arithemetic unit, corresponding example may include: " 1,000,000,000,000 ", " 100,000,000,000 ", " 10,000,000 " etc..
Rule 4 to rule 7 is the corresponding rule of the second arithemetic unit group.
Wherein, rule 4 is specifically, the second arithemetic unit word that the second arithemetic unit group in described group of word path includes is presented From big arithemetic unit to the sequence of decimal fractions unit.
Rule 5 is specifically, any two the second arithemetic unit word that the second arithemetic unit group in described group of word path includes It is non-conterminous, it is non-conterminous such as to pick up (ten), one hundred (hundred), any two in thousand (thousand).
Rule 6 is specifically, when number occurs in the second arithemetic unit word that the second arithemetic unit group in described group of word path includes When word unit is interrupted, corresponding arithemetic unit discontinuity position occurs 1 zero.Wherein, arithemetic unit interruption refers to that the second number is single Hyte exists, but the second arithemetic unit group the second arithemetic unit word for including it is imperfect or including the second arithemetic unit word deposit It is lacking.Wherein, complete second arithemetic unit group may include: that one hundred x of x, one hundred x is picked up, wherein incomplete second arithemetic unit Group may include: that thousand x hundred of x, thousand x of x are picked up, one hundred x of x is picked up, x thousand, x one hundred, x are picked up, wherein " x " expression is taken with the second arithemetic unit word The arithemetic unit word matched.The second arithemetic unit word that second arithemetic unit group of the embodiment of the present invention in group word path includes occurs , it is specified that corresponding arithemetic unit discontinuity position occurs 1 zero in the intermittent situation of arithemetic unit, it is particularly possible to improve numeral expression Reasonability.Such as between " hundred million " and " ten thousand ", there is missing in " thousand ", therefore zero, such as " 100,000,000 liang can occur in corresponding position 1250000 ".
In practical applications, any one arithemetic unit is not present in the second arithemetic unit group, between corresponding arithemetic unit Disconnected position can occur 1 zero.
In an alternative embodiment of the invention, the second number that the second arithemetic unit group in described group of word path includes When arithemetic unit interruption occurs in unit word, corresponding arithemetic unit discontinuity position occurs 1 zero, can specifically include:
In the second arithemetic unit group in described group of word path before first first arithemetic unit word big arithemetic unit exist, And when decimal fractions unit presence interruption, corresponding arithemetic unit discontinuity position occurs 1 zero;For example, first first arithemetic unit Word is " hundred million ", and corresponding example may include: " 101,100,000,000 ", " 100,100,000,000 ", " 10,010,000 " etc..
There are arithemetic unit and multiple continuous number units are not present in the second arithemetic unit group in described group of word path When, corresponding arithemetic unit discontinuity position occurs 1 zero.
It should be noted that illustrating the when arithemetic unit is not present in the second arithemetic unit group in described group of word path Two arithemetic unit groups are not present, and corresponding arithemetic unit discontinuity position can not occur zero.Such as " 100,000,000 yuan whole ", after " hundred million " There is 0 the second arithemetic unit group, therefore can not occur zero.
It should be noted that but multiple discontinuous arithemetic units are not present in the second arithemetic unit group in described group of word path When, corresponding arithemetic unit discontinuity position occurs 1 zero.
Rule 7 can be used for constraining the case where omitting the first arithemetic unit word, when described group of word path is in the first adjacent number There is no when the second arithemetic unit group and number between word unit word, posterior first arithemetic unit word is omitted.For example, " hundred million " and The second arithemetic unit group and number are not present between " ten thousand ", " ten thousand " word can be omitted.For example, for 300020000 " 300,000,002 Ten thousand ", it is not present between " hundred million " and " ten thousand " but number exists, " ten thousand " do not omit.And for 300002000 " 300,002,000 " or The second arithemetic unit group and number is not present in " 300,002,000 " between " hundred million " and " ten thousand ", need to omit " ten thousand ", zero can omit or not It omits.
For rule 8, zero right side again without content when, zero omit.
For rule 9, the zero corresponding previous individual character that described group of word path includes is not digital individual character, and such as " one zero " exist It is illegal in numeral expression.Alternatively, the zero corresponding the latter individual character for including to corresponding group of word path of group individual character For digital individual character or monetary unit word.For example, " 10,000 unitary ", " null element quadrangle one is divided " are legal.
For rule 10, when the first place in described group of word path is to pick up or when ten, second be not pick up, ten, hundred, one hundred, thousand, Thousand, whole or zero.
Rule 11 is to rule 15 for constraining monetary unit group.
For rule 11, frequency of occurrence of the monetary unit group in described group of word path in described group of word path is no more than 1. For RMB, complete monetary unit group may include: x member x angle x points, which can be imperfect.
For rule 12, arithemetic unit word is located in described group of word path before monetary unit group, and volume is, it is expected that no matter right In picking up the lesser arithemetic unit word of the arithemetic units such as (ten), one hundred (hundred), thousand (thousand), or for hundred million, ten thousand, capital, the numbers such as million it is single The biggish arithemetic unit word in position, is respectively positioned in described group of word path before monetary unit group.
For rule 13, from big to small suitable is presented in monetary unit word that the monetary unit group in described group of word path includes Sequence.
For rule 14, the monetary unit word that the monetary unit group in described group of word path includes is non-conterminous.
For rule 15, the monetary unit group in described group of word path includes the first monetary unit word, the first currency list The previous individual character of position word is digital individual character.Ru Jiao, one and only one the preceding digital individual character divided.
The corresponding rule of combination of monetary unit group is illustrated by RMB herein.
1) monetary unit group (member, is divided angle) only may be present primary in group word path;
2) arithemetic unit (any number list must cannot occurring i.e. behind monetary unit group before monetary unit group Position);
3) monetary unit group internal sort needs from big to small;
It 4) can not be adjacent inside monetary unit group;
5) angle, one and only one the preceding digital individual character divided.
For rule 16, the digital individual character that described group of word path includes is non-conterminous, such as any two of " one " into " nine " Digital individual character is non-conterminous.
The example for meeting the numeral expression of rule of combination is provided herein:
Arithemetic unit word/number individual character (combination)+member+number individual character+angle+number individual character (non-zero)+point
Arithemetic unit word/number individual character (combination)+member+number individual character (non-zero)+angle
Arithemetic unit word/number individual character (combination)+member+number individual character (non-zero)+point
Arithemetic unit word/number individual character (combination)+member or " member is whole "
Digital individual character or " picking up "+angle+number individual character (non-zero)+point (exact matching of group word)
Digital individual character or " picking up "+angle
Digital individual character or " picking up "+point
In an embodiment of the present invention, the first arithemetic unit word includes: hundred million or ten thousand, wherein first first number Before word unit word, it may occur in which that 0 group or 1 group of second arithemetic unit group, 1 group of second arithemetic unit group of appearance are presented from big number Unit is to the sequence of decimal fractions unit, and the second big arithemetic unit in arithemetic unit group inside exists and decimal fractions unit has interruption When, corresponding arithemetic unit discontinuity position occurs 1 zero;For example, first first arithemetic unit word is " hundred million ", corresponding example can To include: " 101,100,000,000 ", " 100,100,000,000 ", " 10,010,000 " etc..
In an embodiment of the present invention, the first arithemetic unit word includes: hundred million or ten thousand, wherein the last one number After unit word, it may occur in which that 0 group or 1 group of second arithemetic unit group, 1 group of second arithemetic unit group of appearance present single from big number The sequence of decimal fractions unit is arrived in position, and inside the second arithemetic unit group in the absence of any one arithemetic unit, corresponding number is single Position discontinuity position occurs 1 zero;In the absence of multiple continuous number units, corresponding arithemetic unit discontinuity position occurs 1 zero; On the right side of " zero " again without content when, " zero " omits;In the absence of multiple discontinuous arithemetic units, replaced using multiple " zero ".Accordingly Example may include: " 300,000,200 " or " 30,200 " etc..
In an embodiment of the present invention, the first arithemetic unit word includes: hundred million or ten thousand, and the rule of combination can wrap It includes:
Described group of word path includes hundred million and ten thousand, and myriabit is before hundred million, the second arithemetic unit group for including between Wan He hundred million Quantity is no more than 1;
When between ten thousand and hundred million there are when the second arithemetic unit group, included by the second arithemetic unit group between Wan He hundred million The second arithemetic unit group that two arithemetic unit words are presented from big arithemetic unit to the sequence of decimal fractions unit, between Wan He hundred million is wrapped Any two the second arithemetic unit word included is non-conterminous, and the second number included by the second arithemetic unit group between Wan He hundred million is single When arithemetic unit interruption occurs in position word, corresponding arithemetic unit discontinuity position occurs 1 zero;Or
When the second arithemetic unit group and number are not present between ten thousand and hundred million, do not occur zero between Wan He hundred million, such as " 1,000,000,000,000 2000 yuan ".
Wherein, the case where arithemetic unit interruption occurs in the second arithemetic unit word included by the second arithemetic unit group can wrap Include: inside the second arithemetic unit group in the absence of any one arithemetic unit, corresponding arithemetic unit discontinuity position occurs 1 Zero;In the absence of multiple continuous number units, corresponding arithemetic unit discontinuity position occurs 1 zero;Multiple discontinuous numbers are single In the absence of position, replaced using multiple " zero ".
In an embodiment of the present invention, the first arithemetic unit word includes: hundred million or ten thousand, and the rule of combination can wrap It includes:
Described group of word path includes hundred million and ten thousand, and hundred million are located at before ten thousand, between hundred million and ten thousand there is no the second arithemetic unit group and When digital, ten thousand are omitted.
Further, described group of word path includes hundred million and ten thousand, and hundred million are located at before ten thousand, and there are the second arithemetic units between hundred million and ten thousand When group, the second arithemetic unit word included by the second arithemetic unit group between Wan He hundred million is presented from big arithemetic unit to decimal fractions The sequence of unit, any two the second arithemetic unit word included by the second arithemetic unit group between Wan He hundred million is non-conterminous, and ten thousand And the second arithemetic unit word included by the second arithemetic unit group between hundred million, when there is arithemetic unit interruption, corresponding number is single Position discontinuity position occurs 1 zero.
In an embodiment of the present invention, the first arithemetic unit word includes: hundred million or ten thousand, and the rule of combination includes: Described group of word path does not include hundred million, and ten thousand frequency of occurrence is no more than 1 in described group of word path.That is, not including in a group word path When " hundred million ", can only at most there are one " ten thousand ".
Specifically, before " ten thousand ", 0 group or 1 group of second arithemetic unit group, 1 group of second arithemetic unit of appearance be may occur in which Group is presented from big arithemetic unit to the sequence of decimal fractions unit, and the second big arithemetic unit in arithemetic unit group inside exists and decimal When word unit has interruption, corresponding arithemetic unit discontinuity position occurs 1 zero;For example, first first arithemetic unit word is " ten thousand ", corresponding example may include: " 10,110,000 ", " 10,010,000 " etc..
After " ten thousand ", may occur in which 0 group or 1 group of second arithemetic unit group, 1 group of second arithemetic unit group of appearance present from Big arithemetic unit arrives the sequence of decimal fractions unit, inside the second arithemetic unit group in the absence of any one arithemetic unit, correspondence Arithemetic unit discontinuity position occur 1 zero;In the absence of multiple continuous number units, corresponding arithemetic unit discontinuity position goes out It is 1 zero existing;On the right side of " zero " again without content when, " zero " omits;In the absence of multiple discontinuous arithemetic units, using multiple " zero " generations It replaces.
In the case where described group of word path does not include " ten thousand ", 0 group or 1 group of second arithemetic unit group may occur in which, the 1 of appearance The second arithemetic unit group of group is presented from big arithemetic unit to the sequence of decimal fractions unit, inside the second arithemetic unit group any one In the absence of arithemetic unit, corresponding arithemetic unit discontinuity position occurs 1 zero;It is right in the absence of multiple continuous number units The arithemetic unit discontinuity position answered occurs 1 zero;On the right side of " zero " again without content when, " zero " omits;Multiple discontinuous arithemetic units In the absence of, it is replaced using multiple " zero ".
In practical applications, those skilled in the art can be according to practical application request, using in said combination rule Any or combination, screens a group word path, and the group word path for meeting said combination rule is candidate as group word.Its In, in the case where using the combination in said combination rule, the embodiment of the present invention uses sequence for multiple rules of combination It is without restriction.
The group word path where unisonance individual character can be screened by above-mentioned said combination rule, with phonetically similar word " one " For " hundred million ", if phonetically similar word is located at first place, it can be filtered out " hundred million ", be retained " one " by rule 2;Alternatively, if unisonance Word is located at non-first place, can be sieved by rule 16 (any number individual character (one-nine) can not be adjacent with " one ") to " one " Choosing.
In other embodiments of the invention, if the corresponding syllable of phonetically similar word " yi " is in end or subsequent adjacent syllable It is " yuan ", and meets both the above condition, then can retains " one " and " hundred million ".
It in practical applications, can be by group word candidate that step 103 obtains other candidate items corresponding with input string (as straight Connect the candidate item obtained in the dictionary or the candidate item obtained in the polynary library etc.) showed.It is alternatively possible to this Displaying is marked in group word candidate, candidate for group word to identify some candidate, such as highlighted exhibition can be carried out to this group of word candidate Show, alternatively, corresponding icon etc. can be added in the upper right corner of this group of word candidate.
Before it will organize word candidate other candidate items corresponding with input string and show, can also to group word candidate with it is defeated Other the corresponding candidate items that enter to go here and there are ranked up and/or duplicate removal processing.Referring to Fig. 2, the one kind for showing the embodiment of the present invention is defeated Enter the signal at interface, wherein input method procedure can provide candidate item for input string " sanshiwuwanba ", wherein group word Candidate " 350,000 " meet rule of combination, and " 350,008 " and " 350,008 " do not meet rule of combination.
To sum up, the phrase composing method of the embodiment of the present invention is accorded in hit situation of the individual character for including to group individual character for word collection In the case where closing prerequisite, from described to obtain the group word path for meeting rule of combination in corresponding group of word path of group individual character, As a group word candidate.Since the rule of combination is for characterizing digital individual character and/or the corresponding rule of combination of unit word, combination rule It then can be adapted for arbitrary digital individual character and/or unit word, therefore, the success rate of group word can be improved in the embodiment of the present invention. Also, the rule of combination can reflect digital individual character and/or the corresponding combination rule of unit word, from described corresponding to group individual character It is candidate as group word that legal group word path is obtained in group word path, group reasonability and quality of word candidate can be improved, in turn The input efficiency of user can be improved.
Also, n-tuple relation is stored by polynary library relative to traditional technology, the embodiment of the present invention can be real by word collection It is existing, therefore memory space can be saved.
In addition, the available all group words for meeting rule of combination of the embodiment of the present invention are candidate, a group word candidate can be improved Coverage rate.
Further, the embodiment of the present invention can be adapted for the numeral expression of random length, even if user wants the number of input The length of word expression is larger (such as larger than 10 " 50,802,009,618 yuan five jiaos "), can also be with Successfully organize word.
Referring to Fig. 3, a kind of step flow chart of phrase composing method embodiment of the invention is shown, can specifically include as follows Step:
Step 301, the input string for receiving user;
If step 302, the input string meet prerequisite in the hit situation of preset data collection, according to the input String obtains the group word path for meeting rule of combination, as a group word candidate;Wherein, the preset data collection may include: word collection and The corresponding coding unit collection of the word collection;
Relative to embodiment of the method shown in Fig. 1, the embodiment of the method for the embodiment of the present invention can also include:
Step 303, according to group word path in unisonance occur position, multiple groups of word paths corresponding to the input string into Row sequence.
In practical applications, input string is corresponding may include unisonance individual character to a group individual character, wherein unisonance individual character can be The corresponding unisonance individual character of spelling input string, alternatively, unisonance individual character can be the corresponding unisonance individual character of simplicity input string.
The position that the embodiment of the present invention can occur according to unisonance in group word path, multiple groups corresponding to the input string Word path is ranked up.
In the case where the position that unisonance occurs is the first, by taking phonetically similar word " one " and " hundred million " as an example, if phonetically similar word is located at First place can filter out " hundred million " by above-mentioned regular 2, retain " one ".
In the case where the position that unisonance occurs is non-the first, regular 16 (any number individual character above-mentioned can be passed through (one-nine) can not be adjacent with " one ") " one " is screened.
In other embodiments of the invention, if the corresponding syllable of phonetically similar word " yi " is in end or subsequent adjacent syllable It is " yuan ", and meets above-mentioned regular 2 and rule 16, then can retains " one " and " hundred million ".
In the case where the position that unisonance occurs is end, corresponding ordering rule be can specifically include:
The priority of member is higher than hundred million, and hundred million priority is higher than one or one;Or
It picks up or ten priority of the priority higher than three or three, three or three is higher than four or four;Or
The priority at angle is higher than nine or nine;Or
Thousand or thousand priority is higher than seven or seven;Or
One hundred or hundred priority is higher than eight or eight;Or
Ten thousand priority is higher than 5 or five.
In practical applications, the end of input string may include the substring of simplicity, referring to table 2, show simplicity substring, The signal of mapping relations between spelling syllable and individual character priority.
Table 2
The substring of simplicity Spelling syllable Individual character priority
y Yi, yuan Member > hundred million > one
s Si, san, shi Pick up > three > four
j Jiao, jiu Angle > nine
q Qian, qi Thousand > seven
b Ba, bai One hundred > eight
w Wan, wu Ten thousand > 5
By above-mentioned unisonance rule, it is lower the group word path where the higher individual character of priority can be come into priority Before group word path where individual character, it is possible thereby to preferentially that the group word path for coming front is candidate as group word.It needs Bright, the embodiment of the present invention is without restriction for the execution sequence of step 302 and step 303, the two successively, it is rear first or It executes side by side.
It provides embodiment the exemplary flow chart of steps of a kind of phrase composing method of the invention herein, can specifically include following step It is rapid:
Step S1, the input string of user is received;
Step S2, cutting is carried out to the input string, obtains corresponding cutting result;Judge that the input string is corresponding to cut Whether point result hits the preset data collection;
If the corresponding cutting result of step S3, described input string hit the individual character that includes described in the preset data collection for Hit situation meet prerequisite, then according to the cutting as a result, in mapping relations between coding unit collection and word collection It is searched, it is corresponding to a group individual character as the input string to obtain the individual character to match with the cutting result;According to institute It is corresponding to a group individual character, a determining group word path to state input string, and judges whether comprising " hundred million " in described group of word path, if so, Step S4 is executed, it is no to then follow the steps S5;
Step S4, judge " hundred million ", before " hundred million ", between " hundred million " and " ten thousand ", between " ten thousand " and " hundred million " and as last Whether the individual character (including the second arithemetic unit word, digital individual character and monetary unit word) after " hundred million " of a first arithemetic unit word Meet rule of combination, if so, thening follow the steps S8, otherwise abandons this group of word path;
Step S5, to whether include " ten thousand " in corresponding group of word path of group individual character described in judgement, if so, thening follow the steps S6, it is no to then follow the steps S7;
Step S6, judge " ten thousand ", before " ten thousand ", after " ten thousand " individual character (including the second arithemetic unit word, digital individual character and Monetary unit word) whether meet rule of combination, if so, thening follow the steps S8, otherwise abandon this group of word path;
Step S7, judge whether the second arithemetic unit word, digital individual character and monetary unit word meet rule of combination, if so, S8 is thened follow the steps, this group of word path is otherwise abandoned;
Step S8, it is candidate as group word that rule of combination will be met.
It should be noted that for simple description, therefore, it is stated as a series of movement is dynamic for embodiment of the method It combines, but those skilled in the art should understand that, the embodiment of the present invention is not by the limit of described athletic performance sequence System, because according to an embodiment of the present invention, some steps may be performed in other sequences or simultaneously.Secondly, art technology Personnel also should be aware of, and the embodiments described in the specification are all preferred embodiments, and related athletic performance is simultaneously different It surely is necessary to the embodiment of the present invention.
Installation practice
Referring to Fig. 4, the structural block diagram of a kind of group of word Installation practice of the invention is shown, can specifically include:
Input string receiving module 401, for receiving the input string of user;And
Group word candidate obtains module 402, if meeting preset item in the hit situation of preset data collection for the input string Part then obtains the group word path for meeting rule of combination, as a group word candidate according to the input string;Wherein, the preset data Collection includes: word collection and the corresponding coding unit collection of the word collection.
Optionally, described device can also include:
Judgment module, for judging whether the input string meets prerequisite in the hit situation of preset data collection:
The judgment module may include:
Cutting submodule obtains corresponding cutting result for carrying out cutting to the input string;
Judging submodule, for judging whether the corresponding cutting result of the input string hits the preset data collection.
Optionally, described group of word candidate acquisition module 402 may include:
Search submodule, for according to the cutting as a result, in mapping relations between coding unit collection and word collection into Row is searched, corresponding to a group individual character as the input string to obtain the individual character to match with the cutting result;
Path determines submodule, for corresponding to a group individual character, a determining group word path according to the input string;
Path acquisition submodule, for obtaining the group word path for meeting rule of combination.
Optionally, described device can also include:
Individual character determining module, for determining the corresponding list of the cutting result according to the corresponding context of the input string Word.
Optionally, the word collection may include: digital individual character collection and unit word collection, and the rule of combination is for characterizing number Individual character and/or the corresponding rule of combination of unit word.
Optionally, the rule of combination may include:
Described group of word path includes the first arithemetic unit word, described group of word path the first first arithemetic unit word it Before, the second arithemetic unit group for including between the first adjacent arithemetic unit word or after tail the first arithemetic unit word Quantity is no more than 1;And/or
First arithemetic unit word is not located at the first place in described group of word path;And/or
If the first arithemetic unit word is adjacent with the second arithemetic unit word, alternatively, two the first arithemetic unit words are adjacent, then exist Preceding arithemetic unit is less than posterior arithemetic unit;And/or
The second arithemetic unit word that the second arithemetic unit group in described group of word path includes is presented from big arithemetic unit to small The sequence of arithemetic unit;And/or
Any two the second arithemetic unit word that the second arithemetic unit group in described group of word path includes is non-conterminous;And/or
When there is arithemetic unit interruption in the second arithemetic unit word that the second arithemetic unit group in described group of word path includes, Corresponding arithemetic unit discontinuity position occurs 1 zero;And/or
When there is no the second arithemetic unit groups with number between the first adjacent arithemetic unit word in described group of word path, Posterior first arithemetic unit word omits;And/or
The zero-bit that described group of word path includes is in non-end position;And/or
The zero corresponding previous individual character that described group of word path includes is not digital individual character, alternatively, described to a group individual character pair Zero corresponding the latter individual character that the group word path answered includes is digital individual character or monetary unit word;And/or
The first place in described group of word path be pick up perhaps ten seconds be not pick up, ten, hundred, one hundred, thousand, thousand, it is whole or zero;With/ Or
Frequency of occurrence of the monetary unit group in described group of word path in described group of word path is no more than 1;And/or
Arithemetic unit word is located in described group of word path before monetary unit group;And/or
Sequence from big to small is presented in the monetary unit word that the monetary unit group in described group of word path includes;And/or
The monetary unit word that the monetary unit group in described group of word path includes is non-conterminous;And/or
The monetary unit group in described group of word path include the first monetary unit word, the first monetary unit word it is previous Individual character is digital individual character;And/or
The digital individual character that described group of word path includes is non-conterminous;And/or
Described group of word path includes whole positioned at end position that whole previous individual character is member.
Optionally, the second arithemetic unit word that the second arithemetic unit group in described group of word path includes occurs between arithemetic unit When disconnected, corresponding arithemetic unit discontinuity position occurs 1 zero, comprising:
In the second arithemetic unit group in described group of word path before first first arithemetic unit word big arithemetic unit exist, And when decimal fractions unit presence interruption, corresponding arithemetic unit discontinuity position occurs 1 zero;And/or
There are arithemetic unit and multiple continuous number units are not present in the second arithemetic unit group in described group of word path When, corresponding arithemetic unit discontinuity position occurs 1 zero.
Optionally, the first arithemetic unit word may include: hundred million or ten thousand, and the rule of combination may include:
Described group of word path includes hundred million and ten thousand, and myriabit is before hundred million, the second arithemetic unit group for including between Wan He hundred million Quantity is no more than 1;
When between ten thousand and hundred million there are when the second arithemetic unit group, included by the second arithemetic unit group between Wan He hundred million The second arithemetic unit group that two arithemetic unit words are presented from big arithemetic unit to the sequence of decimal fractions unit, between Wan He hundred million is wrapped Any two the second arithemetic unit word included is non-conterminous, and the second number included by the second arithemetic unit group between Wan He hundred million is single When arithemetic unit interruption occurs in position word, corresponding arithemetic unit discontinuity position occurs 1 zero;Or
When the second arithemetic unit group and number are not present between ten thousand and hundred million, do not occur zero between Wan He hundred million.
Optionally, the first arithemetic unit word may include: hundred million or ten thousand, and the rule of combination may include:
Described group of word path includes hundred million and ten thousand, and hundred million are located at before ten thousand, between hundred million and ten thousand there is no the second arithemetic unit group and When digital, ten thousand are omitted.
Optionally, the first arithemetic unit word may include: hundred million or ten thousand, and the rule of combination may include:
Described group of word path does not include hundred million, and ten thousand frequency of occurrence is no more than 1 in described group of word path.
Optionally, described device can also include:
Sorting module, the position for occurring according to unisonance in group word path, multiple groups of words corresponding to the input string Path is ranked up.
For device embodiment, since it is basically similar to the method embodiment, related so being described relatively simple Place illustrates referring to the part of embodiment of the method.
All the embodiments in this specification are described in a progressive manner, the highlights of each of the examples are with The difference of other embodiments, the same or similar parts between the embodiments can be referred to each other.
About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method Embodiment in be described in detail, no detailed explanation will be given here.
The embodiment of the invention also provides a kind of devices for group word, include memory and one or one Above program, one of them perhaps more than one program be stored in memory and be configured to by one or one with It includes the instruction for performing the following operation that upper processor, which executes the one or more programs: receiving the input of user String;If the input string meets prerequisite in the hit situation of preset data collection, group is met according to input string acquisition Group word path normally, as a group word candidate;Wherein, the preset data collection includes: word collection and the corresponding volume of the word collection Code unit collection.
Optionally, described device is also configured to execute one or one by one or more than one processor Procedure above includes the instruction for performing the following operation: carrying out cutting to the input string, obtains corresponding cutting result;Sentence Whether the corresponding cutting result of the input string that breaks hits the preset data collection.
It is optionally, described that the group word path for meeting rule of combination is obtained according to the input string, comprising:
According to the cutting as a result, being searched in mapping relations between coding unit collection and word collection, with obtain with The individual character that the cutting result matches, it is corresponding to a group individual character as the input string;
It is corresponding to a group individual character according to the input string, determine group word path;
Obtain the group word path for meeting rule of combination.
Optionally, described device is also configured to execute one or one by one or more than one processor Procedure above includes the instruction for performing the following operation: according to the corresponding context of the input string, determining the cutting knot The corresponding individual character of fruit.
Optionally, the word collection includes: digital individual character collection and unit word collection, and the rule of combination is for characterizing digital individual character And/or the corresponding rule of combination of unit word.
Optionally, the rule of combination includes:
Described group of word path between the first arithemetic unit word before the first first arithemetic unit word, adjacent or The quantity for the second arithemetic unit group for including after person's the first arithemetic unit word of tail is no more than 1;And/or
First arithemetic unit word is not located at the first place in described group of word path;And/or
If the first arithemetic unit word is adjacent with the second arithemetic unit word, alternatively, two the first arithemetic unit words are adjacent, then exist Preceding arithemetic unit is less than posterior arithemetic unit;And/or
The second arithemetic unit word that the second arithemetic unit group in described group of word path includes is presented from big arithemetic unit to small The sequence of arithemetic unit;And/or
Any two the second arithemetic unit word that the second arithemetic unit group in described group of word path includes is non-conterminous;And/or
When there is arithemetic unit interruption in the second arithemetic unit word that the second arithemetic unit group in described group of word path includes, Corresponding arithemetic unit discontinuity position occurs 1 zero;And/or
When there is no the second arithemetic unit groups with number between the first adjacent arithemetic unit word in described group of word path, Posterior first arithemetic unit word omits;And/or
The zero-bit that described group of word path includes is in non-end position;And/or
The zero corresponding previous individual character that described group of word path includes is not digital individual character, alternatively, described to a group individual character pair Zero corresponding the latter individual character that the group word path answered includes is digital individual character or monetary unit word;And/or
The first place in described group of word path be pick up perhaps ten seconds be not pick up, ten, hundred, one hundred, thousand, thousand, it is whole or zero;With/ Or
Frequency of occurrence of the monetary unit group in described group of word path in described group of word path is no more than 1;And/or
Arithemetic unit word is located in described group of word path before monetary unit group;And/or
Sequence from big to small is presented in the monetary unit word that the monetary unit group in described group of word path includes;And/or
The monetary unit word that the monetary unit group in described group of word path includes is non-conterminous;And/or
The monetary unit group in described group of word path include the first monetary unit word, the first monetary unit word it is previous Individual character is digital individual character;And/or
The digital individual character that described group of word path includes is non-conterminous;And/or
Described group of word path includes whole positioned at end position that whole previous individual character is member.
Optionally, the second arithemetic unit word that the second arithemetic unit group in described group of word path includes occurs between arithemetic unit When disconnected, corresponding arithemetic unit discontinuity position occurs 1 zero, comprising:
In the second arithemetic unit group in described group of word path before first first arithemetic unit word big arithemetic unit exist, And when decimal fractions unit presence interruption, corresponding arithemetic unit discontinuity position occurs 1 zero;And/or
There are arithemetic unit and multiple continuous number units are not present in the second arithemetic unit group in described group of word path When, corresponding arithemetic unit discontinuity position occurs 1 zero.
Optionally, the first arithemetic unit word includes: hundred million or ten thousand, and the rule of combination includes:
Described group of word path includes hundred million and ten thousand, and myriabit is before hundred million, the second arithemetic unit group for including between Wan He hundred million Quantity is no more than 1;
When between ten thousand and hundred million there are when the second arithemetic unit group, included by the second arithemetic unit group between Wan He hundred million The second arithemetic unit group that two arithemetic unit words are presented from big arithemetic unit to the sequence of decimal fractions unit, between Wan He hundred million is wrapped Any two the second arithemetic unit word included is non-conterminous, and the second number included by the second arithemetic unit group between Wan He hundred million is single When arithemetic unit interruption occurs in position word, corresponding arithemetic unit discontinuity position occurs 1 zero;Or
When the second arithemetic unit group and number are not present between ten thousand and hundred million, do not occur zero between Wan He hundred million.
Optionally, the first arithemetic unit word includes: hundred million or ten thousand, and the rule of combination includes: that described group of word path includes Hundred million and ten thousand, hundred million are located at before ten thousand, and there is no when the second arithemetic unit group and number between hundred million and ten thousand, ten thousand are omitted.
Optionally, the first arithemetic unit word includes: hundred million or ten thousand, and the rule of combination includes:
Described group of word path does not include hundred million, and ten thousand frequency of occurrence is no more than 1 in described group of word path.
Optionally, described device is also configured to execute one or one by one or more than one processor Procedure above includes the instruction for performing the following operation: the position occurred according to unisonance in group word path, to the input string Corresponding multiple groups of word paths are ranked up.
Fig. 5 be a kind of device for group word shown according to an exemplary embodiment as terminal when block diagram.For example, Terminal 900 can be mobile phone, computer, digital broadcasting terminal, messaging device, game console, tablet device, doctor Treat equipment, body-building equipment, personal digital assistant etc..
Referring to Fig. 5, terminal 900 may include following one or more components: processing component 902, memory 904, power supply Component 906, multimedia component 908, audio component 910, the interface 912 of input/output (I/O), sensor module 914, and Communication component 916.
The integrated operation of the usual controlling terminal 900 of processing component 902, such as with display, telephone call, data communication, phase Machine operation and record operate associated operation.Processing element 902 may include that one or more processors 920 refer to execute It enables, to perform all or part of the steps of the methods described above.In addition, processing component 902 may include one or more modules, just Interaction between processing component 902 and other assemblies.For example, processing component 902 may include multi-media module, it is more to facilitate Interaction between media component 908 and processing component 902.
Memory 904 is configured as storing various types of data to support the operation in terminal 900.These data are shown Example includes the instruction of any application or method for operating in terminal 900, contact data, and telephone book data disappears Breath, picture, video etc..Memory 904 can be by any kind of volatibility or non-volatile memory device or their group It closes and realizes, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasable to compile Journey read-only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, flash Device, disk or CD.
Power supply module 906 provides electric power for the various assemblies of terminal 900.Power supply module 906 may include power management system System, one or more power supplys and other with for terminal 900 generate, manage, and distribute the associated component of electric power.
Multimedia component 908 includes the screen of one output interface of offer between the terminal 900 and user.One In a little embodiments, screen may include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen Curtain may be implemented as touch screen, to receive input signal from the user.Touch panel includes one or more touch sensings Device is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding motion The boundary of movement, but also detect duration and pressure associated with the touch or slide operation.In some embodiments, Multimedia component 908 includes a front camera and/or rear camera.When terminal 900 is in operation mode, as shot mould When formula or video mode, front camera and/or rear camera can receive external multi-medium data.Each preposition camera shooting Head and rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.
Audio component 910 is configured as output and/or input audio signal.For example, audio component 910 includes a Mike Wind (MIC), when terminal 900 is in operation mode, when such as call mode, recording mode, and voice recognition mode, microphone is matched It is set to reception external audio signal.The received audio signal can be further stored in memory 904 or via communication set Part 916 is sent.In some embodiments, audio component 910 further includes a loudspeaker, is used for output audio signal.
I/O interface 912 provides interface between processing component 902 and peripheral interface module, and above-mentioned peripheral interface module can To be keyboard, click wheel, button etc..These buttons may include, but are not limited to: home button, volume button, start button and lock Determine button.
Sensor module 914 includes one or more sensors, and the state for providing various aspects for terminal 900 is commented Estimate.For example, sensor module 914 can detecte the state that opens/closes of terminal 900, and the relative positioning of component, for example, it is described Component is the display and keypad of terminal 900, and sensor module 914 can also detect 900 1 components of terminal 900 or terminal Position change, the existence or non-existence that user contacts with terminal 900,900 orientation of terminal or acceleration/deceleration and terminal 900 Temperature change.Sensor module 914 may include proximity sensor, be configured to detect without any physical contact Presence of nearby objects.Sensor module 914 can also include optical sensor, such as CMOS or ccd image sensor, at As being used in application.In some embodiments, which can also include acceleration transducer, gyro sensors Device, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 916 is configured to facilitate the communication of wired or wireless way between terminal 900 and other equipment.Terminal 900 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or their combination.In an exemplary implementation In example, communication component 916 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel. In one exemplary embodiment, the communication component 916 further includes near-field communication (NFC) module, to promote short range communication.Example Such as, NFC module can be based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology, Bluetooth (BT) technology and other technologies are realized.
In the exemplary embodiment, terminal 900 can be believed by one or more application specific integrated circuit (ASIC), number Number processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing the above method.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instruction, example are additionally provided It such as include the memory 904 of instruction, above-metioned instruction can be executed by the processor 920 of terminal 900 to complete the above method.For example, The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk With optical data storage devices etc..
Fig. 6 be a kind of device for group word shown according to an exemplary embodiment as server when block diagram.It should Server 1900 can generate bigger difference because configuration or performance are different, may include one or more central processings Device (central processing units, CPU) 1922 (for example, one or more processors) and memory 1932, (such as one or more magnanimity of storage medium 1930 of one or more storage application programs 1942 or data 1944 Store equipment).Wherein, memory 1932 and storage medium 1930 can be of short duration storage or persistent storage.Storage is stored in be situated between The program of matter 1930 may include one or more modules (diagram does not mark), and each module may include in server Series of instructions operation.Further, central processing unit 1922 can be set to communicate with storage medium 1930, service The series of instructions operation in storage medium 1930 is executed on device 1900.
Server 1900 can also include one or more power supplys 1926, one or more wired or wireless nets Network interface 1950, one or more input/output interfaces 1958, one or more keyboards 1956, and/or, one or More than one operating system 1941, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM Etc..
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instruction, example are additionally provided It such as include the memory 1932 of instruction, above-metioned instruction can be executed by the processor of server 1900 to complete the above method.For example, The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk With optical data storage devices etc..
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium by device (terminal or Server) processor execute when, enable a device to execute a kind of phrase composing method, which comprises receive the defeated of user Enter string;If the input string meets prerequisite in the hit situation of preset data collection, met according to input string acquisition The group word path of rule of combination, as a group word candidate;Wherein, the preset data collection includes: that word collection and the word collection are corresponding Coding unit collection.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its Its embodiment.The present invention is directed to cover any variations, uses, or adaptations of the invention, these modifications, purposes or Person's adaptive change follows general principle of the invention and including the undocumented common knowledge in the art of the disclosure Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following Claim is pointed out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is limited only by the attached claims
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.
Above to a kind of phrase composing method provided by the present invention, a kind of group of word device, a kind of device for group word and A kind of machine readable media, is described in detail, and specific case used herein is to the principle of the present invention and embodiment It is expounded, the above description of the embodiment is only used to help understand the method for the present invention and its core ideas;Meanwhile for Those of ordinary skill in the art have change according to the thought of the present invention in specific embodiments and applications Place, in conclusion the contents of this specification are not to be construed as limiting the invention.

Claims (14)

1. a kind of phrase composing method characterized by comprising
Receive the input string of user;
If the input string meets prerequisite in the hit situation of preset data collection, group is met according to input string acquisition Group word path normally, as a group word candidate;Wherein, the preset data collection includes: word collection and the corresponding volume of the word collection Code unit collection.
2. the method according to claim 1, wherein judging the input string in preset data as follows Whether the hit situation of collection meets prerequisite:
Cutting is carried out to the input string, obtains corresponding cutting result;
Judge whether the corresponding cutting result of the input string hits the preset data collection.
3. according to the method described in claim 2, it is characterized in that, described meet rule of combination according to input string acquisition Group word path, comprising:
According to the cutting as a result, being searched in mapping relations between coding unit collection and word collection, with obtain with it is described The individual character that cutting result matches, it is corresponding to a group individual character as the input string;
It is corresponding to a group individual character according to the input string, determine group word path;
Obtain the group word path for meeting rule of combination.
4. according to the method described in claim 2, it is characterized in that, the method also includes:
According to the corresponding context of the input string, the corresponding individual character of the cutting result is determined.
5. the method according to claim 1, wherein the word collection includes: digital individual character collection and unit word collection, institute Rule of combination is stated for characterizing digital individual character and/or the corresponding rule of combination of unit word.
6. according to claim 1 to the method any in 5, which is characterized in that the rule of combination includes:
Described group of word path is between the first arithemetic unit word before the first first arithemetic unit word, adjacent or tail The quantity for the second arithemetic unit group for including after a first arithemetic unit word is no more than 1;And/or
First arithemetic unit word is not located at the first place in described group of word path;And/or
If the first arithemetic unit word is adjacent with the second arithemetic unit word, alternatively, two the first arithemetic unit words are adjacent, then it is preceding Arithemetic unit is less than posterior arithemetic unit;And/or
The second arithemetic unit word that the second arithemetic unit group in described group of word path includes is presented from big arithemetic unit to decimal fractions The sequence of unit;And/or
Any two the second arithemetic unit word that the second arithemetic unit group in described group of word path includes is non-conterminous;And/or
It is corresponding when arithemetic unit interruption occurs in the second arithemetic unit word that the second arithemetic unit group in described group of word path includes Arithemetic unit discontinuity position occur 1 zero;And/or
When there is no the second arithemetic unit groups with number between the first adjacent arithemetic unit word in described group of word path, rear The first arithemetic unit word omit;And/or
The zero-bit that described group of word path includes is in non-end position;And/or
The zero corresponding previous individual character that described group of word path includes is not digital individual character, alternatively, described corresponding to group individual character Zero corresponding the latter individual character that group word path includes is digital individual character or monetary unit word;And/or
The first place in described group of word path be pick up perhaps ten seconds be not pick up, ten, hundred, one hundred, thousand, thousand, it is whole or zero;And/or
Frequency of occurrence of the monetary unit group in described group of word path in described group of word path is no more than 1;And/or
Arithemetic unit word is located in described group of word path before monetary unit group;And/or
Sequence from big to small is presented in the monetary unit word that the monetary unit group in described group of word path includes;And/or
The monetary unit word that the monetary unit group in described group of word path includes is non-conterminous;And/or
The monetary unit group in described group of word path includes the first monetary unit word, the previous individual character of the first monetary unit word For digital individual character;And/or
The digital individual character that described group of word path includes is non-conterminous;And/or
Described group of word path includes whole positioned at end position that whole previous individual character is member.
7. according to the method described in claim 6, it is characterized in that, the second arithemetic unit group in described group of word path include When arithemetic unit interruption occur in two arithemetic unit words, corresponding arithemetic unit discontinuity position occurs 1 zero, comprising:
Big arithemetic unit exists and small in the second arithemetic unit group in described group of word path before first first arithemetic unit word When arithemetic unit has interruption, corresponding arithemetic unit discontinuity position occurs 1 zero;And/or
It is right there are in the absence of arithemetic unit and multiple continuous number units in the second arithemetic unit group in described group of word path The arithemetic unit discontinuity position answered occurs 1 zero.
8. according to the method described in claim 6, it is characterized in that, the first arithemetic unit word includes: hundred million or ten thousand, the combination Rule includes:
Described group of word path includes hundred million and ten thousand, and myriabit is before hundred million, the quantity for the second arithemetic unit group for including between Wan He hundred million No more than 1;
When, there are when the second arithemetic unit group, second counts included by the second arithemetic unit group between Wan He hundred million between ten thousand and hundred million Word unit word is presented included by the second arithemetic unit group from big arithemetic unit to the sequence of decimal fractions unit, between Wan He hundred million Any two the second arithemetic unit word is non-conterminous, the second arithemetic unit word included by the second arithemetic unit group between Wan He hundred million When there is arithemetic unit interruption, corresponding arithemetic unit discontinuity position occurs 1 zero;Or
When the second arithemetic unit group and number are not present between ten thousand and hundred million, do not occur zero between Wan He hundred million.
9. according to the method described in claim 6, it is characterized in that, the first arithemetic unit word includes: hundred million or ten thousand, the combination Rule includes:
Described group of word path includes hundred million and ten thousand, and hundred million are located at before ten thousand, and the second arithemetic unit group and number are not present between hundred million and ten thousand When, ten thousand omit.
10. according to the method described in claim 6, it is characterized in that, the first arithemetic unit word includes: hundred million or ten thousand, described group Normally include:
Described group of word path does not include hundred million, and ten thousand frequency of occurrence is no more than 1 in described group of word path.
11. according to claim 1 to any method in 5, which is characterized in that the method also includes:
According to the position that unisonance in group word path occurs, multiple groups of word paths corresponding to the input string are ranked up.
12. a kind of group of word device characterized by comprising
Input string receiving module, for receiving the input string of user;And
Group word candidate obtains module, if meeting prerequisite, root in the hit situation of preset data collection for the input string The group word path for meeting rule of combination is obtained according to the input string, as a group word candidate;Wherein, the preset data collection includes: Word collection and the corresponding coding unit collection of the word collection.
13. a kind of device for group word, which is characterized in that it include memory and one or more than one program, Perhaps more than one program is stored in memory and is configured to be executed by one or more than one processor for one of them The one or more programs include the instruction for performing the following operation:
Receive the input string of user;
If the input string meets prerequisite in the hit situation of preset data collection, group is met according to input string acquisition Group word path normally, as a group word candidate;Wherein, the preset data collection includes: word collection and the corresponding volume of the word collection Code unit collection.
14. a kind of machine readable media is stored thereon with instruction, when executed by one or more processors, so that device is held Phrase composing method of the row as described in one or more in claim 1 to 11.
CN201710861480.4A 2017-09-21 2017-09-21 Word forming method and device and word forming device Active CN109542243B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710861480.4A CN109542243B (en) 2017-09-21 2017-09-21 Word forming method and device and word forming device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710861480.4A CN109542243B (en) 2017-09-21 2017-09-21 Word forming method and device and word forming device

Publications (2)

Publication Number Publication Date
CN109542243A true CN109542243A (en) 2019-03-29
CN109542243B CN109542243B (en) 2023-04-18

Family

ID=65828016

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710861480.4A Active CN109542243B (en) 2017-09-21 2017-09-21 Word forming method and device and word forming device

Country Status (1)

Country Link
CN (1) CN109542243B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004010674A1 (en) * 2002-07-18 2004-01-29 Min-Kyum Kim Apparatus and method for inputting alphabet characters
CN101013443A (en) * 2007-02-13 2007-08-08 北京搜狗科技发展有限公司 Intelligent word input method and input method system and updating method thereof
CN101290632A (en) * 2008-05-30 2008-10-22 北京搜狗科技发展有限公司 Input method for user words participating in intelligent word-making and input method system
CN101303625A (en) * 2008-07-04 2008-11-12 上海埃帕信息科技有限公司 Five strokes input words method
CN101359254A (en) * 2007-08-03 2009-02-04 北京搜狗科技发展有限公司 Character input method and system for enhancing input efficiency of name entry
CN101556596A (en) * 2007-08-31 2009-10-14 北京搜狗科技发展有限公司 Input method system and intelligent word making method
CN102012748A (en) * 2010-11-30 2011-04-13 哈尔滨工业大学 Statement-level Chinese and English mixed input method
CN102866782A (en) * 2011-07-06 2013-01-09 哈尔滨工业大学 Input method and input method system for improving sentence generating efficiency
CN103631385A (en) * 2012-08-23 2014-03-12 北京搜狗科技发展有限公司 Method and device for screening candidate items in character input
CN104679278A (en) * 2015-02-28 2015-06-03 广州三星通信技术研究有限公司 Character input method and device
CN105302332A (en) * 2014-07-25 2016-02-03 中国移动通信集团公司 Pinyin input method and realization apparatus thereof
CN105607757A (en) * 2015-12-28 2016-05-25 北京搜狗科技发展有限公司 Input method and device and device used for input

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004010674A1 (en) * 2002-07-18 2004-01-29 Min-Kyum Kim Apparatus and method for inputting alphabet characters
CN101013443A (en) * 2007-02-13 2007-08-08 北京搜狗科技发展有限公司 Intelligent word input method and input method system and updating method thereof
WO2008098507A1 (en) * 2007-02-13 2008-08-21 Beijing Sogou Technology Development Co., Ltd. An input method of combining words intelligently, input method system and renewing method
CN101359254A (en) * 2007-08-03 2009-02-04 北京搜狗科技发展有限公司 Character input method and system for enhancing input efficiency of name entry
CN101556596A (en) * 2007-08-31 2009-10-14 北京搜狗科技发展有限公司 Input method system and intelligent word making method
CN101290632A (en) * 2008-05-30 2008-10-22 北京搜狗科技发展有限公司 Input method for user words participating in intelligent word-making and input method system
CN101303625A (en) * 2008-07-04 2008-11-12 上海埃帕信息科技有限公司 Five strokes input words method
CN102012748A (en) * 2010-11-30 2011-04-13 哈尔滨工业大学 Statement-level Chinese and English mixed input method
CN102866782A (en) * 2011-07-06 2013-01-09 哈尔滨工业大学 Input method and input method system for improving sentence generating efficiency
CN103631385A (en) * 2012-08-23 2014-03-12 北京搜狗科技发展有限公司 Method and device for screening candidate items in character input
CN105302332A (en) * 2014-07-25 2016-02-03 中国移动通信集团公司 Pinyin input method and realization apparatus thereof
CN104679278A (en) * 2015-02-28 2015-06-03 广州三星通信技术研究有限公司 Character input method and device
CN105607757A (en) * 2015-12-28 2016-05-25 北京搜狗科技发展有限公司 Input method and device and device used for input

Also Published As

Publication number Publication date
CN109542243B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
CN109815314B (en) Intent recognition method, recognition device and computer readable storage medium
US6864809B2 (en) Korean language predictive mechanism for text entry by a user
CN102681658B (en) By display device and the method for controlling operation thereof of action control
CN101256462B (en) Hand-written input method and apparatus based on complete mixing association storeroom
CN107608532B (en) Association input method and device and electronic equipment
AU2013270485C1 (en) Input processing method and apparatus
CN107145571B (en) Searching method and device
CN104281649A (en) Input method and device and electronic equipment
CN109614846A (en) Manage real-time handwriting recognition
CN104735243B (en) Contact list displaying method and device
CN107305438A (en) The sort method and device of candidate item, the device sorted for candidate item
CN108958503A (en) input method and device
CN108073292A (en) A kind of intelligent word method and apparatus, a kind of device for intelligent word
CN109299235A (en) Knowledge base searching method, apparatus and computer readable storage medium
CN103914209B (en) A kind of information processing method and electronic equipment
CN107918496A (en) It is a kind of to input error correction method and device, a kind of device for being used to input error correction
CN109101505A (en) A kind of recommended method, recommendation apparatus and the device for recommendation
CN101405693A (en) Personal synergic filtering of multimodal inputs
CN109783244A (en) Treating method and apparatus, the device for processing
CN101763211A (en) System for analyzing semanteme in real time and controlling related operation
CN108803890A (en) A kind of input method, input unit and the device for input
CN109002184A (en) A kind of association method and device of input method candidate word
CN114328798A (en) Processing method, device, equipment, storage medium and program product for searching text
CN108628461A (en) A kind of input method and device, a kind of method and apparatus of update dictionary
CN100517186C (en) Letter inputting method and apparatus based on press-key and speech recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant