CN1755669A - Name input processing method and system - Google Patents
Name input processing method and system Download PDFInfo
- Publication number
- CN1755669A CN1755669A CN 200410083187 CN200410083187A CN1755669A CN 1755669 A CN1755669 A CN 1755669A CN 200410083187 CN200410083187 CN 200410083187 CN 200410083187 A CN200410083187 A CN 200410083187A CN 1755669 A CN1755669 A CN 1755669A
- Authority
- CN
- China
- Prior art keywords
- character
- name
- input
- surname
- given name
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Document Processing Apparatus (AREA)
Abstract
The invention discloses an input method and equipment which can identify the inputted name. The method comprises: checking the surname inputting code in the inputting coded sequence, choosing the candidate character which is corresponding with the surname inputting code in surname listing, acquiring the name's word number from the inputting coded sequence, if the number of the name's word is a single name, calculating the using probability of the single candidate word and doing descend arrangement by the using probability; if the number of the name's word is a double name, calculating the suing probability of the double candidate word and doing descend arrangement by the using probability of the candidate word.
Description
Technical field
The present invention relates to a kind of method and system of importing the Chinese character name, particularly a kind ofly can import the Chinese name fast, reduce word selection and realize the method and system of input continuously.
Background technology
Different with Indo-European alphabetic literal, department of oriental languages usually has a very large character set, for example Chinese, Japanese and Korean.Chinese character among the GB2312 reaches 6763, and the Chinese character among the GBK surpasses 20,000, so the input of Chinese character is considered to first difficult problem of Chinese informationization all the time.In Japanese and Korean input, also exist same problem.For solving Chinese character input problem, in two more than ten years, numerous input method that has been born comprises famous the Five-stroke Method and Intelligent ABC or the like in the past.Purple light spelling input method and Microsoft's spelling input method have solved the problem that reduces word selection and import continuously to a certain extent also by the language statistics model.
Common input method is carried out conversion from phonetic to the Chinese character by vocabulary, yet name is different from fixedly entry, characteristic with dynamic combined, thereby can't include enough big people's famous-brand clock in advance, this also just makes usually needs the numerous and diverse repeatedly word selection of process when the user imports name.
Summary of the invention
The purpose of this invention is to provide a kind of method that can efficiently import name, this method can be embedded in the existing input method, also can be designed to a kind of independently input method.As software systems, can be used for the information equipment such as desk-top computer, personal digital assistant and mobile phone that any needs carry out the name input.
According to an aspect of the present invention, provide a kind of input method that can discern the input name, comprise step: detect the surname input coding in the input coding sequence; From the surname tabulation, retrieve the time selected character of the surname corresponding with the surname input coding of input; From the input coding sequence, obtain the number of words of name; When the number of words of judging name is single-character given name, then calculates the probability of use of waiting selected character as the single-character given name of target, and, wait selected character by descending sort according to the probability of use of waiting selected character; With when the number of words of judging name is two-character given name, then calculates the probability of use of waiting selected character as the two-character given name of target, and, wait selected character by descending sort according to the probability of use of waiting selected character.
According to another aspect of the present invention, provide a kind of input method that can discern the input name, comprise step: detect the surname input coding in the input coding sequence; From the surname tabulation, retrieve the time selected character of the Chinese character surname corresponding with the surname input coding of input; Input single-character given name syllable is also searched the single-character given name concordance list of storage; With utilize expression formula S (w)=0.5* (f (w, G
0| p)+f (w, G
2| p)) obtain all name candidate of this phonetic correspondence.
According to another aspect of the present invention, provide a kind of input method that can discern the input name, comprise step: detect the surname input coding in the input coding sequence; From the surname tabulation, retrieve the time selected character of the Chinese character surname corresponding with the surname input coding of input; Input two-character given name syllable is also searched the two-character given name concordance list of storage; Generate and the corresponding two-character given name candidate combinations of importing of input coding, and utilize expression formula
Estimate each the two-character given name candidate combination that generates; The two-character given name candidate combinations that has top score with output is waited selected character as two-character given name.
According to another aspect of the present invention, provide a kind of input method that can discern the input name, comprise step: from the syllable sequence of current input, obtain current syllable; The tabulation of inquiry surname judges whether the surname tabulation is empty; Judge if surname tabulation, then finishes the name input for empty, and return the sky result, if the surname candidate list be empty, then with first candidate in the surname tabulation as the output surname, and list surname time selected character; If judge that the surname of output is incorrect, then select a surname to wait selected character; If the output result is the surname that needs, then from the syllable sequence of input, calculates single-character given name and wait selected character or candidate's two-character given name word.
According to another aspect of the present invention, provide a kind of input media that can discern the input name, comprising: the input coding device is used for converting Chinese character to acceptable input coding sequence; The surname treating apparatus is used for detecting and judging the surname of the input coding that the user imports; The name treating apparatus is used for discerning the name in the follow-up pinyin sequence after the surname processing unit detects the surname of input coding; With the name output unit, be used to export the name candidate Chinese character corresponding with input coding.
The method according to this invention can be imported the Chinese character name efficiently, this method is done the relevant modification of languages slightly also be applicable to input Japanese and Korean.
Description of drawings
By the preferred embodiments of the present invention being described in detail, will make above-mentioned and other purpose of the present invention, feature and advantage clearer below in conjunction with accompanying drawing.Be noted that explanation given below only is the embodiment that provides in order to understand the present invention better, rather than limitation of the present invention.Wherein:
Fig. 1 is the schematic block diagram of expression according to the name input system of the embodiment of the invention;
Fig. 2 is an expression Chinese surname reverse indexing table;
Fig. 3 is that expression is according to single-character given name identification and the input of embodiment of the invention name input system under independent input pattern;
Fig. 4 is that expression is according to two-character given name identification and the input of embodiment of the invention name input system under independent input pattern; With
Fig. 5 is the inputting interface that expression has the name input function according to the embodiment of the invention.
Embodiment
With reference to the accompanying drawings embodiments of the invention are described in detail, having omitted in the description process is unnecessary details and function for the present invention, obscures to prevent that the understanding of the present invention from causing.
Ultimate principle according to input method of the present invention is described below.Though be noted that the Chinese character name input of describing in the embodiment of the invention, design of the present invention and principle can be applied to the Chinese character input method in other field.In addition, by carrying out the modification relevant with languages, the present invention also can be applicable to Japanese and Korean input.
Be that example is described the present invention below with Chinese.If do not consider the name special case of ethnic group, the name in the Chinese only has four kinds of limited array configurations usually, that is, and and monosyllabic name single-character given name, monosyllabic name two-character given name, two-character surname single-character given name and two-character surname two-character given name.
Because surname, name use the combinatory possibility of word very big, common input method can't provide one to cover whole name vocabularys.But Chinese name is to have certain implication with word and combination thereof, rather than irregular combination in any.In addition, Chinese surname word is limited.Therefore, exist distribution characteristics under the probability meaning.The present invention utilizes this feature exactly, helps the user by a name recognizer and reducing the number of times of word selection from the transfer process that is encoded to Chinese character, thereby accomplish to import efficiently name.
For clearly demonstrating problem, present embodiment is input as example explanation name input method of the present invention with phonetic.Though be noted that in the present embodiment to be input as example with phonetic, the present invention is not limited thereto, the method that proposes among the present invention is equally applicable to the input method of other types, for example based on input method of stroke etc.
Name input method of the present invention comprises two kinds of application models, is the stand-alone mode of name and the continuous mode when importing Chinese character continuously in known current input content promptly.Under stand-alone mode, input method is set to and only is used for importing the Chinese name, after this Shu Ru word string will be considered to a name, and a kind of name recognizer will be used to assess and select the candidate Chinese character of input Pinyin generation, make the user to finish incoming task by less word selection.Stand-alone mode is applicable to the occasion that need occur a large amount of names simultaneously, for example, and input payroll, staff list etc.Under continuous mode, the name recognizer dynamically detects the name that may exist in the pinyin string of user's input, and the word selection of assisted user, thereby improves the efficient of input.Continuous mode is applicable to the situation at general occasions such as article input input name.Stand-alone mode can initiatively be set at input method procedure the name input pattern or switch to the name input method by the user, also can activate the name input method automatically by environment such as operating system, web browsers, for example fills in the name item in the web page form.Input pattern then is used for the efficient of big section text input raising name input continuously.Under two kinds of patterns, the present invention reaches the purpose of efficient input name by a name recognizer.Recognition method and mathematical model that the utilization of name recognizer is set, the candidate that the Pinyin coding of importing is produced sorts, produce the Chinese character combination under the maximum probability meaning at last, thereby reduced the number of times of word selection when the user imports name, reach the purpose of efficient input.The algorithm method of operation under two kinds of patterns is difference to some extent, will be described this respectively hereinafter.
Principle explanation according to the present invention below is according to name model of cognition of the present invention and foundation thereof.Name model of cognition and foundation thereof
The name model of cognition is a mathematical model, and this model is the necessary mathematic parameter of statistics from a database of names of setting up in advance, is used for estimating the possibility of a Chinese character (combination) as Chinese name candidate Chinese character at input phase.
A series of true names have been enumerated in the database of names, for example: " Feng Yuxiang wears calm for Chen Geng, Zhang Zhizhong " or the like.At four kinds of limited array configurations of Chinese Name, single-character given name and two-character given name are handled respectively.This is because from the mathematical statistics angle, single-character given name is different with the quantity of information that two-character given name has.
For single-character given name, suppose that we investigate a Chinese character w as the word G of the single-character given name in the name
0Situation.In order not lose its generality, can be example with " continuing " word.When user's input Pinyin " geng ", in the name storehouse, with the name of sending out " geng " sound with word comprise " continue, heptan, plough, permanent, honest and just, extend " etc.These phonetically similar words are carried out counting statistics as name with word, and press the descending sort of its occurrence frequency in name, predict the single-character given name word with this.Generally speaking, in order to predict the single-character given name word, need set up one from Pinyin coding to unisonance candidate single-character given name with the index of word, and all single-character given names carry out descending sort with word by a specific frequency, constitute a single-character given name concordance list.Shown in following table 1:
The frequency that this word of numeral in the table occurs in name.This concordance list can be stored in the memory storage of the messaging device such as computing machine, promptly as the model of cognition of prediction single-character given name with word.Mathematical description to this concordance list is: Pinyin coding is all Chinese characters of p, as the frequency of single-character given name with word, promptly represents as following expression (1).
F (w, G
0| p)=the single-character given name access times of word w ... (1)
Equally, for two-character given name, can suppose Chinese character w
1And w
2Respectively as the first word G of two-character given name
1With the second word G
2In order to be without loss of generality, be example with two-character given name name " in controlling ".When user's input Pinyin " zhizhong ",, has the same name table that zhi and zhong occur as first word and second word in the name storehouse respectively in two-character given name as the single-character given name word.The frequency of occurrence of name " in controlling " is counted and set up as single-character given name word indexing table.Its mathematical description is respectively:
F (w, G
1| p)=two-character given name that is encoded to p is with the access times of word w as two-character given name first word
F (w, G
2| p)=two-character given name that is encoded to p is with the access times of word w as two-character given name second word
…(2)
In addition, the connection frequency between two of two-character given name individual characters is the effective ways that two independent Chinese characters of a tolerance constitute the possibility of two-character given names.Its principle is that binary connection parameter has strong memory effect the name that occurred.Connect frequency f (w so introduce individual character
2| w
1):
Following formula represents that individual character connects frequency and equals Chinese character w
1And w
2As the number of times of two-character given name divided by with Chinese character w
1Number of times as first word in the two-character given name.Chinese character w wherein
1And w
2Number of times as two-character given name comprises Chinese character w
1And w
2In two-character given name, occur respectively, and the number of times that occurs together.
In addition, as an alternative, the connection frequency of the band tuning joint of two individual characters of employing two-character given name also is a kind of measure preferably.Its principle is the smoothness that the Chinese name attention is read aloud.Therefore introduce band tuning joint and connected frequency f (s
2| s
1) function:
Wherein s represents the syllable of input word.
Shown in top expression formula (1)-(4), expression formula (1) statistics is used for describing the frequency of utilization of single-character given name with word as the single-character given name word frequency of target.This numerical value can be used as the foundation of single-character given name with the word ordering.Expression formula (2) statistics target is the two-character given name frequency of the Chinese character of word at diverse location.Expression formula (3) and (4) statistics target are used to estimate the combination ability of two names with word and pronunciation thereof.
For single-character given name, consider single-character given name with word also usually as two-character given name with second word of word, so set its use score value be:
E1:S(w)=0.5*(f(w,G
0|p)+f(w,G
2|p)) (5)
(please describe the implication of expression formula (5) in detail)
For two-character given name, suppose that the Chinese character candidate combinations that two phonetics produce is w
1w
2, it uses score value to be set to:
E2:
(please describe the implication of following formula in detail)
The name input system
Below with reference to description of drawings name input system of the present invention.
Have the additional part that the input method that can efficiently import the name function having thus described the invention can be used as existing input method, perhaps be used for the equipment of any needs input name as stand alone software.Fig. 1 is the block diagram of name input system according to an embodiment of the invention.
Name input system according to the present invention comprises code inputting device 1, surname processing unit 2, name processing unit 3, name output unit 4, surname list storage unit 5, recognition rule storage unit 6 and model of cognition storage unit 7.
The coding input
Be the input Chinese character, the user need convert Chinese character to computing machine acceptable coded strings with certain coding method.For the input method based on the Chinese phonetic alphabet, its coded strings is exactly the pinyin string of Chinese character.Considering the general spelling input of tool, is example with input name " Xia Hairong ", and the user may import " xia hai rong ".After this pinyin string is input in the system, resolved one-tenth " xia ", " hai " and " rong " three independent syllables.
Surname is handled
Chinese has the surname of quite stable, by setting up Zhang Xingshi tabulation (seeing Table), can be used for detecting and judge the surname in the phonetic that the user imports.Included surname can be retrieved by coding in the surname tabulation, for example can retrieve by " xia " of input for the surname in the last example " summer ".Again for example, surname " Ouyang " can be retrieved by " the ou yang " of input.The surname tabulation can be stored in the surname list storage unit 5.
Table one: Chinese surname tabulation (part)
Surname | Pinyin coding | Surname | Pinyin coding |
Zhao | zhao | Money | Qian |
Open | zhang | Chapter | Zhang |
Xiahou | xia hou | Summer | Xia |
Ouyang | ou yang | Europe | Ou |
… | … | … | … |
At this table in use, usually need to set up one from being encoded to the reverse indexing of surname, its structure is shown in Figure 2.By this reverse indexing table, can judge the syllable of an input, whether may constitute a surname in the name, and the surname how many unisonances are arranged.For example, when input Pinyin " ai ", surname processing unit 2 can arrive and search the pairing Chinese character surname of Pinyin coding " ai " in the surname list storage unit 5.Usually, the pairing Chinese character surname of Pinyin coding " ai " has only " Chinese mugwort ", and therefore, detected surname both had been " Chinese mugwort ".
Exist a spot of two-character surname in the Chinese surname, for example " Xiahou, Ouyang, Yuchi, Zhuge " etc., its first word itself also can be used as monosyllabic name.In the case, as an example, system can be defaulted as it and constitute a two-character surname, but allows the user to go to make amendment.Promptly when user's input Pinyin " ouyang ... " the time, surname processing unit 2 can think that the surname of this name is " Ouyang " rather than " Europe ".And when input Pinyin " ou ", surname processing unit 2 can detect monosyllabic name with word " Europe " and " district ", and as a kind of setting, can detect two-character surname " Ouyang ".
In addition, also comprise the part phonetically similar word in the Chinese surname, as " the opening " and " chapter " of listing among Fig. 2.Do not have remarkable effective method to address this problem, the user can determine by the word selection process.
For user's input, if system can not find out a suitable surname, then end the name input process, the normal word selection process that switches to input method is carried out the word selection input.
Above-mentioned recognition rule can be stored in the recognition rule storage unit 6.
The name word processing
Compare with surname, name is more widely with word than surname with word, therefore relates to above-described main algorithm process herein.Its process still is that continuous mode has difference slightly because of system is in stand-alone mode.For the input method procedure of having determined, can on messaging device, set a special key combination or, switch the pattern of name input as other control modes such as button, menu items.
1. stand-alone mode
Under stand-alone mode, user's purpose is imported name exactly, so think the sign indicating number string effective name coding always of its input.After user's end of input, the total number of word of name just can directly be known in system, and after the processing by surname processing unit 2, name processing unit 3 just can be known the number of words of name, that is, be single-character given name or two-character given name.Therefore, name processing unit 3 can be at single-character given name and the independent processing respectively of two-character given name.
Fig. 3 shows the operating process of system handles single-character given name of the present invention with word.At first, at step 301 input single-character given name syllable s.After this, at step S302, name processing unit 3 has been searched the single-character given name concordance list.Next, utilize expression formula (5) to obtain all name candidate of this phonetic correspondence at step S303, and step S304 select have high frequency time the single-character given name candidate as the output word.As an example, the single-character given name concordance list can be stored in the model of cognition unit 7, also can storage unit be set separately, is used to store name word indexing table.For example, it is divided into single-character given name concordance list and two-character given name concordance list.
Importing name " chen geng " with the user below is that example describes in detail, the surname processing unit judges that at first " chen " is monosyllabic name " old ", then according to concordance list find out " geng " by frequency ordering candidate " continue, plough, extend ... " " continue " as selection output with first candidate, and keep this tabulation for user's selection.
Fig. 4 shows the operating process of system handles two-character given name of the present invention with word.At first, at step 401 input two-character given name syllable s1, s2.After this, at step S402, name processing unit 4 is searched the two-character given name concordance list of storage.Next, in the corresponding two-character given name candidate combinations of phonetic of step S403 generation and input, and each the two-character given name candidate that utilizes expression formula (6) evaluation to generate in step S403 at step S404 makes up.Has the two-character given name candidate of top score as the output word in step S405 output.The output order can be arranged by the frequency that the two-character given name combination occurs.
Be that example is described in detail under the stand-alone mode with user's input " xia hai rong " below, the process of name input system identification two-character given name." xia " at first judges for monosyllabic name " summer " in system, finds out the candidate by frequency ordering of " hai " and " rong " then according to the two-character given name concordance list, " sea, the last of the twelve Earthly Branches, and " and " appearance, honor, Rong, melt, molten, molten ".Combination and utilize expression formula (6) to give a mark after, as follows by score ordering two-character given name candidate: Hai Rong, Hai Rong, Hiroad, Hai Rong, but the candidate word that user's selective system provides.At last, system will be with " Hai Rong " as identification output.
2. continuous mode
The following describes according to the present invention the situation of under continuous mode, handling the name input.Under continuous mode, the user imports the encode Chinese characters for computer word string continuously, does not have significantly to indicate the starting and ending of name, so system need dynamically detect the user and imports the name that exists in the encode Chinese characters for computer string.Its rudimentary algorithm is as follows:
The candidate w of current syllable variable s and s at first, can be set.From the syllable sequence of current input, obtain current syllable s.After this, the tabulation of inquiry surname obtains surname candidate list 1.Judge whether the surname tabulation is empty.If the name input that the surname tabulation for empty, then finishes under the continuous mode is judged, and is returned the sky result.If the surname candidate list is not empty, first candidate in then tabulating with surname is as the output surname.After this, the user can judge whether the surname of output is correct, if import incorrectly, promptly Shu Chu surname is not the surname that the user needs, and the user can select a candidate.If output is non-surname word, then finish the surname identifying, return the sky result, continue the Chinese character input.The surname that if the result of output is the user to be needed, what suppose respectively then that the user imports is single-character given name and two-character given name.After this, respectively the coding phonetic of input is carried out score calculation, and the single-character given name and the two-character given name that will have a top score are shown to the user with word according to above-mentioned expression formula (5) and (6).
After this, the user judges whether the name of selecting is correct.If correct, then finish this name input identification.If the user judges that the name of selecting is incorrect, then select first word of name according to the candidate that provides.If the user determines the name end of input, then finish this name input identification.Input does not finish if the user determines name, and system of the present invention then carries out score calculation according to expression formula (6), and the two-character given name of selecting to have top score is shown to the user.The user can judge whether second word of name of selection be correct.If correct, then finish this name input.If incorrect, then select second word of name, finish this name input identification.
For the sake of clarity, said process can be represented with following table:
Algorithm: the detection variable of name under the continuous mode: current syllable s, the candidate w of |
1. from the input syllable sequence, obtain |
6.3.1 give a mark according to E2, selection has the two-character given name of top score and is shown to user 6.4 users judges whether name second word of selecting is correct, if correct 6.4.1 finishes this name identification input 6.5 otherwise 6.5.1 user selects second word of name, finish this name identification input |
What gives an example below according to appointment input method of the present invention, the identifying under continuous mode.For example, suppose user's input Pinyin string " wo he li bao ying lao shi you guo yi mianzhi yuan ", the target word string is " I and teacher Li Baoying had having met once ".Its corresponding surname searched in each syllable, be listed as follows:
The surname syllable | The surname candidate | The single-character given name hypothesis | The two-character given name hypothesis |
wo | Fertile | he | he li |
he | With, what, conspicuous, he, entire | li | li bao |
li | Lee, multitude, strict, chestnut | bao | bao ying |
bao | Bag, Bao | ying | ying lao |
ying | English, should, win | lao | lao shi |
lao | Labor, wine with dregs | shi | shi you |
shi | Stone, teacher, execute, history, the time | you | you guo |
you | Especially | guo | guo yi |
guo | Guo, really, state | yi | yi mian |
yi | Easily, also, she | mian | mian zhi |
mian | |||
zhi | |||
yuan | Former, first, Yuan |
According to identifying of the present invention, after system at first converted " wo " to and supposes that surname " is irrigated ", the user selected " I " non-surname word at 4.2.1, returns sky, this name identification end of input.Carry out successively to " li ", the user selects " Lee " at 4.2.1, thereafter system supposes single-character given name and two-character given name respectively, and recommend candidate's name " Li Baoying " to the user by calculating, the user judges system recommendation name mistake in the 6.1st step, so select " guarantor " by the user 6.1, system exports name " Li Baoying " more subsequently, the correct end.System handles each syllable successively according to algorithm, finishes until the syllable string.
As an alternative, can be after the user imports surname and name at every turn the name of user's input be write down and adds up.If identical with the surname or the name of former input, the input frequency of corresponding name is upgraded in expression formula (1)-(6) above then utilizing.Thus can be according to the frequency of utilization of name, the Pinyin coding of importing at the user changes candidate's ordering in real time, so that carry out the name input more efficiently.
The input method that runs on the PC platform can have an input field and a block of information usually, and the user is coding such as input Pinyin in the input field, and the Chinese character candidate of code displaying correspondence in the block of information.If the present invention is operated in stand-alone mode or as an input method independently, then input field and block of information all can directly utilize, use is similar to common input method.If be operated in continuous mode, in order not influence normal input of input method and word selection process, can be near its input field, for example, one of the following increase of normal input method is used to show second block of information of name testing result, as shown in Figure 6.
In the process of normal input Chinese character, the user imports encode Chinese characters for computer by first input field, and carries out word selection with the wrong Chinese character in the corrigendum input field from first information district.The present invention is when detecting possible name, just on screen, draw second block of information, the user can pass through the undefined button of input method, for example directionkeys or tab key etc., perhaps use mouse, switch to second block of information, from second block of information, select candidate's name word provided by the invention then.
Name input method of the present invention can be imported on the equipment of Chinese character by application need, for example, personal computer (PC), portable computer, mobile phone is on the PDA equipment such as (personal digital assistants).
Name input system according to the present invention can realize by hardware.Also can utilize the combination of software or hardware and software to realize.Described program can be recorded in such as floppy disk, hard disk, and flash memory disk, CD-ROM is on the machine-readable recording medium of DVD-ROM and so on.
Though invention has been described with reference to preferred embodiment, the present invention is not limited thereto, and only be defined by the following claims, and those skilled in the art can carry out various changes and improvements to embodiments of the invention under the situation that does not break away from spirit of the present invention.
Claims (25)
1. can discern the input method of importing name for one kind, comprise step:
Detect the surname input coding in the input coding sequence;
From the surname tabulation, retrieve the time selected character of the surname corresponding with the surname input coding of input;
From the input coding sequence, obtain the number of words of name;
When the number of words of judging name is single-character given name, then calculates the probability of use of waiting selected character as the single-character given name of target, and, wait selected character by descending sort according to the probability of use of waiting selected character; With
When the number of words of judging name is two-character given name, then calculates the probability of use of waiting selected character as the two-character given name of target, and, wait selected character by descending sort according to the probability of use of waiting selected character.
2. method according to claim 1 is wherein calculated single-character given name and is waited the probability of use that the step of the probability of use of selected character is utilized following expression (1) calculating single-character given name time selected character,
F (w, G
0| p)=the single-character given name access times of word w ... (1)
Wherein w represents Chinese character, G
0The position of expression Chinese character in name, p represents that Pinyin coding is all Chinese characters of p, as the frequency of single-character given name with word.
3. method according to claim 1 is wherein calculated the step of waiting the probability of use of selected character as the two-character given name of target and is utilized following expression (2) to calculate the probability of use that two-character given name is waited selected character,
F (w, G
1| p)=two-character given name that is encoded to p is with the access times of word w as two-character given name first word
F (w, G
2| p)=two-character given name that is encoded to p is with the access times of word w as two-character given name second word
……(2)
W wherein
1And w
2The expression Chinese character, G
1Expression two-character given name first word, G
2Expression two-character given name second word, p represents that Pinyin coding is the access times of the two-character given name Chinese character of p.
4. method according to claim 3 is wherein calculated the step of waiting the probability of use of selected character as the two-character given name of target and is also comprised and utilize following expression (3) to calculate the step that two-character given name is waited the connection frequency of selected character,
F (w wherein
2| w
1) expression individual character connection frequency.
5. method according to claim 3 is wherein calculated the step of waiting the probability of use of selected character as the two-character given name of target and is also comprised and utilize following expression (4) to calculate the step that two-character given name is waited the connection frequency of selected character,
F (s wherein
2| s
1) expression band tuning joint connection frequency.
6. method according to claim 2 is wherein calculated single-character given name and is waited the score that the step of the probability of use of selected character is utilized the probability of use of following expression (5) calculating single-character given name time selected character,
S(w)=0.5*(f(w,G
0|p)+f(w,G
2|p))……(5)
Described keep the score considered single-character given name with word also usually as second word of two-character given name with word.
7. method according to claim 3 is wherein calculated the step of waiting the probability of use of selected character as the two-character given name of target and is also comprised and utilize following expression (4) to calculate the score that two-character given name is waited the combination of selected character,
Described keeping the score considered two Chinese character candidate combinations w that input coding produces
1w
2
8. according to the described method of claim 1 to 7, wherein said input coding is a Chinese-character sound dissection encode.
9. can discern the input method of importing name for one kind, comprise step:
Detect the surname input coding in the input coding sequence;
From the surname tabulation, retrieve the time selected character of the Chinese character surname corresponding with the surname input coding of input;
Input single-character given name syllable is also searched the single-character given name concordance list of storage; With
Utilize expression formula S (w)=0.5* (f (w, G
0| p)+f (w, G
2| p)) obtain all name candidate of this phonetic correspondence.
10. method according to claim 9, wherein said input coding is a Chinese-character sound dissection encode.
11. the input method that can discern the input name comprises step:
Detect the surname input coding in the input coding sequence;
From the surname tabulation, retrieve the time selected character of the Chinese character surname corresponding with the surname input coding of input;
Input two-character given name syllable is also searched the two-character given name concordance list of storage;
Generate and the corresponding two-character given name candidate combinations of importing of input coding, and utilize expression formula
Estimate each the two-character given name candidate combination that generates; With
Output has the two-character given name candidate combinations of top score and waits selected character as two-character given name.
12. method according to claim 11, wherein said input coding is a Chinese-character sound dissection encode.
13. the input method that can discern the input name comprises step:
From the syllable sequence of current input, obtain current syllable;
The tabulation of inquiry surname judges whether the surname tabulation is empty;
Judge if surname tabulation, then finishes the name input for empty, and return the sky result, if the surname candidate list be empty, then with first candidate in the surname tabulation as the output surname, and list surname time selected character;
If judge that the surname of output is incorrect, then select a surname to wait selected character;
If the output result is the surname that needs, then from the syllable sequence of input, calculates single-character given name and wait selected character or candidate's two-character given name word.
14. method according to claim 13 wherein further comprises (f (w, G according to expression formula S (w)=0.5*
0| p)+f (w, G
2| p)) calculate the score that single-character given name is waited selected character, and show the step of single-character given name time selected character with top score.
15. method according to claim 13 wherein further comprises according to expression formula
Calculate two-character given name and wait the score of selected character, and show the step of single-character given name time selected character with top score.
16. method according to claim 13 wherein further comprises when judging that the name of selecting is incorrect, then selects the step of first word of name according to the candidate that provides.
17. method according to claim 16, wherein further being included in the name input does not have under the situation of end, according to expression formula
Carry out score calculation, and the two-character given name of selecting to have top score is shown to user's step.
18. method according to claim 17 comprises further wherein whether second word of the name of judge selecting be correct, if incorrect, then selects second word of name, finishes the step of name input identification.
19. method according to claim 13 wherein further comprises if what export is non-surname word, then finishes the surname identifying, continues the Chinese character input.
20. according to the described method of claim 13-19, the phonetic transcriptions of Chinese characters of the syllable of wherein said input.
21. the input media that can discern the input name comprises:
The input coding device is used for converting Chinese character to acceptable input coding sequence;
The surname treating apparatus is used for detecting and judging the surname of the input coding that the user imports;
The name treating apparatus is used for discerning the name in the follow-up pinyin sequence after the surname processing unit detects the surname of input coding; With
The name output unit is used to export the name candidate Chinese character corresponding with input coding.
22. device according to claim 21 wherein further comprises surname list storage device, is used to store the tabulation of Chinese character surname, described surname can be retrieved by coding.
23. device according to claim 21 comprises further that wherein it still is the two-character surname surname that the recognition rule storage unit is used to store the corresponding monosyllabic name of being imported according to the input coding judgement of input of phonetic.
24. device according to claim 21, wherein further comprise the model of cognition memory storage, be used to store the mathematical model of carrying out name identification, described mathematical model is the necessary mathematic parameter of statistics from the database of names of setting up in advance, is used for estimating the probability of the combination of a Chinese character or Chinese character as Chinese name candidate Chinese character at input phase.
25. according to the described device of claim 21 to 24, wherein said input coding is a Chinese-character sound dissection encode.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200410083187 CN1755669A (en) | 2004-09-29 | 2004-09-29 | Name input processing method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200410083187 CN1755669A (en) | 2004-09-29 | 2004-09-29 | Name input processing method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN1755669A true CN1755669A (en) | 2006-04-05 |
Family
ID=36688908
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 200410083187 Pending CN1755669A (en) | 2004-09-29 | 2004-09-29 | Name input processing method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN1755669A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102193709A (en) * | 2010-03-01 | 2011-09-21 | 腾讯科技(深圳)有限公司 | Character input method and device |
CN101267635B (en) * | 2008-04-25 | 2011-11-23 | 中兴通讯股份有限公司 | Chinese input device for contact book of mobile phone |
CN101634928B (en) * | 2008-12-04 | 2012-01-25 | 北京搜狗科技发展有限公司 | Method and device for displaying name candidate items |
CN102647503A (en) * | 2011-02-18 | 2012-08-22 | 中兴通讯股份有限公司 | Contact person information processing method and mobile terminal |
CN104008093A (en) * | 2013-02-26 | 2014-08-27 | 国际商业机器公司 | Method and system for chinese name transliteration |
CN107784027A (en) * | 2016-08-31 | 2018-03-09 | 北京国双科技有限公司 | A kind of reminding method and device of judgement document's search key |
CN108090033A (en) * | 2017-12-27 | 2018-05-29 | 北京天融信网络安全技术有限公司 | Name detection method, device, computer-readable medium and equipment |
US10083172B2 (en) | 2013-02-26 | 2018-09-25 | International Business Machines Corporation | Native-script and cross-script chinese name matching |
CN112783333A (en) * | 2019-11-06 | 2021-05-11 | 北京搜狗科技发展有限公司 | Input method, input device and input device |
-
2004
- 2004-09-29 CN CN 200410083187 patent/CN1755669A/en active Pending
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101267635B (en) * | 2008-04-25 | 2011-11-23 | 中兴通讯股份有限公司 | Chinese input device for contact book of mobile phone |
CN101634928B (en) * | 2008-12-04 | 2012-01-25 | 北京搜狗科技发展有限公司 | Method and device for displaying name candidate items |
CN102193709B (en) * | 2010-03-01 | 2015-05-13 | 深圳市世纪光速信息技术有限公司 | Character input method and device |
CN102193709A (en) * | 2010-03-01 | 2011-09-21 | 腾讯科技(深圳)有限公司 | Character input method and device |
CN102647503A (en) * | 2011-02-18 | 2012-08-22 | 中兴通讯股份有限公司 | Contact person information processing method and mobile terminal |
US9858269B2 (en) | 2013-02-26 | 2018-01-02 | International Business Machines Corporation | Chinese name transliteration |
CN104008093A (en) * | 2013-02-26 | 2014-08-27 | 国际商业机器公司 | Method and system for chinese name transliteration |
US9858268B2 (en) | 2013-02-26 | 2018-01-02 | International Business Machines Corporation | Chinese name transliteration |
US10083172B2 (en) | 2013-02-26 | 2018-09-25 | International Business Machines Corporation | Native-script and cross-script chinese name matching |
US10089302B2 (en) | 2013-02-26 | 2018-10-02 | International Business Machines Corporation | Native-script and cross-script chinese name matching |
CN107784027A (en) * | 2016-08-31 | 2018-03-09 | 北京国双科技有限公司 | A kind of reminding method and device of judgement document's search key |
CN108090033A (en) * | 2017-12-27 | 2018-05-29 | 北京天融信网络安全技术有限公司 | Name detection method, device, computer-readable medium and equipment |
CN112783333A (en) * | 2019-11-06 | 2021-05-11 | 北京搜狗科技发展有限公司 | Input method, input device and input device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220405046A1 (en) | Multi-modal input on an electronic device | |
CN1133918C (en) | Symbol input | |
US7395203B2 (en) | System and method for disambiguating phonetic input | |
US7979425B2 (en) | Server-side match | |
CN1918578B (en) | Handwriting and voice input with automatic correction | |
CN102272827B (en) | Method and apparatus utilizing voice input to resolve ambiguous manually entered text input | |
JP2005202917A (en) | System and method for eliminating ambiguity over phonetic input | |
CN1282072A (en) | Error correcting method for voice identification result and voice identification system | |
JPH11328312A (en) | Method and device for recognizing handwritten chinese character | |
WO2017005207A1 (en) | Input method, input device, server and input system | |
US20200293276A1 (en) | Multi-modal input on an electronic device | |
CN102073884A (en) | Handwriting recognition method, system and handwriting recognition terminal | |
WO2022134355A1 (en) | Keyword prompt-based search method and apparatus, and electronic device and storage medium | |
CN1755669A (en) | Name input processing method and system | |
CN1928860A (en) | Method, search engine and search system for correcting key errors | |
CN1704879A (en) | Method and apparatus for inputting Chinese characters and phrases | |
CN1991743A (en) | Method and device for voice input method | |
CN100501656C (en) | Tone and shape combination method for inputting Chinese character into electronic apparatus | |
CN102346558A (en) | Stroke structure input method and system | |
JP2011210149A (en) | Character string conversion device, retrieval device, character string conversion method, and character string conversion program | |
CN1679023A (en) | Method and system of creating and using chinese language data and user-corrected data | |
CN113722447B (en) | Voice search method based on multi-strategy matching | |
CN1043490C (en) | Muti-word exchanging apparatus and Chinese character exchanging apparatus | |
CN1031228C (en) | Special purpose pocket calculator for social intercourse | |
CN117917621A (en) | Chinese character input method and system and keyboard |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |