Below in conjunction with accompanying drawing, Chinese character input method of the present invention is elaborated.
The Chinese speech structure has three levels: bottom is a phoneme, and the middle level is a sound, and the top layer is a syllable.What " Scheme for the Chinese Phonetic Alphabet " adopted is the phoneme system that each syllable comprises 1--6 letter.In the Modern Chinese, initial consonant has 21, and simple or compound vowel of a Chinese syllable has 39, and adding up is 60.The Latin alphabet has only 26.In " Scheme for the Chinese Phonetic Alphabet ", except that initial consonant zh, ch, sh represented with biliteral, most initial consonants were all used a letter representation; On the contrary, except that single vowel with the letter representation, 2-4 letter representation all used in most simple or compound vowel of a Chinese syllable.
About the keyboard problem of " Scheme for the Chinese Phonetic Alphabet ", be one to answer for the moment basically because " Scheme for the Chinese Phonetic Alphabet " is the phoneme system scheme of Latinize and the Latin alphabet on the international keypad, so have only two problems need do special processing.
Do not have ü on the international keyboard, existing regulation is represented with alphabetical v.Reason is that v is on the shelf in " Scheme for the Chinese Phonetic Alphabet "; V and ü shape approximation, easily accepted by people.
Tone is the important component part of Chinese speech system, and semantization is arranged, and is not dispensable.On international keypad, coding is general only with 26 latin alphabet keys, need not 10 numerical keys, therefore need to adopt letter to mark the accent method.Now stipulate one, two, three, the four tones of standard Chinese pronunciation and representing with alphabetical f x v h q respectively softly.Main reason is: 1. one in the middle of the f is horizontal identical with one tone mark shape, the first stroke of x identical with two tone mark shape (just sequential write is opposite), v is identical with three tone mark shape, the end pen of h and the tone mark shape approximation (direction is consistent) of the four tones of standard Chinese pronunciation, q is an initial consonant of getting " gently " word softly; F x v h constitutes a parallelogram on keyboard, row and following row had both helped keyboard layout in being distributed in, and also helped memory.
Initial and final double-spelling system and " phoneme system " (Scheme for the Chinese Phonetic Alphabet) are comparatively speaking.Double spelling key-position is based on " Scheme for the Chinese Phonetic Alphabet ", pieces together mutually with two letter representation sound, carries out " the position pronunciation " of the first table sound, last bit table rhythm.For example: " hero ", phoneme system are write as " xiong ", and need with five letters: Two bors d's oeuveres system is write as " xp ", only with two letters, with " P " representative " iong ".Spelling and Two bors d's oeuveres, with regard to content, the quantity of information of the two is equivalent; With regard to form, the latter is the external compression form of the former quantity of information based on the former.
Because Two bors d's oeuveres system requires " syllable can only be used again necessary with two letter representations ", therefore three kinds of situations have appearred: 1. more than---more than will compressing of two letters, all being in the great majority in the syllable, for example " hold high " ang---〉a+ng---〉ag (ng is compressed into g); 2. be less than---be less than will supplying of two letters, way is that this letter is repeated once, is all accounting for only a few in the syllable, for example " Russia " e---〉ee; 3. equal---equal not increasing of two letters and do not subtract, all occupying the minority in the syllable, for example " greatly " da-da.
Following initial consonant and mapping relations Fig. 1 of simple or compound vowel of a Chinese syllable and letter key and sound conversion corresponding tables one and table two are arranged from two different angles, and it is equivalent using, and also usefulness also can be used separately.
Illustrate that referring now to Fig. 1 the double spelling key-position technical scheme is as follows in the inventive method:
1. most initial consonants and single vowel are alphabetical identical with " Scheme for the Chinese Phonetic Alphabet " regulation;
2. initial consonant zh, ch, sh are compressed into a letter respectively, represent with i, u, v;
3. single vowel ü represents with v;
4. compound vowel and vowel followed by a nasal consonant are represented with single-letter without exception, and for example ao represents with f, and ian represents with b;
5. according to the voice principle of complementarity, two simple or compound vowel of a Chinese syllable will be represented in the same letter that has, and for example p represents ong and iong;
6. heal up and exhale simple or compound vowel of a Chinese syllable and Cuo Kou to exhale simple or compound vowel of a Chinese syllable to have complementary relationship, merge expression, for example r had both represented uan, also represented ü an;
7. one, two, three, the four tones of standard Chinese pronunciation and are softly represented with alphabetical f x v h q respectively;
8. the stroke horizontal, vertical, left, points, discount are represented with alphabetical h s p d z respectively.
The first, the processing of general initial consonant and biliteral initial consonant
Initial consonant is divided into seven groups by the points of articulation: 1. 2. 3. dental z, c, s of labiodental f of bilabial sound b, p, m, 4. blade-alveolar d, t, n, l, 5. 6. 7. velar g, k, h of lingual surface sound j, q, x of blade-palatal zh, ch, sh, r.General initial consonant does not change according to the keys arrangement of international keypad.For example: " g " still is arranged on " G " key.
Biliteral initial consonant zh, ch, sh represent with i, u, v respectively.1. the key position of representing the biliteral initial consonant has only i, u, three letters of v, is left with no alternative; 2. accumulating method is: i contains the zh sound as dendriform; U contains the ch sound as pond shape; Hook when the v picture is beaten, the expression "Yes" contains the sh sound.
The second, the processing of zero initial
Initial consonant is first key element of syllable, and each word all has its initial consonant, but some syllable is not initial consonant with the consonant, and but with the vowel beginning, traditionally, " zero initial ", zero initial accounts for 5% this class initial consonant.In the Two bors d's oeuveres scheme, 1. the zero initial of taking the lead with a o e is used original shape basically, has only ang to be compressed into ag; 2. the initial of taking the lead with y w remains unchanged.
The 3rd, the processing of single vowel, compound vowel and vowel followed by a nasal consonant
Simple or compound vowel of a Chinese syllable is divided into single vowel, compound vowel and vowel followed by a nasal consonant by structure, 39 altogether.Phonetic plan itself has merged four: wherein " i " both represented dorsal, represented two apicals again, i.e. " i " of " i " of zi, ci, si back and zhi, chi, shi, ri back; " u " both represented velar, represented dorsal again after adding above at 2; " e " both represented velar, represented dorsal again after adding " ^ " above, represented central vowel when the back adds " i ", " r " again.
1. according to the keys arrangement of international keypad, the position of a, o, e, i, u does not change single vowel basically; But ü represents with v.
2. compound vowel is arranged in the left side of keyboard center line basically, and vowel followed by a nasal consonant is arranged in the right side of keyboard center line basically, as: middle row's center line right side is vowel followed by a nasal consonant an, en, ang, eng, and the left side is compound vowel ai, ei, ao, ou.
3. the compress technique of compound vowel and vowel followed by a nasal consonant: according to the consonant conjunction rule, compound vowel and vowel followed by a nasal consonant can merger be eight groups: on arrange seven groups, one group of following row
uan un ong ua
üan ün iong ia
uang üe ie o
iang ui uai uo
4. the key face of simple or compound vowel of a Chinese syllable " four classes of syllables " is arranged
Because last row's single vowel i (16.5), u (6.9), e (10.2) frequency are higher, so with lower the healing up of frequency exhale, a pinch mouth exhales and is arranged in row, opening is exhaled and is arranged in middle row, class of syllables with i as the final or a final beginning with i is arranged in down row.
The 4th, the processing of tone
Tone with letter key f, x, v, h, q represent high and level tone, rising tone, go up sound, falling tone, softly.
Table one is the alphabet sequence key---each letter right-hand is the simple or compound vowel of a Chinese syllable of its representative or initial consonant
Table one
a----a n----in
b----ian o----o uo
c----iu(iou) p----ong iong
d----ei q----uai ie
e----e r----uan üan
f----ao s----ai
g----ou?ng t----ün un(uen)
h----an u----u ch
i----i?zh v----ü sh
j----en w----üe ui(uei)
k----ang x----iao
l----eng y----uang?iang
m----ing z----ia ua
Table two is to table look-up by the simple or compound vowel of a Chinese syllable order is counter---be the letter of representing simple or compound vowel of a Chinese syllable in the bracket
Table two
Single vowel | a o e i u ü er [a][o] [e][i] [u] [v] [er] |
Compound vowel | ai ei ao ou [s] [d] [f] [g] ia ie ua uo üe [z] [q] [z] [o] [w] uai uei iao iou [q] [w] [x] [c] |
Vowel followed by a nasal consonant | an en ang eng [h] [j] [k] [l] ian in iang ing [b] [n] [y] [m] uan uen uang ueng ong [r] [t] [y] [wl] [p] üan ün iong [r] [t] [p] |
Chinese-character sound dissection encode input method of the present invention need not be switched full-spelling double-spelling, and the two is also deposited and usefulness, is among the system.The content of handling Chinese character is identical.Mix the Two bors d's oeuveres keycap, add Two bors d's oeuveres and key in the spelling demonstration, fully the same in form with spelling.
Why can full-spelling double-spelling combine together? key is: in the spelling scheme syllable, second letter (except that the biliteral initial consonant) all is a vowel, overlaps fully with the vowel of second letter in the Two bors d's oeuveres scheme syllable; In the Two bors d's oeuveres scheme syllable, represent 21 consonants that have of simple or compound vowel of a Chinese syllable, just distinguish fully with the second letter (being vowel) of spelling scheme and come.
With regard to the syllable pattern of full-spelling double-spelling, syllable is made of initial consonant and simple or compound vowel of a Chinese syllable, accounts for 95% of whole syllables."+" in the following breakdown number, the addition of expression sound.
ba---->b+a---->ba ju---->j+u---->ju
lai--->l+ai--->ls guang->g+uang-->gy
shuai->sh+uai-->vq nü--->n+ü--->nv
tuan-->t+uan-->tr zuo--->z+uo--->zo
From value, the simple or compound vowel of a Chinese syllable among the ju (act) is ü rather than u.According to " Scheme for the Chinese Phonetic Alphabet " regulation, j, q, x and ü piece together mutually, last 2 omissions of ü, and on the n ü (woman) 2 can not omit.Therefore the ü among the ju still writes u, and the ü among the n ü will convert v to.Please remember: v only is used for piecing together mutually with n, l.
In addition, syllable is made of zero initial, accounts for 5% of whole syllables."+" in the following breakdown number only represents the alphabetical addition of initial and back.
yi--->y+i--->yi wai---->w+ai---->ws
yuan->y+uan->yr ang---->a+ng---->ag
ou--->o+u--->ou e------>e--------->ee
So-called zero initial, referring to does not have the consonant initial consonant in the syllable and the simple or compound vowel of a Chinese syllable of energy self-syllable own.For example an is a simple or compound vowel of a Chinese syllable in han (Chinese), again can self-syllable an (peace).In " Scheme for the Chinese Phonetic Alphabet ", as an, regardless of being as simple or compound vowel of a Chinese syllable or making zero initial that all write an, alphabetical form is constant; And the picture ian, can only make simple or compound vowel of a Chinese syllable usefulness, for example jian (), when making zero initial, write yan (cigarette).In spelling keyboard, make simple or compound vowel of a Chinese syllable and as zero initial be two cover alphabetical form.For example an converts h (bh does) to when making simple or compound vowel of a Chinese syllable; When making zero initial, still write an (peace).And for example ian converts b (between jb) to when making simple or compound vowel of a Chinese syllable; Convert ian---when making zero initial to〉yan---〉yh (cigarette).
In the Chinese-character sound dissection encode input method of the present invention, take the ordering of level type and planarized structure, carry out the individual character input.
Level type (at all levels both can be private, but also dual-purpose) is divided into:
Binary: sound+rhythm
Ternary: sound+rhythm+accent
Quaternary: sound+rhythm+accent+picture
Example
2 real 3 cities 4 formulas, 5 examinations 6 of spelling shi 1 thing are looked 7 generation, 8 histories, 9 stones 0 and are shown
2 real 3 cities 4 formulas, 5 examinations 6 of Two bors d's oeuveres vi 1 thing are looked 7 generation, 8 histories, 9 stones 0 and are shown
Illustrate: the screen display of level type, each presenting bank shows 10, by the frequency reducing ordering;
Plane (each plane both can be private, but also dual-purpose) refers to:
Sound+rhythm+accent
Illustrate: the screen display of plane, when hitting a key and two keys, each presenting bank shows 1, only treats as the high frequency word, and repeats when the phonological tone triple bond.When having only phonological tone (triple bond) to be in a plane, just show 10, by the frequency reducing ordering.
The technical measures that the individual character processing is taked in the level type of this method comprise:
1. four yards can solve whole phonetically similar word problems.One yard of a sound level word initial consonant, two yards of sound words, the phonological tone trigram, phonological tone is drawn four yards, adds digital selective key again, and four yard five key can all be dealt with problems altogether.
2. hold GB baseset Chinese character enough capacity are arranged.Capacity is enough to surpass any one unisonance block count, can shunt the above phonetically similar word of 250 words theoretically, and just 103 of maximum one group of phonetically similar words in " baseset ".
3. word at the same level only needs a screen to show not page turning.What word at the same level referred to is exactly 10 words of a screen, has been the next stage word just if turn over screen.The used information of word at the same level is identical, such as, 10 used sound information of word of a screen are identical.For example: 4. to 5. 6. systems 7. of zhi 1. straight 2. values, 3. fingers are controlled 8. and are known 9. will, 0. matter, and these 10 words are words at the same level, have only the digital selective key difference between the word at the same level.
4. not isometric, non-holographic.The not isometric input code length that refers to does not wait, as: "Yes" " life " " is looked " " gesture " and " is murdered ", these five Chinese characters are respectively different level secondary words, code length does not wait for from 1 to 4 yard, it is not the holography input that each Chinese character all needs phonological tone to draw, a lot of high frequency Chinese characters in common use, as: " give birth to, look " only gets its sound or sound can be imported.
5. priority of high frequency.Two implications of priority of high frequency are: compare mutually between not at the same level time, its frequency successively decreases, and level is inferior high more, and frequency is high more, otherwise, low more; Compare mutually between the word at the same level, frequency presents the trend that falls progressively from left to right, and promptly the more little frequency of label is high more, otherwise, low more.
6. differentiating words word and morpheme word.In with one-level, with the speech word be arranged in first or preceding several, as: the frequency of " justice " is higher than ", easily ", but " justice " not single usefulness is the morpheme word, so be placed on ", easily " afterwards; In the superior and the subordinate, the speech word is arranged in the higher level, the morpheme word is arranged in subordinate.As: " people " are not single usefulness of morpheme word, can organize speech and be " national democracy people citizen its people ", and " people " are the 46th in GB first-level Chinese characters frequency reducing sequencing, is the 2nd in initial consonant M, should be in sound level, in view of it is the morpheme word, single few with chance, so be placed on the sound level.
7. with other word merger, be convenient to scanning.This measure is in arranging the low word at the same level of frequency, and is particularly useful.One group of phonetically similar word has identical radical, as " snout moth's larva, close the eyes, set, sea, underworld ", pronunciation ming, " underworld " arranged jointly, these phonetically similar words with the phonetic element of a Chinese pictophonetic character are sequentially arranged in in the one-level, are convenient to scanning, find the Chinese character of looking for rapidly, also be convenient to the location memory, for another: " loyalty, clock, swollen, secondary, sad, handleless cup " etc.
8. the unified ordering of sound pressed in word already learned, new word, give birth to ripe band.Having many in the secondary word is new words, reads inaccurate pronunciation, and the word that they and people are known comes together, reads the pronunciation of new word with the pronunciation of the word of knowing well.For example: yihd 1. calls on 2. descendants 3. and plays chess 4. grand 5. bright 6. assists, 7. pleased 8. sad 9. hysterias, 0. buryplants.Read these words of knowing of " attainments " " grand " by recognizing, just can be familiar with these new words of " pleased " " buryplant ".
Anything all launches by different level, and this is a general character, is universal law, and the frequency distribution of Chinese character is no exception, is mainly reflected on the lack of uniformity of Chinese character frequency distribution.Whether a Chinese character is selected into, and number of times how much, restricted by the applying frequency of the speech that word is write in the written communication process, and the utilization rate that has is very high, and what have is very low, has shown the lack of uniformity of literal in the middle of using.
The lack of uniformity that Chinese character frequency distributes is reflected in grade inferior localization method, is exactly the not isometric property of coding.Follow the different frequencies of utilization of Chinese character, impose different code lengths, frequency of utilization height, code length are just short, otherwise, then long.As: " with " frequency reducing arranges the 28th, only needs to key in initial consonant " y ", code length is 1, and " stand erect " frequency reducing is arranged the 3324th, must key in phonological tone picture " yihs ", and code length is 4.Make full use of phonological tone information, this is consistent with the national education background.Not isometric level time localization method has two outstanding features: at first interrelated between level is inferior, and interdependence, at the word that upper level occurred, next stage is normally not present, and is not isometric between the Chinese character of the superior and the subordinate.Secondly, be the relation of progressively increasing between level is inferior, a Chinese character can not find in sound level, needn't return, and adds this one-level information of rhythm, becomes the sound level, by that analogy.
" breath " refers to information, the holographic and each have their own purposes of non-holographic.The Chinese character input is with English different, and English input must be holographic, and all alphabetical informations of forming English word are imported one by one, and is tally in every detail.Chinese character teaching is holographic, every stroke, an accent ground study, and the Chinese character input then can be a non-holographic, imports the content of holography with the form of non-holographic, no matter be sound sign indicating number or font code, always is not always the case.When for example importing word, generally all only get the first sign indicating number information of each word, and input is whole word, so non-holographic mainly refers to the compress technique of information.This is the maximum different of Chinese and English input, also is that input speed surpasses English main cause.Show that by demonstration of level type and plane input method of the present invention has quick input capability.
In the word input, general all without tone, stroke more is not to use.Word is by the method input of " non-holographic " in the input method of the present invention.
Double word word: sound sound
Example: believe greatness
Spelling xiangxin weida
Two bors d's oeuveres xyxn wdda
Three-character words and phrases: several o
Example: why sorry
Spelling dbqo wshmo
Two bors d's oeuveres dbqo wvmo
Four-word phrase: several
Example: wholeheartedly in other words
Spelling qxqy zhjshsh
Two bors d's oeuveres qxqy ijvv
More than five words: front-three-end-one
Example: the People's Republic of China (PRC)
Spelling zhhrg
Two bors d's oeuveres ihrg