CN1026924C - Chinese-character sound dissection encode and input method - Google Patents

Chinese-character sound dissection encode and input method Download PDF

Info

Publication number
CN1026924C
CN1026924C CN 92113155 CN92113155A CN1026924C CN 1026924 C CN1026924 C CN 1026924C CN 92113155 CN92113155 CN 92113155 CN 92113155 A CN92113155 A CN 92113155A CN 1026924 C CN1026924 C CN 1026924C
Authority
CN
China
Prior art keywords
code
chinese
initial consonant
chinese character
radical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 92113155
Other languages
Chinese (zh)
Other versions
CN1073539A (en
Inventor
叶冠卿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN 92113155 priority Critical patent/CN1026924C/en
Publication of CN1073539A publication Critical patent/CN1073539A/en
Application granted granted Critical
Publication of CN1026924C publication Critical patent/CN1026924C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Abstract

The present invention relates to a Chinese Pinyin coding system and an input method. The coding system uses 26 English letters to code and comprises six kinds of codes, wherein the basic code has four codes of an initial, a final, an initial and an initial, the front two codes are an initial code and a final code of Chinese characters, and the back two codes are the initial codes of the head and the end parts of the Chinese characters; the head part fetches the codes according to the principle of fetching the bigger code in the forward direction, and the end part fetches the codes according to the principle of fetching the bigger code in the backward direction. An avoidance code fetches according to the principle of fetching the code in avoidance of the same sound with a fourth code. A touch typing code specially codes a small amount Chinese character according to the fourth code. The coding system is especially provided with a glossary code, a symbol code and a phrase code. Six kinds of codes are completely compatible in the system and have no need of switch. The coding system has 5074 brevity codes. The present invention has the advantages of easy learning, easy use, high speed, no ambiguity, no difficult input characters, etc. The coding system integrates easy popularization with easy touch typing.

Description

Chinese-character sound dissection encode and input method
The present invention is a kind of phonetic coding system of Chinese character and input method, belongs to the Chinese information processing field.
At present, Hanzi coding scheme has hundreds of, but popular only has tens kinds, mainly is divided into two big classes: Pinyin coding and spell shape coding.The artificial rule of spell shape coding is more, more, the amateur typing person of the complicated ambiguity of coding can't be on top of.That now announced or popular many outstanding Pinyin coding, as " natural code ", " phone input method ", " voice-literal coding " etc., though solved substantially repetition rate of coding height, memory capacitance big, be difficult to problem such as grasp, but the regulation of many disengaging Chinese character original ideas is arranged, and the selection of parts is easy to generate ambiguity, and particularly the input that difficulty is read and difficulty is divined by means of characters is very difficult.
The objective of the invention is: at the inherent shortcoming of existing encode Chinese characters for computer, the spy designs a cover and has that the repetition rate of coding is low, memory capacitance is little, code fetch is directly perceived, need not hypermnesia, the folding sound coded system of unambiguity, the difficult defeated word of nothing, very easily touch system, advantage such as very easily universal, be the standardization of Chinese phonetic alphabet coding, the naturalization of input in Chinese, simple and clearization, rapidly provide a highly effective approach.
The realization of technical solution of the present invention: in this encode Chinese characters for computer, the all-key of kanji code has four yards.First, second yard is the initial consonant code and the simple or compound vowel of a Chinese syllable sign indicating number of the whole word of Chinese character, and Chinese character is split into two parts of head and the tail, and trigram is the initial consonant code of stem, and the 4th yard is the initial consonant code of afterbody, and the 4th yard of a small amount of Chinese character is the simple or compound vowel of a Chinese syllable sign indicating number of stem.The equal compatibility of all six kinds of coding methods promptly need not to switch and all can carry out various inputs in same Chinese character input state in this coding, and the input Chinese character has number of ways.Analysing sound coding for six kinds is: analyse the sound coding substantially, avoid and analyse sound coding, unfamiliar word and analyse sound coding, touch system and analyse that sound coding, phrase are analysed the sound coding, symbol is analysed the sound coding.
Accompanying drawing of the present invention: Fig. 1, Chinese character folding sound code computer input keyboard figure describe in detail and see the 14 part.
Divide 14 parts to further specify the technical scheme and the realization thereof of this coded system below in conjunction with chart and example:
One. initial consonant code
Initial consonant b, the c of phonetic transcriptions of Chinese characters, d, f, g, h, j, k, l, m, n, p, q, r, s, t, w, x, y, z and English alphabet similar shape, its initial consonant code are corresponding English alphabet, and wherein y, w are not only as initial consonant but also as virtual initial consonant.Initial consonant ch, sh, zh respectively with English alphabet i, u, v as initial consonant code.No initial consonant Chinese character has three ones of a, e, o, respectively with its corresponding English alphabet a, e, o as its virtual initial consonant, virtual initial consonant is mute, only as each distinctive mark.Like this, all Chinese characters have all had initial consonant code, the standardization of phonetic, unitized have obtained further reinforcement, the multiple input of the ambiguity of initial consonant and simple or compound vowel of a Chinese syllable when importing except phonetic side by side.As: in the input of the Chinese phonetic alphabet that dynamo-electric portion is six, during input a, both represented a portion, and represented zh portion again, two ones are mixed and occur, and first yard has ambiguity, be simple or compound vowel of a Chinese syllable be again initial consonant.Simple or compound vowel of a Chinese syllable ai is replaced by alphabetical l originally, but during input " loves ", ai but imports a and i, and when importing " plucking ", simple or compound vowel of a Chinese syllable ai is entered as l, and simple or compound vowel of a Chinese syllable ai has two kinds of input methods, i.e. ai, and l.In this coding, because first yard may be initial consonant code only, same simple or compound vowel of a Chinese syllable also just has only unique a kind of input method, and the phonetic sign indicating number of " loves " and " plucking " is entered as al and vl respectively, and table is adopted clearly, nature, unambiguity.All initial consonant code all is listed as in Table 1.
Table one: initial consonant code table
Initial consonant code implication and represented initial consonant
The virtual initial consonant of a a portion, mute
The virtual initial consonant of e e portion, mute
The virtual initial consonant of o o portion, mute
The initial consonant of w u portion or virtual initial consonant, pronunciation sometimes
The initial consonant of y i portion and ü portion or virtual initial consonant, pronunciation sometimes
I initial consonant ch
U initial consonant sh
V initial consonant zh
Other initial consonant codes are identical with English alphabet
Two. the simple or compound vowel of a Chinese syllable sign indicating number
In the Chinese phonetic alphabet, simple or compound vowel of a Chinese syllable has 34, except that a, e, i, o, u, u, other simple or compound vowel of a Chinese syllable is formed by two or more letters, this coding all replaces it with an English alphabet, owing in the English 26 letters are only arranged, so several simple or compound vowel of a Chinese syllable will be represented simultaneously in some letters.The author is through a large amount of statistics, will wherein cause repeated code least easily and the simple or compound vowel of a Chinese syllable obscured is arranged on the same letter, and with reference to the simple or compound vowel of a Chinese syllable scheme of Liu Shi " diphthong coding ".For the ease of memory, this coding is except the phonetic sign indicating number that has utilized six in former widely used dynamo-electric portion, and also elaborately planned all the other simple or compound vowel of a Chinese syllable sign indicating numbers make the general operation personnel need not study and can use with painstakingly remembering.Its concrete arrangement sees Table two.
Table two: simple or compound vowel of a Chinese syllable code table
The pairing simple or compound vowel of a Chinese syllable sign indicating number of the pairing simple or compound vowel of a Chinese syllable simple or compound vowel of a Chinese syllable of simple or compound vowel of a Chinese syllable sign indicating number
a-a n-ian a-a ua-x
b-jang,uang o-o,ou e-e ue-w
c-ie,uai p-iu i-i ve-w
d-un,vn q-uan,van o-o ui-w
e-e r-er,ei u-u un-d
f-en s-ong,iong u-v uo-z
g-eng t-in ai-l ang-h
h-ang u-u an-j eny-g
i-i v-ü ao-k ing-y
j-an w-ve,ui ei-r ong-s
k-ao x-ua,ia en-f ian-n
l-ai y-ing er-r iao-m
m-iao z-uo ia-x uai-c
ie-c uan-q
in-t iang-b
iu-p iong-s
ou-o uang-b
Three. the fractionation of Chinese character
In this coding, except that " one " that can not be split and " second " can't decompose, remaining Chinese character all split into two parts.According to following six kinds of fonts, Chinese character is split, numeral " 1 " is represented stem in the diagram, and numeral " 2 " is represented afterbody.
Apsacline
Figure 921131550_IMG3
Sloping portion is as Chinese character
Stem, as: " degree is worn distant "
Respectively with " wide dagger-axe Chuo "
As stem.
Enclose type
Figure 921131550_IMG4
Get encirclement portion
Divide as first
Portion is wrapped
Enclosing part does
Be afterbody
The clamping type
Figure 921131550_IMG5
As: " street " and " inner feelings " all belongs to
The clamping type is got wherein " OK "
" clothing " as stem,
" Gui " and " in " as afterbody
Single character is according to stroke order got wherein maximum radical as stem, and remaining is afterbody, takes into account nature, directly perceived and custom.Single character is difficult to decompose, and the author's ad hoc " difficulty is divined by means of characters and analysed the sound coding " is to solve the input problem of single character.
Four. the radical of Chinese character
The radical of Chinese character is divided into two big classes: a class is characterized radical, and it is encoded, and yes gets its phonetic sign indicating number.Another kind of is radical, radical is the one-tenth word differentiation by ancient times, so it generally also has pronunciation, yet, bigger difference is arranged in modern Chinese character and ancient times, we can not demarcate the pronunciation of modern Chinese character with the pronunciation in ancient times, so we can only encode to radical with modern pronunciation custom.In order to reduce memory, this coding specified standard radical seldom only is defined as the standard radical with some people radical common, commonly used, that can both be familiar with again.Be difficult to be familiar with, be difficult to the radical of pronunciation then without exception with its head or end stroke replacement for those.The basic stroke of Chinese character has only six kinds in this coding, that is: point, horizontal stroke (comprise horizontal colluding, carry horizontal stroke), perpendicular (comprising perpendicular colluding), cast aside, press down, turn.Therefore, this coding is very natural, memory capacitance is minimum, very easily accepted by numerous operating personnel, thereby also just very easily promotes.Certainly some is not too common as the Chinese character of radical, be difficult for reading its pronunciation, for this reason, the author is except listing some characterized radicals comparatively commonly used in table three, also all radicals all are listed in symbol and analyse the ep portion of sound coding, in the time of can't determining the pronunciation of radical, can be by input ep, utilize ">" key then, consult its pronunciation and coding.
Table three: standard radical table
The pronunciation of stem code implication pronunciation radical code implication pronunciation radical code implication
Pig word shi
Figure 921131550_IMG6
Word zhi
Dian d point dian Yan y speech yan pig u shoots a retrievable arrow word yi
The horizontal beng Xin of one h x heart xin insect without feet or legs v gold word qian
Shu l just as 1 Rolling f just as f
Figure 921131550_IMG7
Y narrow-necked earthen jar word fou
Pie p casts aside pie Quan q dog quan
Figure 921131550_IMG8
The blunt word gen of q
Dian a presses down na Zhuang j with jiang narrow-necked earthen jar f Cui word zhi
Figure 921131550_IMG9
G turns the blunt g of guai Cannibals u food shi
Woo u shows shi Cui v bamboo prefix zhu
Yi y clothing yi cutter prefix dao
The stupid v pawl of Ren Chi r people ren Fan Fan w literary composition wen pole prefix zhao
Tou e two er Cannibals j gold jin tortoise strives d Eight characters head ba
Fu Jie e ear er Si Si
Figure 921131550_IMG10
N button niu adopts and is subjected to the rich prefix feng of v
Bing l two shui Dao Bao g cutter dao and younger brother b small character head xiao
Mi Http g lid gai Yin Chuo z walks zou mould f volume prefix huan
The emerging reward of Lv c grass cao h tiger hu x spring prefix chun
The sick bing volume of Xiangxi h fire huo Epileptic b family dependant j
Safe i of three san Contraband Jiong Qian k frame kuang spring of Rui s
Five. get big principle and replace principle
The all-key of this coding has only four yards, and first and second yard is the initial consonant and the simple or compound vowel of a Chinese syllable of whole Chinese character, as long as press table one and the input of table two order, third and fourth sign indicating number is after Chinese character is split into two parts, the coding of its radical.The method for splitting of Chinese character and standard radical have been done simple introduction in first-half, but it is not unique sometimes how to extract radical.For this reason, existing extracting method with radical is described below:
1. when dividing part, get big principle
In upper, middle and lower type Chinese character and single character, according to stroke order get wherein maximum radical, but can not be this word itself, as stem, remainder as afterbody.So-called maximum radical is exactly in this word, appoints to add the radical that unicursal also can't constitute another radical.As:
" etc. " form by " bamboo ", " soil ", " very little " three radicals, " bamboo " and " soil " can not constitute another radical, so maximum radical is " bamboo ".Remaining " temple " is as the maximum radical of afterbody.
" guilt " is made up of " ten ", " mouth ", " standing ", " ten " four characterized radicals, " ten " and " mouth " formations " Gu " word, " Gu " can't constitute another radical with following " stand ", so " Gu " is stem, " suffering " of remainder is as afterbody.
" swoon " and be made up of " day ", " Mi ", " car " three radicals, " day " and " Mi " can't constitute another radical, thus " day " be maximum radical, as stem.Remaining " army " is the maximum radical of afterbody.
About " street " be " OK ", the centre is " Gui ", so " OK " is stem, " Gui " is afterbody.
" take advantage of " wherein that " thousand " are a radical, " standing grain " is a radical also, so get " standing grain " as stem, remaining " north " is as afterbody.
" I " wherein first stroke " Pie " can't constitute the standard radical with other stroke, so get first stroke " Pie " as stem, remaining " looking for " is as afterbody.
2. in a part, get big principle (replacement principle)
Form by two parts significantly by many Chinese characters, after stem is taken out a maximum radical, do not take out entire portion, at this moment, we think that whole stem code fetch finishes, the part of stem remainder is not as a subdivision of afterbody, that is to say that replaced whole stem with the first maximum radical in the stem, Here it is " replacement principle ".As:
" Wei " left right model, left part is made up of " standing grain " and " woman ", presses big principle and should get " committee " as stem, and remaining " ghost " is as afterbody.
" fragrant " goes up mo(u)ld bottom half, top is made up of four radicals, press big principle, get " sound " code as stem, and think that whole stem code fetch finishes, and that is to say, replaced whole top with " sound ", " an ancient weapon made of bamboo " of top remainder is not re-used as the subdivision of afterbody, and afterbody still is bottom " perfume (or spice) ".
" Austria " goes up mo(u)ld bottom half, and wherein first stroke " Pie " can't constitute radical with other strokes, thus with the code of first stroke " Pie " as whole top, the subdivision of no longer regarding the bottom as of top remainder, the bottom still is " greatly ".
" apply " left right model, left part is made up of " just " and " side ", with " just " as whole left part code, remaining " side " is not re-used as the subdivision of afterbody, afterbody still is a right part " The-Fan ".
" degree " apsacline, sloping portion is made up of " extensively " and " twenty ", gets " extensively " code as whole sloping portion, and afterbody only is " again ".
" wear " apsacline, sloping portion is " ten " and " dagger-axe ", gets " dagger-axe " code as whole sloping portion, and " ten " are code fetch no longer, and afterbody still is " field " and " being total to ".
3. forward is got big and reverse getting greatly
Sometimes, it is not unique that Chinese character splits, stem according to stroke order forward get big after, still remaining a plurality of radicals in the Chinese character, afterbody as its code, is easy to generate ambiguity by which radical, for this reason, we specially formulate forward and get the big and reverse big principle of getting.In this coding,
1. the code of Chinese character radical that is: begins with Chinese-character writing order first stroke by " forward is got big principle " code fetch, and forward sequence is taken out maximum radical, as the code of stem.Apsacline and encirclement type have exception.
2. the code of Chinese character afterbody that is: from the last pen of Chinese-character writing order, is pressed the opposite order of sequential write by " reverse get big principle " code fetch, gets the code of a maximum radical as afterbody.
Just the trigram in the chinese-wide code is got big principle code fetch by forward, gets big principle code fetch by reverse for the 4th yard.Like this, the ambiguity of the ambiguity of Chinese character fractionation and code fetch just all has been readily solved.Be exemplified below:
" writing brush " left right model, left part be as stem, gets " ten " that big principle gets its upper left quarter as its code by forward.Right part is an afterbody, gets " plumage " that big principle gets its right lower quadrant as its code by reverse.
" honor " upper, middle and lower type, trigram in the all-key is got " Lv " that big principle gets top as its code by forward, get " wood " that big principle takes off portion as its code by reverse for the 4th yard, " Mi " at middle part be code fetch not, and it is also just unimportant that it belongs to that part on earth.
" degree " inclination shape, trigram is got sloping portion " extensively ", gets big principle and gets " again " by reverse for the 4th yard, and " twenty " wherein be code fetch not.
Six. analyse the sound coding substantially
Substantially analysing the sound coding is to analyse coding the most easy to learn in the sound coding, it is the basis of other various codings, for the beginner, need not painstakingly to learn, as long as hand-held one " analysing sound encoded keyboard figure " can carry out the Chinese character input, its all-key is: first and second yard is the double spelling code of Chinese character, that is: first yard is initial consonant code, and second yard is the simple or compound vowel of a Chinese syllable sign indicating number.Trigram is the initial consonant code of the stem radical that takes out by " forward is got big principle " from Chinese character, and the 4th yard is the initial consonant code of the afterbody radical that takes out by " reverse get big principle ".
First yard initial consonant code (whole word)
Second yard simple or compound vowel of a Chinese syllable sign indicating number (whole word)
Trigram initial consonant code (stem)
The 4th yard initial consonant code (afterbody)
Be exemplified below:
" praising " first yard is initial consonant j, and second yard is the code x of simple or compound vowel of a Chinese syllable ia, and trigram is the initial consonant j of the forward radical " Ji " of getting big taking-up, and the 4th yard is the initial consonant j that the reverse radical of getting big taking-up " adds ".Its all-key is: jxjj
Seven. avoid and analyse the sound coding
Avoidance is analysed the preceding trigram of sound coding and is analysed sound substantially and encode identical, just the 4th yard improves to some extent, to reduce repeated code, that is: when the 4th yard when identical with first yard, when the initial consonant of the just reverse initial consonant of getting the radical of being got greatly and Chinese character itself is identical, the reverse taking-up afterbody radical different one maximum with initial consonant Chinese character, with its initial consonant code as the 4th yard, be referred to as " principle of obviation ", note: this coding is the 4th yard employing principle of obviation only.This avoidance is reasonable fully, natural fully, because the afterbody of Chinese character is all to the phonetic notation of Chinese character own greatly, we there is no need both to import the pronunciation of Chinese character, import its phonetic notation again, therefore we fully should be in the Chinese character, one of essential characteristic of this code Design thought that Here it is is avoided in the input that repeats of unisonance sign indicating number.
First yard initial consonant code (whole word)
Second yard simple or compound vowel of a Chinese syllable sign indicating number (whole word)
Trigram initial consonant code (stem)
The 4th yard initial consonant code (afterbody is avoided first yard)
As:
" praising " first yard is " praising " initial consonant j itself, and the 4th yard former is the initial consonant j that radical " adds ", but " praising " and " adding " in unison, so should avoid, the reverse initial consonant k that gets the maximum sub-radical " mouth " that radical " adds " changes the 4th yard j into k.The all-key of " praising " is jxjk now.
Eight. unfamiliar word is analysed the sound coding
In this coding, unfamiliar word is analysed sound coding and is divided into that difficulty is recognized, difficulty is torn open, difficulty is recognized difficulty and torn three kinds open:
1. difficulty is read and is analysed the sound coding
So-called difficulty is read and is meant that those ordinary peoples are not familiar with, and can't determine the word of pronunciation again by its radical.The uncommon word of reading half of sound does not belong to difficulty and reads.
In GB I and II character library, particularly in the secondary character library, have a lot of ordinary persons be not familiar with, be difficult to determine the Chinese character of its pronunciation, these words to account in whole 6763 words more than 1/3rd.Because their phonetic sign indicating number can't determine that when two of face kinds of codings were imported before use, first and second sign indicating number can't be imported, can only mix in whole character library with everyday character and search by replacing the fuzzy input of key, the repetition rate of coding is high like this.Maximum reaches more than 100 words, can only with the naked eye utilize page turning key, searches page by page.Not only time-consuming but also require great effort.Therefore, be necessary fully, difficulty read encode separately.Certainly, difficulty is read should have common coding method too, so that those people that are familiar with these Chinese characters use.In this coding, the coding of unfamiliar word all is placed on o portion.That is:
Difficulty is read and is analysed the sound coding
First yard o
Second yard initial consonant code (stem is got the radical that big principle is got by forward)
Trigram initial consonant code (afterbody is got the radical that big principle is got by reverse)
The 4th yard simple or compound vowel of a Chinese syllable sign indicating number (afterbody is got the radical that big principle is got by reverse)
As:
" villous themeda " common people be not familiar with, do not read half of sound yet, the phonetic sign indicating number can't be determined, should use difficulty read the coding import, first yard is alphabetical o, second yard initial consonant code c for " Lv ", trigram is the initial consonant code g of " official ", and the 4th yard is the simple or compound vowel of a Chinese syllable sign indicating number q of " official ", its difficulty read the coding all-key be ocgq.In basic coding, the coding of " villous themeda " is also arranged, the all-key jncg of its basic coding.
2. difficulty is divined by means of characters and is analysed the sound coding
Have some words particularly single character be difficult to split, or polysemy is arranged when splitting, for this reason, ad hoc difficulty is divined by means of characters and is analysed the sound coding, it is distinguished with common Chinese character come.
1. the sound of analysing that the stem difficulty is torn open is encoded, and is primarily aimed at those single characters of understanding easily
First yard initial consonant code (whole word)
Second yard simple or compound vowel of a Chinese syllable sign indicating number (whole word)
Trigram o
The 4th yard initial consonant code (initial consonant code of obvious radical in the word)
As:
" must " ordinary people all know " must " pronunciation of word, so preceding two yards input, i.e. bi easily of its all-key.But its fractionation is not easy to find out, so can use the difficulty coding of divining by means of characters, trigram is entered as alphabetical o, " must " word promptly appears in the presenting bank the 4th yard initial consonant code x that also can be input as " heart ".
2. the sound of analysing that the afterbody difficulty is torn open is encoded
First yard initial consonant code (whole word)
Second yard simple or compound vowel of a Chinese syllable sign indicating number (whole word)
Trigram initial consonant code (stem)
The 4th yard o
As:
Preceding two yards of " numbness " " numbness " is bi, trigram is got the initial consonant code b of stem " Epileptic ", the 4th yard initial consonant code b that can be entered as " confering " by basic coding, but it is identical with first yard, avoid coding so can use, because the bottom of " confering " is not the standard radical, code fetch just is easy to generate query, therefore we can use that the afterbody difficulty tears open analyses its 4th yard of sound coding input, is about to the 4th yard and is input as alphabetical o.
3. difficult recognize that difficulty divines by means of characters analyse the sound coding, be meant and not only difficultly recognize but also the difficult Chinese character of tearing open, be primarily aimed at the single character that those are difficult for understanding.
First yard o
Second yard o
Trigram o
The 4th yard initial consonant code (initial consonant code of obvious radical in the word)
As:
" thirty " common people are difficult for understanding, also be difficult to split, so can use difficulty to recognize difficulty divines by means of characters and analyses sound coding it is imported, that is: the input three alphabetical o, " thirty " word just appears in the presenting bank, can use this moment numerical key to select input, also can import the initial consonant code i in its 4th yard " river ".Certainly, " thirty " also can be entered as saih by basic coding.
Nine. the input of Chinese character and brevity code thereof
In first three joint, the author has introduced three kinds of coding methods in this coded system, and the reader has had general understanding to native system.Now, the author can introduce the input of Chinese character.
1. the input of Chinese character
In this coding, during input first yard of Chinese character, ten high frequency Chinese characters appear in the presenting bank, thereafter respectively with the next one coding that they are arranged, when importing Chinese character wherein, as long as key in corresponding numeral, when importing first Chinese character, also can import the space and replace input digit 1.If do not want to import second yard, can search by page turning key ">".After importing second yard, secondary brevity code Chinese character, other high frequency word and phrase appear in presenting bank, if trigram can not determine, can be by the page turning key search.After importing trigram, three Chinese character, high frequency word and phrase appear in presenting bank, if the 4th yard be can not determine, please use page turning key.After importing the 4th yard, all repeat code Chinese characters and phrase appear in presenting bank.In this coding, have two special keys and a specific code:
>page turning key is used for circulation search
Replace key, be used for fuzzy input, replace sign indicating number arbitrarily.
O unfamiliar word sign indicating number, the input difficulty is read, first yard is o, the single character trigram can be o, the 4th yard also can be o when being difficult to determine, the o sign indicating number with the difference of key be, such as trigram, key search one, two, four yard all identical Chinese character, the o sign indicating number only is the Chinese character that those trigrams are difficult to determine.The o sign indicating number belongs to specific code in the kanji code, and only be learning key, be not the coding of Chinese character.
It should be noted that in native system six kinds of coding whiles also deposit, same Chinese character both can be imported by basic coding, can also import by the unfamiliar word coding by avoiding the coding input again, even can also more can use the brevity code input certainly by the input of touch system coding.A Chinese character has multiple coding, various codings mutually not contradiction, do not conflict mutually, only repeated code is counted the difficulty or ease difference of difference, input.Every kind of input method can obtain same Chinese character, that is to say: same Chinese character has a plurality of codings, and all are encoded all and exist in the same code database.For example:
" clever " is input as lirl by basic coding, and three of repeated codes need be selected by presenting bank.Be input as lird by avoiding sign indicating number, do not have repeated code, need not to select.
Do not have the unfamiliar word coding, also do not have brevity code, the touch system sign indicating number is identical with the avoidance sign indicating number.
" twenty " is input as nnch by basic coding, and a repeated code is arranged.Reading by difficulty is input as cchg, recognizes difficulty by difficulty and divines by means of characters and be input as oooc.Avoid sign indicating number, the touch system sign indicating number is all identical with basic code.There is not brevity code.
" ox " is input as nppu by basic code, and brevity code is np, and divining by means of characters by difficulty is input as npou.
2. brevity code input
The brevity code of Chinese character is the simplification to all-key, and for everyday character, we there is no need to key in one by one its four whole sign indicating numbers.For this reason, generally all be provided with one, two, three brevity code in the encode Chinese characters for computer.For the most frequently used Chinese character, with first yard replacement, as long as first yard of input keyed in space bar again and got final product, Here it is one-level brevity code.For than everyday character, replace with first, second sign indicating number, as long as preceding two yards of this word of input key in the space again and get final product, Here it is secondary brevity code.In like manner also can carry out the input of three.Brevity code all is first Chinese character that occurs in the prompt line, need not memory, just when keying in numeral 1 this word of input, knows also and can get final product with the space bar input.In this coded system, 26 of one-level brevity codes, 421 of secondary brevity codes, 4627 of threes, totally 5074 brevity codes.And 50 radicals are arranged in the GB one secondary character library, should not count in the Chinese character.Ading up to of the Chinese character on historical facts or anecdotes border:
6763-50=6713.
Can import most words with brevity code, can improve input speed greatly.Now that I and II brevity lists in the native system is as follows.In the secondary brevity lists, the secondary brevity code of unfamiliar word sign indicating number is not listed.
Table four: brevity code statistical form
Three grades of totals of one-level secondary
Basic code 26 396 3,880 4302
Difficulty reads 0 25 518 543
Difficulty divines by means of characters 00 229 229
Add up to 26 421 4,627 5074
Table five: one-level brevity lists
Not time two non-and go out and can not have
a b c d e f g h i j k l m
Your Europe sheet seven people three he be I little one
n o p q r s t u v w x y z
Ten, the analysis of repeated code
1. theoretical analysis
Calculate by mathematical theory, use alphabet code, a bit code has 26, two bit codes have 26 * 26=676, and three bit codes should have 26 * 26 * 26=17576, and in GB I and II character library, 6763 Chinese characters are only arranged, and three can be satisfied its coding requirement fully.Then there be 26 * 26 * 26 * 26=456975 as for four bit codes, 450,000! For the number of words of Chinese character, this is astronomical figure only not.Normally, more than 6,000 Chinese characters are encoded with 450,000 sign indicating numbers, should be more than sufficient, repeated code should not appear.But this is theoretic supposition, and in fact, no matter what rule you use, as long as your rule has regulations to abide by, certain is several as long as you are wrong, particular provisions done in tens Chinese characters, with four bit codes more than 6,000 encodes Chinese characters for computer just can be had repeated code.In the GB one secondary character library, the actual number of words of Chinese character is 6713.
2. analyse the repeated code of sound coding substantially
In this coding, substantially analyse 3880 of 396 of 26 of bit codes, two bit codes, three bit codes, 5885 of four bit codes of sound coding, because one or two threes have 4302, its the 4th yard there is no need input, so will be identical with the 4th yard of brevity code word be placed on the position of sequence number 1 than everyday character, use the space bar input to get final product, the brevity code word then is placed on thereafter.After such encoding process, there are 550 repeat code Chinese characters also can import without options button.So the Chinese character that need not select to import has 5885+550=6435 more than.The Chinese character of need selecting only has: 6713-6435=328, and most Chinese characters wherein are awkward reads and difficulty is divined by means of characters.Unique deficiency be that the 5th key " space bar " need be keyed in many Chinese characters.
3. avoid the repeated code of analysing the sound coding
In this coding, four bit codes of avoiding sign indicating number have 6487, that is to say, these 6487 Chinese characters need not to select can unique input.Add 4302 brevity codes, will with the 4th yard identical Chinese character of brevity code word, be placed on the position of sequence number 1, the brevity code word is placed on thereafter, the Chinese character that need not select to import reaches 6670 more than.Only there are tens Chinese characters to need to select.If add the unfamiliar word sign indicating number, the Chinese character that then needs to select has not just almost had.The requirement of touch system can have been satisfied fully.
11, the sound coding is analysed in touch system
It is the further improvement of avoidance being analysed the sound coding that the sound coding is analysed in touch system.Certainly, also only the 4th yard is changed.Purpose also is in order to eliminate the situation of one yard multiword.
By the statistics and analysis of front, avoid sign indicating number and can satisfy the requirement of quick input fully, but have nearly 300 Chinese characters need use the 5th key (comprising space bar and numeral selection).For this reason, this coding has stipulated that still a whole set of touch system analyses sound coding, to eradicate repeat code Chinese character, guarantees to need not to select to import all Chinese characters in quadruple linkage.But correspondingly, also increased memory capacitance.The rule one that the sound coding is analysed in touch system has two, all is to establish at repeat code Chinese character:
1. for the identical a plurality of Chinese characters of afterbody, change get Chinese character radical the simple or compound vowel of a Chinese syllable sign indicating number as the 4th yard, be referred to as " principle of displacement ".
First yard initial consonant code (whole word)
Second yard simple or compound vowel of a Chinese syllable sign indicating number (whole word)
Trigram initial consonant code (stem)
The 4th yard simple or compound vowel of a Chinese syllable sign indicating number (stem)
As:
The pronunciation of " visit fat triangular bream " triliteral radical " Yan ", " moon ", " fish " is respectively yan, yue, yu, its initial consonant all is y, and the right part of three words all is radical " side ", if only consider how to distinguish from radical, that has just had only some rigid regulations of work, such as the 4th yard " visit " gets point " Dian " as code, " triangular bream " gets horizontal stroke " ", and " fat " gets " side ".This cogent provision can not be remembered easily, therefore, we must make an issue of from its left part radical, many codings, as " natural code ", " Li Shi coding " etc., these initial consonants are identical, cause the radical of repeated code to be arranged in respectively on the different key positions easily, like this, the repeated code phenomenon is a large amount of naturally to disappear, but these radicals and its pronunciation have also just lost and get in touch, when importing trigram, memory capacitance has also just increased, and does not also meet people's custom, and the author thinks, though this method is effective, and is very undesirable.The author finds, when distinguishing these Chinese characters, as long as the simple or compound vowel of a Chinese syllable of the 4th yard its radical of input is just passable, trigram still is the initial consonant of radical, just: first and second yard got the initial consonant and the simple or compound vowel of a Chinese syllable of Chinese character itself, third and fourth yard got the initial consonant and the simple or compound vowel of a Chinese syllable of radical, that is: " double-tone principle " or " principle of displacement ".According to this principle, the all-key of " visiting the fat triangular bream " three words is:
" visit " fhyj is the code of phonetic fang and yan
" fat " fhyw is the code of phonetic fang and yue
" triangular bream " fhyu is the code of phonetic fang and yu
2. for the incomplete same a plurality of Chinese characters of afterbody,, be referred to as " different word principle of obviation " with the 4th yard initial consonant code that changes the subdivision of afterbody into.
First yard initial consonant code (whole word)
Second yard simple or compound vowel of a Chinese syllable sign indicating number (whole word)
Trigram initial consonant code (stem)
The 4th yard initial consonant code (afterbody is avoided the repeated code word)
After using these two principles to change the 4th yard, can accomplish not have repeated code.
12, the sound coding analysed in phrase
In this coding, phrase and individual character are mixed to be imported, and need not to key in the sign key of phrase, as long as the coding of input phrase, phrase promptly appears in the presenting bank.If there is not this phrase, then keys in Alt+Space and can enter phrase and set up state behind the coding of input phrase, import individual character one by one after, represent by the space that phrase is set up again and finish that this phrase just deposits in the character library, and appears in text and the presenting bank.Phrase is analysed sound and is encoded to:
1. two words,
The initial consonant of (1) two word,
The initial consonant and the simple or compound vowel of a Chinese syllable of (2) two words.
2. three words,
(1) triliteral initial consonant,
(2) simple or compound vowel of a Chinese syllable of triliteral initial consonant tailing word.
3. the above speech of four words,
The initial consonant of the initial consonant tailing word of first three word.
13, the sound of analysing of various symbols is encoded
In the process of Chinese information processing, various symbols will occur inevitably in a large number, particularly Chinese punctuation mark, each row, each sentence all will occur, and english punctuation mark and Chinese punctuation mark difference are very big, and only account for a character bit, make the text of Chinese character lack of standardization easily, so when the input Chinese text, must also should import Chinese punctuation mark.For this reason, this coding regulation " entering Chinese punctuation mark state automatically " when that is: entering the input in Chinese state, also enters into Chinese punctuation mark state simultaneously, imports Chinese punctuation mark and only need import its corresponding english punctuation mark.Need not to switch, but during the input english punctuation mark, then must enter English state.The corresponding relation of two kinds of punctuation marks is:
The Chinese punctuate,; : “ ” , ? 《 》
English punctuate ,/; .: " ",? []
The code of this ad hoc Chinese radical of encoding belongs to ep portion with radical, that is:
First yard e
Second yard p
Trigram initial consonant code (phonetic notation)
The 4th yard simple or compound vowel of a Chinese syllable sign indicating number (phonetic notation)
Other various symbols also all are placed on each height portion of e portion, see the following form:
Table seven: symbolic coding table
Symbol portion designation and the three or four yard
The radical of ep Chinese character, third and fourth yard are the pronunciation of this radical
Punctuate enters Chinese punctuation mark state automatically.
The et graphical symbol
The ey Chinese phonetic alphabet, back two yards is its pronunciation
The eu numeric character, back two yards is its pronunciation
The eb symbol of tabulating
Ej Japanese katakana, back two yards is its pronunciation
The ek Hiragana, back two yards is its pronunciation
The el Russion letter, back two yards is its pronunciation
The ex Greek alphabet, back two yards is its pronunciation
14, analyse sound sign indicating number characteristics and keyboard Designing
Advantage of the present invention and good effect:
1 all-key has only four yards.
2 people are that rule is few, need not hypermnesia, can see word knowledge sign indicating number.
3 brevity codes are more, 26 of one-level brevity codes, 421 of secondary brevity codes, 5072 of threes.
4 owing to adopted virtual initial consonant, eliminated the ambiguity of initial consonant code and simple or compound vowel of a Chinese syllable sign indicating number.
5 owing to adopted forward to get the big and reverse big principle of getting, and eliminated the ambiguity that Chinese character splits.
6 do not have difficult defeated word, owing to be provided with the unfamiliar word coding, have solved that difficulty is read, difficulty is divined by means of characters and difficulty is recognized the difficulty input difficulty of divining by means of characters.
Many yards of 8 one words, same word has various input.
9 owing to adopted principle such as avoidances, few with yard Chinese character, very easily touch system.
In this coding, initial consonant code and simple or compound vowel of a Chinese syllable sign indicating number design according to Qwerty keyboard, for good note, handy, repeated code is few, the characteristics of keyboard Designing are:
1. simple or compound vowel of a Chinese syllable ai, an, ao, en, ang, eng, ing, ong and initial consonant sh, ch adopt the design proposal of six in dynamo-electric portion fully, and just initial consonant zh changes by v and replaces.
2. simple or compound vowel of a Chinese syllable iao, ian, iang and uang be respectively at ao, an, position, the lower-left rhythm mother stock of ang
(ing, ing), (iu, iou, o, ou), (e, er, ei), (uo, ua, uao, ia, ie), (van, uan, ve, ui), (en) six groups, the simple or compound vowel of a Chinese syllable pronunciation is close in each group, letter shapes is close, is convenient to memory for vn, un.
3. main radical all is placed on its corresponding initial consonant code key.So that grasp.

Claims (2)

1, sound code computer input keyboard analysed in a kind of Chinese character, it is characterized in that: with the coding of English alphabet keys as Chinese character; During first yard of Chinese character of input, English alphabet keys is represented the initial consonant of Chinese character; During second yard of Chinese character of input, English alphabet keys is represented the simple or compound vowel of a Chinese syllable of Chinese character; During input Chinese character trigram, English alphabet keys is represented the initial consonant of Chinese character radical radical pronunciation; During the 4th yard of Chinese character of input, English alphabet keys is represented the initial consonant of Chinese character afterbody radical pronunciation;
A. the corresponding relation of the initial consonant of the English alphabet keys and the Chinese phonetic alphabet is as follows:
Initial consonant b in the Chinese phonetic alphabet, p, m, f, d, t, n, l, g, k, h, j, q, x, z, c, s, r, w, y and English alphabet similar shape, its initial consonant code are corresponding English alphabet keys;
Initial consonant zh, ch, sh surpasses a letter, respectively with English alphabet keys v, i, u replaces;
No initial consonant Chinese character has a, e, and o, three ones, respectively with a, e, o be as its virtual initial consonant,
B. the corresponding relation of the simple or compound vowel of a Chinese syllable of the English alphabet keys and the Chinese phonetic alphabet is as follows:
a-ab-iang,uang?c-ie,uai?d-un?e-e?f-en?g-eng?h-ang?i-ij-an?k-aol-aim-iao?n-ian?o-o,ou?p-iu?q-uan?r-er,ei?s-ong,iong?t-in?u-u?v-u?w-ue,ui?x-ua,ia?y-ing?z-uo
When c. the radical of Chinese character is characterized radical, represent with the pairing English alphabet keys of its initial consonant; When radical is basic stroke or radical, represent with the consonant key of its custom pronunciation.
2, the sound code input method for computor analysed in a kind of Chinese character, based on Two bors d's oeuveres, uses 26 English alphabet keys to the Chinese character input of encoding, and it is characterized in that:
A. all Chinese characters are by " sound is several " four yards formations:
All-key=whole word initial consonant code+whole word simple or compound vowel of a Chinese syllable sign indicating number+stem initial consonant code+afterbody initial consonant code
Initial consonant code is with a letter representation; Initial consonant zh in the spelling scheme, ch, sh are replaced by an English alphabet keys respectively; No initial consonant Chinese character is got first letter in the spelling as the virtual initial consonant of this word; Other initial consonant code is identical with spelling,
The simple or compound vowel of a Chinese syllable sign indicating number is with a letter representation; When simple or compound vowel of a Chinese syllable is alphabetical above one in the spelling, all replace with a letter; Simple or compound vowel of a Chinese syllable u represents with alphabetical v; No initial consonant Chinese character is identical with the simple or compound vowel of a Chinese syllable sign indicating number of common Chinese character,
Input method is:
Earlier the initial consonant code of the whole word of input and simple or compound vowel of a Chinese syllable sign indicating number split into Chinese character two parts of head and the tail again as first, second sign indicating number, and the initial consonant code of getting its pronunciation is as the 3rd, the 4th yard,
Stem splits by " forward is got greatly " principle; That is: from first stroke of sequential write, take out maximum strokes as far as possible, but can not round a Chinese character, constitute a maximum radical, radical comprises characterized radical, standard radical, basic stroke; Any Chinese character all can be regarded radical as in the word,
The reverse code fetch of afterbody has two kinds:
1). afterbody splits by " reverse getting greatly " principle; That is: from the end stroke of sequential write, reversely take out maximum strokes as far as possible, but can not intersect with stem, constitute a maximum radical; With this encode Chinese characters for computer that forms is " basic code ",
2). when the afterbody initial consonant code is identical with whole word initial consonant code, by " avoiding in unison " principle code fetch; That is: take out in the afterbody with Chinese character not in unison subdivision get big principle by reverse in the afterbody and extract and avoid part as the code source of afterbody; With this encode Chinese characters for computer that forms is " avoiding sign indicating number "; Afterbody and whole word are then avoided in unison, then do not avoid in unison, avoid sign indicating number and also deposit with basic code, are all the 4th yard, from the angle of source of sound, Chinese character is divided into contains watch sound and do not contain the watch sound two large divisions; But when stem is identical with whole word initial consonant code, do not adopt principle of obviation in unison;
B. encode at the Chinese character that is difficult to be familiar with specially, and, be called " difficulty is recognized sign indicating number " to the expansion encode Chinese characters for computer beyond the GB one secondary character library,
All-key=alphabetical o+ stem initial consonant code+afterbody initial consonant code+afterbody simple or compound vowel of a Chinese syllable sign indicating number
Wherein: the fractionation of initial consonant code, simple or compound vowel of a Chinese syllable sign indicating number, Chinese character is all identical with a,
Difficulty in the international secondary character library is read and is recognized the sign indicating number except in distress, also basic code should be arranged simultaneously; This specific coding is divided into Chinese character easily to recognize with difficulty from the angle of becoming literate recognizes the two large divisions, is applicable to all encodes Chinese characters for computer based on sound;
C. encoding at the Chinese character that is difficult to split specially, mainly is single character, is called " difficulty is torn sign indicating number open ",
The initial consonant of obvious radical in all-key=whole word initial consonant code+whole word simple or compound vowel of a Chinese syllable sign indicating number+alphabetical o+ word
Wherein: initial consonant code is identical with a with the simple or compound vowel of a Chinese syllable sign indicating number,
The Chinese character that splits mainly is a combinde rqdical character easily, does not have difficulty to tear sign indicating number open; Difficulty is divined by means of characters and is torn open the sign indicating number except in distress, also has basic code,
Not only difficultly recognize but also the difficult Chinese character of tearing open, initial consonant code and simple or compound vowel of a Chinese syllable sign indicating number replace with alphabetical oo, and its preceding trigram is: ooo, the 4th yard still with difficulty tear open yard identical,
This specific coding is divided into Chinese character readily removable and difficulty is torn the two large divisions open from the angle of font structure, is applicable to all encodes Chinese characters for computer based on sound.
CN 92113155 1992-11-13 1992-11-13 Chinese-character sound dissection encode and input method Expired - Fee Related CN1026924C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 92113155 CN1026924C (en) 1992-11-13 1992-11-13 Chinese-character sound dissection encode and input method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 92113155 CN1026924C (en) 1992-11-13 1992-11-13 Chinese-character sound dissection encode and input method

Publications (2)

Publication Number Publication Date
CN1073539A CN1073539A (en) 1993-06-23
CN1026924C true CN1026924C (en) 1994-12-07

Family

ID=4946268

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 92113155 Expired - Fee Related CN1026924C (en) 1992-11-13 1992-11-13 Chinese-character sound dissection encode and input method

Country Status (1)

Country Link
CN (1) CN1026924C (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1047675C (en) * 1993-08-28 1999-12-22 陈光宇 Audio code and its Chinese character inputting key board
CN1069420C (en) * 1995-05-26 2001-08-08 戴石灵 Method for inputting Chinese characters by using their pronunciations and shapes

Also Published As

Publication number Publication date
CN1073539A (en) 1993-06-23

Similar Documents

Publication Publication Date Title
CN1040276A (en) Simplified and complex character root Chinese character entering technique and keyboard thereof
CN1577229A (en) Method for inputting note string into computer and diction production, and computer and medium thereof
CN85101817A (en) An zijie type Chinese-character stroke computer code's method and keyboard thereof
CN1694049A (en) Chinese character input system based on five-key
CN1280748C (en) Speed typing apparatus and method
CN1101139A (en) Computer imput method of figure-sign coding
CN1026924C (en) Chinese-character sound dissection encode and input method
CN1154502A (en) Method and device for ducation standardized inputting Chinese characters by five stroke
CN1048343C (en) Free combination code Chinese character input method and key board
CN1050914C (en) Lin code Chinese character input method
CN1045021C (en) Computer entering method for Chinese numerals and its keyboard
CN1241101C (en) Chinese syllable double reading scheme, Chinese keyboard and information input and processing method
CN1604017A (en) Chinese character characterized location encoding combination input method based on one-key -for-one-character
CN1529219A (en) Language code inputting method
CN1259615C (en) Letter-keyboard and number-keyboard universal inputting method for Chinese character inputting and left-part character-shape identification method
CN1019527B (en) Character pixel input method and its keyboard
CN1275732A (en) Chinese character keyboard input system and applied technology thereof
CN1128371A (en) Chinese character-splitting coded method and its keyboard for computer
CN1025896C (en) New concept Chinese character coding
CN1043209A (en) Computer chinese treatment method
CN1110806A (en) Intelligence five-stroke double-spelling code letter-word chain type positioning association input method
CN1464371A (en) Multilingual input method and system and electronic dictionary system thereof
CN1303504C (en) 'Letter' input-method for Chinese characters
CN1208187A (en) Holographic universal Chinese character keyboard and its input method
CN1165337A (en) Chinese character father Chinese system and its Chinese character keyboard

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee