CN1073539A - Chinese-character sound dissection encode and input method - Google Patents

Chinese-character sound dissection encode and input method Download PDF

Info

Publication number
CN1073539A
CN1073539A CN 92113155 CN92113155A CN1073539A CN 1073539 A CN1073539 A CN 1073539A CN 92113155 CN92113155 CN 92113155 CN 92113155 A CN92113155 A CN 92113155A CN 1073539 A CN1073539 A CN 1073539A
Authority
CN
China
Prior art keywords
code
initial consonant
yard
chinese
coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 92113155
Other languages
Chinese (zh)
Other versions
CN1026924C (en
Inventor
叶冠卿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN 92113155 priority Critical patent/CN1026924C/en
Publication of CN1073539A publication Critical patent/CN1073539A/en
Application granted granted Critical
Publication of CN1026924C publication Critical patent/CN1026924C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

The present invention is a kind of phonetic coding system of Chinese character and input method, uses 26 alphabet codes, comprises six kinds of codings.Wherein basic code is several four yards an of sound, and preceding two yards is the initial consonant code and the simple or compound vowel of a Chinese syllable sign indicating number of Chinese character, and back two yards is the initial consonant code of two ones of Chinese character first and lasts, and stem is got big principle code fetch by forward, and afterbody is got big principle code fetch by reverse.Avoid sign indicating number and adopt principle of obviation code fetch in unison with regard to the 4th yard.The touch system sign indicating number carries out specific coding with regard to the 4th yard to a small amount of Chinese character.And ad hoc unfamiliar word coding, symbolic coding, phrase coding.In the native system six kinds of codings compatible fully, need not to switch.Brevity code reaches 5074.Having possessed advantages such as easy, easy-to-use, quick, unambiguity, the difficult defeated word of nothing fully, is to integrate easily to popularize and the easy coded system of touch system.

Description

Chinese-character sound dissection encode and input method
The present invention is a kind of phonetic coding system of Chinese character and input method, belongs to the Chinese information processing field.
At present, Hanzi coding scheme has hundreds of, but popular only has tens kinds, mainly is divided into two big classes: Pinyin coding and spell shape coding.The artificial rule of spell shape coding is more, coding is complicated, more, the amateur typing person of ambiguity can't be on top of.That now announced or popular many outstanding Pinyin coding, as " natural code ", " phone input method ", " voice-literal coding " etc., though solved substantially repetition rate of coding height, memory capacitance big, be difficult to problem such as grasp, but the regulation of many disengaging Chinese character original ideas is arranged, and the selection of parts is easy to generate ambiguity, and particularly the input that difficulty is read and difficulty is divined by means of characters is very difficult.
The objective of the invention is: at the inherent shortcoming of existing encode Chinese characters for computer, the spy designs a cover and has that the repetition rate of coding is low, memory capacitance is little, code fetch is directly perceived, need not hypermnesia, unambiguity, the difficult defeated word of nothing, very easily touch system, advantage such as very easily universal analyse the sound coded system, be the standardization of Chinese phonetic alphabet coding, the naturalization of input in Chinese, simple and clearization, rapidly provide a highly effective approach.
The realization of technical solution of the present invention: in this encode Chinese characters for computer, the all-key of kanji code has four yards.First, second yard is the initial consonant code and the simple or compound vowel of a Chinese syllable sign indicating number of the whole word of Chinese character, and Chinese character is split into two parts of head and the tail, and trigram is the initial consonant code of stem, and the 4th yard is the initial consonant code of afterbody, and the 4th yard of a small amount of Chinese character is the simple or compound vowel of a Chinese syllable sign indicating number of stem.The equal compatibility of all six kinds of coding methods promptly need not to switch and all can carry out various inputs in same Chinese character input state in this coding, and the input Chinese character has number of ways.Analysing sound coding for six kinds is: analyse the sound coding substantially, avoid and analyse sound coding, unfamiliar word and analyse sound coding, touch system and analyse that sound coding, phrase are analysed the sound coding, symbol is analysed the sound coding.
Accompanying drawing of the present invention: Fig. 1, analyse sound encoded keyboard figure, describe in detail and see the 14 part.
Divide 14 parts to further specify the technical scheme and the realization thereof of this coded system below in conjunction with chart and example:
One, initial consonant code
Initial consonant b, the c of the Chinese phonetic alphabet, d, f, g, h, j, k, l, m, n, p, q, r, s, t, w, x, y, z and English alphabet similar shape, its initial consonant code are corresponding English alphabet, and wherein y, w are not only as initial consonant but also as virtual initial consonant.Initial consonant ch, sh, zh respectively with English alphabet i, u, v as initial consonant code.No initial consonant Chinese character has three ones of a, e, o, respectively with its corresponding English alphabet a, e, o as its virtual initial consonant, virtual initial consonant is mute, only as each distinctive mark.Like this, all Chinese characters have all had initial consonant code, the standardization of phonetic, unitized have obtained further reinforcement, the multiple input of the ambiguity of initial consonant and simple or compound vowel of a Chinese syllable when importing except phonetic side by side.As: in the input of the Chinese phonetic alphabet that dynamo-electric portion is six, during input a, both represented a portion, and represented zh portion again, two ones are mixed and occur, and first yard has ambiguity, be simple or compound vowel of a Chinese syllable be again initial consonant.Simple or compound vowel of a Chinese syllable ai is replaced by alphabetical l originally, but during input " loves ", ai but imports a and i, and when importing " plucking ", simple or compound vowel of a Chinese syllable ai is entered as 1, and simple or compound vowel of a Chinese syllable ai has two kinds of input methods, i.e. ai and 1.In this coding, because first yard may be initial consonant code only, same simple or compound vowel of a Chinese syllable also just has only unique a kind of input method, and the phonetic sign indicating number of " loves " and " plucking " is entered as al and vl respectively, and table is adopted clearly, nature, unambiguity.All initial consonant code all is listed as in Table 1.
Table one: initial consonant code table
Initial consonant code Implication and represented initial consonant
Other initial consonant codes of a e o w y i u v The virtual initial consonant of a portion, the virtual initial consonant of mute e portion, the virtual initial consonant of mute o portion, the initial consonant or the virtual initial consonant of mute u portion, sometimes the pronounce initial consonant or the virtual initial consonant of i portion and ü portion, zh is identical with English alphabet for the initial consonant ch initial consonant sh initial consonant that pronounces sometimes
Two, simple or compound vowel of a Chinese syllable sign indicating number
In the Chinese phonetic alphabet, simple or compound vowel of a Chinese syllable has 34, except that a, e, i, o, u, ü, other simple or compound vowel of a Chinese syllable is formed by two or more letters, this coding all replaces it with an English alphabet, owing in the English 26 letters are only arranged, so several simple or compound vowel of a Chinese syllable will be represented simultaneously in some letters.The author is through a large amount of statistics, will wherein cause repeated code least easily and the simple or compound vowel of a Chinese syllable obscured is arranged on the same letter, and with reference to the simple or compound vowel of a Chinese syllable scheme of Liu Shi " diphthong coding ".For the ease of memory, this coding is except the phonetic sign indicating number that has utilized six in former widely used dynamo-electric portion, and also elaborately planned all the other simple or compound vowel of a Chinese syllable sign indicating numbers make the general operation personnel need not study and can use with painstakingly remembering.Its concrete arrangement sees Table two.
Three, the fractionation of Chinese character
In this coding, except that " one " that can not be split and " second " can't decompose, remaining Chinese character all split into two parts.According to following six kinds of fonts, Chinese character is split, numeral " 1 " is represented stem in the diagram, and numeral " 2 " is represented afterbody.
Figure 921131550_IMG2
Apsacline
Figure 921131550_IMG3
Sloping portion is as the stem of Chinese character, as: " degree is worn distant " respectively with " wide dagger-axe Chuo " as stem.
Enclose type
Figure 921131550_IMG4
Get and surround part as stem, besieged part is as afterbody
The clamping type
Figure 921131550_IMG5
As: " street " and " inner feelings " all belongs to the clamping type, gets wherein " OK " and " clothing " as stem, " Gui " and " in " as afterbody
Single character is according to stroke order got wherein maximum radical as stem, and remaining is afterbody, takes into account nature, directly perceived and custom.Single character is difficult to decompose, and the author's ad hoc " difficulty is divined by means of characters and analysed the sound coding " is to solve the input problem of single character.
Four, the radical of Chinese character
The radical of Chinese character is divided into two big classes: a class is characterized radical, and it is encoded, and yes gets its phonetic sign indicating number.Another kind of is radical, radical is the one-tenth word differentiation by ancient times, so it generally also has pronunciation, yet, bigger difference is arranged in modern Chinese character and ancient times, we can not demarcate the pronunciation of modern Chinese character with the pronunciation in ancient times, so we can only encode to radical with modern pronunciation custom.In order to reduce memory, this coding specified standard radical seldom only is defined as the standard radical with some people radical common, commonly used, that can both be familiar with again.Be difficult to be familiar with, be difficult to the radical of pronunciation then without exception with its head or end stroke replacement for those.The basic stroke of Chinese character has only six kinds in this coding, that is: point, horizontal stroke (comprise horizontal colluding, carry horizontal stroke), perpendicular (comprising perpendicular colluding), cast aside, press down, turn.Therefore, this coding is very natural, memory capacitance is minimum, very easily accepted by numerous operating personnel, thereby also just very easily promotes.Certainly some is not too common as the Chinese character of radical, be difficult for reading its pronunciation, for this reason, the author is except listing some characterized radicals comparatively commonly used in table three, also all radicals all are listed in symbol and analyse the ep portion of sound coding, in the time of can't determining the pronunciation of radical, can be by input ep, utilize ">" key then, consult its pronunciation and coding.
Figure 921131550_IMG6
Five, get big principle and replacement principle
The all-key of this coding has only four yards, and first and second yard is the initial consonant and the simple or compound vowel of a Chinese syllable of whole Chinese character, as long as press table one and the input of table two order, third and fourth sign indicating number is after Chinese character is split into two parts, the coding of its radical.The method for splitting of Chinese character and standard radical have been done simple introduction in first-half, but it is not unique sometimes how to extract radical.For this reason, existing extracting method with radical is described below:
1. when dividing part, get big principle
In upper, middle and lower type Chinese character and single character, according to stroke order get wherein maximum radical, but can not be this word itself, as stem, remainder as afterbody.So-called maximum radical is exactly in this word, appoints to add the radical that unicursal also can't constitute another radical.As:
" etc. " form by " bamboo ", " soil ", " very little " three radicals, " bamboo " and " soil " can not constitute another radical, so maximum radical is " bamboo ".Remaining " temple " is as the maximum radical of afterbody.
" guilt " is made up of " ten ", " mouth ", " standing ", " ten " four characterized radicals, " ten " and " mouth " formations " Gu " word, " Gu " can't constitute another radical with following " stand ", so " Gu " is stem, " suffering " of remainder is as afterbody.
" swoon " and be made up of " day ", " Mi ", " car " three radicals, " day " and " Mi " can't constitute another radical, thus " day " be maximum radical, as stem.Remaining " army " is the maximum radical of afterbody.
About " street " be " OK ", the centre is " Gui ", so " OK " is stem, " Gui " is afterbody.
" take advantage of " wherein that " thousand " are a radical, " standing grain " is a radical also, so get " standing grain " as stem, remaining " north " is as afterbody.
" I " wherein first stroke " Pie " can't constitute the standard radical with other stroke, so get first stroke " Pie " as stem, remaining " looking for " is as afterbody.
2, in a part, get big principle (replacement principle)
Form by two parts significantly by many Chinese characters, after stem is taken out a maximum radical, do not take out entire portion, at this moment, we think that whole stem code fetch finishes, the part of stem remainder is not as a subdivision of afterbody, that is to say that replaced whole stem with the first maximum radical in the stem, Here it is " replacement principle ".As:
" Wei " left right model, left part is made up of " standing grain " and " woman ", presses big principle and should get " committee " as stem, and remaining " ghost " is as afterbody.
" fragrant " goes up mo(u)ld bottom half, top is made up of four radicals, press big principle, get " sound " code as stem, and think that whole stem code fetch finishes, and that is to say, replaced whole top with " sound ", " an ancient weapon made of bamboo " of top remainder is not re-used as the subdivision of afterbody, and afterbody still is bottom " perfume (or spice) ".
" Austria " goes up mo(u)ld bottom half, and wherein first stroke " Pie " can't constitute radical with other strokes, thus with the code of first stroke " Pie " as whole top, the subdivision of no longer regarding the bottom as of top remainder, the bottom still is " greatly ".
" apply " left right model, left part is made up of " just " and " side ", with " just " as whole left part code, remaining " side " is not re-used as the subdivision of afterbody, afterbody still is a right part " The-Fan "
" degree " apsacline, sloping portion is made up of " extensively " and " twenty ", gets " extensively " code as whole sloping portion, and afterbody only is " again ".
" wear " apsacline, sloping portion is " ten " and " dagger-axe ", gets " dagger-axe " code as whole sloping portion, and " ten " are code fetch no longer, and afterbody still is " field " and " being total to ".
3, forward is got big and reverse getting greatly
Sometimes, it is not unique that Chinese character splits, stem according to stroke order forward get big after, still remaining a plurality of radicals in the Chinese character, afterbody as its code, is easy to generate ambiguity by which radical, for this reason, we specially formulate forward and get the big and reverse big principle of getting.In this coding,
1. the code of Chinese character radical that is: begins with Chinese-character writing order first stroke by " forward is got big principle " code fetch, and forward sequence is taken out maximum radical, as the code of stem.Apsacline and encirclement type have exception.
2. the code of Chinese character afterbody that is: from the last pen of Chinese-character writing order, is pressed the opposite order of sequential write by " reverse get big principle " code fetch, gets the code of a maximum radical as afterbody.
Just the trigram in the chinese-wide code is got big principle code fetch by forward, gets big principle code fetch by reverse for the 4th yard.Like this, the ambiguity of the ambiguity of Chinese character fractionation and code fetch just all has been readily solved.Be exemplified below:
" writing brush " left right model, left part be as stem, gets " ten " that big principle gets its upper left quarter as its code by forward.Right part is an afterbody, gets " plumage " that big principle gets its right lower quadrant as its code by reverse.
" honor " upper, middle and lower type, trigram in the all-key is got " Lv " that big principle gets top as its code by forward, get " wood " that big principle takes off portion as its code by reverse for the 4th yard, " Mi " at middle part be code fetch not, and it is also just unimportant that it belongs to that part on earth.
" degree " inclination shape, trigram is got sloping portion " extensively ", gets big principle and gets " again " by reverse for the 4th yard, and " twenty " wherein be code fetch not.
Six, analyse the sound coding substantially
Substantially analysing the sound coding is to analyse coding the most easy to learn in the sound coding, it is the basis of other various codings, for the beginner, need not painstakingly to learn, as long as hand-held one " analysing sound encoded keyboard figure " can carry out the Chinese character input, its all-key is: first and second yard is the double spelling code of Chinese character, that is: first yard is initial consonant code, and second yard is the simple or compound vowel of a Chinese syllable sign indicating number.Trigram is the initial consonant code of the stem radical that takes out by " forward is got big principle " from Chinese character, and the 4th yard is the initial consonant code of the afterbody radical that takes out by " reverse get big principle ".
First yard initial consonant code (whole word)
Second yard simple or compound vowel of a Chinese syllable sign indicating number (whole word)
Trigram initial consonant code (stem)
The 4th yard initial consonant code (afterbody)
Be exemplified below:
" praising " first yard is initial consonant j, and second yard is the code x of simple or compound vowel of a Chinese syllable ia, and trigram is the initial consonant j of the forward radical " Ji " of getting big taking-up, and the 4th yard is the initial consonant j that the reverse radical of getting big taking-up " adds ".Its all-key is: jxjj
Seven, the sound coding is analysed in avoidance
Avoidance is analysed the preceding trigram of sound coding and is analysed sound substantially and encode identical, just the 4th yard improves to some extent, to reduce repeated code, that is: when the 4th yard when identical with first yard, when the initial consonant of the just reverse initial consonant of getting the radical of being got greatly and Chinese character itself is identical, the reverse taking-up afterbody radical different one maximum with initial consonant Chinese character, with its initial consonant code as the 4th yard, be referred to as " principle of obviation ", note: this coding is the 4th yard employing principle of obviation only.This avoidance is reasonable fully, natural fully, because the afterbody of Chinese character is all to the phonetic notation of Chinese character own greatly, we there is no need both to import the pronunciation of Chinese character, import its phonetic notation again, therefore we fully should be in the Chinese character, one of essential characteristic of this code Design thought that Here it is is avoided in the input that repeats of unisonance sign indicating number.
First yard initial consonant code (whole word)
Second yard simple or compound vowel of a Chinese syllable sign indicating number (whole word)
Trigram initial consonant code (stem)
The 4th yard initial consonant code (afterbody is avoided first yard)
As:
" praising " first yard is " praising " initial consonant j itself, and the 4th yard former is the initial consonant j that radical " adds ", but " praising " and " adding " in unison, so should avoid, the reverse initial consonant k that gets the maximum sub-radical " mouth " that radical " adds " changes the 4th yard j into k.The all-key of " praising " is jxjk now.
Eight, unfamiliar word is analysed the sound coding
In this coding, unfamiliar word is analysed sound coding and is divided into that difficulty is recognized, difficulty is torn open, difficulty is recognized difficulty and torn three kinds open:
1, difficulty is read and is analysed the sound coding
So-called difficulty is read and is meant that those ordinary peoples are not familiar with, and can't determine the word of pronunciation again by its radical.The uncommon word of reading half of sound does not belong to difficulty and reads.
In GB I and II character library, particularly in the secondary character library, have a lot of ordinary persons be not familiar with, be difficult to determine the Chinese character of its pronunciation, these words to account in whole 6763 words more than 1/3rd.Because their phonetic sign indicating number can't determine that when two of face kinds of codings were imported before use, first and second sign indicating number can't be imported, can only mix in whole character library with everyday character and search by replacing the fuzzy input of key, the repetition rate of coding is high like this.Maximum reaches more than 100 words, can only with the naked eye utilize page turning key, searches page by page.Not only time-consuming but also require great effort.Therefore, be necessary fully, difficulty read encode separately.Certainly, difficulty is read should have common coding method too, so that those people that are familiar with these Chinese characters use.In this coding, the coding of unfamiliar word all is placed on o portion.That is:
Difficulty is read and is analysed the sound coding
First yard o
Second yard initial consonant code (stem is got the radical that big principle is got by forward)
Trigram initial consonant code (afterbody is got the radical that big principle is got by reverse)
The 4th yard simple or compound vowel of a Chinese syllable sign indicating number (afterbody is got the radical that big principle is got by reverse)
As:
" villous themeda " common people be not familiar with, do not read half of sound yet, the phonetic sign indicating number can't be determined, should use difficulty read the coding import, first yard is alphabetical o, second yard initial consonant code c for " Lv ", trigram is the initial consonant code g of " official ", and the 4th yard is the simple or compound vowel of a Chinese syllable sign indicating number q of " official ", its difficulty read the coding all-key be ocgq.In basic coding, the coding of " villous themeda " is also arranged, the all-key jncg of its basic coding.
2, difficulty is divined by means of characters and is analysed the sound coding
Have some words particularly single character be difficult to split, or polysemy is arranged when splitting, for this reason, ad hoc difficulty is divined by means of characters and is analysed the sound coding, it is distinguished with common Chinese character come.
1. the sound of analysing that the stem difficulty is torn open is encoded, and is primarily aimed at those single characters of understanding easily
First yard initial consonant code (whole word)
Second yard simple or compound vowel of a Chinese syllable sign indicating number (whole word)
Trigram o
The 4th yard initial consonant code (initial consonant code of obvious radical in the word)
As:
" must " ordinary people all know " must " pronunciation of word, so preceding two yards input, i.e. bi easily of its all-key.But its fractionation is not easy to find out, so can use the difficulty coding of divining by means of characters, trigram is entered as alphabetical o, " must " word promptly appears in the presenting bank the 4th yard initial consonant code x that also can be input as " heart ".
2. the sound of analysing that the afterbody difficulty is torn open is encoded
First yard initial consonant code (whole word)
Second yard simple or compound vowel of a Chinese syllable sign indicating number (whole word)
Trigram initial consonant code (stem)
The 4th yard o
As:
Preceding two yards of " numbness " " numbness " is bi, trigram is got the initial consonant code b of stem " Epileptic ", the 4th yard initial consonant code b that can be entered as " confering " by basic coding, but it is identical with first yard, avoid coding so can use, because the bottom of " confering " is not the standard radical, code fetch just is easy to generate query, therefore we can use that the afterbody difficulty tears open analyses its 4th yard of sound coding input, is about to the 4th yard and is input as alphabetical o.
3, difficulty recognizes that difficulty divines by means of characters analyses the sound coding, is meant not only difficultly to recognize but also the difficult Chinese character of tearing open, is primarily aimed at the single character that those are difficult for understanding.
First yard o
Second yard o
Trigram o
The 4th yard initial consonant code (initial consonant code of obvious radical in the word)
As:
" thirty " common people are difficult for understanding, also be difficult to split, so can use difficulty to recognize difficulty divines by means of characters and analyses sound coding it is imported, that is: the input three alphabetical o, " thirty " word just appears in the presenting bank, can use this moment numerical key to select input, also can import the initial consonant code i in its 4th yard " river ".Certainly, " thirty " also can be entered as saih by basic coding.
Nine, the input of Chinese character and brevity code thereof
In first three joint, the author has introduced three kinds of coding methods in this coded system, and the reader has had general understanding to native system.Now, the author can introduce the input of Chinese character.
1, the input of Chinese character
In this coding, during input first yard of Chinese character, ten high frequency Chinese characters appear in the presenting bank, thereafter respectively with the next one coding that they are arranged, when importing Chinese character wherein, as long as key in corresponding numeral, when importing first Chinese character, also can import the space and replace input digit 1.If do not want to import second yard, can search by page turning key ">".After importing second yard, secondary brevity code Chinese character, other high frequency word and phrase appear in presenting bank, if trigram can not determine, can be by the page turning key search.After importing trigram, three Chinese character, high frequency word and phrase appear in presenting bank, if the 4th yard be can not determine, please use page turning key.After importing the 4th yard, all repeat code Chinese characters and phrase appear in presenting bank.In this coding, have two special keys and a specific code:
>page turning key is used for circulation search
Replace key, be used for fuzzy input, replace sign indicating number arbitrarily
O unfamiliar word sign indicating number, the input difficulty is read, first yard is o, the single character trigram can be o, the 4th yard also can be o when being difficult to determine, the o sign indicating number with the difference of key be, such as trigram, key search one, two, four yard all identical Chinese character, the o sign indicating number only is the Chinese character that those trigrams are difficult to determine.The o sign indicating number belongs to specific code in the kanji code, and only be learning key, be not the coding of Chinese character.
It should be noted that in native system six kinds of coding whiles also deposit, same Chinese character both can be imported by basic coding, can also import by the unfamiliar word coding by avoiding the coding input again, even can also more can use the brevity code input certainly by the input of touch system coding.A Chinese character has multiple coding, various codings mutually not contradiction, do not conflict mutually, only repeated code is counted the difficulty or ease difference of difference, input.Every kind of input method can obtain same Chinese character, that is to say: same Chinese character has a plurality of codings, and all are encoded all and exist in the same code database.For example:
" clever " is input as lirl by basic coding, and three of repeated codes need be selected by presenting bank.Be input as lird by avoiding sign indicating number, do not have repeated code, need not to select.Do not have the unfamiliar word coding, also do not have brevity code, the touch system sign indicating number is identical with the avoidance sign indicating number.
" twenty " is input as nnch by basic coding, and a repeated code is arranged.Reading by difficulty is input as ochg, recognizes difficulty by difficulty and divines by means of characters and be input as oooc.Avoid sign indicating number, the touch system sign indicating number is all identical with basic code.There is not brevity code.
" ox " is input as nppu by basic code, and brevity code is np, and divining by means of characters by difficulty is input as npou.
2, brevity code input
The brevity code of Chinese character is the simplification to all-key, and for everyday character, we there is no need to key in one by one its four whole sign indicating numbers.For this reason, generally all be provided with one, two, three brevity code in the encode Chinese characters for computer.For the most frequently used Chinese character, with first yard replacement, as long as first yard of input keyed in space bar again and got final product, Here it is one-level brevity code.For than everyday character, replace with first, second sign indicating number, as long as preceding two yards of this word of input key in the space again and get final product, Here it is secondary brevity code.In like manner also can carry out the input of three.Brevity code all is first Chinese character that occurs in the presenting bank, need not memory, just when keying in numeral 1 this word of input, knows also and can get final product with the space bar input.In this coded system, 26 of one-level brevity codes, 421 of secondary brevity codes, 4627 of threes, totally 5074 brevity codes.And 50 radicals are arranged in the GB one secondary character library, should not count in the Chinese character.Ading up to of the Chinese character on historical facts or anecdotes border:
6763-50=6713.
Can import most words with brevity code, can improve input speed greatly.Now that I and II brevity lists in the native system is as follows.In the secondary brevity lists, the secondary brevity code of unfamiliar word sign indicating number is not listed.
Figure 921131550_IMG7
Figure 921131550_IMG8
Ten, the analysis of repeated code
1, theoretical analysis
Calculate by mathematical theory, use alphabet code, a bit code has 26, two bit codes have 26 * 26=676, and three bit codes should have 26 * 26 * 26=17576, and in GB I and II character library, 6763 Chinese characters are only arranged, and three can be satisfied its coding requirement fully.Then there be 26 * 26 * 26 * 26=456975 as for four bit codes, 450,000! For the number of words of Chinese character, this is astronomical figure only not.Normally, more than 6,000 Chinese characters are encoded with 450,000 sign indicating numbers, should be more than sufficient, repeated code should not appear.But this is theoretic supposition, and in fact, no matter what rule you use, as long as your rule has regulations to abide by, certain is several as long as you are wrong, particular provisions done in tens Chinese characters, with four bit codes more than 6,000 encodes Chinese characters for computer just can be had repeated code.In the GB one secondary character library, the actual number of words of Chinese character is 6713.
2, analyse the repeated code of sound coding substantially
In this coding, substantially analyse 3880 of 396 of 26 of bit codes, two bit codes, three bit codes, 5885 of four bit codes of sound coding, because one or two threes have 4302, its the 4th yard there is no need input, so will be identical with the 4th yard of brevity code word be placed on the position of sequence number 1 than everyday character, use the space bar input to get final product, the brevity code word then is placed on thereafter.After such encoding process, there are 550 repeat code Chinese characters also can import without options button.So the Chinese character that need not select to import has 5885+550=6435 more than.The Chinese character of need selecting only has: 6713-6435=328, and most Chinese characters wherein are awkward reads and difficulty is divined by means of characters.Unique deficiency be that the 5th key " space bar " need be keyed in many Chinese characters.
3, avoid the repeated code of analysing the sound coding
In this coding, four bit codes of avoiding sign indicating number have 6487, that is to say, these 6487 Chinese characters need not to select can unique input.Add 4302 brevity codes, will with the 4th yard identical Chinese character of brevity code word, be placed on the position of sequence number 1, the brevity code word is placed on thereafter, the Chinese character that need not select to import reaches 6670 more than.Only there are tens Chinese characters to need to select.If add the unfamiliar word sign indicating number, the Chinese character that then needs to select has not just almost had.The requirement of touch system can have been satisfied fully.
11, the sound coding is analysed in touch system
It is the further improvement of avoidance being analysed the sound coding that the sound coding is analysed in touch system.Certainly, also only the 4th yard is changed.Purpose also is in order to eliminate the situation of one yard multiword.
By the statistics and analysis of front, avoid sign indicating number and can satisfy the requirement of quick input fully, but have nearly 300 Chinese characters need use the 5th key (comprising space bar and numeral selection).For this reason, this coding has stipulated that still a whole set of touch system analyses sound coding, to eradicate repeat code Chinese character, guarantees to need not to select to import all Chinese characters in quadruple linkage.But correspondingly, also increased memory capacitance.The rule one that the sound coding is analysed in touch system has two, all is to establish at repeat code Chinese character:
1, for the identical a plurality of Chinese characters of afterbody, change get Chinese character radical the simple or compound vowel of a Chinese syllable sign indicating number as the 4th yard, be referred to as " principle of displacement ".
First yard initial consonant code (whole word)
Second yard simple or compound vowel of a Chinese syllable sign indicating number (whole word)
Trigram initial consonant code (stem)
The 4th yard simple or compound vowel of a Chinese syllable sign indicating number (stem)
As:
The pronunciation of " visit fat triangular bream " triliteral radical " Yan ", " moon ", " fish " is respectively yan, yue, yu, its initial consonant all is y, and the right part of three words all is radical " side ", if only consider how to distinguish from radical, that has just had only some rigid regulations of work, such as the 4th yard " visit " gets point " Dian " as code, " triangular bream " gets horizontal stroke " ", and " fat " gets " side ".This cogent provision can not be remembered easily, therefore, we must make an issue of from its left part radical, many codings, as " natural code ", " Li Shi coding " etc., these initial consonants are identical, cause the radical of repeated code to be arranged in respectively on the different key positions easily, like this, the repeated code phenomenon is a large amount of naturally to disappear, but these radicals and its pronunciation have also just lost and get in touch, when importing trigram, memory capacitance has also just increased, and does not also meet people's custom, and the author thinks, though this method is effective, and is very undesirable.The author finds, when distinguishing these Chinese characters, as long as the simple or compound vowel of a Chinese syllable of the 4th yard its radical of input is just passable, trigram still is the initial consonant of radical, just: first and second yard got the initial consonant and the simple or compound vowel of a Chinese syllable of Chinese character itself, third and fourth yard got the initial consonant and the simple or compound vowel of a Chinese syllable of radical, that is: " double-tone principle " or " principle of displacement ".According to this principle, the all-key of " visiting the fat triangular bream " three words is:
" visit " fhyj is the code of phonetic fang and yan
" fat " fhyw is the code of phonetic fang and yue
" triangular bream " fhyu is the code of phonetic fang and yu
2,,, be referred to as " different word principle of obviation " with the 4th yard initial consonant code that changes the subdivision of afterbody into for the incomplete same a plurality of Chinese characters of afterbody.
First yard initial consonant code (whole word)
Second yard simple or compound vowel of a Chinese syllable sign indicating number (whole word)
Trigram initial consonant code (stem)
The 4th yard initial consonant code (afterbody is avoided the repeated code word)
After using these two principles to change the 4th yard, can accomplish not have repeated code.
12, the sound coding analysed in phrase
In this coding, phrase and individual character are mixed to be imported, and need not to key in the sign key of phrase, as long as the coding of input phrase, phrase promptly appears in the presenting bank.If there is not this phrase, then keys in Alt+Space and can enter phrase and set up state behind the coding of input phrase, import individual character one by one after, represent by the space that phrase is set up again and finish.This phrase just deposits in the character library, and appears in text and the presenting bank.Phrase is analysed sound and is encoded to:
1. two words,
The initial consonant of (1) two word,
The initial consonant and the simple or compound vowel of a Chinese syllable of (2) two words.
2. three words,
(1) triliteral initial consonant,
(2) simple or compound vowel of a Chinese syllable of triliteral initial consonant tailing word.
3. the above speech of four words,
The initial consonant of the initial consonant tailing word of first three word.
13, the sound of analysing of various symbols is encoded
In the process of Chinese information processing, various symbols will occur inevitably in a large number, particularly Chinese punctuation mark, each row, each sentence all will occur, and english punctuation mark and Chinese punctuation mark difference are very big, and only account for a character bit, make the text of Chinese character lack of standardization easily, so when the input Chinese text, must also should import Chinese punctuation mark.For this reason, this coding regulation " entering Chinese punctuation mark state automatically " when that is: entering the input in Chinese state, also enters into Chinese punctuation mark state simultaneously, imports Chinese punctuation mark and only need import its corresponding english punctuation mark.Need not to switch, but during the input english punctuation mark, then must enter English state.The corresponding relation of two kinds of punctuation marks is:
Figure 921131550_IMG9
The code of this ad hoc Chinese radical of encoding belongs to ep portion with radical, that is:
First yard e
Second yard p
Trigram initial consonant code (phonetic notation)
The 4th yard simple or compound vowel of a Chinese syllable sign indicating number (phonetic notation)
Other various symbols also all are placed on each height portion of e portion, see the following form:
Table seven: symbolic coding table
Symbol portion Designation and the three or four yard
Ep punctuate et ey eu eb ej ek el ex The radical of Chinese character, the 3rd, four yards enter Chinese punctuation mark state picture symbol Chinese phonetic alphabet automatically for the pronunciation of this radical, the back is its pronunciation numeric character for two yards, the back is its pronunciation tabulation symbol Japanese katakana for two yards, the back is its pronunciation Hiragana for two yards, the back is its pronunciation Russion letter for two yards, the back is its pronunciation Greek alphabet for two yards, and back two yards is its pronunciation
13, the sound of analysing of various symbols is encoded
In the process of Chinese information processing, various symbols will occur inevitably in a large number, particularly Chinese punctuation mark, each row, each sentence all will occur, and english punctuation mark and Chinese punctuation mark difference are very big, and only account for a character bit, make the text of Chinese character lack of standardization easily, so when the input Chinese text, must also should import Chinese punctuation mark.For this reason, this coding regulation " entering Chinese punctuation mark state automatically " when that is: entering the input in Chinese state, also enters into Chinese punctuation mark state simultaneously, imports Chinese punctuation mark and only need import its corresponding english punctuation mark.Need not to switch, but during the input english punctuation mark, then must enter English state.The corresponding relation of two kinds of punctuation marks is:
Figure 921131550_IMG10
The code of this ad hoc Chinese radical of encoding belongs to ep portion with radical, that is:
First yard e
Second yard p
Trigram initial consonant code (phonetic notation)
The 4th yard simple or compound vowel of a Chinese syllable sign indicating number (phonetic notation)
Other various symbols also all are placed on each height portion of e portion, see the following form:
Table seven: symbolic coding table
Symbol portion Designation and the three or four yard
Ep punctuate et ey eu eb ej ek el ex The radical of Chinese character, the 3rd, four yards enter Chinese punctuation mark state picture symbol Chinese phonetic alphabet automatically for the pronunciation of this radical, the back is its pronunciation numeric character for two yards, the back is its pronunciation tabulation symbol Japanese katakana for two yards, the back is its pronunciation Hiragana for two yards, the back is its pronunciation Russion letter for two yards, the back is its pronunciation Greek alphabet for two yards, and back two yards is its pronunciation
14, analyse sound sign indicating number characteristics and keyboard Designing
Advantage of the present invention and good effect:
1 all-key only has four yards.
2 people are that rule is few, need not hypermnesia, can see word knowledge sign indicating number.
3 brevity codes are more, 26 of one-level brevity codes, 421 of secondary brevity codes, 5072 of threes.
4 owing to adopted virtual initial consonant, eliminated the ambiguity of initial consonant code and simple or compound vowel of a Chinese syllable sign indicating number.
5 owing to adopted forward to get the big and reverse big principle of getting, and eliminated the ambiguity that Chinese character splits.
6 do not have difficult defeated word, owing to be provided with the unfamiliar word coding, have solved that difficulty is read, difficulty is divined by means of characters and difficulty is recognized the difficulty input difficulty of divining by means of characters.
Many yards of 8 one words, same word has various input.
9 owing to adopted principle such as avoidances, few with yard Chinese character, very easily touch system.
In this coding, initial consonant code and simple or compound vowel of a Chinese syllable sign indicating number design according to Qwerty keyboard, for good note, handy, repeated code is few, the characteristics of keyboard Designing are:
1, simple or compound vowel of a Chinese syllable ai, an, ao, en, ang, eng, ing, ong and initial consonant sh, ch adopt the design proposal of six in dynamo-electric portion fully, and just initial consonant zh changes by v and replaces.
2, simple or compound vowel of a Chinese syllable iao, ian, iang and uang be respectively at ao, an, position, the lower-left rhythm mother stock of ang
(in, ing), (iu, iou, o, ou), (e, er, ei), (uo, ua, uao, ia, ie), (van, uan, ve, ui), (en) six groups, in each group, the simple or compound vowel of a Chinese syllable pronunciation is close, letter shapes is close, is convenient to memory for vn, un.
3, main radical all is placed on its corresponding initial consonant code key.So that grasp.

Claims (8)

1, a kind of computer Chiense character code system and input method.It is characterized in that: the all-key of analysing sound coding is formed by four yards, comprise analyse the sound coding substantially, avoid analyse that sound coding, unfamiliar word analyse that sound coding, touch system analyse that sound coding, phrase analyse sound coding and various symbols analyse six parts of sound coding.Analysing the sound coding mainly is made of initial consonant code and simple or compound vowel of a Chinese syllable sign indicating number, the initial consonant of the Chinese phonetic alphabet is represented with 26 English alphabets, i, u, v represent ch, sh, three initial consonants of zh respectively, and a, e, o, w, y are added in before the simple or compound vowel of a Chinese syllable as the virtual initial consonant of a portion, e portion, o portion, u portion, i portion and ü portion respectively.34 simple or compound vowel of a Chinese syllable of the Chinese phonetic alphabet are also represented with 26 English alphabets.
2, analyse the sound coding method according to claims 1 are described, it is characterized in that: first yard that analyses the sound coding substantially is the initial consonant code of the phonetic of Chinese character own, and second yard is its simple or compound vowel of a Chinese syllable sign indicating number.Chinese character is split into two parts of head and the tail, trigram is the initial consonant code of stem again, and the 4th yard is the initial consonant code of afterbody.Stem is got big principle code fetch by forward, and afterbody is got big principle code fetch by reverse, can eliminate the ambiguity of code fetch like this.
3, analyse the sound coding method according to claims 1 are described, it is characterized in that: avoid and to analyse the sound coding when the initial consonant code of the afterbody of Chinese character and Chinese character itself is identical, get the subdivision inequality of initial consonant code and Chinese character itself in the afterbody as the 4th yard, be referred to as " principle of obviation ".After using principle of obviation, repeat code Chinese character is few in the GB I and II character library, can satisfy the requirement of quick input fully.
4, analyse the sound coding method according to claims 1 are described, it is characterized in that: unfamiliar word analyse the sound coding again the branch difficulty read and analyse sound coding, difficulty and divine by means of characters and analyse sound coding, difficulty and recognize difficulty and divine by means of characters and analyse two kinds of sound codings.1. difficult first yard of analysing the sound coding of reading is lowercase o, and second yard is the initial consonant code of Chinese character radical, and trigram is the initial consonant code of afterbody, and the 4th yard is the simple or compound vowel of a Chinese syllable sign indicating number of afterbody.2. difficult first and second yard of analysing the sound coding of divining by means of characters is the initial consonant and the simple or compound vowel of a Chinese syllable of this word, and trigram is alphabetical o, and the 4th yard is the initial consonant code of obvious sub-word.Perhaps, trigram is the initial consonant code of stem, and the 4th yard is alphabetical o.3. difficult to recognize difficulty first, second and third sign indicating number of analysing the sound coding of divining by means of characters all are alphabetical o, and the 4th yard is the initial consonant code of the obvious sub-word of this word.
5, analyse the sound coding method according to claims 1 are described, it is characterized in that: for the situation of one yard multiword of only a few, touch system is analysed the sound coding and is adopted following method to eliminate the repeated code of Chinese character: 1. for the identical repeat code Chinese character of afterbody, change get Chinese character radical the simple or compound vowel of a Chinese syllable sign indicating number as the 4th yard, be referred to as " principle of displacement ".2. for the incomplete same repeat code Chinese character of afterbody,, be referred to as " different word principle of obviation " with the 4th yard initial consonant code that changes the subdivision of afterbody into.
6, according to claims 1 described coding method, it is characterized in that: phrase mixes input with individual character, if there is not this phrase, then keys in Alt+Space and can enter phrase and set up state behind the coding of input phrase, after importing individual character one by one, represent that by the space phrase foundation finishes again.This phrase just deposits in the character library, and appears in text and the presenting bank.Phrase is analysed sound and is encoded to: 1. two words, the initial consonant of (1) two word, the initial consonant and the simple or compound vowel of a Chinese syllable of (2) two words.2. three words, (1) triliteral initial consonant, the simple or compound vowel of a Chinese syllable of (2) triliteral initial consonant tailing word.3. the above speech of four words, the initial consonant of the initial consonant tailing word of first three word.
7, according to claims 1 described coding method, it is characterized in that: 1. the radical of Chinese character is returned the portion at ep, and promptly first and second sign indicating number is ep.Third and fourth yard is the pronunciation of this radical.2. enter Chinese punctuation mark state automatically, 3. various symbols such as graphical symbol are also all returned each the height portion in e portion.
8, according to claims 1 described coding method, it is characterized in that: analyse sound coding compatibility fully for all six kinds, same Chinese character has various input, that is: use the different sounds of analysing to encode when carrying out the text input, need not to switch input state, various codings all and exist in the same code database, various inputs all are under the same input mode.
CN 92113155 1992-11-13 1992-11-13 Chinese-character sound dissection encode and input method Expired - Fee Related CN1026924C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 92113155 CN1026924C (en) 1992-11-13 1992-11-13 Chinese-character sound dissection encode and input method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 92113155 CN1026924C (en) 1992-11-13 1992-11-13 Chinese-character sound dissection encode and input method

Publications (2)

Publication Number Publication Date
CN1073539A true CN1073539A (en) 1993-06-23
CN1026924C CN1026924C (en) 1994-12-07

Family

ID=4946268

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 92113155 Expired - Fee Related CN1026924C (en) 1992-11-13 1992-11-13 Chinese-character sound dissection encode and input method

Country Status (1)

Country Link
CN (1) CN1026924C (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1047675C (en) * 1993-08-28 1999-12-22 陈光宇 Audio code and its Chinese character inputting key board
CN1069420C (en) * 1995-05-26 2001-08-08 戴石灵 Method for inputting Chinese characters by using their pronunciations and shapes

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1047675C (en) * 1993-08-28 1999-12-22 陈光宇 Audio code and its Chinese character inputting key board
CN1069420C (en) * 1995-05-26 2001-08-08 戴石灵 Method for inputting Chinese characters by using their pronunciations and shapes

Also Published As

Publication number Publication date
CN1026924C (en) 1994-12-07

Similar Documents

Publication Publication Date Title
CN1023916C (en) Chinese keyboard entry technique with both simplified and original complex form of Chinese character root and its keyboard
CN1648828A (en) System and method for disambiguating phonetic input
CN1095560C (en) Kanji conversion result amending system
CN1694049A (en) Chinese character input system based on five-key
CN1280748C (en) Speed typing apparatus and method
CN1073539A (en) Chinese-character sound dissection encode and input method
CN1110738C (en) Literal character input method for notobook computer
CN1121645C (en) Sound and shape word code Chinese character input method
CN1123819C (en) Chinese character key-position code input method for computer
CN1052200A (en) Pronunciation-form-meaning words encode series with compatibility and keyboard
CN1050913C (en) Chinese-character word processor with radical coding input
CN1115620C (en) Chinese-character 'Xianhui code' keyboard and its input method
CN1804763A (en) Chinese harmonic keyboard
CN1259615C (en) Letter-keyboard and number-keyboard universal inputting method for Chinese character inputting and left-part character-shape identification method
CN1317906A (en) Integrated system for imputting digitalized English in information processing of mobile communication and computer
CN1845053A (en) Chinese character and English input technology using assembled and mobile hand-writing virtual keyboard
CN1275732A (en) Chinese character keyboard input system and applied technology thereof
CN1092815C (en) Chinese character dictionary retrieving and computer input method and keyboard
CN1492303A (en) Two division Chinese character coding small keyboard input and its display method
CN1223503A (en) Chinese/English input method and related keyboard
CN1056007C (en) Codes for inputting Chinese characters
CN100342311C (en) Root split type Chinese character input and its display method
CN1246759C (en) 'Gensu' code Chinese character input method
CN1374577A (en) General Chinese character input method suitable for letter keyboard and digital keyboard in computer and its keyboard
CN1134561A (en) Nine-stroke characteristic encoding of Chinese characters

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee