CN1117164A - High-frequency irrational Chinese coding and its keyboard - Google Patents

High-frequency irrational Chinese coding and its keyboard Download PDF

Info

Publication number
CN1117164A
CN1117164A CN 95107975 CN95107975A CN1117164A CN 1117164 A CN1117164 A CN 1117164A CN 95107975 CN95107975 CN 95107975 CN 95107975 A CN95107975 A CN 95107975A CN 1117164 A CN1117164 A CN 1117164A
Authority
CN
China
Prior art keywords
code
word
key
chinese character
chinese
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 95107975
Other languages
Chinese (zh)
Inventor
裴鸣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN 95107975 priority Critical patent/CN1117164A/en
Publication of CN1117164A publication Critical patent/CN1117164A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

The present invention provides an encoding concept of high-frequency irrationality, and based on the frequency of radicals in characters and the frequency of characters, the present invention sorts radicals and optimizes the key positions. The keyboard is divided into six regions and via the optimization of stroke and radical key-positions, every Chinese character has at most three codes, and the characters with use frequency over 80% have simplified codes. The said coding system has a low code duplication rate for characters and words.

Description

High-frequency irrational Chinese coding and keyboard thereof
The present invention relates to computer software and hardware, more particularly relate to a kind of computing machine simplified and unsimplified Hanzi encoding scheme and keyboard thereof.Existing Hanzi coding scheme is a lot, mainly can reduce following a few class:
1. the advantage of phonetic sign indicating number (comprising pure assembly and Two bors d's oeuveres) phonetic sign indicating number is easy, and relatively is applicable to listen and beats.But because the utmost point unevenness that Chinese-character pronunciation distributes, this method repeated code is more, owing to there are a lot of Chinese characters not know its pronunciation or cacoepy concerning a lot of people, the scope of application of this method is very restricted simultaneously.The coding of very early time mostly belongs to this class, and such as the phonetic association, sound is several or the like.This class only generally is not suitable for professional entry personnel's use with the method for Pinyin coding.
2. the advantage of graphemic code graphemic code is easily to learn usefulness well, is not subjected to learner's the educational level and the restriction of life region.Graphemic code than being easier to the formation condition reflection, helps the raising of typing speed owing to be by the input of the radical of word coding, and does not generally have the word of not showing, this some professional entry personnel is even more important.The shortcoming of graphemic code is that the content that the time need remember of study is more, and not too is suitable for listening and beats and think dozen.85102777) or the like graphemic code is a representative (number of patent application: 85100837), also have Mr. Qian Weichang shape code scheme (number of patent application: in addition with the Five-stroke Method of Mr. Wang Yongmin; The present invention also is attributable to this class.
3. to add the advantage of this encoding scheme of font be to have utilized the characteristics of phonetic and font simultaneously to phonetic, helps discrete repeated code.Since this scheme has been utilized phonetic, the word that has that just must have the phonetic sign indicating number must have can not pronounce or the problem of cacoepy.Simultaneously, because used font to participate in coding again, just must increase the memory capacitance of study.This class mainly contains smart code of holding up good Mr. Wen or the like.
With ten numerical codings with ten numerical codings seldom (have only ten) owing to participate in the element of coding, must cause the increase of code length.If do not increase code length, must have a lot of word repeated codes, cause the increase of stroke and slowing down of input speed.With 10 numerical codings, preferably use character shape coding, eager to learn so easy-to-use; If it is press Pinyin coding, then improper with ten digital mutual-characters; Because why we use Pinyin coding, utilized the corresponding relation directly perceived between phonetic and the English alphabet exactly; Moreover its unavoidable problem is just arranged by Pinyin coding itself.Generally speaking, to encode Chinese characters for computer, because the problem of code length, its speed definitely can't be compared with the scheme with alphabet code with ten numerals.This class mainly contain Mr. Xiao Shuiqing Xiao's sign indicating number (patent announcement number: 1079562), 1073023) or the like Mr.'s Peng Kenan dual-part numeric code input method (patent announcement number:.
In addition, also have region-position code, telegraph code etc., but because it finds it difficult to learn, common people do not typewrite with it directly.
The pronunciation distributed pole of Chinese character is inhomogeneous, and the font of Chinese character distributes also extremely inhomogeneous.The applicant thinks through after the big quantitative analysis to the characteristics of Chinese character, and is on the whole, fast than pressing the phonetic input by character pattern input; Wherein one of reason is that formation condition reflects easily because directly perceived by character pattern input.Because its pronunciation or cacoepy do not known in the Chinese character that has, so, concerning professional entry personnel, select graphemic code to be inevitable.Certainly, phonetic sign indicating number and graphemic code respectively have its scope of application, and whom does not exist replace whose problem fully.
Chinese character is except pronunciation and radical distributed pole are inhomogeneous, and the usage frequency of each Chinese character distributes also extremely inhomogeneous.Such as Chinese character " ", its usage frequency is about 250 times of average usage frequency near 4%; And the usage frequency that many Chinese characters are arranged has only several 1/10000000th.Therefore, fundamentally improve the input speed of Chinese character, a part of high frequency word should unconditionally preferentially be treated, in other words, make its stroke minimum as far as possible, keystroke is the easiest.This so-called " unconditionally preferential " is exactly the unreasonable notion of high frequency that proposes here and adopt.Have only the character shape coding of utilization, the notion that the high frequency that utilizes the present invention to propose simultaneously is unreasonable just can make new step on the speed of Chinese character input.
Purpose of the present invention is exactly to improve the bulk velocity of Chinese character input to greatest extent.The present invention compares with each scheme in the past, be exactly to propose and adopted the unreasonable notion of high frequency first, minimum as objective function to the keyboard position of radical simultaneously with the individual character repeated code, global optimization has been carried out in the position of each radical on keyboard, realized maximum 3 sign indicating numbers of each Chinese character.Simultaneously, minimum with the two words group repeated codes of forming between the one-level brevity code word (high frequency word) as objective function, global optimization has been carried out in the position of one-level brevity code word.The result of two re-optimization is to maximize the bulk velocity of Chinese character input.
The present invention is except the high input speed of individual character, and another characteristics are exactly repeated code anything but between individual character and the phrase.And in theory 527280 (26 of four sign indicating numbers 3* 30) individual position, all to phrase usefulness, the how new phrase that so just can make children at any time, and the repeated code that can not influence individual character distributes, and also because the position of phrase is a lot, the phrase repeated code is also seldom.
One, the principle of single character code
Here select 130 Chinese characters in common use as one-level brevity code word, and its usage frequency reaches more than 40%.In these 130 one-level brevity code words, the overwhelming majority is to elect according to the highest principle of its usage frequency, and it is to be selected into for the repeated code between discrete and other everyday character that indivedual everyday characters are arranged.These 130 Chinese characters are by (minimum as the optimization aim function with the two words group repeated codes of forming between them) of the unreasonable principle coding of high frequency, see Fig. 1.When importing this part Chinese character, only need hit its key letter (having only a sign indicating number) and short in size key thereof (be respectively space bar, comma [,] key, fullstop [.] key ,/number key and branch [; ] key) get final product.These 130 words except with disconnected 26 the most frequently used words and two~ten in space totally 9 Chinese-character digitals be by its usage frequency or remember easily the principle coding that the coding of other word all is (its cryptoprinciple of phrase of forming of this part word is seen below and stated) by making that the minimum principle optimization of two words group repeated codes formed between all one-level brevity code words determines.Though the coding of these 130 words shows certain " unreasonable " property, owing to be everyday character, and each word has only a sign indicating number, as long as many exercises several times, is familiar with and memory is not difficult.Because be everyday character, concerning professional data operator, this part word is to rely on conditioned reflex input, the raising of the input speed that so just is highly advantageous to a great extent.In addition, because there is this usage frequency to reach word about 40% root of needn't divining by means of characters, so Pei's sign indicating number also is specially adapted to listen and beats and think dozen.
To other word, then be to encode by the principle that font splits.Three radicals got at most in each word, get successively its first, second and the last radical participate in coding, use the space bar short in size.If this Chinese character has the secondary brevity code, then can only beat its first two radical, with space bar short in size (its usage frequency of the word of secondary brevity code reaches more than 40% among the present invention).30 keys of whole base key position are divided into six districts, with 26 letter keys wherein Chinese character are encoded, other 4 keys are used for one-level brevity code word and relevant phrase short in size thereof; Space bar is mainly used to the individual character short in size.
Choosing of radical is that shape and usage frequency thereof by analyzing all Chinese characters determined on the bases of a large amount of practices.With the single of Chinese character draw by horizontal, folding, cast aside, press down, perpendicular order is successively placed on five districts in first, sees Fig. 2.We show by the result that the structure of all Chinese characters is added up, the number of times that single radical horizontal stroke, folding occur in individual character is more than other radical, and the single radical mainly appears in the everyday character, so horizontal stroke, folding are put in keystroke one's own department or unit keypad (one, two districts) easily, and other three kinds of strokes are put in other three districts successively.Where other radical is placed on, and is that computing machine determines out by a large amount of computation optimization; The target of optimizing is to make the least number of times that occurs repeated code among the GB-2312 (80) between whole 6763 individual characters, guarantees that especially repeated code is minimum between the everyday character.Like this, the position of all radicals has just been decided (see figure 2) thus.When distinguishing radical, every have " mouth " radical that intersects with other stroke, all makes " mouth " radical.Because " mouth " radical is too many, as far as possible discrete " mouth " radical.In addition, if the end pen is the point in the upper right corner, does not then get it and make an end pen or an end identification code, but get its last pen.Because the contained quantity of information of the point in the upper right corner is (word of upper right corner band point is more in other words) very little.
To all Chinese characters, all can be summed up as a kind of in following 3 kinds of situations:
1. when enough 3 sign indicating numbers of a Chinese character, get its first, second and the last radical and hit the space bar short in size and get final product respectively.Such as: Hunan (FLW), difficult (XJT) or the like.
2. for the word that has only two radicals, the 3rd sign indicating number get its last add that the structure of this word determines.End this sign indicating number of pen decision is in which district, and this sign indicating number of structures shape is in which position in this district.About structure is divided and mix two classes, being that left and right sides structure is then hit this and distinguished deputy key, is that mixed structure then hits this and distinguishes tertiary key.Such as: " sudden strain of a muscle " word, get " door " and " people " two radicals earlier, because of finishing touch is to press down, be mixed structure, so hit the 3rd the I key in the 4th district as the complement code key.And for example " " word hits " mouth " and " crust " two keys earlier, because finishing touch is a left and right sides structure for folding, so the J key that hits the 2nd in the 2nd district is as complement code key (please refer to Fig. 2).
3. the word (having only a radical) to being elected to be radical, its input method is to hit the key at its radical place earlier, importing first and the last single of this word then successively draws, add the space again (to five singles make one's cross root horizontal stroke, folding, left-falling stroke, right-falling stroke, perpendicular, hit this radical place key earlier, add two R keys then).
For the word that the secondary brevity code is arranged, need only get its first two radical successively, add the space bar short in size then and get final product.No matter be one-level brevity code word or secondary simple code Chinese character, all can be up to (can point out its brevity code when importing three grades of sign indicating numbers) with three sign indicating numbers by three kinds of above situations.But, all should use its brevity code as far as possible, and not encourage with its three grades of sign indicating numbers (all-key) to one-level brevity code word and secondary simple code Chinese character.
To the complex form of Chinese characters, also be by above principle coding, needn't do any variation.
So far, say in principle that we can both import all Chinese characters.Specific to certain Chinese character, radical should be held following principle when splitting: code fetch successively, get big preferentially, and look after directly perceived.In fact, a GPRS method for splitting of a part of special radical (not being radical), the fractionation of a lot of words has just solved with topic.
To the radical of not knowing how to import, the invention provides an omnipotent learning key, for key.If which radical is known how not import, then with key replace.Be convenient to learn at any time with regard to the utmost point like this.
In addition, the present invention also provides the pinyin learning function, i.e. input Pinyin will be pointed out out the coding of this word.
Two, the principle of phrase coding
The phrase coding has two big principles, and the one, its preceding two sign indicating numbers participation phrase coding got at most in each individual character of participating in coding; The 2nd, every maximum 4 sign indicating numbers of phrase (promptly beating enough 4 sign indicating numbers is automatic short in size).
1. two words groups
(1) two words groups to forming between the one-level brevity code word need only be imported the coding (each word has only a sign indicating number) of each word successively, then, and with the short in size key short in size of a back word.If the short in size key of a back word is the space, then use branch (; ) the key short in size.As " work " (ZG/), " I " (TF; ).
(2) the two words groups that one-level brevity code word and non-one-level brevity code word are formed, if first word is not an one-level brevity code word, then import preceding two sign indicating numbers of this word, then the sign indicating number (having only a sign indicating number) of input back one word, use the short in size key short in size of back one word again, (QSZ/) such as " going home ".If first word is the one-level brevity code, then import the sign indicating number (having only a sign indicating number) of this word earlier, preceding two sign indicating numbers of input back one word are used the short in size key short in size of first word, such as " school work " (NUG.) more then.If wherein the short in size key of one-level brevity code word is the space, then use branch (; ) the key short in size, such as: " single " (ZBJ; ), people (LJB; ).
(3) be not two words groups of one-level brevity code word to two words, get preceding two sign indicating numbers (totally 4 sign indicating numbers) of each word respectively, such as: " outstanding " (JAOY), " north " (CFPY).
2. three words groups
To three words groups, elder generation gets first yard of each word successively, and last sign indicating number is divided into following two kinds of situations:
(1) if the 3rd word is the one-level brevity code, then last sign indicating number is triliteral short in size key.If the short in size key of last word is the space, then use branch (; ) the key short in size.Such as: " Chinese " (NML; ).
(2) if the 3rd word is not brevity code, then last yard is second yard (totally 4 sign indicating numbers) of last word.Such as: " viaduct " (XELT).
3. four words or more than the phrase of four words
First sign indicating number (totally 4 sign indicating numbers) of first three word and the last character got respectively in four words or more than the phrase of four words.Such as: " becoming more prosperous every day " (AANC), " People's Republic of China (PRC) " (NJLM).
4. the processing of administrative title
In order further to improve Chinese character input speed on the whole, this coded system has been taken in the title (Taiwan Province does not take in temporarily) of within Chinese territory all administrative regions above county level, more than totally two thousand 5 hundred, and give special phrase encoding scheme, purpose is for discrete repeated code.
To the ground noun phrase of two words, its coding method is the same with general phrase.
To three words and three place names more than the word, if the last character be " province ", " city " or " county ", then the last character does not participate in phrase is encoded.In other words, remove the last character and be not used in coding, the coding method of remaining word is with general phrase.
The present invention has taken in more than 40,000 of all kinds of general phrase, and will continue income.The cryptoprinciple of the above phrase also is applicable to the complex form of Chinese characters.
Three, repeated code is selected into
When repeated code appearred in individual character or phrase, we were selected in the following manner: then last the 1st of typing representative (promptly the 1st not essential), and space bar is represented last the 2nd, branch [; ] last the 3rd of key representative, last the 4th of comma [,] key representative, fullstop [.] key is represented last the 5th.So just needn't leave the base key position and just can realize being selected into of counterweight code word speech.Here propose to be selected into the 2nd word or speech with space bar, raising is selected into speed is significant.Numerical key is not used in and is selected into, and hitting numerical key is last the 1st and last this numeral, and hitting cursor key also is last the 1st.The individual character repeated code does not surpass 5 in our coding, and phrase does not have yet.If have later on,, be selected in order to last method again with page turning before and after [key reaches] key.
Among Fig. 1, except the individual character of having listed the one-level brevity code, also corresponding phrase.The corresponding phrase of input that is to say, if only need hit a space again and get final product.Such as, import " we " two words, only need hit the T key and hit two spaces again and get final product.For another example, " still " two words be imported, only D need be hit; Two keys hit a space again and get final product; The also corresponding phrase of each secondary simple code Chinese character, the method that is selected into phrase after with the one-level brevity code.This repeated code that also is we invent is selected into one of concrete utilization of method.
Four, self-word creation group
To the self-word creation group, we have invented reverse method of getting speech, be that the back that cursor is retained in speech needn't be moved, the self-word creation group made in n word before getting cursor, just press Alt key and N key (N is in following ten alpha-numeric keys 1234567890, and representative is got 1~10 word and made phrase respectively) simultaneously.In the time will getting word more than 10 and make phrase, with one in Alt and the QWERTYUIOP key, representative is got 11~20 words and is made the self-word creation group respectively.Such as, get preceding 12 words of cursor and make phrase, need only add a carriage return again by Alt and two keys of W simultaneously and get final product.Coding is to be provided by the principle of computing machine according to phrase coding, if the operator feel the coding that computing machine provides improper (such as with other phrase repeated code), also can provide coding by the principle of oneself.
To the self-word creation group, we also provide a new short in size key ' key.Can select a sign indicating number, two sign indicating numbers or three sign indicating numbers, the short in size of usefulness ' key to the phrase of making certainly for use according to its usage frequency.
Pei's Fig. 3 sign indicating number secondary brevity lists

Claims (10)

1. Chinese character shape code scheme and keyboard thereof, it is characterized in that according to the group word frequency of Chinese character root and with the usage frequency of its relevant Chinese character, radical is preferably reached keyboard position optimization to realize maximum three sign indicating numbers of each Chinese character.Simultaneously, propose and used the unreasonable cryptoprinciple of high frequency.
2. one kind for reducing high frequency word and phrase stroke thereof and choosing the method (high frequency is unreasonable) of one-level brevity code word for the repeated code of discrete high frequency word.
3. according to claim 1, a kind of location drawing (radical summary table) of all radicals on keyboard that obtains by optimization, the target of optimization is to make among the GB-2312 (80) between whole 6763 Chinese characters repeated code minimum, guarantees that especially repeated code is minimum between the everyday character.
4. according to claim 1 and 2, a kind of by optimizing the unreasonable encoding scheme of high frequency of determining one-level brevity code word code, the target of optimization is that the phrase repeated code that all one-level brevity code words are formed each other is minimum.
5. according to claim 1 and 2, a kind of except with the space as the short in size key, also use the method for a plurality of other characters, and these a plurality of short in size keys are all on the base key position as the short in size key, as (), (.), (/), (; ) key or the like.
6. according to claim 1 and 3, a kind of by optimize single with Chinese character by horizontal, folding, cast aside, press down, perpendicular order is arranged on certain key position successively, about Chinese character is divided into and mix two kinds of structure types and be aided with the method that end stroke is replenished the last code of not enough trigram Chinese character.
7. according to claim 1 and 2, a kind of Hanzi coding scheme can be encoded to the simplified Chinese character and the complex form of Chinese characters simultaneously, and needn't be revised any radical and cryptoprinciple.
8. according to claim 1 and 2, a kind of reverse self-word creation method of getting speech needn't moving cursor during self-word creation.
9. according to claim 1,2,3 and 4, whole 6763 Chinese characters among the GB-2312 (80) are encoded, form a cover high frequency irrational Chinese characters pattern codes code book and a keyboard position figure, wherein contained the coding of one-level brevity code, secondary brevity code, all-key (three grades of sign indicating numbers) and general code.
10. according to any one in the claim 1~9 of front, simplified and unsimplified Hanzi individual character and phrase are carried out Methods for Coding, relevant software and hardware product can be made, all large, medium and small and miniature Chinese character information processing computer systems, Chinese terminal, kanji typewriter, communication system and all Chinese character sorts and searching field can be used for.
CN 95107975 1995-08-08 1995-08-08 High-frequency irrational Chinese coding and its keyboard Pending CN1117164A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 95107975 CN1117164A (en) 1995-08-08 1995-08-08 High-frequency irrational Chinese coding and its keyboard

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 95107975 CN1117164A (en) 1995-08-08 1995-08-08 High-frequency irrational Chinese coding and its keyboard

Publications (1)

Publication Number Publication Date
CN1117164A true CN1117164A (en) 1996-02-21

Family

ID=5076559

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 95107975 Pending CN1117164A (en) 1995-08-08 1995-08-08 High-frequency irrational Chinese coding and its keyboard

Country Status (1)

Country Link
CN (1) CN1117164A (en)

Similar Documents

Publication Publication Date Title
CN1129837C (en) Mounting device for universal Chinese phonetic alphabet keyboard
CN1117164A (en) High-frequency irrational Chinese coding and its keyboard
CN1058341C (en) Method for phonetic coding and its deaf-mute computer keyboard
CN1384426A (en) Dian code Chinese character input method for computer
CN1106146A (en) Computer input method by computer Chinese-character phonology-tone coding and its keyboard
CN1135056A (en) High frequency irrational Chinese characters pattern codes and keyboard thereof
CN113253853B (en) Chinese character input method for computer and mobile phone
CN1043381C (en) Four-stroke digit look-up method for Chinese characters
CN1111373A (en) Computer Chinese input scheme based on the Chinese Phonetic Alphabet
CN1017662B (en) Irrational order no. digital coding method and the keyboard thereof
CN1062667C (en) All spelling form guide code Chinese character input system
CN1119742C (en) Pictophonetic code computer keyboard inputting method for Chinese characters
CN2476059Y (en) Keyboard for Jiang code input method
CN1022350C (en) Chinese alphabet coding input method
CN1121007C (en) Chinese-character five tones-digital code input method and keyboard
CN1072519A (en) Chinese character radicals and strokes input method
CN1299189C (en) Phonetic zoned digital code Chinese character indexing system and phonetic zoned digital code input method
CN1074147C (en) Five-code Chinese character input process
CN1612095A (en) Double phonetic alphabet input method
CN1425972A (en) Fast and easy Chinese character input method and keyboard
CN1068444C (en) Method of Chinese-character coding
CN1320426C (en) Chinese character input single stroke digital keyboard for electronic equipment
CN1108553C (en) Universal popular voice form Chinese character coding input method
CN1160243A (en) Character shape stroke order code Chinese character entering system and keyboard thereof
CN1409199A (en) Left and right pictophonetic and digital computer input method for Chinese character and its keyboard

Legal Events

Date Code Title Description
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C06 Publication
PB01 Publication
C01 Deemed withdrawal of patent application (patent law 1993)
WD01 Invention patent application deemed withdrawn after publication