CN1233794A - Chinese character coding input method, keyboard and retrieval method therefor - Google Patents

Chinese character coding input method, keyboard and retrieval method therefor Download PDF

Info

Publication number
CN1233794A
CN1233794A CN 99116305 CN99116305A CN1233794A CN 1233794 A CN1233794 A CN 1233794A CN 99116305 CN99116305 CN 99116305 CN 99116305 A CN99116305 A CN 99116305A CN 1233794 A CN1233794 A CN 1233794A
Authority
CN
China
Prior art keywords
parts
chinese character
scheme
chinese
present
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 99116305
Other languages
Chinese (zh)
Other versions
CN1116635C (en
Inventor
王小军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN 98113555 external-priority patent/CN1204083A/en
Application filed by Individual filed Critical Individual
Priority to CN 99116305 priority Critical patent/CN1116635C/en
Publication of CN1233794A publication Critical patent/CN1233794A/en
Application granted granted Critical
Publication of CN1116635C publication Critical patent/CN1116635C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)
  • Machine Translation (AREA)

Abstract

The present invention relates to Chinese character coding and retrieval scheme which is easy to learn and is 100% conform with <<Modern Chinese common words stroke order standard>>. Based on <<Chinese character parts standard>> worked out by National Language and Character committee and "Chinese character radical statistical table", the present invention selects 122 common use parts which have no deforjmed shape, so that it is advantageous for learning words by middle and primary school pupils.

Description

A set of Chinese character coding input method and keyboard thereof and the indexing system of Chinese Characters
The present invention mainly is that a cover Chinese character splits scheme and according to the encode Chinese characters for computer and the retrieval scheme of this conceptual design.This is applied for a patent is to utilize domestic priority, former invention is being done better processing aspect standardization and the easy memory, and according to the primary and secondary relation a little choices are being done in former invention, has saved a little submembers, has given prominence to emphasis.
At present, though Hanzi coding scheme is a lot, the simple proposal repeated code is many, input is slow, and repeated code is few, the scheme of high input speed finds it difficult to learn again, difficult note." indexing system for Chinese characters " of " Xinhua dictionary " usefulness both do not have and computer Chinese input method interrelates, and it is simple, quick not utilize the indexing system of Chinese Characters of the present invention to look into word again.
The two principal themes of encode Chinese characters for computer now research should be popularization and standardization, but the high font code scheme of most of efficient was both remarkable, lack of standardization again.State Language Work Committee successively issue in 1997 " Modern Chinese general words normative stroke order " and " information processing GB13000.1 character set Hanzi component standard "." normative stroke order " is easily people and accepts based on the writing style that most people form in middle and primary schools' study." parts standard " is cardinal rule with " from shape, respect motivation, base oneself upon the modern times, with reference to historical ", is theoretical foundation with the theoretical research of relevant calculation machine coding techniques and Chinese character configuration in recent years, and encode Chinese characters for computer research is had important directive function.After May 18 from this year, I was to this invention first application patent, I just begin the contact of writing letter to the expert, have the honor to receive " the parts standard " that professor Chen Yifan of Beijing Information Technology Institute gives and am loaded with the first phase " PC World " of encode Chinese characters for computer research important information.Afterwards, I just begin " the assorted standard of portion " carried out conscientious research, make my invention also obey " parts standard " as far as possible under the situation of obeying " normative stroke order " fully.I, design by 7000 Chinese characters carefulness in " normative stroke order " is added up based among " parts standard " and the Chen Yifan professor " Chinese character keyboard input technology and theoretical foundation " " radical statistical form " the processing of invention in the recent period.I think by my processing, and encode Chinese characters for computer is simplified more, and have kept the body integrality of common components as far as possible.
The main reason that finds it difficult to learn of font code scheme generally is that memory capacitance is bigger now.Chinese character is the combination of sound, shape, justice, and has only phonetic alphabet to contact directly with the key position of QWERTY keyboard.So main in the past some encoding scheme has also been considered the phonetic of parts when arranging the key position, to strengthen easy memory.But arrange the pure font code scheme of key position considerably less (" cognitive sign indicating number " is wherein a kind of) according to the phonetic of parts at present.Because only handle like that, it is uneven to be certain to make the key position to distribute, and easily produces repeated code; And generally the font code scheme utilizes parts more, intractable.In fact, Chinese is a kind of very perfect, language that vocabulary is abundant, not only can utilize the phonetic of parts itself, can also plant with the phonetic of the association glossary of parts and the key position of keyboard and get in touch.Because selected components of the present invention is less, easier such processing.The present invention utilizes the phonetic of parts and the phonetic of parts associational word to arrange the key position, only need know the title of parts earlier, and then see several just associational words, just can remember the key position corresponding relation of each parts.And the present invention makes parts distributed uniform (inventor goes out the group word counting of every parts to 7000 general words) on the key position as far as possible, and repeated code also can be seldom.
" parts fractionation " the font code scheme (comprising " parts standard " fractionation) that does not also have 100% obedience " normative stroke order " now to Chinese character.Only the general method for splitting from following three words can illustrate: (1) is returned: mouthful, mouthful.(2) impossible: Contraband, mouthful.(3) or: dagger-axe, mouthful, one.No doubt give people's map quite reasonable like this, but shortcoming have two:, not reach the unification of rule 1. for the scheme of explanation " splitting " according to the order of strokes observed in calligraphy.2. for the scheme of not mentioned " order of strokes observed in calligraphy problem ", do not reach the unification of " parts standard " and " normative stroke order ".Certainly, unified also wanting rationally should reach simple, no ambiguity.The present invention's's " 100% get big house under normative stroke order little " principle itself is exactly regular in " normative stroke order " down unambiguous fractionation.The fractionation of " return, impossible, or " three words is promptly arranged among the present invention's " split for example ", whether simply please see.
Since the Eastern Han Dynasty is permitted careful invention " indexing system for Chinese characters ", use till today always, also there is not a kind of font indexing system of Chinese Characters can really replace its status." indexing system of Chinese characters of the four corner code " of Wang Yunwu is though use easily the skilled person, and because of certain memory capacitance is arranged, it is also few that common people conscientiously learn.Even seeing its index of Chinese Characters, unfamiliar people also is not easy retrieval.The present invention splits the characteristics of scheme in conjunction with self, uses for reference the valuable experience of " indexing system of Chinese characters of the four corner code ", has designed the number indexing system for Chinese characters of oneself.The characteristics of this indexing system of Chinese Characters are: need not remember, only need can carry out Chinese character index by numbers first in the index of Chinese Characters, last parts.Compare with " indexing system for Chinese characters ", this indexing system of Chinese Characters is distributed more even to Chinese character, retrieves more conveniently, and does not have ambiguity, does not need " difficult index of Chinese Characters ".
The object of the present invention is to provide that coding is easy to learn, repeated code is few, fireballing computer Chiense character code scheme and general and dictionary and computer, easily, easy-to-use Chinese character index scheme.
Guiding theory of the present invention is: " strengthen the differentiation to Chinese character of first, last two parts, reduction splits rule, and 100% obeys normative stroke order, takes into account every index." advantage of the present invention is: easily learn, easy note, unique to the Chinese character method for splitting.If Chinese learn to use method of the present invention, just can both be with the order of strokes observed in calligraphy writing Chinese characters of standard.
Introduce the present invention according to the order of " notion → Chinese character splits scheme → Hanzi coding scheme → Chinese character index scheme → illustrate and imagines " below.
One, notion
(1) first part, two parts, three parts, parts not: the present invention is the first sum of place parts of Chinese character and an end place parts separate processes, and is called " first part " and " last parts ".After claiming that first part removed in Chinese character, the first part of remainder is " two parts ".After claiming that first part and two parts removed in Chinese character, the first part of remainder is " three parts ".(be called " one ", " two ", " three ", " back pen " in " first part ", " two parts ", " three parts ", " not parts " former again patent application document.According to the Shi Yun Cheng Qian generation's of China Chinese Character Information Institute suggestion, I have made them into more intuitive appellation) obviously, the form of two parts and three parts is still first part.The end parts are at most only used once in Chinese character splits, and it plays in encoding scheme and strengthens the effect of distinguishing Chinese character, quite with " identification code ".Each parts choose the principle that " 100% get big house under normative stroke order little " is followed in strictness.In " holder sound " of the present invention encoding scheme, the inventor has just considered the key position homogeneity of first part and last parts simultaneously when handling, need not distinguish first part during the memory button position and portion is not assorted, learn off and understand those again behind the key position and can be used as first part, those can be used as last parts.Because choosing of first part and last parts is characteristics according to Chinese character itself, thus memory distinguished again after learning off the key position, not difficult.The largest benefit of first, last parts separate processes is: strengthen and distinguish Chinese character, reduce repeated code (first, last parts separate processes are exactly that twice reinforcement of Chinese character distinguished in fact).
(2) component names: the present invention chooses 122 parts altogether, their title is in two kinds of situation: 1, character formation component is totally 60, with word this as its name: the ten factories soil three big workers cun bad lichen of wooden king's car Ge Kou at a tenth of the twelve Earthly Branches mountain towel day shellfish end several thousand months gas Niu Hebai of order field worm bone people eight an ancient type of spoons vow the wide Men Huoxin cubic meter of boat fish again the women horse bow of power cutter corpse cling to little the sixth of the twelve Earthly Branches two the dawn ware bird.2, character non-formation component is totally 62, their title is chosen three kinds of situations of branch: 1. following 48 parts are parts in " parts standard ", and their title takes from Beijing spoken and written languages university Xing Hongbing, Cui Yonghua, open kind " suggestion that parts are said the name of sth. ": one (horizontal stroke) Myeon (page or leaf prefix) Lv (grass-character-head) Rolling (handle) (blue or green prefix) Xi (wanting prefix) Shu (erecting)
Figure A9911630500051
(accounting for prefix) ㄇ (abundant word waist) several (with the word frame) (prefix still) Si (sieve prefix) Pie (left-falling stroke) Ren (single upright people) (people crouches) Qe (Ai Zidi)
Figure A9911630500052
(owe prefix, annotate: " angle prefix " also incorporated these parts into) Bao (bag prefix)
Figure A9911630500053
(anti-prefix)
Figure A9911630500054
(all word frames) Chi (one of the Chinese character components) Quan (by the dog) Cannibals (food is other)
Figure A9911630500055
(announcement prefix) Jin (gold is other) (bamboo prefix) Dian (point) Rui (WAWQ) Yan (speech is other) Mi (bald Bao Gai) Tou (capital prefix) Ha (the Eight characters) Rui (3 water) Xin (the perpendicular heart) Http (Bao Gai) Chuo (walking it) (emerging prefix) Woo (showing other) Epileptic (by the sick word) Yi (clothing benefit) Si (private word limit)
Figure A9911630500056
(silk word angle) Si (hank knotting) ㄒ (lower word) Jie (monaural) Xiangxi (horizontal 4 points) (lifting-hook)
Figure A9911630500057
(Cang Zidi); The title of 2. following 4 parts is taken from the popular title method in many dictionaries, the data:
Figure A9911630500058
(rain prefix)
Figure A9911630500059
(other) Ya (folding) of sufficient word (perpendicular crotch); 3. be that characteristics according to the present invention are chosen with lower member, oneself is named the inventor: Network (sunset prefix) ㄅ (bird head)
Figure A99116305000510
(skin prefix) Fu (ear)
Figure A99116305000511
(at the bottom of state's word)
Figure A99116305000512
(at the bottom of the fish word)
Figure A99116305000513
(the horizontal point of going up) (clothing tail) ㄑ (water tail)
Figure A99116305000515
(seeing at the bottom of the word).
In the former invention file " folding " is divided into " left folding " and " right folding ", " point " is divided into " interior point " and " exterior point ".Through the inventor to the careful statistics of 7000 general words Yi “ Ya (folding) " for the word of last parts divide one group just much of that; needn't be divided into " left folding " and " right folding ": with " Dian (point) " though be that the word of last parts is more; therefrom telling some through the inventor, to serve as with " Dian " that last common components is done last portion assorted; just can needn't divide " interior point " and " exterior point " more just also only dividing one group with " Dian " for last parts rhythm word.
(3) " sound sign indicating number " of parts: " sound sign indicating number " definition in two kinds of situation: 1, the sound sign indicating number of the character formation component initial of getting this word Chinese phonetic alphabet is respectively " k, c, z " as the sound sign indicating number of " mouthful, very little, end ".2, the sound sign indicating number of the character non-formation component initial of getting the first word Chinese phonetic alphabet in the component names is as “ Qe, Xi, ㄇ, Si " the sound sign indicating number be respectively " a, y, h, l ".
(4) " it is little to get big house " principle: " get big preferential " is an indefinite notion in the past, since be " preferentially ", just showing has exception.The present invention's " get close greatly little ": be exactly when splitting Chinese character, strictly in parts of the present invention, get maximum part, exhausted without exception.As if though " identification " of some last parts, it is not too reasonable to consider from the angle of Chinese character " fractionation ", sad last parts of the present invention do not play " fractionation " effect, and it at most only with once, only plays " identification " from effect.And seem not too that from " fractionation " angle reasonably situation is also all very typical, and run into once, just can understand later on.The inventor only finds following seven kinds of situations in statistic processes: (1) " mortar, fragrant-flowered garlic, ear, straight ": according to " it is little to get big house " principle, their last parts all are " two ".(2) the last parts of " face returns west, the tenth of the twelve Earthly Branches " all are
Figure A9911630500061
(dotted line also works in " it is little to get big house " principle).(3) the last parts of " field is lived, and hero is careful " all are " soil ".(4) the last parts of " in, Shen, sheep " all are " ten ".(5) the last parts of " clothing, group " all are
Figure A9911630500062
(6) art parts " " water, sudden and violent, million " all are " ㄑ ".(7) the last parts that " do not have, see " all are By above routine word, mutually doubly according to " normative stroke order ", the head of any one Chinese character, last parts all are easy to identification according to " it is little to get big house " principle.
Two, Chinese character splits scheme
(1) splitting rule and parts handles: any Chinese character is split into four parts only following four simple rules: 1. first part Direct Recognition.2. 100% get two parts,, replace with first part if there are not two parts according to normative stroke order.3. 100% get three parts,, replace (comprising two parts that form with the first part replacement) with two parts if there are not three parts according to normative stroke order.4. last parts Direct Recognition.According to " normative stroke order ", the present invention's handle "
Figure A9911630500064
(carrying) " be classified as " one (horizontal stroke) ", Ba “ Fu (right-falling stroke) " be classified as " Dian (point) ".Except last parts towel is listed " (lifting-hook), (perpendicular crotch) " separately, in other cases, a folding stroke all is classified as “ Ya (folding) " (because the present invention does not consider part distortion, so " " regarded as the folding pen).For the ease of people's identification, the present invention merges into " day " to " day, say ", and " native scholar " merged into " soil ".Table 1 is " components list " of the present invention, and first part is according to the first sum of being divided into of " horizontal, vertical, left, points, discount " five districts, and last parts are divided into " horizontal, vertical, left, points, discount " five districts according to the end pen.Two particular component are arranged: 1. in the wherein last parts : dotted line is represented not exist, but it has the differentiation effect to end pen " ", be that the word of last parts is as " state, face, time " etc. with it.2.
Figure A9911630500066
: dotted line is represented not exist, but it has the differentiation effect to end pen " Dian ", with its be last parts word as " Jian, one-tenth, force, I " etc." dot " is that the present invention obeys the method for handling common components differentiation Chinese character under the normative stroke order situation in strictness.Table 1
Figure A9911630500067
(2) split for example: the fractionation of asking associative list 1 and the above following Chinese character of introducing of fractionation rule understanding: (1) one: one (first part Direct Recognition), not one (do not have two parts, replace) with first part, one (does not have three parts, replace with two parts), one (last parts Direct Recognition).(2) early: day, ten, ten (do not have three parts, replace), ten (last parts Direct Recognition) with two parts.(3) of a specified duration: Network, Dian, Dian, Dian.(4) I: Pie, Rolling , Ya, (if " " conduct " Shu ", first part should be " thousand ", does not meet very much people's psychology) (5) in:, one, Yi , Ya (" " in first part, belong to folding), (" " is independent last parts).(6) do not have: one, one, Pie, (7) also: Ya, Shu , Ya (" " belongs to the folding pen in first part), (" " is independent last parts).(8) in: mouthful, Shu, Shu, ten.(9) China: Ren, Pie , Ya, ten.(10) people: people, people, people, people.(11) people: Ya, Yi , Ya , Ya.(12) altogether: Lv,, eight, eight.(13) and: standing grain, mouthful, mouthful, mouthful.(14) state: ㄇ, the king, Dian,
Figure A9911630500073
(15) special: one, Yi , Ya, Dian.(16) profit: standing grain, Shu , Ya, .(17) the: Shi of office , Ya, mouthful, mouthful.(18) English: Lv, ㄇ is big, big.(19) hero: one, Pie, Si, soil.(20) impossible: one, Kou , Ya , Ya.(21) or: one, mouthful, one,
Figure A9911630500074
Three, Hanzi coding scheme
Split scheme according to Chinese character of the present invention and parts are defined on the key position of keyboard, carry out encode Chinese characters for computer according to the key position corresponding relation that splits out parts.The present invention considers learnability, the repetition rate of coding and special requirement during encoding scheme in design, the spy in the accompanying drawings specific design four kinds of keys arrangements, " the key bit map " of their correspondences is respectively accompanying drawing 1, Fig. 2, Fig. 4 and Fig. 5.Except that Fig. 4, among all the other each figure behind the capitalization target be first part, target is last parts behind the lowercase.Upper and lower case letter is at when coding and indistinction, is to help people to discern first part and last parts in this main effect.
For " holder sound " scheme, during the memory button position, and needn't distinguish first part and parts not, learn off the key position after, just need to understand those and can be used as first part, those can be used as last parts.First part is used for the front and splits, and last parts only are used for last identification.
(1) holder sound (I) scheme: " holder sound " is exactly that parts are defined on its sign indicating number key.If all ask sound to handle, memory capacitance is with minimum.But can make the group word frequency of each key position upper-part inhomogeneous like this, easily produce repeated code.So the inventor has done simple process to parts in this scheme.As shown in Figure 1, most of parts are defined on its sign indicating number key, only the parts that " underscore " arranged have been done the key position and have handled.Handle in two kinds of situation: 1. being defined in has in its associational word, the sentence on the sound of " underscore " word sign indicating number key: Rui (dangerous in the water, as to take care) → a, and month (month youngster) → e, Xin ten tors (erect confidence, cross ten bedstone mountains) → f, Shu (setting) → 1, (Mr. football) → x, (hard hook) → g, (perpendicular crotch) → w.2. according to the homophonic memory key position that " underscore " word is arranged in its associational word, the sentence: Lv (love lawn) → i, Jin (oracular words) → o, Rolling (right hand) → u, wood (timber fender) → v.By the sound sign indicating number of above mnemonic word and parts itself, just can remember the key position corresponding relation of each parts at an easy rate.
The general coding input rule of individual character: hit Chinese character radical spare, two parts, three parts, the pairing key of last parts position successively and can import individual character.Chinese character splits and coding is given an example: (1) force: one, one, end,
Figure A9911630500076
It is encoded to hhzh.(2) Chinese: Rui, again, again, again; It is encoded to ayyy.(3) worker: worker, worker, worker, worker; It is encoded to gggg.(4) already: Shu, Shu , Ha, one; It is encoded to lldh.(5) big: big, big, big, big; It is encoded to dddd.(6) learn: , Mi, son, son; It is encoded to xtzz.
Brevity code: 1. one-level brevity code: first part+space, totally 26.2. secondary brevity code: first part+last parts+space, totally 26 * 22=572.3. three: first part+two parts+last parts+space, can design 26 * 26 * 22=14872 (legal will reducing a lot certainly).
Speech sign indicating number: 1. two words: the first prefix parts+first word end parts+second prefix parts+second word end parts.For example " China " is encoded to kshg.2. three words: the first prefix parts+second prefix parts+the 3rd prefix parts+the 3rd word is parts not.For example " Patent Office " is encoded to hhsk.3. the above speech of four words: the first prefix parts+second prefix parts+the 3rd prefix is part+last word first part all.That for example " unites like one man " is encoded to hrhx.
(2) holder sound (II) scheme: handle and not contact of holder sound (I) the key position of this scheme, learns in two kinds of schemes any and promptly can be encode Chinese characters for computer.It is more even just to ask sound (II) that parts are distributed, and repeated code should be still less in theory.Though memory capacitance neither be very big than the slightly bigger youngster of holder sound (I), and mnemonic word is all arranged.As shown in Figure 2, also be that most of parts are defined on its sign indicating number key, only the parts that " underscore " arranged have been done the key position and handled.Handle equally in two kinds of situation: 1. according to the sound sign indicating number: Jin shellfish (treasuring gold gives a farewell dinner) → a, horse order door rice (horse stare fixedly at rice outdoors) → e, 30 Xin (30 people erect confidence and do kind cause) → f Ha an ancient type of spoon (workman made eight daggers in one day) → g, Chuo (walking energetically) → j, the Yi fish
Figure A9911630500081
(drawing two fishes on the clothes) → l, fire (fiery ox battle array) → n, (cowhide football) → p, Dian (keeping one's hair on a bit) → q, standing grain ㄇ (seedlings of cereal crops are called daylight) → r, (bamboo has been bent) → w, Yan (careful speech) → x, (perpendicular crotch) → w.2. according to homophonic: Rui (love is drunk water) → i, tor Shu (on the bare tor of " o " shape a tree being arranged) → o, Rolling (right hand) → u, Lv (grazing) → v, two (love is drunk strong, colourless liquor distilled from sorghum) → i, Xiangxi (horizontal 4 points are arranged) → u, (" v " shape lifting-hook) → v.
This programme is except the key position corresponding relation of parts is different with holder sound (I), and the method for designing of the general input rule of individual character, brevity code, speech sign indicating number is all identical with holder sound (I).
(3) numerical key holder sound scheme: this programme is the sound sign indicating number of the word by " underscore " arranged in numeral or the digital unquote earlier, phonetic alphabet and numerical key are connected: t (one day one night) → 1, e (two) → 2, s (three) → 3, g (enough four times) → 4, w (five) → 5,1 (six) → 6, q (seven) → 7, b (eight) → 8, j (nine) → 9, and k (mouthful: pictograph) → 0.According to phonetic alphabet being defined on the numerical key with co-relation.And then by following associational word, make between the phonetic alphabet and get in touch: 1 (there is red sun the sky) tr, 2 (trio) ec, 3 (three) s, 4 (anyway inenough) fmg, 5 (dinner party) wy, 6 (difficult next) nz1,7 (in high mood) xqh, 8 (seeking treasured in the water) Rui b, 9 (love drive) apj, 0 (big mouthful) dk.Accompanying drawing 3 is this programme " key position memory figure ", indicates the associative relationship of introducing previously among the figure, can help people's memory button position corresponding relation.By above contact, again the present invention's all first parts except that " Rui " are defined on the numerical key of its sign indicating number place again, just formed " key bit map " as shown in Figure 4.Because of this programme has only utilized 10 numerical keys, itself just can't overcome repeated code, so needn't distinguish too carefully, has only utilized 106 first parts.After claiming that first part, two parts and three parts removed in Chinese character, the first part of remainder be " four parts ", and adds a rule: 100% gets four parts according to normative stroke order, if nothing four parts replace with three parts.
The general coding input rule of this programme individual character: hit Chinese character radical spare, two parts, three parts, four parts key bits corresponding successively and can import individual character.The fractionation of Chinese character and coding are given an example: (1) material: wood, Yi , Ya, Pie; It is encoded to 4769.(2) material: rice, Dian, Dian, ten; It is encoded to 4003.(3) learn: , Mi, son, son; It is encoded to 7166.(4) institute: Fu, Http, one, one; It is encoded to 2877.(5) worker: worker, worker, worker, worker; It is encoded to 4444.(6) divide: eight, cutter, cutter, cutter; It is encoded to 8000.(7) class: king, Dian, Pie, king; It is encoded to 5095.
Brevity code: 1. one-level brevity code: first part+space.2. secondary brevity code: first part+two parts+space.3. three: first part+two parts+three parts+space.Speech sign indicating number: 1. two words: the first prefix parts+first word, two parts+second prefix parts+second word, two parts.2. three words: the first prefix parts+second prefix parts+the 3rd prefix parts+the 3rd word two parts.3. the above speech of four words: the first prefix parts+second prefix portion is assorted+the 3rd prefix parts+last word first part.
(4) pure shape encoding scheme: above scheme is for the Chinese of the meeting Chinese phonetic alphabet, and is enough.If but, just can say without any the memory advantage to having no the foreigner on phonetic basis.So the inventor has designed " key bit map " as shown in Figure 5, be beneficial to foreigners learn Chinese character's coding.As can be seen from the figure, this figure arranges the key position according to " horizontal, vertical, left, points, discount " subregion.The general input rule of the individual character of this scheme, brevity code, speech sign indicating number, method for designing also all identical with holder sound (I).
Four, Chinese character index scheme
Group number of words according to first part evenly is divided into 26 groups to them, and the group number of words according to last parts also evenly is divided into 26 groups to them again, only utilizes first, last two parts, just is equivalent to 26 * 26=676 radicals by which characters are arranged in traditional Chinese dictionaries, and distributes more even than traditional radicals by which characters are arranged in traditional Chinese dictionaries to Chinese character.But total number of parts reduces a lot than radicals by which characters are arranged in traditional Chinese dictionaries in the dictionary in the past.The present invention has designed " the parts number indexing system of Chinese Characters " according to this.
Table 2 is the index of Chinese Characters of " the parts number indexing system of Chinese Characters "." the parts number indexing system of Chinese Characters " is divided into " two number indexing system for Chinese characters ", " three ones numbers indexing systems of Chinese Characters " and " four number indexing system for Chinese characters " again.Serve as main this cover indexing system of Chinese Characters of introducing with " two number indexing system for Chinese characters " below.
(1) two number indexing system for Chinese characters:, can introduce retrieving by following routine word: (1) if 1 dictionary carries out layout according to the ascending orders of combinations of numbers first in the table 2, last parts
Figure A9911630500091
: one (first part), (last portion is assorted); The combinations of numbers of their correspondences is " 1-24 ".(2) purple: end eight; The combinations of numbers of their correspondences is " 8-19 ".(3) thousand: thousand, ten; The combinations of numbers of their correspondences is " 15-9 ".(4) red: Si, worker; The combinations of numbers of their correspondences is " 25-6 ".Can in the respective number of dictionary (can number do the page number), find above each word.If the otherwise layout of 2 dictionaries can be table 2 as " elementary index of Chinese Characters ".Design one " concrete index of Chinese Characters " again, comprise all Chinese characters in the dictionary in " concrete index of Chinese Characters ", and carry out layout by the ascending orders of combinations of numbers first, last parts, mark the page number of this word in the dictionary text behind every word, the word of jack per line carries out layout by stroke order from less to more.Behind each unit construction of " elementary index of Chinese Characters ", mark this and be combined in the page number in " concrete index of Chinese Characters ".During Chinese character retrieval, in " elementary index of Chinese Characters ", find earlier the page number and the last parts number of first part number in " concrete index of Chinese Characters " of Chinese character, and then the corresponding page number in " concrete index of Chinese Characters " finds this first part number, finds last parts number more in turn, can retrieve this Chinese character.For example retrieval " nationality ": check in from " elementary index of Chinese Characters " (i.e. table 2): the number of first part " " is " 17 ", it " concrete index of Chinese Characters " in the page number be " 34 " (virtual), the number of parts " day " is not " 4 ".Translate into " concrete index of Chinese Characters " the 34th page then, can find first part number " 17 ", find last parts number " 4 " more in turn, can find this word very soon.Table 2
Figure A9911630500093
(2) three number indexing system for Chinese characters: promptly carry out Chinese character index according to first part, two parts, the combinations of numbers of last parts in table 2 of Chinese character, method is similar with " two number indexing system for Chinese characters ", and is just thinner to the differentiation of Chinese character.
(3) four number indexing system for Chinese characters: promptly carry out Chinese character index according to first part, two parts, three parts, the combinations of numbers of last parts in table 2 of Chinese character, method is also similar with " two number indexing system for Chinese characters ", and this indexing system of Chinese Characters is distinguished Chinese character does not almost have repeated code.
(4) coding indexing system: this indexing system of Chinese Characters needs dictionary to carry out layout in proper order according to the English alphabetic combination or the combination of numbers of encode Chinese characters for computer of the present invention, carries out Chinese character index according to encode Chinese characters for computer of the present invention.This indexing system of Chinese Characters is primarily aimed at " holder sound " scheme (because of the memory capacitance of " holder sound " scheme is all very little, be beneficial to very much and apply).
Certainly, if above retrieval scheme does not design " elementary index of Chinese Characters " and " concrete index of Chinese Characters ",, all need to do support by the dictionary of respective number or coded sequence layout if want to utilize table 2 or " coding indexing system " to carry out Chinese character index.
Five, explanation and imagination
More than the scheme that Chinese character is split as first part, two parts, three parts, last parts of Jie Shaoing can be described as " Chinese character four parts split scheme "; The scheme that Chinese character is split as first part, two parts, three parts, four parts can be described as " preceding four parts of Chinese character split scheme "; In like manner can promote out " the assorted fractionation scheme of Chinese character dual-part ": promptly Chinese character is split as assorted, the last parts of stem: " Chinese character three-parts fractionation scheme ": promptly Chinese character is split as first part, two parts, last parts.Split schemes and the key position corresponding relation introduced previously all can design corresponding coding scheme according to these Chinese characters.Every kind of encoding scheme all can be used in " coding indexing system " again.In fact,, only utilize first, last two parts, all can distribute Chinese character even, and the key position of " holder sound " scheme relation be also than " Two bors d's oeuveres " memory easily than " Two bors d's oeuveres scheme " according to any letters case corresponding relation noted earlier through inventor's processing.
If parts all are defined on its sign indicating number key, split the conceptual design encoding scheme according to Chinese character of the present invention, can be described as " the pronunciation encoding of Chinese characters scheme of boarding at the nursery ".Inventor's research " holder sound " scheme is at first from " sound of boarding at the nursery ", and compiled " code table original " according to " Chinese character four parts split scheme ", in " input method generator " of windows98, generated input method, and carried out " entry ordering ".According to the ranking results statistics, even " sound of boarding at the nursery " also has word over half not have repeated code to 6763 Chinese characters among the GB2312-80.There is repeated code of word of repeated code generally also to be no more than 5 words.So " the pronunciation encoding of Chinese characters scheme of boarding at the nursery " also is (just not high because the indexing system of Chinese Characters itself requires repeated code) that root is suitable for for " coding indexing system ".
Chinese character is the combination of sound, shape, justice, and encode Chinese characters for computer mainly utilizes tone and shape.Font is the most intuitively for people's vision, and sound feels it is the most intuitively for the people.When we saw a Chinese character, what at first expect was its sound.So it is the psychological characteristics that meets very much the people that font code scheme holder sound is handled.But because the Hanzi coding scheme used unit was more in the past, the more difficult coincident code problem of holder sound.The Chinese character method for splitting of the present invention design is less to component demand, for the processing of the font code scheme of holder sound provides convenience.
Any key position corresponding relation of the present invention's design is marked on the key position of keyboard, promptly is keyboard of the present invention.
The scheme that the present invention relates to is more, and the inventor thinks that " the pronunciation encoding of Chinese characters scheme of boarding at the nursery " and " two number indexing system for Chinese characters " is suitable for pupil's study very much; Other scheme all has universality.The inventor advises school eduaction that " holder sound (II) " is applied to the above degree in junior middle school as a kind of encode Chinese characters for computer and retrieval scheme of standard.
World's Chinese character disunity also now, the inventor only is the processing of being undertaken by 7000 general words, later on also can be further handle the complex form of Chinese characters and numerous, the big character library of letter.Because between numerous, the simple parts commonly used corresponding relation is arranged generally,, promptly can be used for the complex form of Chinese characters so the simplification parts of traditional font parts and homology are mapped.Certainly, if can promote simplified Chinese character, will be best bet in the whole world.
Because this time apply for a patent is to utilize domestic priority, so disclosed some content has just seldom been carried at this in the former patent document.Perhaps many disposal routes also a little are worth in that time application documents, and the inventor only does a little narrations to the imagination of its brevity code design and phonetic-stroke code again at this.
For 105 key English keyboards, master keypad is except 26 letter keys, totally 49 of numerical key, symbolic key and function keys, optional 20 keys that easily hit are as the brevity code end key (Chinese character if we are not totally lost, can not fail any information, so can not clash) with former function as the brevity code end key.1. zero level brevity code: the key of optional undefined first part is made " zero level brevity code key ", hits a key and can import Chinese character.Because of in " word frequency statistic table ", only the tired frequency of ", once " two words reaches 5.48%, if them as " zero level brevity code ", though quantity is few, input speed is influenced should be not little.2. one-level brevity code: with " holder sound (II) " be example, arbitrary key in the brevity code end key of first part+20, and totally 26 * 20=520 is individual.3. secondary brevity code: be example still with " holder sound (II) ", arbitrary key in the brevity code end key of first part+last parts+20, can design 26 * 26 * 20=13520 (can infer to have the word more than 5000 can meet rule at least according to the inventor to the statistical treatment of parts).Because of the secondary brevity code enough, so three still gets final product by former method.
According to holder sound scheme of the present invention, carrying out scheme that Chinese character compiles according to phonetic initial consonant, first part, the last parts of Chinese character during former number sharp application literary composition is assorted also is a kind of very easy, the easy scheme of note.Though Mr.'s Qian Renju " money sign indicating number " is identical with this disposal route in form, " first part " and " last parts " is two special definition in the present invention, also is the key character that the present invention splits scheme.And holder sound processing itself of the present invention is just unusual.Because " holder sound " of the present invention scheme is easy to memory, even to the spelling scheme, adds first part of the present invention at last again, also is a kind of very simple method, and can reduces repeated code greatly that available ", " separates between spelling and first part, to distinguish.
The present invention has impetus to carrying out " normative stroke order ".Because only stroke ground order of strokes observed in calligraphy writing Chinese characters of teaching students in education with standard is not a thing easily.By the coding of parts, people can become interested from the computer input, and the study " normative stroke order " of being obliged to go is so that correctly import Chinese character.So " normative stroke order " has directive function to the present invention, the present invention has impetus again to " normative stroke order ".
Encode Chinese characters for computer research is a systems engineering, more than only be the own design that Chinese character is handled that the inventor provides, perhaps be still far from perfect, wish that various circles of society improve it jointly.

Claims (10)

1, a cover Chinese character splits scheme, and the front splits and gets " first part " successively, gets " last parts " at last, and it is characterized in that: 1. " first part " and " last parts " handled respectively; 2. " first part " is the first sum of place of Chinese character parts, and " last parts " are Chinese character end place parts; 3. " it is little to get big house " principle is obeyed in the strictness of choosing of " first part " and " last parts ".
2, a cover computer Chiense character code input scheme, it is characterized in that: 1. the parts that the present invention is chosen are defined on the key position of keyboard; 2. according to the present invention the fractionation order of Chinese character is hit each parts key bits corresponding input individual character successively.
3, a cover computer and the general Chinese character index scheme of dictionary is characterized in that: utilize the combination of " first part " and " last parts " to carry out Chinese character index.
4, some are applicable to the keyboard of Chinese character input, it is characterized in that: keyboard is marked with the key position corresponding relation of the present invention's design.
5, according to the described fractionation scheme of claim 1, it is characterized in that: 1. split process 100% is obeyed normative stroke order; 2. split process replaces with the tight adjacent parts in front if there is not certain parts.
6, according to the described fractionation scheme of claim 1, the present invention has designed a kind of strict normative stroke order of obeying and has handled the method that common components is distinguished Chinese character, it is characterized in that: with dashed lines distinguish parts (the present invention selected for use "
Figure A9911630500021
" two).
7, according to the described encoding scheme of claim 2, the present invention has designed a kind of " holder sound " arranges the method for parts key position, it is characterized in that: the key position arranged in the Chinese phonetic alphabet according to certain word in the associational word of parts or the sentence.
8, according to the described encoding scheme of claim 2, the present invention has designed following four kinds of unit constructions and key position corresponding relation.First kind: A Qe Rui a Qe, the white b eight an ancient type of spoon shellfishes of B eight Bao an ancient type of spoon Http shellfishes crust Epileptic, cun car worm c of C factory
Figure A9911630500022
Very little worm, big bad d Dian Jie big dawn of cutter of D Dian Ren Ha cutter, E Fu ear moon e two Fu, F
Figure A9911630500023
Xin ten tor f ten, G extensively bends the worker Leather bone g Worker , H-ㄇ fire standing grain h-
Figure A9911630500026
The Xiangxi fire, I Lv, a few Si towel of J Tou j
Figure A9911630500027
Towel, K mouth k mouth, L Bing power is found Si Shu l national power Shu, M order door horse rice m horse order ware, N ㄅ woman ox n woman bird, O Jin, P Pie
Figure A9911630500028
P Pie, Q
Figure A9911630500029
Thousand Quan gas , the R day for human beings r day for human beings, S Si
Figure A99116305000210
Three Chi Cannibals corpse in the sixth of the twelve Earthly Branches Woo vow s Si ㄑ, T Mi Jiong soil T soil, U Rolling, V wood v wood, W king w , the careful Network of X X heart ㄒ, Y You Yan Myeon Yi The tenth of the twelve Earthly Branches
Figure A99116305000214
Fish y again
Figure A99116305000215
, Z Ya
Figure A99116305000218
Sub-Chuo ends boat z Ya Chuo;
Second kind: A Qe Jin shellfish a Qe shellfish, the white b eight of B eight Bao Http crust Epileptic, cun car worm c of C factory
Figure A99116305000219
Very little worm, big bad big dawn of d Jie cutter of D Ren cutter, E Fu knowledge Men Mami e Fu male horse order, F
Figure A99116305000220
30 Xin f ten, G extensively bends the worker
Figure A99116305000221
Ge Gu Ha an ancient type of spoon g worker
Figure A99116305000222
An ancient type of spoon, H-h-, I Rui i two, a few Si towel of J Tou Chuo j towel Chuo, K mouth k mouth, L Bing power is found Si Yi fish l power
Figure A99116305000225
, M wood m wood ware, N ㄅ woman ox fire n woman bird fire, O tor Shu o Shu, P Pie
Figure A99116305000226
P Pie, Q
Figure A99116305000229
Quan gas Dian q Dian, R day for human beings standing grain ㄇ r people
Figure A99116305000230
, S Si Chi Cannibals corpse in the sixth of the twelve Earthly Branches Woo vows s Si ㄑ, T Mi Jiong soil field t soil, U Rolling u Xiangxi, V Lv v , W king w , the careful Network of X Yan Jin x heart ㄒ, Y You Myeon month Xi
Figure A99116305000233
The tenth of the twelve Earthly Branches y again, Z Ya
Figure A99116305000235
Son ends boat z Ya;
The third: 1 people is with Mi Jiong Rolling soil field, the 2 Fu ear Lv of factory cun car worms, 3 Shu, ten Si Xin mountain three Chi Cannibals corpse in the sixth of the twelve Earthly Branches Woo stones vow 4
Figure A99116305000237
The wide bow of order Men Mami wood worker
Figure A99116305000238
The leather bone, 5 king You Yan Myeon month Yi Xi tenth of the twelve Earthly Branches
Figure A99116305000239
Fish, 6 ㄅ Nv Niu Ya
Figure A99116305000240
Figure A99116305000241
Sub-Chuo ends the boat
Figure A99116305000242
Bing power is found Si, and 7 Network  are careful
Figure A99116305000244
Thousand Quan gas -ㄇ fire standing grain, Epileptic is white for 8 Rui, eight Bao an ancient type of spoon Http shellfishes crust, 9 Qe Pie
Figure A99116305000245
Tou is towel Jin how, and O Dian Ren Ha cutter is big bad mouthful;
The 4th kind: Q-Myeon q-, W Lv w two, E Rolling cun e mouth, R wood r dawn day
Figure A99116305000246
Ware, T worker removes from office native t soil, the Xi of Y king factory tenth of the twelve Earthly Branches Bad y order Sub-worker, the big ear u woman of U stone 30 cars bird horse, I Shu
Figure A99116305000249
ㄇ ends Jiong towel field i Shu, O mouth o ten, P worm Shellfish bone p Fu ㄒ Jie towel, L day mountain order Si l
Figure A99116305000251
Chuo, K Pie An ancient type of spoon Bao ㄅ k Pie power cutter, a few j Si of J Ren Chi Xiangxi worm, H Jin gas ox h shellfish fire, G month Jiong thousand boat Quan g adults, F
Figure A99116305000254
Network Cannibals fish standing grain f wood, D white man Ba Qe vows d eight, S Dian Bing Yan heart Chuo s Dian, A Rui a feeling, Z Xin Http Mi z
Figure A99116305000255
Qe, the upright door of an X Tou x ㄑ, the wide fiery c of C Epileptic again, V Ha rice Woo Yi v , B Ya pony in the sixth of the twelve Earthly Branches bow b Ya, N Si
Figure A99116305000257
Corpse is n again, M woman Fu Si Baal cutter m an ancient type of spoon
Figure A99116305000259
9, according to the described Chinese character index scheme of claim 3, the present invention has designed " the parts number indexing system of Chinese Characters ", it is characterized in that: 1. be respectively " first part " and " last parts " have carried out packet numbering; 2. utilize the number combination that is equipped with parts to carry out Chinese character index.
10, according to the described Chinese character index scheme of claim 3, the feature of " coding indexing system " is among the present invention: carry out Chinese character index according to encode Chinese characters for computer of the present invention.
CN 99116305 1998-05-18 1999-01-04 Chinese character coding input method, keyboard and retrieval method therefor Expired - Fee Related CN1116635C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 99116305 CN1116635C (en) 1998-05-18 1999-01-04 Chinese character coding input method, keyboard and retrieval method therefor

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN98113555.2 1998-05-18
CN 98113555 CN1204083A (en) 1998-05-18 1998-05-18 A set of Chinese character coding input method and keyboard and indexing system
CN 99116305 CN1116635C (en) 1998-05-18 1999-01-04 Chinese character coding input method, keyboard and retrieval method therefor

Publications (2)

Publication Number Publication Date
CN1233794A true CN1233794A (en) 1999-11-03
CN1116635C CN1116635C (en) 2003-07-30

Family

ID=25744700

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 99116305 Expired - Fee Related CN1116635C (en) 1998-05-18 1999-01-04 Chinese character coding input method, keyboard and retrieval method therefor

Country Status (1)

Country Link
CN (1) CN1116635C (en)

Also Published As

Publication number Publication date
CN1116635C (en) 2003-07-30

Similar Documents

Publication Publication Date Title
CN1169041C (en) Pronunciation and shape phonetic transcription Chinese character input method
CN101517573A (en) Database system and its handling method for ideogram
CN104123011B (en) Chinese character and Chinese phonetic alphabet coding input method
CN1233794A (en) Chinese character coding input method, keyboard and retrieval method therefor
CN1455322A (en) Position-shape-sound Chinese character coding and computer keyboard layout method
CN105278697B (en) Combined double-spelling class major-minor code Chinese character, word coded input method and its keyboard
CN1032939C (en) Chinese-character coding, English keyboard and single-hand keyboard input
CN1125393C (en) Chinese character encoding and inputting method and keyboard
CN1068947C (en) Shape-sound and shape-shape associated Chinese input method and its keyboard
CN100375947C (en) Thirty-key Renzhi Code Chinese character input method
CN1234062C (en) Chinese-character input method for computer
CN102609106B (en) As the existing kanji code trinity input method of Comnputer Chinese character
CN1115618C (en) Chinese character positive pole and negative pole shape code entering system
CN1062361C (en) Method for inputting chinese characters by key shape code derived from sound and shape
CN1049291C (en) Chinese characters radicals coding method and keyboard thereof
CN1154508A (en) Three-D, three-codes method for inputting Chinese words and characters combined
CN1206582C (en) Chinese characters input method
CN1108553C (en) Universal popular voice form Chinese character coding input method
CN1162765C (en) Chinese-character &#39;Radical classification&#39; input method and its keyboard
CN100397307C (en) Computer Chinese character input method using stroke radical binary code
CN103186242B (en) Chinese keyboard
CN1156742C (en) Chinese character input method
CN1120406C (en) Computer Chinese character radicals input method and keyboard
CN1052801C (en) Three-code computer Chinese character entering keyboard and method thereof
CN1084295A (en) Chinese Eight Diagrams classification keyboard and coding

Legal Events

Date Code Title Description
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C06 Publication
PB01 Publication
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee