CN1327185A - Chinese character gene code - Google Patents

Chinese character gene code Download PDF

Info

Publication number
CN1327185A
CN1327185A CN 00116345 CN00116345A CN1327185A CN 1327185 A CN1327185 A CN 1327185A CN 00116345 CN00116345 CN 00116345 CN 00116345 A CN00116345 A CN 00116345A CN 1327185 A CN1327185 A CN 1327185A
Authority
CN
China
Prior art keywords
code
characters
radicals
initial consonant
traditional chinese
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 00116345
Other languages
Chinese (zh)
Inventor
董杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN 00116345 priority Critical patent/CN1327185A/en
Publication of CN1327185A publication Critical patent/CN1327185A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

The Chinese character gene code is one "no-coding code technology" with the features of easy learning and fast input speed, and is based on the Chinese character creating principle. It changes the difficulty to memory radical information into easy to memory phonetic information to reduce memory amount, simplify coding rules, optimize the mapping rules from radical to key element, and reduce the number of radicals. It has low code duplication rate, short code length and high coding efficiency.

Description

" Chinese character gene code "
One, affiliated technical field: encode Chinese characters for computer input
Two, prior art:
Answer: 1, handwriting recognition input: the write with a pen custom of Chinese character of comand way simulation people writes on word on the identification plate and imports by computer Recognition.
Advantage: meet people and use a writing style.Can will import by writing of Chinese characters.
Shortcoming: input speed is slow, and average every word agreement 10 is drawn and suitable standard otherwise can cause identification difficulty during this input mode claim writing of Chinese characters.
2, phonetic entry: voice messaging converts caption information to and finishes the Chinese character input behind the computing machine acquisition voice messaging.
Advantage: do not need Chinese character is encoded.
Shortcoming: input environment requires high, importer's pronunciation standard, and the phonetically similar word of Chinese character is many and be difficult to distinguish the error rate for input height.This technology is also very immature at present, still is in period of expansion.
3, electron scanning recognition technology: text is read in computing machine by after the electron scanning identification.
Advantage: method is simple.
Shortcoming: original copy must be arranged, very high to the requirement of manuscript printing quality, otherwise computing machine is difficult to identification.
4, mechanical translation: be Chinese by foreign languages translation directly promptly, belong to passive translation yet, it is not high to translate into power.
5, keyboard input: import with keyboard.
Computer Chinese input method is popularized computing machine " bottleneck " problem in China always.Domestic and international many experts have carried out a large amount of research, have proposed nearly more than 1000 kinds of Chinese character input methods.
Can be divided into by coding method: sound sign indicating number, font code, phonetic-stroke code, number.
(1) sound sign indicating number: promptly encode according to the pronunciation of Chinese character." Chinese sound is digital " that Fan of Shandong Province Tang is wide, " the double-tone sign indicating number " of Beijing's Liu Weimin, Microsoft's spelling.The advantage of sound sign indicating number is easy, and is directly perceived.
Shortcoming is that code length is longer, repetition rate of coding height, and the word that can read can't not imported.
(2) font code: promptly encode according to the font of Chinese character.As " the Five-stroke Method " of Henan Province's Wang Yongmin, the advantage of font code is that the repetition rate of coding is low, and code length is shorter, and the word of the word that can read can not imported yet.Shortcoming is to have a cover to split the rule that Chinese character is encoded, and remember the distributing position of radical on keyboard, finds it difficult to learn.
(3) phonetic-stroke code: promptly encode according to the pronunciation and the font of Chinese character simultaneously.As Sichuan Province old generation in " popular Chinese code ", " natural code " of Zhou Zhinong etc.According to the rule of Chinese character " unisonance is similar shape not, and similar shape is unisonance not ", the repetition rate of coding of phonetic-stroke code greatly reduces.Find it difficult to learn but still exist, can not read or read the problem of the difficult input of inaccurate word.And need by the frequent transitions of sound, increase the weight of the burden of human brain by shape to sound to shape.
(4) number: promptly encode as region-position code telegraph code with numeral.
Advantage is only with 10 numerical codings, operates no repeated code easily.Shortcoming is extremely difficult memory.
Three, Fa Ming purpose: answer: utilize keyboard to carry out the main stream approach that the Chinese character input is present Chinese character input computing machine, it is to environment, equipment and operating personnel's competency profiling is minimum, and this mode will be in the dominant position of input method of Chinese character along with further popularizing of computing machine.Mostly there is such or such defective in present input method of Chinese character, and easily the sound sign indicating number input speed of learning is too slow, and quicker a little font code finds it difficult to learn very much again, and pronunciation-form combination code is between between the two, the shape sound, and brain is made us in the conversion of sound shape again, and over-burden in input process.
If solved shape sound in the phonetic-stroke code, the brain this point that over-burden in input process is made us in the voice-shape dimensional conversion, so this input method will be accepted than the easier people of allowing of existing input method.
Concrete grammar is:
1, reduce the repetition rate of coding, the repeated code words is optimized by frequency of utilization.
Thereby 2, simplify the rule of coding rule, optimization radical Map Key unit, the purpose that minimizing radical quantity reaches the easy note of easy.
3, solve the problem that can not read or read the difficult input of inaccurate word in the Chinese character.
4, shorten code length and improve keystroke speed, improve code efficiency.
5, standardization, standardization.The rigorous standard of coding rule integrates with the education of Chinese language literal comprehensively
6, improve software intelligence.
Four, summary of the invention:
1, how " gene " two words are understood in the Chinese character gene code.The origin gene that is title is the biosome essential information factor, the functional areas that different Gene Handling is different.Initial consonant in the Chinese character " message breath ", radicals by which characters are arranged in traditional Chinese dictionaries and parts " shape information " they are the gene expression characteristics codes of Chinese character, 1, the pronunciation of consonant information control Chinese character.
2, the meaning of word of radicals by which characters are arranged in traditional Chinese dictionaries data separation Chinese character.
As: Chinese character " vapour, ripple " contains radicals by which characters are arranged in traditional Chinese dictionaries " Rui " has expressed the information relevant with water.
Chinese character " is eaten, is drunk " and contains radicals by which characters are arranged in traditional Chinese dictionaries " mouth " and expressed the action message relevant with mouth.
3, parts and radicals by which characters are arranged in traditional Chinese dictionaries form the structure of Chinese character.
Earlier these information words are decomposed from Chinese character, clone and extracted during the group speech, made up again according to the authority size, new like this phrase coding has just formed.
Wherein, secondly initial consonant authority when the group speech is the highest is radicals by which characters are arranged in traditional Chinese dictionaries, parts and stroke.
At first that two Chinese character disintegration reserved authorities is the highest two information word initial consonants during (1) two word group speech " the message breath ' and radicals by which characters are arranged in traditional Chinese dictionaries " shape information " and lower parts and the stroke information unit of authority replaced by the high information word of authority.
The radical " shape information " that the initial consonant " message breath " that each Chinese character disintegration reserved authority is the highest during (2) three, four word group speech and authority are low is replaced by the high message breath of authority.
I duplicate liken to " information gene " of this process image and hereditary process.
This process is just as the clone and the heredity of biosome gene.
The origin of Chinese character gene code title that Here it is.
One, coding thinking:
Encode Chinese characters for computer at first needs the problem that solves
1, coding rule will be made every effort to simply, saves brain burden.
(1) etymon list and the keyboard rule of correspondence require simple.For thereby rule, the minimizing radical quantity of optimizing radical Map Key unit reaches the easily purpose of note of easy.Radicals by which characters are arranged in traditional Chinese dictionaries and parts that Chinese character gene code adopts " codeless coding techniques " to be used to encode are sorted out with sound holder form, and the small part parts are sorted out in the pictograph mode.About totally 200, wherein most of radical is the radicals by which characters are arranged in traditional Chinese dictionaries in the Chinese voluminous dictionary to the radical that is adopted.The advantage of this mode is the regular strong of radical and keyboard map, and Chinese character gene code radical quantity is minimum in all inputting methods, and radicals by which characters are arranged in traditional Chinese dictionaries are that we are the most common, the most familiar at ordinary times.Therefore see that word gets sign indicating number and need not memory, saved the worry of back of the body etymon list.
(2) encode Chinese characters for computer fractionation rule will be made every effort to simply, reduces brain tire.
Chinese character gene code has absorbed the sound sign indicating number easily to be learned, and the fireballing characteristics of font code need not split radical, do not carry on the back the worry of etymon list.Coding method is based on the coinage principle of Chinese character, and the shape information translation that difficulty is remembered is the message breath of easily note.Thoroughly solved the worry that phonetic-stroke code middle pitch shape and shape sound two-dimensional transformations make importer's brain tire.Realized the multinomial important breakthrough in the coding input hypothesis.Promptly be suitable for copying into personnel and " see and beat " that being suitable for the writing personnel again " wants to beat " and " listen and beat ".Association just can be never to forget in one's life in a few minutes.Word is a trigram during input, and speech is four yards.Average code length, the repetition rate of coding is low, high input speed.
2, reduce the repetition rate of coding, the repeated code words is optimized by frequency of utilization.Chinese character gene code is according to the rule of Chinese character " unisonance is similar shape not, and similar shape is unisonance not ", and the repetition rate of coding of phonetic-stroke code greatly reduces, high input speed.All repeated code words, speech are all optimized through weighting algorithm by frequency of utilization is different.The word that frequency of utilization is the highest, speech, sentence, go up screen at first.
3, solve the problem that can not read or read the difficult input of inaccurate word in the Chinese character.Can not read or read my method of employing of inaccurate word in the Chinese character is: be input as main front with font code and add " ' "." ' ' implication be to omit initial consonant.That is: '+radical 1+ radical 2+ end radical or
'+radical 1+ radical 2+ space bar example: give ' ark (Si population) standard ' li (Bing Cui) is promptly ' ge (blunt Jie)
Meeting ' res (people two Si)
4, standardization, standardization.The rigorous standard of coding rule integrates with the education of Chinese language literal comprehensively.Word, speech, sentence coding rule are scientific and normal, and most of radical is the radicals by which characters are arranged in traditional Chinese dictionaries in the Chinese voluminous dictionary.Meet the Chinese language literal education of current school.
5, shorten code length and improve keystroke speed, improve code efficiency.Word is a trigram during input, and speech and sentence are four yards, and one-level brevity code secondary brevity code and three processing done in high frequency word commonly used, and the brevity code processing also done in high frequency two words language.Shortened mean code length.
6, use QWERTY keyboard, the code element set of use requires rationally.Chinese character gene code only uses 27 symbols, and wherein 26 code elements are 26 English alphabets, and another one is function key " ' ".Employed number of symbols is scientific and reasonable.
7, input object improves input efficiency based on word.The expression way of Chinese mainly is a word, and the input of comparing based on word based on the input of word has the following advantages:
(1) shortens code length, improve input efficiency more than one times.Word input code length down is two keys for the quadruple linkage speech.
(2) simplify coding rule, a font code only need be imported in each word during the input of two words, and just become spelling input method during the input of the speech of four words, so just can alleviate brain burden reduction error rate.
(3), reduced the repetition rate of coding of phonetic-stroke code according to the rule of Chinese character " unisonance is similar shape not, and similar shape is unisonance not ".
(4) from information-theoretical angle, the zeroth order entropy of Chinese character is 9.71 bit/ words, speech is that the zeroth order entropy of 11.46 bit/ word English alphabets is 4.03 bit/ words, and the zeroth order entropy of English word is 10 bit/ words, uses word to import this method obviously and has improved the Hanzi keyboard input efficiency.
(5) the Chinese character gene code vocabulary of collecting altogether has about 40000.
Comprising: " Essential Terms dictionary " " well-known phrase dictionary in ancient times " " common saying dictionary of proverbs " " dictionary of idioms " " Chinese voluminous dictionary ", countries in the world and capital title and international big city title.Therefore vocabulary is very abundant.
Two, encoding scheme
The word input method:
Get trigram altogether and add that at last space bar finishes, the radicals by which characters are arranged in traditional Chinese dictionaries and the parts of single character are stroke.The advantage of trigram got in word: Chinese character can be divided into two kinds of word and morphemes, and morpheme is nonsensical.Such as: " I " am first person pronoun, can use separately." " is morpheme, cannot use separately, must use together with " I " and could synthesize phrase.Word has more than 1000 in 7000 individual character, more than 5000 of morphemes.The space encoder that trigram got in Chinese character is: 26*26*26=17576, and 17576 corresponding more than 1000 individual character words of space encoder, the space encoder of 17:1 is enough in theory.If the corresponding one group of Chinese character repeated code of a Codabar code, the word authority by weighted be than morpheme height,
Word is at first gone up screen and is got final product.Example: " people, go into, earth " corresponding code be " rty " this be word to " people " in the repeated code, " go into, earth " is morpheme, the word authority after the weighted at first goes up screen than morpheme height so " people " comes the prostatitis of repeated code formation.
1, (1) single character: the font that can not be split into parts.
Initial consonant code+the first sum of stroke code+end stroke sign indicating number example: last mgy, bird nng, my wty, agricultural nyy, send out fny, long uty, electric dhn
(2) accurate single character: a stroke and parts are formed.
Initial consonant code+the first sum of stroke code sign indicating number+component code or initial consonant code+radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+end stroke code example: the beautiful ywy dog of old jho dawn dog coin btj qdy sword rdy too tdy master iyw does ggv
2, combinde rqdical character encoding scheme:
Combinde rqdical character: on structure, can be divided into the Chinese characters Chinese character that several unit constructions form by several parts or radicals by which characters are arranged in traditional Chinese dictionaries: left right model, last mo(u)ld bottom half, full encirclement, semi-surrounding.Get the orientation difference according to radicals by which characters are arranged in traditional Chinese dictionaries and can be divided into following eight kinds again, the radicals by which characters are arranged in traditional Chinese dictionaries of being got become symmetric relation with parts.
(1) left side got in radicals by which characters are arranged in traditional Chinese dictionaries
First sign indicating number+the end of initial consonant code+left part component code example: but drg, sting ykf, brain nou, stay imw, with sez, unrestrained lsg, remove uep
If the end parts do not exist then
Continuous two the stroke code examples in the first sign indicating number+end of initial consonant code+left part: whip bgw, row pfg, starve evw, carry tfw
(2) right side got in radicals by which characters are arranged in traditional Chinese dictionaries
First sign indicating number+first part the sign indicating number of initial consonant code+right part example: the bel of portion, song gqk
If first part does not exist then
The first sign indicating number of initial consonant code+right part+the one the second continuous two stroke codes
Example: than bba, goose ent
(3) radicals by which characters are arranged in traditional Chinese dictionaries are got
Initial consonant code+last radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+end component code example: anxious jdx, true ivb, kind vyk, fennel hck, rope sva bar tfp wild duck fnj
If the end parts do not exist then
Initial consonant code+go up continuous two the stroke code examples in radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+end: year nrf, funeral svw, stingy svk, hop how
(4) radicals by which characters are arranged in traditional Chinese dictionaries take off
Initial consonant code+following radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+first part sign indicating number example: hope www, green bdw, scold mmk, the kqk that cries, become bye
If first part does not exist then
Initial consonant code+following radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+the one the second continuous two stroke code examples: back hke, allusion quotation dbm, soldier bbr, at zyd, shield dme, left zgd, right ykd, smoked xst, black hsm
(5) outside radicals by which characters are arranged in traditional Chinese dictionaries are got
Initial consonant code+shell radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+end component code example: prestige wwn or hgg, wear dgb, plant zgp, state gky
If the end parts do not exist then
Continuous two the stroke codes in initial consonant code+shell radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+end
Example: more yzw, build jzf,
(6) in radicals by which characters are arranged in traditional Chinese dictionaries are got
Initial consonant code+interior radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+the one the second continuous two stroke code examples: chimney cxr
(7) for the font of sphere of movements for the elephants type
Initial consonant code+upper left corner component code+lower right corner component code
Example: energy nsb, device xkk, doubtful ybp, fragrant xvo
If lower right corner parts do not exist then
Continuous two the stroke codes in initial consonant code+component code+end, the upper left corner or initial consonant code+lower right corner component code+upper left corner component code
Or initial consonant code+lower right corner component code+the one the second continuous two stroke codes
(8) for code taking rule identical initial consonant code+the first sum of stroke code+end the stroke code example of the combinde rqdical character code taking rule that does not have radicals by which characters are arranged in traditional Chinese dictionaries and parts with single character: visiing bth don't bgh
3, general condensed summary:
(1) the one the second continuous two stroke codes (seeing figure two): refer to first stroke and the pairing altogether key code of second stroke according to Chinese-character order of strokes.If the first sum of be " Pie " inferior stroke for ", " key code is " W ".
(2) continuous two the stroke codes in end: refer to stroke second from the bottom and the pairing altogether key code of the last stroke according to Chinese-character order of strokes.(seeing figure two) if the stroke second from the bottom end stroke that is " Pie " for ", " key code is " W ".
(3), promptly have only simple or compound vowel of a Chinese syllable not have the Chinese character of initial consonant to get first letter of simple or compound vowel of a Chinese syllable as initial consonant code for zero initial.
(4) first part: the one the second two strokes of this font also are the one the second two strokes (except the semi-surrounding font of radicals by which characters are arranged in traditional Chinese dictionaries " Chuo Yin ") of these parts
(5) end parts: the second two strokes last of this font also are the second two strokes last of these parts.(except full encirclement, the semi-surrounding font)
Should be noted that:
The continuous two-stroke in end of a, " send out, sprinkle, pull out, dial " is " right-falling stroke " and " point " so parts " again " are not the end order of strokes observed in calligraphy codes of end this word of parts of this word is " o "
The end parts of word such as b, " with, leg " be " Chuo " rather than " month, blunt " because according to the second two stroke corresponding components last of this font of order of strokes observed in calligraphy rule be " Chuo " in like manner the end parts of " strong, key " be " Yin ".
(6) some Chinese character has two radicals by which characters are arranged in traditional Chinese dictionaries information words then according to the priority of these two radicals by which characters are arranged in traditional Chinese dictionaries positions, get preferential, get a left side preferential, get outer principle of priority.
Example: Deng dye " again " is radicals by which characters are arranged in traditional Chinese dictionaries on a left side, and " Fu " is parts, in like manner on the right side: chicken jyn sees gyj enemy dvf and scrapes the radicals by which characters are arranged in traditional Chinese dictionaries that gvl draws hgl and be respectively
" tongue dagger-axe again " parts are respectively " bird see The-Fan Dao Dao "
Honor zlc “ Ha " last be radicals by which characters are arranged in traditional Chinese dictionaries, " very little " is being parts, in like manner down: once clo honor zlc cut the stingy svk of jld island dnv breath xzx radicals by which characters are arranged in traditional Chinese dictionaries respectively
Be that " Ha ten birds certainly " parts are respectively " the sun cutter Da Kou mountain hearts "
(7) the head and the tail parts must be according to getting big principle of priority.
Example: " quiet, draw together " end parts are got " tongue " and are not got " mouth ".
" pigtail, distinguish, debate " end parts are got " suffering " and are not got " ten ".
", stinging " the end parts gets " father " and does not get " Qe ".
(8) for " bundle, bacterium, ascarid, complete sincerity " remove radicals by which characters are arranged in traditional Chinese dictionaries " in, Lv, worm, Xin " left in parts all be surround full font then the end parts of these fonts just get the end parts that left in parts.It is the font of besieged part.Be respectively " wood, standing grain, mouth, wood ".
(9) the end stroke of word such as " I, one-tenth, dagger-axe, Jian " get point ", ".
Two, letters method for typing-in phrases:
1, two words code taking rules: get first word and second word respectively preceding two yards.
Initial consonant code+radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+initial consonant code+radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number
Second word of first word
Example: knowledge ivvi, commander ifhf, wealth cbfb, information xrxz
2, three words code taking rules: get this triliteral initial consonant code+vocabulary key " ' respectively ".
Initial consonant code+initial consonant code+initial consonant code+'
The 3rd word example of second word of first word: computing machine jsj ', the ghg ' of republic, the gud ' of the Communist Party, Chinese igr ',
Slip-stick artist guv '
3, four words and phrase input method code taking rule: word initial consonant code+3rd a word initial consonant code+last word initial consonant code of getting initial consonant code+second of first word respectively gets final product.
Initial consonant code+initial consonant code+initial consonant code+initial consonant code
The 3rd word the last character example of second word of first word: Xinhua Bookstore xhvd, postcode yibm, the ihrg of the People's Republic of China (PRC),
Stdw goes fishing for three days and dry the nets for two
4, fuzzy input method: query key is "? " be used to replace the code that can not import.Example: study xsx? if the radicals by which characters are arranged in traditional Chinese dictionaries of " habit " word do not know how to import,
Available "? " replace.A string phrase can appear in this moment, selects to get final product with numerical key again.
5, high frequency letters method for typing-in phrases: only need the initial consonant of the one the second two words of input to get final product.These class common-use words of high frequency word and high frequency phrase are that I am in particular ecommerce and design, and input speed is faster, time saving and energy savingly are convenient to commercial affairs and exchange.Example: we wm finishes wu jt today and agrees Mr. ty Miss xv xj
Three, statement input method: adopt the input of " feel for the language " punctuate method.
The characteristics of voice are that timing is strong, and the rhythm is arranged, and sound just rises and falls, and statement is in picturesque disorder can to have a word the rule of node can be divided into a few segment statements according to speech syllable like this, entry or several phrases.This can discuss by individual feel for the language custom.
Example: Comrade Deng is about building a socialism with Chinese characteristics.Can be divided into
(1) Comrade Deng is about building a socialism with Chinese characteristics.
(2) Comrade Deng is about building a socialism with Chinese characteristics.First can split into ten phrases, and second splits into three phrases.Can also whole section input.
Four, the brevity code of Chinese character is handled:
A likes B is not C from D's E two F sends out G H and Among the I J just K opens L M buys
N you The O mouth P is afraid of Q please R people S three T he The U car On the V W I Under the X Y one Z exists
One-level high frequency brevity code word (adding space bar finishes)
A ' B ' C ' D ' E ' and F ' wind G ' gives H ' very I ' this J ' the present K ' sees L ' two M ' sells
N ' that O ' lotus root P ' sheet Q ' goes R ' allows S ' four T ' she U ' factory On the V ' W ' is X ' is little Y ' has Z '
Secondary high frequency brevity code word (behind initial consonant, add vocabulary key " ' " add space bar finish)
A presses The B quilt C wipes D beats E F flies G does The H meeting I J promptly K can L comes M
N O vomits P runs Q asks R day S send T too U becomes During V W is past X is new The Y month Z
Three grades of high frequency brevity code words (add letter " o " add space bar and finish) in the initial consonant back
Five, invention effect
Chinese character gene code adopts " codeless coding techniques " coding rule extremely simple, and is directly perceived.Radicals by which characters are arranged in traditional Chinese dictionaries that are used to encode and parts are sorted out with sound holder form, and the small part parts are sorted out (seeing figure one) in the pictograph mode.See that word gets sign indicating number and need not memory, absorbed the sound sign indicating number again and easily learned that the fireballing characteristics of font code need not split radical, do not carry on the back the worry of etymon list.Coding method is based on the coinage principle of Chinese character, and the radical shape information translation that difficulty is remembered is the message breath of easily note.Thoroughly solve the worry that phonetic-stroke code middle pitch shape and shape sound two-dimensional transformations make importer's brain tire, saved brain burden.Realized the multinomial important breakthrough in the coding input hypothesis.Do not destroy writing idea and do not influence the thinking continuity.Promptly be suitable for copying into personnel's " see beat " and " listen and beat " and be suitable for writing personnel " think dozen " again.Association just can never to forget in one's life (being shown in the following table) in a few minutes.Word is a trigram during input, and speech is four yards.Average code length, the repetition rate of coding is low, high input speed.And be basic input block with speech, quick, smooth.The dictionary capacity is big, and has a processing capacity.The rigorous standard of coding rule integrates with the education of Chinese language literal comprehensively.
I have totally spent the time in 4 years for this reason, consult the books of numerous encode Chinese characters for computer theories.Based on understanding to Chinese character, collect the chief of each family's coding, COMPREHENSIVE CALCULATING machine science, biological gene science of heredity, cognitive psychology, ergonomics, the Chinese language Word message is handled and is learned.The thought process of the process of code fetch and human brain perception and identification graphic Chinese character is synchronous.
Table one: the general personnel's study schedule of Chinese character gene code
Stage Learning time Association's degree
Initiation study In 10 minutes Can import all words, speech, sentence
Basic learning In 1 hour See the state of beating: words input speed 30 words per minute clocks are listened the state of beating: words input speed 20 words per minute clocks are wanted the state of beating: words input speed 15 words per minute clocks are seen word, speech can rapid reaction go out the key primitive encoding
Skilled operation About 3 hours See the state of beating: fingering exercise word, speech input speed 60 words per minute clocks are listened the state of beating: words input speed 50 words per minute clocks are wanted the state of beating: can remember which words can adopt the brevity code mode to import, word, speech input speed 40 words per minute clocks
Finish " change-fountain pen " In one day Fingering is skillfully beaten, is listened and beat and want to make a call under three kinds of states seeing.Word, more than the equal 100 words per minute clocks of speech input speed, the user is handy.And can remember first-selected repeated code word, speech
Table two: Chinese character gene code professional study schedule
Stage Learning time Association's degree
Basic learning In 10 minutes See the state of beating: words input speed 30 words per minute clocks are listened the state of beating: words input speed 20 words per minute clocks are wanted the state of beating: words input speed 15 words per minute clocks are seen word, speech can rapid reaction go out the key primitive encoding
Skilled operation In 30 minutes See the state of beating: fingering exercise word, speech input speed 60 words per minute clocks are listened the state of beating: words input speed 50 words per minute clocks are wanted the state of beating: can remember which words can adopt the brevity code mode to import, word, speech input speed 40 words per minute clocks
Finish " change-fountain pen " 2-3 hours Fingering is skillfully beaten, is listened and beat and want to make a call under three kinds of states seeing, word, more than the equal 100 words per minute clocks of speech input speed, the user is handy.And can remember first-selected repeated code word, speech
Table three: Chinese character gene code inputting method and other input method index tables of comparisons
The sound sign indicating number Phonetic-stroke code Font code Chinese character gene code
The repetition rate of coding High Lower Low Low
Mean code length 3.5 2.5 2.5 1.7
Complexity Easily Easier Difficult Easily
Study schedule Need not learn Several days 1 month The preliminary grasp only needs a few minutes
Vocabulary quantity Article 20,000, Article 10,000, Article several thousand, Article 40,000,
Be fit to writing More suitable Be fit to
Be fit to make a copy of More suitable Be fit to
Coding theory Newly
The sentence processing power Have

Claims (3)

  1. Claimed right has:
    1, encoding scheme
    One, word input method:
    Get trigram altogether and add that at last space bar finishes, the radicals by which characters are arranged in traditional Chinese dictionaries and the parts of single character are stroke.
    1, (1) single character: the font that can not be split into parts.
    Initial consonant code+the first sum of stroke code+end stroke sign indicating number
    Example: last mgy, bird nng, my wty, agricultural nyy, send out fny, long uty, electric dhn
    (2) accurate single character: a stroke and parts are formed.
    An initial consonant code+the first sum of stroke code sign indicating number+component code or initial consonant code+radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+end stroke code
    Example: the beautiful ywy dog of old jho dawn dog coin btj qdy sword rdy too tdy master iyw does ggv
    2, combinde rqdical character encoding scheme:
    Combinde rqdical character: on structure, can be divided into the Chinese characters Chinese character that several unit constructions form by several parts or radicals by which characters are arranged in traditional Chinese dictionaries: left right model, last mo(u)ld bottom half, full encirclement, semi-surrounding.
    Get the orientation difference according to radicals by which characters are arranged in traditional Chinese dictionaries and can be divided into following eight kinds again, the radicals by which characters are arranged in traditional Chinese dictionaries of being got become symmetric relation with parts.
    (1) left side got in radicals by which characters are arranged in traditional Chinese dictionaries
    First sign indicating number+the end of initial consonant code+left part component code
    Example: but drg, sting ykf, brain nou, stay imw, with sez, unrestrained lsg, remove uep
    If the end parts do not exist then
    Continuous two the stroke codes in the first sign indicating number+end of initial consonant code+left part
    Example: whip bgw, row pfg, starve evw, carry tfw
    (2) right side got in radicals by which characters are arranged in traditional Chinese dictionaries
    First sign indicating number+first part the sign indicating number of initial consonant code+right part example: the bel of portion, song gqk
    If first part does not exist then
    The first sign indicating number of initial consonant code+right part+the one the second continuous two stroke codes
    Example: than bba, goose ent
    (3) radicals by which characters are arranged in traditional Chinese dictionaries are got
    Initial consonant code+last radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+end component code
    Example: anxious jdx, true ivb, kind vyk, fennel hck, rope sva bar tfp wild duck fnj be not if the end parts exist then
    Initial consonant code+continuous two the stroke codes in last radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+end
    Example: year nrf, funeral svw, stingy svk, hop how
    (4) radicals by which characters are arranged in traditional Chinese dictionaries take off
    Initial consonant code+following radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+first part sign indicating number
    Example: hope www, green bdw, scold mmk, the kqk that cries, become bye
    If first part does not exist then
    Initial consonant code+following radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+the one the second continuous two stroke codes
    Example: back hke, allusion quotation dbm, soldier bbr, zyd, shield dme, left zgd, right ykd,
    Smoked xst, black hsm
    (5) outside radicals by which characters are arranged in traditional Chinese dictionaries are got
    Initial consonant code+shell radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+end component code
    Example: prestige wwn or hgg, wear dgb, plant zgp, state gky
    If the end parts do not exist then
    Continuous two the stroke codes in initial consonant code+shell radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+end
    Example: more yzw, build jzf,
    (6) in radicals by which characters are arranged in traditional Chinese dictionaries are got
    Initial consonant code+interior radicals by which characters are arranged in traditional Chinese dictionaries sign indicating number+the one the second continuous two stroke codes
    Example: chimney cxr
    (7) for the font of sphere of movements for the elephants type
    Initial consonant code+upper left corner component code+lower right corner component code
    Example: energy nsb, device xkk, doubtful ybp, fragrant xvo
    If lower right corner parts do not exist then
    Continuous two the stroke codes in initial consonant code+component code+end, the upper left corner
    Or initial consonant code+lower right corner component code+upper left corner component code
    Or initial consonant code+lower right corner component code+the one the second continuous two stroke codes
    (8) identical for the combinde rqdical character code taking rule that does not have radicals by which characters are arranged in traditional Chinese dictionaries and parts with the code taking rule of single character
    An initial consonant code+the first sum of stroke code+end stroke code
    Example: visiing bth don't bgh
    3, can not read or read my method of employing of inaccurate word in the Chinese character is:
    Be input as main front with font code and add " ' "." ' " implication be to omit initial consonant.
    That is: '+radical 1+ radical 2+ end radical or
    '+radical 1+ radical 2+ space bar
    Example: give ' ark (Si population) standard ' li (Bing Cui) is promptly ' ge (blunt Jie)
    Meeting ' res (people two Si)
    Four, the brevity code of Chinese character is handled: A likes B is not C from D's E two F sends out G H and Among the I J just K opens L M buys N you The O mouth P is afraid of Q please R people S three T he The U car V is W I Under the X Y one Z exists
    One-level high frequency brevity code word (adding space bar finishes) A ' B ' C ' D ' E ' and F ' wind G ' gives H ' very I ' this J ' the present K ' sees L ' two M ' sells N ' that O ' lotus root P ' sheet Q ' goes R ' allows S ' four T ' she U ' factory On the V ' W ' is X ' is little Y ' has Z '
    Secondary high frequency brevity code word (behind initial consonant, add vocabulary key ". " and add the space bar end) A presses The B quilt C wipes D beats E F flies G does The H meeting I J promptly K can L comes M N O vomits P runs Q asks R day S send T too U becomes During V W is past X is new The Y month Z
    Three grades of high frequency brevity code words (add letter " o " add space bar and finish) in the initial consonant back
  2. 2, Chinese character gene code radical table:
    Figure A0011634500051
  3. 3, continuous two order of strokes observed in calligraphy etymon list
    Figure A0011634500061
CN 00116345 2000-06-05 2000-06-05 Chinese character gene code Pending CN1327185A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 00116345 CN1327185A (en) 2000-06-05 2000-06-05 Chinese character gene code

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 00116345 CN1327185A (en) 2000-06-05 2000-06-05 Chinese character gene code

Publications (1)

Publication Number Publication Date
CN1327185A true CN1327185A (en) 2001-12-19

Family

ID=4585753

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 00116345 Pending CN1327185A (en) 2000-06-05 2000-06-05 Chinese character gene code

Country Status (1)

Country Link
CN (1) CN1327185A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102163087A (en) * 2011-03-29 2011-08-24 陈长俊 Chinese character shape code input method
CN108183712A (en) * 2017-12-28 2018-06-19 北京华生恒业科技有限公司 A kind of Chinese character gene coding and decoding method and system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102163087A (en) * 2011-03-29 2011-08-24 陈长俊 Chinese character shape code input method
CN102163087B (en) * 2011-03-29 2013-08-07 陈长俊 Chinese character shape code input method
CN108183712A (en) * 2017-12-28 2018-06-19 北京华生恒业科技有限公司 A kind of Chinese character gene coding and decoding method and system
CN108183712B (en) * 2017-12-28 2019-04-16 北京华生恒业科技有限公司 A kind of Chinese character gene coding and decoding method and system

Similar Documents

Publication Publication Date Title
CN1095137C (en) Dictionary retrieval device
CN1523518A (en) Intelligent Chinese cultural dictionary system
CN1278209C (en) Composite phonetic alphabet Chinese character coding input method and its keyboard
CN1327185A (en) Chinese character gene code
CN1110741C (en) Pictophonetic code Chinese character input method
CN1949148A (en) Chinese characters inputting method and device
CN1010989B (en) Input system and keyboards for ideographic characters
CN1059281C (en) Chinese phonetic coding method with initial consonant, simple or compound vowel and tone
CN1129058C (en) Chinese character phonetic code and keyboard design
CN1347023A (en) Intelligent two-stroke handwriting input system
CN1455358A (en) Chinese phonetic alphabet unified scheme, and single phonetic alphabet input and intelligent conversion translation
CN1256644C (en) Chinese-character radical input method
CN1102768C (en) Chinese character sound-shape coding input method for electronic computer
CN1366227A (en) Chinese-character fast input method without splitting
CN1825254A (en) Chinese character inputting method and computer keyboard therefor
CN1271492C (en) 26104 computer Chinese character
CN1058342C (en) Chinese character byte codes and its keyboard of using the same
CN1026829C (en) Chinese-character first and last codes inputing method and keyboard
CN1056007C (en) Codes for inputting Chinese characters
CN1123818C (en) Computer inputting method of electric spelling Chinese characters, applied keyboard and its Chinese internal code
CN1114146C (en) Chinese morpheme code and its computer keyboard input
CN1139774A (en) Chinese character coding method
CN1379307A (en) Chinese-character universal normalized holographic encode and high-speed input method
CN1467614A (en) Three-in-one encode character for computer and keyboard input method
CN1269010C (en) Chinese bit coding keyboard inputting method

Legal Events

Date Code Title Description
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication