CN101339466B - United Chinese characters font commandcode for Chinese, Japanese and Korean - Google Patents

United Chinese characters font commandcode for Chinese, Japanese and Korean Download PDF

Info

Publication number
CN101339466B
CN101339466B CN200810212411.1A CN200810212411A CN101339466B CN 101339466 B CN101339466 B CN 101339466B CN 200810212411 A CN200810212411 A CN 200810212411A CN 101339466 B CN101339466 B CN 101339466B
Authority
CN
China
Prior art keywords
code
parts
chinese character
character
chinese
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN200810212411.1A
Other languages
Chinese (zh)
Other versions
CN101339466A (en
Inventor
曹述交
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN200810212411.1A priority Critical patent/CN101339466B/en
Publication of CN101339466A publication Critical patent/CN101339466A/en
Application granted granted Critical
Publication of CN101339466B publication Critical patent/CN101339466B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

The invention provides ''CJK unified Chinese character'' operating code, belonging to the technical field of Chinese character encoding computer keyboard input, which is a specific embodiment scheme of an invention named ''Chinese character natural component encoding'' further unified with Chinese character ordering, Chinese character detection, Chinese character teaching and Chinese character informationization. The invention employs 638 components and 27 indexing numbers selected to be located in a two-dimensional table being a rectangular coordinate system in nature, the components being grouped by starting strokes, subsequent strokes, and starting stroke components while the indexing numbers being classified as main codes and auxiliary codes, whereby a correspondence between the components and the main and auxiliary codes is formed to provide each of the components with a composite code. Characters are encoded by the rule of ordering the main code and the auxiliary code, so that a Chinese character operating system of 20902 character components of ''CJK unified Chinese character'' having a correct ratio of 100%, and a duplicate encoding ratio of 0% is formed, realizing informationization of international standard Chinese character.

Description

< < CJK unified ideograph > > font operational code
The invention belongs to Chinese character encoding computer keyboard input technology.
The present invention is that the patent No. is that ZL94111115.6 < < Chinese character natural component coding > > keyboard input technology further detects with Chinese character sort, Chinese character, Chinese character teaching, a Chinese character informationization unified specific embodiments mutually.
< < CJK unified ideograph > > is exactly that < < CJK unifies Chinese character > >.It has comprised China, Japan and Korea S.'s related standards word collection separately, is to receive now word amount (20902 character) at most, has carried out the international standard word collection that Unified coding is connected with binary number.The Unified coding of < < CJK unified ideograph > > is " information exchange code ".It is a kind of individual character queuing code, for the message exchange between computer and computer.< < CJK unified ideograph > > font operational code is a kind of parts random coded, for the message exchange between human brain and computer.
One, encode Chinese characters for computer has become Chinese character input
Eighties of last century eighties, State Scientific and Technological Commission promotes " the Five-stroke Method ", and Chinese have carried out for the first time comprehensively practice to encode Chinese characters for computer.Chinese character becomes the First Year that national legal word is 21 century and occurs.The eighties in 20th century, the legal status of Chinese character is also indefinite, academic soil is not also cultivated the ideal of the political circles' " Chinese character spelling ", and country clearly proposes " encode Chinese characters for computer is the sunlight main road that moves towards Chinese character spelling " condition not yet to be possessed, and corresponding technology is not invention in time also.Sail a boat in historical long river, advance and retreat are accompanied.Today, " input method of Chinese character " voicelessly dominating the overall situation of computor-keyboard input technology, and the starting word collection of coding can not be adopted international standards, and parts can not all be spelled again split individual character, between coding, can not eliminate repeated code completely, several unalterable quotas of " encode Chinese characters for computer " are shelved completely.Internet Li“ Beijing University Chinese forum " also only have plate and " the input method discussion prefecture " of one " Chinese information processing ".The concept of " Chinese information " not yet forms.The outlet of " Chinese information processing " is " Chinese information ", and the key of " Chinese information " is " Chinese character information ", and the basis of " Chinese character information " is " encode Chinese characters for computer ".Input method is not equal to encode Chinese characters for computer.Encode Chinese characters for computer is as most basic informationization technology, and the place of a discussion does not exist, and people's struggle has been departed from the original desirable target of popularization " the Five-stroke Method " completely!
The reason that forms this situation today is many-sided.Again illustrate the primary and secondary relation of " coding techniques " and " input technology ", may bring the life that a line is new for encode Chinese characters for computer.In " Chinese character coding keyboard input technology ", encode Chinese characters for computer is the prerequisite of technology, the condition of technology, the basis of technology; Keyboard input is the application of coding, the check of coding.Chinese character coding keyboard input technology, its essence is exactly a kind of technology of encode Chinese characters for computer.From " Chinese character coding keyboard input technology ", take out " coding " two words, become " Chinese character keyboard input technology ", the restriction on the prerequisite of technology, condition, basis etc. has not existed.This processing gimmick, although walked around the technical barrier of encode Chinese characters for computer, has changed the essential content of technological innovation completely, is the performance that technology of Chinese character is innovated a kind of flightism.Bad old practices die hard, and the political circles, educational circles, technos, investor, media are target, task and the meaning of clear and definite encode Chinese characters for computer again, is the informationalized major issue of China.
The declaration of two, encode Chinese characters for computer long-range objectives
Encode Chinese characters for computer is divided into whole word queuing code and parts random coded.The long-range objectives of Hanzi component random coded, will set up a set of code sign that can linear operation exactly as the shadow of Chinese character, realize the informationization of Chinese character.
1, coding is Chinese character information foundation works
What is the informationization of Chinese character? first look at what is " information "." information " has two senses of a dictionary entry in < < modern Chinese dictionary > >: " 1. message; Message.2. the report that information theory middle finger transmits with symbol, the content of report is receiving symbol, and person is ignorant in advance ".Chinese character informationalized " information ", certainly refers to symbol and transmits recipient's ignorant report in advance, instruction, sound, picture, numeral etc.
Informationalized " symbol transmission " is a very complicated process.Because the symbol of " sound " is instantaneous, transmitting range is very limited.Informationalized " transmission symbol " generally adopts font symbol → (numeral or letter) code sign → (scale-of-two) numeric character → (current potential is positive and negative) state symbol.They form one informationalized " transmission symbolic link ".What the people who says Chinese transmitted that language message used is font symbol, is called Chinese.What the people who speaks English transmitted language message use is (letter) code sign, is called English.What electronic software coding was used is code sign, numeric character; Electronic hardware, what in the communication line course of work, use is state symbol.As can be seen here, Chinese character will become " terminal symbol " of informationization " transmission symbolic link ", and it is compared with English, has had more program one, and Chinese character first will be converted into code sign.Chinese character random coded is exactly to create conditions for Chinese character is converted into unified code sign.Chinese character has had own unified code sign to be connected with (scale-of-two) numeric character, and informationization just realized in Chinese character.
2, the phase III of development should be covered in Chinese character
The Chinese character of eighties of last century " Latinize " and " language two literary composition " all by this century First Year spoken and written languages method negated.But Chinese character Latinize is not " groundless rumour ".There is on earth what problem in Chinese character? compatriots must open eyes wide again, find out the reason!
Four kinds of ancient times, the initial stage was all pictograph, with picture, expresses the meaning from source document word in the world.The mankind carry out thought expression interchange with " language ".The thought expression interchange in this period is divided into sound expression interchange to be expressed and exchanges with shape, namely " language is divided into sound expression interchange and exchanges with shape expression ".Whether sound is at that time expressed to exchange and is expressed interchange with shape and link up correspondingly, there is no the basis of positive or negative.Sound is expressed to exchange with shape and is expressed and exchange flourishing level separately, also cannot investigate.This is the starting stage of language development, and the time is very very long.
In word, gradually differentiated afterwards " meaning accords with " (class symbol) and " sound symbol ", and showed that sound expression interchange and shape showed to express the corresponding communication of interchange beginning, word starts watch sound.This is the subordinate phase of language development, and the time answers Yi Qiannianwei unit to calculate.
To about B.C. bimillennium, dodge meter Te (Sai Mu) people and bypass traditional written form at that time, start the consonant (not recording separately vowel) with the speech of letter record, there is shape code in the word that sound is expressed interchange (speech), the inside of shape code is to take letter as unit linear array.From the angle of information " transmission symbolic link ", traditional word differentiates a kind of " code sign word " (second link that belongs to " transmission symbolic link ").The internal unit of code sign word can linear operation, becomes the third important property of word.The three large functions that express the meaning, watch sound and internal unit linear operation are called again word.
English is pandemic word, and China will be walked out in Chinese.Contrast done in the Chinese character of Chinese and English word in three large functions is useful.Express the meaning, the pictographic character of Chinese, self-explanatory characters' word, associative compounds, more than 80 percent phonogram " class symbol " is all expressed the meaning.In English word, only exist the variations such as some property numbers to express the meaning.Can find out, Chinese character is just shunk back from expressing the meaning completely and is expressed the meaning for part, and English word is expressed vocabulary meaning by sound substantially, and shape has been expressed the meaning atrophy.Watch sound, Chinese " sound symbol " watch sound for individual character, Chinese dictionary need to be used Chinese phonetic alphabet phonetic notation; " letter " watch sound for English word, English dictionary need to be used International Phonetic Symbols phonetic notation (the not phonetic notation such as Russian).Linear operation, Chinese individual character internal unit is that face is arranged, cannot linear operation.English word internal unit is linear array, can linear operation.
From word above, develop and contrast, for " font lexigraphy ", expressing the meaning is the first stage of word development, watch sound is the subordinate phase of word development, internal unit can linear operation, computer age is particularly to belong to font lexigraphy---Chinese character is not full-fledged phase III also.So Chinese character is the linear operation of scarce internal unit only now.For " code sign word ", alphabetical watch sound is synchronizeed with internal unit linear operation, does not have subordinate phase and phase III.So-called " express the meaning-watch sound-phonetic (being also watch sound) " word seedtime opinion, is the conclusion of a kind of attribute grafting of watch sound in " transmission symbolic link " two class words that above font symbol is connected in series with code sign purely, is a kind of subjective wishes.The essential is-symbol of word.Font lexigraphy and code sign word, respectively have the form oneself developing.Chinese character has completed the phase III development of internal unit linear operation, and it is more even better than code sign (phonetic) word.
Three, the parts construction of Hanzi coding scheme
The technical design of Chinese character encoding computer keyboard input scheme, is roughly divided into four steps: parts construction, and code construction, parts are corresponding with code, and individual character is by part codes coding.Each step of four steps has skill to execute, and each step is executed the number of skill quantity and the height of quality, is exactly the quality of Chinese character encoding computer keyboard input scheme technology altogether.
Encode Chinese characters for computer keyboard input computer, not all Chinese character (comprising still undiscovered Chinese character) all will carry out random coded.The selection of random coded word collection, the maximum < < Chinese big dictionary > > with word more than 50,000, minimum two words of < < CJK unified ideograph > > with word more than 20,000 integrate as target and relatively gear to actual circumstances.Word beyond < < Chinese big dictionary > >, solution can be used " insertion " and " information exchange code ".
1, the selection of word collection
Chinese character random coded, first should choose may be maximum word collection.Because the code of parts and parts is foundations of random coded.The parts of maximum word collection have been made an accurate selection of: can spell multiple whole split individual characters, can guarantee not have repeated code.The different word collection less than maximum word collection, the coding of selected parts and individual character is also just included among the achievement of maximum word collection.If the word collection of choosing is not maximum, such as what select, be national standard 6763 words, during random coded, the parts of international standard 20902 words just not necessarily include the parts of 6763 words.When the success of international standard word collection random coded, for unified encode Chinese characters for computer, the success of national standard word collection random coded is just nonsensical.This is the experts and scholars that are devoted to encode Chinese characters for computer, a kind of " strategic thinking " that must adhere to.
This programme selects < < CJK unified ideograph > > word collection as the starting target of random coded.But the design of part codes table, the selection of parts is all done from < < Chinese big dictionary > > word collection, leaves the leeway of upgrading.
2, the definition of parts
The definition of parts, having people's novels, anecdotes, etc. part is to be greater than stroke, is less than into the stroke structure piece of word, someone adds parts is from, the stroke structure piece that intersects, join etc., unable to decide which is right.This thinking is to be parts definition component.This programme is adopted as the target of Chinese character information and definition component:
Parts be bear express the meaning, the Chinese character internal unit of watch sound, three tasks of linear operation.
Like this, the expressing the meaning of Chinese character, watch sound, linear operation function, completely by internal unit---express the meaning, the parts of watch sound, assurance linear operation realize, Chinese character is a kind of year adopted body with some senses of a dictionary entry.
3, the fractionation of parts
This programme is according to the above-mentioned definition of parts, and individual character according to the law of inventing character of self-explanatory characters, pictograph, understanding and ideophone, is split as into word as far as possible, left and right radical, and at the bottom of prefix word, word province's member and five kinds of parts of auxiliary stroke, parts have five kinds of identity.Pictographic character self-explanatory characters word is not generally torn open.Concrete method for splitting is as follows:
" arrive ".< < says civilian > >: " arrive, extremely also.From extremely, cutter sound."---141 pages of < < Chinese big dictionary > > edition in reduced formats." to " be split as " to Dao " two parts." the Five-stroke Method " is split as " a Si soil Dao ".
" win ".< < says civilian > >: " win the surname of few Hao Shi.From female, thin province sound."---886 pages of < < Chinese big dictionary > > edition in reduced formats." win " and be split as "
Figure GFW00000053702900041
female " two parts. that " thin " economizes member out.The word of parts is economized member identity and is come therefrom." the Five-stroke Method " is split as " Tou " winning " mouth Dian ".
" contain "." contain, mouthful Qian also.From mouth, modern sound ", phonogram." the present, while being also.From Ji (being deformed into Ra), from Off.Off, ancient Chinese prose and "." contain " and be split as " Ra Off mouth " three parts." the Five-stroke Method " is split as " people Dian Off mouth ".
" Tibetan "." hide, hide also.From Lv, Zang Sheng ", phonogram." a surname is apt to also.From minister, kill sound ", phonogram." kill, Gun also.Other country's minister is regicided to say and is killd.From dagger-axe, slit bamboo or chopped wood sound ", phonogram." Tibetan " is split into " Lv minister dagger-axe slit bamboo or chopped wood " four parts." the Five-stroke Method " is split as “Lv factory
Figure GFW00000053702900047
pie ".
" lick "." lick, with tongue wiping.From tongue, the sound of being unworthy of the honour ", phonogram." tongue, so speech also.From thousand, from mouth, thousand also sound "." be unworthy of the honour, disgrace also.From sky, the heart (Bian Xing is ) sound "." my god, top is also.Most lofty.From one, large "." lick " and be split into " thousand mouthfuls of Yi great  " five parts." the Five-stroke Method " is split as " Pie Gu mono- ".(this programme Chinese character is split as at most five parts.This is that Hanzi component summary table, parts primary key auxiliary code table are selected and designed).
Become word, left and right radical, at the bottom of prefix word, member and auxiliary stroke economized in word, becomes the feature that this programme encode Chinese characters for computer parts are built, and also becomes new approaches of Chinese character teaching, Chinese-character canonical.
Make a special instruction here, parts are chosen to word special meaning.Become word on level, to be greater than parts.The one-tenth word that one-tenth word in parts has been downgraded.It does not contradict with " Chinese character has stroke---parts---becomes three levels of word ", but a kind of realistic solution.Become word to become primary parts, to be Chinese character multiple digital also have remaining part except radicals by which characters are arranged in traditional Chinese dictionaries to reason, and some remaining parts can be divided into radicals by which characters are arranged in traditional Chinese dictionaries and remaining part again, and most remaining parts are into word.The quantity of radicals by which characters are arranged in traditional Chinese dictionaries depends on the details and omissions of classification, and the quantity of one-tenth word is decided by the remaining part of individual character.In parts, most parts are into word also just naturally.Facts have proved, the one-tenth word of parts selects manyly, and the total amount of parts will be fewer.It is that this programme is captured the multiple rate of < < CJK unified ideograph > > word more than 20,000 parts spelling for absolutely crucial.The one-tenth word of this programme parts accounts for 67%, and it is also the key character of this programme.
4, split and spell again
Parts have coinage parts and split dividing of parts.The invention of coinage parts formerly, splits parts invention rear.What the spelling of coinage parts was appeared again is a Chinese character.Split parts, before this, also nobody proposes to spell the requirement of multiple whole split individual characters in setting word collection.But the fractionation of Chinese character is multiple with spelling, be two aspects of interdependence in innovation Chinese character process.If Chinese character does not split, spell multiple problem and just there will not be; If it is multiple that Chinese character is not spelled, split and just there is no good reason.Have fractionation to have and spell again, this also calculates a kind of law of unity of opposites in dialectics.Only split, disorderly split, do not spell again, do not meet innovative thinking.
Spelling is the principal measure that guarantees that Chinese character " objective forms " and " subjective form " are unified again.Portray, write out, printing, the Chinese character of demonstration is a kind of objective forms, and in brain, the Chinese character of map is a kind of subjective form, and two kinds of forms are two aspects of Chinese character interdependence.Use the nationality of Mayan alphabet to be eliminated by colonist, the subjective form of Mayan alphabet has not existed, and Mayan alphabet has also just been withered away.Font information can not reach absolutely cataloged procedure, in brain, form a map (subjective form) that lacks the few picture of pen, not only can cause the suspicion in " road of chronic extinction ", the requirement of also having violated Chinese-character canonical, directly the attitude of study and Chinese character use is write in impact.
The parts that this programme is selected are absolutely to the multiple rate of the spelling of split individual character in < < CJK unified ideograph > >.Wherein, parts are spelled the multiple maximum order of word amount and are: mouthful 2164 words, and Lv 1378 words, Rui 1198 words, 923 words, day 849 words etc., having 12 parts to spell multiple word amount is a word.
5, the total amount of parts
The total amount of this programme parts can be proved.The < < Chinese big dictionary > > of take is example, and the word that it is received has more than 50,000, and the radicals by which characters are arranged in traditional Chinese dictionaries of use are more than 200." water Rui " can be regarded as radicals by which characters are arranged in traditional Chinese dictionaries, but body is different with identity, should can be regarded as two parts.Like this, < < Chinese big dictionary > > just has 300 twenty or thirties as the parts of radicals by which characters are arranged in traditional Chinese dictionaries.Supposing " radicals by which characters are arranged in traditional Chinese dictionaries " and " remaining part " half and half, in < < Chinese big dictionary > > word collection, can spell the parts of multiple whole split individual characters, should be more than 600.Do more than 600 parts count many? if compared with 540 of starting radicals by which characters are arranged in traditional Chinese dictionaries words being permitted the use of careful < < origin of Chinese character > > word more than 9,000, < < CJK unified ideograph > > word collection word more than 20,000, use more than 600 parts, not how as the starting parts of " Chinese character information "!
The last selected parts of this programme are 638.They are by the first stroke of a Chinese character and identity arrangement is as follows separately:
(1), become word
(1), word (329) is received in < < modern Chinese dictionary > > choosing
One hundred twenty-three shows loss of dry kidney beans but also Ge Wang Er Ma rain two unitary two workers are Gongding ear length than non- anonymous Robinson Stone million pages one hundred and five hog face to Wu Ping Nao bad teeth watt dead tired ten Feng Chen car plant car Wei Fu Shi drum sound earth green straight sets designed the ancient wooden plow to the end of the East did not go sheaves sac Tun Chek seven -inch large dogs left SEEDTEC Hundred Fu Yi Shu Yi Ge or stuffy leather onions Wan canopy of its yellow orchid twenty wells Xi Shi ( 107 ) I left the light non- party industry Guan Nirayama abundance BU ended halogen tooth tiger this book Dan North Ran Lou
Figure GFW00000053702900054
central four Doumen Jiong bone said Camps mouth chemicals foot just days denier mesh Ding Shu- li black back Yi Tian strider l meat dish and the babe in the towel Shen Qu insect electric trolling meet ( 65 ) fire heart inevitable children Chuan Zhi Department eighths Valley Wo Ping thousand tender my hair weighs Fou gas Chad slices of raw cow vector owe more than a month before wind Shu several evening each corner fish rabbits do not bird dagger aimed nine pills Zhui segment mound body white inferiority ghost Mao mortar people into the unanimous position of gold eat almost pounds denounced melon claw boat love ( 71 ) six main death made Jing Fang also pay Hai wide Qing Lu Ma 's Family Wing Yi Li Xin production sound Zhanglong Wen QiJicharge and pull the door for the former sheep -meter bucket head Yin Bin Xian Ning points ( 40 ) 'm a child to spear Cloth horse B. Xu and feathers Man Baba Ba corpse Yin Burgundy search ugly bow hanging sword Yu Fu Pi Li Jue Wei Li also water valves tuft of hair township Si unitary mother can not no slave ( 46 ) .
(2), word (98) is received in < < Chinese big dictionary > > choosing
The drooping Contraband Contraband of ㄒ Yu
Figure GFW00000053702900057
ji PHP-Manual Ji
Figure GFW00000053702900058
zhu Lu Bouquet Ling
Figure GFW00000053702900059
Figure GFW000000537029000510
in-particular Marginwidth
Figure GFW000000537029000512
european-allies
Figure GFW000000537029000513
(24)
Figure GFW000000537029000514
ching As-E Shang Jiong Jiong Books Gua strikes lightly
Figure GFW00000053702900061
yang Kou Inner Rou
Figure GFW00000053702900063
rich (19) San Chi
Figure GFW00000053702900064
the ninth of the ten Heavenly Stems
Figure GFW00000053702900065
Figure GFW00000053702900066
Figure GFW00000053702900067
bao Qe メ
Figure GFW00000053702900068
makoto
Figure GFW000000537029000610
Figure GFW000000537029000611
mi Meng Army
Figure GFW000000537029000612
yin Ji Bian
Figure GFW000000537029000614
(26)
Figure GFW000000537029000615
recommendable Epileptic Tou
Figure GFW000000537029000616
ni
Figure GFW000000537029000617
http (9)
Figure GFW000000537029000618
li yin Bo  Jie Blade
Figure GFW000000537029000620
i Kan Jiu Cao
Figure GFW000000537029000621
ㄥ yarn く Chuan ji (20).
(2), left and right radical (41)
Figure GFW000000537029000623
zhang
Figure GFW000000537029000624
Figure GFW000000537029000626
rolling Dao 
Figure GFW000000537029000627
xin
Figure GFW000000537029000628
the-Fan Niu month Quan Ren Jin Jin Shi Cannibals
Figure GFW000000537029000629
speech Yan Yi Woo  the lonely Fu of Bing Rui
Figure GFW000000537029000631
si Si.
(3), (32) at the bottom of prefix word
Xi
Figure GFW000000537029000632
Figure GFW000000537029000633
Lv
Figure GFW000000537029000634
Figure GFW000000537029000635
Figure GFW000000537029000636
Hu
Figure GFW000000537029000637
Si Xiangxi
Figure GFW000000537029000638
Ha
Figure GFW000000537029000639
  
Figure GFW000000537029000640
Ra Zhao Tou
Figure GFW000000537029000641
Ha  Chuo
Figure GFW000000537029000642
Shui .
(4), member (124) economized in word
Figure GFW000000537029000644
Figure GFW000000537029000645
uu
Figure GFW000000537029000646
Figure GFW000000537029000647
(34)
Figure GFW000000537029000648
(36)
Figure GFW000000537029000650
(27) (9)
Figure GFW000000537029000652
(18).
(5), auxiliary stroke (19)
Figure GFW000000537029000653
shu
Figure GFW000000537029000654
pie Dian
Figure GFW000000537029000655
Figure GFW000000537029000656
フ Ya
Figure GFW000000537029000657
Figure GFW000000537029000658
fu
Figure GFW000000537029000659
yin 亅
6, the classification of parts
Parts are generally classified by four kinds of modes.
(1), parts using " first stroke of a Chinese character " as sign, classify.Parts are the same with word, also have " apostrophe folding anyhow " five kinds of first stroke of a Chinese character, correspondingly also have " apostrophe folding anyhow " five kinds of first stroke of a Chinese character parts." beans " are horizontal first stroke of a Chinese character parts, and " field " is to hold up a parts, and " arrow " is to skim first stroke of a Chinese character parts, and " sound " is a first stroke of a Chinese character parts, and " lance " is to turn up a parts etc.Parts using " first stroke of a Chinese character " as sign, classify, be that parts are assigned in nine numerals corresponding with it basis.But it also has weak point: except a first stroke of a Chinese character, numeral corresponding to " anyhow skimming folding " is bifid, also will continue to find knack and make it to become single correspondence.
(2), parts using " first stroke of a Chinese character continues a feature " as sign, classify.The first stroke of a Chinese character is the first stroke, and continuous pen is later each of the first stroke.The continuous feature of the first stroke of a Chinese character is exactly the first stroke of a Chinese character and the relative position and the interconnected relationship that continue pen.Parts are usingd " first stroke of a Chinese character continues a feature " as sign, as horizontal first stroke of a Chinese character parts can also separate two levels.The parts such as " beans work stones ten large its " of take are example: first level, and " beans work stone " is that horizontal stroke does not go out head member (corresponding); " ten large its " are the horizontal head members (corresponding two) that goes out.This level makes the parts simplification that has been assigned to Digital Implementation.Second level, " beans " are that horizontal stroke is not lifted one's head from parts, and " work " is that vertical joint parts do not lifted one's head in horizontal stroke, and " stone " is that slash relay part do not lifted one's head in horizontal stroke, and " ten " are the horizontal perpendicular parts of handing over of lifting one's head, " greatly " is horizontal lifting one's head to skim that to hand over parts, " its " be the horizontal many friendship parts etc. of lifting one's head.This level is that parts are assigned to 27 key mappings (forming 27Ge family) important means of single correspondence with it.
(3), parts are classified by sequential write sign with " first stroke of a Chinese character ".Parts are divided into the continuous dual-purpose parts of the first stroke of a Chinese character, a continuous special-purpose member.The first stroke of a Chinese character continues a dual-purpose parts, if " leather " is the first stroke of a Chinese character parts of saddle (leather Http female) word, is a first continuous parts of despot's (rain leather month) word, be Tiao ( leather) second of word the continuous parts are the 3rd continuous parts of Gong (a few Dian leather of work) word.A continuous tailored version parts, as " The-Fan " do not do the first stroke of a Chinese character parts of any word, it is the first continuous parts of political affairs (positive The-Fan) word, is religion (Uu The-Fan) the second continuous parts of word, being a 3rd continuous parts of sharp (the white side The-Fan of Rui) word, is common vetch (Lv Chi
Figure GFW000000537029000662
several The-Fan) the 4th of word the continues a parts etc.In 638 parts, there are 175 to be a continuous special-purpose member, have 463 to be the continuous dual-purpose parts of the first stroke of a Chinese character.
The parts of this programme, by the first stroke of a Chinese character, the continuous feature of the first stroke of a Chinese character, first stroke of a Chinese character parts, form component family, distribute to 27 primary keys that nine numerals produce, make Hanzi component, Chinese character have " horizontal one or two perpendicular 34; skim and begin at five or six o'clock seven; folding picture eight or nine numbers replace mutually, continuous pen also opinion connect hand over from " the corresponding rule of the first stroke of a Chinese character, for Chinese character carries out numerical coding by parts, lay the foundation.
7, the authentication of parts
Encode Chinese characters for computer, national standard 6763 words, international standard 20902 words, < < Chinese big dictionary > > word more than 50,000, different coding experts, completes the coding of three large word collection above, may respectively there be a set of different parts, who expert does where that selects on earth overlap parts for well? should say, only having check authentication through several aspects just can make to select is objectively, high-level.
(1), motivation authentication.
The coinage parts great majority of Chinese character have motivation.The fractionation parts pursuit of Chinese character is consistent with coinage parts, should be a principle of encode Chinese characters for computer.The motivation of Chinese characters word-formation parts, in < < Chinese big dictionary > > " solution shape " entry, have more and tell about, the < < origin of Chinese character > > of Xu Shen is the main ancient books and records of telling about parts motivation.
The direct result of motivation authentication is that parts have had five kinds of identity.Be that parts should be selected from the one-tenth word of having been downgraded, left and right radical, at the bottom of prefix word, member and auxiliary stroke economized in word.It is the principal character that this programme parts are selected.
This programme has carried out motivation demonstration to whole 638 parts, and parts is spelled to multiple individual character and arrange." first stroke of a Chinese character parts word " arranged is for establishment " first stroke of a Chinese character parts character library "." the continuous parts word " arranged, for expressing the mosaic ability of parts, the law of inventing character of display unit.Two parts in selected parts < < CJK unified ideograph font operational code > > manuscript below, in order to form and the content of explanation demonstration." 1 in " (2), (3) " two wherein 2, 1 3, 1 4" etc. implication be such: main numeral " 1 ", represent that parts are first parts in mosaic, bottom right mark " 2,3,4" represent the total number of parts of this word." 2 2", main numeral " 2 " represents that parts are second parts in mosaic, bottom right mark " 2" represent the total number of parts of this word.The rest may be inferred by analogy for it.
126 north (b ě i)
(1) select reason:
< < says civilian > >: " the back of the body.From meat, northern sound "." north " is the watch sound parts of the words such as the back of the body.“ Ji, thoroughbred horse " etc. be to take the four parts words that " north " be parts.The principle of drawing connected structure by general not demolition pen, " north " can only be split at most two parts.At this moment “ Ji Ji " be five yards of repeated code words of five parts." north " should elect parts word as.
(2) first stroke of a Chinese character parts word:
1 2the Bei Qiu back of the body
Figure GFW00000053702900071
1 3portion Ji 1 4
Figure GFW00000053702900072
(7 word).
(3) a continuous parts word:
2 2bei perylene is well-behaved takes advantage of 2 3back Disobey Bei Bei remains Dou Chui 2 4 ji Ji Ji 3 3cheng Cheng Cheng Sheng 3 4dou (19 word).
482  [people (r é n) prefix]
(1) select reason:
< < says civilian > >: “ Wei.In Cong Ren factory "." Negative.From people, keeping Tony relies on also to some extent ".“ Xian.From people on mortar "." look.From people , Cong Ran "." Xiong.From from people on cave ".< < Plot microcurie little Learn states woods > >: " for a long time, at the beginning of moxibustion, word is also.From sleeping people.The shape resembling with the bright body of thing is drawn at end ".Investigate prefix Shi “  " word, although not completely from people, most of from people.Dictionary radicals by which characters are arranged in traditional Chinese dictionaries catalogue in modern age variant using "  " as " cutter " all, "  " should just be called " people's prefix ", is more conducive to the understanding of the meaning of word.
(2) first stroke of a Chinese character parts word:
1 2jiu Ma Negative Fu Xian Jiu Zhai look hay is exempted to strive and is resembled Tortoises Cu 1 3anxious young Zou of imperial or royal seal Zhen Er Huan Huan Gui Kamei Huan Xiang Discontented-with-oneself Peck Qian Flame that wrinkles exerts oneself 1 4evil spirit Xiong Factories 1 5
Figure GFW00000053702900082
(37 word).
(3) a continuous parts word:
2 3
Figure GFW00000053702900083
Quiet Cheng earns towering Open savage
Figure GFW00000053702900085
Polished criticizing sb's faults frankly
Figure GFW00000053702900086
Figure GFW00000053702900087
Clean
Figure GFW00000053702900088
Zheng zither Zhen Take-advantage-of claim your Xi Mi Multitudinous of Rhesus more You Kan Hom pinch Taste Han flame Kan Qian Stuffing filling Flatter and flatter Flooded and fall into the remorse Jiu Escort of lotus Dan Yan Yan Jiu Jiu Jiu Liu Jiu Jiu Pull Wan and draw fasten hide or leather on Man Of-Records in evening Wrists catfish Exhort Wan Mei and contaminate the gorgeous Jing Zui of Man childbirth Wen Frequently-Mailed
Figure GFW00000053702900089
Yan Cui Se Cesium an angry look absolutely absolutely rubber Xiang Xiang Xiang Xiang Silicon-Image Xiang Yang Yu Lai Fu Confide Fu Fu relies and becomes to fabricating ancient attendants in charge of cart and horses for aristocrats crape 2 4Huan Change Call Huan Huan Huan Huan Paralyzed changes and calling out the shining paralysis that melts and support by the arm your lot Xian 2 of the steady hidden Pi Fan Mian Heng Qiu of gluttonous slander Worcestershire 5Xuan Buckle Juan Joan Qiong Wan ?3 4The rapids an ancient musical pipe leprosy of La La La Lazy Otter Lai Seto Lazy Lan Lai Scoundrel Disease otter
Figure GFW000000537029000810
Yan
Figure GFW000000537029000811
Yan Chan Jue Jue Ping Jing
Figure GFW000000537029000812
Yan Yan coffin with a corpse in it Mu Yu Mi Ching You Mu
Figure GFW000000537029000813
Crown
3 5Qiu Heng Bao addiction (180 word).
(2), spell multiple authentication
Hanzi coding scheme, first should have the coding numerical value character library of oneself, or the alphabetic word storehouse of encoding.This is to make the code sign of Chinese character and the basic guarantee that English word code sign matches in excellence or beauty.Hanzi coding scheme should be in the coding numerical value character library of oneself, or after each individual character in coding alphabetic word storehouse, indicates and split and spell multiple parts, to represent that the multiple rate of spelling of split individual character is absolutely.The spelling of this encoding scheme authenticates again, refers to coding numerical value character library below.
(3), encoded authentication
Why will Chinese character be split as parts? parts are coding service: the given code of parts, individual character just can be encoded by the code of parts.Encoded authentication is the target authentication of parts.The just function authentication of parts of motivation authentication, the multiple authentication of spelling.
If in encoding scheme, the authentication of selected parts motivation, spell multiple authentication and all reached a standard, but encoded authentication do not reach a standard, in the word collection of appointment, coding has repeated code, and selected parts just can not complete the task of random coded.Even if there is the components list of many people's approvals, its existence does not have Practical significance yet.This is the same with one piece of rocket of launching an artificial satellite, and its parts all meet designing requirement by check, and staff is satisfied with.But satellite send to go up to the sky, parts are not also accepted a last practice test, and both are same reasons.The repetition rate of coding between this programme coding, for being 0 percent, refers to coding numerical value character library below.
Four, the code construction of Hanzi coding scheme
The selection of code is also a key to the success of Chinese character random coded.Why does coding expert always use linearly aligned letter without hesitation to Chinese character random coded? if chemical expert then also only uses linearly aligned letter, do not transgress the bounds, students of today just does not have molecular formula to represent the structure of material! Equally, if the coding expert of today also only uses linearly aligned letter, do not transgress the bounds, the student of tomorrow just does not have coding to represent the linear operation of Chinese character inside!
1, the origin of code
Before approximately 4,000 years, dodge the alphabetical linear array that meter Te (Sai Mu) people uses " expression " consonant, sound during " record " speech is expressed word, and linearly aligned letter just becomes the code that sound is expressed word, or is word code.This is the beginning that the mankind use code.
Word code just forms " code word " by the combination of grammer." code word " was divided into afterwards with letter representation vowel and on consonant, added two kinds of forms that affix represents vowel.Code word starts to form linear series and affix series.The English linear series code word that belongs to, Arabic belongs to affix series code word.If the phoneme that during letter representative speech, sound is expressed word is " code ", " d ", " o " in English is exactly " code ".Sound during speech is expressed word " do ", and " dog " is exactly " secondary code ".Initialism " AIDS " is exactly " three codes ".
2, the form of code
Code does not belong to alphabetic writing special use, and its figure spreads all over each section of unity and coherence in writing." algebraically " in mathematics, actual is a gate code mathematics.A+b=c, can be 2+3=5, can be also 3+4=7.Molecular formula in chemistry: K 2o 3potassium oxide, KNO 3potassium nitrate is also a kind of code.Hence one can see that:
The form of code is by the actual needs design of code things.
3, the code of this programme
This programme is selected numerical code.Four-digit number code only has at most 9999 integers, and for the < < CJK unified ideograph > > that has 20902 characters, the coding of ranking is also not enough.This programme has carried out the design of " dilatation " and " level " to ordinary numbers.
(1) capacity of integer
Numeral is 10 in origin phase.They are " radix word ".When ancients grasp after carry, numeral has occurred 11 ... formed natural number.Numeral is with after number axis image representation, and radix word changes 0123456789 into, is called " lattice numeral " on image.Image continues development, has just had (two dimension) planimetric coordinates → (three-dimensional) spatial coordinate → (four-dimension) spacetime coordinates.Because numeral is only for calculating, in coordinate, the lattice numeral of different number axis can only be unified, and can not distinguish.Otherwise digital computation just can not be carried out.When such numeral is used for showing order, two dimension, three-dimensional, four-dimensional numeral all become the numeral on one dimension number axis.From the needs of digital watch order, the measure of dilatation design is exactly that the lattice numeral of (four-dimension) spacetime coordinates four radical axles is carried out to mark with affix, forms affix numeric code, as shown in Figure 1:
Labeling method is the affix that the lattice numeral of four radical axles adds respectively (), (-), (∨), (∧) " without horizontal hook cap " order, forms 37 affix numerals:
Figure GFW00000053702900091
zero is the initial point of coordinate image, and adding affix does not have mathematical meaning.The pronunciation of affix numeral is consistent with the Chinese four tones of standard Chinese pronunciation:
Figure GFW00000053702900092
Like this, originally the integer of one digit number, except " 0 ", only has 9 altogether.By " dilatation " afterwards, the integer of showing order on four radical axles has just had 36.Originally the integer of two figure places is 10---99, and be 90 altogether.By " dilatation " afterwards, " 1 " and " 1 ", " 1 " with
Figure GFW00000053702900101
" 1 " with
Figure GFW00000053702900102
" 1 " and " 2 ", " 1 " with
Figure GFW00000053702900103
" 1 " with
Figure GFW00000053702900104
etc., can be combined as the integer of two figure place table orders, quantity is surged to 1332 by 90.
On X number axis, from 1---4, the capacity of integer is 9999, from 1---5, the capacity of integer is 99999, belongs to ten thousand level capacities.At XYZT coordinate system, from 1---4, the capacity of integer is 1823508.From 1---5, the capacity of integer is 67469796, belongs to hundred million level capacities.Integer " dilatation " has reached unprecedented level.
Encode Chinese characters for computer is made code with 26 letters, can regard " dilatation " of numeral on number axis as, and lattice numeral expands 25 to by 9, and carry is expanded to " two sexadecimal " by " decimal system ".But the effect of " dilatation ", many coding experts tried, and can't resolve the coincident code problem of encode Chinese characters for computer.
This encoding scheme is selected in 37 affix numerals
Figure GFW00000053702900105
as the code of parts, 27 altogether, more much bigger than the capacity of making code with 26 letters.
(2) level of numeral
Five kinds of first stroke of a Chinese character of Chinese character, if distribute to 25 letter representatives as " the Five-stroke Method ", determining of parts first stroke of a Chinese character code, is at least a kind of five selections.9 digitized representations are distributed in the first stroke of a Chinese character, and determining of parts first stroke of a Chinese character code, is at most a kind of two selections.Five are selected correct probability is 20%.Two are selected correct probability is 50%.If a selection, correct probability is exactly 100%.
In the classification of parts, this programme is characterized as level with the first stroke of a Chinese character, the continuous pen of the first stroke of a Chinese character.In order to make the corresponding numeral of parts, be all a selection, numeral is also divided into is not with affix and two levels with affix:
The first level: 0123456789.
The second level:
(3) code element of code
The code element of this encoding scheme code, as shown in Fig. 2 " Hanzi component primary key auxiliary code table ", is 27 affix numerals choosing in the numeral of second level:
Figure GFW00000053702900107
26 formed codes of letter and coding are an interval integral point on alphabetical number axis.The formed code of these code elements of this programme and coding are all point, line, surface, the body in four-dimensional coordinate space.They are that this programme makes the repetition rate of coding become 0% important technique measure.The code element of code system, is that four-dimensional coordinate is that four radical axles have carried out the lattice numeral of difference with affix, becomes the feature of all encoding schemes of difference.
Five, parts are corresponding with code
The feature that this programme parts are corresponding with code is
1, it is corresponding with code that parts form component family
Parts are divided into 27Ge family (seeing Fig. 2) by the first stroke of a Chinese character, the continuous feature of the first stroke of a Chinese character, three levels of first stroke of a Chinese character parts.They are left-handed watch eyebrows:
Figure GFW00000053702900108
four kinds of numerals that the first stroke of a Chinese character is corresponding are two discriminations " anyhow to skim folding ", as " horizontal one or two ", are subdivided into the continuous feature " horizontal stroke is not lifted one's head " of the first stroke of a Chinese character
Figure GFW00000053702900111
the first stroke of a Chinese character continues a feature " horizontal lifting one's head "
Figure GFW00000053702900112
two discriminations of having eliminated numeral are corresponding.Wherein
Figure GFW00000053702900113
be subdivided into again horizontal stroke from " 1 " family, anyhow connect " 1" family, horizontal slash connects
Figure GFW00000053702900114
family.Guaranteed that each family component is single corresponding with second layer numeral.As shown in Figure 2, horizontal stroke from the parts of " 1 " family is: " one
Figure GFW00000053702900115
show
Figure GFW00000053702900116
two lose kidney bean more
Figure GFW00000053702900117
wang Yu can separate dry horse
Figure GFW00000053702900119
rain Two Seoul
Figure GFW000000537029001110
ya Xi tenth of the twelve Earthly Branches two ".Anyhow connect " 1" family is: " the positive Gong Ding of work ㄒ Yu
Figure GFW000000537029001112
ear
Figure GFW000000537029001113
the drooping Contraband of the Long Zhang Contraband of hideing
Figure GFW000000537029001114
minister
Figure GFW000000537029001116
ratio
Figure GFW000000537029001117
" etc.
The parts of each family form associative structure (seeing Fig. 2) by " shape is near " of parts again.Horizontal " two lose cloud ", " Wang Yu in " 1 " family
Figure GFW000000537029001118
" etc. be associative structure.Parts associative structure has two purposes.Directly purposes is the code of parts of being memonic.Purposes is the parts capacity of enlargement part family indirectly.
It is at most 27 that the parts of this programme component family hold.Each parts has its own single code.For example, the code of " gold " (downgrading into word) is
Figure GFW000000537029001119
the code of " Jin " (left and right radical) is
Figure GFW000000537029001120
the code of " Jin " (left and right radical) is
Figure GFW000000537029001121
the code of (word province member) be " 67 ", the code at sunset is
Figure GFW000000537029001123
deng (seeing Fig. 2).The code of parts and parts, much more careful than radical index difference, entirely accurate is to each the tiny difference with classroom teaching.This is a feature of this programme.
Each parts has its own single code.Its first benefit is that parts are single with the corresponding of code, reversible.Known parts " gold ", just can determine that code is
Figure GFW000000537029001124
known code
Figure GFW000000537029001125
can determining means be just " gold ".Its second benefit is that the ins and outs of coding, are all limited in " individual character, parts " aspect separately, suitable with the phoneme phonetic of alphabetic writing.This is also a feature of this programme.
The radical summary table of " the Five-stroke Method ", the radical of a key mapping, what have more than ten is many.For example, the radical of key mapping Q representative is golden Jin
Figure GFW000000537029001126
bao sunset
Figure GFW000000537029001127
 ... whom does Q represent on earth? can not determine, entered at the very start fringe.The radical of " the Five-stroke Method " does not have definite code, only has definite key mapping.Like this, radical keystroke just pressed in individual character, and individual character is not by root coding." the Five-stroke Method " is although also there is coding, technically, " the keystroke code " of radical, " the font identification code " of individual character, " end identification code " of individual character etc., gathers together, in the process of coding, there is not the difference of individual character and two levels of radical, time but basis that radical provides, time but basis that individual character provides increases scholastic difficulties unavoidably.
2, code divide primary key and auxiliary code corresponding with parts
Code is divided into primary key and auxiliary code by its function, forms the multiplexed code (seeing Fig. 2) of parts.Left-handed watch eyebrow
Figure GFW000000537029001128
primary key, upper table eyebrow:
Figure GFW000000537029001129
Figure GFW000000537029001130
it is auxiliary code.The formation method of multiplexed code is consistent with the method for seeing table.Anyhow connecing " 1" parts " fourth " in family's watch core are example, the multiplexed code of " fourth " be first get left " 1", more upwards get " 2 ", be exactly altogether " fourth " multiplexed code " 12 ".The rest may be inferred by analogy for it.The multiplexed code of horizontal each family component of the first stroke of a Chinese character is all released on table.
The primary key representative " component family " of multiplexed code is encoded to individual character, the repeated code of single character code is eliminated in the auxiliary code representative " parts order " of multiplexed code, being formed on hierarchy of components has the elimination repeated code mechanism of a plurality of selections, greatly improved and eliminated the efficiency of repeated code mechanism, thereby guaranteed can not produce repeated code between single character code.Specifically, because multiplexed code is a point in abovementioned mathematical plane, two words that parts form, single character code is definitely without repeated code (single character code method sees below explanation).Three words that parts form, supplementary auxiliary code can have three selections.For example, " mist " is split into " rain Yan is how ", in < < CJK unified ideograph > >, also have “ ?", be split into " rain speech is how ".If they all supplement the auxiliary code of first parts, just produced repeated code.“ ?", supplement the auxiliary code of first parts " rain ", be encoded to " 1747 "." mist ", changes the auxiliary code that supplements second parts into: be encoded to
Figure GFW00000053702900121
if there is again repeated code, can also supplement the auxiliary code of the 3rd parts, stay selectable leeway.
Multiplexed code makes the conversion between parts and code have uniqueness and reversibility.
Multiplexed code is when parts upgrade to into word level, and code also upgrades to coding level (" the Five-stroke Method " be connect strike four times key) thereupon, be formed into word level " coded word " (see that coding numerical value character library horizontal stroke is from " 1 " family:
Figure GFW00000053702900122
as 1 one, 100,1 1show, suan [2 show] etc.).
Exactly because this programme has the multiple choices of supplementary auxiliary code, the repetition rate of coding that has realized easily < < CJK unified ideograph > > is 0 percent.
3, code symbol is corresponding with key mapping
(1) desk-top QWERTY keyboard:
Right-hand operated district key mapping: N---1, M--- 1,
Figure GFW00000053702900124
h---2, J--- 2,
Figure GFW00000053702900125
Figure GFW00000053702900126
l---[< ,], Y---3, U--- 3,
Figure GFW00000053702900127
o---4, P--- 4,
Figure GFW00000053702900128
Figure GFW00000053702900129
Left-handed operation district key mapping: B---5, G--- 5,
Figure GFW000000537029001210
v---6, F--- 6,
Figure GFW000000537029001211
c---7, D--- 7,
Figure GFW000000537029001212
x---8, S--- 8,
Figure GFW000000537029001213
z---9, A--- 9,
Figure GFW000000537029001214
Figure GFW000000537029001215
Upper row keyboard:
Figure GFW000000537029001216
Figure GFW000000537029001217
The input of parts random coded pressed in Chinese character, altogether uses 27 keys, wherein only has one
Figure GFW000000537029001218
key, is arranged in row keyboard, and all the other are all former typing keys.Chinese character adds first stroke of a Chinese character parts (or adding again the first continuous parts) first stroke of a Chinese character code input by initial and final double-spelling,
Figure GFW000000537029001219
just can all use.
(2) hand-held QWERTY keyboard
Hand-held QWERTY keyboard, for affix numeral, such as:
Figure GFW000000537029001220
word [Http]
Figure GFW000000537029001221
type [a European-allies Dao soil], can turn to 7 ∨ 882-word [Http] 12 ∧ 32 types [a European-allies Dao soil], and numeral is separated input with affix.Because coded digital has continuous input, and the not input continuously of affix symbol, three affix keys can be total to key with function key.Hand-held QWERTY keyboard input, although the number of times of keystroke increases to some extent, can not need to compose with screen.
Six, individual character is by part codes coding
Individual character is by part codes coding, and this programme has only designed a table and a rule, and individual character just can be encoded by part codes.A table is exactly " parts primary key auxiliary code table ", and it need to be remembered.One rule is exactly " arrange primary key, supplement auxiliary code ".
Part codes is the direct unique basis of single character code.Single character code becomes the code sign that second link of Chinese character information " transmission symbolic link " is directly connected with binary number symbol, may be also unique form of kanji code in the future.
This programme individual character is as follows by the specific coding method of part codes:
1, a parts word
One parts word is exactly " parts primary key auxiliary code table " inner word, is again word or parts word in table.The coding of one parts word is exactly the multiplexed code in table.Such as the code (seeing Fig. 2) of " minister " parts be " 18 ", the coding of " minister " word be also " 18 ".This is that parts primary key auxiliary code table is determined, between word and code, is reversible.
2, two parts words
" establish " and be split as " Yan an ancient weapon made of bamboo " two parts." establishing " is exactly two parts words.The code (seeing Fig. 2) of " establishing " first parts " Yan " is the code of second parts " an ancient weapon made of bamboo " is
Figure GFW00000053702900132
above be primary key, arrange primary key and be exactly
Figure GFW00000053702900133
below be auxiliary code, supplementing two auxiliary codes is four yards.The coding of " establishing " is exactly
Figure GFW00000053702900134
the coding of two parts words, is equivalent to two line segments that point is determined in a plane, determines, definitely without repeated code.Two parts words, comprise that 345 parts words are below coded word.
3, three parts words
" volume " is split as " Si family Books " three parts." volume " is exactly three parts words.The code (seeing Fig. 2) of first parts " Si " of " volume " is
Figure GFW00000053702900135
the code of second parts " family " be " 71", the code of the 3rd parts " Books " is arrange primary key and be " 974 ".Supplement auxiliary code, generally supplement the auxiliary code of first parts.The coding of " volume " be exactly "
Figure GFW00000053702900137
“ Heliopolis " be split as three parts “ Si  second ".First parts “ Si " code (seeing Fig. 2) be " 92", second parts “  " code be " 54 ", the code of the 3rd parts " second " is " 85 ".Arrange primary key and be " 958 ".If the auxiliary code of supplementary first parts " 2", be encoded to " 958 2".It with the coding of " Ting " " 958 2" be repeated code.“ Heliopolis " coding can only supplement the auxiliary code " 4 " of second parts.So have " 9584 Heliopolis [Si  *second] "." * " right and wrong are generally supplemented the mark of auxiliary code.
4, four parts words
" defeated " is split as " car Ji
Figure GFW00000053702900138
dao " four parts." defeated " is exactly four parts words.In < < CJK unified ideograph > >, also have " Lose " word, be split as four parts " Trucks Ji dao ".They are simplified and difference traditional font.In computer, have the mutual conversion between simplified and traditional font, simplified and traditional font be totally one coding.But it belongs to secondary operation.In character library, sort, can only use disposable operation, each word must have a coding to make seat.Encode Chinese characters for computer is just not limited to only to do keyboard input.Primary key generally only arranged in four parts words, becomes four code encoding.During if any repeated code, then the auxiliary code of supplementary parts is eliminated repeated code.The general auxiliary code that supplements first parts, becomes five yards of codings." defeated " is as follows with the specific coding method of " Lose ":
" defeated ", the code (seeing Fig. 2) of parts " car " is " 2 2", the code of " Ji " is
Figure GFW000000537029001310
"
Figure GFW000000537029001311
" code be " 43 ", the code of " Dao " is " 32 ".Arrangement primary key is because of " car " in parts primary key auxiliary code table, come " Trucks " after, get and supplement auxiliary code for coding.So have "
Figure GFW000000537029001313
defeated [car Ji ] Dao ".
" Lose ", the code (seeing Fig. 2) of parts " Trucks " is " 22 ", the code of " Ji " is "
Figure GFW000000537029001316
" code be " 43 ", the code of " Dao " is that " 32 " are arranged primary key and are
Figure GFW000000537029001317
so have "
Figure GFW000000537029001318
defeated [Trucks Ji ] Dao ".
Hence one can see that, and encode Chinese characters for computer is chosen as 4 yards of word occurrences completely and does not gear to actual circumstances.This programme is selected from 1---5 yards of word occurrences.
5, five parts words
In " crow is covered boundless and walks small muddy pill ", have " boundless " word, be split as five parts " stone Lv Rui Pu cun ", the code of " stone " (seeing Fig. 2) is
Figure GFW00000053702900141
the code of " Lv " is
Figure GFW00000053702900142
the code of " Rui " is
Figure GFW00000053702900143
" just " code is " 24 ", the code of " very little " be " 29".The coding of five parts words is only arranged primary key.At < < CJK unified ideograph > > word, concentrate, the all repeated codes of five parts words, all, when design " parts primary key auxiliary code table ", on component distribution position, done unified processing." boundless " arranges primary key
Figure GFW00000053702900144
so have "
Figure GFW00000053702900145
boundless [stone Lv Rui Pu cun] ".
6, parts pleonasm
" Ying " be split as " 2 Tony female " three parts." Ying " is exactly the pleonasm of parts " Tony ".First parts of " Ying "
Figure GFW00000053702900147
code be
Figure GFW00000053702900148
the code of second parts " Tony " be " 49", the code of the 3rd parts " female " be " 98 ".Arranging primary key is " 1 49".Because " Tony " has two, get " Tony " and repeat the meaning once, " 4" on add a bit and be
Figure GFW00000053702900149
arrangement primary key becomes supplement auxiliary code, generally supplement the auxiliary code of first parts.So have "
Figure GFW000000537029001411
ying [
Figure GFW000000537029001412
2 Tony female] ".
" Ling " is split as three parts " 3 mouthfuls of female of rain ", and the code of " rain " is " 17 ", and the code of " mouth " is " 48 ", the code of " female " be " 98 ".Because " mouth " has three, get the meaning that " mouth " repeats secondary, " 4" on add at 2 and be
Figure GFW000000537029001413
arrangement primary key is
Figure GFW000000537029001414
supplement auxiliary code, generally supplement the auxiliary code of first parts.So have "
Figure GFW000000537029001415
ling [3 mouthfuls of female of rain] ".
The parts of coded word repeat by formal layout above, as "
Figure GFW000000537029001416
brother [2 can] ", " [3 horse] ".
Part codes at most only adds 2 points.In word, there are four parts to repeat, when design " parts primary key auxiliary code table ", have all been elected to be table inner part, as
Figure GFW000000537029001419
Figure GFW000000537029001420
Figure GFW000000537029001421
li.
Word itself is that four parts repeat, as 148 radicals by which characters are arranged in traditional Chinese dictionaries words of < < origin of Chinese character > > " (zh à n) ", in < < CJK unified ideograph > > not with
Figure GFW000000537029001423
for the word of parts, and in < < Chinese big dictionary > >, have with
Figure GFW000000537029001424
for parts
Figure GFW000000537029001425
(zh à n) word.< < says civilian > >: "
Figure GFW000000537029001426
(clothing also for queen consort) Dan Silk-gauze in ancient times (thin thin,tough silk).From clothing, sound ".This programme is treated to
Figure GFW000000537029001428
Figure GFW000000537029001429
△ [2 work 2 works].But " parts primary key auxiliary code table " " 1" in family, leave placing component
Figure GFW000000537029001430
space, for word collection upgrading is allowed some leeway.
This programme individual character is encoded by the code of parts, and the multiple rate of spelling that has realized the multiple split individual character of parts spelling is a hundred per cent, and the repetition rate of coding between single character code is 0 percent.Individual character is by coding numerical ordering, and the homotaxial < < of Chinese phonetic alphabet CJK unified ideograph > > coding numerical value character library pressed in self-assembling formation and individual character.Individual character is by the sequence of first stroke of a Chinese character parts, and the homotaxial < < of radical and stroke CJK unified ideograph > > first stroke of a Chinese character parts character library pressed in self-assembling formation and individual character.The head and the tail part of < < coding numerical value character library > > (being called for short < < coded font > >) in last special selected parts < < CJK unified ideograph font operational code > > manuscript and < < first stroke of a Chinese character parts character library > > (being called for short < < first stroke of a Chinese character character library > >) is in nextpage, in order to contrast Fig. 2 " parts primary key auxiliary code table ", check the actual conditions that < < CJK unified ideograph > > has encoded.In character library, the word of upper right corner band " △ " number is the radicals by which characters are arranged in traditional Chinese dictionaries word of < < origin of Chinese character > >).
< < China, Japan and Korea S. (CJK) unified Chinese character > > coding numerical value character library
1 horizontal first stroke of a Chinese character word
Figure GFW00000053702900151
Figure GFW00000053702900152
Figure GFW00000053702900161
Figure GFW00000053702900171
Figure GFW00000053702900181
9 turn up a word [99]
Figure GFW00000053702900182
Figure GFW00000053702900191
Figure GFW00000053702900201
< < China, Japan and Korea S. (CJK) unified Chinese character > > first stroke of a Chinese character parts character library
1 horizontal first stroke of a Chinese character word
Figure GFW00000053702900211
Figure GFW00000053702900212
Figure GFW00000053702900231
9 turn up a word [99]
Figure GFW00000053702900251
Figure GFW00000053702900261

Claims (5)

1. a Chinese character encoding computer keyboard and input method, be the patent No. be ZL94111115.6 < < Chinese character natural component coding > > keyboard input technology further with Chinese character sort, Chinese character detects, Chinese character teaching, Chinese character informationization is a specific embodiments of unification mutually, it is coded object that < < CJK unified ideograph > > font command code be take international standard < < CJK unified ideograph > > 20902 characters, internal unit with 638 parts as fractionation and the multiple individual character of spelling, with " 01 1
Figure FSB0000115405690000011
2 2
Figure FSB0000115405690000012
3 3
Figure FSB0000115405690000013
4 4
Figure FSB0000115405690000014
5 5
Figure FSB0000115405690000015
Figure FSB0000115405690000016
6 6
Figure FSB0000115405690000017
7 7
Figure FSB0000115405690000018
8 8
Figure FSB0000115405690000019
9 9
Figure FSB00001154056900000110
" " 1 in 37 affix numerals 1 2 2
Figure FSB00001154056900000112
3 3
Figure FSB00001154056900000113
4 4
Figure FSB00001154056900000114
5 5
Figure FSB00001154056900000115
6 6
Figure FSB00001154056900000116
7 7
Figure FSB00001154056900000117
8 8
Figure FSB00001154056900000118
9 9" 27 primary key as parts and auxiliary codes; by a two-dimentional form with rectangular co-ordinate character, make parts and primary key, auxiliary code corresponding, form the multiplexed code of Hanzi component; individual character is arranged by parts multiplexed code the rule that primary key supplements auxiliary code and encoded, and forms the linear operation system of Chinese character; This linear operation system is because there are the multiple choices of supplementary auxiliary code, and the repetition rate of coding that has realized < < CJK unified ideograph > > is 0 percent; This linear operation system code code element is corresponding with key mapping as follows: (one) desk-top QWERTY keyboard, right-hand operated district key mapping: N---1, M--- 1,
Figure FSB00001154056900000119
H---2, J--- 2,
Figure FSB00001154056900000120
L---[< ,], Y---3, U---3,
Figure FSB00001154056900000121
O---4, P--- 4,
Figure FSB00001154056900000122
Left-handed operation district key mapping: B---5, G--- 5,
Figure FSB00001154056900000123
V---6, F--- 6,
Figure FSB00001154056900000124
C---7, D--- 7,
Figure FSB00001154056900000125
X---8, S--- 8,
Figure FSB00001154056900000126
Z---9, A--- 9,
Figure FSB00001154056900000127
Upper row keyboard:
Figure FSB00001154056900000128
The input of parts random coded pressed in Chinese character, altogether uses 27 keys,Wherein
Figure FSB00001154056900000129
Key is arranged in row keyboard, and all the other are all former typing keys, and L key has transformed symbol enter key into; (2) hand-held QWERTY keyboard, numeral is separated input with affix; Desk-top QWERTY keyboard and hand-held QWERTY keyboard, do not need to compose with screen;
It is characterized in that, parts form component family by the first stroke of a Chinese character, the continuous feature of the first stroke of a Chinese character, first stroke of a Chinese character parts, distribute to 27 primary keys that nine numerals produce, make Chinese character and Hanzi component have " horizontal one or two is perpendicular three or four, skims and begins at five or six o'clock seven, and folding picture eight or nine numbers replace mutually; continuous pen also opinion connect hand over from " the corresponding rule of the first stroke of a Chinese character, the code element of code system, is that four-dimensional coordinate is that four radical axles have carried out the lattice numeral of difference with affix, and the primary key representative " component family " of multiplexed code is encoded by parts to individual character; The repeated code of single character code is eliminated in the auxiliary code representative " parts order " of multiplexed code, being formed on hierarchy of components has the elimination repeated code mechanism of a plurality of selections, guarantee can not produce repeated code between single character code, conversion between parts and code has uniqueness and reversibility, part codes is unique basis of single character code, and single character code is unique basis of desk-top QWERTY keyboard input; There is no two kinds of character libraries of single character code self-assembling formation of repeated code, as shown in the table:
(1) coding numerical value character library
Horizontal first stroke of a Chinese character word
Figure FSB00001154056900000130
1
Figure FSB0000115405690000022
(2) first stroke of a Chinese character parts character library
Horizontal first stroke of a Chinese character word
1
Figure FSB0000115405690000024
Figure FSB0000115405690000025
2. Chinese character encoding computer keyboard and input method according to claim 1, is characterized in that, parts are selected from the one-tenth word of having been downgraded, left and right radical, at the bottom of prefix word, member and auxiliary stroke economized in word, they become five kinds of identity of parts, parts process motivation, the authentications that spelling is multiple and three roads of encoding are checked.
3. Chinese character encoding computer keyboard and input method according to claim 1, is characterized in that, the code element of code meets the four tones of standard Chinese pronunciation rule of Chinese, has the fit function of Chinese character.
4. Chinese character encoding computer keyboard and input method according to claim 1, it is characterized in that, individual character is by the sequence of first stroke of a Chinese character parts, the input keyboard code of the homotaxial < < of radical and stroke CJK unified ideograph > > first stroke of a Chinese character parts character library and standard desktop keyboard pressed in self-assembling formation and individual character, as shown in the table:
5. Chinese character encoding computer keyboard and input method according to claim 1, it is characterized in that, individual character is by coding numerical ordering, the keyboard code of the homotaxial < < of Chinese phonetic alphabet CJK unified ideograph > > coding numerical value character library and the input of standard desktop keyboard pressed in self-assembling formation and individual character, as shown in the table:
Figure FSB0000115405690000032
CN200810212411.1A 2008-08-18 2008-08-18 United Chinese characters font commandcode for Chinese, Japanese and Korean Expired - Fee Related CN101339466B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200810212411.1A CN101339466B (en) 2008-08-18 2008-08-18 United Chinese characters font commandcode for Chinese, Japanese and Korean

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200810212411.1A CN101339466B (en) 2008-08-18 2008-08-18 United Chinese characters font commandcode for Chinese, Japanese and Korean

Publications (2)

Publication Number Publication Date
CN101339466A CN101339466A (en) 2009-01-07
CN101339466B true CN101339466B (en) 2014-02-26

Family

ID=40213545

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200810212411.1A Expired - Fee Related CN101339466B (en) 2008-08-18 2008-08-18 United Chinese characters font commandcode for Chinese, Japanese and Korean

Country Status (1)

Country Link
CN (1) CN101339466B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1124366A (en) * 1994-08-08 1996-06-12 曹述交 Chinese character natural component coding
CN1337613A (en) * 2001-06-25 2002-02-27 曹述交 Chinese character part digital code

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1124366A (en) * 1994-08-08 1996-06-12 曹述交 Chinese character natural component coding
CN1337613A (en) * 2001-06-25 2002-02-27 曹述交 Chinese character part digital code

Also Published As

Publication number Publication date
CN101339466A (en) 2009-01-07

Similar Documents

Publication Publication Date Title
Dantzig et al. Number: The language of science
Paton Five classics of fengshui: Chinese spiritual geography in historical and environmental perspective
CN101288239A (en) Method for inputting chinese character using chinese alphabet and system for performing the same
Bostoen et al. The Kongo kingdom: The origins, dynamics and cosmopolitan culture of an African polity
CN1523518A (en) Intelligent Chinese cultural dictionary system
Schmiedl Chinese Character Manipulation in Literature and Divination: The Zichu by Zhou Lianggong (1612–1672)
Goldin Routledge handbook of early Chinese history
CN101339466B (en) United Chinese characters font commandcode for Chinese, Japanese and Korean
Tsu Kingdom of Characters: A Tale of Language, Obsession, and Genius in Modern China
Fan “Mr. Science”, May Fourth, and the global history of science
CN104537079B (en) Easily logical Chinese-character word-phrase Xin Chafa
Lan et al. A cognitive approach to the conceptual metaphors in Shi Jing (The Book of Poetry)
TW201314498A (en) Basic component compounded Chinese input method
CN104951096A (en) Chinese character coding input method for coordinate shape codes of eight categories of stroke shapes
Gu The" Zhouyi"(Book of Changes) as an Open Classic: A Semiotic Analysis of Its System of Representation
Rashwan Intellectual Decolonization and Harmful Nativism: Arabic Knowledge Production of Ancient Egyptian Literature
Aylward The Imperial Guide to Feng Shui and Chinese Astrology: The Only Authentic Translation from the Original Chinese
Azadpour et al. Kielmeyer and the Organic World: Texts and Interpretations
Phung Land & water: A history of fifteenth-century Vietnam from an environmental perspective
CN105511636A (en) Improvements of all Chinese character and Chinese words simple non-repeated code-uniformed inputting method
CN101093421A (en) Hierarchy type codes of four stocks of Chinese characters, and digital encoded method for inputting shape and sound
Duoduo A Repertoire of Dongba Pictographs: Challenges and Solutions
Delz A theoretical approach to automatic loanword detection
Kvaerne The literature of Bon
CN106293130B (en) The Chinese quick hand-writing input method of the word tone font stroke order of strokes observed in calligraphy

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140226

Termination date: 20140818

EXPY Termination of patent right or utility model