CN1172983A - Phonetic Chinese word encoding and its keyboard - Google Patents

Phonetic Chinese word encoding and its keyboard Download PDF

Info

Publication number
CN1172983A
CN1172983A CN 97113313 CN97113313A CN1172983A CN 1172983 A CN1172983 A CN 1172983A CN 97113313 CN97113313 CN 97113313 CN 97113313 A CN97113313 A CN 97113313A CN 1172983 A CN1172983 A CN 1172983A
Authority
CN
China
Prior art keywords
chinese
speech
replaces
joint
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 97113313
Other languages
Chinese (zh)
Other versions
CN1109283C (en
Inventor
赵延胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN 96107547 external-priority patent/CN1142077A/en
Application filed by Individual filed Critical Individual
Priority to CN97113313A priority Critical patent/CN1109283C/en
Publication of CN1172983A publication Critical patent/CN1172983A/en
Application granted granted Critical
Publication of CN1109283C publication Critical patent/CN1109283C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

A phonetic Chinese phrase encode method and its keyboard are invented. Based on word processing, new Chinese-character encode units: "Chinese phrase" and "sentence reading", are created, and new encode forms: "phonetic Chinese phrase" and "sentence language", are provided. Their methematic methods can be used in speech information processing and Chinese-character information processing. Its advantages are no duplicate and good readability.

Description

Chinese word phonetic-alphabet code and keyboard thereof
The invention belongs to the Chinese character information processing field, be mainly used in coding, input, output and the Chinese vocabulary of Chinese character and the Computer Processing of statement etc.
In the Chinese character keypad input method, divide, font code, sound sign indicating number are arranged, shape sound sign indicating number, phonetic-stroke code four big classes by the Hanzi attribute of encode Chinese characters for computer institute foundation.The above-mentioned method of Chinese character coding cuts both ways, also how many differences of dealing with problems, and various relevant introductions have not much just been talked here.They have three common shortcomings, and the one, to repeat code Chinese character, generally use artificial word selection, bring inconvenience for numerous users; The 2nd, encode Chinese characters for computer can not resemble imports computing machine the english easily, popularizes to computing machine and brings difficulty; The 3rd, the various methods of Chinese character coding all can not promote the solution of the various application problems of Chinese character information processing.For example, just very famous by " the natural code input method " of Mr.'s Zhou Zhinong invention, major defect is: the Pinyin coding method of natural code, and use artificial word selection to solve coincident code problem, can not be as using english convenient, the Chinese word segmenting problem does not solve; The shape justice coding method of natural code can not be the solution of the various application problems of Chinese character information processing, and a good environment is provided.
The purpose of this invention is to provide a kind of support that the various application problems of Chinese character information processing solve, based on word processing, words and phrases processing, do not have repeated code, the encode Chinese characters for computer that can read (keyboard) input method.For this reason, provide a kind of new encode Chinese characters for computer unit " Chinese speech ", in Chinese character information processing and encode Chinese characters for computer, Chinese speech can carry out the limit cutting to Chinese language material; A kind of new encode Chinese characters for computer form " Zhao's speech " is provided, i.e. " Chinese word phonetic-alphabet ", Zhao's speech at word link writing, can read, not use artificial word selection, input with under the much the same condition of english, make encode Chinese characters for computer neither one repeated code; Provide a kind of keypad of suitable Zhao's speech brevity code input, so that input speed is provided; Zhao's speech all-key uses international modular keypad.
For finishing above-mentioned purpose, the invention provides a kind of method of Chinese word phonetic-alphabet code, and the keyboard that is applicable to Chinese word phonetic-alphabet code, its special character is to comprise following content:
1. the Hanzi input keyboard of Chinese word phonetic-alphabet code and keyboard special use thereof, it is characterized in that: Chinese word phonetic-alphabet code and keyboard thereof use " all-key " and " brevity code " two kinds of keyboards, and the all-key keyboard does not have figure, brevity code keyboard, i.e. Fig. 7;
All-key and brevity code, all use international modular keypad, 26 initial consonants of Chinese word phonetic-alphabet code, 38 simple or compound vowel of a Chinese syllable, 26 joints are transferred letter and corresponding 22 initial consonants of the Chinese phonetic alphabet, 38 simple or compound vowel of a Chinese syllable, 4 circumflexs, be defined on 26 English alphabet keys of QWERTY keyboard; Following " replacement " all is meant the letter of the Chinese phonetic alphabet is replaced with the English alphabet on the QWERTY keyboard;
The initial consonant of all-key and brevity code is identical, and zh replaces with y, and ch replaces with V with w replacement, sh, compares with the Chinese phonetic alphabet, increases by five " no pronunciation " initial consonant a, i, e, o, u, same English alphabet, the also same English alphabet of other initial consonant, i.e. Fig. 2;
The simple or compound vowel of a Chinese syllable of all-key, all constitute by two English alphabets, er replaces with eh, a replaces with al, o replaces with oj, e replaces with ef, ai replaces with ak, ei replaces with ec, ao replaces with ag, ou replaces with od, an replaces with am, the same English alphabet of en, ang replaces with at, eng replaces with eb, ong replaces with oy, i replaces with ih, ia replaces with il, ie replaces with if, iao replaces with ig, iou replaces with id, ian replaces with im, the same English alphabet of in, iang replaces with it, ing replaces with ib, iong replaces with iy, u replaces with uh, ua replaces with ul, uo replaces with uj, uai replaces with uk, uei replaces with uc, uan replaces with um, uen replaces with un, uang replaces with ut, ueng replaces with ub, ü replaces with oh, ü e replaces with of, ü an replaces with om, ü n replaces with on, the simple or compound vowel of a Chinese syllable ê of the Chinese phonetic alphabet is incorporated into rhythm go into simple or compound vowel of a Chinese syllable ei, replace with ec, the initial consonant ng of the Chinese phonetic alphabet, use as simple or compound vowel of a Chinese syllable, replace with ob, compare with the Chinese phonetic alphabet, increase " no pronunciation " simple or compound vowel of a Chinese syllable ot, same English alphabet, Fig. 3;
The simple or compound vowel of a Chinese syllable of brevity code, all constitute by an English alphabet, er, the no pronunciation simple or compound vowel of a Chinese syllable ot of ia and all-key, all replace with Q, iou replaces with w, the same English alphabet of e, ü an and uan replace with R, ü e and uei replace with T, ian replaces with Y, the same English alphabet of u, the same English alphabet of i, the same English alphabet of o, uo replaces with o, ü n and uen replace with P, the same English alphabet of a, iong and ong replace with s, iang and uang replace with D, en replaces with F, eng and ueng replace with G, ang replaces with H, an replaces with J, ao replaces with k, ai replaces with L, ei and ê replace with z, ie replaces with x, ü and ua replace with C, iao replaces with V, ou replaces with B, in and ng replace with N, ing and uai replace with M, Fig. 3;
The joint of all-key and brevity code transfers letter identical, and high and level tone replaces with s, t, u, v, w, x, z, and rising tone replaces with m, n, o, p, q, r, z, and last sound replaces with g, h, i, j, k, l, y, and falling tone replaces Fig. 1 with a, b, c, d, e, f, y.
2. the method for Chinese character coding of a Chinese word phonetic-alphabet, it is characterized in that, with Chinese speech is encode Chinese characters for computer unit, with Chinese word phonetic-alphabet and phonetic sentence speech is the encode Chinese characters for computer form, encode one to one with Chinese speech and Chinese word phonetic-alphabet, with sentence make peace Chinese word phonetic-alphabet serve as the input unit, be the Chinese character meaning and pronunciation coding method of output unit with sentences and phrases and Chinese speech, content comprises:
1) be encode Chinese characters for computer unit with Chinese speech and coding sentences and phrases, by a Chinese character and two encode Chinese characters for computer units that Chinese character is formed, be called Chinese character Chinese speech, the Chinese character Chinese speech of a Chinese character, be called " Chinese word character ", perhaps be called " Chinese word character Chinese speech ", the Chinese character Chinese speech of two Chinese characters is called " two Chinese character ", perhaps be called " two Chinese character Chinese speech ", when not making any distinction between, be referred to as " Chinese speech ", the mathematical definition of Chinese speech is c 2+ c 1, c=0 in the formula, 1,2,3 ... positive integer, c represents the number of different Chinese character, c 1The number of expression Chinese word character Chinese speech, c 2The number of the two Chinese character Chinese speech of expression; A Chinese speech has only a meaning, is called " generic meaning ", is called for short " synonymity ", and the mathematical model of Chinese part of speech justice is H 1=log 2(c 2+ c 1), in the formula, c>0, H 1The average information of expression Chinese part of speech justice, unit is a bit, c represents the number of different Chinese character, c 1The number of expression Chinese word character Chinese part of speech justice, c 2The number of the two Chinese character Chinese part of speech justice of expression; Chinese speech has the written form and the meaning of regulation, between the Chinese speech space is arranged; Encode Chinese characters for computer unit by two Chinese speech are formed is called " coding sentences and phrases ", is called " sentences and phrases " again, and the encode Chinese characters for computer unit of sentences and phrases has four kinds, is exactly Chinese word character+Chinese word character, Chinese word character+two Chinese characters, two Chinese character+Chinese word character, two Chinese character+two Chinese characters;
2) be the encode Chinese characters for computer form with Chinese word phonetic-alphabet and phonetic sentence speech, Chinese word phonetic-alphabet code uses " all-key ", is called " Zhao's speech all-key " again, and " brevity code " is called " Zhao's speech brevity code " again, two kinds of encode Chinese characters for computer forms;
The initial consonant of all-key and brevity code is identical, all is 26 b, p, m, f, d, t, n, l, g, k, h, j, q, x, y, w, v, r, z, c, s, a, i, e, o, u, i.e. Fig. 2;
The simple or compound vowel of a Chinese syllable of all-key has 38, is that eh, al, oj, ef, ak, ec, ag, od, am, en, at, eb, oy, ih, il, if, ig, id, im, in, it, ib, iy, uh, ul, uj, uk, uc, um, un, ut, ub, oh, of, om, on, ot, ob are Fig. 3;
The simple or compound vowel of a Chinese syllable of brevity code has 26, is Q, W, E, R, T, Y, U, I, O, P, A, S, D, F, G, H, J, K, L, Z, X, C, V, B, N, M, i.e. Fig. 3;
" joint is transferred letter " of all-key and brevity code, be called " joint is transferred " again, identical, all be 26, in addition, also have 2 identical joints to transfer letter respectively, it is s, t, u, v, w, x, z that the high and level tone joint is transferred, and it is m, n, o, p, q, r, z that the rising tone joint is transferred, and it is g, h, i, j, k, l, y that last sound joint is transferred, it is a, b, c, d, e, f, y, i.e. Fig. 1 that the falling tone joint is transferred;
All-key and brevity code use initial consonant, simple or compound vowel of a Chinese syllable, joint to transfer respectively, about 1300 of different syllables that the Chinese phonetic alphabet had phonological tone, be encoded to about 8580 different codings that phonological tone is arranged, these 8580 codings, be called " Chinese word phonetic-alphabet ", the Chinese word phonetic-alphabet of a sound joint, be called " monophone joint ", perhaps be called " monophone joint Zhao speech ", the Chinese word phonetic-alphabet of two sound joints is called " alliteration joint ", perhaps is called " alliteration joint Zhao speech " when not making any distinction between, be referred to as " Chinese word phonetic-alphabet " or " Zhao's speech ", the mathematical definition of Zhao's speech is a 2+ a 1, a=0 in the formula, 1,2,3 ... positive integer, a represents the number that do not save in unison, a 1The number of expression monophone joint Zhao speech, a 2The number of expression alliteration joint Zhao speech; Zhao's speech has only a received pronunciation, is exactly the standard mandarin voice, and the mathematical model of Zhao's speech mandarin pronunciation is H 2=log 2(a 2+ a 1), in the formula, a>0, H 2The average information of expression Zhao speech mandarin pronunciation, unit is a bit, a represents the number that do not save in unison, a 1The number of expression monophone joint Zhao speech mandarin pronunciation, a 2The number of expression alliteration joint Zhao speech mandarin pronunciation; Calculate with 8580 sound joints, the sum of Zhao's speech is 7.362498 * 10 7Individual, the entropy of Zhao's speech voice, promptly the average information of mandarin pronunciation is 26.134 bits; Between Zhao's speech the space is arranged; Monophone saves by initial consonant, simple or compound vowel of a Chinese syllable, joint accent, totally three parts constitute, and the alliteration joint is transferred by initial consonant, simple or compound vowel of a Chinese syllable, joint accent, initial consonant, simple or compound vowel of a Chinese syllable, joint, and totally six parts constitute; Encode Chinese characters for computer form by two Chinese word phonetic-alphabets are formed is called " phonetic sentence speech ", is called " sentence speech " again, and the encode Chinese characters for computer form of sentence speech has four kinds, is exactly monophone joint+monophone joint, monophone joint+alliteration joint, alliteration joint+monophone joint, alliteration joint+alliteration joint;
3) primitive rule of the corresponding coding of Chinese speech and Chinese word phonetic-alphabet has three, an alliteration joint of one pair of fixing use of Chinese characters coding, and a Chinese word character is fixed and is used a monophone to save coding, and alliteration joint of a fixing use of Chinese word character is encoded; The ancillary rules of the corresponding coding of Chinese speech and Chinese word phonetic-alphabet has one, it is exactly the rule that Chinese character and joint are transferred the corresponding ordering of letter, following " sequence number ", all be meant Fig. 1 " joint is transferred alphabetical sequence number ", the one, " word is ranking method frequently " is in unisonance same tone Chinese character, according to the frequency size of using Chinese character, 6 Chinese characters of 6 series arrangement from sequence number 1 to sequence number are arranged repeatedly, can arrange all unisonance same tone Chinese characters; The 2nd, " meaning of word ranking method ", be called " pronunciation and meaning ranking method " again, in unisonance same tone Chinese character, according to a basic meaning of Chinese character, the regulation Chinese character is transferred alphabetical correspondence ordering with joint, the basic meaning of whole Chinese characters, classify as two kinds " generic meanings ", be exactly " noun " and " verb ", be subdivided into 6 kinds again and fix one's mind on justice, be exactly, noun in kind, abstract noun, for the time noun, action verb, stative verb, the process verb, in unisonance same tone Chinese character, according to a basic meaning of Chinese character, from sequence number 1 to sequence number 6,6 Chinese characters of series arrangement, arrange repeatedly, can arrange all unisonance same tone Chinese characters, Fig. 1;
4) with sentence make peace Chinese word phonetic-alphabet serve as the input unit, having between two Chinese word phonetic-alphabets under the prerequisite in a space, by two input units that Chinese word phonetic-alphabet is formed, be called " input sentence speech ", be called " sentence speech " again, the secondary space bar is hit in sentence speech back, if monophone is saved numeral " 1 " expression, alliteration is saved numeral " 2 " expression, so, the array configuration of sentence speech has four kinds, is exactly " 1+1 ", " 1+2 ", " 2+1 ", " 2+2 " is when being the input unit with the Chinese word phonetic-alphabet, import a Chinese word phonetic-alphabet, hit space bar one time;
5) be output unit with sentences and phrases and Chinese speech, having between two Chinese speech under the prerequisite in a space, the output unit of forming by two Chinese speech, be called " output sentences and phrases ", be called " sentences and phrases " again, there is the distance in two spaces the sentences and phrases back, if Chinese word character is represented with numeral " 1 ", numeral " 2 " expression of two Chinese characters, so, the array configuration of sentences and phrases has four kinds, be exactly " 1+1 ", " 1+2 ", " 2+1 ", " 2+2 " are when being output unit with Chinese speech, export a Chinese speech, there is a space back.
Above-mentioned method and keyboard are done classification and the qualification that professional technique is used, just be applicable to that all are large, medium and small, in microcomputer Chinese character information processing system, Chinese character teleprinter, Chinese character computer typewriter, Chinese character terminal, all kinds of electronic printing typesetting system, information retrieval and file administration, OAS, expert system, translation system, Chinese character speech recognition system and Chinese character pattern recognition system, Chinese character information communication system, advertising system, telephone directory system and the public consultative service system.
The Chinese language material always is made up of different Chinese characters.Calculate with 6763 different Chinese character among the GB2312-80, can construct different Chinese speech 4.5744932 * 10 altogether 7Individual, i.e. unique Chinese set of words, the entropy of each Chinese speech, promptly average information is 25.447 bits, computing method are as follows:
When c=6763,
c 2+ c 1=6763 2+ 6763 1=4.5744932 * 10 7(individual)
H 1=log 2(c 2+c 1)
=log 24.5744932×10 7
=25.447 (bits)
The quantity of Chinese speech is very big, but the Chinese speech of the actual use of Modern Chinese, and few.The Chinese speech of the actual use of Modern Chinese can be done the contrast estimation according to the quantity of Chinese grammar speech.From in form, whole disyllabic words in the function word, can regard two Chinese characters as, all monosyllabic, can regard Chinese word character as, trisyllable, quadrisyllable, the above speech of pentasyllable, can be cut into two Chinese characters and Chinese word character, the grammer phrase of two Chinese characters all is two Chinese characters, also have some Chinese speech to contrast, referring to the example sentence among the embodiment with function word.The actual quantity of using Chinese speech is significantly more than the quantity of function word.According to the quantity of modern general syntax speech, inventor's estimation, the quantity of the general Chinese speech of Modern Chinese, about 60,000, Chinese language material coverage rate is 99%, wherein, Chinese language material coverage rate is 95%, the most frequently used general Chinese speech, about 12,000.
Distance between the Chinese speech has a space just passable.When encode Chinese characters for computer, at first Chinese language material is cut into Chinese speech, then through Zhao's Chinese word coding input computing machine, the output computing machine be the Chinese speech of word link writing, also can be the Chinese character of word link writing not, but Chinese speech preferably.Word link writing will bring inexhaustible convenience and benefit to the various application problems of Chinese character information processing.How the importance of word link writing is emphasized all within reason.
The mathematical definition explanation of Chinese speech.Chinese speech is a kind of method of different Chinese character repeated arrangement.Referring to Fig. 4, the repeated arrangement method of " letter ", " breath ", " opinion " three different Chinese character.Formula according to repeated arrangement kind number: m n, and addition definition just can calculate the sum of Chinese speech.Calculating the sum of Chinese speech, is exactly the mathematical definition of Chinese speech.From Fig. 4 Chinese speech principle illustration as can be seen, by " letter ", " breath ", " opinion ", three different Chinese character, can construct 9 different two Chinese character Chinese speech, 3 different Chinese word character Chinese speech, amount to 12 different Chinese speech, the different Chinese speech of the actual use of Modern Chinese have 4, that is: " letter ", " breath ", " opinion ", " information ", remaining 8 two Chinese character Chinese speech are standby." standby " this reason is very simple, and before " information theory " do not produce, " information " this Chinese speech nobody used, and a large amount of now the use.
The mathematical definition of Chinese speech can make computing machine and common user, holds Chinese speech on the whole, can describe the various features of Chinese speech quantitatively, and this is very useful to the solution of Chinese character information processing and encode Chinese characters for computer variety of issue.Give one example again, if " unlatching of communication function and stop ".Regard a sentence as, so, used 10 different Chinese character altogether, used 6 Chinese speech, had only a meaning because the present invention stipulates a Chinese speech, i.e. " generic meaning ", be called for short " synonymity ", so, according to information-theoretical method, and the mathematical model of Chinese part of speech justice, can set up the mathematical model of Chinese words and phrases subclass justice: H 3=log 2(c 2+ c 1) n, c 〉=1,1≤n≤c,
H 3The average information of expression sentence synonymity, unit: bit;
N represents to use in the sentence number of Chinese speech;
Other is with the mathematical model of Chinese part of speech justice.
The synonymity of " unlatching of communication function and stop " the words, promptly the average information of the words meaning is to work as c=10, during n=6, H 3=log 2(c 2+ c 1) n=log 2(10 2+ 10 1) 6=6 * 6.781=40.686 bit.
For english and Chinese grammar speech, similarly plan, will be very difficult.The mathematical definition of Chinese speech, the mathematical model of Chinese part of speech justice, the mathematical model of sentence synonymity will be third generation Hanzi coding input method, Chinese character information processing provides a good working environment.
The mathematical definition explanation of Zhao's speech.The mathematical definition of Zhao's speech, mathematical definition with Chinese speech, what does not have different at all, just literary style is different with quantity, and what Zhao's speech used is the sound joint, is a kind of encode Chinese characters for computer form based on voice, what Chinese speech used is Chinese character, be a kind of encode Chinese characters for computer unit of regularity, on total number, Chinese speech is howed a lot than Zhao's speech.Because Zhao's speech can be read, so, can describe quantitatively the voice of Zhao's speech.The present invention's regulation, Zhao's speech has only voice, and different Zhao's speech just has different voice, if different Zhao's speech pronunciations is identical, promptly the unisonance different shaped also is different voice.The quantity of information of the quantity of information of Zhao's speech voice and Zhao's speech synonymity, computing method are just the same, if the number of different Chinese character is with the number of joint is not identical in unison, so, quantity of information is also just identical, this meets general knowledge.The mathematical model of Zhao's speech mandarin pronunciation, H 2=log 2(a 2+ a 1), the phonetic entry identification and the synthetic method that provides of Chinese character will be provided in a>0.The coding sentence of being formed with Zhao's Chinese word coding, i.e. mandarin pronunciation sentence, computing method are with the computing method of " Chinese words and phrases subclass justice ".Just " c " in " mathematical model of Chinese words and phrases subclass justice " changed into " a ", the number of Zhao's speech is used H in " n " expression voice sentence 4The quantity of information of expression mandarin pronunciation sentence gets final product, that is: H 4=log 2(a 2+ a 1) n, a 〉=1,1≤n≤a.According to the mathematical model of Chinese words and phrases subclass justice and voice, can unify to be write as a kind of form, that is: H=log 2(c 2+ c 1) n, c>0,0<n≤c.
Use sound saves, and can not change the voice of mandarin.The present invention does not use light tone syllable, and all chances are Chinese character softly, marks this accent without exception, as can not find out this accent of Chinese character on small-sized dictionary, substitutes with " falling tone " tone without exception.
The coding key of sound joint.Save the accent alphabet referring to Fig. 1, wherein, sequence number 1 transfers letter " s, m, g, a " to constitute by joint, represent high and level tone, rising tone respectively, go up sound, four tones of falling tone, four joints of sequence number 1 transfer letter with initial consonant of the present invention, simple or compound vowel of a Chinese syllable combination, 1300 different sound joints of codified, four circumflexs that are equivalent to use the Chinese phonetic alphabet are constructed 1300 different single syllable with initial consonant, simple or compound vowel of a Chinese syllable combination.Use the method for sequence number 1 repeatedly, sequence number 2 has just been arranged to sequence number 7.Wherein, sequence number 1 is to sequence number 6, and the different sound of codified saves 6 * 1300=7800 altogether.The situation of sequence number 7 is more special, with a tone letter " z " expression high and level tone and rising tone tone, with sound and falling tone tone in " y " expression.The tone ratio of GB2312-80 " primary word ", approximately be, high and level tone 0.25, rising tone 0.23, last sound 0.17, falling tone 0.35 calculate with the highest high and level tone 0.25 and falling tone 0.35 respectively, then have, 0.25+0.35=0.6, promptly 1300 * 0.6=780 transfers 780 at the different sound joint of alphabetical codified with two joints of " z " and " y ", so have, 7800+780=8580,8580 origin that do not save in unison that Here it is.According to the definition of Zhao's speech, can calculate, the different coding form of monophone joint is 8580, the different coding form of alliteration joint is 8580 * 8580=7.36164 * 10 7The sum of Zhao's Chinese word coding form is: 8580+8580 2=7.362498 * 10 77.362498 * 10 7Individual Zhao's speech is to make encode Chinese characters for computer not have the gordian technique of repeated code.Because the sum of Zhao's speech is seven over thousands of ten thousand, solve the coincident code problem of encode Chinese characters for computer, make at all and too many or too much for use, so the present invention's regulation only uses the sequence number 1 of Fig. 1 to transfer letter to the joint of sequence number 6, the joint of sequence number 7 transfers letter standby.
The sound joint uses 26 initial consonants altogether, referring to Fig. 2 initial consonant table, wherein five vowel initial consonants " a, i, e, o, u " only appear on the initial consonant position, do not have pronunciation, because the present invention does not allow not have the sound joint of initial consonant to exist, so, solution is, when the sound joint has only simple or compound vowel of a Chinese syllable not have initial consonant, and first letter of simple or compound vowel of a Chinese syllable, must rewrite once, an initial consonant all be arranged to guarantee each joint.Because first letter of simple or compound vowel of a Chinese syllable all is a vowel, like this, compare with 21 initial consonants of Chinese spelling pronunciation matrix, the present invention has just increased by five aphonic vowel initial consonants, and the initial consonant table of sound joint does not have any difference in the use with the initial consonant table of the Chinese phonetic alphabet.
The sound joint uses 38 simple or compound vowel of a Chinese syllable altogether, referring to Fig. 3 rhythm matrix.With the simple or compound vowel of a Chinese syllable epiphase ratio of the Chinese phonetic alphabet, except most of simple or compound vowel of a Chinese syllable differences on the literary style, also have 4 differences, the first, the Chinese phonetic alphabet simple or compound vowel of a Chinese syllable tabular of general dictionary goes out 35 simple or compound vowel of a Chinese syllable, and simple or compound vowel of a Chinese syllable er excludes in the table, and the present invention lists in the table; The second, in order to make initial consonant and simple or compound vowel of a Chinese syllable uniform, the Chinese phonetic alphabet is not listed in the initial consonant ng of initial consonant table, the present invention uses as simple or compound vowel of a Chinese syllable, lists the rhythm matrix in, and pronunciation and effect are all constant; The 3rd, the present invention increases a no pronunciation simple or compound vowel of a Chinese syllable, no pronunciation simple or compound vowel of a Chinese syllable has only written form, there is not pronunciation, as the simple or compound vowel of a Chinese syllable of the Chinese character of no simple or compound vowel of a Chinese syllable in the mandarin " mouthful admire ", " mouthful dance ", " ", so that any one Chinese character in the Chinese language material, when using the present invention to encode, the sound joint all transfers three parts to form by initial consonant, simple or compound vowel of a Chinese syllable, joint, without exception; The 4th, the present invention incorporates the simple or compound vowel of a Chinese syllable of the Chinese phonetic alphabet " e " into simple or compound vowel of a Chinese syllable " ei ".
Zhao's speech all-key uses the small letter English alphabet, and the monophone joint is made of four letters, and the alliteration joint is made of eight letters, and Zhao's speech brevity code uses capitalization English letter, and the monophone joint is made of three letters, and the alliteration joint is made of six letters, and the coding form of Zhao's speech is determined.Only see the number of letter, Zhao's speech just can not obscured with english or other western language speech, can not obscure with Chinese phonetic alphabet speech yet, and the boundary of sound joint can not obscured yet.Zhao's speech preferably uses mandarin to read, and also can use non-type mandarin to read, can also user's speech pronunciation.Zhao's speech is a kind of encode Chinese characters for computer form, is not Chinese phonetic alphabet speech, and whether pronunciation standard, can not influence normal use.
From " meaning " of Fig. 5, " they " of Fig. 6, as can be seen, the alliteration joint approximately is disyllabic 49 times of the Chinese phonetic alphabet, the monophone joint approximately is monosyllabic 7 times of the Chinese phonetic alphabet." meaning, contrary opinion, objection, discrepancy, free translation, radiating power and vitalitys, thriving, sparking " maximum with Chinese Homophone are example, use the Chinese phonetic alphabet to write, and have only a kind of literary style " yi yi ", and repeated code is eight.Use the present invention, only used eight of the codings of alliteration joint, do not have repeated code.For general two Chinese characters, the sum of the two Chinese characters of unisonance, surpass six be minority, the two Chinese characters of the unisonance of " meaning " for example above-mentioned are eight, and the two Chinese characters of general unisonance will reach 36, are impossible, even ancient times, modern times, following all counting in, possibility is also minimum, and the two Chinese characters of general unisonance will reach 49, and is impossible especially.Certainly, two Chinese characters that Chinese person name, place name, scientific and technological specialized vocabulary etc. are used, and foreigner's name, place name, scientific and technological specialized vocabulary translate into two Chinese characters of Chinese back use, belongs to the specific question of Chinese speech and Zhao's speech, according to user's requirement, the inventor will handle in addition.
Analogize, two Chinese characters use the no repeated code of alliterations joint coding, according to the 3rd Basic Encoding Rules, from the angle of technology, just can guarantee that whole encodes Chinese characters for computer do not have repeated code.Even if existing different Chinese character has 100,000, suppose all to use alliteration joint coding, also just spend 100,000 different alliteration joints, only account for the only a few of seven over thousands of ten thousand alliteration joints.The double-tone joint of the Chinese phonetic alphabet is though there are 1300 * 1300=1.69 * 10 6Individual different written form, regrettably, the double-tone joint is not handled unisonance sign indicating number, the ability of repeated code in other words.
Explanation to sentences and phrases and sentence speech.The form of " coding sentences and phrases " and " output sentences and phrases " is identical, and just one is used in the preceding cutting of coding, and one is used in computing machine output, so be called for short identical.The reason that " phonetic sentence speech " is identical with the abbreviation of " input sentence speech " is the same.Sentences and phrases are meant the three words language and the four word languages of Chinese character, and the sentence speech is meant three the sound joints (being equivalent to syllable) of encode Chinese characters for computer and the coding of four sound joints.Use the fundamental purpose of sentences and phrases to be, the one, in order to solve the coincident code problem of encode Chinese characters for computer, when Chinese character has repeated code, use three words and expressions to read coding, the input of three joint sentence speeches can solve coincident code problem, it is three joints " 1+2 " and " 2+1 " that sentences and phrases and sentence speech use maximum forms, because alliteration joint of the present invention i.e. " 2 " does not have repeated code, so four word languages " 2+2 " does not have coincident code problem; The 2nd, more definite in order to make semanteme, for example, and " Three Character Primer " of Chinese, " four word Chinese idioms " etc. can both represent a definite meaning or story; The 3rd, for sentences and phrases and sentence speech as a sentence disposal route, behind Zhao's speech and sentence speech input computing machine of preparing phonetic, convert Chinese speech and sentences and phrases automatically to and export, condition is provided; The 4th, more clear and more coherent in order to make statement, express clearer, the 5th, more convenient in order from statement, to be syncopated as Chinese speech.
Example 1: universal joint is a kind of ambidextrous mechanical hook-up. " A1, A2, A4, A5 " represents sentences and phrases, " A3, A11, A12, A21, A22, A41, A42, A51, A52 " expression Chinese speech.The cutting result is as follows:
Universal joint is a kind of ambidextrous mechanical hook-up.
Said method is called " sentences and phrases syncopation ", the present invention's regulation, sentences and phrases must be cut into two Chinese speech, and can only be cut into two Chinese speech, because two spaces are used in the sentences and phrases back, so, on written, sentences and phrases are the same with Chinese speech, have formal denotation, this will bring many convenience for the computing machine automatic word segmentation.Two sentences and phrases are called " super sentences and phrases ", and two super sentences and phrases are called " inferior statement " two statements and are called " statement ", and two statements are called " super statement " or the like, as required, always can two close down.Though super sentences and phrases, statement etc. do not have formal denotation,, will bring convenience to natural language reason Jie, mechanical translation etc. as a kind of algorithm.
Example 2: he holds differing views to the meaning of this incident.
Use " sentences and phrases syncopation " cutting " example 2 ", feel very not smoothly, if " example 2 " is rewritten into: " he holds differing views for the meaning of this incident." increased by one " in " word, felt cutting a bit, but not smoothly, if " example 2 " is rewritten into: " he holds differing views for the meaning that this incident produces." cutting just more smoothly, statement is also relatively more clear and more coherent.This explanation, sentences and phrases can help the user on literal expression, and be clearer, has the rhetoric effect." example 2 " though read obstructed, to the explanation how to encode, do not have what adverse effect.
How the present invention solves the encode Chinese characters for computer coincident code problem.
The user will learn Zhao's Chinese word coding of 3755 Chinese characters of GB2312-80 Chinese characters in common use table, perhaps learns Zhao's Chinese word coding of 6763 Chinese characters among the GB2312-80.There is not repeated code according to alliteration joint Zhao speech, the no repeated code of sentence speech, and the present invention will have tolerant code for all Chinese characters in common use.When keyboard is imported,, can guarantee not have repeated code as long as coding belongs to following one.The one, the first round in the unisonance Chinese character, 6 joints were transferred the Chinese character in the sequence number; Two are to use the input of alliteration joint Zhao speech; Three are to use the input of sentence speech, must have one to be alliteration joint Zhao speech in the sentence speech.If do not belong to above-mentioned three encode Chinese characters for computer input, just have repeated code, solution is to use the 3rd of the primitive rule of coding, and promptly Chinese word character is fixing uses an alliteration joint to encode.
The keyboard layout explanation of coding.The all-key keyboard has only used three compressed codes, i.e. zh y, ch w, sh represent that with v all-key uses Qwerty keyboard, because compressed code has only three, so the all-key keyboard is no longer drawn.
" brevity code keyboard " is keyboard special of the present invention referring to Fig. 7, and the key-bit code of brevity code, initial consonant are referring to Fig. 2, and simple or compound vowel of a Chinese syllable is referring to Fig. 3, and joint transfers letter referring to Fig. 1.Key-bit code among Fig. 7, following left side of face is all-key simple or compound vowel of a Chinese syllable and initial consonant, is the Chinese phonetic alphabet in the bracket of right side."/" expression does not have corresponding code.
The ancillary rules explanation of Zhao's Chinese word coding.Ancillary rules is actually the part of three Basic Encoding Rules, and three Basic Encoding Rules all must be used ancillary rules, are more convenient in order to narrate here, just list as an ancillary rules separately.Ancillary rules is exactly that 6 joints saying a tone are transferred letter, how to follow the method for the corresponding sorting coding of unisonance Chinese character, is called for short " ordering ".
Word is ranking method frequently, comes " ordering " coding according to Chinese character relative application frequency in the modern Chinese written language.This method is fairly simple, but regular poor, user's memory capacitance is very big.
Meaning of word ranking method, the inventor thinks, Chinese character is not expression " title ", is exactly expression " action ", so, the meaning of " title " class is called " noun ", move the meaning of a class, be called " verb ".Though the meaning of a Chinese character is many, a basic meaning is always arranged.Stipulate that a Chinese word character Chinese speech only represents basic meaning, other meaning of Chinese character uses two Chinese character Chinese vocabularys to show.For example: Chinese character " is beaten ", and basic meaning is " with hand or utensil bump object ", and Chinese word character Chinese speech is " verb ", segmentation is " action verb " again, and other meaning that Chinese character " is beaten " is always with other Chinese character logotype, promptly use two Chinese character Chinese speech, could represent, as:
" hired roughneck " (noun), noun in kind,
" hit the person " (verb), action verb,
" dismiss " (verb), the process verb,
" look " (verb) up and down, stative verb,
" plan " (verb), stative verb, or the like.
The basic meaning of Chinese character is divided into two big classes, is subdivided into six class basic meanings again,, do not influence use though " synonym " speech is a lot.Stipulate that a Chinese speech has only a meaning, " synonymity " become to calculate that this will be with to Chinese character information processing, encode Chinese characters for computer, all will bring convenience.The major defect of meaning of word ranking method is that memory capacitance is very big.
Supplementary notes to six kinds of synonymities: following " being equivalent to ", all be meant the grammer language in the Chinese.
Referring to Fig. 1,
Sequence number 1, noun in kind is equivalent to the concret moun in the noun.
For example: people, mountain, water,
Sequence number 2, abstract noun is equivalent to the abstract noun in the noun.
For example: friend, think of, political affairs
Sequence number 3, for the time noun, be equivalent to pronoun, numeral-classifier compound, time, place, the noun of locality etc.
For example: he, year, second, last, eastern, it, with.
Sequence number 4, action verb is equivalent to most of verb
For example: beat, put, write
Sequence number 5: stative verb is equivalent to a part of verb, adjectival whole.
For example: be, large and small, good, fast, slow.
Sequence number 6, the process verb is equivalent to a part of verb, adverbial word, preposition, auxiliary word, conjunction, interjection.
For example: float, flow, very, all,, to,,, get,,, cross and, breathe out.
The synonymity of Chinese word character is with the synonymity of two Chinese characters, change sometimes, for example, Chinese word character Chinese speech " " is the process verb, and " life " is the process verb, and two Chinese character Chinese speech " student " are nouns in kind, this change procedure of the meaning of a word, be called " form ", so " meaning of word ranking method " is called " form coding " again.
In addition, " meaning of word ranking method " has exception, and for example: " he, she, it " all should belong to sequence number 3, for the time noun, but for the convenience on using, regulation: " he ", generation is to noun, " she ", abstract noun, " it ", noun in kind.Similarly situation also have " ", " get ", " " or the like, exception is made special regulation, obviously be shortcoming, be so special Chinese character and seldom well.
Compared with prior art, major advantage of the present invention:
1. Zhao's Chinese word coding has been accomplished to make encode Chinese characters for computer neither one repeated code under the prerequisite that can read technically.This has created condition for popularizing computer utility.
2. the adaptation of the readability of Zhao's Chinese word coding is wide, and the people that can speak standard Chinese pronunciation or can not speak standard Chinese pronunciation can use.
3. Bian Ma primitive rule is exactly three, from all codings of Chinese character that is encoded to of a Chinese character, all is these three Basic Encoding Rules.
4. the mathematical model of the mathematical definition of Chinese character and Zhao's speech, and Chinese speech pronunciation and meaning sentence will provide method for the solution of the various application problems of Chinese character information processing.
5. the mathematical model of the mathematical definition of Chinese speech and Zhao's speech and pronunciation and meaning sentence explanation, Zhao's speech is than english Computer Processing preferably.
Drawing below in conjunction with accompanying drawing is as follows to description of contents of the present invention:
Fig. 1, joint is transferred alphabet (synonymity alphabet);
Fig. 2, the initial consonant table;
Fig. 3, the rhythm matrix;
Fig. 4, Chinese speech principle illustration;
Fig. 5, the Chinese word phonetic-alphabet code table of " meaning ";
Fig. 6, the Chinese word phonetic-alphabet code table of " they ";
Fig. 7, the brevity code keyboard layout.
The accompanying drawings specific embodiment;
When using Chinese word phonetic-alphabet code, at first to from Chinese language material, be syncopated as Chinese speech.Cutting Chinese speech can be regarded as and uses a Chinese character and two Chinese characters to carry out the process of rhetoric.So except having in form the similarity, Chinese character follows function word without any relation.According to " definition of Chinese speech ", be the basic skills of cutting Chinese speech, Fig. 4 is the ultimate principle of cutting Chinese speech, example 1 is the cutting result contrast of function word and Chinese speech.
Example 1. 1. universal joint/be/one/kind/very/dexterous// mechanical hook-up.(function word cutting)
2. universal joint is a kind of ambidextrous mechanical hook-up.(Chinese character definition cutting)
From example 1 1. and 2., the different of function word and Chinese speech can visually see.The subject matter of function word is that the definition of speech can't be held, and cause difficulty to cutting, and the definition of Chinese speech is simply clear and definite, carries out cutting according to a Chinese character and two Chinese characters exactly.Because cutting Chinese speech is relevant with individual's rhetoric level, so the operator must be to be the people with culture more than the junior middle school of mother tongue with Chinese.
For same Chinese language material, the Chinese speech that different people is syncopated as is the same in general.Because people's tendency always wishes to have best Rhetoric Expression, always wish to be syncopated as best Chinese speech, under same culture background, people's the mode of thinking, to the degree of understanding of " quality ", also always the same.It also is normal that exception is arranged, and is syncopated as different Chinese speech, can be regarded as rhetoric level difference, or the expression difference, just the Chinese speech difference of Shi Yonging can also be regarded innovation as, also can be regarded as waste matter, lack of standardization, or the like everything, all might take place.In general, good Chinese word segmentation result has only a kind of, and bad and general cutting result is diversified, and innovation and waste matter, always extremely other.
After Chinese word segmentation comes out, just can use Zhao's speech to encode, referring to Fig. 1 to Fig. 6, for the ease of understanding, the inventor at first provides the Chinese grammar of example sentence and the written form of Chinese phonetic alphabet speech, and then provides Chinese speech and Chinese word phonetic-alphabet code.Example 2 is to use the example sentence of " word is ranking method frequently ", all-key.
Example 2:
1. he/right/should/thing/part// meaning, hold/have/objection.(Chinese grammar speech)
2. T ā du ì g ā i sh ì j ī ā n de y ì y ì, ch í y ǒ u y ì y ì (Chinese phonetic alphabet speech)
3. he holds differing views to the meaning of this incident.(Chinese speech)
④Tai?sduca?gaks?vihdjimb?defa?iihbiiha,wihmiidg?iihciihd
1114212113 [4] (Chinese word phonetic-alphabet all-keys, word is ranking method frequently)
Example 2 4. in, arabic numeral 1,2,3,4 below Zhao's speech and not do not use 5,6, transfer sequence number 1 to the joint in the sequence number 6 to transfer letter corresponding one by one with the joint of Fig. 1, six unisonance Chinese characters with each syllable in 3755 Chinese characters of modern Chinese characters in common use table among the GB2312-80 are corresponding one by one, correspondence is stipulated according to word frequently by the inventor, referring to the numeral of Chinese character lower right side in the unisonance Chinese character statistical form of following example 2.Example 2 4. in, the arabic numeral [1] of the band bracket below Zhao's speech, [2], [3], [4], [5], [6], transfer sequence number 1 to the joint in the sequence number 6 to transfer letter corresponding one by one with the joint of Fig. 1, corresponding one by one with the 7th of each syllable in 3755 Chinese characters of GB2312-80 Chinese characters in common use table and more unisonance Chinese character, corresponding by inventor's regulation, referring to the numeral of Chinese character lower right side in the unisonance Chinese character statistical form of following example 2.Example 2 4. in only used [4],, learned Zhao's speech, just needn't mark just in order to learn and illustrate convenient mark.
Following Chinese word phonetic-alphabet brevity code, meaning of word ranking method still make the sentence of use-case 2.
5. he holds differing views to the meaning of this incident.
6. TAUDTF GLX VIBJYC DEF IIBIIE, WIPIWK IIFIIA. (Chinese word phonetic-alphabet
366236 [2] [5] 4561 codings, meaning of word ranking method)
" this incident " and " holding differing views " in 5. are sentences and phrases, and other is a Chinese speech.There are two spaces the sentences and phrases back, when there is punctuation mark the sentences and phrases back, adds a space before the punctuation mark, and the expression front is sentences and phrases.
" GLX VIBJYC " in 6. and " WIPIWK IIFIIA " are the sentence speeches, and other is Zhao's speech.The regulation in space is with the sentences and phrases in 5..
Unisonance Chinese character statistical form in the example 2, the front target is the Chinese phonetic alphabet, and the numeral in the Chinese character lower right corner is the Chinese character sort sequence number in " word is ranking method frequently ", and the numeral below Chinese character is the Chinese character sort sequence number in " meaning of word ranking method ".
Example 2 4. in, " corresponding by inventor regulation " mentioned, example 2 6. in, be rewritten into that " corresponding basic meaning decision by Chinese character is promptly determined by synonymity." basic meaning of Chinese character can look into " modern Chinese dictionary ", perhaps provide by the inventor.
When using Zhao's speech brevity code, if the sound joint does not have initial consonant, first letter of simple or compound vowel of a Chinese syllable is meant to rewrite first letter of Zhao's speech all-key once, for example: " watt ", the Chinese phonetic alphabet, " w ǎ ", Zhao's speech all-key: " uulg ", Zhao's speech brevity code: " UCG ".Zhao's speech brevity code can not be write as " CCG ".
The unisonance Chinese character statistical form of example 2:
1/t ā collapses 4He 1It 3She 2
4 3 1 2
2/du ì converts 3Team 2Right 1
4 1 6
3/g ā ī this 1
6
4/sh ì formula 6Show [5]The scholar [1]Generation [2]Persimmon [1]Thing 4
1 4 [1] [3] [1] 2
Wipe away [3]Oath [5]Die [5]Gesture [2]Be 1Have a liking for [4]
[4] [4] [6] [5] 5 [5]
Divination by means of the milfoil [6]Suitable [3]Bodyguard [6]Wait upon [2]Release [3]Decorations [4]
[4] [5] [1] [6] 6 [5]
The family name 5The city 2Rely on [6]The chamber 3Look [3]Examination [1]
[2] 3 [5] [3] [4] [6] (" horizontal bar in the front of a carriage used as an armrest " word that example 3 is used belongs to the inferior everyday character of GB2312-80, and the inventor is defined as the " horizontal bar in the front of a carriage used as an armrest [2]".)
[1]
5/j ī ǎ n recommends [2]Sill [4]Mirror [1]Trample [5]Low-priced [5]See 1
5 [1] [1] [4] [5] 4
Key [2]Arrow 6Part 2Strong [6]Warship [1]Sword 5
1 [1] 3 [5] [1] [1]
Give a farewell dinner [3]Gradually 4Spatter [4]The ravine [5]Build 3
[4] [5] [4] [1] 6
6/de 1 (according to regulation of the present invention, Chinese character is as can not find out on small-sized dictionary softly
6 these accent of Chinese character, spend several accent and substitute, de is write as de)
7/y ì skill 4Press down [2]Easily [5]City [1]Towering like a mountain peak [4]Hundred million [6]
2 4 [6] [1] [5] 3
Subjectively [6]Ease [5]Study [6]Epidemic disease [6]Also [1]Descendants [2]
[1] [5] [6] [1] [5] [1]
Meaning 2Firm [3]Recall [3]Justice 1Benefit [1]Overflow [4]
[2] [5] [6] 5 5 [4]
Call on [2]View [4]Friendship [2]Translate [4]Different 3The wing [5]
[5] 1 [5] [5] 6 [1]
Next [4]Unravel silk [3]
[3] [6]
8/ch í holds 1Spoon 2The pond 3Late 4Relax 5Speed 6
4 [1] 1 5 6 [5]
9/y ǒ u tenth of the twelve Earthly Branches 3Have 1The friend 2
3 5 1
Example 3: 1. topic/XiLin wall Soviet Union/horizontal bar in the front of a carriage used as an armrest
It is horizontal/as to see/become/mountain range/side/one-tenth/peak,
Far/near/high/low/each/difference.
No/knowledge/Mount Lushan/true/appearance,
Only/edge/body// this/mountain in.(Chinese grammar speech)
② TíxīLíBì Sū?SHì
Héng Kàn?chéng?Líng?cè?chéng fēng,
yuǎn jìn?gāo dí?gé?bùtóng。
Bù shí?Lúshān zhēn?miànmù,
zhǐ yuán?shēn zài cī?shānzhōng。(Chinese phonetic alphabet)
3. inscribe XiLin wall Soviet Union horizontal bar in the front of a carriage used as an armrest
The horizontal mountain range side Cheng Feng that regards as.
Far and near height is variant.
Fail to see what Lushan Mountain really looks like.
Edge is in this mountain.(Chinese speech)
④TIHN XIHSLINN BIHF SUIHS?VIHB(YIHV)
2 1 2 6 1 [2] [4]
Hebmkama?webmlibh cefc?webmfebw.
1 1 1 2 3 1 5
Oomgjinb?gagsdihs gefb?buhatoym.
1 2 1 1 2 1 1
Buhavihn?Luhnvams ycns?mimamuhb.
1?[2] 2 1 1 1 2
Yihjoomm vcnuzaka cihg vamsyoys. (Chinese word phonetic-alphabet all-key, word be ranking method frequently)
4[1] 3 1 1 1 1
Being described as follows of example 3;
The unisonance Chinese character statistical form of example 3 omits, and its method is with the unisonance Chinese character statistical form of example 2.
Chinese character " horizontal bar in the front of a carriage used as an armrest " usefulness seldom in Modern Chinese, is just used as name, according to the 3rd Basic Encoding Rules of the present invention, when Chinese character " horizontal bar in the front of a carriage used as an armrest " uses as Chinese word character, must be write as the alliteration joint, " vihb (yihv) ", i.e. " horizontal bar in the front of a carriage used as an armrest it ", writing like this is that the inventor stipulates.Round bracket
[2] the sound joint in [4] () expression bracket is not exported Chinese character, but must import computing machine by coding, " name class " Chinese character as the use of name place name, relevant department's statistics in a tree name Taiwan, it approximately is more than 25,000, Chinese characters in current use wherein are also easy to handle, more quite a few than the Chinese character that Chinese character " horizontal bar in the front of a carriage used as an armrest " is used still less, concerning common user, learn so obsolete pair of Chinese character of Modern Chinese of a large amount of resembling " horizontal bar in the front of a carriage used as an armrest it " and alliteration joint, and special-purpose name of a large amount of two Chinese characters and alliteration joint, obviously be inappropriate.The professional Chinese characters of science and technology etc. also belong to this class problem, and to this, the inventor will manage outer reason in addition.
Chinese character " knowledge " and " edge ", though be Chinese characters in common use, not within six Chinese character sequence numbers in the unisonance Chinese character of inventor's regulation, but " fail to see ", " edge " is two Chinese characters, and coding saves with alliteration, " buhavihn ", " yihjoomm " meet article one Basic Encoding Rules.
1 [2] 4 [1]
Chinese character " side ", " respectively ", " very ", " this ", " topic ", " wall ", " Soviet Union " they are Chinese word characters, and within six Chinese character sequence numbers of inventor's regulation, coding saves with monophone, " ccfc ",
3“gcfb”、“yens”、“cihg”、
211 " tihn ", " bihf ", " suhs " meet the second coding rule.
2 6 1
So long as two Chinese characters just are applicable to article one coding rule, an alliteration joint of a fixing use of two Chinese characters coding.Great majority Chinese word character commonly used is applicable to the second Basic Encoding Rules, a monophone joint of a fixing use of Chinese word character coding.Minority is used Chinese word character always, all Chinese word characters that is of little use reach the new from now on Chinese word character that produces, and is applicable to the 3rd Basic Encoding Rules, and alliteration of a fixing use of Chinese word character saves coding.After you skillfully used Chinese word phonetic-alphabet code, the 3rd Basic Encoding Rules can be used flexibly, that is, a Chinese word character can use a plurality of relevant alliterations joint codings.For example, Chinese character " horizontal bar in the front of a carriage used as an armrest " can also be write as alliteration joint " (pibq) vihb ", i.e. and " with the horizontal bar in the front of a carriage used as an armrest ", according to user's convenience, oneself determines.But article one and the second Basic Encoding Rules be immutable forever.Chinese word phonetic-alphabet code is exactly to use this three basic coding rules repeatedly.

Claims (3)

1. the Hanzi input keyboard of Chinese word phonetic-alphabet code and keyboard special use thereof, it is characterized in that: Chinese word phonetic-alphabet code and keyboard thereof use " all-key " and " brevity code " two kinds of keyboards, and the all-key keyboard does not have figure, brevity code keyboard, i.e. Fig. 7;
All-key and brevity code, all use international modular keypad, 26 initial consonants of Chinese word phonetic-alphabet code, 38 simple or compound vowel of a Chinese syllable, 26 joints are transferred letter and corresponding 22 initial consonants of the Chinese phonetic alphabet, 38 simple or compound vowel of a Chinese syllable, 4 circumflexs, be defined on 26 English alphabet keys of QWERTY keyboard; Following " replacement " all is meant the letter of the Chinese phonetic alphabet is replaced with the English alphabet on the QWERTY keyboard;
The initial consonant of all-key and brevity code is identical, and zh replaces with y, and ch replaces with V with w replacement, sh, compares with the Chinese phonetic alphabet, increases by five " no pronunciation " initial consonant a, i, e, o, u, same English alphabet, the also same English alphabet of other initial consonant, i.e. Fig. 2;
The simple or compound vowel of a Chinese syllable of all-key, all constitute by two English alphabets, er replaces with eh, a replaces with al, o replaces with oj, e replaces with ef, ai replaces with ak, ei replaces with ec, ao replaces with ag, ou replaces with od, an replaces with am, the same English alphabet of en, ang replaces with at, eng replaces with eb, ong replaces with oy, i replaces with ih, ia replaces with il, ie replaces with if, iao replaces with ig, iou replaces with id, ian replaces with im, the same English alphabet of in, iang replaces with it, ing replaces with ib, iong replaces with iy, u replaces with uh, ua replaces with ul, uo replaces with uj, uai replaces with uk, uei replaces with uc, uan replaces with um, uen replaces with un, uang replaces with ut, ueng replaces with ub, ü replaces with oh, ü e replaces with of, ü an replaces with om, ü n replaces with on, the simple or compound vowel of a Chinese syllable ê of the Chinese phonetic alphabet is incorporated into rhythm go into simple or compound vowel of a Chinese syllable ei, replace with ec, the initial consonant ng of the Chinese phonetic alphabet, use as simple or compound vowel of a Chinese syllable, replace with ob, compare with the Chinese phonetic alphabet, increase " no pronunciation " simple or compound vowel of a Chinese syllable ot, same English alphabet, Fig. 3;
The simple or compound vowel of a Chinese syllable of brevity code, all constitute by an English alphabet, er, the no pronunciation simple or compound vowel of a Chinese syllable ot of ia and all-key, all replace with Q, iou replaces with w, the same English alphabet of e, ü an and uan replace with R, ü e and uei replace with T, ian replaces with Y, the same English alphabet of u, the same English alphabet of i, the same English alphabet of o, uo replaces with o, ü n and uen replace with P, the same English alphabet of a, iong and ong replace with s, iang and uang replace with D, en replaces with F, eng and ueng replace with G, ang replaces with H, an replaces with J, ao replaces with k, ai replaces with L, ei and ê replace with z, ie replaces with x, ü and ua replace with C, iao replaces with V, ou replaces with B, in and ng replace with N, ing and uai replace with M, Fig. 3;
The joint of all-key and brevity code transfers letter identical, and high and level tone replaces with s, t, u, v, w, x, z, and rising tone replaces with m, n, o, p, q, r, z, and last sound replaces with g, h, i, j, k, l, y, and falling tone replaces Fig. 1 with a, b, c, d, e, f, y.
2. the method for Chinese character coding of a Chinese word phonetic-alphabet, it is characterized in that, with Chinese speech is encode Chinese characters for computer unit, with Chinese word phonetic-alphabet and phonetic sentence speech is the encode Chinese characters for computer form, encode one to one with Chinese speech and Chinese word phonetic-alphabet, with sentence make peace Chinese word phonetic-alphabet serve as the input unit, be the Chinese character meaning and pronunciation coding method of output unit with sentences and phrases and Chinese speech, content comprises:
1) be encode Chinese characters for computer unit with Chinese speech and coding sentences and phrases, by a Chinese character and two encode Chinese characters for computer units that Chinese character is formed, be called Chinese character Chinese speech, the Chinese character Chinese speech of a Chinese character, be called " Chinese word character ", perhaps be called " Chinese word character Chinese speech ", the Chinese character Chinese speech of two Chinese characters is called " two Chinese character ", perhaps be called " two Chinese character Chinese speech ", when not making any distinction between, be referred to as " Chinese speech ", the mathematical definition of Chinese speech is c 2+ c 1, c=0 in the formula, 1,2,3 ... positive integer, c represents the number of different Chinese character, c 1The number of expression Chinese word character Chinese speech, c 2The number of the two Chinese character Chinese speech of expression; A Chinese speech has only a meaning, is called " generic meaning ", is called for short " synonymity ", and the mathematical model of Chinese part of speech justice is H 1=log 2(c 2+ c 1), in the formula, c>0, H 1The average information of expression Chinese part of speech justice, unit is a bit, c represents the number of different Chinese character, c 1The number of expression Chinese word character Chinese part of speech justice, c 2The number of the two Chinese character Chinese part of speech justice of expression; Chinese speech has the written form and the meaning of regulation, between the Chinese speech space is arranged; Encode Chinese characters for computer unit by two Chinese speech are formed is called " coding sentences and phrases ", is called " sentences and phrases " again, and the encode Chinese characters for computer unit of sentences and phrases has four kinds, is exactly Chinese word character+Chinese word character, Chinese word character+two Chinese characters, two Chinese character+Chinese word character, two Chinese character+two Chinese characters;
2) be the encode Chinese characters for computer form with Chinese word phonetic-alphabet and phonetic sentence speech, Chinese word phonetic-alphabet code uses " all-key ", is called " Zhao's speech all-key " again, and " brevity code " is called " Zhao's speech brevity code " again, two kinds of encode Chinese characters for computer forms;
The initial consonant of all-key and brevity code is identical, all is 26, b, p, m, f, d, t, n, l, g, k, h, j, q, x, y, w, v, r, z, c, s, a, i, e, o, u, i.e. Fig. 2;
The simple or compound vowel of a Chinese syllable of all-key has 38, be eh, al, oj, ef, ak, ec, ag, od, am, en, at, eb, oy, ih, il, if, ig, id, im, in, it, ib, iy, uh, ul, uj, uk, uc, um, un, ut, ub, oh, of, om, on, ot, ob, i.e. Fig. 3;
The simple or compound vowel of a Chinese syllable of brevity code has 26, is Q, W, E, R, T, Y, U, I, O, P, A, S, D, F, G, H, J, K, L, Z, X, C, V, B, N, M, i.e. Fig. 3;
" joint is transferred letter " of all-key and brevity code, be called " joint is transferred " again, identical, all be 26, in addition, also have 2 identical joints to transfer letter respectively, it is s, t, u, v, w, x, z that the high and level tone joint is transferred, and it is m, n, o, p, q, r, z that the rising tone joint is transferred, and it is g, h, i, j, k, l, y that last sound joint is transferred, it is a, b, c, d, e, f, y, i.e. Fig. 1 that the falling tone joint is transferred;
All-key and brevity code use initial consonant, simple or compound vowel of a Chinese syllable, joint to transfer respectively, about 1300 of different syllables that the Chinese phonetic alphabet had phonological tone, be encoded to about 8580 different codings that phonological tone is arranged, these 8580 codings, be called " Chinese word phonetic-alphabet ", the Chinese word phonetic-alphabet of a sound joint, be called " monophone joint ", perhaps be called " monophone joint Zhao speech ", the Chinese word phonetic-alphabet of two sound joints is called " alliteration joint ", perhaps be called " alliteration joint Zhao speech ", when not making any distinction between, be referred to as " Chinese word phonetic-alphabet " or " Zhao's speech ", the mathematical definition of Zhao's speech is a 2+ a 1, a=0 in the formula, 1,2,3 ... positive integer, a represents the number that do not save in unison, a 1The number of expression monophone joint Zhao speech, a 2The number of expression alliteration joint Zhao speech; Zhao's speech has only a received pronunciation, is exactly the standard mandarin voice, and the mathematical model of Zhao's speech mandarin pronunciation is H 2=log 2(a 2+ a 1), in the formula, a>0, H 2The average information of expression Zhao speech mandarin pronunciation, unit is a bit, a represents the number that do not save in unison, a 1The number of expression monophone joint Zhao speech mandarin pronunciation, a 2The number of expression alliteration joint Zhao speech mandarin pronunciation; Calculate with 8580 sound joints, the sum of Zhao's speech is 7.362498 * 10 7Individual, the entropy of Zhao's speech voice, promptly the average information of mandarin pronunciation is 26.134 bits; Between Zhao's speech the space is arranged; Monophone saves by initial consonant, simple or compound vowel of a Chinese syllable, joint accent, totally three parts constitute, and the alliteration joint is transferred by initial consonant, simple or compound vowel of a Chinese syllable, joint accent, initial consonant, simple or compound vowel of a Chinese syllable, joint, and totally six parts constitute; Encode Chinese characters for computer form by two Chinese word phonetic-alphabets are formed is called " phonetic sentence speech ", is called " sentence speech " again, and the encode Chinese characters for computer form of sentence speech has four kinds, is exactly monophone joint+monophone joint, monophone joint+alliteration joint, alliteration joint+monophone joint, alliteration joint+alliteration joint;
3) primitive rule of the corresponding coding of Chinese speech and Chinese word phonetic-alphabet has three, an alliteration joint of one pair of fixing use of Chinese characters coding, and a Chinese word character is fixed and is used a monophone to save coding, and alliteration joint of a fixing use of Chinese word character is encoded; The ancillary rules of the corresponding coding of Chinese speech and Chinese word phonetic-alphabet has one, it is exactly the rule that Chinese character and joint are transferred the corresponding ordering of letter, following " sequence number ", all be meant Fig. 1 " joint is transferred alphabetical sequence number ", the one, " word is ranking method frequently " is in unisonance same tone Chinese character, according to the frequency size of using Chinese character, 6 Chinese characters of 6 series arrangement from sequence number 1 to sequence number are arranged repeatedly, can arrange all unisonance same tone Chinese characters; The 2nd, " meaning of word ranking method ", be called " pronunciation and meaning ranking method " again, in unisonance same tone Chinese character, according to a basic meaning of Chinese character, the regulation Chinese character is transferred alphabetical correspondence ordering with joint, the basic meaning of whole Chinese characters, classify as two kinds " generic meanings ", be exactly " noun " and " verb ", be subdivided into 6 kinds again and fix one's mind on justice, be exactly, noun in kind, abstract noun, for the time noun, action verb, stative verb, the process verb, in unisonance same tone Chinese character, according to a basic meaning of Chinese character, from sequence number 1 to sequence number 6,6 Chinese characters of series arrangement, arrange repeatedly, can arrange all unisonance same tone Chinese characters, Fig. 1;
4) with sentence make peace Chinese word phonetic-alphabet serve as the input unit, having between two Chinese word phonetic-alphabets under the prerequisite in a space, by two input units that Chinese word phonetic-alphabet is formed, be called " input sentence speech ", be called " sentence speech " again, the secondary space bar is hit in sentence speech back, if monophone is saved numeral " 1 " expression, alliteration is saved numeral " 2 " expression, so, the array configuration of sentence speech has four kinds, is exactly " 1+1 ", " 1+2 ", " 2+1 ", " 2+2 " is when being the input unit with the Chinese word phonetic-alphabet, import a Chinese word phonetic-alphabet, hit space bar one time;
5) be output unit with sentences and phrases and Chinese speech, having between two Chinese speech under the prerequisite in a space, output unit by two Chinese speech are formed is called " output sentences and phrases ", is called " sentences and phrases " again, there is the distance in two spaces the sentences and phrases back, if Chinese word character is represented with numeral " 1 ", numeral " 2 " expression of two Chinese characters, so, the array configuration of sentences and phrases has four kinds, is exactly " 1+1 "; " 1+2 ", " 2+1 ", " 2+2 " when being output unit with Chinese speech, export a Chinese speech, and there is a space back.
3. according to claim 1 or described Chinese word phonetic-alphabet code method of claim 2 and keyboard and input method, it is characterized in that, described method and keyboard are done classification and the qualification that professional technique is used, just can be used in that all are big, in, little, the microcomputer Chinese character information processing system, Chinese character teleprinter, the Chinese character computer typewriter, Chinese character terminal, all kinds of electronic printing typesetting systems, information retrieval and file administration, OAS, expert system, translation system, Chinese character speech recognition system and Chinese character pattern recognition system, the Chinese character information communication system, the advertising system, in telephone directory system and the public consultative service system.
CN97113313A 1996-05-29 1997-05-28 Phonetic Chinese word encoding and its keyboard Expired - Fee Related CN1109283C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN97113313A CN1109283C (en) 1996-05-29 1997-05-28 Phonetic Chinese word encoding and its keyboard

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN 96107547 CN1142077A (en) 1996-05-29 1996-05-29 Chinese word phonetic-alphabet code
CN96107547.3 1996-05-29
CN97113313A CN1109283C (en) 1996-05-29 1997-05-28 Phonetic Chinese word encoding and its keyboard

Publications (2)

Publication Number Publication Date
CN1172983A true CN1172983A (en) 1998-02-11
CN1109283C CN1109283C (en) 2003-05-21

Family

ID=25743976

Family Applications (1)

Application Number Title Priority Date Filing Date
CN97113313A Expired - Fee Related CN1109283C (en) 1996-05-29 1997-05-28 Phonetic Chinese word encoding and its keyboard

Country Status (1)

Country Link
CN (1) CN1109283C (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112861487A (en) * 2020-11-30 2021-05-28 新绎健康科技有限公司 Method and system for marking five tones of Chinese characters

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN85102522A (en) * 1985-04-10 1987-02-04 中国中文信息研究会汉字编码专业委员会 Method of computer input in chinese alphabetic writing
CN1006018B (en) * 1986-10-16 1989-12-06 丁飞 Method for inputting chinese word and phrase and its keyboard
CN1054219C (en) * 1994-11-03 2000-07-05 王昭宁 Substitution type Chinese phonetic character, word input coding method and keyboard thereof

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112861487A (en) * 2020-11-30 2021-05-28 新绎健康科技有限公司 Method and system for marking five tones of Chinese characters

Also Published As

Publication number Publication date
CN1109283C (en) 2003-05-21

Similar Documents

Publication Publication Date Title
CN1191514C (en) System and method for processing chinese language text
CN1205572C (en) Language input architecture for converting one text form to another text form with minimized typographical errors and conversion errors
CN1648828A (en) System and method for disambiguating phonetic input
CN1384940A (en) Language input architecture fot converting one text form to another text form with modeless entry
CN101038508A (en) GB phoneticize input method
CN1896923A (en) Method for inputting English Bashu railing Chinese morphology translation intermediate text by computer
CN1109283C (en) Phonetic Chinese word encoding and its keyboard
CN1129058C (en) Chinese character phonetic code and keyboard design
CN1121645C (en) Sound and shape word code Chinese character input method
CN1257445C (en) Chinese-character 'Pronunciation-meaning code' input method
CN85100087A (en) " Chinese coded sound " scheme and its implementation
CN1052200A (en) Pronunciation-form-meaning words encode series with compatibility and keyboard
CN1081355C (en) Three-sound-code Chinese character input method of computer and keyboard thereof
CN1156744C (en) Chinese-character 'meta-root code' input method
CN1142077A (en) Chinese word phonetic-alphabet code
CN1374577A (en) General Chinese character input method suitable for letter keyboard and digital keyboard in computer and its keyboard
CN1734404A (en) Phonetic code and recognition phonetic code, database technology, stroke code and numeric stroke code
CN1132364A (en) Man-machine Chinese coding processing method and unit for Chinese characters information
CN1196989C (en) Chinese character pattern schematic input method and keyboard thereof
CN1037043A (en) Computer Chinese input method
CN1241738A (en) Computer inputting method of electric spelling Chinese characters, applied keyboard and its Chinese internal code
CN1159029A (en) Chinese character input method and product thereof
CN1808351A (en) Chinese character input method using initial and etymon to encode for computer
CN1641550A (en) Computerised information generating method using digital marking English alphabet international phonetic transcription
CN1242090A (en) Method for converting non-phonetic characters into surrogate words for inputting into a computer

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee