CN1109283C - Phonetic Chinese word encoding and its keyboard - Google Patents

Phonetic Chinese word encoding and its keyboard Download PDF

Info

Publication number
CN1109283C
CN1109283C CN97113313A CN97113313A CN1109283C CN 1109283 C CN1109283 C CN 1109283C CN 97113313 A CN97113313 A CN 97113313A CN 97113313 A CN97113313 A CN 97113313A CN 1109283 C CN1109283 C CN 1109283C
Authority
CN
China
Prior art keywords
chinese
speech
joint
alphabet
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN97113313A
Other languages
Chinese (zh)
Other versions
CN1172983A (en
Inventor
赵延胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN 96107547 external-priority patent/CN1142077A/en
Application filed by Individual filed Critical Individual
Priority to CN97113313A priority Critical patent/CN1109283C/en
Publication of CN1172983A publication Critical patent/CN1172983A/en
Application granted granted Critical
Publication of CN1109283C publication Critical patent/CN1109283C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

The present invention relates to pinyin Chinese word coding and a keyboard thereof, which belongs to the technical field of Chinese character coding of Chinese character information processing. The present invention provides a word, sentence, sound and meaning Chinese character coding keyboard input method based on word processing. New Chinese character coding units of 'Chinese words' and 'commas' and new Chinese character coding forms of 'pinyin Chinese words' and 'sentences' are provided. The present invention is a mathematization method of the Chinese words, the commas, the pinyin Chinese words and the sentences and can provide a method for language information processing, Chinese information processing and Chinese character information processing. In the present invention, the Chinese character coding has no coincident codes and can be read on the premise that Chinese characters are selected not by persons.

Description

Utilize the Chinese word phonetic-alphabet code input method of computer keyboard
The present invention relates to a kind of Hanzi coding input method that utilizes computer keyboard.Belong to the computer Chinese-character information process field.
In the Chinese character keypad input method, divide, font code, sound sign indicating number, shape sound sign indicating number, phonetic-stroke code four big classes are arranged by the Hanzi attribute of encode Chinese characters for computer institute foundation.The above-mentioned method of Chinese character coding cuts both ways, also how many differences of dealing with problems, various relevant introductions a lot.Above-mentioned coding has three common shortcomings: the one, to repeat code Chinese character, generally use artificial word selection, and bring inconvenience for numerous users; The 2nd, encode Chinese characters for computer can not resemble imports computing machine the english easily, popularizes to computing machine and brings difficulty; The 3rd, the various methods of Chinese character coding all can not promote the various application of Chinese character information processing solution always.For example, just very famous by " the natural code input method " of Mr.'s Zhou Zhinong invention, major defect is: the Pinyin coding method of natural code, and use artificial word selection to solve coincident code problem, can not be as using english convenient, the Chinese word segmenting problem does not solve; The shape justice coding method of natural code can not be the solution of Chinese character information processing wide variety of applications problem, and a good environment is provided.
The objective of the invention is to solve the various application problems of Chinese character information processing in order to provide a kind of, based on word processing, simultaneously carry out words and phrases processing, no repeated code, can sentences and phrases the Hanzi coding input method that utilizes computer keyboard.For this reason, provide a kind of new encode Chinese characters for computer unit " Chinese speech ", in Chinese character information processing and encode Chinese characters for computer, Chinese speech can carry out the limit cutting to Chinese language material; A kind of new encode Chinese characters for computer form Chinese word phonetic-alphabet is provided, but Chinese word phonetic-alphabet at the word link writing sentences and phrases, do not use artificial word selection, input with under the much the same condition of english, make none repeated code of encode Chinese characters for computer; Provide a kind of keypad of suitable Chinese word phonetic-alphabet brevity code input, so that high input speed is provided; The phonetic all-key uses international modular keypad.
Purpose of the present invention can realize by following measure:
A kind of Chinese word phonetic-alphabet code input method of computer keyboard of utilizing is divided into all-key and brevity code dual mode; The keyboard input of all-key and brevity code all adopts initial consonant, simple or compound vowel of a Chinese syllable, the joint of Chinese speech to transfer the letter on the corresponding computer keyboard to import by said sequence; Wherein all-key and brevity code initial consonant, joint phase modulation have only the simple or compound vowel of a Chinese syllable difference together; A, i, e, o, u initial consonant that initial consonant adopts 21 initial consonants of traditional Chinese phonetic alphabet to add five " no pronunciations " are formed 26 initial consonants altogether, 26 initial consonant correspondences are distributed on the letter key corresponding on the computer keyboard, and initial consonant in the Chinese phonetic alphabet " zh, ch, sh " uses letter " y, w, v " to replace respectively; Joint transfers " tone " according to the Chinese phonetic alphabet promptly to be divided into high and level tone, rising tone, to go up sound, falling tone four classes; In each joint is transferred by the synonymity title be divided into again noun in kind, abstract noun, for the time noun, action noun, static noun, process verb, standby six kinds, wherein each class joint is transferred the letter of pressing again on the corresponding computer keyboard of synonymity title, the synonymity title " standby " of high and level tone, rising tone is same letter, and the synonymity title " standby " of last sound, falling tone is same letter; The simple or compound vowel of a Chinese syllable of all-key and brevity code is different, and wherein the simple or compound vowel of a Chinese syllable of all-key is 38, and each simple or compound vowel of a Chinese syllable is replaced by two on keyboard letters; The simple or compound vowel of a Chinese syllable of brevity code is 26, the letter of the Qwerty keyboard on the corresponding computing machine; Be divided into single syllable Chinese speech and two single-unit Chinese speech by the Chinese custom when adopting all-key and brevity code input Chinese speech; The single syllable Chinese word encoding adopts initial consonant+simple or compound vowel of a Chinese syllable+joint accent to import in proper order; Two single-unit Chinese speech adopt initial consonant+simple or compound vowel of a Chinese syllable+joint accent+initial consonant+simple or compound vowel of a Chinese syllable+joint accent to import in proper order; Wherein joint transfers input to be divided into " word is ranking method frequently " and " meaning of word ranking method " two kinds again, and " word is ranking method frequently " is to transfer lexicographic order to get letter successively according to the usage frequency size of Chinese character from joint; " meaning of word ranking method " is to transfer and corresponding its generic meaning is that letter got in above-mentioned various " noun " and " verb " by joint.
The English alphabet of described all-key simple or compound vowel of a Chinese syllable correspondence is as follows: er-eh, a-al, o-oj, e-ef, ai-ak, ê, ei-ec, ao-ag, ou-od, an-am, en-en, ang-at, eng-eb, ong-oy, i-ih, ia-il, ie-if, iao-ig, iou-id, ian-im, in-in, iang-it, ing-ib, iong-iy, u-uh, ua-ul, uo-uj, uai-uk, uei-uc, uan-um, uen-un, uang-ut, ueng-ub, ü-oh, ü e-of, ü an-om, ü n-on, the simple or compound vowel of a Chinese syllable ot of a ng-ob and a no pronunciation is with its English alphabet.
The English alphabet of described brevity code simple or compound vowel of a Chinese syllable correspondence is as follows: er, ia, ot-Q, iou-W, e-E, ü an, uan-R, ü e, uei-T, ian-Y, u-U, i-I, o, uo-O, ü n, uen-P, a-A, iong, ong-S, iang, uang-D, en-F, eng, ueng-G, ang-H, an-J, ao-K, ai-L, ei, ê-Z, ie-X, ü, ua-C, iao-V, ou-B, in, ng-N, ing, uai-M.
Described joint transfers corresponding English alphabet as follows: high and level tone-s, t, u, v, w, x, z; Rising tone-m, n, o, p, q, r, z; Last sound-g, h, i, j, k, l, y; Falling tone-a, b, c, d, e, f, y; The synonymity name order of each English alphabet correspondence was followed successively by " noun in kind, abstract noun, for time noun, action verb, stative verb, process verb, standby " during wherein each joint was transferred.
The described Chinese word phonetic-alphabet code input method of utilizing computer keyboard is to be encode Chinese characters for computer unit with Chinese speech, with Chinese word phonetic-alphabet and phonetic sentence speech is the encode Chinese characters for computer form, encode one to one with Chinese speech and Chinese word phonetic-alphabet, with sentence make peace Chinese word phonetic-alphabet serve as the input unit, with sentences and phrases and Chinese speech is that the Chinese character meaning and pronunciation coding method of output unit is as follows: (1) is encode Chinese characters for computer unit with Chinese speech and coding sentences and phrases, by a Chinese character and two encode Chinese characters for computer units that Chinese character is formed, be called Chinese character Chinese speech, the Chinese character Chinese speech of a Chinese character is called " Chinese word character " or is called " Chinese word character Chinese speech "; The Chinese character Chinese speech of two Chinese characters is called " two Chinese character " or is called " two Chinese character Chinese speech ", when not making any distinction between, is referred to as " Chinese speech ", and the mathematical definition of Chinese speech is C 2+ C 1, C=0 in the formula, 1,2,3 ... positive integer, C represents the number of different Chinese character, C 1The number of expression Chinese word character Chinese speech, C 2The number of the two Chinese character Chinese speech of expression; A Chinese speech has only a meaning, is called " generic meaning ", is called for short " synonymity ", and the mathematical model of Chinese part of speech justice is H 1=log 2(C 2+ C 1), C in the formula>0, H 1The average information of expression Chinese part of speech justice, unit is a bit, C represents the number of different Chinese character, C 1The number of expression Chinese word character Chinese part of speech justice, C 2The number of the two Chinese character Chinese part of speech justice of expression, Chinese speech has the written form and the meaning of regulation, imports between two Chinese speech and presses space bar; Encode Chinese characters for computer unit with two Chinese speech are formed is called " coding sentences and phrases " and claims sentences and phrases again, and the encode Chinese characters for computer unit of sentences and phrases has four kinds, is exactly: Chinese word character+Chinese word character, Chinese word character+two Chinese characters, two Chinese character+Chinese word character, two Chinese character+two Chinese characters; (2) be the encode Chinese characters for computer form with Chinese word phonetic-alphabet and phonetic sentence speech, Chinese word phonetic-alphabet code uses " all-key ", " brevity code ", two kinds of encode Chinese characters for computer forms; All-key and brevity code use its corresponding initial consonant, simple or compound vowel of a Chinese syllable, joint to transfer in the row coding respectively, the Chinese phonetic alphabet has about 1300 of the different chapters and sections of phonological tone, be encoded to about 8580 different codings that phonological tone is arranged, these 8580 codings are " Chinese word phonetic-alphabet ", and the Chinese word phonetic-alphabet of a sound joint is called " monophone joint ", the Chinese word phonetic-alphabet of two sound joints is called " alliteration joint ", when not making any distinction between, be referred to as " Chinese word phonetic-alphabet ", the mathematical definition of " Chinese word phonetic-alphabet " is a 2+ a 1, a=0,1,2,3 in the formula ... positive integer, a represents the number that do not save in unison, a 1The number of monophone joint Chinese word phonetic-alphabet, a 2The number of expression alliteration joint Chinese word phonetic-alphabet; A Chinese word phonetic-alphabet has only a received pronunciation, is exactly the standard mandarin voice, and the mathematical model of Chinese word phonetic-alphabet mandarin pronunciation is H 2=log 2(a 2+ a 1), a in the formula>0, H 2The average information of expression Chinese word phonetic-alphabet mandarin pronunciation, unit is a bit, a represents the number that do not save in unison, a 1The number of expression monophone joint Chinese word phonetic-alphabet mandarin pronunciation, a 2The number of expression alliteration joint Chinese word phonetic-alphabet mandarin pronunciation; Calculate with 8580 sound joints, the sum of Chinese word phonetic-alphabet is 7.362498 * 10 7Individual, the entropy of Chinese word phonetic-alphabet, promptly the average information of mandarin pronunciation is 26.134 bits; Adopt space bar between the Chinese word phonetic-alphabet; Monophone joint transfers by initial consonant, simple or compound vowel of a Chinese syllable, joint that totally three parts constitute, and the alliteration joint transfers by initial consonant, simple or compound vowel of a Chinese syllable, joint accent, initial consonant, simple or compound vowel of a Chinese syllable, joint that totally six parts constitute; The encode Chinese characters for computer form of being made up of two Chinese word phonetic-alphabets is called " phonetic sentence speech " and claims " sentence speech " again, and the encode Chinese characters for computer form of sentence speech has four kinds: monophone joint+monophone joint, monophone joint+alliteration joint, alliteration joint+monophone joint, alliteration joint+alliteration joint; (3) primitive rule of the corresponding coding of Chinese speech and Chinese word phonetic-alphabet has three, an alliteration joint of one pair of fixing use of Chinese characters coding, and a Chinese word character is fixed and is used a monophone to save coding, and alliteration joint of a fixing use of Chinese word character is encoded; The ancillary rules of the corresponding coding of Chinese speech and Chinese word phonetic-alphabet has one, is exactly the rule of the corresponding ordering of joint accent letter of Chinese character; (4) with sentence make peace Chinese word phonetic-alphabet serve as the input unit, under the prerequisite of a space bar of input between two Chinese word phonetic-alphabets, by two input units that Chinese word phonetic-alphabet is formed, be called " input sentence speech " and claim " sentence speech " again, twice space bar of sentence speech back double hit, if monophone is saved numeral 1 expression, alliteration is saved numeral 2 expressions, the array configuration of sentence speech has four kinds so, be exactly " 1+1 ", " 1+2 ", " 2+1 ", " 2+2 ", when being the input unit, importing a Chinese word phonetic-alphabet and hit space bar one time with the Chinese word phonetic-alphabet; (5) be output unit with sentences and phrases and Chinese speech, under the prerequisite of a space bar of input between two Chinese speech, the output unit of being made up of two Chinese speech is called " output sentences and phrases " " sentences and phrases " again, there is the distance of two space bars the sentences and phrases back, if Chinese word character represents that with numeral 1 two Chinese characters are represented with numeral 2, the array configuration of sentences and phrases has four kinds to be exactly " 1+1 ", " 1+2 ", " 2+1 ", " 2+2 " so, when being output unit, export a Chinese speech, space bar of back input with Chinese speech.
Above-mentioned method and keyboard are done classification and the qualification that professional technique is used, just be applicable to that all are large, medium and small, in microcomputer Chinese character information processing system, Chinese character teleprinter, Chinese character computer typewriter, Chinese character terminal, all kinds of electronic printing typesetting system, information retrieval and file administration, OAS, expert system, translation system, Chinese character speech recognition system and Chinese character pattern recognition system, Chinese character information communication system, advertising system, telephone directory system and the public standing service system.
The Chinese language material always is made up of different Chinese characters.Calculate with 6763 different Chinese character among the GB2312-80, can construct different Chinese speech 4.5744932 * 10 altogether 7Individual, i.e. unique Chinese set of words, the entropy of each Chinese speech, promptly average information is 25.447 bits, computing method are as follows:
When c=6763,
c 2+ c 1=6763 2+ 6763 1=4.5744932 * 10 7(individual)
H 1=Log 2(c 2+c 1)
=Log 24.5744932×10 7
=25.447 (bits)
The quantity of Chinese speech is very big, but the Chinese speech of the actual use of Modern Chinese and few.The Chinese speech of the actual use of Modern Chinese can be done the contrast estimation according to the quantity of Chinese grammar speech.From in form, two Chinese characters can be regarded as in whole disyllabic words in the function word, all monosyllable can be regarded Chinese word character as, trisyllable, quadrisyllable, the above speech of pentasyllable, can be cut into two Chinese characters and Chinese word character, the grammer phrase of two Chinese characters all is two Chinese characters, also have some Chinese speech to contrast, referring to the example sentence among the embodiment with function word.The actual quantity of using Chinese speech is significantly more than the quantity of function word.According to quantity, inventor's estimation of modern general syntax speech, the number reason of the general Chinese speech of Modern Chinese, about 60,000, Chinese language material coverage rate is 99%, wherein, Chinese language material coverage rate is 95%, the most frequently used general Chinese speech, about 12,000.
Distance between the Chinese speech has a space just passable.When encode Chinese characters for computer, at first Chinese language material is cut into Chinese speech, then through this coding method input computing machine, the output computing machine be the Chinese speech of word link writing, also can be the Chinese character of word link writing not, but Chinese speech preferably.Word link writing will bring inexhaustible convenience and benefit to the various application problems of Chinese character information processing.How the importance of word link writing is emphasized all within reason.
The mathematical definition explanation of Chinese speech.Chinese speech is a kind of method of different Chinese character repeated arrangement.Referring to Fig. 4, the repeated arrangement method of " letter ", " breath ", " opinion " three different Chinese character shapes.Formula according to repeated arrangement kind number: m n, and addition definition just can calculate the sum of Chinese speech.Calculating the sum of Chinese speech, is exactly the mathematical definition of Chinese speech.From Fig. 4 Chinese speech principle illustration as can be seen, by 3 different Chinese word character Chinese speech, amount to 12 different Chinese speech, the different Chinese speech of the actual use of Modern Chinese have 4, that is: " letter ", " breath ", " opinion ", " information ", remaining 8 two Chinese character Chinese speech are standby." standby " this reason is very simple, and before " information theory " do not produce, " information " this Chinese speech nobody used, and a large amount of now the use.
The mathematical definition of Chinese speech can make computing machine and common user, holds Chinese speech on the whole, can describe the various features of Chinese speech quantitatively, and this is very useful to the solution of Chinese character information processing and encode Chinese characters for computer variety of issue.Give one example again, if " unlatching of communication function and stop ".Regard a sentence as, so, 10 different Chinese character have been used altogether, used 6 Chinese speech because the present invention stipulates that a Chinese speech has only a meaning, i.e. " generic meaning " is called for short " synonymity ", so, according to information-theoretical method, and the mathematical model of Chinese part of speech justice, can set up the mathematical model of Chinese words and phrases subclass justice: H 3=Log 2(c 2+ c 1) n, c 〉=1, H 3The average information of expression sentence synonymity, unit: bit; N represents to use in the sentence number of Chinese speech; Other is with the mathematical model of Chinese part of speech justice.
The synonymity of " unlatching of communication function and stop " the words, promptly the average information of the words meaning is to work as c=10, during n=6, H 3=Log 2(c 2+ c 1) n=Log 2(10 2+ 10 1) 6=6 * 6.781=40.686 bit.
For english and Chinese grammar speech, similarly calculate, will be very because of difficulty.The mathematical definition of Chinese speech, the mathematical model of Chinese part of speech justice, the mathematical model of sentence synonymity will be third generation Hanzi coding input method, Chinese character information processing provides a good working environment.
Chinese speech mathematical definition of the present invention explanation.What the mathematical definition of Chinese speech of the present invention with the mathematical definition of Chinese speech, do not have different at all, and just literary style is different with quantity, and what Chinese speech of the present invention used is the sound joint, is a kind of encode Chinese characters for computer form based on voice, and Chinese speech is howed a lot than Chinese speech of the present invention.Because Chinese speech of the present invention can be read, so, can describe quantitatively the voice of Chinese speech of the present invention.The present invention's regulation, a Chinese speech of the present invention has only voice, and different Chinese speech of the present invention just have different voice, if different Chinese speech pronunciations of the present invention is identical, promptly the unisonance different shaped also is different voice.The quantity of information of Chinese word sound of the present invention and the quantity of information of its synonymity, computing method are just the same.If the number of different Chinese character is with the number of joint is not identical in unison, so, quantity of information is also just identical, and this meets general knowledge.The mathematical model H of Chinese speech mandarin pronunciation of the present invention 2=Log 2(a 2+ a 1), the phonetic entry identification and the synthetic method that provides of Chinese character will be provided in a>0.The coding sentence of forming with Chinese word encoding of the present invention, i.e. mandarin pronunciation sentence, computing method are with the computing method of " Chinese words and phrases subclass justice ".Just " c " in " mathematical model of Chinese words and phrases subclass justice " changed into " a ", the number of Chinese speech of the present invention is used H=Log in " n " expression voice sentence 2(c 2+ c 1) n, c>0, n>0
Use sound saves, and can not change the voice of mandarin.The present invention does not use light tone syllable, and chi is met Chinese character softly, marks this accent without exception, as can not find out this accent of Chinese character on small-sized dictionary, substitutes with " falling tone " tone without exception.
The coding key of sound joint.Save the accent alphabet referring to Fig. 1, wherein, sequence number 1 transfers letter " s, m, g, a " to constitute by joint, represent high and level tone, rising tone respectively, go up sound, four tones of falling tone, four joints of sequence number 1 transfer letter with initial consonant of the present invention, simple or compound vowel of a Chinese syllable combination, 1300 different sound joints of codified, four circumflexs that are equivalent to use the Chinese phonetic alphabet are constructed 1300 different single syllable with initial consonant, simple or compound vowel of a Chinese syllable combination.Use the method for sequence number 1 repeatedly, sequence number 2 has just been arranged to sequence number 7.Wherein, sequence number 1 is to sequence number 6, and the different sound of codified saves 6 * 1300=7800 altogether.The situation of sequence number 7 is more special, transfers letter " Z " expression high and level tone and rising tone tone with a joint, with sound and falling tone tone in " Y " expression.The tone ratio of GB2312-80 " primary word ", approximately be, high and level tone 0.25, rising tone 0.23, last sound 0.17, falling tone 0.35 calculate with the highest high and level tone 0.25 and falling tone 0.35 respectively, then have, 0.25+0.35=0.6, promptly 1300 * 0.6=780 transfers 780 at the different sound joint of alphabetical codified with two joints of " Z " and " Y ", so have, 7800+780=8580,80 origin that do not save in unison that Here it is.The definition of Chinese speech according to the present invention can calculate, and the different coding form of monophone joint is individual 8580, and the different coding form of alliteration joint is 8580 * 8580=7.36164 * 10 7Individual.The sum of Chinese word encoding form of the present invention is: 8580+8580 2=7.362498 * 10 77.362498 * 10 7Individual Chinese speech of the present invention is to make encode Chinese characters for computer not have the gordian technique of repeated code.Because the sum of Chinese speech of the present invention is seven over thousands of ten thousand, solve the coincident code problem of encode Chinese characters for computer, make at all and too many or too much for use, so the present invention's regulation only uses the sequence number 1 of Fig. 1 to transfer letter to the joint of sequence number 6, the joint of sequence number 7 transfers letter standby.
The sound joint uses 26 initial consonants altogether, referring to Fig. 2 initial consonant table, wherein five vowel initial consonants " a, i, e, o, u " only appear on the initial consonant position, do not have pronunciation, because the present invention does not allow not have the sound joint of initial consonant to exist, so, solution is, when the sound joint has only simple or compound vowel of a Chinese syllable not have initial consonant, and first letter of simple or compound vowel of a Chinese syllable, must rewrite once, an initial consonant all be arranged to guarantee each joint.Because first letter of simple or compound vowel of a Chinese syllable all is a vowel, like this, compare with 21 initial consonants of Chinese spelling pronunciation matrix, the present invention has just increased by five aphonic vowel initial consonants, and the initial consonant table of sound joint does not have any difference in the use with the initial consonant table of the Chinese phonetic alphabet.
The sound joint uses 38 simple or compound vowel of a Chinese syllable altogether, referring to Fig. 3 rhythm matrix.With the simple or compound vowel of a Chinese syllable epiphase ratio of the Chinese phonetic alphabet, except most of simple or compound vowel of a Chinese syllable differences on the literary style, also have 4 differences, the first, the Chinese phonetic alphabet simple or compound vowel of a Chinese syllable tabular of general dictionary goes out 35 simple or compound vowel of a Chinese syllable, and simple or compound vowel of a Chinese syllable er excludes in the table, and the present invention lists in the table; The second, in order to make initial consonant and simple or compound vowel of a Chinese syllable uniform, the Chinese phonetic alphabet is not listed in the initial consonant ng of initial consonant table, the present invention uses as simple or compound vowel of a Chinese syllable, lists the rhythm matrix in, and pronunciation and effect are all constant; The 3rd, the present invention increases a no pronunciation simple or compound vowel of a Chinese syllable, and no pronunciation simple or compound vowel of a Chinese syllable has only written form, and no pronunciation is as the Chinese character of no simple or compound vowel of a Chinese syllable in the mandarin
Figure C9711331300101
Figure C9711331300102
The simple or compound vowel of a Chinese syllable of " ", so that any one Chinese character in the Chinese language material, when using the present invention to encode, the sound joint all transfers three parts to form by initial consonant, simple or compound vowel of a Chinese syllable, joint, without exception; The 4th, the present invention incorporates the simple or compound vowel of a Chinese syllable of the Chinese phonetic alphabet " ê " into simple or compound vowel of a Chinese syllable " ei ".
Chinese speech all-key of the present invention uses the small letter English alphabet, and the monophone joint is made of four letters, and the alliteration joint is made of eight letters, Chinese speech brevity code of the present invention uses capitalization English letter, the monophone joint is made of three letters, and the alliteration joint is made of six letters, and the coding form of Chinese speech of the present invention is determined.Only see the number of letter, Chinese speech of the present invention just can not obscured with english or other western language speech, can not obscure with Chinese phonetic alphabet speech yet, and the boundary of sound joint can not obscured yet.Chinese speech of the present invention preferably uses mandarin to read, and also can use non-type mandarin to read, can also user's speech pronunciation.Chinese speech of the present invention is a kind of encode Chinese characters for computer form, is not Chinese phonetic alphabet speech, and whether pronunciation standard, can not influence normal use.
From " meaning " of Fig. 5, " they " of Fig. 6, as can be seen, the alliteration joint approximately is disyllabic 49 times of the Chinese phonetic alphabet, the monophone joint approximately is monosyllabic 7 times of the Chinese phonetic alphabet." meaning, contrary opinion, objection, discrepancy, free translation, radiating power and vitalitys, thriving, sparking " maximum with Chinese Homophone are example, use the Chinese phonetic alphabet to write, and having only a kind of literary style " y ì y ì " repeated code is eight.Use the present invention, only used eight of the codings of alliteration joint, do not have repeated code.For general two Chinese characters, the sum of the two Chinese characters of unisonance, surpass six be minority, the two Chinese characters of the unisonance of " meaning " for example above-mentioned are eight, and the two Chinese characters of general unisonance will reach 36, are impossible, even ancient times, modern times, following all counting in, possibility is minimum, and the two Chinese characters of general unisonance will reach 49, and is impossible especially.Certainly, two Chinese characters that Chinese person name, place name, scientific and technological specialized vocabulary etc. are used, and foreigner's name, place name, scientific and technological specialized vocabulary translate into two Chinese characters of Chinese back use, belongs to the specific question of Chinese speech and Chinese speech of the present invention, according to user's requirement, the inventor will handle in addition.
Analogize, two Chinese characters use the no repeated code of alliterations joint coding, according to the 3rd Basic Encoding Rules, from the angle of technology, just can guarantee that whole encodes Chinese characters for computer do not have repeated code.Even if existing different Chinese character has 100,000, suppose all to use alliteration joint coding, also just spend 100,000 different alliteration joints, only account for the only a few of seven over thousands of ten thousand alliteration joints.The double-tone joint of the Chinese phonetic alphabet is though there are 1300 * 1300=1.69 * 10 6Individual different written form, but the double-tone joint is not handled unisonance sign indicating number, the ability of repeated code in other words.
Explanation to sentences and phrases and sentence speech.The form of " coding sentences and phrases " and " output sentences and phrases " is identical, and just one is used in the preceding cutting of coding, and one is used in computing machine output, so be called for short identical.The reason that " phonetic sentence speech " is identical with the abbreviation of " input sentence speech " is the same.Sentences and phrases are meant the three words language and the four word languages of Chinese character, and the sentence speech is meant the coding that four sound of three sound joints (being equivalent to syllable) of encode Chinese characters for computer save.Use the fundamental purpose of sentences and phrases to be, the one, in order to solve the coincident code problem of encode Chinese characters for computer, when Chinese character has repeated code, use three words and expressions to read coding, the input of three joint sentence speeches can solve coincident code problem; It is three joints " 1+2 " and " 2+1 " that sentences and phrases and a speech use maximum forms, does not have repeated code because alliteration of the present invention joint is " 2 ", so four word languages " 2+2 " does not have coincident code problem; The 2nd, more definite in order to make semanteme, for example, and " Three Character Primer " of Chinese, " four word Chinese idioms " etc. can both represent a definite meaning or story; The 3rd, for sentences and phrases and sentence speech as a sentence disposal route, behind the Chinese speech of the present invention and sentence speech input computing machine of preparing phonetic, convert Chinese speech and sentences and phrases automatically to and export, condition is provided; The 4th, more clear and more coherent in order to make statement, express clearer, the 5th, more convenient in order from statement, to be syncopated as Chinese speech.
Example 1: universal joint is a kind of ambidextrous mechanical hook-up.
Figure C9711331300111
A1 A2 A3 A4 A5 A11 A12 A21 A22 A41 A42 A61 A62 " A1, A2, A4, A5 " represents sentences and phrases, " A3, A11, A12, A21, A22, A41, A42, A51, A52 " expression Chinese speech.The cutting result is as follows:
Universal joint is a kind of ambidextrous mechanical hook-up.
Said method is called " sentences and phrases syncopation ", the present invention's regulation, sentences and phrases must be cut into two Chinese speech, and can only be cut into two Chinese speech, because two spaces are used in the sentences and phrases back, so, on written, sentences and phrases are the same with Chinese speech, have formal denotation, this will bring many convenience for the computing machine automatic word segmentation.Two sentences and phrases are called " super sentences and phrases ", and two super sentences and phrases are called " inferior statement ", and two time statements are called " statement ", and two statements are called " super statement " or the like, as required, always can two close down.Though super sentences and phrases, statement etc. do not have formal denotation,, will bring convenience to natural language reason Jie, mechanical translation etc. as a kind of algorithm.
Example 2: he holds differing views to the meaning of this incident.
Use " sentences and phrases syncopation " cutting " example 2 ", feel very not smoothly, if " example 2 " is rewritten into: " he holds differing views for the meaning of this incident." increased by one " in " word, felt cutting a bit, but not smoothly, if " example 2 " is rewritten into: " he holds differing views for the meaning that this incident produces." cutting just more smoothly, statement is also relatively more clear and more coherent.This explanation, sentences and phrases can help the user on literal expression, and be clearer, has the rhetoric effect." example 2 " though read obstructed, to the explanation how to encode, do not have what adverse effect.
How the present invention solves the encode Chinese characters for computer coincident code problem.
The user will learn the Chinese word encoding of the present invention of 3755 Chinese characters of GB2310-80 Chinese characters in common use table, perhaps learns the Chinese word encoding of the present invention of 6763 Chinese characters among the GB2310-80.There is not repeated code according to alliteration abridged edition invention Chinese speech, the no repeated code of sentence speech, and the present invention will have tolerant code for all Chinese characters in common use.When keyboard is imported,, can guarantee not have repeated code as long as coding belongs to following one.The one, the first round in the unisonance Chinese character, 6 joints were transferred the Chinese character in the sequence number; Two are to use the input of alliteration abridged edition invention Chinese speech; Three are to use sentence to say input, just have repeated code, the 3rd of the primitive rule of solution use coding, and promptly alliteration joint of a fixing use of Chinese word character is encoded.
The keyboard layout explanation of coding.The all-key keyboard has only used three compressed codes, i.e. zh y, ch w, sh represent that with v all-key uses Qwerty keyboard, because compressed code has only three, so the all-key keyboard is no longer drawn.
" brevity code keyboard " is keyboard special of the present invention referring to Fig. 7, and the key-bit code of brevity code, initial consonant are referring to Fig. 2, and simple or compound vowel of a Chinese syllable is referring to Fig. 3, and joint transfers letter referring to Fig. 1.Key-bit code among Fig. 7, following left side of face is all-key simple or compound vowel of a Chinese syllable and initial consonant, is the Chinese phonetic alphabet in the bracket of right side."/" expression does not have corresponding code.
The ancillary rules explanation of Chinese word encoding of the present invention.Ancillary rules is actually the part of three Basic Encoding Rules, and three Basic Encoding Rules all must be used ancillary rules, are more convenient in order to narrate here, just list as an ancillary rules separately.Ancillary rules is exactly that 6 joints saying a tone are transferred letter, how to follow the method for the corresponding sorting coding of unisonance Chinese character, is called for short " ordering ".
Word is ranking method frequently, comes " ordering " coding according to Chinese character relative application frequency in the modern Chinese written language.This method is fairly simple, but regular poor, user's memory capacitance is very big.
Meaning of word ranking method, the inventor thinks, Chinese character is not expression " title ", is exactly expression " action ", so, the meaning of " title " class is called " noun ", move the meaning of a class, be called " verb ".Though the meaning of a Chinese character is many, a basic meaning is always arranged.Stipulate that a Chinese word character Chinese speech only represents basic meaning, other meaning of Chinese character uses two Chinese character Chinese vocabularys to show.For example: Chinese character " is beaten ", and basic meaning is " with hand or utensil bump object ", and Chinese word character Chinese speech is " verb ", segmentation is " action verb " again, and other meaning that Chinese character " is beaten " is always with other Chinese character logotype, promptly use two Chinese character Chinese speech, could represent, as:
" hired roughneck " (noun), noun in kind,
" hit the person " (verb), action verb,
" dismiss " (verb), the process verb,
" look " (verb) up and down, stative verb,
" plan " (verb), stative verb, or the like.
The basic meaning of Chinese word character is divided into two big classes, is subdivided into six big class basic meanings again,, do not influence use though " synonym " speech is a lot.Stipulate that a Chinese speech has only a meaning, " synonymity " become to calculate that this will be to Chinese character information processing, encode Chinese characters for computer, all will bring convenience.The major defect of meaning of word ranking method is that memory capacitance is very big.
Supplementary notes to six kinds of synonymities: following " being equivalent to ", all be meant the function word in the Chinese.
Referring to Fig. 1,
Sequence number 1, noun in kind is equivalent to the concret moun in the noun.
For example: people, mountain, water
Sequence number 2, abstract noun is equivalent to the abstract noun in the noun.
For example: friend, think of, political affairs
Sequence number 3, for the time noun, be equivalent to pronoun, numeral-classifier compound, time, place, the noun of locality etc.
For example: he, year, second, last, eastern, it, with.
Sequence number 4, action verb is equivalent to most of verb
For example: beat, put, write
Sequence number 5: stative verb is equivalent to a part of verb, adjectival whole.
For example: be, large and small, good, fast, slow
Sequence number 6: the process verb is equivalent to a part of verb, adverbial word, preposition, auxiliary word, conjunction, interjection.
For example: float, flow, very, all,, to,,, get,,, cross and, breathe out.
The synonymity of Chinese word character changes, for example sometimes with the synonymity of two Chinese characters, Chinese word character Chinese speech " " is the process verb, and " life " is the process verb, and two Chinese character Chinese speech " student " are nouns in kind, this change procedure of the meaning of a word is called " meaning of word ranking method " and is called " form coding " again.
In addition, " meaning of word ranking method " has exception, and for example: " he, she, it " all should belong to sequence number 3, for the time noun, but convenient in order to use, regulation: " he ", for the time noun, " she " abstract noun, " it ", noun in kind.Similarly situation also have " ", " get ", " " or the like, exception is made special regulation, obviously be shortcoming, be so special Chinese character and seldom well.
Compared with prior art, major advantage of the present invention:
1, Chinese word encoding of the present invention has been accomplished to make encode Chinese characters for computer neither one repeated code under the prerequisite that can read technically.This has created condition for popularizing computer utility.
2, the readability of Chinese word encoding of the present invention adapts to widely, can say people common or can mandarin, can use.
3, Bian Ma primitive rule is exactly three, from the coding that is encoded to whole Chinese characters of a Chinese character, all is these three Basic Encoding Rules.
The mathematical model of 4, the mathematical definition of Chinese character and Chinese speech of the present invention, and Chinese speech pronunciation and meaning sentence will provide method for the solution of the various application problems of Chinese character information processing.
5, the mathematical model of the mathematical definition of Chinese speech and Chinese speech of the present invention and pronunciation and meaning sentence explanation, Chinese speech of the present invention is than english Computer Processing preferably.
Drawing below in conjunction with accompanying drawing is as follows to description of contents of the present invention:
Fig. 1, joint is transferred alphabet (synonymity alphabet)
Fig. 2, the initial consonant table;
Fig. 3, the rhythm matrix;
Fig. 4, Chinese speech principle illustration;
Fig. 5, the Chinese word phonetic-alphabet code table of " meaning ";
Fig. 6, the Chinese word phonetic-alphabet code table of " they ";
Fig. 7, the brevity code keyboard layout.
The accompanying drawings specific embodiment:
When using Chinese word phonetic-alphabet code, at first to from Chinese language material, be syncopated as Chinese speech.Cutting Chinese speech can be regarded as and uses a Chinese character and two Chinese characters to carry out the process of rhetoric.So except having in form the similarity, Chinese character follows function word without any relation.According to " definition of Chinese speech ", be the basic skills of cutting Chinese speech, Fig. 4 is the ultimate principle of cutting Chinese speech, example 1 is the cutting result contrast of function word and Chinese speech.
Example 1,1. universal joint/be/one/kind/very/dexterous// mechanical tool device.(function word cutting)
2. universal joint is a kind of ambidextrous mechanical hook-up.(Chinese speech definition cutting)
From example 1 1. and 2., the different of function word and Chinese speech can visually see.The subject matter of function word is that the definition husband method of speech is held, and cause difficulty to cutting, and the definition of Chinese speech is simply clear and definite, carries out cutting according to a Chinese character and two Chinese characters exactly.Because cutting Chinese speech is relevant with individual's rhetoric level, so the operator must be to be the people with culture more than the junior middle school of mother tongue with Chinese.
For same Chinese language material, the Chinese speech that different people is syncopated as is the same in general.Because people's tendency always wishes to have best Rhetoric Expression, always wish to be syncopated as best Chinese speech, under the same cultural figure viewed from behind, people's the mode of thinking, to the degree of understanding of " quality ", also always the same.It also is normal that exception is arranged, and is syncopated as different Chinese speech, can be regarded as rhetoric level difference, or the expression difference, just the Chinese speech difference of Shi Yonging can also be regarded innovation as, also can be regarded as waste matter, lack of standardization, or the like everything, all might take place.In general, good Chinese word segmentation result has only a kind of, and bad and general cutting result is multiple my sample, and innovation and waste matter, always extremely other.
After Chinese word segmentation comes out, just can use Chinese speech of the present invention to encode, referring to Fig. 1 to Fig. 6, for the ease of understanding, the inventor at first provides the Chinese grammar speech of example sentence and the written form of Chinese phonetic alphabet speech, and then provides Chinese speech and Chinese word phonetic-alphabet code.Example 2 is to use the example sentence of " word is ranking method frequently " all-key.
Example 2:
1. he/right/should/thing/part// meaning, hold/have/objection.(Chinese grammar speech)
2. T ā d ù i g ā i sh ì ji ā n de y ì y ì, ch í y ǒ u y ì y ì (Chinese phonetic alphabet speech)
3. he holds differing views to the meaning of this incident.(Chinese speech)
④Taisduca?gaks?vihdjimb?defa?iihbiiha,wihmiidg?iihciihd。
1 1 1 4 2 1 2 1 1 1 3 [4]
(Chinese word phonetic-alphabet all-key, word is ranking method frequently)
Example 2 4. in, arabic numeral 1,2,3,4 below Zhao's speech and not do not use 5,6, transfer sequence number 1 to the joint in the sequence number 6 to transfer letter corresponding one by one with the joint of Fig. 1, six unisonance Chinese characters with each syllable in 3755 Chinese characters of modern Chinese characters in common use table among the GB2312-80 are corresponding one by one, correspondence is stipulated according to word frequently by the inventor, referring to the numeral of Chinese character lower right side in the unisonance Chinese character statistical form of following example 2.Example 2 4. in, the arabic numeral [1] of the band bracket below the Chinese speech of the present invention, [2], [3], [4], [5], [6], transfer sequence number 1 to the joint in the sequence number 6 to transfer letter corresponding one by one with the joint of Fig. 1, corresponding one by one with the 7th of each syllable in 3755 Chinese characters of GB2312-80 Chinese characters in common use table and more unisonance Chinese character, corresponding by inventor's regulation, referring to the numeral of Chinese character lower right side in the unisonance Chinese character statistical form of following example 2.Example 2 4. in used [4],, learned Chinese speech of the present invention and just needn't mark just in order to learn and illustrate convenient mark.
Following Chinese word phonetic-alphabet brevity code, meaning of word ranking method still make the sentence of use-case 2.
5. he holds differing views to the meaning of this incident.
⑥TAU?DTF?GLX?VIBJYC?DEF?IIBIIE,WIPIWK?IIFIIA。
366236 [2] [5] 4561 (Chinese-character phonetic letter brevity code, meaning of word ranking methods)
" this incident " and " holding differing views " in 5. are sentences and phrases, and other is a Chinese speech.There are two spaces the sentences and phrases back, when there is punctuation mark the sentences and phrases back, adds a space before the punctuation mark, and the expression front is sentences and phrases.
" GLX VIBJYC " in 6. and " WIPIWK IIFIIA " are sentence speeches, and other is that the regulation in Chinese speech of the present invention space is with the sentences and phrases in 5..
Unisonance Chinese character statistical form in the example 2, the front target is the Chinese phonetic alphabet, and the mathematics in the Chinese character lower right corner is the Chinese character sort sequence number in " word is ranking method frequently ", and the mathematics below Chinese character is the Chinese character sort sequence number in " meaning of word ranking method ".
Example 2 4. in, " corresponding by inventor regulation " mentioned, example 2 6. in, be rewritten into that " corresponding basic meaning decision by Chinese character is promptly determined by synonymity." basic meaning of Chinese character can look into " modern Chinese dictionary ", perhaps provide by the inventor.
When using Chinese speech brevity code of the present invention, if the sound joint does not have initial consonant, first letter of simple or compound vowel of a Chinese syllable is meant first letter of Chinese speech all-key of the present invention, repeat to write once, for example: " watt ", the Chinese phonetic alphabet, " wa ", Chinese speech all-key of the present invention: " uulg ", Chinese speech brevity code of the present invention " UGG ".Chinese speech brevity code of the present invention can not be write as " CCG ".
The unisonance Chinese character statistical form of example 2:
1/t ā collapse 4 he 1 it 3 she 2
4 3 1 2
Example 3:
2/du ì converts 3 teams 2 pairs 1
4 1 6
3/g ā i this 1
6
4/sh ì formula 6 shows that [5] scholar [1] generation [2] persimmon [1] thing 4 wipes away [3] oaths [5] [5] gesture [2] that dies
1 4 [1] [1] [1] 2 [4] [4] [6] [5]
Be 1 to have a liking for suitable [5] bodyguard [6] of [6] divination by means of the milfoil [6] and wait upon [2] and release [3] decorations [4] family name 5 cities 2
5 [5] [4] [5] [1] [6] 6 [5] [2] 3
Rely on [6] chamber 3 and look [3] examinations [1]
[5] [3] [4] [6]
(" horizontal bar in the front of a carriage used as an armrest " word that example 3 is used belongs to the inferior everyday character of GB2312-80, and the inventor is defined as " horizontal bar in the front of a carriage used as an armrest [2] ".)
5/ji à n recommends [2] sill [4] mirror [1] and tramples [5] low-priced [5] and see 1
5 [1] [1] [4] [5] 4
62 strong [6] warship [1] swords 5 of key [2] arrow
1 [1] 3 [5] [1] [1]
Giving a farewell dinner [3] gradually 4 spatters [4] ravine [5] and builds 3
[4] [5] [4] [1] 6
1 (according to regulation of the present invention, Chinese character spends several accent and substitutes as can not find out this accent of Chinese character on small-sized dictionary softly, and de is write out into de) of 6/de
7/y ì skill 4 presses down [2] easily [5] city [1] towering like a mountain peak [4] hundred million [6]
2 4 [6] [1] [5] 3
[6] ease [5] also [1] descendants [2] of [6] epidemic disease [6] that study subjectively
[1] [5] [6] [1] [5] [1]
[3] adopted 1 benefits [1] overflow [4] are recalled in 2 firm [3] of anticipating
[2] [5] [6] 5 6 [4]
Call on [2] view [2] friendship [4] and translate [4] different 3 wings [5]
[5] 1 [5] [5] 6 [1]
Next [4] unravel silk [3]
[3] [6]
8/ch í holds 1 spoon of 2 pond 3 slow 4 relaxation 5 and speeds 6
4 [1] 1 5 6 [5]
Have 1 friend 2 9/y ǒ u tenth of the twelve Earthly Branches 3
3 5 1
Example 3: 1. topic/XiLin wall Soviet Union/horizontal bar in the front of a carriage used as an armrest
It is horizontal/as to see/become/mountain range/side/one-tenth/peak,
Far/near/high/low/each/difference.
No/knowledge/Mount Lushan/true/appearance,
Only/edge/body// this/mountain in.(Chinese grammar speech)
② Ti?xiLinBi Su?Shi
Héng?kàn?chéng?Lǐn?cè?chéng?fēng,
Yuǎn?jìn?gāo?dí?gè?bùtóng.
Bù?shíLúshān?zhēn?miànmù,
Zh ǐ yu á n sh ē n z à i c ǐ sh ā nzh ō ng. (Chinese phonetic alphabet)
3. inscribe XiLin wall Soviet Union horizontal bar in the front of a carriage used as an armrest
The horizontal mountain range side Cheng Feng that regards as,
Far and near height is variant.
Fail to see what Lushan Mountain really looks like,
Edge is in this mountain.(Chinese speech)
④TIHN?XIHSLINN?BIHF?SUHS?VIHB(YIHV)
2 1 2 6 1 [2] [4]
Hebmkama?webmlibh?cefc?webmfebw.
1 1 1 2 3 1 5
Oomgjinb?gagsdihs?gefb?buhatoym.
1 2 1 1 2 1 1
Buhavihn?Luhnvams?yens?mimamuhb.
1 [2] 2 1 1 1 2
Yihjoomm venuzaka cihg vamsyoys. (Chinese word phonetic-alphabet all-key, word be ranking method frequently)
4 [1] 3 1 1 1 1
Being described as follows of example 3:
The unisonance Chinese character statistical form of example 3 omits, and its method is with the unisonance Chinese character statistical form of example 2.Chinese character " horizontal bar in the front of a carriage used as an armrest " usefulness seldom in Modern Chinese, is just used as name, according to the 3rd Basic Encoding Rules of the present invention, when Chinese character " horizontal bar in the front of a carriage used as an armrest " uses as Chinese word character, must be write as the alliteration joint, " vihb (yihv) "
[2] [4] i.e. " horizontal bar in the front of a carriage used as an armrest it ", writing like this is that the inventor stipulates.Sound joint in round bracket () the expression bracket is not exported Chinese character, but must import computing machine by coding, " name class " Chinese character as the use of name place name, relevant department's statistics according to Taiwan, it approximately is more than 25,000, Chinese characters in current use wherein are also easy to handle, more quite a few than the Chinese character that Chinese character " horizontal bar in the front of a carriage used as an armrest " is used still less, concerning common user, learn so obsolete pair of Chinese character of Modern Chinese of a large amount of resembling " horizontal bar in the front of a carriage used as an armrest it " and alliteration joint, and special-purpose name of a large amount of two Chinese characters and alliteration joint, obviously be inappropriate.The professional Chinese characters of science and technology etc. also belong to this class problem, and to this, the inventor will manage to handle in addition.
Chinese character " knowledge " and " edge ", though be Chinese characters in common use, not within six Chinese character sequence numbers in the unisonance Chinese character of inventor's regulation, but " fail to see ", " edge " be two Chinese characters, coding saves with alliteration, and " buhavihn ", " yihjoomm " meet article one Basic Encoding Rules.1 [2] 4 [1]
Chinese character " side ", " respectively ", " very ", " this ", " topic ", " wall ", " Soviet Union " they are Chinese word characters, and within six Chinese character sequence numbers of inventor regulation, coding is with the monophone joint, " cefc ", " gefb ", " yens ", " cihg ",
3 2 1 1
" tihn ", " bihf ", " suhs " ", meet the second coding rule.
2 6 1
So long as two Chinese characters just are applicable to article one coding rule, an alliteration joint of a fixing use of two Chinese characters coding.Great majority Chinese word character commonly used is applicable to the second Basic Encoding Rules, a monophone joint of a fixing use of Chinese word character coding.Minority is used Chinese word character, all Chinese word characters that is of little use and the Chinese word character that will newly produce from now on always, is applicable to the 3rd Basic Encoding Rules, and alliteration joint of a fixing use of Chinese word character is encoded.After you skillfully used Chinese word phonetic-alphabet code, the 3rd Basic Encoding Rules can be used flexibly, that is, a Chinese word character can use a plurality of relevant alliterations joint codings.For example: Chinese character " horizontal bar in the front of a carriage used as an armrest " can also be write as alliteration joint " (pibq) vihb ", i.e. and " with the horizontal bar in the front of a carriage used as an armrest ", according to user's convenience, oneself determines.But article one and the Basic Encoding Rules of second be immutable forever.Chinese word phonetic-alphabet code is exactly to use this three basic coding rules repeatedly.

Claims (5)

1, a kind of Chinese word phonetic-alphabet code input method of utilizing computer keyboard is characterized in that this input method is divided into all-key and brevity code dual mode; The keyboard input of all-key and brevity code all adopts initial consonant, simple or compound vowel of a Chinese syllable, the joint of Chinese speech to transfer the letter on the corresponding computer keyboard to import by said sequence; Wherein all-key and brevity code initial consonant, joint phase modulation have only the simple or compound vowel of a Chinese syllable difference together; A, i, e, o, u initial consonant that initial consonant adopts 21 initial consonants of the Chinese phonetic alphabet to add five " no pronunciations " are formed 26 initial consonants altogether, 26 initial consonant correspondences are distributed on the letter key corresponding on the computer keyboard, and initial consonant in the Chinese phonetic alphabet " zh, ch, sh " uses letter " y, w, v " to replace respectively; Joint transfers " tone " according to the Chinese phonetic alphabet promptly to be divided into high and level tone, rising tone, to go up sound, falling tone four classes; In each joint is transferred by the synonymity title be divided into again noun in kind, abstract noun, for the time noun, action noun, static noun, process verb, standby six kinds, wherein each class joint is transferred the letter of pressing again on the corresponding computer keyboard of synonymity title, the synonymity title " standby " of high and level tone, rising tone is same letter, and the synonymity title " standby " of last sound, falling tone is same letter; The simple or compound vowel of a Chinese syllable of all-key and brevity code is different, and wherein the simple or compound vowel of a Chinese syllable of all-key is 38, and each simple or compound vowel of a Chinese syllable is replaced by two on keyboard letters; The simple or compound vowel of a Chinese syllable of brevity code is 26, the letter of the Qwerty keyboard on the corresponding computing machine; Be divided into single syllable Chinese speech and two single-unit Chinese speech by the Chinese custom when adopting all-key and brevity code input Chinese speech; The single syllable Chinese word encoding adopts initial consonant+simple or compound vowel of a Chinese syllable+joint accent to import in proper order; Two single-unit Chinese speech adopt initial consonant+simple or compound vowel of a Chinese syllable+joint accent+initial consonant+simple or compound vowel of a Chinese syllable+joint accent to import in proper order; Wherein joint transfers input to be divided into " word is ranking method frequently " and " meaning of word ranking method " two kinds again, and " word is ranking method frequently " is to transfer lexicographic order to get letter successively according to the usage frequency size of Chinese character from joint; " meaning of word ranking method " is to transfer and corresponding its generic meaning is that letter got in above-mentioned various " noun " and " verb " by joint.
2, the Chinese word phonetic-alphabet code input method of utilizing computer keyboard as claimed in claim 1 is characterized in that the English alphabet of described all-key simple or compound vowel of a Chinese syllable correspondence is as follows: er-eh, a-al, o-oj, e-ef, ai-ak, ê, ei-ec, ao-ag, ou-od, an-am, en-en, ang-at, eng-eb, ong-oy, i-ih, ia-il, ie-if, iao-ig, iou-id, ian-im, in-in, iang-it, ing-ib, iong-iy, u-uh, ua-ul, uo-uj, uai-uk, uei-uc, uan-um, uen-un, uang-ut, ueng-ub, ü-oh, ü e-of, ü an-om, ü n-on, the simple or compound vowel of a Chinese syllable ot of a ng-ob and a no pronunciation is with its English alphabet.
3, the Chinese word phonetic-alphabet code input method of utilizing computer keyboard as claimed in claim 1 is characterized in that the English alphabet of described brevity code simple or compound vowel of a Chinese syllable correspondence is as follows: er, ia, ot-Q, iou-W, e-E, ü an, uan-R, ü e, uei-T, ian-Y, u-U, i-I, o, uo-O, ü n, uen-P, a-A, iong, ong-S, iang, uang-D, en-F, eng, ueng-G, ang-H, an-J, ao-K, ai-L, ei, ê-Z, ie-X, ü, ua-C, iao-V, ou-B, in, ng-N, ing, uai-M.
4, the Chinese word phonetic-alphabet code input method of utilizing computer keyboard as claimed in claim 1 is characterized in that the corresponding English alphabet of described joint accent is as follows: high and level tone-s, t, u, v, w, x, z; Rising tone-m, n, o, p, q, r, z; Last sound-g, h, i, j, k, l, y; Falling tone-a, b, c, d, e, f, y; The synonymity name order of each English alphabet correspondence was followed successively by " noun in kind, abstract noun, for time noun, action verb, stative verb, process verb, standby " during wherein each joint was transferred.
5, the Chinese word phonetic-alphabet code input method of utilizing computer keyboard as claimed in claim 1, it is characterized in that with Chinese speech be encode Chinese characters for computer unit, with Chinese word phonetic-alphabet and phonetic sentence speech is the encode Chinese characters for computer form, encode one to one with Chinese speech and Chinese word phonetic-alphabet, with sentence make peace Chinese word phonetic-alphabet serve as the input unit, with sentences and phrases and Chinese speech is that the Chinese character meaning and pronunciation coding method of output unit is as follows: (1) is encode Chinese characters for computer unit with Chinese speech and coding sentences and phrases, by a Chinese character and two encode Chinese characters for computer units that Chinese character is formed, be called Chinese character Chinese speech, the Chinese character Chinese speech of a Chinese character is called " Chinese word character " or is called " Chinese word character Chinese speech "; The Chinese character Chinese speech of two Chinese characters is called " two Chinese character " or is called " two Chinese character Chinese speech ", when not making any distinction between, is referred to as " Chinese speech ", and the mathematical definition of Chinese speech is C 2+ C 1, C=0 in the formula, 1,2,3 ... positive integer, C represents the number of different Chinese character, C 1The number of expression Chinese word character Chinese speech, C 2The number of the two Chinese character Chinese speech of expression; A Chinese speech has only a meaning, is called " generic meaning ", is called for short " synonymity ", and the mathematical model of Chinese part of speech justice is H 1=log 2(C 2+ C 1), C in the formula>0, H 1The average information of expression Chinese part of speech justice, unit is a bit, C represents the number of different Chinese character, C 1The number of expression Chinese word character Chinese part of speech justice, C 2The number of the two Chinese character Chinese part of speech justice of expression, Chinese speech has the written form and the meaning of regulation, imports between two Chinese speech and presses space bar; Encode Chinese characters for computer unit with two Chinese speech are formed is called " coding sentences and phrases " and claims sentences and phrases again, and the encode Chinese characters for computer unit of sentences and phrases has four kinds, is exactly: Chinese word character+Chinese word character, Chinese word character+two Chinese characters, two Chinese character+Chinese word character, two Chinese character+two Chinese characters; (2) be the encode Chinese characters for computer form with Chinese word phonetic-alphabet and phonetic sentence speech, Chinese word phonetic-alphabet code uses " all-key ", " brevity code ", two kinds of encode Chinese characters for computer forms; All-key and brevity code use its corresponding initial consonant, simple or compound vowel of a Chinese syllable, joint to transfer in the row coding respectively, the Chinese phonetic alphabet has about 1300 of the different chapters and sections of phonological tone, be encoded to about 8580 different codings that phonological tone is arranged, these 8580 codings are " Chinese word phonetic-alphabet ", and the Chinese word phonetic-alphabet of a sound joint is called " monophone joint ", the Chinese word phonetic-alphabet of two sound joints is called " alliteration joint ", when not making any distinction between, be referred to as " Chinese word phonetic-alphabet ", the mathematical definition of " Chinese word phonetic-alphabet " is a 2+ a 1, a=0,1,2,3 in the formula ... positive integer, a represents the number that do not save in unison, a 1The number of monophone joint Chinese word phonetic-alphabet, a 2The number of expression alliteration joint Chinese word phonetic-alphabet; A Chinese word phonetic-alphabet has only a received pronunciation, is exactly the standard mandarin voice, and the mathematical model of Chinese word phonetic-alphabet mandarin pronunciation is H 2=log 2(a 2+ a 1), a in the formula>0, H 2The average information of expression Chinese word phonetic-alphabet mandarin pronunciation, unit is a bit, a represents the number that do not save in unison, a 1The number of expression monophone joint Chinese word phonetic-alphabet mandarin pronunciation, a 2The number of expression alliteration joint Chinese word phonetic-alphabet mandarin pronunciation; Calculate with 8580 sound joints, the sum of Chinese word phonetic-alphabet is 7.362498 * 10 7Individual; Adopt space bar between the Chinese word phonetic-alphabet; Monophone joint transfers by initial consonant, simple or compound vowel of a Chinese syllable, joint that totally three parts constitute, and the alliteration joint transfers by initial consonant, simple or compound vowel of a Chinese syllable, joint accent, initial consonant, simple or compound vowel of a Chinese syllable, joint that totally six parts constitute; The encode Chinese characters for computer form of being made up of two Chinese word phonetic-alphabets is called " phonetic sentence speech " and claims " sentence speech " again, and the encode Chinese characters for computer form of sentence speech has four kinds: monophone joint+monophone joint, monophone joint+alliteration joint, alliteration joint+monophone joint, alliteration joint+alliteration joint; (3) primitive rule of the corresponding coding of Chinese speech and Chinese word phonetic-alphabet has three, an alliteration joint of one pair of fixing use of Chinese characters coding, and a Chinese word character is fixed and is used a monophone to save coding, and alliteration joint of a fixing use of Chinese word character is encoded; The ancillary rules of the corresponding coding of Chinese speech and Chinese word phonetic-alphabet has one, is exactly the rule of the corresponding ordering of joint accent letter of Chinese character; (4) with sentence make peace Chinese word phonetic-alphabet serve as the input unit, having between two Chinese word phonetic-alphabets under the prerequisite in a space, by two input units that Chinese word phonetic-alphabet is formed, be called " input sentence speech ", claim " sentence speech " again, twice space bar of sentence speech back double hit, if monophone is saved numeral " 1 " expression, alliteration is saved numeral " 2 " expression, the array configuration of sentence speech has four kinds so, is exactly " 1+1 ", " 1+2 ", " 2+1 ", " 2+2 ", when with the Chinese word phonetic-alphabet being the input unit, import a Chinese word phonetic-alphabet, hit space bar one time; (5) be output unit with sentences and phrases and Chinese speech, having between two Chinese speech under the prerequisite in a space, the output unit of forming by two Chinese speech, be called " output sentences and phrases ", claim again " sentences and phrases ", there is the distance of two space bars the sentences and phrases back, if Chinese word character is represented with numeral " 1 ", two Chinese characters are represented with numeral " 2 ", the array configuration of sentences and phrases has four kinds to be exactly " 1+1 ", " 1+2 ", " 2+1 ", " 2+2 " so, when being output unit, export a Chinese speech, space bar of back input with Chinese speech.
CN97113313A 1996-05-29 1997-05-28 Phonetic Chinese word encoding and its keyboard Expired - Fee Related CN1109283C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN97113313A CN1109283C (en) 1996-05-29 1997-05-28 Phonetic Chinese word encoding and its keyboard

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN 96107547 CN1142077A (en) 1996-05-29 1996-05-29 Chinese word phonetic-alphabet code
CN96107547.3 1996-05-29
CN97113313A CN1109283C (en) 1996-05-29 1997-05-28 Phonetic Chinese word encoding and its keyboard

Publications (2)

Publication Number Publication Date
CN1172983A CN1172983A (en) 1998-02-11
CN1109283C true CN1109283C (en) 2003-05-21

Family

ID=25743976

Family Applications (1)

Application Number Title Priority Date Filing Date
CN97113313A Expired - Fee Related CN1109283C (en) 1996-05-29 1997-05-28 Phonetic Chinese word encoding and its keyboard

Country Status (1)

Country Link
CN (1) CN1109283C (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112861487A (en) * 2020-11-30 2021-05-28 新绎健康科技有限公司 Method and system for marking five tones of Chinese characters

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN85102522A (en) * 1985-04-10 1987-02-04 中国中文信息研究会汉字编码专业委员会 Method of computer input in chinese alphabetic writing
CN86107214A (en) * 1986-10-16 1987-08-12 丁飞 A kind of Chinese word input method and keyboard thereof
CN1054219C (en) * 1994-11-03 2000-07-05 王昭宁 Substitution type Chinese phonetic character, word input coding method and keyboard thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN85102522A (en) * 1985-04-10 1987-02-04 中国中文信息研究会汉字编码专业委员会 Method of computer input in chinese alphabetic writing
CN86107214A (en) * 1986-10-16 1987-08-12 丁飞 A kind of Chinese word input method and keyboard thereof
CN1054219C (en) * 1994-11-03 2000-07-05 王昭宁 Substitution type Chinese phonetic character, word input coding method and keyboard thereof

Also Published As

Publication number Publication date
CN1172983A (en) 1998-02-11

Similar Documents

Publication Publication Date Title
CN1205572C (en) Language input architecture for converting one text form to another text form with minimized typographical errors and conversion errors
CN1143769A (en) System and method for processing chinese language text
CN1384940A (en) Language input architecture fot converting one text form to another text form with modeless entry
CN1648828A (en) System and method for disambiguating phonetic input
CN87107540A (en) Choose the method and apparatus of storage and demonstration Chinese character
CN1591414A (en) Automatic translating converting method for Chinese language to braille
CN1896923A (en) Method for inputting English Bashu railing Chinese morphology translation intermediate text by computer
CN1109283C (en) Phonetic Chinese word encoding and its keyboard
CN1387109A (en) Numeral (keypad) input method for braille
CN1110738C (en) Literal character input method for notobook computer
CN1731389A (en) Braille-Chinese contrapositive editing/typesetting system and editing/typesetting method
CN1121645C (en) Sound and shape word code Chinese character input method
CN1129058C (en) Chinese character phonetic code and keyboard design
CN1053976C (en) Full and double phoneticizing combined type Chinese input method
CN85100087A (en) " Chinese coded sound " scheme and its implementation
CN1142077A (en) Chinese word phonetic-alphabet code
CN1123818C (en) Computer inputting method of electric spelling Chinese characters, applied keyboard and its Chinese internal code
CN1779624A (en) Code, inputting method and keyboard for Chinese on syllable compressed platform
CN1801056A (en) Chinese phonetic input method for digital keyboard
CN1196989C (en) Chinese character pattern schematic input method and keyboard thereof
CN1114146C (en) Chinese morpheme code and its computer keyboard input
CN1734404A (en) Phonetic code and recognition phonetic code, database technology, stroke code and numeric stroke code
CN1037043A (en) Computer Chinese input method
CN1089175C (en) Chinese character input method and product thereof
CN1042174C (en) Holographic natural code Chinese input method and relative keyboard apparatus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee