CN1099494A - Encoding method for identification Chinese by initial consonant and strokes and keyboard thereof - Google Patents

Encoding method for identification Chinese by initial consonant and strokes and keyboard thereof Download PDF

Info

Publication number
CN1099494A
CN1099494A CN 94110908 CN94110908A CN1099494A CN 1099494 A CN1099494 A CN 1099494A CN 94110908 CN94110908 CN 94110908 CN 94110908 A CN94110908 A CN 94110908A CN 1099494 A CN1099494 A CN 1099494A
Authority
CN
China
Prior art keywords
code
parts
chinese
last
strokes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 94110908
Other languages
Chinese (zh)
Inventor
唐晓卫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN 94110908 priority Critical patent/CN1099494A/en
Publication of CN1099494A publication Critical patent/CN1099494A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

According to such characteristics of Chinese charecters that their high-freq. parts are easily phoneticized, 180 high-freq parts in 120 kinds are chosen and most of them have part codes created by the relationship between reading key letter and initial consonant or between initial consonant and English letter. The code for each Character includes part code, first stroke ID and duplicate code treating code for low-freq. characters. Its maximal code length is 4 codes. It features easy learning, less duplicate codes and high enter efficiency.

Description

Encoding method for identification Chinese by initial consonant and strokes and keyboard thereof
One. the present invention is a kind of method of Chinese character coding and keyboard thereof.It utilizes the high frequency parts of Chinese character to combine the characteristics of pronunciation easily into syllables, chosen meticulously 121 kinds totally 181 high frequency, easily combine parts (radical) into syllables, parts are set up corresponding relation by the key word and the Chinese Pin Yin pseudonym of its conventional pronunciation, Chinese Pin Yin pseudonym is set up corresponding relation with English alphabet again, so just sets up the Hanzi component keyboard of this compiling method on QWERTY keyboard.This keyboard has embodied whole corresponding relation between Hanzi component, parts phonetic (mainly being initial consonant), part codes (English alphabet on the keyboard) three.Parts are with the initial consonant pronunciation (English key O, the component exception on the V) of its Chinese phonetic alphabet.Initial consonant is that initial consonant ZH and Z, CH, SH distinguish corresponding English A, I, U with the English alphabet corresponding relation, and phonetic YU and YUE, SHUI be corresponding English V, O respectively, and other initial consonants are corresponding with the English alphabet of its similar shape.
For example, the conventional pronunciation of ' wide ' be ' the other youngster of wide word ', and (GuangZiPangr), key word be ' extensively ', and its phonetic initial consonant is ' G ', parts ' extensively ' just be positioned at the G(brother on the Hanzi component keyboard) key.By this method, 80% parts can be located on the Hanzi component keyboard, and miscellaneous part is by key elements such as stroke, stroke number, position, Yi Ji location.Parts and code thereof see Figure of description-encoding method for identification Chinese by initial consonant and strokes Hanzi component keyboard for details.
Two. divine by means of characters and get parts according to conventional rule: sequential write, directly perceived, can be very much not little, can not hand over from not connecting, can connecting.
Three. coding formula and rule
The coding formula is:
Encode Chinese characters for computer=part codes+Chinese character first and last strokes identification code+low-frequency word repeated code transaction code part codes is an entry required, and its latter two are option, and maximum length code is four yards.
The single character code rule:
1. the code of its first three parts and last parts got in the word that is made of four or more parts
2. its part codes got in the word of less than four parts, and add Chinese character first and last strokes identification code (abbreviation identification code), and nearly 20 of identification codes are by 5 groups of the first sum of divisions.
(1) the first sum ofly divides the identification code that horizontal last stroke is respectively horizontal, vertical, casts aside and presses down the Chinese character of (point), folding into and be respectively: G, F, D, S.
(2) the first sum ofly divide the identification code that perpendicular last stroke is respectively horizontal, vertical, casts aside and presses down the Chinese character of (point), folding into and be respectively: H, J, K, L.
(3) the first sum of dividing into cast aside the identification code that last stroke is respectively horizontal, vertical, casts aside and presses down the Chinese character of (point), folding and is respectively: T, R, E, W.
(4) the first sum ofly divide the identification code that the last stroke of right-falling stroke (point) is respectively horizontal, vertical, casts aside and presses down the Chinese character of (point), folding into and be respectively: Y, U, I, O.
(5) the first sum ofly divide the identification code that the last stroke of folding is respectively horizontal, vertical, casts aside and presses down the Chinese character of (point), folding into and be respectively: B, V, C, X.
In identification code, last stroke is left-falling stroke, right-falling stroke, some person as with a kind of stroke process, and the common name of this compiling method they be ' tiltedly ', rolls over to comprise collude the class stroke except that lifting-hook.
Example: right-the YC(part codes)+and the C(identification code, the first sum of folding that divides into, last stroke is a little)
Beg for-the YC(part codes)+the I(identification code, the first sum of dividing into a little, last stroke is a little)
3. for the one-tenth word on the Hanzi keyboard.As ' big ', ' upright ' gets the code of this character formation component earlier, get then its first, two, the code of last one stroke, less than is then got actual code length for four yards,
For example on the D key ' big ' word be encoded to DGTY, the F key ' two ' are FGG
4. the encode Chinese characters for computer that constitutes according to situation 1.2.3., as repeated code appears, high frequency word row elder generation, as containing the Chinese characters of level 2 of GB baseset this moment in the repeated code, then these secondary word three code words add ' Z ', four code words then change the 4th yard and are ' Z ', so that reduce repeated code.
5. brevity code:
(1) one-level brevity code (a yard) is:
The C of a little B of A cross the big E of D by F G not H and I at a few K state L of J
M with the good O of N do not have Q than R be S can T I U V also this X of W want Y to say
(2) cardinal rule chosen of secondary brevity code (two yards)
Two parts words in the first-level Chinese characters, under the situation that does not add identification code,
No repeated code person is brevity code, and the highest frequency word of person's choosing is a brevity code the repeated code.
(3) cardinal rule chosen of three (trigram)
One-level three parts Chinese characters under the situation that does not add identification code with add identification code after one-level two parts Chinese character encoding summations in, no repeated code person is brevity code, the highest frequency word of the person that has repeated code choosing is for simple.
5.2 the Chinese character frequency in 5.3 is according to " modern Chinese frequency dictionary " (._1996.06 compiles in language teaching research institute of Beijing Language Institute)
Four. the vocabulary encoding law
Preceding two part codes respectively got in double word vocabulary.Example: theory-WRYR
Three words converge, and first sign indicating number got in preceding two words.The 3rd word is got preceding two yards,
Example: computing machine-YAMJ
Four words converge, and get four prefix coees.Example: market economy-YTSO
The above vocabulary of four words is got first three word and is got first sign indicating number, gets last prefix sign indicating number.
Example: computer utility-YAJV
Five. the advantage of this Chinese character coding method be easily learn well usefulness, directly perceived, input efficiency is high.
Six. description of drawings
Title: encoding method for identification Chinese by initial consonant and strokes Hanzi component keyboard.
1. parts are arranged on the keyboard by 5 kinds of situations
(1) presses phonetic transcriptions of Chinese characters initial consonant and the English alphabet corresponding relation that it is accustomed to the key word of pronunciation.Example: soil-T, mouth-K, mountain-U(phonetic initial consonant SH)
Stone (Shi) exception.
(2) connect its phonetic transcriptions of Chinese characters and English alphabet corresponding relation
For example: water (phonetic Shui)-I rain (sound Yu abandons)-V
(3) belong to same radicals by which characters are arranged in traditional Chinese dictionaries or the similar arrangement of shape by it
For example: dog and Quan, oneself and the sixth of the twelve Earthly Branches
(4) by key elements such as stroke, stroke number, position, Yi Ji location.
For example: the code of one two three four horizontal strokes is respectively G, F, D, S
The code of folding three foldings folding into two all is V
(5) ' clothing ' and similar the coming on the E key of English alphabet E pronunciation by Chinese
2. the position component on each English alphabet keys is fixed, and divides upper, middle and lower row
3.Z key is a low-frequency word repeated code transaction code
4.P being used as, key substitutes any parts in Chinese character input system, to realize fuzzy search (any in the middle of the place ahead unanimity or the front and back unanimity)
Seven. this Chinese character coding method machine and keyboard thereof are applied to Chinese character input system easily, and the present inventor has utilized the developing instrument of Relational DBMS Foxpro V2.5 to realize National Standard Chinese Character Set Code for Informati (baseset) " in 6763 Chinese characters and the coding of 6000 common wordss, build storehouse and Chinese character input and editting function.

Claims (5)

1, encoding method for identification Chinese by initial consonant and strokes and keyboard thereof, it is characterized in that whole Chinese characters are made of 120 kinds of 180 parts that are fixed in this compiling method on the Hanzi component keyboard, encode Chinese characters for computer is generated by part codes+Chinese character first and last strokes identification code+low-frequency word repeated code transaction code according to five rules of this compiling method single character code, part codes is an entry required, and identification code and low-frequency word repeated code transaction code item are option.The phrase coding in accordance with associative encode method in this compiling method then.
2, by described parts of claim 1 and code thereof, it is characterized in that 140 parts set up corresponding relation by the key word and the Chinese Pin Yin pseudonym of its conventional pronunciation, Chinese Pin Yin pseudonym is set up corresponding relation with English alphabet again, finally set up the direct corresponding relation of parts and English alphabet (being code relation), 140 of indication parts comprise identical radicals by which characters are arranged in traditional Chinese dictionaries and the similar two kinds of situations of component shape here.
3, generate by the encode Chinese characters for computer described in the claim 1, it is characterized in that maximum length code is four yards, the code of its first three parts and last parts got in the word of four or more parts.Its part codes got in the word of less than four parts, and add Chinese character first and last strokes identification code.
4, by the described Chinese character first and last strokes of claim 1 identification code, it is characterized in that first stroke is respectively horizontal, vertical, casts aside and presses down (point), folding, last stroke is respectively, and five group identification codes horizontal, vertical, that cast aside, press down (point), folding are respectively (G F D S), (H J K L), (T R E W), (Y U I O), (B V C X).
5, by the low-frequency word repeated code transaction code described in the claim 1, it is characterized in that part codes+when repeated code appears in Chinese character first and last strokes identification code, for the Chinese characters of level 2 in the GB baseset in the repeated code, three code words add ' Z', and four code words then change the 4th yard and are ' Z'.
CN 94110908 1994-04-01 1994-04-01 Encoding method for identification Chinese by initial consonant and strokes and keyboard thereof Pending CN1099494A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 94110908 CN1099494A (en) 1994-04-01 1994-04-01 Encoding method for identification Chinese by initial consonant and strokes and keyboard thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 94110908 CN1099494A (en) 1994-04-01 1994-04-01 Encoding method for identification Chinese by initial consonant and strokes and keyboard thereof

Publications (1)

Publication Number Publication Date
CN1099494A true CN1099494A (en) 1995-03-01

Family

ID=5034825

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 94110908 Pending CN1099494A (en) 1994-04-01 1994-04-01 Encoding method for identification Chinese by initial consonant and strokes and keyboard thereof

Country Status (1)

Country Link
CN (1) CN1099494A (en)

Similar Documents

Publication Publication Date Title
CN1099494A (en) Encoding method for identification Chinese by initial consonant and strokes and keyboard thereof
CN1106146A (en) Computer input method by computer Chinese-character phonology-tone coding and its keyboard
CN101046707A (en) Input method for Chinese character of first pronunciation
CN1096112A (en) A kind of Chinese character initial consonant coded input method and applied keyboard thereof
CN1164982C (en) Yi-code input method for Chinese characters
CN1111373A (en) Computer Chinese input scheme based on the Chinese Phonetic Alphabet
CN1116336A (en) Substitution type Chinese phonetic character, word input coding method and keyboard thereof
CN1122913C (en) Normal encoding input method for Chinese data processing in computer
CN1178121C (en) Double Chinese character stroke order-radical input system
CN1057727A (en) Phonetic element encoding method
CN1047676C (en) Square-circular code entering method for Chinese characters
CN1101439A (en) Word-oriented Chinese character typing device
CN1062667C (en) All spelling form guide code Chinese character input system
CN1043381C (en) Four-stroke digit look-up method for Chinese characters
CN1068444C (en) Method of Chinese-character coding
CN1052799C (en) Chinese character coding method and its keyboard
CN1549099A (en) Simplified spellnig simplified stroke Chinese characters inputting method and keyboard
CN1264861A (en) Standing-in-the-world code and its improved compatible keyboard
CN1066928A (en) Chinese characters decomposing and fixing location coding
CN1063369A (en) A kind of bidirectional phonetic stroke pattern Chinese character input system
CN1285539A (en) Chinese character shape symbol input system
CN1252552A (en) Word code input method
CN1049727A (en) Form-pronunciation encoding of chinese characters
CN1100538A (en) New spelling Chinese input method and its keyboard design
CN1152145A (en) Chinese character encoding method and its keyboard

Legal Events

Date Code Title Description
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C06 Publication
PB01 Publication
C01 Deemed withdrawal of patent application (patent law 1993)
WD01 Invention patent application deemed withdrawn after publication