CN1006017B - Nested grapheme coding method and input keyboard thereof - Google Patents

Nested grapheme coding method and input keyboard thereof Download PDF

Info

Publication number
CN1006017B
CN1006017B CN86104174.7A CN86104174A CN1006017B CN 1006017 B CN1006017 B CN 1006017B CN 86104174 A CN86104174 A CN 86104174A CN 1006017 B CN1006017 B CN 1006017B
Authority
CN
China
Prior art keywords
grapheme
word
head
key
chinese character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
CN86104174.7A
Other languages
Chinese (zh)
Other versions
CN86104174A (en
Inventor
萧忠义
余锦凤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN86104174.7A priority Critical patent/CN1006017B/en
Publication of CN86104174A publication Critical patent/CN86104174A/en
Publication of CN1006017B publication Critical patent/CN1006017B/en
Expired legal-status Critical Current

Links

Images

Abstract

The invention belongs to the field of Chinese information science. It is characterized by that it adopts nested grapheme coding, and uniformly places one hundred twenty graphemes on 31 keys of input keyboard. The Chinese character is coded by the way of words or words and characters according to the rule of 'head → middle → tail', and a 16 bit Chinese character code can be directly formed; the most common punctuations and Chinese characters (31 in total) only click secondary keys (including space keys), and the occurrence rate of coincident code is less than 0.1%; easy-to-see key typing or touch typing, the arrangement of grapheme takes the reduction of coincident code rate as a reference, and the high and low use frequency of the grapheme determines the position of the grapheme on a keyboard. It can be used in computer, terminal with graphic function, printer, plotter and electronic typewriter.

Description

Nested grapheme compiling method and input keyboard thereof
The present invention is referred to as " nested grapheme compiling method ", it be with satisfy user's needs and computing machine requirement be starting point novelty Chinese character shape-oriented encoding and be a kind of accurately and fast indexing system of Chinese Characters.The every word stroke that has in the compiling method in the past is few, but bond number reaches three, 400; Have only with eight keys but stroke reach 6 to 8 times, ambiguity is arranged; 32 of the bond number less thaies that has, but symbol numbers inequality and keystroke four times and need association on each key; The stroke that has only three times, but bond number more than symbol on 32 and the key more than four and uneven, total number of symbols is more than 170.These encoding schemes are trained through January concerning the person that begins to learn the keystroke, its code rate (refer to correctly encode in the unit interval average number of words) is less than 50 words per minutes also, and mostly the complete coding theory of none cover can not directly form the shortest kanji code (16).
The purpose of this compiling method be overcome the shortage rationale that exists in existing most schemes, ambiguity arranged, code rate is low, code is long, uniqueness is poor, need association, traditional font simplified not public key dish cart, every word stroke be more than shortcomings such as three times, last easy touch system, the traditional habit that as far as possible keeps writing Chinese characters, set up the encode Chinese characters for computer theory of a cover science, release the simple and direct and coding method that is easy to grasp.
Chinese character is formed by the orderly grapheme set that contains positional information, and its design feature is a nested type.Grapheme be after the structure of word is carried out statistical study, preferably come out can be according to this intactly assembly go out the requisite element of all Chinese characters.This compiling method is the coding unit with the grapheme, Chinese character be divided into monolithic word (as " only ") and polylith word (as " ").The monolithic word is exactly the first stroke of a Chinese character from writing Chinese characters, is unit with grapheme and set thereof, along the horizontal direction of word forward, from up to down see, can not about the word that separates naturally; The word that can separate naturally is exactly the polylith word; Its leftmost one is cried first, the back to back time piece that cries, and rightmost is last piece.In every, be unit with grapheme and set thereof, vertically look down (turning left from left to right or from the right side) everyly can unhinderedly natural up and down divided portion be called the word layer, topmost be the first floor, and then be sublevel, be last layer bottom.In every layer, be unit with grapheme and set thereof, along continuous straight runs sees from the top down that forward the pen-shaped structure of separating naturally about energy is called word slice, sheet headed by the most left person, the right is close to is time sheet.First grapheme in normally first of the lead-in element, it is positioned at the Chinese character left side, top, top or the upper left corner.All words with grapheme headed by traditional " radicals by which characters are arranged in traditional Chinese dictionaries " mostly are the word of monolithic word with grapheme headed by " radical ", almost are the polylith word entirely.Obviously, this Hanzi structure that connects with one another closely is a nested structure.
The primitive rule that this compiling method is encoded to word is the first stroke of a Chinese character from Chinese-character writing, utilize nested grapheme input keyboard Fig. 1, with the maximum grapheme of the identical stroke of shape is the lead-in element that the coding unit removes to replace Chinese character, grapheme in the middle of repeatedly not going to choose then by piece, layer, the first priority principle of sheet, the last coding that finishes whole Chinese character with the tail grapheme.The tail grapheme comprises the end pen at the maximum grapheme of interior stroke when meaning writing Chinese characters, it is usually located at the right of word, bottom or right lower quadrant.That is, its rule writes a Chinese character in simplified form:
Kanji code=head → in → tail (1)
Its head, in, each other should overlapping use between the tail.
When being main with speech, kanji code=head → in → keyboard → tail (2)
(1) when Chinese character is made up of three graphemes at least:
First, when Chinese character is the monolithic word: 1. if this word two word layers only, and a sublevel grapheme only, then in the middle of grapheme select time sheet head (first grapheme) in the first floor, promptly time sheet head has precedence over first all the other graphemes.As gully, keep away, rash, ripe, " in " choosing " again, upright, Fan, nine " respectively.2. if behind the first floor a plurality of graphemes are arranged, then middle grapheme selects sublevel head, and promptly a layer head has precedence over sheet head.As fragrant, clamor, climb " " in " choosing " standing grain, page or leaf, greatly " respectively
Second, when Chinese character is the polylith word: 1. if a grapheme only after first, then " in " select first middle sublevel head, that is, " layer is first " has precedence over " sheet head " as " put down, parrot, jaw " Qu in first " " in " choosing respectively " really, woman, two, good ".2. if a more than grapheme after first, then " in " choosing time piece head, that is, and " piece " have precedence over " layer is first " for example " stone roller, writing brush, *, weed, mole " " in " choosing respectively " corpse, people, woman, two, say ".
When (two) Chinese character only was made up of two graphemes, code is isometric not to be 16bit to grapheme in order to make in the middle of not having usually, should " tail " works " in " and add position keys code work " tail ", purpose is the minimizing repeated code.The position keys code is the pairing code in position on the key of " tail " place.But inlay each other or its " head " when being the monolithic word of " nine, mouthful, day " when this two grapheme, its " tail " selected for use respectively and inlayed key and " layer " code.
(3), when Chinese character only is made up of a grapheme, this moment nothing " in " and " tail ", in order to make its code isometric, " in " with the pairing code in position on this grapheme place key, " tail " is with a retainingf key code.Because grapheme itself is nested, thereby, can directly see the key typewriting and make that grapheme minimizes on the key.
(4), account for the everyday character of whole Hanzi frequency counts 1/5th and the coding of punctuation mark=their place keys
The purpose of code → sup(3) do like this is to improve code rate.
(5), provide vocabulary code, purpose is just ten inputs and improves code rate in the words mode.Its coding rule is:
Make vocabulary=X1 X2 ... Xn, Xi(i=1 wherein, 2 ..., n) be i Chinese character,
Make " head " that represent the 1st word among X1 head, the X1 respectively and " in ", the first expression of Xn n word it " head ".
Among the X=X1 in+Xn head when a grapheme (X1 only)
X tail=Xn head+sup(X1 is only during a grapheme)
First, when being main, among vocabulary code=X1 head → X → vocabulary key → X tail (4) with word
As carefully=little → heart → vocabulary key → sup
Computing machine=speech → ten → vocabulary key → wood
If one does not exert oneself in youth, one will regret it in old age=little → → the vocabulary key → non-
Second, when being main, only vocabulary key in the last example being removed and gets final product with speech:
Among vocabulary code=X1 head → X → the X tail.(5)
Its input keyboard contains 32 keys at least, only uses wherein 32 keys here, and the keys arrangement of grapheme just carries out by its usage frequency, the high frequency grapheme is placed on the keyboard center key position, maximum four graphemes on every key, its available any technical measures is illustrated on the key, is beneficial to see the key typewriting; The grapheme arrangement is a benchmark to reduce the repetition rate of coding on the whole keyboard; The pinkie usage frequency is below 10% in the coding.
Coding bond number N and each stroke X, code length L(bit) between satisfy the X=relational expression.When the shortest kanji code 16bit, its best relation is N=8 16 32
Allow the average stroke X=5 43 of every word
Have only nearly 30 to differ from that the grapheme of radical is to found for more computer input code problems that solve better on the dictionary among Fig. 1.
This compiling method is compared with other similar scheme, is characterized in: the coding theory that has a cover to meet spoken and written languages, literature search, computer science coaches, and when best information key N=32, the repetition rate of coding<0.1% occurs, easily touch system; Every word hits key usually three times when being main with word, can directly form the shortest kanji code (16), and speech must add and hits a vocabulary key; Otherwise when being main with speech then; Punctuation mark that usage frequency is high and Chinese character only hit secondary bond; 100 graphemes are placed on 31 keys equably, and another key confession or word, minority language, outer literal, scientific and technological symbol and other symbol use; Clear rules is rigorous and complete, need not associate; Compatible good, extendibility is strong, has considered the coding of 22,000 Chinese characters; Its coding schedule is applicable to numerous, simplified and kanji; For condition created in the machine recognition Chinese character; Its code has contained a large amount of structure word information, the feasible Chinese character generator that can produce the size minimum.Adopt the words mode to encode, be convenient to study and improve typing rate.It can be used for computing machine, have the terminal of graphing capability, printer, plotting apparatus and electronic typewriter, in communication system, OAS and the printing automation system.

Claims (24)

1, nested grapheme coding method is characterized in that adopting (Fig. 1) nested grapheme input keyboard to decompose Chinese character according to nested grapheme structural theory and imports in the mode of nested grapheme " piece, layer, sheet ":
A. nested grapheme compiling method is pointed out: Chinese character is formed by the orderly grapheme set that contains positional information, and its design feature is a nested type; This compiling method is the coding unit with the grapheme, Chinese character is divided into monolithic word (for example " only ") and polylith word (for example " "): the monolithic word is exactly the first stroke of a Chinese character from writing Chinese characters, with grapheme and set thereof is unit, along the horizontal direction of word forward, from up to down see, can not about the word that separates naturally; The word that can separate naturally is exactly the polylith word; Its leftmost one is cried first, the back to back time piece that cries, and rightmost is last piece; In every, be unit with grapheme and set thereof, vertically look down (turning left from left to right or from the right side) everyly can unhinderedly natural up and down divided portion be called the word layer, topmost be the first floor, and then be sublevel, be last layer bottom; In every layer, be unit with grapheme and set thereof, along continuous straight runs sees from the top down that forward the pen-shaped structure of separating naturally about energy is called word slice, sheet headed by the most left person, the right is close to is time sheet;
B. this compiling method method that Chinese character is decomposed input is the first stroke of a Chinese character from Chinese-character writing, utilize nested grapheme input keyboard (Fig. 1), with the maximum grapheme of the identical stroke of shape is " head " grapheme that the coding unit removes to replace Chinese character, then not repeatedly by the first priority principle of piece, layer, sheet choose " in " between grapheme, finish the coding of whole Chinese character at last with " tail " grapheme; " tail " comprises the end pen at the maximum grapheme of interior stroke when grapheme means writing Chinese characters; First grapheme in normally first of " head " grapheme;
C. the feature that presents on the nested grapheme input keyboard of Fig. 1 of grapheme also is nested, (as: " fish " embedding " field ", " all " embeddings " several ", " being " embedding " power ", " body " embedding " Dao " and " * " , “ Door " embedding " door "); Wherein found 30 graphemes that differ from radical on the dictionary: it is thousand not educate two li five * of electricity for well person Ba Yu asks to go out in the worm city that internal beam is partly sought an ancient type of spoon centre by ancient I a Woo Gan Feng of volume, etc.
2, the nested grapheme input method according to claim 1 is characterized in that:
When being main with word, the code taking principle of single Chinese character is: kanji code=head → in → tail; Its head, in, use each other should not overlap between the tail; (2) when being main with speech, kanji code=head → in → keyboard code → tail;
A. when Chinese character is made up of three graphemes at least:
First, when Chinese character is the monolithic word: 1. if this word two word layers only, and a sublevel grapheme only, then " in " select time sheet head in the first floor, promptly time sheet head has precedence over first all the other graphemes; (as, gully, keep away " " in " choosing respectively " again, upright "); 2. if behind the first floor a plurality of phonemes are arranged, then " in " select sublevel head, promptly layer head have precedence over sheet head (as " fragrant, climb " " in " choosing respectively " standing grain, big ");
Second, when Chinese character is the polylith word: 1. if a grapheme only after first, then " in " select first middle sublevel head, that is, and in first " layer is first " have precedence over " sheet head " (as " parrot, jaw " " in " choosing respectively " woman, two "); 2. if a more than grapheme after first, then " in " choosing time piece head, that is, and " piece " have precedence over " layer is first " (for example " writing brush, weed " " in " choosing " people, two " respectively);
When b. Chinese character only is made up of two graphemes, should " tail " work " in " and add position keys code work " tail "; The position keys code is the pairing code in position on the key of " tail " place;
When c. Chinese character only is made up of a grapheme, " in " with the pairing code in position on this grapheme place key, " tail " is with a retainingf key code; Because grapheme itself is nested, thereby, can directly see the key typewriting and make that grapheme minimizes on the key;
D. account for the everyday character of whole Hanzi frequency counts 1/5th and the coding of punctuation mark=their place key code → sup.
3, the nested grapheme input method according to claim 1 is characterized in that: provide vocabulary code in the words mode, its coding rule is:
Make vocabulary=X1 X2 ... Xn, Xi(i=1 wherein, 2 ..., n) be i Chinese character,
If represent respectively among X1 head, the X1 the 1st word " head " and " in ", first n the word it " head " of representing of Xn; Then
Among the X=X1 in+Xn head when a grapheme (X1 only)
X tail=Xn head+sup(X1 is only during a grapheme)
When being main a. with word, among vocabulary code=X1 head → X → vocabulary key → X tail
As carefully=little → heart → vocabulary key → sup
Computing machine=speech → ten → vocabulary key → wood
When being main b. with speech, among the first X of vocabulary code=X1 → the X tail.
4, according to the nested grapheme compiling method of claim 1, it is characterized in that input keyboard contains 32 keys, more than 100 grapheme of selecting for use is uniformly distributed on 31 information buttons; Other has a key to aim at the settings of vocabulary (or word), minority language, outer literal, scientific and technological symbol and other symbol; The keys arrangement of grapheme just carries out by its usage frequency, and the high frequency grapheme is placed on the keyboard center key position, maximum four graphemes on every key, and the grapheme arrangement is a benchmark to reduce the repetition rate of coding on the whole keyboard; The medium and small finger of encoding uses frequency below 10%.
5,, it is characterized in that shared this enter key dish cart of numerous, simplified Hanzi, but its direct representation can be seen the key typewriting on key according to the nested grapheme input keyboard of claim 1.
6, according to the coding rule of the nested grapheme compiling method of claim 1, it is characterized in that carrying out touch system under 32 key conditions, every word hits triple bond usually, and speech adds and hits a vocabulary key; Or each vocabulary hits triple bond and word adds and hits a keyboard.Have only three punctuation marks accounting for Hanzi frequency count 1/5th (as ",.") and a 29 Chinese character (as, " have,, be, one and,, or not ... ") only need hit secondary bond (comprise and hit space bar); Can directly form 16 kanji codes.
CN86104174.7A 1986-06-28 1986-06-28 Nested grapheme coding method and input keyboard thereof Expired CN1006017B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN86104174.7A CN1006017B (en) 1986-06-28 1986-06-28 Nested grapheme coding method and input keyboard thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN86104174.7A CN1006017B (en) 1986-06-28 1986-06-28 Nested grapheme coding method and input keyboard thereof

Publications (2)

Publication Number Publication Date
CN86104174A CN86104174A (en) 1988-01-13
CN1006017B true CN1006017B (en) 1989-12-06

Family

ID=4802373

Family Applications (1)

Application Number Title Priority Date Filing Date
CN86104174.7A Expired CN1006017B (en) 1986-06-28 1986-06-28 Nested grapheme coding method and input keyboard thereof

Country Status (1)

Country Link
CN (1) CN1006017B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SE0102294L (en) * 2001-06-28 2002-12-29 Anoto Ab Ways to handle information

Also Published As

Publication number Publication date
CN86104174A (en) 1988-01-13

Similar Documents

Publication Publication Date Title
US5475767A (en) Method of inputting Chinese characters using the holo-information code for Chinese characters and keyboard therefor
CN1003326B (en) Encoding method of optimizing 5 character components and keyboard
CN100462901C (en) GB phoneticize input method
US5131766A (en) Method for encoding chinese alphabetic characters
CN101135936A (en) Speed typing apparatus and method
CN101667099A (en) Method for inputting stroke connection keyboard characters and device therefor
CN102750000A (en) Binary syllabification input method
CN1006017B (en) Nested grapheme coding method and input keyboard thereof
CN1018205B (en) Chinese voice-digit coding input technique for computer
CN101751134B (en) Right upper left Chinese character input method
CN102511021A (en) Number-order-code-element keyboard and information input method thereof
JP2003505777A (en) Character input keyboard
CN100363874C (en) Multi-key parallel-impact chinese character imput keyboard and input method thereof
CN100498663C (en) Method for inputting Chinese character by utilizing Korean
CN110673746A (en) Twenty-six radicals Chinese character input method capable of reasonably inputting common used figures and good radicals
CN102375558A (en) Computer Chinese character rapid-code five-stroke input method
CN101470535A (en) Optimized Chinese character code input method
CN103207685A (en) T-shaped Chinese character code input method
CN1022350C (en) Chinese alphabet coding input method
CN102043471A (en) Twenty-five-radical Chinese-form code input method
CN100568161C (en) A kind of input method of Chinese character of being convenient to selecting coincident codes rapidly
CN110956017A (en) Chinese mandarin information ASCII natural language understanding code
CN1243300C (en) Three-stroke digital code Chinese character input method in computer
CN117111752A (en) New homophonic near-bit Chinese character code input method
CN105022496B (en) A kind of Chinese character sound algebraically input method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C13 Decision
GR02 Examined patent application
C14 Grant of patent or utility model
GR01 Patent grant
C57 Notification of unclear or unknown address
DD01 Delivery of document by public notice

Addressee: Yu Jinfeng

Document name: Notification of Termination of Patent Right

C17 Cessation of patent right
CX01 Expiry of patent term