WO2004036404A1

WO2004036404A1 - Pattern-code chinese character input method with “three patterns association”

Info

Publication number: WO2004036404A1
Application number: PCT/CN2003/000858
Authority: WO
Inventors: Zongxing Lin; Zongfan Lin
Original assignee: Zongxing Lin; Zongfan Lin
Priority date: 2002-10-16
Filing date: 2003-10-14
Publication date: 2004-04-29
Also published as: AU2003272866A1; CN1328649C; CN1455316A

Abstract

The present invention discloses a pattern-code Chinese character input method with ' Three Patterns Association ', which intends to provide a Chinese character input method with low repetition rate and highly input speed, as is easy to learn and difficult to forget. The method chooses 300 pattern units which are divided into 26 classes based on pattern according to a classification rule, and creates the relationship between the pattern units in each class and 26 English letter keys on the keyboard with association in pattern, pronunciation, or meaning, to form 'patter units' classes in which homophone can be pronounced. A portion of said 300 pattern units are used for inputting 3000 most frequently used Chinese characters, and a portion of those are used for inputting 7000 frequently used Chinese characters. All of the 300 pattern units are used for inputting simplified Chines characters, traditional Chinese characters and ancient Chinese characters. By classifying Chinese characters into nine types in three classes and employing a code-selection method with the primary pattern units as unit, three pattern units are selected to be input sequentially, wherein, when the pattern units is the same, the larger is selected by preference. The present invention has the advantages of short code length, low repetitive rate, capable of choose pattern units required for coding in Chinese characters with the help of pronunciation, high-speed and easily input of Chinese characters by typing according to the indication of homophone.

Description

Chinese character "Three-shape association" Shape code input method Technical field

The present invention relates to a word processing method, and more particularly, to a Chinese character input method.

Background technique

The modernization of Chinese information processing is the foundation of national economic and social informatization. The improvement of Chinese keyboard input efficiency is the bottleneck of modernizing Chinese information processing. Because "efficiency is life" has been the consensus of people.

Although there are thousands of codes for inputting Chinese characters on computer keyboards today, there is no one that can be both easy to learn and use. In other words, it's hard to learn to beat. The so-called easy-to-learn pinyin codes have many heavy codes and are difficult to type blindly. Problems such as homophones, rare words, and inconsistent pronunciation cannot be solved, and the efficiency is difficult to improve. As for the commonly used shape codes, although they can type fast, and the skilled person can type faster than English, it is difficult to learn and easy to forget, and the learning efficiency is very low. Some people say it is like learning a "foreign language". If you don't use it often, you will forget most of it. Therefore, this form code can only be used by professional typists. If the Chinese character computer coding cannot be widely used, or the efficiency is low, the drafting and recording of documents cannot be synchronized, that is, typing must be changed, and it cannot be like English. As long as you practice fingering and learn a little, most people will type It is said that the computer keyboard input of Chinese characters has been truly modernized.

Today, there are still many "phonetic", "phonetic", "phonetic" simultaneous, and Chinese character coding of several strokes. Although these codes can be better learned or can reduce some re-coding rates, many of their shortcomings are difficult to fundamentally improve efficiency. There are also voice input and handwriting input methods. There are problems with voice control that are difficult to unify from the south to the north. Chinese characters are input by hand. Due to the large number of strokes, the input is slow, computer recognition is difficult, and the efficiency is very low. Therefore, these methods can only be used under specific conditions, and it is difficult to popularize them. It is very difficult to become the mainstream of Chinese characters input by computers.

To sum up, although the encoding of Chinese characters is from scratch, to a certain extent, it satisfies The demand for the modernization of Chinese information processing is still inefficient, and it is far from meeting the growing demand for national economic and social informatization. Although the Chinese keyboard coding scheme for computer keyboard input has accumulated, there are more than 1,000 patents that have only been applied for, but the Chinese computer coding scheme requires a qualitative leap. Chinese character coding must be easy to learn, memorable, and fast in order to meet the requirements.

Summary of the Invention

The present invention is to overcome the shortcomings in the prior art, and provides a Chinese character input method with low recoding rate, easy to learn, memorable and fast.

The present invention is achieved through the following technical solutions:

A Chinese character "three-association" shape code input method includes the following steps:

(1) Select the glyph element as the basic component of the coding, divide it into 26 families according to the same family principle according to the shape, and correspond to the 26 letters on the keyboard. The corresponding relationship is shown in Table 1.

(2) input Chinese characters;

The Chinese character input rules are as follows: taking basic glyph elements as a unit, removing three types of nine-type code points, selecting three glyph elements, and typing in sequence to complete the input of a Chinese character; when the same glyph element is the same, the big priority is given to the glyph element. When connecting or intersecting, use the method of disconnecting the common strokes up and down or left and right to get the code.

When inputting Chinese characters, the Chinese characters are divided into three types and nine types, and the glyph elements are selected according to the code taking points. The three types are single-line type, double-line type, and three-line type. The nine types and code-taking methods are:

"01 型" is a glyph element, and two "W" identification codes are added after the glyph element when entering;

"02 type" is a character that can be divided into two glyph elements up and down, followed by one or "R" as the identification code to complete the input of three codes;

"03 type" is a character that can be divided into three or more glyph elements up and down. Type in the first, second, and last three glyph elements in sequence;

"012 type" is a character font, which can be divided into upper and lower parts, taking the first glyph element in the upper part, taking one glyph element in the lower part, and the last glyph element; "021 type" is an inverted type, which can be divided into upper and lower parts, taking the first glyph element and the last glyph element in the upper two parts respectively;

"11 type" is a left and right type. Take one glyph element from left and right, and then add an "R" or "L" identification code to complete the input.

"12 type" is left and right, taking one glyph element on the left and the first two glyph elements on the right;

"21" is left and right, taking the first two glyph elements on the left and the first glyph element on the right;

"111 type" is a Sichuan font, and the first, second, and last three glyph elements are taken from left to right.

When entering common Chinese characters, the correspondence between the 26 letters on the keyboard and the glyph elements is shown in Table 2 ^

When inputting Chinese characters, the coding rules for Chinese character splitting are as follows: The connected glyph elements are split and split; the connected or intersecting glyph elements use the method of disconnecting public strokes up and down or left and right to get the code, and the left and right can not be disconnected; When the glyph elements outside the frame are sufficient, the glyph elements in the frame are not coded; when the glyph elements are sufficient, the corner points are not coded; when the glyph elements are insufficient, the corner points are coded; when there is a frame, the coding is de-framed downwards, first outside the frame and then inside the frame ; Upward deframe coding, first inside the frame and then outside the frame; characters above the two glyphs on the left side, take the first, last code and the last code on the left side; characters with and beside are coded from the left, without stroke order; Characters with discrete Chinese character structure are split naturally.

Table 3 shows the correspondence between the glyph elements and the 26 letters on the keyboard when entering both simplified and traditional characters or ancient Chinese characters.

The invention has the following beneficial effects:

This code is based on the six characters of the six characters (ie, pictograph, pictophonetic, understanding, referral, borrowing and re-sending), using the scientific method formed by the history of artificial characters, to extract the short form of the glyph element used for encoding. "Yuan", the code extraction method is designed by the association of sound, form and meaning. Chinese characters are strictly classified, and the code points of each character are fixed. As long as it is encoded according to rules, the encoding of any Chinese character is unique. Rarely seem Instead of the so-called fault tolerance code. Its logic is much higher than all kinds of codes today. The code length is short, and the coverage of words and words is large. Moreover, the code, sound, and meaning of the six books, such as pictograms, are closely connected, so that logic and image are perfectly combined. Learning is fun, not boring. So as to improve the efficiency of learning and use, you can achieve more results with less effort.

detailed description

The present invention is further described below with reference to specific embodiments.

This coding is based on the philosophical theory of the six books and the method of code selection, using the association of sound, form and meaning. This code selects 210 components from the "Chinese Character Component Specification GF3001-1997", plus 90 innovative components for a total of 300, as the final level of this code. And define it as "glyph element" (referred to as "shape element"), which is used to compile three "shape elements" with a total of 21,003 Chinese characters at home and abroad.

The 300 "shape elements" are divided into groups of 26 groups according to the principle of the same family. Based on the association of sound, shape, and meaning, each family shape element is mapped to 26 Chinese Pinyin letters, that is, 26 English letters on the keyboard. relationship. That is to say, the English alphabet is used as the mother of the pinyin of the Chinese character, which constitutes a family of "shape elements" that can be read phonically. Under the accompanying sound, the glyph structural elements required for encoding can be extracted from the Chinese characters. Harmonic sounds indicate keystrokes to achieve fast and easy Chinese character input.

See Table 1, Table 2 and Table 3 for the glyph elements, the composition of each family, and the correspondence with the 26 letters on the keyboard. The glyph elements shown in Table 1 are used to input the 3000 most commonly used cylinder characters, which can be used in applications such as mobile phones that require simple input. Table 2 is used to input 7000 commonly used Chinese characters. The glyph elements shown in Table 3 are added with traditional and ancient Chinese characters. It is used to input both simplified and traditional Chinese characters or ancient Chinese characters. Words represented by "mouth" in the table, such as "gold" and "rice", should be split when they are not used alone.

In order to easily memorize glyph elements during learning, the association method is added to this input method, as shown in Table 2.

In order to make the division of Chinese characters more reasonable, the structure of Chinese characters is classified according to "three types and nine types". Table 4 shows the classification and coding rules of Chinese characters of "three types and nine types". In the table The "dot" indicates the position where the primitive is extracted when the Chinese character is encoded.

In Table 4: Mouth represents a single element; represents any pure single column

Represents any single column class represents any type of structure

(1) "Three types and nine types" classification method and glyph identification code:

The structure of Chinese characters is very complicated. Generally, only the structure of Chinese characters is divided into three types: upper and lower, left and right, and heterozygous. Therefore, it is difficult to determine the code point of many complex Chinese characters, which makes it impossible to start coding. In order to determine the code point more intuitively and quickly, and to improve the coding efficiency, this coding classifies the structure of Chinese characters into "three types and nine types".

"Three categories" refers to single-row, double-row, and three-row.

"Nine types" reads as follows:

"01 type" is the character "shape element", so after the character "shape element", add two identification codes "WW" to complete the input. If the code of "person" is A, the input code of herringbone is AWW.

"02 型" is a character that can be split up and down into two "shape elements" (two yards). Then add "L" or "R" as the identification code to complete the input of the three codes. If the word "gu" is divided into ten and ten, the input code of the word "gu" is SOL.

"03 型" is a character that can be split up and down into three or more "shape elements" (three yards), such as the word "ting", which is split into:, mouth, and Ding. Its encoding is LOT.

"012 型" is a character font, which can be divided into upper and lower parts, taking one glyph element in the upper part, taking the first glyph element in the lower part and the last glyph element; for example, the word "Salary" is divided into: Ding is coded as NLT, "021" is an inverted product, which can be divided into upper and lower parts. Take one glyph element in the upper part and the last glyph element in the lower part. For example, the "type" character is divided into: 1, 4, and Coded as HZT.

"11 type" is a character of "shape element" (one yard) on each side. For example, the word "ding" is divided into: mouth, Ding, and the supplementary identification code "R" or its code is OTR.

"12 type" is the left one "shape element" (one yard), the right two "shape element" (two yards), such as "chest". This is divided into: month, spoon, then it is coded as MPU.

"21" is the left two "shape elements" (two yards) and the right one is "shape elements" (one yard). For example, the word "while" is divided into: soil, people, and more. Its code is: TAP.

"111 type" is a Sichuan font. Take the first, second, and last three "shape elements" (three yards) from left to right. For example, if the word "tide" is divided into:,, ten, and month, it is encoded as DSM.

(2) Special cases of glyph classification:

① Multiple permutations such as states and continents belong to three columns. Press "111 type" to get the first, second and last from left to right. The continent word is divided into: '),,, I, and its encoding is DDZ.

② The all-inclusive type is a single-row type. Press outside the frame and then inside the frame to remove the type. For example, the word "return" is "02 type". The word "country" is "03 type".

③ The characters of upper three bags, lower three bags, upper left bag and upper right bag belong to a single column. Such as: wind, fierce, hall, sentence and other words are treated as "02". Such as: Same, disease, lice, etc. are treated as "type 03".

④ The characters in the left three bags and the lower left bag belong to the double-rank category. For example: Ju, Da, etc. are treated as "11 type". Such as: medical treatment, treatment, etc. according to "12" treatment. For example: trips, starts, etc. are handled as "21". The above classification and processing methods. Let it be, without rote.

(3) "Three-shape association" single-word coding principle

1. Use the shape to take the code, in order of writing, from left to right, from top to bottom, and then inside and outside (with the exception of the words beside and: L).

2. Take the basic "shape element" as the unit, and press the "three types and nine types" glyph structure and code points to get the code.

3. — generally take the code of the first, second and last three "shape elements" in order, at most only three code.

4. For the single structure split, it is necessary to disassemble the "elements with the largest number of strokes and the larger number of strokes".

5. When the single character "shape element" is less than three codes, add the glyph identification code.

6. Some intersecting or connected words do not become "shape elements", so they use the "breaking method" to make them "shape elements" and then take the code.

(D) Chinese character split coding rules and examples

1. The connected "shape elements" can be disassembled

2. "Shape" connected and intersecting can disconnect common strokes up and down (can be broken up and down, not left and right)

Example Word Split Coding Example Word Split Coding; | v BCX East Seventy Little CSX Wing, τ Water DIW Black W Soil '", FTD Ping Yi Xiaoshi HXS Koda 1 ZL Well NNL World NJL Mao PQJ Wei * p QUL Car Ten CQL Beam Ten Middle School SQX Fire villain XAL Middle I Tian I ZKZ Mi Xiao ten Xiao XSX Seeking Shishui, SWD Yi Shixiu SQN Li Shizhong SQC 3. Shape elements intersect, you can disconnect common strokes left and right

4. The frame shape element is sufficient (including corner points), and the shape element inside the frame is not coded

5. The shape element is not enough for corners to be encoded

6. Insufficient angle coding

7. Down-frame coding (including outward and right) Example word split coding Example word split coding Zhou Mentukou MT 0 plaque 门, door EDM En □ big heart KYD for, force, DU ¾ small doorway XM 0 door 7 people DIC

8. Up frame decoupling, first inside the frame and then outside the frame

9. The characters above the two digraphs on the left, take the first, last, and last code on the left ("/// 型"

The exceptions to the above "/// 形" or "Sichuan-shaped" are:

10. The characters with and are coded from the left (not in stroke order). Example one split code. Example word split code

Jian Zheng Niu, Γ E Q Road _ Head B F

Ting. I P T Rough α X W 0

Yan ZLIPJ Force: L Yitian W Η Κ 11. Split the structure of discrete Chinese characters, let it go

12. Special regulations:

; |, Er, and three form element combination structures are not classified as double-column or triple-column, but are classified as single-column.

(5) Shortcode input

In order to improve the input speed, this code will use commonly used Chinese characters (select high-frequency Chinese characters as much as possible), only take one or two "shape elements" in front of it, and then add a number key or space key to end. As the code of the word, a barrel code is formed. Because this code is a three-code code (three-button end). Therefore, only one or two short codes are set.

Short code

The first level short code of this code is divided into two levels. The first level is: Each family can arrange the first code of high frequency words plus space bar, a total of 26 short codes. The second level is: Each family of "form elements" can arrange 10 high-frequency Chinese characters with the same first code, each with a number key, and a total of 10 number keys, as 10 barrel codes. There are 26 families in this code. Therefore, 26 X 10 = 260 can be arranged.

The first-level shortcodes are 26 + 260 = 286, see Table 5.

2. Secondary Barcode

The second-level shortcode refers to the code that takes only the first two "shape elements" of a single character, adds a space key to the end, and selects the frequently used Chinese characters as the second-level shortcode. For the codes of the twenty-six families of "shape elements", the combination of the first two codes has a total of 26 x 26 = 676 secondary short codes (including 9 spaces), see Table 6.

3. The first and second short codes are designed with on-screen prompts, no need to remember. After a little practice, professionals can achieve blind typing. (6) Vocabulary coding rules

The practice of many encoding methods has confirmed that the lexical encoding input can effectively reduce the re-coding rate, and significantly reduce the code length, thereby greatly improving the input speed and efficiency. This code uses three-code word code input, and mainly uses words with more than three words, with extremely high speed and efficiency.

Coding rules for double words

The two-word words used in this code are mostly used in "single words". The first character takes two yards according to the rules, and the second character takes the first yard.

E.g:

3. Multi-word encoding rules

The encoding rules for multi-word words are to take the first code of the first, second and last words. E.g:

Example word split coding science and technology Wo, w ten PXS Serving the People, People 夂 DAY Scientific Socialism, ", PXD Chinese People's Liberation Army Zhongkou QKG One Country Two Systems One Person Canton HAR

(7) Encoding of traditional and ancient Chinese characters:

In order to adapt to a wider range of requirements and facilitate communication at home and abroad, in addition to simplifying the Chinese character system, this code also designs traditional Chinese and ancient Chinese character coding systems. This code compiles a total of 21,003 Chinese characters at home and abroad. The coding rules for traditional and ancient Chinese characters are the same as for tubed characters, but a prescribed end key (or conversion key) is added respectively.

1. Example of Traditional Chinese Character Encoding

'-Thrift (thrift) people enter. • AAA / Sword (sword) person ¹ j: AAZ / Generation (generation) "Ten ZZS / Tooth (Tooth) Factory Bu u. ZZ l) /.

2. Example of encoding ancient Chinese characters. Character splitting and encoding. '佳 1ί A A A //

-Only. Renren A A Y // Walk Ί Bu Ren 'I l //

. 4 ^ 1 1 1 // This encoding adopts diversity design. The encoding of the "positive set" only has three character positions (orthographic, adverb, and remainder), and is used to type common Chinese characters and general Chinese characters. "Subset" encoding settings

There are 5 characters, and the encoding of the "set" is set to 10 characters for selection on the screen. "Vice set" and "Yu set" are used to type traditional Chinese characters, ancient Chinese characters and overseas Chinese characters. Statistics on the frequency of use of Chinese characters provided by the National Language Commission. The Chinese characters are used in the regular set, with a usage rate of 93%, adverbs 6.1%, and the remaining characters 0.85%. There are still 0.05% Chinese characters, which need to be found in the deputy. If you want an adverb, continue typing <2> or <9> on the keyboard to change it to an adverb. If you want a residual word, you need to use the "prompt line". The probability of this situation is less than 1%. So after many operations, you will quickly remember. This coding simplifies the Chinese character system, and the repetition rate of Chinese characters is often used, which is only about 5% under the three-code condition. If one or two short codes are used and the word code is mainly used, it is close to no heavy code. Traditional and ancient Chinese character systems require the help of hint lines.

The technical characteristics of this coding are:

1. This code is a pure form code characterized by spelling "phonetic":

This code is based on the shape, and the code used is Hanyu Pinyin. That is, the glyph structural elements such as strokes, radicals, radicals, standard parts and innovative parts of Chinese characters are collectively referred to as glyph elements (referred to as "shape elements"). Through the association of sound, form and meaning, these "shape elements" are regarded as Chinese phonetic alphabets that can be read aloud. The glyph of each Chinese character is coded using only three Chinese Pinyin letters. This is a kind of sound and spell. Not only does it have the same effect as the keyboard input of Pinyin text (such as English, etc.), but it is also faster and more concise. For example: The word "jiang" is divided into three "shape elements" as:,;, ding, one, where ",;" (point Di S n) is associated with D, "丁" is T, and "一" (Heng Ying 6 ng) association is Η, so its encoding is DTH.

The Chinese characters are entered by spelling "Pinyin" because the glyphs of the Chinese characters are ideographic. I ca n’t understand from the South to the North, but I understand everything as soon as I write the character (glyph). That ’s the reason. Chinese characters are not just “writing symbols of the recorded language” like Western pinyin characters. . Chinese characters are also "writing symbols that record people's thoughts, consciousness, and ideas." For keyboard input, only the glyph of Chinese characters can be accurate Ideographic. Because there are too many homonyms in Chinese characters, only by inputting Chinese characters in a spelling shape, a low repetition rate can be achieved. This code is further than the general shape code: This code is not a simple mechanical spelling, but a "phonetic". It is based on the association, there is no rote memorization, and it is similar to the Chinese phonetic code "Jinpin" method for coding. However, it is fundamentally different from the pinyin code. The pinyin code spells the sound of the entire word, while this code spells the sound of the "shape element".

Although this code also has the characteristics of "phonetic" and "shape", it is fundamentally different from the general phonetic code (or phonetic code). Generally, the sound and shape of the sound shape code are separated, that is, the spelling sound is firstly added, and then the first or last shape of the word is added to reduce the repetition rate. Due to the unsynchronized sound and shape of such a phonetic code, there is inconsistent thinking in use, and it can not get rid of the same defects as the Chinese phonetic code, that is, unrecognizable characters or inaccurate pronunciation cannot be input. Practice has proved that The input effect is not as good as a simple shape code. This is the fundamental reason why the commonly used form codes are still difficult to learn but still occupy the main market. The tone and shape of this code are synchronized and integrated. The spelling is the sound of "shape element", not the whole character sound. The characteristics of this spelling of "phonetic" still belong to pure code. Pinyin Although the language cannot be recognized (incomprehensible), the computer can accurately identify what the word is. Because computer keyboard typing, it mainly relies on the glyphs on the display screen to transmit information and perform operations. Only glyphs appear on the screen, not voices and phonetic sounds. Pinyin text is no exception. Chinese characters are spelled characters, and the glyphs of Chinese characters can be represented most accurately, which coincides with the operating characteristics of electronic computers. It is because of this objective existence that determines the role of square Chinese characters in the electronic high-tech era. As long as the keyboard input can accurately align the glyphs, information can be accurately transmitted. Pinyin is the most effective means of setting. Therefore, the square Chinese characters fixed by spelling "phonetic" can meet the needs of the electronic high-tech era. Therefore, the ascension of ancient Chinese characters based on shape will be here!

The spelling of "phonetic" actually makes the square Chinese characters "quasi-pinyin" under the condition that the form of the word table is unchanged. This is the dream of many generations, and it will become a reality now. This is a dream realized under computer conditions. This easy-to-learn "quasi-pinyin" In modern information transmission technology, it will outperform all Pinyin text (such as English, etc.), which means that it will perform better than Pinyin text to transmit information. In the era of mechanical typewriters, it was not possible to spell Chinese characters with squares using pinyin letters. The number of individual Chinese characters is huge and the structure is complicated. Therefore, form-based Chinese characters are powerless and have been lagging behind for more than a hundred years. Nowadays, as long as the ease of learning is solved by computer typing, the isolated "shape element" can be converted into Chinese characters. This rejuvenates the ancient Chinese characters; it makes it possible for us to stay ahead of the times by using methods beyond pinyin!

We believe that: The technical way out for entering Chinese characters on a computer keyboard is in shape codes. The way out of the shape code is to solve the problem of learnability and shorten the code length appropriately, and maintain a relatively heavy code rate. Imitation of the shape, sound, and meaning of the ancestors' "six books" is the most effective way to solve easy learning, easy to remember, and memorable. This is determined by the inherent characteristics and profound connotation of square Chinese characters.

The coding is based on the pronunciation, shape and meaning. Due to the role of association, mapping effects can be formed between human senses, thinking and glyphs, codes, and key positions. Therefore, when typing Chinese characters on the keyboard, not only can you see and type, but it is also convenient to want to type and listen to type. Therefore, it can meet the needs of various personnel.

Associativity is the bond to understand things. When you see the word "person", you will think of a human figure standing on two feet. When you see the words "say, month", you will think of the sun and moon in the sky. When you see the word "fire", you will think of a wild bonfire set up in the wild. When you see the word "rain", you will feel drizzle. When you see the word "tears", you will feel the tears dripping from your eyes. All this seems primitive, yet it is very modern and scientific. Because the computer is especially "favorite" these magical Chinese characters, it can accurately and quickly identify it. This shows that it can adapt well to contemporary technology. It is high technology that revitalizes ancient Chinese characters. As long as we use the right method, the technology of inputting Chinese characters on the keyboard will be more popular and better than pinyin (such as English). It is just around the corner.

2. This code is very easy to learn and remember, and its keyboard input technology is advanced: There are three hundred "shape elements" in this code. Although there are many, some of them are used for Traditional Chinese, there are only more than 200 commonly used Chinese characters. Only a few dozen are used frequently. However, due to the use of similar family principles, sound, form, and meaning association, it is very easy to remember. As long as you remember a representative of the same family and its code, you can remember a series of "shape elements" of the same family. Once remembered, it is hard to forget. As long as you have a little understanding of Chinese pinyin, many important "shape elements" and their codes will always be remembered, and you will never forget them. For example, the most frequently used "shape element""一" horizontal (H 0 ng), "1" Straight (Zh ί), ")" 撇 (?), "" 捺 (Νέ), "," points (Di δ η). The above "shape element" uses only the first letter of its pronunciation as the code. Then "a" is H, "I" is Z, "J" is P, "" is N, ", and" D ". And (,, ",";, ",,, heart) are all D. (J,; /,, spoon) are Po. For example, "ten" is S, "soil" is T, "mouth" is o, and so on. This is actually "tube spelling", only spelling the first pinyin letter of the "shape element" pronunciation. Because this pinyin is easy to learn, as long as you know a little bit about Chinese pinyin or the pronunciation of dialects is not accurate, it is not difficult to learn. If you have learned to type the Chinese phonetic alphabet, it is even easier to learn this coding. Thus the popularity of this coding is predictable _β As the input speed, because the shorter the code length, weight code rate, high input accuracy suitable for comfortably. Therefore, its input speed should be higher than any keyboard text input (including Chinese and foreign text). This is not a joke! The reason for making this judgment is: Today's commonly used form codes, the skilled person's typing speed has exceeded English (see related reports), but the typing speed of this code is not exceeded by individual, but many people will exceed. This is determined by the technical characteristics and advantages of this encoding. Only in this way can we say: The computer keyboard input technology of Chinese characters has stood in the forest of the nations of the world!

3. This code uses the "disconnect" split method

This code adopts the method of disconnecting and dividing a part of Chinese characters with connected and intersecting glyphs, instead of using the method of "shape element" and "extraction". In this way, under the condition that the shape of the glyph is kept intact, the purpose of intuitive and quick code fetching can be achieved. The so-called "break" is only the point at which the two "shape elements" meet at the intersection of Chinese characters, and the shape of the glyph has not changed. This is very important for the intuitiveness of code in shape. Since this encoding does not PT / CN2003 / 000858

The "extract" split code is adopted, so it is avoided, which is recognized as the difficulty of learning and using the shape code, thus greatly improving the coding efficiency.

Now the disconnection (break) of this code is separated from the extraction of commonly used shape codes.

One mi

()-4 Kings (Pomelo)

(Excerpt) Top Ten Fires ZJ (one person

PL Cairi Z

Prerequisite speed: only look accurate, fast code set, in order to quickly keystrokes _β "disconnect" take the code, change shape due to the shape, do not have to abstract thinking, and create favorable conditions for quick keystrokes. It should be noted that the speed of typing depends not only on the number of keystrokes, but more importantly, whether the code can be determined intuitively and quickly.

4. This code uses three types of nine-character classification of Chinese characters and three-point scattered coding method. As this code uses three types of nine-character classification of Chinese characters, for more complex glyphs, the used code points are scattered and code points are used. It does not go deep into the font, so it is relatively simple and intuitive, and it is easy to determine the code point. At the same time, the code points are scattered, so that each independent structural part of the Chinese character can obtain codes, which will inevitably reduce the re-coding rate. The character font of this code (012 The code points of Chinese characters of type), inverted type (type 021) and left and right type (type 21) are compared with the examples of commonly used shape codes today:

(Product ^.) 1 i ^

± mouth three

Q one

l) / Λ (: take,) soil mouth f take-/. ±> L

Ten —, <n-¾

A (take the right point)

Yikou / j, only), 卞卞 r take ^ three plants ^

sense'

'本 X X 手 (¾ ^) τ-P l i

(f, -angle

(y,

* Fetching-As can be seen from the above example, there are obvious differences in the intuitiveness of three-point code scatter from the outside of the font and two-point code scatter from the inside of the font. Three-point scatter code, the code point is fixed, just consider what code is on the code point. Don't look inside the glyph. Inside the glyphs, it is often difficult to identify and determine due to the cross-connection of "shape elements". Therefore, for more complex glyphs, three-point scattered fixed-point fetching from the outside of the glyph can achieve the purpose of being intuitive, fast, and accurate. Writing and drawing are just like drawing. You must outline the outline to draw it. Therefore, scattered fixed-point code fetching at the periphery is an important method for aligning fonts. At the same time, the range of code fetching is spread to every corner, and the number and difference of different "shape elements" that must be taken are also enlarged, and the re-coding rate is also reduced. This coding can benefit from the realization of low repetition rate under the condition of three codes. Because the "three types and nine types" classification and three-point scattered fixed-point code-taking method are used, the more complicated the Chinese character, the easier the code-taking. Simplification This method greatly improves the keyboard input efficiency of this encoding.

5. The font recognition of this code is very simple, without cross recognition

Today's commonly used shape codes, the last character recognition is one of the learning difficulties. When the single character is less than four yards, cross-recognition is required using the five strokes of the horizontal stroke, vertical stroke, skimming, 捺, and folding of the last stroke, and the upper and lower strokes, left and right strokes, and heterozygous strokes. There are fifteen types of identification codes. Beginners have to memorize complex identification code tables. In particular, the classification of three fonts; some are not clear; especially the classification of heterozygous types makes it difficult for beginners to learn and judge.

This coded word only uses a maximum of three codes. When the word is less than three yards,

Add "WW" and "WW" to make up three yards. If the code of the character "shape element" (type 01) "person" is A, then the code of the character "people" is AWW. Up and down type (02) type two code words. If it is less than three codes, add "L" to make up three codes. If the first two codes of the word "gu" are SO, then the code of the word "gu" is SOL. If the left and right characters (type 11) are less than three yards, add "R" to make up the three yards. If the first two codes of the word "ting" are DT, then the code of the word "ting" is DTR. There are only three cases: fonts, up and down, left and right, and characters with less than three codes. Only W, L, R three letters (codes) are used as identification codes. There are no complicated situations such as cross recognition. The other types of single words are all three yards, so there is no need to add an identification code. Therefore, the font recognition of this code is very easy to learn and remember. W, L, R three-letter codes are font identification codes, which is one of the important technical features of this code. _Β

6. The number of primary and secondary short codes of this code is large, high quality, and no memory is required. This code has 286 primary short codes (one code plus space bar or number key). More than 660 secondary short codes (two codes plus space bar). The first-level shortcodes are all high-frequency words; and almost include the first-level barrel codes of today's shape codes and phonetic codes.

The first and second level short codes of this code are designed with screen prompts, so do n’t memorize them. Professionals can touch blindly with a little practice.

7. This code is very convenient for foreigners to learn and input Chinese characters

This code also opens the door for foreigners (especially English learners) to learn and input Chinese characters. Such as: People, 3, 7, Π, Mouth, Factory, Ding, U, etc. Yuan ", foreigners will also associate it. The code of" water "is W, which coincides with the English WATER. Other" Xing Yuan "is not difficult to learn as long as you understand Chinese characters. Foreigners and Southerners speak Mandarin The pronunciation is inaccurate, and it is often difficult to master the ending sound. The first sound is generally no problem. This code spells "phonetic", which is actually only "Jin Pin". It uses only the first letter, and it is compatible with the pinyin code. In the same way, you can directly use the English keyboard layout of the computer to make it foreign (not mechanically used by general shape codes). Therefore, it is not difficult to learn and use. Therefore, this code will help Chinese and foreign cultural exchanges.

To sum up, this code has superior technical performance, so it is not only easy to use, but also can meet the needs of professionals. Therefore, students, teachers, writers, journalists, secretaries, etc. can learn and master quickly. Dialect areas can also be promoted. As long as you know a little bit of Hanyu Pinyin, it is suitable for all ages and learn to use it.

Because this code uses the "phonetic" method, which is consistent with the Chinese phonetic alphabet education in elementary and middle schools, it is suitable for general promotion in primary and secondary education. In this way, we can achieve: Wen computer coding Starting from the baby, it can popularize and improve the input of Chinese characters on computer keyboards, and lay a solid foundation for the modernization of Chinese information processing.

The practicality of this code can also be extended to the field of compiling common words, dictionaries and electronic dictionaries. Using this code to search dictionaries and dictionaries can be the same as searching English dictionaries, and it is more convenient and quick. Because the code length is short (a single word uses only three letters), it can basically be done in one step (that is, "seat with a check mark"). Unlike when looking up a Chinese pinyin dictionary, it often brings out a series of homophones, and some even turn the pages. The dictionary of this code is basically arranged in the way of radicals and radicals. The same and near words are arranged. In one piece. This makes it easy to compare, learn, remember, and deepen understanding. It is close to the traditional arrangement of the Kangxi dictionary. However, it can also be searched in the form of "pinyin", so it is very convenient to use. In this way, both the tradition and the technological content are increased. This is completely different from today's Chinese Pinyin dictionary, which is convenient for the Pinyin search and has a disorderly arrangement. This imitation of pinyin text and dictionary layout is not very scientific. It is _β that deviates from the characteristic of Chinese characters. For example: bank (h 4 n _g ) bank, bank (χ ί ng) walk, bank (heng) bank, bank (h "g) tree bank, although Characters of the same shape, but due to different pronunciation, they cannot be arranged together (split around). This makes it difficult to compare, learn and remember. There are still many such words. Using this code to compile a dictionary can fundamentally overcome this shortcoming.

This code includes simplified and traditional Chinese characters, ancient and modern Chinese characters, and Chinese characters at home and abroad. This can not only study and excavate the vast cultural treasure trove of ancient China and conduct ancient text research, but also adapt to the needs of Hong Kong, Macao, Taiwan and Chinese-speaking regions and the growing international information exchange, and serve international friends and overseas Chinese who use Chinese characters in the world!

The significance of this encoding is also that it is a "full-size code", which can promote the modernization of information processing on the premise of saving square Chinese characters. Now that the internal rules of square Chinese characters have been fully revealed, huge potential has been released so that ancient Chinese characters can adapt to the needs of the rapid development of science and technology in the new era. Why can't we save the Chinese characters of the squares, and carry them forward? Chinese characters are peculiar to their own laws of development. Chinese characters should not be completely westernized, and they should also follow the Chinese-style development path. That is to say: Chinese characters should be appropriately simplified, and Chinese and Chinese characters should learn to use Pinyin, but Chinese characters should not be changed to Latinized Pinyin characters.

Although the disclosed method involving a Chinese character "three-associative" shape code input method is specifically described with reference to the embodiments, those skilled in the art will understand that without departing from the scope and spirit of the invention, Make obvious changes in form and detail. Therefore, the embodiments described above are illustrative and not restrictive, and all variations and modifications are within the scope of the present invention without departing from the spirit and scope of the present invention.

Table 1

Table 2

衮 3

Table 4 Example

Coding Supplement WW

Man, eight, gate, cricket, mountain, water, bug, seven

02 type

Θ code complement L lun, union, text, fire, wood, sentence, phoenix, hall single

Column 03 type Ξ

Umbrella, hood, rice, barren, Xiang, with, disease, lice

Type 012 public, crime, crown, arrogance, dysfunction, Jia, Ji, Lu Fan, 'comfort, capital, dysfunction, dysfunction, panic, disadvantage, feeling

Type 021

Looking down, north, phase, recognition, middle, only, giant, up to double-column type 1 12 type. 1 3 Jiang, He, Yun, Jubilee, Refutation, Seeing, Bai, Medical

Type 21 □ □ Regulation, brake, daring, quail. Missing, scattered, wading, offering, three types of Sichuan,. 撖, 铤, 淋, Η, Η, invitation, pelican, incubation

~ 9Z- chrysanthemum T z

X A

Case • Can storm X

m m class

m Λ

m n

& Έ 1¾ 穉 No ¾ difficult 1 m to s

镩镩. Ή

# if • 0

, ¾ ¥ Purple d in Zhejiang. Chop o

· 眯

m w m library. ¾ 颦 ¾ 1 豳昝昝 lei s m.

m 禺 Salary f 杳 SI a:%. I a m is over M. 2 H 袅萆. £ O m. Sound s ¾ '' Λ will m 鲔

鼷 '翳猓 a

Car. D

V strike every a

Let Ψ V

group

0 6 8 L 9 ς ε ε l I

S8000 / C00ZN3 / X3d 1 ^ 9 £ 0 Please Z OAV -LZ-

8S8000 / C001M3 / X3d

Claims

L A Chinese character "three-shaped association" shape code input method, which is characterized by including the following steps:

(1) The glyph element is selected as the basic part of the coding, and is divided into 26 families according to the same family principle according to the shape, and corresponds to the 26 letters on the keyboard. The corresponding relationship is:

A person, / (, into, good,

β A, evening,, / w,,

C meaning, 3, seven, 1 (, ΐ, 弋,, 屮,

D,, n,, "" ,, heart, i, door

E 3, Ding, ,,,, Yin,, corpse, district ...

F head, once, and, ππ, four, tttr,

Q r ^>,, rA, ',>, bone'.

H ―, Wang., Jun, ′ 5, ^ w, i¾.

X, 7, ",, H, Ί,", V.,

J L,, 纟,, i,, ί, i :, l_, L /, dagger, factory

□, Yue, ca, Tian,, ί,, corpse, Qu Feng

L ^,,

Door,> Αι rL,, towel,:

N 廿, f, 廿,,,, ¾, ... ,,

D o, B, P, :: £

P,, evening,, 々,,,, body, #, ^, Taiwan

Q ,; ί =, φ,, Feng, sound,. ¾,, this,

R r,, / ,: f Guang, household, f, e.

S +, ten,, ice

T: t, Shi, Shi, Ding, Le

V mountain,, .LJ, 6, force, † l, card,

V Female,, υ, Xi,. Xi Xi, w; ^,,;,,,,》, ί ^

^-, Small,, |,, Ψ,,,.

γ,,, 夂, Ge, Da, Zhang.

. I, 丄, 'J,, †, ^,

(2) input Chinese characters;

The Chinese character input rules are as follows: taking basic glyph elements as the unit, selecting three glyph elements according to the three types of code points, and typing in sequence to complete the input of a character; when the same glyph element is the first, the glyph element takes precedence When connecting or intersecting, use the method of disconnecting the common strokes up and down or left and right to get the code.

2. The Chinese character "three-associative" shape code input method as claimed in claim 1, characterized in that, when inputting Chinese characters, the Chinese characters are divided into three types and nine types, and the glyph elements are selected according to the code point; Refers to single-row type, double-row type, and three-row type; the nine types and code obtaining methods are:

"01 型" is a glyph element, and two identification codes are added after the glyph element when input;

"02 type" is a character that can be divided into two glyph elements up and down, followed by an "L" or "R" as the identification code, and the three-code input is completed;

"012 type" is a character font, which can be split into upper and lower parts, taking the first glyph element in the upper part, taking one glyph element in the lower part and the last glyph element

"021 type" is an inverted type, which can be divided into upper and lower parts, taking the first glyph element and the last glyph element in the upper and lower parts respectively;

"12 type" is left and right, take the left glyph element, the right side take the first and last two glyph elements; υ τ.

"21" is left and right. Take the first and last two glyph elements on the left, and take the right one.

Characters of glyphs;

The character "three-shaped Legend" shape code input method according to claim 2, wherein, when the input Chinese characters, the keyboard ²⁶ corresponding relationship between the letters and graphic elements are:

A person, (, ---, Jia ,.

B A, Xi, \ /, JL-, 4,

C,, »», 弋, six,, car,

D,,, Beam ,,

E 3, n, ^, ^, Yin, 匸,, corpse '., District,,,,

F, ..., head, and, and, ΠΠ ,, 2ξ), d, c¾,, JCE2,

Q,, ^, Ί b, 'ψ

H ―, Wang ,? , ^., ~ Ί, ί,, 1, ί¾,

I 7 ·· ,, "7, 5 * Α," "L, J, ¾, ¾,

r L, 幺, 纟, ^, (, χ, Γ, dagger, L, L, b Γ, "<,, J,,., i. X,

κ, 8, α = ιϋ V-, corpse, Qu Ξ.,

",

, —,

Μ π., R), ^ i, n s ^,, rm,,,

廿, 廿, 卄, "^, 廿, 讲, mother, mother, \-,, · ί,

Soil, Shi: k, Ding, Ding

Mountain,, u, nine, force, h, card, fji, i¾

0 V Female,,,, force, evening, little,, fish,

W.;,,, Ice, 氽,, A, t,

x 4,, small,, i, U, / i), y, ^

Y Shen,, 乂, 夂, Ge, Da, Zhang,,

I, J, 'j, Xiao Bu H. Worm,

4. The Chinese character "three-shaped association" shape code input method according to any one of claims 1 to 3, wherein the Chinese character split coding rules are as follows: connected glyph elements, decomposition and split; glyph elements Connect or intersect using the method of disconnecting the common strokes up and down or left and right to get the code. It can be disconnected up and down and not left and right. When the glyph elements outside the frame are sufficient, the glyph elements in the frame are not encoded. ; When the glyph element is insufficient, the corners are coded; when there is a frame, the code is deframed downwards, first outside the frame and then inside the frame; the code is deframed upwards, first inside the frame and then outside the frame; The first and last code on the left and the last code of the word; the words with and beside are coded from the left without stroke order; the words with discrete Chinese character structure are split naturally.

5. The Chinese character "three-shaped association" shape code input method according to claim 4, characterized in that the correspondence relationship between the glyph elements and the 26 letters on the keyboard when inputting both simplified characters and traditional characters or ancient Chinese characters is: :

A person,, into, good, food, gold

B eight,, /,, ~,, / ^,

C XL 、、七、、 t 、 ^ · 、古. 、 ^ 、车、 Ά

D, heart, must, door

£ 3, ΐ, 匸,], hut,, Ε,, Yin,,, ψ

, ^,, ¾ ,, sense.

P b. 01 ?, · and, and, dish, '.Ε? , Ψ, ΐ &,, with

Q,, bone, L

7, ", 孓 H, H, B,

J 5>>

厶,, ^,

J

'', Factory, Γ,

D.,,,,,

, Households, households, Τ B SZ How many -2 P,

, 丄,.,

S,, towel, ■ ¾, door, shell--,,,:

"

v

,,,,,,,,,,,,,,,,, Z z 'J BU ¾, F3