CN112783336A - New phoneme same-tone near-bit Chinese character code input method - Google Patents

New phoneme same-tone near-bit Chinese character code input method Download PDF

Info

Publication number
CN112783336A
CN112783336A CN202011143150.XA CN202011143150A CN112783336A CN 112783336 A CN112783336 A CN 112783336A CN 202011143150 A CN202011143150 A CN 202011143150A CN 112783336 A CN112783336 A CN 112783336A
Authority
CN
China
Prior art keywords
chinese character
code
chinese
codes
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011143150.XA
Other languages
Chinese (zh)
Inventor
王治阳
王亭朝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202011143150.XA priority Critical patent/CN112783336A/en
Publication of CN112783336A publication Critical patent/CN112783336A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)

Abstract

A Chinese-character coding input method for computer features that the first strokes of the part of the compound character with same pronunciation and radicals except the radicals with same pronunciation are the basic strokes of same kind, the multiple-stroke parts with same initial letters are arranged according to the same pronunciation near-position method, and the shape-taking part is used to code the second code by double-layer scanning technique, so only 21 multiple-stroke parts and five basic strokes are needed to input Chinese characters easily and quickly.

Description

New phoneme same-tone near-bit Chinese character code input method
Technical Field
The invention belongs to a Chinese character coding input method of a computer, which is a significant improvement on a t-shaped Chinese character code input method invented by the inventor before, adopts a homophonic near-position technology invented only after ten years, and adopts a maximum double-layer left-to-right scanning technology invented only after more than twenty years when a second code of a shape part is coded and code-fetched.
Background
The keyboard input method is the most widely used input method in the Chinese character input method at present. The keyboard input is characterized in that the input Chinese characters need to be coded, the Chinese character coding means that a group of codes is used for representing one Chinese character, the key where the Chinese character coding is located is pressed, and one Chinese character is usually input by pressing 1-4 keys. The keyboard input is divided into three categories, namely phonetic code, shape code and phonetic-shape code according to codes.
The phonetic codes are generally based on Chinese phonetic alphabet and are encoded by using pronunciation of Chinese characters. The shape code uses the character shape characteristics of Chinese characters to code. The sound-shape code is coded by using the phonetic features and the character pattern features of Chinese characters. The phonetic and configuration codes can be divided into two types of phonetic and configuration codes only using initial consonants and whole phonetic and configuration codes using whole phonetic and configuration codes of Chinese characters, the initial consonants and vowels of Chinese characters are completely used, and the specified phonetic and configuration codes are first and second, and said phonetic and configuration codes basically do not affect thinking, and said thinking is similar to phonetic and configuration codes, and its duplication code is similar to configuration code, and can be compatible with phonetic alphabet, and can more and more show its superiority. At present, the invention of the sound-shape code by other people is usually more in Chinese character components or higher in repeated code rate, and the Chinese character code invented by the inventor can simply and quickly input Chinese characters only by using about 28 Chinese character components on the basis of innovative coding rules. However, further research shows that: the invention has the defects that the popularization is influenced. The Chinese character code has the advantages that the frequency of horizontal, vertical, left-falling and right-falling strokes (right-falling) and folding is higher than that of any multi-stroke part, the horizontal, vertical, left-falling and folding are arranged on punctuation mark keys and z keys at the lower right corner of a keyboard, which are not convenient for keystroke, the keystroke efficiency and the comfort are influenced, and the phonetic alphabet of the Chinese character code can not be compatible with the phonemic letters invented by the Chinese character, meanwhile, the Chinese character further researches show that the horizontal, vertical, left-falling, dot (right-falling) and folding are respectively coded by a, o, e, i and u, the Chinese character, word and character have the unexpected advantage of reducing the same code, and the further researches show that the Chinese character code is favorable for the accurate arrangement and positioning. The first letter of pinyin is adopted when arranging the radicals, but many commonly used radicals have the same pronunciation, so that some radicals are arranged according to the similar shape, namely according to the similar shape, but many radicals are difficult to be similar to English letters and can only be arranged by drawing strong and mishap. The invention discloses a homophonic near position arrangement method, which arranges the first letters of pinyin or the radicals with the same initial consonants at the similar positions, but has no quantitative calculation method when the first letters of pinyin or the radicals with the same initial consonants are arranged at the similar positions, and only depends on experience and feeling.
In addition, as five basic strokes are used and 21 radicals are selected, the total number of radicals and strokes is 26, the number of the radicals and the strokes is close to that of 26 English letters, the radicals and the strokes are convenient to display on small-screen keys such as a mobile phone, the radicals are easier to remember than the radicals selected by 28, and the radicals are more suitable to display on small screens such as the mobile phone.
The code fetching of the second code of the shape and part coding is also a headache problem, and if the Chinese characters are divided into the left and right structural Chinese characters and the non-left and right structural Chinese characters for code fetching, whether the Chinese characters are left and right structural Chinese characters can be easily identified, but whether the left and right structural Chinese characters need to be distinguished continuously during coding is also tired. Later, the improvement is made, Chinese character coding is distinguished according to T shape, but the direction of coding sometimes from left to right, sometimes from top to bottom, is not smooth. Until the first few days and half night, I have a significant innovation: when the shape and part coding is suddenly realized, the coding rule problem is thoroughly solved only after the codes are respectively scanned from left to right from the upper layer and the lower layer of the Chinese character and the brevity codes of the Chinese characters with the non-left and right structures are prioritized over the brevity codes of the Chinese characters with the left and right structures.
However, some radicals are commonly used, but the repetition codes are low, so that more than 10 pairs of repetition codes can be reduced, and only 26 key bits are used without selection. But it would be beneficial to some people who pursue typing speed if these components were selected. How to solve this contradiction is also a problem in the new invention.
In addition, the phoneme homophonic and near-position Chinese character code input method is the same as other input methods, the quantitative calculation is not carried out on the radicals, the selection and the selection of a few radicals are not reasonable, the few radicals with homophonic and near-position are not quantitatively calculated, and the radicals are arranged only by experience, so the positions on the keyboard are not reasonable.
Disclosure of Invention
Thus, the present Chinese character input method has the defects that the Chinese character components are not standard or the number of the selected Chinese character components is not reasonable; or the radicals, namely the Chinese character components are not selected according to the word forming frequency, the practical frequency and the coincident code rate in the pinyin common Chinese characters; or the positions of the five basic strokes in the keyboard are unreasonable, the five basic strokes cannot be compatible with phonemic letters, and words and codes are easy to be identified; or the code length is too long; or the coincident code is too high, which affects the input speed; or only the initial consonant or the first letter of the pinyin of the Chinese character is utilized; or not intuitive enough; or the code fetching rule is not reasonable, and the brain reaction can be influenced; or whether the code is left-right structure or transversely code fetching for a moment and longitudinally code fetching for a moment is distinguished continuously during code fetching; or the arrangement regularity of the Chinese character components on the keyboard is not strong, even the Chinese character components are slightly strong; or the multiple stroke components are not calculated quantitatively, the selection is abandoned, and the arrangement on the keyboard is intuitive through experience. The technical problems of simplicity, rapidness and simplicity cannot be well solved, and Chinese character input is inconvenient and rapid.
The invention aims to provide a computer Chinese character coding input method which has the advantages of reasonable Chinese character component selection, reasonable stroke layout, normative and intuitive property, simplicity and easiness in learning, reasonable code fetching rule and convenience and quickness in inputting Chinese characters, namely a new phoneme homonymous near-position Chinese character code input method.
In order to achieve the purpose of the new phoneme homonymous near-bit Chinese character code input method, the invention provides that the codes of the new phoneme homonymous near-bit Chinese character code input method consist of two parts, namely a sound code and a shape and part code.
The phonetic code part can adopt Chinese mainland pinyin or Chinese Taiwan phonetic notation, and the phonetic letter sound interrhyme input method invented by the inventor is suggested, the input method is similar to the phonetic notation input method of Taiwan, but the vowels are phonetically expressed, the initial consonants are basically from Latin letters and are internationally connected. The phonetic code can be full spelling or other double spelling or phonetic alphabet spelling or sound medium rhyme spelling or incomplete spelling.
The shape portion encoding part occupies two codes at most. The invention is composed of two codes, five basic strokes and about 21 multi-stroke components are preferably selected to participate in coding, the five basic strokes and about 21 multi-stroke components are collectively called as basic components, the multi-stroke components are all selected from radicals of Chinese characters, the method is simple, common and intuitive, the number is small, a brand-new method which is a simultaneous pronunciation and proximity method and is only burst inspired by the diligent research for ten years is adopted in the layout of the multi-stroke components, and the keyboard layout is reasonable and easy to memorize. Since the national language commission also refers to five basic strokes as chinese character parts, the five basic strokes are referred to as single stroke parts in the present invention, and the other about 21 preferred chinese character parts are composed of multiple strokes, referred to as multi-stroke parts, which are all radicals, and thus are called radicals as well. When the shape part is coded, the basic part with more strokes is preferably coded, otherwise, the rule of selecting the multi-stroke part is meaningless. There are three types of shape coding rules:
the first type of shape coding has the following code-fetching rule: the single-body character is coded by taking the corresponding codes of the first two basic components according to the writing sequence, or the corresponding codes of the first and the last basic components of the Chinese character according to the writing sequence, when the Chinese character has only one basic component, the corresponding code of the basic component is only taken for coding or the corresponding code of the basic component is taken for coding twice in succession; and dividing the multi-character into two parts according to the integral structure, firstly writing the part as a head part, then writing the part as a residual part, and respectively taking the corresponding codes of the first basic part of the head part and the first basic part of the residual part according to the writing sequence for coding.
This coding rule has a weakness: that is, when the shape and the part are coded, the font must be considered after the first basic component of each Chinese character is taken out, namely whether the character is a single-body character or a combined character must be distinguished, and then two different code-taking rules are adopted for coding according to two different fonts, which can affect the brain reaction, and some Chinese characters are difficult to judge whether the character is a combined character, and sometimes the combined character is difficult to be divided into two parts. The Chinese characters with left and right structures and the Chinese characters with non-left and right structures are coded more easily, because whether a Chinese character is in the left and right structures is easily distinguished, the left part and the right part of the Chinese character with the left and right structures have gaps, and the Chinese character is easily divided into two parts, namely a left part and a right part according to the gaps. The Chinese characters with left, middle and right structures are generally based on the first gap, and the middle part is divided into the right part, namely the part except the left part of the Chinese characters with left, middle and right structures is calculated as the right part.
The second type of coding rule for shape coding is: the Chinese characters with left and right structures are coded by the corresponding codes of the first basic components of the left part and the right part according to the writing sequence; the Chinese character with non-left and right structure is coded by the first and last basic parts according to the writing order, only one basic part is coded by the corresponding code of the basic part or the code of the basic part is coded twice. To prevent bypassing the patent, or to provide for: the Chinese characters with non-left and right structures are coded by taking the corresponding codes of the first and second basic components of the Chinese characters according to the writing sequence, but the regulation is easy to increase a large number of repeated codes.
Also specifically noted are: the reason why all Chinese characters are not specified to be coded by the codes of the first two basic components or the codes of the first two basic components is that the code-taking rule of the shape-part codes is apparently simple and easy to remember, and a large number of repeated codes are actually caused or the cost of increasing a large number of multi-stroke components is paid. Why is "Chinese characters with left and right structures, respectively, the corresponding codes of the first basic component in writing order of the left and right parts" can reduce duplication code? Because most of the Chinese characters, the left side is often the radical, the right side is the sound side, and the sound side is often a single-body character representing the sound. If the first and last basic parts are taken in writing order as in a general input method, there is a case where the first stroke of the radical is the same as the first stroke of the side of the initial, which results in a large number of coincident codes. To reduce the duplication code, more radicals must be selected, which causes the situation of difficult memory. That is why a "non-left-right-structured Chinese character is coded by taking the respective codes of the first and last basic components of the Chinese character in writing order. "wool? The answer is also to reduce radicals, because the first stroke and the last stroke of the sound side are often different, for a certain same sound side, Chinese characters with left and right structures, the first stroke of the sound side is taken as the second code, Chinese characters with non-left and right structures and the last stroke of the sound side is taken as the second code, so that the codes of the two second codes are different, and the repeated codes can be better avoided. In addition, if the first two basic parts of the Chinese character with non-left and right structure are taken for coding according to the writing sequence, more repeated codes are easily caused, because the first two basic parts of the Chinese character with a plurality of upper, lower and surrounding structures are the same, but the last basic part of the Chinese character is different, the second code is taken for the last basic part of the Chinese character according to the stroke order, and the repeated codes can be effectively reduced. Therefore, the code fetching rule can effectively reduce coincident codes, so that compared with the input method invented by other people, the Chinese character input method has the advantages that the used radicals are greatly reduced, what double strokes or three strokes are not used, and the result is a thousand-hammer and hundred-mill result, and the coincident code rate is very low in 3775 common Chinese characters and also in 6763 Chinese characters in the national standard and in a Xinhua dictionary.
However, this coding rule also needs to continuously distinguish whether the left and right structures are used during coding, and although it is clear whether a Chinese character is left or right structure, the thinking is still troublesome because it needs to continuously distinguish whether the Chinese character is left or right structure when inputting a long character. Therefore, when actually fetching codes, the third rule of shape part coding and code fetching is used, namely, a method which is suddenly inspired by the inventor in dreams days before the inventor takes more than twenty years: the first code of the shape part coding is: firstly, taking the code of the first basic component of the Chinese character according to the writing sequence regardless of twenty-one of the pseudo-ginseng. The second code of the shape part code adopts the code-taking rule that I meditates the minds and thinks, and sudden inspiration is caused when the I is awake and sleepy half: the method comprises the steps of scanning from left to right or looking from left to right from the right side of a first basic component of the Chinese character, if a vertical line can be used under the condition that strokes of the Chinese character are not cut off, if the Chinese character can be divided into two parts, the Chinese character is in a left-right structure, the right part of the vertical line is the right part of the Chinese character, then the code of the first basic component of the right part of the Chinese character is taken for coding according to the writing sequence, if the Chinese character cannot be divided into two parts under the condition that the strokes are not cut off, the lower half layer or the lower half layer of the Chinese character is scanned from left to right, and the code of the last basic component of the Chinese character according to the writing sequence or the corresponding code of the basic component where the lower right corner of the Chinese character is located is found for coding. Scanning the lower half of the Chinese character is specified because it is easy to find the last basic component of the Chinese character. The reason why the lower half layer of the Chinese character is scanned from left to right is that the scanning direction of the Chinese character is the same as that of the Chinese character with the left-right structure, the scanning direction is from left to right, the scanning direction is consistent with the line direction of the Chinese character, the thinking is more convenient than the conventional method that the Chinese character is searched from the top to the bottom right corner like the T-shaped Chinese character code, and the situation that the code is fetched from left to right for a moment and from top to bottom for a moment can not happen. The method of scanning the lower half layer or the lower half part of the Chinese character from left to right is smelly in various Chinese character input methods, and is a great innovation.
The Chinese characters with left and right structures often have obvious gaps and are easy to be distinguished, so the Chinese characters are not required to be divided by vertical lines, the second code is only required to scan from left to right from the right side of the first basic component of the Chinese character to find the gaps of the left and right parts of the whole Chinese character, the right part of the gap is the right part of the Chinese character, the code of the first basic component of the right part of the Chinese character is taken according to the writing sequence to code, if the Chinese character has no gaps left and right, the lower half layer (or the lower half layer or the lower layer part) of the Chinese character is scanned or looked at from left to right, and the code of the last basic component of the Chinese character according to the writing sequence is found in a favorable way.
Briefly, the first code of the shape code is: the code of the first basic component of the Chinese character is taken according to the writing order. When the second code of the shape and part codes is used for code fetching, the Chinese character is scanned from left to right, if the Chinese character is of a left-right structure, the code of the first basic component of the right part of the Chinese character can be found, and the code of the first basic component of the right part of the Chinese character can be fetched according to the writing sequence. If the right part can not be found, the lower half layer of the Chinese character is scanned from the left to the right, and the code of the last basic component of the Chinese character in the writing order is found. The Chinese character searching method does not need to directly search the lower right corner of the Chinese character as before, and is easy to confuse in thinking.
Note that some chinese characters are regarded as integral parts when they are parts such as "gate" for radical or "heart " for lower half part of chinese character, and they cannot be divided by vertical lines. When the last Chinese character component in the written order in some Chinese characters is a Chinese character component such as 'love, dog, ge, shoot', etc., the second code can be coded by the code of the last stroke, or the last stroke can be removed and then coded, and both can be used, and the repeated codes are hardly influenced, which is a high-level aspect of the error-tolerant code technology of the invention.
From the code-fetching rule of the shape and part codes, it can be seen that the Chinese characters with non-left and right structures are equivalent to the Chinese characters with left and right structures, which are slightly inconvenient, because the Chinese characters with left and right structures only need to be scanned once from left to right, and the Chinese characters with non-left and right structures need to be scanned once from left to right from the lower half part of the Chinese characters again. Therefore, the invention is innovated. The simplified codes are preferably selected from the Chinese characters with non-left and right structures, even though the common frequency is greatly inferior to that of the Chinese characters with left and right structures. The method is characterized in that the shape and part codes of the Chinese characters with the non-left and right structures are the same as the shape and part codes of the Chinese characters with the left and right structures, namely, the simplified codes of the Chinese characters with the non-left and right structures are preferably selected, the shape and part codes of the first codes are input after the sound codes of the Chinese characters are input, the space key is knocked, the Chinese characters with the left and right structures can be input, and only one Chinese character with the non-left and right structures can be specified to have the simplified codes when the shape and part codes of two or more Chinese characters with the non-left and right structures are the same as the first codes. This definition has the advantage that the non-left-right-structured Chinese characters do not have to be scanned once again from left to right from the bottom half of the Chinese character because they are abbreviated codes.
By the way, when the corresponding code of the last basic component of the Chinese character is taken for coding according to the writing sequence or the corresponding code of the basic component where the lower right corner of the Chinese character is located is taken for coding, most of the Chinese characters have the same codes, but the last basic component of a few Chinese characters is not the lower right corner, but the basic component where the lower right corner is located is taken from other positions conveniently from the viewpoint of searching, but the lower right corners of some Chinese characters are not obvious, at the moment, the corresponding code of the last basic component of the Chinese character is taken according to the writing sequence for coding better, the method for processing the Chinese character is to give error-tolerant codes, namely, the last basic component of the Chinese character is taken for coding in the writing sequence or the corresponding code of the basic component where the lower right corner of the Chinese character is taken. This provision is also to prevent bypassing the patent.
The inventor also finds that after the compound character is divided into two parts, the situation that the first strokes of the part of the compound character with the same tone and the radical except the radical are similar basic strokes is unexpectedly few, and the coincidence rate is only more than 100 pairs, namely the coincidence code rate is very low, and the finding and the creative code fetching rule are the reason that only 5 basic strokes and about 21 basic components are selected to participate in coding.
The input method of the original invention selects 28 basic components, which are coded by pinyin first letters for facilitating memory of a plurality of basic components, but several radicals with the same pinyin first letters are encountered, which radicals are according to the initial consonants and which radicals are not according to the initial consonants and have no definite standard, wherein the radicals with the same tone are mainly concentrated on s, h, j, r, y, z and c, the radicals with the same pinyin first letters as s have ' radicals, perpendicular, mountain and stone ', the radicals with the same pinyin first letters as h have ' fire and standing grain ', the radicals with the same pinyin first letters as j have ' radicals and Chinese radicals ', the radicals with the same pinyin first letters as r have ' radicals and days ', the radicals with the same pinyin first letters as y have ' months, radicals and fish ', the radicals with the same pinyin first letters as z have ' pinyin bamboos, feet and branches ', and Chinese branches ', and parts with the first letters as c or a plurality of strokes. When the multi-stroke components are arranged in the original Chinese character code input method, the components are not arranged according to the stroke number and the sequence of horizontal, vertical, left falling, point and turning, but arranged according to pinyin or pictograph. The initial phonetic letters of the basic parts are arranged to avoid duplication codes. The basic components with the same initial phonetic letters or initial consonants are arranged in pictographic mode. The square stroke components of the Chinese characters are different from western letters, so that the Chinese characters are difficult to be similar to the western letters, and are slightly dragged and mishaped. To avoid the homophonic character, the pronunciation code of the points is rectangular, the word is similar to the F code, and the radicals of other homophonics have similar strength. The inventor realizes the problem in the original invented Chinese character code, but has no good policy, and finally invents a brand-new method for arranging homophonic radicals through bitter exploration and sudden inspiration of nearly ten years, namely a homophonic near position method on a keyboard. The method comprises the steps of selecting a plurality of stroke components with the same pronunciation of initial consonants or pinyin initial letters, selecting one of the stroke components which is easy to memorize and is coded according to the initial consonants or the pinyin initial letters, wherein the stroke components are not called as a team leader, and the rest stroke components are called as team members. That is, the basic components of the Chinese character with the same initial consonant or the same pinyin initial are generally arranged in parallel in the same row on the keyboard and arranged left and right, and when encountering a stroke or other multi-stroke components, the Chinese character is separated by the stroke or other multi-stroke components, but is positioned on the left or right of the key position. Therefore, the method is firmly positioned, is obviously very easy to search and memorize, is easier to memorize than arrangement modes such as shape similarity, strokes, tables and the like, and is a major initiative in the world. However, what is the captain and what is the player, how to arrange on the keyboard is not quantified, and the input method at that time is not precedent for the quantified calculation.
The original invention has 28 radicals and five basic strokes, 33 basic Chinese character components and only 30 keys, so that some keys have to be arranged with two basic Chinese character components. With the popularization and use of the mobile phone touch screen, the mobile phone screen also commonly uses 26 letters, if one key position row has two keys, the keys are difficult to be arranged on the mobile phone letter keys, and therefore, one letter key position row and one Chinese character basic component are reasonable. This is preferable to use 21 multi-stroke components and 5 basic strokes. The Chinese character code of the invention is based on 6763 Chinese characters in the national standard, and the combined characters account for the vast majority of 6763 Chinese characters in the national standard, and are about 95%. The number of words combined with the same pronunciation and the radicals is large, about five or six hundred pairs. Wherein 28 radicals of Chinese character, Zhi, mu, Dou, Chinese character, Qi, woman, T, , Yue, Ri, Chong, Tu, Xia, fire, , bamboo, mountain, stone, , He, Fish, shellfish, bird, foot, Zhi, etc. generate more homophones, in order to reduce the weight code, these radicals are selected, respectively coded with a letter or other symbols. Later, only 21 multi-stroke components were selected for ease of display on a keypad of a cell phone or the like. The 7 multi-stroke components are abandoned, and when the inventor realizes that the pinyin is used in the invention, the ordinary person only grasps the pronunciation of 3500 or 3755 Chinese characters in pinyin input, and basically only uses the Chinese characters in daily use. Therefore, the coding is more reasonable according to the frequency and the coincident code rate of the radicals in the 3500 or 3755 Chinese characters. The frequency and coincident code rate of radicals in 3500 or 3755 first-level common Chinese characters are different from those of 6763 Chinese characters in national standard. For example, "" has not high character forming frequency at 6763 Chinese characters, but has high character forming frequency at 3755 first-level common Chinese characters, and is recommended to be selected. has higher frequency in 6763 Chinese characters, but has not high frequency in 3755 first-class common Chinese characters, and has lower frequency in 2500 common Chinese characters, which is obviously lower than that of the keyboard, , vegetable, etc., and slightly lower than that of He, , worm and stone. The Chinese character forming frequency of Wang, shan and Zu is very high in Xinhua dictionary and is higher in 6763 Chinese characters in the national standard, but is not so high in 3500 or 3755 Chinese characters, which are not like He, mountain, , worm and stone, but not like Xian and , which are abandoned in phoneme same-tone near-position Chinese character codes. The character forming frequency of 3755 characters beside the bamboo characters is almost the same as that beside the stone characters. However, the frequency of character formation in 2500 common Chinese characters is obviously inferior to that of the radical 'stone', so that the radical 'stone' is selected, the radical 'bamboo' is abandoned, and the radical 'stone' is abandoned, and the radical 'bamboo' can be selected. The Chinese character forming frequency of the worms is very high in 6763 Chinese characters, but is not high in 3755 Chinese characters, and the Chinese characters can only be barely listed in 21 multi-stroke components. Thus preferably (u), +, zhi, mu, i, u, character radicals, u, , u, g, li, fire, u, ri, u, i, , a Chinese character roll, , a grain, a worm, stone. Of course, it is also possible to change "" to "" or it is not impossible to change "stone" to "mountain" or "bamboo" or "". If 30 keys are used, the bamboo with the highest word forming frequency (or "Guang") can be selected, and then two basic parts are selected from "Wang, shan, Zu and ". From the perspective of popularization, the simpler the herb is, the better one is to discard the herbs, ri, xi, , wu, chong, shi, and he. The selection and disuse of these components, and the location on the keyboard remain empirical and intuitive for years of coding. In the latest invention, quantitative analysis is carried out on the components, and accurate selection and abandonment and scientific positioning on keyboard keys are realized through mathematical operation.
In the original invention, basic strokes such as horizontal, vertical, left-falling, dot (right-falling), turning and the like appear more frequently in the shape coding, and are not suitable to be arranged on the same key with the basic components in order to reduce the weight codes. This is true. However, in the original invention, four basic strokes of horizontal stroke, vertical stroke, left-falling stroke and dot stroke are arranged on the punctuation mark key, and are coded by punctuation marks, and the first letter of pinyin is arranged according to the folding. This was considered advantageous for reducing duplication codes at the time, but later further studies found such an arrangement to be unreasonable. Because the number of the multi-stroke components selected in the Chinese character code is less, the frequency of horizontal stroke, vertical stroke, left falling stroke, dot stroke and turning stroke is greatly higher than that of other multi-stroke components in the input method, which is completely different from the common input method. Even the most common multi-stroke part "mouth" is not comparable to the encoding frequency of the five basic strokes. The five basic strokes of high frequency can be arranged on punctuation mark keys and z keys which are inconvenient to click, wherein the z key is also provided with a multi-stroke part which is coded, which can affect the comfort of typing and the speed of clicking and is also inconvenient to be arranged on small screens of mobile phones and the like. Later, if the user wants to code the pinyin initial letters of the five basic strokes, the user still has inconvenience and inconvenience for clicking keys by 'left-falling or turning over'. It is also contemplated that five basic strokes are arranged in "d, f, g, l,; ' in this respect, the punctuation keys are used, and the radicals with phonetic initials s are arranged in a homonymous manner, and more importantly, are compatible with the invented phonemic letters, because the invented phonemic letters represent A, O, E, I, U by five basic strokes, namely, vertical stroke, downward stroke, horizontal stroke, , wherein the vertical stroke, downward stroke, horizontal stroke, are respectively taken from the first stroke of A, O, E, I, U letters, and the like, which is easy to memorize. English keyboards e, u, i and o are not provided with small thumbs, are arranged on the upper row of the keyboard and are smooth, and a is provided with small thumbs for clicking, but is arranged in the middle of the keyboard and is smooth. Further I have discovered unexpectedly that A, O, E, I, U is represented by five basic strokes, i.e., vertical, horizontal, , respectively, which in the present invention also plays an important role in reducing word duplication and facilitates quantitative analysis, selection and key location of radicals.
The quantitative analysis is a remarkable improvement of the new phoneme same-tone near-bit Chinese character code input method relative to the original phoneme same-tone near-bit Chinese character code input method, and 21 Chinese character multi-stroke components are preferably selected through quantitative calculation and are accurately positioned on a keyboard. The following is a detailed explanation: the radicals of Chinese characters, i.e., radicals of Chinese character, have very high frequency of radicals of Chinese characters, i.e., radicals of Chinese character, can form more than three hundred Chinese characters, and if the radicals of Chinese character are coded according to strokes, a large number of codes are formed, so that the radicals of Chinese character should be selected and arranged on the keys and respectively coded. The multi-stroke parts or radicals ' worm, woman and moon ' can also form about 250 Chinese characters, the first strokes of ' worm ' are ' mouths ', and in order to avoid that a large number of repeated codes are generated when the ' worm ' is coded as a mouth ', the ' worm ' is selected and coded by a certain letter. If the 'woman' and 'month' are coded according to strokes, four or fifty pairs of coincident codes are also brought, and other letters are also selected and respectively coded. A Chinese character component , Huo, xi, Xia and Shi has a little bit of ability to compose a Chinese character, which is about 200 pairs, if they are coded according to strokes, a multi-stroke component can bring about 40 pairs of coincident codes; fire can bring about nearly 40 pairs of coincident codes; the smart can bring about 36 pairs of coincident codes; the different property is manifested as 41 pairs of coincident code; the 'stone' can bring 35 pairs of coincident codes; radical "sun", "radical" and "radical" can bring about 40 pairs, 47 pairs, 36 pairs of coincident codes respectively; the radical "wang" and can bring about 20 pairs of them respectively. Based on the ability to avoid duplication, radical, , different from each other, the different from each other, Shi, Ri, Huo, Chinese character, foot, stone, Wang and are also encoded by another letter. Thus 21 radicals have been selected, each coded with a letter. The radical 'fish' can bring 24 pairs of coincident codes, so that the capability of avoiding the coincident codes is stronger than that of 'king' or '', but the radicals can only be arranged in terms of the arrangement method of homophones and nears; on the key, the code is divided into a number, the number is close to p, and the number is marginally more near. Because the Chinese characters formed by the radicals 'fish' often appear in the phrase mode of 'a certain fish', such as 'carp' and 'silver carp', only 21 radicals are selected and abandoned. The number of the coincident codes generated by other radicals such as , , , mountain, rice, bird and the like is small, although the number of the characters forming the  is large, and 210 pairs are provided, the structure is an upper-lower structure,  can generate 14 pairs of coincident codes, can generate 11 pairs of coincident codes, can generate 13 pairs of coincident codes, mountain can generate 15 pairs of coincident codes, "rice" can generate 14 pairs of coincident codes, and "bird" can generate 13 pairs of coincident codes, so that the new phoneme same-sound near-position Chinese character codes are abandoned. Compared with the phoneme homonymous near-position Chinese character code input method applied in the past, , He and are abandoned, and Wang, and Zu are selected. If the radical 'foot' is not selected, the coincident code is easy to occur with the Chinese character whose radical is 'mouth'.
For ease of memory, most part capitals are arranged or mapped to keys with the first letter of pinyin, encoded with the first letter of pinyin. The Chinese characters, characters radicals, mouths, , , women, wood and other multi-stroke components are all coded according to the first letter of pinyin, and the other multi-stroke components are arranged according to the same sound near position method, which is explained in detail below:
the quantitative analysis is also carried out when the arrangement with the same initial consonant or the same initial pinyin is arranged according to the homonym method, and objective and reliable basis is provided through quantitative calculation. The selected Chinese characters are also the Chinese characters appearing in the Xinhua dictionary app. Some radicals with very strong word-forming capability are not uniformly distributed in the syllables beginning with every phonetic alphabet, in the phonetic syllables beginning with some phonetic alphabets, the number of Chinese characters beginning with a certain radical with very strong word-forming capability is very small, if the radicals or multiple stroke components with the same phonetic alphabets with very strong word-forming capability are coded by using a certain specific phonetic initial, when the corresponding first codes of the shape-part codes are the specific codes, and the second codes of the shape-part codes are matched, the probability of the two codes of the shape-part codes overlapping with the phonetic syllables is less, and the corresponding Chinese characters corresponding to the phonetic syllables are also less, so that the word duplication codes can be effectively avoided. This principle is an objective basis for quantitative calculation of the homoacoustic proximity method. For example, the word composing ability of a bed is very strong, but of the chinese characters whose pinyin initial is f, only 2 chinese characters begin with a bed. The encoding of a source with f is very efficient in avoiding word duplication. The first letter of pinyin for the different Chinese characters xi, Shi, Pan, xi is s, arranged on the keys s, d, f, g according to the homophonic proximal method, if all using s coding, according to the pinyin word-searching method, the number of characters appearing in the pinyin syllable s is 11, 7, 18, 32 respectively; the number of the group characters appearing in the pinyin syllable d is respectively 5, 15, 11 and 12; the number of the group characters appearing in the pinyin syllable f is respectively 7, 3, 2 and 15; the number of the group characters appearing in the pinyin syllable g is 9, 7, 12 and 17 respectively. It can be seen that the character forming ability of the Chinese character appearing in the first letter of pinyin or initial consonant f of the word "i", is weakest, and only 2 initials are Chinese characters of the word "i", so that the word "i" is most reasonably encoded by the word "i" and the word patterns are similar. The stone has the weakest character-forming ability in the Chinese character with the phonetic first letter s, only 7 Chinese characters with the radicals as the stone are available, so the stone is more reasonable by the coding of s, simultaneously, in many input methods, the stone is coded by s, the Chinese character with the radicals as 32 Chinese characters appear in the Chinese character with the phonetic first letter s, the number is too large, the heavy code of the character and the word is easy to cause, the Chinese character with the radicals as the Chinese character is also coded by s in the original phonemic same-sound near-position Chinese character code input method, and the error is corrected in the new phonemic same-sound near-position Chinese character code input method. The Chinese character component ability of the chinese character component is the d is the lowest among s, d, f, g for the chinese character component ability of the chinese character component, the chinese character component is only 5 in the chinese character component number that is located the prefix of the chinese character component is d, the chinese character component number that is located the prefix of the chinese character component in the chinese character component is d is only 12, roughly one, the chinese character component is with d code is more reasonable, it is not so, because the word of coding like this, the g code is used, and the number when the prefix is in the chinese character of g is not nearly, the number when the prefix is g is 9, also very low, so comprehensive consideration, from the lower angle of the number of total chinese character consider, still the d code for the chinese character component, the chinese character component is with g code is better. It is easier to remember.
The r of the Pinyin first letter of day and (including person) is arranged on the keys of q and r according to the same-tone near-bit method, wherein, the number of Chinese characters appearing in the first letter r of day is 1, the number of Chinese characters appearing in the first letter q of day is 2, the number of Chinese characters appearing in the first letter q of day is 5, the number of Chinese characters appearing in the first letter r of alpha is 9, and from the lower angle of the total number of Chinese characters, the "day" is encoded by q, and the "alpha" is encoded by r.
The pinyin first letters of the moon and the Chinese character ma are both y, and the pinyin first letters are arranged on the y key and the p key according to a homophonic near bit method, wherein the number of Chinese characters which appear in the first letter of the Chinese character ma in the pinyin first letter y is 8, the number of Chinese characters which appear in the first letter of the Chinese character ma in the pinyin first letter p is 3, the number of Chinese characters which appear in the first letter of the Chinese character ma in the pinyin first letter p is 7, the number of Chinese characters which appear in the first letter of the Chinese character ma in the pinyin first letter y is 16, the Chinese characters in the first letter of the Chinese character ma are coded by y in the moon and the Chinese character ma are coded by p in the aspect of lower total number of the Chinese characters.
The first letter of spelling of the radical and the foot is z, which can only be arranged on the keys l and z according to the homophonic proximal method, wherein l and z are respectively arranged on the rightmost side of the second row and the leftmost side of the third row of the keyboard and can be regarded as proximal. The left radical can only appear at the word tail of a Chinese character, so that the number of Chinese characters with the Chinese character 'foot' at the initial of the Chinese character is counted, wherein the first letter of the Chinese character 'z' is counted, the number of Chinese characters with the Chinese character 'foot' at the initial of the Chinese character is counted to be 11, the number of Chinese characters with the Chinese character 'foot' at the initial of the Chinese character 'l' is counted to be 11, therefore, the feet are coded by the 'l' and the words are coded by the 'z' from the viewpoint that the total number of the Chinese characters is lower.
The first phonetic letters of the +, the worms and the pinyin are all c, and can only be arranged on two keys of c and v according to a homophonic near position method, since v is a vowel, only the number of Chinese characters appearing when the Chinese characters are at the beginning of the word in the first phonetic letter c is considered, 3 worms appear, 11 worms appear, and the worms are coded by c and the +, which are coded by v, from the viewpoint of lower total number of words.
Some radicals are common, but the duplication codes are low, so that more than 10 pairs of duplication codes can be reduced, and only 26 key bits are used without selection. But it would be beneficial to some people who pursue typing speed if these components were selected. In the new invention, selected parts are double-coded, namely, the parts can be coded according to strokes or radicals and are inconvenient to display on a small-screen keyboard of a mobile phone and the like. These several components are referred to as dual components or virtual components, and may also be referred to as dual radicals or virtual radicals. The virtual parts are called because they do not appear on letter keys of small screens such as mobile phones, but can be encoded by punctuation keys. I.e. the dual components may be stroke coded or punctuation key coded. The fish has strong word forming capability, and 24 pairs of repeated codes can be avoided and arranged; "on bond, use"; and (4) coding. According to the frequency of the radicals, , and cereal are respectively arranged in the form of, ". The keys of "" and "/" are respectively designated as "" and "". ","/"code, see figure 2. This is a great improvement over the original phonemic alphabet, and is more convenient for high-hand and high-speed Chinese character input.
By optimizing about 21 multi-stroke components and five basic strokes, creatively specifying the code-taking rule of the second code of the font code, creatively adopting a homophonic near-position method to arrange the multi-stroke components and five basic strokes by using vowel letters, creatively carrying out quantitative calculation and accurate positioning, the font code is simple and easy to remember, homophonic characters can be effectively distinguished, the coincident code rate is also very low in 3500 common Chinese characters and 6763 common Chinese characters with national standard, and the input speed can be compared with the input methods of a five-stroke font and the like. The method solves the problem that any other input method cannot be solved, really achieves simplicity and intuition, very low coincident code rate and high input speed, is compatible with the most popular pinyin input method or Zhuyin input method, and is a unique, ideal and complete Chinese character input method which can be popularized to primary and secondary school students.
Drawings
FIG. 1 is one of the arrangement diagrams of shape part coding keyboard
FIG. 2 is a second diagram of the arrangement of the shape portion code keyboard
FIG. 3 is a diagram of a mapping relationship of phoneme letters and vowels on a keyboard
FIG. 4 is a second mapping relation diagram of the phonetic alphabet vowels on the keyboard
Detailed Description
The input method of new phoneme same-pronunciation near-position Chinese character code is formed from two portions, one portion is phonetic code, i.e. pronunciation, or phonetic code, and another portion is form portion code. The two parts can be encoded in such a way that the sound code is first and the shape part is encoded later; the shape part coding may precede the phonetic code. But once selected, cannot be changed. To facilitateTo be recalled, consistent with thinking, in order to be fully compatible with the pinyin input method, and to suggest that pinyin is first, the shape is encoded later, and this method is used in the encoding example. The pinyin can adopt full pinyin, double pinyin, simple pinyin or incomplete pinyin, the full pinyin adopts the standard pinyin of a Chinese character, the Chinese language of Taiwan and the phonetic notation input method, and the part for representing tone in the phonetic notation input method is removed, because the shape part code of the invention has higher capability of distinguishing coincident codes than the tone. The double spelling is not always popular because the number of vowels is as much as 35, and the arrangement and the memory are inconvenient. Thus, in the new invention, even a non-professional typist, double spelling is not favored. The phonetic alphabet of Taiwan is adopted, the code length of the phonetic alphabet of Taiwan is shorter, and the phonetic alphabet of Taiwan is not calculated, and generally only has two or three codes, and the phonetic code length of Taiwan is up to 6 codes, so that its input speed is theoretically faster than that of the phonetic alphabet of Taiwan, and its initial consonant can not be latin expressed and its vowel can not be phonatized expressed. The phoneme letters invented by the inventor have the advantages of realizing Latin representation of initial consonants and phonation representation of vowels, being simple to write, being convenient to display on small screens of mobile phones and the like, having shorter code length than pinyin and higher input speed than pinyin, and having the defect that punctuation marks or number keys are needed if one phoneme letter is pressed, and the keystroke of a plurality of punctuation marks or number keys is slightly inconvenient. The initial consonants of the single letters of the phoneme letters are the same as the pinyin, and the warped tongue sounds in the pinyin can be arranged on the v, u and i keys. The vowels of the phonemic letters are simple, the phonemic vowels can be conveniently converted into vowels in the Chinese phonetic scheme, and only the vowels are required to be memorized, i.e., horizontally, vertically, horizontally, leftwards and rightwards,
Figure BSA0000222540200000191
フ, r respectively represent letters e, i, a, o, u, n, ng, r of vowels formed by the scheme of Chinese phonetic alphabet, and then writing according to the writing order.
Figure BSA0000222540200000205
Or as <.
A mapping relation graph of punctuation marks and numeric keys of letters on an English keyboard, pinyin finals and phoneme letters is shown in the attached figure 3:
a...a...
Figure BSA00002225402000002010
left-hand or left-hand
Figure BSA0000222540200000206
e
i.
Figure BSA0000222540200000207
v...ü...v
V. ao.. .. alpha, an..
Figure BSA0000222540200000208
;...ang...
Figure BSA0000222540200000209
6...ou...
Figure BSA0000222540200000201
7...ong...
Figure BSA0000222540200000202
8...ei...
Figure BSA0000222540200000203
9...en...
Figure BSA0000222540200000204
0...eng...ラ
In the drawings ". The bond "is the key at" > ", i.e., the key at" > ".
The shape portion coding will be described in detail below.
The classification knowledge of Chinese characters is introduced first. Chinese characters can be divided into two types of single-body characters and multi-body characters. The combined character is a Chinese character with left-right, up-down, surrounding and mosaic structures. The single-body character is only a single body, and the character is mostly simple pictographic characters and finger-writing characters, because the characters are evolved from pictures, each character is a whole or forms discrete strokes. The multi-character can be divided into two parts according to the whole structure, namely, the left part and the right part or the upper part and the lower part of the multi-character or the surrounding part and the surrounded part or the embedded part and the embedded part are separated, and the multi-character can be divided into a head part and a remaining part. The part containing the first stroke is called the head part, namely the head part contains the first stroke in the writing sequence in the Chinese character, and the rest part except the head part is called the rest part. This division is useful, for example, for some Chinese characters with a surrounding structure, such as "or" carrying "characters, the surrounding part of which is written separately according to the stroke order, since the part containing the first stroke is defined as the head part and the part not containing the first stroke is defined as the rest part," or "the head part of the character is defined as" go ", the rest part is defined as the rest part," the rest part carrying "the character is defined as" car ", and the other part is defined as the head part. For a chinese character having a left-middle-right structure or a top-middle-bottom structure, the middle part may be divided into the remaining parts, or the middle part may be divided into the head part, and the middle part may be generally divided into the remaining parts. For a chinese character having a top, middle, and bottom structure, the middle part may be divided into the top part, and the middle part may be divided into the bottom part.
In order to reduce the repeated codes, the invention also provides a division principle of word-forming priority. When a Chinese character is in an upper-middle-lower structure or a left-middle-right structure, if both sides can form characters, the Chinese character is divided according to the principle of 'the both sides form the characters preferentially', and if one side can form the characters, the Chinese character is divided according to 'the one side form the characters preferentially'. For example, the "ying" is a top-middle-bottom structure, if "+" is listed as the head, then both sides cannot be formed, if "lu" is listed as the remainder, then one side can be formed, so "lu" is listed as the remainder. If the word "case" is listed as the head, the word cannot be formed on both sides, if the word "" is listed as the head, the word cannot be formed on both sides, if the word "case" is listed as the head
If "mu" is listed as the remaining part, both sides can be formed into words, so "an" is listed as the head and "mu" is listed as the remaining part. Of course such words can be resolved using error-tolerant code techniques. It allows a combined Chinese character to be coded according to different divisions.
The optimal division method of the combined characters is to divide the combined characters into two parts of shape parts and sound parts according to the characteristic that most combined characters are shape-sound characters, for example, the characters are 'case' characters, the sound parts 'arrangement' are listed as head parts, and the shape parts 'wood' of the meaning are listed as the rest parts. For the word "Ying", Lu "is listed as the remaining part, and the other part is the table tone and listed as the head. The other multi-character is a meeting character, and can be split into two parts according to a meeting structure. For example, the "Hu" is divided into two parts of "" alpha "" and "" mu "".
The strokes of Chinese characters are classified into five basic strokes of horizontal stroke, vertical stroke, left-falling stroke, dot stroke and turning stroke according to the regulations of the national language commission. The stroke is a line which is written once without interruption when writing Chinese characters, and when only the stroke direction of the Chinese character stroke is considered and the length of the stroke is not counted, the stroke can be classified into five basic strokes of horizontal stroke, vertical stroke, left falling stroke, point stroke and turning stroke, wherein the stroke is merged into the horizontal stroke, the vertical hook is merged into the vertical stroke, the stroke is merged into the point, and the rest strokes with turning strokes are merged into the turning stroke. There are also input methods that call vertical straight and the carabiner left carabiner. In order to reduce the duplication code, about 21 Chinese character components which are high in character forming frequency or practical frequency and are composed of two or more strokes are preferably arranged on letter keys to participate in coding, and because the number of the strokes is two or more, the Chinese character components are called multi-stroke components, or word roots, or radicals, so as to be different from single-stroke components, or basic strokes. The multi-stroke component and the single-stroke component are collectively referred to as a basic component, and sometimes simply referred to as a component.
The first type of shape coding has the following code-fetching rule: the single character is coded by taking the corresponding codes of the first two basic parts according to the writing sequence: or the corresponding code of the first or last basic component is taken according to the writing sequence, and when only one basic component exists, the corresponding code of the basic component is taken; and dividing the multi-character into two parts according to the integral structure, firstly writing the part as a head part, then writing the part as a residual part, and respectively taking the corresponding codes of the first basic part of the head part and the first basic part of the residual part according to the writing sequence for coding.
In long-term coding research, the inventor realizes that whether a Chinese character is in a left-right structure is clear at a glance, the Chinese character in the left-right structure is easily divided into two parts at a gap, and the Chinese characters in an upper-lower surrounding structure are not easily divided into two parts sometimes, and even the Chinese characters in the upper-lower surrounding structure or the surrounding structure are difficult to distinguish whether a character is a single character or the upper-lower structure or the surrounding structure sometimes. It is the simplest to learn according to whether a Chinese character is divided into left and right structures. When a Chinese character with a left, middle and right structure is encountered, the middle and right parts are calculated as the right part or the right part.
If all Chinese characters are divided into left and right structures and non-left and right structures, the Chinese characters can be coded by using the attached figures 1 and 3, namely, the selected pinyin, basic components and codes are not changed. The coding also consists of a pinyin and a font code. The code fetching rule of the shape part code is as follows: the Chinese characters with left and right structures are respectively coded by corresponding codes of the first basic component of the writing sequence of the left part and the right part; the Chinese character with non-left and right structure is coded by the first and last basic parts according to the writing order, and only one basic part is coded by the corresponding code of the basic part or twice. At this time, the Chinese character with non-left and right structure can not take the corresponding codes of the first two basic components according to the writing order to code, because the coincident codes are caused, the corresponding codes of the first and the last basic components of the Chinese character should be taken according to the writing order to code. Because whether a Chinese character is in a left-right structure is clear and no ambiguity is generated, except a few Chinese characters such as 'shun, Chuan, Zhou, Er' and the like, the Chinese character in the left-right structure is easy to generate a gap in the left-right part, and the Chinese character is divided into two parts by a vertical line according to the gap. The Chinese characters with left and right structures sometimes meet the Chinese characters of individual ' Chuan ', ' shun ' and ' state ', the ' Chuan ' is composed of discrete strokes and is regarded as a single character, the ' shun ' is characterized in that the discrete strokes and one Chinese character component are added to form one Chinese character, the whole discrete strokes are generally suggested to calculate the left part, the other Chinese character component calculates the right part, for example, the ' shun ' character, the ' Chuan ' is the left part, and the page ' is the right part. Of course, the present input method is very fault tolerant, with the left part being laid out "off the left and the remaining part being the right part. In addition, "" cannot be bisected by a vertical line.
In order to reduce unnecessary duplication codes, for a few barycentric words, it may also be specified that the second code of the partial code may be coded according to the code of the first or last basic component in which the barycentric is located, it being recommended to code according to the code of the last basic component in which the barycentric is located. The gravity center character refers to a specific physical Chinese character which indicates that the radical of the character meaning is in the middle or the tail of the Chinese character, such as characters of 'win', 'carry', 'Ying', 'Comte' and the like, and the second code of the shape and part code can be coded according to the corresponding code of the basic component 'woman' where the gravity center is located. Also like the word "fluoro", the second code of the shape code may be coded according to the corresponding code of the basic component "fire" where the center of gravity is located, since the part of the word "fluoro" that does not include "fire" is actually phonographic. The center of gravity of the Chinese character with the left part being the same as the right part is in the middle part, so that the shape part coding second code can be taken as the code coding of the last basic component in the middle part. For example, the shape part code two codes can be the last basic part of the middle part for the code of 'left-falling'. The Chinese characters of left, middle and right parts, the shape and part codes or the auxiliary codes are usually coded in the bird part, and the second code is coded according to the position of the center of gravity. .
Because the last basic component of the Chinese character is basically at the lower layer of the Chinese character, generally at the lower right corner, except points of the Chinese character components such as ' pu ' and ge ' which specify the upper right corner according to the writing sequence are the last component. Therefore, when a Chinese character containing the parts of the Chinese characters such as love and Go is encountered, the Chinese character is used as an error-tolerant code, and the points at the upper right corner can be ignored, namely love and Go are respectively used as the last strokes in a vertical mode and a left-falling mode.
The preferred arrangement of the 21 multi-stroke elements and the five basic strokes on the keyboard is shown in fig. 1. The mapping relation among the 21 multi-stroke components, the five basic strokes, the letters and the punctuation marks is set as follows:
a-left-falling b- c-insect d-Zhi u-horizontal f-inverter g-zhu h-fire i-vertical j-Chinese character k-mouth l-foot m-wood n-female o-point p-q-day r-s-stone t-soil u-turning v-triglycidyl w-king x- y-monthly z-lateral word.
And respectively coding the multi-stroke part and the basic strokes by corresponding letters according to the set relation.
The following is a detailed explanation: for the convenience of memory, the factors of word forming frequency, coincident code rate and the like are considered at the same time, the invention has already described a quantitative calculation method for multiple stroke components with the same arranged pinyin initial letters. For the convenience of memory, it can be regarded as "people, moon, stone," "radical, and worms" as the formation, and the memory method is that people find the worms walking in the moon stone. The rest of the co-vocalized radicals are regarded as team members.
A preferred arrangement of the 25 multi-stroke members and the five basic strokes on the keyboard is shown in fig. 2. The mapping relation of the 25 multi-stroke components and the five basic strokes with letters and punctuation marks is set as follows:
a-left-falling b- c-insect d-wind e-horizontal f-inverter g-hot-row h-fire i-vertical j-Chinese character k-mouth l-foot m-wood n-female o-point p-q-day r-s-stone t-soil u-turning v-triglycidyl w-king x- y-month z-lateral; -fish, - . /-standing grain
Some radicals will change slightly after word formation, and the complex and simple radicals will change, which must be treated as the same kind of basic parts coded with the same letter, such basic parts as radicals and people, radicals and characters, radicals and gold, radicals and water, crankshaft, hands, and heart, the medical stone and , etc., which are characterized by the same source. The basic components may also include individual components that are similar to each other, coded with the same letter. For example, the part "soil" may contain "soldiers", and since the two parts are only divided by stroke length, the same part code may be more consistent with the brain reaction habit. The different from each other is also called the different, this is the different from.
The problem of distinguishing whether the Chinese characters are in left and right structures continuously exists when the codes are taken according to the second type of shape and part codes. Therefore, the coding according to the coding rule of the third type of shape part is simple and easy to remember, and the coding example adopts the coding rule and adopts the coding shown in the attached figure 1. The phonetic code lists phonetic and phoneme letters for selection.
Example of encoding: for example, the Chinese character coding is that the initial consonant is h, the vowel is an, the sound code portion is han, the first basic component according to the writing sequence is a plurality of paintings, the coding which is selected according to the writing sequence is d, the left and right structure is formed, the first stroke on the right portion of the Chinese character is folded, the coding which is folded is u, and then the coding of the Chinese character is handu. If the phoneme letter is adopted, the letter is "h ", and the corresponding position on the keyboard is "h,". The code for "Han" is then "h, du". For example, if the Chinese character is a non-left-right structure Chinese character, the last basic component of the Chinese character is a horizontal code e, so that the Chinese character is encoded as zioe, and if the phonemic letters are used, the sound code part is also zi, so that the Chinese character is encoded as zioe. For example, when the Chinese character is 'this' character, the complete spelling is zhe 'and the shape and part are coded, the first basic component of the Chinese character is' dot 'according to the writing sequence, the code is' o ', the non-left and right structure Chinese characters are' z 'of the last basic component in the writing sequence, the shape and part coding of this' is 'oz' and the coding is 'zheozez', because the present invention is the sound of warped tongue, the meaning is not large, and the south of the sound of warped tongue is not read as large, so the sound of warped tongue can be removed, and the coding can be 'zeoz'. For example, the code of "wood" is double-spelling to mu, the Chinese character only has one basic component "wood", the code is m, the shape and part code of "wood" is "m", and the code of wood is mu. In order to obtain a uniform code length, it is also possible to specify that a Chinese character having only one basic component may have the code of the first stroke or the last stroke or repeat the code of the basic component as the second code of the font code. The present coding example does not so specify.
The figure 3 uses the number keys, needs to cross-arrange key strokes, is inconvenient, therefore, because the w key and the y key are left, the frequency of p in Chinese is very low, the vowel is arranged on the p key and the word coincident code is hardly generated when the code is generated, so the n key and the r key are also the same, so the ei, en, eng, ou and ong are arranged on the w, r, y, n and p keys, at this time, the mapping relation diagram of each letter punctuation number key on the English keyboard and the pinyin vowel and the phoneme letter vowel is shown in figure 4:
a...a...
Figure BSA0000222540200000263
left-hand or left-hand
Figure BSA0000222540200000261
e
i.
Figure BSA0000222540200000262
v...ü...v
V. ao.. .. alpha, an..
Figure BSA0000222540200000275
;...ang...
Figure BSA0000222540200000276
n...ou...
Figure BSA0000222540200000271
p...ong...
Figure BSA0000222540200000272
w...ei...
Figure BSA0000222540200000273
r...en...
Figure BSA0000222540200000274
y...eng...
Figure BSA0000222540200000277
The frequency difference between the initial consonants k and r in fig. 4 is not large, and the key r in the figure can be replaced by a key k.
The arrangement of fig. 4 is more regular, i.e. divided into a area a, an area o and an area e according to the initial letter of pinyin, and each area is arranged in the order of a, o, e, i, u, n and ng. The area a has ao, ai, an and ang, which are arranged on four punctuation mark keys, the area O has ou, ong, which are arranged on n or p keys, or arranged on k and p keys. The e region has ei, en and eng, which are arranged on the bonds of w, r and y respectively. Accords with the keystroke rule and is convenient to keystroke. The vowels with higher frequency are arranged on the keys which are convenient for key stroke, for example, the en and ou with higher frequency in Chinese are arranged on the r and n keys which are convenient for key stroke to be positioned by the forefinger, and the vowels with e beginning and the vowels with o beginning of other frequency low points are arranged on other keys.
For a few people who are not willing to tremble, the shape coding can also adopt pure strokes, namely, according to the regulations of the national language commission, various strokes of Chinese characters are summarized into five basic strokes of horizontal (lifting), vertical, left-falling, dot (right-falling) and turning. After inputting the pinyin of a Chinese character, the codes E, I, A, O, U corresponding to the five basic strokes of the Chinese character are input according to the writing sequence. At this time, the code length of the shape part code can be 2 codes, or can be an indefinite code length, namely, all strokes of the Chinese character are taken for coding.
To improve the input speed, brevity codes are designed for frequently used words. The simple code is that for the common Chinese character, only 1, 2 or 3 codes are coded before the complete code is selected, and then the Chinese character can be input by pressing 1 space key. Since the prescribed phonetic code is first and the shape part is later, many Chinese characters need to input the brevity code of the Chinese character, so that the single character code is mainly the phonetic code, and is assisted by the shape part code, and the shape part code plays the role of auxiliary code.
Because the pinyin of the Chinese character is only four hundred, the secondary simplified code of the Chinese character is only four hundred, and the coding space of the invention is 729, therefore, for the other three hundred coding spaces, the simplified code words can be set up to further improve the typing speed. For example, the pinyin of a Chinese character has no kian form, the double pinyin code has no ky form, and the k and the y are the initial consonants of the "may" and the "yes" respectively, so that the ky can be used as the "ok" code. Because the input method is provided with more than three hundred simple code words, the input speed of word groups is theoretically faster than that of single words, so that the input speed of Chinese characters can be obviously improved. After the key where the brevity code of a certain Chinese character or phrase is located is knocked on the computer, the space key is knocked again, and then the corresponding Chinese character or phrase can be input.
The word input is the most common method for improving the Chinese character input speed, because the specified phonetic codes are firstly coded and the shape and part are secondly coded, the word input is completely input by using the phonetic codes, the phonetic codes can adopt full spelling or double spelling when the words are input, and the Chinese pinyin is taken as an example, and the method comprises the following steps:
a. the words of the two characters are input by taking the codes of the initial consonant and the final sound of each character in sequence; such as "encoding" the code as bianma.
b. Three-character words are input by taking the initial consonant of each character or the code of the first letter of the pinyin in sequence and then filling a blank space; the code for a "computer" is "jsj". It is also possible to provide that the first code of the first word and the second word, i.e. the initial code, is taken, and then the first two codes of the third word are taken. It also can specify the first two codes of the first word, and then take the codes of the first codes of the second word and the third word, i.e. the initial consonant.
c. Four-character and above words, and the initial consonant codes of the first three characters and the last character are input in sequence; if the 'science and technology' is a four-character word, the code of the initial consonant of each character is taken as 'kxjs'. For another example, "Xinjiang Uygur autonomous region", code "xjwq" takes initial consonants of the first three words and the last word.
The new phoneme homonymous near-bit Chinese character code input method software is utilized to stroke the key of the corresponding code of a certain Chinese character or phrase on the computer keyboard, and then the input can be completed. Generally, Chinese characters or phrases without coincident codes and reaching a specified code length are automatically displayed on a screen, a space key is pressed when the code length is not enough, and single characters or phrases with coincident codes are selected according to prompt lines. The phonetic codes adopt double-spelling words with the maximum code length of four keys, the phonetic codes adopt full-spelling words with variable code length, and the words and phrases of the invention are compatible.
The shape code can also be independent. Used as an input method. The code length is then at most 2 codes. After inputting according to the shape part code, selecting Chinese characters according to the same prompt line. When the method is not independently used as an input method, the shape part codes can be input again by adding a guide symbol such as v and the like at the front. Then prompt the row to select the Chinese characters.
At present, many Chinese characters are input by voice or pinyin, homophone errors easily occur due to the fact that the Chinese characters have many homophones, the input method software provides a strong homophone modification function, namely, the homophone modification function is entered, a cursor is moved to the front or the back of the incorrect homophone, and attention is paid to that whether the cursor is moved to the front of the Chinese character or the cursor is moved to the back of the Chinese character is uniformly specified. At this moment, the software automatically identifies the pronunciation of the Chinese character, the phonetic code part of the invention is not needed to be input, the shape part coding is input, the complete coding of the Chinese character is input, the original Chinese character is automatically replaced without repeated codes, the selected Chinese character automatically replaces the original input wrong Chinese character after the selection of the prompting line with individual repeated codes.
The invention adopts double-color candidate character technology for popularizing and compatible because the phonetic code is first and completely compatible with the pinyin input method and the phonetic notation input method, namely, after inputting letters in a candidate window, words and phrases appear for selection, the words and phrases which do not adopt shape and part coding are in a certain color, such as green, the Chinese characters which adopt shape and part coding, namely the Chinese characters which adopt Chinese character codes, are in another color, such as black, and after inputting black for several times, the system considers the Chinese character code technology to be understood and inputs the Chinese characters according to the Chinese character codes preferentially so as to improve the speed.
For convenience of use, error-tolerant codes are also set, and Chinese characters to be input can appear in error input for Chinese characters with error-prone codes.
It should be noted that the letters in the specification, claims and drawings of the specification are all capital and small, and that the capital and small letters are equivalent.

Claims (8)

1. A computer Chinese character coding keyboard input method, namely a new phoneme homophonic near-bit Chinese character code input method, classifies various strokes of Chinese characters into five basic strokes of horizontal stroke, vertical stroke, left-falling stroke, dot stroke and turning stroke according to the specification of the national language commission, and is characterized in that:
(1) the code consists of two parts, one part is a phonetic code, namely pinyin, or called a phonetic code, and the other part is a shape code, and the two parts of the Chinese character code can be selected before and after the Chinese character code, and once the two parts are selected, the two parts cannot be changed;
(2) the pinyin can adopt full pinyin, double pinyin, simple pinyin or incomplete pinyin, and also can adopt Taiwan phonetic notation and phoneme letters; the mapping relation between the number keys of the punctuation marks of all the letters and the pinyin finals and the phoneme finals is as follows:
Figure FSA0000222540190000011
the other mapping relation between the number keys of the punctuation marks of all the letters and the pinyin vowels and the vowels of the letters is as follows:
Figure FSA0000222540190000012
(3) the first type of code-fetching rule of shape coding is: the single-body character is coded by taking the corresponding codes of the first two basic components according to the writing sequence, or the corresponding codes of the first and the last basic components according to the writing sequence, when only one basic component exists, the corresponding code of the basic component is only taken for coding, and the single-body character is also specified to be coded by taking the corresponding codes of the first and the last basic components of the Chinese character according to the writing sequence;
the multi-character is divided into two parts according to the whole structure, the part containing the first stroke of the Chinese character in the writing sequence is a head part, the later written part is a residual part, and the corresponding codes of the first basic part of the head part and the first basic part of the residual part are respectively taken for coding according to the writing sequence;
the second type of coding rule for shape coding is: the Chinese characters with left and right structures are respectively coded by the corresponding codes of the first part of the writing sequence of the left part and the right part; the Chinese character with non-left and right structure takes the corresponding code codes of the first and last basic components of the Chinese character according to the writing sequence, only one basic component takes the corresponding code of the basic component or takes the code of the basic component twice in succession; or to specify: the method comprises the steps of (1) coding a Chinese character with a non-left-right structure by taking a first basic component of the Chinese character and a basic component of the lower right corner of the Chinese character (the lower right corner in an enclosed structure is taken when the structure is enclosed) according to a writing sequence;
the third rule of shape coding and code fetching: the first code of the shape part coding is: firstly, taking the code of the first basic component of the Chinese character according to the writing sequence regardless of twenty-one of the pseudo-ginseng; the second code of the shape and part codes is scanned or looked at from left to right from the right side of the first basic component of the Chinese character, if a vertical line can be used for dividing the Chinese character into two under the condition of not cutting off the strokes of the Chinese character, the Chinese character is in a left-right structure, the part on the right side of the vertical line is the right part of the Chinese character, then the code of the first basic component on the right part of the Chinese character is taken according to the writing sequence for coding, if the Chinese character can not be divided into two under the condition of not cutting off the strokes by the vertical line, the lower half layer or the lower half part of the Chinese character is scanned from left to right, and the code of the last basic component of the Chinese character according to the writing sequence or the corresponding code of the basic component where the right lower corner of the Chinese character is located is taken for coding; the Chinese characters with left and right structures often have obvious gaps and are easy to be distinguished, so the Chinese characters are not required to be divided by vertical lines, the second code is only required to scan from left to right from the right side of the first basic component of the Chinese character to find the gaps of the left and right parts of the whole Chinese character, the right part of the gap is the right part of the Chinese character, then the code of the first basic component of the right part of the Chinese character is taken according to the writing sequence, if the Chinese character has no gaps left and right, the lower half layer (or the lower half part or the lower layer part) of the Chinese character is scanned or looked at from left to right, and the code of the last basic component of the Chinese character according to the writing sequence is found in a favorable way;
briefly, the first code of the shape code is: taking the code of the first basic component of the Chinese character according to the writing sequence; when the second code of the shape and part codes is used for code fetching, the Chinese character is scanned from left to right, if the Chinese character is of a left-right structure, the code of the first basic component of the right part of the Chinese character can be found, and the code of the first basic component of the right part of the Chinese character is fetched according to the writing sequence; if the right part can not be found, scanning the lower half layer of the Chinese character from left to right, and finding the code of the last basic component of the Chinese character according to the writing sequence;
(3) when the shape and part coding code-taking rule is adopted, five basic strokes and 21 multi-stroke parts are preferably selected to participate in coding, the word-forming frequency and the coincident code rate of the multi-stroke parts in 3755 common Chinese characters are mainly considered when the multi-stroke parts are selected, the multi-stroke parts are coded according to pinyin first letters or initial consonants, the pinyin first letters of the multi-stroke parts are arranged according to the same-tone near position method when the pinyin first letters of the multi-stroke parts are the same, and accurate positioning calculation is carried out, the method is that in the pinyin syllables of the beginning of some pinyin first letters, the number of Chinese characters beginning with the bushou with strong word-forming capability is small, the bushou with the same pinyin first letter with strong word-forming capability or the multi-stroke parts are coded by using a certain specific pinyin first letter, and word coincident codes can be effectively avoided;
one mapping relation of 21 basic components, five basic strokes and letter keys is set as:
Figure FSA0000222540190000041
the other mapping relation of the 25 multi-stroke components and the five basic strokes and the letters and the punctuation marks is set or set as follows:
Figure FSA0000222540190000042
2. the new phoneme homonymous near-bit Chinese character code input method as claimed in claim 1, wherein: when arranging other multi-stroke components with the same spelling initial or initial as gold, water, fire, , human, moon, the same sound proximity method is adopted, that is, the multi-stroke components with the same spelling initial are arranged at the positions near the key positions of the multi-stroke components, and because the keyboard letter keys are divided into three rows, the multi-stroke components are generally arranged at the positions on the left or right of the same row of the multi-stroke components.
3. The new phoneme homonymous near-bit Chinese character code input method as claimed in claim 1, wherein: the basic components of the Chinese character are selected from radicals of Chinese characters such as Zhi, , kou, mu, Dou, Chinese character radicals, Qi, T, Wan, Chong, Tu, Xian, fire, Ri, Shi, Wang, , Zu, etc.
4. The new phoneme homonymous near-bit Chinese character code input method as claimed in claim 1, wherein: for frequently used characters, brevity codes are designed, and for commonly used Chinese characters, 1, 2 or 3 codes are coded before the complete codes of the characters are selected, and then 1 space key is added to form the brevity codes.
5. The new phoneme homonymous near-bit Chinese character code input method as claimed in claim 1, wherein: when the first code of the shape part code of the Chinese character with the non-left and right structures is the same as the first code of the shape part code of the Chinese character with the left and right structures, the Chinese character with the non-left and right structures is preferably abbreviated, and the Chinese character with the left and right structures can be input only by inputting the first code of the shape part code and knocking a space key after the sound code of the Chinese character is input.
6. The new phoneme homonymous near-bit Chinese character code input method as claimed in claim 1, wherein: the word input steps are:
the words of the two characters are input by taking the codes of the initial consonant and the final sound of each character in sequence;
three-character words, the code of the initial consonant of each character is taken to be input in sequence, and then a blank space is filled for input;
four or more words are input in sequence by taking the initial consonant codes of the first three words and the last word.
7. The new phoneme homonymous near-bit Chinese character code input method as claimed in claim 1, wherein: the input method software provides a powerful homophone character modification function, namely, the homophone character modification function is entered, a cursor is moved to the front or the back of the wrong homophone character, attention is paid to that the cursor is moved to the front of the Chinese character in a unified mode or the cursor is moved to the back of the Chinese character in a unified mode, at the moment, the software automatically identifies the pronunciation of the Chinese character, the phonetic code part of the invention is not needed to be input, only the shape part code is input, the complete code of the Chinese character is input, the original Chinese character is automatically replaced without repeated codes, the Chinese character with individual repeated codes is selected, and the selected Chinese character can automatically replace the original input wrong Chinese character once the Chinese character is selected according to prompt line selection.
8. The new phoneme homonymous near-bit Chinese character code input method as claimed in claim 1, wherein: the double-color candidate character technology is adopted, namely after the letters are input in the candidate window, the characters and the phrases can appear for selection, the characters and the phrases which are not coded by the shape parts are in a certain color, such as green, the Chinese characters which are coded by the shape parts, namely the Chinese characters which are coded by the Chinese character codes, are in another color, such as black, after the black is input for several times, the system considers that the Chinese character code technology is understood, and the Chinese characters are preferentially input according to the Chinese character codes, so that the speed is improved.
CN202011143150.XA 2020-10-16 2020-10-16 New phoneme same-tone near-bit Chinese character code input method Pending CN112783336A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011143150.XA CN112783336A (en) 2020-10-16 2020-10-16 New phoneme same-tone near-bit Chinese character code input method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011143150.XA CN112783336A (en) 2020-10-16 2020-10-16 New phoneme same-tone near-bit Chinese character code input method

Publications (1)

Publication Number Publication Date
CN112783336A true CN112783336A (en) 2021-05-11

Family

ID=75750965

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011143150.XA Pending CN112783336A (en) 2020-10-16 2020-10-16 New phoneme same-tone near-bit Chinese character code input method

Country Status (1)

Country Link
CN (1) CN112783336A (en)

Similar Documents

Publication Publication Date Title
KR101290071B1 (en) Man-machine interface for real-time forecasting user&#39;s input
CN103235696B (en) It is a kind of based on the rapid pinyin input system with touch sensible equipment
CN111880667A (en) Phoneme same-tone near-bit common Chinese character code input method
CN100462901C (en) GB phoneticize input method
CN103488426B (en) Virtual keyboard based on touch screen and input method
CN103616960A (en) Six vowel binary syllabification input method
CN103257715A (en) Pinyin keyboard and input method based on pinyin keyboard
CN103257720B (en) A kind of input method of Chinese character
CN106168858A (en) 26 radical radical and stroke Chinese-character input methods
CN112783336A (en) New phoneme same-tone near-bit Chinese character code input method
CN109308130A (en) A kind of text input and edit methods for digital equipment
CN102511021B (en) Number-order-code-element keyboard and information input method thereof
CN103207684A (en) Phonemic letter double-input method
CN105404402A (en) Chinese character input method applicable to touch screen
CN117111752A (en) New homophonic near-bit Chinese character code input method
CN109739365A (en) Middle and primary schools&#39; teaching Multi-Function Keyboard and professional input Multi-Function Keyboard
CN101957662B (en) Computer with Chinese character elements as well as cell phone keypad for inputting Chinese characters and input method
CN103744535B (en) Homophone Wubi input method
CN1694046A (en) Computer coding Chinese character keyboard input method and information code
CN103616961A (en) Phoneme T-shaped Chinese character code input method
CN115047980A (en) Non-split Chinese character input integrated system capable of accurately inputting Chinese characters
CN106406558B (en) Number key realizes Two bors d&#39;s oeuveres, five-stroke etymon input and companion keyboard
CN103941882A (en) T-shaped Chinese character code input method
CN105892704B (en) The first sum of phonemic alphabet phonetic input method
CN111611773A (en) Digital coding method for Chinese and foreign languages and its use

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination