CN121881982A - Chinese character encoding and decoding method and system based on Chinese pinyin and Chinese character characteristics - Google Patents

Chinese character encoding and decoding method and system based on Chinese pinyin and Chinese character characteristics

Info

Publication number
CN121881982A
CN121881982A CN202610021821.6A CN202610021821A CN121881982A CN 121881982 A CN121881982 A CN 121881982A CN 202610021821 A CN202610021821 A CN 202610021821A CN 121881982 A CN121881982 A CN 121881982A
Authority
CN
China
Prior art keywords
chinese character
code
character
codes
chinese
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202610021821.6A
Other languages
Chinese (zh)
Inventor
王文涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Shengfangsi Information Technology Co ltd
Original Assignee
Guangzhou Shengfangsi Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Shengfangsi Information Technology Co ltd filed Critical Guangzhou Shengfangsi Information Technology Co ltd
Priority to CN202610021821.6A priority Critical patent/CN121881982A/en
Publication of CN121881982A publication Critical patent/CN121881982A/en
Pending legal-status Critical Current

Links

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

本申请提供一种基于汉语拼音与汉字特征的汉字编码与解码方法及系统,所述方法包括:当键盘状态为第一状态时,根据键盘上被触发的第一位置生成对应的声母简码;当键盘状态为第二状态时,根据键盘上被触发的第二位置、第三位置以及触发顺序,生成对应的韵母简码;在任意键盘状态中,若当前被触发的第四位置属于键盘中的辅助输入区域,则生成对应的汉字起笔编码,并将所述键盘状态切换为辅助输入状态;当所述键盘状态为辅助输入状态时,根据所述辅助输入区域中被触发的第五位置生成对应的汉字特征编码;根据编码生成顺序,将已生成的声母简码、韵母简码、汉字起笔编码以及汉字特征编码组合构建为汉字编码,提高编码与汉字的匹配精度以及用户的输入体验。

This application provides a method and system for encoding and decoding Chinese characters based on Pinyin and Chinese character features. The method includes: when the keyboard is in a first state, generating a corresponding initial consonant abbreviation based on the first position triggered on the keyboard; when the keyboard is in a second state, generating a corresponding final vowel abbreviation based on the second and third positions triggered on the keyboard and the triggering order; in any keyboard state, if the currently triggered fourth position belongs to the auxiliary input area of the keyboard, generating a corresponding Chinese character stroke start code and switching the keyboard state to the auxiliary input state; when the keyboard state is in the auxiliary input state, generating a corresponding Chinese character feature code based on the fifth position triggered in the auxiliary input area; and combining the generated initial consonant abbreviation, final vowel abbreviation, Chinese character stroke start code, and Chinese character feature code according to the code generation order to construct a Chinese character code, thereby improving the matching accuracy between the code and the Chinese character and the user's input experience.

Description

Chinese character encoding and decoding method and system based on Chinese pinyin and Chinese character characteristics
Technical Field
The application relates to the technical field of computers and text input methods, in particular to a Chinese character encoding and decoding method and system based on Chinese pinyin and Chinese character characteristics.
Background
The existing input method has some disadvantages in application, such as high duplicate rate and accuracy of the shape code input method, too high learning cost and difficulty in adapting to the flow of intelligent input, being in a process of being phased out, and also has the advantages of simple and easy learning, high duplicate rate and low accuracy, and the double-spelling input method effectively shortens the coding length, but has no rule and difficulty in memorizing the corresponding relation between vowels and keys, and does not solve the problem of high duplicate rate. In addition, although some input methods introduce tone, the processing mode is rough, either keys are directly and separately set for inputting tone, or a separate state is set for inputting tone, but there are problems that the code becomes long and is difficult to analyze, and direct input ā, a and other symbols with tone cannot be supported, so that the input method supporting tone is often difficult to put into use.
Some input methods introduce other elements to improve accuracy, but often have some problems. For example, the shape codes (usually more than one hundred) are introduced on the basis of double spelling, so that the number and accuracy of effective combinations are improved, the learning cost is greatly increased, various external elements such as strokes and structures are introduced, the accuracy of inputting single Chinese characters is improved, the problem of being too complex exists, and the problems of overlong encoding and inconvenient analysis exist for words above 2 words and 2 words.
Disclosure of Invention
Aiming at the technical problems, the application provides a Chinese character encoding and decoding method and a Chinese character encoding and decoding system based on Chinese pinyin and Chinese character characteristics, which improve the matching precision of encoding and Chinese characters and the input experience of users.
In a first aspect, an embodiment of the present application provides a method for encoding chinese characters based on pinyin and kanji characteristics, including:
when the keyboard state is a first state, generating a corresponding initial consonant brevity code according to a first triggered position on the keyboard;
when the keyboard state is the second state, generating a corresponding final simplified code according to the triggered second position, the triggered third position and the triggering sequence on the keyboard;
in any keyboard state, if the fourth position which is triggered currently belongs to an auxiliary input area in the keyboard, generating a corresponding Chinese character starting code according to the fourth position, and switching the keyboard state into an auxiliary input state;
When the keyboard state is an auxiliary input state, generating a corresponding Chinese character feature code according to a fifth position triggered in the auxiliary input area;
When the input is completed, the generated initial simplified codes, the generated final simplified codes, the initial codes of the Chinese characters and the Chinese character characteristic codes are combined to construct Chinese character codes according to the code generation sequence.
The embodiment of the application provides a Chinese character coding method based on Chinese pinyin and Chinese character characteristics, which is characterized in that initial consonant simple codes, vowel simple codes, chinese character initial stroke codes and Chinese character characteristic codes are respectively generated under different keyboard states, and are sequentially combined to form a complete Chinese character code. The embodiment effectively integrates the pinyin characteristics and the structure characteristics of the Chinese characters, so that the codes can reflect pronunciation and character shape characteristics, and the duplication code rate is obviously reduced. For example, in the traditional pinyin input method, homophones are numerous, so that a user needs to frequently turn pages to select a target Chinese character, but in the embodiment, initial consonants and final sounds in pinyin codes are distinguished by introducing initial information and structural features, candidate Chinese characters with high possibility can be quickly screened out in a subsequent decoding stage, the searching time of the user is greatly shortened, and the matching precision of codes and Chinese characters is improved. And secondly, the embodiment adopts a dynamic coding mechanism in different states, so that different codes can be generated on the same key position under different input states, the occupied space of the keyboard layout is greatly reduced, and the simplicity and the attractiveness of an input interface are improved. In addition, the embodiment also designs an auxiliary input area, and a user can code structural characteristics of the current Chinese characters at any time by triggering the auxiliary input area according to requirements, so that the user does not need to frequently switch modes in the input process, the operation is smooth and natural, and the input experience of the user is improved. Finally, as the coding rule gives consideration to the naturalness of pinyin and the structural property of Chinese characters, the learning threshold is lower, and a user can master the use method in a shorter time, so that the input experience of the user is further improved.
Further, when the keyboard state is the second state, generating a corresponding final simplified code according to the triggered second position, the triggered third position and the triggering sequence on the keyboard, including:
Generating a corresponding second character and a corresponding third character according to the second position, the third position and the triggering sequence on the keyboard;
Determining a tone and a tone position according to the second character and the third character;
and generating corresponding final simple codes according to the tone positions of the second character, the third character and the tone positions.
The embodiment of the application provides a generating mode of a final simple code, in particular to a method for constructing a final code with tone by utilizing a combination mode of a second character and a third character. Specifically, the embodiment skillfully solves the problem of difficult tone input in the traditional input method. In the prior art, the tones are marked by adopting independent key positions or complex switching modes, so that the complexity of operation is increased, and error input is easily caused. In contrast, the method can realize tone embedding through simple combined action, so that a user can intuitively and rapidly complete input of vowels with tones. For example, the user can automatically synthesize the correct final form by pressing two related keys in turn, so that the complicated input flow is avoided. In addition, the design can be compatible with various tone changes, supports direct input of special characters such as ā, a and the like, is rare in the existing input method, and has obvious technical innovation. In conclusion, the introduction of the vowel coding mechanism greatly enriches the functions of an input method, improves the matching precision of codes and Chinese characters and the input experience of users.
Furthermore, the Chinese character initial stroke code corresponds to Chinese character initial stroke information, and the Chinese character initial stroke information is 'horizontal', 'vertical', 'left-falling', 'dot' and 'folding'.
The embodiment of the application defines five basic stroke starting types corresponding to the Chinese character stroke starting codes, namely 'horizontal', 'vertical', 'left-falling', 'point' and 'folding', and the stroke starting types cover the initial stroke forms of most Chinese characters, so that the embodiment can increase the morphological characteristics of the Chinese characters in the Chinese character coding process, simultaneously does not occupy more key positions, and ensures the simplicity of the Chinese character coding process. In practical application, a user can quickly select a corresponding initial code according to the writing habit of the target Chinese character, so that the input efficiency of the Chinese character code is improved. Meanwhile, the learning difficulty of the user is reduced, because the initial stroke information accords with the visual cognition of people on the structure of the Chinese character, and no additional memory of complex rules or complete strokes of the Chinese character is required. In the subsequent decoding process, the introduction of the initial stroke code further compresses the search space, accelerates the matching process of Chinese characters and improves the matching precision of the code and the Chinese characters.
In one possible implementation manner, the Chinese character feature code corresponds to Chinese character feature information, and the Chinese character feature information is a first feature, a second feature, a third feature or a fourth feature;
the first characteristic is that the target Chinese character to be encoded can be split into two or more first independent components left and right, and any one of the first independent components can be split into two or more second independent components up and down;
The second characteristic is that the target Chinese character can be split into two or more first independent components left and right, and each first independent component cannot be split into two or more second independent components up and down;
the third characteristic is that the target Chinese character cannot be split into two or more first independent components left and right, and the target Chinese character can be split into two or more second independent components up and down;
the fourth characteristic is that the target Chinese character can not be split left and right, but also can not be split up and down.
The embodiment of the application provides four Chinese character characteristic information, and the characteristics describe the structural attribute of the Chinese character based on whether the Chinese character can be split left and right or up and down, so as to provide a retrieval basis for the subsequent Chinese character matching process. The embodiment of the application provides a hierarchical structure framework for Chinese character coding, so that each Chinese character can be accurately classified. For example, the first feature is applicable to complex words with relatively complex structures, such as "inert", "Shao", etc., and the fourth feature is applicable to Chinese characters with relatively simple structures, such as "Ding", "Guo", "Shao", etc. Compared with the traditional single-dimension coding method, the multi-dimension feature coding of the embodiment captures all the features of the Chinese characters more comprehensively, so that the target Chinese characters can be positioned faster in the decoding stage, the repeated code rate of a user in the actual Chinese character input process is greatly reduced, the matching precision is improved, the memory habit of the user on the Chinese characters is met, and the input experience of the user is improved.
In one possible implementation manner, the first character and the second character in the final simple code are triggered and generated by different input areas respectively, specifically:
when the keyboard state is the second state, generating a corresponding first character according to the triggered second position on the keyboard;
in any keyboard state, if a third position which is triggered currently belongs to an auxiliary input area in the keyboard, generating a corresponding second character according to the third position, and switching the keyboard state into an auxiliary input state;
When the keyboard state is an auxiliary input state, generating corresponding Chinese character initial codes and Chinese character feature codes according to the triggered fourth position and fifth position in the auxiliary input area in sequence;
and generating the final brevity code according to the first character and the second character.
The embodiment of the application limits the specific application mode of the auxiliary input area, and the second character of the final simple code, the initial stroke code of the Chinese character and the character characteristic code are input by multiplexing the auxiliary input area on the keyboard without adding independent special keys for different types of codes. The keyboard layout is greatly optimized, so that the keyboard can bear richer input functions in a limited screen (especially a mobile device screen) or keyboard space, the definition and conciseness of an input interface are maintained, and the interface swelling and the visual burden of a user caused by excessive keys are avoided. In addition, the embodiment allows the user to flexibly trigger the auxiliary input area to supplement the vowel supplementary information and the structural feature information (vowel second part, starting and component number classification) of the input Chinese characters according to the requirement in the process of inputting the main codes (such as the first part of the initial consonant and the vowel) or after inputting the main codes, so that the matching precision of the codes and the Chinese characters is improved. The design ensures that the coding input process is more coherent and natural, and the input experience and the operation smoothness of a user are obviously improved.
In a second aspect, an embodiment of the present application provides a method for decoding chinese characters based on pinyin and hanzi features, where the method for decoding chinese character codes generated by any one of the chinese character coding methods based on pinyin and hanzi features described in the present application includes:
Acquiring Chinese character codes;
splitting the Chinese character code into a plurality of sub-code information according to the positions of initial consonant brevity codes in the Chinese character code, wherein at most only one initial consonant brevity code exists in the sub-code information, and the initial consonant brevity code is the first bit code in the sub-code information;
Generating corresponding retrieval information through a preset mapping table according to the sub-coding information, wherein the retrieval information comprises a plurality of complete initials which are generated according to the initial brevity codes; generating a plurality of corresponding complete vowels according to the vowel simple codes if the vowel simple codes exist in the sub-coding information, generating corresponding Chinese character starting information according to the Chinese character starting codes if the Chinese character starting codes exist in the sub-coding information;
For each piece of sub-coding information, searching a plurality of corresponding candidate Chinese characters from a preset database according to the corresponding searching information;
and if the number of the subcode information is 1, displaying the plurality of candidate Chinese characters on a preset interface.
The embodiment of the application provides a Chinese character decoding method corresponding to an encoding method, firstly, the embodiment considers the condition that a user inputs a plurality of Chinese characters simultaneously, the position of an initial simple code splits the Chinese character codes possibly existing in the plurality of Chinese characters into a plurality of sub-code information, and decodes each sub-code information respectively, thereby realizing the rapid conversion from the simple code to the complete information. In the information decoding process, whether the final simple code, the Chinese character initial stroke code and the Chinese character characteristic code exist in the sub-code information or not is checked in sequence, and a plurality of corresponding complete final, chinese character initial stroke information and Chinese character characteristic information are generated, that is to say, the embodiment can still normally complete the Chinese character decoding process under the condition that one or more of the final simple code, the Chinese character initial stroke code and the Chinese character characteristic code are absent, the Chinese character matching result is displayed, and the flexibility of Chinese character input of a user is improved. And secondly, the decoded retrieval information can cover multiple dimensions such as initials, finals, strokes, characteristics and the like at most, so that the database is more accurate to inquire, the returned result is more targeted, and the matching precision of the codes and the Chinese characters is improved.
Further, the generating a plurality of corresponding complete initials according to the initial brevity code includes:
if the initial consonant brevity code is a character in a preset first character set, generating a blank placeholder as a complete initial consonant in the search information;
And if the initial consonant simple code is a character in a preset second character set, generating flat-tongue consonant and seesaw consonant corresponding to the initial consonant simple code as the plurality of complete consonants.
In the embodiment of the application, the system can be more flexibly adapted to the input habit of the user through the classification processing of the initial consonant brevity codes, and the corresponding initial consonant information can be analyzed. For example, in some cases, the user may not need to input a specific initial consonant or some Chinese characters do not have an initial consonant, and then input the characters in the first character set in the encoding stage, and at this time, generate a blank placeholder as a complete initial consonant in the search information in the decoding stage, so as to ensure that each Chinese character has corresponding initial consonant information, and ensure that subsequent search is performed normally. Secondly, for some easy-to-confuse flat-tongue sounds and uptongue sounds, the embodiment can generate corresponding flat-tongue sounds and uptongue sounds according to single letter simple codes input by a user, so that the number of input characters of the user is reduced, the input efficiency is improved, and the fault tolerance of Chinese character input is remarkably improved. For example, the user may have inaccurate pronunciation due to dialect or accent difference, and the design ensures that even under the condition of incomplete and accurate input, reasonable initial consonant brevity codes can be generated, robustness of the system is enhanced, the number of key positions is effectively reduced, the keyboard layout is optimized, simplicity and attractiveness of an input interface are improved, and more convenient and efficient interaction experience is provided for the user.
Further, if the number of the sub-code information is greater than 1, according to a preset word stock and the sequence of each sub-code information in the Chinese character code, arranging and combining the corresponding candidate Chinese characters to generate a plurality of words or phrases, and displaying each word or phrase on a preset interface.
The embodiment of the application further considers the combined display scene after the user inputs a plurality of Chinese characters at the same time, and as each piece of subcode information can be matched with a plurality of Chinese characters, when only one piece of subcode information exists, a plurality of matched Chinese characters can be directly displayed. When a plurality of sub-code information exists, the number of the arrangement and combination results of each Chinese character obtained by matching is greatly increased, and a user needs to select the Chinese character to be input from a large number of matching results, so that the use experience of the user is greatly reduced. Therefore, the embodiment intelligently combines each candidate Chinese character into a plurality of coherent words or phrases by integrating the preset word stock and combining the sequence information of each subcode, thereby greatly reducing the matching quantity, ensuring that a user can quickly select the Chinese characters to be input, and improving the matching precision of the codes and the Chinese characters and the input experience of the user.
In a third aspect, an embodiment of the present application provides a chinese character encoding system based on pinyin and chinese character features, including an initial simplified code generation module, a final simplified code generation module, a chinese character initial code generation module, a chinese character feature code generation module, and a combination module;
the initial consonant brevity code generation module is used for generating a corresponding initial consonant brevity code according to a first triggered position on the keyboard when the keyboard state is a first state;
the final simple code generation module is used for generating a corresponding final simple code according to the triggered second position, the triggered third position and the triggering sequence on the keyboard when the keyboard state is the second state;
The Chinese character initial code generating module is used for generating a corresponding Chinese character initial code according to a fourth position when the fourth position which is triggered currently belongs to an auxiliary input area in the keyboard in any keyboard state, and switching the keyboard state into an auxiliary input state;
The Chinese character feature code generating module is used for generating a corresponding Chinese character feature code according to a fifth triggered position in the auxiliary input area when the keyboard state is the auxiliary input state;
And the combination module is used for constructing the generated initial consonant brevity codes, the generated final simple codes, the Chinese character initial stroke codes and the Chinese character characteristic codes into Chinese character codes according to the code generation sequence when the input is completed.
Further, the Chinese character feature codes correspond to Chinese character feature information, and the Chinese character feature information is a first feature, a second feature, a third feature or a fourth feature;
the first characteristic is that the target Chinese character to be encoded can be split into two or more first independent components left and right, and any one of the first independent components can be split into two or more second independent components up and down;
The second characteristic is that the target Chinese character can be split into two or more first independent components left and right, and each first independent component cannot be split into two or more second independent components up and down;
the third characteristic is that the target Chinese character cannot be split into two or more first independent components left and right, and the target Chinese character can be split into two or more second independent components up and down;
the fourth characteristic is that the target Chinese character can not be split left and right, but also can not be split up and down.
The fourth aspect of the present application provides a Chinese character decoding system, which is used for decoding any Chinese character code generated by a Chinese character coding system based on Chinese pinyin and Chinese character characteristics, and comprises an acquisition module, a splitting module, an information decoding module, a retrieval module and a display module;
The acquisition module is used for acquiring Chinese character codes;
The splitting module is used for splitting the Chinese character code into a plurality of sub-code information according to the position of an initial consonant simple code in the Chinese character code, wherein at most one initial consonant simple code exists in the sub-code information, and the initial consonant simple code is the first bit code in the sub-code information;
The information decoding module is used for generating each corresponding search information according to each piece of sub-coding information through a preset mapping table, wherein any one piece of search information at least comprises a plurality of complete initials corresponding to the initial simple codes, if a final simple code exists in the sub-coding information, the search information also comprises a plurality of corresponding complete finals, if a Chinese character starting code exists in the sub-coding information, the search information also comprises corresponding Chinese character starting information, and if a Chinese character characteristic code exists in the sub-coding information, the search information also comprises corresponding Chinese character characteristic information;
The searching module is used for searching a plurality of corresponding candidate Chinese characters from a preset database according to the corresponding searching information for each piece of sub-coding information;
And the display module is used for displaying the plurality of candidate Chinese characters on a preset interface if the number of the subcode information is 1.
Drawings
FIG. 1 is a flow chart of a Chinese character encoding method based on Pinyin and Chinese character features according to an embodiment of the present application;
FIG. 2 is a diagram of a coding interface of a Chinese character coding method based on Pinyin and Chinese character features according to an embodiment of the present application;
FIG. 3 is a flowchart of a Chinese character decoding method based on Pinyin and Chinese character features according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a Chinese character encoding system based on Pinyin and Chinese character features according to an embodiment of the present application;
Fig. 5 is a schematic structural diagram of a chinese character decoding system based on pinyin and character features according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be noted that, the step numbers herein are only for convenience of explanation of the specific embodiments, and are not used as limiting the order of execution of the steps. In the description of the present application, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature.
Embodiment one:
As shown in fig. 1, a first embodiment provides a Chinese character encoding method based on pinyin and character features, comprising steps S1-S5:
step S1, when the keyboard state is a first state, generating a corresponding initial consonant brevity code according to a first triggered position on the keyboard;
Step S2, when the keyboard state is the second state, generating a corresponding final simple code according to the triggered second position, the triggered third position and the triggering sequence on the keyboard;
Step S3, in any keyboard state, if the fourth position which is triggered currently belongs to an auxiliary input area in the keyboard, generating a corresponding Chinese character starting code according to the fourth position, and switching the keyboard state into an auxiliary input state;
S4, when the keyboard state is an auxiliary input state, generating a corresponding Chinese character feature code according to a fifth triggered position in the auxiliary input area;
And S5, when the input is completed, constructing the generated initial consonant brevity codes, vowel brevity codes, chinese character initial stroke codes and Chinese character characteristic codes into Chinese character codes according to the code generation sequence.
The embodiment of the application provides a Chinese character coding method based on Chinese pinyin and Chinese character characteristics, which is characterized in that initial consonant simple codes, vowel simple codes, chinese character initial stroke codes and Chinese character characteristic codes are respectively generated under different keyboard states, and are sequentially combined to form a complete Chinese character code. The embodiment effectively integrates the pinyin characteristics and the structure characteristics of the Chinese characters, so that the codes can reflect pronunciation and character shape characteristics, and the duplication code rate is obviously reduced. For example, in the traditional pinyin input method, homophones are numerous, so that a user needs to frequently turn pages to select a target Chinese character, but in the embodiment, initial consonants and final sounds in pinyin codes are distinguished by introducing initial information and structural features, candidate Chinese characters with high possibility can be quickly screened out in a subsequent decoding stage, the searching time of the user is greatly shortened, and the matching precision of codes and Chinese characters is improved. And secondly, the embodiment adopts a dynamic coding mechanism in different states, so that different codes can be generated on the same key position under different input states, the occupied space of the keyboard layout is greatly reduced, and the simplicity and the attractiveness of an input interface are improved. In addition, the embodiment also designs an auxiliary input area, and a user can code structural characteristics of the current Chinese characters at any time by triggering the auxiliary input area according to requirements, so that the user does not need to frequently switch modes in the input process, the operation is smooth and natural, and the input experience of the user is improved. Finally, as the coding rule gives consideration to the naturalness of pinyin and the structural property of Chinese characters, the learning threshold is lower, and a user can master the use method in a shorter time, so that the input experience of the user is further improved.
In a preferred embodiment, in step S1, the main input area of the input keyboard corresponding to the coding method is 3*8, the keys in the main input area are respectively represented by 1-1 to 3-8 (first row, first column, third row, eighth column), and a function key is commonly used at the bottom, and the coding method is suitable for a mobile touch screen operating system, and after adjustment is performed according to the keyboard, the coding method is also suitable for a general keyboard.
According to the input content, the keyboard is divided into different states, and in each state, the key and the code establish a mapping relation.
The codes input by the user through the keyboard in the first state are called "initial consonant brevity codes", and the initial consonant brevity codes comprise Q, R, W, T, Y, P, L, S, D, F, G, H, J, K, Z, X, C, B, N, M and V, wherein V represents the condition of no initial consonant of pinyin, and the corresponding relation with the layout of the keyboard is as follows:
wherein, keys 1-8, 2-8, 3-8 correspond to Λ, Φ, pi respectively in this state, these three greek letters are used to refer to these three keys, in the non-input state, clicking will input punctuation marks, and clicking these three keys in the input state will not be resolved into the initial consonants either. In addition, the embodiment of the application provides a corresponding initial simple code input rule, and for the tongue-raising sound, h is directly removed, only one character is needed to be input, for example, sh only needs to be input S, ch, only needs to be input C, zh, and only needs to be input Z, so that certain flat-tongue sounds and the tongue-raising sounds in the embodiment share the same code. For pinyin without an initial, the initial is denoted by v. For convenience of expression, the same keyboard layout will be used for illustration in this embodiment, and consonant symbols are used to represent key positions, i.e. 2-1 keys are called S keys, 3-5 keys are called N keys, and 2-8 keys are called Φ keys.
Further, in step S2, when the keyboard state is the second state, generating a corresponding final simplified code according to the triggered second position, the triggered third position and the triggering sequence on the keyboard includes:
Generating a corresponding second character and a corresponding third character according to the second position, the third position and the triggering sequence on the keyboard;
Determining a tone and a tone position according to the second character and the third character;
and generating corresponding final simple codes according to the tone positions of the second character, the third character and the tone positions.
The embodiment of the application provides a generating mode of a final simple code, in particular to a method for constructing a final code with tone by utilizing a combination mode of a second character and a third character. Specifically, the embodiment skillfully solves the problem of difficult tone input in the traditional input method. In the prior art, the tones are marked by adopting independent key positions or complex switching modes, so that the complexity of operation is increased, and error input is easily caused. In contrast, the method can realize tone embedding through simple combined action, so that a user can intuitively and rapidly complete input of vowels with tones. For example, the user can automatically synthesize the correct final form by pressing two related keys in turn, so that the complicated input flow is avoided. In addition, the design can be compatible with various tone changes, supports direct input of special characters such as ā, a and the like, is rare in the existing input method, and has obvious technical innovation. In conclusion, the introduction of the vowel coding mechanism greatly enriches the functions of an input method, improves the matching precision of codes and Chinese characters and the input experience of users.
In a preferred embodiment, the user processes the vowels before entering the Chinese character and enters the corresponding simple vowel code according to the rules that a) only the first two digits of the vowels are reserved and the rest are discarded, b) the vowels are less than two digits and denoted by v, c) vowels a, e, i, o, u, u with unvoiced vowels in the vowels are denoted by â, î, ô, e, û (u is also denoted by û), respectively. The conversion process from partial vowels to simple vowels is as follows:
Specifically, in the second state of the keyboard, the first digit of the final simplified code (also referred to as the keyboard state 2-1) is input first, and the value range is divided into two categories: 1) tone is first in the final shortcode, (e.g., a, e, o beginning final and partial ui beginning final (un, uv, in, iv)), for the following final shortcode representation: a, ā, a, o, ō, b, ǒ, o, e, ě, e, i, ī, i, ǐ, im, u, g, i, ǔ, a, ǘ, ǚ, ǜ, wherein a, e, i, o, u, a represent light sound; 2) the tone is at the second position of the final (e.g., ua, ui, uo, ue, io, ie, iu, ia etc., the tone is at the second position of the final), represented by special symbols û and î, wherein û represents the final with u beginning and tone at the second position, î represents the final with i beginning and tone at the second position.
Thus, the number of symbols corresponding to the keyboard state 2-1 is 32, u and ü are combined, light sound and one sound are combined, the number of corresponding symbols is reduced to 22, in addition, using the wave symbols of ã, ẽ, ĩ, õ, ũ to indicate tones but not limited to tones, and ã, ẽ correspond to a button, representing a or e, and not limited to tones; ĩ, õ, ũ correspond to a button, representing i or o or u, and are not limited by tone. The correspondence between the final keyboard state 2-1 and input symbols is as follows:
Furthermore, the input content of the keyboard state 2-2 corresponds to the 2 nd bit of the simple code of the final, the corresponding input range is a, e, i, o, u, r, n, v, the corresponding input range is â, the corresponding input range is e, û, î, ô, r, n and v because the vowel part is not provided with tone, the corresponding input range is û and î because the input content of the previous state is provided with tone, and the corresponding input range is a, ā, a, k, o, ō, d, ǒ, co, e, ě, e, i, ī, i, ǐ, im, u, g, m, ǔ and d. The two codes are 33 codes and more than 24 keys, so partial keys are combined, light sounds and one sound are still combined, and in consideration of the fact that the sound tone symbol is input in the last step, the sound tone symbol is input in the next step, and the sound tone symbol is input in the last step, the sound tone symbol is input in the next step, so that the key is mutually exclusive (namely impossible to be selectable simultaneously) between ā and â when the keyboard is in a state 2-2, the key is mutually exclusive between ī and î, the key is mutually exclusive between ō and ô, the key is mutually exclusive between the key and û, the symbols are combined, and the key is shared with the key because the number of Chinese characters corresponding to 'r' is less;
Wherein, the 3-7 key represents a or e and is not limited to tone, the 3-8 key represents i or o or u and is not limited to tone, and the two keys can be clicked when the tone is not clear to the user. Therefore, when the keyboard is in the second state, the corresponding final simplified codes can be generated in a combined mode according to the triggered second position, the triggered third position and the triggered sequence.
Further, in step S3, the initial code of the chinese character corresponds to initial information of a chinese character, and the initial information of the chinese character is "horizontal", "vertical", "left-falling", "dot" and "folding".
The embodiment of the application defines five basic stroke starting types corresponding to the Chinese character stroke starting codes, namely 'horizontal', 'vertical', 'left-falling', 'point' and 'folding', and the stroke starting types cover the initial stroke forms of most Chinese characters, so that the embodiment can increase the morphological characteristics of the Chinese characters in the Chinese character coding process, simultaneously does not occupy more key positions, and ensures the simplicity of the Chinese character coding process. In practical application, a user can quickly select a corresponding initial code according to the writing habit of the target Chinese character, so that the input efficiency of the Chinese character code is improved. Meanwhile, the learning difficulty of the user is reduced, because the initial stroke information accords with the visual cognition of people on the structure of the Chinese character, and no additional memory of complex rules or complete strokes of the Chinese character is required. In the subsequent decoding process, the introduction of the initial stroke code further compresses the search space, accelerates the matching process of Chinese characters and improves the matching precision of the code and the Chinese characters.
In one possible implementation manner, in step S4, the kanji feature code corresponds to kanji feature information, where the kanji feature information is a first feature, a second feature, a third feature, or a fourth feature;
the first characteristic is that the target Chinese character to be encoded can be split into two or more first independent components left and right, and any one of the first independent components can be split into two or more second independent components up and down;
The second characteristic is that the target Chinese character can be split into two or more first independent components left and right, and each first independent component cannot be split into two or more second independent components up and down;
the third characteristic is that the target Chinese character cannot be split into two or more first independent components left and right, and the target Chinese character can be split into two or more second independent components up and down;
the fourth characteristic is that the target Chinese character can not be split left and right, but also can not be split up and down.
The embodiment of the application provides four Chinese character characteristic information, and the characteristics describe the structural attribute of the Chinese character based on whether the Chinese character can be split left and right or up and down, so as to provide a retrieval basis for the subsequent Chinese character matching process. The embodiment of the application provides a hierarchical structure framework for Chinese character coding, so that each Chinese character can be accurately classified. For example, the first feature is applicable to complex words with relatively complex structures, such as "inert", "Shao", etc., and the fourth feature is applicable to Chinese characters with relatively simple structures, such as "Ding", "Guo", "Shao", etc. Compared with the traditional single-dimension coding method, the multi-dimension feature coding of the embodiment captures all the features of the Chinese characters more comprehensively, so that the target Chinese characters can be positioned faster in the decoding stage, the repeated code rate of a user in the actual Chinese character input process is greatly reduced, the matching precision is improved, the memory habit of the user on the Chinese characters is met, and the input experience of the user is improved.
In the prior art, for newly added code types (such as tone, structure, strokes, etc.), there are usually several ways to set a separate area pair specifically for inputting the code type (such as several separate keys specifically for inputting tone in many patent applications), and to add a state for inputting the code type. The former needs to set independent keys, which is not suitable for mobile terminals with limited space and can lead to low key utilization and long codes, while the latter lacks flexibility and cannot realize 'on-demand input', because a certain state is generally difficult to skip in the normal input process. Therefore, unlike the former two, the dynamic input area does not need to set an independent key, or newly add a state to the whole keyboard, but rather, a certain area is specially designated in a certain state, and has an independent sub-state, after the area enters the inputtable state, the sub-state is only changed by clicking the key, the state of the main keyboard is not changed, the input of the main code is not influenced, and the keyboard is switched to the next state by clicking the key outside the dynamic area.
Advantages of dynamic input area:
1. An input area is not required to be independently arranged outside the main keyboard, so that the keyboard is not bulked;
2. The special keyboard state does not need to be added independently, and the coding analysis difficulty is not increased;
3. The method has high flexibility, the main code and the auxiliary code are separated, the auxiliary code can be input according to the requirement, and the main code input is not influenced;
4. The method overcomes the problem of the combination input that the combination input is caused by the combination of two different types of elements, namely, the combination input is an input method for combining two different types of elements together and mapping the elements with a keyboard, such as combining the second letter (without tone) of a final simple code with the last pen of a Chinese character, clicking a key once and inputting the letter and the last pen at the same time, and has higher input mode efficiency, but the two elements are required to be combined in the brain and then correspond to the key, so that time thinking is required to be spent, and the input experience is poor;
5. Support an unlimited number of secondary codes (of course the secondary code types should not be too many to encode too long);
Setting up a dynamic input area, namely 1) designating a state corresponding to the dynamic input area, 2) reserving a fixed blank key (without setting a mapping relation) in the designated state, and 3) setting up a dynamic area sub-state, wherein clicking a dynamic area key only changes the sub-state, but does not change the keyboard state, and does not influence normal input of other keys.
In an preferred embodiment, the keyboard state corresponding to the dynamic input area is a first state, (any keyboard state can be set to trigger the dynamic input area), the keys in the dynamic area are three keys (1-8, 2-8 and 3-8) at the rightmost side in the first state, the three keys are blank keys, and the dynamic area has two sub-states, namely, the state 1-1 corresponds to the starting of Chinese characters, the state 1-2 corresponds to the classification of the number of parts.
The initial of Chinese characters refers to the first stroke of Chinese characters, which is divided into horizontal stroke (I), vertical stroke (I), left falling stroke (II), dot stroke (III) and folding stroke (yi 乚), and Chinese characters can be divided into 5 classes according to the initial of Chinese characters, for example, the initial of Chinese characters is "horizontal" and has "south", "Su", "lower", "have" and the like, chinese characters with initial strokes of I include "Japanese", "Japanese" and the like, chinese characters with initial strokes of "Japanese", "Japanese" and the like, the Chinese characters with "point" are "Song", "talking", "send", "Guang", etc., and the Chinese characters with "folding" are "silk", "nine", "pre", "aim", "good" etc.
Regarding the number of Chinese character parts classification, the number of Chinese character parts is not the number of independent parts of the whole Chinese character, but the number of parts in two dimensions in a longitudinal and a transverse direction. The classification according to the number of the parts refers to classifying Chinese characters from two dimensions of the number of the vertical parts and the number of the horizontal parts of the Chinese characters, wherein x is represented by x-y, x represents the number of the horizontal parts, y represents the number of the vertical parts, and the Chinese characters can be classified into the following categories:
The Chinese characters can be transversely divided into 2 or more independent parts, and one of the left side and the right side can be longitudinally divided into 2 or more independent parts, and the Chinese characters are recorded as 2-2, such as 'inert', 'Shao', 'shao', and the like;
The Chinese characters can be transversely divided into 2 or more independent parts, and the left side and the right side cannot be longitudinally divided into 2 or more independent parts, and the Chinese characters are recorded as 2-1, such as 'punching', 'arranging', 'digging', and the like;
Chinese characters cannot be transversely divided into 2 or more independent parts, but can be longitudinally divided into 2 or more independent parts, and are recorded as 1-2, such as 'paste', 'high', 'Hua' and the like;
Chinese characters cannot be divided into two or more independent parts in the transverse direction and 2 or more parts in the longitudinal direction, and are recorded as 1-1, such as "Ding", "Guo", "Suo" and the like
The number of the dynamic area keys is 5, and the number of the dynamic area keys is only 3, so that partial strokes are needed to be combined, the corresponding relation is that 1-8 keys correspond to one and I, 2-8 keys correspond to two and I, 3-8 keys correspond to 乚 (folds) as follows:
dynamic area and part number classification mapping relation
According to the definition, the Chinese characters are classified into four types, namely 2-2 types, 2-1 types, 1-2 types and 1-1 types, from the number of the two dimension parts of the Chinese characters in the longitudinal and the transverse directions, and the dynamic area has only three keys, so that partial classifications are combined. According to the statistics of the number of the four types by the applicant, the number of the 1-2 types and the 1-1 types of Chinese characters is found to be relatively small, so that the two types of Chinese characters are combined. The corresponding relationship of the Chinese characters of 1-2 class and 1-1 class is represented by I, the Chinese characters of 2-1 class is represented by II, and the Chinese characters of 2-2 class is represented by III, namely, the 1-8 keys correspond to the Chinese characters of I class, the 2-8 keys correspond to the Chinese characters of II class, and the 3-8 keys correspond to the Chinese characters of III class, as shown below
Therefore, as shown in the following table, the complete coding of a Chinese character is classified by initial consonant brevity code, final simple code, first stroke and component number, in the actual coding process, the user can omit other coding except the initial consonant brevity code as required, so that the input efficiency is improved, and the embodiment can still normally perform subsequent decoding and Chinese character display.
In a preferred embodiment, examples of Chinese character codes are as follows:
1. the coding method of single Chinese characters comprises the following steps:
According to the foregoing, the Chinese character codes are composed of initial consonant simple codes, vowel simple codes, component number classification codes and Chinese character initial strokes, and the following are examples of partial Chinese character codes:
The spelling of the 'Zhang' word is zh ā ng, the initial simple code is Z and corresponds to the key Z, the final simple code is ā N, ā corresponds to the key Q, N corresponds to the key N, the first pen is 'fold', the corresponding key is pi, the part number classification code is 2-1 and corresponds to the key phi, and therefore the complete key code is ZQN pi phi. As shown in fig. 2, when inputting codes, firstly, the key 3-1 is triggered in the first state of the keyboard to generate the initial simplified code Z, then the keys 1-1 and 3-5 are triggered in sequence in the second state of the keyboard to generate the final simplified code an, at this time, the keyboard state is switched to the auxiliary input state by triggering the key 3-8 in the auxiliary input area, and the corresponding Chinese character starting codes are generated, and finally, the key 2-8 is triggered in the auxiliary input state to generate the corresponding Chinese character characteristic codes, so that the codes of Chinese characters are finished, and meanwhile, the system decodes and searches according to the input codes to generate a plurality of candidate Chinese characters to be displayed at the designated positions.
The spelling of the yellow word is hu ng, the initial simplified code is H and corresponds to a key H, the final simplified code is u, wherein u is unvoiced vowels, the unvoiced marks are added to be û, the key N, the key W corresponds to the key A, the first pen is one, the corresponding key is lambda, the component number classification code is 1-2 and corresponds to the key lambda, and therefore the complete key code is HNW lambda;
The Pinyin of the 'monk' word is S ng, the initial consonant brevity code is S and corresponds to a key S, the final brevity code is N, wherein the key Y corresponds to a key N, the first pen is a 'horizontal' corresponding key phi, the number of parts is classified as 2-2 and corresponds to a key pi, and the complete key code is SYN phi pi;
the Pinyin of the same word is T and is the letter simplified code of T, which corresponds to the key T, the final simplified code is the vowel N, wherein the vowel corresponds to the key J, the N corresponds to the key N, the first stroke is I and corresponds to the key lambda, the number of parts is classified as 1-1 and corresponds to the key lambda, and the complete key code is TJN lambda;
the pinyin of the 'Fu' word is F, the initial simple code of the 'Fu' word is F and corresponds to the key F, the final simple code of the 'Fu' word is U V, wherein the U corresponding to the key X and V corresponds to the key V, the first pen is 'Ji', the corresponding key phi, the number of parts is classified as 2-2, and the corresponding key pi is n, so that the complete key code is FXV phi pi;
the spelling of the 'running' word is chu and has the initial simplified code of C corresponding to the key C, the final simplified code of u hakura, wherein u is silent vowels, the letter is û, the letter corresponds to the key N, the letter corresponds to the R, the number of components is classified as 1-1, the letter corresponds to the key lambda, the first stroke is 'Chinese', the letter corresponds to the key phi, and the complete key code is CNR lambda phi.
The spelling of the ' vault ' word is qi (ng), the initial simple code of the ' vault ' word is Q, the corresponding key Q, the final simple code of the ' vault ' word is i (i is vowel without sound), the ' vault ' word should be transcribed to î, the corresponding key is V, the ' vault ' word should be J, the ' vault ' word should be the ' keyboard, the corresponding key is phi, the number of parts is classified to 1-2, the corresponding key lambda, and the complete key code is QVJ phi lambda.
The spelling of the "stolen" word is qi, its initial simple code is Q, and correspondent key Q, and final simple code is i, in which i is silent vowel, it should be transcribed into î, and correspondent key is V, and the correspondent key is Λ, and its first pen is "Chinese character's" and correspondent key is phi, the number of components is classified into 1-2 and correspondent key Λ, so that its complete key code is QV Λ.
The spelling of the 'enemy' word is dI, the initial simplified code is D and corresponds to the key D, the final simplified code is I V, wherein I corresponds to the key D and V corresponds to the key V, the first pen is 'horizontal', corresponds to the key phi, the number of parts is classified as 2-1 and corresponds to the key phi, and the complete key code is DDV phi.
2. Vocabulary coding method
Five codes of Chinese characters are represented by C1, C2, C3, C4 and C5 respectively, for example, the code of 'enemy' word is DXV phi, then C1 corresponds to D, C2 corresponds to X, and the like.
2.1, Two-word assembly code rules
The codes of the two-word vocabulary are C1C2C3C1C2C3, namely the first three codes of two Chinese characters are combined together, for example, the codes of two Chinese characters are JXM Φpi and SBM Λ respectively, and the codes of the word are JXMSBM. Due to the strong flexibility of the dynamic input area of the input method, under the condition that the accuracy needs to be improved, the last two codes, namely JXV SBV phi n lambda, JXV phi n SBV or JXV phi n SBV lambda, can be continuously input after the first three codes are input, and a user can input the last two codes according to the requirement.
2.2, Three-word assembly code rule
There are two coding rules for three words:
The first three digits of each Chinese character can be sequentially input by C1C2C3C1C2C3C1C2C3, and after the first three digits of each Chinese character are independently input, whether auxiliary codes C4 and C5 are input can be determined according to the requirement;
The first two Chinese characters are the first two codes, namely, the complete codes of three Chinese characters of "kindergarten" are Y phi Z pi phi, VPV phi lambda and YNW lambda respectively, then the codes are Y phi VP YZ, and in general, the three words of "kindergarten" can be matched as long as the first 6 codes are input;
2.3, four-word assembly codes, wherein four-word vocabulary has two coding modes:
The first three Chinese characters of C1C1C1C2C3 are coded first, the last Chinese character is coded first three, and C4 and C5 can be input according to the requirement, for example, the first code of the first three Chinese characters of 'southeast, southwest and North' is DNX, the last Chinese character is BLS lambda phi, the complete code is DNXBLS, and the auxiliary codes phi and lambda can be input according to the requirement;
the first two codes of each Chinese character are taken as C1C2C1C2C1C2, and compared with the main codes of each Chinese character of 'Huiyin like torch', the main codes of each Chinese character are HNG, YRN, RXV, JBV respectively, so that the codes are HNYRRXJB;
2.4, five words and above vocabulary codes, and two coding modes exist for the five words and above vocabulary:
C1C1C1C1C1, namely taking the first position of each Chinese character code, for example, the corresponding code of 'fashion hero' is SSZYX;
the first two codes of each Chinese character are taken, for example, the main codes of each Chinese character are YBV, SBV, ZPV, BBV, DWV when the user wants to speed, so that the codes are YBSBZPBBDW.
In one possible implementation manner, the first character and the second character in the final simple code are triggered and generated by different input areas respectively, specifically:
when the keyboard state is the second state, generating a corresponding first character according to the triggered second position on the keyboard;
in any keyboard state, if a third position which is triggered currently belongs to an auxiliary input area in the keyboard, generating a corresponding second character according to the third position, and switching the keyboard state into an auxiliary input state;
When the keyboard state is an auxiliary input state, generating corresponding Chinese character initial codes and Chinese character feature codes according to the triggered fourth position and fifth position in the auxiliary input area in sequence;
and generating the final brevity code according to the first character and the second character.
The embodiment of the application limits the specific application mode of the auxiliary input area, and the second character of the final simple code, the initial stroke code of the Chinese character and the character characteristic code are input by multiplexing the auxiliary input area on the keyboard without adding independent special keys for different types of codes. The keyboard layout is greatly optimized, so that the keyboard can bear richer input functions in a limited screen (especially a mobile device screen) or keyboard space, the definition and conciseness of an input interface are maintained, and the interface swelling and the visual burden of a user caused by excessive keys are avoided. In addition, the embodiment allows the user to flexibly trigger the auxiliary input area to supplement the vowel supplementary information and the structural feature information (vowel second part, starting and component number classification) of the input Chinese characters according to the requirement in the process of inputting the main codes (such as the first part of the initial consonant and the vowel) or after inputting the main codes, so that the matching precision of the codes and the Chinese characters is improved. The design ensures that the coding input process is more coherent and natural, and the input experience and the operation smoothness of a user are obviously improved.
In summary, the embodiment of the application solves the input problem of the symbols with the tone, and the coding length of the pinyin part is consistent and regular by cutting and complementing the pinyin, and the tone is fixed at the second three positions. In the existing pinyin input method, any codes are basically not abandoned, so that the codes are lack of rules and different in length, and the rules of tone input in vowels are not deeply mined. In addition, the embodiment introduces two Chinese character features of Chinese character starting and structure number classification, greatly improves the accuracy, simultaneously does not influence the input of a main code when introducing the two Chinese character features of Chinese characters, and can input Chinese character feature parts according to the need (the pinyin part is the main part and the Chinese character feature part is the auxiliary part, and has obvious main and auxiliary differences and no confusion when analyzing the codes). In the prior art, without the "dynamic input area" in the present embodiment, when resolving codes, there may be a situation that the pinyin part is confused with other codes.
Embodiment two:
As shown in fig. 3, a second embodiment provides a method for decoding chinese characters based on pinyin and hanzi features, where the method for decoding chinese character codes generated by any one of the chinese character coding methods based on pinyin and hanzi features according to the present application includes steps S101-S501:
step S101, acquiring Chinese character codes;
Step S201, splitting the Chinese character code into a plurality of sub-code information according to the positions of initial consonant brevity codes in the Chinese character code, wherein at most one initial consonant brevity code exists in the sub-code information, and the initial consonant brevity code is the first bit code in the sub-code information;
Step 301, generating corresponding search information according to the sub-coding information through a preset mapping table, wherein the search information comprises a plurality of complete initials generated according to the initial brevity codes; generating a plurality of corresponding complete vowels according to the vowel simple codes if the vowel simple codes exist in the sub-coding information, generating corresponding Chinese character starting information according to the Chinese character starting codes if the Chinese character starting codes exist in the sub-coding information;
Step S401, for each piece of sub-coding information, searching a plurality of corresponding candidate Chinese characters from a preset database according to the corresponding searching information;
step S501, if the number of the subcode information is 1, displaying the plurality of candidate chinese characters on a predetermined interface.
The embodiment of the application provides a Chinese character decoding method corresponding to an encoding method, firstly, the embodiment considers the condition that a user inputs a plurality of Chinese characters simultaneously, the position of an initial simple code splits the Chinese character codes possibly existing in the plurality of Chinese characters into a plurality of sub-code information, and decodes each sub-code information respectively, thereby realizing the rapid conversion from the simple code to the complete information. In the information decoding process, whether the final simple code, the Chinese character initial stroke code and the Chinese character characteristic code exist in the sub-code information or not is checked in sequence, and a plurality of corresponding complete final, chinese character initial stroke information and Chinese character characteristic information are generated, that is to say, the embodiment can still normally complete the Chinese character decoding process under the condition that one or more of the final simple code, the Chinese character initial stroke code and the Chinese character characteristic code are absent, the Chinese character matching result is displayed, and the flexibility of Chinese character input of a user is improved. And secondly, the decoded retrieval information can cover multiple dimensions such as initials, finals, strokes, characteristics and the like at most, so that the database is more accurate to inquire, the returned result is more targeted, and the matching precision of the codes and the Chinese characters is improved.
Further, in step S301, the generating a plurality of complete initials according to the initial consonant brevity code includes:
if the initial consonant brevity code is a character in a preset first character set, generating a blank placeholder as a complete initial consonant in the search information;
And if the initial consonant simple code is a character in a preset second character set, generating flat-tongue consonant and seesaw consonant corresponding to the initial consonant simple code as the plurality of complete consonants.
In the embodiment of the application, the system can be more flexibly adapted to the input habit of the user through the classification processing of the initial consonant brevity codes, and the corresponding initial consonant information can be analyzed. For example, in some cases, the user may not need to input a specific initial consonant or some Chinese characters do not have an initial consonant, and then input the characters in the first character set in the encoding stage, and at this time, generate a blank placeholder as a complete initial consonant in the search information in the decoding stage, so as to ensure that each Chinese character has corresponding initial consonant information, and ensure that subsequent search is performed normally. Secondly, for some easy-to-confuse flat-tongue sounds and uptongue sounds, the embodiment can generate corresponding flat-tongue sounds and uptongue sounds according to single letter simple codes input by a user, so that the number of input characters of the user is reduced, the input efficiency is improved, and the fault tolerance of Chinese character input is remarkably improved. For example, the user may have inaccurate pronunciation due to dialect or accent difference, and the design ensures that even under the condition of incomplete and accurate input, reasonable initial consonant brevity codes can be generated, robustness of the system is enhanced, the number of key positions is effectively reduced, the keyboard layout is optimized, simplicity and attractiveness of an input interface are improved, and more convenient and efficient interaction experience is provided for the user.
Further, in step S501, if the number of the subcode information is greater than 1, according to a preset word stock and the sequence of each subcode information in the chinese character encoding, each corresponding candidate chinese character is arranged and combined to generate a plurality of words or phrases, and each word or phrase is displayed on a preset interface.
The embodiment of the application further considers the combined display scene after the user inputs a plurality of Chinese characters at the same time, and as each piece of subcode information can be matched with a plurality of Chinese characters, when only one piece of subcode information exists, a plurality of matched Chinese characters can be directly displayed. When a plurality of sub-code information exists, the number of the arrangement and combination results of each Chinese character obtained by matching is greatly increased, and a user needs to select the Chinese character to be input from a large number of matching results, so that the use experience of the user is greatly reduced. Therefore, the embodiment intelligently combines each candidate Chinese character into a plurality of coherent words or phrases by integrating the preset word stock and combining the sequence information of each subcode, thereby greatly reducing the matching quantity, ensuring that a user can quickly select the Chinese characters to be input, and improving the matching precision of the codes and the Chinese characters and the input experience of the user.
Embodiment III:
as shown in fig. 4, the third embodiment provides a chinese character encoding system based on pinyin and chinese character features, which includes an initial simplified code generating module 10, a final simplified code generating module 20, a chinese character initial code generating module 30, a chinese character feature code generating module 40, and a combining module 50;
The initial consonant brevity code generation module 10 is configured to generate a corresponding initial consonant brevity code according to a first position triggered on the keyboard when the keyboard state is a first state;
The final simple code generating module 20 is configured to generate a corresponding final simple code according to the triggered second position, the triggered third position and the triggering sequence on the keyboard when the keyboard state is the second state;
The chinese character start code generating module 30 is configured to generate, in any keyboard state, a corresponding chinese character start code according to a fourth position that is currently triggered if the fourth position belongs to an auxiliary input area in the keyboard, and switch the keyboard state to an auxiliary input state;
The chinese character feature code generating module 40 is configured to generate a corresponding chinese character feature code according to a fifth position triggered in the auxiliary input area when the keyboard state is an auxiliary input state;
the combination module 50 is configured to combine the generated initial simplified code, the final simplified code, the initial code of the Chinese character, and the characteristic code of the Chinese character into a Chinese character code according to a code generation sequence when the input is completed.
Further, when the keyboard state is the second state, the final simple code generating module 20 generates a corresponding final simple code according to the triggered second position, the triggered third position and the triggering sequence on the keyboard, including:
generating corresponding second characters and third characters according to the second position and the third position on the keyboard;
combining the second character with the third character to generate a combined character;
and determining the tone and tone position of the combined character according to the triggering sequence, and further generating a corresponding final simple code.
Furthermore, the Chinese character initial stroke code corresponds to Chinese character initial stroke information, and the Chinese character initial stroke information is 'horizontal', 'vertical', 'left-falling', 'dot' and 'folding'.
Further, the Chinese character feature codes correspond to Chinese character feature information, and the Chinese character feature information is a first feature, a second feature, a third feature or a fourth feature;
the first characteristic is that the target Chinese character to be encoded can be split into two or more first independent components left and right, and any one of the first independent components can be split into two or more second independent components up and down;
The second characteristic is that the target Chinese character can be split into two or more first independent components left and right, and each first independent component cannot be split into two or more second independent components up and down;
the third characteristic is that the target Chinese character cannot be split into two or more first independent components left and right, and the target Chinese character can be split into two or more second independent components up and down;
the fourth characteristic is that the target Chinese character can not be split left and right, but also can not be split up and down.
In one possible implementation manner, the final simple code generating module includes a first character generating unit, a second character generating unit and a combining unit, which are respectively configured to generate a first character, a second character and a final simple code in the final simple code, specifically:
when the keyboard state is the second state, the first character generating unit generates a corresponding first character according to the triggered second position on the keyboard;
In any keyboard state, if a third position which is triggered currently belongs to an auxiliary input area in the keyboard, the second character generating unit generates a corresponding second character according to the third position, and switches the keyboard state into an auxiliary input state;
When the keyboard state is an auxiliary input state, the Chinese character initial code generating module 30 and the Chinese character feature code generating module 40 generate corresponding Chinese character initial codes and Chinese character feature codes according to the triggered fourth position and fifth position in the auxiliary input area respectively;
the combination unit generates the final brevity code according to the first character and the second character.
The embodiment of the application provides a Chinese character coding system based on Chinese pinyin and Chinese character characteristics, which is characterized in that initial consonant simple codes, vowel simple codes, chinese character initial stroke codes and Chinese character characteristic codes are respectively generated in different keyboard states, and are sequentially combined to form a complete Chinese character code. The embodiment effectively integrates the pinyin characteristics and the structure characteristics of the Chinese characters, so that the codes can reflect pronunciation and character shape characteristics, and the duplication code rate is obviously reduced. For example, in the traditional pinyin input method, homophones are numerous, so that a user needs to frequently turn pages to select a target Chinese character, but in the embodiment, initial consonants and final sounds in pinyin codes are distinguished by introducing initial information and structural features, candidate Chinese characters with high possibility can be quickly screened out in a subsequent decoding stage, the searching time of the user is greatly shortened, and the matching precision of codes and Chinese characters is improved. And secondly, the embodiment adopts a dynamic coding mechanism in different states, so that different codes can be generated on the same key position under different input states, the occupied space of the keyboard layout is greatly reduced, and the simplicity and the attractiveness of an input interface are improved. In addition, the embodiment also designs an auxiliary input area, and a user can code structural characteristics of the current Chinese characters at any time by triggering the auxiliary input area according to requirements, so that the user does not need to frequently switch modes in the input process, the operation is smooth and natural, and the input experience of the user is improved. Finally, as the coding rule gives consideration to the naturalness of pinyin and the structural property of Chinese characters, the learning threshold is lower, and a user can master the use method in a shorter time, so that the input experience of the user is further improved.
Embodiment four:
as shown in fig. 5, the fourth embodiment provides a chinese character decoding system based on pinyin and chinese character features, where the chinese character decoding system is configured to decode a chinese character code generated by any one of the chinese character encoding systems based on pinyin and chinese character features according to the present application, and the chinese character decoding system includes an obtaining module 101, a splitting module 102, an information decoding module 103, a retrieving module 104, and a displaying module 105;
wherein, the obtaining module 101 is used for obtaining Chinese character codes;
the splitting module 102 is configured to split the chinese character code into a plurality of sub-code information according to a position of an initial consonant brevity code in the chinese character code, where at most only one initial consonant brevity code exists in the sub-code information, and the initial consonant brevity code is a first bit code in the sub-code information;
the information decoding module 103 is configured to generate, according to each piece of the sub-encoded information, each piece of corresponding search information through a preset mapping table, including generating a plurality of corresponding complete initials according to the initial consonant brevity codes; generating a plurality of corresponding complete vowels according to the vowel simple codes if the vowel simple codes exist in the sub-coding information, generating corresponding Chinese character starting information according to the Chinese character starting codes if the Chinese character starting codes exist in the sub-coding information;
The searching module 104 is configured to search, for each piece of sub-code information, a plurality of corresponding candidate kanji from a preset database according to the corresponding search information;
The display module 105 is configured to display the plurality of candidate chinese characters on a preset interface if the number of the subcode information is 1.
Further, the information decoding module 103 generates a plurality of complete initials according to the initial consonant brevity codes, including:
if the initial consonant brevity code is a character in a preset first character set, generating a blank placeholder as a complete initial consonant in the search information;
And if the initial consonant simple code is a character in a preset second character set, generating flat-tongue consonant and seesaw consonant corresponding to the initial consonant simple code as the plurality of complete consonants.
Further, if the number of the sub-code information is greater than 1, the display module 105 arranges and combines the corresponding candidate chinese characters according to a preset word stock and the sequence of each sub-code information in the chinese character code, generates a plurality of words or phrases, and displays each word or phrase on a preset interface.
The embodiment of the application provides a Chinese character decoding system corresponding to a coding method, firstly, the embodiment considers the condition that a user inputs a plurality of Chinese characters simultaneously, the position of an initial simple code splits the Chinese character codes possibly existing in the plurality of Chinese characters into a plurality of sub-code information, and decodes each sub-code information respectively, thereby realizing the rapid conversion from the simple code to the complete information. In the information decoding process, whether the final simple code, the Chinese character initial stroke code and the Chinese character characteristic code exist in the sub-code information or not is checked in sequence, and a plurality of corresponding complete final, chinese character initial stroke information and Chinese character characteristic information are generated, that is to say, the embodiment can still normally complete the Chinese character decoding process under the condition that one or more of the final simple code, the Chinese character initial stroke code and the Chinese character characteristic code are absent, the Chinese character matching result is displayed, and the flexibility of Chinese character input of a user is improved. And secondly, the decoded retrieval information can cover multiple dimensions such as initials, finals, strokes, characteristics and the like at most, so that the database is more accurate to inquire, the returned result is more targeted, and the matching precision of the codes and the Chinese characters is improved.
The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present application, and are not to be construed as limiting the scope of the application. It should be noted that any modifications, equivalent substitutions, improvements, etc. made by those skilled in the art without departing from the spirit and principles of the present application are intended to be included in the scope of the present application.

Claims (10)

1. A Chinese character encoding method based on Chinese phonetic alphabet and Chinese character features is characterized by comprising the following steps:
When the keyboard state is the first state, generating a corresponding initial consonant brevity code according to the triggered first position on the keyboard;
when the keyboard state is the second state, generating a corresponding final simplified code according to the triggered second position, the triggered third position and the triggering sequence on the keyboard;
in any keyboard state, if the fourth position which is triggered currently belongs to an auxiliary input area in the keyboard, generating a corresponding Chinese character starting code according to the fourth position, and switching the keyboard state into an auxiliary input state;
When the keyboard state is an auxiliary input state, generating a corresponding Chinese character feature code according to a fifth position triggered in the auxiliary input area;
When the input is completed, the generated initial simplified codes, the generated final simplified codes, the initial codes of the Chinese characters and the Chinese character characteristic codes are combined to construct Chinese character codes according to the code generation sequence.
2. The method for encoding Chinese characters based on pinyin and Chinese character features as claimed in claim 1, wherein when the keyboard state is the second state, generating the corresponding final simplified code according to the triggered second position, third position and triggering sequence on the keyboard comprises:
Generating a corresponding second character and a corresponding third character according to the second position, the third position and the triggering sequence on the keyboard;
Determining a tone and a tone position according to the second character and the third character;
and generating corresponding final simple codes according to the tone positions of the second character, the third character and the tone positions.
3. The method of claim 1, wherein the initial code of the Chinese character corresponds to initial information of the Chinese character, and the initial information of the Chinese character is "horizontal", "vertical", "left-falling", "dot" and "folding".
4. The method for encoding Chinese characters based on pinyin and character features as claimed in claim 1, wherein the character feature codes correspond to character feature information, and the character feature information is a first feature, a second feature, a third feature or a fourth feature;
the first characteristic is that the target Chinese character to be encoded can be split into two or more first independent components left and right, and any one of the first independent components can be split into two or more second independent components up and down;
The second characteristic is that the target Chinese character can be split into two or more first independent components left and right, and each first independent component cannot be split into two or more second independent components up and down;
the third characteristic is that the target Chinese character cannot be split into two or more first independent components left and right, and the target Chinese character can be split into two or more second independent components up and down;
the fourth characteristic is that the target Chinese character can not be split left and right, but also can not be split up and down.
5. The method for encoding Chinese characters based on pinyin and Chinese character features as claimed in claim 1, wherein the first character and the second character in the final brevity code are triggered and generated by different input areas respectively, specifically:
when the keyboard state is the second state, generating a corresponding first character according to the triggered second position on the keyboard;
in any keyboard state, if a third position which is triggered currently belongs to an auxiliary input area in the keyboard, generating a corresponding second character according to the third position, and switching the keyboard state into an auxiliary input state;
When the keyboard state is an auxiliary input state, generating corresponding Chinese character initial codes and Chinese character feature codes according to the triggered fourth position and fifth position in the auxiliary input area in sequence;
and generating the final brevity code according to the first character and the second character.
6. A method for decoding chinese characters based on pinyin and hanzi features, wherein the method for decoding chinese character codes generated by the method for encoding chinese characters based on pinyin and hanzi features according to any one of claims 1 to 5, comprises:
Acquiring Chinese character codes;
splitting the Chinese character code into a plurality of sub-code information according to the positions of initial consonant brevity codes in the Chinese character code, wherein at most only one initial consonant brevity code exists in the sub-code information, and the initial consonant brevity code is the first bit code in the sub-code information;
Generating corresponding retrieval information through a preset mapping table according to the sub-coding information, wherein the retrieval information comprises a plurality of complete initials which are generated according to the initial brevity codes; generating a plurality of corresponding complete vowels according to the vowel simple codes if the vowel simple codes exist in the sub-coding information, generating corresponding Chinese character starting information according to the Chinese character starting codes if the Chinese character starting codes exist in the sub-coding information;
For each piece of sub-coding information, searching a plurality of corresponding candidate Chinese characters from a preset database according to the corresponding searching information;
and if the number of the subcode information is 1, displaying the plurality of candidate Chinese characters on a preset interface.
7. The method for decoding Chinese characters based on pinyin and character features of claim 6, wherein generating a plurality of complete initials corresponding to the initial brevity code comprises:
if the initial consonant brevity code is a character in a preset first character set, generating a blank placeholder as a complete initial consonant in the search information;
And if the initial consonant simple code is a character in a preset second character set, generating flat-tongue consonant and seesaw consonant corresponding to the initial consonant simple code as the plurality of complete consonants.
8. The method of claim 6, wherein if the number of the sub-code information is greater than 1, according to a preset word stock and the sequence of each sub-code information in the Chinese character code, arranging and combining the corresponding candidate Chinese characters to generate a plurality of words or phrases, and displaying each word or phrase on a preset interface.
9. The Chinese character coding system based on the Chinese phonetic alphabet and the Chinese character characteristics is characterized by comprising an initial simplified code generation module, a final simplified code generation module, a Chinese character initial stroke code generation module, a Chinese character characteristic code generation module and a combination module;
the initial consonant brevity code generation module is used for generating a corresponding initial consonant brevity code according to a first triggered position on the keyboard when the keyboard state is a first state;
the final simple code generation module is used for generating a corresponding final simple code according to the triggered second position, the triggered third position and the triggering sequence on the keyboard when the keyboard state is the second state;
The Chinese character initial code generating module is used for generating a corresponding Chinese character initial code according to a fourth position when the fourth position which is triggered currently belongs to an auxiliary input area in the keyboard in any keyboard state, and switching the keyboard state into an auxiliary input state;
The Chinese character feature code generating module is used for generating a corresponding Chinese character feature code according to a fifth triggered position in the auxiliary input area when the keyboard state is the auxiliary input state;
And the combination module is used for constructing the generated initial consonant brevity codes, the generated final simple codes, the Chinese character initial stroke codes and the Chinese character characteristic codes into Chinese character codes according to the code generation sequence when the input is completed.
10. A Chinese character decoding system based on Chinese phonetic alphabets and Chinese character features, which is used for decoding a Chinese character code generated by a Chinese character coding system based on Chinese phonetic alphabets and Chinese character features as claimed in claim 9, and comprises an acquisition module, a splitting module, an information decoding module, a retrieval module and a display module;
The acquisition module is used for acquiring Chinese character codes;
The splitting module is used for splitting the Chinese character code into a plurality of sub-code information according to the position of an initial consonant simple code in the Chinese character code, wherein at most one initial consonant simple code exists in the sub-code information, and the initial consonant simple code is the first bit code in the sub-code information;
The information decoding module is used for generating corresponding retrieval information according to the sub-coding information through a preset mapping table, and generating a plurality of corresponding complete initials according to the initial consonant brevity codes; generating a plurality of corresponding complete vowels according to the vowel simple codes if the vowel simple codes exist in the sub-coding information, generating corresponding Chinese character starting information according to the Chinese character starting codes if the Chinese character starting codes exist in the sub-coding information;
The searching module is used for searching a plurality of corresponding candidate Chinese characters from a preset database according to the corresponding searching information for each piece of sub-coding information;
And the display module is used for displaying the plurality of candidate Chinese characters on a preset interface if the number of the subcode information is 1.
CN202610021821.6A 2026-01-08 2026-01-08 Chinese character encoding and decoding method and system based on Chinese pinyin and Chinese character characteristics Pending CN121881982A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202610021821.6A CN121881982A (en) 2026-01-08 2026-01-08 Chinese character encoding and decoding method and system based on Chinese pinyin and Chinese character characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202610021821.6A CN121881982A (en) 2026-01-08 2026-01-08 Chinese character encoding and decoding method and system based on Chinese pinyin and Chinese character characteristics

Publications (1)

Publication Number Publication Date
CN121881982A true CN121881982A (en) 2026-04-17

Family

ID=99417579

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202610021821.6A Pending CN121881982A (en) 2026-01-08 2026-01-08 Chinese character encoding and decoding method and system based on Chinese pinyin and Chinese character characteristics

Country Status (1)

Country Link
CN (1) CN121881982A (en)

Similar Documents

Publication Publication Date Title
CN101310244B (en) Information input method based on Chinese phonetic alphabets
CN117519493B (en) Full-spelling input method based on 10-key keyboard and applied to small-screen electronic equipment
US8977535B2 (en) Transliterating methods between character-based and phonetic symbol-based writing systems
CN106716396A (en) Input of characters based on sign-based written language
RU2671043C1 (en) Method, system and keypad for input of hieroglyphs
CN117590953B (en) Double-spelling input method based on 10-key keyboard and applied to small-screen electronic equipment
CN102455845A (en) Character input method and device
JP4662667B2 (en) Character input device and method for small keypad
US9171234B2 (en) Method of learning a context of a segment of text, and associated handheld electronic device
CN103246354B (en) Input method and the keyboard thereof of Chinese character is expressed with common language literal code
CN121881982A (en) Chinese character encoding and decoding method and system based on Chinese pinyin and Chinese character characteristics
JP2002207728A (en) Phonetic character generation device and recording medium storing program for realizing the same
KR102517021B1 (en) Hangul input device and method for language with rhythm and prosody
CN101017397A (en) Computer Chinese character input system and input method
JP5751537B2 (en) International Japanese input system
KR101777141B1 (en) Apparatus and method for inputting chinese and foreign languages based on hun min jeong eum using korean input keyboard
JP4302918B2 (en) Hangul character generation method and dictionary lookup method
KR101140767B1 (en) input device using extended HANGUL inscription method and input method using it
US7665037B2 (en) Method of learning character segments from received text, and associated handheld electronic device
JPH01287774A (en) Japanese data input processor
CA2658586C (en) Learning character segments from received text
CN119987568A (en) A method for generating a short spelling input method
JPH08272780A (en) Chinese input processing apparatus, Chinese input processing method, language processing apparatus and language processing method
TW552517B (en) Classified input method
CA2653823C (en) Method of learning a context of a segment of text, and associated handheld electronic device

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination