CN105912139B - Method for correspondingly recognizing modular stroke coding Chinese characters - Google Patents

Method for correspondingly recognizing modular stroke coding Chinese characters Download PDF

Info

Publication number
CN105912139B
CN105912139B CN201610216758.8A CN201610216758A CN105912139B CN 105912139 B CN105912139 B CN 105912139B CN 201610216758 A CN201610216758 A CN 201610216758A CN 105912139 B CN105912139 B CN 105912139B
Authority
CN
China
Prior art keywords
module
stroke
vertical
chinese character
divided
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610216758.8A
Other languages
Chinese (zh)
Other versions
CN105912139A (en
Inventor
金云中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CN105912139A publication Critical patent/CN105912139A/en
Application granted granted Critical
Publication of CN105912139B publication Critical patent/CN105912139B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • G06F3/0237Character input methods using prediction or retrieval techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)

Abstract

A method for corresponding recognition of Chinese characters by modular stroke codes belongs to the field of Chinese character stroke codes and is characterized in that two-dimension of Chinese character patterns are compressed in a modular mode to carry out coding arrangement, so that the defects of repeated codes and non-readability of the stroke codes of the Chinese characters are overcome; the situation that the existing input method lacks a simple and accurate stroke form input method which accords with the Chinese character writing habit of people is supplemented and perfected; the method changes the complexity and irrationality of the steps of inquiring Chinese characters in the existing Chinese dictionary, and the inapplicability of separate arrangement of polyphone by pinyin typesetting.

Description

Method for correspondingly recognizing modular stroke coding Chinese characters
The technical field is as follows:
the invention relates to a method for coding Chinese characters by modular strokes, in particular to a method for directly inquiring and outputting Chinese characters by modular strokes.
Background art:
the font of a Chinese character is the final carrier of the meaning and the pronunciation of the character, the stroke is the minimum composition unit of the character pattern of the Chinese character, and the basic stroke style of the Chinese character is limited, so that the stroke can be expressed by a single byte in computer coding.
In the past, stroke coding for Chinese characters has two major problems: compared with the existing mode of coding Chinese characters by 2 bytes of the Unicode character set UCS-2 standard, the stroke number of the Chinese characters is far more than 2 strokes on average, so that the coded bytes of one Chinese character are far more than 2 bytes, and the requirement on computer hardware is high; and secondly, the character pattern of the Chinese character is a square two-dimensional graph, the problem of repeated codes cannot be solved by directly using stroke coding, and a character string after the stroke coding cannot be identified into a two-dimensional character pattern. Based on the above two points, stroke codes have no color in Chinese character codes.
At present, with the rapid development of computer electronic industry and information transmission technology, the coverage rate of fiber to the home in the network fiber arrangement in China is achieved, and in the aspect of wireless transmission, a 4G network is popularized in a tight gong and drum, the transmission speed can reach 100Mbps, which is equivalent to 12.5 Mb/s. On a computer processing core, the calculation speed of a supercomputer reaches billions of times per second, and the bandwidth of a newly used PCI-E3.0 single channel on a computer bus reaches 1 Gb/s. The performance required for chinese characters encoded using strokes on computers is already plentiful.
The most three input methods we use: pinyin input method, Wubi input method and stroke input method. The pinyin input method is commonly used due to the habit of pinyin always used by people, but because the pinyin is always finished or characters required by people do not exist in the pinyin input method because the pinyin input method does not have typing depth, the time is very long for turning pages and checking; the five-stroke input method has few repeated codes and high input speed, but the etymons are difficult to remember, and the layout of the etymons on a keyboard is very familiar, so that the five-stroke input method is rarely studied seriously except for professional typists; the stroke input method is simple and easy to learn, is the easiest input method for learning typing for the first time, but is only suitable for small input devices such as mobile phones because of the defects of more repeated codes, difficult character leaning, difficult character typing and poor multi-character continuous typing capability caused by few strokes.
When each Chinese character is learned, the unknown character needs to be checked in the dictionary, but because the dictionary is in Pinyin typesetting, the unknown character can find the pronunciation to check the dictionary! This makes the composing mode of the phonetic alphabet equal to zero action, then we can only use the stroke inquiry method to find the radical, then count the radical strokes, then find the page where the radical is located, however we can also count the number of strokes of this character, finally find the number of pages where the character is located in a group of small characters, finally turn over to check the information of the character, but see that the last line writes a phonetic alphabet and a page number, or a polyphone! In the process, the steps of inquiring the Chinese dictionary are complicated and unreasonable.
The disadvantages of the prior art described above: stroke coding is the key of Chinese character single byte coding, but has coincident codes and non-readability; the existing input method lacks a simple and accurate stroke form input method which accords with the Chinese character writing habit; the Chinese dictionary has the defects of complicated and unreasonable steps for inquiring Chinese characters, and inapplicability of separate arrangement of polyphone by pinyin typesetting.
The invention content is as follows:
the invention aims to solve the problem of providing a Chinese character stroke sorting and coding method which is easy to remember, easy to learn and accords with the standard writing habit of Chinese characters according to the composition structure of Chinese character patterns, and creating a new Chinese character stroke coding method to be applied to computer Chinese character coding, input methods and Chinese character dictionaries.
In order to achieve the above object, the present invention provides a method for coding Chinese characters by modular strokes, which divides the square shape of the Chinese characters into 13 module types according to different division modes, then codes the Chinese characters by strokes according to the characters in each small module in each module type, arranges each small module from left to right and from top to bottom in the writing sequence of the Chinese characters, and adds the starting module type code and the segment point of each small module segment code to form a complete stroke code of the Chinese character, wherein the modular stroke coding of the Chinese characters comprises the following steps:
(1) the Chinese character square font is classified into 13 standard module types according to different segmentation modes, and the module types are as follows:
a. the 'one' type module class represents a simple font, the font is an inseparable Chinese character module type, and the contained Chinese characters are generally the composition basis of a multi-module type.
b. The "two" type module represents the type of Chinese character module which is formed from upper and lower modules, in which the character form is irrevocable and can be divided from middle.
c. The three-type module class represents that the Chinese character font consists of an upper module, a middle module and a lower module, one block and more than three blocks are summarized into the type from top to bottom according to the writing stroke sequence, and the number of the blocks is not increased.
d. The 'vertical two' type module class represents a module type that the Chinese character font is composed of a left module and a right module, and the font in each small module can not be divided.
e. The vertical three-module type represents that the Chinese character font is composed of a left module, a middle module and a right module, and the module type is from left to right according to the writing stroke sequence.
f. The 'right two' type module represents that the main body of Chinese character pattern is divided into two parts of left and right parts, while the right part can be divided into two parts of upper and lower parts, and can not be divided into more than two module types.
g. The 'left two' type module type represents that the Chinese character font theme is divided into a left part and a right part, while the left part can be divided into an upper part and a lower part firstly, and can not be divided into more than two module types, which is opposite to the 'right two' type.
h. The right three-section module type represents that the main body of Chinese character form is divided into left and right two portions, and the right portion can be divided into upper, middle and lower three portions, and according to the writing sequence from left to right, then from top to bottom, said module type can be divided into more than three portions.
i. The 'left three' type module class represents that the main body of the Chinese character font is divided into a left part and a right part, while the left part can be divided into an upper part, a middle part and a lower part firstly, and the module class can be divided into more than three parts from top to bottom and then from left to right according to the writing sequence, which is actually opposite to the 'right three'.
j. The "upper two" module type represents that the main body of Chinese character form is divided into upper and lower two portions, and the upper portion can be divided into left and right small portions, and according to the writing sequence, it is from left to right, then from top to bottom, and its upper portion can not be divided into more than two module types.
k. The lower two-type module type represents that the main body of the Chinese character font is divided into an upper part and a lower part, the lower part can be divided into a left small part and a right small part, the writing sequence is from top to bottom and then from left to right, the lower part can not be divided into more than two module types, and the type is opposite to the upper two-type module type.
The module type of 'three upper' represents that the main body of the Chinese character font is divided into an upper part and a lower part, and the upper part can be divided into three small parts of a left part, a middle part and a right part firstly, and the module type is from left to right and then from top to bottom according to the writing sequence.
The m and the lower three-type module type represent that the main body of the Chinese character font is divided into an upper part and a lower part, while the lower part can be divided into a left part, a middle part and a right part, and the module type is from top to bottom and from left to right according to the writing sequence and is opposite to the upper three-type.
The 13 module types are combined with each other to form other types, so that all the fonts are mapped by the 13 module types, the 13 module types are small in quantity, obvious in rule and easy to remember, and the module types are expressed into a two-dimensional graph as shown in figure 1.
Because the Chinese characters are crystals of Chinese five thousand years culture, the character forms are various and do not create evolution according to a fixed mode, the modular stroke coding Chinese character method summarizes the rules that 6 segmentation character forms are of a module type:
1. can be divided into the following parts: according to the non-overlapping and non-interspersed part of the strokes of the Chinese character, the font of the Chinese character is firstly divided into small blocks, and then the 13 types of modules are compared to find the most suitable one.
2. Can be divided into: there are some fonts, the upper part, the lower part, the left part and the right part of the main body can be divided according to a first principle, and the main body can be divided firstly according to the stroke writing sequence.
3. When the font is more complex and consists of a plurality of module types, the number of the modules is more reserved, and the number of the modules is less compressed into the modules, and then the modules are compared with the module types.
4. When only one stroke is not intersected with the character form, the character form is not divided, and the stroke part of the front single part is as many as possible except the stroke parts of the horizontal stroke and the vertical stroke.
5. When the main body is divided into left and right parts or upper and lower parts, the first part is not divided when the first part is a radical, but the radical original word can be divided when the radical original word meets other division conditions.
6. All the segmentation is performed on the font in a modularization way according to the standard writing sequence of the font, and when the font looks like left-right or up-down division, but the initial stroke and the ending stroke are completed in the same part, the font is not segmented.
(2) After the modular division of the font is completed, the second step is to encode each small module by strokes according to the stroke sequence as a method for each application: the stroke code table is as follows:
Figure DEST_PATH_GSB0000154202190000031
in order to show the position of each stroke in the coding segment, the invention adds a segment point code which is used for the segmentation use of the two-dimensional coding segment, the symbols are temporarily used and replaced by the shape of the 'module type code and the stroke coding segment (,)') in the complete modular stroke coding, the first byte code in the coding composition of the Chinese character has the information of the two-dimensional graph, and the subsequent codes have the information of the strokes, thereby reducing the problem of repeated coding of different fonts with the same strokes in the Chinese character stroke coding to the greatest extent and having certain reverse readability on the coding.
Selecting different stroke coding combinations according to different application conditions:
1. the first application aspect is computer coding, which requires no coincident code, establishes a one-to-one correspondence relationship between Chinese characters and codes, and although modular stroke coding has reduced coincident code to the minimum possible, because modularization only changes large blocks into small blocks, and the small blocks have two-dimensional graphics, the invention adds ' deformation code ' aiming at the factor, symbols are temporarily replaced by ' shape, and only one deformation code is added after coding of one of different fonts with the same stroke for distinguishing, thus thoroughly eliminating the problem of coincident code of modular stroke coding. Because the modular stroke coded Chinese character is formed by a plurality of single-byte codes, the problem of messy codes of the traditional double-byte coded Chinese character is solved.
The modular stroke coding Chinese character has the greatest advantage of renewability, when a character is not recorded into the computer coding, the coding of the character can be completely printed out according to the modular coding Chinese character method, data is stored according to the form, and after the character is recorded into a computer system, the character is recorded into the computer system according to the same modular coding method, so that the existing coding section can be completely displayed from the beginning at the moment.
2. Secondly, each word is coded by the modular stroke coding based on data on the Chinese dictionary, various module classes are classified firstly, the same module classes are typeset together (as shown in figure 2), then the corresponding relation between the first column and the third column in the table is used (ten Arabic numerals of 0-9 are used for representing ten types of strokes), the first section and the first three codes of the first section and the second section of each modular stroke coding section are represented by the numerals of the first column of the table (only one section of one module class is converted), two three-digit numbers are formed, finally, the Chinese characters in the same module class are typeset from small to large according to the three-digit numbers of the first module, and when the three-digit numbers of the first module are the same, the same words are typeset from small to large according to the three-digit numbers of the second module (as shown in figure 3) to form the novel dictionary of the modular stroke coding Chinese character method.
3. Third, because the input method does not require unique coding, the present invention generalizes the ten types of strokes and dictionary generalizations, including "one, two, three, two vertical, three vertical, two right, two left, three right, three left, two upper, two lower, three upper, three lower" 13 types, "horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical,
Figure DEST_PATH_GSB00001542021900000411
Left falling, left falling,
Figure DEST_PATH_GSB0000154202190000041
(change of hook),
Figure DEST_PATH_GSB0000154202190000042
(vertical change),
Figure DEST_PATH_GSB0000154202190000043
(skimming) of,
Figure DEST_PATH_GSB0000154202190000044
(vertical and horizontal),
Figure DEST_PATH_GSB0000154202190000045
(horizontal and vertical change) "ten stroke classes and", (segment point) "total 24 key positions are blended into the existing keyboard.
Wherein, five keys of QWERT on the three, two, one, two vertical and three vertical mapping keyboard, four keys of ZXCV on the two upper, two lower, three right and three left mapping keyboard, three upper and three lower mapping YB keys, two right and two left mapping NM keys "
Figure DEST_PATH_GSB0000154202190000046
(skimming), blowing (skimming), stroke (dot), stroke (horizontal),
Figure DEST_PATH_GSB0000154202190000047
(vertical and horizontal),
Figure DEST_PATH_GSB0000154202190000048
(horizontal folding change) 'antipodal' ASDFGH 'six key positions'
Figure DEST_PATH_GSB0000154202190000049
Vertical transformation, I (vertical),
Figure DEST_PATH_GSB00001542021900000410
(change of hook),
Figure DEST_PATH_GSB00001542021900000412
(right pressing) "antipodal" UIJL "four key positions,", (segment point) "antipodal" K "key position, reserve" OP "key position"<>The English 26-letter keys are smaller than or larger than the English 26-letter keys, so that the use by programmers is facilitated.
The modular stroke coding input method includes the steps of firstly typing module type codes according to the same steps of dictionary query methods, confirming the type of a Chinese character, then typing stroke type keys of three strokes related to the Chinese character (only one type can continuously input strokes to narrow the query range), then typing segment point keys, then inputting the three strokes, if the required Chinese character is not obtained, continuously typing strokes or replacing strokes of a third small module for query, and querying the Chinese character which meets the conditions in the module type, so that the Chinese character is output.
Description of the drawings:
FIG. 1 is a modular diagram of Chinese characters;
FIG. 2 is a side view of a modular Chinese character dictionary;
FIG. 3 is a layout diagram of Chinese character parsing indexes of a modular Chinese character dictionary;
Detailed Description
The present invention will be described in further detail with reference to specific examples.
A method for coding Chinese characters by modular strokes divides the square shape of the Chinese characters into 13 module types according to different division modes, then codes the characters in each small module in each module type by strokes, arranges each small module from left to right and from top to bottom according to the writing sequence of the Chinese characters, and adds the beginning module type code and the segment point of each small module segment code to form a complete stroke code of the Chinese character, wherein the modular stroke coding steps of the Chinese characters are as follows:
(1) the Chinese character square font is classified into 13 standard module types according to different segmentation modes, and the module types are as follows:
a. the 'one' type module class represents a simple font, the font is an inseparable Chinese character module type, and the contained Chinese characters are generally the composition basis of a multi-module type.
Such as these words: characters such as I, B, ten, D, factory, seven, man, income, nine, etc., and the structure is simple and the character pattern can not be divided.
b. The "two" type module represents the type of Chinese character module which is formed from upper and lower modules, in which the character form is irrevocable and can be divided from middle.
For example, the thunder can be divided into an upper module and a lower module which are respectively formed by strokes of rain and field, and the same types of the module comprise stomach, jiu, and need.
c. The three-type module type represents that the Chinese character font consists of an upper module, a middle module and a lower module, and the number of the three modules is from top to bottom according to the writing stroke sequence, namely one module, more than three modules are also summarized into the type without addition.
For example, the 'bank' can be divided into three modules, namely an upper module, a middle module and a lower module, which are composed of strokes of 'mountain', 'factory' and 'stem', and the 'happiness' can be divided into four parts from top to bottom at most, but the number of the modules is limited and is not increased, so that the 'three' type still belongs to.
d. The vertical two-module type represents a module type that the Chinese character font is composed of a left module and a right module, and the font in each small module can not be divided.
For example, "it" can be divided into left and right modules, which are composed of strokes of "white" and "spoon", and the same types of modules include proportion, leaf and hook.
e. The 'vertical three' type module class represents that the Chinese character font is composed of three modules of a left module, a middle module and a right module, and the module type is from left to right according to the writing stroke sequence.
For example, the "speckle" can be divided into three modules of left, middle and right, and each module stroke is formed from "king", "wen" and "king", and the same type of one class, one class and one other class can be distinguished
f. The "right two" type module type represents that the main body of Chinese character form is divided into two portions of left and right, and the right side can be divided into two portions of upper and lower, and can not be divided into more than two module types.
For example, the body of a cat is divided into a left part and a right part, the right part is divided into an upper part and a lower part, and strokes are drawn from left to right and from top to bottom.
g. The 'left two' type module type represents that the Chinese character font theme is divided into a left part and a right part, while the left part can be divided into an upper part and a lower part firstly, and can not be divided into more than two module types, which is opposite to the 'right two' type.
If "quick", the main body is divided into left and right parts, the left part can be divided into upper and lower parts, and the strokes are sequentially from top to bottom and then from left to right.
h. The "three right" type module type, representing the Chinese character font body is divided into two parts of left and right, and the right side can be divided into three parts of upper, middle and lower, according to the writing sequence from left to right, then from top to bottom, it can be divided into more than three parts, and also can be classified into said module type.
For example, the main body is divided into two parts, namely a left part, a right part, a middle part and a lower part, and the stroke writing sequence is from left to right and then from top to bottom.
i. The 'left three' type module class represents that the main body of the Chinese character font is divided into a left part and a right part, while the left part can be divided into an upper part, a middle part and a lower part firstly, and the module class can be divided into more than three parts from top to bottom and then from left to right according to the writing sequence, which is actually opposite to the 'right three'.
For example, the main body is divided into left and right parts, the left part can be divided into upper, middle and lower parts, and the stroke writing sequence is from top to bottom and then from left to right, which are ten, day, ten and month.
j. The "upper two" module type represents that the main body of Chinese character form is divided into upper and lower two portions, and the upper portion can be divided into left and right small portions, and according to the writing sequence, it is from left to right, then from top to bottom, and its upper portion can not be divided into more than two module types.
For example, a stroke is composed of two people and one, namely, two people at the upper part and one at the lower part, and the writing sequence is from left to right and then from top to bottom.
k. The lower two-type module type represents that the main body of the Chinese character font is divided into an upper part and a lower part, the lower part can be divided into a left small part and a right small part, the writing sequence is from top to bottom and then from left to right, the lower part can not be divided into more than two module types, and the type is opposite to the upper two type.
For example, a stroke is composed of three people, and the writing sequence is from top to bottom and then from left to right.
The module type of 'three upper' represents that the main body of the Chinese character font is divided into an upper part and a lower part, and the upper part can be divided into three small parts of a left part, a middle part and a right part firstly, and the module type is from left to right and then from top to bottom according to the writing sequence.
For example, the upper part of the pen can be divided into three modules, and the lower part of the pen can be divided into a block from left to right and then from top to bottom according to the writing sequence.
The module type of m and lower three represents that the main body of the Chinese character font is divided into an upper part and a lower part, while the lower part can be divided into a left part, a middle part and a right part, and the module type is from top to bottom and then from left to right according to the writing sequence and is opposite to the module type of the upper three.
For example, the goose is divided into an upper part and a lower part, the lower part can be divided into a left part, a middle part and a right part, and the stroke writing sequence is from top to bottom and from left to right.
Totally, 13 module types are formed, and the module types can be mutually combined to form other types, so that all fonts are mapped by the 13 module types, the 13 module types are not large in number, the rule is obvious, the memory is easy, and the module types are represented as two-dimensional graphs as shown in fig. 1.
Because the Chinese characters are crystals of Chinese five thousand years culture, the character forms are various and do not create evolution according to a fixed mode, the modular stroke coding Chinese character method summarizes the rules that 6 segmentation character forms are of a module type:
1. can be divided into the following parts: according to the non-overlapping and non-interpenetrating parts of the strokes of the Chinese characters, the Chinese character font is firstly divided into small blocks, and then the 13 module types are compared to find the most suitable one.
2. Can be divided into: there are some fonts, the upper part, the lower part, the left part and the right part of the main body can be divided according to a first principle, and the main body can be divided firstly according to the stroke writing sequence.
3. When the font is more complex and consists of a plurality of module types, the number of modules is more reserved, and less modules are compressed into modules, and then the modules are compared with the module types.
4. When only one stroke is not intersected with the character form, the character form is not divided, and the stroke part of the front single part is as many as possible except the stroke parts of the horizontal stroke and the vertical stroke.
5. When the main body is divided into left and right parts or upper and lower parts, the first part is not divided when the first part is a radical, but the radical original word can be divided if it satisfies other division conditions.
6. All the segmentation is performed on the font in a modularization way according to the standard writing sequence of the font, and when the font looks like left-right or up-down division, but the initial stroke and the ending stroke are completed in the same part, the font is not segmented.
The example of Chinese character modular sorting is carried out according to the above 6 items: the left part and the right part of the 'can' character can be vertically divided according to a first rule, but the 'can' character is sorted into a 'left two' module type character because of the existence of a second character; "good" is not classified into "left-to-strokes" and "blunt" according to the fourth, but "Jiang", three small strokes in the right part, "one" and "Tian" are put together, and the last one remains; the winning characters can be divided into upper parts and lower parts according to a rule, the lower parts can be divided into left parts, middle parts and right parts, the winning characters are formed by stacking two types and vertical three types, the winning characters are complex and difficult to distinguish, but the winning characters are divided into lower three types according to a third rule and the reservation of a plurality of modules in one direction, and the number of compression modules is small, so the winning characters are divided into upper three types; the radicals are important components of Chinese characters, and according to a fifth rule, the radicals are modularized and do not divide the first radicals into 'strokes', bamboo heads are not divided, and the 'bamboo' word is divided into 'vertical two' types; if "can", it can not be divided into "T" and "kou", it does not accord with the writing order of Chinese characters, according to the six rules, it is divided into "one".
(2) After the modular division of the font is completed, the second step is to encode each small module by strokes according to the stroke sequence as a method for each application: the stroke code table is as follows:
Figure DEST_PATH_GSB0000154202190000061
Figure DEST_PATH_GSB0000154202190000071
in order to show the position of each stroke in the coding segment, the invention adds a segment point code which is used for the segmentation use of the two-dimensional coding segment, the symbols are temporarily used and replaced by the shape of the 'module type code and the stroke coding segment (,)') in the complete modular stroke coding, the first byte code in the coding composition of the Chinese character has the information of the two-dimensional graph, and the subsequent codes have the information of the strokes, thereby reducing the problem of repeated coding of different fonts with the same strokes in the Chinese character stroke coding to the greatest extent and having certain reverse readability on the coding.
Selecting different stroke coding combinations according to different application conditions:
1. the first application aspect is computer coding, which requires no coincident code, establishes a one-to-one correspondence relationship between Chinese characters and codes, and although modular stroke coding has reduced coincident code to the minimum possible, because modularization only changes large blocks into small blocks, and the small blocks have two-dimensional graphics, the invention adds ' deformation code ' aiming at the factor, symbols are temporarily replaced by ' shape, and only one deformation code is added after coding of one of different fonts with the same stroke for distinguishing, thus thoroughly eliminating the problem of coincident code of modular stroke coding. Because the modular stroke coded Chinese character is formed by a plurality of single-byte codes, the problem of messy codes of the traditional double-byte coded Chinese character is solved.
The modular stroke coding Chinese character has the greatest advantage of renewability, when a character is not recorded into the computer coding, the coding of the character can be completely printed out according to the modular coding Chinese character method, data is stored according to the form, and after the character is recorded into a computer system, the character is recorded into the computer system according to the same modular coding method, so that the existing coding section can be completely displayed from the moment.
For example, the 'I', the 'T', the 'Shi' and the 'T' belong to the same 'I' module class, the stroke sequence is horizontal, vertical and horizontal, the minimum composition unit of the modular stroke is, the problem of repeated codes exists, and therefore, the 'X' deformation code can only be added for distinguishing.
The I code is: "one" + "one";
the "soil" code is: "one" - "type code +" -;
"Shi" codes are: "one" + "one" - ";
the coding differences between the "stomach" and the "armature":
the "stomach" is encoded as: ' two ' type code + ' |)
Figure DEST_PATH_GSB0000154202190000072
Horizontal and vertical
Figure DEST_PATH_GSB0000154202190000074
One to one ";
the "armature" is encoded as: ' two ' type code + ' |)
Figure DEST_PATH_GSB0000154202190000073
A horizontal line, a vertical line
Figure DEST_PATH_GSB0000154202190000075
One to one ";
2. secondly, each word is coded according to the basic modular stroke codes on the Chinese dictionary, various module classes are firstly classified, the same module classes are typeset together (as shown in figure 2), then using the corresponding relation between the first column and the third column in the table (ten Arabic numerals of 0-9 are used for representing ten types of strokes), representing the first section and the first three codes of the second section of each modular stroke code section by the numerals of the first column of the table (one type of module class is only converted into one section) to form two three-digit numbers, and finally composing the Chinese characters in the same module class from small to large according to the three-digit number of the first module, when the three-digit number of the first module is the same, the same characters are typeset from small to large according to the three-digit number of the second module (as shown in figure 3), and the same characters are formed according to the front of the character row with few strokes, so that the novel dictionary of the modular stroke coding Chinese character method is formed.
And querying the 'give', wherein the process is as follows: firstly sorting the characters to find out the module type to which the characters belong, from left to right, from top to bottom, according to strokes, the characters belong to a 'right two' type, then numbering the first three strokes of a first module and a second module, obtaining '772' and '230' by contrasting the first column and the third column in the table, turning over a dictionary, firstly turning over to a 'right two' type module type area, then searching for a coding area according to the '772' of the first module, and turning over backwards according to the number from small to large until the number or an adjacent number is found, if the area range is large and the number of the same characters is large, the step can be repeated to search for the number '230' of the second module, the character area can be found, and the required characters can be obtained quickly from the search of few strokes. Compared with the searching mode of the existing dictionary, the method is more concise and intuitive, and in the example, the 'give' is polyphone, the method searches that the characters are ordered according to strokes, and polyphones can be integrated under one character.
3. Third, because the input method does not require unique coding, the present invention generalizes the ten types of strokes and dictionary generalizations, including "one, two, three, two vertical, three vertical, two right, two left, three right, three left, two upper, two lower, three upper, three lower" 13 types, "horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical, horizontal, vertical,
Figure DEST_PATH_GSB00001542021900000813
Left falling, left falling,
Figure DEST_PATH_GSB0000154202190000081
(hook-and-turn),
Figure DEST_PATH_GSB0000154202190000082
(vertical change),
Figure DEST_PATH_GSB0000154202190000083
(skimming),
Figure DEST_PATH_GSB0000154202190000084
(vertical and horizontal),
Figure DEST_PATH_GSB0000154202190000085
(cross-folding change) "ten stroke classes and", (paragraph point) "total 24 key positions are merged into the existing keyboard.
Wherein, five keys of QWERT on the three, two, one, two vertical and three vertical mapping keyboard, four keys of ZXCV on the two upper, two lower, three right and three left mapping keyboard, three upper and three lower mapping YB keys, two right and two left mapping NM keys "
Figure DEST_PATH_GSB0000154202190000086
(skimming), blowing (skimming), stroke (dot), stroke (horizontal),
Figure DEST_PATH_GSB0000154202190000087
(vertical and horizontal),
Figure DEST_PATH_GSB0000154202190000088
(horizontal folding change) 'antipodal' ASDFGH 'six key positions'
Figure DEST_PATH_GSB0000154202190000089
Vertical transformation, I (vertical),
Figure DEST_PATH_GSB00001542021900000810
(change of hook),
Figure DEST_PATH_GSB00001542021900000814
(right pressing) "antipodal" UIJL "four key positions,", (segment point) "antipodal" K "key position, reserve" OP "key position"<>The English 26-letter keys are smaller than or larger than the English 26-letter keys, so that the use by programmers is facilitated.
The modular stroke coding input method includes the steps of firstly typing module type codes according to the same steps of dictionary query methods, confirming the type of a Chinese character, then typing stroke type keys of three strokes related to the Chinese character (only one type can continuously input strokes to narrow the query range), then typing segment point keys, then inputting the three strokes, if the required Chinese character is not obtained, continuously typing strokes or replacing strokes of a third small module for query, and querying the Chinese character which meets the conditions in the module type, so that the Chinese character is output.
Compared with dictionary query, the input method is more convenient and faster, the input method is the same as the dictionary query, and the input method only needs to beat the right two "
Figure DEST_PATH_GSB00001542021900000811
(skimming) "," a "
Figure DEST_PATH_GSB00001542021900000812
(skimming), "(segment)," (skimming) and (de-sludging) are illustrated.
Figure DEST_PATH_GSB00001542021900000815
The method comprises the following steps of (putting down) ", (horizontal)", and (horizontal) ", wherein the eight keys are equal to the eight keys of 'NAASKSLF', the computer continuously inquires and narrows down the range in the typing process until a typing column displays the character and successfully outputs the character, the typing is not completed, a third module stroke can be typed, and the character is finally found.
Compared with the existing pinyin input method, the modular stroke input method has higher typing depth, does not need to turn over a query page until the character is typed, and cannot be helpless when encountering an unreadable character as the pinyin input method; compared with a five-stroke input method, the modular stroke input method is better in memory and easy to learn and understand; compared with the stroke input method, the modular stroke input method supports more Chinese characters and can continuously spell the Chinese characters. The most distinctive is that it has the support of modular stroke coding Chinese character system from top to bottom.

Claims (1)

1. A method for identifying the corresponding character of modular stroke code Chinese character based on stroke code divides the square shape of Chinese character into 13 module types according to different dividing modes, then codes the character in each small module in each module type by strokes, arranges each small module from left to right and from top to bottom according to the writing sequence of Chinese character, adds the beginning module type code and the segment point of each small module segment code to form a complete stroke code of Chinese character, the modular stroke code of Chinese character comprises the following steps:
(1) dividing Chinese character types into: the model is a 'one', a 'two', a 'three', a 'vertical two', a 'vertical three', a 'right two', a 'left two', a 'right three', a 'left three', an 'upper two', a 'lower two', an 'upper three' and a 'lower three' 13 module types;
(2) the Chinese character codes are applied to the stroke input method coding according to different coding sets to form the Chinese character codes for inquiry and output, and the stroke input method comprises 13 models of ' first, second, third, vertical second, vertical third, right second, left second, right third, left third, upper second, lower second, upper third and lower third ', horizontal, vertical ', left falling, right falling ', the left falling, right falling ' vertical ', the right falling ', the left falling and the right falling
Figure FSB0000199591780000011
Point-stroke-removing and hook-removing
Figure FSB0000199591780000012
Vertical transformer
Figure FSB0000199591780000013
Skimming change
Figure FSB0000199591780000014
Transverse and vertical transformer
Figure FSB0000199591780000015
Transverse folding transformer
Figure FSB0000199591780000016
"ten stroke classes and" segment points, "total 24 key positions are merged into the existing keyboard:
wherein, five keys of QWERT on the three, two, one, two and three vertical mapping keyboard, four keys of ZXCV on the two, three left mapping keyboard, YB on the three, three down mapping keyboard, NM on the two, two left mapping keyboard and left shifting keyboard
Figure FSB0000199591780000017
Left falling, left-falling, horizontal and vertical
Figure FSB0000199591780000018
Transverse folding transformer
Figure FSB0000199591780000019
"antipodal" ASDFGH six key positions, "vertical variation
Figure FSB00001995917800000110
Vertical and hook transformer
Figure FSB00001995917800000111
A method of making a ball
Figure FSB00001995917800000112
Four key positions of ' antipodal ' UIJL ', segment point, ' antipodal ' K ' key position, and reserved ' OP ' key position '<>"less than greater than number, fully applied to English 26 letter key;
the typing mode includes typing module type key, typing stroke key, pressing segment point key, typing next module, pressing segment point key, completing the typing of the module type module number, and inquiring and outputting Chinese characters.
CN201610216758.8A 2016-01-11 2016-04-06 Method for correspondingly recognizing modular stroke coding Chinese characters Active CN105912139B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201610030705.7A CN105807947A (en) 2016-01-11 2016-01-11 Method for correspondingly identifying modular stroke coded Chinese characters
CN2016100307057 2016-01-11
CN201610030705.7 2016-01-11

Publications (2)

Publication Number Publication Date
CN105912139A CN105912139A (en) 2016-08-31
CN105912139B true CN105912139B (en) 2022-08-30

Family

ID=56465685

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201610030705.7A Pending CN105807947A (en) 2016-01-11 2016-01-11 Method for correspondingly identifying modular stroke coded Chinese characters
CN201610216758.8A Active CN105912139B (en) 2016-01-11 2016-04-06 Method for correspondingly recognizing modular stroke coding Chinese characters

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201610030705.7A Pending CN105807947A (en) 2016-01-11 2016-01-11 Method for correspondingly identifying modular stroke coded Chinese characters

Country Status (1)

Country Link
CN (2) CN105807947A (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776499B9 (en) * 2016-12-09 2021-02-12 哈尔滨工业大学 Digital Chinese character spelling realization method and device
CN107292936B (en) * 2017-05-18 2020-08-11 湖南大学 Chinese character font vectorization method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1700157A (en) * 2005-07-19 2005-11-23 庚以津 Modular Chinese character coding method
CN1841365A (en) * 2005-02-02 2006-10-04 李梧杰 Segment code input method for Chinese character
CN101315579A (en) * 2007-05-28 2008-12-03 白春荣 Chinese character input method and its key board
CN102799282A (en) * 2012-07-13 2012-11-28 潘刚禹 Stroke etymon holographic code Chinese character input method
CN104731362A (en) * 2015-03-22 2015-06-24 邵德子 Chinese character code fast entry

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1091529C (en) * 1993-01-12 2002-09-25 陈劲松 Full-shape code for characters
CN1104673C (en) * 1996-12-23 2003-04-02 林兵 Popularized Lin code inputting method for Chinese characters
CN100380291C (en) * 2004-11-09 2008-04-09 刘金远 Method for searching words through ten initial and final strokes and digitalized input method through ten initial and final strokes
CN102830809B (en) * 2011-06-15 2016-05-11 高静敏 Encode method for entering Chinese characters
CN102339139A (en) * 2011-10-28 2012-02-01 王治阳 Three-class five-field input method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1841365A (en) * 2005-02-02 2006-10-04 李梧杰 Segment code input method for Chinese character
CN1700157A (en) * 2005-07-19 2005-11-23 庚以津 Modular Chinese character coding method
CN101315579A (en) * 2007-05-28 2008-12-03 白春荣 Chinese character input method and its key board
CN102799282A (en) * 2012-07-13 2012-11-28 潘刚禹 Stroke etymon holographic code Chinese character input method
CN104731362A (en) * 2015-03-22 2015-06-24 邵德子 Chinese character code fast entry

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Six-digit stroke-based Chinese input method;Lai-Man Po 等;《2009 IEEE International Conference on Systems, Man and Cybernetics》;20091214;104-116 *
汉字的数字编码输入研究;金如集;《中国中文信息学会汉字编码专业委员会第九届年会暨学术研讨会论文集》;20111001;818-823 *

Also Published As

Publication number Publication date
CN105807947A (en) 2016-07-27
CN105912139A (en) 2016-08-31

Similar Documents

Publication Publication Date Title
US5119296A (en) Method and apparatus for inputting radical-encoded chinese characters
US5197810A (en) Method and system for inputting simplified form and/or original complex form of Chinese character
KR860001068B1 (en) An ideogram generator
WO2003104963A1 (en) Input method for optimizing digitize operation code for the world characters information and information processing system thereof
CN100462901C (en) GB phoneticize input method
CN109271610A (en) A kind of vector expression of Chinese character
CN105912139B (en) Method for correspondingly recognizing modular stroke coding Chinese characters
CN102830809A (en) Chinese character coding input method
CN103616960A (en) Six vowel binary syllabification input method
CN101952790B (en) Method for inputting chinese characters apapting for chinese teaching
CN100520685C (en) Chinese characters pinyin identification code input method
CN101231558A (en) Oracle spelling and component resolution input method
Hung et al. Boxing code for stroke-order free handprinted Chinese character recognition
CN103324299A (en) Chinese character pictographic code computer input method based on Chinese character basic components
CN110879668A (en) Chinese character input method by expanding strokes in large character library
CN100428118C (en) Inputting method of Chinese code series
CN1196057C (en) One-code two-form quick Chinese digital coding input method
CN105278697B (en) Combined double-spelling class major-minor code Chinese character, word coded input method and its keyboard
CN1027839C (en) Chinese character encoding input method
CN113253853B (en) Chinese character input method for computer and mobile phone
CN1079060A (en) Sound-figure word code input system for Chinese character
CN108845680A (en) A kind of two word Chinese Computers looking into word typewriting one and same coding look into word typewriting method
CN102637077A (en) Phonological, calligraphic and tone hybrid coding method for inputting Chinese characters to computer
CN1340754A (en) Chinese-character digital input method and its keyboard
CN117917621A (en) Chinese character input method and system and keyboard

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant