One kind digitlization Chinese-character spelling implementation method and device
Technical field
The present invention relates to one kind digitlization man's mosaic implementation method and device, belong to digitlization man's mosaic technology neck
Domain.
Background technology
The method for expressing of existing Chinese character configuration, is input into as target with the stroke of Chinese character mostly, in this kind of method for expressing,
There is no the complete information of each part space structure of Chinese character, it is impossible to realize the reduction to any fractionation mode of Chinese character.For example,
In many this kind of configuration methods, it is " heart " structure under upper " field " to be only capable of representing combinde rqdical character as " think of ", it is impossible to further table
Show monomer word as " field ".Although or can represent that monomer word as " field " is 3 horizontal 3 perpendicular structures, can not represent anyhow
Between spatial relation, therefore it is this kind of same by the 3 horizontal 3 perpendicular words for constituting to cannot be distinguished by " field ", " by ", " first ".
The method for expressing of another kind of Chinese character configuration, then mainly for the fontlib for building Chinese character, Chinese character is each in this kind of expression
The locus of individual part is fixed, so if the Chinese character of display constitutes Chinese character in relative position, but each
The particular location of stroke is different, just be cannot be distinguished by with this method for expressing.As shown in figure 1, the two " think of " words are from the point of view of people
It is a word, but the word on the right is obvious and there is difference on the left side according to the word generated in character library.
In functional literacy, generally require to allow user oneself to go out Chinese character by stroke or radical split, then again to the Chinese
Specific pronunciation, the meaning of word, usage of word etc. carry out detailed explanation.Be currently based on the word game of digitizer often by
To the number of treatable Chinese character be any limitation as, or for the locus of Chinese character part during user's split
It is limited to realize word game.Thus mosaic can not be carried out by any part to any Chinese character.For example:Have
A little word games are only capable of realizing being pieced together " product " with 3 " mouths ", it is impossible to are further erected with 6 horizontal 6 and piece together " product ", or need elder generation
6 horizontal 6 are erected and pieces together 3 " mouths ", further pieced together " product ".Some word games need to be erected 6 horizontal 6 and are accurately put into certain
One limited area just can recognize that the word for spelling out is " product ".
Alternatively possible implementation method is that the word later for split carries out Chinese Character Recognition by the method for pattern-recognition,
Although this method discrimination is high, also unavoidably can exist identification mistake, or even None- identified situation.
In a word, in the current word game based on digitizer, for any Chinese character, by any part
Mosaic is carried out, only judges whether a word is to be difficult to by the relative position in the space of each part of Chinese character after split
Accomplish.
The content of the invention
To solve technical problem present in above-mentioned prior art, the present invention proposes a kind of digitlization man mosaic and realizes
Method and apparatus.
One kind digitlization Chinese character product word realizes device, it is characterised in that described device includes:
With the representation module of the relative position space representation Chinese character of the minimum composition part of Chinese character;
Recognize that the minimum composition part pieces together the identification module of word on relative tertiary location;
Show the display module of the Chinese character relevant information.
Further, described device also includes:
To constituting the basic stroke of Chinese character, it is difficult to the radical that is built with basic stroke and monomer word sets up concordance list
Index set up module;
It is to represent basic stroke, be difficult to the minimum of the radical and monomer word built with basic stroke by Chinese character separating
The Chinese character separating module of building block;
Determine the rectangle determining module of the minimum rectangle of the minimum composition part;
Determine the spatial relation determining module of the relative tertiary location relation at four edges of the minimum rectangle.
Further, described device also includes:
The word interior coding module that all minimal parts of Chinese character are encoded;
Module is set up according to the minimal parts table that concordance list and word interior coding set up minimal parts table;
The position relationship table that relative tertiary location relation according to all minimum rectangles of Chinese character sets up position relationship table is set up
Module;
The data memory module that the concordance list, minimal parts table and position relationship table are stored.
Further, the concordance list includes index number and the basic stroke representated by index number, is difficult to basic
Stroke is come the radical or monomer word that build;The minimal parts table is used to record in word expression basic stroke, difficulty in numbering
To represent basic stroke in the radical or each numbering of monomer word that are built with basic stroke and concordance list, be difficult to basic
Stroke is come the corresponding relation between the radical or each index number of monomer word that build.
Further, the word interior coding module according to order from left to right, from top to bottom to the minimal parts of Chinese character
Encoded.
One kind digitlization Chinese character product word implementation method, it is characterised in that methods described includes:
Step is represented with the relative position space representation Chinese character of the minimum composition part of Chinese character;
Recognize that the minimum composition part pieces together the identification step of word on relative tertiary location;
Show the step display of the Chinese character relevant information.
Further, methods described also includes:
To constituting the basic stroke of Chinese character, it is difficult to the radical that is built with basic stroke and monomer word sets up concordance list
Index establishment step;
It is to represent basic stroke, be difficult to the minimum of the radical and monomer word built with basic stroke by Chinese character separating
The Chinese character separating step of building block;
Determine that the rectangle of the minimum rectangle of the minimum composition part determines step;
Determine that the spatial relation of the relative tertiary location relation at four edges of the minimum rectangle determines step.
Further, methods described also includes:
The word interior coding step that all minimal parts of Chinese character are encoded;
The minimal parts table establishment step of minimal parts table is set up according to concordance list and word interior coding;
The position relationship table that relative tertiary location relation according to all minimum rectangles of Chinese character sets up position relationship table is set up
Step;
The data storing steps that the concordance list, minimal parts table and position relationship table are stored.
Further, the concordance list includes index number and the basic stroke representated by index number, is difficult to basic
Stroke is come the radical or monomer word that build;The minimal parts table is used to record in word expression basic stroke, difficulty in numbering
To represent basic stroke in the radical or each numbering of monomer word that are built with basic stroke and concordance list, be difficult to basic
Stroke is come the corresponding relation between the radical or each index number of monomer word that build.
Further, the word interior coding step according to order from left to right, from top to bottom to the minimal parts of Chinese character
Encoded.
Beneficial effect of the present invention:
The digitlization man's mosaic implementation method and device that the present invention is provided can accurately recognize a Chinese character risked,
Realization carries out mosaic to any Chinese character by any part, does not have the limitation on area of space, as long as and being represented in Chinese character
Middle zero defect, the Chinese character after split is in the absence of the wrong problem of identification, rate of accuracy reached to 100%.Solve current based on number
In the word game of word equipment, for any Chinese character, mosaic is carried out by any part, it is only each by Chinese character after split
The relative position in the space of individual part judges to judge the problem of Chinese character.
Brief description of the drawings
Fig. 1 is by relative tertiary location Chinese character in the Chinese character of the fixed stroke locus of character library generation and word game
The schematic diagram of contrast;
Fig. 2 is the structural representation of device of the present invention;
Fig. 3 is concordance list schematic diagram;
Fig. 4 is the fractionation of Chinese character " eight " and the minimum rectangle schematic diagram of each minimal parts;
Fig. 5 is numbering and Unified number mapping table schematic diagram in the word of Chinese character " eight ";
Fig. 6 is the display effect schematic diagram after being initially displayed and be mobile of minimal parts;
Fig. 7 is the display schematic diagram of the relevant information of Chinese character.
Specific embodiment
With reference to specific embodiment, the present invention will be further described, but the present invention should not be limited by the examples.
Embodiment 1
One kind digitlization Chinese character product word realizes device, it is characterised in that described device includes:
With the representation module of the relative position space representation Chinese character of the minimum composition part of Chinese character;
Recognize that the minimum composition part pieces together the identification module of word on relative tertiary location;
Show the display module of the Chinese character relevant information.
Further, described device also includes:
To constitute Chinese character basic stroke (such as:It is horizontal, vertical, skim, right-falling stroke, point etc.), be difficult to the radical that is built with basic stroke
Radical is (such as:Second, Fu, Yin) and monomer word is (such as:" 〇 " of one or nine 〇 〇) index of setting up concordance list sets up module;
It is to represent basic stroke, be difficult to the minimum of the radical and monomer word built with basic stroke by Chinese character separating
The Chinese character separating module of building block;
Determine the rectangle determining module of the minimum rectangle of the minimum composition part;
Determine the spatial relation determining module of the relative tertiary location relation at four edges of the minimum rectangle.
Further, described device also includes:
The word interior coding module that all minimal parts of Chinese character are encoded;
Module is set up according to the minimal parts table that concordance list and word interior coding set up minimal parts table;
The position relationship table that relative tertiary location relation according to all minimum rectangles of Chinese character sets up position relationship table is set up
Module;
The data memory module that the concordance list, minimal parts table and position relationship table are stored.
Wherein, the concordance list includes index number and the basic stroke representated by index number, is difficult to use basic stroke
Come the radical or monomer word that build;The minimal parts table is used to record in word expression basic stroke in numbering, is difficult to use
Basic stroke represents basic stroke, is difficult to use basic stroke in the radical or each numbering of monomer word that build and concordance list
Come the corresponding relation between the radical or each index number of monomer word that build.The word interior coding module according to from a left side to
Right, order from top to bottom is encoded to the minimal parts of Chinese character.
One kind digitlization Chinese character product word implementation method, it is characterised in that methods described includes:
Step is represented with the relative position space representation Chinese character of the minimum composition part of Chinese character;
Recognize that the minimum composition part pieces together the identification step of word on relative tertiary location;
Show the step display of the Chinese character relevant information.
Further, methods described also includes:
To constituting the basic stroke of Chinese character, it is difficult to the radical that is built with basic stroke and monomer word sets up concordance list
Index establishment step;
It is to represent basic stroke, be difficult to the minimum of the radical and monomer word built with basic stroke by Chinese character separating
The Chinese character separating step of building block;
Determine that the rectangle of the minimum rectangle of the minimum composition part determines step;
Determine that the spatial relation of the relative tertiary location relation at four edges of the minimum rectangle determines step.
Further, methods described also includes:
The word interior coding step that all minimal parts of Chinese character are encoded;
The minimal parts table establishment step of minimal parts table is set up according to concordance list and word interior coding;
The position relationship table that relative tertiary location relation according to all minimum rectangles of Chinese character sets up position relationship table is set up
Step;
The data storing steps that the concordance list, minimal parts table and position relationship table are stored.
Wherein, the concordance list includes index number and the basic stroke representated by index number, is difficult to use basic stroke
Come the radical or monomer word that build;The minimal parts table is used to record in word expression basic stroke in numbering, is difficult to use
Basic stroke represents basic stroke, is difficult to use basic stroke in the radical or each numbering of monomer word that build and concordance list
Come the corresponding relation between the radical or each index number of monomer word that build.The word interior coding step according to from a left side to
Right, order from top to bottom is encoded to the minimal parts of Chinese character.
Digitlization Chinese-character spelling implementation method proposed by the present invention is empty using the relative position of the minimum composition part of Chinese character
Between represent Chinese character, then recognize that the minimum composition part pieces together word on relative tertiary location;Finally show the Chinese character
Relevant information.
It is of the present invention digitlization Chinese-character spelling implementation method specific work process be:Firstly, for composition Chinese character
Basic stroke is (such as:Horizontal, vertical, slash, right-falling stroke, point etc.) and it is difficult to the radical built with stroke (such as:Second, Fu, Yin), monomer word (such as:
" 〇 " of one or nine 〇 〇) concordance list is set up, the numbering in concordance list represents numbering stroke below, radical or monomer
Word (Fig. 3), calls Unified number in the following text.Then, for each Chinese character, partition constitutes the stroke of this Chinese character or is difficult to use stroke
The radical of expression, monomer word, untill it can not further decouple, for the partition stroke that can not further decouple out or
Person's radical, referred to as monomer word, minimal parts.Each minimal parts are determined to surround with the minimum square of this minimal parts
Shape.Wherein, the definition of minimum rectangle is to surround this minimal parts just (i.e. minimal parts do not have any part to exceed this
Rectangle) the rectangle that horizontal edge is most short and vertical edge is also most short.Then, it is out every for from a Chinese character separating in a certain order
One minimal parts is numbered (call numbering in word in the following text), and sets up numbering and minimal parts are represented in word stroke or portion
The first, corresponding relation of monomer word concordance list Unified number in step one, forms minimal parts table;
Again after this, for from Chinese character separating out any two minimal parts, determining 4 sides of their minimum rectangles
The relative tertiary location relation on edge;All phases of all minimal parts for constituting the word are established according to relative tertiary location relation
To the relation table of spatial relation, i.e. position relation table.
Finally, concordance list, the minimal parts table of each Chinese character and position relationship table are stored in processing (for example
In the random access memory of processing equipment);Chinese character is represented by character code (such as GB18030, UTF-8 etc.), while storing the Chinese
The corresponding relevant information of word, such as pronunciation, the meaning of word, group word, example sentence.
Embodiment 2:
One kind digitlization Chinese character product word realizes device, it is characterised in that described device includes:
With the representation module of the relative position space representation Chinese character of the minimum composition part of Chinese character;
Recognize that the minimum composition part pieces together the identification module of word on relative tertiary location;
Show the display module of the Chinese character relevant information.
Described device also includes:
To constituting the basic stroke of Chinese character, it is difficult to the radical that is built with basic stroke and monomer word sets up concordance list
Index set up module;
It is to represent basic stroke, be difficult to the minimum of the radical and monomer word built with basic stroke by Chinese character separating
The Chinese character separating module of building block;
Determine the rectangle determining module of the minimum rectangle of the minimum composition part;
Determine the spatial relation determining module of the relative tertiary location relation at four edges of the minimum rectangle.
The word interior coding module that all minimal parts of Chinese character are encoded;
Module is set up according to the minimal parts table that concordance list and word interior coding set up minimal parts table;
The position relationship table that relative tertiary location relation according to all minimum rectangles of Chinese character sets up position relationship table is set up
Module;
The data memory module that the concordance list, minimal parts table and position relationship table are stored.
Wherein, the concordance list includes index number and the basic stroke representated by index number, is difficult to use basic stroke
Come the radical or monomer word that build;The minimal parts table is used to record in word expression basic stroke in numbering, is difficult to use
Basic stroke represents basic stroke, is difficult to use basic stroke in the radical or each numbering of monomer word that build and concordance list
Come the corresponding relation between the radical or each index number of monomer word that build.The word interior coding module according to from a left side to
Right, order from top to bottom is encoded to the minimal parts of Chinese character.
One kind digitlization Chinese character product word implementation method, it is characterised in that methods described includes:
The first step, representation module represent Chinese character with the relative tertiary location of the Chinese character part for being accurate to stroke;
Second step, when the building block of Chinese character can piece together a word on relative tertiary location, recognize mould
Block identification split Chinese character out;
3rd step, display module show the relevant information of the Chinese character, such as pronunciation, the meaning of word, group word, example sentence.
Wherein, methods described also includes:
Step one, index set up module for constitute Chinese character basic stroke (such as:Horizontal, vertical, slash, right-falling stroke, point etc.) and be difficult to
The radical built with stroke is (such as:Second, Fu, Yin), monomer word (such as:" 〇 " of one or nine 〇 〇) concordance list is set up, call in the following text
Unified number.
, for each Chinese character, partition constitutes the stroke of this Chinese character or is difficult to use pen for step 2, Chinese character separating module
Radical, the monomer word for representing are drawn, untill it can not further decouple, for the partition stroke that can not further decouple out
Or radical, referred to as monomer word, minimal parts.Rectangle determining module is determined to surround this minimum to each minimal parts
The minimum rectangle of part.The definition of minimum rectangle is to surround this minimal parts just (i.e. minimal parts do not have any part
Beyond this rectangle) the rectangle that horizontal edge is most short and vertical edge is also most short.Fig. 4 gives fractionation and each minimum of Chinese character " eight "
The minimum rectangle schematic diagram of part, the side of wherein minimum rectangle is represented by dashed line.Because this is the abstract representation of Chinese character, therefore pen
Draw no width, it means that the upper edge and lower edge of the minimum rectangle of stroke " horizontal stroke " coincide, actually one horizontal line,
In order to unify appellation, this special circumstances are also called minimum rectangle, and (it is zero to regard left and right edge length as, what lower edges overlapped
Rectangle).
Step 3, word interior coding module are in a certain order for from Chinese character separating out each minimal parts
It is numbered (call numbering in word in the following text), and module is set up by minimal parts table and sets up the pen that numbering and minimal parts are represented in word
Draw or radical, the corresponding relation of monomer word concordance list Unified number in step one, Fig. 5 left sides be with from left to right sequentially to
Chinese character " eight " does numbering in word, and the right is numbering and Unified number mapping table, i.e. minimal parts table in word.
Step 4, spatial relation determining module for from Chinese character separating out any two minimal parts, really
The relative tertiary location relation at fixed 4 edges of their minimum rectangles.For example, for " eight " word, the right edge of the minimum rectangle of slash
Should be in the left side of minimum rectangle left margin of right-falling stroke and misaligned, the minimum rectangle upper edge of right-falling stroke should be in the minimum rectangle skimmed
The top of upper edge.In position relation table sets up module, left margin is represented with 0,1 represents upper edge, and 2 represent the right edge, 3 tables
Show lower edge, according to numbering in the word of Chinese character in Fig. 5 " eight ", and according to from left to right, order from top to bottom, this relation should
This is expressed as { (001/2<002/0),[001/0>002/0] }, wherein, the part before "/" be minimal parts word in numbering,
Part below is the expression numeral at edge, and larger and smaller than a number representation space position relationship, round parentheses are represented from left to right
Spatial relation, square brackets represent spatial relation from top to bottom, and comma separates each relation, is table in brace
Show all spatial relations of all minimal parts for constituting the word.Thus establish all minimal parts for constituting the word
All relative tertiary location relations relation table, call position relationship table in the following text.
Step 5, data memory module are by the concordance list of step one, the minimal parts table and position relationship of each Chinese character
Table storage is in processing (such as in the random access memory of processing equipment);Chinese character by character code (such as GB18030,
UTF-8 etc.) represent, while store the corresponding relevant information of the Chinese character, such as pronunciation, the meaning of word, group word, example sentence.
Embodiment 3
The present embodiment is illustrated according to Fig. 6, to pull the building block of Chinese character on the display device described in the present embodiment
Method, the display device in the present embodiment is preferably the display device with touch function, i.e. touch display screen, the method bag
Include following steps:
A1, one Chinese character of selection, find from the minimal parts table of the step 5 of specific embodiment one storage and constitute the Chinese
The Unified coding of the minimal parts of word, finds corresponding according to Unified coding in the concordance list of the step one of specific embodiment one
Stroke, radical, single character, show these minimal parts (Fig. 6 left sides) in the touch display screen of display device, while record is every
One position of minimal parts (is represented, absolute coordinate can be in units of pixel with the absolute coordinate at each edge of minimum rectangle
Each edge of rectangle apart from touch display screen left margin and upper edge distance).
A2, when finger touches touch display screen, calculate that finger touches the point of touch display screen and each is minimum
The relation of the minimum rectangle of part, the minimal parts are picked if in minimum rectangle, being moved with finger and moved.
A3, when finger leaves touch display screen, update the position (Fig. 6 the right) of picked up minimal parts.
Embodiment 4
Be described in the present embodiment when the building block of Chinese character can piece together a word on relative tertiary location,
The method of identification split Chinese character out, the method is comprised the following steps:
B1, in touch display screen show all minimal parts, according to the step 3 identical of specific embodiment one
Order is numbered, and according to the method generation minimal parts Table A of the step 3 of specific embodiment one, then according to specific implementation
The method generation position relationship Table X of the step 4 of mode one.
B2, the minimal parts table B for storing each Chinese character in processing, if B is identical with A,
Position relationship the table Y and X of the Chinese character are compared, if also identical, judge that the Chinese character is exactly the split Chinese out
Word.
B3, if on processing equipment in all Chinese characters, its minimal parts table B all be different from A, although or B and A
It is identical, but Y and X differ, then show " this is not a Chinese character " or " not found in system " or other prompt messages.
B4, the Chinese character identified for step 2, the character code table stored using the step 5 of specific embodiment one
Show.
Embodiment 5
Reference picture 7 illustrates the present embodiment, to show the relevant information of the Chinese character, such as pronunciation, word described in the present embodiment
The method of justice, group word, example sentence etc., the method is comprised the following steps:
The character code of C1, the Chinese character identified according to B4 in specific embodiment 3, searches and stores in processing
The corresponding relevant information of the Chinese character, such as pronunciation, the meaning of word, group word, example sentence.
C2, the relevant information for finding C1 are included in touch display screen (Fig. 7).
Although the present invention is disclosed as above with preferred embodiment, it is not limited to the present invention, any to be familiar with this
The people of technology, without departing from the spirit and scope of the present invention, can do various changes and modification, therefore protection of the invention
What scope should be defined by claims is defined.