CN102915550A - Character joining detection method - Google Patents

Character joining detection method Download PDF

Info

Publication number
CN102915550A
CN102915550A CN2012103953541A CN201210395354A CN102915550A CN 102915550 A CN102915550 A CN 102915550A CN 2012103953541 A CN2012103953541 A CN 2012103953541A CN 201210395354 A CN201210395354 A CN 201210395354A CN 102915550 A CN102915550 A CN 102915550A
Authority
CN
China
Prior art keywords
parts
literal
amalgamation
detection method
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012103953541A
Other languages
Chinese (zh)
Inventor
金连文
陈心涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN2012103953541A priority Critical patent/CN102915550A/en
Publication of CN102915550A publication Critical patent/CN102915550A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a character joining detection method. The method comprises the following steps: decomposing a character into each character part by processing and segmenting a Chinese character image; disordering the parts; renewedly forming a Chinese character with a correct font by the Chinese character parts decomposed by a user according to the arrangement rule of the Chinese character; detecting the combined Chinese character when the user renewedly combines the Chinese character; and judging the matching degree of the Chinese character joining. Even if when the user calculates variance of images of all parts in the process of moving the parts, solves a fraction of the variance by a secondary decision function and compares the fraction with a pre-set threshold, whether the character joining is successful or not can be judged by the user. The method is suitable for joining detection of all characters, is high in adaptability, and has the advantages of small calculation amount, small occupied and storage space and the like.

Description

A kind of literal amalgamation detection method
Technical field
The present invention relates to the computer picture data processes and Technology of Chinese Information Processing, particularly a kind of literal amalgamation detection method.
Technical background
At present common mosaic mainly contains Chinese-character spelling and English mosaic, and wherein common Chinese-character spelling is actually Chinese character spelling toy, and these Word assembling toy are that the lead-in of Chinese character, radical etc. are imprinted on the building block mostly, building block is combined risked Chinese character.English mosaic then mainly is that the letter that forms English word is broken, and the learner is combined into a significant English word to the English alphabet of breaing up according to correct order.
For Chinese character spelling toy, at first be to use and learn very inconveniently, when playing, the learner must carry the building blocks such as lead-in, radical of all Chinese characters.Next is to lack intelligent decision, and Word assembling toy can not provide judgement to learner's mosaic result.And the method applicability of English mosaic is narrower, because English word all is comprised of 26 English alphabets, therefore English alphabet being broken up is a piece of cake, but this method is not suitable for other characters (such as Chinese character).Trace it to its cause, be that mainly this mosaic method regarded the English word that splits as a character string and split, and do not split from the computer picture aspect.
Summary of the invention
In order to overcome the deficiencies in the prior art, the present invention proposes a kind of literal amalgamation detection method, and the method is carried out the detection of the goodness of fit when the user carries out amalgamation after literal is upset at random, have preferably adaptability, have simultaneously calculated amount little, take the advantages such as storage space is few.
To achieve these goals, technical scheme of the present invention is:
A kind of literal amalgamation detection method may further comprise the steps:
1) reading characters from database, splitting literal is a plurality of parts;
2) parts of literal are upset, and preserved the positional information of all parts;
3) mobile amalgamation component information is calculated the mobile rear current position of each parts and step 2) middle variance of preserving the position;
4) calculate current mark Mark according to mark=-0.0001*var*var+100, detect the goodness of fit of literal amalgamation.
Further, described step 1) with literal split be for a plurality of parts be that method for splitting according to the parts data storehouse splits or on average be split as the N five equilibrium.
Further, described N is 3,4 or 5.
Further, the variance var that preserves the position in the current position of each parts and the step 4) after the calculating of described step 3) is moved calculates according to following mode:
Hanzi component is drawn in the onesize rectangular area, and replaces the position of these parts with the coordinate of the point in this grid upper left corner, in the variance of the point in the mosaic process computation all parts upper left corner.
Further, described rectangular area is 400*400.
Further, described step 3) adopts histogram to show the goodness of fit of current monogram when mobile amalgamation component information.
Further, the current fractional value Mark in the described step 4) greater than threshold X the time, the synthetic merit of characters piecing then shows the literal after the amalgamation.
Further, described threshold X is 90% ~ 97%.
Further, described threshold X is 95%, namely when current fractional value Mark greater than threshold value 95% time, the synthetic merit of characters piecing then shows the literal after the amalgamation.
Compare with existing literal amalgamation detection method, the present invention has following advantage:
(1) applicability is strong: this law is drawn in the different parts of each Chinese character in the rectangular area of formed objects, diverse location, it judges whether that the method for mosaic success and character have nothing to do, only relevant with the position relationship of each parts, therefore this method goes for the situation of all Chinese characters of different fonts, and not only be confined to indivedual fonts, simultaneously, for the character of other countries' language, this method also is to be suitable for, and this has also strengthened the extendability of whole system.
(2) memory space is few: component information has only been stored the sequence number of end stroke of all parts of each literal, has greatly reduced the size of whole database.
(3) calculated amount is little: detect mosaic whether in the successful method, do not relate to very complicated mathematical operation, the variance that only needs to calculate all parts with utilize quadratic fit function calculation Mark value, calculated amount is very little, the reaction velocity of system is very fast.
(4) parts split flexibility and changeability: the parts data of Chinese character and general information are separated in the system, and parts data has just recorded the end stroke sequence number of each Chinese character all parts, do not record other information of these parts.Therefore, in system, can revise easily and flexibly the number of Hanzi component.
Description of drawings
Fig. 1 is system architecture diagram of the present invention.
Fig. 2 is system master of the present invention interface synoptic diagram.
Fig. 3 is process flow diagram of the present invention.
Embodiment
The present invention is further described below in conjunction with drawings and embodiments.
System architecture diagram of the present invention as shown in Figure 1.5 modules in the system comprises processing module, database module, module is set, interface module and reminding module be to connect each other, collaborative work, its nucleus module is processing module.Database module and module is set provides the number of the parts of the parts picture of each Chinese character and Chinese character to processing module, processing module goes out all parts according to these component information random display in the interface.The synoptic diagram at system master interface as shown in Figure 2.Move in the process of each Hanzi component the user, processing module is calculated variance and the fractional value Mark of all parts in real time, and the Mark value is passed to reminding module, shows the goodness of fit of active user's mosaic in reminding module.When the mosaic success, processing module notice reminding module mosaic success, reminding module shows complete Chinese phrase and reads aloud.
Process flow diagram of the present invention as shown in Figure 3.Be specially the profile information that from database, reads first certain Chinese character, determine the number of Hanzi component according to the parts data storehouse, through generating all parts picture of this Chinese character after this two step, respectively these parts are drawn in size in the different rectangular areas of 400*400, and the random function that utilizes system to provide is upset these parts.In the process of user's moving-member, calculate in real time the variance of all parts, utilize the quadratic function curve of match to calculate fractional value Mark.When Mark value during greater than threshold value, judge this Chinese-character spelling success, otherwise continue to calculate the Mark value according to the position movement of parts.If after the success of this Chinese-character spelling, then look into and see if there is next word, if having then continues to repeat the mosaic process, then end.Just can realize preferably the present invention as mentioned above.

Claims (8)

1. literal amalgamation detection method is characterized in that may further comprise the steps:
1) reading characters from database, splitting literal is a plurality of parts;
2) parts of literal are upset, and preserved the positional information of all parts;
3) mobile amalgamation component information is calculated the mobile rear current position of each parts and step 2) middle variance of preserving the position;
4) calculate current mark Mark according to mark=-0.0001*var*var+100, detect the goodness of fit of literal amalgamation.
2. described literal amalgamation detection method according to claim 1, it is characterized in that described step 1) with literal split be for a plurality of parts be that method for splitting according to the parts data storehouse splits or on average be split as the N five equilibrium.
3. described literal amalgamation detection method according to claim 2 is characterized in that described N is 3,4 or 5.
4. described literal amalgamation detection method according to claim 1, after the calculating that it is characterized in that described step 3) is moved in the current position of each parts and the step 4) variance var of preservation position calculate according to following mode:
Hanzi component is drawn in the onesize rectangular area, and replaces the position of these parts with the coordinate of the point in this grid upper left corner, in the variance of the point in the mosaic process computation all parts upper left corner.
5. described literal amalgamation detection method according to claim 4 is characterized in that described rectangular area is 400*400.
6. described literal amalgamation detection method according to claim 1 is characterized in that described step 3) when mobile amalgamation component information, adopts histogram to show the goodness of fit of current monogram.
7. described literal amalgamation detection method according to claim 1, it is characterized in that in the described step 4) current fractional value Mark greater than threshold X the time, the synthetic merit of characters piecing then shows the literal after the amalgamation.
8. described literal amalgamation detection method according to claim 1 is characterized in that described threshold X is 90% ~ 97%.
CN2012103953541A 2012-10-17 2012-10-17 Character joining detection method Pending CN102915550A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012103953541A CN102915550A (en) 2012-10-17 2012-10-17 Character joining detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012103953541A CN102915550A (en) 2012-10-17 2012-10-17 Character joining detection method

Publications (1)

Publication Number Publication Date
CN102915550A true CN102915550A (en) 2013-02-06

Family

ID=47613902

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012103953541A Pending CN102915550A (en) 2012-10-17 2012-10-17 Character joining detection method

Country Status (1)

Country Link
CN (1) CN102915550A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897253A (en) * 2017-01-06 2017-06-27 燕山大学 A kind of font composition method based on improvement expansion algorithm

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1534516A (en) * 2003-04-02 2004-10-06 孙 勇 Chinese alphabet letter composing system its realizing method
US20050132342A1 (en) * 2003-12-10 2005-06-16 International Business Machines Corporation Pattern-matching system
CN102122298A (en) * 2011-03-07 2011-07-13 清华大学 Method for matching Chinese similarity

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1534516A (en) * 2003-04-02 2004-10-06 孙 勇 Chinese alphabet letter composing system its realizing method
US20050132342A1 (en) * 2003-12-10 2005-06-16 International Business Machines Corporation Pattern-matching system
CN102122298A (en) * 2011-03-07 2011-07-13 清华大学 Method for matching Chinese similarity

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897253A (en) * 2017-01-06 2017-06-27 燕山大学 A kind of font composition method based on improvement expansion algorithm
CN106897253B (en) * 2017-01-06 2020-05-29 燕山大学 Font synthesis method based on improved expansion algorithm

Similar Documents

Publication Publication Date Title
CN112818812B (en) Identification method and device for table information in image, electronic equipment and storage medium
US9268999B2 (en) Table recognizing method and table recognizing system
CN107622230B (en) PDF table data analysis method based on region identification and segmentation
CN103268481B (en) A kind of Text Extraction in complex background image
CN102184378B (en) Method for cutting portable data file (PDF) 417 standard two-dimensional bar code image
US20190019055A1 (en) Word segmentation system, method and device
CN102129560B (en) Method and device for identifying characters
CN104463101A (en) Answer recognition method and system for textual test question
CN105447522A (en) Complex image character identification system
CN106156766A (en) The generation method and device of line of text grader
US20030086611A1 (en) Recognition process
CN103809694A (en) Handwriting recognition child intelligent learning system based on intelligent terminal
CN104915420B (en) Knowledge base data processing method and system
CN110929727A (en) Image labeling method and device, character detection method and system and electronic equipment
CN102968619B (en) Recognition method for components of Chinese character pictures
CN112597773A (en) Document structuring method, system, terminal and medium
CN102663454A (en) Method and device for evaluating character writing standard degree
CN101452531B (en) Identification method for handwriting latin letter
CN113705286A (en) Form detection and identification method and medium
CN109472020B (en) Feature alignment Chinese word segmentation method
CN104463157A (en) Electronic identification method for handwritten characters
CN102968610B (en) Receipt image processing method and equipment
CN102915550A (en) Character joining detection method
US9639970B2 (en) Character recognition system, character recognition program and character recognition method
CN104951755B (en) A kind of Intelligent file image block detection method based on EMD

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20130206